Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
420 views
in Technique[技术] by (71.8m points)

google cloud dataflow - Apache beam streaming pipeline deplyment doesn't exit

I have a streaming apache beam pipeline that I want to deploy using DataflowRunner. My pipeline code looks something like this:

with beam.Pipeline(options=pipeline_options) as p:
    (p | "Read input from PubSub" >>
     beam.io.ReadFromPubSub(subscription=known_args.subscription)
     # ...

Then I deploy the pipeline like this python3 main.py --runner=DataflowRunner --streaming ... The pipeline is being deployed successfully but the problem is that the process does not end but continues to show logs coming from the workers.

Is there a way to start the pipeline, check that it's in a running state, and then exit the process?

question from:https://stackoverflow.com/questions/66066567/apache-beam-streaming-pipeline-deplyment-doesnt-exit

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I believe when you are using `with beam.Pipeline() as p:' it will by default wait until the pipeline finishes or being terminated because it will invoke the 'enter' and 'exit' function on the pipeline, see https://github.com/apache/beam/blob/93c2bd8c8a7988f99a1299b9a1dd3a01122a35be/sdks/python/apache_beam/pipeline.py#L581.

Alternatively you can try

p = beam.Pipeline(options=pipeline_options)
p | ...
p.run()

Which should be non-blocking.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.9k users

...