Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
995 views
in Technique[技术] by (71.8m points)

rust - How can I create a Tokio runtime inside another Tokio runtime without getting the error "Cannot start a runtime from within a runtime"?

I'm using rust_bert for summarising text. I need to set a model with rust_bert::pipelines::summarization::SummarizationModel::new, which fetches the model from the internet. It does this asynchronously using tokio and the issue that (I think) I'm running into is that I am running the Tokio runtime within another Tokio runtime, as indicated by the error message:

Downloading https://cdn.huggingface.co/facebook/bart-large-cnn/config.json to "/home/(censored)/.cache/.rustbert/bart-cnn/config.json"
thread 'main' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.', /home/(censored)/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/enter.rs:38:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I've tried running the model fetching synchronously with tokio::task::spawn_blocking and tokio::task::block_in_place but neither of them are working for me. block_in_place gives the same error as if weren't there, and spawn_blocking doesn't really seem to be of use to me. I've also tried making summarize_text async, but that didn't help much. Github Issue tokio-rs/tokio#2194 and Reddit post "'Cannot start a runtime from within a runtime.' with Actix-Web And Postgresql" seem similar (same-ish error message), but they weren't of much help in finding a solution.

The code I've got issues with is as follows:

use egg_mode::tweet;
use rust_bert::pipelines::summarization::SummarizationModel;

fn summarize_text(model: SummarizationModel, text: &str) -> String {
    let output = model.summarize(&[text]);
    // @TODO: output summarization
    match output.is_empty() {
        false => "FALSE".to_string(),
        true => "TRUE".to_string(),
    }
}

#[tokio::main]
async fn main() {
    let model = SummarizationModel::new(Default::default()).unwrap();

    let token = egg_mode::auth::Token::Bearer("obviously not my token".to_string());
    let tweet_id = 1221552460768202756; // example tweet

    println!("Loading tweet [{id}]", id = tweet_id);
    let status = tweet::show(tweet_id, &token).await;
    match status {
        Err(err) => println!("Failed to fetch tweet: {}", err),
        Ok(tweet) => {
            println!(
                "Original tweet:
{orig}

Summarized tweet:
{sum}",
                orig = tweet.text,
                sum = summarize_text(model, &tweet.text)
            );
        }
    }
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Solving the problem

This is a reduced example:

use tokio; // 1.0.2

#[tokio::main]
async fn inner_example() {}

#[tokio::main]
async fn main() {
    inner_example();
}
thread 'main' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.', /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.2/src/runtime/enter.rs:39:9

To avoid this, you need to run the code that creates the second Tokio runtime on a completely independent thread. The easiest way to do this is to use std::thread::spawn:

use std::thread;

#[tokio::main]
async fn inner_example() {}

#[tokio::main]
async fn main() {
    thread::spawn(|| {
        inner_example();
    }).join().expect("Thread panicked")
}

For improved performance, you may wish to use a threadpool instead of creating a new thread each time. Conveniently, Tokio itself provides such a threadpool via spawn_blocking:

#[tokio::main]
async fn inner_example() {}

#[tokio::main]
async fn main() {
    tokio::task::spawn_blocking(|| {
        inner_example();
    }).await.expect("Task panicked")
}

In some cases you don't need to actually create a second Tokio runtime and can instead reuse the parent runtime. To do so, you pass in a Handle to the outer runtime. You can optionally use a lightweight executor like futures::executor to block on the result, if you need to wait for the work to finish:

use tokio::runtime::Handle; // 1.0.2

fn inner_example(handle: Handle) {
    futures::executor::block_on(async {
        handle
            .spawn(async {
                // Do work here
            })
            .await
            .expect("Task spawned in Tokio executor panicked")
    })
}

#[tokio::main]
async fn main() {
    let handle = Handle::current();

    tokio::task::spawn_blocking(|| {
        inner_example(handle);
    })
    .await
    .expect("Blocking task panicked")
}

See also:

Avoiding the problem

A better path is to avoid creating nested Tokio runtimes in the first place. Ideally, if a library uses an asynchronous executor, it would also offer the direct asynchronous function so you could use your own executor.

It's worth looking at the API to see if there is a non-blocking alternative, and if not, raising an issue on the project's repository.

You may also be able to reorganize your code so that the Tokio runtimes are not nested but are instead sequential:

struct Data;

#[tokio::main]
async fn inner_example() -> Data {
    Data
}

#[tokio::main]
async fn core(_: Data) {}

fn main() {
    let data = inner_example();
    core(data);
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...