Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
192 views
in Technique[技术] by (71.8m points)

c# - How to implement an efficient WhenEach that streams an IAsyncEnumerable of task results?

I am trying to update my toolset with the new tools offered by C# 8, and one method that seems particularly useful is a version of Task.WhenAll that returns an IAsyncEnumerable. This method should stream the task results as soon as they become available, so naming it WhenAll doesn't make much sense. WhenEach sounds more appropriate. The signature of the method is:

public static IAsyncEnumerable<TResult> WhenEach<TResult>(Task<TResult>[] tasks);

This method could be used like this:

var tasks = new Task<int>[]
{
    ProcessAsync(1, 300),
    ProcessAsync(2, 500),
    ProcessAsync(3, 400),
    ProcessAsync(4, 200),
    ProcessAsync(5, 100),
};

await foreach (int result in WhenEach(tasks))
{
    Console.WriteLine($"Processed: {result}");
}

static async Task<int> ProcessAsync(int result, int delay)
{
    await Task.Delay(delay);
    return result;
}

Expected output:

Processed: 5
Processed: 4
Processed: 1
Processed: 3
Processed: 2

I managed to write a basic implementation using the method Task.WhenAny in a loop, but there is a problem with this approach:

public static async IAsyncEnumerable<TResult> WhenEach<TResult>(
    Task<TResult>[] tasks)
{
    var hashSet = new HashSet<Task<TResult>>(tasks);
    while (hashSet.Count > 0)
    {
        var task = await Task.WhenAny(hashSet).ConfigureAwait(false);
        yield return await task.ConfigureAwait(false);
        hashSet.Remove(task);
    }
}

The problem is the performance. The implementation of the Task.WhenAny creates a defensive copy of the supplied list of tasks, so calling it repeatedly in a loop results in O(n2) computational complexity. My naive implementation struggles to process 10,000 tasks. The overhead is nearly 10 sec in my machine. I would like the method to be nearly as performant as the build-in Task.WhenAll, that can handle hundreds of thousands of tasks with ease. How could I improve the WhenEach method to make it performs decently?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

By using code from this article, you can implement the following:

public static Task<Task<T>>[] Interleaved<T>(IEnumerable<Task<T>> tasks)
{
   var inputTasks = tasks.ToList();

   var buckets = new TaskCompletionSource<Task<T>>[inputTasks.Count];
   var results = new Task<Task<T>>[buckets.Length];
   for (int i = 0; i < buckets.Length; i++)
   {
       buckets[i] = new TaskCompletionSource<Task<T>>();
       results[i] = buckets[i].Task;
   }

   int nextTaskIndex = -1;
   Action<Task<T>> continuation = completed =>
   {
       var bucket = buckets[Interlocked.Increment(ref nextTaskIndex)];
       bucket.TrySetResult(completed);
   };

   foreach (var inputTask in inputTasks)
       inputTask.ContinueWith(continuation, CancellationToken.None, TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);

   return results;
}

Then change your WhenEach to call the Interleaved code

public static async IAsyncEnumerable<TResult> WhenEach<TResult>(Task<TResult>[] tasks)
{
    foreach (var bucket in Interleaved(tasks))
    {
        var t = await bucket;
        yield return await t;
    }
}

Then you can call your WhenEach as per usual

await foreach (int result in WhenEach(tasks))
{
    Console.WriteLine($"Processed: {result}");
}

I did some rudimentary benchmarking with 10k tasks and performed 5 times better in terms of speed.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...