I've removed all p/invoke stuff and re-created a simplified version of the compiler-generated state machine logic. It exhibits the same behavior: the awaiter
gets garabage-collected after the first invocation of the state machine's MoveNext
method.
Microsoft has recently done an excellent job on providing the Web UI to their .NET reference sources, that's been very helpful. After studying the implementation of AsyncTaskMethodBuilder
and, most importantly, AsyncMethodBuilderCore.GetCompletionAction
, I now believe the GC behavior I'm seeing makes perfect sense. I'll try to explain that below.
The code:
using System;
using System.Threading;
using System.Threading.Tasks;
using System.Runtime.InteropServices;
using System.Runtime.CompilerServices;
namespace ConsoleApplication
{
public class Program
{
// Original version with async/await
/*
static async Task TestAsync()
{
Console.WriteLine("Enter TestAsync");
var awaiter = new Awaiter();
//var hold = GCHandle.Alloc(awaiter);
var i = 0;
while (true)
{
await awaiter;
Console.WriteLine("tick: " + i++);
}
Console.WriteLine("Exit TestAsync");
}
*/
// Manually coded state machine version
struct StateMachine: IAsyncStateMachine
{
public int _state;
public Awaiter _awaiter;
public AsyncTaskMethodBuilder _builder;
public void MoveNext()
{
Console.WriteLine("StateMachine.MoveNext, state: " + this._state);
switch (this._state)
{
case -1:
{
this._awaiter = new Awaiter();
goto case 0;
};
case 0:
{
this._state = 0;
var awaiter = this._awaiter;
this._builder.AwaitOnCompleted(ref awaiter, ref this);
return;
};
default:
throw new InvalidOperationException();
}
}
public void SetStateMachine(IAsyncStateMachine stateMachine)
{
Console.WriteLine("StateMachine.SetStateMachine, state: " + this._state);
this._builder.SetStateMachine(stateMachine);
// s_strongRef = stateMachine;
}
static object s_strongRef = null;
}
static Task TestAsync()
{
StateMachine stateMachine = new StateMachine();
stateMachine._state = -1;
stateMachine._builder = AsyncTaskMethodBuilder.Create();
stateMachine._builder.Start(ref stateMachine);
return stateMachine._builder.Task;
}
public static void Main(string[] args)
{
var task = TestAsync();
Thread.Sleep(1000);
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
Console.WriteLine("Press Enter to exit...");
Console.ReadLine();
}
// custom awaiter
public class Awaiter :
System.Runtime.CompilerServices.INotifyCompletion
{
Action _continuation;
public Awaiter()
{
Console.WriteLine("Awaiter()");
}
~Awaiter()
{
Console.WriteLine("~Awaiter()");
}
// resume after await, called upon external event
public void Continue()
{
var continuation = Interlocked.Exchange(ref _continuation, null);
if (continuation != null)
continuation();
}
// custom Awaiter methods
public Awaiter GetAwaiter()
{
return this;
}
public bool IsCompleted
{
get { return false; }
}
public void GetResult()
{
}
// INotifyCompletion
public void OnCompleted(Action continuation)
{
Console.WriteLine("Awaiter.OnCompleted");
Volatile.Write(ref _continuation, continuation);
}
}
}
}
The compiler-generated state machine is a mutable struct, being passed over by ref
. Apparently, this is an optimization to avoid extra allocations.
The core part of this is taking place inside AsyncMethodBuilderCore.GetCompletionAction
, where the current state machine struct gets boxed, and the reference to the boxed copy is kept by the continuation callback passed to INotifyCompletion.OnCompleted
.
This is the only reference to the state machine which has a chance to stand the GC and survive after await
. The Task
object returned by TestAsync
does not hold a reference to it, only the await
continuation callback does. I believe this is done on purpose, to preserve the efficient GC behavior.
Note the commented line:
// s_strongRef = stateMachine;
If I un-comment it, the boxed copy of the state machine doesn't get GC'ed, and awaiter
stays alive as a part of it. Of course, this is not a solution, but it illustrates the problem.
So, I've come to the following conclusion. While an async operation is in "in-flight" and none of the state machine's states (MoveNext
) is currently being executed, it's the responsibility of the "keeper" of the continuation callback to put a strong hold on the callback itself, to make sure the boxed copy of the state machine does not get garbage-collected.
For example, in case with YieldAwaitable
(returned by Task.Yield
), the external reference to the continuation callback is kept by the ThreadPool
task scheduler, as a result of ThreadPool.QueueUserWorkItem
call. In case with Task.GetAwaiter
, it is indirectly referenced by the task object.
In my case, the "keeper" of the continuation callback is the Awaiter
itself.
Thus, as long as there is no external references to the continuation callback the CLR is aware of (outside the state machine object), the custom awaiter should take steps to keep the callback object alive. This, in turn, would keep alive the whole state machine. The following steps would be necessary in this case:
- Call the
GCHandle.Alloc
on the callback upon INotifyCompletion.OnCompleted
.
- Call
GCHandle.Free
when the async event has actually happened, before invoking the continuation callback.
- Implement
IDispose
to call GCHandle.Free
if the event has never happened.
Given that, below is a version of the original timer callback code, which works correctly. Note, there is no need to put a strong hold on the timer callback delegate (WaitOrTimerCallbackProc callback
). It is kept alive as a part of the state machine. Updated: as pointed out by @svick, this statement may be specific to the current implementation of the state machine (C# 5.0). I've added GC.KeepAlive(callback)
to eliminate any dependency on this behavior, in case it changes in the future compiler versions.
using System;
using System.Runtime.InteropServices;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApplication
{
class Program
{
// Test task
static async Task TestAsync(CancellationToken token)
{
using (var awaiter = new Awaiter())
{
WaitOrTimerCallbackProc callback = (a, b) =>
awaiter.Continue();
try
{
IntPtr timerHandle;
if (!CreateTimerQueueTimer(out timerHandle,
IntPtr.Zero,
callback,
IntPtr.Zero, 500, 500, 0))
throw new System.ComponentModel.Win32Exception(
Marshal.GetLastWin32Error());
try
{
var i = 0;
while (true)
{
token.ThrowIfCancellationRequested();
await awaiter;
Console.WriteLine("tick: " + i++);
}
}
finally
{
DeleteTimerQueueTimer(IntPtr.Zero, timerHandle, IntPtr.Zero);
}
}
finally
{
// reference the callback at the end
// to avoid a chance for it to be GC'ed
GC.KeepAlive(callback);
}
}
}
// Entry point
static void Main(string[] args)
{
// cancel in 3s
var testTask = TestAsync(new CancellationTokenSource(10 * 1000).Token);
Thread.Sleep(1000);
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced, true);
Thread.Sleep(2000);
Console.WriteLine("Press Enter to GC...");
Console.ReadLine();
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
Console.WriteLine("Press Enter to exit...");
Console.ReadLine();
}
// Custom awaiter
public class Awaiter :
System.Runtime.CompilerServices.INotifyCompletion,
IDisposable
{
Action _continuation;
GCHandle _hold = new GCHandle();
public Awaiter()
{
Console.WriteLine("Awaiter()");
}
~Awaiter()
{
Console.WriteLine("~Awaiter()");
}
void ReleaseHold()
{
if (_hold.IsAllocated)
_hold.Free();
}
// resume after await, called upon external event
public void Continue()
{
Action continuation;
// it's OK to use lock (this)
// the C# compiler would never do this,
// because it's slated to work with struct awaiters
lock (this)
{
continuation