Most of these kind of questions can be answered by looking at the CLR source code as available through the SSCLI20 distribution. It is getting pretty dated by now, it is .NET 2.0 vintage, but a lot of the core CLR features haven't changed much.
The source code file you want to look at is clr/src/vm/syncblk.cpp. Three classes play a role here, AwareLock is the low-level lock implementation that takes care of acquiring the lock, SyncBlock is the class that implements the queue of threads that are waiting to enter a lock, CLREvent is the wrapper for the operating system synchronization object, the one you are asking about.
This is C++ code and the level of abstraction is quite high, this code heavily interacts with the garbage collector and there's a lot of testing code included. So I'll give a brief description of the process.
SyncBlock has the m_Monitor member that stores the AwareLock instance. SyncBlock::Enter() directly calls AwareLock::Enter(). It first tries to acquire the lock as cheaply as possible. First checking if the thread already owns the lock and just incrementing the lock count if that's the case. Next using FastInterlockCompareExchange(), an internal function that's very similar to Interlocked.CompareExchange(). If the lock is not contended then this succeeds very quickly and Monitor.Enter() returns. If not then another thread already owns the lock, AwareLock::EnterEpilog is used. There's a need to get the operating system's thread scheduler involved so a CLREvent is used. It is dynamically created if necessary and its WaitOne() method is called. Which will involve a kernel transition.
So enough there to answer your question: the Monitor class enters kernel mode when the lock is contended and the thread has to wait.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…