summaryrefslogtreecommitdiff
path: root/src
diff options
context:
space:
mode:
authorKoundinya Veluri <kouvel@users.noreply.github.com>2017-11-27 10:28:02 -0800
committerGitHub <noreply@github.com>2017-11-27 10:28:02 -0800
commit93ddc4d563a09b21dd829c8c8a50911a200042f3 (patch)
tree62f865ee49fe8df07286d90a94f3d21be7d55197 /src
parent7d4d7a3c0c20fd34aa7db360ebaa8e18720199f2 (diff)
downloadcoreclr-93ddc4d563a09b21dd829c8c8a50911a200042f3.tar.gz
coreclr-93ddc4d563a09b21dd829c8c8a50911a200042f3.tar.bz2
coreclr-93ddc4d563a09b21dd829c8c8a50911a200042f3.zip
Improve Monitor scaling (#14216)
Improve Monitor scaling and reduce spinning Part 1: Improve Monitor scaling Fixes https://github.com/dotnet/coreclr/issues/13978 - Refactored AwareLock::m_MonitorHeld into a class LockState with operations to mutate the state - Allowed the lock to be taken by a non-waiter when there is a waiter to prevent creating lock convoys - Added a bit to LockState to indicate that a waiter is signaled to wake, to avoid waking more than one waiter at a time. A waiter that wakes by observing the signal unsets this bit. See AwareLock::EnterEpilogHelper(). - Added a spinner count to LockState. Spinners now register and unregister themselves and lock releasers don't wake a waiter when there is a registered spinner (the spinner guarantees to take the lock if it's available when unregistering itself) - This was necessary mostly on Windows to reduce CPU usage to the expected level in contended cases with several threads. I believe it's the priority boost Windows gives to signaled threads, which seems to cause waiters to much more frequently succeed in acquiring the lock. This causes a CPU usage problem because once the woken waiter releases the lock, on the next lock attempt it will become a spinner. This keeps repeating, converting several waiters into spinners unnecessarily. Before registering spinners, I saw typically 4-6 spinners under contention (with delays inside and outside the lock) when I expected to have only 1-2 spinners at most. - It costs an interlocked operation before and after the spin loop, doesn't seem to be too significant since spinning is a relatively slow path anyway, and the reduction in CPU usage in turn reduces contention on the lock and lets more useful work get done - Updated waiters to spin a bit before going back to waiting, reasons are explained in AwareLock::EnterEpilogHelper() - Removed AwareLock::Contention() and any references (this removes the 10 repeats of the entire spin loop in that function). With the lock convoy issue gone, this appears to no longer be necessary. Perf - On Windows, throughput has increased significantly starting at slightly lower than proc count threads. On Linux, latency and throughput have increased more significantly at similar proc counts. - Most of the larger regressions are in the unlocked fast paths. The code there hasn't changed and is almost identical (minor layout differences), I'm just considering this noise until we figure out how to get consistently faster code generated. - The smaller regressions are within noise range Part 2: Reduce Monitor spinning Fixes https://github.com/dotnet/coreclr/issues/13980 - Added new config value Monitor_SpinCount and Monitor spins for that many iterations, default is 30 (0x1e). This seems to give a somewhat decent balance between latency, fairness, and throughput. Lower spin counts improve latency and fairness significantly and regress throughput slightly, and higher spin counts improve throughput slightly and regress latency and fairness significantly. - The other constants can still be used to disable spinning but otherwise they are no longer used by Monitor - Decreased the number of bits used for tracking spinner count to 3. This seems to be more than enough since only one thread can take a lock at a time, and prevents spikes of unnecessary CPU usage. Tried some things that didn't pan out: - Sleep(0) doesn't seem to add anything to the spin loop, so left it out. Instead of Sleep(0) it can just proceed to waiting. Waiting is more expensive than Sleep(0), but I didn't see that benefit in the tests. Omitting Sleep(0) also keeps the spin loop very short (a few microseconds max). - Increasing the average YieldProcessor() duration per spin iteration improved thorughput slightly but regressed latency and fairness very quickly. Given that fairness is generally worse with part 1 of this change above, it felt like a better compromise to take a small reduction in throughput for larger improvements in latency and fairness. - Tried adding a very small % of lock releases by random wake a waiter despite there being spinners to improve fairness. This improved fairness noticeably but not as much as decreasing the spin count slightly, and it was making latency and throughput worse more quickly. After reducing the % to a point where I was hardly seeing fairness improvements, there were still noticeable latency and throughput regressions. Miscellaneous - Moved YieldProcessorNormalized code into separate files so that they can be included earlier and where needed - Added a max for "optimal max normalized yields per spin iteration" since it has a potential to be very large on machines where YieldProcessor may be implemented as no-op, in which case it's probably not worth spinning for the full duration - Refactored duplicate code in portable versions of MonEnterWorker, MonEnter, and MonReliableEnter. MonTryEnter has a slightly different structure, did not refactor that. Perf - Throughput is a bit lower than before at lower thread counts and better at medium-high thread counts. It's a bit lower at lower thread counts because of two reasons: - Shorter spin loop means the lock will be polled more frequently because the exponential backoff does not get as high, making it more likely for a spinner to steal the lock from another thread, causing the other thread to sometimes wait early - The duration of YieldProcessor() calls per spin iteration has decreased and a spinner or spinning waiter are more likely to take the lock, the rest is similar to above - For the same reasons as above, latency is better than before. Fairness is better on Windows and worse on Linux compared to baseline due to the baseline having differences between these platforms. Latency also has differences between Windows/Linux in the baseline, I suspect those are due to differences in scheduling. - Performance now scales appropriately on processors with different pause delays Part 3: Add mitigation for waiter starvation Normally, threads are allowed to preempt waiters to acquire the lock. There are cases where waiters can be easily starved as a result. For example, a thread that holds a lock for a significant amount of time (much longer than the time it takes to do a context switch), then releases and reacquires the lock in quick succession, and repeats. Though a waiter would be woken upon lock release, usually it will not have enough time to context-switch-in and take the lock, and can be starved for an unreasonably long duration. In order to prevent such starvation and force a bit of fair forward progress, it is sometimes necessary to change the normal policy and disallow threads from preempting waiters. A new bit was added to LockState and ShouldNotPreemptWaiters() indicates the current state of the policy. - When the first waiter begins waiting, it records the current time as a "waiter starvation start time". That is a point in time after which no forward progress has occurred for waiters. When a waiter acquires the lock, the time is updated to the current time. - Before a spinner begins spinning, and when a waiter is signaled to wake, it checks whether the starvation duration has crossed a threshold (currently 100 ms) and if so, sets ShouldNotPreemptWaiters() When unreasonable starvation is occurring, the lock will be released occasionally and if caused by spinners, spinners will be starting to spin. - Before starting to spin, if ShouldNotPreemptWaiters() is set, the spinner will skip spinning and wait instead. Spinners that are already registered at the time ShouldNotPreemptWaiters() is set will stop spinning as necessary. Eventually, all spinners will drain and no new ones will be registered. - After spinners have drained, only a waiter will be able to acquire the lock. When a waiter acquires the lock, or when the last waiter unregisters itself, ShouldNotPreemptWaiters() is cleared to restore the normal policy.
Diffstat (limited to 'src')
-rw-r--r--src/debug/daccess/request.cpp6
-rw-r--r--src/inc/clrconfigvalues.h1
-rw-r--r--src/inc/utilcode.h1
-rw-r--r--src/utilcode/utsem.cpp3
-rw-r--r--src/vm/amd64/asmconstants.h16
-rw-r--r--src/vm/eeconfig.cpp10
-rw-r--r--src/vm/eeconfig.h2
-rw-r--r--src/vm/i386/asmconstants.h14
-rw-r--r--src/vm/interpreter.cpp2
-rw-r--r--src/vm/jithelpers.cpp120
-rw-r--r--src/vm/object.h2
-rw-r--r--src/vm/object.inl36
-rw-r--r--src/vm/syncblk.cpp575
-rw-r--r--src/vm/syncblk.h346
-rw-r--r--src/vm/syncblk.inl592
-rw-r--r--src/vm/threads.cpp16
-rw-r--r--src/vm/vars.cpp3
17 files changed, 1127 insertions, 618 deletions
diff --git a/src/debug/daccess/request.cpp b/src/debug/daccess/request.cpp
index 92a4758fee..377e008fd4 100644
--- a/src/debug/daccess/request.cpp
+++ b/src/debug/daccess/request.cpp
@@ -3619,9 +3619,9 @@ ClrDataAccess::GetSyncBlockData(unsigned int SBNumber, struct DacpSyncBlockData
}
#endif // FEATURE_COMINTEROP
- pSyncBlockData->MonitorHeld = pBlock->m_Monitor.m_MonitorHeld;
- pSyncBlockData->Recursion = pBlock->m_Monitor.m_Recursion;
- pSyncBlockData->HoldingThread = HOST_CDADDR(pBlock->m_Monitor.m_HoldingThread);
+ pSyncBlockData->MonitorHeld = pBlock->m_Monitor.GetMonitorHeldStateVolatile();
+ pSyncBlockData->Recursion = pBlock->m_Monitor.GetRecursionLevel();
+ pSyncBlockData->HoldingThread = HOST_CDADDR(pBlock->m_Monitor.GetHoldingThread());
if (pBlock->GetAppDomainIndex().m_dwIndex)
{
diff --git a/src/inc/clrconfigvalues.h b/src/inc/clrconfigvalues.h
index 968aed5b11..8c79b93010 100644
--- a/src/inc/clrconfigvalues.h
+++ b/src/inc/clrconfigvalues.h
@@ -657,6 +657,7 @@ RETAIL_CONFIG_DWORD_INFO_EX(EXTERNAL_SpinLimitProcCap, W("SpinLimitProcCap"), 0x
RETAIL_CONFIG_DWORD_INFO_EX(EXTERNAL_SpinLimitProcFactor, W("SpinLimitProcFactor"), 0x4E20, "Hex value specifying the multiplier on NumProcs to use when calculating the maximum spin duration", EEConfig_default)
RETAIL_CONFIG_DWORD_INFO_EX(EXTERNAL_SpinLimitConstant, W("SpinLimitConstant"), 0x0, "Hex value specifying the constant to add when calculating the maximum spin duration", EEConfig_default)
RETAIL_CONFIG_DWORD_INFO_EX(EXTERNAL_SpinRetryCount, W("SpinRetryCount"), 0xA, "Hex value specifying the number of times the entire spin process is repeated (when applicable)", EEConfig_default)
+RETAIL_CONFIG_DWORD_INFO_EX(INTERNAL_Monitor_SpinCount, W("Monitor_SpinCount"), 0x1e, "Hex value specifying the maximum number of spin iterations Monitor may perform upon contention on acquiring the lock before waiting.", EEConfig_default)
//
// Native Binder
diff --git a/src/inc/utilcode.h b/src/inc/utilcode.h
index f48af70f60..d7676cb86a 100644
--- a/src/inc/utilcode.h
+++ b/src/inc/utilcode.h
@@ -5515,6 +5515,7 @@ struct SpinConstants
DWORD dwMaximumDuration;
DWORD dwBackoffFactor;
DWORD dwRepetitions;
+ DWORD dwMonitorSpinCount;
};
extern SpinConstants g_SpinConstants;
diff --git a/src/utilcode/utsem.cpp b/src/utilcode/utsem.cpp
index 4a76bbdb31..5c88936541 100644
--- a/src/utilcode/utsem.cpp
+++ b/src/utilcode/utsem.cpp
@@ -79,7 +79,8 @@ SpinConstants g_SpinConstants = {
50, // dwInitialDuration
40000, // dwMaximumDuration - ideally (20000 * max(2, numProc)) ... updated in code:InitializeSpinConstants_NoHost
3, // dwBackoffFactor
- 10 // dwRepetitions
+ 10, // dwRepetitions
+ 0 // dwMonitorSpinCount
};
inline void InitializeSpinConstants_NoHost()
diff --git a/src/vm/amd64/asmconstants.h b/src/vm/amd64/asmconstants.h
index 016a318abd..f526e22f2b 100644
--- a/src/vm/amd64/asmconstants.h
+++ b/src/vm/amd64/asmconstants.h
@@ -188,22 +188,6 @@ ASMCONSTANT_OFFSETOF_ASSERT(DelegateObject, _methodPtr);
#define OFFSETOF__DelegateObject___target 0x08
ASMCONSTANT_OFFSETOF_ASSERT(DelegateObject, _target);
-#define OFFSETOF__g_SystemInfo__dwNumberOfProcessors 0x20
-ASMCONSTANTS_C_ASSERT(OFFSETOF__g_SystemInfo__dwNumberOfProcessors
- == offsetof(SYSTEM_INFO, dwNumberOfProcessors));
-
-#define OFFSETOF__g_SpinConstants__dwInitialDuration 0x0
-ASMCONSTANTS_C_ASSERT(OFFSETOF__g_SpinConstants__dwInitialDuration
- == offsetof(SpinConstants, dwInitialDuration));
-
-#define OFFSETOF__g_SpinConstants__dwMaximumDuration 0x4
-ASMCONSTANTS_C_ASSERT(OFFSETOF__g_SpinConstants__dwMaximumDuration
- == offsetof(SpinConstants, dwMaximumDuration));
-
-#define OFFSETOF__g_SpinConstants__dwBackoffFactor 0x8
-ASMCONSTANTS_C_ASSERT(OFFSETOF__g_SpinConstants__dwBackoffFactor
- == offsetof(SpinConstants, dwBackoffFactor));
-
#define OFFSETOF__MethodTable__m_dwFlags 0x00
ASMCONSTANTS_C_ASSERT(OFFSETOF__MethodTable__m_dwFlags
== offsetof(MethodTable, m_dwFlags));
diff --git a/src/vm/eeconfig.cpp b/src/vm/eeconfig.cpp
index 16d6d73c9f..df744566e8 100644
--- a/src/vm/eeconfig.cpp
+++ b/src/vm/eeconfig.cpp
@@ -212,6 +212,7 @@ HRESULT EEConfig::Init()
dwSpinLimitProcFactor = 0x4E20;
dwSpinLimitConstant = 0x0;
dwSpinRetryCount = 0xA;
+ dwMonitorSpinCount = 0;
iJitOptimizeType = OPT_DEFAULT;
fJitFramed = false;
@@ -1013,11 +1014,20 @@ HRESULT EEConfig::sync()
#endif
dwSpinInitialDuration = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_SpinInitialDuration);
+ if (dwSpinInitialDuration < 1)
+ {
+ dwSpinInitialDuration = 1;
+ }
dwSpinBackoffFactor = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_SpinBackoffFactor);
+ if (dwSpinBackoffFactor < 2)
+ {
+ dwSpinBackoffFactor = 2;
+ }
dwSpinLimitProcCap = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_SpinLimitProcCap);
dwSpinLimitProcFactor = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_SpinLimitProcFactor);
dwSpinLimitConstant = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_SpinLimitConstant);
dwSpinRetryCount = CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_SpinRetryCount);
+ dwMonitorSpinCount = CLRConfig::GetConfigValue(CLRConfig::INTERNAL_Monitor_SpinCount);
fJitFramed = (GetConfigDWORD_DontUse_(CLRConfig::UNSUPPORTED_JitFramed, fJitFramed) != 0);
fJitAlignLoops = (GetConfigDWORD_DontUse_(CLRConfig::UNSUPPORTED_JitAlignLoops, fJitAlignLoops) != 0);
diff --git a/src/vm/eeconfig.h b/src/vm/eeconfig.h
index 84e4ccf15b..6f290fa1f7 100644
--- a/src/vm/eeconfig.h
+++ b/src/vm/eeconfig.h
@@ -272,6 +272,7 @@ public:
DWORD SpinLimitProcFactor(void) const {LIMITED_METHOD_CONTRACT; return dwSpinLimitProcFactor; }
DWORD SpinLimitConstant(void) const {LIMITED_METHOD_CONTRACT; return dwSpinLimitConstant; }
DWORD SpinRetryCount(void) const {LIMITED_METHOD_CONTRACT; return dwSpinRetryCount; }
+ DWORD MonitorSpinCount(void) const {LIMITED_METHOD_CONTRACT; return dwMonitorSpinCount; }
// Jit-config
@@ -978,6 +979,7 @@ private: //----------------------------------------------------------------
DWORD dwSpinLimitProcFactor;
DWORD dwSpinLimitConstant;
DWORD dwSpinRetryCount;
+ DWORD dwMonitorSpinCount;
#ifdef VERIFY_HEAP
int iGCHeapVerify;
diff --git a/src/vm/i386/asmconstants.h b/src/vm/i386/asmconstants.h
index 58b567d21c..70f6e30c4a 100644
--- a/src/vm/i386/asmconstants.h
+++ b/src/vm/i386/asmconstants.h
@@ -65,20 +65,6 @@ ASMCONSTANTS_C_ASSERT(CONTEXT_Eip == offsetof(CONTEXT,Eip))
#define CONTEXT_Esp 0xc4
ASMCONSTANTS_C_ASSERT(CONTEXT_Esp == offsetof(CONTEXT,Esp))
-// SYSTEM_INFO from rotor_pal.h
-#define SYSTEM_INFO_dwNumberOfProcessors 20
-ASMCONSTANTS_C_ASSERT(SYSTEM_INFO_dwNumberOfProcessors == offsetof(SYSTEM_INFO,dwNumberOfProcessors))
-
-// SpinConstants from clr/src/vars.h
-#define SpinConstants_dwInitialDuration 0
-ASMCONSTANTS_C_ASSERT(SpinConstants_dwInitialDuration == offsetof(SpinConstants,dwInitialDuration))
-
-#define SpinConstants_dwMaximumDuration 4
-ASMCONSTANTS_C_ASSERT(SpinConstants_dwMaximumDuration == offsetof(SpinConstants,dwMaximumDuration))
-
-#define SpinConstants_dwBackoffFactor 8
-ASMCONSTANTS_C_ASSERT(SpinConstants_dwBackoffFactor == offsetof(SpinConstants,dwBackoffFactor))
-
#ifndef WIN64EXCEPTIONS
// EHContext from clr/src/vm/i386/cgencpu.h
#define EHContext_Eax 0x00
diff --git a/src/vm/interpreter.cpp b/src/vm/interpreter.cpp
index d18eede1f1..b4b18cb13d 100644
--- a/src/vm/interpreter.cpp
+++ b/src/vm/interpreter.cpp
@@ -1756,7 +1756,7 @@ static void MonitorEnter(Object* obj, BYTE* pbLockTaken)
GET_THREAD()->PulseGCMode();
}
objRef->EnterObjMonitor();
- _ASSERTE ((objRef->GetSyncBlock()->GetMonitor()->m_Recursion == 1 && pThread->m_dwLockCount == lockCount + 1) ||
+ _ASSERTE ((objRef->GetSyncBlock()->GetMonitor()->GetRecursionLevel() == 1 && pThread->m_dwLockCount == lockCount + 1) ||
pThread->m_dwLockCount == lockCount);
if (pbLockTaken != 0) *pbLockTaken = 1;
diff --git a/src/vm/jithelpers.cpp b/src/vm/jithelpers.cpp
index 1611d585a5..dda73f2018 100644
--- a/src/vm/jithelpers.cpp
+++ b/src/vm/jithelpers.cpp
@@ -4416,7 +4416,7 @@ NOINLINE static void JIT_MonEnter_Helper(Object* obj, BYTE* pbLockTaken, LPVOID
GET_THREAD()->PulseGCMode();
}
objRef->EnterObjMonitor();
- _ASSERTE ((objRef->GetSyncBlock()->GetMonitor()->m_Recursion == 1 && pThread->m_dwLockCount == lockCount + 1) ||
+ _ASSERTE ((objRef->GetSyncBlock()->GetMonitor()->GetRecursionLevel() == 1 && pThread->m_dwLockCount == lockCount + 1) ||
pThread->m_dwLockCount == lockCount);
if (pbLockTaken != 0) *pbLockTaken = 1;
@@ -4427,69 +4427,18 @@ NOINLINE static void JIT_MonEnter_Helper(Object* obj, BYTE* pbLockTaken, LPVOID
}
/*********************************************************************/
-NOINLINE static void JIT_MonContention_Helper(Object* obj, BYTE* pbLockTaken, LPVOID __me)
-{
- FC_INNER_PROLOG_NO_ME_SETUP();
-
- OBJECTREF objRef = ObjectToOBJECTREF(obj);
-
- // Monitor helpers are used as both hcalls and fcalls, thus we need exact depth.
- HELPER_METHOD_FRAME_BEGIN_ATTRIB_1(Frame::FRAME_ATTR_EXACT_DEPTH|Frame::FRAME_ATTR_CAPTURE_DEPTH_2, objRef);
-
- GCPROTECT_BEGININTERIOR(pbLockTaken);
-
- objRef->GetSyncBlock()->QuickGetMonitor()->Contention();
- if (pbLockTaken != 0) *pbLockTaken = 1;
-
- GCPROTECT_END();
- HELPER_METHOD_FRAME_END();
-
- FC_INNER_EPILOG();
-}
-
-/*********************************************************************/
#include <optsmallperfcritical.h>
HCIMPL_MONHELPER(JIT_MonEnterWorker_Portable, Object* obj)
{
FCALL_CONTRACT;
-
- AwareLock::EnterHelperResult result;
- Thread * pCurThread;
-
- if (obj == NULL)
- {
- goto FramedLockHelper;
- }
-
- pCurThread = GetThread();
- if (pCurThread->CatchAtSafePointOpportunistic())
- {
- goto FramedLockHelper;
- }
-
- result = obj->EnterObjMonitorHelper(pCurThread);
- if (result == AwareLock::EnterHelperResult_Entered)
+ if (obj != nullptr && obj->TryEnterObjMonitorSpinHelper())
{
MONHELPER_STATE(*pbLockTaken = 1);
return;
}
- if (result == AwareLock::EnterHelperResult_Contention)
- {
- result = obj->EnterObjMonitorHelperSpin(pCurThread);
- if (result == AwareLock::EnterHelperResult_Entered)
- {
- MONHELPER_STATE(*pbLockTaken = 1);
- return;
- }
- if (result == AwareLock::EnterHelperResult_Contention)
- {
- FC_INNER_RETURN_VOID(JIT_MonContention_Helper(obj, MONHELPER_ARG, GetEEFuncEntryPointMacro(JIT_MonEnter)));
- }
- }
-FramedLockHelper:
FC_INNER_RETURN_VOID(JIT_MonEnter_Helper(obj, MONHELPER_ARG, GetEEFuncEntryPointMacro(JIT_MonEnter)));
}
HCIMPLEND
@@ -4498,40 +4447,11 @@ HCIMPL1(void, JIT_MonEnter_Portable, Object* obj)
{
FCALL_CONTRACT;
- Thread * pCurThread;
- AwareLock::EnterHelperResult result;
-
- if (obj == NULL)
- {
- goto FramedLockHelper;
- }
-
- pCurThread = GetThread();
-
- if (pCurThread->CatchAtSafePointOpportunistic())
- {
- goto FramedLockHelper;
- }
-
- result = obj->EnterObjMonitorHelper(pCurThread);
- if (result == AwareLock::EnterHelperResult_Entered)
+ if (obj != nullptr && obj->TryEnterObjMonitorSpinHelper())
{
return;
}
- if (result == AwareLock::EnterHelperResult_Contention)
- {
- result = obj->EnterObjMonitorHelperSpin(pCurThread);
- if (result == AwareLock::EnterHelperResult_Entered)
- {
- return;
- }
- if (result == AwareLock::EnterHelperResult_Contention)
- {
- FC_INNER_RETURN_VOID(JIT_MonContention_Helper(obj, NULL, GetEEFuncEntryPointMacro(JIT_MonEnter)));
- }
- }
-FramedLockHelper:
FC_INNER_RETURN_VOID(JIT_MonEnter_Helper(obj, NULL, GetEEFuncEntryPointMacro(JIT_MonEnter)));
}
HCIMPLEND
@@ -4540,42 +4460,12 @@ HCIMPL2(void, JIT_MonReliableEnter_Portable, Object* obj, BYTE* pbLockTaken)
{
FCALL_CONTRACT;
- Thread * pCurThread;
- AwareLock::EnterHelperResult result;
-
- if (obj == NULL)
- {
- goto FramedLockHelper;
- }
-
- pCurThread = GetThread();
-
- if (pCurThread->CatchAtSafePointOpportunistic())
- {
- goto FramedLockHelper;
- }
-
- result = obj->EnterObjMonitorHelper(pCurThread);
- if (result == AwareLock::EnterHelperResult_Entered)
+ if (obj != nullptr && obj->TryEnterObjMonitorSpinHelper())
{
*pbLockTaken = 1;
return;
}
- if (result == AwareLock::EnterHelperResult_Contention)
- {
- result = obj->EnterObjMonitorHelperSpin(pCurThread);
- if (result == AwareLock::EnterHelperResult_Entered)
- {
- *pbLockTaken = 1;
- return;
- }
- if (result == AwareLock::EnterHelperResult_Contention)
- {
- FC_INNER_RETURN_VOID(JIT_MonContention_Helper(obj, pbLockTaken, GetEEFuncEntryPointMacro(JIT_MonReliableEnter)));
- }
- }
-FramedLockHelper:
FC_INNER_RETURN_VOID(JIT_MonEnter_Helper(obj, pbLockTaken, GetEEFuncEntryPointMacro(JIT_MonReliableEnter)));
}
HCIMPLEND
@@ -4817,7 +4707,7 @@ HCIMPL_MONHELPER(JIT_MonEnterStatic_Portable, AwareLock *lock)
goto FramedLockHelper;
}
- if (lock->EnterHelper(pCurThread, true /* checkRecursiveCase */))
+ if (lock->TryEnterHelper(pCurThread))
{
#if defined(_DEBUG) && defined(TRACK_SYNC)
// The best place to grab this is from the ECall frame
diff --git a/src/vm/object.h b/src/vm/object.h
index a79e846621..3d981b98d0 100644
--- a/src/vm/object.h
+++ b/src/vm/object.h
@@ -497,6 +497,8 @@ class Object
return GetHeader()->TryEnterObjMonitor(timeOut);
}
+ bool TryEnterObjMonitorSpinHelper();
+
FORCEINLINE AwareLock::EnterHelperResult EnterObjMonitorHelper(Thread* pCurThread)
{
WRAPPER_NO_CONTRACT;
diff --git a/src/vm/object.inl b/src/vm/object.inl
index 495f7bff71..7d79554cc9 100644
--- a/src/vm/object.inl
+++ b/src/vm/object.inl
@@ -109,8 +109,40 @@ inline void Object::EnumMemoryRegions(void)
// the enumeration to the MethodTable.
}
-#endif // #ifdef DACCESS_COMPILE
-
+#else // !DACCESS_COMPILE
+
+FORCEINLINE bool Object::TryEnterObjMonitorSpinHelper()
+{
+ CONTRACTL{
+ SO_TOLERANT;
+ NOTHROW;
+ GC_NOTRIGGER;
+ MODE_COOPERATIVE;
+ } CONTRACTL_END;
+
+ Thread *pCurThread = GetThread();
+ if (pCurThread->CatchAtSafePointOpportunistic())
+ {
+ return false;
+ }
+
+ AwareLock::EnterHelperResult result = EnterObjMonitorHelper(pCurThread);
+ if (result == AwareLock::EnterHelperResult_Entered)
+ {
+ return true;
+ }
+ if (result == AwareLock::EnterHelperResult_Contention)
+ {
+ result = EnterObjMonitorHelperSpin(pCurThread);
+ if (result == AwareLock::EnterHelperResult_Entered)
+ {
+ return true;
+ }
+ }
+ return false;
+}
+
+#endif // DACCESS_COMPILE
inline TypeHandle ArrayBase::GetTypeHandle() const
{
diff --git a/src/vm/syncblk.cpp b/src/vm/syncblk.cpp
index 9c512675b6..8f6f5448f9 100644
--- a/src/vm/syncblk.cpp
+++ b/src/vm/syncblk.cpp
@@ -1917,8 +1917,12 @@ AwareLock::EnterHelperResult ObjHeader::EnterObjMonitorHelperSpin(Thread* pCurTh
return AwareLock::EnterHelperResult_Contention;
}
- for (DWORD spinCount = g_SpinConstants.dwInitialDuration; AwareLock::SpinWaitAndBackOffBeforeOperation(&spinCount);)
+ YieldProcessorNormalizationInfo normalizationInfo;
+ const DWORD spinCount = g_SpinConstants.dwMonitorSpinCount;
+ for (DWORD spinIteration = 0; spinIteration < spinCount; ++spinIteration)
{
+ AwareLock::SpinWait(normalizationInfo, spinIteration);
+
LONG oldValue = m_SyncBlockValue.LoadWithoutBarrier();
// Since spinning has begun, chances are good that the monitor has already switched to AwareLock mode, so check for that
@@ -1931,22 +1935,46 @@ AwareLock::EnterHelperResult ObjHeader::EnterObjMonitorHelperSpin(Thread* pCurTh
return AwareLock::EnterHelperResult_UseSlowPath;
}
- // Check the recursive case once before the spin loop. If it's not the recursive case in the beginning, it will not
- // be in the future, so the spin loop can avoid checking the recursive case.
SyncBlock *syncBlock = g_pSyncTable[oldValue & MASK_SYNCBLOCKINDEX].m_SyncBlock;
_ASSERTE(syncBlock != NULL);
AwareLock *awareLock = &syncBlock->m_Monitor;
- if (awareLock->EnterHelper(pCurThread, true /* checkRecursiveCase */))
+
+ AwareLock::EnterHelperResult result = awareLock->TryEnterBeforeSpinLoopHelper(pCurThread);
+ if (result != AwareLock::EnterHelperResult_Contention)
{
- return AwareLock::EnterHelperResult_Entered;
+ return result;
}
- while (AwareLock::SpinWaitAndBackOffBeforeOperation(&spinCount))
+
+ ++spinIteration;
+ if (spinIteration < spinCount)
{
- if (awareLock->EnterHelper(pCurThread, false /* checkRecursiveCase */))
+ while (true)
{
- return AwareLock::EnterHelperResult_Entered;
+ AwareLock::SpinWait(normalizationInfo, spinIteration);
+
+ ++spinIteration;
+ if (spinIteration >= spinCount)
+ {
+ // The last lock attempt for this spin will be done after the loop
+ break;
+ }
+
+ result = awareLock->TryEnterInsideSpinLoopHelper(pCurThread);
+ if (result == AwareLock::EnterHelperResult_Entered)
+ {
+ return AwareLock::EnterHelperResult_Entered;
+ }
+ if (result == AwareLock::EnterHelperResult_UseSlowPath)
+ {
+ break;
+ }
}
}
+
+ if (awareLock->TryEnterAfterSpinLoopHelper(pCurThread))
+ {
+ return AwareLock::EnterHelperResult_Entered;
+ }
break;
}
@@ -2134,7 +2162,7 @@ BOOL ObjHeader::GetThreadOwningMonitorLock(DWORD *pThreadId, DWORD *pAcquisition
SyncBlock* psb = g_pSyncTable[(int)index].m_SyncBlock;
_ASSERTE(psb->GetMonitor() != NULL);
- Thread* pThread = psb->GetMonitor()->m_HoldingThread;
+ Thread* pThread = psb->GetMonitor()->GetHoldingThread();
if(pThread == NULL)
{
*pThreadId = 0;
@@ -2144,7 +2172,7 @@ BOOL ObjHeader::GetThreadOwningMonitorLock(DWORD *pThreadId, DWORD *pAcquisition
else
{
*pThreadId = pThread->GetThreadId();
- *pAcquisitionCount = psb->GetMonitor()->m_Recursion;
+ *pAcquisitionCount = psb->GetMonitor()->GetRecursionLevel();
return TRUE;
}
}
@@ -2751,8 +2779,7 @@ SyncBlock *ObjHeader::GetSyncBlock()
// The lock is orphaned.
pThread = (Thread*) -1;
}
- syncBlock->InitState();
- syncBlock->SetAwareLock(pThread, recursionLevel + 1);
+ syncBlock->InitState(recursionLevel + 1, pThread);
}
}
else if ((bits & BIT_SBLK_IS_HASHCODE) != 0)
@@ -2917,73 +2944,45 @@ void AwareLock::Enter()
}
CONTRACTL_END;
- Thread *pCurThread = GetThread();
-
- for (;;)
+ Thread *pCurThread = GetThread();
+ LockState state = m_lockState;
+ if (!state.IsLocked() || m_HoldingThread != pCurThread)
{
- // Read existing lock state.
- LONG state = m_MonitorHeld.LoadWithoutBarrier();
-
- if (state == 0)
+ if (m_lockState.InterlockedTryLock_Or_RegisterWaiter(this, state))
{
- // Common case: lock not held, no waiters. Attempt to acquire lock by
- // switching lock bit.
- if (FastInterlockCompareExchange((LONG*)&m_MonitorHeld, 1, 0) == 0)
- {
- break;
- }
- }
- else
- {
- // It's possible to get here with waiters but no lock held, but in this
- // case a signal is about to be fired which will wake up a waiter. So
- // for fairness sake we should wait too.
- // Check first for recursive lock attempts on the same thread.
- if (m_HoldingThread == pCurThread)
- {
- goto Recursion;
- }
-
- // Attempt to increment this count of waiters then goto contention
- // handling code.
- if (FastInterlockCompareExchange((LONG*)&m_MonitorHeld, (state + 2), state) == state)
- {
- goto MustWait;
- }
- }
- }
-
- // We get here if we successfully acquired the mutex.
- m_HoldingThread = pCurThread;
- m_Recursion = 1;
- pCurThread->IncLockCount();
+ // We get here if we successfully acquired the mutex.
+ m_HoldingThread = pCurThread;
+ m_Recursion = 1;
+ pCurThread->IncLockCount();
#if defined(_DEBUG) && defined(TRACK_SYNC)
- {
- // The best place to grab this is from the ECall frame
- Frame *pFrame = pCurThread->GetFrame();
- int caller = (pFrame && pFrame != FRAME_TOP
- ? (int) pFrame->GetReturnAddress()
- : -1);
- pCurThread->m_pTrackSync->EnterSync(caller, this);
- }
+ // The best place to grab this is from the ECall frame
+ Frame *pFrame = pCurThread->GetFrame();
+ int caller = (pFrame && pFrame != FRAME_TOP
+ ? (int)pFrame->GetReturnAddress()
+ : -1);
+ pCurThread->m_pTrackSync->EnterSync(caller, this);
#endif
+ return;
+ }
- return;
+ // Lock was not acquired and the waiter was registered
-MustWait:
- // Didn't manage to get the mutex, must wait.
- EnterEpilog(pCurThread);
- return;
+ // Didn't manage to get the mutex, must wait.
+ // The precondition for EnterEpilog is that the count of waiters be bumped
+ // to account for this thread, which was done above.
+ EnterEpilog(pCurThread);
+ return;
+ }
-Recursion:
// Got the mutex via recursive locking on the same thread.
_ASSERTE(m_Recursion >= 1);
m_Recursion++;
+
#if defined(_DEBUG) && defined(TRACK_SYNC)
// The best place to grab this is from the ECall frame
Frame *pFrame = pCurThread->GetFrame();
- int caller = (pFrame && pFrame != FRAME_TOP ? (int) pFrame->GetReturnAddress() : -1);
+ int caller = (pFrame && pFrame != FRAME_TOP ? (int)pFrame->GetReturnAddress() : -1);
pCurThread->m_pTrackSync->EnterSync(caller, this);
#endif
}
@@ -3000,123 +2999,58 @@ BOOL AwareLock::TryEnter(INT32 timeOut)
}
CONTRACTL_END;
- if (timeOut != 0)
- {
- LARGE_INTEGER qpFrequency, qpcStart, qpcEnd;
- BOOL canUseHighRes = QueryPerformanceCounter(&qpcStart);
-
- // try some more busy waiting
- if (Contention(timeOut))
- return TRUE;
-
- DWORD elapsed = 0;
- if (canUseHighRes && QueryPerformanceCounter(&qpcEnd) && QueryPerformanceFrequency(&qpFrequency))
- elapsed = (DWORD)((qpcEnd.QuadPart-qpcStart.QuadPart)/(qpFrequency.QuadPart/1000));
-
- if (elapsed >= (DWORD)timeOut)
- return FALSE;
-
- if (timeOut != (INT32)INFINITE)
- timeOut -= elapsed;
- }
-
Thread *pCurThread = GetThread();
- TESTHOOKCALL(AppDomainCanBeUnloaded(pCurThread->GetDomain()->GetId().m_dwId,FALSE));
+ TESTHOOKCALL(AppDomainCanBeUnloaded(pCurThread->GetDomain()->GetId().m_dwId, FALSE));
- if (pCurThread->IsAbortRequested())
+ if (pCurThread->IsAbortRequested())
{
pCurThread->HandleThreadAbort();
}
-retry:
-
- for (;;) {
-
- // Read existing lock state.
- LONG state = m_MonitorHeld.LoadWithoutBarrier();
-
- if (state == 0)
- {
- // Common case: lock not held, no waiters. Attempt to acquire lock by
- // switching lock bit.
- if (FastInterlockCompareExchange((LONG*)&m_MonitorHeld, 1, 0) == 0)
- {
- break;
- }
- }
- else
+ LockState state = m_lockState;
+ if (!state.IsLocked() || m_HoldingThread != pCurThread)
+ {
+ if (timeOut == 0
+ ? m_lockState.InterlockedTryLock(state)
+ : m_lockState.InterlockedTryLock_Or_RegisterWaiter(this, state))
{
- // It's possible to get here with waiters but no lock held, but in this
- // case a signal is about to be fired which will wake up a waiter. So
- // for fairness sake we should wait too.
- // Check first for recursive lock attempts on the same thread.
- if (m_HoldingThread == pCurThread)
- {
- goto Recursion;
- }
- else
- {
- goto WouldBlock;
- }
- }
- }
-
- // We get here if we successfully acquired the mutex.
- m_HoldingThread = pCurThread;
- m_Recursion = 1;
- pCurThread->IncLockCount();
+ // We get here if we successfully acquired the mutex.
+ m_HoldingThread = pCurThread;
+ m_Recursion = 1;
+ pCurThread->IncLockCount();
#if defined(_DEBUG) && defined(TRACK_SYNC)
- {
- // The best place to grab this is from the ECall frame
- Frame *pFrame = pCurThread->GetFrame();
- int caller = (pFrame && pFrame != FRAME_TOP ? (int) pFrame->GetReturnAddress() : -1);
- pCurThread->m_pTrackSync->EnterSync(caller, this);
- }
+ // The best place to grab this is from the ECall frame
+ Frame *pFrame = pCurThread->GetFrame();
+ int caller = (pFrame && pFrame != FRAME_TOP ? (int)pFrame->GetReturnAddress() : -1);
+ pCurThread->m_pTrackSync->EnterSync(caller, this);
#endif
+ return true;
+ }
- return TRUE;
-
-WouldBlock:
- // Didn't manage to get the mutex, return failure if no timeout, else wait
- // for at most timeout milliseconds for the mutex.
- if (!timeOut)
- {
- return FALSE;
- }
-
- // The precondition for EnterEpilog is that the count of waiters be bumped
- // to account for this thread
-
- for (;;)
- {
- // Read existing lock state.
- LONG state = m_MonitorHeld.LoadWithoutBarrier();
+ // Lock was not acquired and the waiter was registered if the timeout is nonzero
- if (state == 0)
+ // Didn't manage to get the mutex, return failure if no timeout, else wait
+ // for at most timeout milliseconds for the mutex.
+ if (timeOut == 0)
{
- goto retry;
+ return false;
}
- if (FastInterlockCompareExchange((LONG*)&m_MonitorHeld, (state + 2), state) == state)
- {
- break;
- }
+ // The precondition for EnterEpilog is that the count of waiters be bumped
+ // to account for this thread, which was done above
+ return EnterEpilog(pCurThread, timeOut);
}
- return EnterEpilog(pCurThread, timeOut);
-
-Recursion:
// Got the mutex via recursive locking on the same thread.
_ASSERTE(m_Recursion >= 1);
m_Recursion++;
#if defined(_DEBUG) && defined(TRACK_SYNC)
// The best place to grab this is from the ECall frame
Frame *pFrame = pCurThread->GetFrame();
- int caller = (pFrame && pFrame != FRAME_TOP ? (int) pFrame->GetReturnAddress() : -1);
+ int caller = (pFrame && pFrame != FRAME_TOP ? (int)pFrame->GetReturnAddress() : -1);
pCurThread->m_pTrackSync->EnterSync(caller, this);
#endif
-
return true;
}
@@ -3140,27 +3074,58 @@ BOOL AwareLock::EnterEpilog(Thread* pCurThread, INT32 timeOut)
return EnterEpilogHelper(pCurThread, timeOut);
}
+#ifdef _DEBUG
+#define _LOGCONTENTION
+#endif // _DEBUG
+
+#ifdef _LOGCONTENTION
+inline void LogContention()
+{
+ WRAPPER_NO_CONTRACT;
+#ifdef LOGGING
+ if (LoggingOn(LF_SYNC, LL_INFO100))
+ {
+ LogSpewAlways("Contention: Stack Trace Begin\n");
+ void LogStackTrace();
+ LogStackTrace();
+ LogSpewAlways("Contention: Stack Trace End\n");
+ }
+#endif
+}
+#else
+#define LogContention()
+#endif
+
BOOL AwareLock::EnterEpilogHelper(Thread* pCurThread, INT32 timeOut)
{
STATIC_CONTRACT_THROWS;
STATIC_CONTRACT_MODE_COOPERATIVE;
STATIC_CONTRACT_GC_TRIGGERS;
- DWORD ret = 0;
- BOOL finished = false;
+ // IMPORTANT!!!
+ // The caller has already registered a waiter. This function needs to unregister the waiter on all paths (exception paths
+ // included). On runtimes where thread-abort is supported, a thread-abort also needs to unregister the waiter. There may be
+ // a possibility for preemptive GC toggles below to handle a thread-abort, that should be taken into consideration when
+ // porting this code back to .NET Framework.
// Require all callers to be in cooperative mode. If they have switched to preemptive
// mode temporarily before calling here, then they are responsible for protecting
// the object associated with this lock.
_ASSERTE(pCurThread->PreemptiveGCDisabled());
+ COUNTER_ONLY(GetPerfCounters().m_LocksAndThreads.cContention++);
+
+ // Fire a contention start event for a managed contention
+ FireEtwContentionStart_V1(ETW::ContentionLog::ContentionStructs::ManagedContention, GetClrInstanceId());
+ LogContention();
- OBJECTREF obj = GetOwningObject();
+ OBJECTREF obj = GetOwningObject();
// We cannot allow the AwareLock to be cleaned up underneath us by the GC.
IncrementTransientPrecious();
+ DWORD ret;
GCPROTECT_BEGIN(obj);
{
if (!m_SemEvent.IsMonitorEventAllocated())
@@ -3183,104 +3148,98 @@ BOOL AwareLock::EnterEpilogHelper(Thread* pCurThread, INT32 timeOut)
} param;
param.pThis = this;
param.timeOut = timeOut;
- param.ret = ret;
+
+ // Measure the time we wait so that, in the case where we wake up
+ // and fail to acquire the mutex, we can adjust remaining timeout
+ // accordingly.
+ ULONGLONG start = CLRGetTickCount64();
EE_TRY_FOR_FINALLY(Param *, pParam, &param)
{
- // Measure the time we wait so that, in the case where we wake up
- // and fail to acquire the mutex, we can adjust remaining timeout
- // accordingly.
- ULONGLONG start = CLRGetTickCount64();
-
pParam->ret = pParam->pThis->m_SemEvent.Wait(pParam->timeOut, TRUE);
_ASSERTE((pParam->ret == WAIT_OBJECT_0) || (pParam->ret == WAIT_TIMEOUT));
-
- // When calculating duration we consider a couple of special cases.
- // If the end tick is the same as the start tick we make the
- // duration a millisecond, to ensure we make forward progress if
- // there's a lot of contention on the mutex. Secondly, we have to
- // cope with the case where the tick counter wrapped while we where
- // waiting (we can cope with at most one wrap, so don't expect three
- // month timeouts to be very accurate). Luckily for us, the latter
- // case is taken care of by 32-bit modulo arithmetic automatically.
-
- if (pParam->timeOut != (INT32) INFINITE)
- {
- ULONGLONG end = CLRGetTickCount64();
- ULONGLONG duration;
- if (end == start)
- {
- duration = 1;
- }
- else
- {
- duration = end - start;
- }
- duration = min(duration, (DWORD)pParam->timeOut);
- pParam->timeOut -= (INT32)duration;
- }
}
EE_FINALLY
{
if (GOT_EXCEPTION())
{
- // We must decrement the waiter count.
- for (;;)
- {
- LONG state = m_MonitorHeld.LoadWithoutBarrier();
- _ASSERTE((state >> 1) != 0);
- if (FastInterlockCompareExchange((LONG*)&m_MonitorHeld, state - 2, state) == state)
- {
- break;
- }
- }
+ // It is likely the case that An APC threw an exception, for instance Thread.Interrupt(). The wait subsystem
+ // guarantees that if a signal to the event being waited upon is observed by the woken thread, that thread's
+ // wait will return WAIT_OBJECT_0. So in any race between m_SemEvent being signaled and the wait throwing an
+ // exception, a thread that is woken by an exception would not observe the signal, and the signal would wake
+ // another thread as necessary.
- // And signal the next waiter, else they'll wait forever.
- m_SemEvent.Set();
+ // We must decrement the waiter count.
+ m_lockState.InterlockedUnregisterWaiter();
}
} EE_END_FINALLY;
ret = param.ret;
+ if (ret != WAIT_OBJECT_0)
+ {
+ // We timed out, decrement waiter count.
+ m_lockState.InterlockedUnregisterWaiter();
+ break;
+ }
- if (ret == WAIT_OBJECT_0)
+ // Spin a bit while trying to acquire the lock. This has a few benefits:
+ // - Spinning helps to reduce waiter starvation. Since other non-waiter threads can take the lock while there are
+ // waiters (see LockState::InterlockedTryLock()), once a waiter wakes it will be able to better compete
+ // with other spinners for the lock.
+ // - If there is another thread that is repeatedly acquiring and releasing the lock, spinning before waiting again
+ // helps to prevent a waiter from repeatedly context-switching in and out
+ // - Further in the same situation above, waking up and waiting shortly thereafter deprioritizes this waiter because
+ // events release waiters in FIFO order. Spinning a bit helps a waiter to retain its priority at least for one
+ // spin duration before it gets deprioritized behind all other waiters.
+ if (g_SystemInfo.dwNumberOfProcessors > 1)
{
- // Attempt to acquire lock (this also involves decrementing the waiter count).
- for (;;)
+ bool acquiredLock = false;
+ YieldProcessorNormalizationInfo normalizationInfo;
+ const DWORD spinCount = g_SpinConstants.dwMonitorSpinCount;
+ for (DWORD spinIteration = 0; spinIteration < spinCount; ++spinIteration)
{
- LONG state = m_MonitorHeld.LoadWithoutBarrier();
- _ASSERTE(((size_t)state >> 1) != 0);
-
- if ((size_t)state & 1)
+ if (m_lockState.InterlockedTry_LockAndUnregisterWaiterAndObserveWakeSignal(this))
{
+ acquiredLock = true;
break;
}
- if (FastInterlockCompareExchange((LONG*)&m_MonitorHeld, ((state - 2) | 1), state) == state)
- {
- finished = true;
- break;
- }
+ SpinWait(normalizationInfo, spinIteration);
}
- }
- else
- {
- // We timed out, decrement waiter count.
- for (;;)
+ if (acquiredLock)
{
- LONG state = m_MonitorHeld.LoadWithoutBarrier();
- _ASSERTE((state >> 1) != 0);
- if (FastInterlockCompareExchange((LONG*)&m_MonitorHeld, state - 2, state) == state)
- {
- finished = true;
- break;
- }
+ break;
}
}
- if (finished)
+ if (m_lockState.InterlockedObserveWakeSignal_Try_LockAndUnregisterWaiter(this))
{
break;
}
+
+ // When calculating duration we consider a couple of special cases.
+ // If the end tick is the same as the start tick we make the
+ // duration a millisecond, to ensure we make forward progress if
+ // there's a lot of contention on the mutex. Secondly, we have to
+ // cope with the case where the tick counter wrapped while we where
+ // waiting (we can cope with at most one wrap, so don't expect three
+ // month timeouts to be very accurate). Luckily for us, the latter
+ // case is taken care of by 32-bit modulo arithmetic automatically.
+ if (timeOut != (INT32)INFINITE)
+ {
+ ULONGLONG end = CLRGetTickCount64();
+ ULONGLONG duration;
+ if (end == start)
+ {
+ duration = 1;
+ }
+ else
+ {
+ duration = end - start;
+ }
+ duration = min(duration, (DWORD)timeOut);
+ timeOut -= (INT32)duration;
+ }
}
pCurThread->DisablePreemptiveGC();
@@ -3288,9 +3247,12 @@ BOOL AwareLock::EnterEpilogHelper(Thread* pCurThread, INT32 timeOut)
GCPROTECT_END();
DecrementTransientPrecious();
+ // Fire a contention end event for a managed contention
+ FireEtwContentionStop(ETW::ContentionLog::ContentionStructs::ManagedContention, GetClrInstanceId());
+
if (ret == WAIT_TIMEOUT)
{
- return FALSE;
+ return false;
}
m_HoldingThread = pCurThread;
@@ -3300,11 +3262,10 @@ BOOL AwareLock::EnterEpilogHelper(Thread* pCurThread, INT32 timeOut)
#if defined(_DEBUG) && defined(TRACK_SYNC)
// The best place to grab this is from the ECall frame
Frame *pFrame = pCurThread->GetFrame();
- int caller = (pFrame && pFrame != FRAME_TOP ? (int) pFrame->GetReturnAddress() : -1);
+ int caller = (pFrame && pFrame != FRAME_TOP ? (int)pFrame->GetReturnAddress() : -1);
pCurThread->m_pTrackSync->EnterSync(caller, this);
#endif
-
- return (ret != WAIT_TIMEOUT);
+ return true;
}
@@ -3339,156 +3300,6 @@ BOOL AwareLock::Leave()
}
}
-#ifdef _DEBUG
-#define _LOGCONTENTION
-#endif // _DEBUG
-
-#ifdef _LOGCONTENTION
-inline void LogContention()
-{
- WRAPPER_NO_CONTRACT;
-#ifdef LOGGING
- if (LoggingOn(LF_SYNC, LL_INFO100))
- {
- LogSpewAlways("Contention: Stack Trace Begin\n");
- void LogStackTrace();
- LogStackTrace();
- LogSpewAlways("Contention: Stack Trace End\n");
- }
-#endif
-}
-#else
-#define LogContention()
-#endif
-
-
-
-bool AwareLock::Contention(INT32 timeOut)
-{
- CONTRACTL
- {
- INSTANCE_CHECK;
- THROWS;
- GC_TRIGGERS;
- MODE_ANY;
- INJECT_FAULT(COMPlusThrowOM(););
- }
- CONTRACTL_END;
-
- DWORD startTime = 0;
- if (timeOut != (INT32)INFINITE)
- startTime = GetTickCount();
-
- COUNTER_ONLY(GetPerfCounters().m_LocksAndThreads.cContention++);
-
-
- LogContention();
- Thread *pCurThread = GetThread();
- OBJECTREF obj = GetOwningObject();
- bool bEntered = false;
- bool bKeepGoing = true;
-
- // We cannot allow the AwareLock to be cleaned up underneath us by the GC.
- IncrementTransientPrecious();
-
- GCPROTECT_BEGIN(obj);
- {
- GCX_PREEMP();
-
- // Try spinning and yielding before eventually blocking.
- // The limit of 10 is largely arbitrary - feel free to tune if you have evidence
- // you're making things better
- for (DWORD iter = 0; iter < g_SpinConstants.dwRepetitions && bKeepGoing; iter++)
- {
- DWORD i = g_SpinConstants.dwInitialDuration;
-
- do
- {
- if (TryEnter())
- {
- bEntered = true;
- goto entered;
- }
-
- if (g_SystemInfo.dwNumberOfProcessors <= 1)
- {
- bKeepGoing = false;
- break;
- }
-
- if (timeOut != (INT32)INFINITE && GetTickCount() - startTime >= (DWORD)timeOut)
- {
- bKeepGoing = false;
- break;
- }
-
- // Spin for i iterations, and make sure to never go more than 20000 iterations between
- // checking if we should SwitchToThread
- int remainingDelay = i;
-
- while (remainingDelay > 0)
- {
- int currentDelay = min(remainingDelay, 20000);
- remainingDelay -= currentDelay;
-
- // Delay by approximately 2*currentDelay clock cycles (Pentium III).
-
- // This is brittle code - future processors may of course execute this
- // faster or slower, and future code generators may eliminate the loop altogether.
- // The precise value of the delay is not critical, however, and I can't think
- // of a better way that isn't machine-dependent.
- for (int delayCount = currentDelay; (--delayCount != 0); )
- {
- YieldProcessor(); // indicate to the processor that we are spining
- }
-
- // TryEnter will not take the lock if it has waiters. This means we should not spin
- // for long periods without giving the waiters a chance to run, since we won't
- // make progress until they run and they may be waiting for our CPU. So once
- // we're spinning >20000 iterations, check every 20000 iterations if there are
- // waiters and if so call SwitchToThread.
- //
- // Since this only affects the spinning heuristic, calling HasWaiters now
- // and getting a dirty read is fine. Note that it is important that TryEnter
- // not take the lock because we could easily starve waiting threads.
- // They make only one attempt before going back to sleep, and spinners on
- // other CPUs would likely get the lock. We could fix this by allowing a
- // woken thread to become a spinner again, at which point there are no
- // starvation concerns and TryEnter can take the lock.
- if (remainingDelay > 0 && HasWaiters())
- {
- __SwitchToThread(0, CALLER_LIMITS_SPINNING);
- }
- }
-
- // exponential backoff: wait a factor longer in the next iteration
- i *= g_SpinConstants.dwBackoffFactor;
- }
- while (i < g_SpinConstants.dwMaximumDuration);
-
- {
- GCX_COOP();
- pCurThread->HandleThreadAbort();
- }
-
- __SwitchToThread(0, CALLER_LIMITS_SPINNING);
- }
-entered: ;
- }
- GCPROTECT_END();
- // we are in co-operative mode so no need to keep this set
- DecrementTransientPrecious();
- if (!bEntered && timeOut == (INT32)INFINITE)
- {
- // We've tried hard to enter - we need to eventually block to avoid wasting too much cpu
- // time.
- Enter();
- bEntered = TRUE;
- }
- return bEntered;
-}
-
-
LONG AwareLock::LeaveCompletely()
{
WRAPPER_NO_CONTRACT;
diff --git a/src/vm/syncblk.h b/src/vm/syncblk.h
index 9dcd57c497..212a726eb5 100644
--- a/src/vm/syncblk.h
+++ b/src/vm/syncblk.h
@@ -17,6 +17,7 @@
#include "slist.h"
#include "crst.h"
#include "vars.hpp"
+#include "yieldprocessornormalized.h"
// #SyncBlockOverview
//
@@ -165,6 +166,7 @@ inline void InitializeSpinConstants()
g_SpinConstants.dwMaximumDuration = min(g_pConfig->SpinLimitProcCap(), g_SystemInfo.dwNumberOfProcessors) * g_pConfig->SpinLimitProcFactor() + g_pConfig->SpinLimitConstant();
g_SpinConstants.dwBackoffFactor = g_pConfig->SpinBackoffFactor();
g_SpinConstants.dwRepetitions = g_pConfig->SpinRetryCount();
+ g_SpinConstants.dwMonitorSpinCount = g_SpinConstants.dwMaximumDuration == 0 ? 0 : g_pConfig->MonitorSpinCount();
#endif
}
@@ -184,11 +186,247 @@ class AwareLock
friend class SyncBlock;
public:
- Volatile<LONG> m_MonitorHeld;
+ enum EnterHelperResult {
+ EnterHelperResult_Entered,
+ EnterHelperResult_Contention,
+ EnterHelperResult_UseSlowPath
+ };
+
+ enum LeaveHelperAction {
+ LeaveHelperAction_None,
+ LeaveHelperAction_Signal,
+ LeaveHelperAction_Yield,
+ LeaveHelperAction_Contention,
+ LeaveHelperAction_Error,
+ };
+
+private:
+ class LockState
+ {
+ private:
+ // Layout constants for m_state
+ static const UINT32 IsLockedMask = (UINT32)1 << 0; // bit 0
+ static const UINT32 ShouldNotPreemptWaitersMask = (UINT32)1 << 1; // bit 1
+ static const UINT32 SpinnerCountIncrement = (UINT32)1 << 2;
+ static const UINT32 SpinnerCountMask = (UINT32)0x7 << 2; // bits 2-4
+ static const UINT32 IsWaiterSignaledToWakeMask = (UINT32)1 << 5; // bit 5
+ static const UINT8 WaiterCountShift = 6;
+ static const UINT32 WaiterCountIncrement = (UINT32)1 << WaiterCountShift;
+ static const UINT32 WaiterCountMask = (UINT32)-1 >> WaiterCountShift << WaiterCountShift; // bits 6-31
+
+ private:
+ UINT32 m_state;
+
+ public:
+ LockState(UINT32 state = 0) : m_state(state)
+ {
+ LIMITED_METHOD_CONTRACT;
+ }
+
+ public:
+ UINT32 GetState() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return m_state;
+ }
+
+ UINT32 GetMonitorHeldState() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ static_assert_no_msg(IsLockedMask == 1);
+ static_assert_no_msg(WaiterCountShift >= 1);
+
+ // Return only the locked state and waiter count in the previous (m_MonitorHeld) layout for the debugger:
+ // bit 0: 1 if locked, 0 otherwise
+ // bits 1-31: waiter count
+ UINT32 state = m_state;
+ return (state & IsLockedMask) + (state >> WaiterCountShift << 1);
+ }
+
+ public:
+ bool IsUnlockedWithNoWaiters() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return !(m_state & (IsLockedMask + WaiterCountMask));
+ }
+
+ void InitializeToLockedWithNoWaiters()
+ {
+ LIMITED_METHOD_CONTRACT;
+ _ASSERTE(!m_state);
+
+ m_state = IsLockedMask;
+ }
+
+ public:
+ bool IsLocked() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return !!(m_state & IsLockedMask);
+ }
+
+ private:
+ void InvertIsLocked()
+ {
+ LIMITED_METHOD_CONTRACT;
+ m_state ^= IsLockedMask;
+ }
+
+ public:
+ bool ShouldNotPreemptWaiters() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return !!(m_state & ShouldNotPreemptWaitersMask);
+ }
+
+ private:
+ void InvertShouldNotPreemptWaiters()
+ {
+ WRAPPER_NO_CONTRACT;
+
+ m_state ^= ShouldNotPreemptWaitersMask;
+ _ASSERTE(!ShouldNotPreemptWaiters() || HasAnyWaiters());
+ }
+
+ bool ShouldNonWaiterAttemptToAcquireLock() const
+ {
+ WRAPPER_NO_CONTRACT;
+ _ASSERTE(!ShouldNotPreemptWaiters() || HasAnyWaiters());
+
+ return !(m_state & (IsLockedMask + ShouldNotPreemptWaitersMask));
+ }
+
+ public:
+ bool HasAnySpinners() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return !!(m_state & SpinnerCountMask);
+ }
+
+ private:
+ bool TryIncrementSpinnerCount()
+ {
+ WRAPPER_NO_CONTRACT;
+
+ LockState newState = m_state + SpinnerCountIncrement;
+ if (newState.HasAnySpinners()) // overflow check
+ {
+ m_state = newState;
+ return true;
+ }
+ return false;
+ }
+
+ void DecrementSpinnerCount()
+ {
+ WRAPPER_NO_CONTRACT;
+ _ASSERTE(HasAnySpinners());
+
+ m_state -= SpinnerCountIncrement;
+ }
+
+ public:
+ bool IsWaiterSignaledToWake() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return !!(m_state & IsWaiterSignaledToWakeMask);
+ }
+
+ private:
+ void InvertIsWaiterSignaledToWake()
+ {
+ LIMITED_METHOD_CONTRACT;
+ m_state ^= IsWaiterSignaledToWakeMask;
+ }
+
+ public:
+ bool HasAnyWaiters() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return m_state >= WaiterCountIncrement;
+ }
+
+ private:
+ void IncrementWaiterCount()
+ {
+ LIMITED_METHOD_CONTRACT;
+ _ASSERTE(m_state + WaiterCountIncrement >= WaiterCountIncrement);
+
+ m_state += WaiterCountIncrement;
+ }
+
+ void DecrementWaiterCount()
+ {
+ WRAPPER_NO_CONTRACT;
+ _ASSERTE(HasAnyWaiters());
+
+ m_state -= WaiterCountIncrement;
+ }
+
+ private:
+ bool NeedToSignalWaiter() const
+ {
+ WRAPPER_NO_CONTRACT;
+ return HasAnyWaiters() && !(m_state & (SpinnerCountMask + IsWaiterSignaledToWakeMask));
+ }
+
+ private:
+ operator UINT32() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return m_state;
+ }
+
+ LockState &operator =(UINT32 state)
+ {
+ LIMITED_METHOD_CONTRACT;
+
+ m_state = state;
+ return *this;
+ }
+
+ public:
+ LockState VolatileLoad() const
+ {
+ WRAPPER_NO_CONTRACT;
+ return ::VolatileLoad(&m_state);
+ }
+
+ private:
+ LockState CompareExchange(LockState toState, LockState fromState)
+ {
+ LIMITED_METHOD_CONTRACT;
+ return (UINT32)InterlockedCompareExchange((LONG *)&m_state, (LONG)toState, (LONG)fromState);
+ }
+
+ LockState CompareExchangeAcquire(LockState toState, LockState fromState)
+ {
+ LIMITED_METHOD_CONTRACT;
+ return (UINT32)InterlockedCompareExchangeAcquire((LONG *)&m_state, (LONG)toState, (LONG)fromState);
+ }
+
+ public:
+ bool InterlockedTryLock();
+ bool InterlockedTryLock(LockState state);
+ bool InterlockedUnlock();
+ bool InterlockedTrySetShouldNotPreemptWaitersIfNecessary(AwareLock *awareLock);
+ bool InterlockedTrySetShouldNotPreemptWaitersIfNecessary(AwareLock *awareLock, LockState state);
+ EnterHelperResult InterlockedTry_LockOrRegisterSpinner(LockState state);
+ EnterHelperResult InterlockedTry_LockAndUnregisterSpinner();
+ bool InterlockedUnregisterSpinner_TryLock();
+ bool InterlockedTryLock_Or_RegisterWaiter(AwareLock *awareLock, LockState state);
+ void InterlockedUnregisterWaiter();
+ bool InterlockedTry_LockAndUnregisterWaiterAndObserveWakeSignal(AwareLock *awareLock);
+ bool InterlockedObserveWakeSignal_Try_LockAndUnregisterWaiter(AwareLock *awareLock);
+ };
+
+ friend class LockState;
+
+private:
+ LockState m_lockState;
ULONG m_Recursion;
PTR_Thread m_HoldingThread;
-
- private:
+
LONG m_TransientPrecious;
@@ -198,16 +436,20 @@ public:
CLREvent m_SemEvent;
+ DWORD m_waiterStarvationStartTimeMs;
+
+ static const DWORD WaiterStarvationDurationMsBeforeStoppingPreemptingWaiters = 100;
+
// Only SyncBlocks can create AwareLocks. Hence this private constructor.
AwareLock(DWORD indx)
- : m_MonitorHeld(0),
- m_Recursion(0),
+ : m_Recursion(0),
#ifndef DACCESS_COMPILE
// PreFAST has trouble with intializing a NULL PTR_Thread.
m_HoldingThread(NULL),
#endif // DACCESS_COMPILE
m_TransientPrecious(0),
- m_dwSyncIndex(indx)
+ m_dwSyncIndex(indx),
+ m_waiterStarvationStartTimeMs(0)
{
LIMITED_METHOD_CONTRACT;
}
@@ -237,24 +479,60 @@ public:
#endif // defined(ENABLE_CONTRACTS_IMPL)
public:
- enum EnterHelperResult {
- EnterHelperResult_Entered,
- EnterHelperResult_Contention,
- EnterHelperResult_UseSlowPath
- };
+ UINT32 GetLockState() const
+ {
+ WRAPPER_NO_CONTRACT;
+ return m_lockState.GetState();
+ }
- enum LeaveHelperAction {
- LeaveHelperAction_None,
- LeaveHelperAction_Signal,
- LeaveHelperAction_Yield,
- LeaveHelperAction_Contention,
- LeaveHelperAction_Error,
- };
+ bool IsUnlockedWithNoWaiters() const
+ {
+ WRAPPER_NO_CONTRACT;
+ return m_lockState.IsUnlockedWithNoWaiters();
+ }
+
+ UINT32 GetMonitorHeldStateVolatile() const
+ {
+ WRAPPER_NO_CONTRACT;
+ return m_lockState.VolatileLoad().GetMonitorHeldState();
+ }
+
+ ULONG GetRecursionLevel() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return m_Recursion;
+ }
+
+ PTR_Thread GetHoldingThread() const
+ {
+ LIMITED_METHOD_CONTRACT;
+ return m_HoldingThread;
+ }
+
+private:
+ void ResetWaiterStarvationStartTime();
+ void RecordWaiterStarvationStartTime();
+ bool ShouldStopPreemptingWaiters() const;
- static bool SpinWaitAndBackOffBeforeOperation(DWORD *spinCountRef);
+private: // friend access is required for this unsafe function
+ void InitializeToLockedWithNoWaiters(ULONG recursionLevel, PTR_Thread holdingThread)
+ {
+ WRAPPER_NO_CONTRACT;
+
+ m_lockState.InitializeToLockedWithNoWaiters();
+ m_Recursion = recursionLevel;
+ m_HoldingThread = holdingThread;
+ }
+
+public:
+ static void SpinWait(const YieldProcessorNormalizationInfo &normalizationInfo, DWORD spinIteration);
// Helper encapsulating the fast path entering monitor. Returns what kind of result was achieved.
- bool EnterHelper(Thread* pCurThread, bool checkRecursiveCase);
+ bool TryEnterHelper(Thread* pCurThread);
+
+ EnterHelperResult TryEnterBeforeSpinLoopHelper(Thread *pCurThread);
+ EnterHelperResult TryEnterInsideSpinLoopHelper(Thread *pCurThread);
+ bool TryEnterAfterSpinLoopHelper(Thread *pCurThread);
// Helper encapsulating the core logic for leaving monitor. Returns what kind of
// follow up action is necessary
@@ -272,9 +550,10 @@ public:
// CLREvent::SetMonitorEvent works even if the event has not been intialized yet
m_SemEvent.SetMonitorEvent();
+
+ m_lockState.InterlockedTrySetShouldNotPreemptWaitersIfNecessary(this);
}
- bool Contention(INT32 timeOut = INFINITE);
void AllocLockSemEvent();
LONG LeaveCompletely();
BOOL OwnedByCurrentThread();
@@ -308,13 +587,6 @@ public:
LIMITED_METHOD_CONTRACT;
return m_HoldingThread;
}
-
- // Do we have waiters?
- inline BOOL HasWaiters()
- {
- LIMITED_METHOD_CONTRACT;
- return (m_MonitorHeld >> 1) > 0;
- }
};
#ifdef FEATURE_COMINTEROP
@@ -636,7 +908,7 @@ class SyncBlock
{
WRAPPER_NO_CONTRACT;
return (!IsPrecious() &&
- m_Monitor.m_MonitorHeld.RawValue() == (LONG)0 &&
+ m_Monitor.IsUnlockedWithNoWaiters() &&
m_Monitor.m_TransientPrecious == 0);
}
@@ -721,16 +993,6 @@ class SyncBlock
m_dwAppDomainIndex = dwAppDomainIndex;
}
- void SetAwareLock(Thread *holdingThread, DWORD recursionLevel)
- {
- LIMITED_METHOD_CONTRACT;
- // <NOTE>
- // DO NOT SET m_MonitorHeld HERE! THIS IS NOT PROTECTED BY ANY LOCK!!
- // </NOTE>
- m_Monitor.m_HoldingThread = PTR_Thread(holdingThread);
- m_Monitor.m_Recursion = recursionLevel;
- }
-
DWORD GetHashCode()
{
LIMITED_METHOD_CONTRACT;
@@ -875,10 +1137,10 @@ class SyncBlock
// This should ONLY be called when initializing a SyncBlock (i.e. ONLY from
// ObjHeader::GetSyncBlock()), otherwise we'll have a race condition.
// </NOTE>
- void InitState()
+ void InitState(ULONG recursionLevel, PTR_Thread holdingThread)
{
- LIMITED_METHOD_CONTRACT;
- m_Monitor.m_MonitorHeld.RawValue() = 1;
+ WRAPPER_NO_CONTRACT;
+ m_Monitor.InitializeToLockedWithNoWaiters(recursionLevel, holdingThread);
}
#if defined(ENABLE_CONTRACTS_IMPL)
diff --git a/src/vm/syncblk.inl b/src/vm/syncblk.inl
index 376cd4c2e7..697afe369b 100644
--- a/src/vm/syncblk.inl
+++ b/src/vm/syncblk.inl
@@ -8,34 +8,492 @@
#ifndef DACCESS_COMPILE
-FORCEINLINE bool AwareLock::SpinWaitAndBackOffBeforeOperation(DWORD *spinCountRef)
+FORCEINLINE bool AwareLock::LockState::InterlockedTryLock()
+{
+ WRAPPER_NO_CONTRACT;
+ return InterlockedTryLock(*this);
+}
+
+FORCEINLINE bool AwareLock::LockState::InterlockedTryLock(LockState state)
+{
+ WRAPPER_NO_CONTRACT;
+
+ // The monitor is fair to release waiters in FIFO order, but allows non-waiters to acquire the lock if it's available to
+ // avoid lock convoys.
+ //
+ // Lock convoys can be detrimental to performance in scenarios where work is being done on multiple threads and the work
+ // involves periodically taking a particular lock for a short time to access shared resources. With a lock convoy, once
+ // there is a waiter for the lock (which is not uncommon in such scenarios), a worker thread would be forced to
+ // context-switch on the subsequent attempt to acquire the lock, often long before the worker thread exhausts its time
+ // slice. This process repeats as long as the lock has a waiter, forcing every worker to context-switch on each attempt to
+ // acquire the lock, killing performance and creating a negative feedback loop that makes it more likely for the lock to
+ // have waiters. To avoid the lock convoy, each worker needs to be allowed to acquire the lock multiple times in sequence
+ // despite there being a waiter for the lock in order to have the worker continue working efficiently during its time slice
+ // as long as the lock is not contended.
+ //
+ // This scheme has the possibility to starve waiters. Waiter starvation is mitigated by other means, see
+ // InterlockedTrySetShouldNotPreemptWaitersIfNecessary().
+ if (state.ShouldNonWaiterAttemptToAcquireLock())
+ {
+ LockState newState = state;
+ newState.InvertIsLocked();
+
+ return CompareExchangeAcquire(newState, state) == state;
+ }
+ return false;
+}
+
+FORCEINLINE bool AwareLock::LockState::InterlockedUnlock()
+{
+ WRAPPER_NO_CONTRACT;
+ static_assert_no_msg(IsLockedMask == 1);
+ _ASSERTE(IsLocked());
+
+ LockState state = InterlockedDecrementRelease((LONG *)&m_state);
+ while (true)
+ {
+ // Keep track of whether a thread has been signaled to wake but has not yet woken from the wait.
+ // IsWaiterSignaledToWakeMask is cleared when a signaled thread wakes up by observing a signal. Since threads can
+ // preempt waiting threads and acquire the lock (see InterlockedTryLock()), it allows for example, one thread to acquire
+ // and release the lock multiple times while there are multiple waiting threads. In such a case, we don't want that
+ // thread to signal a waiter every time it releases the lock, as that will cause unnecessary context switches with more
+ // and more signaled threads waking up, finding that the lock is still locked, and going right back into a wait state.
+ // So, signal only one waiting thread at a time.
+ if (!state.NeedToSignalWaiter())
+ {
+ return false;
+ }
+
+ LockState newState = state;
+ newState.InvertIsWaiterSignaledToWake();
+
+ LockState stateBeforeUpdate = CompareExchange(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ return true;
+ }
+
+ state = stateBeforeUpdate;
+ }
+}
+
+FORCEINLINE bool AwareLock::LockState::InterlockedTrySetShouldNotPreemptWaitersIfNecessary(AwareLock *awareLock)
+{
+ WRAPPER_NO_CONTRACT;
+ return InterlockedTrySetShouldNotPreemptWaitersIfNecessary(awareLock, *this);
+}
+
+FORCEINLINE bool AwareLock::LockState::InterlockedTrySetShouldNotPreemptWaitersIfNecessary(
+ AwareLock *awareLock,
+ LockState state)
+{
+ WRAPPER_NO_CONTRACT;
+ _ASSERTE(awareLock != nullptr);
+ _ASSERTE(&awareLock->m_lockState == this);
+
+ // Normally, threads are allowed to preempt waiters to acquire the lock in order to avoid creating lock convoys, see
+ // InterlockedTryLock(). There are cases where waiters can be easily starved as a result. For example, a thread that
+ // holds a lock for a significant amount of time (much longer than the time it takes to do a context switch), then
+ // releases and reacquires the lock in quick succession, and repeats. Though a waiter would be woken upon lock release,
+ // usually it will not have enough time to context-switch-in and take the lock, and can be starved for an unreasonably long
+ // duration.
+ //
+ // In order to prevent such starvation and force a bit of fair forward progress, it is sometimes necessary to change the
+ // normal policy and disallow threads from preempting waiters. ShouldNotPreemptWaiters() indicates the current state of the
+ // policy and this function determines whether the policy should be changed to disallow non-waiters from preempting waiters.
+ // - When the first waiter begins waiting, it records the current time as a "waiter starvation start time". That is a
+ // point in time after which no forward progress has occurred for waiters. When a waiter acquires the lock, the time is
+ // updated to the current time.
+ // - This function checks whether the starvation duration has crossed a threshold and if so, sets
+ // ShouldNotPreemptWaiters()
+ //
+ // When unreasonable starvation is occurring, the lock will be released occasionally and if caused by spinners, spinners
+ // will be starting to spin.
+ // - Before starting to spin this function is called. If ShouldNotPreemptWaiters() is set, the spinner will skip spinning
+ // and wait instead. Spinners that are already registered at the time ShouldNotPreemptWaiters() is set will stop
+ // spinning as necessary. Eventually, all spinners will drain and no new ones will be registered.
+ // - Upon releasing a lock, if there are no spinners, a waiter will be signaled to wake. On that path, this function
+ // is called.
+ // - Eventually, after spinners have drained, only a waiter will be able to acquire the lock. When a waiter acquires
+ // the lock, or when the last waiter unregisters itself, ShouldNotPreemptWaiters() is cleared to restore the normal
+ // policy.
+
+ while (true)
+ {
+ if (!state.HasAnyWaiters())
+ {
+ _ASSERTE(!state.ShouldNotPreemptWaiters());
+ return false;
+ }
+ if (state.ShouldNotPreemptWaiters())
+ {
+ return true;
+ }
+ if (!awareLock->ShouldStopPreemptingWaiters())
+ {
+ return false;
+ }
+
+ LockState newState = state;
+ newState.InvertShouldNotPreemptWaiters();
+
+ LockState stateBeforeUpdate = CompareExchange(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ return true;
+ }
+
+ state = stateBeforeUpdate;
+ }
+}
+
+FORCEINLINE AwareLock::EnterHelperResult AwareLock::LockState::InterlockedTry_LockOrRegisterSpinner(LockState state)
+{
+ WRAPPER_NO_CONTRACT;
+
+ while (true)
+ {
+ LockState newState = state;
+ if (state.ShouldNonWaiterAttemptToAcquireLock())
+ {
+ newState.InvertIsLocked();
+ }
+ else if (state.ShouldNotPreemptWaiters() || !newState.TryIncrementSpinnerCount())
+ {
+ return EnterHelperResult_UseSlowPath;
+ }
+
+ LockState stateBeforeUpdate = CompareExchange(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ return state.ShouldNonWaiterAttemptToAcquireLock() ? EnterHelperResult_Entered : EnterHelperResult_Contention;
+ }
+
+ state = stateBeforeUpdate;
+ }
+}
+
+FORCEINLINE AwareLock::EnterHelperResult AwareLock::LockState::InterlockedTry_LockAndUnregisterSpinner()
+{
+ WRAPPER_NO_CONTRACT;
+
+ // This function is called from inside a spin loop, it must unregister the spinner if and only if the lock is acquired
+ LockState state = *this;
+ while (true)
+ {
+ _ASSERTE(state.HasAnySpinners());
+ if (!state.ShouldNonWaiterAttemptToAcquireLock())
+ {
+ return state.ShouldNotPreemptWaiters() ? EnterHelperResult_UseSlowPath : EnterHelperResult_Contention;
+ }
+
+ LockState newState = state;
+ newState.InvertIsLocked();
+ newState.DecrementSpinnerCount();
+
+ LockState stateBeforeUpdate = CompareExchange(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ return EnterHelperResult_Entered;
+ }
+
+ state = stateBeforeUpdate;
+ }
+}
+
+FORCEINLINE bool AwareLock::LockState::InterlockedUnregisterSpinner_TryLock()
+{
+ WRAPPER_NO_CONTRACT;
+
+ // This function is called at the end of a spin loop, it must unregister the spinner always and acquire the lock if it's
+ // available. If the lock is available, a spinner must acquire the lock along with unregistering itself, because a lock
+ // releaser does not wake a waiter when there is a spinner registered.
+
+ LockState stateBeforeUpdate = InterlockedExchangeAdd((LONG *)&m_state, -(LONG)SpinnerCountIncrement);
+ _ASSERTE(stateBeforeUpdate.HasAnySpinners());
+ if (stateBeforeUpdate.IsLocked())
+ {
+ return false;
+ }
+
+ LockState state = stateBeforeUpdate;
+ state.DecrementSpinnerCount();
+ _ASSERTE(!state.IsLocked());
+ do
+ {
+ LockState newState = state;
+ newState.InvertIsLocked();
+
+ LockState stateBeforeUpdate = CompareExchangeAcquire(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ return true;
+ }
+
+ state = stateBeforeUpdate;
+ } while (!state.IsLocked());
+ return false;
+}
+
+FORCEINLINE bool AwareLock::LockState::InterlockedTryLock_Or_RegisterWaiter(AwareLock *awareLock, LockState state)
+{
+ WRAPPER_NO_CONTRACT;
+ _ASSERTE(awareLock != nullptr);
+ _ASSERTE(&awareLock->m_lockState == this);
+
+ bool waiterStarvationStartTimeWasReset = false;
+ while (true)
+ {
+ LockState newState = state;
+ if (state.ShouldNonWaiterAttemptToAcquireLock())
+ {
+ newState.InvertIsLocked();
+ }
+ else
+ {
+ newState.IncrementWaiterCount();
+
+ if (!state.HasAnyWaiters() && !waiterStarvationStartTimeWasReset)
+ {
+ // This would be the first waiter. Once the waiter is registered, another thread may check the waiter starvation
+ // start time and the previously recorded value may be stale, causing ShouldNotPreemptWaiters() to be set
+ // unnecessarily. Reset the start time before registering the waiter.
+ waiterStarvationStartTimeWasReset = true;
+ awareLock->ResetWaiterStarvationStartTime();
+ }
+ }
+
+ LockState stateBeforeUpdate = CompareExchange(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ if (state.ShouldNonWaiterAttemptToAcquireLock())
+ {
+ return true;
+ }
+
+ if (!state.HasAnyWaiters())
+ {
+ // This was the first waiter, record the waiter starvation start time
+ _ASSERTE(waiterStarvationStartTimeWasReset);
+ awareLock->RecordWaiterStarvationStartTime();
+ }
+ return false;
+ }
+
+ state = stateBeforeUpdate;
+ }
+}
+
+FORCEINLINE void AwareLock::LockState::InterlockedUnregisterWaiter()
+{
+ WRAPPER_NO_CONTRACT;
+
+ LockState state = *this;
+ while (true)
+ {
+ _ASSERTE(state.HasAnyWaiters());
+
+ LockState newState = state;
+ newState.DecrementWaiterCount();
+ if (newState.ShouldNotPreemptWaiters() && !newState.HasAnyWaiters())
+ {
+ newState.InvertShouldNotPreemptWaiters();
+ }
+
+ LockState stateBeforeUpdate = CompareExchange(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ return;
+ }
+
+ state = stateBeforeUpdate;
+ }
+}
+
+FORCEINLINE bool AwareLock::LockState::InterlockedTry_LockAndUnregisterWaiterAndObserveWakeSignal(AwareLock *awareLock)
+{
+ WRAPPER_NO_CONTRACT;
+ _ASSERTE(awareLock != nullptr);
+ _ASSERTE(&awareLock->m_lockState == this);
+
+ // This function is called from the waiter's spin loop and should observe the wake signal only if the lock is taken, to
+ // prevent a lock releaser from waking another waiter while one is already spinning to acquire the lock
+ bool waiterStarvationStartTimeWasRecorded = false;
+ LockState state = *this;
+ while (true)
+ {
+ _ASSERTE(state.HasAnyWaiters());
+ _ASSERTE(state.IsWaiterSignaledToWake());
+ if (state.IsLocked())
+ {
+ return false;
+ }
+
+ LockState newState = state;
+ newState.InvertIsLocked();
+ newState.InvertIsWaiterSignaledToWake();
+ newState.DecrementWaiterCount();
+ if (newState.ShouldNotPreemptWaiters())
+ {
+ newState.InvertShouldNotPreemptWaiters();
+
+ if (newState.HasAnyWaiters() && !waiterStarvationStartTimeWasRecorded)
+ {
+ // Update the waiter starvation start time. The time must be recorded before ShouldNotPreemptWaiters() is
+ // cleared, as once that is cleared, another thread may check the waiter starvation start time and the
+ // previously recorded value may be stale, causing ShouldNotPreemptWaiters() to be set again unnecessarily.
+ waiterStarvationStartTimeWasRecorded = true;
+ awareLock->RecordWaiterStarvationStartTime();
+ }
+ }
+
+ LockState stateBeforeUpdate = CompareExchange(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ if (newState.HasAnyWaiters())
+ {
+ _ASSERTE(!state.ShouldNotPreemptWaiters() || waiterStarvationStartTimeWasRecorded);
+ if (!waiterStarvationStartTimeWasRecorded)
+ {
+ // Since the lock was acquired successfully by a waiter, update the waiter starvation start time
+ awareLock->RecordWaiterStarvationStartTime();
+ }
+ }
+ return true;
+ }
+
+ state = stateBeforeUpdate;
+ }
+}
+
+FORCEINLINE bool AwareLock::LockState::InterlockedObserveWakeSignal_Try_LockAndUnregisterWaiter(AwareLock *awareLock)
+{
+ WRAPPER_NO_CONTRACT;
+ _ASSERTE(awareLock != nullptr);
+ _ASSERTE(&awareLock->m_lockState == this);
+
+ // This function is called at the end of the waiter's spin loop. It must observe the wake signal always, and if the lock is
+ // available, it must acquire the lock and unregister the waiter. If the lock is available, a waiter must acquire the lock
+ // along with observing the wake signal, because a lock releaser does not wake a waiter when a waiter was signaled but the
+ // wake signal has not been observed.
+
+ LockState stateBeforeUpdate = InterlockedExchangeAdd((LONG *)&m_state, -(LONG)IsWaiterSignaledToWakeMask);
+ _ASSERTE(stateBeforeUpdate.IsWaiterSignaledToWake());
+ if (stateBeforeUpdate.IsLocked())
+ {
+ return false;
+ }
+
+ bool waiterStarvationStartTimeWasRecorded = false;
+ LockState state = stateBeforeUpdate;
+ state.InvertIsWaiterSignaledToWake();
+ _ASSERTE(!state.IsLocked());
+ do
+ {
+ _ASSERTE(state.HasAnyWaiters());
+ LockState newState = state;
+ newState.InvertIsLocked();
+ newState.DecrementWaiterCount();
+ if (newState.ShouldNotPreemptWaiters())
+ {
+ newState.InvertShouldNotPreemptWaiters();
+
+ if (newState.HasAnyWaiters() && !waiterStarvationStartTimeWasRecorded)
+ {
+ // Update the waiter starvation start time. The time must be recorded before ShouldNotPreemptWaiters() is
+ // cleared, as once that is cleared, another thread may check the waiter starvation start time and the
+ // previously recorded value may be stale, causing ShouldNotPreemptWaiters() to be set again unnecessarily.
+ waiterStarvationStartTimeWasRecorded = true;
+ awareLock->RecordWaiterStarvationStartTime();
+ }
+ }
+
+ LockState stateBeforeUpdate = CompareExchange(newState, state);
+ if (stateBeforeUpdate == state)
+ {
+ if (newState.HasAnyWaiters())
+ {
+ _ASSERTE(!state.ShouldNotPreemptWaiters() || waiterStarvationStartTimeWasRecorded);
+ if (!waiterStarvationStartTimeWasRecorded)
+ {
+ // Since the lock was acquired successfully by a waiter, update the waiter starvation start time
+ awareLock->RecordWaiterStarvationStartTime();
+ }
+ }
+ return true;
+ }
+
+ state = stateBeforeUpdate;
+ } while (!state.IsLocked());
+ return false;
+}
+
+FORCEINLINE void AwareLock::ResetWaiterStarvationStartTime()
+{
+ LIMITED_METHOD_CONTRACT;
+ m_waiterStarvationStartTimeMs = 0;
+}
+
+FORCEINLINE void AwareLock::RecordWaiterStarvationStartTime()
+{
+ WRAPPER_NO_CONTRACT;
+
+ DWORD currentTimeMs = GetTickCount();
+ if (currentTimeMs == 0)
+ {
+ // Don't record zero, that value is reserved for identifying that a time is not recorded
+ --currentTimeMs;
+ }
+ m_waiterStarvationStartTimeMs = currentTimeMs;
+}
+
+FORCEINLINE bool AwareLock::ShouldStopPreemptingWaiters() const
+{
+ WRAPPER_NO_CONTRACT;
+
+ // If the recorded time is zero, a time has not been recorded yet
+ DWORD waiterStarvationStartTimeMs = m_waiterStarvationStartTimeMs;
+ return
+ waiterStarvationStartTimeMs != 0 &&
+ GetTickCount() - waiterStarvationStartTimeMs >= WaiterStarvationDurationMsBeforeStoppingPreemptingWaiters;
+}
+
+FORCEINLINE void AwareLock::SpinWait(const YieldProcessorNormalizationInfo &normalizationInfo, DWORD spinIteration)
+{
+ WRAPPER_NO_CONTRACT;
+
+ _ASSERTE(g_SystemInfo.dwNumberOfProcessors != 1);
+ _ASSERTE(spinIteration < g_SpinConstants.dwMonitorSpinCount);
+
+ YieldProcessorWithBackOffNormalized(normalizationInfo, spinIteration);
+}
+
+FORCEINLINE bool AwareLock::TryEnterHelper(Thread* pCurThread)
{
CONTRACTL{
SO_TOLERANT;
NOTHROW;
GC_NOTRIGGER;
- MODE_COOPERATIVE;
+ MODE_ANY;
} CONTRACTL_END;
- _ASSERTE(spinCountRef != nullptr);
- DWORD &spinCount = *spinCountRef;
- _ASSERTE(g_SystemInfo.dwNumberOfProcessors != 1);
-
- if (spinCount > g_SpinConstants.dwMaximumDuration)
+ if (m_lockState.InterlockedTryLock())
{
- return false;
+ m_HoldingThread = pCurThread;
+ m_Recursion = 1;
+ pCurThread->IncLockCount();
+ return true;
}
- for (DWORD i = 0; i < spinCount; i++)
+ if (GetOwningThread() == pCurThread) /* monitor is held, but it could be a recursive case */
{
- YieldProcessor();
+ m_Recursion++;
+ return true;
}
-
- spinCount *= g_SpinConstants.dwBackoffFactor;
- return true;
+ return false;
}
-FORCEINLINE bool AwareLock::EnterHelper(Thread* pCurThread, bool checkRecursiveCase)
+FORCEINLINE AwareLock::EnterHelperResult AwareLock::TryEnterBeforeSpinLoopHelper(Thread *pCurThread)
{
CONTRACTL{
SO_TOLERANT;
@@ -44,23 +502,91 @@ FORCEINLINE bool AwareLock::EnterHelper(Thread* pCurThread, bool checkRecursiveC
MODE_ANY;
} CONTRACTL_END;
- LONG state = m_MonitorHeld.LoadWithoutBarrier();
- if (state == 0)
+ LockState state = m_lockState;
+
+ // Check the recursive case once before the spin loop. If it's not the recursive case in the beginning, it will not
+ // be in the future, so the spin loop can avoid checking the recursive case.
+ if (!state.IsLocked() || GetOwningThread() != pCurThread)
{
- if (InterlockedCompareExchangeAcquire((LONG*)&m_MonitorHeld, 1, 0) == 0)
+ if (m_lockState.InterlockedTrySetShouldNotPreemptWaitersIfNecessary(this, state))
{
- m_HoldingThread = pCurThread;
- m_Recursion = 1;
- pCurThread->IncLockCount();
- return true;
+ // This thread currently should not preempt waiters, just wait
+ return EnterHelperResult_UseSlowPath;
}
+
+ // Not a recursive enter, try to acquire the lock or register the spinner
+ EnterHelperResult result = m_lockState.InterlockedTry_LockOrRegisterSpinner(state);
+ if (result != EnterHelperResult_Entered)
+ {
+ // EnterHelperResult_Contention:
+ // Lock was not acquired and the spinner was registered
+ // EnterHelperResult_UseSlowPath:
+ // This thread currently should not preempt waiters, or we reached the maximum number of spinners, just wait
+ return result;
+ }
+
+ // Lock was acquired and the spinner was not registered
+ m_HoldingThread = pCurThread;
+ m_Recursion = 1;
+ pCurThread->IncLockCount();
+ return EnterHelperResult_Entered;
+ }
+
+ // Recursive enter
+ m_Recursion++;
+ return EnterHelperResult_Entered;
+}
+
+FORCEINLINE AwareLock::EnterHelperResult AwareLock::TryEnterInsideSpinLoopHelper(Thread *pCurThread)
+{
+ CONTRACTL{
+ SO_TOLERANT;
+ NOTHROW;
+ GC_NOTRIGGER;
+ MODE_ANY;
+ } CONTRACTL_END;
+
+ // Try to acquire the lock and unregister the spinner. The recursive case is not checked here because
+ // TryEnterBeforeSpinLoopHelper() would have taken care of that case before the spin loop.
+ EnterHelperResult result = m_lockState.InterlockedTry_LockAndUnregisterSpinner();
+ if (result != EnterHelperResult_Entered)
+ {
+ // EnterHelperResult_Contention:
+ // Lock was not acquired and the spinner was not unregistered
+ // EnterHelperResult_UseSlowPath:
+ // This thread currently should not preempt waiters, stop spinning and just wait
+ return result;
}
- else if (checkRecursiveCase && GetOwningThread() == pCurThread) /* monitor is held, but it could be a recursive case */
+
+ // Lock was acquired and spinner was unregistered
+ m_HoldingThread = pCurThread;
+ m_Recursion = 1;
+ pCurThread->IncLockCount();
+ return EnterHelperResult_Entered;
+}
+
+FORCEINLINE bool AwareLock::TryEnterAfterSpinLoopHelper(Thread *pCurThread)
+{
+ CONTRACTL{
+ SO_TOLERANT;
+ NOTHROW;
+ GC_NOTRIGGER;
+ MODE_ANY;
+ } CONTRACTL_END;
+
+ // Unregister the spinner and try to acquire the lock. A spinner must not unregister itself without trying to acquire the
+ // lock because a lock releaser does not wake a waiter when a spinner can acquire the lock.
+ if (!m_lockState.InterlockedUnregisterSpinner_TryLock())
{
- m_Recursion++;
- return true;
+ // Spinner was unregistered and the lock was not acquired
+ return false;
}
- return false;
+
+ // Spinner was unregistered and the lock was acquired
+ m_HoldingThread = pCurThread;
+ m_Recursion = 1;
+ pCurThread->IncLockCount();
+ return true;
}
FORCEINLINE AwareLock::EnterHelperResult ObjHeader::EnterObjMonitorHelper(Thread* pCurThread)
@@ -105,7 +631,7 @@ FORCEINLINE AwareLock::EnterHelperResult ObjHeader::EnterObjMonitorHelper(Thread
SyncBlock *syncBlock = g_pSyncTable[oldValue & MASK_SYNCBLOCKINDEX].m_SyncBlock;
_ASSERTE(syncBlock != NULL);
- if (syncBlock->m_Monitor.EnterHelper(pCurThread, true /* checkRecursiveCase */))
+ if (syncBlock->m_Monitor.TryEnterHelper(pCurThread))
{
return AwareLock::EnterHelperResult_Entered;
}
@@ -159,7 +685,7 @@ FORCEINLINE AwareLock::LeaveHelperAction AwareLock::LeaveHelper(Thread* pCurThre
if (m_HoldingThread != pCurThread)
return AwareLock::LeaveHelperAction_Error;
- _ASSERTE((size_t)m_MonitorHeld & 1);
+ _ASSERTE(m_lockState.IsLocked());
_ASSERTE(m_Recursion >= 1);
#if defined(_DEBUG) && defined(TRACK_SYNC) && !defined(CROSSGEN_COMPILE)
@@ -174,14 +700,14 @@ FORCEINLINE AwareLock::LeaveHelperAction AwareLock::LeaveHelper(Thread* pCurThre
m_HoldingThread->DecLockCount();
m_HoldingThread = NULL;
- // Clear lock bit.
- LONG state = InterlockedDecrementRelease((LONG*)&m_MonitorHeld);
-
- // If wait count is non-zero on successful clear, we must signal the event.
- if (state & ~1)
+ // Clear lock bit and determine whether we must signal a waiter to wake
+ if (!m_lockState.InterlockedUnlock())
{
- return AwareLock::LeaveHelperAction_Signal;
+ return AwareLock::LeaveHelperAction_None;
}
+
+ // There is a waiter and we must signal a waiter to wake
+ return AwareLock::LeaveHelperAction_Signal;
}
return AwareLock::LeaveHelperAction_None;
}
diff --git a/src/vm/threads.cpp b/src/vm/threads.cpp
index 11f9c866af..de5eb6abec 100644
--- a/src/vm/threads.cpp
+++ b/src/vm/threads.cpp
@@ -1241,11 +1241,11 @@ void Dbg_TrackSyncStack::EnterSync(UINT_PTR caller, void *pAwareLock)
{
LIMITED_METHOD_CONTRACT;
- STRESS_LOG4(LF_SYNC, LL_INFO100, "Dbg_TrackSyncStack::EnterSync, IP=%p, Recursion=%d, MonitorHeld=%d, HoldingThread=%p.\n",
+ STRESS_LOG4(LF_SYNC, LL_INFO100, "Dbg_TrackSyncStack::EnterSync, IP=%p, Recursion=%u, LockState=%x, HoldingThread=%p.\n",
caller,
- ((AwareLock*)pAwareLock)->m_Recursion,
- ((AwareLock*)pAwareLock)->m_MonitorHeld.LoadWithoutBarrier(),
- ((AwareLock*)pAwareLock)->m_HoldingThread );
+ ((AwareLock*)pAwareLock)->GetRecursionLevel(),
+ ((AwareLock*)pAwareLock)->GetLockState(),
+ ((AwareLock*)pAwareLock)->GetHoldingThread());
if (m_Active)
{
@@ -1267,11 +1267,11 @@ void Dbg_TrackSyncStack::LeaveSync(UINT_PTR caller, void *pAwareLock)
{
WRAPPER_NO_CONTRACT;
- STRESS_LOG4(LF_SYNC, LL_INFO100, "Dbg_TrackSyncStack::LeaveSync, IP=%p, Recursion=%d, MonitorHeld=%d, HoldingThread=%p.\n",
+ STRESS_LOG4(LF_SYNC, LL_INFO100, "Dbg_TrackSyncStack::LeaveSync, IP=%p, Recursion=%u, LockState=%x, HoldingThread=%p.\n",
caller,
- ((AwareLock*)pAwareLock)->m_Recursion,
- ((AwareLock*)pAwareLock)->m_MonitorHeld.LoadWithoutBarrier(),
- ((AwareLock*)pAwareLock)->m_HoldingThread );
+ ((AwareLock*)pAwareLock)->GetRecursionLevel(),
+ ((AwareLock*)pAwareLock)->GetLockState(),
+ ((AwareLock*)pAwareLock)->GetHoldingThread());
if (m_Active)
{
diff --git a/src/vm/vars.cpp b/src/vm/vars.cpp
index 343f1545c8..3c54029231 100644
--- a/src/vm/vars.cpp
+++ b/src/vm/vars.cpp
@@ -146,7 +146,8 @@ SpinConstants g_SpinConstants = {
50, // dwInitialDuration
40000, // dwMaximumDuration - ideally (20000 * max(2, numProc))
3, // dwBackoffFactor
- 10 // dwRepetitions
+ 10, // dwRepetitions
+ 0 // dwMonitorSpinCount
};
// support for Event Tracing for Windows (ETW)