summaryrefslogtreecommitdiff
path: root/src/vm/comsynchronizable.cpp
AgeCommit message (Collapse)AuthorFilesLines
2019-04-03Remove ADID and ADIndex from CoreCLR (#23588)David Wrighton1-3/+3
- Remove concept of AppDomain from object api in VM - Various infrastructure around entering/leaving appdomains is removed - Add small implementation of GetAppDomain for use by DAC (to match existing behavior) - Simplify finalizer thread operations - Eliminate AppDomain::Terminate - Remove use of ADID from stresslog - Remove thread enter/leave tracking from AppDomain - Remove unused asm constants across all architectures - Re-order header inclusion order to put gcenv.h before handletable - Remove retail only sync block code involving appdomain index
2019-03-01Implement Serialization GuardMorgan Brown1-0/+23
Add Serialization Guard API and consume it in CoreLib targets
2019-02-25Remove support for ICLRExecutionManager and pause/resume code for waits (#22834)Filip Navara1-9/+1
2019-01-23Remove all traces of FEATURE_STACK_PROBE. (#22149)Filip Navara1-9/+0
2019-01-23Remove unused thread abortion methods. (#22147)Filip Navara1-136/+0
2019-01-10Normalize a few more spin-wait loops (#21586)Koundinya Veluri1-2/+2
Normalize a few more spin-wait loops - Fixed a few more spin-waits to normalize the spin-wait duration between processors - These spin-waits have so far not needed to be retuned to avoid unreasonably long spin-wait durations. They can be retuned as necessary in the future. - Added a version of YieldProcessorNormalized() that normalizes based on spin-wait counts tuned for pre-Skylake processors for spin-wait loops that have not been retuned. - Moved some files around to make YieldProcessorNormalized() and the like available in more places. Initialization is still only done in the VM. Uses outside the VM will use the defaults, where there would be no significant change from before. - Made YieldProcessor() private outside of the GC and added System_YieldProcessor() for when the system-defined implementation is intended to be used
2019-01-03Cleanup current culture handling in the unmanaged runtime (#21706)Jan Kotas1-141/+0
Large portion of the current culture handling in the unmanaged runtime inherited from desktop has been no-op. The nativeInitCultureAccessors QCall that it used to depend on desktop got (almost) never called in CoreCLR. - Delete resetting of current culture on threadpool threads. It was needed in desktop because of a very tricky flow of current culture between appdomains. It is superseded by the flowing the current culture via AsyncLocal in CoreCLR. - Comment out fetch of managed current culture for unmanaged resource lookup. It has number of problems that are not easy to fix. We are not localizing the unmanaged runtime currently anyway, so it is ok to just comment it out. - Fix the rest to call CultureInfo directly without going through Thread.CurrentThread
2018-12-10Refactor internal System.AppDomain out of CoreLib (#21460)Jan Kotas1-68/+0
Fixes #21028
2018-11-16Fix unloadability races (#21024)Jan Vorlicek1-2/+2
* Fix LoaderAllocator::AllocateHandle When another thread wins the race in growing the handle table, the code was not refreshing the slotsUsed local to the new up to date value. This was leading to overwriting / reusing a live handle. This change fixes it. * Embed ThreadLocalBlock in Thread Instead of allocating ThreadLocalBlock dynamically, embed it in the Thread. That solves race issue between thread destruction and LoaderAllocator destruction. The ThreadLocalBlock could have been deleted during Thread shutdown while the LoaderAllocator's destruction would be working with it.
2018-10-04Remove AppDomain unload (#20250)Jan Vorlicek1-14/+4
* Remove AppDomain unload This change removes all code in AppDomain that's related to AppDomain unloading which is obsolete in CoreCLR. It also removes all calls to the removed methods. In few places, I have made the change simpler by taking into account the fact that there is always just one AppDomain.
2018-08-01Expose OSThreadId and TimeStamp on EventWrittenEventArgs (#19002)Brian Robbins1-0/+21
* Add EventWrittenEventArgs public properties OSThreadId and TimeStamp and plumb through OSThreadId. * Plumb ActivityId and RelatedActivityId to EventListener.
2018-06-03Warnings cleanup (#18260)Robin Sue1-1/+0
* Cleanup all disabled warnings that do not trigger * Fix warning about line continuation in single line comment * Eliminiate all unreferenced local variables and reenable warning
2018-05-31Review feedbackVance Morrison1-1/+1
2018-05-31Review feedbackVance Morrison1-1/+1
2018-05-31Review feedbackVance Morrison1-1/+1
2018-05-31Fixed misspellingVance Morrison1-3/+3
2018-05-31Review feedback (avoid passing out unpinned managed pointers).Vance Morrison1-5/+11
2018-05-30Fix Managed Thread Name SettingVance Morrison1-2/+6
System.Thread has a 'Name' Property that will call the Windows OS SetThreadDescription that will give the thread that friendly name for debuggers and profilers. However this code was broken in a couple ways. First, it actually calls the SetThreadDescription with a null name when a thread is created (even if later the name is set. Secondly, if the Thread.Name is set before the thread is actually started, then SetThreadDescription is not called at all. This change fixes both of these issues. First, when a thread is created it uses the Name property if it is set. Second if the name property is set, it does not call SetThreadDescription if the Thread has not been created. Third, it skips calls to SetThreadDescription if the name is null.
2018-03-19Delete unused downlevel globalization support (#17022)Jan Kotas1-2/+0
Delete ENABLE_DOWNLEVEL_FOR_NLS and everything under it
2018-02-28Delete unnecesary StackCrawlMarks (#16648)Jan Kotas1-3/+3
CAS leftovers
2018-02-28Add Thread.GetCurrentProcessorId() API (#16650)Jan Kotas1-0/+7
Contributes to https://github.com/dotnet/corefx/issues/16767
2017-12-28Recognize STA\MTA Attribute For Main Function (#15652)Anirudh Agnihotry1-6/+3
* Apartment state set for main method * g_fWeownprocess removed and CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_FinalizeOnShutdown) set
2017-12-20Revert "Respect STA/MTAThread attributes (#15512)"Jan Kotas1-3/+6
This reverts commit 21cfdb6f5bb8c596aa55cc50892be0bfabee5de3.
2017-12-16Respect STA/MTAThread attributes (#15512)Anirudh Agnihotry1-6/+3
* Apartment state set for main method * Default apartment state is Unknown * Removed Extra validation of token * Removed legacy apartment code
2017-10-31Clean up YieldProcessorNormalized (#14739)Koundinya Veluri1-6/+2
Move YieldProcessorNormalized into separate files Clean up YieldProcessorNormalized
2017-09-19Move initialization of YieldProcessorNormalized to the finalizer thread (#14058)Koundinya Veluri1-6/+8
Move initialization of YieldProcessorNormalized to the finalizer thread Fixes https://github.com/dotnet/coreclr/issues/13984 - Also moved relevant functions out of the Thread class as requested in the issue - For some reason, after moving the functions out of the Thread class, YieldProcessorNormalized was not getting inlined anymore. It seems to be important to have it be inlined such that the memory loads are hoisted out of outer loops. To remove the dependency on the compiler to do it (even with forceinline it's not possible to hoist sometimes, for instance InterlockedCompareExchnage loops), changed the signatures to do what is intended.
2017-09-01Add normalized equivalent of YieldProcessor, retune some spin loops (#13670)Koundinya Veluri1-7/+27
* Add normalized equivalent of YieldProcessor, retune some spin loops Part of fix for https://github.com/dotnet/coreclr/issues/13388 Normalized equivalent of YieldProcessor - The delay incurred by YieldProcessor is measured once lazily at run-time - Added YieldProcessorNormalized that yields for a specific duration (the duration is approximately equal to what was measured for one YieldProcessor on a Skylake processor, about 125 cycles). The measurement calculates how many YieldProcessor calls are necessary to get a delay close to the desired duration. - Changed Thread.SpinWait to use YieldProcessorNormalized Thread.SpinWait divide count by 7 experiment - At this point I experimented with changing Thread.SpinWait to divide the requested number of iterations by 7, to see how it fares on perf. On my Sandy Bridge processor, 7 * YieldProcessor == YieldProcessorNormalized. See numbers in PR below. - Not too many regressions, and the overall perf is somewhat as expected - not much change on Sandy Bridge processor, significant improvement on Skylake processor. - I'm discounting the SemaphoreSlim throughput score because it seems to be heavily dependent on Monitor. It would be more interesting to revisit SemaphoreSlim after retuning Monitor's spin heuristics. - ReaderWriterLockSlim seems to perform worse on Skylake, the current spin heuristics are not translating well Spin tuning - At this point, I abandoned the experiment above and tried to retune spins that use Thread.SpinWait - General observations - YieldProcessor stage - At this stage in many places we're currently doing very long spins on YieldProcessor per iteration of the spin loop. In the last YieldProcessor iteration, it amounts to about 70 K cycles on Sandy Bridge and 512 K cycles on Skylake. - Long spins on YieldProcessor don't let other work run efficiently. Especially when many scheduled threads all issue a long YieldProcessor, a significant portion of the processor can go unused for a long time. - Long spins on YieldProcessor is in some cases helping to reduce contention in high-contention cases, effectively taking away some threads into a long delay. Sleep(1) works much better but has a much higher delay so it's not always appropriate. In other cases, I found that it's better to do more iterations with a shorter YieldProcessor. It would be even better to reduce the contention in the app or to have a proper wait in the sync object, where appropriate. - Updated the YieldProcessor measurement above to calculate the number of YieldProcessorNormalized calls that amount to about 900 cycles (this was tuned based on perf), and modified SpinWait's YieldProcessor stage to cap the number of iterations passed to Thread.SpinWait. Effectively, the first few iterations have a longer delay than before on Sandy Bridge and a shorter delay than before on Skylake, and the later iterations have a much shorter delay than before on both. - Yield/Sleep(0) stage - Observed a couple of issues: - When there are no threads to switch to, Yield and Sleep(0) become no-op and it turns the spin loop into a busy-spin that may quickly reach the max spin count and cause the thread to enter a wait state, or may just busy-spin for longer than desired before a Sleep(1). Completing the spin loop too early can cause excessive context switcing if a wait follows, and entering the Sleep(1) stage too early can cause excessive delays. - If there are multiple threads doing Yield and Sleep(0) (typically from the same spin loop due to contention), they may switch between one another, delaying work that can make progress. - I found that it works well to interleave a Yield/Sleep(0) with YieldProcessor, it enforces a minimum delay for this stage. Modified SpinWait to do this until it reaches the Sleep(1) threshold. - Sleep(1) stage - I didn't see any benefit in the tests to interleave Sleep(1) calls with some Yield/Sleep(0) calls, perf seemed to be a bit worse actually. If the Sleep(1) stage is reached, there is probably a lot of contention and the Sleep(1) stage helps to remove some threads from the equation for a while. Adding some Yield/Sleep(0) in-between seems to add back some of that contention. - Modified SpinWait to use a Sleep(1) threshold, after which point it only does Sleep(1) on each spin iteration - For the Sleep(1) threshold, I couldn't find one constant that works well in all cases - For spin loops that are followed by a proper wait (such as a wait on an event that is signaled when the resource becomes available), they benefit from not doing Sleep(1) at all, and spinning in other stages for longer - For infinite spin loops, they usually seemed to benefit from a lower Sleep(1) threshold to reduce contention, but the threshold also depends on other factors like how much work is done in each spin iteration, how efficient waiting is, and whether waiting has any negative side-effects. - Added an internal overload of SpinWait.SpinOnce to take the Sleep(1) threshold as a parameter - SpinWait - Tweaked the spin strategy as mentioned above - ManualResetEventSlim - Changed to use SpinWait, retuned the default number of iterations (total delay is still significantly less than before). Retained the previous behavior of having Sleep(1) if a higher spin count is requested. - Task - It was using the same heuristics as ManualResetEventSlim, copied the changes here as well - SemaphoreSlim - Changed to use SpinWait, retuned similarly to ManualResetEventSlim but with double the number of iterations because the wait path is a lot more expensive - SpinLock - SpinLock was using very long YieldProcessor spins. Changed to use SpinWait, removed process count multiplier, simplified. - ReaderWriterLockSlim - This one is complicated as there are many issues. The current spin heuristics performed better even after normalizing Thread.SpinWait but without changing the SpinWait iterations (the delay is longer than before), so I left this one as is. - The perf (see numbers in PR below) seems to be much better than both the baseline and the Thread.SpinWait divide by 7 experiment - On Sandy Bridge, I didn't see many significant regressions. ReaderWriterLockSlim is a bit worse in some cases and a bit better in other similar cases, but at least the really low scores in the baseline got much better and not the other way around. - On Skylake, some significant regressions are in SemaphoreSlim throughput (which I'm discounting as I mentioned above in the experiment) and CountdownEvent add/signal throughput. The latter can probably be improved later.
2017-08-14Added SetThreadDescription to set the unmanaged thread name (#12593)Alois-xx1-1/+14
* Added SetThreadDescription to set the unmanaged thread name as well when a managed thread name was set. This will show up in future debuggers which know how to read that information or in ETW traces in the Thread Name column. * use printf instead of wprintf which exists on all platforms. * Removed printf Ensure that GetProceAddress is only called once to when the method is not present. Potential perf hit should be negligible since setting a thread name can only happen once per managed thread. * - Moved SetThreadName code to winfix.cpp as proposed - Finalizer and threadpool threads get their name - GCToEEInterface::CreateBackgroundThread is also named - but regular GC threads have no name because when I included utilcode.h things did break apart. * Fix for data race in g_pfnSetThreadDescription * Fix string literals on unix builds. * Fixed nits Settled thread name on ".NET Core ThreadPool"
2017-08-07Cleanup code access security from the unmanaged runtime (#13241)Jan Kotas1-1/+0
2017-06-13Partially remove relocations for ModuleSection (ZapVirtualSectionType). (#11853)Ruben Ayrapetyan1-1/+1
2017-03-27Add heuristic to trigger GC to finalize dead threads and clean up han… ↵Koundinya Veluri1-12/+2
(#10413) Add heuristic to trigger GC to finalize dead threads and clean up handles and memory A thread that is started allocates some handles in Thread::AllocHandles(). After it terminates, the managed thread object needs to be collected by a GC for the handles to be released. Some applications that are mostly idle but have a timer ticking periodically will cause a thread pool thread to be created on each tick, since the thread pool preserves threads for a maximum of 20 seconds currently. Over time the number of handles accumulates to a high value. Thread creation adds some memory pressure, but that does not force a GC until a sufficiently large handle count, and for some mostly idle apps, that can be very long. The same issue arises with directly starting threads as well. Fixes #6602: - Track a dead thread count separately from the current dead thread count. This is the count that will contribute to triggering a GC, once it reaches a threshold. The count is tracked separately so that it may be reset to zero when a max-generation GC occurs, preventing dead threads that survive a GC due to references from contributing to triggering a GC again in this fashion. - If the count exceeds a threshold, enumerate dead threads to see which GC generation the corresponding managed thread objects are in. If the duration of time since the last GC of the desired generation also exceeds a threshold, trigger a preferably non-blocking GC on the finalizer thread. - Removed a couple of handles and some code paths specific to user-requested thread suspension, which is not supported on CoreCLR
2017-03-25Add Interlocked.MemoryBarrierProcessWide (#10476)Jan Kotas1-10/+0
Contributes to #16799
2017-03-07[x86/Linux] CDECL as default P/Invoke Calling Convetion (#9977)Jonghyun Park1-1/+1
* [x86/Linux] CDECL as default P/Invoke Calling Convetion
2017-02-14Remove never defined FEATURE_REMOTINGdanmosemsft1-193/+0
2017-02-14Remove never defined FEATURE_LEAK_CULTURE_INFOdanmosemsft1-52/+0
2017-02-14Remove never defined FEATURE_INCLUDE_ALL_INTERFACESdanmosemsft1-14/+0
2017-02-12Remove dead appdomainhelper* filesdanmosemsft1-1/+0
2017-02-12Remove never defined FEATURE_COMPRESSEDSTACKdanmosemsft1-45/+0
2017-02-10Remove always defined FEATURE_CORECLRdanmosemsft1-151/+0
2017-02-07Remove more CAS (#9390)Dan Moseley1-28/+5
* Remove PermissionSet * Remove HostProtectionAttribute * Remove PermissionState * Remove S.Security.Permissions * Remove IPrincipal * Fix native side * Remove model.xml again
2017-02-06CAS Security cleanup (#9355)Jan Kotas1-1/+0
2016-12-09[x86/Linux] Use Portable FastGetDomain (#8556)Jonghyun Park1-3/+3
FastGetDomain function (with __declspec(naked)) causes segmentation fault.
2016-01-27Update license headersdotnet-bot1-4/+3
2015-09-04Remove thread affinity and critical region stuff for UnixJan Vorlicek1-1/+2
The WaitHandleNative::CorWaitMultipleNative was calling Thread::BeginThreadAffinityAndCriticalRegion that results in incrementing the Thread::m_dwCriticalRegionCount. However, there is nothing that would decrement it on CoreCLR, so if the WaitHandleNative::CorWaitMultipleNative is called, in debug build we get an assert in Thread::InternalReset. It turns out that the critical region and thread affinity stuff is not to be used in CoreCLR, so I have disabled handling of that in CoreCLR for Unix. The only remainder are the static methods Thread::BeginThreadAffinity and Thread::EndThreadAffinity which are used in the ThreadAffinityHolder. Conditionally removing the holder usage would be messy, so I have rather kept those methods and made their bodies empty.
2015-01-30Initial commit to populate CoreCLR repo dotnet-bot1-0/+2243
[tfs-changeset: 1407945]