platform/upstream/coreclr - Domain: Dotnet / Core; Licenses: MIT;

Age	Commit message (Collapse)	Author	Files	Lines
2019-03-01	Implement Serialization Guard	Morgan Brown	1	-0/+1
	Add Serialization Guard API and consume it in CoreLib targets
2019-01-23	Remove unused thread abortion methods. (#22147)	Filip Navara	1	-5/+0

2019-01-03	Cleanup current culture handling in the unmanaged runtime (#21706)	Jan Kotas	1	-1/+0
	Large portion of the current culture handling in the unmanaged runtime inherited from desktop has been no-op. The nativeInitCultureAccessors QCall that it used to depend on desktop got (almost) never called in CoreCLR. - Delete resetting of current culture on threadpool threads. It was needed in desktop because of a very tricky flow of current culture between appdomains. It is superseded by the flowing the current culture via AsyncLocal in CoreCLR. - Comment out fetch of managed current culture for unmanaged resource lookup. It has number of problems that are not easy to fix. We are not localizing the unmanaged runtime currently anyway, so it is ok to just comment it out. - Fix the rest to call CultureInfo directly without going through Thread.CurrentThread
2018-12-10	Refactor internal System.AppDomain out of CoreLib (#21460)	Jan Kotas	1	-4/+0
	Fixes #21028
2018-12-03	Refactor all FCalls out of AppDomain.cs (#21337)	Jan Kotas	1	-2/+0
	This saves the unmanaged->managed->unmanaged trip to initialize the assembly binder. Includes small bits of unrelated cleanup.
2018-08-01	Expose OSThreadId and TimeStamp on EventWrittenEventArgs (#19002)	Brian Robbins	1	-0/+1
	* Add EventWrittenEventArgs public properties OSThreadId and TimeStamp and plumb through OSThreadId. * Plumb ActivityId and RelatedActivityId to EventListener.
2018-02-28	Delete unnecesary StackCrawlMarks (#16648)	Jan Kotas	1	-2/+2
	CAS leftovers
2018-02-28	Add Thread.GetCurrentProcessorId() API (#16650)	Jan Kotas	1	-0/+2
	Contributes to https://github.com/dotnet/corefx/issues/16767
2017-09-01	Add normalized equivalent of YieldProcessor, retune some spin loops (#13670)	Koundinya Veluri	1	-0/+1
	* Add normalized equivalent of YieldProcessor, retune some spin loops Part of fix for https://github.com/dotnet/coreclr/issues/13388 Normalized equivalent of YieldProcessor - The delay incurred by YieldProcessor is measured once lazily at run-time - Added YieldProcessorNormalized that yields for a specific duration (the duration is approximately equal to what was measured for one YieldProcessor on a Skylake processor, about 125 cycles). The measurement calculates how many YieldProcessor calls are necessary to get a delay close to the desired duration. - Changed Thread.SpinWait to use YieldProcessorNormalized Thread.SpinWait divide count by 7 experiment - At this point I experimented with changing Thread.SpinWait to divide the requested number of iterations by 7, to see how it fares on perf. On my Sandy Bridge processor, 7 * YieldProcessor == YieldProcessorNormalized. See numbers in PR below. - Not too many regressions, and the overall perf is somewhat as expected - not much change on Sandy Bridge processor, significant improvement on Skylake processor. - I'm discounting the SemaphoreSlim throughput score because it seems to be heavily dependent on Monitor. It would be more interesting to revisit SemaphoreSlim after retuning Monitor's spin heuristics. - ReaderWriterLockSlim seems to perform worse on Skylake, the current spin heuristics are not translating well Spin tuning - At this point, I abandoned the experiment above and tried to retune spins that use Thread.SpinWait - General observations - YieldProcessor stage - At this stage in many places we're currently doing very long spins on YieldProcessor per iteration of the spin loop. In the last YieldProcessor iteration, it amounts to about 70 K cycles on Sandy Bridge and 512 K cycles on Skylake. - Long spins on YieldProcessor don't let other work run efficiently. Especially when many scheduled threads all issue a long YieldProcessor, a significant portion of the processor can go unused for a long time. - Long spins on YieldProcessor is in some cases helping to reduce contention in high-contention cases, effectively taking away some threads into a long delay. Sleep(1) works much better but has a much higher delay so it's not always appropriate. In other cases, I found that it's better to do more iterations with a shorter YieldProcessor. It would be even better to reduce the contention in the app or to have a proper wait in the sync object, where appropriate. - Updated the YieldProcessor measurement above to calculate the number of YieldProcessorNormalized calls that amount to about 900 cycles (this was tuned based on perf), and modified SpinWait's YieldProcessor stage to cap the number of iterations passed to Thread.SpinWait. Effectively, the first few iterations have a longer delay than before on Sandy Bridge and a shorter delay than before on Skylake, and the later iterations have a much shorter delay than before on both. - Yield/Sleep(0) stage - Observed a couple of issues: - When there are no threads to switch to, Yield and Sleep(0) become no-op and it turns the spin loop into a busy-spin that may quickly reach the max spin count and cause the thread to enter a wait state, or may just busy-spin for longer than desired before a Sleep(1). Completing the spin loop too early can cause excessive context switcing if a wait follows, and entering the Sleep(1) stage too early can cause excessive delays. - If there are multiple threads doing Yield and Sleep(0) (typically from the same spin loop due to contention), they may switch between one another, delaying work that can make progress. - I found that it works well to interleave a Yield/Sleep(0) with YieldProcessor, it enforces a minimum delay for this stage. Modified SpinWait to do this until it reaches the Sleep(1) threshold. - Sleep(1) stage - I didn't see any benefit in the tests to interleave Sleep(1) calls with some Yield/Sleep(0) calls, perf seemed to be a bit worse actually. If the Sleep(1) stage is reached, there is probably a lot of contention and the Sleep(1) stage helps to remove some threads from the equation for a while. Adding some Yield/Sleep(0) in-between seems to add back some of that contention. - Modified SpinWait to use a Sleep(1) threshold, after which point it only does Sleep(1) on each spin iteration - For the Sleep(1) threshold, I couldn't find one constant that works well in all cases - For spin loops that are followed by a proper wait (such as a wait on an event that is signaled when the resource becomes available), they benefit from not doing Sleep(1) at all, and spinning in other stages for longer - For infinite spin loops, they usually seemed to benefit from a lower Sleep(1) threshold to reduce contention, but the threshold also depends on other factors like how much work is done in each spin iteration, how efficient waiting is, and whether waiting has any negative side-effects. - Added an internal overload of SpinWait.SpinOnce to take the Sleep(1) threshold as a parameter - SpinWait - Tweaked the spin strategy as mentioned above - ManualResetEventSlim - Changed to use SpinWait, retuned the default number of iterations (total delay is still significantly less than before). Retained the previous behavior of having Sleep(1) if a higher spin count is requested. - Task - It was using the same heuristics as ManualResetEventSlim, copied the changes here as well - SemaphoreSlim - Changed to use SpinWait, retuned similarly to ManualResetEventSlim but with double the number of iterations because the wait path is a lot more expensive - SpinLock - SpinLock was using very long YieldProcessor spins. Changed to use SpinWait, removed process count multiplier, simplified. - ReaderWriterLockSlim - This one is complicated as there are many issues. The current spin heuristics performed better even after normalizing Thread.SpinWait but without changing the SpinWait iterations (the delay is longer than before), so I left this one as is. - The perf (see numbers in PR below) seems to be much better than both the baseline and the Thread.SpinWait divide by 7 experiment - On Sandy Bridge, I didn't see many significant regressions. ReaderWriterLockSlim is a bit worse in some cases and a bit better in other similar cases, but at least the really low scores in the baseline got much better and not the other way around. - On Skylake, some significant regressions are in SemaphoreSlim throughput (which I'm discounting as I mentioned above in the experiment) and CountdownEvent add/signal throughput. The latter can probably be improved later.
2017-03-25	Add Interlocked.MemoryBarrierProcessWide (#10476)	Jan Kotas	1	-1/+0
	Contributes to #16799
2017-03-07	[x86/Linux] CDECL as default P/Invoke Calling Convetion (#9977)	Jonghyun Park	1	-1/+1
	* [x86/Linux] CDECL as default P/Invoke Calling Convetion
2017-02-14	Remove never defined FEATURE_REMOTING	danmosemsft	1	-4/+0

2017-02-14	Remove never defined FEATURE_LEAK_CULTURE_INFO	danmosemsft	1	-7/+0

2017-02-12	Remove never defined FEATURE_COMPRESSEDSTACK	danmosemsft	1	-4/+0

2017-02-10	Remove always defined FEATURE_CORECLR	danmosemsft	1	-13/+0

2017-02-07	Remove more CAS (#9390)	Dan Moseley	1	-2/+2
	* Remove PermissionSet * Remove HostProtectionAttribute * Remove PermissionState * Remove S.Security.Permissions * Remove IPrincipal * Fix native side * Remove model.xml again
2016-01-27	Update license headers	dotnet-bot	1	-4/+3

2015-09-04	Remove thread affinity and critical region stuff for Unix	Jan Vorlicek	1	-0/+2
	The WaitHandleNative::CorWaitMultipleNative was calling Thread::BeginThreadAffinityAndCriticalRegion that results in incrementing the Thread::m_dwCriticalRegionCount. However, there is nothing that would decrement it on CoreCLR, so if the WaitHandleNative::CorWaitMultipleNative is called, in debug build we get an assert in Thread::InternalReset. It turns out that the critical region and thread affinity stuff is not to be used in CoreCLR, so I have disabled handling of that in CoreCLR for Unix. The only remainder are the static methods Thread::BeginThreadAffinity and Thread::EndThreadAffinity which are used in the ThreadAffinityHolder. Conditionally removing the holder usage would be messy, so I have rather kept those methods and made their bodies empty.
2015-01-30	Initial commit to populate CoreCLR repo	dotnet-bot	1	-0/+157
	[tfs-changeset: 1407945]