summaryrefslogtreecommitdiff
path: root/src/vm/amd64/JitHelpers_Slow.asm
AgeCommit message (Collapse)AuthorFilesLines
2018-05-04Fix System.String over-allocation (#17876)Jan Kotas1-3/+1
BaseSize for System.String was not set correctly. It caused unnecessary extra 8 bytes to be allocated at the end of strings that had `Length % 4 < 2` on 64-bit platforms. This change makes affected strings proportionally cheaper. For example, `new string('a', 1)` in a long-running loop is 7% faster.
2017-10-11Delete !FEATURE_IMPLICIT_TLS (#14398)Jan Kotas1-481/+0
Linux and Windows arm64 are using the regular C/C++ thread local statics. This change unifies the remaining Windows architectures to be on the same plan.
2017-09-26Remove Monitor asm helpers (#14146)Koundinya Veluri1-888/+0
- Removed asm helpers on Windows and used portable C++ helpers instead - Rearranged fast path code to improve them a bit and match the asm more closely Perf: - The asm helpers are a bit faster. The code generated for the portable helpers is almost the same now, the remaining differences are: - There were some layout issues where hot paths were in the wrong place and return paths were not cloned. Instrumenting some of the tests below with PGO on x64 resolved all of the layout issues. I couldn't get PGO instrumentation to work on x86 but I imagine it would be the same there. - Register usage - x64: All of the Enter functions are using one or two (TryEnter is using two) callee-saved registers for no apparent reason, forcing them to be saved and restored. r10 and r11 seem to be available but they're not being used. - x86: Similarly to x64, the compiled functions are pushing and popping 2-3 additional registers in the hottest fast paths. - I believe this is the main remaining gap and PGO is not helping with this - On Linux, perf is >= before for the most part - Perf tests used for below are updated in PR https://github.com/dotnet/coreclr/pull/13670
2017-06-26Replace array type handle with method table in arguments of array allocation ↵Ruben Ayrapetyan1-51/+18
helpers (#12369) * Remove direct usage of type handle in JIT_NewArr1, with except of retrieving template method table. * Assert that array type descriptor is loaded when array object's method table is set. * Pass template method tables instead of array type descriptors to array allocation helpers.
2017-03-16[Local GC] Break EE dependency on GC's generation table and alloc lock in ↵Sean Gillespie1-36/+32
single-proc scenarios (#10065) * Remove usage of the generation table from the EE by introducing an EE-owned GC alloc context used for allocations on single-proc machines. * Move the GC alloc lock to the EE side of the interface * Repair the Windows ARM build * Move the decision to use per-thread alloc contexts to the EE * Rename the lock used by StartNoGCRegion and EndNoGCRegion to be more indicative of what it is protecting * Address code review feedback 2 (enumerate the global alloc context as a part of GCToEEInterface) * Code review feedback (3) * Address code review feedback (move some GC-internal globals to gcimpl.h and gc.cpp) * g_global_alloc_lock is a dword, not a qword - fixes a deadlock * Move GlobalAllocLock to gchelpers.cpp and switch to preemptive mode when spinning * Repair the Windows x86 build
2016-08-29Monitor.TryEnter asm timeout != 0 before spin (#6951)Ben Adams1-8/+12
* JIT_MonTryEnter_Slow timeout != 0 before spin * JIT_MonTryEnter_InlineGetThread timeout != 0 before spin
2016-06-01Remove duplicate avoid in comments from a few files (#5363)James Singleton1-2/+2
Remove double avoid in comments
2016-04-12Implement software write watch and make concurrent GC functional outside WindowsKoundinya Veluri1-0/+18
- Implemented software write watch using write barriers - A new set of write barriers is introduced, each corresponding to an existing one, but which also updates the write watch table. The GC switches to a write watch barrier during concurrent GC, and switches back to a non write watch barrier after the final query for dirty pages. - The write watch table is alloacted along with the card table - Since the card table is used differently, different synchonization is used for the write watch table. The runtime is suspended during resize since that is the most infrequently occuring operation, of that, ResetWriteWatch, and GetWriteWatch. - ResetWriteWatch() doesn't need a suspend, but since the software WW version is much faster than the Windows version, moved it into the suspended region to avoid some synchronization that would otherwise be required - The background calls to GetWriteWatch() don't need or do a suspend. They only need to synchronize with the resize path, not for the purpose of correct functionality, but to not miss dirty pages such that concurrent GC is effective. Miscellaneous: - Fixed runtests.sh to copy mscorlib.dll and delete the Windows version of mscorlib.ni.dll
2016-01-27Update license headersdotnet-bot1-4/+3
2015-11-19Add casts to APIs that require LONG / DWORD parametersJan Vorlicek1-1/+1
2015-01-30Initial commit to populate CoreCLR repo dotnet-bot1-0/+1809
[tfs-changeset: 1407945]