Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
We need to flush instruction cache only for pages that have relocations
instead of full sections because otherwise application's shared clean
memory is increased in some cases on Linux.
|
|
|
|
* Separate sections READONLY_VCHUNKS and READONLY_DICTIONARY
* Remove relocations for second-level indirection of Vtable in case FEATURE_NGEN_RELOCS_OPTIMIZATIONS is enabled.
Introduce FEATURE_NGEN_RELOCS_OPTIMIZATIONS, under which NGEN specific relocations optimizations are enabled
* Replace push/pop of R11 in stubs with
- str/ldr of R4 in space reserved in epilog for non-tail calls
- usage of R4 with hybrid-tail calls (same as for EmitShuffleThunk)
* Replace push/pop of R11 for function epilog with usage of LR as helper register right before its restore from stack
|
|
* [x86/Linux] Fix marshalling struct with 64-bit types
The System V ABI for i386 defines 4-byte alignment for 64-bit types.
* [Linux/x86] Fix marshalling tests in the case of System V i386 ABI
|
|
|
|
|
|
StompWriteBarrier(WriteBarrierOp::StompResize) on ARM (#18107)
|
|
|
|
exception (#17710) (#17844)
|
|
(#17842)
Add portable PDB caching to StackTrace.
This is the mscorlib side of the change.
|
|
|
|
Fixes buggy memset implementation
Use heavily optimized platform implementation
Follows amd64 & arm precedent
|
|
JIT_Memset alignment code was definitly broken for some
unaligned cases
JIT_MemCpy likely had the same issue
Simplify implementation to reduce maintenance burden
|
|
The call to PInvokeStubWorker can do all kinds of stuff to the
VASigCookieReg in the GenericPInvokeCalli case, since x15 is just a
temporary register. Let's save it in a callee-saved register so that
when we come back after stub generation, we still have the correct value
for the VASigCookie.
|
|
When enumerating live gc registers, if we are not on the active stack frame,
we need to report callee-save gc registers that are live before the call.
The reason is that the liveness of gc registers may change across a call
to a method that does not return. In this case the instruction after the call
may be a jump target and a register that didn't have a live gc pointer before
the call may have a live gc pointer after the jump. To make sure we report the
registers that have live gc pointers before the call we subtract 1 from curOffs.
|
|
* Fix x86 steady state tiered compilation performance
Also included - a few tiered compilation only test hooks + small logging fix for JitBench
Tiered compilation wasn't correctly implementing the MayHavePrecode and RequiresStableEntryPoint policy functions. On x64 this was a non-issue, but due to compact entrypoints on x86 it lead to methods allocating both FuncPtrStubs and Precodes. The FuncPtrStubs would never get backpatched which caused never ending invocations of the Prestub for some methods. Although such code still runs correctly, it is much slower than it needs to be. On MusicStore x86 I am seeing a 20% improvement in steady state RPS after this fix, bringing us inline with what I've seen on x64.
|
|
Fix trigger for tier 1 call counting delay
The trigger was taking into account all non-tier-1 JIT invocations to delay call counting, even for those methods that are not eligible for tiering. In the AllReady benchmark, some dynamic methods were being jitted frequently enough to not allow tier 1 call counting to begin. Fixed to count only eligible methods jitted at tier 0, such that methods not eligible for tiering don't interfere with the tiering heuristics.
|
|
|
|
There were two problems:
* The illegal instruction 0xde01 used for INTERRUPT_INSTR_CALL doesn't
generate SIGILL, but SIGTRAP, since this is the code used for
breakpoints.
* The USE_REDIRECT_FOR_GCSTRESS was defined even for FEATURE_PAL for
ARM, which is incorrect and resulted in explicit redirect frame not
being created in DoGcStress and thus the GC stack walk was skipping
managed frames that it should walk.
|
|
GC info fix: correctly adjust argCnt
|
|
* Rename conflicting definitions of VER_MAJOR/MINORVERSION
These macros are defined by Windows SDK. They were overload to mean CLR version
that was causing interesting redefinition issues.
* Delete workaround for redefined Windows SDK macros
* Delete ProjectN version
* Delete dead code
|
|
Tighten arm32/arm64 write barrier kill reg sets
|
|
Fixes regex-redux-1 failure seen in https://github.com/dotnet/coreclr/issues/15309
- HashMap lookups and insertions occur under a level 0 lock and may enter cooperative GC mode inside the lock. A GC that is triggered may delete some dynamic code, which takes another level 0 lock. It does not look like it's possible for a deadlock as a result.
- Fixed by increasing the crst level for the lock used in ReadyToRunInfo
|
|
Add an api so the VM can tell the GC how long to spin for to normalize across processor families.
|
|
is to be used by BCL for deciding when to trim memory usage in pooling code
|
|
|
|
The JIT write barrier helpers have a custom calling convention that
avoids killing most registers. The JIT was not taking advantage of
this, and thus was killing unnecessary registers when a write barrier
was necessary. In particular, some integer callee-trash registers
are unaffected by the write barriers, and no floating-point register
is affected.
Also, I got rid of the `FEATURE_WRITE_BARRIER` define, which is always
set. I also put some code under `LEGACY_BACKEND` for easier cleanup
later. I removed some unused defines in target.h for some platforms.
|
|
It fixes crashes on arm when using AOT images.
|
|
corruption of the trace file. (#17358)
|
|
When there are nested calls, and there is a non-ptr on the stack below the last ptr popped by the inner call, the `argHigh` and `argCnt` values can get out of sync.
|
|
containers (#17265)
On Windows, we need to shutdown the EE when receiving a CTRL_CLOSE_EVENT to we run ProcessExit handlers and other code that relies on ProcessExit working (like AssemblyLoadContext.Unloading). One way we receive this event is when there's a running process in a docker container that has the stop command run against it.
|
|
Things have progressed far enough that its time to use a friendlier name. The feature still still has performance aspects that need to be investigated and improved, but I don't want to scare people off simply because it isn't as fast as it could be.
This also updates to use a newer CoreFX version for JitBench since that appeared to be broken, and updated some comments and usage of the tieredcompilation variable.
|
|
There was one more change needed to make the unwinding work properly.
Pushes in some prologs were missing the unwinder annotation.
The fix is to use PROLOG_PUSH for them.
To make things in this file consistent, I've also replaced pops in
epilogs with EPILOG_POP macro and vpush / vpop with PROLOG_VPUSH /
PROLOG_VPOP, although it is not functionally necessary.
With these changes, all the EH related issues are gone.
|
|
* Fix DelayLoad_MethodCall unwinding
Unwinding through DelayLoad_MethodCall was broken due to the overwriting
of R7 which is used as a frame pointer. That caused some managed
exceptions to cause abort with unhandled PAL_SEHException.
This change fixes the problem by using a different register.
* Fix one more spot with the same issue
|
|
When running ready to run code on ARM, the ExternalMethodFixupWorker
doesn't handle the entrypoints of FCalls correctly. It tries to handle
them as compact entrypoints, but those use a different machine code
instructions and it results in an assert in debug / checked build.
This change detects the runtime supplied calls before trying to check
for the compact entrypoint.
|
|
* Add FailFast error log to Windows Event Log
* change const wchar * to lpcwstr
* Enable sending unhandled exception info to Windows Event Log
* Change log format to match console output and address PR comments
* Remove more comments
* Change the order DoReportForUnhandledException to do a safety check first before calling managed code
* Fix parameter name in header file
* Add Windows Event logging in DefaultCatchHandler and remove DoReportForUnhandledException
* Add back event reporting for ignored unhandled exception cases, fix broken UNIX builds
* Fix more broken unix builds
* Fix typo
* Address PR comments
|
|
|
|
* Enable reflection load ComImport assembly and Type.IsComObjectType
* Update Enable reflection load ComImport assembly
|
|
|
|
|
|
- This change makes the compare for very short Span strings a bit slower and for longer Span strings many times faster.
- Switch several places where it was a clear benefit to use it.
`String.CompareOrdinal(string,string)` is notable exception that I have left intact for now. It is fine tuned for current string layout, and replacing with a call of vectorized SequenceCompareTo gives mixed results.
|
|
(#17232)
|
|
|
|
|
|
|
|
Fix hex value of the instruction ldp x16, x0, [x12]
|
|
|
|
|