Age | Commit message (Collapse) | Author | Files | Lines |
|
- Methods marked with AggressiveOptimization are not NGENed at all.
- The methods are compiled during the runtime with high JITC overhead.
- It makes launching time slower over 6% in our embedded systems.
|
|
on the ISAs supported by the target CPU. (#25365)
|
|
* Add crossgen option to setup default base address for native image
This is enabled only with -DFEATURE_ENABLE_NO_ADDRESS_SPACE_RANDOMIZATION.
* Mmap native images at default base address if env variable COMPlus_UseDefaultBaseAddr=0x1 is setup.
This is enabled only with -DFEATURE_ENABLE_NO_ADDRESS_SPACE_RANDOMIZATION.
|
|
* Allow pregenerating all HW intrinsics in CoreLib
This is a follow up to #24689 that lets us pregenerate all hardware intrinsics in CoreLib.
We ensures the potentially unsupported code will never be reachable at runtime on CPUs that don't support it by not reporting the `IsSupported` property as intrinsic in crossgen. This ensures the support checks are always JITted. JITting the support checks is very cheap.
There is cost in the form of an extra call and failure to do constant propagation of the return value, but the cost is negligible in practice and gets eliminated once the tiered JIT tiers the method up.
We only do this in CoreLib because user code could technically not guard intrinsic use in `IsSupported` checks and pregenerating the code could lead to illegal instruction traps at runtime (instead of `PlatformNotSupportedException` throws) - it's a bad user experience.
|
|
Since crossgen normally prints a message when the compilation didn't happen (because e.g. something is not supported), we should print something here too. Otherwise the `/verbose` log makes it look like pregeneration succeeded and sends someone down a wrong path investigating why it's not picked up at runtime...
|
|
* Basic support for precompiled pinvoke stubs
* Generate R2R file with multiple references to same IL stub (one per method which the IL stub is associated with)
* Not all il stubs are p/invokes. Don't fail when they aren't.
* Consistently use IsDynamicScope and GetModule to avoid unsafe memory access in IL stub compilation paths
* Enable full p/invoke il stubs when compiling System.Private.Corelib
* Disable IL Stub generation in crossgen for ARM32.
- The cross bitness logic is not correct for IL Stub generation
|
|
Enable pinvoke stub inlining on Unix
Exclude x86 Unix platforms from inlining pinvoke stubs (limited support)
|
|
We currently don't precompile methods that use hardware intrinsics because we don't know the CPU that the generated code will run on. Jitting these methods slows down startup and accounts for 3% of startup time in PowerShell.
With this change, we're going to lift this restriction for CoreLib (the thing that matters for startup) and support generating HW intrinsics for our minimum supported target ISA (SSE/SSE2).
|
|
|
|
We abort R2R compiling methods with `thisTransform == CORINFO_BOX_THIS`. This means we don't R2R compile some methods that do virtual calls on valuetypes (e.g. calling `ToString` on a struct that doesn't itself provide a `ToString` method).
We can't allow this in general, but enums and primitives should be fine.
|
|
* Basic infra for cuckoo filter of attributes
- Implement cuckoo filter lookup logic
- Implement new ready to run section
- Add dumper to R2RDump
- Parse section on load into data structure
- Implement function to query filter
- Add concept of enum of well known attributes
- So that attribute name hashes themselves may be cached
* Wrap all even vaguely perf critical uses of attribute by name parsing with use of R2R data
* Update emmintrin.h in the PAL header to contain the needed SSE2 intrinsics for the feature
- Disable the presence table for non Corelib cases. Current performance data does not warrant the size increase in other generated binaries
|
|
(#24596)
It's not unusual to see these types of warnings with incremental IBC.
These warnings will move under the crossgen verbose mode.
|
|
data (#24557)
* Making crossgen throw when compiling with the PartialNgen flag and no IBC data
* Adding missing m_fPartialNGen check
|
|
|
|
|
|
Rename method ICorJitInfo::allocBBProfileBuffer to ICorJitInfo::allocMethodBlockCounts
Rename method ICorJitInfo::getBBProfileData to ICorJitInfo:"getMethodBlockCounts
Rename args and use DWORD instead of ULONG for ICorJitInfo:allocMethodBlockCounts and ICorJitInfo:getMethodBlockCounts
Rename Compiler::fgProfileBuffer to Compiler::fgBlockCounts
Use an #ifdef FEATURE_CORECLR to fix the missing CORINFO_FLG_DISABLE_TIER0_FOR_LOOPS flag on the desktop.
Make fgBlockCountsCount and fgNumProfileRuns DWORDs instead of ULONGs
Rename local var bbCurrentBlockProfileBuffer to currentBlockCounts
Rename local var bbProfileBufferStart to profileBlockCountsStart
Use DWORD when iterating over BlockCounts instead of ULONG
Rename ZapImage::hashBBProfileData to ZapImage::hashMethodBlockCounts
SuperPMI - Fixed all references to allocBBProfileBuffer => allocMethodBlockCounts
SuperPMI - fixed all reference to getBBProfileBuffer => getMethodBlockCounts
|
|
- Add crossgen test to verify file version is preserved
- Add support for general win32 resource copying to ReadyToRun
- Copy all resources
|
|
#23363 added a crossplat implementation of `GetWin32Resource`, but forgot to enable places that call it.
|
|
* Static RVA fields should be placed sequentially after the qsort operation to not break Managed C++ binaries in R2R
* Add regression test
|
|
version bubble (#23940)
|
|
references (#23828)
* Exclude PInvokes declared on other modules. We don't yet encode cross module references
|
|
- Remove concept of AppDomain from object api in VM
- Various infrastructure around entering/leaving appdomains is removed
- Add small implementation of GetAppDomain for use by DAC (to match existing behavior)
- Simplify finalizer thread operations
- Eliminate AppDomain::Terminate
- Remove use of ADID from stresslog
- Remove thread enter/leave tracking from AppDomain
- Remove unused asm constants across all architectures
- Re-order header inclusion order to put gcenv.h before handletable
- Remove retail only sync block code involving appdomain index
|
|
|
|
required (#22560)
* These changes enable the inlining of some PInvokes that do not require any marshalling. With inlined pinvokes, R2R performance should become slightly better, since we'll avoid jitting some of the pinvoke IL stubs that we jit today for S.P.CoreLib. Performance gains not yet measured.
* Added JIT_PInvokeBegin/End helpers for all architectures. Linux stubs not yet implemented
* Add INLINE_GETTHREAD for arm/arm64
* Set CORJIT_FLAG_USE_PINVOKE_HELPERS jit flag for ReadyToRun compilations
* Updating R2RDump tool to handle pinvokes
|
|
|
|
warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
conversions
Update src/ToolBox/superpmi/mcs/verbdumptoc.cpp
Co-Authored-By: franksinankaya <41809318+franksinankaya@users.noreply.github.com>
|
|
* Add parenthesis
src/vm/sha1.cpp: In function ‘void SHA1_block(SHA1_CTX*)’:
src/vm/sha1.cpp:93:29: warning: suggest parentheses around arithmetic in operand of ‘|’ [-Wparentheses]
#define ROUND3(B, C, D) ((C & (B | D) | (B & D)) + sha1_round3)
^
src/vm/sha1.cpp:139:32: note: in expansion of macro ‘ROUND3’
e += ROTATE32L(a, 5) + ROUND3(b, c, d) + msg80[i];
* Move declaration into same file as one was defined Extern the other one was static
* Remove hr=hr undefined assignment
* Fix mutli-line comment warning
* Convert multi-character literal
* Remove null check for stack local variables
rc/vm/invokeutil.cpp: In static member function ‘static void InvokeUtil::SetValidField(CorElementType, TypeHandle, FieldDesc*, OBJECTREF*, OBJECTREF*, TypeHandle, CLR_BOOL*)’:
src/vm/invokeutil.cpp:978:29: warning: the address of ‘Throwable’ will never be NULL [-Waddress]
EX_CATCH_THROWABLE(&Throwable);
^
src/inc/ex.h:1087:21: note: in definition of macro ‘EX_CATCH_THROWABLE’
if (NULL != ppThrowable)
^
|
|
* Use thread_local for thread local storage on non MSVC targets
* Use local copy of visitor rather than function parameter
* Remove extra class qualifier
* Replace hex number representation in ASM files
* Reorder STDAPI and DLLEXPORT
* Suppress conversion
Suppress warning during hash
add casting
* Remove anonymous struct
src/vm/codeversion.h:112:16: warning: ‘struct NativeCodeVersion::<anonymous union>::SyntheticStorage’ invalid; an anonymous union can only have non-static data members [-fpermissive]
struct SyntheticStorage
* Remove class declaration
Remove extra class declaration
* Remove extern C
* Add implicit paranthesis
src/vm/amd64/virtualcallstubcpu.hpp:735:103: warning: suggest parentheses around ‘-’ in operand of ‘&’ [-Wparentheses]
resolveInit.toMiss1 = offsetof(ResolveStub,miss)-(offsetof(ResolveStub,toMiss1)+1) & 0xFF;
^
src/vm/amd64/virtualcallstubcpu.hpp:741:103: warning: suggest parentheses around ‘-’ in operand of ‘&’ [-Wparentheses]
resolveInit.toMiss2 = offsetof(ResolveStub,miss)-(offsetof(ResolveStub,toMiss2)+1) & 0xFF;
Add parenthesis
src/vm/dataimage.cpp:631:55: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
previousRvaInfo->rva == rvaInfo->rva && previousRvaInfo->size >= rvaInfo->size
Add parenthesis
src/debug/daccess/daccess.cpp:6871:29: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
_ASSERTE(peFile == NULL && reflectionModule != NULL || peFile != NULL && reflectionModule == NULL);
Add parenthesis
src/vm/dataimage.cpp:631:57: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
(previousRvaInfo->rva == rvaInfo->rva) && (previousRvaInfo->size >= rvaInfo->size)
* Initialize member 1
src/ilasm/method.cpp:35:36: warning: operation on ‘((Method*)this)->Method::m_ulColumns[0]’ may be undefined [-Wsequence-point]
m_ulColumns[0]=m_ulColumns[0]=0;
* Remove unknown compiler option
* Abstract DLLEXPORT
|
|
|
|
For single-def locals, the type of a reference seen at the assignment to the
local may be a more specific type than the local's declared type. If so the jit
would prefer to use the assignment type to describe the local's value, as this
will lead to better optimization. For instance in
```
object x = "a string"; // only assignment to x
```
the jit can optimize better if it models the type of `x` as `string`.
Instead of relying on `mergeClasses` plus some jit-side screening to decide if
the assignment type is a more specific type, implement a new jit interface
method `isMoreSpecificType` that tries to answer this question more directly.
Added a test case with type equivalence that hit asserts.
Closes #22583.
|
|
* Add reporting exception from ResolveEHClause
When an exception, like EEFileLoadException happens in the
ResolveEHClause, it was not caught by the runtime and so it caused exit
with `terminating with uncaught exception of type EEFileLoadException*`
message without any additional details.
This change adds catching the exception, reporting its details and call
stack and then failing fast.
* Change StackSString to SString
* Ensure the catch clause types are loaded before EH
In crossgen-ed images, ensure the types used in catch clauses are loaded
before the function containing these clauses is executed. That ensures
that a failure to load the EH clause type will occur at that time
instead of during the EH stack walking that searches for the catch
handler.
* Fix EH clause class module check
* Remove the EH clause class module check
It turns out that even if the class was from the current module, it may
depend on types from other modules, so we still need to add a fixup for
it.
|
|
|
|
* Preliminary Changes
* Module Index Resolution
* Change infoModule encoding
* Change referencing module in R2R
* Pre-condition Check
* Virtual Method Module Resolution
* Remove Workarounds and add conditional import loading
* Add signature kind module override
* Add ELEMENT_TYPE_MODULE_ZAPSIG
* Add switch to enable large version bubble
* Cleanup
* Change Native header check
* Add large version bubble test
* Add Large Version Bubble Checks
* Cleanup
* Revert unnecessary check
* Change EncodeMethod Version Bubble Condition
* Add Large Version Bubble asserts
* Cleanup
* Add default argument to runtests.py
* Change test PreCommands
* Revert whitespace changes
* Change breaking conditional check
* Streamline Version Bubble test
* Address PR Feedback
* Address PR Feedback #2
* Remove dead code
* Add crossgen-time ifdef
|
|
switch (#21987)
|
|
This change improves detection of allocators with side effects.
Allocators can cause side effects if the allocated object may have a finalizer.
This change adds a pHasSideEffects parameter to getNewHelper JitEE interface
method. It's used by the jit to check for allocator side effects instead of
guessing from helper ids.
Fixes #21530.
|
|
During my work on the CPAOT compiler I found out that in the past
I had believed we should use the compressed uint signature encoding
for emitting fixup types. This place may have contributed to my
mistake. I have audited all places in CoreCLR where signatures are
decoded and in all cases the fixup type is assumed to be a byte,
not a signature-compressed uint.
As I said, the bug is harmless (after all, Crossgen works) because
the enum code of READYTORUN_FIXUP_Helper is less than 128 and for
values 0-127 the compressed uint encoding is identical with the
byte value; I however believe it's useful to make this change
nonetheless for the sake of code clarity and to help avoid future
confusions similar to mine.
|
|
Default interface methods in their unresolved state don't have a generic context. The generic context is only added once the method is resolved to its implementation.
|
|
* Add support for RVA fields to crossgen/R2R
RVA fields are became more common with pre-inititialized ReadOnlySpan<byte>. Fix crossgen to deal with them for R2R.
* Fix tests, map new JIT helper for R2R
|
|
|
|
|
|
|
|
This has two parts:
## Part 1
CoreRT represents native type handles differently from CoreCLR - on CoreCLR, `RuntimeTypeHandle` is a wrapper over `RuntimeType` and RyuJIT is aware of that. On CoreRT, `RuntimeTypeHandle` wraps the native type handle, not a `RuntimeType`.
The knowledge is hardcoded in importer when importing the sequence "ldtoken foo / call Type.GetTypeFromHandle" - importer just removes the call and bashes the result of ldtoken to be a reference type. CoreRT had to avoid reporting `Type.GetTypeFromHandle` as an intrinsic because of that.
I'm adding another helper that lets RyuJIT avoid hardcoding that knowledge. Instead of just bashing the return type, we swap the helper call.
## Part 2
Native type handle equality checks need to go through a helper, unless the EE side says it's okay to compare native type handles directly.
|
|
* Implementation of R2R vtable call thunks. These thunks will fetch the target code pointer from the vtable of the input thisPtr, and jump to that address.
This is especially helpful with generics, since we can avoid a generic dictionary lookup cost for a simple vtable call.
Overall, these thunks cause the CPU to have less branch mispredictions, and give a small performance boost to vtable calls.
These stubs are under VirtualCallStubManager so that the managed debugger can handle stepping through them.
|
|
The jit incorporates the value of integer and float typed initonly static
fields into its codegen, if the class initializer has already run.
The jit can't incorporate the values of ref typed initonly static fields,
but the types of those values can't change, and the jit can use this knowledge
to enable type based optimizations like devirtualization.
In particular for static fields initialized by complex class factory logic the
jit can now see the end result of that logic instead of having to try and deduce
the type of object that will initialize or did initialize the field.
Examples of this factory pattern in include `EqualityComparer<T>.Default` and
`Comparer<T>.Default`. The former is already optimized in some cases by via
special-purpose modelling in the framework, jit, and runtime (see #14125) but
the latter is not. With this change calls through `Comparer<T>.Default` may now
also devirtualize (though won't yet inline as the devirtualization happens
late).
Also update the reflection code to throw an exception instead of changing the value
of a fully initialized static readonly field.
Closes #4108.
|
|
The 'const' used in this context has no meaning
|
|
Enable high entropy for 64bit native images, which expands the set of virtual address bases a native image can be loaded at.
|
|
Add two methods to JitEE interface: getHeapClassSize and canAllocateOnStack.
Change JITEEVersionIdentifier.
|
|
Add MethodImplOptions.AggressiveOptimization and use it for tiering
Part of fix for https://github.com/dotnet/corefx/issues/32235
Workaround for https://github.com/dotnet/coreclr/issues/19751
- Added and set CORJIT_FLAG_AGGRESSIVE_OPT to indicate that a method is flagged with AggressiveOptimization
- For a method flagged with AggressiveOptimization, tiering uses a foreground tier 1 JIT on first call to the method, skipping the tier 0 JIT and call counting
- When tiering is disabled, a method flagged with AggressiveOptimization does not use r2r-pregenerated code
- R2r crossgen does not generate code for a method flagged with AggressiveOptimization
|
|
* Performance fix for R2R: use vtable-based codegen for virtual calls within the System.Private.CoreLib version bubble, avoiding the use of the VSD, or in the generics case, a dictionary lookup.
The CoreLib assembly will always be serviced along side the coreclr runtime, so special casing CoreLib to using vtable-based calls like the fragile NI case is ok.
|
|
|