summaryrefslogtreecommitdiff
path: root/src/jit/importer.cpp
AgeCommit message (Collapse)AuthorFilesLines
2018-11-29Track single def locals in importer (#21251)Andy Ayers1-6/+33
* Defer setting lvSingleDef until lvaMarkLocalVars Since this information isn't used early on in the jit, defer setting this bit until later. * Track single-def locals in the importer and use this information to enable type updates during late devirtualization. This gets some cases where the exact type materializes late (perhaps via a chain of copies).
2018-11-28Implement the S.R.I.VectorXXX `get_Zero` and `As` helper methods as JIT ↵Tanner Gooding1-10/+362
intrinsics (#21198) * Remove ARM64_SIMD_StaticCast intrinsic and the x86 TwoTypeGeneric flag * Implementing the `S.R.Intrinsic.VectorXXX.As` methods as runtime intrinsics * Implementing the get_Zero method on the S.R.Intrinsic.VectorXXX types as runtime intrinsics
2018-11-19Add asserts for the types of the Span indexers. (#21037)Carol Eidt1-0/+2
Also, add a test for #20958
2018-11-19Make type comparisons more general purpose (#20940)Michal Strehovský1-6/+20
This has two parts: ## Part 1 CoreRT represents native type handles differently from CoreCLR - on CoreCLR, `RuntimeTypeHandle` is a wrapper over `RuntimeType` and RyuJIT is aware of that. On CoreRT, `RuntimeTypeHandle` wraps the native type handle, not a `RuntimeType`. The knowledge is hardcoded in importer when importing the sequence "ldtoken foo / call Type.GetTypeFromHandle" - importer just removes the call and bashes the result of ldtoken to be a reference type. CoreRT had to avoid reporting `Type.GetTypeFromHandle` as an intrinsic because of that. I'm adding another helper that lets RyuJIT avoid hardcoding that knowledge. Instead of just bashing the return type, we swap the helper call. ## Part 2 Native type handle equality checks need to go through a helper, unless the EE side says it's okay to compare native type handles directly.
2018-11-13Fix for bug 20499.Eugene Rozenfeld1-19/+57
When the jit is copying a struct-typed return to the return local in a synchronous method on arm64, it ends up invoking an importer utility outside the importer, where impTreeLast is not set. The call sequence is fgMorphBlocks --> gtNewTempAssign --> impAssignStruct --> impAssignStructPtr --> impAppendTree When impAssignStructPtr sees GT_COMMA src nodes, it unwraps them and inserts additional statements. The fix is to pass an insertion point statement through this call chain to prevent impAssignStruct and impAssignStructPtr from calling impAppendTree outside of the importer.
2018-11-05Add support for BSWAP intrinsic (#18398)Levi Broderick1-0/+48
With this change, the JIT will recognize a call to BinaryPrimitives.ReverseEndianness and will emit a bswap instruction. This logic is currently only hooked up for x86 and x64; ARM still uses fallback logic. If the JIT can't emit a bswap instruction (for example, trying to emit a 64-bit bswap in a 32-bit process), it will fall back to a software implementation, so the APIs will work across all architectures.
2018-11-05Adding some new functions to System.Math and System.MathF (#20788)Tanner Gooding1-11/+1
* Adding BitIncrement, BitDecrement, CopySign, MaxMagnitude, and MinMagnitude to Math and MathF * Adding FusedMultiplyAdd, IlogB, Log2, and ScaleB to Math and MathF * Adding some basic PAL tests for fma, ilogb, log2, and scalbn * Fixing a couple typos and adding clarifying comments * Fixing the MSVC _VVV FCALL declarations
2018-11-05Updating the importer to throw a NotImplementedException if it finds a ↵Tanner Gooding1-6/+10
mustExpand intrinsic that it can't expand (#20792) * Updating the importer to throw a NotImplementedException if it finds a mustExpand hwintrinsic that it can't expand * Updating the JITEEVersionIdentifier
2018-11-03Ifdef out GenTreeLclVar::gtLclILoffsMike Danes1-8/+8
This member is only used in debug builds.
2018-11-02Delete SMALL_TREE_NODESMike Danes1-4/+0
It is always defined and disabling it isn't really an option - making everything "large" would require more memory and some code (e.g. gtCloneExpr) seems to be broken without SMALL_TREE_NODES.
2018-10-29JIT: streamline temp usage for returns (#20640)Andy Ayers1-9/+20
If the jit decides it needs a return spill temp, and the return value has already been spilled to a single-def temp, re-use the existing for the return temp rather than creating a new one. In conjunction with #20553 this allows late devirtualization for calls where the object in the virtual call is the result of an inline that provides a better type, and the objected formerly reached the call via one or more intermediate temps. Closes #15873.
2018-10-26JIT: refactor how we do late devirtualization (#20553)Andy Ayers1-6/+1
Change late devirtualization to run in a postorder callback during the tree search to update GT_RET_EXPRs, instead of shoehorning it into the preorder callback. This allows the jit to reconsider all calls for late devirtualization, not just calls that are parents of particular GT_RET_EXPRs. The jit will take advantage of this in subsequent work that does more aggressive bottup-up type sharpening. Reconsidering all calls for late devirt instead of just a select subset incurs around a 0.5% throughput impact. To mitigate this, short-circult the tree walk when we no longer see a GTF_CALL flag. This prevents us from walking through CALL and RET_EXPR free subtrees. To make this work we have to make sure all node types propagate flags from their children. This required some updates to a few node constructors. There is also an odd quirk in the preorder callback where it may overwrite the parent node (from GT_ASG to GT_BLK or GT_NOP). If the overwrite creates a GT_NOP, the postorder callback may see a null tree. Tolerate this for now by checking for that case in the postorder callback. Also take advantage of `gtRetExprVal` to tunnel through GT_RET_EXPR chains to the final tree to avoid needing multiple overwrites when propagating the return expression back into the IR. With these mitigations this change has minimal throughput impact.
2018-10-23JIT: recover types from helper calls and more (#20447)Andy Ayers1-1/+4
The jit needs to recover class handles in order to devirtualize and do other type-based optimizations. This change allows the jit to find the type for more trees: in particular, helper calls, intrinsics, and expanded static field accesses. Also, annotate a few methods to control jit optimization We don't want to optimize special methods that are used to inform crossgen about desirable generic instantiations. `CommonlyUsedGenericInstantiations` was already annotated but `CommonlyUsedWinRTRedirectedInterfaceStubs` wasn't. And because `RuntimeType` is sealed calls through types are now often devirtualized. Name lookups on types are frequent, especially on error paths. The method `GetCachedName` looks like an attractive inline but simply expands into a larger sequence of two other calls. So block it from being inlined.
2018-10-12JIT: add some devirtualization info to the inline context (#20395)Andy Ayers1-1/+4
Allows the jit to remember which calls were devirtualized and which of those were then optimized to use an unboxed entry point. This info is then dumped out as part of the inline tree. Also remove some of the clutter from the COMPlus_JitPrintInlinedMethods output stream -- we don't need to see both the in-stream results and the final results, and we don't really need to know about the budget. This information is still dumped for COMPlus_JitDump.
2018-10-05Refactoring of struct promotion code for the future fix. (#20216)Sergey Andreenko1-3/+2
Create StructPromotionHelper. * Add the check for promoted fields with changed type. `CanPromoteStructType` can fake a field type if the field type is "structs of a single field of scalar types aligned at their natural boundary.". Add a check in debug that when we morph struct field access we have an expected type in the field handle.
2018-09-28Make `structType` optional in jitEEInterface method `getFieldType`. (#20191)Sergey Andreenko1-2/+1
* Make `structType` optional in `getFieldType`. The declaration in corinfo.h says: "if 'structType' == 0, then don't bother the structure info". However, `getFieldTypeInternal ` did not check this case. * Do not bother the structure info when we do not need it from `getFieldType`.
2018-09-26Do not treat a custom layout as overlapping when trying to inline a struct ↵Sergey Andreenko1-1/+1
method. (#20141) * Do not treat custom layout as overlapping when trying to inline struct method. That hack was added 4 years ago with independent struct promotion for parameters. I was not able to find any related issues. This change produces asm diffs because it allows us to inline more (lvaCanPromoteStructVar is a multiplier for inlineIsProfitable parameter). For System.Private.CoreLib we have total bytes of diff: 294 (0.01% of base). For example, we started to inline methods from 'System.Threading.Tasks.ValueTask' that has '[StructLayout(LayoutKind.Auto)]'. * Always sort fields in lcCanPromoteStructType. It will be optimized away in the future commits. * Delete the argument that is no longer used. * Fix variable name according to jit-coding-conventions. Rename 'StructPromotionInfo' to 'structPromotionInfo'.
2018-09-17Jit: ensure objp properly updated by SpillRetExprHelper (#19974)Andy Ayers1-7/+8
We need to arrange to actually update the call objp in the SpillRetExprHelper, otherwise the changes don't propagate back into the tree.
2018-09-06Remove unused GenTree flags (#19840)mikedn1-13/+5
GTF_IND_ARR_LEN was set by the importer in minopts/debug mode and used only by value numbering, which does not run in minopts/debug mode. GTF_FLD_NULLCHECK was also set by the importer and not used anywhere. fgMorphField has its own opinion about when an explicit null check is needed.
2018-08-23JIT: fix handling of newarr size (#19633)Andy Ayers1-0/+17
`newarr` accepts either an int or native int for array size, and the newarr helper the jit invokes takes a native int. If the value on the stack is an int and the jit is running on a platform where native int is larger than int, the jit must explicitly widen the int to native int. Found this while running some code through the interpreter -- the path the interpreter uses to invoke jitted code doesn't guarantee that int args are 64 bit clean.
2018-08-22define FMT_BB as "BB%02u" and use it uniformly in the codebaseBrian Sullivan1-31/+36
We use the following format when print the BasicBlock number: bbNum This define is used with string concatenation to put this in printf format strings
2018-07-13JIT: optimize some cases of unused structs (#18819)Andy Ayers1-1/+16
In some cases CSC will use `ldobj; pop` to null check a pointer to struct. This change avoids copying the struct value for such constructs. Codegen may still redundantly null check, if there are multiple such checks in a method. Fixes #18710
2018-07-03compGSReorderStackLayout is not supported when EnC is on (#18713)Andrew Au1-4/+7
2018-07-03Merge pull request #18504 from mikedn/comp-smallBruce Forstall1-8/+21
Make Compiler and CodeGen objects smaller
2018-07-01Fix recursive inlining of PInvoke stubs (#18737)Jan Kotas1-3/+3
PInvoke stubs can be inlined into a regular methods in CoreRT. PInvoke transition has to be inlined as well when that happens. Otherwise, there is a risk of recursive inlining in corner cases.
2018-06-30Remove Compiler::impSmallStackMike Danes1-8/+21
Memory allocation via ArenaAllocator is cheap so it's not worth having this inline array that ends up wasting 384 bytes when a larger stack is needed. Care needs to be taken when inlining to avoid allocating a new stack every time - just steal the stack of the inliner compiler (after ensuring that it is large enough).
2018-06-30Pass CompAllocator by value (#15025)mikedn1-3/+3
Passing CompAllocator objects by value is advantageous because it no longer needs to be dynamically allocated and cached. CompAllocator instances can now be freely created, copied and stored, which makes adding new CompMemKind values easier. Together with other cleanup this also improves memory allocation performance by removing some extra levels of indirection that were previously required - jitstd::allocator had a pointer to CompAllocator, CompAllocator had a pointer to Compiler and Compiler finally had a pointer to ArenaAllocator. Without MEASURE_MEM_ALLOC enabled, both jitstd::allocator and CompAllocator now just contain a pointer to ArenaAllocator. When MEASURE_MEM_ALLOC is enabled CompAllocator also contains a pointer but to a MemStatsAllocator object that holds the relevant memory kind. This way CompAllocator is always pointer sized so that enabling MEASURE_MEM_ALLOC does not result in increased memory usage due to objects that store a CompAllocator instance. In order to implement this, 2 additional signficant changes have been made: * MemStats has been moved to ArenaAllocator, it's after all the allocator's job to maintain statistics. This also fixes some issues related to memory statistics, such as not tracking the memory allocated by the inlinee compiler (since that one used its own MemStats instance). * Extract the arena page pooling logic out of the allocator. It doesn't make sense to pool an allocator, it has very little state that can actually be reused and everyting else (including MemStats) needs to be reset on reuse. What really needs to be pooled is just a page of memory. Since this was touching allocation code the opportunity has been used to perform additional cleanup: * Remove unnecessary LSRA ListElementAllocator * Remove compGetMem and compGetMemArray * Make CompAllocator and HostAllocator more like the std allocator * Update HashTable to use CompAllocator * Update ArrayStack to use CompAllocator * Move CompAllocator & friends to alloc.h
2018-06-29Remove relocations for vtable chunks (#17147)Gleb Balykov1-3/+4
* Separate sections READONLY_VCHUNKS and READONLY_DICTIONARY * Remove relocations for second-level indirection of Vtable in case FEATURE_NGEN_RELOCS_OPTIMIZATIONS is enabled. Introduce FEATURE_NGEN_RELOCS_OPTIMIZATIONS, under which NGEN specific relocations optimizations are enabled * Replace push/pop of R11 in stubs with - str/ldr of R4 in space reserved in epilog for non-tail calls - usage of R4 with hybrid-tail calls (same as for EmitShuffleThunk) * Replace push/pop of R11 for function epilog with usage of LR as helper register right before its restore from stack
2018-06-28JIT: fix bug returning small structs on linux x64 (#18563)Andy Ayers1-6/+82
The jit was retyping all calls with small struct returns as power of two sized ints when generating code for linux x64 and arm32/arm64. When the results of those calls were assigned to fields the jit would use overly wide stores which could corrupt neighboring fields. The fix is to keep better track of the smallest power of two enclosing int type size for the struct. If this exactly matches the struct size then the the result of the call can be used in an arbitrary context. If the enclosing type is larger, the call's result must be copied to a temp and then the temp can be reinterpreted as the struct for subsequent uses. Defer retyping inline candidates and tail call candidates. Then handle deferred updates for cases where calls don't get inlined. Add test cases for 6 byte structs showing the various situations the jit must handle: callee is not an inline candidate, is an inline candidate and gets inlined, or inline candidate that does not get inlined. Add a 3 byte test case to get coverage on arm32. Add new tests to the arm32/arm64 test lists.
2018-06-25Cross-bitness in instance fields placement and CORINFO structs (#18366)Egor Chesakov1-7/+7
* Replace sizeof(OBJECTREF*) with TARGET_POINTER_SIZE * Define IN_TARGET_32BIT IN_TARGET_64BIT macros * Replace IN_WIN32 IN_WIN64 with IN_TARGET_32BIT IN_TARGET_64BIT in src/vm/jitinterface.cpp src/vm/methodtablebuilder.cpp * Define and use OFFSETOF__CORINFO_* constants in clrjit * Define for all 64-bit targets * Use unsigned __int32 to emphasize that this is 32-bit number * Rename Array__u1Elems to Array__data * Eliminate OFFSETOF__CORINFO_RefArray__length * Rename OFFSETOF__CORINFO_RefAny__ to OFFSETOF__CORINFO_TypedReference__ * Fix OFFSETOF__CORINFO_TypedReference__dataPtr macro value
2018-06-23Use correct basic block to check legality of PInvoke callsite for inlining ↵Jan Kotas1-1/+3
(#18620) It is the same logic as used in other similar places
2018-06-19PInvoke calli support for CoreRT (#18534)Jan Kotas1-47/+34
* Ifdef out NGen-specific PInvoke calli inlining limitation for CoreCLR This limitation seems to be a left-over from effort to eliminate JITing with fragile NGen. * Delete dead partial-trust related code * Allow PInvoke stub inlining * Add convertCalliToCall JIT/EE interface method * Update superpmi
2018-06-13[Windows|Arm64|VarArgs] Correctly pass HFA arguments (#18364)Jarret Shook1-2/+6
* Fix passing HFA of two floats to vararg methods Previously, the type would be reported as HFA and enregistered; however, this is not correct, as arm64 varargs abi requires passing using int registers. * Address linux build issue * Apply final format patch * Add _TARGET_WINDOWS_
2018-06-13Fix enregistered lclFld bug (#18418)Carol Eidt1-1/+6
* Fix enregistered lclFld bug In `impFixupStructReturnType()`, don't transform to `GT_LCL_FLD` if we have a scalar lclVar. Also, to avoid future bad codegen, add verification and recovery code to Lowering. Fix #18408 * Extract the full conditions for whether a lclVar is a reg candidate, so it can be called from the assert in Lowering. * Review feedback
2018-06-06Fix issue in Compiler::impImportStaticFieldAccess (#18328)Jan Vorlicek1-2/+10
When CORINFO_FIELD_STATIC_SHARED_STATIC_HELPER accessor is being processed for structs, the method creates a GT_IND node instead of GT_OBJ node. For structs that can be normalized to a specific type (like TYP_SIMD8), it causes an assert in Compiler::impNormStructVal in the switch case for GT_IND. This change modifies the node creation to work the same way as in the default case for accessors, creating GT_OBJ for structs and GT_IND otherwise.
2018-06-04JIT: tolerate byref/nint return type mismatches during inlining (#18274)Andy Ayers1-4/+14
Have the jit tolerate byref/nint return type mismatches for inlines same way that it tolerates them for ret instructions. Fixes #18270.
2018-06-04Merge pull request #18267 from mikedn/lockadd3Carol Eidt1-4/+4
Cleanup LOCKADD handling
2018-06-04Cleanup LOCKADD handlingMike Danes1-4/+4
LOCKADD nodes are generated rather early and there's no reason for that: * The CORINFO_INTRINSIC_InterlockedAdd32/64 intrinsics are not actually used. Even if they would be used we can still import them as XADD nodes and rely on lowering to generate LOCKADD when needed. * gtExtractSideEffList transforms XADD into LOCKADD but this can be done in lowering. LOCKADD is an XARCH specific optimization after all. Additionally: * Avoid the need for special handling in LSRA by making GT_LOCKADD a "no value" oper. * Split LOCKADD codegen from XADD/XCHG codegen, attempting to use the same code for all 3 just makes things more complex. * The address is always in a register so there's no real need to create an indir node on the fly, the relevant emitter functions can be called directly. The last point above is actually a CQ issue - we always generate `add [reg], imm`, more complex address modes are not used. Unfortunately this problem starts early, when the importer spills the address to a local variable. If that ever gets fixed then we'll could probably generate a contained LEA in lowering.
2018-06-03Set GTF_RELOP_QMARK in gtNewQmarkNodeluqunl1-5/+0
2018-06-02Moving the x86 lookupHWIntrinsic and lookupHWIntrinsicISA methods to be ↵Tanner Gooding1-4/+4
static methods on HWIntrinsicInfo
2018-06-02JIT: fix issue with zero-length stackalloc (#18183)Andy Ayers1-5/+10
Defer calling `setNeedsGSSecurityCookie` until we know for sure that the jit won't do the zero-length stackalloc optimization. Closes #18176.
2018-05-22Remove JIT LEGACY_BACKEND code (#18064)Bruce Forstall1-108/+10
Remove JIT LEGACY_BACKEND code All code related to the LEGACY_BACKEND JIT is removed. This includes all code related to x87 floating-point code generation. Almost 50,000 lines of code have been removed. Remove legacyjit/legacynonjit directories Remove reg pairs Remove tiny instruction descriptors Remove compCanUseSSE2 (it's always true) Remove unused FEATURE_FP_REGALLOC
2018-05-18JIT: clear up some no-op casts in the importer (#17996)Andy Ayers1-9/+19
If we see a no-op cast (say converting long to ulong) where the operand to be cast already has the right stack type, don't generate a GT_CAST, just use the operand. This comes up in some cases with unsafe operations where the pointer type is cast. Interposing a cast between an GT_IND and GT_ADDR results in poor codegen as the jit thinks the addressed local is address taken.
2018-05-10JIT: fix case with slow generic delegate creation (#17802)Andy Ayers1-0/+9
The jit has been using IR pattern matching to find the information needed for optimizing delegate construction. This missed one case, causing the code to use a very slow construction path. to convey this information to the optimization. With this change the jit now now relies on the token cache for all the other cases too. This initial commit preserves the pattern match code and verifies that when it fires it reaches the same conclusion as the token cache does. This cross-validation revealed one case where the token information was less specific than the pattern match so we now also update the method handle in the token from the call info. A subsequent commit will remove the pattern matching when we are confident the new token cache logic handles all the cases correctly. Closes #12264.
2018-04-24Fix TailCallStress mode to do more legality checks (#17763)Eugene Rozenfeld1-5/+13
Check with EE whether tail call is allowed before adding tail. prefix in TailCallStress mode.
2018-04-17Unix/x64 ABI cleanupCarol Eidt1-17/+17
Eliminate `FEATURE_UNIX_AMD64_STRUCT_PASSING` and replace it with `UNIX_AMD64_ABI` when used alone. Both are currently defined; it is highly unlikely the latter will work alone; and it significantly clutters up the code, especially the JIT. Also, fix the altjit support (now `UNIX_AMD64_ABI_ITF`) to *not* call `ClassifyEightBytes` if the struct is too large. Otherwise it asserts.
2018-03-30Tighten arm32/arm64 write barrier kill reg setsBruce Forstall1-3/+3
The JIT write barrier helpers have a custom calling convention that avoids killing most registers. The JIT was not taking advantage of this, and thus was killing unnecessary registers when a write barrier was necessary. In particular, some integer callee-trash registers are unaffected by the write barriers, and no floating-point register is affected. Also, I got rid of the `FEATURE_WRITE_BARRIER` define, which is always set. I also put some code under `LEGACY_BACKEND` for easier cleanup later. I removed some unused defines in target.h for some platforms.
2018-03-26Merge pull request #15301 from mikedn/cast-unCarol Eidt1-45/+45
Fix inconsistent handling of zero extending casts
2018-03-22Fix bug for #17089Petr Bred1-2/+4
Signed-off-by: Petr Bred <bredpetr@gmail.com>
2018-03-20JIT: remove boxing for interface call to shared generic struct (#17006)Andy Ayers1-6/+53
Improve the jit's ability to optimize a box-interface call sequence to handle cases where the unboxed entry point requires a method table argument. Added two test cases, one from the inspiring issue, and another where the jit is then able to inline the method. Closes #16982.