summaryrefslogtreecommitdiff
path: root/src/jit/compiler.hpp
AgeCommit message (Collapse)AuthorFilesLines
2019-06-05Small fixes around AST nodes. (#24957)Sergey Andreenko1-1/+1
* Fix MEASURE_NODE_SIZE and naming mistakes. * The additional fields were deleted in #14582 (~1.5 years ago). * Fix GT_INDEX_ADDR def. We created them as `new (this, GT_INDEX_ADDR) GenTreeIndexAddr` but used smaller `GenTreeIndex` as nessecary size. * Use LargeOpOpcode instead of GT_CALL.
2019-05-16Ensure that SIMD fields are correctly typed (#24377)Carol Eidt1-0/+6
When a struct field is imported, its type needs to be normalized. Also, the LHS of a struct init, even if a SIMD type, should not be transformed to a non-block node, except in the case of a SIMD local, in which case it must be transformed to a simple assignment. Also, add an assert to catch this kind of bug in liveness. Fix #24336
2019-05-01Adjust some terms (#24351)Dan Moseley1-1/+1
2019-04-19Fixes for Zero Offset field sequence trackingBrian Sullivan1-0/+15
- A GT_LCL_VAR may have a zeroOffset field - Add an assert to prevent building field sequences with duplicates - Fix fgMorphField when we have a zero offset field Improve fgAddFieldSeqForZeroOffset - Add JItDump info - Handle GT_LCL_FLD Changing the sign of an int constant also remove any field sequence information. Added method header comment for fgAddFieldSeqForZeroOffset Changed when we call fgAddFieldSeqForZeroOffset to be before the call to fgMorphSmpOp. Prefer calling fgAddFieldSeqForZeroOffset() to GetZeroOffsetFieldMap()->Set()
2019-04-17Add lvIsImplicitByRef information to lvaSetStruct (#19223)Jarret Shook1-2/+9
Before implicit byrefs were tracked by setting lvIsParam and lvIsTemp. This change explicitly adds a flag for implicitByRef instead of overloading. In addition, it fixes the decision to copy an implicitByRef for arm64 varargs. Temporarily bump weight on byref params to match old behavior and avoid codegen diffs. Re-enabled various tests and parts of tests. Closes #20046 Closes #19860
2019-04-16Arm64 vector ABI (#23675)Carol Eidt1-3/+3
* Support for Arm64 Vector ABI Extend HFA support to support vectors as well as floating point types. This requires that the JIT recognize vector types even during crossgen, so that the ABI is supported consistently. Also, fix and re-enable the disabled Arm64 Simd tests. Fix #16022
2019-03-29Use GenTreeStmt* where it is implied. (#22963)Sergey Andreenko1-7/+4
* Extract `impAppendStmt` and `impExtractLastStmt`. * Delete `BEG_STMTS` fake stmt. Use new functions to keep the list updated. * Retype `impTreeList` and `impTreeLast` as statements. Rename `impTreeList` and `impTreeLast` to show that they are statements. * Fix fields that have to be stmt. * Start using GenTreeStmt. Change `optVNAssertionPropCurStmt` to use GenTreeStmt. Replace `GenTree* stmt = block->bbTreeList` with `GenTreeStmt* stmt = block->firstStmt()`. Save results of `FirstNonPhiDef` as `GenTreeStmt`. * Replace do-while with for loop. * Change type inside VNAssertionPropVisitorInfo. * Delete unused args fron `optVNConstantPropOnTree`. * Update fields to be stmt. Update optVNConstantPropCurStmt to use Stmt. Change `lvDefStmt` to stmt. Update LoopCloning structs. Update `optDebugLogLoopCloning`. Make `compCurStmt` a statement. Update declaration name in `BuildNode`. * Clean simple cpp files. Clean valuenum. Clean ssabuilder. Clean simd. Clean optcse. Clean loopcloning. Clean copyprop. Clean optimizer part1. * Start cleaning importer, morph, flowgraph, gentree. * Continue clean functons. Clean assertionprop. Clean morph. Clean gentree. Clean flowgraph. Clean compiler. Clean rangecheck. Clean indirectcalltransofrmer. Clean others. * Create some temp stmt. * Delete unnecessary noway_assert and casts. * Init `impStmtList` and `impLastStmt` in release. * Response review 1.
2019-03-19Adding const to functions that don't change or shouldn't change state (#23329)Brian Bohe1-1/+1
2019-03-13Fix for Issue 21231Brian Sullivan1-2/+2
When transferring a Zero offset from one GenTree node to another, we need to check if there already is a FieldSeq and append to it. Added third parameter 'kind' to JitHashTable::Set, and Added enum SetKind Only allow Set to overwrite an existing entry when kind is set to Overwrite. Added validation for all calls to JitHashTable::Set asserting that we don't expect the key to already exist or that we passed Overwrite indicating that we expect to handle it properly. Added two test cases for Issue 21231
2019-02-12JIT: change how we block gc refs from callee saves for inline pinvokes (#22477)Andy Ayers1-0/+1
Add a new marker instruction that we emit once we've enabled preepmtive gc in the inline pinvoke method prolog. Use that to kill off callee saves registers with GC references, instead of waiting until the call. This closes a window of vulnerability we see in GC stress where if a stress interrupt happens between the point at which we enable preeemptive GC and the point at which we make the call, we may report callee saves as GC live when they're actually dead. Closes #19211.
2019-01-30Remove GTF_ADDR_ONSTACK and IsVarAddr.Eugene Rozenfeld1-25/+0
IsVarAddr was checking GTF_ADDR_ONSTACK to determine if the GT_ADDR node is an address of a local. This change removes both GTF_ADDR_ONSTACK and IsVarAddr and uses IsLocalAdrExpr instead. IsLocalAddrExpr uses opcodes to determine if GT_ADDR node is a local address. GTF_ADDR_ONSTACK flag is ancient, added before 2002 so I couldn't find the checkin that introduced it. I changed the assert to a check and an assignment since simplifications inside fgMorphArgs between https://github.com/dotnet/coreclr/blob/1a1e4c4d5a8030cb8d82a2e5b06c2ab357b92534/src/jit/morph.cpp#L3709 (which causes https://github.com/dotnet/coreclr/blob/1a1e4c4d5a8030cb8d82a2e5b06c2ab357b92534/src/jit/morph.cpp#L3057) and https://github.com/dotnet/coreclr/blob/1a1e4c4d5a8030cb8d82a2e5b06c2ab357b92534/src/jit/morph.cpp#L3790 may result in more GT_ADDR nodes recognized by IsLocalAdrExpr. x86 and x64 pmi frameworks had no code diffs and some gcinfo reductions (15 methods with gcinfo diffs in x86). Fixes #22190.
2019-01-09Merge pull request #20772 from mikedn/ir-cleanupBruce Forstall1-73/+4
Some IR cleanup
2019-01-04JIT: encapsulate general checks for optimizationAndy Ayers1-2/+2
Add methods that answer the general question of whether or not the jit is optimizing the code it produces. Use this to replace composite checks for minopts and debug codegen (the two modes where the jit is not optimizing).
2018-12-21Improve removal of dead calls to allocator helpers.Eugene Rozenfeld1-9/+8
This change improves detection of allocators with side effects. Allocators can cause side effects if the allocated object may have a finalizer. This change adds a pHasSideEffects parameter to getNewHelper JitEE interface method. It's used by the jit to check for allocator side effects instead of guessing from helper ids. Fixes #21530.
2018-12-16Merge pull request #21400 from BruceForstall/FixArm32FloatRangeBruce Forstall1-37/+46
Fix arm32 local variable references
2018-12-16Enable object stack allocation in R2R mode.Eugene Rozenfeld1-5/+7
This change modified the importer to create GenTreeAllocObj node for box and newobj instead of a helper call in R2R mode. ObjectAllocator phase decides whether the object can be allocated on the stack or has to be created on the heap via a helper call. To trigger object stack allocation COMPlus_JitObjectStackAllocation has to be set (it's not set by default).
2018-12-10Eliminate GenTreeRegVar and GT_REG_VAR and RegVar (#18317)Julius R Friedman1-1/+0
Issue #18201 / Hackathon
2018-12-05Fix arm32 local variable referencesBruce Forstall1-37/+46
Arm32 has different addressing mode offset ranges for floating-point and integer instructions. In addition, the ranges aren't too large. So in functions with a frame pointer, we try to access some variables using the frame pointer and some with the stack pointer, to expand the total number of variables we can access without allocating a "reserved register" just used for constructing large offsets. This calculation was incorrect for struct variables that contained floats, as float fields require calculating using the floating point range, but we were calculating using the variable type (struct), instead of the instruction type (floating-point). In addition, we were not correctly calculating the frame pointer range using the actual variable offset plus "within variable" offset (struct member offset). Added a test that covers some of these cases. Fixes #19537
2018-11-05Add support for BSWAP intrinsic (#18398)Levi Broderick1-0/+2
With this change, the JIT will recognize a call to BinaryPrimitives.ReverseEndianness and will emit a bswap instruction. This logic is currently only hooked up for x86 and x64; ARM still uses fallback logic. If the JIT can't emit a bswap instruction (for example, trying to emit a 64-bit bswap in a 32-bit process), it will fall back to a software implementation, so the APIs will work across all architectures.
2018-11-03Delete GenTreeLabelMike Danes1-8/+0
This is only used to hold a pointer to a BasicBlock for GenTreeBoundsChk and GenTreeIndexAddr. This doesn't serve any purpose and it does not behave like a real operand (e.g. it's not included in the linear order).
2018-11-02Delete SMALL_TREE_NODESMike Danes1-65/+4
It is always defined and disabling it isn't really an option - making everything "large" would require more memory and some code (e.g. gtCloneExpr) seems to be broken without SMALL_TREE_NODES.
2018-11-01Remove redundant zero-initializations for long-lifetime structs. (#20753)Eugene Rozenfeld1-3/+4
When compInitMem is true long-lifetime structs (i.e., the ones with lvIsTemp set to false) are zero-initialized in the prolog: https://github.com/dotnet/coreclr/blob/c8a63947382b0db428db602238199ca81badbe8e/src/jit/codegencommon.cpp#L4765 Therefore, these structs don't need an explicit zero-initialization in blocks that are not in a loop.
2018-10-30JIT: Fix call flag propagation for GenTreeArrElem (#20660)Andy Ayers1-11/+2
Closes #20651. Also fix up some "near miss" cases for GenTreeField and GenTreeBoundsCheck, where we get lucky and the importer currently splits trees with temps so the currently ignored child nodes have no interesting side effects. Revise GenTreeField a bit to pull more of the initialization work into the constructor. Add a missing R2R field propagation for field nodes in GtClone (evidently also never hit in practice).
2018-10-29Dump the "reason" for a compiler tempBruce Forstall1-0/+2
Compiler temps are created with a "reason" that is dumped in JitDump. Save the reason and display it in the local variable table dump. This is useful when trying to find a particular temp and see what code has been generated using it.
2018-10-18[RyuJIT] Delete dead code (#20411)mikedn1-28/+0
* Delete dead code optFindLocalInit and related functions (optIsTrackedLocal, lvaLclVarRefs, lvaLclVarRefsAccumIntoRes, lvaLclVarRefsAccum) are not used anywhere. Also delete a bunch of undefined function declarations. * Cleanup DataFlow callback comment
2018-09-06Remove unused GenTree flags (#19840)mikedn1-7/+1
GTF_IND_ARR_LEN was set by the importer in minopts/debug mode and used only by value numbering, which does not run in minopts/debug mode. GTF_FLD_NULLCHECK was also set by the importer and not used anywhere. fgMorphField has its own opinion about when an explicit null check is needed.
2018-08-25Remove some GT_ASG_op leftovers (#18205)mikedn1-46/+22
2018-08-25Streamline fgExcludeFromSsa (#15351)mikedn1-43/+1
This function is relatively expensive due to the many checks it does. Adding an LclVarDsc "in SSA" bit that is set during SSA construction by calling fgExcludeFromSsa only once per variable results in 0.35% drop in instructions retired. Most of the checks done in fgExcludeFromSsa are implied by lvTracked and they could probably be converted to asserts. But lvOverlappingFields is not implied by lvTracked so even if all redundant checks are converted to asserts fgExcludeFromSsa still needs 2 checks rather than just one. Incidentally, this difference between tracked variables and SSA variables results in SSA and value numbers being assigned to some variables that are actually excluded from SSA - SsaBuilder::RenameVariables and fgValueNumber assign numbers to all live in fgFirstBB variables that require initialization without checking fgExcludeFromSsa first. Structs with overlapping fields are not common but properly excluding them is still enough to save 0.15% memory when compiling corelib. - Replace calls to fgExcludeFromSsa with calls to lvaInSsa (the old name is kind of weird, it has nothing to do with the flow graph and "exclude" results in double negation) - Move fgExcludeFromSsa logic to SsaBuild::IncludeInSsa and use it to initialize LclVarDsc::lvInSsa for all variables - Change RenameVariables and fgValueNumber to call lvaInSsa before assigning numbers to live in fgFirstBB variables
2018-08-23Merge pull request #15011 from mikedn/ssa-mem-numCarol Eidt1-2/+2
Eliminate duplicate SSA number bookkeeping
2018-08-20JIT: remove incremental ref count updates (#19345)Andy Ayers1-171/+13
Remove almost all of the code in the jit that tries to maintain local ref counts incrementally. Also remove `lvaSortAgain` and related machinery. Explicitly sort locals before post-lower-liveness when optimizing to get the best set of tracked locals. Explicitly recount after post-lower liveness to get accurate counts after dead stores. This can lead to tracked unreferenced arguments; tolerate this during codegen.
2018-08-08JIT: update lvaGrabTemp for new minopts/debug ref counting approach (#19351)Andy Ayers1-5/+14
If the jit has started normal ref counting and is in minopts or debug, set all new temps to be implicitly referenced by default. Closes #19346.
2018-07-31JIT: fast path for minopts/debug codegen in lvaMarkRefs (#19103)Andy Ayers1-3/+15
For minopts and debug codegen, consider all locals to be implicitly referenced. This is set up during `lvaMarkLocalVars` and maintained after that by having `incRefCnts` set the implicit reference bit and not doing anything in `decRefCnts` for minopts / debug. Likewise suppress local var sorting, as we don't have accurate counts to go by.
2018-07-23JIT: some lclvars related cleanup (#19077)Andy Ayers1-107/+5
Consolidate various compiler globals used when setting local var ref counts by folding them into the visitor: * lvaMarkRefsCurBlock * lvaMarkRefsCurStmt * lvaMarkRefsWeight Remove the largely vestigial `lvPrefReg` and associated methods to set or modify this field. Haven't verified but this is likely a remmant of the legacy backend. In the one remaning use (lcl var sorting predicates), swap in `lvIsRegArg` instead, which gets most of the same cases.
2018-07-22JIT: stateful local ref counts and weights (#19068)Andy Ayers1-27/+223
Introduce a notion of state for local var ref counts and weighted ref counts. Accesses and current state must agree. State is invalid initially, enabled for an early period around bits of morph, invalid again for a time, and then enabled normally once lvaMarkRefs is called. Accesses normally specify RCS_NORMAL as the desired state, but in the accesses of selected ref counts in morph, specify RCS_EARLY. Revise how we decide if normal ref counting is active by changing `lvaLocalVarRefCounted` into a method. Update `gtIsLikelyRegVar` to not access ref counts when they're not in a valid state. Change weight APIs over to use `weight_t`.
2018-07-20JIT: handle implicit local var references via local var attribute bit (#19012)Andy Ayers1-3/+2
Instead of relying on ref count bumps, add a new attribute bit to local vars to indicate that they may have implicit references (prolog, epilog, gc, eh) and may not have any IR references. Use this attribute bit to ensure that the ref count and weighted ref count for such variables are never reported as zero, and as a result that these variables end up being allocated and reportable. This is another preparatory step for #18969 and frees the jit to recompute explicit ref counts via an IR scan without having to special case the counts for these variables. The jit can no longer describe implicit counts other than 1 and implicit weights otehr than BB_UNITY_WEIGHT, but that currently doesn't seem to be very important. The new bit fits into an existing padding void so LclVarDsc remains at 128 bytes (for windows x64).
2018-07-19Eliminate duplicate SSA number bookkeepingMike Danes1-2/+2
During SSA construction SsaRenameState keeps a definition count for each variable in an array. But each variable has a lvPerSsaData array that does almost the same thing, count the definitions. "Almost" because lvPerSsaData is a JitExpandArray that tracks only the array size and not the number of elements actually stored in the array. Replace JitExpandArray with purposely designed "array" that is in charge with allocating new SSA numbers and handles their intricacies - RESERVED_SSA_NUM, UNINIT_SSA_NUM and FIRST_SSA_NUM. This also allows the removal of the allocator from the array. Allocating new SSA numbers happens only during SSA construction and it's reasonable to pass the allocator to AllocSsaNum rather than increasing the size of PerSsaArray and LclVarDsc.
2018-07-18JIT: force all local var ref counts to be accessed via API (#18979)Andy Ayers1-18/+20
This is a preparatory change for auditing and controlling how local variable ref counts are observed and manipulated. See #18969 for context. No diffs seen locally. No TP impact expected. There is a small chance we may see some asserts in broader testing as there were places in original code where local ref counts were incremented without checking for possible overflows. The new APIs will assert for overflow cases.
2018-07-05Enable genFnCalleeRegArgs for Arm64 Varargs (#18714)Jarret Shook1-4/+18
* Enable genFnCalleeRegArgs for Arm64 Varargs Before the method would early out and incorrectly expect the usage of all incoming arguments to be their homed stack slots. It is instead possible for incoming arguments to be homed to different integer registers. The change will mangle the float types for vararg cases in the same way that is done during lvaInitUserArgs and fgMorphArgs. * Apply format patch * Account for softfp case * Address feedback * Apply format patch * Use standard function header for mangleVarArgsType * Remove confusing comment
2018-06-30Move temp info from Compiler to RegSetMike Danes1-6/+6
Temporaries are only used during register allocation and code generation. They waste space (136 bytes) in the compiler object during inlining.
2018-06-30Pass CompAllocator by value (#15025)mikedn1-78/+10
Passing CompAllocator objects by value is advantageous because it no longer needs to be dynamically allocated and cached. CompAllocator instances can now be freely created, copied and stored, which makes adding new CompMemKind values easier. Together with other cleanup this also improves memory allocation performance by removing some extra levels of indirection that were previously required - jitstd::allocator had a pointer to CompAllocator, CompAllocator had a pointer to Compiler and Compiler finally had a pointer to ArenaAllocator. Without MEASURE_MEM_ALLOC enabled, both jitstd::allocator and CompAllocator now just contain a pointer to ArenaAllocator. When MEASURE_MEM_ALLOC is enabled CompAllocator also contains a pointer but to a MemStatsAllocator object that holds the relevant memory kind. This way CompAllocator is always pointer sized so that enabling MEASURE_MEM_ALLOC does not result in increased memory usage due to objects that store a CompAllocator instance. In order to implement this, 2 additional signficant changes have been made: * MemStats has been moved to ArenaAllocator, it's after all the allocator's job to maintain statistics. This also fixes some issues related to memory statistics, such as not tracking the memory allocated by the inlinee compiler (since that one used its own MemStats instance). * Extract the arena page pooling logic out of the allocator. It doesn't make sense to pool an allocator, it has very little state that can actually be reused and everyting else (including MemStats) needs to be reset on reuse. What really needs to be pooled is just a page of memory. Since this was touching allocation code the opportunity has been used to perform additional cleanup: * Remove unnecessary LSRA ListElementAllocator * Remove compGetMem and compGetMemArray * Make CompAllocator and HostAllocator more like the std allocator * Update HashTable to use CompAllocator * Update ArrayStack to use CompAllocator * Move CompAllocator & friends to alloc.h
2018-06-22Expand ARM local variable R11-based addressingBruce Forstall1-14/+16
When addressing a local with negative offsets from R11 (if it can't be reached from SP), allow the full range of negative offsets allowed in the instructions. Floating-point load/store especially has a much bigger range than what was previously allowed.
2018-06-18[JIT/ARM] Fix Compiler::lvaFrameAddressKonstantin Baladurin1-8/+5
Use sp-based offset only if r10 reserved or offset is lower than encoding limit.
2018-06-14[Windows|Arm64|Vararg] Add FEATURE_ARG_SPLIT (#18346)Jarret Shook1-2/+2
* [ARM64|Windows|Vararg] Add FEATURE_ARG_SPLIT Enable splitting >8 byte <= 16 byte structs for arm64 varargs between x7 and virtual stack slot 0. * Force notHfa for vararg methods * Correctly pass isVararg * Correct var name
2018-06-13[Windows|Arm64|VarArgs] Correctly pass HFA arguments (#18364)Jarret Shook1-5/+6
* Fix passing HFA of two floats to vararg methods Previously, the type would be reported as HFA and enregistered; however, this is not correct, as arm64 varargs abi requires passing using int registers. * Address linux build issue * Apply final format patch * Add _TARGET_WINDOWS_
2018-05-22Remove JIT LEGACY_BACKEND code (#18064)Bruce Forstall1-184/+9
Remove JIT LEGACY_BACKEND code All code related to the LEGACY_BACKEND JIT is removed. This includes all code related to x87 floating-point code generation. Almost 50,000 lines of code have been removed. Remove legacyjit/legacynonjit directories Remove reg pairs Remove tiny instruction descriptors Remove compCanUseSSE2 (it's always true) Remove unused FEATURE_FP_REGALLOC
2018-05-15Do not allocate memory in compUpdateTreeLife. (#17055)Sergey Andreenko1-23/+0
* move compUpdateLifeVar and compUpdateTreeLife to separate files for legacy and non-legacy case * create TreeLifeUpdater class
2018-04-17Unix/x64 ABI cleanupCarol Eidt1-2/+2
Eliminate `FEATURE_UNIX_AMD64_STRUCT_PASSING` and replace it with `UNIX_AMD64_ABI` when used alone. Both are currently defined; it is highly unlikely the latter will work alone; and it significantly clutters up the code, especially the JIT. Also, fix the altjit support (now `UNIX_AMD64_ABI_ITF`) to *not* call `ClassifyEightBytes` if the struct is too large. Otherwise it asserts.
2018-03-30fix the bugSergey Andreenko1-0/+4
2018-03-28Add crossbitness support to ClrJit:Egor Chesakov1-0/+9
* Add FEATURE_CROSSBITNESS in crosscomponents.cmake * Exclude mscordaccore mscordbi sos from CLR_CROSS_COMPONENTS_LIST when FEATURE_CROSSBITNESS is defined in crosscomponents.cmake * Introduce target_size_t in src/jit/target.h * Use size_t value in genMov32RelocatableImmediate in src/jit/codegen.h src/jit/codegencommon.cpp * Fix definition/declaration inconsistency for emitter::emitIns_R_I in emitarm.cpp * Zero HiVal when GetTree::SetOper GenTreeLngCon->GetTreeIntCon in src/jit/compiler.hpp * Explicity specify roundUp(expr, TARGET_POINTER_SIZE) * Use target_size_t* target in emitOutputDataSec in src/jit/emit.cpp
2018-03-26Merge pull request #15301 from mikedn/cast-unCarol Eidt1-4/+5
Fix inconsistent handling of zero extending casts