summaryrefslogtreecommitdiff
path: root/src/jit/gtlist.h
AgeCommit message (Collapse)AuthorFilesLines
2019-06-05Small fixes around AST nodes. (#24957)Sergey Andreenko1-3/+3
* Fix MEASURE_NODE_SIZE and naming mistakes. * The additional fields were deleted in #14582 (~1.5 years ago). * Fix GT_INDEX_ADDR def. We created them as `new (this, GT_INDEX_ADDR) GenTreeIndexAddr` but used smaller `GenTreeIndex` as nessecary size. * Use LargeOpOpcode instead of GT_CALL.
2019-03-29Use GenTreeStmt* where it is implied. (#22963)Sergey Andreenko1-2/+0
* Extract `impAppendStmt` and `impExtractLastStmt`. * Delete `BEG_STMTS` fake stmt. Use new functions to keep the list updated. * Retype `impTreeList` and `impTreeLast` as statements. Rename `impTreeList` and `impTreeLast` to show that they are statements. * Fix fields that have to be stmt. * Start using GenTreeStmt. Change `optVNAssertionPropCurStmt` to use GenTreeStmt. Replace `GenTree* stmt = block->bbTreeList` with `GenTreeStmt* stmt = block->firstStmt()`. Save results of `FirstNonPhiDef` as `GenTreeStmt`. * Replace do-while with for loop. * Change type inside VNAssertionPropVisitorInfo. * Delete unused args fron `optVNConstantPropOnTree`. * Update fields to be stmt. Update optVNConstantPropCurStmt to use Stmt. Change `lvDefStmt` to stmt. Update LoopCloning structs. Update `optDebugLogLoopCloning`. Make `compCurStmt` a statement. Update declaration name in `BuildNode`. * Clean simple cpp files. Clean valuenum. Clean ssabuilder. Clean simd. Clean optcse. Clean loopcloning. Clean copyprop. Clean optimizer part1. * Start cleaning importer, morph, flowgraph, gentree. * Continue clean functons. Clean assertionprop. Clean morph. Clean gentree. Clean flowgraph. Clean compiler. Clean rangecheck. Clean indirectcalltransofrmer. Clean others. * Create some temp stmt. * Delete unnecessary noway_assert and casts. * Init `impStmtList` and `impLastStmt` in release. * Response review 1.
2019-03-18Fix explicit constructor calls and Remove multi-line comments (#23162)Sinan Kaya1-69/+69
* fix implicit constructor call * extern c format patch * muti-line * Remove direct constructor call * Conversion * Need paranthesis * Return value on resize * declspec(Thread) * Ignore warnings for GCC * Formatting issues * Move cast to constant
2019-02-12JIT: change how we block gc refs from callee saves for inline pinvokes (#22477)Andy Ayers1-0/+2
Add a new marker instruction that we emit once we've enabled preepmtive gc in the inline pinvoke method prolog. Use that to kill off callee saves registers with GC references, instead of waiting until the call. This closes a window of vulnerability we see in GC stress where if a stress interrupt happens between the point at which we enable preeemptive GC and the point at which we make the call, we may report callee saves as GC live when they're actually dead. Closes #19211.
2019-01-09Merge pull request #20772 from mikedn/ir-cleanupBruce Forstall1-1/+1
Some IR cleanup
2018-12-10Eliminate GenTreeRegVar and GT_REG_VAR and RegVar (#18317)Julius R Friedman1-1/+0
Issue #18201 / Hackathon
2018-11-05Add support for BSWAP intrinsic (#18398)Levi Broderick1-0/+3
With this change, the JIT will recognize a call to BinaryPrimitives.ReverseEndianness and will emit a bswap instruction. This logic is currently only hooked up for x86 and x64; ARM still uses fallback logic. If the JIT can't emit a bswap instruction (for example, trying to emit a 64-bit bswap in a 32-bit process), it will fall back to a software implementation, so the APIs will work across all architectures.
2018-11-03Delete GenTreeLabelMike Danes1-1/+1
This is only used to hold a pointer to a BasicBlock for GenTreeBoundsChk and GenTreeIndexAddr. This doesn't serve any purpose and it does not behave like a real operand (e.g. it's not included in the linear order).
2018-08-28Handle multiReg COPYCarol Eidt1-3/+1
On x86, `MUL_LONG` wasn't considered a multi-reg node, as it should be, so that when it gets spilled or copied, the additional register will be correctly handled. Also, the ARM and X86 versions of genStoreLongLclVar should be identical and shared (neither version were handling the copy of a `MUL_LONG`). Finally, fix the LSRA dumping of multi-reg nodes. Fix #19397
2018-06-14[Windows|Arm64|Vararg] Add FEATURE_ARG_SPLIT (#18346)Jarret Shook1-2/+2
* [ARM64|Windows|Vararg] Add FEATURE_ARG_SPLIT Enable splitting >8 byte <= 16 byte structs for arm64 varargs between x7 and virtual stack slot 0. * Force notHfa for vararg methods * Correctly pass isVararg * Correct var name
2018-06-04Cleanup LOCKADD handlingMike Danes1-1/+1
LOCKADD nodes are generated rather early and there's no reason for that: * The CORINFO_INTRINSIC_InterlockedAdd32/64 intrinsics are not actually used. Even if they would be used we can still import them as XADD nodes and rely on lowering to generate LOCKADD when needed. * gtExtractSideEffList transforms XADD into LOCKADD but this can be done in lowering. LOCKADD is an XARCH specific optimization after all. Additionally: * Avoid the need for special handling in LSRA by making GT_LOCKADD a "no value" oper. * Split LOCKADD codegen from XADD/XCHG codegen, attempting to use the same code for all 3 just makes things more complex. * The address is always in a register so there's no real need to create an indir node on the fly, the relevant emitter functions can be called directly. The last point above is actually a CQ issue - we always generate `add [reg], imm`, more complex address modes are not used. Unfortunately this problem starts early, when the importer spills the address to a local variable. If that ever gets fixed then we'll could probably generate a contained LEA in lowering.
2018-05-22Remove JIT LEGACY_BACKEND code (#18064)Bruce Forstall1-35/+7
Remove JIT LEGACY_BACKEND code All code related to the LEGACY_BACKEND JIT is removed. This includes all code related to x87 floating-point code generation. Almost 50,000 lines of code have been removed. Remove legacyjit/legacynonjit directories Remove reg pairs Remove tiny instruction descriptors Remove compCanUseSSE2 (it's always true) Remove unused FEATURE_FP_REGALLOC
2018-02-26Update the table-driven framework to support x86 imm-intrinsics.Fei Peng1-0/+4
And add a new range-check IR for x86 imm-intrinsics.
2018-01-19Fix desktop buildBruce Forstall1-1/+1
1. Fix `LEGACY_BACKEND` 2. `#if FEATURE_HW_INTRINSICS` => `#ifdef FEATURE_HW_INTRINSICS` [tfs-changeset: 1686599]
2017-11-17Fix GenTree definition for GT_BITCASTBruce Forstall1-1/+1
Nothing should be defined as GenTreeUnOp; use GenTreeOp instead.
2017-10-25Enable Crc32 , Popcnt, Lzcnt intrinsicsFei Peng1-0/+4
2017-10-18Ifdef out legacy uses of GT_ASG_op (#14384)mikedn1-1/+6
* Ifdef out legacy uses of GT_ASG_op GT_ASG_op nodes are only generated when the legacy backend is used. * Address feedback * Cleanup gtOverflow/gtOverflowEx
2017-10-16Merge pull request #14350 from CarolEidt/LsraInfoCleanupCarol Eidt1-1/+1
Cleanup of Lowering & LsraInfo
2017-10-11Cleanup of Lowering & LsraInfoCarol Eidt1-1/+1
These are preparatory changes for eliminating gtLsraInfo. Register requirements should never be set on contained nodes. This includes setting isDelayFree and restricting to byte registers for x86. - This results in net positive diffs for the framework (eliminating incorrect setting of hasDelayFreeSrc), though a net regression for the tests on x86 (including many instances of effectively the same code). - The regressions are largely related to issue #11274. Improve consistency of IsValue(): - Any node that can be contained should produce a value, and have a type (e.g. GT_FIELD_LIST). - Some value nodes (GTK_NOVALUE isn't set) are allowed to have TYP_VOID, in which case IsValue() should return false. - This simplifies IsValue(). - Any node that can be assigned a register should return true for IsValue() (e.g. GT_LOCKADD). - PUTARG_STK doesn't produce a value; get type from its operand. - This requires some fixing up of SIMD12 operands. - Unused GT_LONG nodes shouldn't define any registers Eliminate isNoRegCompare, by setting type of JTRUE operand to TYP_VOID - Set GTF_SET_FLAGS on the operand to ensure it is not eliminated as dead code.
2017-10-07Removed unused opers and codeMike Danes1-2/+0
* GT_DIV_HI and GT_MOD_HI are not used anywhere * genCodeForBinary doesn't handle GT_MUL_LONG * OperIsHigh is not used anywhere
2017-10-05JIT: More type equality opts (#14317)Andy Ayers1-0/+2
* JIT: Wrap some runtime lookups in new node type Work based on the plan outlined in #14305. Introduce a new unary node type GT_RUNTIMELOOKUP that wraps existing runtime lookup trees created in `impLookupToTree` and holds onto the handle that inspired the lookup. Note there are other importer paths that create lookups directly that we might also want to wrap, though wrapping is not a requirement for correctness. Keep this node type around through morph, then unwrap and just use the wrapped node after that. * JIT: More enhancements to type equality testing The jit is now able to optimize some type equality checks in shared method instances where one or both of the types require runtime lookup, for instance comparing `T` to a value type, or comparing two different value types `V1<T>` and `V2<T>`. Add two new jit interface methods, one for testing for type equality and the other for type casting. These return Must/MustNot/Maybe results depending on whether or not the equality or cast can be resolved at jit time. Implement the equality check. Use this to enhance the type equality opts in the jit for both direct comparison and the checking done by unbox.any.
2017-09-26[Arm64] Add GT_JCMP nodeSteve MacLean1-0/+1
Create new node type GT_JCMP to represent a fused Relop + JTrue which does not set flags Add lowering code to create GT_JCMP when Arm64 could use cbz, cbnz, tbz, or tbnz
2017-09-26Merge pull request #14171 from wateret/ryu-arm-putarg-bitcastCarol Eidt1-0/+4
[RyuJIT/armarch] Put arguments with GT_BITCAST
2017-09-26[RyuJIT/armarch] Put arguments with GT_BITCASTHanjoung Lee1-0/+4
Put arguments with GT_BITCAST instead of GT_COPY for arm32/arm64 Fix #14008
2017-09-20mark argplace node as no_lir (#14044)Sergey Andreenko1-16/+16
mark argplace node as no_lir
2017-09-15[RyuJit] fix the inconsistency between setContained and isContained. (#13991)Sergey Andreenko1-1/+1
* show the problem with contained arg_place We set contained on PUTARG_REG, but it doesn't pass IsContained check. * Fix problem with gtControlExpr * fix problem with ARGPLACE * additional improvements1 We should never have a contained node that is the last node in the execution order. * additional impovement2 for xarch. It is redundant, do not need to set as contained. * additional improvement2 for arm `GenTree* ctrlExpr = call->gtControlExpr;` was unused. * additional improvement3: unify CheckLir.
2017-09-13Add GT_BTMike Danes1-1/+4
2017-08-30Revert commit bec6ac10f3968a8f699aad6233657ac59df37a73.Pat Gavlin1-0/+2
This restores the `GT_INDEX_ADDR` changes.
2017-08-30Revert "Merge pull request #13245 from pgavlin/NoExpandIndex" (#13682)Jan Kotas1-2/+0
This reverts commit a7ffdeca6fed927dbd457293d97b07237db95e82, reversing changes made to f5f622db2a00d7687f256c0d1cdda5e6f6da7ad4.
2017-08-21Use a smaller expansion of `GT_INDEX` in MinOpts.Pat Gavlin1-0/+2
We currently expand `GT_INDEX` nodes during morph into an explicit bounds check followed by a load. For example, this tree: ``` [000059] ------------ /--* LCL_VAR int V09 loc6 [000060] R--XG------- /--* INDEX ref [000058] ------------ | \--* LCL_VAR ref V00 arg0 [000062] -A-XG------- * ASG ref [000061] D------N---- \--* LCL_VAR ref V10 loc7 ``` is expanded into this tree: ``` [000060] R--XG+------ /--* IND ref [000491] -----+------ | | /--* CNS_INT long 16 Fseq[#FirstElem] [000492] -----+------ | \--* ADD byref [000488] -----+-N---- | | /--* CNS_INT long 3 [000489] -----+------ | | /--* LSH long [000487] -----+------ | | | \--* CAST long <- int [000484] i----+------ | | | \--* LCL_VAR int V09 loc6 [000490] -----+------ | \--* ADD byref [000483] -----+------ | \--* LCL_VAR ref V00 arg0 [000493] ---XG+------ /--* COMMA ref [000486] ---X-+------ | \--* ARR_BOUNDS_CHECK_Rng void [000059] -----+------ | +--* LCL_VAR int V09 loc6 [000485] ---X-+------ | \--* ARR_LENGTH int [000058] -----+------ | \--* LCL_VAR ref V00 arg0 [000062] -A-XG+------ * ASG ref [000061] D----+-N---- \--* LCL_VAR ref V10 loc7 ``` Even in this simple case where both the array object and the index are lclVars, this represents a rather large increase in the size of the IR. In the worst case, the JIT introduces and additional lclVar for both the array object and the index, adding several additional nodes to the tree. When optimizing, exposing the structure of the array access may be helpful, as it may allow the compiler to better analyze the program. When we are not optimizing, however, the expansion serves little purpose besides constraining the IR shapes that must be handled by the backend. Due to its need for lclVars in the worst case, this expansion may even bloat the size of the generated code, as all lclVar references are generated as loads/stores from/to the stack when we are not optimizing. In the case above, the expanded tree generates the following x64 assembly: ``` IN0018: 000092 mov rdi, gword ptr [V00 rbp-10H] IN0019: 000096 mov edi, dword ptr [rdi+8] IN001a: 000099 cmp dword ptr [V09 rbp-48H], edi IN001b: 00009C jae G_M5106_IG38 IN001c: 0000A2 mov rdi, gword ptr [V00 rbp-10H] IN001d: 0000A6 mov esi, dword ptr [V09 rbp-48H] IN001e: 0000A9 movsxd rsi, esi IN001f: 0000AC mov rdi, gword ptr [rdi+8*rsi+16] IN0020: 0000B1 mov gword ptr [V10 rbp-50H], rdi ``` Inspired by other recent experiments (e.g. #13188), this change introduces a new node that replaces the above expansion in MinOpts. This node, `GT_INDEX_ADDR`, represents the bounds check and address computation involved in an array access, and returns the address of the element that is to be loaded or stored. Using this node, the example tree given above expands to the following: ``` [000489] a--XG+------ /--* IND ref [000059] -----+------ | | /--* LCL_VAR int V09 loc6 [000060] R--XG+--R--- | \--* INDEX_ADDR byref [000058] -----+------ | \--* LCL_VAR ref V00 arg0 [000062] -A-XG+------ * ASG ref [000061] D----+-N---- \--* LCL_VAR ref V10 loc7 ``` This expansion requires only the addition of the `GT_IND` node that represents the memory access itself. This savings in IR size translates to about a 2% decrease in instructions retired during non-optimizing compilation. Furthermore, this expansion tends to generate smaller code; for example, the tree given above is generated in 29 rather than 35 bytes: ``` IN0018: 000092 mov edi, dword ptr [V09 rbp-48H] IN0019: 000095 mov rsi, gword ptr [V00 rbp-10H] IN001a: 000099 cmp rdi, qword ptr [rsi+8] IN001b: 00009D jae G_M5106_IG38 IN001c: 0000A3 lea rsi, bword ptr [rsi+8*rdi+16] IN001d: 0000A8 mov rdi, gword ptr [rsi] IN001e: 0000AB mov gword ptr [V10 rbp-50H], rdi ```
2017-07-12Remove unused GT_PHYSREGDST nodeCarol Eidt1-1/+0
2017-07-12Implement a new approach for SIMD8/LONG interactions. (#12590)Pat Gavlin1-0/+1
SIMD8 values need to be converted to longs under a small number of situations on x64/Windows: - SIMD8 values are passed and returned as LONGs - SIMD8 values may be stored to a LONG lclVar Currently, LSRA performs some gymnastics when building use positions in order to ensure that registers are properly allocated. This change is a stab at a different approach: rather than pushing this work onto the RA, lowering inserts `GT_BITCAST` nodes where necessary to indicate that the source long- or SIMD8-typed value should be retinterpreted as a SIMD8- or long-typed value as necessary. The RA performs one specific optimization wherein it retypes stores of `GT_BITCAST` nodes to non-register-candidate local vars with the type of the cast's operand and preferences the cast to its source interval. This approach trades slightly larger IR for some functions that manipulate SIMD8 values for tighter code in buildRefPositions.
2017-06-29[RyuJIT/armel] Support `double` argument passingHanjoung Lee1-1/+5
- Fix for putting `double` arguments between Lowering and Codegen phase - Rename GenTreeMulLong to GenTreeMultiRegOp GT_PUTARG_REG could be GenTreeMultiRegOp on RyuJIT/arm Fix #12293
2017-06-28[RyuJIT/ARM32] Enable passing large split struct argument (#12050)Hyeongseok Oh1-0/+3
* [RyuJIT/ARM32] Enable passing large split struct This enables passing split struct larger than 16 bytes. To support splitted struct, it defines new GenTree type - GenTreePutArgSplit. GenTreePutArgSplit is similar with GenTreePutArgStk, but it is used for splitted struct only and it has additional field to save register information. GenTreePutArgSplit node is generated in lower phase. * Apply reviews: split struct argument passing - Fix some comments: genPutArgSplit, GenTreePutArgStk, GenTreePutArgSplit, NuwPutArg, ArgComplete - Add assertion check in genPutArgSplit, genCallInstruction - Rename variable: baseReg - Change flag for GenTreePutArgSplit: _TARGET_ARM && !LEGACY_BACKEND - Change type of gtOtherRegs in GenTreePutArgSplit - Remove duplicated code: NewPutArg - Implement spill & restore flag for GenTreePutArgSplit * Apply reviews - Rebase - Update managing spillFlag for split struct - Implement spill & restore code generation - Fix typos and rename variables - Fix bug related to print gentree for split struct * Fix bug and comments - Fix bug in regset.cpp - Add comments in morph.cpp's NYI_ARM - Fix comments' typo in lsraarmarcp.cpp
2017-06-08Make JIT dumps more readableMike Danes1-171/+170
Replace all uses of NodeName with OpName so we get proper names instead of symbols (e.g. ADD instead of +). Names stand out better, especially in JIT dumps where we use various symbols to draw tree lines.
2017-05-23Introduce GT_CMP and GT_SETCC(condition)Mike Danes1-1/+11
2017-05-15RyuJIT/ARM32: Enable GT_MUL_LONG codegen.Mikhail Skvortcov1-7/+12
2017-02-25Refactor isContained().Pat Gavlin1-5/+5
This changes isContained to terminate early if it is not possible for the node in question to ever be contained. There are many nodes that are statically non-containable: any node that is not a value may not be contained along with a small number of value nodes. To avoid the need for a `switch` to identify the latter, their node kinds have been marked with a new flag, `GTK_NOCONTAIN`. The checks for identifying non-containable nodes are encapsulated withing a new function, `canBeContained`.
2017-01-20CommentsMike Danes1-0/+7
2017-01-17Add GT_TEST_EQ and GT_TEST_NEMike Danes1-0/+4
2017-01-10Updates based on PR review.Carol Eidt1-1/+1
2017-01-10Fix handling of PutArgStkCarol Eidt1-2/+2
GT_PUTARG_STK doesn't produce a value, so it should have the GTK_NOVALUE flag set. Although the dstCount was being set to zero by the parent call, localDefUse was also being set, causing a register to be allocated. Fixing this produces a number of improvements due to reuse of constant registers that were otherwise unnecessarily "overwritten" by the localDefUse. Also on x86, GT_LONG shouldn't be used to pass a long, since GT_LONG should always be a value-producing node. Instead, use the existing GT_FIELD_LIST approach.
2016-11-18Reinstate the struct optimization changes:Carol Eidt1-0/+2
Remove many of the restrictions on structs that were added to preserve behavior of the old IR form. Change the init block representation to not require changing the parent when a copy block is changed to an init. In addition, fix the bug that was causing the corefx ValueTuple tests to fail. This was a bug in assertion prop where it was not invaliding the assertions about the parent struct when a field is modified. Add a repro test case for that bug. Incorporate the fix to #7954, which was encountered after the earlier version of these changes. This was a case where we reused a struct temp, and then it wound up eventually getting assigned the value it originally contained, so we had V09 = V09. This hit the assert in liveness where it wasn't expecting to find a struct assignment with the same lclVar on the lhs and rhs. This assert should have been eliminated with the IR change to struct assignments, but when this situation arises we may run into issues if we call a helper that doesn't expect the lhs and rhs to be the same. So, change morph to eliminate such assignments.
2016-11-03Revert "Enable optimization of structs"Jan Kotas1-2/+0
2016-10-20Enable optimization of structsCarol Eidt1-0/+2
Remove many of the restrictions on structs that were added to preserve behavior of the old IR form. Change the init block representation to not require changing the parent when a copy block is changed to an init.
2016-09-20Clean up GenTree node size dumping code. (#7278)Peter Kukol1-0/+2
Clean up GenTreeXxxx struct size dump; fix 32-bit build when MEASURE_NODE_SIZE is enabled.
2016-09-20Support GT_OBJ for x86Carol Eidt1-1/+2
Add support for GT_OBJ for x86, and allow them to be transformed into a list of fields (in morph) if it is a promoted struct. Add a new list type for this (GT_FIELD_LIST) with the type and offset, and use it for the multireg arg passing as well for consistency. Also refactor fgMorphArgs so that there is a positive check for reMorphing, rather than relying on gtCallLateArgs, which can be null if there are no register args. In codegenxarch, modify the struct passing (genPutStructArgStk) to work for both the x64/ux and x86 case, including the option of pushing fields onto the stack. Eliminate the redundant INS_movs_ptr, and replace with the pre-existing INS_movsp.
2016-09-16Add optimization for shift by CNS_INTMichelle McDaniel1-0/+10
This change adds support for shifting by a GT_CNS_INT without going through a helper. If the shiftOp is a GT_CNS_INT we do several transformations based on the shift amount: If the shift amount is 0, the shift is a nop, so we just put together the hi and lo ops as a GT_LONG. If the shift amount is < 32, we generate a shl/shld pattern, a shr/shrd pattern or a sar/shrd pattern, depending on the oper. The first operand of the shrd/shld is a GT_LONG, which we crack in codegen, using it essentially as two int operands, rather than creating a tri op GenTree node (essentially so that we can have 3 operands, instead of the normal two). If the shift amount is 32, it differs between shifting left and shifting right. For GT_LSH, we move the loOp into the hiResult and set the loResult to 0. For GT_RSZ, we move the hiOp into the loResult, and set the hiResult to 0. For GT_RSH, we move the hiOp into the loResult, and set the hiResult to a 31 bit signed shift of the hiOp to sign extend. If the shift amount is less than 64, but larger than 32: for GT_LSH, the hiResult is a shift of the loOp by shift amount - 32 (the move from lo into hi is the 32 bit shift). We set the loResult to 0. For GT_RSH and GT_RSZ, the loResult is a right shift (signed for GT_RSH) of the hiOp by shift amount - 32. The hiResult is 0 for GT_RSZ, and a 31 bit signed shift of hiOp1 for GT_RSH. If the shift amount is >= 64, we set both hiResult and loResult to 0 for GT_LSH and GT_RSZ, and do a sign extend shift to set hiResult and loResult to the sign of the original hiOp for GT_RSH.
2016-09-15Option for reporting GenTree operator bashing stats (#7152)Peter Kukol1-162/+158
* Add option (off by default) to report GenTree operator bashing stats.
2016-09-14Address PR feedback.Pat Gavlin1-0/+3