Age | Commit message (Collapse) | Author | Files | Lines |
|
* Fix contained LEA handling
This adds an LEA case to both `LinearScan::BuildOperandUses` and `CodeGen::genConsumeRegs`.
Fix #25039
|
|
* Cleanup block stores and test for 24846
Fix zero-length assert/bad codegen for initblk.
Remove redundant assertions in codegen and those that don't directly relate to codegen requirements.
Eliminate redundant LEA that was being generated by `genCodeForCpBlk`.
Rename `genCodeFor[Cp|Init]Blk` to `genCodeFor[Cp|Init]BlkHelper` to parallel the other forms.
Fix the test case for #24846.
|
|
Fix #24846
|
|
* Avoid using ins_Load/ins_Store with constant type
* Use ldr to load array lengths/lower bounds
genCodeForArrIndex and genCodeForArrOffset emit a ldrsw on ARM64 but that's not necessary. Array lengths are always positive. Lower bounds are signed but then the subsequent subtract is anyway EA_4BYTE so sign extension isn't actually needed. Just use ldr on both ARM32 and ARM64.
* Use ldr to load array length (part 2)
genCodeForIndexAddr's range check generation is messed up:
- It uses ldrsw to load array length on ARM64. Not needed, the length is positive.
- It uses sxtw to sign etxend the array length if the index is native int. But it was already extended by the load.
- It creates IND and LEA nodes out of thin air. Just call the appropiate emitter functions.
- It always generates a TYP_I_IMPL cmp, even when the index is TYP_INT. Well, that's just bogus.
* Stop the using the instruction size for immediate scaling on ARM32
The scaling of immediate operands is a property of the instruction and its encoding. It doesnt' make sense to throw the instruction size (emitAttr) into the mix, that's a codegen/emitter specific concept. On ARM32 it's also almost meaningless, at least in the case of integer types - all instructions are really EA_4BYTE, even ldrb, ldrh etc.
The ARM64 emitter already extracts the scaling factor from the instruction. It can't use the instruction size as on ARM64 the size is used to select between 32 bit and 64 bit instructions so it's never EA_1BYTE/EA_2BYTE.
* Stop using ldrsw for TYP_INT loads
ARM64's ins_Load returns INS_ldrsw for TYP_INT but there's nothing in the JIT type system that requires sign extending TYP_INT values on load. The fact that an indir node has TYP_INT doesn't even imply that the value to load is signed, it may be unsigned and indir nodes will never have type TYP_UINT nor have the GTF_UNSIGNED flag set.
XARCH uses a mov (rather than movsxd, the equivalent of ldrsw) so it zero extends. There's no reason for ARM64 to behave differently and doing so makes it more difficult to share codegen logic between XARCH and ARM64
Other ARM64 compilers also use ldr rather than ldrsw.
This requires patching up emitInsAdjustLoadStoreAttr so EA_4BYTE loads don't end up using EA_8BYTE, which ldrsw requires.
* Cleanup genCodeForIndir/genCodeForStoreInd
In particular, cleanup selection of acquire/release instructions. The existing code starts by selecting a "normal" instruction, only to throw it away and redo the type/size logic in the volatile case. And get it wrong in the process, it required that "targetType" be TYP_UINT or TYP_LONG to use ldar. But there are no TYP_UINT indirs.
Also rename "targetType" to "type", using "target" is misleading. The real target type would be genActualType(tree->TypeGet()).
* Remove ARM32/64 load/store instruction size inconsistencies
- Require EA_4BYTE for byte/halfword instructions on ARM32.
- Remove emitInsAdjustLoadStoreAttr on ARM64. Getting the correct instruction size simply requires using emitActualTypeSize, that will provide the correct size on both ARM32 and ARM64.
- Replace emitTypeSize with emitActualTypeSize as needed.
* Remove unnecessary insUnscaleImm parameter
|
|
Moving VariableLiveRanges classes outside Compiler class
|
|
|
|
* Fix x86 stack probing
On x86, structs are passed by value on the stack. We copy structs
to the stack in various ways, but one way is to first subtract the
size of the struct and then use a "rep movsb" instruction. If the
struct we are passing is sufficiently large, this can cause us to
miss the stack guard page.
So, introduce stack probes for these struct copies.
It turns out the stack pointer after prolog probing can be sitting
near the very end of the guard page (one `STACK_ALIGN` slot before
the end, which allows a "call" instruction which pushes its
return address to touch the guard page with the return address push).
We don't want to probe with every argument push, though. So change
the prolog probing to insert an "extra" touch at the final SP location
if the previous touch was "too far" away, leaving at least some
buffer zone for un-probed SP adjustments. I chose this to be the
size of the largest SIMD register, which also can get copied to the
argument stack with a "SUB;MOV" sequence.
Added several test case variations showing different large stack
probe situations.
Fixes #23796
* Increase the argument size probe buffer
* Formatting
|
|
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
|
|
* Defining VariableLiveRange class
* Adding some typedefs to avoid rewriting
* Defining VariableLiveDescriptor class
* Initializing VariableLiveRange structures before BasicBlock code is being generated
* Getting a siVarLoc for variable homes from a given LclVarDsc and stack level
* Defining VariableLiveKeeper class
* Reporting VariableLiveRanges on changes of variable livenesss or variable homes
* Adding USING_VARIABLE_LIVE_RANGE flag to enable disable VariableLiveRange
* Send VariableLiveRanges to debugger
* Reporting variable homes on prolog
* Wrong argument
* Miss to change variable homes count before sending them to debugger
* Adding dumper of VariableLiveRanges for each blocks and end of code generation
* Close all open VaribleLiveRanges on last BasicBlock
* Changing order of properties initialization on VariableLiveRange constructor
* Type error on assignation
* Rephrasing comments, moving dumps and fixing typos
* Changing const VARSET_TP* for VARSET_VALARG_TP on args
* Variable home was variable location in VariableLiveRange context
* Rephrase and rename of VariableLiveKeeper properties
* Missing some renames
* Adding const where BasicBlock should not be modified
* siBeginBlock and siInit have support for debug code for VariableLiveRange and siScope info
* Adding USING_VARIABLE_LIVE_RANGE flags on methods definition.
* Variable home -> variable location
* Renaming and rephrasing names and uses of VariableLiveRange
* Moving LiveRangeDumper ctor to class declation
* Removing destructors
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Removing blank spaces and reordering functions inside class definition
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Miss to increment the index after refactoring
* Logic for keeping the last BasicBlock end IL offset is shared between siScope and VariableLiverange for debug code
* Missing to print on debug the last block VariableLiveRanges
* Avoid updating VariableLiveRange when unspilling and dying at the same assembly instruction
* Rephrasing #ifs and #ifdefs
* Calling VariableLiveKeeper in one line
* Avoid copying siVarLoc on genSetScopeInfo
* Removing unused args from eeSetLVinfo
* Changing VariableLiveKeeper ctor
* Typo
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Updating VariableLiveDescriptor ctor
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Error on first argument
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Changing reference for pointer
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Renaming assembly offset -> native offset
* removing unnecesary comments and asserts
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Update VariableLiveRange dump message
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Moving VariableLiveRanges classes inside VariableLiveKeeper
* Wrong flag name
* Adding documentation about how we track variables for debug info
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Adding opened issues to doc file
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Changing dump tittle
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Renaming VariableLiveKeeper property
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Update documentation
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Updating comments on flags
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
* Setting Scope Info as default way of tracking variables for debug info
Signed-off-by: Brian Bohe <brianbohe@gmail.com>
|
|
* Extract `impAppendStmt` and `impExtractLastStmt`.
* Delete `BEG_STMTS` fake stmt.
Use new functions to keep the list updated.
* Retype `impTreeList` and `impTreeLast` as statements.
Rename `impTreeList` and `impTreeLast` to show that they are statements.
* Fix fields that have to be stmt.
* Start using GenTreeStmt.
Change `optVNAssertionPropCurStmt` to use GenTreeStmt.
Replace `GenTree* stmt = block->bbTreeList` with `GenTreeStmt* stmt = block->firstStmt()`.
Save results of `FirstNonPhiDef` as `GenTreeStmt`.
* Replace do-while with for loop.
* Change type inside VNAssertionPropVisitorInfo.
* Delete unused args fron `optVNConstantPropOnTree`.
* Update fields to be stmt.
Update optVNConstantPropCurStmt to use Stmt.
Change `lvDefStmt` to stmt.
Update LoopCloning structs.
Update `optDebugLogLoopCloning`.
Make `compCurStmt` a statement.
Update declaration name in `BuildNode`.
* Clean simple cpp files.
Clean valuenum.
Clean ssabuilder.
Clean simd.
Clean optcse.
Clean loopcloning.
Clean copyprop.
Clean optimizer part1.
* Start cleaning importer, morph, flowgraph, gentree.
* Continue clean functons.
Clean assertionprop.
Clean morph.
Clean gentree.
Clean flowgraph.
Clean compiler.
Clean rangecheck.
Clean indirectcalltransofrmer.
Clean others.
* Create some temp stmt.
* Delete unnecessary noway_assert and casts.
* Init `impStmtList` and `impLastStmt` in release.
* Response review 1.
|
|
* Handle addressing modes for HW intrinsics
Also, eliminate some places where the code size estimates were over-estimating.
Contribute to #19550
Fix #19521
|
|
* Clean `assertionprop.cpp`.
* Clean `codegencommon.cpp`.:
* Clean `codegenlinear.cpp`.
* Clean `compiler.cpp`.
* Clean `earlyprop.cpp`.
* Clean `gschecks.cpp`.
* Clean 'lclvars.cpp`.
* Clean `jiteh.cpp`.
* Clean `liveness.cpp`.
* Clean `hashbv.cpp`.
* Clean `gcinfo.cpp`.
* Clean `optimizer.cpp`.
|
|
* Make genStackLevel accessible from CodeGenInterface
* Initializing stackLevel before fist BasicBlock's code is generated
* Typo
* Removing extra line on comments
|
|
* Flag USING_SCOPE_INGO definition
* Enclosing siScope's related functions uses with USING_SCOPE_INFO flag definition check
* Encapsulating genSetScopeInfo when using siVarScope
* Moving comment inside flag defined block
* Include siScope/psiScope functions only when flag USING_SCOPE_INFO is defined
* Disable scope info
* Typo
* Adding comment flag name on #endif
* Remove redundant access levels/flags
* Repeating last accessibility level in case flag is disabled
* Setting use of siScope and psiScope as default way of reporting variable homes
|
|
function
|
|
Expand GT_JCC/SETCC condition support
|
|
Update var life for multireg local
|
|
When a multi-reg var is defined by a call, but doesn't currently reside in a register,
we must still update liveness.
Fix #21500
|
|
Issue #18201 / Hackathon
|
|
|
|
|
|
|
|
The actual checking had gotten lost between JIT32 and RyuJIT.
I fixed the "on return from function" case for x86/x64, and
the "around every call site" case for x86.
I removed the arm64 case because it's not easy to store SP to a
stack local or directly compare SP against a stack local without
a temporary. Also, for the fixed outgoing arg space ABIs (all but x86),
these checks don't seem too useful anyway, so I also removed the
arm case.
|
|
Refactor integer cast codegen
|
|
* delete isProfLeaveCB from arm signature
The previous implementation was done many years ago and I do not why it was done that way.
* extract GetSavedSet
* add isNoGCHelper
* delete isNoGC arg
* move declarations closer to their uses
* delete isGc from genEmitCall
* delete unused method declaration.
* add emitNoGChelper that accepts CORINFO_METHOD_HANDLE
* fix missed switch cases
* add function headers
* Fix feedback
* Fix feedback2
|
|
|
|
|
|
On x86, `MUL_LONG` wasn't considered a multi-reg node, as it should be, so that when it gets spilled or copied, the additional register will be correctly handled.
Also, the ARM and X86 versions of genStoreLongLclVar should be identical and shared (neither version were handling the copy of a `MUL_LONG`).
Finally, fix the LSRA dumping of multi-reg nodes.
Fix #19397
|
|
* Fix warnings due to "strlen return type is size_t" in src/jit/emitarm.cpp src/jit/unwindarm.cpp
* Use ptrdiff_t disp in emitter::emitOutputInstr in src/jit/emitarm.cpp
* Compiler::gtHashValue should depend on host-bitness in src/jit/gentree.cpp
* Simplify checking using ImmedValNeedsReloc() in src/jit/lowerarmarch.cpp
* Use target_ssize_t immVal in Lowering::IsContainableImmed in src/jit/lowerarmarch.cpp
* Remove int offs and use BYTE* addr and %p specifier in emitter::emitDispInsHelp in IF_T2_J3 case in src/jit/emitarm.cpp
* Cast gtIconVal to target_size_t in CodeGen::genLclHeap in src/jit/codegenarm.cpp
* Use int argSize in CodeGen::genEmitCall in src/jit/codegen.h src/jit/codegenlinear.cpp
* Use ssize_t disp in emitter::emitIns_Call in src/jit/emitarm.cpp src/jit/emitarm.h
* Use int argSize in emitter::emitIns_Call in src/jit/emitarm.cpp src/jit/emitarm.h
* Use target_size_t return type in Compiler::eeGetPageSize Compiler::getVeryLargeFrameSize in src/jit/codegencommon.cpp src/jit/compiler.h
* Cast gtIconVal to unsigned in CodeGen::genCodeForShift CodeGen::genCodeForShiftLong in src/jit/codegenarm.cpp src/jit/codegenarmarch.cpp
* Cast gtIconVal to unsigned in DecomposeLongs::DecomposeRotate in src/jit/decomposelongs.cpp
* Use unsigned size in CodeGen::genConsumePutStructArgStk in src/jit/codegenlinear.cpp
* Use target_ssize_t stmImm in cast in CodeGen::genZeroInitFrame in src/jit/codegencommon.cpp
* Cast to target_ssize_t in Compiler::gtSetEvalOrder in src/jit/gentree.cpp
* Address PR feedbask - use dspPtr(addr) in src/jit/emitarm.cpp
|
|
We use the following format when print the BasicBlock number: bbNum
This define is used with string concatenation to put this in printf format strings
|
|
Temporaries are only used during register allocation and code generation. They waste space (136 bytes) in the compiler object during inlining.
|
|
* Corrected a few typos in the documentation and comments
* Corrected a few typos in the documentation and comments
* Corrected a few typos in the documentation and comments
* Corrected a few typos in the documentation and comments
* Corrected a few typos in the documentation and comments
* Corrected a few typos in the documentation and comments
|
|
* Use correct field offset in genPutArgStkFieldList
Fix #18482
* formatting
* Add the new test to the arm and arm64 test lists
|
|
* [ARM64|Windows|Vararg] Add FEATURE_ARG_SPLIT
Enable splitting >8 byte <= 16 byte structs for arm64 varargs
between x7 and virtual stack slot 0.
* Force notHfa for vararg methods
* Correctly pass isVararg
* Correct var name
|
|
* Unify struct arg handling
Eliminate unnecessary struct copies, especially on Linux, and reduce code duplication.
Across all targets, use GT_FIELD_LIST to pass promoted structs on stack, and avoid
requiring a copy and/or marking `lvDoNotEnregister` for those cases.
Unify the specification of multi-reg args:
- numRegs now indicates the actual number of reg args (not the size in pointer-size units)
- regNums contains all the arg register numbers
|
|
* Replace regTracker.rsTrackRegTrash with regSet.verifyRegUsed
* Replace regTracker.rsTrackRegCopy with regSet.verifyRegUsed
* Replace regTracker.rsTrackRegIntCns with regSet.verifyRegUsed
* Replace regTracker.rsTrashRegSet with regSet.verifyRegistersUsed
* Move verifyReg* method implementation from header file to implementation file
* Replace regTracker.rsTrackRegLclVar with regSet.verifyRegistersUsed
* Eliminate RegTracker
* Fixed the funclet prolog special case
* Experiment - Changing the verification code to actually set the mask
* Adding function headers
* Removed a few trailing spaces
|
|
Remove JIT LEGACY_BACKEND code
All code related to the LEGACY_BACKEND JIT is removed. This includes all code related to x87 floating-point code generation. Almost 50,000 lines of code have been removed.
Remove legacyjit/legacynonjit directories
Remove reg pairs
Remove tiny instruction descriptors
Remove compCanUseSSE2 (it's always true)
Remove unused FEATURE_FP_REGALLOC
|
|
This is a follow-up to #17501 that fixed #17398.
gc pointer reporting in fully-interruptible mode: the latter assumed that
register gc pointer liveness doesn't change across calls while #6103 introduced
codegen where it wasn't true.
doesn't change across calls.
This change inserts int3 after non-returning calls at the end of basic blocks
so that gc pointer liveness doesn't change across calls. This is additional
insurance in case any other place in the runtime is dependent on that contract.
|
|
Fixes the runtime issues (#16359,#15389) seen with COMPlus_JitStressRegs=0x2* on ARM64.
|
|
LoadAlignedVector128 as contained.
|
|
* jit sources: Each local pointer variable must be declared on its own line.
Implement https://github.com/dotnet/coreclr/blob/master/Documentation/coding-guidelines/clr-jit-coding-conventions.md#101-pointer-declarations
Each local pointer variable must be declared on its own line.
* add constGenTreePtr
* delete GenTreePtr
* delete constGenTreePtr
* fix arm
|
|
|
|
Move genLongToIntCast call to codegenlinear
|
|
These are preparatory changes for eliminating gtLsraInfo.
Register requirements should never be set on contained nodes. This includes setting isDelayFree and restricting to byte registers for x86.
- This results in net positive diffs for the framework (eliminating incorrect setting of hasDelayFreeSrc), though a net regression for the tests on x86 (including many instances of effectively the same code).
- The regressions are largely related to issue #11274.
Improve consistency of IsValue():
- Any node that can be contained should produce a value, and have a type (e.g. GT_FIELD_LIST).
- Some value nodes (GTK_NOVALUE isn't set) are allowed to have TYP_VOID, in which case IsValue() should return false.
- This simplifies IsValue().
- Any node that can be assigned a register should return true for IsValue() (e.g. GT_LOCKADD).
- PUTARG_STK doesn't produce a value; get type from its operand.
- This requires some fixing up of SIMD12 operands.
- Unused GT_LONG nodes shouldn't define any registers
Eliminate isNoRegCompare, by setting type of JTRUE operand to TYP_VOID
- Set GTF_SET_FLAGS on the operand to ensure it is not eliminated as dead code.
|
|
|
|
|
|
Do TreeNodeInfoInit in buildIntervals
|
|
Repro test. Fix and additional assert.
|
|
Eliminate the second pass in Lowering, and instead call `TreeNodeInfoInit` on each node just prior to `buildRefPositionsForNode`.
|
|
Fix lclVar move node type inserted on lsra resolve phase
|