platform/upstream/coreclr - Domain: Dotnet / Core; Licenses: MIT;

Age	Commit message (Collapse)	Author	Files	Lines
2018-04-14	Avoid creating illegal byref pointers (#17524)	Bruce Forstall	1	-0/+15
	Byref pointers need to point within their "host" object -- thus the alternate name "interior pointers". If the JIT creates and reports a pointer as a "byref", but it points outside the host object, and a GC occurs that moves the host object, the byref pointer will not be updated. If a subsequent calculation puts the byref "back" into the host object, it will actually be pointing to garbage, since the host object has moved. This occurred on ARM with array index calculations, in particular because ARM doesn't have a single-instruction "base + scaleindex + offset" addressing mode. Thus, we were generating, for the jaggedarr_cs_do test case, `ProcessJagged3DArray()` function: ``` // r0 = array object, r6 = computed index offset. We mark r4 as a byref. add r4, r0, r6 // r4 - 32 is the offset of the object we care about. Then we load the array element. // In this case, the loaded element is a gcref, so r4 becomes a gcref. ldr r4, [r4-32] ``` We get this math because the user code uses `a[i - 10]`, which is essentially `a + (i - 10) 4 + 8` for element size 4. This is optimized to `a + i * 4 - 32`. In the above code, `r6` is `i * 4`. In this case, after the first instruction, `r4` can point beyond the array. If a GC happens, `r4` isn't updated, and the second instruction loads garbage. There are several fixes: 1. Change array morphing in `fgMorphArrayIndex()` to rearrange the array index IR node creation to only create a byref pointer that is precise; don't create "intermediate" byref pointers that don't represent the actual array element address being computed. The tree matching code that annotates the generated tree with field sequences needs to be updated to match the new form. 2. Change `fgMoveOpsLeft()` to prevent the left-weighted reassociation optimization `[byref]+ (ref, [int]+ (int, int)) => [byref]+ ([byref]+ (ref, int), int)`. This optimization creates "incorrect" byrefs that don't necessarily point within the host object. 3. Add an additional condition to the `Fold "((x+icon1)+icon2) to (x+(icon1+icon2))"` morph optimization to prevent merging of constant TYP_REF nodes, which now were being recognized due to different tree shapes. This was probably always a problem, but the particular tree shape wasn't seen before. These fixes are all-platform. However, to reduce risk at this point, the are enabled for ARM only, under the `FEATURE_PREVENT_BAD_BYREFS` `#ifdef`. Fixes #17517. There are many, many diffs. For ARM32 ngen-based desktop asm diffs, it is a 0.30% improvement across all framework assemblies. A lot of the diffs seem to be because we CSE the entire array address offset expression, not just the index expression.
2018-03-30	Tighten arm32/arm64 write barrier kill reg sets	Bruce Forstall	1	-58/+94
	The JIT write barrier helpers have a custom calling convention that avoids killing most registers. The JIT was not taking advantage of this, and thus was killing unnecessary registers when a write barrier was necessary. In particular, some integer callee-trash registers are unaffected by the write barriers, and no floating-point register is affected. Also, I got rid of the `FEATURE_WRITE_BARRIER` define, which is always set. I also put some code under `LEGACY_BACKEND` for easier cleanup later. I removed some unused defines in target.h for some platforms.
2018-03-30	[arm32] Fixed RBM_PROFILER_* (#17291)	sergey ignatov	1	-0/+5
	* [arm32] Fixed RBM_PROFILER_* * Changed trash registers to RBM_NONE
2018-03-28	Add crossbitness support to ClrJit:	Egor Chesakov	1	-0/+8
	* Add FEATURE_CROSSBITNESS in crosscomponents.cmake * Exclude mscordaccore mscordbi sos from CLR_CROSS_COMPONENTS_LIST when FEATURE_CROSSBITNESS is defined in crosscomponents.cmake * Introduce target_size_t in src/jit/target.h * Use size_t value in genMov32RelocatableImmediate in src/jit/codegen.h src/jit/codegencommon.cpp * Fix definition/declaration inconsistency for emitter::emitIns_R_I in emitarm.cpp * Zero HiVal when GetTree::SetOper GenTreeLngCon->GetTreeIntCon in src/jit/compiler.hpp * Explicity specify roundUp(expr, TARGET_POINTER_SIZE) * Use target_size_t* target in emitOutputDataSec in src/jit/emit.cpp
2018-01-10	Fix ARM GCStress hole with byref write barrier helper	Bruce Forstall	1	-6/+12
	When unrolling a STOREOBJ, we can generate multiple consecutive byref helper calls. This helper has a unique calling convention where the dst and src addresses are modified by adding pointer size to their original value (thus allowing consecutive helper calls without reloading the dst/src addresses). So, for liveness purposes, the helper call kills the dst/src values. However, for GC purposes, it does not, as the registers still contain byref pointers. We were, in the ARM case, reporting the r0/r1 registers dead after the first call, so a GC didn't update them, and a second call updated garbage. In fixing this, I cleaned up the helper call kill handling a bit. I also fixed and improved RyuJIT/x86 write barrier kill modeling.
2017-11-17	Fix RyuJIT/arm32 GS cookie check before JMP call	Bruce Forstall	1	-0/+10
	The GS cookie check was using r2/r3 registers, after they had been reloaded as outgoing argument registers for the JMP call, thus trashing them. Change the temp regs used to r12/lr, the only non-argument, non-callee-saved registers available on arm32. Partially fixes #14862
2017-11-06	ARM64: Fix two register selection issues	Carol Eidt	1	-1/+1
	On ARM64 IP0 and IP1 are not in the register selection order, though there are some cases where they must be allocated. See #14607. So we may see them as free when looking for a register to spill. Also, V15 was missing from the selection order (V16 was in the order twice). Fix #14626
2017-10-25	Avoid allocating IP0 and IP1	Carol Eidt	1	-1/+0
	Revert to prior behavior.
2017-10-23	[Arm64] SIMD simple defines (#14628)	Steve MacLean	1	-0/+5
	* [Arm64] SIMD simple defines * Fix #else
2017-10-20	Merge pull request #14606 from CarolEidt/Fix14591	Carol Eidt	1	-0/+1
	LSRA Arm64 consistent reg sets
2017-10-20	[RyuJIT/ARM32] Fast tail call: code generation (#14445)	Hyeongseok Oh	1	-0/+4
	* Codegen for fast tail call Codegen call and epilog for fast tail call * Implementation for GT_START_NONGC This implementation removes two NYI_ARM Code generation for GT_START_NONGC which is used to prevent GC in fast tail call * Define fast tail call target register and mask on ARMARCH Define REG_FASTTAILCALL_TARGET and RBM_FASTTAILCALL_TARGET on ARMARCH Modify lsra init and codegen to use these definition * Merge genFnEpilog Merge genFnEpilog for ARM32 and ARM64 * Fix bug in getFirstArgWithStackSlot Fix bug in getFirstArgWithStackSlot: AMD64 and X86
2017-10-19	LSRA Arm64 consistent reg sets	Carol Eidt	1	-0/+1
	tryAllocateFreeReg() uses the RegOrder array to iterate over the available registers. This needs to be consistent with the available registers of the given type. Otherwise, allocateBusyReg() will assert when it finds a free register that should have been allocated in tryAllocateFreeReg(). Fix #14591
2017-10-03	remove FEATURE_AVX_SUPPORT flag	Fei Peng	1	-2/+2

2017-09-25	[Arm64] Use GTF_SET_FLAGS/GTF_USE_FLAGS	Steve MacLean	1	-1/+1

2017-09-22	Merge pull request #14139 from sdmaclea/PR-ARM64-EMIT-CBxZ-TBxZ	Brian Sullivan	1	-0/+3
	[Arm64] Add emitters for cbz, cbnz, tbz, and tbnz
2017-09-22	[Arm64] Add emitters for cbz, cbnz, tbz, or tbnz	Steve MacLean	1	-0/+3

2017-09-20	Merge pull request #13541 from ↵	Bruce Forstall	1	-0/+4
	hqueue/arm/ryujit/issue_12614_enable_unrolling_for_cpblk [RyuJIT/ARM32] enable unrolling for cpblk
2017-09-13	[RyuJIT/ARM32] Enable unrolling for cpblk	Hyung-Kyu Choi	1	-0/+4
	- Define INITBLK_UNROLL_LIMIT and CPBLK_UNROLL_LIMIT for ARM32 - Introduce same helper for codegen to ARM32 from ARM64, i.e. genCodeForLoadOffset() and genCodeForStoreOffset() - Introduce genCodeForCpBlkUnroll() to ARM32 from ARM64. - Enable unrolling for cpblk
2017-09-05	implementing profiler ELT callbacks for AMD64 Linux (#12603)	sergey ignatov	1	-1/+8
	* implement profiler ELT callbacks for AMD64 Linux * Some formatting fixes * Fixed profiler * Added aligning frame option * Added aligning stack for quad values stores
2017-06-12	make REG_VIRTUAL_STUB_PARAM depended on Abi. (#12209)	Sergey Andreenko	1	-20/+0
	* make REG_VIRTUAL_STUB_PARAM depended on Abi. Add VirtualStubParamInfo that is part of compiler and must be init in compCompile.
2017-06-09	delete DECLARE_TYPED_ENUM (#12177)	Sergey Andreenko	1	-41/+41
	* delete DECLARE_TYPED_ENUM delete the workaroung for g++ c++11, that was fixes in gcc 4.4.1 many years ago. The workaround makes code dirty and sometimes we have typos like: }; END_DECLARE_TYPED_ENUM(insFlags,unsigned) or END_DECLARE_TYPED_ENUM(ChunkExtraAttribs, BYTE); with double ;; * jit-format
2017-06-07	Make containedness explicit	Carol Eidt	1	-2/+11
	Eliminate the use of GTF_REG_VAL in RyuJIT and reuse the bit to mark nodes as contained. Abstract GTF_REG_VAL for legacy backend.
2017-05-26	[RyuJIT/ARM32] Update RMB for helper function	Hyung-Kyu Choi	1	-2/+2
	Make RBM_CALLEE_TRASH be consistent with CORINFO_HELP_STOP_FOR_GC helper which is JIT_RareDisableHelper defined in vm/arm/asmhelpers.S Signed-off-by: Hyung-Kyu Choi <hk0110.choi@samsung.com>
2017-05-23	[RyuJIT/ARM][LSRA] Update register mask for GC helper	Hyung-Kyu Choi	1	-1/+1
	CORINFO_HELP_STOP_FOR_GC helper preserves integer and double return register. Signed-off-by: Hyung-Kyu Choi <hk0110.choi@samsung.com>
2017-05-12	Merge pull request #10972 from hqueue/arm/ryujit/lsra	Carol Eidt	1	-2/+2
	[Ryujit/ARM32] LSRA support for arm32 floating-point register allocation
2017-05-11	[RyuJIT/ARM32] Implement for GT_STORE_OBJ (#10721)	Sujin Kim	1	-0/+7
	* Implement lowering for GT_STORE_OBJ In #10657, I commented that the messages for NYI were printed about GT_STORE_OBJ on running the CodeGenBringUpTests. 'Lowering::LowerBlockStore(GenTreeBlk* blkNode)' method implementation is just copied. but after lowering phase, in code generation, codegenarm.cpp, below would be used. ```cpp blkNode->gtBlkOpKind = GenTreeBlk::BlkOpKindUnroll; ``` ```cpp blkNode->gtBlkOpKind = GenTreeBlk::BlkOpKindHelper; ``` ```cpp void CodeGen::genCodeForStoreBlk(GenTreeBlk* blkOp) { if (blkOp->gtBlkOpGcUnsafe) { getEmitter()->emitDisableGC(); } bool isCopyBlk = blkOp->OperIsCopyBlkOp(); switch (blkOp->gtBlkOpKind) { case GenTreeBlk::BlkOpKindHelper: if (isCopyBlk) { genCodeForCpBlk(blkOp); } else { genCodeForInitBlk(blkOp); } break; case GenTreeBlk::BlkOpKindUnroll: if (isCopyBlk) { genCodeForCpBlkUnroll(blkOp); } else { genCodeForInitBlkUnroll(blkOp); } break; default: unreached(); } if (blkOp->gtBlkOpGcUnsafe) { getEmitter()->emitEnableGC(); } } ``` 'genCodeForCpBlk' and 'genCodeForInitBlk' are implemented in ARM/ARM64 by MEMCPY/MEMSET but 'genCodeForCpBlkUnroll' and 'genCodeForInitBlkUnroll' are not implemented both ARM and ARM64. Therefore those need to implement. * Implement NYI : GT_STORE_OBJ is needed of write barriers implementation It was copied from ARM64 and removed codes related with gc write barrier which doesn't support on ARM. * Implement CodeGen::genCodeForCpObj * Refactor some codes * Use INS_OPTS_LDST_POST_INC option for post-indexing When structure is copied, the results of asm codes have been strange. IN0013: 000048 ldr r3, [r1+4] IN0014: 00004A str r3, [r0+4] IN0015: 00004C ldr r3, [r1+4] IN0016: 00004E str r3, [r0+4] It needs that the index would increment or the post-indexing. So I use INS_OPTS_LDST_POST_INC option for post-indexing when the instruction is emitted. * Fix conflicts * Fix conflicts and Apply #11219 I want to merge genCodeForCpObj to codegenarmarch.cpp but the function has modified in #11219. So I decided to keep the code divided now. In future, If modifying the function on ARM is also needed, it would be able to modify. * Fix conflicts * Remove NYI * Fix genCountBits assertion
2017-05-09	Merge pull request #11406 from sdmaclea/PR-ARM64-ENABLE-FEATURE_TAILCALL_OPT	Bruce Forstall	1	-1/+1
	[Arm64] Enable FEATURE_TAILCALL_OPT
2017-05-05	[Arm64/Unix] Enable FEATURE_USE_SOFTWARE_WRITE_WATCH_FOR_GC_HEAP (#11375)	Steve MacLean	1	-1/+1
	* [Arm64/Unix] Enable FEATURE_USE_SOFTWARE_WRITE_WATCH_FOR_GC_HEAP * [Arm64/Unix] Enable FEATURE_MANUALLY_MANAGED_CARD_BUNDLES
2017-05-04	[Arm64] Enable FEATURE_TAILCALL_OPT	Steve MacLean, Qualcomm Datacenter Technologies, Inc	1	-1/+1

2017-04-14	[Ryujit/ARM32] LSRA compute RegRecords for double register	Hyung-Kyu Choi	1	-2/+2
	* When computing RegRecords at BasicBlock entry, let's consider overlapping floating and double registers. Signed-off-by: Hyung-Kyu Choi <hk0110.choi@samsung.com>
2017-04-07	Merge pull request #10656 from hseok-oh/ryujit/fix_10654	Bruce Forstall	1	-0/+4
	[RyuJIT/ARM32] [ReadyToRun] Fix target register for invocation to Thunk
2017-04-07	Use _TARGET_ARMARCH_	Hyeongseok Oh	1	-1/+0
	Chagne _TARGET_ARM_ and _TARGET_ARM64_ to _TARGET_ARMARCH_
2017-04-06	Modify THUNK_PARAM generated in SaveWork	Hyeongseok Oh	1	-0/+5
	- Use parameter r4 to pass Indirection from code generated by R2R - Define REG_R2R_INDIRECT_PARAM in ARM32 to merge with ARM64 routine
2017-04-05	Remove unused PREDICT_REG_RER_INDIRECT_PARAM define	Bruce Forstall	1	-1/+0

2017-03-13	Build Linux altjit for x86 and amd64 (#10120)	Bruce Forstall	1	-17/+3
	Enable Windows hosted, Linux target amd64 altjit With this change, we build a JIT that runs on Windows amd64 and targets Linux amd64, as an altjit named linuxnonjit.dll. This is useful for debugging, or generating asm code or diffs. You can even easily create Windows/non-Windows asm diffs (either to compare the asm, or compare the generated code size). For this to work, the JIT-EE interface method getSystemVAmd64PassStructInRegisterDescriptor() was changed to always be built in, by defining `FEATURE_UNIX_AMD64_STRUCT_PASSING_ITF` in all AMD64 builds. The `_ITF` suffix indicates that this is functionality specific to implementing the JIT-EE interface contract. There were many places in the VM that used this interchangeably with `FEATURE_UNIX_AMD64_STRUCT_PASSING`. Now, `FEATURE_UNIX_AMD64_STRUCT_PASSING` means code in the VM needed to implement this feature, but not required to implement the JIT-EE interface contract. In particular, MethodTables compute and cache the "eightbyte" info of structs when loading a type. This is not done when only `FEATURE_UNIX_AMD64_STRUCT_PASSING_ITF` is set, to avoid altering MethodTable behavior on non-Unix AMD64 builds. Instead, if `getSystemVAmd64PassStructInRegisterDescriptor()` is called on a non-Unix build (by the altjit), the `ClassifyEightBytes()` function is called, and nothing is cached. Hopefully (though it was hard for me to guarantee by observation), calling `ClassifyEightBytes()` does not have any side effects on MethodTables. It doesn't really matter, since if called for altjit, we don't care too much about running. The previously used `PLATFORM_UNIX` define is now insufficient. I introduced the `#define` macros `_HOST_UNIX_` to indicate the JIT being built will run on Unix, and `_TARGET_UNIX_` to indicate the JIT is generating code targeting Unix. Some things were converted to use the `UNIX_AMD64_ABI` define, which makes more sense.
2017-03-01	RyuJIT/ARM32: Enable P/Invoke lowering.	Mikhail Skvortcov	1	-1/+1

2017-02-23	Merge pull request #9681 from mskvortsov/ryujit-arm32-reload	Bruce Forstall	1	-0/+4
	[RyuJIT/ARM32] Fix helper kill mask and call target consuming
2017-02-23	RyuJIT/ARM32: Fix helper kill mask and call target consuming	Mikhail Skvortcov	1	-0/+4

2017-02-21	Fix non-Windows amd64 register mask initialization	Bruce Forstall	1	-6/+6
	The arrays were being initialized with register numbers, not register masks. The arrays are only used on non-Windows amd64 for homing circular incoming argument conflicts. Perhaps we never saw this happen?
2017-02-09	RyuJIT/ARM32: enable DecomposeLongs phase	Mikhail Skvortcov	1	-0/+8

2017-02-07	[x86/Linux] Stack align 16 bytes for JIT code	SaeHie Park	1	-0/+6
	Change JIT code to align stack in 16 byte used in modern compiler
2017-01-23	[x86/Linux] Enable FEATURE_EH_FUNCLETS (#8889)	Jonghyun Park	1	-0/+4
	* [x86/Linux] (Partially) Enable FEATURE_EH_FUNCLETS * Update CLR ABI Document * Add TODO (for Funclet Prolog/Epilog Gen)
2017-01-11	[x86/Linux] Introduce UNIX_X86_ABI definition (#8863)	SaeHie Park	1	-0/+7
	Add UNIX_X86_ABI definition for Unix/Linux specific ABI parts First will be for 16 byte stack alignment codes
2016-12-22	ARM: A step towards the RyuJIT/ARM32 backend.	Mikhail Skvortcov	1	-0/+3

2016-12-02	RyuJIT/x86: Implement TYP_SIMD12 support	Bruce Forstall	1	-0/+3
	There is no native load/store instruction for Vector3/TYP_SIMD12, so we need to break this type down into two loads or two stores, with an additional instruction to put the values together in the xmm target register. AMD64 SIMD support already implements most of this. For RyuJIT/x86, we need to implement stack argument support (both incoming and outgoing), which is different from the AMD64 ABI. In addition, this change implements accurate alignment-sensitive codegen for all SIMD types. For RyuJIT/x86, the stack is only 4 byte aligned (unless we have double alignment), so SIMD locals are not known to be aligned (TYP_SIMD8 could be with double alignment). For AMD64, we were unnecessarily pessimizing alignment information, and were always generating unaligned moves when on AVX2 hardware. Now, all SIMD types are given their preferred alignment in getSIMDTypeAlignment() and alignment determination in isSIMDTypeLocalAligned() takes into account stack alignment (it still needs support for x86 dynamic alignment). X86 still needs to consider dynamic stack alignment for SIMD locals. Fixes #7863
2016-10-30	Fix P/Invoke cookie passing on x86.	Pat Gavlin	1	-0/+3
	On x86, the P/Invoke cookie (when required) is passed on the stack after all other stack arguments (if any).
2016-10-27	Introduce new CORJIT_FLAGS type	Bruce Forstall	1	-4/+4
	The "JIT flags" currently passed between the EE and the JIT have traditionally been bit flags in a 32-bit word. Recently, a second 32-bit word was added to accommodate additional flags, but that set of flags is definitely "2nd class": they are not universally passed, and require using a separate set of bit definitions, and comparing those bits against the proper, 2nd word. This change replaces all uses of bare DWORD or 'unsigned int' types representing flags with CORJIT_FLAGS, which is now an opaque type. All flag names were renamed from CORJIT_FLG_* to CORJIT_FLAG_* to ensure all cases were changed to use the new names, which are also scoped within the CORJIT_FLAGS type itself. Another motivation to do this, besides cleaner code, is to allow enabling the SSE/AVX flags for x86. For x86, we had fewer bits available in the "first word", so would have to either put them in the "second word", which, as stated, was very much 2nd class and not plumbed through many usages, or we could move other bits to the "second word", with the same issues. Neither was a good option. RyuJIT compiles with both COR_JIT_EE_VERSION > 460 and <= 460. I introduced a JitFlags adapter class in jitee.h to handle both JIT flag types. All JIT code uses this JitFlags type, which operates identically to the new CORJIT_FLAGS type. In addition to introducing the new CORJIT_FLAGS type, the SSE/AVX flags are enabled for x86. The JIT-EE interface GUID is changed, as this is a breaking change.
2016-10-23	Delete _TARGET_SET_ macro	Jan Kotas	1	-5/+0

2016-10-19	Enable Enter/Leave/Tailcall hooks for RyuJIT/x86	Bruce Forstall	1	-3/+10

2016-09-28	[ARM] Generate direct call instructions for recursive calls	Hanjoung Lee	1	-0/+3
	Direct Call Instruction : (movw, movt, blx reg) Indirect Call Instruction : (bl +-imm24) It is pretty hard to determine direct/indirect instructions for general case. However for recursive calls we can safely estimate the jump distance as we know relative distance between jump source and destination. So this change will generate direct call instructions for recursive calls when jump distance is close enough. Plus, it directly jumps to the function without the prestub. For #7002