summaryrefslogtreecommitdiff
path: root/src/jit/emitxarch.h
AgeCommit message (Collapse)AuthorFilesLines
2018-09-06Clean CodeGen::genEmitCall (#19804)Sergey Andreenko1-18/+1
* delete isProfLeaveCB from arm signature The previous implementation was done many years ago and I do not why it was done that way. * extract GetSavedSet * add isNoGCHelper * delete isNoGC arg * move declarations closer to their uses * delete isGc from genEmitCall * delete unused method declaration. * add emitNoGChelper that accepts CORINFO_METHOD_HANDLE * fix missed switch cases * add function headers * Fix feedback * Fix feedback2
2018-09-05Implement AVX2 Gather intrinsic in JITFei Peng1-0/+9
2018-08-09Implementing the Avx.MaskStore intrinsicsTanner Gooding1-0/+2
2018-06-06Adding containment support for more x86 hardware intrinsics (#18297)Tanner Gooding1-1/+33
* Adding containment support to one-operand scalar HWIntrinsics (x86) * Adding containment support to two-operand imm HWIntrinsics (x86) * Adding containment support to three-operand imm HWIntrinsics (x86) * Updating hwintrinsiccodegenxarch to properly mask Sse41.Insert for TYP_FLOAT * Updating the Sse41.Insert tests for TYP_FLOAT * Adding containment support for Sse2.CompareLessThan and BlendVariable (Sse41/Avx/Avx2) * Fixing `genHWIntrinsic_R_RM_I` to call `emitIns_SIMD_R_R_I`, rather than `emitIns_R_R_I` * Updating emitOutputSV to not modify the code for IF_RWR_RRD_SRD_CNS * Cleaning up some of the emitxarch code. * Moving roundps and roundpd into the IsDstSrcImm check
2018-06-04Adding function headers to the 'emitIns_SIMD_*' methods and clarifying ↵Tanner Gooding1-6/+15
comments on related code.
2018-06-04Updating the x86 HWIntrinsics to support containment for most one-operand ↵Tanner Gooding1-2/+2
intrinsics.
2018-06-04Fixing the name of the paramters in the emitIns_SIMD_* methods.Tanner Gooding1-19/+22
2018-05-25Updating the JIT to handle the FMA hardware intrinsics.Tanner Gooding1-0/+13
2018-05-22Remove JIT LEGACY_BACKEND code (#18064)Bruce Forstall1-83/+1
Remove JIT LEGACY_BACKEND code All code related to the LEGACY_BACKEND JIT is removed. This includes all code related to x87 floating-point code generation. Almost 50,000 lines of code have been removed. Remove legacyjit/legacynonjit directories Remove reg pairs Remove tiny instruction descriptors Remove compCanUseSSE2 (it's always true) Remove unused FEATURE_FP_REGALLOC
2018-03-22Add emitIns_AR_R_I for vextracti/f128Fei Peng1-0/+1
2018-02-26Update the table-driven framework to support x86 imm-intrinsics.Fei Peng1-0/+1
And add a new range-check IR for x86 imm-intrinsics.
2018-02-09Updating the emitter to more generally handle 4-Byte SSE4 instructions.Tanner Gooding1-5/+1
2018-02-06Disable prefetch instructions for LEGACY_BACKENDCarol Eidt1-0/+9
2018-02-05Adding support for the StoreFence/Prefetch* APIs and the new Sse scalar ↵Tanner Gooding1-0/+2
overloads.
2018-02-03Updating the HWIntrinsic codegen to support marking LoadVector128 and ↵Tanner Gooding1-0/+5
LoadAlignedVector128 as contained.
2018-01-31Delete GenTreePtr. (#16027)Sergey Andreenko1-1/+1
* jit sources: Each local pointer variable must be declared on its own line. Implement https://github.com/dotnet/coreclr/blob/master/Documentation/coding-guidelines/clr-jit-coding-conventions.md#101-pointer-declarations Each local pointer variable must be declared on its own line. * add constGenTreePtr * delete GenTreePtr * delete constGenTreePtr * fix arm
2018-01-19Fix desktop buildBruce Forstall1-2/+20
1. Fix `LEGACY_BACKEND` 2. `#if FEATURE_HW_INTRINSICS` => `#ifdef FEATURE_HW_INTRINSICS` [tfs-changeset: 1686599]
2018-01-19Merge SSE intrinsics into the table-driven frameworkFei Peng1-13/+6
2018-01-18table drive Intel hardware intrinsicFei Peng1-6/+9
2018-01-17Updating emitIns_R_R_A_I to not be defined for the legacy backend.Tanner Gooding1-0/+2
2018-01-16Adding support for the SSE Load, LoadAligned, LoadHigh, LoadLow, and ↵Tanner Gooding1-0/+3
LoadScalar intrinsics
2018-01-16Updating most of the SSE Compare intrinsics to support containmentTanner Gooding1-0/+19
2018-01-16Adding support for the SSE Reciprocal, ReciprocalSqrt, and Sqrt intrinsicsTanner Gooding1-0/+1
2018-01-16Adding support for the SSE compare eq, gt, ge, lt, le, ne, ord, and unord ↵Tanner Gooding1-0/+1
intrinsics
2018-01-16Merge pull request #14736 from tannergooding/roundsxTanner Gooding1-0/+6
Enable CORINFO_INTRINSIC Round, Ceiling, and Floor to generate ROUNDSS and ROUNDSD
2018-01-16Mark emitIns_R_A and emitIns_R_R_A to be not defined for legacy backendTanner Gooding1-0/+2
2018-01-14Adding SSE4.1 intrinsic support for Round, Ceiling, and Floor.Tanner Gooding1-0/+6
2018-01-12Fixing the hwintrin codgen containment checksTanner Gooding1-19/+4
2018-01-12Adding basic containment support to the x86 HWIntrinsicsTanner Gooding1-0/+28
2017-12-12Enable Vector128/256<T> and Add intrinsicsFei Peng1-0/+4
2017-11-14Change VEX-encoding selection to avoid AVX-SSE transition penaltiesFei Peng1-7/+7
2017-10-30Rename and simplify SSE3_4 to SSE4Fei Peng1-5/+5
2017-10-24Cleanup unused emitter argumentsBrian Sullivan1-16/+3
Removed unused idClsCookie from struct instrDescDebugInfo Cleaned up several ifdefs
2017-10-03remove FEATURE_AVX_SUPPORT flagFei Peng1-7/+12
2017-09-27fix bad VEX.vvvv to avoid false register dependencyFei Peng1-5/+5
2017-05-10add jit intrinsic support for vector conversion/narrow/widen on AMD64 and ↵helloguo1-0/+2
x86, except double->long/ulong conversion on x86
2017-04-07Remove RELOC_SUPPORT defineBruce Forstall1-3/+1
It's always defined, is always expected to be defined, and the build doesn't work without it. Also remove unused `SECURITY_CHECK` and `VERIFY_IMPORTER` defines.
2017-03-06Un-clang-format-horrible-ify emitIns_Call() and genEmitCall()Bruce Forstall1-21/+27
2017-02-23Rewrite Is4ByteAVXInstruction() and Is4ByteSSE4Instruction()Fei Peng1-0/+1
2017-02-05Enable SIMD for RyuJIT/x86Bruce Forstall1-1/+11
This change implements support for Vector<long>, handling SIMDIntrinsicInit, which takes a LONG, and decomposition of SIMDIntrinsicGetItem, which produces a LONG. It also enables SIMD, including AVX, by default for RyuJIT/x86.
2017-01-10fix comments, assertion failure in crossgen mscorlibLi Tian1-2/+2
2017-01-08Remove AVX/SSE transition penaltiesLi Tian1-0/+28
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the #1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because #2 SSE to AVX transition penalty won't happen since #1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240 move setContainsAVX flags to lower, refactor to a smaller method refactor, fix typo in comments fix format error
2016-11-30Fix x86 encoder to use 64-bit type to accumulate opcode/prefix bitsBruce Forstall1-36/+52
The encoder was using size_t, a 32-bit type on x86, to accumulate opcode and prefix bits to emit. AVX support uses 3 bytes for prefixes that are higher than the 32-bit type can handle. So, change all code byte related types from size_t to a new code_t, defined as "unsigned __int64" on RyuJIT x86 (there is precedence for this type on the ARM architectures). Fixes #8331
2016-11-29Merge pull request #8291 from sivarv/sse34Sivarv1-0/+18
Enable use of SSE3_4 instruction set for SIMD codegen.
2016-11-28Enable using SSE3_4 instruction set for SIMD codegen.sivarv1-0/+18
2016-11-28Factor out common stack adjustment codeBruce Forstall1-0/+12
2016-08-11Reformat jit sources with clang-tidy and formatMichelle McDaniel1-449/+349
This change is the result of running clang-tidy and clang-format on jit sources.
2016-07-29Massage code for clang-formatMichelle McDaniel1-1/+1
This change starts the process of updating the jit code to make it ready for being formatted by clang-format. Changes mostly include reflowing comments that go past our column limit and moving comments around ifdefs so clang-format does not modify the indentation. Additionally, some header files are manually reformatted for pointer alignment and marked as clang-format off so that we do not lose the current formatting.
2016-07-26Enable multireg returns on Arm64Brian Sullivan1-4/+4
Added method IsMultiRegPassedType and updated IsMultiRegReturnType Switched these methods to using getArgTypeForStruct and getReturnTypeForStruct Removed IsRegisterPassable and used IsMultiRegReturned instead. Converted lvIsMultiregStruct to use getArgTypeForStruct Renamed varDsc->lvIsMultiregStruct() to compiler->lvaIsMultiregStruct(varDsc) Skip calling getPrimitiveTypeForStruct when we have a struct larger than 8 bytes Refactored ReturnTypeDesc::InitializeReturnType Fixed missing SPK_ByReference case in InitializeReturnType Fixes for RyiJIt x86 TYP_LONG return types and additional ARM64 work for full multireg support Added ARM64 guard the uses of MAX_RET_MULTIREG_BYTES with FEATURE_MULTIREG_RET Fixes for multireg returns in Arm64 Codegen Added dumping of lvIsMultiRegArg and lvIsMultiRegRet in the assembly output Added check and set of compFloatingPointUsed to InitializeStructReturnType Fixes to handle JIT helper calls that say they return a TYP_STRUCT with no class handle available Placed all of the second GC return reg under MULTIREG_HAS_SECOND_GC_RET ifdefs Added the Arm64 VM changes from Rahul's PR 5175 Update getArgTypeForStruct for x86/arm32 so that it returns TYP_STRUCT for all pass by value cases Fixes for the passing of 3,5,6 or 7 byte sized structs Fix issue on ARM64 where we would back fill into x7 after passing a 16-byte struct on the stack Implemented register shuffling for multi reg Call returns on Arm64 Fixed regression on Arm32 for struct args that are not multi regs Updated Tests.Lst with 23 additional passing tests Changes from codereview feedback
2016-04-22Fix #3561: assert on RyuJIT x86 when generating shl by 1Bruce Forstall1-2/+2
This assert hit in the encoder when we were trying to generate an INS_shl_N with a constant of 1, instead of using the special xarch INS_shl_1 encoding, which saves a byte. It turns out, the assert was and in fact amd64 does generate the suboptimal encoding currently. The bad code occurs in the RMW case of genCodeForShift(). It turns out that function is unnecessarily complex, unique (it doesn't use the common RMW code paths), and has a number of other latent bugs. To fix this, I split genCodeForShift() by leaving the non-RMW case there, and adding a genCodeForShiftRMW() function just for the RMW case. I rewrote the RMW case to use the existing emitInsRMW functions. Other related cleanups along the way: 1. I changed emitHandleMemOp to consistently set the idInsFmt field, and changed all callers to stop pre-setting or post-setting this field. This makes the API much easier to understand. I added a big header comment for the function. Now, it takes a "basic" insFmt (using ARD, AWR, or ARW forms), which might be munged to a MRD/MWR/MRW form if necessary. 2. I changed some places to always use the most derived GenTree type for all uses. For instance, if the code has "GenTreeIndir* mem = node->AsIndir()", then always use "mem" from then on, and don't use "node". I changed some functions to take more derived GenTree node types. 3. I rewrote the emitInsRMW() functions to be much simpler, and rewrote their header comments. 4. I added GenTree predicates OperIsShift(), OperIsRotate(), and OperIsShiftOrRotate(). 5. I added function genMapShiftInsToShiftByConstantIns() to encapsulate mapping from INS_shl to INS_shl_N or INS_shl_1 based on a constant. This code was in 3 different places already. 6. The change in assertionprop.cpp is simply to make JitDumps readable. In addition to fixing the bug for RyuJIT/x86, there are a small number of x64 diffs where we now generate smaller encodings for shift by 1.