summaryrefslogtreecommitdiff
path: root/src/jit/lower.h
diff options
context:
space:
mode:
authorLi Tian <litian2025@gmail.com>2016-12-11 19:13:28 -0800
committerLi Tian <litian2025@gmail.com>2017-01-08 12:44:12 -0800
commitcc169eac6736693c4bdbd9f61ae821146252e4cb (patch)
tree95e296a6ace5a53eb24037698d7813947777ef8a /src/jit/lower.h
parentc10c1ff8e3237689212606c9aa5153beec8a1778 (diff)
downloadcoreclr-cc169eac6736693c4bdbd9f61ae821146252e4cb.tar.gz
coreclr-cc169eac6736693c4bdbd9f61ae821146252e4cb.tar.bz2
coreclr-cc169eac6736693c4bdbd9f61ae821146252e4cb.zip
Remove AVX/SSE transition penalties
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the #1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because #2 SSE to AVX transition penalty won't happen since #1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240 move setContainsAVX flags to lower, refactor to a smaller method refactor, fix typo in comments fix format error
Diffstat (limited to 'src/jit/lower.h')
-rw-r--r--src/jit/lower.h1
1 files changed, 1 insertions, 0 deletions
diff --git a/src/jit/lower.h b/src/jit/lower.h
index 555b9e26c6..da47cf9041 100644
--- a/src/jit/lower.h
+++ b/src/jit/lower.h
@@ -235,6 +235,7 @@ private:
#if defined(_TARGET_XARCH_)
void SetMulOpCounts(GenTreePtr tree);
+ void SetContainsAVXFlags(bool isFloatingType = true, unsigned sizeOfSIMDVector = 0);
#endif // defined(_TARGET_XARCH_)
#if !CPU_LOAD_STORE_ARCH