RyuJIT/x86: Implement TYP_SIMD12 support

There is no native load/store instruction for Vector3/TYP_SIMD12, so we need to break this type down into two loads or two stores, with an additional instruction to put the values together in the xmm target register. AMD64 SIMD support already implements most of this. For RyuJIT/x86, we need to implement stack argument support (both incoming and outgoing), which is different from the AMD64 ABI. In addition, this change implements accurate alignment-sensitive codegen for all SIMD types. For RyuJIT/x86, the stack is only 4 byte aligned (unless we have double alignment), so SIMD locals are not known to be aligned (TYP_SIMD8 could be with double alignment). For AMD64, we were unnecessarily pessimizing alignment information, and were always generating unaligned moves when on AVX2 hardware. Now, all SIMD types are given their preferred alignment in getSIMDTypeAlignment() and alignment determination in isSIMDTypeLocalAligned() takes into account stack alignment (it still needs support for x86 dynamic alignment). X86 still needs to consider dynamic stack alignment for SIMD locals. Fixes #7863
author: Bruce Forstall <brucefo@microsoft.com> 2016-11-13 19:35:32 -0800
committer: Bruce Forstall <brucefo@microsoft.com> 2016-12-02 17:55:27 -0800
commit: e401df83de9a4f135e71ca2ab06eff19c112a881 (patch)
tree: ecc4e369dff35148cbc4212d940045ca642f9b84 /src/jit/regset.cpp
parent: a0a055ba3cf8265055a37a14f206e9e30836bc18 (diff)
download: coreclr-e401df83de9a4f135e71ca2ab06eff19c112a881.tar.gz
coreclr-e401df83de9a4f135e71ca2ab06eff19c112a881.tar.bz2
coreclr-e401df83de9a4f135e71ca2ab06eff19c112a881.zip
1 files changed, 10 insertions, 0 deletions
diff --git a/src/jit/regset.cpp b/src/jit/regset.cpp
index 2980f96813..0d0ac3e0ce 100644
--- a/src/jit/regset.cpp
+++ b/src/jit/regset.cpp
@@ -3175,6 +3175,16 @@ var_types Compiler::tmpNormalizeType(var_types type)
 
     type = genActualType(type);
 
+#if defined(FEATURE_SIMD) && !defined(_TARGET_64BIT_)
+    // For SIMD on 32-bit platforms, we always spill SIMD12 to a 16-byte SIMD16 temp.
+    // This is because we don't have a single instruction to store 12 bytes. We also
+    // allocate non-argument locals as 16 bytes; see lvSize().
+    if (type == TYP_SIMD12)
+    {
+        type = TYP_SIMD16;
+    }
+#endif // defined(FEATURE_SIMD) && !defined(_TARGET_64BIT_)
+
 #else  // LEGACY_BACKEND
     if (!varTypeIsGC(type))
     {
author	Bruce Forstall <brucefo@microsoft.com>	2016-11-13 19:35:32 -0800
committer	Bruce Forstall <brucefo@microsoft.com>	2016-12-02 17:55:27 -0800
commit	e401df83de9a4f135e71ca2ab06eff19c112a881 (patch)
tree	ecc4e369dff35148cbc4212d940045ca642f9b84 /src/jit/regset.cpp
parent	a0a055ba3cf8265055a37a14f206e9e30836bc18 (diff)
download	coreclr-e401df83de9a4f135e71ca2ab06eff19c112a881.tar.gz coreclr-e401df83de9a4f135e71ca2ab06eff19c112a881.tar.bz2 coreclr-e401df83de9a4f135e71ca2ab06eff19c112a881.zip