summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authortmsriram <tmsriram@google.com>2017-06-15 14:24:18 -0700
committerVictor Costan <pwnall@chromium.org>2017-06-28 18:34:54 -0700
commitf24f9d2d97fe702822bf4fcc23fbce0912b5060a (patch)
tree4529fa7058e932ceed774f20b451fc22ce6f75b6
parent82deffcde796fe2b6260087485cbb64543aa8788 (diff)
downloadsnappy-f24f9d2d97fe702822bf4fcc23fbce0912b5060a.tar.gz
snappy-f24f9d2d97fe702822bf4fcc23fbce0912b5060a.tar.bz2
snappy-f24f9d2d97fe702822bf4fcc23fbce0912b5060a.zip
Explicitly copy internal::wordmask to the stack array to work around a compiler
optimization with LLVM that converts const stack arrays to global arrays. This is a temporary change and should be reverted when https://reviews.llvm.org/D30759 is fixed. With PIE, accessing stack arrays is more efficient than global arrays and wordmask was moved to the stack due to that. However, the LLVM compiler automatically converts stack arrays, detected as constant, to global arrays and this transformation hurts PIE performance with LLVM. We are working to fix this in the LLVM compiler, via https://reviews.llvm.org/D30759, to not do this conversion in PIE mode. Until this patch is finished, please consider this source change as a temporary work around to keep this array on the stack. This source change is important to allow some projects to flip the default compiler from GCC to LLVM for optimized builds. This change works for the following reason. The LLVM compiler does not convert non-const stack arrays to global arrays and explicitly copying the elements is enough to make the compiler assume that this is a non-const array. With GCC, this change does not affect code-gen in any significant way. The array initialization code is slightly different as it copies the constants directly to the stack. With LLVM, this keeps the array on the stack. No change in performance with GCC (within noise range). With LLVM, ~0.7% improvement in optimized mode (no FDO) and ~1.75% improvement in FDO mode.
-rw-r--r--snappy.cc11
1 files changed, 10 insertions, 1 deletions
diff --git a/snappy.cc b/snappy.cc
index 8bb5d23..1ba247b 100644
--- a/snappy.cc
+++ b/snappy.cc
@@ -662,7 +662,16 @@ class SnappyDecompressor {
// For position-independent executables, accessing global arrays can be
// slow. Move wordmask array onto the stack to mitigate this.
uint32 wordmask[sizeof(internal::wordmask)/sizeof(uint32)];
- memcpy(wordmask, internal::wordmask, sizeof(wordmask));
+ // Do not use memcpy to copy internal::wordmask to
+ // wordmask. LLVM converts stack arrays to global arrays if it detects
+ // const stack arrays and this hurts the performance of position
+ // independent code. This change is temporary and can be reverted when
+ // https://reviews.llvm.org/D30759 is approved.
+ wordmask[0] = internal::wordmask[0];
+ wordmask[1] = internal::wordmask[1];
+ wordmask[2] = internal::wordmask[2];
+ wordmask[3] = internal::wordmask[3];
+ wordmask[4] = internal::wordmask[4];
// We could have put this refill fragment only at the beginning of the loop.
// However, duplicating it at the end of each branch gives the compiler more