Git initHEAD 2.0_alpha master 2.0alpha 1.0_post

author: Kibum Kim <kb0929.kim@samsung.com> 2012-01-07 00:46:38 +0900
committer: Kibum Kim <kb0929.kim@samsung.com> 2012-01-07 00:46:38 +0900
commit: f5660c6460a863b19f9ef745575780e37cc192a9 (patch)
tree: 0b478679da32d706de7b0de546d2e4daf03b160c /mpi/i586/README
parent: 06b9124a4f9d38acc78e6af686bc49a06f6354f8 (diff)
download: gnupg-1.0_post.tar.gz
gnupg-1.0_post.tar.bz2
gnupg-1.0_post.zip
1 files changed, 26 insertions, 0 deletions
diff --git a/mpi/i586/README b/mpi/i586/README
new file mode 100644
index 0000000..d73b082
--- /dev/null
+++ b/mpi/i586/README
@@ -0,0 +1,26 @@
+This directory contains mpn functions optimized for Intel Pentium
+processors.
+
+RELEVANT OPTIMIZATION ISSUES
+
+1. Pentium doesn't allocate cache lines on writes, unlike most other modern
+processors.  Since the functions in the mpn class do array writes, we have to
+handle allocating the destination cache lines by reading a word from it in the
+loops, to achieve the best performance.
+
+2. Pairing of memory operations requires that the two issued operations refer
+to different cache banks.  The simplest way to insure this is to read/write
+two words from the same object.  If we make operations on different objects,
+they might or might not be to the same cache bank.
+
+STATUS
+
+1. mpn_lshift and mpn_rshift run at about 6 cycles/limb, but the Pentium
+documentation indicates that they should take only 43/8 = 5.375 cycles/limb,
+or 5 cycles/limb asymptotically.
+
+2. mpn_add_n and mpn_sub_n run at asymptotically 2 cycles/limb.  Due to loop
+overhead and other delays (cache refill?), they run at or near 2.5 cycles/limb.
+
+3. mpn_mul_1, mpn_addmul_1, mpn_submul_1 all run 1 cycle faster than they
+should...
author	Kibum Kim <kb0929.kim@samsung.com>	2012-01-07 00:46:38 +0900
committer	Kibum Kim <kb0929.kim@samsung.com>	2012-01-07 00:46:38 +0900
commit	f5660c6460a863b19f9ef745575780e37cc192a9 (patch)
tree	0b478679da32d706de7b0de546d2e4daf03b160c /mpi/i586/README
parent	06b9124a4f9d38acc78e6af686bc49a06f6354f8 (diff)
download	gnupg-1.0_post.tar.gz gnupg-1.0_post.tar.bz2 gnupg-1.0_post.zip