diff options
author | Mitchell Spryn <mspryn@fb.com> | 2020-06-26 11:08:56 -0700 |
---|---|---|
committer | MyungJoo Ham <myungjoo.ham@samsung.com> | 2021-07-15 10:49:30 +0900 |
commit | 6f2c349dc55661e02b841f9a86605225de0bed99 (patch) | |
tree | dca832c9e40bf5dc0d93a6120e49f45594c9fa43 | |
parent | e60a928d220c024b7a7622fc149c764d6112974a (diff) | |
download | pytorch-accepted/tizen_7.0_unified.tar.gz pytorch-accepted/tizen_7.0_unified.tar.bz2 pytorch-accepted/tizen_7.0_unified.zip |
Fix illegal opcode bug in caffe2 (#40584)tizen_7.0_m2_releasetizen_6.5.m2_releasesubmit/tizen_6.5/20211028.163601submit/tizen/20210715.075526accepted/tizen/unified/20210715.094605accepted/tizen/7.0/unified/hotfix/20221116.111404accepted/tizen/7.0/unified/20221110.060752accepted/tizen/6.5/unified/20211028.231830tizen_7.0_hotfixtizen_7.0tizen_6.5accepted/tizen_7.0_unified_hotfixaccepted/tizen_7.0_unifiedaccepted/tizen_6.5_unified
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40584
Also patch [this github issue](https://github.com/pytorch/pytorch/issues/33124)
involving an illegal assembly instruction in 8x8-dq-aarch64-neon.S.
Test Plan:
Build binaries, copy to shaker, run executables. Also run all
existing caffe tests.
Reviewed By: kimishpatel
Differential Revision: D22240670
Change-Id: I6ac15b550dbbb83caed4f9cd0fb5345e2614cc09
fbshipit-source-id: 51960266ce58699fe6830bcf75632b92a122f638
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
-rw-r--r-- | aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm/8x8-dq-aarch64-neon.S | 16 |
1 files changed, 8 insertions, 8 deletions
diff --git a/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm/8x8-dq-aarch64-neon.S b/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm/8x8-dq-aarch64-neon.S index 7a67c6d401..b8bde02006 100644 --- a/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm/8x8-dq-aarch64-neon.S +++ b/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm/8x8-dq-aarch64-neon.S @@ -687,14 +687,14 @@ BEGIN_FUNCTION pytorch_q8gemm_dq_ukernel_8x8__aarch64_neon SUB x1, x1, 4 - MOV V8.4s, V9.4s - MOV v10.4s, v11.4s - MOV v12.4s, V13.4s - MOV V14.4s, V15.4s - MOV V16.4s, V17.4s - MOV V18.4s, V19.4s - MOV V20.4s, V21.4s - MOV V22.4s, V23.4s + MOV V8.16b, V9.16b + MOV v10.16b, v11.16b + MOV v12.16b, V13.16b + MOV V14.16b, V15.16b + MOV V16.16b, V17.16b + MOV V18.16b, V19.16b + MOV V20.16b, V21.16b + MOV V22.16b, V23.16b 5: CMP x1, 2 |