summaryrefslogtreecommitdiff
path: root/caffe2/perfkernels
AgeCommit message (Expand)AuthorFilesLines
2019-01-30more careful use of inline/template function in perfkernels (#15388)Jongsoo Park13-1236/+1236
2019-01-18Remove ATen/Half.h and ATen/core/Half.h forwarding headers.Edward Yang5-5/+5
2019-01-16Resolve errors in perfkernel for Windows (#16031)Tongliang Liao15-384/+574
2019-01-09Fix several ResourceWarning: unclosed file (#15746)Mickaël Schoentgen1-5/+4
2018-12-21handle empty inputs to SparseLengthsMean correctly (#15389)Jongsoo Park3-98/+98
2018-12-13minimize header file includes from _avx2.cc (#14950)Jongsoo Park10-503/+350
2018-12-11re-enable copy of python files, but be careful that the copy is only … (#14...Zachary DeVito1-0/+0
2018-12-05include avx512vl to avx512 code path (#14733)Jongsoo Park3-14/+16
2018-12-03add avx512 option (but no avx512 kernel yet) (#14664)Jongsoo Park4-19/+105
2018-12-02inline adagrad functions (#14194)Jongsoo Park3-93/+168
2018-11-16fix sparse_adagrad param_size overflow error (#14049)Jongsoo Park2-6/+6
2018-11-11moving simd adagrad code to perfkernels (#13549)Jongsoo Park5-1/+697
2018-09-24set up c10 scaffolding. Move macros proper first.Yangqing Jia1-2/+2
2018-09-22Remove many caffe2::TIndex and replace them with int64_t (#11943)Christian Puhrsch7-198/+198
2018-09-20codemod: caffe::float16 -> at::Half (#11785)Roy Li7-184/+184
2018-08-22Faster random number generation in fused_rowwise_random_quantization_ops (#10...Wei Wen3-0/+582
2018-08-10Remove caffe1 specific proto (#10380)Yangqing Jia1-2/+2
2018-08-04Optimize reduce ops for 2d and 3d (#9992)Xiaomeng Yang2-2/+2
2018-07-11Fix Eigen issue on OS X with CUDA and nvcc compile (#9350)Orion Reblitz-Richardson2-0/+2
2018-07-10Revert D8768025: [pytorch][PR] Fix Eigen issue on OS X with CUDA and nvcc com...Mike Kelley2-2/+0
2018-07-10Fix Eigen issue on OS X with CUDA and nvcc compile (#9270)Orion Reblitz-Richardson2-0/+2
2018-03-27Add SparseLengthsPositionalWeightedSum operator that fuses SparseLengthsWeigh...Jongsoo Park7-431/+1091
2018-03-27Remove Apache headers from source.Orion Reblitz-Richardson15-256/+0
2018-01-19Add FusedEmbeddingLookupPeter Goldsborough4-72/+3113
2017-12-03Fix typo in clang ifdef to fix clang 3.9 buildPieter Noordhuis1-1/+1
2017-11-20Fix perfkernel compile error on clang 3.8Pieter Noordhuis1-2/+15
2017-11-13Limit this fix to apple clang onlycaozhong1-1/+1
2017-11-03Fix intrinsic in perf kerneles for int8Dmytro Dzhulgakov2-64/+64
2017-11-03Fix boundary checking in 8-bit sparselengthssum opsDmytro Dzhulgakov2-116/+317
2017-09-28Re-license to ApacheYangqing Jia12-0/+191
2017-09-21Adding uint8 support for to code generator for and high-performance emebding ...Misha Smelyanskiy6-110/+1412
2017-09-19Backed out changeset 3a5c020294d8Yangqing Jia6-1393/+92
2017-09-19Adding uint8 support for to code generator for and high-performance emebding ...Misha Smelyanskiy6-92/+1393
2017-08-30Code generator for and high-performance emebding look-up kernels, supportingMisha Smelyanskiy2-39/+1637
2017-08-22cmake: generate macros.h with configure_file()Luke Yeager2-0/+4
2017-08-08Fix build failure on MacOS X with clang-800.0.42.1Jammy Zhou1-1/+1
2017-08-02add proper build support for perfkernelsYangqing Jia6-1/+113
2017-07-30Scaffolding for perfkernels dispatch of embedding lookupDmytro Dzhulgakov3-0/+230
2017-07-26fix perfkernels buildYangqing Jia1-1/+1
2017-07-25vectorized typed axpy implementationYangqing Jia6-9/+198
2017-07-25lengths_reducer_ops refactoring.Yangqing Jia2-0/+62