Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
In case of softfp abi, assembly implementations of only those APIs are
used which doesnt have a floating point argument or return value.
In case of hard abi, all assembly implementations are used.
|
|
Added generic 4x4 and 4x2 gemm kernels
Added generic 4x2 trmm kernel
|
|
If ARM abi is not explicitly mentioned on the command line, then set the
arm abi to softfp or hard according to the compiler environment.
This assumes that compiler sets the defines __ARM_PCS and __ARM_PCS_VFP
accordingly.
|
|
|
|
|
|
|
|
|
|
|
|
Add 64bit support for Microsoft Visual Studio
|
|
|
|
|
|
|
|
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
|
|
* Fix installation of header files with cmake
Install only the required header files, with openblas_config.h preprocessed like in Makefile.install
Fixes #1184
* Update CMakeLists.txt
Escape remaining semicolons in awk argument list (to get it working on Windows as well)
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Add files via upload
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
see if it is the single quotes that cause the problem on windows
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Use C utility instead of awk for header generation in cmake builds
* Update CMakeLists.txt
* Fix generation and installation of header files
Generate openblas_config.h and f77blas.h with same contents as in plain Makefile builds and install only the public header files
|
|
Update test to use openblas_make_complex_float and openblas_make_comp…
|
|
openblas_make_complex_double functions
|
|
build: Flang has the same interface as PGI
|
|
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
|
|
build: Flang compiler support
|
|
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
|
|
build: fix libxlmass errors building on Power CPU
|
|
IBM MASS library is upgraded to 8.1.5 and 8.1.3 is not available.
Update README.md and Makefile.power to use version 8.1.5 of libxlmass.
|
|
Build shared library on Android without SONAME versioning
|
|
Android does not support versioned SONAME entries, ref. #1173
|
|
MIPS threading fixes
|
|
Fixes to driver/others/memory.c
|
|
|
|
MIPS 32-bit currently has an empty blas_lock implementation which is
worse than nothing at all. MIPS 64-bit does has a blas_lock
implementation but is broken. Remove them and fallback to the generic
version in common.h which should do the right thing on MIPS.
|
|
The MIPS architecture has weak memory ordering and therefore requires
sutible memory barriers when doing lock free programming with multiple
threads (just like ARM does). This commit implements those barriers for
MIPS and MIPS64 using GCC bultins which is probably easiest way.
|
|
Before this commit, the "position < NUM_BUFFERS" loop condition from
blas_memory_free will be completely optimized away by GCC. This is
because the condition can only be false after undefined behavior has
already been invoked (reading past the end of an array). As a
consequence of this bug, GCC also removes the subsequent if statement
and all the code after the error label because all of it is dead.
This commit switches the loop condition around so it works as intended.
|
|
Fix workspace computation in LAPACKE ?tpmqrt
|
|
Fix DYNAMIC_ARCH=1 breaking builds on non-x86 platforms
|
|
|
|
|
|
From netlib PR#144
|