blob: b431c9723ee31c85ff858f4373f61d205912acee (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
|
OpenBLAS ChangeLog
====================================================================
Version 0.1 alpha2.2
14-Jul-2011
common:
* Fixed a building bug when DYNAMIC_ARCH=1 & INTERFACE64=1.
(Refs issue #44 on github)
====================================================================
Version 0.1 alpha2.1
28-Jun-2011
common:
* Stop the build and output the error message when detecting
fortran compiler failed. (Refs issue #42 on github)
====================================================================
Version 0.1 alpha2
23-Jun-2011
common:
* Fixed blasint undefined bug in <cblas.h> file. Other software
could include this header successfully(Refs issue #13 on github)
* Fixed the SEGFAULT bug on 64 cores. On SMP server, the number
of CPUs or cores should be less than or equal to 64.(Refs issue #14
on github)
* Support "void goto_set_num_threads(int num_threads)" and "void
openblas_set_num_threads(int num_threads)" when USE_OPENMP=1
* Added extern "C" to support C++. Thank Tasio for the patch(Refs
issue #21 on github)
* Provided an error message when the arch is not supported.(Refs
issue #19 on github)
* Fixed issue #23. Fixed a bug of f_check script about generating link flags.
* Added openblas_set_num_threads for Fortran.
* Fixed #25 a wrong result of rotmg.
* Fixed a bug about detecting underscore prefix in c_check.
* Print the wall time (cycles) with enabling FUNCTION_PROFILE
* Fixed #35 a build bug with NO_LAPACK=1 & DYNAMIC_ARCH=1
* Added install target. You can use "make install". (Refs #20)
x86/x86_64:
* Fixed #28 a wrong result of dsdot on x86_64.
* Fixed #32 a SEGFAULT bug of zdotc with gcc-4.6.
* Fixed #33 ztrmm bug on Nehalem.
* Walk round #27 the low performance axpy issue with small imput size & multithreads.
MIPS64:
* Fixed #28 a wrong result of dsdot on Loongson3A/MIPS64.
* Optimized single/double precision BLAS Level3 on Loongson3A/MIPS64. (Refs #2)
* Optimized single/double precision axpy function on Loongson3A/MIPS64. (Refs #3)
====================================================================
Version 0.1 alpha1
20-Mar-2011
common:
* Support "make NO_LAPACK=1" to build the library without
LAPACK functions.
* Fixed randomly SEGFAULT when nodemask==NULL with above Linux 2.6.34.
Thank Mr.Ei-ji Nakama providing this patch. (Refs issue #12 on github)
* Added DEBUG=1 rule in Makefile.rule to build debug version.
* Disable compiling quad precision in reference BLAS library(netlib BLAS).
* Added unit testcases in utest/ subdir. Used CUnit framework.
* Supported OPENBLAS_* & GOTO_* environment variables (Pleas see README)
* Imported GotoBLAS2 1.13 BSD version
x86/x86_64:
* On x86 32bits, fixed a bug in zdot_sse2.S line 191. This would casue
zdotu & zdotc failures.Instead,Walk around it. (Refs issue #8 #9 on github)
* Modified ?axpy functions to return same netlib BLAS results
when incx==0 or incy==0 (Refs issue #7 on github)
* Modified ?swap functions to return same netlib BLAS results
when incx==0 or incy==0 (Refs issue #6 on github)
* Modified ?rot functions to return same netlib BLAS results
when incx==0 or incy==0 (Refs issue #4 on github)
* Detect Intel Westmere,Intel Clarkdale and Intel Arrandale
to use Nehalem codes.
* Fixed a typo bug about compiling dynamic ARCH library.
MIPS64:
* Improve daxpy performance on ICT Loongson 3A.
* Supported ICT Loongson 3A CPU (Refs issue #1 on github)
====================================================================
|