summaryrefslogtreecommitdiff
path: root/Changelog.txt
blob: 4c8ff2f985db023fab53488d907b9f1fd0b57a5c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
OpenBLAS ChangeLog
====================================================================
Version 0.1 alpha2.1
28-Jun-2011

common:
	* Stop the build and output the error message when detecting 
	  fortran compiler failed. (Refs issue #42 on github)

====================================================================
Version 0.1 alpha2
23-Jun-2011

common:
	* Fixed blasint undefined bug in <cblas.h> file. Other software 
	  could include this header successfully(Refs issue #13 on github)
	* Fixed the SEGFAULT bug on 64 cores. On SMP server, the number 
	  of CPUs or cores should be less than or equal to 64.(Refs issue #14 
	  on github)
	* Support "void goto_set_num_threads(int num_threads)" and "void
	  openblas_set_num_threads(int num_threads)" when USE_OPENMP=1
	* Added extern "C" to support C++. Thank Tasio for the patch(Refs 
	  issue #21 on github)
	* Provided an error message when the arch is not supported.(Refs 
	  issue #19 on github)
	* Fixed issue #23. Fixed a bug of f_check script about generating link flags.
	* Added openblas_set_num_threads for Fortran.
	* Fixed #25 a wrong result of rotmg.
	* Fixed a bug about detecting underscore prefix in c_check.
	* Print the wall time (cycles) with enabling FUNCTION_PROFILE
	* Fixed #35 a build bug with NO_LAPACK=1 & DYNAMIC_ARCH=1
	* Added install target. You can use "make install". (Refs #20)


x86/x86_64:
	* Fixed #28 a wrong result of dsdot on x86_64.
	* Fixed #32 a SEGFAULT bug of zdotc with gcc-4.6.
	* Fixed #33 ztrmm bug on Nehalem.
	* Walk round #27 the low performance axpy issue with small imput size & multithreads.

MIPS64:
	* Fixed #28 a wrong result of dsdot on Loongson3A/MIPS64. 
	* Optimized single/double precision BLAS Level3 on Loongson3A/MIPS64. (Refs #2)
	* Optimized single/double precision axpy function on Loongson3A/MIPS64. (Refs #3)

====================================================================
Version 0.1 alpha1
20-Mar-2011

common:
	* Support "make  NO_LAPACK=1" to build the library without 
	  LAPACK functions.
	* Fixed randomly SEGFAULT when nodemask==NULL with above Linux 2.6.34. 
	  Thank Mr.Ei-ji Nakama providing this patch. (Refs issue #12 on github)
	* Added DEBUG=1 rule in Makefile.rule to build debug version.
	* Disable compiling quad precision in reference BLAS library(netlib BLAS).
	* Added unit testcases in utest/ subdir. Used  CUnit framework.
	* Supported OPENBLAS_* & GOTO_* environment variables (Pleas see README)
	* Imported GotoBLAS2 1.13 BSD version

x86/x86_64:
	* On x86 32bits, fixed a bug in zdot_sse2.S line 191. This would casue 
	  zdotu & zdotc failures.Instead,Walk around it. (Refs issue #8 #9 on github)
	* Modified ?axpy functions to return same netlib BLAS results 
	  when incx==0 or incy==0 (Refs issue #7 on github)
	* Modified ?swap functions to return same netlib BLAS results 
	  when incx==0 or incy==0 (Refs issue #6 on github)
	* Modified ?rot functions to return same netlib BLAS results 
	  when incx==0 or incy==0 (Refs issue #4 on github)
	* Detect Intel Westmere,Intel Clarkdale and Intel Arrandale 
	  to use Nehalem codes.
	* Fixed a typo bug about compiling dynamic ARCH library.
MIPS64:
	* Improve daxpy performance on ICT Loongson 3A.
	* Supported ICT Loongson 3A CPU (Refs issue #1 on github)
====================================================================