Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
From compiler warning.
|
|
(merge from Julien on June 17th, 2016)
|
|
Reported by Eugene Chereshnev on April 4th 2016
See:
http://icl.cs.utk.edu/lapack-forum/posting.php?mode=reply&f=13&t=4941
|
|
Reported by Eugene Chereshnev on April 4th 2016
See:
http://icl.cs.utk.edu/lapack-forum/posting.php?mode=reply&f=13&t=4943
|
|
reported by Eugene Chereshnev on April 4th 2016
See:
http://icl.cs.utk.edu/lapack-forum/posting.php?mode=reply&f=13&t=4942
|
|
parameter:
reported by Eugene Chereshnev on April 4th 2016
See:
http://icl.cs.utk.edu/lapack-forum/posting.php?mode=reply&f=13&t=4945
|
|
reported by Eugene Chereshnev on April 4th 2016
See:
http://icl.cs.utk.edu/lapack-forum/posting.php?mode=reply&f=13&t=4946
|
|
should be replaced with ones without _t.
reported by Alex Zotkevich, Intel Co. on april 11th 2016
See http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4950
|
|
reported by Alex Zotkevich, Intel Co. on april 11th 2016
See http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4951
|
|
reported by Alex Zotkevich, Intel Co. on april 11th 2016
See http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4953
|
|
Reported by Alex Zotkevich, Intel Co. on april 11th 2016
See http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4954
|
|
the gitignore command *.[oa]. I removed the spaces and my `git` follows the
.gitignore. Better for me now.
|
|
See: http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4975
In dlasy2.f, INFO = 1 needs to be inserted before line 442.
|
|
Compiling current lapack svn trunk with ifort -warn all results in
errors like:
ifort -O3 -fp-model strict -warn all -c sorcsd2by1.f -o sorcsd2by1.o
sorcsd2by1.f(350): error #6633: The type of the actual argument differs
from
the type of the dummy argument. [0]
CALL SORBDB1( M, P, Q, X11, LDX11, X21, LDX21, THETA, 0, 0,
------------------------------------------------------------------^
sorcsd2by1.f(350): error #6633: The type of the actual argument differs
from
the type of the dummy argument. [0]
CALL SORBDB1( M, P, Q, X11, LDX11, X21, LDX21, THETA, 0, 0,
---------------------------------------------------------------------^
sorcsd2by1.f(351): error #6633: The type of the actual argument differs
from
the type of the dummy argument. [0]
$ 0, 0, WORK, -1, CHILDINFO )
--------------------------^
sorcsd2by1.f(351): error #6633: The type of the actual argument differs
from
the type of the dummy argument. [0]
$ 0, 0, WORK, -1, CHILDINFO )
-----------------------------^
ifort -O3 -fp-model strict -warn all -c cgesdd.f -o cgesdd.o
cgesdd.f(343): error #6633: The type of the actual argument differs
from the
type of the dummy argument. [CDUM]
CALL CGEBRD( M, N, CDUM(1), M, CDUM(1), DUM(1), CDUM(1),
-------------------------------------------^
|
|
Contribution from Mark Gates (UTK)
From mark:
It blocks NB gemv calls into one gemm call inside trevc. To do that, it
needs a new routine, trevc3, because unfortunately the lwork was not
passed into trevc. (I highly recommend all new routines always pass
lwork and lrwork, where applicable, to enable future upgrades & to
catch lwork bugs.)
|
|
Adding Julien Schueller's fox for CBLAS cmake
|
|
|
|
|
|
Contributed by Julien Schueller
Sent to Julie on May 8th 2016
Correspond to SVN rev 1748.
|
|
See: http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4970
Posted on Wed May 25th
From Lawrence
=============
In zhbgvd.f (chbgvd.f)
Line 312: LRWMIN = N
Line 374: Call ZHBGST(....,RWORK(INDWRK), IINFO)
INDWRK = N+1 and ZHBGST requires RWORK(N)
Therefore either
i. line 312 should be LWMIN = 2*N; or,
ii. line 374 should use RWORK(INDE)
From Julien
===========
Let us go with fix (ii) then.
Note: I simply used "RWORK", not "RWORK(INDE)
|
|
See: http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4970
Posted on Wed May 25th
From Lawrence
=============
In dsbgvd.f (ssbgvd.f)
with the case WANTZ = .False.
LWMIN = 2*N (line 282)
But, (line 140)
Call dsbgst(.....,work(INDWRK), iinfo)
where INDWRK = N+1 and dsbgst requires a workspace of 2*N
Therefore either
i. line 282 should be LWMIN = 3*N; or,
ii. line 140 should use WORK(INDE)
From Julien
===========
Let us go with fix (ii) then.
Note: I simply used "WORK", not "WORK(INDE)"
|
|
------------------------------------------------------------------------
r1744 | langou | 2016-04-29 15:39:17 -0600 (Fri, 29 Apr 2016) | 15 lines
See post of Nathan Whitehead on the forum (4/29/2016)
http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4958
Addressing second comment (out of two):
"For GELSS, there is a DUM dummy argument but it is complex, [CZ]GEBRD expects
a real array as 6th argument. I added a "REAL DUMMY(1)" in local arrays, then
passed DUMMY for 6th argument of [CZ]GEBRD."
(Use the array S instead of creating a REAL DUMMY(1) array.)
(Update the thank-you file.)
------------------------------------------------------------------------
r1743 | langou | 2016-04-29 15:12:04 -0600 (Fri, 29 Apr 2016) | 4 lines
( minor: re-indent the file zgges3.f )
------------------------------------------------------------------------
r1742 | langou | 2016-04-29 15:11:31 -0600 (Fri, 29 Apr 2016) | 14 lines
See post of Nathan Whitehead on the forum (4/29/2016)
http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4958
Addressing first comment (out of two):
"For CGGES3, there is an argument that is complex where a real array is expected.
SRC/cgges3.f:397
I changed WORK to RWORK in the second to last argument to CHGEQZ."
(Update the thank-you file.)
|
|
reported by nathanw]
In the CUNCSD function I saw an illegal use of an integer for an array argument.
To fix I replaced 0 with U1 in 4th argument of CUNGQR and CUNGLQ to be consistent with ZUNCSD.
SRC/cuncsd.f:491,496
|
|
*gesvd, and *bdsdc).
Items are labelled (a) through (m), omitting (l).
Several are not bugs, just suggestions.
Most bugs are in *gesdd.
There's one bug (g) in *bdsdc. This is the underlying cause of LAPACK bug #111.
There's one bug (m) in [cz]gesvd. I also added an INT() cast in these
assignments to silence compiler warnings. Changed:
LWORK_ZGEQRF=CDUM(1)
to:
LWORK_ZGEQRF = INT( CDUM(1) )
Where possible, I ran a test showing the wrong behavior, then a test showing the
corrected behavior. These use a modified version of the MAGMA SVD tester
(calling LAPACK), because I could adjust the lwork as needed. The last 3 columns
are the lwork type, the lwork size, and the lwork formula. The lwork types are:
doc_old as documented in LAPACK 3.6.
doc as in the attached, updated documentation.
min_old minwrk, as computed in LAPACK 3.6.
min minwrk, as computed in the attached, updated code.
min-1 minimum - 1; this should cause gesdd to return an error.
opt optimal size.
max the maximum size LAPACK will take advantage of;
some cases, the optimal is n*n + work, while the max is m*n + work.
query what gesdd returns for an lwork query; should equal opt or max.
After the lwork, occasionally there is a ! or ? error code indicating:
Error codes: ! error: lwork < min. For (min-1), this ought to appear.
? compatability issue: lwork < min_old, will fail for lapack <= 3.6.
I also tested the update routines on a wide variety of sizes and jobz, with
various lwork.
Besides fixing the bugs below, I made two significant changes.
1) Changed *gesdd from computing each routine's workspace using, e.g.:
N*ilaenv(...)
to querying each routine for its LWORK, e.g.:
CALL ZGEBRD( M, N, CDUM(1), M, CDUM(1), DUM(1), CDUM(1),
$ CDUM(1), CDUM(1), -1, IERR )
LWORK_ZGEBRD_MN = INT( CDUM(1) )
This matches how *gesvd was changed in LAPACK 3.4.0.
2) Changed the Workspace: comments, which were incredibly deceptive.
For instance, in Path 2 before dbdsdc, it said
Workspace: need N + N*N + BDSPAC
since dbdsdc needs the [e] vector, [U] matrix, and bdspac.
However, that ignores that the [tauq, taup] vectors and [R] matrix
are also already allocated, though dbdsdc doesn't need them.
So the workspace needed at that point in the code is actually
Workspace: need N*N [R] + 3*N [e, tauq, taup] + N*N [U] + BDSPAC
For clarity, I added in [brackets] what matrices or vectors were allocated,
and the order reflects their order in memory.
I may do a similar change for *gesvd eventually. The workspace comments in
MAGMA's *gesvd have already been updated as above.
================================================================================
a) Throughout, to simplify equations, let:
mn = min( M, N )
mx = max( M, N )
================================================================================
b) [sdcz]gesdd Path 4 (m >> n, job="all") has wrong minwrk formula in code:
minwrk = bdspac + mn*mn + 2*mn + mx
= 4*mn*mn + 6*mn + mx
This is an overestimate, needlessly rejecting the documented formula:
doc = 4*mn*mn + 7*mn
In complex, the correct min fails, but the doc matches the wrong minwrk.
Solution: fix code to:
minwrk = mn*mn + max( 3*mn + bdspac, mn + mx )
= mn*mn + max( 3*mn*mn + 7*mn, mn + mx )
Test cases:
m=40, ..., 100; n=20; jobz='A'
See bug (d) showing documentation is also wrong.
Also, see bug (i), complex [cz]gesdd should return -12 instead of -13.
================================================================================
bt) transposed case
[sd]gesdd Path 4t (n >> m, job="all") has a different wrong minwrk; see bug (c).
[cz]gesdd Path 4t exhibits same bug as Path 4.
Test cases:
m=20; n=40, ..., 100; jobz='A'
================================================================================
c) [sd]gesdd Path 4t (n >> m, job="all") has wrong minwrk formula in code,
different than bug (b):
minwrk = bdspac + m*m + 3*m
= 4*mn*mn + 7*mn
This formula lacks any dependence on N, so the code will fail (without
setting info; orglq calls xerbla) if N is too large, N > 3*M*M + 6*M.
Bug does not occur in complex.
Test cases:
m=20; n = 1320; jobz='A' ok with documented lwork
m=20; n > 1320; jobz='A' fails with documented lwork
Solution: as in bug (b), fix code to:
minwrk = mn*mn + max( 3*mn + bdspac, mn + mx )
= mn*mn + max( 3*mn*mn + 7*mn, mn + mx )
See bug (d) showing documentation is also wrong.
================================================================================
d) [sd]gesdd documentation lists the same minimum size for jobz='S' and 'A':
If JOBZ = 'S' or 'A', LWORK >= min(M,N)*(7 + 4*min(M,N))
However, jobz='A' actually also depends on max(M,N):
minwrk = mn*mn + max( 3*mn*mn + 7*mn, mn + mx )
This causes the formula to fail for mx > 3*mn*mn + 6*mn.
Test cases:
m > 1320; n = 20; jobz='A' fails with document lwork, even after fixing bugs (b) and (c).
m = 20; n > 1320; jobz='A' fails also.
Solution: in docs, split these two cases. This fix uses an overestimate,
so that codes using it will be backwards compatible with LAPACK <= 3.6.
If JOBZ = 'S', LWORK >= 4*mn*mn + 7*mn.
If JOBZ = 'A', LWORK >= 4*mn*mn + 6*mn + mx.
================================================================================
e) [sd]gesdd, Path 5, jobz='A' has wrong maxwrk formula in the code:
MAXWRK = MAX( MAXWRK, BDSPAC + 3*N )
Should be:
MAXWRK = MAX( WRKBL, BDSPAC + 3*N )
This causes the lwork query to ignore WRKBL, and return the minimum
workspace size, BDSPAC + 3*N, instead of the optimal workspace size.
However, it only affects the result for small sizes where min(M,N) < NB.
Path 5t has the correct maxwrk formula.
Complex is correct for both Path 5 and 5t.
Test case:
Compare lwork query with
M = 30, N = 20, jobz='A', lwork query is 1340
M = 20, N = 30, jobz='A', lwork query is 3260
These should be the same.
Solution: fix code as above.
================================================================================
f) Not a bug, just a suggestion.
The lwork minimum sizes are not actually minimums, and can be larger than
the queried lwork size.
Solution: add a comment:
These are not tight minimums in all cases; see comments inside code.
================================================================================
g) [sd]bdsdc segfaults due to too small workspace size. Its documentation claims:
If COMPQ = 'N' then LWORK >= (4 * N).
Based on this, in zgesdd, the rwork size >= 5*min(M,N).
However, LAPACK bug 111 found that rwork size >= 7*min(M,N) was required.
In dbdsdc, if uplo='L', then it rotates lower bidiagonal to upper bidiagonal,
and saves 2 vectors of Givens rotations in work. It shifts WSTART from
1 to 2*N-1. Then it calls dlasdq( ..., work( wstart ), info ).
As dlasdq requires 4*N, dbdsdc would now require 6*N in this case.
This caused zgesdd to require rwork size >= 7*min(M,N) when N > M and jobz='N'.
My preferred solution is to change WSTART to 1 in the DLASDQ call inside dbdsdc:
IF( ICOMPQ.EQ.0 ) THEN
CALL DLASDQ( 'U', 0, N, 0, 0, 0, D, E, VT, LDVT, U, LDU, U,
$ LDU, WORK( WSTART ), INFO )
GO TO 40
END IF
to:
IF( ICOMPQ.EQ.0 ) THEN
* Ignores WSTART, which is needed only for ICOMPQ = 1 or 2;
* using WSTART would change required workspace to 6*N for uplo='L'.
CALL DLASDQ( 'U', 0, N, 0, 0, 0, D, E, VT, LDVT, U, LDU, U,
$ LDU, WORK( 1 ), INFO )
GO TO 40
END IF
The [cz]gesdd documentation, which was changed to 7*min(M,N) in LAPACK 3.6,
may be reverted to 5*min(M,N), if desired.
================================================================================
h) [sd]gesdd for jobz='N' requires bdspac = 7*n for the dbdsdc workspace.
However, dbdsdc requires only 4*n, or 6*n before fixing bug (g).
For backwards compatability, I did not change the code, but added a comment
for clarification.
================================================================================
i) [cz]gesdd returns info = -13 instead of info = -12 for lwork errors.
================================================================================
j) In zgesdd, for computing maxwrk, these paths:
Path 6, jobz=A
Path 6t, jobz=S
Path 6t, jobz=A
query ilaenv( 1, zungbr, ... )
when the code actually calls zunmbr (twice). I corrected it.
================================================================================
k) In zgesdd documentation, currently
lrwork >= max( 5*mn*mn + 7*mn, 2*mx*mn + 2*mn*mn + mn )
It doesn't need that much, particularly for (mx >> mn) case.
If (mx >> mn), lrwork >= 5*mn*mn + 5*mn;
else, lrwork >= max( 5*mn*mn + 5*mn,
2*mx*mn + 2*mn*mn + mn ).
I changed this in the documentation. Feel free to revert if you prefer.
================================================================================
m) [cz]gesvd, Path 10 and 10t, have minwrk inside the wrong conditional:
IF( .NOT.WNTVN ) THEN
MAXWRK = MAX( MAXWRK, 2*N+LWORK_ZUNGBR_P )
MINWRK = 2*N + M
END IF
So Path 10 with jobvt='N', and Path 10t with jobu='N', have minwrk = 1,
so an invalid lwork is not correctly rejected.
================================================================================
mt) transposed case
broken: with old routine, Path 10t with jobu='N' doesn't enforce minwrk
|
|
http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4937
Thanks to Eugene Chereshnev (Intel)
|
|
|
|
Used A(1,1) and T(1,1) in call to *LARFG (around line 177)
to make arguments 2 and 5 scalars rather than 2d-arrays.
|
|
The work array, RWORK, is declared COMPLEX, it should be REAL.
|
|
on March 1st 2016
|
|
|
|
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 42/42] Fix lapacke_?tprfb - avoid nancheck of unset data
---> LAST ONE!!!
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 41/42] Fix lapacke_?tpmqrt - avoid nancheck of unset data
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 40/42] Fix lapacke_?steqr - avoid nancheck of z when compz=='i'
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 39/42] Fix lapacke_?hetri2x - avoid nancheck of unset data
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 38/42] Fix lapacke_???swapr* - missing parameter lda, and more
- lda should not be missed
- a has leading dimension lda, not n
- a_t has leading dimension lda_t, not n
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 37/42] Fix lapacke_?gemqrt - avoid nancheck of unset data
|
|
(dmitry.g.baksheev@intel.com)
Part of [PATCH 36/42]
- nancheck of input shall cover n-by-n, not lda-by-n
|
|
(dmitry.g.baksheev@intel.com)
!!! NOT APPLYING PATCH [PATCH 36/42] !!! Fix lapacke_?syconv - parameter work is not part of hi-level interface
See revision 1609 | langou | 2015-10-28 22:06:14 -0700 (Wed, 28 Oct 2015)
"In ?syconv, replace the variable name WORK by the variable name E. E is the
standard way to name the supdiagonal/subdiagonal of a symmetric tridiagonal
matrix. Also, E (previously WORK) is of size N-1, not N. So correct this in
the comment."
Updated LAPACKE with new parameter name.
|
|
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 34/42] Fix lapacke_?sytri2 - work is real, not complex
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 33/42] Fix lapacke_?bdsvdx - vl,vu are real; ns must be passed by ref
- vl and vu are real kind
- ns is [out] and must be passed by reference
- lwork must be at least 1
- e is n-1 array
- ldz should be compared to ncols_z for r-major case
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 32/42] Fix lapacke_?{un,or}csd2by1 - theta is real, and use ld*_t
- theta is real, not complex
- use ld*_t for calling LAPACK in row-major case, not ld*
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 31/42] Fix lapacke_?tprfb - parameter work is not 'const'
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 30/42] Fix lapacke_?lascl - bugs and NaN checks
- nancheck content, not padding (e.g. M-by-N, not LDA-by-N)
- types L, U: use ?gb_nancheck(m,n), not ?tr_nancheck(n)
- type H: use ?gb_nancheck(m,n), not ?hs_nancheck(n)
- type Z: use ?gb_nancheck correctly, do not check unset data
- type Z: M-by-N should be checked
- type Z: A is 9th parameter
- nrows_a is needed for correct transposition
- info from LAPACK should be respected
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 29/42] Fix lapacke_?gb_nancheck - misuse of leading dimension as matrix size
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 28/42] Fix lapacke_?gesvj - correct eval of nrows_v
|
|
(dmitry.g.baksheev@intel.com)
[PATCH 27/42] Fix lapacke_?gesvdx* - NS is reference; VU, VL are real; ...
- NS must be passed by ref
- VU and VL shall be real, not integer
- LWORK must be at least 1, even for N=0
- correct allocation of WORK, JOBU/JOBVT cannot be 'A' or 'S'
- nrows/ncols = max(...,0), zero is possible, negatives are not
fixup lapacke_?gesvdx
|