#include <math.h>
#include "superlu_zdefs.h"
Functions/Subroutines | |
| void | pzgssvx_ABglobal (superlu_options_t *options, SuperMatrix *A, ScalePermstruct_t *ScalePermstruct, doublecomplex B[], int ldb, int nrhs, gridinfo_t *grid, LUstruct_t *LUstruct, double *berr, SuperLUStat_t *stat, int *info) |
-- Distributed SuperLU routine (version 1.0) -- Lawrence Berkeley National Lab, Univ. of California Berkeley. September 1, 1999
| void pzgssvx_ABglobal | ( | superlu_options_t * | options, | |
| SuperMatrix * | A, | |||
| ScalePermstruct_t * | ScalePermstruct, | |||
| doublecomplex | B[], | |||
| int | ldb, | |||
| int | nrhs, | |||
| gridinfo_t * | grid, | |||
| LUstruct_t * | LUstruct, | |||
| double * | berr, | |||
| SuperLUStat_t * | stat, | |||
| int * | info | |||
| ) |
Purpose =======
pzgssvx_ABglobal solves a system of linear equations A*X=B, by using Gaussian elimination with "static pivoting" to compute the LU factorization of A.
Static pivoting is a technique that combines the numerical stability of partial pivoting with the scalability of Cholesky (no pivoting), to run accurately and efficiently on large numbers of processors.
See our paper at http://www.nersc.gov/~xiaoye/SuperLU/ for a detailed description of the parallel algorithms.
Here are the options for using this code:
1. Independent of all the other options specified below, the
user must supply
On output, B is overwritten with the solution X.
2. Depending on options->Fact, the user has several options
for solving A*X=B. The standard option is for factoring
A "from scratch". (The other options, described below,
are used when A is sufficiently similar to a previously
solved problem to save time by reusing part or all of
the previous factorization.)
In this case the user must also supply
as well as the following options, which are described in more
detail below:
The outputs returned include
(part of ScalePermstruct may also need to be supplied on input,
depending on options->RowPerm and options->ColPerm as described
later).
A1 = Pc*Pr*diag(R)*A*diag(C)*Pc^T = L*U
(Note that A1 = Aout * Pc^T, where Aout is the matrix stored
in A on output.)
3. The second value of options->Fact assumes that a matrix with the same
sparsity pattern as A has already been factored:
In this case the user must still specify the following options
as before:
but not options->ColPerm, whose value is ignored. This is because the
previous column permutation from ScalePermstruct->perm_c is used as
input. The user must also supply
The outputs returned include
4. The third value of options->Fact assumes that a matrix B with the same
sparsity pattern as A has already been factored, and where the
row permutation of B can be reused for A. This is useful when A and B
have similar numerical values, so that the same row permutation
will make both factorizations numerically stable. This lets us reuse
all of the previously computed structure of L and U.
In this case the user must still specify the following options
as before:
but not options->RowPerm or options->ColPerm, whose values are ignored.
This is because the permutations from ScalePermstruct->perm_r and
ScalePermstruct->perm_c are used as input.
The user must also supply
The outputs returned include
5. The fourth and last value of options->Fact assumes that A is
identical to a matrix that has already been factored on a previous
call, and reuses its entire LU factorization
In this case all the other options mentioned above are ignored
(options->Equil, options->RowPerm, options->ColPerm,
options->ReplaceTinyPivot)
The user must also supply
all of which are unmodified on output.
Arguments =========
options (input) superlu_options_t*
The structure defines the input parameters to control
how the LU decomposition will be performed.
The following fields should be defined for this structure:
o Fact (fact_t)
Specifies whether or not the factored form of the matrix
A is supplied on entry, and if not, how the matrix A should
be factorized based on the previous history.
= DOFACT: The matrix A will be factorized from scratch.
Inputs: A
options->Equil, RowPerm, ColPerm, ReplaceTinyPivot
Outputs: modified A
(possibly row and/or column scaled and/or
permuted)
all of ScalePermstruct
all of LUstruct
= SamePattern: the matrix A will be factorized assuming
that a factorization of a matrix with the same sparsity
pattern was performed prior to this one. Therefore, this
factorization will reuse column permutation vector
ScalePermstruct->perm_c and the elimination tree
LUstruct->etree
Inputs: A
options->Equil, RowPerm, ReplaceTinyPivot
ScalePermstruct->perm_c
LUstruct->etree
Outputs: modified A
(possibly row and/or column scaled and/or
permuted)
rest of ScalePermstruct (DiagScale, R, C, perm_r)
rest of LUstruct (GLU_persist, Llu)
= SamePattern_SameRowPerm: the matrix A will be factorized
assuming that a factorization of a matrix with the same
sparsity pattern and similar numerical values was performed
prior to this one. Therefore, this factorization will reuse
both row and column scaling factors R and C, and the
both row and column permutation vectors perm_r and perm_c,
distributed data structure set up from the previous symbolic
factorization.
Inputs: A
options->Equil, ReplaceTinyPivot
all of ScalePermstruct
all of LUstruct
Outputs: modified A
(possibly row and/or column scaled and/or
permuted)
modified LUstruct->Llu
= FACTORED: the matrix A is already factored.
Inputs: all of ScalePermstruct
all of LUstruct
o Equil (yes_no_t)
Specifies whether to equilibrate the system.
= NO: no equilibration.
= YES: scaling factors are computed to equilibrate the system:
diag(R)*A*diag(C)*inv(diag(C))*X = diag(R)*B.
Whether or not the system will be equilibrated depends
on the scaling of the matrix A, but if equilibration is
used, A is overwritten by diag(R)*A*diag(C) and B by
diag(R)*B.
o RowPerm (rowperm_t)
Specifies how to permute rows of the matrix A.
= NATURAL: use the natural ordering.
= LargeDiag: use the Duff/Koster algorithm to permute rows of
the original matrix to make the diagonal large
relative to the off-diagonal.
= MY_PERMR: use the ordering given in ScalePermstruct->perm_r
input by the user.
o ColPerm (colperm_t)
Specifies what type of column permutation to use to reduce fill.
= NATURAL: natural ordering.
= MMD_AT_PLUS_A: minimum degree ordering on structure of A'+A.
= MMD_ATA: minimum degree ordering on structure of A'*A.
= MY_PERMC: the ordering given in ScalePermstruct->perm_c.
o ReplaceTinyPivot (yes_no_t)
= NO: do not modify pivots
= YES: replace tiny pivots by sqrt(epsilon)*norm(A) during
LU factorization.
o IterRefine (IterRefine_t)
Specifies how to perform iterative refinement.
= NO: no iterative refinement.
= DOUBLE: accumulate residual in double precision.
= EXTRA: accumulate residual in extra precision.
NOTE: all options must be indentical on all processes when
calling this routine.
A (input/output) SuperMatrix*
On entry, matrix A in A*X=B, of dimension (A->nrow, A->ncol).
The number of linear equations is A->nrow. The type of A must be:
Stype = SLU_NC; Dtype = SLU_Z; Mtype = SLU_GE. That is, A is stored in
compressed column format (also known as Harwell-Boeing format).
See supermatrix.h for the definition of 'SuperMatrix'.
This routine only handles square A, however, the LU factorization
routine pzgstrf can factorize rectangular matrices.
On exit, A may be overwritten by Pc*Pr*diag(R)*A*diag(C),
depending on ScalePermstruct->DiagScale, options->RowPerm and
options->colpem:
if ScalePermstruct->DiagScale != NOEQUIL, A is overwritten by
diag(R)*A*diag(C).
if options->RowPerm != NATURAL, A is further overwritten by
Pr*diag(R)*A*diag(C).
if options->ColPerm != NATURAL, A is further overwritten by
Pc*Pr*diag(R)*A*diag(C).
If all the above condition are true, the LU decomposition is
performed on the matrix Pc*Pr*diag(R)*A*diag(C)*Pc^T.
NOTE: Currently, A must reside in all processes when calling
this routine.
ScalePermstruct (input/output) ScalePermstruct_t*
The data structure to store the scaling and permutation vectors
describing the transformations performed to the matrix A.
It contains the following fields:
o DiagScale (DiagScale_t)
Specifies the form of equilibration that was done.
= NOEQUIL: no equilibration.
= ROW: row equilibration, i.e., A was premultiplied by
diag(R).
= COL: Column equilibration, i.e., A was postmultiplied
by diag(C).
= BOTH: both row and column equilibration, i.e., A was
replaced by diag(R)*A*diag(C).
If options->Fact = FACTORED or SamePattern_SameRowPerm,
DiagScale is an input argument; otherwise it is an output
argument.
o perm_r (int*)
Row permutation vector, which defines the permutation matrix Pr;
perm_r[i] = j means row i of A is in position j in Pr*A.
If options->RowPerm = MY_PERMR, or
options->Fact = SamePattern_SameRowPerm, perm_r is an
input argument; otherwise it is an output argument.
o perm_c (int*)
Column permutation vector, which defines the
permutation matrix Pc; perm_c[i] = j means column i of A is
in position j in A*Pc.
If options->ColPerm = MY_PERMC or options->Fact = SamePattern
or options->Fact = SamePattern_SameRowPerm, perm_c is an
input argument; otherwise, it is an output argument.
On exit, perm_c may be overwritten by the product of the input
perm_c and a permutation that postorders the elimination tree
of Pc*A'*A*Pc'; perm_c is not changed if the elimination tree
is already in postorder.
o R (double*) dimension (A->nrow)
The row scale factors for A.
If DiagScale = ROW or BOTH, A is multiplied on the left by
diag(R).
If DiagScale = NOEQUIL or COL, R is not defined.
If options->Fact = FACTORED or SamePattern_SameRowPerm, R is
an input argument; otherwise, R is an output argument.
o C (double*) dimension (A->ncol)
The column scale factors for A.
If DiagScale = COL or BOTH, A is multiplied on the right by
diag(C).
If DiagScale = NOEQUIL or ROW, C is not defined.
If options->Fact = FACTORED or SamePattern_SameRowPerm, C is
an input argument; otherwise, C is an output argument.
B (input/output) doublecomplex*
On entry, the right-hand side matrix of dimension (A->nrow, nrhs).
On exit, the solution matrix if info = 0;
NOTE: Currently, B must reside in all processes when calling
this routine.
ldb (input) int (global)
The leading dimension of matrix B.
nrhs (input) int (global)
The number of right-hand sides.
If nrhs = 0, only LU decomposition is performed, the forward
and back substitutions are skipped.
grid (input) gridinfo_t*
The 2D process mesh. It contains the MPI communicator, the number
of process rows (NPROW), the number of process columns (NPCOL),
and my process rank. It is an input argument to all the
parallel routines.
Grid can be initialized by subroutine SUPERLU_GRIDINIT.
See superlu_zdefs.h for the definition of 'gridinfo_t'.
LUstruct (input/output) LUstruct_t*
The data structures to store the distributed L and U factors.
It contains the following fields:
o etree (int*) dimension (A->ncol)
Elimination tree of Pc*(A'+A)*Pc' or Pc*A'*A*Pc', dimension A->ncol.
It is computed in sp_colorder() during the first factorization,
and is reused in the subsequent factorizations of the matrices
with the same nonzero pattern.
On exit of sp_colorder(), the columns of A are permuted so that
the etree is in a certain postorder. This postorder is reflected
in ScalePermstruct->perm_c.
NOTE:
Etree is a vector of parent pointers for a forest whose vertices
are the integers 0 to A->ncol-1; etree[root]==A->ncol.
o Glu_persist (Glu_persist_t*)
Global data structure (xsup, supno) replicated on all processes,
describing the supernode partition in the factored matrices
L and U:
xsup[s] is the leading column of the s-th supernode,
supno[i] is the supernode number to which column i belongs.
o Llu (LocalLU_t*)
The distributed data structures to store L and U factors.
See superlu_ddefs.h for the definition of 'LocalLU_t'.
berr (output) double*, dimension (nrhs)
The componentwise relative backward error of each solution
vector X(j) (i.e., the smallest relative change in
any element of A or B that makes X(j) an exact solution).
stat (output) SuperLUStat_t*
Record the statistics on runtime and floating-point operation count.
See util.h for the definition of 'SuperLUStat_t'.
info (output) int*
= 0: successful exit
> 0: if info = i, and i is
<= A->ncol: U(i,i) is exactly zero. The factorization has
been completed, but the factor U is exactly singular,
so the solution could not be computed.
> A->ncol: number of bytes allocated when memory allocation
failure occurred, plus A->ncol.
See superlu_zdefs.h for the definitions of various data types.
1.5.5