Man page - hetrd_2stage(3)
Packages contains this manual
- hptrd(3)
- potri(3)
- xerbla_array(3)
- ggsvd_driver_grp(3)
- hfrk(3)
- getsqr_comp_grp(3)
- laed6(3)
- gtrfs(3)
- lasdq(3)
- gglse(3)
- la_xisnan_la_isnan(3)
- unmr2(3)
- hetrs_aa(3)
- tpttr(3)
- gerz_comp_grp(3)
- potrf(3)
- hegv_driver(3)
- laqps(3)
- ggqr_comp_grp(3)
- ilalc(3)
- ung2r(3)
- heevd(3)
- pstf2(3)
- lacn2(3)
- ptrfs(3)
- ungrq(3)
- gelqf(3)
- ppsv_comp(3)
- blas2_full(3)
- gemlqt(3)
- unml2(3)
- tplqt(3)
- tpcon(3)
- getf2(3)
- ggbak(3)
- bdsvd_driver(3)
- lamch(3)
- gelq(3)
- gebal(3)
- laqr1(3)
- ptsvx(3)
- lahr2(3)
- larscl2(3)
- geqrt(3)
- larfb(3)
- gtsv_comp(3)
- gesvd_aux(3)
- hbevx_2stage(3)
- hbgvx(3)
- tprfs(3)
- params_grp(3)
- lahef(3)
- laqr_group(3)
- unmqr(3)
- tgsy2(3)
- tfsv_comp(3)
- ggls_driver_grp(3)
- geev(3)
- latrd(3)
- unbdb4(3)
- bbcsd(3)
- lange(3)
- gelq_comp3(3)
- gttrs(3)
- lasy2(3)
- hetf2_rook(3)
- gtsv(3)
- lalsd(3)
- lanhb(3)
- laqhb(3)
- hgeqz(3)
- gesvj(3)
- gsvj0(3)
- ungtsqr_row(3)
- gelq_comp1(3)
- gemmtr(3)
- pbequ(3)
- heev_driver(3)
- unhr_col(3)
- syconvf_rook(3)
- getc2(3)
- syconv(3)
- norm_grp(3)
- larrc(3)
- laqr4(3)
- posv_comp(3)
- geev_driver_grp(3)
- heev_comp(3)
- pfsv(3)
- trevc3(3)
- gesv_driver_grp(3)
- reflector_aux_grp(3)
- langt(3)
- lacrt(3)
- latdf(3)
- hetrs_aa_2stage(3)
- lamc1(3)
- hpev_driver(3)
- hegvd(3)
- pptri(3)
- geqrt3(3)
- gelqt3(3)
- lasd5(3)
- laeda(3)
- geqr(3)
- lamtsqr(3)
- heev(3)
- hpev_comp(3)
- larfg(3)
- blas2_grp(3)
- hesv_rook(3)
- laexc(3)
- hetrd(3)
- geesx(3)
- ppsvx(3)
- blas_top(3)
- gtts2(3)
- la_herpvgrw(3)
- hpevx(3)
- ggevx(3)
- lahqr(3)
- gelq_comp_grp(3)
- hesv_comp_v3(3)
- tplqt2(3)
- hpev(3)
- hbtrd(3)
- getrs(3)
- hecon_3(3)
- lasrt(3)
- lanhe(3)
- gesv_comp(3)
- gbequ(3)
- hetrf_rk(3)
- laqr3(3)
- heev_comp_grp(3)
- ungtsqr(3)
- ppcon(3)
- ggrq_comp_grp(3)
- larmm(3)
- ieeeck(3)
- geqrf(3)
- solve_aux_grp(3)
- herfs(3)
- posvx(3)
- posvxx(3)
- gges3(3)
- hbgvd(3)
- lantb(3)
- lasd_comp_grp(3)
- hpgvx(3)
- lapy2(3)
- lauu2(3)
- copy(3)
- getsqrhrt(3)
- stev_comp_grp(3)
- laev2(3)
- larfb_gett(3)
- trti2(3)
- laqz4(3)
- hegv_driver_grp(3)
- la_porfsx_extended(3)
- laruv(3)
- ggsvd_comp_grp(3)
- dot(3)
- gehd2(3)
- lanhf(3)
- hetri_rook(3)
- pfsv_comp(3)
- gbtrf(3)
- hpgst(3)
- getri(3)
- trevc(3)
- unmrz(3)
- hsein(3)
- lsamen(3)
- lasd6(3)
- trtri(3)
- ggglm(3)
- las2(3)
- latrs(3)
- lapll(3)
- gemlq(3)
- geqpf_comp_grp(3)
- stemr(3)
- rotm(3)
- disna(3)
- ggrqf(3)
- pptrf(3)
- lasd0(3)
- lals0(3)
- laqz2(3)
- hbev_driver2(3)
- geswlq_comp_grp(3)
- laqr0(3)
- trttp(3)
- stedc(3)
- lasq4(3)
- geev_comp_grp(3)
- ungbr(3)
- lanv2(3)
- hpsv(3)
- pprfs(3)
- gehrd(3)
- ppsv(3)
- lagtm(3)
- hpgv(3)
- trsv_comp(3)
- larfx(3)
- gesv_driver(3)
- gerfsx(3)
- la_geamv(3)
- laed9(3)
- tpqrt2(3)
- uncsd(3)
- gecs_comp_grp(3)
- bdsqr(3)
- hegv_comp_grp(3)
- labad(3)
- geqp3(3)
- gesvdq(3)
- tfttp(3)
- laln2(3)
- uncsd2by1(3)
- blas2_like_grp(3)
- latbs(3)
- hbgst(3)
- larrv(3)
- ilaenv2stage(3)
- bdsvdx(3)
- hegs2(3)
- lasq_comp_grp(3)
- hpr2(3)
- laqhe(3)
- larra(3)
- gemqrt(3)
- hbmv(3)
- hpsv_driver(3)
- lacp2(3)
- lapmt(3)
- gecon(3)
- unbdb5(3)
- la_gerpvgrw(3)
- tgex2(3)
- laqhp(3)
- tftri(3)
- getrf2(3)
- porfs(3)
- lartg(3)
- lagts(3)
- ggev_comp_grp(3)
- lasd3(3)
- geqr_comp2(3)
- laqz_group(3)
- pftri(3)
- hetri2x(3)
- lahef_aa(3)
- svd_driver_grp(3)
- gbsv_driver(3)
- hesv_comp_aasen2(3)
- laqtr(3)
- lag2(3)
- la_porcond(3)
- hbev(3)
- pbtrf(3)
- lascl(3)
- larr_comp_grp(3)
- hecon(3)
- pttrs(3)
- lasd8(3)
- lsame(3)
- unm2l(3)
- potrs(3)
- tptrs(3)
- lartv(3)
- trtrs(3)
- gsvj1(3)
- sum1(3)
- larrj(3)
- gbmv(3)
- posv(3)
- gghd3(3)
- geev_top(3)
- geqr_comp_grp(3)
- laset(3)
- hesvxx(3)
- posv_comp_grp(3)
- lahef_rk(3)
- lasd1(3)
- tprfb(3)
- potf2(3)
- laein(3)
- lamc4(3)
- stevd(3)
- gtsv_driver(3)
- gesvd_comp_grp(3)
- la_constants(3)
- gesvx(3)
- hseqr(3)
- launhr_col_getrfnp2(3)
- trcon(3)
- larre(3)
- gelsy(3)
- ptsv(3)
- lacon(3)
- laed_comp_grp(3)
- hpsvx(3)
- gemm(3)
- poequ(3)
- laesy(3)
- lagtf(3)
- trrfs(3)
- ggev3(3)
- pbstf(3)
- poequb(3)
- heevr(3)
- lanhp(3)
- unbdb3(3)
- tgsyl(3)
- lamc5(3)
- geqr2p(3)
- ungqr(3)
- laqz3(3)
- imax1(3)
- gels_top(3)
- hesv(3)
- gelqt(3)
- pfsv_driver(3)
- stegr(3)
- gerqf(3)
- laisnan(3)
- ilatrans(3)
- gbsv_comp(3)
- pbrfs(3)
- lascl2(3)
- larz(3)
- la_hercond(3)
- tgexc(3)
- ggesx(3)
- unbdb6(3)
- ungl2(3)
- laed_comp2(3)
- rscl(3)
- hegv(3)
- gelst(3)
- gbtrs(3)
- pftrf(3)
- langb(3)
- lantr(3)
- laqgb(3)
- ggsvp3(3)
- bdsdc(3)
- ladiv(3)
- laqge(3)
- iparmq(3)
- ggbal(3)
- hb2st_kernels(3)
- lartgs(3)
- lartgp(3)
- rot(3)
- ppequ(3)
- laed3(3)
- her(3)
- hptri(3)
- stevx(3)
- upgtr(3)
- lar2v(3)
- hbev_2stage(3)
- gejsv(3)
- ppsv_driver(3)
- unm22(3)
- gesvxx(3)
- laqz0(3)
- unmtr(3)
- laed5(3)
- tptri(3)
- laed0(3)
- heev_driver2(3)
- hpcon(3)
- lasd4(3)
- hetrf_aa(3)
- geqr_comp3(3)
- rot_aux_grp(3)
- aux_grp(3)
- laebz(3)
- trsyl3(3)
- gges(3)
- gesdd(3)
- trexc(3)
- ung2l(3)
- gesv(3)
- laed4(3)
- md__r_e_a_d_m_e(3)
- blas3_like_grp(3)
- laed1(3)
- larcm(3)
- hbevx(3)
- hesv_driver_grp(3)
- hetrs(3)
- hbevd_2stage(3)
- blas1_grp(3)
- laic1(3)
- geql_comp_grp(3)
- heev_2stage(3)
- hpmv(3)
- pbtf2(3)
- hetrf_aa_2stage(3)
- hbgv(3)
- pptrs(3)
- lapmr(3)
- tpqr_comp_grp(3)
- larfy(3)
- gedmd(3)
- lasr(3)
- hetrd_2stage(3)
- gerfs(3)
- ungtr(3)
- porfsx(3)
- tpmv(3)
- lasd_comp2(3)
- unmbr(3)
- tbtrs(3)
- hetd2(3)
- trsv_comp_grp(3)
- lapy3(3)
- ptts2(3)
- unmhr(3)
- hbev_driver(3)
- lalsa(3)
- tbsv_comp(3)
- hesv_comp_v1(3)
- geql2(3)
- sterf(3)
- larrd(3)
- larft(3)
- lagv2(3)
- gttrf(3)
- tpqrt(3)
- la_lin_berr(3)
- rotg(3)
- solve_top(3)
- lacgv(3)
- larrf(3)
- tbmv(3)
- trsyl(3)
- geequ(3)
- upmtr(3)
- hpgv_driver(3)
- tbsv(3)
- hesvx(3)
- latrz(3)
- tfttr(3)
- gesv_comp_grp(3)
- xerbla_grp(3)
- tpsv(3)
- blas3_grp(3)
- gesvd_driver(3)
- geqr_comp1(3)
- ggev_driver_grp(3)
- la_gbamv(3)
- tpmlqt(3)
- trttf(3)
- larzb(3)
- unmr3(3)
- hecon_rook(3)
- stebz(3)
- lantp(3)
- laqz1(3)
- hesv_rk(3)
- tbcon(3)
- xerbla(3)
- posv_mixed(3)
- latps(3)
- hesv_aa_driver(3)
- gemqr(3)
- larrr(3)
- gebrd(3)
- tgsna(3)
- la_gercond(3)
- gbsv(3)
- hesv_comp_grp(3)
- gesv_mixed(3)
- gghrd(3)
- gbrfs(3)
- tpmqrt(3)
- lasq3(3)
- tpsv_comp(3)
- largv(3)
- gelsd(3)
- pftrs(3)
- asum(3)
- launhr_col_getrfnp(3)
- hptrf(3)
- lacpy(3)
- gesc2(3)
- lasda(3)
- second(3)
- hprfs(3)
- hpsv_comp(3)
- lamrg(3)
- pbsv_comp(3)
- hegv_2stage(3)
- gerq2(3)
- lasdt(3)
- abs1(3)
- hbevd(3)
- hbev_comp(3)
- trsv(3)
- la_porpvgrw(3)
- la_gbrpvgrw(3)
- hbgv_driver(3)
- tgsja(3)
- gebd2(3)
- geqr2(3)
- unm2r(3)
- unmql(3)
- la_gbrfsx_extended(3)
- gelq_comp2(3)
- iparam2stage(3)
- ger(3)
- larf(3)
- ilaprec(3)
- labrd(3)
- unbdb1(3)
- unmlq(3)
- geequb(3)
- la_herfsx_extended(3)
- unbdb2(3)
- lapack_top(3)
- ptsv_driver(3)
- hetrs2(3)
- geqr_comp4(3)
- pbsv(3)
- posv_driver(3)
- steqr(3)
- gels(3)
- lar1v(3)
- hemv(3)
- la_transtype(3)
- hesv_aa(3)
- lacrm(3)
- stevr(3)
- hetf2_rk(3)
- blas2_banded(3)
- stein(3)
- unmrq(3)
- larrk(3)
- hetri2(3)
- hesv_aa_2stage(3)
- pttrf(3)
- gelss(3)
- pbsv_driver(3)
- lasq5(3)
- heevx_2stage(3)
- hetri(3)
- lasd2(3)
- laed2(3)
- pbcon(3)
- ptcon(3)
- laed7(3)
- gels_aux_grp(3)
- hpgvd(3)
- hetf2(3)
- tzrzf(3)
- hpr(3)
- unitary_top(3)
- latsqr(3)
- ungql(3)
- her2(3)
- hetri_3x(3)
- hetrd_hb2st(3)
- tgsen(3)
- ggsvd3(3)
- lasq6(3)
- set_grp(3)
- larfgp(3)
- gels_driver_grp(3)
- pbtrs(3)
- lamswlq(3)
- lanht(3)
- gbsvxx(3)
- tgevc(3)
- ilaenv(3)
- swap(3)
- lae2(3)
- iladiag(3)
- lasq2(3)
- la_heamv(3)
- blas_like_top(3)
- la_gerfsx_extended(3)
- hegst(3)
- tfsm(3)
- gesvd(3)
- ungr2(3)
- ggev(3)
- aux_top(3)
- blas2_packed(3)
- geqlf(3)
- hetrs_rook(3)
- gelq2(3)
- geqrfp(3)
- gbequb(3)
- stev(3)
- lauum(3)
- potrf2(3)
- lamc3(3)
- gbrfsx(3)
- gerq_comp_grp(3)
- pocon(3)
- tbrfs(3)
- heswapr(3)
- lamc2(3)
- hpevd(3)
- hesv_comp_aasen(3)
- scalar_grp(3)
- gemv(3)
- lasv2(3)
- lanhs(3)
- svd_top(3)
- gbsvx(3)
- gesvdx(3)
- tplq_comp_grp(3)
- hesv_driver(3)
- hesv_comp_v2(3)
- trsen(3)
- syconvf(3)
- lasd7(3)
- gbcon(3)
- unbdb(3)
- heev_driver_grp(3)
- ggqrf(3)
- heevx(3)
- gtsvx(3)
- lahef_rook(3)
- hetrf_rook(3)
- hetrf(3)
- trsna(3)
- gebak(3)
- larnv(3)
- ptsv_comp(3)
- laswlq(3)
- lags2(3)
- laed8(3)
- laswp(3)
- hptrs(3)
- unglq(3)
- la_wwaddw(3)
- getrf(3)
- gees(3)
- gbtf2(3)
- hegvx(3)
- latrs3(3)
- roundup_lwork(3)
- unghr(3)
- iamax(3)
- larzt(3)
- pteqr(3)
- ilaver(3)
- trmv(3)
- la_gbrcond(3)
- blas0_like_grp(3)
- nrm2(3)
- heev_top(3)
- gtcon(3)
- heevr_2stage(3)
- pstrf(3)
- rot_comp(3)
- laqr5(3)
- heevd_2stage(3)
- getsls(3)
- hetrd_he2hb(3)
- heequb(3)
- laqp2(3)
- axpy(3)
- blast_aux(3)
- rotmg(3)
- pbsvx(3)
- ilauplo(3)
- herfsx(3)
- laqr2(3)
- blas1_like_grp(3)
- lassq(3)
- larrb(3)
- stev_driver(3)
- geevx(3)
- tpttf(3)
- scal(3)
- laneg(3)
- posv_driver_grp(3)
- lasq1(3)
- hetrs_3(3)
- geqrt2(3)
- gbbrd(3)
- ilalr(3)
- hetri_3(3)
apt-get install liblapack-doc
Manual
hetrd_2stage
NAMESYNOPSIS
Functions
Detailed Description
Function Documentation
subroutine chetrd_2stage (character vect, character uplo, integer n,complex, dimension( lda, * ) a, integer lda, real, dimension( * ) d,real, dimension( * ) e, complex, dimension( * ) tau, complex,dimension( * ) hous2, integer lhous2, complex, dimension( * ) work,integer lwork, integer info)
subroutine dsytrd_2stage (character vect, character uplo, integer n, doubleprecision, dimension( lda, * ) a, integer lda, double precision,dimension( * ) d, double precision, dimension( * ) e, double precision,dimension( * ) tau, double precision, dimension( * ) hous2, integerlhous2, double precision, dimension( * ) work, integer lwork, integerinfo)
subroutine ssytrd_2stage (character vect, character uplo, integer n, real,dimension( lda, * ) a, integer lda, real, dimension( * ) d, real,dimension( * ) e, real, dimension( * ) tau, real, dimension( * ) hous2,integer lhous2, real, dimension( * ) work, integer lwork, integer info)
subroutine zhetrd_2stage (character vect, character uplo, integer n,complex*16, dimension( lda, * ) a, integer lda, double precision,dimension( * ) d, double precision, dimension( * ) e, complex*16,dimension( * ) tau, complex*16, dimension( * ) hous2, integer lhous2,complex*16, dimension( * ) work, integer lwork, integer info)
Author
NAME
hetrd_2stage - {he,sy}trd_2stage: reduction to tridiagonal, 2-stage
SYNOPSIS
Functions
subroutine
chetrd_2stage
(vect, uplo, n, a, lda, d, e, tau,
hous2, lhous2, work, lwork, info)
CHETRD_2STAGE
subroutine
dsytrd_2stage
(vect, uplo, n, a, lda, d,
e, tau, hous2, lhous2, work, lwork, info)
DSYTRD_2STAGE
subroutine
ssytrd_2stage
(vect, uplo, n, a, lda, d,
e, tau, hous2, lhous2, work, lwork, info)
SSYTRD_2STAGE
subroutine
zhetrd_2stage
(vect, uplo, n, a, lda, d,
e, tau, hous2, lhous2, work, lwork, info)
ZHETRD_2STAGE
Detailed Description
Function Documentation
subroutine chetrd_2stage (character vect, character uplo, integer n,complex, dimension( lda, * ) a, integer lda, real, dimension( * ) d,real, dimension( * ) e, complex, dimension( * ) tau, complex,dimension( * ) hous2, integer lhous2, complex, dimension( * ) work,integer lwork, integer info)
CHETRD_2STAGE
Purpose:
CHETRD_2STAGE
reduces a complex Hermitian matrix A to real symmetric
tridiagonal form T by a unitary similarity transformation:
Q1**H Q2**H* A * Q2 * Q1 = T.
Parameters
VECT
VECT is
CHARACTER*1
= āNā: No need for the Housholder
representation,
in particular for the second stage (Band to
tridiagonal) and thus LHOUS2 is of size max(1, 4*N);
= āVā: the Householder representation is needed
to
either generate Q1 Q2 or to apply Q1 Q2,
then LHOUS2 is to be queried and computed.
(NOT AVAILABLE IN THIS RELEASE).
UPLO
UPLO is
CHARACTER*1
= āUā: Upper triangle of A is stored;
= āLā: Lower triangle of A is stored.
N
N is INTEGER
The order of the matrix A. N >= 0.
A
A is COMPLEX
array, dimension (LDA,N)
On entry, the Hermitian matrix A. If UPLO = āUā,
the leading
N-by-N upper triangular part of A contains the upper
triangular part of the matrix A, and the strictly lower
triangular part of A is not referenced. If UPLO =
āLā, the
leading N-by-N lower triangular part of A contains the lower
triangular part of the matrix A, and the strictly upper
triangular part of A is not referenced.
On exit, if UPLO = āUā, the band superdiagonal
of A are overwritten by the corresponding elements of the
internal band-diagonal matrix AB, and the elements above
the KD superdiagonal, with the array TAU, represent the
unitary
matrix Q1 as a product of elementary reflectors; if UPLO
= āLā, the diagonal and band subdiagonal of A
are over-
written by the corresponding elements of the internal
band-diagonal
matrix AB, and the elements below the KD subdiagonal, with
the array TAU, represent the unitary matrix Q1 as a product
of elementary reflectors. See Further Details.
LDA
LDA is INTEGER
The leading dimension of the array A. LDA >=
max(1,N).
D
D is REAL
array, dimension (N)
The diagonal elements of the tridiagonal matrix T.
E
E is REAL
array, dimension (N-1)
The off-diagonal elements of the tridiagonal matrix T.
TAU
TAU is COMPLEX
array, dimension (N-KD)
The scalar factors of the elementary reflectors of
the first stage (see Further Details).
HOUS2
HOUS2 is
COMPLEX array, dimension (MAX(1,LHOUS2))
Stores the Householder representation of the stage2
band to tridiagonal.
LHOUS2
LHOUS2 is
INTEGER
The dimension of the array HOUS2.
LHOUS2 >= 1.
If LWORK = -1,
or LHOUS2=-1,
then a query is assumed; the routine
only calculates the optimal size of the HOUS2 array, returns
this value as the first entry of the HOUS2 array, and no
error
message related to LHOUS2 is issued by XERBLA.
If VECT=āNā, LHOUS2 = max(1, 4*n);
if VECT=āVā, option not yet available.
WORK
WORK is COMPLEX
array, dimension (MAX(1,LWORK))
On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
LWORK
LWORK is
INTEGER
The dimension of the array WORK.
If N = 0, LWORK >= 1, else LWORK = MAX(1, dimension).
If LWORK = -1,
or LHOUS2 = -1,
then a workspace query is assumed; the routine
only calculates the optimal size of the WORK array, returns
this value as the first entry of the WORK array, and no
error
message related to LWORK is issued by XERBLA.
LWORK = MAX(1, dimension) where
dimension = max(stage1,stage2) + (KD+1)*N
= N*KD + N*max(KD+1,FACTOPTNB)
+ max(2*KD*KD, KD*NTHREADS)
+ (KD+1)*N
where KD is the blocking size of the reduction,
FACTOPTNB is the blocking used by the QR or LQ
algorithm, usually FACTOPTNB=128 is a good choice
NTHREADS is the number of threads used when
openMP compilation is enabled, otherwise =1.
INFO
INFO is INTEGER
= 0: successful exit
< 0: if INFO = -i, the i-th argument had an illegal
value
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Further Details:
Implemented by Azzam Haidar.
All details are available on technical report, SC11, SC13 papers.
Azzam Haidar,
Hatem Ltaief, and Jack Dongarra.
Parallel reduction to condensed forms for symmetric
eigenvalue problems
using aggregated fine-grained and memory-aware kernels. In
Proceedings
of 2011 International Conference for High Performance
Computing,
Networking, Storage and Analysis (SC ā11), New York,
NY, USA,
Article 8 , 11 pages.
http://doi.acm.org/10.1145/2063384.2063394
A. Haidar, J.
Kurzak, P. Luszczek, 2013.
An improved parallel singular value algorithm and its
implementation
for multicore hardware, In Proceedings of 2013 International
Conference
for High Performance Computing, Networking, Storage and
Analysis (SC ā13).
Denver, Colorado, USA, 2013.
Article 90, 12 pages.
http://doi.acm.org/10.1145/2503210.2503292
A. Haidar, R.
Solca, S. Tomov, T. Schulthess and J. Dongarra.
A novel hybrid CPU-GPU generalized eigensolver for
electronic structure
calculations based on fine-grained memory aware tasks.
International Journal of High Performance Computing
Applications.
Volume 28 Issue 2, Pages 196-209, May 2014.
http://hpc.sagepub.com/content/28/2/196
subroutine dsytrd_2stage (character vect, character uplo, integer n, doubleprecision, dimension( lda, * ) a, integer lda, double precision,dimension( * ) d, double precision, dimension( * ) e, double precision,dimension( * ) tau, double precision, dimension( * ) hous2, integerlhous2, double precision, dimension( * ) work, integer lwork, integerinfo)
DSYTRD_2STAGE
Purpose:
DSYTRD_2STAGE
reduces a real symmetric matrix A to real symmetric
tridiagonal form T by a orthogonal similarity
transformation:
Q1**T Q2**T* A * Q2 * Q1 = T.
Parameters
VECT
VECT is
CHARACTER*1
= āNā: No need for the Housholder
representation,
in particular for the second stage (Band to
tridiagonal) and thus LHOUS2 is of size max(1, 4*N);
= āVā: the Householder representation is needed
to
either generate Q1 Q2 or to apply Q1 Q2,
then LHOUS2 is to be queried and computed.
(NOT AVAILABLE IN THIS RELEASE).
UPLO
UPLO is
CHARACTER*1
= āUā: Upper triangle of A is stored;
= āLā: Lower triangle of A is stored.
N
N is INTEGER
The order of the matrix A. N >= 0.
A
A is DOUBLE
PRECISION array, dimension (LDA,N)
On entry, the symmetric matrix A. If UPLO = āUā,
the leading
N-by-N upper triangular part of A contains the upper
triangular part of the matrix A, and the strictly lower
triangular part of A is not referenced. If UPLO =
āLā, the
leading N-by-N lower triangular part of A contains the lower
triangular part of the matrix A, and the strictly upper
triangular part of A is not referenced.
On exit, if UPLO = āUā, the band superdiagonal
of A are overwritten by the corresponding elements of the
internal band-diagonal matrix AB, and the elements above
the KD superdiagonal, with the array TAU, represent the
orthogonal
matrix Q1 as a product of elementary reflectors; if UPLO
= āLā, the diagonal and band subdiagonal of A
are over-
written by the corresponding elements of the internal
band-diagonal
matrix AB, and the elements below the KD subdiagonal, with
the array TAU, represent the orthogonal matrix Q1 as a
product
of elementary reflectors. See Further Details.
LDA
LDA is INTEGER
The leading dimension of the array A. LDA >=
max(1,N).
D
D is DOUBLE
PRECISION array, dimension (N)
The diagonal elements of the tridiagonal matrix T.
E
E is DOUBLE
PRECISION array, dimension (N-1)
The off-diagonal elements of the tridiagonal matrix T.
TAU
TAU is DOUBLE
PRECISION array, dimension (N-KD)
The scalar factors of the elementary reflectors of
the first stage (see Further Details).
HOUS2
HOUS2 is DOUBLE
PRECISION array, dimension (MAX(1,LHOUS2))
Stores the Householder representation of the stage2
band to tridiagonal.
LHOUS2
LHOUS2 is
INTEGER
The dimension of the array HOUS2.
LHOUS2 >= 1.
If LWORK = -1,
or LHOUS2 = -1,
then a query is assumed; the routine
only calculates the optimal size of the HOUS2 array, returns
this value as the first entry of the HOUS2 array, and no
error
message related to LHOUS2 is issued by XERBLA.
If VECT=āNā, LHOUS2 = max(1, 4*n);
if VECT=āVā, option not yet available.
WORK
WORK is DOUBLE
PRECISION array, dimension (MAX(1,LWORK))
On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
LWORK
LWORK is
INTEGER
The dimension of the array WORK.
If N = 0, LWORK >= 1, else LWORK = MAX(1, dimension).
If LWORK = -1,
or LHOUS2 = -1,
then a workspace query is assumed; the routine
only calculates the optimal size of the WORK array, returns
this value as the first entry of the WORK array, and no
error
message related to LWORK is issued by XERBLA.
LWORK = MAX(1, dimension) where
dimension = max(stage1,stage2) + (KD+1)*N
= N*KD + N*max(KD+1,FACTOPTNB)
+ max(2*KD*KD, KD*NTHREADS)
+ (KD+1)*N
where KD is the blocking size of the reduction,
FACTOPTNB is the blocking used by the QR or LQ
algorithm, usually FACTOPTNB=128 is a good choice
NTHREADS is the number of threads used when
openMP compilation is enabled, otherwise =1.
INFO
INFO is INTEGER
= 0: successful exit
< 0: if INFO = -i, the i-th argument had an illegal
value
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Further Details:
Implemented by Azzam Haidar.
All details are available on technical report, SC11, SC13 papers.
Azzam Haidar,
Hatem Ltaief, and Jack Dongarra.
Parallel reduction to condensed forms for symmetric
eigenvalue problems
using aggregated fine-grained and memory-aware kernels. In
Proceedings
of 2011 International Conference for High Performance
Computing,
Networking, Storage and Analysis (SC ā11), New York,
NY, USA,
Article 8 , 11 pages.
http://doi.acm.org/10.1145/2063384.2063394
A. Haidar, J.
Kurzak, P. Luszczek, 2013.
An improved parallel singular value algorithm and its
implementation
for multicore hardware, In Proceedings of 2013 International
Conference
for High Performance Computing, Networking, Storage and
Analysis (SC ā13).
Denver, Colorado, USA, 2013.
Article 90, 12 pages.
http://doi.acm.org/10.1145/2503210.2503292
A. Haidar, R.
Solca, S. Tomov, T. Schulthess and J. Dongarra.
A novel hybrid CPU-GPU generalized eigensolver for
electronic structure
calculations based on fine-grained memory aware tasks.
International Journal of High Performance Computing
Applications.
Volume 28 Issue 2, Pages 196-209, May 2014.
http://hpc.sagepub.com/content/28/2/196
subroutine ssytrd_2stage (character vect, character uplo, integer n, real,dimension( lda, * ) a, integer lda, real, dimension( * ) d, real,dimension( * ) e, real, dimension( * ) tau, real, dimension( * ) hous2,integer lhous2, real, dimension( * ) work, integer lwork, integer info)
SSYTRD_2STAGE
Purpose:
SSYTRD_2STAGE
reduces a real symmetric matrix A to real symmetric
tridiagonal form T by a orthogonal similarity
transformation:
Q1**T Q2**T* A * Q2 * Q1 = T.
Parameters
VECT
VECT is
CHARACTER*1
= āNā: No need for the Housholder
representation,
in particular for the second stage (Band to
tridiagonal) and thus LHOUS2 is of size max(1, 4*N);
= āVā: the Householder representation is needed
to
either generate Q1 Q2 or to apply Q1 Q2,
then LHOUS2 is to be queried and computed.
(NOT AVAILABLE IN THIS RELEASE).
UPLO
UPLO is
CHARACTER*1
= āUā: Upper triangle of A is stored;
= āLā: Lower triangle of A is stored.
N
N is INTEGER
The order of the matrix A. N >= 0.
A
A is REAL
array, dimension (LDA,N)
On entry, the symmetric matrix A. If UPLO = āUā,
the leading
N-by-N upper triangular part of A contains the upper
triangular part of the matrix A, and the strictly lower
triangular part of A is not referenced. If UPLO =
āLā, the
leading N-by-N lower triangular part of A contains the lower
triangular part of the matrix A, and the strictly upper
triangular part of A is not referenced.
On exit, if UPLO = āUā, the band superdiagonal
of A are overwritten by the corresponding elements of the
internal band-diagonal matrix AB, and the elements above
the KD superdiagonal, with the array TAU, represent the
orthogonal
matrix Q1 as a product of elementary reflectors; if UPLO
= āLā, the diagonal and band subdiagonal of A
are over-
written by the corresponding elements of the internal
band-diagonal
matrix AB, and the elements below the KD subdiagonal, with
the array TAU, represent the orthogonal matrix Q1 as a
product
of elementary reflectors. See Further Details.
LDA
LDA is INTEGER
The leading dimension of the array A. LDA >=
max(1,N).
D
D is REAL
array, dimension (N)
The diagonal elements of the tridiagonal matrix T.
E
E is REAL
array, dimension (N-1)
The off-diagonal elements of the tridiagonal matrix T.
TAU
TAU is REAL
array, dimension (N-KD)
The scalar factors of the elementary reflectors of
the first stage (see Further Details).
HOUS2
HOUS2 is REAL
array, dimension (MAX(1,LHOUS2))
Stores the Householder representation of the stage2
band to tridiagonal.
LHOUS2
LHOUS2 is
INTEGER
The dimension of the array HOUS2.
LHOUS2 >= 1.
If LWORK = -1,
or LHOUS2 = -1,
then a query is assumed; the routine
only calculates the optimal size of the HOUS2 array, returns
this value as the first entry of the HOUS2 array, and no
error
message related to LHOUS2 is issued by XERBLA.
If VECT=āNā, LHOUS2 = max(1, 4*n);
if VECT=āVā, option not yet available.
WORK
WORK is REAL array, dimension (LWORK)
LWORK
LWORK is
INTEGER
The dimension of the array WORK.
If N = 0, LWORK >= 1, else LWORK = MAX(1, dimension).
If LWORK = -1,
or LHOUS2 = -1,
then a workspace query is assumed; the routine
only calculates the optimal size of the WORK array, returns
this value as the first entry of the WORK array, and no
error
message related to LWORK is issued by XERBLA.
LWORK = MAX(1, dimension) where
dimension = max(stage1,stage2) + (KD+1)*N
= N*KD + N*max(KD+1,FACTOPTNB)
+ max(2*KD*KD, KD*NTHREADS)
+ (KD+1)*N
where KD is the blocking size of the reduction,
FACTOPTNB is the blocking used by the QR or LQ
algorithm, usually FACTOPTNB=128 is a good choice
NTHREADS is the number of threads used when
openMP compilation is enabled, otherwise =1.
INFO
INFO is INTEGER
= 0: successful exit
< 0: if INFO = -i, the i-th argument had an illegal
value
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Further Details:
Implemented by Azzam Haidar.
All details are available on technical report, SC11, SC13 papers.
Azzam Haidar,
Hatem Ltaief, and Jack Dongarra.
Parallel reduction to condensed forms for symmetric
eigenvalue problems
using aggregated fine-grained and memory-aware kernels. In
Proceedings
of 2011 International Conference for High Performance
Computing,
Networking, Storage and Analysis (SC ā11), New York,
NY, USA,
Article 8 , 11 pages.
http://doi.acm.org/10.1145/2063384.2063394
A. Haidar, J.
Kurzak, P. Luszczek, 2013.
An improved parallel singular value algorithm and its
implementation
for multicore hardware, In Proceedings of 2013 International
Conference
for High Performance Computing, Networking, Storage and
Analysis (SC ā13).
Denver, Colorado, USA, 2013.
Article 90, 12 pages.
http://doi.acm.org/10.1145/2503210.2503292
A. Haidar, R.
Solca, S. Tomov, T. Schulthess and J. Dongarra.
A novel hybrid CPU-GPU generalized eigensolver for
electronic structure
calculations based on fine-grained memory aware tasks.
International Journal of High Performance Computing
Applications.
Volume 28 Issue 2, Pages 196-209, May 2014.
http://hpc.sagepub.com/content/28/2/196
subroutine zhetrd_2stage (character vect, character uplo, integer n,complex*16, dimension( lda, * ) a, integer lda, double precision,dimension( * ) d, double precision, dimension( * ) e, complex*16,dimension( * ) tau, complex*16, dimension( * ) hous2, integer lhous2,complex*16, dimension( * ) work, integer lwork, integer info)
ZHETRD_2STAGE
Purpose:
ZHETRD_2STAGE
reduces a complex Hermitian matrix A to real symmetric
tridiagonal form T by a unitary similarity transformation:
Q1**H Q2**H* A * Q2 * Q1 = T.
Parameters
VECT
VECT is
CHARACTER*1
= āNā: No need for the Housholder
representation,
in particular for the second stage (Band to
tridiagonal) and thus LHOUS2 is of size max(1, 4*N);
= āVā: the Householder representation is needed
to
either generate Q1 Q2 or to apply Q1 Q2,
then LHOUS2 is to be queried and computed.
(NOT AVAILABLE IN THIS RELEASE).
UPLO
UPLO is
CHARACTER*1
= āUā: Upper triangle of A is stored;
= āLā: Lower triangle of A is stored.
N
N is INTEGER
The order of the matrix A. N >= 0.
A
A is COMPLEX*16
array, dimension (LDA,N)
On entry, the Hermitian matrix A. If UPLO = āUā,
the leading
N-by-N upper triangular part of A contains the upper
triangular part of the matrix A, and the strictly lower
triangular part of A is not referenced. If UPLO =
āLā, the
leading N-by-N lower triangular part of A contains the lower
triangular part of the matrix A, and the strictly upper
triangular part of A is not referenced.
On exit, if UPLO = āUā, the band superdiagonal
of A are overwritten by the corresponding elements of the
internal band-diagonal matrix AB, and the elements above
the KD superdiagonal, with the array TAU, represent the
unitary
matrix Q1 as a product of elementary reflectors; if UPLO
= āLā, the diagonal and band subdiagonal of A
are over-
written by the corresponding elements of the internal
band-diagonal
matrix AB, and the elements below the KD subdiagonal, with
the array TAU, represent the unitary matrix Q1 as a product
of elementary reflectors. See Further Details.
LDA
LDA is INTEGER
The leading dimension of the array A. LDA >=
max(1,N).
D
D is DOUBLE
PRECISION array, dimension (N)
The diagonal elements of the tridiagonal matrix T.
E
E is DOUBLE
PRECISION array, dimension (N-1)
The off-diagonal elements of the tridiagonal matrix T.
TAU
TAU is
COMPLEX*16 array, dimension (N-KD)
The scalar factors of the elementary reflectors of
the first stage (see Further Details).
HOUS2
HOUS2 is
COMPLEX*16 array, dimension (MAX(1,LHOUS2))
Stores the Householder representation of the stage2
band to tridiagonal.
LHOUS2
LHOUS2 is
INTEGER
The dimension of the array HOUS2.
LHOUS2 >= 1.
If LWORK = -1,
or LHOUS2 = -1,
then a query is assumed; the routine
only calculates the optimal size of the HOUS2 array, returns
this value as the first entry of the HOUS2 array, and no
error
message related to LHOUS2 is issued by XERBLA.
If VECT=āNā, LHOUS2 = max(1, 4*n);
if VECT=āVā, option not yet available.
WORK
WORK is
COMPLEX*16 array, dimension (MAX(1,LWORK))
On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
LWORK
LWORK is
INTEGER
The dimension of the array WORK.
If N = 0, LWORK >= 1, else LWORK = MAX(1, dimension).
If LWORK = -1,
or LHOUS2 = -1,
then a workspace query is assumed; the routine
only calculates the optimal size of the WORK array, returns
this value as the first entry of the WORK array, and no
error
message related to LWORK is issued by XERBLA.
LWORK = MAX(1, dimension) where
dimension = max(stage1,stage2) + (KD+1)*N
= N*KD + N*max(KD+1,FACTOPTNB)
+ max(2*KD*KD, KD*NTHREADS)
+ (KD+1)*N
where KD is the blocking size of the reduction,
FACTOPTNB is the blocking used by the QR or LQ
algorithm, usually FACTOPTNB=128 is a good choice
NTHREADS is the number of threads used when
openMP compilation is enabled, otherwise =1.
INFO
INFO is INTEGER
= 0: successful exit
< 0: if INFO = -i, the i-th argument had an illegal
value
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Further Details:
Implemented by Azzam Haidar.
All details are available on technical report, SC11, SC13 papers.
Azzam Haidar,
Hatem Ltaief, and Jack Dongarra.
Parallel reduction to condensed forms for symmetric
eigenvalue problems
using aggregated fine-grained and memory-aware kernels. In
Proceedings
of 2011 International Conference for High Performance
Computing,
Networking, Storage and Analysis (SC ā11), New York,
NY, USA,
Article 8 , 11 pages.
http://doi.acm.org/10.1145/2063384.2063394
A. Haidar, J.
Kurzak, P. Luszczek, 2013.
An improved parallel singular value algorithm and its
implementation
for multicore hardware, In Proceedings of 2013 International
Conference
for High Performance Computing, Networking, Storage and
Analysis (SC ā13).
Denver, Colorado, USA, 2013.
Article 90, 12 pages.
http://doi.acm.org/10.1145/2503210.2503292
A. Haidar, R.
Solca, S. Tomov, T. Schulthess and J. Dongarra.
A novel hybrid CPU-GPU generalized eigensolver for
electronic structure
calculations based on fine-grained memory aware tasks.
International Journal of High Performance Computing
Applications.
Volume 28 Issue 2, Pages 196-209, May 2014.
http://hpc.sagepub.com/content/28/2/196
Author
Generated automatically by Doxygen for LAPACK from the source code.