IMSL Fortran Library User Guide 6 PDF
IMSL Fortran Library User Guide 6 PDF
IMSL Fortran Library User Guide 6 PDF
Users Guide
MATH/LIBRARY
V E R S I O N
6 . 0
800.222.4675
713.784.3131
303.379.3040
[email protected]
www.vni.com
Table of Contents
Introduction
xix
Routines .....................................................................................................................................1
Usage Notes ...............................................................................................................................5
Matrix Types .....................................................................................................................5
Solution of Linear Systems ...............................................................................................6
Multiple Right Sides..........................................................................................................7
Determinants .....................................................................................................................7
Iterative Refinement ..........................................................................................................7
Matrix Inversion ................................................................................................................7
Singularity .........................................................................................................................7
Special Linear Systems......................................................................................................8
Iterative Solution of Linear Systems .................................................................................8
QR Decomposition ............................................................................................................8
LIN_SOL_GEN .........................................................................................................................9
LIN_SOL_SELF ......................................................................................................................17
Fortran Numerical MATH LIBRARY
Table of Contents i
LIN_SOL_LSQ ........................................................................................................................27
LIN_SOL_SVD........................................................................................................................36
LIN_SOL_TRI .........................................................................................................................44
LIN_SVD .................................................................................................................................57
Parallel Constrained Least-Squares Solvers.............................................................................66
Solving Constrained Least-Squares Systems...................................................................66
PARALLEL_NONNEGATIVE_LSQ .....................................................................................66
PARALLEL_BOUNDED_LSQ ..............................................................................................74
LSARG.....................................................................................................................................82
LSLRG .....................................................................................................................................87
LFCRG.....................................................................................................................................93
LFTRG .....................................................................................................................................98
LFSRG ...................................................................................................................................103
LFIRG ....................................................................................................................................107
LFDRG...................................................................................................................................113
LINRG ...................................................................................................................................114
LSACG...................................................................................................................................118
LSLCG ...................................................................................................................................123
LFCCG...................................................................................................................................127
LFTCG ...................................................................................................................................133
LFSCG ...................................................................................................................................138
LFICG ....................................................................................................................................142
LFDCG...................................................................................................................................148
LINCG ...................................................................................................................................149
LSLRT ...................................................................................................................................154
LFCRT ...................................................................................................................................157
LFDRT ...................................................................................................................................161
LINRT....................................................................................................................................163
LSLCT ...................................................................................................................................164
LFCCT ...................................................................................................................................168
LFDCT ...................................................................................................................................172
LINCT....................................................................................................................................174
LSADS ...................................................................................................................................176
LSLDS ...................................................................................................................................180
LFCDS ...................................................................................................................................185
LFTDS ...................................................................................................................................190
LFSDS....................................................................................................................................194
LFIDS ....................................................................................................................................199
LFDDS ...................................................................................................................................204
LINDS....................................................................................................................................205
LSASF....................................................................................................................................209
LSLSF ....................................................................................................................................212
LFCSF....................................................................................................................................214
LFTSF ....................................................................................................................................217
LFSSF ....................................................................................................................................219
LFISF .....................................................................................................................................221
LFDSF....................................................................................................................................224
LSADH ..................................................................................................................................226
LSLDH...................................................................................................................................231
ii Table of Contents
LFCDH ..................................................................................................................................236
LFTDH...................................................................................................................................241
LFSDH...................................................................................................................................246
LFIDH....................................................................................................................................251
LFDDH ..................................................................................................................................256
LSAHF...................................................................................................................................258
LSLHF ...................................................................................................................................261
LFCHF ...................................................................................................................................263
LFTHF ...................................................................................................................................266
LFSHF ...................................................................................................................................269
LFIHF ....................................................................................................................................271
LFDHF...................................................................................................................................274
LSLTR ...................................................................................................................................275
LSLCR ...................................................................................................................................277
LSARB...................................................................................................................................280
LSLRB ...................................................................................................................................282
LFCRB...................................................................................................................................287
LFTRB ...................................................................................................................................290
LFSRB ...................................................................................................................................293
LFIRB ....................................................................................................................................295
LFDRB...................................................................................................................................298
LSAQS...................................................................................................................................300
LSLQS ...................................................................................................................................303
LSLPB ...................................................................................................................................305
LFCQS ...................................................................................................................................308
LFTQS ...................................................................................................................................311
LFSQS ...................................................................................................................................313
LFIQS ....................................................................................................................................315
LFDQS...................................................................................................................................318
LSLTQ ...................................................................................................................................319
LSLCQ...................................................................................................................................321
LSACB...................................................................................................................................324
LSLCB ...................................................................................................................................327
LFCCB...................................................................................................................................330
LFTCB ...................................................................................................................................333
LFSCB ...................................................................................................................................336
LFICB ....................................................................................................................................338
LFDCB...................................................................................................................................342
LSAQH ..................................................................................................................................344
LSLQH...................................................................................................................................346
LSLQB...................................................................................................................................349
LFCQH ..................................................................................................................................352
LFTQH...................................................................................................................................355
LFSQH...................................................................................................................................358
LFIQH....................................................................................................................................360
LFDQH ..................................................................................................................................362
LSLXG...................................................................................................................................364
LFTXG...................................................................................................................................369
LFSXG...................................................................................................................................374
Fortran Numerical MATH LIBRARY
LSLZG ...................................................................................................................................377
LFTZG ...................................................................................................................................382
LFSZG ...................................................................................................................................387
LSLXD...................................................................................................................................391
LSCXD...................................................................................................................................395
LNFXD ..................................................................................................................................399
LFSXD ...................................................................................................................................404
LSLZD ...................................................................................................................................408
LNFZD...................................................................................................................................412
LFSZD ...................................................................................................................................417
LSLTO ...................................................................................................................................421
LSLTC ...................................................................................................................................423
LSLCC ...................................................................................................................................425
PCGRC ..................................................................................................................................427
JCGRC ...................................................................................................................................433
GMRES..................................................................................................................................436
LSQRR...................................................................................................................................446
LQRRV ..................................................................................................................................452
LSBRR ...................................................................................................................................458
LCLSQ ...................................................................................................................................462
LQRRR ..................................................................................................................................466
LQERR...................................................................................................................................473
LQRSL ...................................................................................................................................478
LUPQR...................................................................................................................................484
LCHRG ..................................................................................................................................489
LUPCH...................................................................................................................................491
LDNCH..................................................................................................................................494
LSVRR...................................................................................................................................498
LSVCR...................................................................................................................................504
LSGRR...................................................................................................................................508
515
Routines .................................................................................................................................515
Usage Notes ...........................................................................................................................516
Reformulating Generalized Eigenvalue Problems.........................................................519
LIN_EIG_SELF .....................................................................................................................520
LIN_EIG_GEN ......................................................................................................................527
LIN_GEIG_GEN ...................................................................................................................535
EVLRG ..................................................................................................................................543
EVCRG ..................................................................................................................................545
EPIRG ....................................................................................................................................548
EVLCG ..................................................................................................................................550
EVCCG ..................................................................................................................................552
EPICG ....................................................................................................................................555
EVLSF ...................................................................................................................................557
EVCSF ...................................................................................................................................559
EVASF ...................................................................................................................................561
EVESF ...................................................................................................................................563
iv Table of Contents
EVBSF ...................................................................................................................................566
EVFSF ...................................................................................................................................568
EPISF .....................................................................................................................................571
EVLSB...................................................................................................................................573
EVCSB...................................................................................................................................575
EVASB ..................................................................................................................................578
EVESB...................................................................................................................................581
EVBSB...................................................................................................................................584
EVFSB ...................................................................................................................................586
EPISB.....................................................................................................................................589
EVLHF...................................................................................................................................591
EVCHF ..................................................................................................................................593
EVAHF ..................................................................................................................................596
EVEHF...................................................................................................................................599
EVBHF ..................................................................................................................................602
EVFHF...................................................................................................................................604
EPIHF ....................................................................................................................................607
EVLRH ..................................................................................................................................609
EVCRH..................................................................................................................................611
EVLCH ..................................................................................................................................614
EVCCH..................................................................................................................................616
GVLRG..................................................................................................................................618
GVCRG .................................................................................................................................621
GPIRG ...................................................................................................................................625
GVLCG..................................................................................................................................627
GVCCG .................................................................................................................................629
GPICG ...................................................................................................................................632
GVLSP...................................................................................................................................634
GVCSP...................................................................................................................................636
GPISP.....................................................................................................................................639
643
Routines .................................................................................................................................643
Usage Notes ...........................................................................................................................645
Piecewise Polynomials ..................................................................................................645
Splines and B-splines ....................................................................................................646
Cubic Splines.................................................................................................................647
Tensor Product Splines..................................................................................................648
Quadratic Interpolation..................................................................................................649
Scattered Data Interpolation ..........................................................................................649
Least Squares.................................................................................................................649
Smoothing by Cubic Splines .........................................................................................649
Rational Chebyshev Approximation .............................................................................649
Using the Univariate Spline Routines............................................................................649
Choosing an Interpolation Routine................................................................................651
SPLINE_CONSTRAINTS.....................................................................................................652
SPLINE_VALUES ................................................................................................................653
SPLINE_FITTING.................................................................................................................654
Fortran Numerical MATH LIBRARY
Table of Contents v
SURFACE_CONSTRAINTS ................................................................................................664
SURFACE_VALUES ............................................................................................................665
SURFACE_FITTING ............................................................................................................666
CSIEZ ....................................................................................................................................677
CSINT ....................................................................................................................................680
CSDEC...................................................................................................................................682
CSHER...................................................................................................................................687
CSAKM .................................................................................................................................690
CSCON ..................................................................................................................................692
CSPER ...................................................................................................................................696
CSVAL...................................................................................................................................699
CSDER...................................................................................................................................700
CS1GD ...................................................................................................................................703
CSITG ....................................................................................................................................706
SPLEZ....................................................................................................................................708
BSINT ....................................................................................................................................711
BSNAK ..................................................................................................................................715
BSOPK...................................................................................................................................718
BS2IN ....................................................................................................................................720
BS3IN ....................................................................................................................................725
BSVAL...................................................................................................................................731
BSDER...................................................................................................................................732
BS1GD ...................................................................................................................................735
BSITG ....................................................................................................................................738
BS2VL ...................................................................................................................................741
BS2DR ...................................................................................................................................742
BS2GD ...................................................................................................................................746
BS2IG ....................................................................................................................................750
BS3VL ...................................................................................................................................754
BS3DR ...................................................................................................................................756
BS3GD ...................................................................................................................................760
BS3IG ....................................................................................................................................766
BSCPP....................................................................................................................................770
PPVAL ...................................................................................................................................771
PPDER ...................................................................................................................................774
PP1GD ...................................................................................................................................776
PPITG ....................................................................................................................................780
QDVAL..................................................................................................................................782
QDDER..................................................................................................................................784
QD2VL...................................................................................................................................786
QD2DR ..................................................................................................................................789
QD3VL...................................................................................................................................792
QD3DR ..................................................................................................................................796
SURF......................................................................................................................................800
RLINE....................................................................................................................................803
RCURV..................................................................................................................................806
FNLSQ ...................................................................................................................................811
BSLSQ ...................................................................................................................................815
BSVLS ...................................................................................................................................819
vi Table of Contents
CONFT ..................................................................................................................................824
BSLS2....................................................................................................................................833
BSLS3....................................................................................................................................838
CSSED ...................................................................................................................................844
CSSMH..................................................................................................................................848
CSSCV...................................................................................................................................851
RATCH..................................................................................................................................854
859
Routines .................................................................................................................................859
Usage Notes ...........................................................................................................................860
Univariate Quadrature ...................................................................................................860
Multivariate Quadrature ................................................................................................861
Gauss rules and three-term recurrences.........................................................................861
Numerical differentiation ..............................................................................................862
QDAGS..................................................................................................................................862
QDAG....................................................................................................................................865
QDAGP..................................................................................................................................869
QDAGI...................................................................................................................................872
QDAWO ................................................................................................................................875
QDAWF.................................................................................................................................879
QDAWS.................................................................................................................................883
QDAWC ................................................................................................................................886
QDNG....................................................................................................................................889
TWODQ.................................................................................................................................891
QAND....................................................................................................................................896
QMC ......................................................................................................................................899
GQRUL..................................................................................................................................901
GQRCF ..................................................................................................................................905
RECCF...................................................................................................................................908
RECQR ..................................................................................................................................911
FQRUL ..................................................................................................................................914
DERIV ...................................................................................................................................918
923
Routines .................................................................................................................................923
Usage Notes ...........................................................................................................................923
Ordinary Differential Equations ....................................................................................924
Differential-algebraic Equations....................................................................................924
Partial Differential Equations ........................................................................................925
Summary .......................................................................................................................926
IVPRK ...................................................................................................................................927
IVMRK ..................................................................................................................................934
IVPAG ...................................................................................................................................944
BVPFD...................................................................................................................................961
BVPMS..................................................................................................................................973
DASPG ..................................................................................................................................980
Introduction to Subroutine PDE_1D_MG............................................................................1003
Fortran Numerical MATH LIBRARY
PDE_1D_MG.......................................................................................................................1004
Description ..................................................................................................................1011
Remarks on the Examples ...........................................................................................1013
Example 1 - Electrodynamics Model...........................................................................1015
Example 3 - Population Dynamics ..............................................................................1020
Example 4 - A Model in Cylindrical Coordinates .......................................................1023
Example 5 - A Flame Propagation Model ...................................................................1024
Example 6 - A Hot Spot Model ................................................................................1027
Example 7 - Traveling Waves .....................................................................................1029
Example 9 - Electrodynamics, Parameters Studied with MPI .....................................1033
MOLCH ...............................................................................................................................1038
FPS2H ..................................................................................................................................1053
FPS3H ..................................................................................................................................1059
SLEIG ..................................................................................................................................1066
SLCNT .................................................................................................................................1078
Chapter 6: Transforms
1083
Routines ...............................................................................................................................1083
Usage Notes .........................................................................................................................1084
Fast Fourier Transforms ..............................................................................................1084
Continuous versus Discrete Fourier Transform...........................................................1085
Inverse Laplace Transform ..........................................................................................1086
FAST_DFT ..........................................................................................................................1086
FAST_2DFT ........................................................................................................................1093
FAST_3DFT ........................................................................................................................1099
FFTRF..................................................................................................................................1103
FFTRB .................................................................................................................................1106
FFTRI...................................................................................................................................1109
FFTCF..................................................................................................................................1111
FFTCB .................................................................................................................................1113
FFTCI...................................................................................................................................1116
FSINT ..................................................................................................................................1118
FSINI....................................................................................................................................1120
FCOST .................................................................................................................................1122
FCOSI ..................................................................................................................................1124
QSINF ..................................................................................................................................1126
QSINB..................................................................................................................................1129
QSINI ...................................................................................................................................1131
QCOSF.................................................................................................................................1133
QCOSB ................................................................................................................................1135
QCOSI..................................................................................................................................1137
FFT2D..................................................................................................................................1139
FFT2B ..................................................................................................................................1142
FFT3F ..................................................................................................................................1145
FFT3B ..................................................................................................................................1149
RCONV................................................................................................................................1153
CCONV................................................................................................................................1158
RCORL ................................................................................................................................1163
viii Table of Contents
CCORL ................................................................................................................................1168
INLAP..................................................................................................................................1172
SINLP ..................................................................................................................................1175
1183
Routines ...............................................................................................................................1183
Usage Notes .........................................................................................................................1183
Zeros of a Polynomial .................................................................................................1183
Zero(s) of a Function ...................................................................................................1184
Root of System of Equations .......................................................................................1184
ZPLRC .................................................................................................................................1184
ZPORC.................................................................................................................................1186
ZPOCC.................................................................................................................................1188
ZANLY................................................................................................................................1189
ZBREN ................................................................................................................................1192
ZREAL.................................................................................................................................1195
NEQNF ................................................................................................................................1198
NEQNJ.................................................................................................................................1201
NEQBF ................................................................................................................................1204
NEQBJ .................................................................................................................................1210
Chapter 8: Optimization
1217
Routines ...............................................................................................................................1217
Usage Notes .........................................................................................................................1218
Unconstrained Minimization .......................................................................................1218
Minimization with Simple Bounds..............................................................................1219
Linearly Constrained Minimization.............................................................................1219
Nonlinearly Constrained Minimization .......................................................................1219
Selection of Routines...................................................................................................1220
UVMIF.................................................................................................................................1222
UVMID................................................................................................................................1225
UVMGS ...............................................................................................................................1229
UMINF.................................................................................................................................1232
UMING................................................................................................................................1237
UMIDH................................................................................................................................1243
UMIAH................................................................................................................................1249
UMCGF ...............................................................................................................................1255
UMCGG...............................................................................................................................1259
UMPOL ...............................................................................................................................1263
UNLSF.................................................................................................................................1267
UNLSJ .................................................................................................................................1273
BCONF ................................................................................................................................1279
BCONG ...............................................................................................................................1286
BCODH ...............................................................................................................................1293
BCOAH ...............................................................................................................................1299
BCPOL.................................................................................................................................1306
BCLSF .................................................................................................................................1310
BCLSJ..................................................................................................................................1317
Fortran Numerical MATH LIBRARY
Table of Contents ix
BCNLS.................................................................................................................................1324
READ_MPS.........................................................................................................................1333
MPS File Format .........................................................................................................1337
NAME Section ............................................................................................................1338
ROWS Section.............................................................................................................1338
COLUMNS Section.....................................................................................................1339
RHS Section ................................................................................................................1339
RANGES Section ........................................................................................................1340
BOUNDS Section........................................................................................................1340
QUADRATIC Section.................................................................................................1342
ENDATA Section........................................................................................................1342
MPS_FREE..........................................................................................................................1343
DENSE_LP ..........................................................................................................................1346
DLPRS .................................................................................................................................1351
SLPRS..................................................................................................................................1355
QPROG ................................................................................................................................1361
LCONF.................................................................................................................................1364
LCONG................................................................................................................................1370
NNLPF .................................................................................................................................1377
NNLPG ................................................................................................................................1383
CDGRD................................................................................................................................1390
FDGRD ................................................................................................................................1392
FDHES .................................................................................................................................1394
GDHES ................................................................................................................................1397
FDJAC .................................................................................................................................1400
CHGRD................................................................................................................................1403
CHHES.................................................................................................................................1406
CHJAC.................................................................................................................................1410
GGUES ................................................................................................................................1414
1419
Routines ...............................................................................................................................1419
Basic Linear Algebra Subprograms .....................................................................................1422
Programming Notes for Level 1 BLAS .......................................................................1422
Descriptions of the Level 1 BLAS Subprograms ........................................................1423
Programming Notes for Level 2 and Level 3 BLAS ...................................................1433
Descriptions of the Level 2 and Level 3 BLAS...........................................................1434
Other Matrix/Vector Operations ..........................................................................................1443
CRGRG................................................................................................................................1444
CCGCG................................................................................................................................1445
CRBRB ................................................................................................................................1447
CCBCB ................................................................................................................................1448
CRGRB ................................................................................................................................1450
CRBRG ................................................................................................................................1452
CCGCB ................................................................................................................................1453
CCBCG ................................................................................................................................1455
CRGCG................................................................................................................................1457
CRRCR ................................................................................................................................1458
x Table of Contents
CRBCB ................................................................................................................................1460
CSFRG.................................................................................................................................1462
CHFCG ................................................................................................................................1463
CSBRB.................................................................................................................................1465
CHBCB................................................................................................................................1467
TRNRR ................................................................................................................................1469
MXTXF ...............................................................................................................................1470
MXTYF ...............................................................................................................................1472
MXYTF ...............................................................................................................................1474
MRRRR ...............................................................................................................................1476
MCRCR ...............................................................................................................................1479
HRRRR................................................................................................................................1481
BLINF..................................................................................................................................1483
POLRG ................................................................................................................................1485
MURRV...............................................................................................................................1487
MURBV...............................................................................................................................1489
MUCRV...............................................................................................................................1491
MUCBV...............................................................................................................................1493
ARBRB................................................................................................................................1495
ACBCB................................................................................................................................1497
NRIRR .................................................................................................................................1499
NR1RR.................................................................................................................................1501
NR2RR.................................................................................................................................1502
NR1RB.................................................................................................................................1504
NR1CB.................................................................................................................................1505
DISL2...................................................................................................................................1507
DISL1...................................................................................................................................1509
DISLI ...................................................................................................................................1510
VCONR ...............................................................................................................................1512
VCONC ...............................................................................................................................1514
Extended Precision Arithmetic ............................................................................................1516
1521
Routines ...............................................................................................................................1521
Usage Notes .........................................................................................................................1522
Matrix Optional Data Changes.............................................................................................1522
Dense Matrix Computations ................................................................................................1524
Dense Matrix Functions .......................................................................................................1526
Dense Matrix Parallelism Using MPI ..................................................................................1527
General Remarks .........................................................................................................1527
Getting Started with Modules MPI_setup_int and MPI_node_int .................1527
Using Processors .........................................................................................................1529
Sparse Matrix Computations................................................................................................1530
Introduction .................................................................................................................1530
Derived Type Definitions ............................................................................................1532
Type SLU_options ......................................................................................................1533
Overloaded Assignments.............................................................................................1534
.x. .........................................................................................................................................1537
Fortran Numerical MATH LIBRARY
Table of Contents xi
.tx. ........................................................................................................................................1541
.xt. ........................................................................................................................................1544
.hx.........................................................................................................................................1547
.xh.........................................................................................................................................1550
.t. ..........................................................................................................................................1553
.h...........................................................................................................................................1556
.i. ..........................................................................................................................................1558
.ix. ........................................................................................................................................1561
.xi. ........................................................................................................................................1571
CHOL...................................................................................................................................1574
COND ..................................................................................................................................1577
DET......................................................................................................................................1581
DIAG....................................................................................................................................1584
DIAGONALS ......................................................................................................................1585
EIG.......................................................................................................................................1586
EYE......................................................................................................................................1590
FFT.......................................................................................................................................1592
FFT_BOX ............................................................................................................................1594
IFFT .....................................................................................................................................1596
IFFT_BOX ...........................................................................................................................1598
isNaN ...................................................................................................................................1600
NaN ......................................................................................................................................1601
NORM..................................................................................................................................1602
ORTH...................................................................................................................................1605
RAND ..................................................................................................................................1608
RANK ..................................................................................................................................1610
SVD......................................................................................................................................1612
UNIT ....................................................................................................................................1614
1617
Routines ...............................................................................................................................1617
Usage Notes for ScaLAPACK Utilities ...............................................................................1619
ScaLAPACK Supporting Modules..............................................................................1622
ScaLAPACK_SETUP..........................................................................................................1622
ScaLAPACK_GETDIM ......................................................................................................1624
ScaLAPACK_READ ...........................................................................................................1625
ScaLAPACK_WRITE .........................................................................................................1627
ScaLAPACK_MAP .............................................................................................................1636
ScaLAPACK_UNMAP........................................................................................................1637
ScaLAPACK_EXIT.............................................................................................................1640
ERROR_POST.....................................................................................................................1640
SHOW..................................................................................................................................1643
WRRRN ...............................................................................................................................1647
WRRRL ...............................................................................................................................1649
WRIRN ................................................................................................................................1653
WRIRL.................................................................................................................................1655
WRCRN ...............................................................................................................................1658
WRCRL ...............................................................................................................................1660
xii Table of Contents
WROPT ...............................................................................................................................1664
PGOPT.................................................................................................................................1671
PERMU................................................................................................................................1673
PERMA................................................................................................................................1674
SORT_REAL.......................................................................................................................1677
SVRGN................................................................................................................................1679
SVRGP.................................................................................................................................1681
SVIGN .................................................................................................................................1683
SVIGP..................................................................................................................................1684
SVRBN ................................................................................................................................1685
SVRBP.................................................................................................................................1687
SVIBN .................................................................................................................................1688
SVIBP ..................................................................................................................................1690
SRCH ...................................................................................................................................1691
ISRCH..................................................................................................................................1694
SSRCH.................................................................................................................................1696
ACHAR ...............................................................................................................................1698
IACHAR ..............................................................................................................................1699
ICASE..................................................................................................................................1700
IICSR ...................................................................................................................................1701
IIDEX...................................................................................................................................1703
CVTSI..................................................................................................................................1704
CPSEC .................................................................................................................................1705
TIMDY ................................................................................................................................1706
TDATE ................................................................................................................................1707
NDAYS................................................................................................................................1708
NDYIN.................................................................................................................................1710
IDYWK................................................................................................................................1711
VERML ...............................................................................................................................1713
RAND_GEN ........................................................................................................................1714
RNGET ................................................................................................................................1722
RNSET.................................................................................................................................1723
RNOPT ................................................................................................................................1724
RNIN32................................................................................................................................1726
RNGE32...............................................................................................................................1727
RNSE32 ...............................................................................................................................1729
RNIN64................................................................................................................................1729
RNGE64...............................................................................................................................1730
RNSE64 ...............................................................................................................................1732
RNUNF................................................................................................................................1732
RNUN ..................................................................................................................................1734
FAURE_INIT ......................................................................................................................1736
FAURE_FREE.....................................................................................................................1736
FAURE_NEXT....................................................................................................................1737
IUMAG................................................................................................................................1739
UMAG .................................................................................................................................1743
SUMAG/DUMAG ...............................................................................................................1746
PLOTP .................................................................................................................................1746
PRIME .................................................................................................................................1749
Fortran Numerical MATH LIBRARY
CONST.................................................................................................................................1751
CUNIT .................................................................................................................................1753
HYPOT ................................................................................................................................1757
MP_SETUP..........................................................................................................................1758
Reference Material
1763
Contents ...............................................................................................................................1763
User Errors ...........................................................................................................................1763
What Determines Error Severity .................................................................................1763
Kinds of Errors and Default Actions ...........................................................................1764
Errors in Lower-Level Routines ..................................................................................1765
Routines for Error Handling ........................................................................................1765
ERSET .................................................................................................................................1765
IERCD and N1RTY .............................................................................................................1766
Examples .....................................................................................................................1766
Machine-Dependent Constants ............................................................................................1769
IMACH ................................................................................................................................1769
AMACH...............................................................................................................................1771
DMACH...............................................................................................................................1772
IFNAN(X)............................................................................................................................1772
UMACH...............................................................................................................................1774
Matrix Storage Modes..........................................................................................................1775
Reserved Names...................................................................................................................1784
Deprecated Features and Renamed Routines .......................................................................1785
Automatic Workspace Allocation................................................................................1785
Changing the Amount of Space Allocated...................................................................1786
Character Workspace...................................................................................................1787
A-1
B-1
Routines .................................................................................................................................B-1
Appendix C: References
C-1
D-1
Product Support
Index
Table of Contents xv
Introduction
Most of the routines are available in both single and double precision versions. Many routines for
linear solvers and eigensystems are also available for complex and double -complex precision
arithmetic. The same user interface is found on the many hardware versions that span the range
from personal computer to supercomputer.
This library is the result of a merging of the products: IMSL Fortran Numerical Libraries and
IMSL Fortran 90 Library.
User Background
To use this product you should be familiar with the Fortran 90 language as well as the withdrawn
Fortran 77 language, which is, in practice, a subset of Fortran 90. A summary of the ISO and
ANSI standard language is found in Metcalf and Reid (1990). A more comprehensive illustration
is given in Adams et al. (1992).
Those routines implemented in the IMSL Fortran Numerical Library provide a simpler, more
reliable user interface than was possible with Fortran 77. Features of the IMSL Fortran Numerical
Library include the use of descriptive names, short required argument lists, packaged userinterface blocks, a suite of testing and benchmark software, and a collection of examples. Source
code is provided for the benchmark software and examples.
Some of the routines in the IMSL Fortran Numerical Library can take advantage of a standard
(MPI) Message Passing Interface environment but do not require an MPI environment if the user
chooses to not take advantage of MPI.
Introduction
The MPI logo shown below cues the reader when this is the case:
CAPABLE
Routines documented with the MPI Capable logo can be called in a scalar or one computer
environment.
Other routines in the IMSL Library take advantage of MPI and require that an MPI environment
be present in order to use them. The MPI Required logo shown below clues the reader when this is
the case:
REQUIRED
NOTE: It is recommended that users considering using the MPI capabilities of the product read
the following sections of the MATH Library documentation:
Introduction: Using MPI Routines,
Introduction: Using ScaLAPACK Enhanced Routines,
Chapter 10, Linear Algebra Operators and Generic Functions see Dense Matrix Parallelism
Using MPI.
Getting Started
The IMSL MATH/LIBRARY is a collection of Fortran routines and functions useful in
mathematical analysis research and application development. Each routine is designed and
documented for use in research activities as well as by technical specialists.
To use any of these routines, you must write a program in Fortran 90 (or possibly some other
language) to call the MATH/LIBRARY routine. Each routine conforms to established conventions
in programming and documentation. We give first priority in development to efficient algorithms,
clear documentation, and accurate results. The uniform design of the routines makes it easy to use
more than one routine in a given application. Also, you will find that the design consistency
enables you to apply your experience with one MATH/LIBRARY routine to other IMSL routines
that you use.
Often the quickest way to use the MATH/LIBRARY is to find an example similar to your problem
and then to mimic the example. Each routine document has at least one example demonstrating its
application. The example for a routine may be created simply for illustration, it may be from a
textbook (with reference to the source), or it may be from the mathematical literature.
Purpose: a statement of the purpose of the routine. If the routine is a function rather than a
subroutine the purpose statement will reflect this fact.
Function Return Value: a description of the return value (for functions only).
Required Arguments: a description of the required arguments in the order of their occurrence.
Input arguments usually occur first, followed by input/output arguments, with output
arguments described last. Futhermore, the following terms apply to arguments:
Input Argument must be initialized; it is not changed by the routine.
Input/Output Argument must be initialized; the routine returns output through this
argument; cannot be a constant or an expression.
Input[/Output] Argument must be initialized; the routine may return output through this
argument based on other optional data the user may choose to pass to this routine; cannot
be a constant or an expression.
Input or Output Select appropriate option to define the argument as either input or output.
See individual routines for further instructions.
Output No initialization is necessary; cannot be a constant or an expression. The routine
returns output through this argument.
Optional Arguments: a description of the optional arguments in the order of their occurrence.
Fortran 90 Interface: a section that describes the generic and specific interfaces to the routine.
Fortran 77 Style Interface: an optional section, which describes Fortran 77 style interfaces, is
supplied for backwards compatibility with previous versions of the Library.
Introduction
Programming notes: an optional section that contains programming details not covered
elsewhere.
Example: at least one application of this routine showing input and required dimension and
type statements.
Additional Examples: an optional section with additional applications of this routine showing
input and required dimension and type statements.
Naming Conventions
The names of the routines are mnemonic and unique. Most routines are available in both a single
precision and a double precision version, with names of the two versions sharing a common root.
The root name is also the generic interface name. The name of the double precision specific
version begins with a D_. The single precision specific version begins with an S_. For
example, the following pairs are precision specific names of routines in the two different
precisions: S_GQRUL/D_GQRUL (the root is GQRUL , for Gauss quadrature rule) and
S_RECCF/D_RECCF (the root is RECCF, for recurrence coefficient). The precision specific
names of the IMSL routines that return or accept the type complex data begin with the letter C_
or Z_ for complex or double complex, respectively. Of course, the generic name can be used as
an entry point for all precisions supported.
When this convention is not followed the generic and specific interfaces are noted in the
documentation. For example, in the case of the BLAS and trigonometric intrinsic functions where
standard names are already established, the standard names are used as the precision specific
names. There may also be other interfaces supplied to the routine to provide for backwards
compatibility with previous versions of the IMSL Fortran Numerical Library. These alternate
interfaces are noted in the documentation when they are available.
Except when expressly stated otherwise, the names of the variables in the argument lists follow
the Fortran default type for integer and floating point. In other words, a variable whose name
begins with one of the letters I through N is of type INTEGER, and otherwise is of type REAL
or DOUBLE PRECISION, depending on the precision of the routine.
An assumed-size array with more than one dimension that is used as a Fortran argument can have
an assumed-size declarator for the last dimension only. In the MATH/LIBRARY routines, the
information about the first dimension is passed by a variable with the prefix LD and with the
array name as the root. For example, the argument LDA contains the leading dimension of array A.
In most cases, information about the dimensions of arrays is obtained from the array through the
use of Fortran 90s size function. Therefore, arguments carrying this type of information are
usually defined as optional arguments.
Where appropriate, the same variable name is used consistently throughout a chapter in the
MATH/LIBRARY. For example, in the routines for random number generation, NR denotes the
number of random numbers to be generated, and R or IR denotes the array that stores the numbers.
When writing programs accessing the MATH/LIBRARY, the user should choose Fortran names
that do not conflict with names of IMSL subroutines, functions, or named common blocks. The
xxii Introduction
careful user can avoid any conflicts with IMSL names if, in choosing names, the following rules
are observed:
Do not choose a name that appears in the Alphabetical Summary of Routines, at the end of the
Users Manual, nor one of these names preceded by a D, S_, D_, C_, or Z_.
Do not choose a name consisting of more than three characters with a numeral in the second
or third position.
For further details, see the section on Reserved Names in the Reference Material.
where the overstrike denotes complex conjugation. IMSL Fortran Numerical Library linear
algebra software uses this convention to conserve the utility of generic documentation for that
code subject. All references to orthogonal matrices are to be replaced by their complex
counterparts, unitary matrices. Thus, an n n orthogonal matrix Q satisfies the
condition Q T Q = I n . An n n unitary matrix V satisfies the analogous condition for complex
matrices, V *V = I n .
Introduction
Programming Conventions
In general, the IMSL MATH/LIBRARY codes are written so that computations are not affected by
underflow, provided the system (hardware or software) places a zero value in the register. In this
case, system error messages indicating underflow should be ignored.
IMSL codes are also written to avoid overflow. A program that produces system error messages
indicating overflow should be examined for programming errors such as incorrect input data,
mismatch of argument types, or improper dimensioning.
In many cases, the documentation for a routine points out common pitfalls that can lead to failure
of the algorithm.
Library routines detect error conditions, classify them as to severity, and treat them accordingly.
This error-handling capability provides automatic protection for the user without requiring the user
to make any specific provisions for the treatment of error conditions. See the section on User
Errors in the Reference Material for further details.
Module Usage
Users are required to incorporate a use statement near the top of their program for the IMSL
routine being called when writing new code that uses this library. However, legacy code which
calls routines in the previous version of the library without the use of a use statement will
continue to work as before. Also, code that employed the use numerical_libraries statement
from the previous version of the library will continue to work properly with this version of the
library.
Users wishing to update existing programs so as to call other routines from this library should
incorporate a use statement for the specific new routine being called. (Here, the term new
routine implies any routine in the library, only new to the users program.) Use of the more
encompassing imsl_libraries module in this case could result in argument mismatches for
the old routine(s) being called. (The compiler would catch this.)
Users wishing to update existing programs to call the new generic versions of the routines must
change their calls to the existing routines to match the new calling sequences and use either the
routine specific interface modules or the all-encompassing imsl_libraries module.
REQUIRED
Users of the IMSL Fortran Numerical Library benefit by having a standard (MPI) Message
Passing Interface environment. This is needed to accomplish parallel computing within parts of
the Library. Either of the icons above clues the reader when this is the case. If parallel computing
is not required, then the IMSL Library suite of dummy MPI routines can be substituted for
standard MPI routines. All requested MPI routines called by the IMSL Library are in this dummy
suite. Warning messages will appear if a code or example requires more than one process to
execute. Typically users need not be aware of the parallel codes.
xxiv Introduction
NOTE: that a standard MPI environment is not part of the IMSL Fortran Numerical Library. The
standard includes a library of MPI Fortran and C routines, MPI include files, usage
documentation, and other run-time utilities.
NOTE: Details on linking to the appropriate libraries are explained in the online README file of
the product distribution.
There are three situations of MPI usage in the IMSL Fortran Numerical Library:
1.
There are some computations that are performed with the box data type that benefit from the
use of parallel processing. For computations involving a single array or a single problem,
there is no IMSL use of parallel processing or MPI codes. The box type data type implies that
several problems of the same size and type are to be computed and solved. Each rack of the
box is an independent problem. This means that each problem could potentially be solved in
parallel. The default for computing a box data type calculation is that a single processor will
do all of the problems, one after the other. If this is acceptable there should be no further
concern about which version of the libraries is used for linking. If the problems are to be
solved in parallel, then the user must link with a working version of an MPI Library and the
appropriate IMSL Library. Examples demonstrating the use of box type data may be found in
Chapter 10, Linear Algebra Operators and Generic Functions.
NOTE: Box data type routines are marked with the MPI Capable icon.
2.
Various routines in Chapter 1, Linear Systems allow the user to interface with the
ScaLAPACK Library routines. If the user chooses to run on only one processor then these
routines will utilize either IMSL Library code or LAPACK Library code based on the
libraries the user chooses to use during linking. If the user chooses to run on multiple
processors then working versions of MPI, ScaLAPACK, PBLAS, and Blacs will need to be
present. These routines are marked with the MPI Capable icon.
3.
There are some routines or operators in the Library that require that a working MPI Library be
present in order for them to run. Examples are the large-scale parallel solvers and the
ScaLAPACK utilities. Routines of this type are marked with the MPI Required icon. For
these routines, the user must link with a working version of an MPI Library and the
appropriate IMSL Library.
In all cases described above it is the users responsibility to supply working versions of the
aforementioned third party libraries when those libraries are required.
Table A below lists the chapters and IMSL routines calling MPI routines or the replacement nonparallel package.
Chapter Name and Number
Linear Systems, 1
PARALLEL_NONNEGATIVE_LSQ
Linear Systems, 1
PARALLEL_BOUNDED_LSQ
Linear Systems, 1
Introduction
Utilities, 11
ScaLAPACK_SETUP
Utilities, 11
ScaLAPACK_GETDIM
Utilities, 11
ScaLAPACK_READ
Utilities, 11
ScaLAPACK_WRITE
Utilities, 11
ScaLAPACK_MAP
Utilities, 11
ScaLAPACK_UNMAP
Utilities, 11
ScaLAPACK_EXIT
Reference Material
Programming Tips
Each subject routine called or otherwise referenced requires the use statement for an interface
block designed for that subject routine. The contents of this interface block are the interfaces to the
separate routines available for that subject. Packaged descriptive names for option numbers that
modify documented optional data or internal parameters might also be provided in the interface
block. Although this seems like an additional complication, many errors are avoided at an early
stage in development through the use of these interface blocks. The use statement is required
for each routine called in the users program. As illustrated in Examples 3 and 4 in routine
lin_geig_gen, the use statement is required for defining the secondary option flags.
The function subprogram for s_NaN() or d_NaN() does not require an interface block because it
has only a single required dummy argument. Also, if one is only using the Fortran 77 interfaces
supplied for backwards compatibility then the use statements are not required.
existed in the Fortran 77 version of the Library. Note that it is not necessary to include use
statements when calling these routines by themselves. Existing programs that called these
routines will continue to work in the same manner as before.
Some of the primary routines have arguments epack= and iopt=. As noted the epack=
argument is of derived type s_error or d_error. The prefix s_ or d_ is chosen
depending on the precision of the data type for that routine. These optional arguments are part of
the interface to certain routines, and are used to modify internal algorithm choices or other
parameters.
Optional Data
This additional optional argument (available for some routines) is further distinguisheda derived
type array that contains a number of parameters to modify the internal algorithm of a routine. This
derived type has the name ?_options, where ?_ is either s_ or d_. The choice depends
on the precision of the data type. The declaration of this derived type is packaged within the
modules for these codes.
The definition of the derived types is:
type ?_options
integer idummy; real(kind(?)) rdummy
end type
where the ?_ is either s_ or d_, and the kind value matches the desired data type
indicated by the choice of s or d.
Example 3 in Chapter 1, Linear Systems of LIN_SOL_GEN illustrates the use of iterative
refinement to compute a double-precision solution based on a single-precision factorization of the
matrix. This is communicated to the routine using an optional argument with optional data. For
efficiency of iterative refinement, perform the factorization step once, and then save the factored
matrix in the array A and the pivoting information in the rank-1 integer array, ipivots. By
default, the factorization is normally discarded. To enable the routine to be re-entered with a
previously computed factorization of the matrix, optional data are used as array entries in the
iopt= optional argument. The packaging of LIN_SOL_GEN includes the definitions of the selfdocumenting integer parameters lin_sol_gen_save_LU and lin_sol_gen_solve_A. These
parameters have the values 2 and 3, but the programmer usually does not need to be aware of it.
The following rules apply to the iopt=iopt optional argument:
1. Define a relative index, for example IO, for placing option numbers and data into the
array argument iopt. Initially, set IO = 1. Before a call to the IMSL Library routine,
follow Steps 2 through 4.
2. The data structure for the optional data array has the following form:
iopt (IO) = ?_options (Option_number, Optional_data)
[iopt (IO + 1) =?_options (Option_number, Optional_data)]
The length of the data set is specified by the documentation for an individual routine. (The
Optional_data is output in some cases and may not be used in other cases.) The square
braces [. . .] denote optional items.
Illustration: Example 3 in Chapter 2, Singular Value and Eigenvalue Decomposition of
Introduction
There is one line of code required for the change and the new tolerance:
iopt (1) = d_options(d_lin_sol_self_set_small,
epsilon(one) *abs (d(i)))
3.
4.
changes to default values of internal parameters have been made. This implies that the last
option number is the value zero or the value of SIZE (iopt) matches the last optional
value changed.
To add more options, replace IO with IO + n, where n is the number of items required for
the previous option. Go to Step 2.
Option numbers can be written in any order, and any selected set of options can be changed from
the defaults. They may be repeated. Example 3 in Chapter 1, Linear Solvers of LIN_SOL_SELF
uses three and then four option numbers for purposes of computing an eigenvector associated with
a known eigenvalue.
Using the overloaded assignment and logical operations, this code fragment can be written in the
equivalent and more readable form:
s_epack(1) = 0
call lin_sol_gen(A,b,x,epack=s_epack)
if (s_epack(1) > 0) call error_post(s_epack)
Generally the assignments and logical operations refer only to component idummy. The
assignment s_epack(1)=0 is equivalent to s_epack(1)=s_error(0,0E0). Thus, the
floating-point component rdummy is assigned the value 0E0. The assignment statement
I=s_epack(1), for I an integer type, is equivalent to I=s_epack(1)%idummy. The value
of component rdummy is ignored in this assignment. For the logical operators, a single element of
any of the IMSL Fortran Numerical Library derived types can be in either the first or second
operand.
Derived Type Overloaded Assignments and Tests
s_options
I=s_options(1);s_options(1)=I
= =
/=
<
<=
>
>=
s_options
I=d_options(1);d_options(1)=I
= =
/=
<
<=
>
>=
xxviii Introduction
I=s_epack(1);s_epack(1)=I
= =
/=
<
<=
>
>=
d_epack
I=d_epack(1);d_epack(1)=I
= =
/=
<
<=
>
>=
In the examples, operator_ex01, , _ex37, the overloaded assignments and tests have been
used whenever they improve the readability of the code.
Error Handling
CAPABLE
The routines in the IMSL MATH/LIBRARY attempt to detect and report errors and invalid input.
Errors are classified and are assigned a code number. By default, errors of moderate or worse
severity result in messages being automatically printed by the routine. Moreover, errors of worse
severity cause program execution to stop. The severity level and the general nature of the error are
designated by an error type ranging from 0 to 5. An error type 0 is no error; types 1 through 5
are progressively more severe. In most cases, you need not be concerned with our method of
handling errors. For those interested, a complete description of the error-handling system is given
in the Reference Material, which also describes how you can change the default actions and access
the error code numbers.
A separate error handler is provided to allow users to handle errors of differing types being
reported from several nodes without danger of jumbling or mixing error messages. The design
of this error handler is described more fully in Hanson (1992). The primary feature of the design is
the use of a separate array for each parallel call to a routine. This allows the user to summarize
errors using the routine error_post in a non-parallel part of an application. For a more detailed
discussion of the use of this error handler in applications which use MPI for distributed
computing, see the Reference Material.
Printing Results
Most of the routines in the IMSL MATH/LIBRARY (except the line printer routines and special
utility routines) do not print any of the results. The output is returned in Fortran variables, and you
can print these yourself. See Chapter 11, Utilities, for detailed descriptions of these routines.
A commonly used routine in the examples is the IMSL routine UMACH (see the Reference Material),
which retrieves the Fortran device unit number for printing the results. Because this routine obtains
device unit numbers, it can be used to redirect the input or output. The section on MachineDependent Constants in the Reference Material contains a description of the routine UMACH.
Introduction
Fortran 90 Constructs
CAPABLE
The IMSL Fortran Numerical Library contains routines which take advantage of Fortran 90
language constructs, including Fortran 90 array data types. One feature of the design is that the
default use may be as simple as the problem statement. Complicated, professional-quality
mathematical software is hidden from the casual or beginning user.
In addition, high-level operators and functions are provided in the Library. They are described in
Chapter 10, Linear Algebra Operators and Generic Functions.
xxx Introduction
Matrix Operation
A .x. B
AB
.i. A
A 1
.t. A, .h. A
A T , A*
A .ix. B
A 1 B
B .xi. A
BA 1
A T B, A* B
BA T , BA*
A = USV T
R=CHOL(A)
A = RT R
Q=ORTH(A [,R=R])
( A = QR ) , QT Q = I
U=UNIT(A)
u1 , = a1 / a1 ,
F=DET(A)
det(A) = determinant
K=RANK(A)
rank(A) = rank
P=NORM(A[,[type=]i])
p = A 1 = max j aij
i =1
C=COND(A)
huge (1)
= max i aij
=
j
1
A1 . A
Z=EYE(N)
Z = IN
A=DIAG(X)
A = diag ( x1 ,)
X=DIAGONALS(A)
x = ( a11 ,)
W=FFT(Z); Z=IFFT(W)
A=RAND(A)
L=isNaN(A)
Defined Operation
Matrix Operation
Data Management
A .x. B
AB
Introduction
Defined Operation
Matrix Operation
T
.t. A, .h. A
A ,A
A .ix. B
A 1 B
B .xi. A
BA 1
A T B, A* B
BA T , BA*
C=COND(A)
A1 . A
Table B.2 Defined Operators and Generic Functions for Harwell-Boeing Sparse Matrices
LAPACK Routines
used when Linking with High
Performance Libraries
LSARG
?GERFS,?GETRF,?GECON, ?=S/D
LSLRG
LFCRG
?GETRF,?GECON, ?=S/D
LFTRG
?GETRF, ?=S/D
LFSRG
?GETRS, ?=S/D
LFIRG
?GETRS, ?=S/D
xxxii Introduction
Generic Name
of
IMSL Routine
LINRG
LSACG
LSLCG
LFCCG
LFTCG
?GETRF, ?C/Z
LFSCG
?GETRS, ?C/Z
LFICG
?GERFS,?GETRS, ?=C/Z
LINCG
LSLRT
?TRTRS, ?=S/D
LFCRT
?TRCON, ?=S/D
LSLCT
?TRTRS, ?=C/Z
LFCCT
?TRCON, ?=C/Z
LSADS
Introduction
LAPACK Routines
used when Linking with High
Performance Libraries
LSLDS
?POTRF,
?POTRS, ?=S/D
LFCDS
?POTRF,
?POCON, ?=S/D
LFTDS
?POTRF, ?-S/D
LFSDS
?POTRS, ?-S/D
LFIDS
LINDS
?POTRF, ?=S/D
LSASF
LSLSF
LFCSF
LFTSF
?SYTRF,
?=S/D
LFSSF
?SYTRF,
?=S/D
LFISF
?SYRFS, ?=S/D
LSADH
LSLDH
LFCDH
LFTDH
?POTRF, ?=C/Z
LFSDH
?TRTRS, ?=C/Z
LFIDH
LSAHF
LSLHF
Generic Name
of
IMSL Routine
LAPACK Routines
used when Linking with High
Performance Libraries
LFCHF
LFTHF
?HETRF, ?=C/Z
LFSHF
?HETRS, ?=C/Z
LFIHF
LSARB
LSLRB
LFCRB
LFTRB
LFSRB
?GBTRF, ?=S/D
?GBTRS, ?=S/D
LFIRB
LSQRR
LQRRV
LSBRR
?GEQRF, ?=S/D
LQRRR
?GEQRF, ?=S/D
LSVRR
?GESVD, ?-S/D
LSVCR
?GESVD, ?=C/Z
LSGRR
?GESVD, ?=S/D
LQRSL
LQERR
?ORGQR, ?=S/D
EVLRG
EVCRG
?GEEVX, ?=S/D
EVLCG
EVCCG
?GEEV, ?=C/Z
EVLSF
?SYEV, ?=S/D
EVCSF
?SYEV, ?=S/D
EVLHF
?HEEV, ?=C/Z
EVCHF
?HEEV, ?=C/Z
GVLRG
GVCRG
GVLCG
GVCCG
GCLSP
?SYGV, ?=S/D
xxxiv Introduction
Generic Name
of
IMSL Routine
GCCSP
LAPACK Routines
used when Linking with High
Performance Libraries
?SYGV, ?=S/D
ScaLAPACK, Blackford et al. (1997), includes a subset of LAPACK codes redesigned for use on
distributed memory MIMD parallel computers. A number of IMSL Library routines make use of a
subset of the ScaLAPACK library.
Table D below lists the IMSL routines that make use of ScaLAPACK codes. The intent is to
provide access to the ScaLAPACK codes through the familiar IMSL routine interface. The IMSL
routines that utilize ScaLAPACK codes have a ScaLAPACK Interface documented in addition to
the FORTRAN 90 Interface. Like the LAPACK codes, access to the ScaLAPACK codes is made
by linking to the appropriate library. Details on linking to the appropriate IMSL Library and
alternate libraries for ScaLAPACK and BLAS are explained in the online README file of the
product distribution.
Generic Name
of
IMSL Routine
Introduction
ScaLAPACK Routines
used when Linking with High Performance
Libraries
LSARG
P?GERFS,P?GETRF,P?GETRS, ?=S/D
LSLRG
LFCRG
P?GETRF,P?GECON, ?=S/D
LFTRG
P?GETRF, ?=S/D
LFSRG
P?GETRS, ?=S/D
LFIRG
LINRG
LSACG
LSLCG
LFCCG
LFTCG
P?GETRF, ?C/Z
LFSCG
P?GETRS, ?C/Z
LFICG
P?GERFS,P?GETRS, ?=C/Z
LINCG
LSLRT
P?TRTRS, ?=S/D
LFCRT
P?TRCON, ?=S/D
LSLCT
P?TRTRS, ?=C/Z
LFCCT
P?TRCON, ?=C/Z
LSADS
Generic Name
of
IMSL Routine
ScaLAPACK Routines
used when Linking with High Performance
Libraries
LSLDS
P?POTRF,
P?POTRS, ?=S/D
LFCDS
P?POTRF,
P?POCON, ?=S/D
LFTDS
P?POTRF, ?-S/D
LFSDS
P?POTRS, ?-S/D
LFIDS
LINDS
LSADH
LSLDH
LFCDH
LFTDH
P?POTRF, ?=C/Z
LFSDH
P?POTRS, ?=C/Z
LFIDH
LSLRB
LSQRR
LQRRV
LQRRR
LSVRR
P?GESVD, ?-S/D
LSGRR
P?GESVD, ?=S/D
LQRSL
LQERR
P?ORGQR, ?=S/D
Table D IMSL Routines and ScaLAPACK Routines Utilized Within
General Remarks
Use of the ScaLAPACK enhanced routines allows a user to solve large linear systems of algebraic
equations at a performance level that might not be achievable on one computer by performing the
work in parallel across multiple computers. One might also use these routines on linear systems
that prove to be too large for the address space of the target computer. Visual Numerics has tried
to facilitate the use of parallel computing in these situations by providing interfaces to
ScaLAPACK routines which accomplish the task. The IMSL Library solver interface has the same
look and feel whether one is using the routine on a single computer or across multiple computers.
xxxvi Introduction
The basic steps required to utilize the IMSL routines which interface with ScaLAPACK routines
are:
1. Initialize MPI
2. Initialize the processor grid
3. Define any necessary array descriptors
4. Allocate space for the local arrays
5. Set up local matrices across the processor grid
6. Call the IMSL routine which interfaces with ScaLAPACK
7. Gather the results from across the processor grid
8. Release the processor grid
9. Exit MPI
Utilities are provided in the IMSL Library that facilitate these steps for the user. Each of these
utilities is documented in Chapter 11, Utilities. We visit the steps briefly here:
1. Initialize MPI
The user should call MP_SETUP() at this step. This function is described in detail in
Getting Started with Modules MPI_setup_int and MPI_node_int in Chapter 10, Linear
Algebra Operators and Generic Functions of this manual. For ScaLAPACK usage, suffice it to say
that following a call to the function MP_SETUP(), the module MPI_node_int will contain
information about the number of processors, the rank of a processor, and the communicator for the
application. A call to this function will return the number of processors available to the program.
Since the module MPI_node_int is used by MPI_setup_int, it is not necessary to explicitly
use the module MPI_node_int. If MP_SETUP() is not called, then the program will compute
entirely on one node. No routine from MPI will be called.
Introduction
9. Exit MPI
A call to MP_SETUP with the argument FINAL will shut down MPI and set the value of
MP_NPROCS = 0. This flags that MPI has been initialized and terminated. It cannot be initialized
again in the same program unit execution. No MPI routine is defined when MP_NPROCS has this
value.
xxxviii Introduction
Routines
1.1.
Linear Solvers
1.1.1
1.1.2
17
27
36
1.1.5
44
1.1.6
57
1.1.3
1.1.4
1.2.
1.2.1
66
1.2.2
66
74
1.2.3
1.3.
1.3.1
82
87
93
98
103
107
Routines 1
1.3.2
1.3.3
1.3.4
1.3.5
1.3.6
1.3.7
113
114
118
123
127
133
138
142
148
149
154
157
161
163
164
168
172
174
176
180
185
190
194
199
204
205
209
212
214
217
219
221
224
226
231
236
241
246
251
256
1.3.8
258
261
263
266
269
271
274
275
277
280
282
287
290
293
295
298
300
303
305
308
311
313
315
318
319
321
324
327
330
333
336
338
342
344
346
349
352
355
358
360
362
1.3.9
Routines 3
364
369
374
377
382
387
391
395
399
404
421
423
425
427
433
436
1.4.
1.4.1
446
452
458
462
466
473
478
484
Cholesky Factorization
Cholesky factoring for rank deficient matrices ....................LCHRG
Cholesky factor update........................................................ LUPCH
Cholesky factor down-date.................................................. LDNCH
489
491
494
498
504
508
1.4.2
1.4.3
Usage Notes
Section 1.1 describes routines for solving systems of linear algebraic equations by direct matrix
factorization methods, for computing only the matrix factorizations, and for computing linear
least-squares solutions.
Section 1.2 describes routines for solving systems of parallel constrained least-squares.
Many of the routines described in sections 1.3 and 1.4 are for matrices with special properties or
structure. Computer time and storage requirements for solving systems with coefficient matrices
of these types can often be drastically reduced, using the appropriate routine, compared with using
a routine for solving a general complex system.
The appropriate matrix property and corresponding routine can be located in the Routines
section. Many of the linear equation solver routines in this chapter are derived from subroutines
from LINPACK, Dongarra et al. (1979). Other routines have been developed by Visual Numerics,
derived from draft versions of LAPACK subprograms, Bischof et al. (1988), or were obtained
from alternate sources.
A system of linear equations is represented by Ax = b where A is the n n coefficient data matrix,
b is the known right-hand-side n-vector, and x is the unknown or solution n-vector. Figure 1-1
summarizes the relationships among the subroutines. Routine names are in boxes and input/output
data are in ovals. The suffix ** in the subroutine names depend on the matrix type. For example,
to compute the determinant of A use LFC** or LFT** followed by LFD**.
The paths using LSA** or LFI** use iterative refinement for a more accurate solution. The path
using LSA** is the same as using LFC** followed by LFI**. The path using LSL** is the same as
the path using LFC** followed by LFS**. The matrix inversion routines LIN** are available only
for certain matrix types.
Matrix Types
The two letter codes for the form of coefficient matrix, indicated by ** in Figure 1-1, are as
follows:
RG
CG
TR or CR
RB
TQ or CQ
CB
SF
DS
Real symmetric positive definite matrix stored in the upper half of a square matrix.
DH
Complex Hermitian positive definite matrix stored in the upper half of a complex
square matrix.
HF
Complex Hermitian matrix stored in the upper half of a complex square matrix.
Usage Notes 5
QS or PB
QH or QB
XG
ZG
XD
ZD
LFT**
LFC**
Condition
number
Factorization
LIN**
LFI**
LFS**
LSA**
LSL**
A 1
x = A1 b
or
x = AT b
LFD**
Determinant
Determinants
The routines for evaluating determinants are named LFD**. As indicated in Figure 1-1, these
routines require the factors of the matrix as input. The values of determinants are often badly
scaled. Additional complications in structures for evaluating them result from this fact. See Rice
(1983) for comments on determinant evaluation.
Iterative Refinement
Iterative refinement can often improve the accuracy of a well-posed numerical solution. The
iterative refinement algorithm used is as follows:
x0 = A1b
For i = 1, 50
ri = Axi1 b computed in higher precision
pi = A1 ri
xi = xi1 - pi
if (|| pi || || xi || ) Exit
End for
Error Matrix is too ill-conditioned
If the matrix A is in single precision, then the residual ri = Axi 1 b is computed in double
precision. If A is in double precision, then quadruple-precision arithmetic routines are used.
The use of the value 50 is arbitrary. In fact a single correction is usually sufficient. It is also
helpful even when ri is computed in the same precision as the data.
Matrix Inversion
An inverse of the coefficient matrix can be computed directly by one of the routines named
LIN**. These routines are provided for general matrix forms and some special matrix forms.
When they do not exist, or when it is desirable to compute a high accuracy inverse, the two-step
technique of calling the factoring routine followed by the solver routine can be used. The inverse
is the solution of the matrix system AX = I where I denotes the n n identity matrix, and the
solution is X = A1.
Singularity
The numerical and mathematical notions of singularity are not the same. A matrix is considered
numerically singular if it is sufficiently close to a mathematically singular matrix. If error
Chapter 1: Linear Systems
Usage Notes 7
messages are issued regarding an exact singularity then specific error message level reset actions
must be taken to handle the error condition. By default, the routines in this chapter stop. The
solvers require that the coefficient matrix be numerically nonsingular. There are some tests to
determine if this condition is met. When the matrix is factored, using routines LFC**, the
condition number is computed. If the condition number is large compared to the working
precision, a warning message is issued and the computations are continued. In this case, the user
needs to verify the usability of the output. If the matrix is determined to be mathematically
singular, or ill-conditioned, a least-squares routine or the singular value decomposition routine
may be used for further analysis.
p3
p1
p0
p1
p2
p2
p1
p0
p1
p3
p2
p1
p0
Real Toeplitz systems can be solved using LSLTO. Complex Toeplitz systems can be solved using
LSLTC.
Circulant matrices have the property that each row is obtained by shifting the row above it one
place to the right. Entries that are shifted off at the right reenter at the left. For example:
p1
p
A= 4
p3
p2
p2
p3
p1
p2
p4
p3
p1
p4
p4
p3
p2
p1
QR Decomposition
The QR decomposition of a matrix A consists of finding an orthogonal matrix Q, a permutation
matrix P, and an upper trapezoidal matrix R with diagonal elements of nonincreasing magnitude,
such that AP = QR. This decomposition is determined by the routines LQRRR or LQRRV. It returns
R and the information needed to compute Q. To actually compute Q use LQERR. Figure 1-2
summarizes the relationships among the subroutines.
The QR decomposition can be used to solve the linear system Ax = b. This is equivalent to
Rx = QTPb. The routine LQRSL, can be used to find QTPb from the information computed by
8 Chapter 1: Linear Systems
LQRRR. Then x can be computed by solving a triangular system using LSLRT. If the system Ax = b
is overdetermined, then this procedure solves the least-squares problem, i.e., it finds an x for which
Ax b
2
2
is a minimum.
If the matrix A is changed by a rank-1 update, A A + xyT, the QR decomposition of A can be
updated/down-dated using the routine LUPQR. In some applications a series of linear systems
which differ by rank-1 updates must be solved. Computing the QR decomposition once and then
updating or down-dating it usually faster than newly solving each system.
A
LQRRR or LQRRV
A > A + xy T
QR decomposition
LUPQR
LQERR
LQRSL
Qb, Q T b,
Q
Least-squares
solution
Figure 1- 2 Least-Squares Routine
LIN_SOL_GEN
Solves a general system of linear equations Ax = b. Using optional arguments, any of several
related computations can be performed. These extra tasks include computing the LU factorization
of A using partial pivoting, representing the determinant of A, computing the inverse matrix A-1,
and solving AT x = b or Ax = b given the LU factorization of A.
Required Arguments
A Array of size n n containing the matrix. (Input [/Output])
If the packaged option lin_sol_gen_save_LU is used then the LU factorization of A
is saved in A. For solving efficiency, the diagonal reciprocals of the matrix U are saved
in the diagonal entries of A.
LIN_SOL_GEN 9
Optional Arguments
NROWS = n (Input)
Uses array A(1:n, 1:n) for the input matrix.
Default: n = size (A, 1)
NRHS = nb (Input)
Uses array b(1:n, 1:nb) for the input right-hand side matrix.
Default: nb = size(b, 2)
Note that b must be a rank-2 array.
pivots = pivots(:) (Output [/Input])
Integer array of size n that contains the individual row interchanges. To construct the
permuted order so that no pivoting is required, define an integer array ip(n). Initialize
ip(i) = i, i = 1, n and then execute the loop, after calling lin_sol_gen,
k=pivots(i)
interchange ip(i) and ip(k), i=1,n
The matrix defined by the array assignment that permutes the rows,
A(1:n, 1:n) = A(ip(1:n), 1:n), requires no pivoting for maintaining numerical
stability. Now, the optional argument iopt= and the packaged option number
?_lin_sol_gen_no_pivoting can be safely used for increased efficiency during
the LU factorization of A.
det = det(1:2) (Output)
Array of size 2 of the same type and kind as A for representing the determinant of the
input matrix. The determinant is represented by two numbers. The first is the base with
the sign or complex angle of the result. The second is the exponent. When det(2) is
within exponent range, the value of this expression is given by
abs(det(1))**det(2) * (det(1))/abs(det(1)). If the matrix is not singular,
abs(det(1)) = radix(det); otherwise, det(1) = 0., and det(2) = huge(abs(det(1))).
ainv = ainv(:,:) (Output)
Array of the same type and kind as A(1:n, 1:n). It contains the inverse matrix, A-1,
when the input matrix is nonsingular.
Derived type array with the same precision as the input matrix; used for passing
optional data to the routine. The options are as follows:
Option Name
Option Value
lin_sol_gen_set_small
lin_sol_gen_save_LU
lin_sol_gen_solve_A
lin_sol_gen_solve_ADJ
lin_sol_gen_no_pivoting
lin_sol_gen_scan_for_NaN
lin_sol_gen_no_sing_mess
lin_sol_gen_A_is_sparse
Replaces a diagonal term of the matrix U if it is smaller in magnitude than the value
Small using the same sign or complex direction as the diagonal. The system is declared
singular. A solution is approximated based on this replacement if no overflow results.
Default: the smallest number that can be reciprocated safely
iopt(IO) = ?_options(?_lin_sol_gen_save_LU, ?_dummy)
Saves the LU factorization of A. Requires the optional argument pivots= if the
routine will be used later for solving systems with the same matrix. This is the only
case where the input arrays A and b are not saved. For solving efficiency, the diagonal
reciprocals of the matrix U are saved in the diagonal entries of A.
iopt(IO) = ?_options(?_lin_sol_gen_solve_A, ?_dummy)
Uses the LU factorization of A computed and saved to solve Ax = b.
iopt(IO) = ?_options(?_lin_sol_gen_solve_ADJ,?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNan(b(i,j)) ==.true.
LIN_SOL_GEN 11
iopt(IO) = ?_options(?_lin_sol_gen_A_is_sparse,?_dummy)
Uses an indirect updating loop for the LU factorization that is efficient for sparse
matrices where all matrix entries are stored.
FORTRAN 90 Interface
Generic:
Specific:
Description
Routine LIN_SOL_GEN solves a system of linear algebraic equations with a nonsingular
coefficient matrix A. It first computes the LU factorization of A with partial pivoting such that
LU = A . The matrix U is upper triangular, while the following is true:
L1 A Ln Pn Ln 1 Pn 1
L1 P1 A U
The factors Pi and Li are defined by the partial pivoting. Each Pi is an interchange of row i with
row j i. Thus, Pi is defined by that value of j. Every
Li = I + mi eiT
is an elementary elimination matrix. The vector mi is zero in entries 1, ..., i. This vector is stored
as column i in the strictly lower-triangular part of the working array containing the decomposition
information. The reciprocals of the diagonals of the matrix U are saved in the diagonal of the
working array. The solution of the linear system Ax = b is found by solving two simpler systems,
y = L1b and x = U 1 y
More mathematical details are found in Golub and Van Loan (1989, Chapter 3).
Output
Example 1 for LIN_SOL_GEN is correct.
Additional Examples
Example 2: Matrix Inversion and Determinant
This example computes the inverse and determinant of A, a random matrix. Tests are made on the
conditions
AA1 = I
and
det ( A1 ) = det ( A )
LIN_SOL_GEN 13
Output
Example 2 for LIN_SOL_GEN is correct.
use lin_sol_gen_int
use rand_gen_int
implicit none
! This is Example 3 for LIN_SOL_GEN.
integer, parameter :: n=32
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1d0)), parameter :: d_zero=0.0d0
integer ipivots(n)
real(kind(1e0)) a(n,n), b(n,1), x(n,1), w(n**2)
real(kind(1e0)) change_new, change_old
real(kind(1d0)) c(n,1), d(n,n), y(n,1)
type(s_options) :: iopti(2)=s_options(0,zero)
! Generate a random matrix.
call rand_gen(w)
a = reshape(w, (/n,n/))
! Generate a random right hand side.
call rand_gen(b(1:n,1))
! Save double precision copies of the matrix and right hand side.
d = a
c = b
! Start solution at zero.
y = d_zero
change_old = huge(one)
! Use packaged option to save the factorization.
iopti(1) = s_options(s_lin_sol_gen_save_LU,zero)
iterative_refinement: do
b = c - matmul(d,y)
call lin_sol_gen(a, b, x, &
pivots=ipivots, iopt=iopti)
y = x + y
change_new = sum(abs(x))
! Exit when changes are no longer decreasing.
if (change_new >= change_old) &
exit iterative_refinement
change_old = change_new
! Use option to re-enter code with factorization saved; solve only.
Chapter 1: Linear Systems
LIN_SOL_GEN 15
iopti(2) = s_options(s_lin_sol_gen_solve_A,zero)
end do iterative_refinement
write (*,*) 'Example 3 for LIN_SOL_GEN is correct.'
end
Output
Example 3 for LIN_SOL_GEN is correct.
with initial values y(0) = y0. For this example, the matrix A is real and constant with respect to t .
The unique solution is given by the matrix exponential:
y ( t ) = e At y0
A = XDX 1
to evaluate the solution with the equivalent formula
y ( t ) = Xe Dt z0
where
z0 = X 1 y0
is computed using the complex arithmetic version of lin_sol_gen. The results for y(t) are real
quantities, but the evaluation uses intermediate complex-valued calculations. Note that the
computation of the complex matrix X and the diagonal matrix D is performed using the IMSL
MATH/LIBRARY FORTRAN 77 interface to routine EVCRG. This is an illustration of intermixing
interfaces of FORTRAN 77 and Fortran 90 code. The information is made available to the Fortran
90 compiler by using the FORTRAN 77 interface for EVCRG. Also, see operator_ex04, supplied
with the product examples, where the Fortran 90 function EIG() has replaced the call to EVCRG.
use lin_sol_gen_int
use rand_gen_int
use Numerical_Libraries
implicit none
! This is Example 4 for LIN_SOL_GEN.
integer, parameter :: n=32, k=128
real(kind(1e0)), parameter :: one=1.0e0, t_max=1, delta_t=t_max/(k-1)
real(kind(1e0)) err, A(n,n), atemp(n,n), ytemp(n**2)
real(kind(1e0)) t(k), y(n,k), y_prime(n,k)
complex(kind(1e0)) EVAL(n), EVEC(n,n)
16 Chapter 1: Linear Systems
Output
Example 4 for LIN_SOL_GEN is correct.
LIN_SOL_SELF
Solves a system of linear equations Ax = b, where A is a self-adjoint matrix. Using optional
arguments, any of several related computations can be performed. These extra tasks include
computing and saving the factorization of A using symmetric pivoting, representing the
determinant of A, computing the inverse matrix A-1, or computing the solution of Ax = b given the
LIN_SOL_SELF 17
Required Arguments
A Array of size n n containing the self-adjoint matrix. (Input [/Output]
If the packaged option lin_sol_self_save_factors is used then the factorization
of A is saved in A. For solving efficiency, the diagonal reciprocals of the matrix R are
saved in the diagonal entries of A when the Cholesky method is used.
B Array of size n nb containing the right-hand side matrix. (Input [/Output]
If the packaged option lin_sol_self_save_factors is used then input B is used as
work storage and is not saved.
X Array of size n nb containing the solution matrix. (Output)
Optional Arguments
NROWS = n (Input)
Uses array A(1:n, 1:n) for the input matrix.
Default: n = size(A, 1)
NRHS = nb (Input)
Uses the array b(1:n, 1:nb) for the input right-hand side matrix.
Default: nb = size(b, 2)
Note that b must be a rank-2 array.
pivots = pivots(:) (Output [/Input])
Integer array of size n + 1 that contains the individual row interchanges in the first n
locations. Applied in order, these yield the permutation matrix P. Location n + 1
contains the number of the first diagonal term no larger than Small, which is defined on
the next page of this chapter.
Array of size 2 of the same type and kind as A for representing the determinant of the
input matrix. The determinant is represented by two numbers. The first is the base with
the sign or complex angle of the result. The second is the exponent. When det(2) is
within exponent range, the value of the determinant is given by the expression
abs(det(1))**det(2) * (det(1))/abs(det(1)). If the matrix is not singular,
abs(det(1)) = radix(det); otherwise, det(1) = 0., and det(2) = huge(abs(det(1))).
ainv = ainv(:,:) (Output)
Array of the same type and kind as A(1:n, 1:n). It contains the inverse matrix, A-1
when the input matrix is nonsingular.
Derived type array with the same precision as the input matrix; used for passing
optional data to the routine. The options are as follows:
Option Name
Option Value
lin_sol_self_set_small
lin_sol_self_save_factors
lin_sol_self_no_pivoting
lin_sol_self_use_Cholesky
lin_sol_self_solve_A
lin_sol_self_scan_for_NaN
lin_sol_self_no_sing_mess
will be used for solving further systems with the same matrix. This is the only case
where the input arrays A and b are not saved. For solving efficiency, the diagonal
reciprocals of the matrix R are saved in the diagonal entries of A when the Cholesky
method is used.
iopt(IO) = ?_options(?_lin_sol_self_no_pivoting, ?_dummy)
Does no row pivoting. The array pivots(:), if present, satisfies pivots(i) = i + 1 for
i = 1, , n 1 when using Aasens method. When using the Cholesky method,
pivots(i) = i for i = 1, , n.
iopt(IO) = ?_options(?_lin_sol_self_use_Cholesky, ?_dummy)
The Cholesky decomposition PAPT = RTR is used instead of the Aasen method.
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNan(b(i,j)) ==.true.
LIN_SOL_SELF 19
iopt(IO) = ?_options(?_lin_sol_self_no_sing_mess,?_dummy)
FORTRAN 90 Interface
Generic:
Specific:
Description
Routine LIN_SOL_SELF routine solves a system of linear algebraic equations with a nonsingular
coefficient matrix A. By default, the routine computes the factorization of A using Aasens
method. This decomposition has the form
PAPT = LTLT
where P is a permutation matrix, L is a unit lower-triangular matrix, and T is a tridiagonal
self-adjoint matrix. The solution of the linear system Ax = b is found by solving simpler systems,
u = L1 Pb
Tv = u
and
x = PT LT v
More mathematical details for real matrices are found in Golub and Van Loan (1989, Chapter 4).
When the optional Cholesky algorithm is used with a positive definite, self-adjoint matrix, the
factorization has the alternate form
PAPT = RT R
where P is a permutation matrix and R is an upper-triangular matrix. The solution of the linear
system Ax = b is computed by solving the systems
u = R T Pb
and
x = PT R 1u
The permutation is chosen so that the diagonal term is maximized at each step of the
decomposition. The individual interchanges are optionally available in the argument pivots.
The n n self-adjoint system Ax = b is solved for x. This solution method is not as satisfactory, in
terms of numerical accuracy, as solving the system Cx d directly by using the routine
lin_sol_lsq. Also, see operator_ex05, Chapter 10.
use lin_sol_self_int
use rand_gen_int
implicit none
! This is Example 1 for LIN_SOL_SELF.
integer, parameter :: m=64, n=32
real(kind(1e0)), parameter :: one=1e0
real(kind(1e0)) err
real(kind(1e0)), dimension(n,n) :: A, b, x, res, y(m*n),&
C(m,n), d(m,n)
! Generate two rectangular random matrices.
call rand_gen(y)
C = reshape(y,(/m,n/))
call rand_gen(y)
d = reshape(y,(/m,n/))
! Form the normal equations for the rectangular system.
A = matmul(transpose(C),C)
b = matmul(transpose(C),d)
! Compute the solution for Ax = b.
call lin_sol_self(A, b, x)
! Check the results for small residuals.
res = b - matmul(A,x)
err = maxval(abs(res))/sum(abs(A)+abs(b))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 1 for LIN_SOL_SELF is correct.'
end if
end
Output
Example 1 for LIN_SOL_SELF is correct.
LIN_SOL_SELF 21
Additional Examples
Example 2: System Solving with Cholesky Method
This example solves the same form of the system as Example 1. The optional argument iopt=
is used to note that the Cholesky algorithm is used since the matrix A is positive definite and selfadjoint. In addition, the sample covariance matrix
= 2 A1
is computed, where
2 =
d Cx
mn
the inverse matrix is returned as the ainv= optional argument. The scale factor 2 and are
computed after returning from the routine. Also, see operator_ex06, Chapter 10.
use lin_sol_self_int
use rand_gen_int
use error_option_packet
implicit none
! This is Example 2 for LIN_SOL_SELF.
integer, parameter :: m=64, n=32
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1e0)) err
real(kind(1e0)) a(n,n), b(n,1), c(m,n), d(m,1), cov(n,n), x(n,1), &
res(n,1), y(m*n)
type(s_options) :: iopti(1)=s_options(0,zero)
! Generate a random rectangular matrix and a random right hand side.
call rand_gen(y)
c = reshape(y,(/m,n/))
call rand_gen(d(1:n,1))
! Form the normal equations for the rectangular system.
a = matmul(transpose(c),c)
b = matmul(transpose(c),d)
! Use packaged option to use Cholesky decomposition.
iopti(1) = s_options(s_lin_sol_self_Use_Cholesky,zero)
! Compute the solution of Ax=b with optional inverse obtained.
call lin_sol_self(a, b, x, ainv=cov, &
iopt=iopti)
22 Chapter 1: Linear Systems
Output
Example 2 for LIN_SOL_SELF is correct.
are computed using the routine lin_eig_self. An eigenvector, corresponding to one of these
eigenvalues, , is computed using inverse iteration. This solves the near singular system
(A I)x = b for an eigenvector, x. Following the computation of a normalized eigenvector
y=
x
x
= yT Ay
is checked. Since a singular system is expected, suppress the fatal error message that normally
prints when the error post-processor routine error_post is called within the routine
lin_sol_self. Also, see operator_ex07, Chapter 10.
use
use
use
use
lin_sol_self_int
lin_eig_self_int
rand_gen_int
error_option_packet
implicit none
! This is Example 3 for LIN_SOL_SELF.
integer i, tries
Chapter 1: Linear Systems
LIN_SOL_SELF 23
Output
Example 3 for LIN_SOL_SELF is correct.
A r b
=
0 x 0
computed using iterative refinement. This solution method is appropriate for least-squares
problems when an accurate solution is required. The solution and residuals are accumulated in
double precision, while the decomposition is computed in single precision. Also, see
operator_ex08, supplied with the product examples.
use lin_sol_self_int
use rand_gen_int
implicit none
! This is Example 4 for LIN_SOL_SELF.
integer i
integer, parameter :: m=8, n=4
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1d0)), parameter :: d_zero=0.0d0
integer ipivots((n+m)+1)
real(kind(1e0)) a(m,n), b(m,1), w(m*n), f(n+m,n+m), &
g(n+m,1), h(n+m,1)
real(kind(1e0)) change_new, change_old
real(kind(1d0)) c(m,1), d(m,n), y(n+m,1)
type(s_options) :: iopti(2)=s_options(0,zero)
! Generate a random matrix.
call rand_gen(w)
a = reshape(w, (/m,n/))
! Generate a random right hand side.
LIN_SOL_SELF 25
call rand_gen(b(1:m,1))
! Save double precision copies of the matrix and right hand side.
d = a
c = b
! Fill in augmented system for accurately solving the least-squares
! problem.
f = zero
do i=1, m
f(i,i) = one
end do
f(1:m,m+1:) = a
f(m+1:,1:m) = transpose(a)
! Start solution at zero.
y = d_zero
change_old = huge(one)
! Use packaged option to save the factorization.
iopti(1) = s_options(s_lin_sol_self_save_factors,zero)
iterative_refinement: do
g(1:m,1) = c(1:m,1) - y(1:m,1) - matmul(d,y(m+1:m+n,1))
g(m+1:m+n,1) = - matmul(transpose(d),y(1:m,1))
call lin_sol_self(f, g, h, &
pivots=ipivots, iopt=iopti)
y = h + y
change_new = sum(abs(h))
! Exit when changes are no longer decreasing.
if (change_new >= change_old) &
exit iterative_refinement
change_old = change_new
! Use option to re-enter code with factorization saved; solve only.
iopti(2) = s_options(s_lin_sol_self_solve_A,zero)
end do iterative_refinement
write (*,*) 'Example 4 for LIN_SOL_SELF is correct.'
end
Output
Example 4 for LIN_SOL_SELF is correct.
LIN_SOL_LSQ
Solves a rectangular system of linear equations Ax b, in a least-squares sense. Using optional
arguments, any of several related computations can be performed. These extra tasks include
computing and saving the factorization of A using column and row pivoting, representing the
determinant of A, computing the generalized inverse matrix A, or computing the least-squares
solution of
Ax b
or
ATy b,
given the factorization of A. An optional argument is provided for computing the following
unscaled covariance matrix
C = ( AT A )
Least-squares solutions, where the unknowns are non-negative or have simple bounds, can be
computed with PARALLEL_NONNEGATIVE_LSQ and PARALLEL_BOUNDED_LSQ. These codes can
be restricted to execute without MPI.
Required Arguments
A Array of size m n containing the matrix. (Input [/Output])
If the packaged option lin_sol_lsq_save_QR is used then the factorization of A is
saved in A. For efficiency, the diagonal reciprocals of the matrix R are saved in the
diagonal entries of A.
B Array of size m nb containing the right-hand side matrix. When using the option to
solve adjoint systems ATx b, the size of b is n nb. (Input [/Output])
If the packaged option lin_sol_lsq_save_QR is used then input B is used as work
storage and is not saved.
X Array of size m nb containing the right-hand side matrix. When using the option to
solve adjoint systems ATx b, the size of x is m nb. (Output)
Optional Arguments
MROWS = m (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: m = size(A, 1)
NCOLS = n (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: n = size(A, 2)
LIN_SOL_LSQ 27
NRHS = nb (Input)
Uses the array b(1:, 1:nb) for the input right-hand side matrix.
Default: nb = size(b, 2)
Note that b must be a rank-2 array.
pivots = pivots(:) (Output [/Input])
Integer array of size 2 * min(m, n) + 1 that contains the individual row followed by the
column interchanges. The last array entry contains the approximate rank of A.
trans = trans(:) (Output [/Input])
Array of size 2 * min(m, n) that contains data for the construction of the orthogonal
decomposition.
det = det(1:2) (Output)
Array of size 2 of the same type and kind as A for representing the products of the
determinants of the matrices Q, P, and R. The determinant is represented by two
numbers. The first is the base with the sign or complex angle of the result. The second
is the exponent. When det(2) is within exponent range, the value of this expression is
given by abs (det(1))**det(2) * (det(1))/abs(det(1)). If the matrix is not singular,
abs(det(1)) = radix(det); otherwise, det(1) = 0., and det(2) = huge(abs(det(1))).
ainv = ainv(:,:) (Output)
Array with size n m of the same type and kind as A(1:m, 1:n). It contains the
generalized inverse matrix, A.
cov = cov(:,:) (Output)
Array with size n n of the same type and kind as A(1:m, 1:n). It contains the
unscaled covariance matrix, C = (ATA)-1.
Derived type array with the same precision as the input matrix; used for passing
optional data to the routine. The options are as follows:
Packaged Options for lin_sol_lsq
Option Prefix = ?
Option Name
Option Value
lin_sol_lsq_set_small
lin_sol_lsq_save_QR
lin_sol_lsq_solve_A
lin_sol_lsq_solve_ADJ
lin_sol_lsq_no_row_pivoting
lin_sol_lsq_no_col_pivoting
lin_sol_lsq_scan_for_NaN
lin_sol_lsq_no_sing_mess
Replaces with Small if a diagonal term of the matrix R is smaller in magnitude than the
value Small. A solution is approximated based on this replacement in either case.
Default: the smallest number that can be reciprocated safely
iopt(IO) = ?_options(?_lin_sol_lsq_save_QR, ?_dummy)
Saves the factorization of A. Requires the optional arguments pivots= and
trans= if the routine is used for solving further systems with the same matrix. This
is the only case where the input arrays A and b are not saved. For efficiency, the
diagonal reciprocals of the matrix R are saved in the diagonal entries of A.
iopt(IO) = ?_options(?_lin_sol_lsq_solve_A, ?_dummy)
Uses the factorization of A computed and saved to solve Ax = b.
iopt(IO) = ?_options(?_lin_sol_lsq_solve_ADJ, ?_dummy)
Uses the factorization of A computed and saved to solve ATx = b.
iopt(IO) = ?_options(?_lin_sol_lsq_no_row_pivoting, ?_dummy)
Does no row pivoting. The array pivots(:), if present, satisfies pivots(i) = i for i = 1,
, min (m, n).
iopt(IO) = ?_options(?_lin_sol_lsq_no_col_pivoting, ?_dummy)
Does no column pivoting. The array pivots(:), if present, satisfies pivots(i + min (m,
n)) = i for i = 1, , min (m, n).
iopt(IO) = ?_options(?_lin_sol_lsq_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNan(b(i,j)) ==.true.
FORTRAN 90 Interface
Generic:
Specific:
Description
Routine LIN_SOL_LSQ solves a rectangular system of linear algebraic equations in a least-squares
sense. It computes the decomposition of A using an orthogonal factorization. This decomposition
has the form
LIN_SOL_LSQ 29
R
QAP = k k
0
0
0
where the matrices Q and P are products of elementary orthogonal and permutation matrices. The
matrix R is k k, where k is the approximate rank of A. This value is determined by the value of
the parameter Small. See Golub and Van Loan (1989, Chapter 5.4) for further details. Note that the
use of both row and column pivoting is nonstandard, but the routine defaults to this choice for enhanced reliability.
is a real matrix with m > n. The least-squares problem is derived from polynomial data fitting to
the function
x
y ( x ) = e x + cos( )
2
using a discrete set of values in the interval 1 x 1. The polynomial is represented as the series
N
u ( x ) = ci Ti ( x )
i =0
where the Ti ( x ) are Chebyshev polynomials. It is natural for the problem matrix and solution to
have a column or entry corresponding to the subscript zero, which is used in this code. Also, see
operator_ex09, supplied with the product examples.
use lin_sol_lsq_int
use rand_gen_int
use error_option_packet
implicit none
! This is Example 1 for LIN_SOL_LSQ.
integer i
integer, parameter :: m=128, n=8
real(kind(1d0)), parameter :: one=1d0, zero=0d0
real(kind(1d0)) A(m,0:n), c(0:n,1), pi_over_2, x(m), y(m,1), &
u(m), v(m), w(m), delta_x
! Generate a random grid of points.
call rand_gen(x)
! Transform points to the interval -1,1.
30 Chapter 1: Linear Systems
x = x*2 - one
! Compute the constant 'PI/2'.
pi_over_2 = atan(one)*2
! Generate known function data on the grid.
y(1:m,1) = exp(x) + cos(pi_over_2*x)
! Fill in the least-squares matrix for the Chebyshev polynomials.
A(:,0) = one; A(:,1) = x
do i=2, n
A(:,i) = 2*x*A(:,i-1) - A(:,i-2)
end do
! Solve for the series coefficients.
call lin_sol_lsq(A, y, c)
! Generate an
delta_x
do i=1,
x(i)
end do
Output
Example 1 for LIN_SOL_LSQ is correct.
LIN_SOL_LSQ 31
Additional Examples
Example 2: System Solving with the Generalized Inverse
This example solves the same form of the system as Example 1. In this case, the grid of evaluation
points is equally spaced. The coefficients are computed using the smoothing formulas by rows
of the generalized inverse matrix, A, computed using the optional argument ainv=. Thus, the
coefficients are given by the matrix-vector product c = (A) y, where y is the vector of values of
the function y(x) evaluated at the grid of points. Also, see operator_ex10, supplied with the
product examples.
use lin_sol_lsq_int
implicit none
! This is Example 2 for LIN_SOL_LSQ.
integer i
integer, parameter :: m=128, n=8
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
real(kind(1d0)) a(m,0:n), c(0:n,1), pi_over_2, x(m), y(m,1), &
u(m), v(m), w(m), delta_x, inv(0:n, m)
! Generate an array of equally spaced points on the interval -1,1.
delta_x = 2/real(m-1,kind(one))
do i=1, m
x(i) = -one + (i-1)*delta_x
end do
! Compute the constant 'PI/2'.
pi_over_2 = atan(one)*2
! Compute data values on the grid.
y(1:m,1) = exp(x) + cos(pi_over_2*x)
! Fill in the least-squares matrix for the Chebyshev polynomials.
a(:,0) = one
a(:,1) = x
do i=2, n
a(:,i) = 2*x*a(:,i-1) - a(:,i-2)
end do
! Compute the generalized inverse of the least-squares matrix.
call lin_sol_lsq(a, y, c, nrhs=0, ainv=inv)
! Compute the series coefficients using the generalized inverse
! as 'smoothing formulas.'
c(0:n,1) = matmul(inv(0:n,1:m),y(1:m,1))
! Evaluate residuals using backward recurrence formulas.
u = zero
v = zero
do i=n, 0, -1
w = 2*x*u - v + c(i,1)
v = u
u = w
end do
y(1:m,1) = exp(x) + cos(pi_over_2*x) - (u-x*v)
! Check that n+2 sign changes in the residual curve occur.
! (This test will fail when n is larger.)
x = one
x = sign(x,y(1:m,1))
if (count(x(1:m-1) /= x(2:m)) == n+2) then
write (*,*) 'Example 2 for LIN_SOL_LSQ is correct.'
end if
end
Output
Example 2 for LIN_SOL_LSQ is correct.
f ( p) = cj ( p qj
+ 2 )1/ 2
j =1
where 2 is a parameter. This example uses 2 = 1, but either larger or smaller values can give a
better approximation for user problems. The coefficients {cj} are obtained by solving the
following m n linear least-squares problem:
f ( pj ) = yj
This example illustrates an effective use of Fortran 90 array operations to eliminate many details
required to build the matrix and right-hand side for the {cj} . For this example, the two sets of
points {pi} and {qj} are chosen randomly. The values {yj} are computed from the following
formula:
yj = e
Chapter 1: Linear Systems
| | p j | |2
LIN_SOL_LSQ 33
|| ||2
f ( p)
is computed at an N N square grid of equally spaced points on the unit square. The magnitude of
r(p) may be larger at certain points on this grid than the residuals at the given points, { pi } . Also,
see operator_ex11, supplied with the product examples.
use lin_sol_lsq_int
use rand_gen_int
implicit none
! This is Example 3 for LIN_SOL_LSQ.
integer i, j
integer, parameter :: m=128, n=32, k=2, n_eval=16
real(kind(1d0)), parameter :: one=1.0d0, delta_sqr=1.0d0
real(kind(1d0)) a(m,n), b(m,1), c(n,1), p(k,m), q(k,n), &
x(k*m), y(k*n), t(k,m,n), res(n_eval,n_eval), &
w(n_eval), delta
! Generate a random set of data points in k=2 space.
call rand_gen(x)
p = reshape(x,(/k,m/))
! Generate a random set of center points in k-space.
call rand_gen(y)
q = reshape(y,(/k,n/))
! Compute the coefficient matrix for the least-squares system.
t = spread(p,3,n)
do j=1, n
t(1:,:,j) = t(1:,:,j) - spread(q(1:,j),2,m)
end do
a = sqrt(sum(t**2,dim=1) + delta_sqr)
! Compute the right hand side of data values.
b(1:,1) = exp(-sum(p**2,dim=1))
! Compute the solution.
call lin_sol_lsq(a, b, c)
! Check the results.
if (sum(abs(matmul(transpose(a),b-matmul(a,c))))/sum(abs(a)) &
<= sqrt(epsilon(one))) then
write (*,*) 'Example 3 for LIN_SOL_LSQ is correct.'
34 Chapter 1: Linear Systems
end if
! Evaluate residuals, known function - approximation at a square
! grid of points. (This evaluation is only for k=2.)
delta = one/real(n_eval-1,kind(one))
do i=1, n_eval
w(i) = (i-1)*delta
end do
res = exp(-(spread(w,1,n_eval)**2 + spread(w,2,n_eval)**2))
do j=1, n
res = res - c(j,1)*sqrt((spread(w,1,n_eval) - q(1,j))**2 + &
(spread(w,2,n_eval) - q(2,j))**2 + delta_sqr)
end do
end
Output
Example 3 for LIN_SOL_LSQ is correct.
1/ 2
where is the machine precision, but any larger value can be used. The fact that lin_sol_lsq
performs row pivoting in this case is critical for obtaining an accurate solution to the constrained
problem solved using weighting. See Golub and Van Loan (1989, Chapter 12) for more
information about this method. Also, see operator_ex12, supplied with the product examples.
use lin_sol_lsq_int
use rand_gen_int
implicit none
! This is Example 4 for LIN_SOL_LSQ.
integer, parameter :: m=64, n=32
real(kind(1e0)), parameter :: one=1.0e0
real(kind(1e0)) :: a(m+1,n), b(m+1,1), x(n,1), y(m*n)
LIN_SOL_LSQ 35
call rand_gen(b(1:m,1))
! Heavily weight desired constraint.
a(m+1,1:n) = one/sqrt(epsilon(one))
b(m+1,1) = one/sqrt(epsilon(one))
call lin_sol_lsq(a, b, x)
if (abs(sum(x) - one)/sum(abs(x)) <= &
sqrt(epsilon(one))) then
write (*,*) 'Example 4 for LIN_SOL_LSQ is correct.'
end if
end
Output
Example 4 for LIN_SOL_LSQ is correct.
LIN_SOL_SVD
Solves a rectangular least-squares system of linear equations Ax b using singular value
decomposition
A = USV T
With optional arguments, any of several related computations can be performed. These extra tasks
include computing the rank of A, the orthogonal m m and n n matrices U and V, and the m n
diagonal matrix of singular values, S.
Required Arguments
A Array of size m n containing the matrix. (Input [/Output])
If the packaged option lin_sol_svd_overwrite_input is used, this array is not
saved on output.
B Array of size m nb containing the right-hand side matrix. (Input [/Output]
If the packaged option lin_sol_svd_overwrite_input is used, this array is not
saved on output.
X Array of size n nb containing the solution matrix. (Output)
Optional Arguments
MROWS = m (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: m = size (A, 1)
36 Chapter 1: Linear Systems
NCOLS = n (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: n = size(A, 2)
NRHS = nb (Input)
Uses the array b(1:, 1:nb) for the input right-hand side matrix.
Default: nb = size(b, 2)
Note that b must be a rank-2 array.
RANK = k (Output)
Number of singular values that are at least as large as the value Small. It will satisfy k
<= min(m, n).
u = u(:,:) (Output)
Array of the same type and kind as A(1:m, 1:n). It contains the m m orthogonal
matrix U of the singular value decomposition.
s = s(:) (Output)
Array of the same precision as A(1:m, 1:n). This array is real even when the matrix
data is complex. It contains the m n diagonal matrix S in a rank-1 array. The singular
values are nonnegative and ordered non-increasing.
v = v(:,:) (Output)
Array of the same type and kind as A(1:m, 1:n). It contains the n n orthogonal matrix
V.
Derived type array with the same precision as the input matrix. Used for passing
optional data to the routine. The options are as follows:
Packaged Options for lin_sol_svd
Option Prefix = ?
Option Name
Option Value
lin_sol_svd_set_small
lin_sol_svd_overwrite_input
lin_sol_svd_safe_reciprocal
lin_sol_svd_scan_for_NaN
Replaces with zero a diagonal term of the matrix S if it is smaller in magnitude than the
value Small. This determines the approximate rank of the matrix, which is returned as
the rank= optional argument. A solution is approximated based on this
replacement.
Default: the smallest number that can be safely reciprocated
LIN_SOL_SVD 37
iopt(IO) = ?_options(?_lin_sol_svd_overwrite_input,?_dummy)
Does not save the input arrays A(:,:) and b(:,:).
iopt(IO) = ?_options(?_lin_sol_svd_safe_reciprocal, safe)
Replaces a denominator term with safe if it is smaller in magnitude than the value safe.
Default: the smallest number that can be safely reciprocated
iopt(IO) = ?_options(?_lin_sol_svd_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNan(b(i,j)) ==.true.
See the isNaN() function, Chapter 10.
Default: Does not scan for NaNs
FORTRAN 90 Interface
Generic:
Specific:
Description
Routine LIN_SOL_SVD solves a rectangular system of linear algebraic equations in a least-squares
sense. It computes the factorization of A known as the singular value decomposition. This
decomposition has the following form:
A = USVT
The matrices U and V are orthogonal. The matrix S is diagonal with the diagonal terms non-increasing. See Golub and Van Loan (1989, Chapters 5.4 and 5.5) for further details.
Output
Example 1 for LIN_SOL_SVD is correct.
Additional Examples
Example 2: Polar Decomposition of a Square Matrix
A polar decomposition of an n n random matrix is obtained. This decomposition satisfies
A = PQ, where P is orthogonal and Q is self-adjoint and positive definite.
Given the singular value decomposition
A = USV T
the polar decomposition follows from the matrix products
P = UV T and Q = VSV T
This example uses the optional arguments u=, s=, and v=, then array intrinsic functions to
calculate P and Q. Also, see operator_ex14, Chapter 10.
use lin_sol_svd_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_SOL_SVD.
integer i
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
Chapter 1: Linear Systems
LIN_SOL_SVD 39
Output
Example 2 for LIN_SOL_SVD is correct.
is computed, where S is of low rank. Approximations using fewer of these nonzero singular values
and vectors suffice to reconstruct A. Also, see operator_ex15, supplied with the product
examples.
40 Chapter 1: Linear Systems
use lin_sol_svd_int
use rand_gen_int
use error_option_packet
implicit none
! This is Example 3 for LIN_SOL_SVD.
integer i, j, k
integer, parameter :: n=32
real(kind(1e0)), parameter :: half=0.5e0, one=1e0, zero=0e0
real(kind(1e0)) a(n,n), b(n,0), x(n,0), s(n), u(n,n), &
v(n,n), c(n,n)
! Fill in value one for points inside the circle.
a = zero
do i=1, n
do j=1, n
if ((i-n/2)**2 + (j-n/2)**2 <= (n/4)**2) a(i,j) = one
end do
end do
! Compute the singular value decomposition.
call lin_sol_svd(a, b, x, nrhs=0,&
s=s, u=u, v=v)
! How many terms, to the nearest integer, exactly
! match the circle?
c = zero; k = count(s > half)
do i=1, k
c = c + spread(u(1:n,i),2,n)*spread(v(1:n,i),1,n)*s(i)
if (count(int(c-a) /= 0) == 0) exit
end do
if (i < k) then
write (*,*) 'Example 3 for LIN_SOL_SVD is correct.'
end if
end
Output
Example 3 for LIN_SOL_SVD is correct.
st
f ( t ) dt = s 1 (1 e s ) = g ( s )
The unknown function f(t) = 1 is computed. This problem is equivalent to the numerical inversion
of the Laplace Transform of the function g(s) using real values of t and s, solving for a function
Chapter 1: Linear Systems
LIN_SOL_SVD 41
that is nonzero only on the unit interval. The evaluation of the integral uses the following
approximate integration rule:
1
t j +1
j =1
tj
f ( t )e st dt = f ( t j ) e st dt
tj =
j 1
n
The points {s j } are computed so that the range of g(s) is uniformly sampled. This requires the
solution of m equations
g ( si ) = g i =
i
m +1
for j = 1, , n and i = 1, , m. Fortran 90 array operations are used to solve for the collocation
points {si } as a single series of steps. Newton's method,
s s
h
h
whose entry at the intersection of row i and column j is equal to the value
t j +1
si t
dt
tj
is explicitly integrated and evaluated as an array operation. The solution analysis of the resulting
linear least-squares system
Af g
b =UT g
followed by using as few of the largest singular values as possible to minimize the following
squared error residual:
n
(1 f )
j =1
vj
j =1
sj
f = bj
LIN_SOL_SVD 43
a = (exp(-spread(t(1:n),1,m)*spread(s,2,n)) &
- exp(-spread(t(2:n+1),1,m)*spread(s,2,n))) / &
spread(s,2,n)
b(1:,1)=g
! Compute the singular value decomposition.
call lin_sol_svd(a, b, f, nrhs=0, &
rank=k, u=U_S, v=V_S, s=S_S)
! Singular values that are larger than epsilon determine
! the rank=k.
k = count(S_S > epsilon(one))
oldrms = huge(one)
g = matmul(transpose(U_S), b(1:m,1))
! Find the minimum number of singular values that gives a good
! approximation to f(t) = 1.
do i=1,k
f(1:n,1) = matmul(V_S(1:,1:i), g(1:i)/S_S(1:i))
f = f - one
rms = sum(f**2)/n
if (rms > oldrms) exit
oldrms = rms
end do
write (*,"( ' Using this number of singular values, ', &
&i4 / ' the approximate R.M.S. error is ', 1pe12.4)") &
i-1, oldrms
if (sqrt(oldrms) <= delta_t**2) then
write (*,*) 'Example 4 for LIN_SOL_SVD is correct.'
end if
end
Output
Example 4 for LIN_SOL_SVD is correct.
LIN_SOL_TRI
Solves multiple systems of linear equations
Aj x j = y j , j = 1, , k
Each matrix Aj is tridiagonal with the same dimension, n. The default solution method is based on
LU factorization computed using cyclic reduction or, optionally, Gaussian elimination with partial
pivoting.
Required Arguments
C Array of size 2n k containing the upper diagonals of the matrices Aj. Each upper
diagonal is entered in array locations c(1:n 1, j). The data C(n, 1:k) are not used.
(Input [/Output])
The input data is overwritten. See note below.
D Array of size 2n k containing the diagonals of the matrices Aj. Each diagonal is
entered in array locations D(1:n, j). (Input [/Output])
The input data is overwritten. See note below.
B Array of size 2n k containing the lower diagonals of the matrices Aj. Each lower
diagonal is entered in array locations B(2:n, j). The data
B(1, 1:k) are not used. (Input [/Output])
The input data is overwritten. See note below.
Y Array of size 2n k containing the right-hand sides, yj. Each right-hand side is entered
in array locations Y(1:n, j). The computed solution xj is returned in locations Y(1:n, j).
(Input [/Output])
NOTE: The required arguments have the Input data overwritten. If these quantities are
used later, they must be saved in user-defined arrays. The routine uses each array's
locations (n + 1:2 * n, 1:k) for scratch storage and intermediate data in the LU
factorization. The default values for problem dimensions are n = (size (D, 1))/2 and
k = size (D, 2).
Optional Arguments
NCOLS = n (Input)
Uses arrays C(1:n 1, 1:k), D(1:n, 1:k), and B(2:n, 1:k) as the upper, main and
lower diagonals for the input tridiagonal matrices. The right-hand sides and solutions
are in array Y(1:n, 1:k). Note that each of these arrays are rank-2.
Default: n = (size(D, 1))/2
NPROB = k (Input)
Derived type array with the same precision as the input matrix. Used for passing
optional data to the routine. The options are as follows:
Packaged Options for LIN_SOL_TRI
Option Prefix = ?
s_, d_, c_, z_
Chapter 1: Linear Systems
Option Name
lin_sol_tri_set_small
Option Value
1
LIN_SOL_TRI 45
lin_sol_tri_set_jolt
lin_sol_tri_scan_for_NaN
lin_sol_tri_factor_only
lin_sol_tri_solve_only
lin_sol_tri_use_Gauss_elim
Examines each input array entry to find the first value such that
isNaN(C(i,j)) .or.
isNaN(D(i,j)) .or.
isNaN(B(i,j)) .or.
isNaN(Y(i,j)) == .true.
Obtain the LU factorization of the matrices Aj. Does not solve for a solution.
Default: Factor the matrices and solve the systems.
iopt(IO) = ?_options(?_lin_sol_tri_solve_only, ?_dummy)
The accuracy, numerical stability or efficiency of the cyclic reduction algorithm may
be inferior to the use of LU factorization with partial pivoting.
Default: Use cyclic reduction to compute the factorization.
FORTRAN 90 Interface
Generic:
Specific:
Description
Routine lin_sol_tri solves k systems of tridiagonal linear algebraic equations, each problem of
dimension n n. No relation between k and n is required. See Kershaw, pages 8688 in Rodrigue
(1982) for further details. To deal with poorly conditioned or singular systems, a specific
regularizing term is added to each reciprocated value. This technique keeps the factorization
process efficient and avoids exceptions from overflow or division by zero. Each occurrence of an
array reciprocal a 1 is replaced by the expression ( a + t ) , where the array temporary t has the
1
value 0 whenever the corresponding entry satisfies |a| > Small. Alternately, t has the value 2 jolt.
(Every small denominator gives rise to a finite jolt.) Since this tridiagonal solver is used in the
routines lin_svd and lin_eig_self for inverse iteration, regularization is required. Users can
reset the values of Small and jolt for their own needs. Using the default values for these
parameters, it is generally necessary to scale the tridiagonal matrix so that the maximum
magnitude has value approximately one. This is normally not an issue when the systems are
nonsingular.
The routine is designed to use cyclic reduction as the default method for computing the LU
factorization. Using an optional parameter, standard elimination and partial pivoting will be used
to compute the factorization. Partial pivoting is numerically stable but is likely to be less efficient
than cyclic reduction.
LIN_SOL_TRI 47
Output
Example 1 for LIN_SOL_TRI is correct.
Additional Examples
Example 2: Iterative Refinement and Use of Partial Pivoting
This program unit shows usage that typically gives acceptable accuracy for a large class of
problems. Our goal is to use the efficient cyclic reduction algorithm when possible, and keep on
using it unless it will fail. In exceptional cases our program switches to the LU factorization with
partial pivoting. This use of both factorization and solution methods enhances reliability and
maintains efficiency on the average. Also, see operator_ex18, supplied with the product
examples.
48 Chapter 1: Linear Systems
use lin_sol_tri_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_SOL_TRI.
integer i, nopt
integer, parameter :: n=128
real(kind(1e0)), parameter :: s_one=1e0, s_zero=0e0
real(kind(1d0)), parameter :: d_one=1d0, d_zero=0d0
real(kind(1e0)), dimension(2*n,n) :: d, b, c, res(n,n), &
x, y
real(kind(1e0)) change_new, change_old, err
type(s_options) :: iopt(2) = s_options(0,s_zero)
real(kind(1d0)), dimension(n,n) :: d_save, b_save, c_save, &
x_save, y_save, x_sol
logical solve_only
c = s_zero; d=s_zero; b=s_zero; x=s_zero
! Generate the upper, main, and lower diagonals of the
! matrices A. A random vector x is used to construct the
! right-hand sides: y=A*x.
do i = 1, n
call rand_gen (c(1:n,i))
call rand_gen (d(1:n,i))
call rand_gen (b(1:n,i))
call rand_gen (x(1:n,i))
end do
! Save double precision copies of the diagonals and the
! right-hand side.
c_save = c(1:n,1:n); d_save = d(1:n,1:n)
b_save = b(1:n,1:n); x_save = x(1:n,1:n)
y_save(1:n,1:n) = d(1:n,1:n)*x_save + &
c(1:n,1:n)*EOSHIFT(x_save,SHIFT=+1,DIM=1) + &
b(1:n,1:n)*EOSHIFT(x_save,SHIFT=-1,DIM=1)
! Iterative refinement loop.
factorization_choice: do nopt=0, 1
! Set the logical to flag the first time through.
solve_only = .false.
x_sol = d_zero
change_old = huge(s_one)
iterative_refinement:
do
LIN_SOL_TRI 49
Output
Example 2 for LIN_SOL_TRI is correct.
1 , , n
of a tridiagonal real, self-adjoint matrix are computed. Note that the computation is performed
using the IMSL MATH/LIBRARY FORTRAN 77 interface to routine EVASB. The user may write
this interface based on documentation of the arguments (IMSL 2003, p. 480), or use the module
Numerical_Libraries as we have done here. The eigenvectors corresponding to k < n of the
eigenvalues are required. These vectors are computed using inverse iteration for all the
eigenvalues at one step. See Golub and Van Loan (1989, Chapter 7). The eigenvectors are then
orthogonalized. Also, see operator_ex19, supplied with the product examples.
use lin_sol_tri_int
use rand_gen_int
use Numerical_Libraries
implicit none
! This is Example 3 for LIN_SOL_TRI.
integer i, j, nopt
integer, parameter :: n=128, k=n/4, ncoda=1, lda=2
real(kind(1e0)), parameter :: s_one=1e0, s_zero=0e0
real(kind(1e0)) A(lda,n), EVAL(k)
type(s_options) :: iopt(2)=s_options(0,s_zero)
real(kind(1e0)) d(n), b(n), d_t(2*n,k), c_t(2*n,k), perf_ratio, &
b_t(2*n,k), y_t(2*n,k), eval_t(k), res(n,k), temp
logical small
! This flag is used to get the k largest eigenvalues.
small = .false.
! Generate the main diagonal and the co-diagonal of the
! tridiagonal matrix.
call rand_gen (b)
call rand_gen (d)
A(1,1:)=b; A(2,1:)=d
! Use Numerical Libraries routine for the calculation of k
! largest eigenvalues.
CALL EVASB (N, K, A, LDA, NCODA, SMALL, EVAL)
EVAL_T = EVAL
! Use DNFL tridiagonal solver for inverse iteration
! calculation of eigenvectors.
factorization_choice: do nopt=0,1
! Create k tridiagonal problems, one for each inverse
! iteration system.
b_t(1:n,1:k) = spread(b,DIM=2,NCOPIES=k)
c_t(1:n,1:k) = EOSHIFT(b_t(1:n,1:k),SHIFT=1,DIM=1)
d_t(1:n,1:k) = spread(d,DIM=2,NCOPIES=k) - &
Chapter 1: Linear Systems
LIN_SOL_TRI 51
spread(EVAL_T,DIM=1,NCOPIES=n)
! Start the right-hand side at random values, scaled downward
! to account for the expected 'blowup' in the solution.
do i=1, k
call rand_gen (y_t(1:n,i))
end do
! Do two iterations for the eigenvectors.
do i=1, 2
y_t(1:n,1:k) = y_t(1:n,1:k)*epsilon(s_one)
call lin_sol_tri(c_t, d_t, b_t, y_t, &
iopt=iopt)
iopt(nopt+1) = s_options(s_lin_sol_tri_solve_only,s_zero)
end do
! Orthogonalize the eigenvectors. (This is the most
! intensive part of the computing.)
do j=1,k-1 ! Forward sweep of HMGS orthogonalization.
temp=s_one/sqrt(sum(y_t(1:n,j)**2))
y_t(1:n,j)=y_t(1:n,j)*temp
y_t(1:n,j+1:k)=y_t(1:n,j+1:k)+ &
spread(-matmul(y_t(1:n,j),y_t(1:n,j+1:k)), &
DIM=1,NCOPIES=n)* spread(y_t(1:n,j),DIM=2,NCOPIES=k-j)
end do
temp=s_one/sqrt(sum(y_t(1:n,k)**2))
y_t(1:n,k)=y_t(1:n,k)*temp
do j=k-1,1,-1 ! Backward sweep of HMGS.
y_t(1:n,j+1:k)=y_t(1:n,j+1:k)+ &
spread(-matmul(y_t(1:n,j),y_t(1:n,j+1:k)), &
DIM=1,NCOPIES=n)* spread(y_t(1:n,j),DIM=2,NCOPIES=k-j)
end do
!
!
!
!
!
!
!
!
end do factorization_choice
if (perf_ratio <= s_one) then
write (*,*) 'Example 3 for LIN_SOL_TRI is correct.'
end if
end
Output
Example 3 for LIN_SOL_TRI is correct.
u 2u
=
u xx
t x2
is solved for values of 0 x and t > 0. A boundary value problem consists of choosing the
value
u ( 0, t ) = u0
, u1 =
1
2
and
t1 = 1
are used for illustration of the solution process. The one-parameter equation
u ( x1 , t1 ) u1 = 0
that v(0, t) = 0. The function v(x, t) satisfies the differential equation. The one-parameter equation
solved is therefore
v ( x1 , t1 ) ( u1 u0 ) = 0
To solve this equation for u0 , use the standard technique of the variational equation,
LIN_SOL_TRI 53
v
u0
Thus
w 2w
=
t x2
Since the initial data for
v ( x, 0 ) = u0
=
=
=
=
2*hx/3
hx/6
-2/hx
1/hx
LIN_SOL_TRI 55
Output
Example 4 for LIN_SOL_TRI is correct.
LIN_SVD
Computes the singular value decomposition (SVD) of a rectangular matrix, A. This gives the
decomposition
A = USV T
Required Arguments
A Array of size m n containing the matrix. (Input [/Output])
If the packaged option lin_svd_overwrite_input is used, this array is not saved
on output.
S Array of size min(m, n) containing the real singular values. These nonnegative values
are in non-increasing order. (Output)
U Array of size m m containing the singular vectors, U. (Output)
LIN_SVD 57
Optional Arguments
MROWS = m (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: m = size(A, 1)
NCOLS = n (Input)
Uses array A(1:m, 1:n) for the input matrix.
Default: n = size(A, 2)
RANK = k (Output)
Number of singular values that exceed the value Small. RANK will satisfy
k <= min(m, n).
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix. Used for passing
optional data to the routine. The options are as follows:
Packaged Options for LIN_SVD
Option Prefix = ?
Option Name
Option Value
lin_svd_set_small
lin_svd_overwrite_input
lin_svd_scan_for_NaN
lin_svd_use_qr
lin_svd_skip_orth
lin_svd_use_gauss_elim
lin_svd_set_perf_ratio
If a singular value is smaller than Small, it is defined as zero for the purpose of
computing the rank of A.
Default: the smallest number that can be reciprocated safely
iopt(IO) = ?_options(?_lin_svd_overwrite_input, ?_dummy)
Does not save the input array A(:, :).
iopt(IO) = ?_options(?_lin_svd_scan_for_NaN, ?_dummy)
Examines each input array entry to find the first value such that
isNaN(a(i,j)) == .true.
If the eigenvalues are computed using inverse iteration, skips the final
orthogonalization of the vectors. This method results in a more efficient computation.
However, the singular vectors, while a complete set, may not be orthogonal.
Default: singular vectors are orthogonalized if obtained using inverse iteration
iopt(IO) = ?_options(?_lin_svd_use_gauss_elim, ?_dummy)
If the eigenvalues are computed using inverse iteration, uses standard elimination with
partial pivoting to solve the inverse iteration problems.
Default: singular vectors computed using cyclic reduction
iopt(IO) = ?_options(?_lin_svd_set_perf_ratio, perf_ratio)
Uses residuals for approximate normalized singular vectors if they have a performance
index no larger than perf_ratio. Otherwise an alternate approach is taken and the
singular vectors are computed again: Standard elimination is used instead of cyclic
reduction, or the standard QR algorithm is used as a backup procedure to inverse
iteration. Larger values of perf_ratio are less likely to cause these exceptions.
Default: perf_ratio = 4
FORTRAN 90 Interface
Generic:
Specific:
Description
Routine lin_svd is an implementation of the QR algorithm for computing the SVD of
rectangular matrices. An orthogonal reduction of the input matrix to upper bidiagonal form is
performed. Then, the SVD of a real bidiagonal matrix is calculated. The orthogonal decomposition
AV = US results from products of intermediate matrix factors. See Golub and Van Loan (1989,
Chapter 8) for details.
LIN_SVD 59
Output
Example 1 for LIN_SVD is correct.
Additional Examples
Example 2: Linear Least Squares with a Quadratic Constraint
An m n matrix equation Ax b, m > n, is approximated in a least-squares sense. The matrix b is
size m k. Each of the k solution vectors of the matrix x is constrained to have Euclidean length of
value j > 0. The value of i is chosen so that the constrained solution is 0.25 the length of the
nonregularized or standard least-squares equation. See Golub and Van Loan (1989, Chapter 12)
for more details. In the Example 2 code, Newtons method is used to solve for each regularizing
parameter of the k systems. The solution is then computed and its length is checked. Also, see
operator_ex22, supplied with the product examples.
use lin_svd_int
use rand_gen_int
implicit none
LIN_SVD 61
Output
Example 2 for LIN_SVD is correct.
and
BX = Vdiag ( s1 , , sn )
The ci are nonincreasing, and the si are nondecreasing. See Golub and Van Loan (1989, Chapter
8) for more details. Our method is based on computing three SVDs as opposed to the QR
decomposition and two SVDs outlined in Golub and Van Loan. As a bonus, an SVD of the matrix
X is obtained, and you can use this information to answer further questions about its conditioning.
This form of the decomposition assumes that the matrix
A
D=
B
has all its singular values strictly positive. For alternate problems, where some singular values of
D are zero, the GSVD becomes
U T A = diag ( c1 , , cn ) W
and
V T B = diag ( s1 , , sn ) W
The matrix W has the same singular values as the matrix D. Also, see operator_ex23, supplied
with the product examples.
use lin_svd_int
use rand_gen_int
implicit none
! This is Example 3 for LIN_SVD.
LIN_SVD 63
( Ax b
2
2
+ 2 x
2
2
The solution to this problem, with row k deleted, is denoted by xk(). Using nonnegative weights
(w1, , wm), the cross-validation squared error C() is given by:
m
mC ( ) = wk ( akT xk ( ) bk )
k =1
With the SVD A = USVT and product g = UTb, this quantity can be written as
s 2j
bk ukj g j
s 2j + 2 )
j =1
(
mC ( ) = wk
2
n
k =1
s
j
1 ukj2 2
2
j =1
( s j + )
This expression is minimized. See Golub and Van Loan (1989, Chapter 12) for more details. In the
Example 4 code, mC(), at p = 10 grid points are evaluated using a log-scale with respect to ,
0.1s1 10s1 . Array operations and intrinsics are used to evaluate the function and then to
choose an approximate minimum. Following the computation of the optimum , the regularized
solutions are computed. Also, see operator_ex24, supplied with the product examples.
use lin_svd_int
use rand_gen_int
implicit none
! This is Example 4 for LIN_SVD.
integer i
integer, parameter :: m=32, n=16, p=10, k=4
64 Chapter 1: Linear Systems
LIN_SVD 65
sqrt(epsilon(one))) then
write (*,*) 'Example 4 for LIN_SVD is correct.'
end if
end
Output
Example 4 for LIN_SVD is correct.
PARALLEL_NONNEGATIVE_LSQ
REQUIRED
For a detailed description of MPI Requirements see Dense Matrix Parallelism Using MPI in
Chapter 10 of this manual.
Solves a linear, non-negative constrained least-squares system.
Usage Notes
CALL PARALLEL_NONNEGATIVE_LSQ&
(A, B, X, RNORM, W, INDEX, IPART, IOPT = IOPT)
Required Arguments
A(1:M,:) (Input/Output) Columns of the matrix with limits given by entries in the array
IPART(1:2,1:max(1,MP_NPROCS)). On output Ak is replaced by the product QAk ,
where Q is an orthogonal matrix. The value SIZE(A,1) defines the value of M. Each
processor starts and exits with its piece of the partitioned matrix.
B(1:M) (Input/Output) Assumed-size array of length M containing the right-hand side
vector, b . On output b is replaced by the product Qb , where Q is the orthogonal
matrix applied to A . All processors in the communicator start and exit with the same
vector.
X(1:N) (Output) Assumed-size array of length N containing the solution, x 0 . The
value SIZE(X) defines the value of N. All processors exit with the same vector.
RNORM (Output) Scalar that contains the Euclidean or least-squares length of the residual
vector, Ax b . All processors exit with the same value.
W(1:N) (Output) Assumed-size array of length N containing the dual vector,
w = AT ( b Ax ) 0 . All processors exit with the same vector.
INDEX(1:N) (Output) Assumed-size array of length N containing the NSETP indices of
columns in the positive solution, and the remainder that are at their constraint. The
number of positive components in the solution x is give by the Fortran intrinsic
function value,
NSETP=COUNT(X > 0). All processors exit with the same array.
IPART(1:2,1:max(1,MP_NPROCS)) (Input) Assumed-size array containing the
partitioning describing the matrix A . The value MP_NPROCS is the number of
processors in the communicator,
except when MPI has been finalized with a call to the routine MP_SETUP(Final).
This causes MP_NPROCS to be assigned 0. Normally users will give the partitioning to
processor of rank = MP_RANK by setting IPART(1,MP_RANK+1)= first column
index, and IPART(2,MP_RANK+1)= last column index. The number of columns per
node is typically based on their relative computing power. To avoid a node with rank
MP_RANK doing any work except communication, set IPART(1,MP_RANK+1) = 0 and
IPART(2,MP_RANK+1)= -1. In this exceptional case there is no reference to the
array A(:,:) at that node.
Optional Argument
IOPT(:) (Input) Assumed-size array of derived type S_OPTIONS or D_OPTIONS. This
argument is used to change internal parameters of the algorithm. Normally users will
not be concerned about this argument, so they would not include it in the argument list
for the routine.
PARALLEL_NONNEGATIVE_LSQ 67
Option Value
PNLSQ_SET_TOLERANCE
PNLSQ_SET_MAX_ITERATIONS
PNLSQ_SET_MIN_RESIDUAL
FORTRAN 90 Interface
Generic: CALL PARALLEL_NONNEGATIVE_LSQ (A, B, X, RNORM, W, INDEX,
IPART [,])
Specific: The specific interface names are S_PARALLEL_NONNEGATIVE_LSQ and
D_PARALLEL_NONNEGATIVE_LSQ.
Description
Subroutine PARALLEL_NONNEGATIVE_LSQ solves the linear least-squares system Ax b, x 0 ,
using the algorithm NNLS found in Lawson and Hanson, (1995), pages 160-161. The code now
updates the dual vector w of Step 2, page 161. The remaining new steps involve exchange of
required data, using MPI.
random values. When the minimum Euclidean length solution to the inequalities has been
calculated, the residuals r = Gy h 0 are computed, with the dual variables to the NNLS
problem indicating the entries of r that are precisely zero.
The fact that matrix products involving both E and E T are needed to compute the constrained
solution y and the residuals r , implies that message passing is required. This occurs after the
NNLS solution is computed.
!
!
!
!
!
PROGRAM PNLSQ_EX1
Use Parallel_nonnegative_LSQ to solve an inequality
constraint problem, Gy >= h. This algorithm uses
Algorithm LDP of Solving Least Squares Problems,
page 165. The constraints are allocated to the
processors, by rows, in columns of the array A(:,:).
USE PNLSQ_INT
USE MPI_SETUP_INT
USE RAND_INT
USE SHOW_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: MP=500, NP=400, M=NP+1, N=MP
REAL(KIND(1D0)), PARAMETER :: ZERO=0D0, ONE=1D0
REAL(KIND(1D0)), ALLOCATABLE :: &
A(:,:), B(:), X(:), Y(:), W(:), ASAVE(:,:)
REAL(KIND(1D0)) RNORM
INTEGER, ALLOCATABLE :: INDEX(:), IPART(:,:)
INTEGER K, L, DN, J, JSHIFT, IERROR
LOGICAL :: PRINT=.false.
PARALLEL_NONNEGATIVE_LSQ 69
! Parallel_nonnegative_LSQ.
A=rand(A); ASAVE=A
IF(MP_RANK == 0 .and. PRINT) &
CALL SHOW(IPART, &
"Partition of the constraints to be solved")
! Set the right-hand side to be one in the last component, zero elsewhere.
B=ZERO;B(M)=ONE
! Solve the dual problem.
CALL Parallel_nonnegative_LSQ &
(A, B, X, RNORM, W, INDEX, IPART)
! Each processor multiplies its block times the part of
! the dual corresponding to that part of the partition.
Y=ZERO
DO J=IPART(1,MP_RANK+1),IPART(2,MP_RANK+1)
JSHIFT=J-IPART(1,MP_RANK+1)+1
Y=Y+ASAVE(:,JSHIFT)*X(J)
END DO
! Accumulate the pieces from all the processors. Put sum into B(:)
! on rank 0 processor.
B=Y
IF(MP_NPROCS > 1) &
CALL MPI_REDUCE(Y, B, M, MPI_DOUBLE_PRECISION,&
MPI_SUM, 0, MP_LIBRARY_WORLD, IERROR)
IF(MP_RANK == 0) THEN
! Compute constrained solution at the root.
! The constraints will have no solution if B(M) = ONE.
! All of these example problems have solutions.
B(M)=B(M)-ONE;B=-B/B(M)
END IF
! Send the inequality constraint solution to all nodes.
IF(MP_NPROCS > 1) &
CALL MPI_BCAST(B, M, MPI_DOUBLE_PRECISION, &
0, MP_LIBRARY_WORLD, IERROR)
! For large problems this printing needs to be removed.
IF(MP_RANK == 0 .and. PRINT) &
CALL SHOW(B(1:NP), &
"Minimal length solution of the constraints")
! Compute residuals of the individual constraints.
! If only the solution is desired, the program ends here.
X=ZERO
DO J=IPART(1,MP_RANK+1),IPART(2,MP_RANK+1)
JSHIFT=J-IPART(1,MP_RANK+1)+1
X(J)=dot_product(B,ASAVE(:,JSHIFT))
END DO
! This cleans up residuals that are about rounding
! error unit (times) the size of the constraint
70 Chapter 1: Linear Systems
Output
Example 1 for PARALLEL_NONNEGATIVE_LSQ is correct.
Additional Examples
Example 2: Distributed Non-negative Least-Squares
The program PNLSQ_EX2 illustrates the computation of the solution to a system of linear leastsquares equations with simple constraints: aiT x bi , i = 1,..., m, subject to x 0 . In this example
we write the row vectors aiT : bi on a file. This illustrates reading the data by rows and
arranging the data by columns, as required by PARALLEL_NONNEGATIVE_LSQ. After reading the
data, the right-hand side vector is broadcast to the group before computing a solution, x . The
block-size is chosen so that each participating processor receives the same number of columns,
except any remaining columns sent to the processor with largest rank. This processor contains the
right-hand side before the broadcast.
This example illustrates connecting a BLACS context handle and the Fortran Library MPI
communicator, MP_LIBRARY_WORLD, described in Chapter 10.
!
!
!
!
!
!
PROGRAM PNLSQ_EX2
Use Parallel_Nonnegative_LSQ to solve a least-squares
problem, A x = b, with x >= 0. This algorithm uses a
distributed version of NNLS, found in the book
Solving Least Squares Problems, page 165. The data is
read from a file, by rows, and sent to the processors,
as array columns.
USE PNLSQ_INT
USE SCALAPACK_IO_INT
USE BLACS_INT
PARALLEL_NONNEGATIVE_LSQ 71
USE MPI_SETUP_INT
USE RAND_INT
USE ERROR_OPTION_PACKET
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: M=128, N=32, NP=N+1, NIN=10
real(kind(1d0)), ALLOCATABLE, DIMENSION(:) :: &
d_A(:,:), A(:,:), B, C, W, X, Y
real(kind(1d0)) RNORM, ERROR
INTEGER, ALLOCATABLE :: INDEX(:), IPART(:,:)
INTEGER I, J, K, L, DN, JSHIFT, IERROR, &
CONTXT, NPROW, MYROW, MYCOL, DESC_A(9)
TYPE(d_OPTIONS) IOPT(1)
! Routines with the "BLACS_" prefix are from the
! BLACS library.
CALL BLACS_PINFO(MP_RANK, MP_NPROCS)
! Make initialization for BLACS.
CALL BLACS_GET(0,0, CONTXT)
! Define processor grid to be 1 by MP_NPROCS.
NPROW=1
CALL BLACS_GRIDINIT(CONTXT, 'N/A', NPROW, MP_NPROCS)
! Get this processor's role in the process grid.
CALL BLACS_GRIDINFO(CONTXT, NPROW, MP_NPROCS, &
MYROW, MYCOL)
! Connect BLACS context with communicator MP_LIBRARY_WORLD.
CALL BLACS_GET(CONTXT, 10, MP_LIBRARY_WORLD)
! Setup for MPI:
MP_NPROCS=MP_SETUP()
DN=max(1,NP/MP_NPROCS)
ALLOCATE(IPART(2,MP_NPROCS))
! Spread columns evenly to the processors. Any odd
! number of columns are in the processor with highest
! rank.
IPART(1,:)=1; IPART(2,:)=0
DO L=2,MP_NPROCS
IPART(2,L-1)=IPART(1,L-1)+DN
IPART(1,L)=IPART(2,L-1)+1
END DO
IPART(2,MP_NPROCS)=NP
IPART(2,:)=min(NP,IPART(2,:))
! Note which processor (L-1) receives the right-hand side.
72 Chapter 1: Linear Systems
DO L=1,MP_NPROCS
IF(IPART(1,L) <= NP .and. NP <= IPART(2,L)) EXIT
END DO
K=max(0,IPART(2,MP_RANK+1)-IPART(1,MP_RANK+1)+1)
ALLOCATE(d_A(M,K), W(N), X(N), Y(N),&
B(M), C(M), INDEX(N))
IF(MP_RANK == 0 ) THEN
ALLOCATE(A(M,N))
! Define the matrix data using random values.
A=rand(A); B=rand(B)
! Write the rows of data to an external file.
OPEN(UNIT=NIN, FILE='Atest.dat', STATUS='UNKNOWN')
DO I=1,M
WRITE(NIN,*) (A(I,J),J=1,N), B(I)
END DO
CLOSE(NIN)
ELSE
! No resources are used where this array is not saved.
ALLOCATE(A(M,0))
END IF
!
!
!
!
PARALLEL_NONNEGATIVE_LSQ 73
Output
Example 2 for PARALLEL_NONNEGATIVE_LSQ is correct.'
PARALLEL_BOUNDED_LSQ
REQUIRED
For a detailed description of MPI Requirements see Dense Matrix Parallelism Using MPI in
Chapter 10 of this manual.
Solves a linear least-squares system with bounds on the unknowns.
Usage Notes
CALL PARALLEL_BOUNDED_LSQ & (A, B, BND, X, RNORM, W, INDEX, IPART, &
NSETP, NSETZ, IOPT=IOPT)
Required Arguments
A(1:M,:) (Input/Output) Columns of the matrix with limits given by entries in the array
IPART(1:2,1:max(1,MP_NPROCS)). On output Ak is replaced by the product QAk ,
where Q is an orthogonal matrix. The value SIZE(A,1) defines the value of M. Each
processor starts and exits with its piece of the partitioned matrix.
B(1:M) (Input/Output) Assumed-size array of length M containing the right-hand side
vector, b . On output b is replaced by the product Q ( b Ag ) , where Q is the
orthogonal matrix applied to A and g is a set of active bounds for the solution. All
processors in the communicator start and exit with the same vector.
BND(1:2,1:N) (Input) Assumed-size array containing the bounds for x . The lower
bound j is in BND(1,J), and the upper bound j is in BND(2,J).
X(1:N) (Output) Assumed-size array of length N containing the solution, x . The
value SIZE(X) defines the value of N. All processors exit with the same vector.
RNORM (Output) Scalar that contains the Euclidean or least-squares length of the residual
vector, Ax b . All processors exit with the same value.
W(1:N) (Output) Assumed-size array of length N containing the dual vector,
w = AT ( b Ax ) . At a solution exactly one of the following is true for each
j ,1 j n,
j = x j = j , and w j arbitrary
j = x j , and w j 0
x j = j , and w j 0
j < x j < j , and w j =0
PARALLEL_BOUNDED_LSQ 75
NSETZ (Output) An INTEGER indicating the solution components held at fixed values.
The column indices are output in the array INDEX(:).
Optional Argument
IOPT(:) (Input) Assumed-size array of derived type S_OPTIONS or D_OPTIONS. This
argument is used to change internal parameters of the algorithm. Normally users will
not be concerned about this argument, so they would not include it in the argument list
for the routine.
Option Name
Option Value
PBLSQ_SET_TOLERANCE
PBLSQ_SET_MAX_ITERATIONS
PBLSQ_SET_MIN_RESIDUAL
IOPT(IO)= PBLSQ_SET_MAX_ITERATIONS
IOPT(IO+1)= NEW_MAX_ITERATIONS Replaces the default maximum number of iterations
from 3*N to NEW_MAX_ITERATIONS. Note that this option requires two entries in the
FORTRAN 90 Interface
Generic:
Specific:
Description
Subroutine PARALLEL_BOUNDED_LSQ solves the least-squares linear system Ax b, x ,
using the algorithm BVLS found in Lawson and Hanson, (1995), pages 279-283. The new steps
involve updating the dual vector and exchange of required data, using MPI. The optional changes
to default tolerances, minimum residual, and the number of iterations are new features.
PROGRAM PBLSQ_EX1
Use Parallel_bounded_LSQ to solve an inequality
constraint problem, Gy >= h. Force F of the constraints
to be equalities. This algorithm uses LDP of
Solving Least Squares Problems, page 165.
Forcing equality constraints by freeing the dual is
new here. The constraints are allocated to the
processors, by rows, in columns of the array A(:,:).
USE PBLSQ_INT
USE MPI_SETUP_INT
USE RAND_INT
USE SHOW_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: MP=500, NP=400, M=NP+1, &
N=MP, F=NP/10
REAL(KIND(1D0)), PARAMETER :: ZERO=0D0, ONE=1D0
REAL(KIND(1D0)), ALLOCATABLE :: &
A(:,:), B(:), BND(:,:), X(:), Y(:), &
W(:), ASAVE(:,:)
REAL(KIND(1D0)) RNORM
INTEGER, ALLOCATABLE :: INDEX(:), IPART(:,:)
INTEGER K, L, DN, J, JSHIFT, IERROR, NSETP, NSETZ
LOGICAL :: PRINT=.false.
PARALLEL_BOUNDED_LSQ 77
IPART(1,1)=1
DO L=2,MP_NPROCS
IPART(2,L-1)=IPART(1,L-1)+DN
IPART(1,L)=IPART(2,L-1)+1
END DO
IPART(2,MP_NPROCS)=N
! Define the constraints using random data.
K=max(0,IPART(2,MP_RANK+1)-IPART(1,MP_RANK+1)+1)
ALLOCATE(A(M,K), ASAVE(M,K), BND(2,N), &
X(N), W(N), B(M), Y(M), INDEX(N))
! The use of ASAVE can be replaced by regenerating the
! data for A(:,:) after the return from
! Parallel_bounded_LSQ
A=rand(A); ASAVE=A
IF(MP_RANK == 0 .and. PRINT) &
call show(IPART,&
"Partition of the constraints to be solved")
! Set the right-hand side to be one in the last
! component, zero elsewhere.
B=ZERO;B(M)=ONE
! Solve the dual problem. Letting the dual variable
! have no constraint forces an equality constraint
! for the primal problem.
BND(1,1:F)=-HUGE(ONE); BND(1,F+1:)=ZERO
BND(2,:)=HUGE(ONE)
CALL Parallel_bounded_LSQ &
(A, B, BND, X, RNORM, W, INDEX, IPART, &
NSETP, NSETZ)
! Each processor multiplies its block times the part
! of the dual corresponding to that partition.
Y=ZERO
DO J=IPART(1,MP_RANK+1),IPART(2,MP_RANK+1)
JSHIFT=J-IPART(1,MP_RANK+1)+1
Y=Y+ASAVE(:,JSHIFT)*X(J)
END DO
! Accumulate the pieces from all the processors.
! Put sum into B(:) on rank 0 processor.
B=Y
IF(MP_NPROCS > 1) &
CALL MPI_REDUCE(Y, B, M, MPI_DOUBLE_PRECISION,&
MPI_SUM, 0, MP_LIBRARY_WORLD, IERROR)
IF(MP_RANK == 0) THEN
! Compute constraint solution at the root.
! The constraints will have no solution if B(M) = ONE.
! All of these example problems have solutions.
B(M)=B(M)-ONE;B=-B/B(M)
END IF
Output
Example 1 for PARALLEL_BOUNDED_LSQ is correct.
Additional Examples
Example 2: Distributed Newton-Raphson Method with Step Control
The program PBLSQ_EX2 illustrates the computation of the solution of a non-linear system of
equations. We use a constrained Newton-Raphson method.
This algorithm works with the problem chosen for illustration. The step-size control used here,
employing only simple bounds, may not work on other non-linear systems of equations. Therefore
we do not recommend the simple non-linear solving technique illustrated here for an arbitrary
Chapter 1: Linear Systems
PARALLEL_BOUNDED_LSQ 79
problem. The test case is Browns Almost Linear Problem, Mor, et al. (1982). The components
are given by:
n
f i ( x ) = xi + x j ( n + 1) , i = 1,..., n 1
j =1
f n ( x ) = x1 ...xn 1
The functions are zero at the point x = ( ,..., , 1 n ) , where > 1 is a particular root of the
T
equation by
xn yn is replaced by the strict bound, EPSILON(1D0), the arithmetic precision, which restricts
the relative accuracy of xn . The input for routine PARALLEL_BOUNDED_LSQ expects each
processor to obtain that part of J ( x ) it owns. Those columns of the Jacobian matrix correspond
to the partition given in the array IPART(:,:). Here the columns of the matrix are evaluated, in
parallel, on the nodes where they are required.
!
!
!
!
PROGRAM PBLSQ_EX2
Use Parallel_bounded_LSQ to solve a non-linear system
of equations. The example is an ACM-TOMS test problem,
except for the larger size. It is "Brown's Almost Linear
Function."
USE ERROR_OPTION_PACKET
USE PBLSQ_INT
USE MPI_SETUP_INT
USE SHOW_INT
USE Numerical_Libraries, ONLY : N1RTY
IMPLICIT NONE
INTEGER, PARAMETER :: N=200, MAXIT=5
REAL(KIND(1D0)), PARAMETER :: ZERO=0D0, ONE=1D0,&
HALF=5D-1, TWO=2D0
REAL(KIND(1D0)), ALLOCATABLE :: &
A(:,:), B(:), BND(:,:), X(:), Y(:), W(:)
REAL(KIND(1D0)) RNORM
INTEGER, ALLOCATABLE :: INDEX(:), IPART(:,:)
INTEGER K, L, DN, J, JSHIFT, IERROR, NSETP, &
NSETZ, ITER
LOGICAL :: PRINT=.false.
TYPE(D_OPTIONS) IOPT(3)
! Setup for MPI:
MP_NPROCS=MP_SETUP()
DN=N/max(1,max(1,MP_NPROCS))-1
ALLOCATE(IPART(2,max(1,MP_NPROCS)))
! Spread Jacobian matrix columns evenly to the processors.
IPART(1,1)=1
DO L=2,MP_NPROCS
IPART(2,L-1)=IPART(1,L-1)+DN
IPART(1,L)=IPART(2,L-1)+1
END DO
IPART(2,MP_NPROCS)=N
K=max(0,IPART(2,MP_RANK+1)-IPART(1,MP_RANK+1)+1)
ALLOCATE(A(N,K), BND(2,N), &
X(N), W(N), B(N), Y(N), INDEX(N))
! This is Newton's method on "Brown's almost
! linear function."
X=HALF
ITER=0
! Turn off messages and stopping for FATAL class errors.
CALL ERSET (4, 0, 0)
NEWTON_METHOD: DO
! Set bounds for the values after the step is taken.
! All variables are positive and bounded below by HALF,
! except for variable N, which has an upper bound of HALF.
BND(1,1:N-1)=-HUGE(ONE)
BND(2,1:N-1)=X(1:N-1)-HALF
BND(1,N)=X(N)-HALF
BND(2,N)=X(N)-EPSILON(ONE)
! Compute the residual function.
B(1:N-1)=SUM(X)+X(1:N-1)-(N+1)
B(N)=LOG(PRODUCT(X))
if(mp_rank == 0 .and. PRINT) THEN
CALL SHOW(B, &
"Developing non-linear function residual")
END IF
IF (MAXVAL(ABS(B(1:N-1))) <= SQRT(EPSILON(ONE)))&
EXIT NEWTON_METHOD
! Compute the derivatives local to each processor.
A(1:N-1,:)=ONE
DO J=1,N-1
IF(J < IPART(1,MP_RANK+1)) CYCLE
IF(J > IPART(2,MP_RANK+1)) CYCLE
JSHIFT=J-IPART(1,MP_RANK+1)+1
A(J,JSHIFT)=TWO
Chapter 1: Linear Systems
PARALLEL_BOUNDED_LSQ 81
END DO
A(N,:)=ONE/X(IPART(1,MP_RANK+1):IPART(2,MP_RANK+1))
! Reset the linear independence tolerance.
IOPT(1)=D_OPTIONS(PBLSQ_SET_TOLERANCE,&
sqrt(EPSILON(ONE)))
IOPT(2)=PBLSQ_SET_MAX_ITERATIONS
! If N iterations was not enough on a previous iteration, reset to 2*N.
IF(N1RTY(1) == 0) THEN
IOPT(3)=N
ELSE
IOPT(3)=2*N
CALL E1POP('MP_SETUP')
CALL E1PSH('MP_SETUP')
END IF
CALL parallel_bounded_LSQ &
(A, B, BND, Y, RNORM, W, INDEX, IPART, NSETP, &
NSETZ,IOPT=IOPT)
! The array Y(:) contains the constrained Newton step.
! Update the variables.
X=X-Y
IF(mp_rank == 0 .and. PRINT) THEN
CALL show(BND, "Bounds for the moves")
CALL SHOW(X, "Developing Solution")
CALL SHOW((/RNORM/), &
"Linear problem residual norm")
END IF
! This is a safety measure for not taking too many steps.
ITER=ITER+1
IF(ITER > MAXIT) EXIT NEWTON_METHOD
END DO NEWTON_METHOD
IF(MP_RANK == 0) THEN
IF(ITER <= MAXIT) WRITE(*,*)&
" Example 2 for PARALLEL_BOUNDED_LSQ is correct."
END IF
! See to any errors and shut down MPI.
MP_NPROCS=MP_SETUP('Final')
END
LSARG
CAPABLE
Required Arguments
A N by N matrix containing the coefficients of the linear system. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
IPATH = 2 means the system ATX = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
LSARG 83
Description
Routine LSARG solves a system of linear algebraic equations having a real general coefficient
matrix. It first uses the LFCRG to compute an LU factorization of the coefficient matrix and to
estimate the condition number of the matrix. The solution of the linear system is then found using
the iterative refinement routine LFIRG. The underlying code is based on either LINPACK ,
LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during
linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
LSARG fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the
iterative refinement algorithm fails to converge. These errors occur only if A is singular or very
close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSARG solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
2.
Informational errors
Type
Code
3
X0 Local vector of length MXLDA containing the local portions of the distributed vector X.
X contains the solution to the linear system. (Output)
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call to
SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of three linear equations is solved. The coefficient matrix has real general form and the
right-hand-side vector b has three elements.
USE LSARG_INT
USE WRRRN_INT
IMPLICIT NONE
!
Declare variables
INTEGER
PARAMETER
LDA, N
(LDA=3, N=3)
REAL
!
B =
(/129.0, -96.0,
8.5/)
Solve the system of equations
Print results
CALL WRRRN (X, X, 1, N, 1)
END
Output
1
1.000
X
2
1.500
3
1.000
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The
coefficient matrix has real general form and the right-hand-side vector b has three elements.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities)
used to map and unmap arrays to and from the processor grid. They are used here for brevity.
DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSARG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
Chapter 1: Linear Systems
LSARG 85
INCLUDE mpif.h
!
Declare variables
INTEGER
N, DESCA(9), DESCX(9)
INTEGER
INFO, MXLDA, MXCOL
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF (MP_RANK .EQ. 0) THEN
ALLOCATE (A(N,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 33.0, 16.0, 72.0/)
A(2,:) = (/-24.0, -10.0, -57.0/)
A(3,:) = (/ 18.0, -11.0,
7.0/)
!
B = (/129.0, -96.0,
ENDIF
!
!
8.5/)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
1
1.000
X
2
1.500
3
1.000
LSLRG
CAPABLE
Required Arguments
A N by N matrix containing the coefficients of the linear system. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
T
IPATH = 2 means the system A X = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
LSLRG 87
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSLRG solves a system of linear algebraic equations having a real general coefficient
matrix. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code
depending upon which supporting libraries are used during linking. For a detailed explanation see
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this
manual. LSLRG first uses the routine LFCRG to compute an LU factorization of the coefficient
matrix based on Gauss elimination with partial pivoting. Experiments were analyzed to determine
efficient implementations on several different computers. For some supercomputers, particularly
those with efficient vendor-supplied BLAS, versions that call Level 1, 2 and 3 BLAS are used.
The remaining computers use a factorization method provided to us by Dr. Leonard J. Harding of
the University of Michigan. Hardings work involves loop unrolling and jamming techniques
that achieve excellent performance on many computers. Using an option, LSLRG will estimate the
condition number of the matrix. The solution of the linear system is then found using LFSRG.
The routine LSLRG fails if U, the upper triangular part of the factorization, has a zero diagonal
element. This occurs only if A is close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that small changes in A can cause large changes in the solution x. If
the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that either
LIN_SOL_SVD or LSARG be used.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LRG the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2); respectively, in LSLRG.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLRG. Users directly calling L2LRG can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLRG or L2LRG. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17 This option has two values that determine if the L1 condition number is to be
computed. Routine LSLRG temporarily replaces IVAL(2) by IVAL(1). The routine
L2CRG computes the condition number if IVAL(2) = 2. Otherwise L2CRG skips this
computation. LSLRG restores the option. Default values for the option are
IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example 1
A system of three linear equations is solved. The coefficient matrix has real general form and the
right-hand-side vector b has three elements.
USE LSLRG_INT
USE WRRRN_INT
Chapter 1: Linear Systems
LSLRG 89
IMPLICIT NONE
!
Declare variables
INTEGER
PARAMETER
LDA, N
(LDA=3, N=3)
REAL
!
B = (/129.0 -96.0
8.5/)
Print results
CALL WRRRN (X, X, 1, N, 1)
END
Output
1
1.000
X
2
1.500
3
1.000
Additional Example
Example 2
A system of N = 16 linear equations is solved using the routine L2LRG. The option manager is used
to eliminate memory bank conflict inefficiencies that may occur when the matrix dimension is a
multiple of 16. The leading dimension of FACT = A is increased from N to N + IVAL(3)=17,
since N=16=IVAL(4). The data used for the test is a nonsymmetric Hadamard matrix and a
right-hand side generated by a known solution, xj = j, j = 1, ..., N.
USE L2LRG_INT
USE IUMAG_INT
USE WRRRN_INT
USE SGEMV_INT
IMPLICIT NONE
!
Declare variables
INTEGER
PARAMETER
!
INTEGER
REAL
PARAMETER
!
!
INTEGER
REAL
INTEGER
LDA, N
(LDA=17, N=16)
SPECIFICATIONS FOR
ICHP, IPATH, IPUT, KBANK
ONE, ZERO
(ICHP=1, IPATH=1, IPUT=2, KBANK=16,
ZERO=0.0E0)
SPECIFICATIONS FOR
I, IPVT(N), J, K, NN
A(LDA,N), B(N), WK(N), X(N)
SPECIFICATIONS FOR
IOPT(1), IVAL(4)
PARAMETERS
ONE=1.0E0, &
LOCAL VARIABLES
SAVE VARIABLES
SAVE
IVAL
!
!
!
!
A(1,1) = ONE
NN
= 1
!
!
!
!
Output
1
1.00
11
11.00
2
2.00
12
12.00
3
3.00
13
13.00
4
4.00
14
14.00
5
5.00
15
15.00
6
6.00
7
7.00
8
8.00
9
9.00
10
10.00
16
16.00
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The
coefficient matrix has real general form and the right-hand-side vector b has three elements.
SCALAPACK_MAP and (see Chapter 11, Utilities) are IMSL utility routines (see Chapter 11,
Utilities) used to map and unmap arrays to and from the processor grid. They are used here for
brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
LSLRG 91
USE MPI_SETUP_INT
USE LSLRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(N,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 33.0, 16.0, 72.0/)
A(2,:) = (/-24.0, -10.0, -57.0/)
A(3,:) = (/ 18.0, -11.0,
7.0/)
!
B = (/129.0, -96.0,
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
8.5/)
Output
1
1.000
X
2
1.500
3
1.000
LFCRG
CAPABLE
Computes the LU factorization of a real general matrix and estimate its L1 condition number.
Required Arguments
A N by N matrix to be factored. (Input)
FACT N by N matrix containing the LU factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
IPVT Vector of length N containing the pivoting information for the LU factorization.
(Output)
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
LFCRG 93
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFCRG performs an LU factorization of a real general coefficient matrix. It also estimates
the condition number of the matrix. The underlying code is based on either LINPACK , LAPACK,
or ScaLAPACK code depending upon which supporting libraries are used during linking. For a
detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the
Introduction section of this manual. The LU factorization is done using scaled partial pivoting.
Scaled partial pivoting differs from partial pivoting in that the pivoting strategy is the same as if
each row were scaled to have the same -norm. Otherwise, partial pivoting is used.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described in a paper by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCRG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
can occur only if A either is singular or is very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFIRG, LFSRG and LFDRG.
To solve systems of equations with multiple right-hand-side vectors, use LFCRG followed by either
LFIRG or LFSRG called once for each right-hand side. The routine LFDRG can be called to compute
the determinant of the coefficient matrix after LFCRG has performed the factorization.
Let F be the matrix FACT and let p be the vector IPVT. The triangular matrix U is stored in the
upper triangle of F. The strict lower triangle of F contains the information needed to reconstruct L
using
L1 = LN-1PN-1 L1 P1
where Pk is the identity matrix with rows k and pk interchanged and Lk is the identity with Fik for
i = k + 1, , N inserted below the diagonal. The strict lower half of F can also be thought of as
containing the negative of the multipliers. LFCRG is based on the LINPACK routine SGECO; see
Dongarra et al. (1979). SGECO uses unscaled partial pivoting.
94 Chapter 1: Linear Systems
Comments
1.
2.
Informational errors
Type
Code
3
4
1
2
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse of a 3 3 matrix is computed. LFCRG is called to factor the matrix and to check for
singularity or ill-conditioning. LFIRG is called to determine the columns of the inverse.
USE LFCRG_INT
USE UMACH_INT
USE LFIRG_INT
USE WRRRN_INT
IMPLICIT NONE
!
PARAMETER
INTEGER
REAL
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), J, NOUT
A(LDA,N), AINV(LDA,N), FACT(LDFACT,N), RCOND, &
RES(N), RJ(N)
Set values for A
LFCRG 95
!
!
!
!
!
!
!
!
!
A(1,:)
A(2,:)
A(3,:)
CALL LFCRG
!
99998 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND < .02
L1 Condition number < 100.0
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. LFCRG is called to
factor the matrix and to check for singularity or ill-conditioning. LFIRG is called to determine the
columns of the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see
Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors
for the local arrays.
USE
USE
USE
USE
USE
USE
MPI_SETUP_INT
LFCRG_INT
UMACH_INT
LFIRG_INT
WRRRN_INT
SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, NOUT
INTEGER, ALLOCATABLE ::
IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:), RJ(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), RJ0(:)
REAL
RCOND
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context id, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), RES0(MXLDA), IPVT0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Call the factorization routine
CALL LFCRG (A0, FACT0, IPVT0, RCOND)
Print the reciprocal condition number
and the L1 condition number
IF(MP_RANK .EQ. 0) THEN
CALL UMACH (2, NOUT)
WRITE (NOUT,99998) RCOND, 1.0E0/RCOND
ENDIF
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0
CALL SCALAPACK_MAP(RJ, DESCL, RJ0)
RJ is the J-th column of the identity
matrix so the following LFIRG
reference computes the J-th column of
the inverse of A
CALL LFIRG (A0, FACT0, IPVT0, RJ0, X0, RES0)
RJ(J) = 0.0
CALL SCALAPACK_UNMAP(X0, DESCL, AINV(:,J))
10 CONTINUE
LFCRG 97
!
!
Print results
Only Rank=0 has the solution, X.
IF(MP_RANK.EQ.0) CALL WRRRN (AINV, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, IPVT0, FACT0, RES0, RJ, RJ0, X0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(FINAL)
99998 FORMAT ( RCOND = ,F5.3,/, L1 Condition number = ,F6.3)
END
Output
RCOND < .02
L1 Condition number < 100.0
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
LFTRG
CAPABLE
Required Arguments
A N by N matrix to be factored. (Input)
FACT N by N matrix containing the LU factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
IPVT Vector of length N containing the pivoting information for the LU factorization.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFTRG performs an LU factorization of a real general coefficient matrix. The underlying
code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. The LU
factorization is done using scaled partial pivoting. Scaled partial pivoting differs from partial
pivoting in that the pivoting strategy is the same as if each row were scaled to have the same norm.
Otherwise, partial pivoting is used.
The routine LFTRG fails if U, the upper triangular part of the factorization, has a zero diagonal
element. This can occur only if A is singular or very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFIRG, LFSRG and LFDRG.
To solve systems of equations with multiple right-hand-side vectors, use LFTRG followed by either
LFIRG or LFSRG called once for each right-hand side. The routine LFDRG can be called to compute
the determinant of the coefficient matrix after LFTRG has performed the factorization. Let F be the
matrix FACT and let p be the vector IPVT. The triangular matrix U is stored in the upper triangle of
F. The strict lower triangle of F contains the information needed to reconstruct L-1 using
L1 = LN-1PN-1 . . . L1 P1
LFTRG 99
where Pk is the identity matrix with rows k and pk interchanged and Lk is the identity with Fik for
i = k + 1, ..., N inserted below the diagonal. The strict lower half of F can also be thought of as
containing the negative of the multipliers.
Routine LFTRG is based on the LINPACK routine SGEFA. See Dongarra et al. (1979). The routine
SGEFA uses partial pivoting.
Comments
1.
2.
Informational error
Type
Code
4
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A linear system with multiple right-hand sides is solved. Routine LFTRG is called to factor the
coefficient matrix. The routine LFSRG is called to compute the two solutions for the two righthand sides. In this case, the coefficient matrix is assumed to be well-conditioned and correctly
scaled. Otherwise, it would be better to call LFCRG to perform the factorization, and LFIRG to
compute the solutions.
USE LFTRG_INT
USE LFSRG_INT
USE WRRRN_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), J
A(LDA,LDA), B(N,2), FACT(LDFACT,LDFACT), X(N,2)
!
!
!
!
!
!
!
!
!
!
!
1.0
1.0
1.0
B = ( 1.0
( 4.0
( -1.0
3.0
3.0
4.0
3.0)
4.0)
3.0)
10.0)
14.0)
9.0)
DATA A/1.0, 1.0, 1.0, 3.0, 3.0, 4.0, 3.0, 4.0, 3.0/
DATA B/1.0, 4.0, -1.0, 10.0, 14.0, 9.0/
!
CALL LFTRG (A,
FACT,
IPVT)
Solve for the two right-hand sides
DO 10 J=1, 2
CALL LFSRG (FACT, IPVT, B(:,J), X(:,J))
10 CONTINUE
Print results
CALL WRRRN (X, X)
END
Output
1
2
3
X
1
2
-2.000
1.000
-2.000 -1.000
3.000
4.000
ScaLAPACK Example
A linear system with multiple right-hand sides is solved. Routine LFTRG is called to factor the
coefficient matrix. The routine LFSRG is called to compute the two solutions for the two righthand sides. In this case, the coefficient matrix is assumed to be well-conditioned and correctly
scaled. Otherwise, it would be better to call LFCRG to perform the factorization, and LFIRG to
compute the solutions. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see
Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors
for the local arrays.
USE
USE
USE
USE
USE
MPI_SETUP_INT
LFTRG_INT
LFSRG_INT
WRRRN_INT
SCALAPACK_SUPPORT
LFTRG 101
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
INTEGER, ALLOCATABLE ::
IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), B(:,:), X(:,:), X0(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N,2), X(N,2))
Set values for A and B
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
B(1,:) = (/ 1.0, 10.0/)
B(2,:) = (/ 4.0, 14.0/)
B(3,:) = (/-1.0, 9.0/)
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
END
Output
1
2
3
X
1
2
-2.000
1.000
-2.000 -1.000
3.000
4.000
LFSRG
CAPABLE
Solves a real general system of linear equations given the LU factorization of the coefficient
matrix.
Required Arguments
FACT N by N matrix containing the LU factorization of the coefficient matrix A as output
from routine LFCRG. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization of A
as output from subroutine LFCRG or LFTRG. (Input).
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT, 2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT, 1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
T
Default: IPATH = 1.
LFSRG 103
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFSRG computes the solution of a system of linear algebraic equations having a real
general coefficient matrix. To compute the solution, the coefficient matrix must first undergo an
LU factorization. This may be done by calling either LFCRG or LFTRG. The solution to Ax = b is
found by solving the triangular systems Ly = b and Ux = y. The forward elimination step consists
of solving the system Ly = b by applying the same permutations and elimination operations to b
that were applied to the columns of A in the factorization routine. The backward substitution step
consists of solving the triangular system Ux = y for x.
LFSRG and LFIRG both solve a linear system given its LU factorization. LFIRG generally takes
more time and produces a more accurate answer than LFSRG. Each iteration of the iterative
refinement algorithm used by LFIRG calls LFSRG. The underlying code is based on either
LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used
during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
B0 Local vector of length MXLDA containing the local portions of the distributed vector B.
B contains the right-hand side of the linear system. (Input)
X0 Local vector of length MXLDA containing the local portions of the distributed vector X.
X contains the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a
call to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities)
has been made. See the ScaLAPACK Example below.
Example
The inverse is computed for a real general 3 3 matrix. The input matrix is assumed to be wellconditioned, hence, LFTRG is used rather than LFCRG.
USE LFSRG_INT
USE LFTRG_INT
USE WRRRN_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, LDFACT=3, N=3)
I, IPVT(N), J
A(LDA,LDA), AINV(LDA,LDA), FACT(LDFACT,LDFACT), RJ(N)
!
!
A(1,:) = (/ 1.0,
A(2,:) = (/ 1.0,
A(3,:) = (/ 1.0,
3.0,
3.0,
4.0,
!
CALL LFTRG (A, FACT, IPVT)
!
!
!
!
!
!
!
LFSRG 105
Output
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. The input matrix is
assumed to be well-conditioned, hence, LFTRG is used rather than LFCRG. LFSRG is called to
determine the columns of the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility
routines (see Chapter 11, Utilities) used to map and unmap arrays to and from the processor
grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes
the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFTRG_INT
USE UMACH_INT
USE LFSRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
!
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
INTEGER, ALLOCATABLE ::
IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:), RJ(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RJ0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context id, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), IPVT0(MXLDA))
!
!
!
!
!
!
!
!
!
!
!
!
Output
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
LFIRG
CAPABLE
Uses iterative refinement to improve the solution of a real general system of linear equations.
Required Arguments
A N by N matrix containing the coefficient matrix of the linear system. (Input)
FACT N by N matrix containing the LU factorization of the coefficient matrix A as output
from routine LFCRG/DLFCRG or LFTRG/DLFTRG. (Input).
Chapter 1: Linear Systems
LFIRG 107
IPVT Vector of length N containing the pivoting information for the LU factorization of A
as output from routine LFCRG/DLFCRG or LFTRG/DLFTRG. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input).
X Vector of length N containing the solution to the linear system. (Output)
RES Vector of length N containing the final correction at the improved solution. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system A * X = B is solved.
T
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFIRG computes the solution of a system of linear algebraic equations having a real
general coefficient matrix. Iterative refinement is performed on the solution vector to improve the
accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is
somewhat ill-conditioned. The underlying code is based on either LINPACK , LAPACK, or
ScaLAPACK code depending upon which supporting libraries are used during linking. For a
detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the
Introduction section of this manual.
To compute the solution, the coefficient matrix must first undergo an LU factorization. This may
be done by calling either LFCRG or LFTRG.
Iterative refinement fails only if the matrix is very ill-conditioned.
Routines LFIRG and LFSRG both solve a linear system given its LU factorization. LFIRG generally
takes more time and produces a more accurate answer than LFSRG. Each iteration of the iterative
refinement algorithm used by LFIRG calls LFSRG.
Comments
Informational error
Type
3
Code
2 The input matrix is too ill-conditioned for iterative refinement to be
effective.
LFIRG 109
RES0 Local vector of length MXLDA containing the local portions of the distributed
vector RES. RES contains the final correction at the improved solution to the linear
system. (Output)
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving
the system each of the first two times by adding 0.5 to the second element.
USE
USE
USE
USE
LFIRG_INT
LFCRG_INT
UMACH_INT
WRRRN_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), B(N), FACT(LDFACT,LDFACT), RCOND, RES(N), X(N)
!
!
!
!
!
!
!
!
!
1.0
1.0
1.0
3.0
3.0
4.0
3.0)
4.0)
3.0)
B = ( -0.5
-1.0
1.5)
DATA A/1.0, 1.0, 1.0, 3.0, 3.0, 4.0, 3.0, 4.0, 3.0/
DATA B/-0.5, -1.0, 1.5/
!
!
!
!
!
10 CONTINUE
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
X
2
2.000
3
0.000
1
-8.000
X
2
2.000
3
0.500
ScaLAPACK Example
The same set of linear systems is solved successively as a distributed example. The right-hand side
vector is perturbed after solving the system each of the first two times by adding 0.5 to the second
element. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11,
Utilities) used to map and unmap arrays to and from the processor grid. They are used here for
brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
USE MPI_SETUP_INT
USE LFIRG_INT
USE UMACH_INT
USE LFCRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, NOUT
INTEGER, ALLOCATABLE ::
IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:), X0(:), AINV(:,:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), B0(:)
REAL
RCOND
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
!
B(:) =
ENDIF
!
!
!
!
(/-0.5, -1.0,
1.5/)
LFIRG 111
Output
RCOND < 0.02
L1 Condition number < 100.0
1
-5.000
X
2
2.000
3
-0.500
1
-6.500
X
2
2.000
3
0.000
1
-8.000
X
2
2.000
3
0.500
LFDRG
Computes the determinant of a real general matrix given the LU factorization of the matrix.
Required Arguments
FACT N by N matrix containing the LU factorization of the matrix A as output from routine
LFTRG/DLFTRG or LFCRG/DLFCRG. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization as
output from routine LFTRG/DLFTRG or LFCRG/DLFCRG. (Input).
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDRG computes the determinant of a real general coefficient matrix. To compute the
determinant, the coefficient matrix must first undergo an LU factorization. This may be done by
calling either LFCRG or LFTRG. The formula det A = det L det U is used to compute the
determinant. Since the determinant of a triangular matrix is the product of the diagonal elements
det U = i =1U ii
N
LFDRG 113
(The matrix U is stored in the upper triangle of FACT.) Since L is the product of triangular matrices
with unit diagonals and of permutation matrices, det L = (1)k where k is the number of pivoting
interchanges.
Routine LFDRG is based on the LINPACK routine SGEDI; see Dongarra et al. (1979)
Example
The determinant is computed for a real general 3 3 matrix.
USE LFDRG_INT
USE LFTRG_INT
USE UMACH_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), DET1, DET2, FACT(LDFACT,LDFACT)
!
!
!
!
!
!
!
CALL LFTRG (A, FACT, IPVT)
!
!
!
99999 FORMAT ( The determinant of A is , F6.3, * 10**, F2.0)
END
Output
The determinant of A is -4.761 * 10**3.
LINRG
CAPABLE
Required Arguments
A N by N matrix containing the matrix to be inverted. (Input)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDAINV Leading dimension of AINV exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LINRG computes the inverse of a real general matrix. The underlying code is based on
either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries
are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual. LINRG first uses the routine
LFCRG to compute an LU factorization of the coefficient matrix and to estimate the condition
number of the matrix. Routine LFCRG computes U and the information needed to compute L-1.
-1
-1
-1
-1 -1
LINRT is then used to compute U . Finally, A is computed using A = U L .
LINRG 115
The routine LINRG fails if U, the upper triangular part of the factorization, has a zero diagonal
element or if the iterative refinement algorithm fails to converge. This error occurs only if A is
singular or very close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in A-1.
Comments
1.
2.
Informational errors
Type
Code
3
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse is computed for a real general 3 3 matrix.
USE LINRG_INT
USE WRRRN_INT
!
PARAMETER
Declare variables
(LDA=3, LDAINV=3)
INTEGER
REAL
I, J, NOUT
A(LDA,LDA), AINV(LDAINV,LDAINV)
!
!
!
!
!
!
3.0)
4.0)
3.0)
DATA A/1.0, 1.0, 1.0, 3.0, 3.0, 4.0, 3.0, 4.0, 3.0/
!
CALL LINRG (A, AINV)
!
Print results
CALL WRRRN (AINV, AINV)
END
Output
1
2
3
1
7.000
-1.000
-1.000
AINV
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LINRG_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
!
!
Declare variables
INTEGER
LDA, LDAINV, N, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:)
REAL, ALLOCATABLE ::
A0(:,:), AINV0(:,:)
PARAMETER (LDA=3, LDAINV=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDAINV,N))
Set values for A
A(1,:) = (/ 1.0, 3.0, 3.0/)
A(2,:) = (/ 1.0, 3.0, 4.0/)
A(3,:) = (/ 1.0, 4.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
LINRG 117
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
AINV
1
2
3
1
7.000
-1.000
-1.000
2
-3.000
0.000
1.000
3
-3.000
1.000
0.000
LSACG
CAPABLE
Required Arguments
A Complex N by N matrix containing the coefficients of the linear system. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
H
IPATH = 2 means the system A X = B is solved
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSACG solves a system of linear algebraic equations with a complex general coefficient
matrix. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code
depending upon which supporting libraries are used during linking. For a detailed explanation see
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this
manual. LSACG first uses the routine LFCCG to compute an LU factorization of the coefficient
matrix and to estimate the condition number of the matrix. The solution of the linear system is
then found using the iterative refinement routine LFICG.
LSACG fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the
iterative refinement algorithm fails to converge. These errors occur only if A is singular or very
close to a singular matrix.
Chapter 1: Linear Systems
LSACG 119
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSACG solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2ACG the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2); respectively, in LSACG.
Additional memory allocation for FACT and option value restoration are done
automatically in LSACG. Users directly calling L2ACG can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSACG or L2ACG. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17 This option has two values that determine if the L1 condition number is to be
computed. Routine LSACG temporarily replaces IVAL(2) by IVAL(1). The routine
L2CCG computes the condition number if IVAL(2) = 2. Otherwise L2CCG skips this
computation. LSACG restores the option. Default values for the option are
IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of three linear equations is solved. The coefficient matrix has complex general form and
the right-hand-side vector b has three elements.
USE LSACG_INT
USE WRCRN_INT
!
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
Set values for A and B
A = ( 3.0-2.0i 2.0+4.0i
( 1.0+1.0i 2.0-6.0i
( 4.0+0.0i -5.0+1.0i
B = (10.0+5.0i
0.0-3.0i)
1.0+2.0i)
3.0-2.0i)
6.0-7.0i -1.0+2.0i)
Output
X
1
( 1.000,-1.000)
2
( 2.000, 1.000)
3
( 0.000, 3.000)
LSACG 121
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The
coefficient matrix has complex general form and the right-hand-side vector b has three elements.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities)
used to map and unmap arrays to and from the processor grid. They are used here for brevity.
DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSACG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ (3.0, -2.0), (2.0, 4.0), (0.0, -3.0)/)
A(2,:) = (/ (1.0, 1.0), (2.0, -6.0), (1.0, 2.0)/)
A(3,:) = (/ (4.0, 0.0), (-5.0, 1.0), (3.0, -2.0)/)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
X
1
( 1.000,-1.000)
2
( 2.000, 1.000)
3
( 0.000, 3.000)
LSLCG
CAPABLE
Required Arguments
A Complex N by N matrix containing the coefficients of the linear system. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
H
IPATH = 2 means the system A X = B is solved
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
LSLCG 123
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSLCG solves a system of linear algebraic equations with a complex general coefficient
matrix. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code
depending upon which supporting libraries are used during linking. For a detailed explanation see
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this
manual. LSLCG first uses the routine LFCCG to compute an LU factorization of the coefficient
matrix and to estimate the condition number of the matrix. The solution of the linear system is
then found using LFSCG.
LSLCG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
occurs only if A either is a singular matrix or is very close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that
LSACG be used.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LCG the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2); respectively, in LSLCG.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLCG. Users directly calling L2LCG can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLCG or L2LCG. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLCG temporarily replaces IVAL(2) by IVAL(1). The
routine L2CCG computes the condition number if IVAL(2) = 2. Otherwise L2CCG
skips this computation. LSLCG restores the option. Default values for the option
are IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of three linear equations is solved. The coefficient matrix has complex general form and
the right-hand-side vector b has three elements.
Chapter 1: Linear Systems
LSLCG 125
USE LSLCG_INT
USE WRCRN_INT
!
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
Set values for A and B
A = ( 3.0-2.0i 2.0+4.0i
( 1.0+1.0i 2.0-6.0i
( 4.0+0.0i -5.0+1.0i
B = (10.0+5.0i
0.0-3.0i)
1.0+2.0i)
3.0-2.0i)
6.0-7.0i -1.0+2.0i)
Output
X
1
( 1.000,-1.000)
2
( 2.000, 1.000)
3
( 0.000, 3.000)
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The
coefficient matrix has complex general form and the right-hand-side vector b has three elements.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities)
used to map and unmap arrays to and from the processor grid. They are used here for brevity.
DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLCG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
!
!
Output
X
1
( 1.000,-1.000)
2
( 2.000, 1.000)
3
( 0.000, 3.000)
LFCCG
CAPABLE
Computes the LU factorization of a complex general matrix and estimate its L1 condition number.
LFCCG 127
Required Arguments
A Complex N by N matrix to be factored. (Input)
FACT Complex N by N matrix containing the LU factorization of the matrix A (Output)
If A is not needed, A and FACT can share the same storage locations)
IPVT Vector of length N containing the pivoting information for the LU factorization.
(Output)
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFCCG performs an LU factorization of a complex general coefficient matrix. It also
estimates the condition number of the matrix. The underlying code is based on either LINPACK,
LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during
linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual. The LU factorization is done using scaled
partial pivoting. Scaled partial pivoting differs from partial pivoting in that the pivoting strategy is
the same as if each row were scaled to have the same -norm.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCCG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
can occur only if A either is singular or is very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFICG, LFSCG and LFDCG.
To solve systems of equations with multiple right-hand-side vectors, use LFCCG followed by either
LFICG or LFSCG called once for each right-hand side. The routine LFDCG can be called to compute
the determinant of the coefficient matrix after LFCCG has performed the factorization.
Let F be the matrix FACT and let p be the vector IPVT. The triangular matrix U is stored in the
upper triangle of F. The strict lower triangle of F contains the information needed to reconstruct L
using
L11 = LN-1PN-1 L1 P1
where Pk is the identity matrix with rows k and pk interchanged and Lk is the identity with Fik for
i = k + 1, ..., N inserted below the diagonal. The strict lower half of F can also be thought of as
containing the negative of the multipliers.
Comments
1.
2.
Informational errors
Type
Code
3
4
1
2
LFCCG 129
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse of a 3 3 matrix is computed. LFCCG is called to factor the matrix and to check for
singularity or ill-conditioning. LFICG is called to determine the columns of the inverse.
USE IMSL_LIBRARIES
!
PARAMETER
INTEGER
REAL
COMPLEX
!
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
RCOND, THIRD
A(LDA,N), AINV(LDA,N), RJ(N), FACT(LDFACT,N), RES(N)
Declare functions
CMPLX
Set values for A
A = ( 1.0+1.0i 2.0+3.0i 3.0+3.0i)
( 2.0+1.0i 5.0+3.0i 7.0+4.0i)
( -2.0+1.0i -4.0+4.0i -5.0+3.0i)
!
!
!
!
!
!
!
!
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND < .02
L1 Condition number < 100.0
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. LFCCG is called to
factor the matrix and to check for singularity or ill-conditioning. LFICG is called to determine the
columns of the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see
Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors
for the local arrays.
USE MPI_SETUP_INT
USE LFCCG_INT
USE UMACH_INT
USE LFICG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, NOUT
INTEGER, ALLOCATABLE ::
IPVT0(:)
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:), RJ(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), RJ0(:)
REAL
RCOND, THIRD
PARAMETER (LDA=3, N=3)
LFCCG 131
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
RCOND < .02
L1 Condition number < 100.0
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
LFTCG
CAPABLE
Required Arguments
A Complex N by N matrix to be factored. (Input)
FACT Complex N by N matrix containing the LU factorization of the matrix A. (Output)
If A is not needed, A and FACT can share the same storage locations.
IPVT Vector of length N containing the pivoting information for the LU factorization.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
LFTCG 133
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFTCG performs an LU factorization of a complex general coefficient matrix. The LU
factorization is done using scaled partial pivoting. Scaled partial pivoting differs from partial
pivoting in that the pivoting strategy is the same as if each row were scaled to have the same
-norm.
LFTCG fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
can occur only if A either is singular or is very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFICG, LFSCG and LFDCG.
To solve systems of equations with multiple right-hand-side vectors, use LFTCG followed by either
LFICG or LFSCG called once for each right-hand side. The routine LFDCG can be called to compute
the determinant of the coefficient matrix after LFCCG has performed the factorization.
Let F be the matrix FACT and let p be the vector IPVT. The triangular matrix U is stored in the
upper triangle of F. The strict lower triangle of F contains the information needed to reconstruct L
using
L = LN-1PN-1 L1 P1
where Pk is the identity matrix with rows k and Pk interchanged and Lk is the identity with Fik for
i = k + 1, ..., N inserted below the diagonal. The strict lower half of F can also be thought of as
containing the negative of the multipliers.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational error
Type
4
Code
2 The input matrix is singular.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A linear system with multiple right-hand sides is solved. LFTCG is called to factor the coefficient
matrix. LFSCG is called to compute the two solutions for the two right-hand sides. In this case the
coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be
better to call LFCCG to perform the factorization, and LFICG to compute the solutions.
USE LFTCG_INT
USE LFSCG_INT
USE WRCRN_INT
!
PARAMETER
INTEGER
COMPLEX
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N)
A(LDA,LDA), B(N,2), X(N,2), FACT(LDFACT,LDFACT)
Set values for A
LFTCG 135
!
!
!
!
A = ( 1.0+1.0i 2.0+3.0i
( 2.0+1.0i 5.0+3.0i
(-2.0+1.0i -4.0+4.0i
3.0-3.0i)
7.0-5.0i)
5.0+3.0i)
!
!
!
!
!
!
!
!
!
Factor A
CALL LFTCG (A, FACT, IPVT)
Output
1
2
3
X
1
( 1.000,-1.000)
( 2.000, 4.000)
( 3.000, 0.000)
2
( 0.000, 2.000)
(-2.000,-1.000)
( 1.000, 3.000)
ScaLAPACK Example
The same linear system with multiple right-hand sides is solved as a distributed example. LFTCG is
called to factor the matrix. LFSCG is called to compute the two solutions for the two right-hand
sides. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11,
Utilities) used to map and unmap arrays to and from the processor grid. They are used here for
brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
USE MPI_SETUP_INT
USE LFTCG_INT
USE LFSCG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
INTEGER, ALLOCATABLE ::
IPVT0(:)
COMPLEX, ALLOCATABLE ::
A(:,:), B(:,:), X(:,:), X0(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N,2), X(N,2))
Set values for A and B
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0,-3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0,-5.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), ( 5.0, 3.0)/)
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
X
1
( 1.000,-1.000)
2
( 0.000, 2.000)
LFTCG 137
2
3
( 2.000, 4.000)
( 3.000, 0.000)
(-2.000,-1.000)
( 1.000, 3.000)
LFSCG
CAPABLE
Solves a complex general system of linear equations given the LU factorization of the coefficient
matrix.
Required Arguments
FACT Complex N by N matrix containing the LU factorization of the coefficient matrix A
as output from routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization of A
as output from routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
H
IPATH = 2 means the system A X = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFSCG computes the solution of a system of linear algebraic equations having a complex
general coefficient matrix. To compute the solution, the coefficient matrix must first undergo an
LU factorization. This may be done by calling either LFCCG or LFTCG. The solution to Ax = b is
found by solving the triangular systems Ly = b and Ux = y. The forward elimination step consists
of solving the system Ly = b by applying the same permutations and elimination operations to b
that were applied to the columns of A in the factorization routine. The backward substitution step
consists of solving the triangular system Ux = y for x.
Routines LFSCG and LFICG both solve a linear system given its LU factorization. LFICG generally
takes more time and produces a more accurate answer than LFSCG. Each iteration of the iterative
refinement algorithm used by LFICG calls LFSCG.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
LFSCG 139
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse is computed for a complex general 3 3 matrix. The input matrix is assumed to be
well-conditioned, hence LFTCG is used rather than LFCCG.
USE IMSL_LIBRARIES
!
PARAMETER
INTEGER
REAL
COMPLEX
!
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N)
THIRD
A(LDA,LDA), AINV(LDA,LDA), RJ(N), FACT(LDFACT,LDFACT)
Declare functions
CMPLX
Set values for A
A = ( 1.0+1.0i 2.0+3.0i 3.0+3.0i)
( 2.0+1.0i 5.0+3.0i 7.0+4.0i)
( -2.0+1.0i -4.0+4.0i -5.0+3.0i)
!
!
!
!
!
!
!
!
Output
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. The input matrix is
assumed to be well-conditioned, hence LFTCG is used rather than LFCCG. LFSCG is called to
determine the columns of the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility
routines (see Chapter 11, Utilities) used to map and unmap arrays to and from the processor
grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes
the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFTCG_INT
USE LFSCG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
!
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
INTEGER, ALLOCATABLE ::
IPVT0(:)
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RJ(:), RJ0(:)
REAL
THIRD
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0, 3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0, 4.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), (-5.0, 3.0)/)
Scale A by dividing by three
THIRD = 1.0/3.0
A = A * THIRD
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCL, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), IPVT0(MXLDA))
LFSCG 141
!
!
!
!
!
!
!
!
!
!
!
!
Output
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
LFICG
CAPABLE
Uses iterative refinement to improve the solution of a complex general system of linear equations.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the linear system. (Input)
FACT Complex N by N matrix containing the LU factorization of the coefficient matrix A
as output from routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization of A
as output from routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
RES Complex vector of length N containing the residual vector at the improved solution.
(Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
H
IPATH = 2 means the system A X = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
LFICG 143
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFICG computes the solution of a system of linear algebraic equations having a complex
general coefficient matrix. Iterative refinement is performed on the solution vector to improve the
accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is
somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo an LU factorization. This may
be done by calling either LFCCG, or LFTCG.
Iterative refinement fails only if the matrix is very ill-conditioned. Routines LFICG and LFSCG
both solve a linear system given its LU factorization. LFICG generally takes more time and
produces a more accurate answer than LFSCG. Each iteration of the iterative refinement algorithm
used by LFICG calls LFSCG.
Comments
Informational error
Type
3
Code
2 The input matrix is too ill-conditioned for iterative refinement to be
effective
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving
the system each of the first two times by adding 0.5 + 0.5i to the second element.
USE
USE
USE
USE
LFICG_INT
LFCCG_INT
WRCRN_INT
UMACH_INT
!
PARAMETER
INTEGER
REAL
COMPLEX
!
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
RCOND
A(LDA,LDA), B(N), X(N), FACT(LDFACT,LDFACT), RES(N)
Declare functions
CMPLX
Set values for A
A = ( 1.0+1.0i 2.0+3.0i
( 2.0+1.0i 5.0+3.0i
( -2.0+1.0i -4.0+4.0i
3.0-3.0i)
7.0-5.0i)
5.0+3.0i)
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
LFICG 145
3
( 3.000, 0.000)
X
1
( 0.910,-1.061)
2
( 1.986, 4.175)
3
( 3.123, 0.071)
X
1
( 0.821,-1.123)
2
( 1.972, 4.349)
3
( 3.245, 0.142)
ScaLAPACK Example
The same set of linear systems is solved successively as a distributed example. The right-handside vector is perturbed after solving the system each of the first two times by adding 0.5 + 0.5i to
the second element. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see
Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors
for the local arrays.
USE MPI_SETUP_INT
USE LFICG_INT
USE LFCCG_INT
USE WRCRN_INT
USE UMACH_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, NOUT
INTEGER, ALLOCATABLE ::
IPVT0(:)
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:), X0(:), RES(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:), RES0(:)
REAL
RCOND
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N), RES(N))
Set values for A and B
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0, 3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0, 4.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), (-5.0, 3.0)/)
!
B
!
!
!
!
= (/ (3.0,
ENDIF
4.0)/)
!
!
!
!
!
!
!
!
Output
RCOND < 0.025
L1 Condition number < 75.0
X
1
2
( 1.000,-1.000) ( 2.000, 4.000)
3
( 3.000, 0.000)
X
1
( 0.910,-1.061)
2
( 1.986, 4.175)
3
( 3.123, 0.071)
X
1
( 0.821,-1.123)
2
( 1.972, 4.349)
3
( 3.245, 0.142)
LFICG 147
LFDCG
Computes the determinant of a complex general matrix given the LU factorization of the matrix.
Required Arguments
FACT Complex N by N matrix containing the LU factorization of the coefficient matrix A
as output from routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization of A
as output from routine LFCCG/DLFCCG or LFTCG/DLFTCG. (Input)
DET1 Complex scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDCG computes the determinant of a complex general coefficient matrix. To compute the
determinant the coefficient matrix must first undergo an LU factorization. This may be done by
calling either LFCCG or LFTCG. The formula det A = det L det U is used to compute the
determinant. Since the determinant of a triangular matrix is the product of the diagonal elements,
det U = i =1U ii
N
(The matrix U is stored in the upper triangle of FACT.) Since L is the product of triangular matrices
with unit diagonals and of permutation matrices, det L = (1)k where k is the number of pivoting
interchanges.
LFDCG is based on the LINPACK routine CGEDI; see Dongarra et al. (1979).
Example
The determinant is computed for a complex general 3 3 matrix.
USE LFDCG_INT
USE LFTCG_INT
USE UMACH_INT
!
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDFACT=3, N=3)
IPVT(N), NOUT
DET2
A(LDA,LDA), FACT(LDFACT,LDFACT), DET1
Set values for A
A = (
(
(
3.0-2.0i 2.0+4.0i
1.0+1.0i 2.0-6.0i
4.0+0.0i -5.0+1.0i
0.0-3.0i)
1.0+2.0i)
3.0-2.0i)
Factor A
CALL LFTCG (A, FACT, IPVT)
!
!
!
!
99999 FORMAT ( The determinant of A is,3X,(,F6.3,,,F6.3,&
) * 10**,F2.0)
END
Output
The determinant of A is ( 0.700, 1.100) * 10**1.
LINCG
CAPABLE
LINCG 149
Required Arguments
A Complex N by N matrix containing the matrix to be inverted. (Input)
AINV Complex N by N matrix containing the inverse of A. (Output)
If A is not needed, A and AINV can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDAINV Leading dimension of AINV exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LINCG computes the inverse of a complex general matrix. The underlying code is based
on either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries
are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual.
150 Chapter 1: Linear Systems
LINCG first uses the routine LFCCG to compute an LU factorization of the coefficient matrix and to
estimate the condition number of the matrix. LFCCG computes U and the information needed to
compute L. LINCT is then used to compute U. Finally A is computed using A=UL.
LINCG fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the
iterative refinement algorithm fails to converge. This errors occurs only if A is singular or very
close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in A-1.
Comments
1.
2.
Informational errors
Type
Code
3
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a
call to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities)
has been made. See the ScaLAPACK Example below.
Example
The inverse is computed for a complex general 3 3 matrix.
Chapter 1: Linear Systems
LINCG 151
USE LINCG_INT
USE WRCRN_INT
USE CSSCAL_INT
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
Declare variables
(LDA=3, LDAINV=3, N=3)
THIRD
A(LDA,LDA), AINV(LDAINV,LDAINV)
Set values for A
A = ( 1.0+1.0i 2.0+3.0i 3.0+3.0i)
( 2.0+1.0i 5.0+3.0i 7.0+4.0i)
( -2.0+1.0i -4.0+4.0i -5.0+3.0i)
!
!
Output
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LINCG_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA, NPROW, NPCOL
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
COMPLEX, ALLOCATABLE ::
A0(:,:), AINV0(:,:)
REAL
THIRD
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ ( 1.0, 1.0), ( 2.0, 3.0), ( 3.0, 3.0)/)
A(2,:) = (/ ( 2.0, 1.0), ( 5.0, 3.0), ( 7.0, 4.0)/)
A(3,:) = (/ (-2.0, 1.0), (-4.0, 4.0), (-5.0, 3.0)/)
Scale A by dividing by three
THIRD = 1.0/3.0
A = A * THIRD
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), AINV0(MXLDA,MXCOL))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor A
CALL LINCG (A0, AINV0)
Unmap the results from the distributed
arrays back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(AINV0, DESCA, AINV)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK.EQ.0) CALL WRCRN (AINV, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, AINV0)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(FINAL)
END
Output
AINV
1
2
3
1
( 6.400,-2.800)
(-1.600,-1.800)
(-0.600, 2.200)
2
(-3.800, 2.600)
( 0.200, 0.600)
( 1.200,-1.400)
3
(-2.600, 1.200)
( 0.400,-0.800)
( 0.400, 0.200)
LINCG 153
LSLRT
CAPABLE
Required Arguments
A N by N matrix containing the coefficient matrix for the triangular linear system. (Input)
For a lower triangular system, only the lower triangular part and diagonal of A are
referenced. For an upper triangular system, only the upper triangular part and diagonal
of A are referenced.
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means solve AX = B, A lower triangular.
IPATH = 2 means solve AX = B, A upper triangular.
T
IPATH = 3 means solve A X = B, A lower triangular.
T
IPATH = 4 means solve A X = B, A upper triangular.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSLRT solves a system of linear algebraic equations with a real triangular coefficient
matrix. LSLRT fails if the matrix A has a zero diagonal element, in which case A is singular. The
underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon
which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of three linear equations is solved. The coefficient matrix has lower triangular form and
the right-hand-side vector, b, has three elements.
USE LSLRT_INT
USE WRRRN_INT
!
PARAMETER
REAL
!
Declare variables
(LDA=3)
A(LDA,LDA), B(LDA), X(LDA)
Set values for A and B
LSLRT 155
!
!
!
!
!
!
!
A = ( 2.0
( 2.0
( -4.0
B = (
2.0
-1.0
2.0
)
)
5.0)
5.0
0.0)
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
DATA B/2.0, 5.0, 0.0/
!
!
Solve AX = B
(IPATH = 1)
Print results
CALL WRRRN (X, X, 1, 3, 1)
END
Output
1
1.000
X
2
-3.000
3
2.000
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The
coefficient matrix has lower triangular form and the right-hand-side vector b has three elements.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities)
used to map and unmap arrays to and from the processor grid. They are used here for brevity.
DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLRT_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 2.0, 0.0, 0.0/)
A(2,:) = (/ 2.0, -1.0, 0.0/)
A(3,:) = (/-4.0, 2.0, 5.0/)
!
!
B =
ENDIF
(/ 2.0,
5.0,
0.0/)
Set up a 1D processor grid and define
Fortran Numerical MATH LIBRARY
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
1
1.000
X
2
-3.000
3
2.000
LFCRT
CAPABLE
Required Arguments
A N by N matrix containing the coefficient matrix for the triangular linear system. (Input)
For a lower triangular system, only the lower triangular part and diagonal of A are
referenced. For an upper triangular system, only the upper triangular part and diagonal
of A are referenced.
LFCRT 157
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means A is lower triangular.
IPATH = 2 means A is upper triangular.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFCRT estimates the condition number of a real triangular matrix. The L1 condition
number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to compute ||A-1||1 ,
the condition number is only estimated. The estimation algorithm is the same as used by
LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational error
Type
Code
3
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
An estimate of the reciprocal condition number is computed for a 3 3 lower triangular
coefficient matrix.
USE LFCRT_INT
USE UMACH_INT
!
PARAMETER
REAL
INTEGER
!
Declare variables
(LDA=3)
A(LDA,LDA), RCOND
NOUT
Set values for A and B
LFCRT 159
!
!
!
!
A = ( 2.0
( 2.0
( -4.0
-1.0
2.0
)
)
5.0)
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
!
!
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
99999 FORMAT ( RCOND = ,F5.3,/, L1 Condition number = ,F6.3)
END
Output
RCOND < 0.1
L1 Condition number < 15.0
ScaLAPACK Example
The same lower triangular matrix as in the example above is used in this distributed computing
example. An estimate of the reciprocal condition number is computed for the 3 3 lower
triangular coefficient matrix. SCALAPACK_MAP is an IMSL utility routine (see Chapter 11,
Utilities) used to map an array to the processor grid. It is used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFCRT_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
!
!
!
!
Declare variables
INTEGER
LDA, N, NOUT, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
REAL, ALLOCATABLE ::
A(:,:)
REAL, ALLOCATABLE ::
A0(:,:)
PARAMETER
(LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N))
Set values for A
A(1,:) = (/ 2.0, 0.0, 0.0/)
A(2,:) = (/ 2.0, -1.0, 0.0/)
A(3,:) = (/-4.0, 2.0, 5.0/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
Output
RCOND < 0.1
L1 Condition number < 15.0
LFDRT
Computes the determinant of a real triangular matrix.
Required Arguments
A N by N matrix containing the triangular matrix. (Input)
The matrix can be either upper or lower triangular.
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LFDRT 161
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDRT computes the determinant of a real triangular coefficient matrix. The determinant
of a triangular matrix is the product of the diagonal elements
det A = i =1 Aii
N
LFDRT is based on the LINPACK routine STRDI; see Dongarra et al. (1979).
Comments
Informational error
Type
3
Code
1 The input triangular matrix is singular.
Example
The determinant is computed for a 3 3 lower triangular matrix.
USE LFDRT_INT
USE UMACH_INT
!
PARAMETER
REAL
INTEGER
!
!
!
!
!
Declare variables
(LDA=3)
A(LDA,LDA), DET1, DET2
NOUT
Set values for A
A = ( 2.0
( 2.0
-1.0
( -4.0
2.0
)
)
5.0)
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
!
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
99999 FORMAT ( The determinant of A is , F6.3, * 10**, F2.0)
END
Output
The determinant of A is -1.000 * 10**1.
LINRT
Computes the inverse of a real triangular matrix.
Required Arguments
A N by N matrix containing the triangular matrix to be inverted. (Input)
For a lower triangular matrix, only the lower triangular part and diagonal of A are
referenced. For an upper triangular matrix, only the upper triangular part and diagonal
of A are referenced.
AINV N by N matrix containing the inverse of A. (Output)
If A is lower triangular, AINV is also lower triangular. If A is upper triangular, AINV is
also upper triangular. If A is not needed, A and AINV can share the same storage
locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means A is lower triangular.
IPATH = 2 means A is upper triangular.
Default: IPATH = 1.
LDAINV Leading dimension of AINV exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
LINRT 163
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LINRT computes the inverse of a real triangular matrix. It fails if A has a zero diagonal
element.
Example
The inverse is computed for a 3 3 lower triangular matrix.
USE LINRT_INT
USE WRRRN_INT
!
PARAMETER
REAL
!
!
!
!
!
Declare variables
(LDA=3)
A(LDA,LDA), AINV(LDA,LDA)
Set values for A
A = ( 2.0
( 2.0
-1.0
( -4.0
2.0
)
)
5.0)
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
!
!
Print results
CALL WRRRN (AINV, AINV)
END
Output
AINV
1
2
3
1
0.500
1.000
0.000
2
0.000
-1.000
0.400
3
0.000
0.000
0.200
LSLCT
CAPABLE
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the triangular linear system.
(Input)
For a lower triangular system, only the lower triangle of A is referenced. For an upper
triangular system, only the upper triangle of A is referenced.
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means solve AX = B, A lower triangular
IPATH = 2 means solve AX = B, A upper triangular
H
IPATH = 3 means solve A X = B, A lower triangular
H
IPATH = 4 means solve A X = B, A upper triangular
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
LSLCT 165
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSLCT solves a system of linear algebraic equations with a complex triangular coefficient
matrix. LSLCT fails if the matrix A has a zero diagonal element, in which case A is singular. The
underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon
which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational error
Type
4
Code
1 The input triangular matrix is singular. Some of its diagonal elements are
near zero.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call to
SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of three linear equations is solved. The coefficient matrix has lower triangular form and
the right-hand-side vector, b, has three elements.
USE LSLCT_INT
USE WRCRN_INT
!
Declare variables
INTEGER
LDA
PARAMETER
COMPLEX
(LDA=3)
A(LDA,LDA), B(LDA), X(LDA)
Set values for A and B
!
!
!
!
!
!
!
!
A = ( -3.0+2.0i
( -2.0-1.0i
( -1.0+3.0i
)
0.0+6.0i
)
1.0-5.0i -4.0+0.0i )
!
!
Solve AX = B
CALL LSLCT (A, B, X)
Print results
CALL WRCRN (X, X, 1, 3, 1)
END
Output
X
1
( 3.000, 2.000)
2
( 1.000, 1.000)
3
( 2.000, 0.000)
ScaLAPACK Example
The same lower triangular matrix as in the example above is used in this distributed computing
example. The system of three linear equations is solved. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK
tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLCT_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A
A(1,:) = (/ (-3.0, 2.0), (0.0, 0.0), ( 0.0, 0.0)/)
A(2,:) = (/ (-2.0, -1.0), (0.0, 6.0), ( 0.0, 0.0)/)
A(3,:) = (/ (-1.0, 3.0), (1.0, -5.0), (-4.0, 0.0)/)
LSLCT 167
!
B
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
1
( 3.000, 2.000)
2
( 1.000, 1.000)
3
( 2.000, 0.000)
LFCCT
CAPABLE
Required Arguments
A Complex N by N matrix containing the triangular matrix. (Input)
For a lower triangular system, only the lower triangle of A is referenced. For an upper
triangular system, only the upper triangle of A is referenced.
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means A is lower triangular.
IPATH = 2 means A is upper triangular.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
LFCCT 169
Description
Routine LFCCT estimates the condition number of a complex triangular matrix. The L1 condition
number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to compute ||A-1||1 ,
the condition number is only estimated. The estimation algorithm is the same as used by
LINPACK and is described by Cline et al. (1979). If the estimated condition number is greater
than 1/ (where is machine precision), a warning error is issued. This indicates that very small
changes in A can cause very large changes in the solution x. The underlying code is based on
either LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries
are used during linking. For a detailed explanation see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational error
Type
Code
3
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
An estimate of the reciprocal condition number is computed for a 3 3 lower triangular
coefficient matrix.
USE LFCCT_INT
USE UMACH_INT
170 Chapter 1: Linear Systems
Declare variables
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
LDA, N
(LDA=3)
NOUT
RCOND
A(LDA,LDA)
!
!
!
!
!
!
)
0.0+6.0i
)
1.0-5.0i -4.0+0.0i )
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
99999 FORMAT ( RCOND = ,F5.3,/, L1 Condition number = ,F6.3)
END
Output
RCOND < 0.2
L1 Condition number < 10.0
ScaLAPACK Example
The same lower triangular matrix as in the example above is used in this distributed computing
example. An estimate of the reciprocal condition number is computed for a 3 3 lower triangular
coefficient matrix. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see
Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors
for the local arrays.
USE MPI_SETUP_INT
USE LFCCT_INT
USE UMACH_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
Declare variables
INTEGER
LDA, N, NOUT, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
COMPLEX, ALLOCATABLE ::
A(:,:)
COMPLEX, ALLOCATABLE ::
A0(:,:)
PARAMETER
(LDA=3, N=3)
Set up for MPI
LFCCT 171
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N))
!
Output
RCOND < 0.2
L1 Condition number < 10.0
LFDCT
Computes the determinant of a complex triangular matrix.
Required Arguments
A Complex N by N matrix containing the triangular matrix.(Input)
DET1 Complex scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 DET1 <10.0 or DET1= 0.0.
172 Chapter 1: Linear Systems
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDCT computes the determinant of a complex triangular coefficient matrix. The
determinant of a triangular matrix is the product of the diagonal elements
det A = i =1 Aii
N
LFDCT is based on the LINPACK routine CTRDI; see Dongarra et al. (1979).
Comments
Informational error
Type
Code
3
Example
The determinant is computed for a 3 3 complex lower triangular matrix.
USE LFDCT_INT
USE UMACH_INT
!
Declare variables
INTEGER
PARAMETER
INTEGER
LDA, N
(LDA=3, N=3)
NOUT
LFDCT 173
REAL
COMPLEX
!
!
!
!
!
!
DET2
A(LDA,LDA), DET1
Set values for A
A = ( -3.0+2.0i
( -2.0-1.0i
( -1.0+3.0i
)
0.0+6.0i
)
1.0-5.0i -4.0+0.0i )
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
99999 FORMAT ( The determinant of A is (,F4.1,,,F4.1,) * 10**,&
F2.0)
END
Output
The determinant of A is ( 0.5, 0.7) * 10**2.
LINCT
Computes the inverse of a complex triangular matrixs.
Required Arguments
A Complex N by N matrix containing the triangular matrix to be inverted. (Input)
For a lower triangular matrix, only the lower triangle of A is referenced. For an upper
triangular matrix, only the upper triangle of A is referenced.
AINV Complex N by N matrix containing the inverse of A. (Output)
If A is lower triangular, AINV is also lower triangular. If A is upper triangular, AINV is
also upper triangular. If A is not needed, A and AINV can share the same storage
locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means A is lower triangular.
174 Chapter 1: Linear Systems
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LINCT computes the inverse of a complex triangular matrix. It fails if A has a zero
diagonal element.
Comments
Informational error
Type
4
Code
1 The input triangular matrix is singular. Some of its diagonal elements are
close to zero.
Example
The inverse is computed for a 3 3 lower triangular matrix.
USE LINCT_INT
USE WRCRN_INT
!
!
!
!
!
!
!
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA
(LDA=3)
A(LDA,LDA), AINV(LDA,LDA)
Set values for A
A = ( -3.0+2.0i
( -2.0-1.0i
( -1.0+3.0i
)
0.0+6.0i
)
1.0-5.0i -4.0+0.0i )
LINCT 175
Print results
CALL WRCRN (AINV, AINV)
END
Output
AINV
1
2
3
1
(-0.2308,-0.1538)
(-0.0897, 0.0513)
( 0.2147,-0.0096)
2
( 0.0000, 0.0000)
( 0.0000,-0.1667)
(-0.2083,-0.0417)
3
( 0.0000, 0.0000)
( 0.0000, 0.0000)
(-0.2500, 0.0000)
LSADS
CAPABLE
Solves a real symmetric positive definite system of linear equations with iterative refinement.
Required Arguments
A N by N matrix containing the coefficient matrix of the symmetric positive definite linear
system. (Input)
Only the upper triangle of A is referenced.
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSADS solves a system of linear algebraic equations having a real symmetric positive
definite coefficient matrix. The underlying code is based on either LINPACK , LAPACK, or
ScaLAPACK code depending upon which supporting libraries are used during linking. For a
detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the
Introduction section of this manual. LSADS first uses the routine LFCDS to compute an RTR
Cholesky factorization of the coefficient matrix and to estimate the condition number of the
matrix. The matrix R is upper triangular. The solution of the linear system is then found using the
iterative refinement routine LFIDS. LSADS fails if any submatrix of R is not positive definite, if R
has a zero diagonal element or if the iterative refinement algorithm fails to converge. These errors
occur only if A is either very close to a singular matrix or a matrix which is not positive definite. If
the estimated condition number is greater than 1/ (where is machine precision), a warning error
is issued. This indicates that very small changes in A can cause very large changes in the solution
x. Iterative refinement can sometimes find the solution to such a system. LSADS solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
LSADS 177
2.
Informational errors
Type
Code
3.
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2ADS the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSADS.
Additional memory allocation for FACT and option value restoration are done
automatically in LSADS. Users directly calling L2ADS can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSADS or L2ADS. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSADS temporarily replaces IVAL(2) by IVAL(1). The
routine L2CDS computes the condition number if IVAL(2) = 2. Otherwise L2CDS
skips this computation. LSADS restores the option. Default values for the option
are IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of three linear equations is solved. The coefficient matrix has real positive definite form
and the right-hand-side vector b has three elements.
USE LSADS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, N
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
!
!
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
B = ( 27.0 -78.0
64.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
DATA B/27.0, -78.0, 64.0/
Print results
CALL WRRRN (X, X, 1, N, 1)
!
END
Output
X
1
1.000
2
-4.000
3
7.000
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The
coefficient matrix has real positive definite form and the right-hand-side vector b has three
elements. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11,
Utilities) used to map and unmap arrays to and from the processor grid. They are used here for
brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
USE MPI_SETUP_INT
USE LSADS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(LDA=3, N=3)
Set up for MPI
LSADS 179
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
!
B = (/27.0, -78.0,
ENDIF
!
!
64.0/)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
1
1.000
X
2
-4.000
3
7.000
LSLDS
CAPABLE
Solves a real symmetric positive definite system of linear equations without iterative refinement .
Required Arguments
A N by N matrix containing the coefficient matrix of the symmetric positive definite linear
system. (Input)
Only the upper triangle of A is referenced.
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSLDS solves a system of linear algebraic equations having a real symmetric positive
definite coefficient matrix. The underlying code is based on either LINPACK , LAPACK, or
ScaLAPACK code depending upon which supporting libraries are used during linking. For a
Chapter 1: Linear Systems
LSLDS 181
detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the
Introduction section of this manual. LSLDS first uses the routine LFCDS to compute an RTR
Cholesky factorization of the coefficient matrix and to estimate the condition number of the
matrix. The matrix R is upper triangular. The solution of the linear system is then found using the
routine LFSDS. LSLDS fails if any submatrix of R is not positive definite or if R has a zero
diagonal element. These errors occur only if A either is very close to a singular matrix or to a
matrix which is not positive definite. If the estimated condition number is greater than 1/ (where
is machine precision), a warning error is issued. This indicates that very small changes in A can
cause very large changes in the solution x. If the coefficient matrix is ill-conditioned, it is
recommended that LSADS be used.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LDS the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSLDS.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLDS. Users directly calling L2LDS can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLDS or L2LDS. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLDS temporarily replaces IVAL(2) by IVAL(1). The
routine L2CDS computes the condition number if IVAL(2) = 2. Otherwise L2CDS
skips this computation. LSLDS restores the option. Default values for the option
are IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of three linear equations is solved. The coefficient matrix has real positive definite form
and the right-hand-side vector b has three elements.
USE LSLDS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, N
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
!
!
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
B = ( 27.0 -78.0
64.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
DATA B/27.0, -78.0, 64.0/
!
CALL LSLDS (A, B, X)
!
Print results
CALL WRRRN (X, X, 1, N, 1)
!
END
Output
LSLDS 183
X
1
1.000
2
-4.000
3
7.000
ScaLAPACK Example
The same system of three linear equations is solved as a distributed computing example. The
coefficient matrix has real positive definite form and the right-hand-side vector b has three
elements. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11,
Utilities) used to map and unmap arrays to and from the processor grid. They are used here for
brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
USE MPI_SETUP_INT
USE LSLDS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
!
B = (/27.0, -78.0,
ENDIF
!
!
!
!
!
!
!
!
!
!
!
64.0/)
array.
CALL SCALAPACK_UNMAP(X0, DESCX, X)
Print results.
Only Rank=0 has the solution, X.
IF(MP_RANK .EQ. 0)CALL WRRRN (X, X, 1, N, 1)
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(FINAL)
END
!
!
!
!
Output
1
1.000
X
2
-4.000
3
7.000
LFCDS
CAPABLE
Computes the RTR Cholesky factorization of a real symmetric positive definite matrix and estimate
its L1 condition number.
Required Arguments
A N by N symmetric positive definite matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT N by N matrix containing the upper triangular matrix R of the factorization of A in
the upper triangular part. (Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share
the same storage locations.
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LFCDS 185
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFCDS computes an RTR Cholesky factorization and estimates the condition number of a
real symmetric positive definite coefficient matrix. The matrix R is upper triangular.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCDS fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
The RTR factors are returned in a form that is compatible with routines LFIDS, LFSDS and LFDDS.
To solve systems of equations with multiple right-hand-side vectors, use LFCDS followed by either
LFIDS or LFSDS called once for each right-hand side. The routine LFDDS can be called to compute
the determinant of the coefficient matrix after LFCDS has performed the factorization.
Comments
1.
2.
Informational errors
Type
Code
3
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse of a 3 3 matrix is computed. LFCDS is called to factor the matrix and to check for
nonpositive definiteness or ill-conditioning. LFIDS is called to determine the columns of the
inverse.
USE
USE
USE
USE
LFCDS_INT
UMACH_INT
WRRRN_INT
LFIDS_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N, NOUT
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), AINV(LDA,LDA), RCOND, FACT(LDFACT,LDFACT),&
RES(N), RJ(N)
!
!
Chapter 1: Linear Systems
!
!
!
!
A = ( 1.0
( -3.0
( 2.0
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
Factor the matrix A
CALL LFCDS (A, FACT, RCOND)
!
Set up the columns of the identity
!
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
!
RJ is the J-th column of the identity
!
matrix so the following LFIDS
!
reference places the J-th column of
!
the inverse of A in the J-th column
!
of AINV
CALL LFIDS (A, FACT, RJ, AINV(:,J), RES)
RJ(J) = 0.0E0
10 CONTINUE
!
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
CALL WRRRN (AINV, AINV)
99999 FORMAT ( RCOND = ,F5.3,/, L1 Condition number = ,F9.3)
END
!
Output
RCOND < 0.005
L1 Condition number < 875.0
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. LFCDS is called to
factor the matrix and to check for singularity or ill-conditioning. LFIDS is called to determine the
columns of the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see
Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors
for the local arrays.
USE
USE
USE
USE
USE
USE
MPI_SETUP_INT
LFCDS_INT
UMACH_INT
LFIDS_INT
WRRRN_INT
SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, NOUT, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:), RJ(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), RJ0(:)
REAL
RCOND
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
LFCDS 189
10 CONTINUE
!
!
Print results.
Only Rank=0 has the solution, AINV.
IF(MP_RANK.EQ.0) CALL WRRRN (AINV, AINV)
IF (MP_RANK .EQ. 0) DEALLOCATE(A, AINV)
DEALLOCATE(A0, FACT0, RJ, RJ0, RES0, X0)
!
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
!
Shut down MPI
MP_NPROCS = MP_SETUP(FINAL)
99998 FORMAT ( RCOND = ,F5.3,/, L1 Condition number = ,F9.3)
END
Output
RCOND < 0.005
L1 Condition number < 875.0
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
LFTDS
CAPABLE
Computes the RTR Cholesky factorization of a real symmetric positive definite matrix.
Required Arguments
A N by N symmetric positive definite matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT N by N matrix containing the upper triangular matrix R of the factorization of A in
the upper triangle, and the lower triangular matrix RT in the lower triangle. (Output)
If A is not needed, A and FACT can share the same storage location.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFTDS computes an RTR Cholesky factorization of a real symmetric positive definite
coefficient matrix. The matrix R is upper triangular.
LFTDS fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
The RTR factors are returned in a form that is compatible with routines LFIDS, LFSDS and LFDDS.
To solve systems of equations with multiple right-hand-side vectors, use LFTDS followed by either
LFIDS or LFSDS called once for each right-hand side. The routine LFDDS can be called to compute
the determinant of the coefficient matrix after LFTDS has performed the factorization.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational error
Type
4
Code
2 The input matrix is not positive definite.
LFTDS 191
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse of a 3 3 matrix is computed. LFTDS is called to factor the matrix and to check for
nonpositive definiteness. LFSDS is called to determine the columns of the inverse.
USE LFTDS_INT
USE LFSDS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), AINV(LDA,LDA), FACT(LDFACT,LDFACT), RJ(N)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
Factor the matrix A
CALL LFTDS (A, FACT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
RJ is the J-th column of the identity
matrix so the following LFSDS
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSDS (FACT, RJ, AINV(:,J))
RJ(J) = 0.0E0
10 CONTINUE
Print the results
Output
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. LFTDS is called to
factor the matrix and to check for nonpositive definiteness. LFSDS is called to determine the
columns of the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see
Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors
for the local arrays.
USE MPI_SETUP_INT
USE LFTDS_INT
USE UMACH_INT
USE LFSDS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:), X0(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), RJ0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
ENDIF
!
!
CALL
!
!
CALL
!
CALL
CALL
LFTDS 193
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
RCOND < 0.005
L1 Condition number < 875.0
1
2
3
1
35.00
8.00
-5.00
AINV
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
LFSDS
CAPABLE
Solves a real symmetric positive definite system of linear equations given the RT R Cholesky
factorization of the coefficient matrix.
Required Arguments
FACT N by N matrix containing the RT R factorization of the coefficient matrix A as output
from routine LFCDS/DLFCDS or LFTDS/DLFTDS. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
This routine computes the solution for a system of linear algebraic equations having a real
symmetric positive definite coefficient matrix. To compute the solution, the coefficient matrix
must first undergo an RTR factorization. This may be done by calling either LFCDS or LFTDS. R is
an upper triangular matrix.
The solution to Ax = b is found by solving the triangular systems RTy = b and Rx = y.
Chapter 1: Linear Systems
LFSDS 195
LFSDS and LFIDS both solve a linear system given its R R factorization. LFIDS generally takes
more time and produces a more accurate answer than LFSDS. Each iteration of the iterative
refinement algorithm used by LFIDS calls LFSDS.
The underlying code is based on either LINPACK, LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational error
Type
4
Code
1 The input matrix is singular.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A set of linear systems is solved successively. LFTDS is called to factor the coefficient matrix.
LFSDS is called to compute the four solutions for the four right-hand sides. In this case the
coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be
better to call LFCDS to perform the factorization, and LFIDS to compute the solutions.
USE LFSDS_INT
USE LFTDS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), B(N,4), FACT(LDFACT,LDFACT), X(N,4)
!
!
!
196 Chapter 1: Linear Systems
!
!
!
!
!
!
!
!
!
!
A = ( 1.0
( -3.0
( 2.0
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
B = ( -1.0
( -3.0
( -3.0
3.6
-4.2
-5.2
-8.0 -9.4)
11.0 17.6)
-6.0 -23.4)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
DATA B/-1.0, -3.0, -3.0, 3.6, -4.2, -5.2, -8.0, 11.0, -6.0,&
-9.4, 17.6, -23.4/
Factor the matrix A
CALL LFTDS (A, FACT)
Compute the solutions
DO 10 I=1, 4
CALL LFSDS (FACT, B(:,I), X(:,I))
10 CONTINUE
Print solutions
CALL WRRRN (The solution vectors are, X)
!
END
Output
1
2
3
ScaLAPACK Example
The same set of linear systems is solved successively as a distributed example. Routine LFTDS is
called to factor the coefficient matrix. The routine LFSDS is called to compute the four solutions
for the four right-hand sides. In this case, the coefficient matrix is assumed to be well-conditioned
and correctly scaled. Otherwise, it would be better to call LFCDS to perform the factorization, and
LFIDS to compute the solutions. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility
routines (see Chapter 11, Utilities) used to map and unmap arrays to and from the processor
grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes
the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFSDS_INT
USE LFTDS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:,:), X(:,:), X0(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:)
PARAMETER (LDA=3, N=3)
LFSDS 197
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
1
2
3
LFIDS
CAPABLE
Uses iterative refinement to improve the solution of a real symmetric positive definite system of
linear equations.
Required Arguments
A N by N matrix containing the symmetric positive definite coefficient matrix of the linear
system. (Input)
Only the upper triangle of A is referenced.
FACT N by N matrix containing the RT R factorization of the coefficient matrix A as output
from routine LFCDS/DLFCDS or LFTDS/DLFTDS. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
RES Vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimesion statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
LFIDS 199
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFIDS computes the solution of a system of linear algebraic equations having a real
symmetric positive definite coefficient matrix. Iterative refinement is performed on the solution
vector to improve the accuracy. Usually almost all of the digits in the solution are accurate, even if
the matrix is somewhat ill-conditioned. The underlying code is based on either LINPACK ,
LAPACK, or ScaLAPACK code depending upon which supporting libraries are used during
linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual.
To compute the solution, the coefficient matrix must first undergo an RTR factorization. This may
be done by calling either LFCDS or LFTDS.
Iterative refinement fails only if the matrix is very ill-conditioned.
T
LFIDS and LFSDS both solve a linear system given its R R factorization. LFIDS generally takes
more time and produces a more accurate answer than LFSDS. Each iteration of the iterative
refinement algorithm used by LFIDS calls LFSDS.
Comments
Informational error
Type
Code
3
2 The input matrix is too ill-conditioned for iterative refinement to be
effective.
X0 Local vector of length MXLDA containing the local portions of the distributed vector X.
X contains the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
RES0 Local vector of length MXLDA containing the local portions of the distributed
vector RES. RES contains the residual vector at the improved solution to the linear
system. (Output)
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving
the system each of the first two times by adding 0.2 to the second element.
USE
USE
USE
USE
LFIDS_INT
LFCDS_INT
UMACH_INT
WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), B(N), RCOND, FACT(LDFACT,LDFACT), RES(N,3),&
X(N,3)
!
!
!
!
!
!
!
!
!
!
!
!
-3.0
10.0
-5.0
2.0)
-5.0)
6.0)
B = (
-3.0
2.0)
1.0
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
DATA B/1.0, -3.0, 2.0/
Factor the matrix A
CALL LFCDS (A, FACT, RCOND)
Print the estimated condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Compute the solutions
DO 10 I=1, 3
CALL LFIDS (A, FACT, B, X(:,I), RES(:,I))
B(2) = B(2) + .2E0
10 CONTINUE
Print solutions and residuals
CALL WRRRN (The solution vectors are, X)
CALL WRRRN (The residual vectors are, RES)
!
99999 FORMAT (
RCOND = ,F5.3,/,
END
Output
RCOND = 0.001
L1 Condition number =
674.727
vectors are
2
3
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
ScaLAPACK Example
The same set of linear systems is solved successively as a distributed example. The right-handside vector is perturbed after solving the system each of the first two times by adding 0.2 to the
second element. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter
11, Utilities) used to map and unmap arrays to and from the processor grid. They are used here
for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the
local arrays.
USE MPI_SETUP_INT
USE LFIDS_INT
USE LFCDS_INT
USE UMACH_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, NOUT, DESCA(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:,:), RES(:,:), X0(:)
REAL, ALLOCATABLE ::
A0(:,:), FACT0(:,:), B0(:), RES0(:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N,3), RES(N,3))
Set values for A and B
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/-3.0, 10.0, -5.0/)
A(3,:) = (/ 2.0, -5.0, 6.0/)
!
B
ENDIF
= (/ 1.0,
-3.0,
2.0/)
!
!
Output
RCOND = 0.001
L1 Condition number =
674.727
3
LFIDS 203
1
2
3
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
LFDDS
Computes the determinant of a real symmetric positive definite matrix given the RTR Cholesky
factorization of the matrix .
Required Arguments
FACT N by N matrix containing the RT R factorization of the coefficient matrix A as output
from routine LFCDS/DLFCDS or LFTDS/DLFTDS. (Input)
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that, 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form, det(A) = DET1 * 10DET2.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDDS computes the determinant of a real symmetric positive definite coefficient matrix.
To compute the determinant, the coefficient matrix must first undergo an RTR factorization. This
may be done by calling either LFCDS or LFTDS. The formula det A = det RT det R = (det R)2 is
204 Chapter 1: Linear Systems
used to compute the determinant. Since the determinant of a triangular matrix is the product of the
diagonal elements,
det R = i =1 Rii
N
Example
The determinant is computed for a real positive definite 3 3 matrix.
USE LFDDS_INT
USE LFTDS_INT
USE UMACH_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, NOUT
(LDA=3, LDFACT=3)
A(LDA,LDA), DET1, DET2, FACT(LDFACT,LDFACT)
!
!
!
!
!
!
!
!
!
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 20.0, -5.0, 2.0, -5.0, 6.0/
Factor the matrix
CALL LFTDS (A, FACT)
Compute the determinant
CALL LFDDS (FACT, DET1, DET2)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT ( The determinant of A is ,F6.3, * 10**,F2.0)
END
Output
The determinant of A is 2.100 * 10**1.
LINDS
CAPABLE
LINDS 205
Required Arguments
A N by N matrix containing the symmetric positive definite matrix to be inverted. (Input)
Only the upper triangle of A is referenced.
AINV N by N matrix containing the inverse of A. (Output)
If A is not needed, A and AINV can share the same storage locations.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDAINV Leading dimension of AINV exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDAINV = size (AINV,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LINDS computes the inverse of a real symmetric positive definite matrix. The underlying
code is based on either LINPACK , LAPACK, or ScaLAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see
206 Chapter 1: Linear Systems
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this
manual. LINDS first uses the routine LFCDS to compute an RTR factorization of the coefficient
matrix and to estimate the condition number of the matrix. LINRT is then used to compute R-1.
Finally A-1 is computed using R-1 = R-1 R-T.
LINDS fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in A.
Comments
1.
2.
Informational errors
Type Code
3
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse is computed for a real positive definite 3 3 matrix.
LINDS 207
USE LINDS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDAINV
(LDA=3, LDAINV=3)
A(LDA,LDA), AINV(LDAINV,LDAINV)
!
!
!
!
!
!
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
!
CALL LINDS (A, AINV)
!
!
Print results
CALL WRRRN (AINV, AINV)
END
Output
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
ScaLAPACK Example
The inverse of the same 3 3 matrix is computed as a distributed example. SCALAPACK_MAP and
SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities) used to map and unmap
arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LINDS_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, LDFACT, N, DESCA(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), AINV(:,:)
REAL, ALLOCATABLE ::
A0(:,:), AINV0(:,:)
PARAMETER (LDA=3, N=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A
A(1,:) = (/ 1.0, -3.0, 2.0/)
A(2,:) = (/ -3.0, 10.0, -5.0/)
A(3,:) = (/
2.0,
-5.0,
6.0/)
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
AINV
1
2
3
1
35.00
8.00
-5.00
2
8.00
2.00
-1.00
3
-5.00
-1.00
1.00
LSASF
Solves a real symmetric system of linear equations with iterative refinement.
Required Arguments
A N by N matrix containing the coefficient matrix of the symmetric linear system. (Input)
Only the upper triangle of A is referenced.
B Vector of length N containing the right-hand side of the linear system. (Input)
Chapter 1: Linear Systems
LSASF 209
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSASF solves systems of linear algebraic equations having a real symmetric indefinite
coefficient matrix. It first uses the routine LFCSF to compute a U DUT factorization of the
coefficient matrix and to estimate the condition number of the matrix. D is a block diagonal matrix
with blocks of order 1 or 2, and U is a matrix composed of the product of a permutation matrix
and a unit upper triangular matrix. The solution of the linear system is then found using the
iterative refinement routine LFISF.
LSASF fails if a block in D is singular or if the iterative refinement algorithm fails to converge.
These errors occur only if A is singular or very close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSASF solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2ASF the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSASF.
Additional memory allocation for FACT and option value restoration are done
automatically in LSASF. Users directly calling L2ASF can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSASF or L2ASF. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSASF temporarily replaces IVAL(2) by IVAL(1). The
routine L2CSF computes the condition number if IVAL(2) = 2. Otherwise L2CSF
skips this computation. LSASF restores the option. Default values for the option
are IVAL(*) = 1, 2.
Example
A system of three linear equations is solved. The coefficient matrix has real symmetric form and
the right-hand-side vector b has three elements.
USE LSASF_INT
USE WRRRN_INT
!
PARAMETER
REAL
Declare variables
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
Chapter 1: Linear Systems
-2.0
3.0
-2.0
1.0)
-2.0)
3.0)
LSASF 211
!
!
!
B = (
4.1
-4.7
6.5)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
DATA B/4.1, -4.7, 6.5/
CALL LSASF (A, B, X)
Print results
CALL WRRRN (X, X, 1, N, 1)
END
Output
1
-4.100
X
2
-3.500
3
1.200
LSLSF
Solves a real symmetric system of linear equations without iterative refinement .
Required Arguments
A N by N matrix containing the coefficient matrix of the symmetric linear system. (Input)
Only the upper triangle of A is referenced.
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLSF solves systems of linear algebraic equations having a real symmetric indefinite
coefficient matrix. It first uses the routine LFCSF to compute a U DUT factorization of the
coefficient matrix. D is a block diagonal matrix with blocks of order 1 or 2, and U is a matrix
composed of the product of a permutation matrix and a unit upper triangular matrix.
The solution of the linear system is then found using the routine LFSSF.
LSLSF fails if a block in D is singular. This occurs only if A either is singular or is very close to a
singular matrix.
Comments
1.
2.
Informational errors
Type
3
Code
1 The input matrix is too ill-conditioned. The solution might not be
accurate.
2 The input matrix is singular.
4
3.
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine LSLSF the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSLSF.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLSF. Users directly calling LSLSF can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLSF or LSLSF. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLSF temporarily replaces IVAL(2) by IVAL(1). The
LSLSF 213
Example
A system of three linear equations is solved. The coefficient matrix has real symmetric form and
the right-hand-side vector b has three elements.
USE LSLSF_INT
USE WRRRN_INT
!
PARAMETER
REAL
Declare variables
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
!
!
-2.0
3.0
-2.0
1.0)
-2.0)
3.0)
B = (
-4.7
6.5)
4.1
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
DATA B/4.1, -4.7, 6.5/
!
CALL LSLSF (A, B, X)
!
Print results
CALL WRRRN (X, X, 1, N, 1)
END
Output
1
-4.100
X
2
-3.500
3
1.200
LFCSF
Computes the U DUT factorization of a real symmetric matrix and estimate its L1 condition
number.
Required Arguments
A N by N symmetric matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT N by N matrix containing information about the factorization of the symmetric
matrix A. (Output)
Only the upper triangle of FACT is used. If A is not needed, A and FACT can share the
same storage locations.
214 Chapter 1: Linear Systems
IPVT Vector of length N containing the pivoting information for the factorization.
(Output)
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFCSF performs a U DUT factorization of a real symmetric indefinite coefficient matrix.
It also estimates the condition number of the matrix. The U DUT factorization is called the
diagonal pivoting factorization.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCSF fails if A is singular or very close to a singular matrix.
The U DUT factors are returned in a form that is compatible with routines LFISF, LFSSF and
LFDSF. To solve systems of equations with multiple right-hand-side vectors, use LFCSF followed
Chapter 1: Linear Systems
LFCSF 215
by either LFISF or LFSSF called once for each right-hand side. The routine LFDSF can be called
to compute the determinant of the coefficient matrix after LFCSF has performed the factorization.
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational errors
Type
Code
3
4
1
2
Example
The inverse of a 3 3 matrix is computed. LFCSF is called to factor the matrix and to check for
singularity or ill-conditioning. LFISF is called to determine the columns of the inverse.
USE
USE
USE
USE
LFCSF_INT
UMACH_INT
LFISF_INT
WRRRN_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), AINV(N,N), FACT(LDA,LDA), RJ(N), RES(N),&
RCOND
!
!
!
!
!
!
!
!
!
!
!
-2.0
3.0
-2.0
1.0)
-2.0)
3.0)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
Factor A and return the reciprocal
condition number estimate
CALL LFCSF (A, FACT, IPVT, RCOND)
Print the estimate of the condition
number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
!
216 Chapter 1: Linear Systems
!
!
!
!
!
Output
RCOND < 0.05
L1 Condition number < 40.0
AINV
1
2
3
1
-2.500
-2.000
-0.500
2
-2.000
-1.000
0.000
3
-0.500
0.000
0.500
LFTSF
Computes the U DUT factorization of a real symmetric matrix.
Required Arguments
A N by N symmetric matrix to be factored. (Input)
Only the upper triangle of A is referenced.
FACT N by N matrix containing information about the factorization of the symmetric
matrix A. (Output)
Only the upper triangle of FACT is used. If A is not needed, A and FACT can share the
same storage locations.
IPVT Vector of length N containing the pivoting information for the factorization.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LFTSF 217
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFTSF performs a U DUT factorization of a real symmetric indefinite coefficient matrix.
The U DUT factorization is called the diagonal pivoting factorization.
LFTSF fails if A is singular or very close to a singular matrix.
The U DUT factors are returned in a form that is compatible with routines LFISF, LFSSF and
LFDSF. To solve systems of equations with multiple right-hand-side vectors, use LFTSF followed
by either LFISF or LFSSF called once for each right-hand side. The routine LFDSF can be called
to compute the determinant of the coefficient matrix after LFTSF has performed the factorization.
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational error
Type
4
Code
2 The input matrix is singular.
Example
The inverse of a 3 3 matrix is computed. LFTSF is called to factor the matrix and to check for
singularity. LFSSF is called to determine the columns of the inverse.
USE LFTSF_INT
USE LFSSF_INT
218 Chapter 1: Linear Systems
USE WRRRN_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, N=3)
IPVT(N)
A(LDA,LDA), AINV(N,N), FACT(LDA,LDA), RJ(N)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1.0)
-2.0)
3.0)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
Factor A
CALL LFTSF (A, FACT, IPVT)
Set up the columns of the identity
matrix one at a time in RJ
RJ = 0.0E0
DO 10 J=1, N
RJ(J) = 1.0E0
RJ is the J-th column of the identity
matrix so the following LFSSF
reference places the J-th column of
the inverse of A in the J-th column
of AINV
CALL LFSSF (FACT, IPVT, RJ, AINV(:,J))
RJ(J) = 0.0E0
10 CONTINUE
Print the inverse
CALL WRRRN (AINV, AINV)
END
Output
1
2
3
1
-2.500
-2.000
-0.500
AINV
2
-2.000
-1.000
0.000
3
-0.500
0.000
0.500
LFSSF
Solves a real symmetric system of linear equations given the U DUT factorization of the
coefficient matrix.
Required Arguments
FACT N by N matrix containing the factorization of the coefficient matrix A as output from
routine LFCSF/DLFCSF or LFTSF/DLFTSF. (Input)
Only the upper triangle of FACT is used.
IPVT Vector of length N containing the pivoting information for the factorization of A as
output from routine LFCSF/DLFCSF or LFTSF/DLFTSF. (Input)
Chapter 1: Linear Systems
LFSSF 219
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of A exactly as specified in the dimension statement of the
calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFSSF computes the solution of a system of linear algebraic equations having a real
symmetric indefinite coefficient matrix.
To compute the solution, the coefficient matrix must first undergo a U DUT factorization. This
may be done by calling either LFCSF or LFTSF.
T
LFSSF and LFISF both solve a linear system given its U DU factorization. LFISF generally takes
more time and produces a more accurate answer than LFSSF. Each iteration of the iterative
refinement algorithm used by LFISF calls LFSSF.
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Example
A set of linear systems is solved successively. LFTSF is called to factor the coefficient matrix.
LFSSF is called to compute the four solutions for the four right-hand sides. In this case the
coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be
better to call LFCSF to perform the factorization, and LFISF to compute the solutions.
220 Chapter 1: Linear Systems
USE LFSSF_INT
USE LFTSF_INT
USE WRRRN_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, N=3)
IPVT(N)
A(LDA,LDA), B(N,4), X(N,4), FACT(LDA,LDA)
!
!
!
!
!
!
!
!
!
!
!
!
!
-2.0
3.0
-2.0
1.0)
-2.0)
3.0)
B = ( -1.0
( -3.0
( -3.0
3.6
-4.2
-5.2
-8.0 -9.4)
11.0 17.6)
-6.0 -23.4)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
DATA B/-1.0, -3.0, -3.0, 3.6, -4.2, -5.2, -8.0, 11.0, -6.0,&
-9.4, 17.6, -23.4/
Factor A
CALL LFTSF (A, FACT, IPVT)
Solve for the four right-hand sides
DO 10 I=1, 4
CALL LFSSF (FACT, IPVT, B(:,I), X(:,I))
10 CONTINUE
Print results
CALL WRRRN (X, X)
END
Output
X
1
2
3
1
10.00
5.00
-1.00
2
2.00
-3.00
-4.40
3
1.00
5.00
1.00
4
0.00
1.20
-7.00
LFISF
Uses iterative refinement to improve the solution of a real symmetric system of linear equations.
Required Arguments
A N by N matrix containing the coefficient matrix of the symmetric linear system. (Input)
Only the upper triangle of A is referenced
FACT N by N matrix containing the factorization of the coefficient matrix A as output from
routine LFCSF/DLFCSF or LFTSF/DLFTSF. (Input)
Only the upper triangle of FACT is used.
Chapter 1: Linear Systems
LFISF 221
IPVT Vector of length N containing the pivoting information for the factorization of A as
output from routine LFCSF/DLFCSF or LFTSF/DLFTSF. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
RES Vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
LFISF computes the solution of a system of linear algebraic equations having a real symmetric
indefinite coefficient matrix. Iterative refinement is performed on the solution vector to improve
the accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is
somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo a U DUT factorization. This
may be done by calling either LFCSF or LFTSF.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFISF and LFSSF both solve a linear system given its U DU factorization. LFISF generally takes
more time and produces a more accurate answer than LFSSF. Each iteration of the iterative
refinement algorithm used by LFISF calls LFSSF.
Comments
Informational error
Type
Code
3
2 The input matrix is too ill-conditioned for iterative refinement to be
effective.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving
the system each of the first two times by adding 0.2 to the second element.
USE
USE
USE
USE
LFISF_INT
UMACH_INT
LFCSF_INT
WRRRN_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), B(N), X(N), FACT(LDA,LDA), RES(N), RCOND
!
!
!
!
!
!
!
!
!
!
!
!
!
4.1
-4.7
6.5)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
DATA B/4.1, -4.7, 6.5/
Factor A and compute the estimate
of the reciprocal condition number
CALL LFCSF (A, FACT, IPVT, RCOND)
Print condition number
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) RCOND, 1.0E0/RCOND
Solve, then perturb right-hand side
DO 10 I=1, 3
CALL LFISF (A, FACT, IPVT, B, X, RES)
Print results
CALL WRRRN (X, X, 1, N, 1)
CALL WRRRN (RES, RES, 1, N, 1)
B(2) = B(2) + .20E0
10 CONTINUE
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
LFISF 223
2
-3.500
3
1.200
RES
1
-2.384E-07
1
-4.500
2
-2.384E-07
X
2
-3.700
3
0.000E+00
3
1.200
RES
1
-2.384E-07
1
-4.900
2
-2.384E-07
X
2
-3.900
3
0.000E+00
3
1.200
RES
1
-2.384E-07
2
-2.384E-07
3
0.000E+00
LFDSF
Computes the determinant of a real symmetric matrix given the U DUT factorization of the matrix.
Required Arguments
FACT N by N matrix containing the factored matrix A as output from subroutine
LFTSF/DLFTSF or LFCSF/DLFCSF. (Input)
IPVT Vector of length N containing the pivoting information for the U DUT factorization
as output from routine LFTSF/DLFTSF or LFCSF/DLFCSF. (Input)
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that, 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form, det(A) = DET1 * 10DET2.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (FACT,2).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDSF computes the determinant of a real symmetric indefinite coefficient matrix. To
compute the determinant, the coefficient matrix must first undergo a U DUT factorization. This
may be done by calling either LFCSF or LFTSF. Since det U = 1, the formula
det A = det U det D det UT = det D is used to compute the determinant. Next det D is computed as
the product of the determinants of its blocks.
LFDSF is based on the LINPACK routine SSIDI; see Dongarra et al. (1979).
Example
The determinant is computed for a real symmetric 3 3 matrix.
USE LFDSF_INT
USE LFTSF_INT
USE UMACH_INT
!
PARAMETER
INTEGER
REAL
Declare variables
(LDA=3, N=3)
IPVT(N), NOUT
A(LDA,LDA), FACT(LDA,LDA), DET1, DET2
!
!
!
!
!
!
!
!
!
1.0)
-2.0)
3.0)
DATA A/1.0, -2.0, 1.0, -2.0, 3.0, -2.0, 1.0, -2.0, 3.0/
Factor A
CALL LFTSF (A, FACT, IPVT)
Compute the determinant
CALL LFDSF (FACT, IPVT, DET1, DET2)
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
LFDSF 225
Output
The determinant of A is -2.000 * 10**0.
LSADH
CAPABLE
Solves a Hermitian positive definite system of linear equations with iterative refinement.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the Hermitian positive
definite linear system. (Input)
Only the upper triangle of A is referenced.
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution of the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSADH solves a system of linear algebraic equations having a complex Hermitian positive
definite coefficient matrix. It first uses the routine LFCDH to compute an RH R Cholesky
factorization of the coefficient matrix and to estimate the condition number of the matrix. The
matrix R is upper triangular. The solution of the linear system is then found using the iterative
refinement routine LFIDH.
LSADH fails if any submatrix of R is not positive definite, if R has a zero diagonal element or if the
iterative refinement algorithm fails to converge. These errors occur only if A either is very close to
a singular matrix or is a matrix that is not positive definite.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSADH solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational errors
Type
Code
3
4
4
3.
2
4
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2ADH the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSADH.
Additional memory allocation for FACT and option value restoration are done
automatically in LSADH. Users directly calling L2ADH can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSADH or L2ADH. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSADH temporarily replaces IVAL(2) by IVAL(1). The
routine L2CDH computes the condition number if IVAL(2) = 2. Otherwise L2CDH
skips this computation. LSADH restores the option. Default values for the option
are IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of five linear equations is solved. The coefficient matrix has complex positive definite
form and the right-hand-side vector b has five elements.
USE LSADH_INT
USE WRCRN_INT
228 Chapter 1: Linear Systems
!
!
!
!
!
!
!
!
!
!
!
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA, N
(LDA=5, N=5)
A(LDA,LDA), B(N), X(N)
Set values for A and B
A =
(
(
(
(
(
2.0+0.0i
B =
( 1.0+5.0i
-1.0+1.0i
4.0+0.0i
12.0-6.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
1.0-16.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
-3.0-3.0i
0.0+0.0i )
0.0+0.0i )
0.0+0.0i )
1.0+1.0i )
9.0+0.0i )
25.0+16.0i )
Print results
CALL WRCRN (X, X, 1, N, 1)
!
END
Output
X
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
ScaLAPACK Example
The same system of five linear equations is solved as a distributed computing example. The
coefficient matrix has complex positive definite form and the right-hand-side vector b has five
elements. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11,
Utilities) used to map and unmap arrays to and from the processor grid. They are used here for
brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
USE MPI_SETUP_INT
USE LSADH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
INTEGER
INTEGER
Chapter 1: Linear Systems
Declare variables
LDA, N, DESCA(9), DESCX(9)
INFO, MXCOL, MXLDA
LSADH 229
COMPLEX, ALLOCATABLE ::
COMPLEX, ALLOCATABLE ::
PARAMETER
(LDA=5, N=5)
!
0.0),(0.0,
0.0),(0.0,
4.0),(0.0,
0.0),(1.0,
0.0),(9.0,
0.0)/)
0.0)/)
0.0)/)
1.0)/)
0.0)/)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
LSLDH
CAPABLE
Solves a complex Hermitian positive definite system of linear equations without iterative
refinement.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the Hermitian positive
definite linear system. (Input)
Only the upper triangle of A is referenced.
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
LSLDH 231
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSLDH solves a system of linear algebraic equations having a complex Hermitian positive
definite coefficient matrix. The underlying code is based on either LINPACK , LAPACK, or
ScaLAPACK code depending upon which supporting libraries are used during linking. For a
detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the
Introduction section of this manual. LSLDH first uses the routine LFCDH to compute an RH R
Cholesky factorization of the coefficient matrix and to estimate the condition number of the
matrix. The matrix R is upper triangular. The solution of the linear system is then found using the
routine LFSDH.
LSLDH fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that
LSADH be used.
Comments
1.
2.
3.
Informational errors
Type
Code
3
4
4
2
4
16
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LDH the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSLDH.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLDH. Users directly calling L2LDH can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLDH or L2LDH. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLDH temporarily replaces IVAL(2) by IVAL(1). The
routine L2CDH computes the condition number if IVAL(2) = 2. Otherwise L2CDH
skips this computation. LSLDH restores the option. Default values for the option
are IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of five linear equations is solved. The coefficient matrix has complex Hermitian positive
definite form and the right-hand-side vector b has five elements.
USE LSLDH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA, N
(LDA=5, N=5)
A(LDA,LDA), B(N), X(N)
!
Chapter 1: Linear Systems
LSLDH 233
!
!
!
!
!
!
!
!
!
!
(
(
(
(
(
2.0+0.0i
B =
( 1.0+5.0i
-1.0+1.0i
4.0+0.0i
12.0-6.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
1.0-16.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
-3.0-3.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
1.0+1.0i
9.0+0.0i
)
)
)
)
)
25.0+16.0i )
Print results
CALL WRCRN (X, X, 1, N, 1)
!
END
Output
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
ScaLAPACK Example
The same system of five linear equations is solved as a distributed computing example. The
coefficient matrix has complex positive definite form and the right-hand-side vector b has five
elements. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11,
Utilities) used to map and unmap arrays to and from the processor grid. They are used here for
brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local
arrays.
USE MPI_SETUP_INT
USE LSLDH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), X(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0,
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0,
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0,
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0,
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0,
0.0),(0.0,
0.0),(0.0,
4.0),(0.0,
0.0),(1.0,
0.0),(9.0,
0.0)/)
0.0)/)
0.0)/)
1.0)/)
0.0)/)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
X
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
LSLDH 235
LFCDH
CAPABLE
Computes the RH R factorization of a complex Hermitian positive definite matrix and estimate its
L1 condition number.
Required Arguments
A Complex N by N Hermitian positive definite matrix to be factored. (Input) Only the
upper triangle of A is referenced.
FACT Complex N by N matrix containing the upper triangular matrix R of the factorization
of A in the upper triangle. (Output)
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share
the same storage locations.
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT --- Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFCDH computes an RH R Cholesky factorization and estimates the condition number of a
complex Hermitian positive definite coefficient matrix. The matrix R is upper triangular.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCDH fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
The RH R factors are returned in a form that is compatible with routines LFIDH, LFSDH and
LFDDH. To solve systems of equations with multiple right-hand-side vectors, use LFCDH followed
by either LFIDH or LFSDH called once for each right-hand side. The routine LFDDH can be called
to compute the determinant of the coefficient matrix after LFCDH has performed the factorization.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational errors
Type
Code
3
3
1
4
4
4
4
2
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse of a 5 5 Hermitian positive definite matrix is computed. LFCDH is called to factor the
matrix and to check for nonpositive definiteness or ill-conditioning. LFIDH is called to determine
the columns of the inverse.
USE
USE
USE
USE
LFCDH_INT
LFIDH_INT
UMACH_INT
WRCRN_INT
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NOUT
(LDA=5, LDFACT=5, N=5)
RCOND
A(LDA,LDA), AINV(LDA,LDA), FACT(LDFACT,LDFACT),&
RES(N), RJ(N)
Set values for A
A =
(
(
(
(
(
2.0+0.0i
-1.0+1.0i
4.0+0.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
1.0+1.0i
9.0+0.0i
)
)
)
)
)
!
!
!
!
!
!
!
RJ = (0.0E0, 0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND < 0.075
L1 Condition number < 25.0
1
2
3
4
5
1
2
3
4
5
AINV
1
2
3
( 0.7166, 0.0000) ( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.2166, 0.2166) ( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0899, 0.0300) (-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0207,-0.0622) (-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0092, 0.0046) ( 0.0138,-0.0046) (-0.0138,-0.0138)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
ScaLAPACK Example
The inverse of the same 5 x 5 Hermitian positive definite matrix in the preceding example is
computed as a distributed computing example. LFCDH is called to factor the matrix and to check
for nonpositive definiteness or ill-conditioning. LFIDH (page 187) is called to determine the
columns of the inverse. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see
Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They are
used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the descriptors
for the local arrays.
LFCDH 239
USE MPI_SETUP_INT
USE LFCDH_INT
USE LFIDH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
INTEGER
J, LDA, N, NOUT, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:), RJ(:), RJ0(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), RES0(:), X0(:)
PARAMETER
(LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0, 0.0)/)
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0, 0.0)/)
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0, 0.0)/)
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0, 1.0)/)
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0, 0.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA), RES0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor the matrix A
CALL LFCDH (A0, FACT0, RCOND)
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0, 0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
CALL SCALAPACK_MAP(RJ, DESCX, RJ0)
RJ is the J-th column of the identity
matrix so the following LFIDH
reference solves for the J-th column of
the inverse of A
CALL LFIDH (A0, FACT0, RJ0, X0, RES0)
Unmap the results from the distributed
array back to a non-distributed array
CALL SCALAPACK_UNMAP(X0, DESCX, AINV(:,J))
RJ(J) = (0.0E0,0.0E0)
10 CONTINUE
!
!
!
Output
RCOND < 0.075
L1 Condition number < 25.0
1
2
3
4
5
1
2
3
4
5
AINV
1
2
3
( 0.7166, 0.0000) ( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.2166, 0.2166) ( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0899, 0.0300) (-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0207,-0.0622) (-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0092, 0.0046) ( 0.0138,-0.0046) (-0.0138,-0.0138)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
LFTDH
CAPABLE
Required Arguments
A Complex N by N Hermitian positive definite matrix to be factored. (Input) Only the
upper triangle of A is referenced.
FACT Complex N by N matrix containing the upper triangular matrix R of the factorization
of A in the upper triangle. (Output)
Chapter 1: Linear Systems
LFTDH 241
Only the upper triangle of FACT will be used. If A is not needed, A and FACT can share
the same storage locations.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFTDH computes an RH R Cholesky factorization of a complex Hermitian positive definite
coefficient matrix. The matrix R is upper triangular.
LFTDH fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
The RH R factors are returned in a form that is compatible with routines LFIDH, LFSDH and
LFDDH. To solve systems of equations with multiple right-hand-side vectors, use LFCDH followed
242 Chapter 1: Linear Systems
by either LFIDH or LFSDH called once for each right-hand side. The IMSL routine LFDDH can be
called to compute the determinant of the coefficient matrix after LFCDH has performed the
factorization.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational errors
Type
3
4
4
Code
4 The input matrix is not Hermitian. It has a diagonal entry with a small
imaginary part.
2 The input matrix is not positive definite.
4 The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
The inverse of a 5 5 matrix is computed. LFTDH is called to factor the matrix and to check for
nonpositive definiteness. LFSDH is called to determine the columns of the inverse.
USE LFTDH_INT
USE LFSDH_INT
USE WRCRN_INT
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA, LDFACT, N
(LDA=5, LDFACT=5, N=5)
A(LDA,LDA), AINV(LDA,LDA), FACT(LDFACT,LDFACT), RJ(N)
!
Chapter 1: Linear Systems
LFTDH 243
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
(
(
(
(
(
2.0+0.0i
-1.0+1.0i
4.0+0.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
1.0+1.0i
9.0+0.0i
)
)
)
)
)
Output
AINV
1
2
3
1 ( 0.7166, 0.0000) ( 0.2166,-0.2166) (-0.0899,-0.0300)
2
( 0.4332, 0.0000) (-0.0599,-0.1198)
3
( 0.1797, 0.0000)
4
5
1 ( 0.0092,-0.0046)
2 ( 0.0138, 0.0046)
3 (-0.0138, 0.0138)
4 (-0.0288,-0.0288)
5 ( 0.1175, 0.0000)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
ScaLAPACK Example
The inverse of the same 5 x 5 Hermitian positive definite matrix in the preceding example is
computed as a distributed computing example. LFTDH is called to factor the matrix and to check
for nonpositive definiteness. LFSDH (page 192) is called to determine the columns of the inverse.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities)
244 Chapter 1: Linear Systems
used to map and unmap arrays to and from the processor grid. They are used here for brevity.
DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
USE MPI_SETUP_INT
USE LFTDH_INT
USE LFSDH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
Declare variables
INTEGER
J, LDA, N, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
COMPLEX, ALLOCATABLE ::
A(:,:), AINV(:,:), RJ(:), RJ0(:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), X0(:)
PARAMETER
(LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), AINV(LDA,N))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0, 0.0)/)
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0, 0.0)/)
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0, 0.0)/)
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0, 1.0)/)
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0, 0.0)/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(N, N, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
and MXCOL
CALL SCALAPACK_GETDIM(N, N, MP_MB, MP_NB, MXLDA, MXCOL)
Set up the array descriptors
CALL DESCINIT(DESCA, N, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, MXLDA, INFO)
CALL DESCINIT(DESCX, N, 1, MP_MB, 1, 0, 0, MP_ICTXT, MXLDA, INFO)
Allocate space for the local arrays
ALLOCATE(A0(MXLDA,MXCOL), X0(MXLDA),FACT0(MXLDA,MXCOL), RJ(N), &
RJ0(MXLDA))
Map input arrays to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Factor the matrix A
CALL LFTDH (A0, FACT0)
Set up the columns of the identity
matrix one at a time in RJ
RJ = (0.0E0, 0.0E0)
DO 10 J=1, N
RJ(J) = (1.0E0,0.0E0)
CALL SCALAPACK_MAP(RJ, DESCX, RJ0)
RJ is the J-th column of the identity
matrix so the following LFIDH
reference solves for the J-th column of
the inverse of A
CALL LFSDH (FACT0, RJ0, X0)
Unmap the results from the distributed
LFTDH 245
!
!
!
!
!
Output
1
2
3
4
5
1
2
3
6
7
AINV
1
2
3
( 0.7166, 0.0000) ( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.2166, 0.2166) ( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0899, 0.0300) (-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0207,-0.0622) (-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0092, 0.0046) ( 0.0138,-0.0046) (-0.0138,-0.0138)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
LFSDH
CAPABLE
Solves a complex Hermitian positive definite system of linear equations given the RH R
factorization of the coefficient matrix.
Required Arguments
FACT Complex N by N matrix containing the factorization of the coefficient matrix A as
output from routine LFCDH/DLFCDH or LFTDH/DLFTDH. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
This routine computes the solution for a system of linear algebraic equations having a complex
Hermitian positive definite coefficient matrix. To compute the solution, the coefficient matrix
must first undergo an RH R factorization. This may be done by calling either LFCDH or LFTDH. R is
an upper triangular matrix.
The solution to Ax = b is found by solving the triangular systems RH y = b and Rx = y.
H
LFSDH and LFIDH both solve a linear system given its R R factorization. LFIDH generally takes
more time and produces a more accurate answer than LFSDH. Each iteration of the iterative
refinement algorithm used by LFIDH calls LFSDH.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see
Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this
manual.
LFSDH 247
Comments
Informational error
Type
4
Code
1 The input matrix is singular.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call
to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (Utilities) has been
made. See the ScaLAPACK Example below.
Example
A set of linear systems is solved successively. LFTDH is called to factor the coefficient matrix.
LFSDH is called to compute the four solutions for the four right-hand sides. In this case, the
coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be
better to call LFCDH to perform the factorization, and LFIDH to compute the solutions.
USE LFSDH_INT
USE LFTDH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N
(LDA=5, LDFACT=5, N=5)
A(LDA,LDA), B(N,3), FACT(LDFACT,LDFACT), X(N,3)
Set values for A and B
A =
(
(
(
(
(
2.0+0.0i
-1.0+1.0i
4.0+0.0i
B =
(
(
(
3.0+3.0i
5.0-5.0i
5.0+4.0i
4.0+0.0i
15.0-10.0i
-12.0-56.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
0.0+0.0i )
0.0+0.0i )
0.0+0.0i )
1.0+1.0i )
9.0+0.0i )
29.0-9.0i )
-36.0-17.0i )
-15.0-24.0i )
Fortran Numerical MATH LIBRARY
!
!
( 9.0+7.0i
(-22.0+1.0i
-12.0+10.0i
3.0-1.0i
-23.0-15.0i )
-23.0-28.0i )
!
END
Output
1
2
3
4
5
( 1.00,
( 1.00,
( 2.00,
( 2.00,
( -3.00,
1
0.00)
-2.00)
0.00)
3.00)
0.00)
2
( 3.00, -1.00)
( 2.00, 0.00)
( -1.00, -6.00)
( 2.00, 1.00)
( 0.00, 0.00)
(
(
(
(
(
11.00,
-7.00,
-2.00,
-2.00,
-2.00,
3
-1.00)
0.00)
-3.00)
-3.00)
-3.00)
ScaLAPACK Example
The same set of linear systems as in in the preceding example is solved successively as a
distributed computing example. LFTDH is called to factor the matrix. LFSDH is called to compute
the four solutions for the four right-hand sides. In this case, the coefficient matrix is assumed to be
well-conditioned and correctly scaled. Otherwise, it would be better to call LFCDH to perform the
factorization, and LFIDH to compute the solutions.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities)
used to map and unmap arrays to and from the processor grid. They are used here for brevity.
DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LFTDH_INT
USE LFSDH_INT
USE WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
INTEGER
INTEGER
Chapter 1: Linear Systems
Declare variables
J, LDA, N, DESCA(9), DESCX(9)
INFO, MXCOL, MXLDA
LFSDH 249
COMPLEX, ALLOCATABLE ::
COMPLEX, ALLOCATABLE ::
PARAMETER
(LDA=5, N=5)
!
0.0),(0.0,
0.0),(0.0,
4.0),(0.0,
0.0),(1.0,
0.0),(9.0,
0.0)/)
0.0)/)
0.0)/)
1.0)/)
0.0)/)
!
B(1,:)
B(2,:)
B(3,:)
B(4,:)
B(5,:)
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
=
=
=
=
=
(/(3.0, 3.0),
(/(5.0, -5.0),
(/(5.0, 4.0),
(/(9.0, 7.0),
(/(-22.0,1.0),
( 4.0, 0.0),
( 15.0,-10.0),
(-12.0,-56.0),
(-12.0, 10.0),
( 3.0, -1.0),
( 29.0, -9.0)/)
(-36.0,-17.0)/)
(-15.0,-24.0)/)
(-23.0,-15.0)/)
(-23.0,-28.0)/)
Output
X
1
2
3
4
5
( 1.00,
( 1.00,
( 2.00,
( 2.00,
( -3.00,
1
0.00)
-2.00)
0.00)
3.00)
0.00)
2
( 3.00, -1.00)
( 2.00, 0.00)
( -1.00, -6.00)
( 2.00, 1.00)
( 0.00, 0.00)
(
(
(
(
(
11.00,
-7.00,
-2.00,
-2.00,
-2.00,
3
-1.00)
0.00)
-3.00)
-3.00)
-3.00)
LFIDH
CAPABLE
Uses iterative refinement to improve the solution of a complex Hermitian positive definite system
of linear equations.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the linear system. (Input)
Only the upper triangle of A is referenced.
FACT Complex N by N matrix containing the factorization of the coefficient matrix A as
output from routine LFCDH/DLFCDH or LFTDH/DLFTDH. (Input)
Only the upper triangle of FACT is used.
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution. (Output)
RES Complex vector of length N containing the residual vector at the improved solution.
(Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
LFIDH 251
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LFIDH computes the solution of a system of linear algebraic equations having a complex
Hermitian positive definite coefficient matrix. Iterative refinement is performed on the solution
vector to improve the accuracy. Usually almost all of the digits in the solution are accurate, even if
the matrix is somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo an RH R factorization. This may
be done by calling either LFCDH or LFTDH.
Iterative refinement fails only if the matrix is very ill-conditioned.
H
LFIDH and LFSDH both solve a linear system given its R R factorization. LFIDH generally takes
more time and produces a more accurate answer than LFSDH. Each iteration of the iterative
refinement algorithm used by LFIDH calls LFSDH.
Comments
Informational error
Type
3
Code
3 The input matrix is too ill-conditioned for iterative refinement to be
effective.
A0 MXLDA by MXCOL complex local matrix containing the local portions of the
distributed matrix A. A contains the coefficient matrix of the linear system. (Input)
Only the upper triangle of A is referenced.
FACT0 MXLDA by MXCOL complex local matrix containing the local portions of the
distributed matrix FACT as output from routine LFCDH or LFTDH. FACT contains the
factorization of the matrix A. (Input)
Only the upper triangle of FACT is referenced.
B0 Complex local vector of length MXLDA containing the local portions of the distributed
vector B. B contains the right-hand side of the linear system. (Input)
X0 Complex local vector of length MXLDA containing the local portions of the distributed
vector X. X contains the solution to the linear system. (Output)
RES0 Complex local vector of length MXLDA containing the local portions of the
distributed vector RES. RES contains the residual vector at the improved solution to the
linear system. (Output)
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a call to
SCALAPACK_GETDIM (Chapter 11, Utilities) after a call to SCALAPACK_SETUP
(Chapter 11, Utilities) has been made. See the ScaLAPACK Example below.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed by adding
(1 + i)/2 to the second element after each call to LFIDH.
USE
USE
USE
USE
LFIDH_INT
LFCDH_INT
UMACH_INT
WRCRN_INT
!
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N
(LDA=5, LDFACT=5, N=5)
RCOND
A(LDA,LDA), B(N), FACT(LDFACT,LDFACT), RES(N,3), X(N,3)
Set values for A and B
A =
(
(
(
(
(
2.0+0.0i
B =
( 3.0+3.0i
-1.0+1.0i
4.0+0.0i
5.0-5.0i
0.0+0.0i
1.0+2.0i
10.0+0.0i
5.0+4.0i
0.0+0.0i
0.0+0.0i
0.0+4.0i
6.0+0.0i
9.0+7.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
1.0+1.0i
9.0+0.0i
)
)
)
)
)
-22.0+1.0i )
LFIDH 253
!
!
!
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND < 0.07
L1 Condition number < 25.0
1
2
3
4
5
1
( 1.000, 0.000)
( 1.000,-2.000)
( 2.000, 0.000)
( 2.000, 3.000)
(-3.000, 0.000)
2
( 1.217, 0.000)
( 1.217,-1.783)
( 1.910, 0.030)
( 1.979, 2.938)
(-2.991, 0.005)
3
( 1.433, 0.000)
( 1.433,-1.567)
( 1.820, 0.060)
( 1.959, 2.876)
(-2.982, 0.009)
RES
1
2
3
4
5
1
( 1.192E-07, 0.000E+00)
( 1.192E-07,-2.384E-07)
( 2.384E-07, 8.259E-08)
(-2.384E-07, 2.814E-14)
(-2.384E-07,-1.401E-08)
2
( 6.592E-08, 1.686E-07)
(-5.329E-08,-5.329E-08)
( 2.390E-07,-3.309E-08)
(-8.240E-08,-8.790E-09)
(-2.813E-07, 6.981E-09)
3
( 1.318E-07, 2.010E-14)
( 1.318E-07,-2.258E-07)
( 2.395E-07, 1.015E-07)
(-1.648E-07,-1.758E-08)
(-3.241E-07,-2.795E-08)
ScaLAPACK Example
As in the preceding example, a set of linear systems is solved successively as a distributed
computing example. The right-hand-side vector is perturbed by adding (1 + i)/2 to the second
element after each call to LFIDH. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility
routines (see Chapter 11, Utilities) used to map and unmap arrays to and from the processor
grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes
the descriptors for the local arrays.
USE
USE
USE
USE
USE
MPI_SETUP_INT
LFCDH_INT
LFIDH_INT
UMACH_INT
WRCRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
J, LDA, N, NOUT, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL
RCOND
COMPLEX, ALLOCATABLE ::
A(:,:), B(:), B0(:), RES(:,:), X(:,:)
COMPLEX, ALLOCATABLE ::
A0(:,:), FACT0(:,:), X0(:), RES0(:)
PARAMETER
(LDA=5, N=5)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), RES(N,3), X(N,3))
Set values for A and B
A(1,:) = (/(2.0, 0.0),(-1.0, 1.0),( 0.0, 0.0),(0.0, 0.0),(0.0, 0.0)/)
A(2,:) = (/(0.0, 0.0),( 4.0, 0.0),( 1.0, 2.0),(0.0, 0.0),(0.0, 0.0)/)
A(3,:) = (/(0.0, 0.0),( 0.0, 0.0),(10.0, 0.0),(0.0, 4.0),(0.0, 0.0)/)
A(4,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(6.0, 0.0),(1.0, 1.0)/)
A(5,:) = (/(0.0, 0.0),( 0.0, 0.0),( 0.0, 0.0),(0.0, 0.0),(9.0, 0.0)/)
!
B
ENDIF
!
!
!
!
!
!
!
!
!
!
!
LFIDH 255
!
!
Output
RCOND < 0.07
L1 Condition number < 25.0
X
1
2
3
4
5
1
( 1.000, 0.000)
( 1.000,-2.000)
( 2.000, 0.000)
( 2.000, 3.000)
(-3.000, 0.000)
2
( 1.217, 0.000)
( 1.217,-1.783)
( 1.910, 0.030)
( 1.979, 2.938)
(-2.991, 0.005)
3
( 1.433, 0.000)
( 1.433,-1.567)
( 1.820, 0.060)
( 1.959, 2.876)
(-2.982, 0.009)
RES
1
2
3
4
5
1
( 1.192E-07, 0.000E+00)
( 1.192E-07,-2.384E-07)
( 2.384E-07, 8.259E-08)
(-2.384E-07, 2.814E-14)
(-2.384E-07,-1.401E-08)
2
( 6.592E-08, 1.686E-07)
(-5.329E-08,-5.329E-08)
( 2.390E-07,-3.309E-08)
(-8.240E-08,-8.790E-09)
(-2.813E-07, 6.981E-09)
3
( 1.318E-07, 2.010E-14)
( 1.318E-07,-2.258E-07)
( 2.395E-07, 1.015E-07)
(-1.648E-07,-1.758E-08)
(-3.241E-07,-2.795E-08)
LFDDH
Computes the determinant of a complex Hermitian positive definite matrix given the RHR
Cholesky factorization of the matrix.
Required Arguments
FACT Complex N by N matrix containing the RHR factorization of the coefficient matrix A
as output from routine LFCDH/DLFCDH or LFTDH/DLFTDH. (Input)
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
256 Chapter 1: Linear Systems
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDDH computes the determinant of a complex Hermitian positive definite coefficient
matrix. To compute the determinant, the coefficient matrix must first undergo an RH R
factorization. This may be done by calling either LFCDH or LFTDH. The formula det A = det RH det
R = (det R)2 is used to compute the determinant. Since the determinant of a triangular matrix is the
product of the diagonal elements,
det R = i =1 Rii
N
Example
The determinant is computed for a complex Hermitian positive definite 3 3 matrix.
USE LFDDH_INT
USE LFTDH_INT
USE UMACH_INT
!
INTEGER
PARAMETER
REAL
COMPLEX
Declare variables
LDA, LDFACT, NOUT
(LDA=3, LDFACT=3)
DET1, DET2
A(LDA,LDA), FACT(LDFACT,LDFACT)
!
!
!
Chapter 1: Linear Systems
LFDDH 257
!
!
!
!
!
!
!
A =
(
(
(
6.0+0.0i
1.0+1.0i
4.0+0.0i
1.0-1.0i
7.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
11.0+0.0i )
!
99999 FORMAT ( The determinant of A is ,F6.3, * 10**,F2.0)
END
Output
The determinant of A is
1.400 * 10**2.
LSAHF
Solves a complex Hermitian system of linear equations with iterative refinement.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the Hermitian linear system.
(Input)
Only the upper triangle of A is referenced.
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSAHF solves systems of linear algebraic equations having a complex Hermitian
indefinite coefficient matrix. It first uses the routine LFCHF to compute a U DUH factorization of
the coefficient matrix and to estimate the condition number of the matrix. D is a block diagonal
matrix with blocks of order 1 or 2 and U is a matrix composed of the product of a permutation
matrix and a unit upper triangular matrix. The solution of the linear system is then found using the
iterative refinement routine LFIHF.
LSAHF fails if a block in D is singular or if the iterative refinement algorithm fails to converge.
These errors occur only if A is singular or very close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSAHF solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
2.
Informational errors
Type
Code
3
3
1
4
4
4
2
4
3.
17
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2AHF the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSAHF.
Additional memory allocation for FACT and option value restoration are done
automatically in LSAHF. Users directly calling L2AHF can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSAHF or L2AHF. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
This option has two values that determine if the L1 condition number is to be computed.
Routine LSAHF temporarily replaces IVAL(2) by IVAL(1). The routine L2CHF
computes the condition number if IVAL(2) = 2. Otherwise L2CHF skips this
computation. LSAHF restores the option. Default values for the option are
IVAL(*) = 1, 2.
Example
A system of three linear equations is solved. The coefficient matrix has complex Hermitian form
and the right-hand-side vector b has three elements.
USE LSAHF_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA, N
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
!
!
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
!
CALL LSAHF (A, B, X)
!
Print results
CALL WRCRN (X, X, 1, N, 1)
END
Output
X
(
2.00,
1
1.00)
2
(-10.00, -1.00)
3.00,
3
5.00)
LSLHF
Solves a complex Hermitian system of linear equations without iterative refinement.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the Hermitian linear system.
(Input)
Only the upper triangle of A is referenced.
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLHF solves systems of linear algebraic equations having a complex Hermitian
indefinite coefficient matrix. It first uses the routine LFCHF to compute a UDUH factorization of
the coefficient matrix. D is a block diagonal matrix with blocks of order 1 or 2 and U is a matrix
composed of the product of a permutation matrix and a unit upper triangular matrix.
LSLHF 261
The solution of the linear system is then found using the routine LFSHF. LSLHF fails if a block in
D is singular. This occurs only if A is singular or very close to a singular matrix. If the coefficient
matrix is ill-conditioned or poorly scaled, it is recommended that LSAHF be used.
Comments
1.
2.
3.
Informational errors
Type
Code
3
3
1
4
4
4
2
4
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LHF the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSLHF.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLHF. Users directly calling L2LHF can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLHF or L2LHF. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLHF temporarily replaces IVAL(2) by IVAL(1). The
routine L2CHF computes the condition number if IVAL(2) = 2. Otherwise L2CHF
skips this computation. LSLHF restores the option. Default values for the option
are IVAL(*) = 1, 2.
Example
A system of three linear equations is solved. The coefficient matrix has complex Hermitian form
and the right-hand-side vector b has three elements.
USE LSLHF_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA, N
(LDA=3, N=3)
A(LDA,LDA), B(N), X(N)
!
!
!
!
!
!
!
!
!
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
!
CALL LSLHF (A, B, X)
!
Print results
CALL WRCRN (X, X, 1, N, 1)
END
Output
X
(
2.00,
1
1.00)
2
(-10.00, -1.00)
3.00,
3
5.00)
LFCHF
Computes the UDUH factorization of a complex Hermitian matrix and estimate its L1 condition
number.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the Hermitian linear system.
(Input)
Only the upper triangle of A is referenced.
FACT Complex N by N matrix containing the information about the factorization of the
Hermitian matrix A. (Output)
Only the upper triangle of FACT is used. If A is not needed, A and FACT can share the
same storage locations.
LFCHF 263
IPVT Vector of length N containing the pivoting information for the factorization.
(Output)
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFCHF performs a U DUH factorization of a complex Hermitian indefinite coefficient
matrix. It also estimates the condition number of the matrix. The U DUH factorization is called the
diagonal pivoting factorization.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCHF fails if A is singular or very close to a singular matrix.
The U DUH factors are returned in a form that is compatible with routines LFIHF, LFSHF and
LFDHF. To solve systems of equations with multiple right-hand-side vectors, use LFCHF followed
264 Chapter 1: Linear Systems
by either LFIHF or LFSHF called once for each right-hand side. The routine LFDHF can be called
to compute the determinant of the coefficient matrix after LFCHF has performed the factorization.
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational errors
Type
Code
3
3
1
4
4
4
2
4
Example
The inverse of a 3 3 complex Hermitian matrix is computed. LFCHF is called to factor the matrix
and to check for singularity or ill-conditioning. LFIHF is called to determine the columns of the
inverse.
USE
USE
USE
USE
LFCHF_INT
UMACH_INT
LFIHF_INT
WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N), NOUT
RCOND
A(LDA,LDA), AINV(LDA,N), FACT(LDA,LDA), RJ(N), RES(N)
Set values for A
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
LFCHF 265
!
!
!
!
!
!
!
!
!
!
!
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND < 0.25
L1 Condition number <
6.0
AINV
1
2
3
1
( 0.2000, 0.0000)
( 0.1200,-0.0400)
( 0.0800, 0.0400)
2
( 0.1200, 0.0400)
( 0.1467, 0.0000)
(-0.1267, 0.0067)
3
( 0.0800,-0.0400)
(-0.1267,-0.0067)
(-0.0267, 0.0000)
LFTHF
Computes the U DUH factorization of a complex Hermitian matrix.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the Hermitian linear system.
(Input)
Only the upper triangle of A is referenced.
FACT Complex N by N matrix containing the information about the factorization of the
Hermitian matrix A. (Output)
Only the upper triangle of FACT is used. If A is not needed, A and FACT can share the
same storage locations.
266 Chapter 1: Linear Systems
IPVT Vector of length N containing the pivoting information for the factorization.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFTHF performs a U DUH factorization of a complex Hermitian indefinite coefficient
matrix. The U DUH factorization is called the diagonal pivoting factorization.
LFTHF fails if A is singular or very close to a singular matrix.
The U DUH factors are returned in a form that is compatible with routines LFIHF, LFSHF and
LFDHF. To solve systems of equations with multiple right-hand-side vectors, use LFTHF followed
by either LFIHF or LFSHF called once for each right-hand side. The routine LFDHF can be called
to compute the determinant of the coefficient matrix after LFTHF has performed the factorization.
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
Informational errors
Type
Code
LFTHF 267
3
4
4
4 The input matrix is not Hermitian. It has a diagonal entry with a small
imaginary part.
2 The input matrix is singular.
4 The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Example
The inverse of a 3 3 matrix is computed. LFTHF is called to factor the matrix and check for
singularity. LFSHF is called to determine the columns of the inverse.
USE LFTHF_INT
USE LFSHF_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
COMPLEX
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N)
A(LDA,LDA), AINV(LDA,N), FACT(LDA,LDA), RJ(N)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
Output
AINV
1
2
3
1
( 0.2000, 0.0000)
( 0.1200,-0.0400)
( 0.0800, 0.0400)
2
( 0.1200, 0.0400)
( 0.1467, 0.0000)
(-0.1267, 0.0067)
3
( 0.0800,-0.0400)
(-0.1267,-0.0067)
(-0.0267, 0.0000)
Fortran Numerical MATH LIBRARY
LFSHF
Solves a complex Hermitian system of linear equations given the U DUH factorization of the
coefficient matrix.
Required Arguments
FACT Complex N by N matrix containing the factorization of the coefficient matrix A as
output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
Only the upper triangle of FACT is used.
IPVT Vector of length N containing the pivoting information for the factorization of A as
output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFSHF computes the solution of a system of linear algebraic equations having a complex
Hermitian indefinite coefficient matrix.
To compute the solution, the coefficient matrix must first undergo a U DUH factorization. This
may be done by calling either LFCHF or LFTHF.
Chapter 1: Linear Systems
LFSHF 269
LFSHF and LFIHF both solve a linear system given its U DU factorization. LFIHF generally takes
more time and produces a more accurate answer than LFSHF. Each iteration of the iterative
refinement algorithm used by LFIHF calls LFSHF.
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Example
A set of linear systems is solved successively. LFTHF is called to factor the coefficient matrix.
LFSHF is called to compute the three solutions for the three right-hand sides. In this case the
coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be
better to call LFCHF to perform the factorization, and LFIHF to compute the solutions.
USE LFSHF_INT
USE WRCRN_INT
USE LFTHF_INT
!
INTEGER
PARAMETER
INTEGER
COMPLEX
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N), I
A(LDA,LDA), B(N,3), X(N,3), FACT(LDA,LDA)
!
!
!
!
!
!
!
!
!
!
!
!
!
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
Output
X
1
2.00,
1
1.00)
1.00,
2
0.00)
3
0.00, -1.00)
Fortran Numerical MATH LIBRARY
2
3
(-10.00, -1.00)
( 3.00, 5.00)
( -3.00, -4.00)
( -0.50, 3.00)
(
(
0.00, -2.00)
0.00, -3.00)
LFIHF
Uses iterative refinement to improve the solution of a complex Hermitian system of linear
equations.
Required Arguments
A Complex N by N matrix containing the coefficient matrix of the Hermitian linear system.
(Input)
Only the upper triangle of A is referenced.
FACT Complex N by N matrix containing the factorization of the coefficient matrix A as
output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
Only the upper triangle of FACT is used.
IPVT Vector of length N containing the pivoting information for the factorization of A as
output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution. (Output)
RES Complex vector of length N containing the residual vector at the improved solution.
(Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
LFIHF 271
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFIHF computes the solution of a system of linear algebraic equations having a complex
Hermitian indefinite coefficient matrix.
Iterative refinement is performed on the solution vector to improve the accuracy. Usually almost
all of the digits in the solution are accurate, even if the matrix is somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo a U DUH factorization. This
may be done by calling either LFCHF or LFTHF.
Iterative refinement fails only if the matrix is very ill-conditioned.
H
LFIHF and LFSHF both solve a linear system given its U DU factorization. LFIHF generally takes
more time and produces a more accurate answer than LFSHF. Each iteration of the iterative
refinement algorithm used by LFIHF calls LFSHF.
Comments
Informational error
Type
3
Code
3 The input matrix is too ill-conditioned for iterative refinement to be
effective.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving
the system each of the first two times by adding 0.2 + 0.2i to the second element.
USE
USE
USE
USE
LFIHF_INT
UMACH_INT
LFCHF_INT
WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N), NOUT
RCOND
A(LDA,LDA), B(N), X(N), FACT(LDA,LDA), RES(N)
!
!
!
!
!
!
!
!
272 Chapter 1: Linear Systems
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
!
!
!
!
!
!
!
!
99998 FORMAT ( RCOND = ,F5.3,/, L1 Condition number = ,F6.3)
99999 FORMAT (//, For problem , I1)
END
Output
RCOND < 0.25
L1 Condition number <
For problem 1
5.0
X
2.00,
1
1.00)
2
(-10.00, -1.00)
3.00,
3
5.00)
RES
1
( 2.384E-07,-4.768E-07)
2
( 0.000E+00,-3.576E-07)
3
(-1.421E-14, 1.421E-14)
For problem 2
X
1
( 2.016, 1.032)
2
(-9.971,-0.971)
3
( 2.973, 4.976)
RES
1
( 2.098E-07,-1.764E-07)
2
( 6.231E-07,-1.518E-07)
3
( 1.272E-07, 4.005E-07)
For problem 3
X
1
( 2.032, 1.064)
2
(-9.941,-0.941)
3
( 2.947, 4.952)
RES
1
( 4.196E-07,-3.529E-07)
Chapter 1: Linear Systems
2
( 2.925E-07,-3.632E-07)
3
( 2.543E-07, 3.242E-07)
LFIHF 273
LFDHF
Computes the determinant of a complex Hermitian matrix given the U DUH factorization of the
matrix.
Required Arguments
FACT Complex N by N matrix containing the factorization of the coefficient matrix A as
output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
Only the upper triangle of FACT is used.
IPVT Vector of length N containing the pivoting information for the factorization of A as
output from routine LFCHF/DLFCHF or LFTHF/DLFTHF. (Input)
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDHF computes the determinant of a complex Hermitian indefinite coefficient matrix. To
compute the determinant, the coefficient matrix must first undergo a U DUH factorization. This
may be done by calling either LFCHF or LFTHF since det U = 1, the formula
274 Chapter 1: Linear Systems
det A = det U det D det UH = det D is used to compute the determinant. det D is computed as the
product of the determinants of its blocks.
LFDHF is based on the LINPACK routine CSIDI; see Dongarra et al. (1979).
Example
The determinant is computed for a complex Hermitian 3 3 matrix.
USE LFDHF_INT
USE LFTHF_INT
USE UMACH_INT
!
!
!
!
!
!
!
!
!
!
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
Declare variables
LDA, N
(LDA=3, N=3)
IPVT(N), NOUT
DET1, DET2
A(LDA,LDA), FACT(LDA,LDA)
Set values for A
A = ( 3.0+0.0i
( 1.0+1.0i
( 4.0+0.0i
1.0-1.0i
2.0+0.0i
-5.0-1.0i
4.0+0.0i )
-5.0+1.0i )
-2.0+0.0i )
!
99999 FORMAT ( The determinant is, F5.1, * 10**, F2.0)
END
Output
The determinant is -1.5 * 10**2.
LSLTR
Solves a real tridiagonal system of linear equations.
Required Arguments
C Vector of length N containing the subdiagonal of the tridiagonal matrix in C(2) through
C(N). (Input/Output)
On output C is destroyed.
LSLTR 275
Optional Arguments
N Order of the tridiagonal matrix. (Input)
Default: N = size (C,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLTR factors and solves the real tridiagonal linear system Ax = b. LSLTR is intended
just for tridiagonal systems. The coefficient matrix does not have to be symmetric. The algorithm
is Gaussian elimination with partial pivoting for numerical stability. See Dongarra (1979),
LINPACK subprograms SGTSL/DGTSL, for details. When computing on vector or parallel
computers the cyclic reduction algorithm, LSLCR, should be considered as an alternative method
to solve the system.
Comments
Informational error
Type
4
Code
2 An element along the diagonal became exactly zero during execution.
Example
A system of n = 4 linear equations is solved.
USE LSLTR_INT
USE WRRRL_INT
276 Chapter 1: Linear Systems
!
!
Declaration of variables
INTEGER
PARAMETER
N
(N=4)
REAL
CHARACTER
!
DATA FMT/(E13.6)/
DATA CLABEL/NUMBER/
DATA RLABEL/NONE/
!
!
!
!
!
!
Output
Solution:
1
0.400000E+01
2
-0.800000E+01
3
-0.700000E+01
4
0.900000E+01
LSLCR
Computes the L DU factorization of a real tridiagonal matrix A using a cyclic reduction algorithm.
Required Arguments
C Array of size 2N containing the upper codiagonal of the N by N tridiagonal matrix in the
entries C(1), , C(N 1). (Input/Output)
A Array of size 2N containing the diagonal of the N by N tridiagonal matrix in the entries
A(1), , A(N). (Input/Output)
B Array of size 2N containing the lower codiagonal of the N by N tridiagonal matrix in the
entries B(1), , B(N 1). (Input/Output)
Y Array of size 2N containing the right hand side for the system Ax = y in the order Y(1),
, Y(N). (Input/Output) The vector x overwrites Y in storage.
U Array of size 2N of flags that indicate any singularities of A. (Output)
A value U(I) = 1. means that a divide by zero would have occurred during the factoring.
Otherwise U(I) = 0.
LSLCR 277
IR Array of integers that determine the sizes of loops performed in the cyclic reduction
algorithm. (Output)
IS Array of integers that determine the sizes of loops performed in the cyclic reduction
algorithm. (Output)
The sizes of IR and IS must be at least log2 (N) + 3.
Optional Arguments
N Order of the matrix. (Input)
N must be greater than zero
Default: N = size (C,1).
IJOB Flag to direct the desired factoring or solving step. (Input)
Default: IJOB = 1.
IJOB
Action
Factor the matrix A and solve the system Ax = y, where y is stored in array
Y.
Do the solve step only. Use y from array Y. (The factoring step has already
been done.)
4, 5, 6
Same meaning as with the value IJOB = 3. For efficiency, no error checking
is done on the validity of any input value.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLCR factors and solves the real tridiagonal linear system Ax = y. The matrix is
decomposed in the form A = L DU, where L is unit lower triangular, U is unit upper triangular,
and D is diagonal. The algorithm used for the factorization is effectively that described in Kershaw
(1982). More details, tests and experiments are reported in Hanson (1990).
LSLCR is intended just for tridiagonal systems. The coefficient matrix does not have to be
symmetric. The algorithm amounts to Gaussian elimination, with no pivoting for numerical
stability, on the matrix whose rows and columns are permuted to a new order. See Hanson (1990)
for details. The expectation is that LSLCR will outperform either LSLTR or LSLPB on vector or
parallel computers. Its performance may be inferior for small values of n, on scalar computers, or
high-performance computers with non-optimizing compilers.
Example
A system of n = 1000 linear equations is solved. The coefficient matrix is the symmetric matrix of
the second difference operation, and the right-hand-side vector y is the first column of the identity
matrix. Note that an, n= 1. The solution vector will be the first column of the inverse matrix of A.
Then a new system is solved where y is now the last column of the identity matrix. The solution
vector for this system will be the last column of the inverse matrix.
USE LSLCR_INT
USE UMACH_INT
!
INTEGER
PARAMETER
Declare variables
LP, N, N2
(LP=12, N=1000, N2=2*N)
INTEGER
REAL
!
!
!
!
!
!
!
!
!
LSLCR 279
Output
The value of n is:
1000
Elements 1, n of inverse matrix columns 1 and
n:
1.00000
1000.000
LSARB
Solves a real system of linear equations in band storage mode with iterative refinement.
Required Arguments
A (NLCA + NUCA + 1) by N array containing the N by N banded coefficient matrix in band
storage mode. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX= B is solved.
T
IPATH = 2 means the system A X = B is solved.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSARB solves a system of linear algebraic equations having a real banded coefficient
matrix. It first uses the routine LFCRB to compute an LU factorization of the coefficient matrix and
to estimate the condition number of the matrix. The solution of the linear system is then found
using the iterative refinement routine LFIRB.
LSARB fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the
iterative refinement algorithm fails to converge. These errors occur only if A is singular or very
close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSARB solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2ARB the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSARB.
Additional memory allocation for FACT and option value restoration are done
automatically in LSARB. Users directly calling L2ARB can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
LSARB 281
applications that use LSARB or L2ARB. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
This option has two values that determine if the L1 condition number is to be
computed. Routine LSARB temporarily replaces IVAL(2) by IVAL(1). The
routine L2CRB computes the condition number if IVAL(2) = 2. Otherwise L2CRB
skips this computation. LSARB restores the option. Default values for the option
are IVAL(*) = 1, 2.
17
Example
A system of four linear equations is solved. The coefficient matrix has real banded form with 1
upper and 1 lower codiagonal. The right-hand-side vector b has four elements.
USE LSARB_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
Declare variables
LDA, N, NLCA, NUCA
(LDA=3, N=4, NLCA=1, NUCA=1)
A(LDA,N), B(N), X(N)
Set values for A in band form, and B
A = ( 0.0
( 2.0
( -3.0
B = (
3.0
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
1.0
11.0
-2.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
DATA B/3.0, 1.0, 11.0, -2.0/
!
CALL LSARB (A, NLCA, NUCA, B, X)
Print results
CALL WRRRN (X, X, 1, N, 1)
!
!
END
Output
X
1
2.000
2
1.000
3
-3.000
4
4.000
LSLRB
CAPABLE
Solves a real system of linear equations in band storage mode without iterative refinement.
Required Arguments
A (NLCA + NUCA + 1) by N array containing the N by N banded coefficient matrix in band
storage mode. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX= B is solved.
T
IPATH = 2 means the system A X = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Chapter 1: Linear Systems
LSLRB 283
Description
Routine LSLRB solves a system of linear algebraic equations having a real banded coefficient
matrix. It first uses the routine LFCRB to compute an LU factorization of the coefficient matrix and
to estimate the condition number of the matrix. The solution of the linear system is then found
using LFSRB. LSLRB fails if U, the upper triangular part of the factorization, has a zero diagonal
element. This occurs only if A is singular or very close to a singular matrix. If the estimated
condition number is greater than 1/ (where is machine precision), a warning error is issued. This
indicates that very small changes in A can cause very large changes in the solution x. If the
coefficient matrix is ill-conditioned or poorly scaled, it is recommended that LSARB be used.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LRB the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSLRB.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLRB. Users directly calling L2LRB can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLRB or L2LRB. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLRB temporarily replaces IVAL(2) by IVAL(1). The
routine L2CRB computes the condition number if IVAL(2) = 2. Otherwise L2CRB
skips this computation. LSLRB restores the option. Default values for the option
are IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXCOL can be obtained through a call to
SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities) has been
made. See the ScaLAPACK Example below.
Example
A system of four linear equations is solved. The coefficient matrix has real banded form with 1
upper and 1 lower codiagonal. The right-hand-side vector b has four elements.
USE LSLRB_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
Declare variables
LDA, N, NLCA, NUCA
(LDA=3, N=4, NLCA=1, NUCA=1)
A(LDA,N), B(N), X(N)
Set values for A in band form, and B
A = ( 0.0
( 2.0
( -3.0
B = (
3.0
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
1.0
11.0
-2.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
DATA B/3.0, 1.0, 11.0, -2.0/
!
CALL LSLRB (A, NLCA, NUCA, B, X)
Chapter 1: Linear Systems
LSLRB 285
Print results
CALL WRRRN (X, X, 1, N, 1)
!
END
Output
X
1
2.000
2
1.000
3
-3.000
4
4.000
ScaLAPACK Example
The same system of four linear equations is solved as a distributed computing example. The
coefficient matrix has real banded form with 1 upper and 1 lower codiagonal. The right-hand-side
vector b has four elements. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines
(see Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They
are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the
descriptors for the local arrays.
USE MPI_SETUP_INT
USE LSLRB_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, M, N, NLCA, NUCA, NRA, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:)
PARAMETER
(LDA=3, N=6, NLCA=1, NUCA=1)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,N), B(N), X(N))
Set values for A and B
A(1,:) = (/ 0.0,
0.0, -3.0,
0.0, -1.0, -3.0/)
A(2,:) = (/ 10.0, 10.0, 15.0, 10.0, 1.0, 6.0/)
A(3,:) = (/ 0.0,
0.0,
0.0, -5.0, 0.0, 0.0/)
!
B
= (/ 10.0,
7.0,
ENDIF
NRA = NLCA + NUCA + 1
M = 2*NLCA + 2*NUCA + 1
!
!
!
!
!
45.0,
!
!
!
!
!
!
!
!
!
!
!
Output
X
1
1.000
2
1.600
3
3.000
4
2.900
5
-4.000
6
5.167
LFCRB
Computes the LU factorization of a real matrix in band storage mode and estimate its L1 condition
number.
Required Arguments
A (NLCA + NUCA + 1) by N array containing the N by N matrix in band storage mode to be
factored. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
FACT (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the matrix A.
(Output)
If A is not needed, A can share the first (NLCA + NUCA + 1) * N locations with FACT.
LFCRB 287
IPVT Vector of length N containing the pivoting information for the LU factorization.
(Output)
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LFCRB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, RCOND)
Double:
Description
Routine LFCRB performs an LU factorization of a real banded coefficient matrix. It also estimates
the condition number of the matrix. The LU factorization is done using scaled partial pivoting.
Scaled partial pivoting differs from partial pivoting in that the pivoting strategy is the same as if
each row were scaled to have the same -norm.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LSCRB fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
can occur only if A is singular or very close to a singular matrix. The LU factors are returned in a
288 Chapter 1: Linear Systems
form that is compatible with routines LFIRB, LFSRB and LFDRB. To solve systems of equations
with multiple right-hand-side vectors, use LFCRB followed by either LFIRB or LFSRB called once
for each right-hand side. The routine LFDRB can be called to compute the determinant of the
coefficient matrix after LFCRB has performed the factorization.
Let F be the matrix FACT, let ml= NLCA and let mu = NUCA. The first ml+ mu + 1 rows of F contain
the triangular matrix U in band storage form. The lower ml rows of F contain the multipliers
needed to reconstruct L1 .
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
WK)
2.
Informational errors
Type
Code
3
4
1
2
Example
The inverse of a 4 4 band matrix with one upper and one lower codiagonal is computed. LFCRB
is called to factor the matrix and to check for singularity or ill-conditioning. LFIRB is called to
determine the columns of the inverse.
USE
USE
USE
USE
LFCRB_INT
UMACH_INT
LFIRB_INT
WRRRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
Chapter 1: Linear Systems
LFCRB 289
!
!
!
!
!
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND < .07
L1 Condition number = 25.0
1
2
3
4
1
-1.000
-3.000
0.000
0.000
AINV
2
3
-1.000
0.400
-2.000
0.800
0.000 -0.200
0.000
0.400
4
-0.800
-1.600
0.400
0.200
LFTRB
Computes the LU factorization of a real matrix in band storage mode.
Required Arguments
A (NLCA + NUCA + 1) by N array containing the N by N matrix in band storage mode to be
factored. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
290 Chapter 1: Linear Systems
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine LFTRB performs an LU factorization of a real banded coefficient matrix using
Gaussian elimination with partial pivoting. A failure occurs if U, the upper triangular factor, has a
zero diagonal element. This can happen if A is close to a singular matrix. The LU factors are
returned in a form that is compatible with routines LFIRB, LFSRB and LFDRB. To solve systems of
equations with multiple right-hand-side vectors, use LFTRB followed by either LFIRB or LFSRB
called once for each right-hand side. The routine LFDRB can be called to compute the determinant
of the coefficient matrix after LFTRB has performed the factorization
Let ml = NLCA, and let mu = NUCA. The first ml + mu + 1 rows of FACT contain the triangular
matrix U in band storage form. The next ml rows of FACT contain the multipliers needed to
produce L.
The routine LFTRB is based on the the blocked LU factorization algorithm for banded linear
systems given in Du Croz, et al. (1990). Level-3 BLAS invocations were replaced by in-line loops.
Chapter 1: Linear Systems
LFTRB 291
The blocking factor nb has the default value 1 in LFTRB. It can be reset to any positive value not
exceeding 32.
Comments
1.
Informational error
Type
Code
4
3.
Example
A linear system with multiple right-hand sides is solved. LFTRB is called to factor the coefficient
matrix. LFSRB is called to compute the two solutions for the two right-hand sides. In this case the
coefficient matrix is assumed to be appropriately scaled. Otherwise, it may be better to call routine
LFCRB to perform the factorization, and LFIRB to compute the solutions.
USE LFTRB_INT
USE LFSRB_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), B(N,2), FACT(LDFACT,N), X(N,2)
Set values for A in band form, and B
A = ( 0.0
( 2.0
( -3.0
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
B = ( 12.0 -17.0)
(-19.0 23.0)
( 6.0
5.0)
( 8.0
5.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
!
!
!
END
Output
1
2
3
4
X
1
3.000
-6.000
2.000
4.000
2
-8.000
1.000
1.000
3.000
LFSRB
Solves a real system of linear equations given the LU factorization of the coefficient matrix in
band storage mode.
Required Arguments
FACT (2 NLCA + NUCA + 1) by N array containing the LU factorization of the coefficient
matrix A as output from routine LFCRB/DLFCRB or LFTRB/DLFTRB. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization of A
as output from routine LFCRB/DLFCRB or LFTRB/DLFTRB. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LFSRB 293
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFSRB computes the solution of a system of linear algebraic equations having a real
banded coefficient matrix. To compute the solution, the coefficient matrix must first undergo an
LU factorization. This may be done by calling either LFCRB or LFTRB. The solution to Ax = b is
found by solving the banded triangular systems Ly = b and Ux = y. The forward elimination step
consists of solving the system Ly = b by applying the same permutations and elimination
operations to b that were applied to the columns of A in the factorization routine. The backward
substitution step consists of solving the banded triangular system Ux = y for x.
LFSRB and LFIRB both solve a linear system given its LU factorization. LFIRB generally takes
more time and produces a more accurate answer than LFSRB. Each iteration of the iterative
refinement algorithm used by LFIRB calls LFSRB.
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Example
The inverse is computed for a real banded 4 4 matrix with one upper and one lower codiagonal.
The input matrix is assumed to be well-conditioned, hence LFTRB is used rather than LFCRB.
USE LFSRB_INT
USE LFTRB_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
Declare variables
LDA, LDFACT, N, NLCA, NUCA
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
INTEGER
REAL
!
!
!
!
!
IPVT(N)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N)
Set values for A in band form
A = ( 0.0 -1.0 -2.0
2.0)
( 2.0
1.0 -1.0
1.0)
( -3.0
0.0
2.0
0.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
!
!
!
!
!
!
!
!
!
END
Output
1
2
3
4
1
-1.000
-3.000
0.000
0.000
AINV
2
3
-1.000
0.400
-2.000
0.800
0.000 -0.200
0.000
0.400
4
-0.800
-1.600
0.400
0.200
LFIRB
Uses iterative refinement to improve the solution of a real system of linear equations in band
storage mode.
Required Arguments
A (NUCA +NLCA +1) by N array containing the N by N banded coefficient matrix in band
storage mode. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
LFIRB 295
FACT (2 * NLCA +NUCA +1) by N array containing the LU factorization of the matrix A as
output from routines LFCRB/DLFCRB or LFTRB/DLFTRB. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization of A
as output from routine LFCRB/DLFCRB or LFTRB/DLFTRB. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
RES Vector of length N containing the residual vector at the improved
solution . (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
T
IPATH = 2 means the system A X = B is solved.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LFIRB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, B, IPATH, X,
RES)
Double:
Description
Routine LFIRB computes the solution of a system of linear algebraic equations having a real
banded coefficient matrix. Iterative refinement is performed on the solution vector to improve the
accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is
somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo an LU factorization. This may
be done by calling either LFCRB or LFTRB.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFIRB and LFSRB both solve a linear system given its LU factorization. LFIRB generally takes
more time and produces a more accurate answer than LFSRB. Each iteration of the iterative
refinement algorithm used by LFIRB calls LFSRB.
Comments
Informational error
Type
3
Code
2 The input matrix is too ill-conditioned for iterative refinement to be
effective
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving
the system each of the first two times by adding 0.5 to the second element.
USE
USE
USE
USE
LFIRB_INT
LFCRB_INT
UMACH_INT
WRRRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), B(N), FACT(LDFACT,N), RCOND, RES(N), X(N)
Set values for A in band form, and B
A = ( 0.0
( 2.0
( -3.0
B = (
3.0
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
5.0
7.0
-9.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
DATA B/3.0, 5.0, 7.0, -9.0/
!
!
!
LFIRB 297
!
!
DO 10 J=1, 3
CALL LFIRB (A, NLCA, NUCA, FACT, IPVT, B, X, RES)
Print results
CALL WRRRN (X, X, 1, N, 1)
Perturb B by adding 0.5 to B(2)
B(2) = B(2) + 0.5E0
10 CONTINUE
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND < .07
L1 Condition number = 25.0
X
1
2
3
4
2.000
1.000 -5.000
1.000
X
1
1.500
2
0.000
1
1.000
2
-1.000
3
-5.000
4
1.000
X
3
-5.000
4
1.000
LFDRB
Computes the determinant of a real matrix in band storage mode given the LU factorization of the
matrix.
Required Arguments
FACT (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the matrix A as
output from routine LFTRB/DLFTRB or LFCRB/DLFCRB. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization as
output from routine LFTRB/DLFTRB or LFCRB/DLFCRB. (Input)
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LFDRB (N, FACT, LDFACT, NLCA, NUCA, IPVT, DET1, DET2)
Double:
Description
Routine LFDRB computes the determinant of a real banded coefficient matrix. To compute the
determinant, the coefficient matrix must first undergo an LU factorization. This may be done by
calling either LFCRB or LFTRB. The formula det A = det L det U is used to compute the
determinant. Since the determinant of a triangular matrix is the product of the diagonal elements,
det U = i =1 U ii
N
(The matrix U is stored in the upper NUCA + NLCA + 1 rows of FACT as a banded matrix.) Since L
is the product of triangular matrices with unit diagonals and of permutation matrices, det L = (1)k,
where k is the number of pivoting interchanges.
LFDRB is based on the LINPACK routine CGBDI; see Dongarra et al. (1979).
Example
The determinant is computed for a real banded 4 4 matrix with one upper and one lower
codiagonal.
USE LFDRB_INT
USE LFTRB_INT
USE UMACH_INT
!
INTEGER
PARAMETER
INTEGER
REAL
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), DET1, DET2, FACT(LDFACT,N)
Set values for A in band form
LFDRB 299
!
!
!
!
A = ( 0.0
( 2.0
( -3.0
-1.0
1.0
0.0
-2.0
-1.0
2.0
2.0)
1.0)
0.0)
DATA A/0.0, 2.0, -3.0, -1.0, 1.0, 0.0, -2.0, -1.0, 2.0,&
2.0, 1.0, 0.0/
!
CALL LFTRB (A, NLCA, NUCA, FACT, IPVT)
Compute the determinant
CALL LFDRB (FACT, NLCA, NUCA, IPVT, DET1, DET2)
!
Print the results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
99999 FORMAT ( The determinant of A is , F6.3, * 10**, F2.0)
END
!
Output
The determinant of A is
5.000 * 10**0.
LSAQS
Solves a real symmetric positive definite system of linear equations in band symmetric storage
mode with iterative refinement.
Required Arguments
A NCODA + 1 by N array containing the N by N positive definite band coefficient matrix in
band symmetric storage mode. (Input)
NCODA Number of upper codiagonals of A. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSAQS solves a system of linear algebraic equations having a real symmetric positive
definite band coefficient matrix. It first uses the routine LFCQS to compute an RTR Cholesky
factorization of the coefficient matrix and to estimate the condition number of the matrix. R is an
upper triangular band matrix. The solution of the linear system is then found using the iterative
refinement routine LFIQS.
LSAQS fails if any submatrix of R is not positive definite, if R has a zero diagonal element or if the
iterative refinement algorithm fails to converge. These errors occur only if A is very close to a
singular matrix or to a matrix which is not positive definite.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSAQS solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2AQS the leading dimension of FACT is increased by
LSAQS 301
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSAQS.
Additional memory allocation for FACT and option value restoration are done
automatically in LSAQS.
Users directly calling L2AQS can allocate additional space for FACT and set
IVAL(3) and IVAL(4) so that memory bank conflicts no longer cause
inefficiencies. There is no requirement that users change existing applications
that use LSAQS or L2AQS. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSAQS temporarily replaces IVAL(2) by IVAL(1). The
routine L2CQS computes the condition number if IVAL(2) = 2. Otherwise L2CQS
skips this computation. LSAQS restores the option. Default values for the option
are IVAL(*) = 1,2.
Example
A system of four linear equations is solved. The coefficient matrix has real positive definite
band form, and the right-hand-side vector b has four elements.
USE LSAQS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, N, NCODA
(LDA=3, N=4, NCODA=2)
A(LDA,N), B(N), X(N)
!
!
!
!
!
!
!
!
!
!
!
!
0.0
0.0
2.0
0.0
0.0
4.0
-1.0
2.0
7.0
1.0 )
-1.0 )
3.0 )
B = (
19.0 )
DATA A/2*0.0, 2.0, 2*0.0, 4.0, -1.0, 2.0, 7.0, 1.0, -1.0, 3.0/
DATA B/6.0, -11.0, -11.0, 19.0/
Solve A*X = B
CALL LSAQS (A, NCODA, B, X)
Print results
CALL WRRRN (X, X, 1, N, 1)
END
Output
X
1
4.000
2
-6.000
3
2.000
4
9.000
LSLQS
Solves a real symmetric positive definite system of linear equations in band symmetric storage
mode without iterative refinement.
Required Arguments
A NCODA + 1 by N array containing the N by N positive definite band symmetric coefficient
matrix in band symmetric storage mode. (Input)
NCODA Number of upper codiagonals of A. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLQS solves a system of linear algebraic equations having a real symmetric positive
definite band coefficient matrix. It first uses the routine LFCQS to compute an RTR Cholesky
Chapter 1: Linear Systems
LSLQS 303
factorization of the coefficient matrix and to estimate the condition number of the matrix. R is an
upper triangular band matrix. The solution of the linear system is then found using the routine
LFSQS.
LSLQS fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that
LSAQS be used.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LQS the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSLQS.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLQS. Users directly calling L2LQS can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLQS or L2LQS. Default values for the option are
IVAL(*) = 1,16,0,1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLQS temporarily replaces IVAL(2) by IVAL(1). The
routine L2CQS computes the condition number if IVAL(2) = 2. Otherwise L2CQS
skips this computation. LSLQS restores the option. Default values for the option
are IVAL(*) = 1,2.
Example
A system of four linear equations is solved. The coefficient matrix has real positive definite band
form and the right-hand-side vector b has four elements.
USE LSLQS_INT
USE WRRRN_INT
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, N, NCODA
(LDA=3, N=4, NCODA=2)
A(LDA,N), B(N), X(N)
Set values for A in band symmetric form, and B
A = (
(
(
0.0
0.0
2.0
0.0
0.0
4.0
-1.0
2.0
7.0
1.0 )
-1.0 )
3.0 )
B = (
19.0 )
DATA A/2*0.0, 2.0, 2*0.0, 4.0, -1.0, 2.0, 7.0, 1.0, -1.0, 3.0/
DATA B/6.0, -11.0, -11.0, 19.0/
Solve A*X = B
CALL LSLQS (A, NCODA, B, X)
Print results
CALL WRRRN (X, X, 1, N, 1)
END
Output
1
4.000
X
2
-6.000
3
2.000
4
9.000
LSLPB
Computes the RTDR Cholesky factorization of a real symmetric positive definite matrix A in
codiagonal band symmetric storage mode. Solve a system Ax = b.
Required Arguments
A Array containing the N by N positive definite band coefficient matrix and right hand
side in codiagonal band symmetric storage mode. (Input/Output)
The number of array columns must be at least NCODA + 2. The number of column is
not an input to this subprogram.
On output, A contains the solution and factors. See Comments section for details.
Chapter 1: Linear Systems
LSLPB 305
Optional Arguments
N Order of the matrix. (Input)
Must satisfy N > 0.
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Must satisfy LDA N + NCODA.
Default: LDA = size (A,1).
IJOB Flag to direct the desired factorization or solving step. (Input)
Default: IJOB = 1.
IJOB Meaning
factor the matrix A and solve the system Ax = b, where b is stored in column
NCODA + 2 of array A. The vector x overwrites b in storage.
solve step only. Use b as column NCODA + 2 of A. (The factorization step has
already been done.) The vector x overwrites b in storage.
4,5,6 same meaning as with the value IJOB - 3. For efficiency, no error checking is
done on values LDA, N, NCODA, and U(*).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLPB factors and solves the symmetric positive definite banded linear system Ax = b.
The matrix is factored so that A = RTDR, where R is unit upper triangular and D is diagonal. The
reciprocals of the diagonal entries of D are computed and saved to make the solving step more
efficient. Errors will occur if D has a non-positive diagonal element. Such events occur only if A is
very close to a singular matrix or is not positive definite.
LSLPB is efficient for problems with a small band width. The particular cases NCODA = 0, 1, 2 are
done with special loops within the code. These cases will give good performance. See Hanson
(1989) for details. When solving tridiagonal systems, NCODA = 1 , the cyclic reduction code LSLCR
should be considered as an alternative. The expectation is that LSLCR will outperform LSLPB on
vector or parallel computers. It may be inferior on scalar computers or even parallel computers
with non-optimizing compilers.
Comments
1.
2.
3.
Informational error
Type
Code
4
Example
A system of four linear equations is solved. The coefficient matrix has real positive definite
codiagonal band form and the right-hand-side vector b has four elements.
USE LSLPB_INT
USE WRRRN_INT
!
!
Declare variables
INTEGER LDA, N, NCODA
PARAMETER (N=4, NCODA=2, LDA=N+NCODA)
INTEGER IJOB
REAL A(LDA,NCODA+2), U(N)
REAL R(N,N), RT(N,N), D(N,N), WK(N,N), AA(N,N)
!
Chapter 1: Linear Systems
LSLPB 307
!
!
!
!
!
!
!
!
!
!
!
!
!
( *
( *
(2.0
(4.0
(7.0
(3.0
*
*
*
0.0
2.0
-1.0
*
*
*
*
-1.0
1.0
* )
* )
6.0)
-11.0)
-11.0)
19.0)
Output
X
1
4.000
2
-6.000
3
2.000
4
9.000
LFCQS
Computes the RT R Cholesky factorization of a real symmetric positive definite matrix in band
symmetric storage mode and estimate its L1 condition number.
Required Arguments
A NCODA + 1 by N array containing the N by N positive definite band coefficient matrix in
band symmetric storage mode to be factored. (Input)
NCODA Number of upper codiagonals of A. (Input)
FACT NCODA + 1 by N array containing the RTR factorization of the matrix A in band
symmetric form. (Output)
If A is not needed, A and FACT can share the same storage locations.
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFCQS computes an RTR Cholesky factorization and estimates the condition number of a
real symmetric positive definite band coefficient matrix. R is an upper triangular band matrix.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1 . Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCQS fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
The RTR factors are returned in a form that is compatible with routines LFIQS, LFSQS and LFDQS.
To solve systems of equations with multiple right-hand-side vectors, use LFCQS followed by either
LFIQS or LFSQS called once for each right-hand side. The routine LFDQS can be called to compute
the determinant of the coefficient matrix after LFCQS has performed the factorization.
LFCQS is based on the LINPACK routine SPBCO; see Dongarra et al. (1979).
LFCQS 309
Comments
1.
2.
Informational errors
Type
Code
3
4
3
2
Example
The inverse of a 4 4 symmetric positive definite band matrix with one codiagonal is computed.
LFCQS is called to factor the matrix and to check for nonpositive definiteness or ill-conditioning.
LFIQS is called to determine the columns of the inverse.
USE
USE
USE
USE
LFCQS_INT
LFIQS_INT
UMACH_INT
WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N, NCODA, NOUT
(LDA=2, LDFACT=2, N=4, NCODA=1)
A(LDA,N), AINV(N,N), RCOND, FACT(LDFACT,N),&
RES(N), RJ(N)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
0.0
2.0
1.0
2.5
1.0
2.5
1.0 )
2.0 )
Output
RCOND = 0.160
L1 Condition number = 6.239
AINV
1
2
3
1
0.6667 -0.3333
0.1667
2 -0.3333
0.6667 -0.3333
3
0.1667 -0.3333
0.6667
4 -0.0833
0.1667 -0.3333
4
-0.0833
0.1667
-0.3333
0.6667
LFTQS
Computes the RTR Cholesky factorization of a real symmetric positive definite matrix in band
symmetric storage mode.
Required Arguments
A NCODA + 1 by N array containing the N by N positive definite band coefficient matrix in
band symmetric storage mode to be factored. (Input)
NCODA Number of upper codiagonals of A. (Input)
FACT NCODA + 1 by N array containing the RT R factorization of the matrix A. (Output)
If A s not needed, A and FACT can share the same storage locations.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
LFTQS 311
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFTQS computes an RT R Cholesky factorization of a real symmetric positive definite
band coefficient matrix. R is an upper triangular band matrix.
LFTQS fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A is very close to a singular matrix or to a matrix which is not positive
definite.
The RT R factors are returned in a form that is compatible with routines LFIQS, LFSQS and LFDQS.
To solve systems of equations with multiple right hand-side vectors, use LFTQS followed by either
LFIQS or LFSQS called once for each right-hand side. The routine LFDQS can be called to compute
the determinant of the coefficient matrix after LFTQS has performed the factorization.
LFTQS is based on the LINPACK routine CPBFA; see Dongarra et al. (1979).
Comments
Informational error
Type
4
Code
2
Example
The inverse of a 3 3 matrix is computed. LFTQS is called to factor the matrix and to check for
nonpositive definiteness. LFSQS is called to determine the columns of the inverse.
USE LFTQS_INT
USE WRRRN_INT
USE LFSQS_INT
!
!
!
!
!
!
!
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N, NCODA
(LDA=2, LDFACT=2, N=4, NCODA=1)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N)
Set values for A in band symmetric form
A = (
(
0.0
2.0
1.0
2.5
1.0
2.5
1.0 )
2.0 )
!
!
!
!
!
!
!
Output
1
2
3
4
1
0.6667
AINV
2
3
-0.3333
0.1667
0.6667 -0.3333
0.6667
4
-0.0833
0.1667
-0.3333
0.6667
LFSQS
Solves a real symmetric positive definite system of linear equations given the factorization of the
coefficient matrix in band symmetric storage mode.
Required Arguments
FACT NCODA + 1 by N array containing the RT R factorization of the positive definite band
matrix A in band symmetric storage mode as output from subroutine LFCQS/DLFCQS or
LFTQS/DLFTQS. (Input)
NCODA Number of upper codiagonals of A. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X an share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LFSQS 313
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
This routine computes the solution for a system of linear algebraic equations having a real
symmetric positive definite band coefficient matrix. To compute the solution, the coefficient
matrix must first undergo an RT R factorization. This may be done by calling either LFCQS or
LFTQS. R is an upper triangular band matrix.
The solution to Ax = b is found by solving the triangular systems RTy = b and Rx = y.
T
LFSQS and LFIQS both solve a linear system given its R R factorization. LFIQS generally takes
more time and produces a more accurate answer than LFSQS. Each iteration of the iterative
refinement algorithm used by LFIQS calls LFSQS.
LFSQS is based on the LINPACK routine SPBSL; see Dongarra et al. (1979).
Comments
Informational error
Type
4
Code
1 The factored matrix is singular.
Example
A set of linear systems is solved successively. LFTQS is called to factor the coefficient matrix.
LFSQS is called to compute the four solutions for the four right-hand sides. In this case the
coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be
better to call LFCQS to perform the factorization, and LFIQS to compute the solutions.
USE LFSQS_INT
USE LFTQS_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
Declare variables
LDA, LDFACT, N, NCODA
(LDA=3, LDFACT=3, N=4, NCODA=2)
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
0.0
0.0
2.0
0.0
0.0
4.0
-1.0
2.0
7.0
1.0 )
-1.0 )
3.0 )
B = ( 4.0
( 6.0
( 15.0
( -7.0
-3.0
10.0
12.0
1.0
9.0
29.0
11.0
14.0
-1.0
3.0
6.0
2.0
)
)
)
)
DATA A/2*0.0, 2.0, 2*0.0, 4.0, -1.0, 2.0, 7.0, 1.0, -1.0, 3.0/
DATA B/4.0, 6.0, 15.0, -7.0, -3.0, 10.0, 12.0, 1.0, 9.0, 29.0,&
11.0, 14.0, -1.0, 3.0, 6.0, 2.0/
Factor the matrix A
CALL LFTQS (A, NCODA, FACT)
Compute the solutions
DO 10 I=1, 4
CALL LFSQS (FACT, NCODA, B(:,I), X(:,I))
10 CONTINUE
Print solutions
CALL WRRRN (X, X)
!
END
Output
1
2
3
4
1
3.000
1.000
2.000
-2.000
X
2
-1.000
2.000
1.000
0.000
3
5.000
6.000
1.000
3.000
4
0.000
0.000
1.000
1.000
LFIQS
Uses iterative refinement to improve the solution of a real symmetric positive definite system of
linear equations in band symmetric storage mode.
Required Arguments
A NCODA + 1 by N array containing the N by N positive definite band coefficient matrix in
band symmetric storage mode. (Input)
NCODA Number of upper codiagonals of A. (Input)
FACT NCODA + 1 by N array containing the RT R factorization of the matrix A from routine
LFCQS/DLFCQS or LFTQS/DLFTQS. (Input)
LFIQS 315
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the system. (Output)
RES Vector of length N containing the residual vector at the improved solution. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFIQS computes the solution of a system of linear algebraic equations having a real
symmetric positive-definite band coefficient matrix. Iterative refinement is performed on the
solution vector to improve the accuracy. Usually almost all of the digits in the solution are
accurate, even if the matrix is somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo an RT R factorization. This may
be done by calling either IMSL routine LFCQS or LFTQS.
Iterative refinement fails only if the matrix is very ill-conditioned.
T
LFIQS and LFSQS both solve a linear system given its R R factorization. LFIQS generally takes
more time and produces a more accurate answer than LFSQS. Each iteration of the iterative
refinement algorithm used by LFIQS calls LFSQS.
Comments
Informational error
Type
3
Code
4
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving
the system each of the first two times by adding 0.5 to the second element.
USE
USE
USE
USE
LFIQS_INT
UMACH_INT
LFCQS_INT
WRRRN_INT
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N, NCODA, NOUT
(LDA=2, LDFACT=2, N=4, NCODA=1)
A(LDA,N), B(N), RCOND, FACT(LDFACT,N), RES(N,3),&
X(N,3)
!
!
!
!
!
!
!
!
0.0
2.0
1.0
2.5
1.0
2.5
1.0 )
2.0 )
B = (
3.0
5.0
7.0
4.0 )
Output
RCOND = 0.160
L1 Condition number = 6.239
X
1
2
3
1
1.167
1.000
0.833
Chapter 1: Linear Systems
LFIQS 317
2
3
4
0.667
2.167
0.917
1.000
2.000
1.000
1.333
1.833
1.083
RES
1
2
3
4
1
7.947E-08
7.947E-08
7.947E-08
-3.974E-08
2
0.000E+00
0.000E+00
0.000E+00
0.000E+00
3
9.934E-08
3.974E-08
1.589E-07
-7.947E-08
LFDQS
Computes the determinant of a real symmetric positive definite matrix given the RTR Cholesky
factorization of the band symmetric storage mode.
Required Arguments
FACT NCODA + 1 by N array containing the RT R factorization of the positive definite band
matrix, A, in band symmetric storage mode as output from subroutine LFCQS/DLFCQS
or LFTQS/DLFTQS. (Input)
NCODA Number of upper codiagonals of A. (Input)
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1| < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det(A) = DET1 * 10DET2.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDQS computes the determinant of a real symmetric positive-definite band coefficient
matrix. To compute the determinant, the coefficient matrix must first undergo an RT R
factorization. This may be done by calling either IMSL routine LFCQS or LFTQS. The formula
det A = det RT det R = (det R)2 is used to compute the determinant. Since the determinant of a
triangular matrix is the product of the diagonal elements,
det R = i =1 Rii
N
LFDQS is based on the LINPACK routine SPBDI; see Dongarra et al. (1979).
Example
The determinant is computed for a real positive definite 4 4 matrix with 2 codiagonals.
USE LFDQS_INT
USE LFTQS_INT
USE UMACH_INT
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N, NCODA, NOUT
(LDA=3, N=4, LDFACT=3, NCODA=2)
A(LDA,N), DET1, DET2, FACT(LDFACT,N)
!
!
!
!
!
!
!
!
!
!
0.0
0.0
7.0
0.0
2.0
6.0
1.0
1.0
6.0
-2.0 )
3.0 )
8.0 )
DATA A/2*0.0, 7.0, 0.0, 2.0, 6.0, 1.0, 1.0, 6.0, -2.0, 3.0, 8.0/
Factor the matrix
CALL LFTQS (A, NCODA, FACT)
Compute the determinant
CALL LFDQS (FACT, NCODA, DET1, DET2)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) DET1, DET2
!
99999 FORMAT ( The determinant of A is ,F6.3, * 10**,F2.0)
END
Output
The determinant of A is 1.186 * 10**3.
LSLTQ
Solves a complex tridiagonal system of linear equations.
Chapter 1: Linear Systems
LSLTQ 319
Required Arguments
C Complex vector of length N containing the subdiagonal of the tridiagonal matrix in C(2)
through C(N). (Input/Output)
On output C is destroyed.
D Complex vector of length N containing the diagonal of the tridiagonal matrix.
(Input/Output)
On output D is destroyed.
E Complex vector of length N containing the superdiagonal of the tridiagonal matrix in
E(1) through E(N 1). (Input/Output)
On output E is destroyed.
B Complex vector of length N containing the right-hand side of the linear system on entry
and the solution vector on return. (Input/Output)
Optional Arguments
N Order of the tridiagonal matrix. (Input)
Default: N = size (C,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLTQ factors and solves the complex tridiagonal linear system Ax = b. LSLTQ is intended
just for tridiagonal systems. The coefficient matrix does not have to be symmetric. The algorithm
is Gaussian elimination with pivoting for numerical stability. See Dongarra et al. (1979),
LINPACK subprograms CGTSL/ZGTSL, for details. When computing on vector or parallel
computers the cyclic reduction algorithm, LSLCQ, should be considered as an alternative method
to solve the system.
Comments
Informational error
Type
4
Code
2
Example
A system of n = 4 linear equations is solved.
USE LSLTQ_INT
USE WRCRL_INT
!
!
Declaration of variables
INTEGER
PARAMETER
N
(N=4)
COMPLEX
CHARACTER
!
DATA FMT/(E13.6)/
DATA CLABEL/NUMBER/
DATA RLABEL/NONE/
!
!
!
!
DATA
DATA
DATA
DATA
!
!
CALL LSLTQ (C, D, E, B)
!
Output
Solution:
1
(-0.400000E+01,-0.700000E+01)
3
( 0.700000E+01,-0.700000E+01)
2
(-0.700000E+01, 0.400000E+01)
4
( 0.900000E+01, 0.200000E+01)
LSLCQ
Computes the LDU factorization of a complex tridiagonal matrix A using a cyclic reduction
algorithm.
LSLCQ 321
Required Arguments
C Complex array of size 2N containing the upper codiagonal of the N by N tridiagonal
matrix in the entries C(1), , C(N 1). (Input/Output)
A Complex array of size 2N containing the diagonal of the N by N tridiagonal matrix in the
entries A(1), , A(N). (Input/Output)
B Complex array of size 2N containing the lower codiagonal of the N by N tridiagonal
matrix in the entries B(1), , B(N 1). (Input/Output)
Y Complex array of size 2N containing the right-hand side of the system Ax = y in the order
Y(1),,Y(N). (Input/Output)
The vector x overwrites Y in storage.
U Real array of size 2N of flags that indicate any singularities of A. (Output)
A value U(I) = 1. means that a divide by zero would have occurred during the
factoring. Otherwise U(I) = 0.
IR Array of integers that determine the sizes of loops performed in the cyclic reduction
algorithm. (Output)
IS Array of integers that determine the sizes of loops performed in the cyclic reduction
algorithm. (Output)
The sizes of these arrays must be at least log2 (N) + 3.
Optional Arguments
N Order of the matrix. (Input)
N must be greater than zero.
Default: N = size (C,1).
IJOB Flag to direct the desired factoring or solving step. (Input)
Default: IJOB =1.
IJOB
Action
1
2
3
4
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLCQ factors and solves the complex tridiagonal linear system Ax = y. The matrix is
decomposed in the form A = LDU, where L is unit lower triangular, U is unit upper triangular, and
D is diagonal. The algorithm used for the factorization is effectively that described in Kershaw
(1982). More details, tests and experiments are reported in Hanson (1990).
LSLCQ is intended just for tridiagonal systems. The coefficient matrix does not have to be
Hermitian. The algorithm amounts to Gaussian elimination, with no pivoting for numerical
stability, on the matrix whose rows and columns are permuted to a new order. See Hanson (1990)
for details. The expectation is that LSLCQ will outperform either LSLTQ or LSLQB on vector or
parallel computers. Its performance may be inferior for small values of n, on scalar computers, or
high-performance computers with non-optimizing compilers.
Example
A real skew-symmetric tridiagonal matrix, A, of dimension n = 1000 is given by ck = k, ak = 0,
and bk = k, k = 1, , n 1, an = 0. This matrix will have eigenvalues that are purely imaginary.
The eigenvalue closest to the imaginary unit is required. This number is obtained by using inverse
iteration to approximate a complex eigenvector y. The eigenvalue is approximated by
= yH Ay/yH y. (This example is contrived in the sense that the given tridiagonal skew-symmetric
matrix eigenvalue problem is essentially equivalent to the tridiagonal symmetic eigenvalue
problem where the ck = k and the other data are unchanged.)
USE LSLCQ_INT
USE UMACH_INT
!
INTEGER
PARAMETER
!
!
!
!
!
!
!
Declare variables
LP, N, N2
(LP=12, N=1000, N2=2*N)
INTEGER
REAL
COMPLEX
INTRINSIC
LSLCQ 323
B(I) = I
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
The value of n is:
1000
Value of approximate imaginary eigenvalue:
1.03811
LSACB
Solves a complex system of linear equations in band storage mode with iterative refinement.
Required Arguments
A Complex NLCA + NUCA + 1 by N array containing the N by N banded coefficient matrix in
band storage mode. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
H
IPATH = 2 means the system A X = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSACB solves a system of linear algebraic equations having a complex banded coefficient
matrix. It first uses the routine LFCCB to compute an LU factorization of the coefficient matrix and
to estimate the condition number of the matrix. The solution of the linear system is then found
using the iterative refinement routine LFICB.
LSACB 325
LSACB fails if U, the upper triangular part of the factorization, has a zero diagonal element or if the
iterative refinement algorithm fails to converge. These errors occur only if A is singular or very
close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSACB solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2ACB the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSACB.
Additional memory allocation for FACT and option value restoration are done
automatically in LSACB. Users directly calling L2ACB can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSACB or L2ACB. Default values for the option are
IVAL(*) = 1,16,0,1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSACB temporarily replaces IVAL(2) by IVAL(1). The
routine L2CCB computes the condition number if IVAL(2) = 2. Otherwise
L2CCB skips this computation. LSACB restores the option. Default values for
the option are IVAL(*) = 1,2.
Example
A system of four linear equations is solved. The coefficient matrix has complex banded form with
one upper and one lower codiagonal. The right-hand-side vector b has four elements.
USE LSACB_INT
USE WRCRN_INT
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, N, NLCA, NUCA
(LDA=3, N=4, NLCA=1, NUCA=1)
A(LDA,N), B(N), X(N)
Set values for A in band form, and B
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
B = ( -10.0-5.0i
9.5+5.5i
12.0-12.0i
0.0+8.0i )
!
END
Output
X
1
( 3.000, 0.000)
2
(-1.000, 1.000)
3
( 3.000, 0.000)
4
(-1.000, 1.000)
LSLCB
Solves a complex system of linear equations in band storage mode without iterative refinement.
Required Arguments
A Complex NLCA + NUCA + 1 by N array containing the N by N banded coefficient matrix in
band storage mode. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
Chapter 1: Linear Systems
LSLCB 327
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, then B and X may share the same storage locations)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
H
IPATH = 2 means the system A X = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLCB solves a system of linear algebraic equations having a complex banded coefficient
matrix. It first uses the routine LFCCB to compute an LU factorization of the coefficient matrix and
to estimate the condition number of the matrix. The solution of the linear system is then found
using LFSCB.
LSLCB fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
occurs only if A is singular or very close to a singular matrix.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. If the coefficient matrix is ill-conditioned or poorly scaled, it is recommended that
LSACB be used.
Comments
1.
2.
3.
Informational errors
Type
Code
3
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LCB the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSLCB.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLCB. Users directly calling L2LCB can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLCB or L2LCB. Default values for the option are
IVAL(*) = 1,16,0,1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLCB temporarily replaces IVAL(2) by IVAL(1). The
routine L2CCB computes the condition number if IVAL(2) = 2. Otherwise L2CCB
skips this computation. LSLCB restores the option. Default values for the option
are IVAL(*) = 1,2.
Example
A system of four linear equations is solved. The coefficient matrix has complex banded form with
one upper and one lower codiagonal. The right-hand-side vector b has four elements.
USE LSLCB_INT
USE WRCRN_INT
Chapter 1: Linear Systems
LSLCB 329
!
!
!
!
!
!
!
!
!
!
!
INTEGER
PARAMETER
COMPLEX
Declare variables
LDA, N, NLCA, NUCA
(LDA=3, N=4, NLCA=1, NUCA=1)
A(LDA,N), B(N), X(N)
Set values for A in band form, and B
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
B = ( -10.0-5.0i
9.5+5.5i
12.0-12.0i
0.0+8.0i )
!
END
Output
X
1
( 3.000, 0.000)
2
(-1.000, 1.000)
3
( 3.000, 0.000)
4
(-1.000, 1.000)
LFCCB
Computes the LU factorization of a complex matrix in band storage mode and estimate its L1
condition number.
Required Arguments
A Complex NLCA + NUCA + 1 by N array containing the N by N matrix in band storage
mode to be factored. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
FACT Complex 2 * NLCA + NUCA + 1 by N array containing the LU factorization of the
matrix A. (Output)
If A is not needed, A can share the first (NLCA + NUCA + 1) * N locations with FACT .
IPVT Vector of length N containing the pivoting information for the LU factorization.
(Output)
330 Chapter 1: Linear Systems
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LFCCB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, RCOND)
Double:
Description
Routine LFCCB performs an LU factorization of a complex banded coefficient matrix. It also
estimates the condition number of the matrix. The LU factorization is done using scaled partial
pivoting. Scaled partial pivoting differs from partial pivoting in that the pivoting strategy is the
same as if each row were scaled to have the same -norm.
The L1 condition number of the matrix A is defined to be (A) = ||A||1 ||A-1||1. Since it is expensive to
compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCCB fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
can occur only if A is singular or very close to a singular matrix.
The LU factors are returned in a form that is compatible with IMSL routines LFICB, LFSCB and
LFDCB. To solve systems of equations with multiple right-hand-side vectors, use LFCCB followed
LFCCB 331
by either LFICB or LFSCB called once for each right-hand side. The routine LFDCB can be called
to compute the determinant of the coefficient matrix after LFCCB has performed the factorization.
Let F be the matrix FACT, let ml = NLCA and let mu = NUCA. The first ml + mu + 1 rows of F
contain the triangular matrix U in band storage form. The lower ml rows of F contain the
multipliers needed to reconstruct L.
LFCCB is based on the LINPACK routine CGBCO; see Dongarra et al. (1979). CGBCO uses unscaled
partial pivoting.
Comments
1.
WK)
2.
Informational errors
Type
Code
3
4
1
2
Example
The inverse of a 4 4 band matrix with one upper and one lower codiagonal is computed.
LFCCB is called to factor the matrix and to check for singularity or ill-conditioning. LFICB is
called to determine the columns of the inverse.
USE
USE
USE
USE
LFCCB_INT
UMACH_INT
LFICB_INT
WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
RCOND
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N), RES(N)
Set values for A in band form
A = (
(
(
!
!
!
!
!
!
!
!
!
!
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND = 0.022
L1 condition number = 45.933
1
2
3
4
(
(
(
(
0.562,
0.122,
0.034,
0.938,
1
0.170)
0.421)
0.904)
0.870)
( 0.125,
(-0.195,
(-0.437,
(-0.347,
AINV
2
0.260)
0.094)
0.090)
0.527)
3
(-0.385,-0.135)
( 0.101,-0.289)
(-0.153,-0.527)
(-0.679,-0.374)
4
(-0.239,-1.165)
( 0.874,-0.179)
( 1.087,-1.172)
( 0.415,-1.759)
LFTCB
Computes the LU factorization of a complex matrix in band storage mode.
Required Arguments
A Complex NLCA + NUCA + 1 by N array containing the N by N matrix in band storage
mode to be factored. (Input)
NLCA Number of lower codiagonals of A. (Input)
LFTCB 333
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFTCB performs an LU factorization of a complex banded coefficient matrix. The LU
factorization is done using scaled partial pivoting. Scaled partial pivoting differs from partial
pivoting in that the pivoting strategy is the same as if each row were scaled to have the same
-norm.
LFTCB fails if U, the upper triangular part of the factorization, has a zero diagonal element. This
can occur only if A is singular or very close to a singular matrix.
The LU factors are returned in a form that is compatible with routines LFICB, LFSCB and LFDCB.
To solve systems of equations with multiple right-hand-side vectors, use LFTCB followed by either
LFICB or LFSCB called once for each right-hand side. The routine LFDCB can be called to compute
the determinant of the coefficient matrix after LFTCB has performed the factorization.
334 Chapter 1: Linear Systems
Let F be the matrix FACT, let ml = NLCA and let mu = NUCA. The first ml + mu + 1 rows of F
contain the triangular matrix U in band storage form. The lower ml rows of F contain the
multipliers needed to reconstruct L1. LFTCB is based on the LINPACK routine CGBFA; see
Dongarra et al. (1979). CGBFA uses unscaled partial pivoting.
Comments
1.
2.
Informational error
Type
4
Code
2
Example
A linear system with multiple right-hand sides is solved. LFTCB is called to factor the coefficient
matrix. LFSCB is called to compute the two solutions for the two right-hand sides. In this case the
coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be
better to call LFCCB to perform the factorization, and LFICB to compute the solutions.
USE LFTCB_INT
USE LFSCB_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), B(N,2), FACT(LDFACT,N), X(N,2)
Set values for A in band form, and B
A = (
(
(
B = (
(
(
(
-4.0-5.0i
9.5+5.5i
9.0-9.0i
0.0+8.0i
16.0-4.0i )
-9.5+19.5i )
12.0+12.0i )
-8.0-2.0i )
LFTCB 335
!
!
!
END
Output
X
1
2
3
4
( 3.000,
(-1.000,
( 3.000,
(-1.000,
1
0.000)
1.000)
0.000)
1.000)
(
(
(
(
2
0.000, 4.000)
1.000,-1.000)
0.000, 4.000)
1.000,-1.000)
LFSCB
Solves a complex system of linear equations given the LU factorization of the coefficient matrix in
band storage mode.
Required Arguments
FACT Complex 2 * NLCA + NUCA + 1 by N array containing the LU factorization of the
coefficient matrix A as output from subroutine LFCCB/DLFCCB or LFTCB/DLFTCB.
(Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization of A
as output from subroutine LFCCB/DLFCCB or LFTCB/DLFTCB. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
336 Chapter 1: Linear Systems
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFSCB computes the solution of a system of linear algebraic equations having a complex
banded coefficient matrix. To compute the solution, the coefficient matrix must first undergo an
LU factorization. This may be done by calling either LFCCB or LFTCB. The solution to Ax = b is
found by solving the banded triangular systems Ly = b and Ux = y. The forward elimination step
consists of solving the system Ly = b by applying the same permutations and elimination
operations to b that were applied to the columns of A in the factorization routine. The backward
substitution step consists of solving the banded triangular system Ux = y for x.
LFSCB and LFICB both solve a linear system given its LU factorization. LFICB generally takes
more time and produces a more accurate answer than LFSCB. Each iteration of the iterative
refinement algorithm used by LFICB calls LFSCB.
LFSCB is based on the LINPACK routine CGBSL; see Dongarra et al. (1979).
Example
The inverse is computed for a real banded 4 4 matrix with one upper and one lower codiagonal.
The input matrix is assumed to be well-conditioned; hence LFTCB is used rather than LFCCB.
USE LFSCB_INT
USE LFTCB_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
INTEGER
COMPLEX
Declare variables
LDA, LDFACT, N, NLCA, NUCA
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N)
LFSCB 337
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
END
Output
1
2
3
4
(
(
(
(
1
0.165,-0.341)
0.588,-0.047)
0.318, 0.271)
0.588,-0.047)
(
(
(
(
2
0.376,-0.094)
0.259, 0.235)
0.012, 0.247)
0.259, 0.235)
3
(-0.282, 0.471)
(-0.494, 0.024)
(-0.759,-0.235)
(-0.994, 0.524)
4
(-1.600, 0.000)
(-0.800,-1.200)
(-0.550,-2.250)
(-2.300,-1.200)
LFICB
Uses iterative refinement to improve the solution of a complex system of linear equations in band
storage mode.
Required Arguments
A Complex NLCA + NUCA + 1 by N array containing the N by N coefficient matrix in band
storage mode. (Input)
NLCA Number of lower codiagonals of A. (Input)
338 Chapter 1: Linear Systems
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system AX = B is solved.
H
IPATH = 2 means the system A X = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LFICB (N, A, LDA, NLCA, NUCA, FACT, LDFACT, IPVT, B, IPATH, X,
RES)
Double:
LFICB 339
Description
Routine LFICB computes the solution of a system of linear algebraic equations having a complex
banded coefficient matrix. Iterative refinement is performed on the solution vector to improve the
accuracy. Usually almost all of the digits in the solution are accurate, even if the matrix is
somewhat ill-conditioned.
To compute the solution, the coefficient matrix must first undergo an LU factorization. This may
be done by calling either LFCCB or LFTCB.
Iterative refinement fails only if the matrix is very ill-conditioned.
LFICB and LFSCB both solve a linear system given its LU factorization. LFICB generally takes
more time and produces a more accurate answer than LFSCB. Each iteration of the iterative
refinement algorithm used by LFICB calls LFSCB.
Comments
Informational error
Type
3
Code
3 The input matrix is too ill-conditioned for iterative refinement be effective.
Example
A set of linear systems is solved successively. The right-hand-side vector is perturbed after solving
the system each of the first two times by adding (1 + i)/2 to the second element.
USE
USE
USE
USE
LFICB_INT
LFCCB_INT
WRCRN_INT
UMACH_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
RCOND
A(LDA,N), B(N), FACT(LDFACT,N), RES(N), X(N)
Set values for A in band form, and B
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
B = ( -10.0-5.0i
9.5+5.5i
12.0-12.0i
0.0+8.0i )
!
CALL LFCCB (A, NLCA, NUCA, FACT, IPVT, RCOND)
340 Chapter 1: Linear Systems
!
!
!
!
99998 FORMAT ( RCOND = ,F5.3,/, L1 Condition number = ,F6.3)
99999 FORMAT (//, For system ,I1)
END
Output
RCOND = 0.014
L1 Condition number = 72.414
For system 1
X
1
( 3.000, 0.000)
2
(-1.000, 1.000)
3
( 3.000, 0.000)
4
(-1.000, 1.000)
RES
1
( 0.000E+00, 0.000E+00)
4
( 3.494E-22,-6.698E-22)
2
( 0.000E+00, 0.000E+00)
3
( 0.000E+00, 5.684E-14)
For system 2
X
1
( 3.235, 0.141)
2
(-0.988, 1.247)
3
( 2.882, 0.129)
4
(-0.988, 1.247)
RES
1
(-1.402E-08, 6.486E-09)
4
(-7.012E-10, 4.488E-08)
2
(-7.012E-10, 4.488E-08)
3
(-1.122E-07, 7.188E-09)
For system 3
X
1
( 3.471, 0.282)
2
(-0.976, 1.494)
3
( 2.765, 0.259)
4
(-0.976, 1.494)
LFICB 341
RES
1
(-2.805E-08, 1.297E-08)
4
(-1.402E-09,-2.945E-08)
2
(-1.402E-09,-2.945E-08)
3
( 1.402E-08, 1.438E-08)
LFDCB
Computes the determinant of a complex matrix given the LU factorization of the matrix in band
storage mode.
Required Arguments
FACT Complex (2 * NLCA + NUCA + 1) by N array containing the LU factorization of the
matrix A as output from routine LFTCB/DLFTCB or LFCCB/DLFCCB. (Input)
NLCA Number of lower codiagonals in matrix A. (Input)
NUCA Number of upper codiagonals in matrix A. (Input)
IPVT Vector of length N containing the pivoting information for the LU factorization as
output from routine LFTCB/DLFTCB or LFCCB/DLFCCB. (Input)
DET1 Complex scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1 | < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det (A) = DET1 * 10DET2.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LFDCB (N, FACT, LDFACT, NLCA, NUCA, IPVT, DET1, DET2)
Double:
Description
Routine LFDCB computes the determinant of a complex banded coefficient matrix. To compute the
determinant, the coefficient matrix must first undergo an LU factorization. This may be done by
calling either LFCCB or LFTCB. The formula det A = det L det U is used to compute the
determinant. Since the determinant of a triangular matrix is the product of the diagonal elements,
det U = i =1 U ii
N
(The matrix U is stored in the upper NUCA + NLCA + 1 rows of FACT as a banded matrix.) Since L
is the product of triangular matrices with unit diagonals and of permutation matrices, det L = (1)k,
where k is the number of pivoting interchanges.
LFDCB is based on the LINPACK routine CGBDI; see Dongarra et al. (1979).
Example
The determinant is computed for a complex banded 4 4 matrix with one upper and one lower
codiagonal.
USE LFDCB_INT
USE LFTCB_INT
USE UMACH_INT
!
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NLCA, NUCA, NOUT
(LDA=3, LDFACT=4, N=4, NLCA=1, NUCA=1)
IPVT(N)
DET2
A(LDA,N), DET1, FACT(LDFACT,N)
Set values for A in band form
A = ( 0.0+0.0i 4.0+0.0i -2.0+2.0i -4.0-1.0i )
( -2.0-3.0i -0.5+3.0i 3.0-3.0i 1.0-1.0i )
( 6.0+1.0i 1.0+1.0i 0.0+2.0i 0.0+0.0i )
!
!
!
!
99999 FORMAT ( The determinant of A is (, F6.3, ,, F6.3, ) * 10**,&
F2.0)
Chapter 1: Linear Systems
LFDCB 343
END
Output
The determinant of A is ( 2.500,-1.500) * 10**1.
LSAQH
Solves a complex Hermitian positive definite system of linear equations in band Hermitian storage
mode with iterative refinement.
Required Arguments
A Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian
coefficient matrix in band Hermitian storage mode. (Input)
NCODA Number of upper or lower codiagonals of A. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSAQH solves a system of linear algebraic equations having a complex Hermitian positive
definite band coefficient matrix. It first uses the IMSL routine LFCQH to compute an RH R
344 Chapter 1: Linear Systems
Cholesky factorization of the coefficient matrix and to estimate the condition number of the
matrix. R is an upper triangular band matrix. The solution of the linear system is then found using
the iterative refinement IMSL routine LFIQH.
LSAQH fails if any submatrix of R is not positive definite, if R has a zero diagonal element, or if the
iterative refinement agorithm fails to converge. These errors occur only if the matrix A either is
very close to a singular matrix or is a matrix that is not positive definite.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system. LSAQH solves the
problem that is represented in the computer; however, this problem may differ from the problem
whose solution is desired.
Comments
1.
2.
3.
Informational errors
Type
Code
3
4
4
2
4
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2AQH the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSAQH.
Additional memory allocation for FACT and option value restoration are done
automatically in LSAQH. Users directly calling L2AQH can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSAQH or L2AQH. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
LSAQH 345
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSAQH temporarily replaces IVAL(2) by IVAL(1). The
routine L2CQH computes the condition number if IVAL(2) = 2. Otherwise L2CQH
skips this computation. LSAQH restores the option. Default values for the option
are IVAL(*) = 1, 2.
Example
A system of five linear equations is solved. The coefficient matrix has complex Hermitian positive
definite band form with one codiagonal and the right-hand-side vector b has five elements.
USE LSAQH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, N, NCODA
(LDA=2, N=5, NCODA=1)
A(LDA,N), B(N), X(N)
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
!
END
Output
X
1
( 2.000, 1.000)
5
( 3.000, 2.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
LSLQH
Solves a complex Hermitian positive definite system of linear equations in band Hermitian storage
mode without iterative refinement.
Required Arguments
A Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian
coefficient matrix in band Hermitian storage mode. (Input)
NCODA Number of upper or lower codiagonals of A. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLQH solves a system of linear algebraic equations having a complex Hermitian positive
definite band coefficient matrix. It first uses the routine LFCQH to compute an RH R Cholesky
factorization of the coefficient matrix and to estimate the condition number of the matrix. R is an
upper triangular band matrix. The solution of the linear system is then found using the routine
LFSQH.
LSLQH fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A either is very close to a singular matrix or is a matrix that is not
positive definite.
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. If the coefficient matrix is ill-conditioned or poorly sealed, it is recommended that
LSAQH be used.
Chapter 1: Linear Systems
LSLQH 347
Comments
1.
2.
3.
Informational errors
Type
Code
3
4
4
2
4
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2LQH the leading dimension of FACT is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSLQH.
Additional memory allocation for FACT and option value restoration are done
automatically in LSLQH. Users directly calling L2LQH can allocate additional
space for FACT and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSLQH or L2LQH. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSLQH temporarily replaces IVAL(2) by IVAL(1). The
routine L2CQH computes the condition number if IVAL(2) = 2. Otherwise L2CQH
skips this computation. LSLQH restores the option. Default values for the option
are IVAL(*) = 1, 2.
Example
A system of five linear equations is solved. The coefficient matrix has complex Hermitian positive
definite band form with one codiagonal and the right-hand-side vector b has five elements.
USE LSLQH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
Declare variables
N, NCODA, LDA
(N=5, NCODA=1, LDA=NCODA+1)
A(LDA,N), B(N), X(N)
B = ( 1.0+5.0i 12.0-6.0i
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
Output
X
1
( 2.000, 1.000)
2
( 3.000, 0.000)
3
(-1.000,-1.000)
4
( 0.000,-2.000)
5
( 3.000, 2.000)
LSLQB
Computes the RH DR Cholesky factorization of a complex Hermitian positive-definite matrix A in
codiagonal band Hermitian storage mode. Solve a system Ax = b.
Required Arguments
A Array containing the N by N positive-definite band coefficient matrix and the right hand
side in codiagonal band Hermitian storage mode. (Input/Output)
The number of array columns must be at least 2 * NCODA + 3. The number of columns
is not an input to this subprogram.
NCODA Number of upper codiagonals of matrix A. (Input)
Must satisfy NCODA 0 and NCODA < N.
U Array of flags that indicate any singularities of A, namely loss of positive-definiteness of
a leading minor. (Output)
Chapter 1: Linear Systems
LSLQB 349
A value U(I) = 0. means that the leading minor of dimension I is not positive-definite.
Otherwise, U(I) = 1.
Optional Arguments
N Order of the matrix. (Input)
Must satisfy N > 0.
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Must satisfy LDA N + NCODA.
Default: LDA = size (A,1).
IJOB flag to direct the desired factorization or solving step. (Input)
Default: IJOB =1.
IJOB Meaning
factor the matrix A and solve the system Ax = b; where the real part of b is
stored in column 2 * NCODA + 2 and the imaginary part of b is stored in column
2 * NCODA + 3 of array A. The real and imaginary parts of b are overwritten by
the real and imaginary parts of x.
solve step only. Use the real part of b as column 2 * NCODA + 2 and the
imaginary part of b as column 2 * NCODA + 3 of A. (The factorization step has
already been done.) The real and imaginary parts of b are overwritten by the real
and imaginary parts of x.
4,5,6 same meaning as with the value IJOB = 3. For efficiency, no error checking is
done on values LDA, N, NCODA, and U(*).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSLQB factors and solves the Hermitian positive definite banded linear system Ax = b.
The matrix is factored so that A = RH DR, where R is unit upper triangular and D is diagonal and
real. The reciprocals of the diagonal entries of D are computed and saved to make the solving step
more efficient. Errors will occur if D has a nonpositive diagonal element. Such events occur only
if A is very close to a singular matrix or is not positive definite.
LSLQB is efficient for problems with a small band width. The particular cases NCODA = 0, 1 are
done with special loops within the code. These cases will give good performance. See Hanson
(1989) for more on the algorithm. When solving tridiagonal systems, NCODA = 1, the cyclic
reduction code LSLCQ should be considered as an alternative. The expectation is that LSLCQ will
outperform LSLQB on vector or parallel computers. It may be inferior on scalar computers or even
parallel computers with non-optimizing compilers.
Comments
1.
2.
Informational error
Type
Code
4
Example
A system of five linear equations is solved. The coefficient matrix has real positive definite
codiagonal Hermitian band form and the right-hand-side vector b has five elements.
USE LSLQB_INT
USE WRRRN_INT
INTEGER
LDA, N, NCODA
PARAMETER (N=5, NCODA=1, LDA=N+NCODA)
!
INTEGER
REAL
I, IJOB, J
A(LDA,2*NCODA+3), U(N)
!
!
!
!
!
!
!
!
Chapter 1: Linear Systems
( *
( 2.0
( 4.0
(10.0
*
*
-1.0
1.0
*
*
1.0
2.0
*
* )
1.0
5.0)
12.0 -6.0)
1.0 -16.0)
LSLQB 351
!
!
!
( 6.0
( 9.0
0.0
1.0
4.0
1.0
-3.0
25.0
-3.0)
16.0)
!
!
!
Print results
CALL WRRRN (REAL(X), A((NCODA+1):,(2*NCODA+2):), 1, N, 1)
CALL WRRRN (IMAG(X), A((NCODA+1):,(2*NCODA+3):), 1, N, 1)
END
Output
1
2.000
2
3.000
REAL(X)
3
4
-1.000
0.000
5
3.000
1
1.000
2
0.000
IMAG(X)
3
4
-1.000 -2.000
5
2.000
LFCQH
Computes the RH R factorization of a complex Hermitian positive definite matrix in band
Hermitian storage mode and estimate its L1 condition number.
Required Arguments
A Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian
matrix to be factored in band Hermitian storage mode. (Input)
NCODA Number of upper or lower codiagonals of A. (Input)
FACT Complex NCODA + 1 by N array containing the RH R factorization of the matrix A.
(Output)
If A is not needed, A and FACT can share the same storage locations.
RCOND Scalar containing an estimate of the reciprocal of the L1 condition number of A.
(Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFCQH computes an RH R Cholesky factorization and estimates the condition number of a
complex Hermitian positive definite band coefficient matrix. R is an upper triangular band matrix.
The L1 condition number of the matrix A is defined to be (A) = ||A ||1 ||A-1||1 . Since it is expensive
to compute ||A-1||1 , the condition number is only estimated. The estimation algorithm is the same as
used by LINPACK and is described by Cline et al. (1979).
If the estimated condition number is greater than 1/ (where is machine precision), a warning
error is issued. This indicates that very small changes in A can cause very large changes in the
solution x. Iterative refinement can sometimes find the solution to such a system.
LFCQH fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A either is very close to a singular matrix or is a matrix which is not
positive definite.
The RH R factors are returned in a form that is compatible with routines LFIQH, LFSQH and
LFDQH. To solve systems of equations with multiple right-hand-side vectors, use LFCQH followed
by either LFIQH or LFSQH called once for each right-hand side. The routine LFDQH can be called
to compute the determinant of the coefficient matrix after LFCQH has performed the factorization.
LFCQH is based on the LINPACK routine CPBCO; see Dongarra et al. (1979).
LFCQH 353
Comments
1.
2.
Informational errors
Type
Code
3
3
1
4
4
4
2
4
Example
The inverse of a 5 5 band Hermitian matrix with one codiagonal is computed. LFCQH is called to
factor the matrix and to check for nonpositive definiteness or ill-conditioning. LFIQH is called to
determine the columns of the inverse.
USE
USE
USE
USE
LFCQH_INT
LFIQH_INT
UMACH_INT
WRCRN_INT
!
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
Declare variables
N, NCODA, LDA, LDFACT, NOUT
(N=5, NCODA=1, LDA=NCODA+1, LDFACT=LDA)
RCOND
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RES(N), RJ(N)
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
!
!
!
!
99999 FORMAT (
END
RCOND = ,F5.3,/,
Output
RCOND = 0.067
L1 Condition number = 14.961
1
2
3
4
5
1
2
3
4
5
1
( 0.7166, 0.0000)
( 0.2166, 0.2166)
(-0.0899, 0.0300)
(-0.0207,-0.0622)
( 0.0092, 0.0046)
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
AINV
2
3
( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0138,-0.0046) (-0.0138,-0.0138)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
LFTQH
Computes the RH R factorization of a complex Hermitian positive definite matrix in band
Hermitian storage mode.
Required Arguments
A Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian
matrix to be factored in band Hermitian storage mode. (Input)
NCODA Number of upper or lower codiagonals of A. (Input)
FACT Complex NCODA + 1 by N array containing the RH R factorization of the matrix A.
(Output)
If A is not needed, A and FACT can share the same storage locations.
LFTQH 355
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFTQH computes an RHR Cholesky factorization of a complex Hermitian positive definite
band coefficient matrix. R is an upper triangular band matrix.
LFTQH fails if any submatrix of R is not positive definite or if R has a zero diagonal element.
These errors occur only if A either is very close to a singular matrix or is a matrix which is not
positive definite.
The RH R factors are returned in a form that is compatible with routines LFIQH, LFSQH and
LFDQH. To solve systems of equations with multiple right-hand-side vectors, use LFTQH followed
by either LFIQH or LFSQH called once for each right-hand side. The routine LFDQH can be called
to compute the determinant of the coefficient matrix after LFTQH has performed the factorization.
LFTQH is based on the LINPACK routine SPBFA; see Dongarra et al. (1979).
Comments
Informational errors
Type
3
4
Code
4
The input matrix is not Hermitian. It has a diagonal entry with a small
imaginary part.
2
The input matrix is not positive definite.
4 The input matrix is not Hermitian. It has a diagonal entry with an imaginary
part.
Example
The inverse of a 5 5 band Hermitian matrix with one codiagonal is computed. LFTQH is called to
factor the matrix and to check for nonpositive definiteness. LFSQH is called to determine the
columns of the inverse.
USE LFTQH_INT
USE LFSQH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA
(LDA=2, LDFACT=2, N=5, NCODA=1)
A(LDA,N), AINV(N,N), FACT(LDFACT,N), RJ(N)
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
!
END
Output
1
2
3
4
5
1
( 0.7166, 0.0000)
( 0.2166, 0.2166)
(-0.0899, 0.0300)
(-0.0207,-0.0622)
( 0.0092, 0.0046)
AINV
2
3
( 0.2166,-0.2166) (-0.0899,-0.0300)
( 0.4332, 0.0000) (-0.0599,-0.1198)
(-0.0599, 0.1198) ( 0.1797, 0.0000)
(-0.0829,-0.0415) ( 0.0000, 0.1244)
( 0.0138,-0.0046) (-0.0138,-0.0138)
4
(-0.0207, 0.0622)
(-0.0829, 0.0415)
( 0.0000,-0.1244)
( 0.2592, 0.0000)
(-0.0288, 0.0288)
LFTQH 357
1
2
3
4
5
5
( 0.0092,-0.0046)
( 0.0138, 0.0046)
(-0.0138, 0.0138)
(-0.0288,-0.0288)
( 0.1175, 0.0000)
LFSQH
Solves a complex Hermitian positive definite system of linear equations given the factorization of
the coefficient matrix in band Hermitian storage mode.
Required Arguments
FACT Complex NCODA + 1 by N array containing the RH R factorization of the Hermitian
positive definite band matrix A. (Input)
FACT is obtained as output from routine LFCQH/DLFCQH or LFTQH/DLFTQH .
NCODA Number of upper or lower codiagonals of A. (Input)
B Complex vector of length N containing the right-hand-side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
If B is not needed, B and X can share the same storage locations.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
This routine computes the solution for a system of linear algebraic equations having a complex
Hermitian positive definite band coefficient matrix. To compute the solution, the coefficient
matrix must first undergo an RH R factorization. This may be done by calling either IMSL routine
LFCQH or LFTQH. R is an upper triangular band matrix.
The solution to Ax = b is found by solving the triangular systems RH y = b and Rx = y.
H
LFSQH and LFIQH both solve a linear system given its R R factorization. LFIQH generally takes
more time and produces a more accurate answer than LFSQH. Each iteration of the iterative
refinement algorithm used by LFIQH calls LFSQH.
LFSQH is based on the LINPACK routine CPBSL; see Dongarra et al. (1979).
Comments
Informational error
Type
Code
4
1
The factored matrix has a diagonal element close to zero.
Example
A set of linear systems is solved successively. LFTQH is called to factor the coefficient matrix.
LFSQH is called to compute the three solutions for the three right-hand sides. In this case the
coefficient matrix is assumed to be well-conditioned and correctly scaled. Otherwise, it would be
better to call LFCQH to perform the factorization, and LFIQH to compute the solutions.
USE LFSQH_INT
USE LFTQH_INT
USE WRCRN_INT
!
INTEGER
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA
(LDA=2, LDFACT=2, N=5, NCODA=1)
A(LDA,N), B(N,3), FACT(LDFACT,N), X(N,3)
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
B = ( 3.0+3.0i
4.0+0.0i
29.0-9.0i )
( 5.0-5.0i 15.0-10.0i -36.0-17.0i )
( 5.0+4.0i -12.0-56.0i -15.0-24.0i )
( 9.0+7.0i -12.0+10.0i -23.0-15.0i )
(-22.0+1.0i
3.0-1.0i -23.0-28.0i )
DATA A/(0.0,0.0), (2.0,0.0), (-1.0,1.0), (4.0, 0.0), (1.0,2.0),&
(10.0,0.0), (0.0,4.0), (6.0,0.0), (1.0,1.0), (9.0,0.0)/
DATA B/(3.0,3.0), (5.0,-5.0), (5.0,4.0), (9.0,7.0), (-22.0,1.0),&
(4.0,0.0), (15.0,-10.0), (-12.0,-56.0), (-12.0,10.0),&
(3.0,-1.0), (29.0,-9.0), (-36.0,-17.0), (-15.0,-24.0),&
(-23.0,-15.0), (-23.0,-28.0)/
Factor the matrix A
LFSQH 359
Output
X
1
2
3
4
5
( 1.00,
( 1.00,
( 2.00,
( 2.00,
( -3.00,
1
0.00)
-2.00)
0.00)
3.00)
0.00)
2
( 3.00, -1.00)
( 2.00, 0.00)
( -1.00, -6.00)
( 2.00, 1.00)
( 0.00, 0.00)
(
(
(
(
(
11.00,
-7.00,
-2.00,
-2.00,
-2.00,
3
-1.00)
0.00)
-3.00)
-3.00)
-3.00)
LFIQH
Uses iterative refinement to improve the solution of a complex Hermitian positive definite system
of linear equations in band Hermitian storage mode.
Required Arguments
A Complex NCODA + 1 by N array containing the N by N positive definite band Hermitian
coefficient matrix in band Hermitian storage mode. (Input)
NCODA Number of upper or lower codiagonals of A. (Input)
FACT Complex NCODA + 1 by N array containing the RH R factorization of the matrix A as
output from routine LFCQH/DLFCQH or LFTQH/DLFTQH. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
RES Complex vector of length N containing the residual vector at the improved solution.
(Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
360 Chapter 1: Linear Systems
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
This routine computes the solution for a system of linear algebraic equations having a complex
Hermitian positive definite band coefficient matrix. To compute the solution, the coefficient
matrix must first undergo an RH R factorization. This may be done by calling either IMSL routine
LFCQH or LFTQH. R is an upper triangular band matrix.
The solution to Ax = b is found by solving the triangular systems RH y = b and Rx = y.
H
LFSQH and LFIQH both solve a linear system given its R R factorization. LFIQH generally takes
more time and produces a more accurate answer than LFSQH. Each iteration of the iterative
refinement algorithm used by LFIQH calls LFSQH.
Comments
Informational error
Type
Code
4
1 The factored matrix has a diagonal element close to zero.
Example
A set of linear systems is solved successively. The right-hand side vector is perturbed after solving
the system each of the fisrt two times by adding (1 + i)/2 to the second element.
USE IMSL_LIBRARIES
!
!
!
!
!
!
INTEGER
PARAMETER
REAL
COMPLEX
Declare variables
LDA, LDFACT, N, NCODA
(LDA=2, LDFACT=2, N=5, NCODA=1)
RCOND
A(LDA,N), B(N), FACT(LDFACT,N), RES(N,3), X(N,3)
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
LFIQH 361
!
!
!
B = (
3.0+3.0i 5.0-5.0i
Output
1
2
3
4
5
( 1.00,
( 1.00,
( 2.00,
( 2.00,
( -3.00,
1
0.00)
-2.00)
0.00)
3.00)
0.00)
2
( 3.00, -1.00)
( 2.00, 0.00)
( -1.00, -6.00)
( 2.00, 1.00)
( 0.00, 0.00)
(
(
(
(
(
11.00,
-7.00,
-2.00,
-2.00,
-2.00,
3
-1.00)
0.00)
-3.00)
-3.00)
-3.00)
LFDQH
Computes the determinant of a complex Hermitian positive definite matrix given the RHR
Cholesky factorization in band Hermitian storage mode.
Required Arguments
FACT Complex NCODA + 1 by N array containing the RHR factorization of the Hermitian
positive definite band matrix A. (Input)
FACT is obtained as output from routine LFCQH/DLFCQH or LFTQH/DLFTQH.
NCODA Number of upper or lower codiagonals of A. (Input)
DET1 Scalar containing the mantissa of the determinant. (Output)
The value DET1 is normalized so that 1.0 |DET1 | < 10.0 or DET1 = 0.0.
DET2 Scalar containing the exponent of the determinant. (Output)
The determinant is returned in the form det (A) = DET1 * 10DET2.
Optional Arguments
N Number of equations. (Input)
Default: N = size (FACT,2).
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LFDQH computes the determinant of a complex Hermitian positive definite band
coefficient matrix. To compute the determinant, the coefficient matrix must first undergo an
RH R factorization. This may be done by calling either LFCQH or LFTQH. The formula
det A = det RH det R = (det R)2 is used to compute the determinant. Since the determinant of a
triangular matrix is the product of the diagonal elements,
det R = i =1 Rii
N
LFDQH is based on the LINPACK routine CPBDI; see Dongarra et al. (1979).
Example
The determinant is computed for a 5 5 complex Hermitian positive definite band matrix with one
codiagonal.
USE LFDQH_INT
USE LFTQH_INT
USE UMACH_INT
!
INTEGER
PARAMETER
REAL
COMPLEX
!
!
!
!
!
Declare variables
LDA, LDFACT, N, NCODA, NOUT
(LDA=2, N=5, LDFACT=2, NCODA=1)
DET1, DET2
A(LDA,N), FACT(LDFACT,N)
0.0+4.0i
6.0+0.0i
1.0+1.0i )
9.0+0.0i )
LFDQH 363
!
!
!
!
!
99999 FORMAT ( The determinant of A is ,F6.3, * 10**,F2.0)
END
Output
The determinant of A is
1.736 * 10**3.
LSLXG
Solves a sparse system of linear algebraic equations by Gaussian elimination.
Required Arguments
A Vector of length NZ containing the nonzero coefficients of the linear system. (Input)
IROW Vector of length NZ containing the row numbers of the corresponding elements in
A. (Input)
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in A. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (B,1).
NZ The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system Ax = b is solved.
T
IPATH = 2 means the system A x = b is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Consider the linear equation
Ax = b
where A is a n n sparse matrix. The sparse coordinate format for the matrix A requires one real
and two integer vectors. The real array a contains all the nonzeros in A. Let the number of
nonzeros be nz. The two integer arrays irow and jcol, each of length nz, contain the row and
column numbers for these entries in A. That is
irow(i),icol(i)
= a(i),
i = 1, , nz
where P and Q are the row and column permutation matrices determined by the Markowitz
strategy (Duff et al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained by the following calculations:
1) Lz = Pb
Chapter 1: Linear Systems
LSLXG 365
2) Uy = z
3) x = Qy
Comments
1.
2.
Informational errors
Type
Code
3
3
3
3.
1
2
3
If the default parameters are desired for LSLXG, then set IPARAM(1) to zero and call the
routine LSLXG. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM. then the following steps should be taken before calling LSLXG.
CALL L4LXG (IPARAM, RPARAM)
IPARAM(2)
IPARAM(5)
integer
When L2LXG is called, the values of LWK and LIWK are used instead of
IPARAM(5).
Default: 0.
IPARAM(6) = Iterative refinement is done when this is nonzero.
Default: 0.
must be bigger than the largest element in absolute value in its row
divided by RPARAM(2).
Default: 10.0.
RPARAM(3) = Drop-tolerance. Any element in the lower triangular factor L will
LSLXG 367
Large value of the growth factor indicates that an appreciable error in the
computed solution is possible.
RPARAM(5) = The value of the smallest pivotal element in absolute value.
(Output)
If double precision is required, then DL4LXG is called and RPARAM is declared
double precision.
Example
As an example consider the 6 6 linear system:
10 0 0 0 0 0
0 10 3 1 0 0
0 0 15 0 0 0
A=
2 0 0 10 1 0
1 0 0 5
1 3
1 2 0 0 0 6
Let xT = (1, 2, 3, 4, 5, 6) so that Ax = (10, 7, 45, 33,34, 31)T. The number of nonzeros in A is
nz = 15. The sparse coordinate form for A is given by:
irow 6 2 3 2 4 4 5 5 5 5 1 6 6 2 4
jcol 6 2 3 3 4 5 1 6 4 5 1 1 2 4
1
a
6 10 15 3 10 1 1 3 5 1 10 1 2 1 2
USE LSLXG_INT
USE WRRRN_INT
USE L4LXG_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
!
INTEGER
REAL
!
DATA A/6., 10., 15., -3., 10., -1., -1., -3., -5., 1., 10., -1.,&
-2., -1., -2./
DATA B/10., 7., 45., 33., -34., 31./
DATA IROW/6, 2, 3, 2, 4, 4, 5, 5, 5, 5, 1, 6, 6, 2, 4/
DATA JCOL/6, 2, 3, 3, 4, 5, 1, 6, 4, 5, 1, 1, 2, 4, 1/
!
!
!
368 Chapter 1: Linear Systems
Solve for X
Fortran Numerical MATH LIBRARY
Output
x
1
1.000
2
2.000
3
3.000
4
4.000
5
5.000
6
6.000
LFTXG
Computes the LU factorization of a real general sparse matrix..
Required Arguments
A Vector of length NZ containing the nonzero coefficients of the linear system. (Input)
IROW Vector of length NZ containing the row numbers of the corresponding elements in
A. (Input)
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in A. (Input)
NL The number of nonzero coefficients in the triangular matrix L excluding the diagonal
elements. (Output)
NFAC On input, the dimension of vector FACT. (Input/Output)
On output, the number of nonzero coefficients in the triangular matrix L and U.
FACT Vector of length NFAC containing the nonzero elements of L (excluding the
diagonals) in the first NL locations and the nonzero elements of U in NL + 1 to NFAC
locations. (Output)
IRFAC Vector of length NFAC containing the row numbers of the corresponding elements
in FACT. (Output)
JCFAC Vector of length NFAC containing the column numbers of the corresponding
elements in FACT. (Output)
IPVT Vector of length N containing the row pivoting information for the LU factorization.
(Output)
JPVT Vector of length N containing the column pivoting information for the LU
factorization. (Output)
LFTXG 369
Optional Arguments
N Number of equations. (Input)
Default: N = size (IPVT,1).
NZ The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM.
Default: IPARAM(1) = 0.
See Comment 3.
RPARAM Parameter vector of length 5. (Input/Output)
See Comment 3.
FORTRAN 90 Interface
Generic:
CALL LFTXG (A, IROW, JCOL, NL, NFAC, FACT, IRFAC, JCFAC, IPVT,
JPVT [,])
Specific:
FORTRAN 77 Interface
Single:
CALL LFTXG (N, NZ, A, IROW, JCOL, IPARAM, RPARAM, NFAC, NL, FACT,
IRFAC, JCFAC, IPVT, JPVT)
Double:
Description
Consider the linear equation
Ax = b
where A is a n n sparse matrix. The sparse coordinate format for the matrix A requires one real
and two integer vectors. The real array a contains all the nonzeros in A. Let the number of
nonzeros be nz. The two integer arrays irow and jcol, each of length nz, contain the row and
column numbers for these entries in A. That is
irow(i),icol(i)
= a(i),
i = 1, , nz
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz
strategy (Duff et al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained using LFSXG by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
Comments
1.
2.
Informational errors
Type
Code
3
3
3.
1
2
If the default parameters are desired for LFTXG, then set IPARAM(1) to zero and call the
routine LFTXG. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling LFTXG.
CALL L4LXG (IPARAM, RPARAM)
LFTXG 371
IPARAM(2)
Action
integer
When L2TXG is called, the values of LWK and LIWK are used instead of
IPARAM(5).
IPARAM(6) = Not used in LFTXG.
must be bigger than the largest element in absolute value in its row
divided by RPARAM(2).
Default: 10.0.
RPARAM(3) = Drop-tolerance. Any element in the lower triangular factor L will
Default: 0.0.
RPARAM(4) = The growth factor. It is calculated as the largest element in
absolute value in A at any stage of the Gaussian elimination divided by
the largest element in absolute value in the original A matrix. (Output)
Large value of the growth factor indicates that an appreciable error in the
computed solution is possible.
(Output)
Example
As an example, consider the 6 6 matrix of a linear system:
10 0 0 0 0 0
0 10 3 1 0 0
0 0 15 0 0 0
A=
2 0 0 10 1 0
1 0 0 5 1 3
1 2 0 0 0 6
15., -3., 10., -1., -1., -3., -5., 1., 10., -1.,&
-2./
3, 2, 4, 4, 5, 5, 5, 5, 1, 6, 6, 2, 4/
3, 3, 4, 5, 1, 6, 4, 5, 1, 1, 2, 4, 1/
NFAC = 3*NZ
!
!
CALL WRRRN ( fact , FACT, 1, NFAC, 1)
CALL WRIRN ( irfac , IRFAC, 1, NFAC, 1)
Chapter 1: Linear Systems
LFTXG 373
Output
1
-0.10
11
-1.00
2
-5.00
12
30.00
3
-0.20
13
6.00
4
-0.10
14
-2.00
5
-0.10
15
10.00
fact
6
-1.00
16
15.00
7
-0.20
8
4.90
1
3
2
4
3
4
4
5
5
5
6
6
7
6
8
6
irfac
9 10
5
5
11
4
12
4
13
3
14
3
15
2
16
1
1
2
2
3
3
1
4
4
5
2
6
5
7
2
8
6
jcfac
9 10
6
5
11
6
12
4
13
4
14
3
15
2
16
1
1
3
2
1
3
6
p
4
2
5
5
6
4
1
3
2
1
3
2
q
4
6
5
5
6
4
9
-5.10
10
1.00
LFSXG
Solves a sparse system of linear equations given the LU factorization of the coefficient matrix..
Required Arguments
NFAC The number of nonzero coefficients in FACT as output from subroutine
LFTXG/DLFTXG. (Input)
NL The number of nonzero coefficients in the triangular matrix L excluding the diagonal
elements as output from subroutine LFTXG/DLFTXG. (Input)
FACT Vector of length NFAC containing the nonzero elements of L (excluding the
diagonals) in the first NL locations and the nonzero elements of U in NL + 1 to NFAC
locations as output from subroutine LFTXG/DLFTXG. (Input)
IRFAC Vector of length NFAC containing the row numbers of the corresponding elements
in FACT as output from subroutine LFTXG/DLFTXG. (Input)
JCFAC Vector of length NFAC containing the column numbers of the corresponding
elements in FACT as output from subroutine LFTXG/DLFTXG. (Input)
IPVT Vector of length N containing the row pivoting information for the LU factorization
as output from subroutine LFTXG/DLFTXG. (Input)
JPVT Vector of length N containing the column pivoting information for the LU
factorization as output from subroutine LFTXG/DLFTXG. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (B,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system Ax = B is solved.
T
IPATH = 2 means the system A x = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFSXG (NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B, X [,])
Specific:
FORTRAN 77 Interface
Single:
CALL LFSXG (N, NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B, IPATH, X)
Double:
Description
Consider the linear equation
Ax = b
where A is a n n sparse matrix. The sparse coordinate format for the matrix A requires one real
and two integer vectors. The real array a contains all the nonzeros in A. Let the number of
nonzeros be nz. The two integer arrays irow and jcol, each of length nz, contain the row and
column numbers for these entries in A. That is
irow(i),icol(i)
= a(i),
i = 1, , nz
LFSXG 375
with all other entries in A zero. The routine LFSXG computes the solution of the linear equation
given its LU factorization. The factorization is performed by calling LFTXG. The solution of the
linear system is then found by the forward and backward substitution. The algorithm can be
expressed as
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz
strategy (Duff et al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
For more details, see Crowe et al. (1990).
Example
As an example, consider the 6 6 linear system:
10 0 0 0 0 0
0 10 3 1 0 0
0 0 15 0 0 0
A=
2 0 0 10 1 0
1 0 0 5 1 3
1 2 0 0 0 6
Let
x1T = (1, 2,3, 4,5, 6 )
so that Ax2 = (60, 35, 60, 16, 22, 10)T. The sparse coordinate form for A is given by:
irow 6 2 3 2 4 4 5 5 5 5 1 6 6 2 4
jcol 6 2 3 3 4 5 1 6 4 5 1 1 2 4
1
a
6 10 15 3 10 1 1 3 5 1 10 1 2 1 2
USE LFSXG_INT
USE WRRRL_INT
USE LFTXG_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
INTEGER
IPATH, IROW(NZ), JCOL(NZ), NFAC,&
NL, IRFAC(3*NZ), JCFAC(3*NZ), IPVT(N), JPVT(N)
REAL
X(N), A(NZ), B(N,2), FACT(3*NZ)
CHARACTER TITLE(2)*2, RLABEL(1)*4, CLABEL(1)*6
376 Chapter 1: Linear Systems
Perform LU factorization
CALL LFTXG (A, IROW, JCOL, NL, NFAC, FACT, IRFAC, JCFAC, IPVT, JPVT)
!
DO 10 I = 1, 2
!
RLABEL, CLABEL, 1, N, 1)
Output
1
1.0
2
2.0
3
3.0
1
6.0
2
5.0
3
4.0
x1
4
4.0
5
5.0
6
6.0
5
2.0
6
1.0
x2
4
3.0
LSLZG
Solves a complex sparse system of linear equations by Gaussian elimination.
Required Arguments
A Complex vector of length NZ containing the nonzero coefficients of the linear system.
(Input)
IROW Vector of length NZ containing the row numbers of the corresponding elements in
A. (Input)
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in A. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
LSLZG 377
Optional Arguments
N Number of equations. (Input)
Default: N = size (B,1).
NZ The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system Ax = b is solved.
H
IPATH = 2 means the system A x = b is solved.
Default: IPATH =1.
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 3.
Default: IPARAM = 0.
RPARAM Parameter vector of length 5. (Input/Output)
See Comment 3
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Consider the linear equation
Ax = b
where A is a n n complex sparse matrix. The sparse coordinate format for the matrix A requires
one complex and two integer vectors. The complex array a contains all the nonzeros in A. Let the
number of nonzeros be nz. The two integer arrays irow and jcol, each of length nz, contain the
row and column numbers for these entries in A. That is
irow(i),icol(i)
= a(i),
i = 1, , nz
matrix. The solution of the linear system is then found using LFSZG. The routine LFTZG by default
uses a symmetric Markowitz strategy (Crowe et al. 1990) to choose pivots that most likely would
reduce fill-ins while maintaining numerical stability. Different strategies are also provided as
options for row oriented or column oriented problems. The algorithm can be expressed as
P AQ = LU
where P and Q are the row and column permutation matrices determined by the Markowitz
strategy (Duff et al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
Comments
1.
2.
Informational errors
Type
Code
3
3
3
3.
1
2
3
If the default parameters are desired for LSLZG, then set IPARAM(1) to zero and call the
routine LSLZG. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM. then the following steps should be taken before calling LSLZG.
CALL L4LZG (IPARAM, RPARAM)
LSLZG 379
Note that the call to L4LZG will set IPARAM and RPARAM to their default values, so
only nondefault values need to be set above. The arguments are as follows:
IPARAM Integer vector of length 6.
IPARAM(1) = Initialization flag.
IPARAM(2) = The pivoting strategy.
IPARAM(2)
Action
Action
integer
When L2LZG is called, the values of LWK and LIWK are used instead of
IPARAM(5).
Default: 0.
IPARAM(6) = Iterative refinement is done when this is nonzero.
Default: 0.
must be bigger than the largest element in absolute value in its row
divided by RPARAM(2).
Default: 10.0.
Large value of the growth factor indicates that an appreciable error in the
computed solution is possible.
(Output)
Example
As an example, consider the 6 6 linear system:
10 + 7i
0
A=
2 4i
5 + 4i
1 + 12i
0
3 + 2i
0
0
0
2 + 8i
0
3 + 0i
4 + 2i
0
0
0
0
1 + 2i
0
1 + 6i
5 + 0i
0
0
0
0
1 + 3i
12 + 2i
0
0
0
0
0
7 + 7i
3 + 7i
Let
xT = (1 + i, 2 + 2i, 3 + 3i, 4 + 4i, 5 + 5i, 6 + 6i)
so that
Ax = (3 + 17i, 19 + 5i, 6 + 18i, 38 + 32i, 63 + 49i, 57 + 83i)T
The number of nonzeros in A is nz = 15. The sparse coordinate form for A is given by:
irow
6 2 2 4 3 1 5 4 6 5 5 6 4 2 5
jcol
6 2 3 5 3 1 1 4 1 4 5 2 1 4 6
USE LSLZG_INT
USE WRCRN_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
!
!
INTEGER
COMPLEX
IROW(NZ), JCOL(NZ)
A(NZ), B(N), X(N)
LSLZG 381
!
!
!
CALL WRCRN (X, X)
END
Output
1
2
3
4
5
6
(
(
(
(
(
(
X
1.000, 1.000)
2.000, 2.000)
3.000, 3.000)
4.000, 4.000)
5.000, 5.000)
6.000, 6.000)
LFTZG
Computes the LU factorization of a complex general sparse matrix.
Required Arguments
A Complex vector of length NZ containing the nonzero coefficients of the linear system.
(Input)
IROW Vector of length NZ containing the row numbers of the corresponding elements in
A. (Input)
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in A. (Input)
NFAC On input, the dimension of vector FACT. (Input/Output)
On output, the number of nonzero coefficients in the triangular matrix L and U.
NL The number of nonzero coefficients in the triangular matrix L excluding the diagonal
elements. (Output)
FACT Complex vector of length NFAC containing the nonzero elements of L (excluding
the diagonals) in the first NL locations and the nonzero elements of U in NL + 1 to NFAC
locations. (Output)
IRFAC Vector of length NFAC containing the row numbers of the corresponding elements
in FACT. (Output)
JCFAC Vector of length NFAC containing the column numbers of the corresponding
elements in FACT. (Output)
IPVT Vector of length N containing the row pivoting information for the LU factorization.
(Output)
JPVT Vector of length N containing the column pivoting information for the LU
factorization. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (IPVT,1).
NZ The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 3.
Default: IPARAM = 0.
RPARAM Parameter vector of length 5. (Input/Output)
See Comment 3.
FORTRAN 90 Interface
Generic:
CALL LFTZG (A, IROW, JCOL, NFAC, NL, FACT, IRFAC, JCFAC, IPVT,
JPVT [,])
Specific:
FORTRAN 77 Interface
Single:
CALL LFTZG (N, NZ, A, IROW, JCOL, IPARAM, RPARAM, NFAC, NL, FACT,
IRFAC, JCFAC, IPVT, JPVT)
Double:
Description
Consider the linear equation
Ax = b
where A is a complex n n sparse matrix. The sparse coordinate format for the matrix A requires
one complex and two integer vectors. The complex array a contains all the nonzeros in A. Let the
number of nonzeros be nz. The two integer arrays irow and jcol, each of length nz, contain the
row and column indices for these entries in A. That is
Chapter 1: Linear Systems
LFTZG 383
irow(i),icol(i)
= a(i),
i = 1, , nz
where P and Q are the row and column permutation matrices determined by the Markowitz
strategy (Duff et al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained using LFSZG by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
Comments
1.
2.
Informational errors
Type
Code
3
3
3.
1
2
If the default parameters are desired for LFTZG, then set IPARAM(1) to zero and call the
routine LFTZG. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM. then the following steps should be taken before calling LFTZG:
IPARAM(2)
IPARAM(5)
integer
Default: 0.
IPARAM(6) = Not used in LFTZG.
must be bigger than the largest element in absolute value in its row
divided by RPARAM(2).
Default: 10.0.
LFTZG 385
Large value of the growth factor indicates that an appreciable error in the
computed solution is possible.
(Output)
Example
As an example, the following 6 6 matrix is factorized, and the outcome is printed:
10 + 7i
0
A=
2 4i
5 + 4i
1 + 12i
0
3 + 2i
0
0
0
2 + 8i
0
3 + 0i
4 + 2i
0
0
0
0
1 + 2i
0
1 + 6i
5 + 0i
0
0
0
0
1 + 3i
12 + 2i
0
0
0
0
0
7 + 7i
3 + 7i
6 2 2 4 3 1 5 4 6 5 5 6 4 2 5
jcol
6 2 3 5 3 1 1 4 1 4 5 2 1 4 6
USE LFTZG_INT
USE WRCRN_INT
USE WRIRN_INT
INTEGER
N, NFAC, NZ
PARAMETER (N=6, NZ=15)
!
INTEGER
COMPLEX
!
CALL
CALL
CALL
CALL
CALL
WRCRN
WRIRN
WRIRN
WRIRN
WRIRN
(fact,FACT, 1, NFAC, 1)
( irfac ,IRFAC, 1, NFAC, 1)
( jcfac ,JCFAC, 1, NFAC, 1)
( p ,IPVT, 1, N, 1)
( q ,JPVT, 1, N, 1)
!
END
Output
fact
0.50, 0.85)
0.15, -0.41)
-0.60, 0.30)
2.23, -1.97)
-0.15, 0.50)
-0.04, 0.26)
-0.32, -0.17)
-0.92, 7.46)
-6.71, -6.42)
12.00, 2.00)
-1.00, 2.00)
-3.32, 0.21)
3.00, 7.00)
-2.00, 8.00)
10.00, 7.00)
4.00, 2.00)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
1
3
2
4
3
4
4
5
5
5
6
6
7
6
8
6
irfac
9 10
5
5
11
4
12
4
13
3
14
3
15
2
16
1
1
2
2
3
3
1
4
4
5
2
6
5
7
2
8
6
jcfac
9 10
6
5
11
6
12
4
13
4
14
3
15
2
16
1
1
3
2
1
3
6
p
4
2
5
5
6
4
1
3
2
1
3
2
q
4
6
5
5
6
4
LFSZG
Solves a complex sparse system of linear equations given the LU factorization of the coefficient
matrix.
LFSZG 387
Required Arguments
NFAC The number of nonzero coefficients in FACT as output from subroutine
LFTZG/DLFTZG. (Input)
NL The number of nonzero coefficients in the triangular matrix L excluding the diagonal
elements as output from subroutine LFTZG/DLFTZG. (Input)
FACT Complex vector of length NFAC containing the nonzero elements of L (excluding
the diagonals) in the first NL locations and the nonzero elements of U in NL+ 1 to NFAC
locations as output from subroutine LFTZG/DLFTZG. (Input)
IRFAC Vector of length NFAC containing the row numbers of the corresponding elements
in FACT as output from subroutine LFTZG/DLFTZG. (Input)
JCFAC Vector of length NFAC containing the column numbers of the corresponding
elements in FACT as output from subroutine LFTZG/DLFTZG. (Input)
IPVT Vector of length N containing the row pivoting information for the LU factorization
as output from subroutine LFTZG/DLFTZG. (Input)
JPVT Vector of length N containing the column pivoting information for the LU
factorization as output from subroutine LFTZG/DLFTZG. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (B,1).
IPATH Path indicator. (Input)
IPATH = 1 means the system Ax = b is solved.
H
IPATH = 2 means the system A x = b is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
CALL LFSZG (NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B, X [,])
Specific:
FORTRAN 77 Interface
Single:
CALL LFSZG (N, NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT, B, IPATH, X)
Double:
Description
Consider the linear equation
Ax = b
where A is a complex n n sparse matrix. The sparse coordinate format for the matrix A requires
one complex and two integer vectors. The complex array a contains all the nonzeros in A. Let the
number of nonzeros be nz. The two integer arrays irow and jcol, each of length nz, contain the
row and column numbers for these entries in A. That is
irow(i),icol(i)
i = 1, , nz
= a(i),
where P and Q are the row and column permutation matrices determined by the Markowitz
strategy (Duff et al. 1986), and L and U are lower and upper triangular matrices, respectively.
Finally, the solution x is obtained by the following calculations:
1) Lz = Pb
2) Uy = z
3) x = Qy
For more details, see Crowe et al. (1990).
Example
As an example, consider the 6 6 linear system:
10 + 7i
0
A=
2 4i
5 + 4i
1 + 12i
3 + 2i
1 + 2i
0
1 + 6i
3 + 0i
4 + 2i
0
2 + 8i
5 + 0i
0
1 + 3i
12 + 2i
0
0
0
0
0
7 + 7i
3 + 7i
Let
x1T = (1 + i, 2 + 2i,3 + 3i, 4 + 4i,5 + 5i, 6 + 6i )
so that
Ax1 = (3 + 17i, 19 + 5i, 6 + 18i, 38 + 32i, 63 + 49i, 57 + 83i)T
Chapter 1: Linear Systems
LFSZG 389
and
x2T = ( 6 + 6i,5 + 5i, 4 + 4i,3 + 3i, 2 + 2i,1 + i )
so that
Ax2 = (18 + 102i, 16 + 16i, 8 + 24i, 11 11i, 63 + 7i, 132 + 106i)T
6 2 2 4 3 1 5 4 6 5 5 6 4 2 5
jcol
6 2 3 5 3 1 1 4 1 4 5 2 1 4 6
USE LFSZG_INT
USE WRCRN_INT
USE LFTZG_INT
INTEGER
N, NZ
PARAMETER (N=6, NZ=15)
INTEGER
COMPLEX
CHARACTER
!
DATA A/(3.0,7.0), (3.0,2.0), (-3.0,0.0), (-1.0,3.0), (4.0,2.0),&
(10.0,7.0), (-5.0,4.0), (1.0,6.0), (-1.0,12.0), (-5.0,0.0),&
(12.0,2.0), (-2.0,8.0), (-2.0,-4.0), (-1.0,2.0), (-7.0,7.0)/
DATA B/(3.0,17.0), (-19.0,5.0), (6.0,18.0), (-38.0,32.0),&
(-63.0,49.0), (-57.0,83.0), (18.0,102.0), (-16.0,16.0),&
(8.0,24.0), (-11.0,-11.0), (-63.0,7.0), (-132.0,106.0)/
DATA IROW/6, 2, 2, 4, 3, 1, 5, 4, 6, 5, 5, 6, 4, 2, 5/
DATA JCOL/6, 2, 3, 5, 3, 1, 1, 4, 1, 4, 5, 2, 1, 4, 6/
DATA TITLE/x1,x2/
!
NFAC = 3*NZ
!
Perform LU factorization
CALL LFTZG (A, IROW, JCOL, NFAC, NL, FACT, IRFAC, JCFAC, IPVT, JPVT)
!
IPATH = 1
DO 10 I = 1,2
!
!
END
Output
1
2
3
4
(
(
(
(
x1
1.000, 1.000)
2.000, 2.000)
3.000, 3.000)
4.000, 4.000)
5
6
( 5.000, 5.000)
( 6.000, 6.000)
1
2
3
4
5
6
(
(
(
(
(
(
x2
6.000,
5.000,
4.000,
3.000,
2.000,
1.000,
6.000)
5.000)
4.000)
3.000)
2.000)
1.000)
LSLXD
Solves a sparse system of symmetric positive definite linear algebraic equations by Gaussian
elimination.
Required Arguments
A Vector of length NZ containing the nonzero coefficients in the lower triangle of the linear
system. (Input)
The sparse matrix has nonzeroes only in entries (IROW (i), JCOL(i)) for i = 1 to NZ, and
at this location the sparse matrix has value A(i).
IROW Vector of length NZ containing the row numbers of the corresponding elements in
the lower triangle of A. (Input)
Note IROW(i) JCOL(i), since we are only indexing the lower triangle.
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in the lower triangle of A. (Input)
B Vector of length N containing the right-hand side of the linear system. (Input)
X Vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (B,1).
NZ The number of nonzero coefficients in the lower triangle of the linear system. (Input)
Default: NZ = size (A,1).
ITWKSP The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero.
Default: ITWKSP = 0.
LSLXD 391
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and symmetric. The sparse coordinate format for the matrix A
requires one real and two integer vectors. The real array a contains all the nonzeros in the lower
triangle of A including the diagonal. Let the number of nonzeros be nz. The two integer arrays
irow and jcol, each of length nz, contain the row and column indices for these entries in A. That
is
irow(i),icol(i)
i = 1, , nz
= a(i),
irow(i) jcol(i)
i = 1, , nz
1) Ly1 = Pb
2) LTy2 = y1
3) x = PTy2
The routine LFSXD accepts b and the permutation vector which determines P. It then returns x.
Comments
1.
Informational errors
Type
Code
4
4
3.
1
2
If the default parameters are desired for L2LXD, then set IPARAM(1) to zero and call the
routine L2LXD. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling L2LXD.
CALL L4LXD (IPARAM, RPARAM)
LSLXD 393
IPARAM(2)
Multifrontal
IPARAM(3)
factorization.
RPARAM(2) = The value of the smallest diagonal element in the Cholesky
factorization.
Example
As an example consider the 5 5 linear system:
1 0 2
10 0
0 20 0 0 3
A = 1 0 30 4 0
0 0 4 40 5
2 3 0 5 50
Let xT = (1, 2, 3, 4, 5) so that Ax = (23, 55, 107, 197, 278)T. The number of nonzeros in the lower
triangle of A is nz = 10. The sparse coordinate form for the lower triangle of A is given by:
irow
jcol
a
394 Chapter 1: Linear Systems
2 3
3 4
4 5 5 5
2 1
3 3
1 2 4
10 20 1 30 4 40 2 3 5 50
Fortran Numerical MATH LIBRARY
or equivalently by
irow
jcol
a
4 5 5 5
2 3
3 4
2 1
3 3
1 2 4
40 2 3 5 10 20 1 30 4 50
USE LSLXD_INT
USE WRRRN_INT
INTEGER
N, NZ
PARAMETER (N=5, NZ=10)
!
INTEGER
REAL
IROW(NZ), JCOL(NZ)
A(NZ), B(N), X(N)
!
DATA
DATA
DATA
DATA
A/10., 20., 1., 30., 4., 40., 2., 3., 5., 50./
B/23., 55., 107., 197., 278./
IROW/1, 2, 3, 3, 4, 4, 5, 5, 5, 5/
JCOL/1, 2, 1, 3, 3, 4, 1, 2, 4, 5/
Solve A * X = B
CALL LSLXD (A, IROW, JCOL, B, X)
Print results
CALL WRRRN ( x , X, 1, N, 1)
END
!
!
Output
1
1.000
2
2.000
x
3
3.000
4
4.000
5
5.000
LSCXD
Performs the symbolic Cholesky factorization for a sparse symmetric matrix using a minimum
degree ordering or a user-specified ordering, and set up the data structure for the numerical
Cholesky factorization
Required Arguments
IROW Vector of length NZ containing the row subscripts of the nonzeros in the lower
triangular part of the matrix including the nonzeros on the diagonal. (Input)
JCOL Vector of length NZ containing the column subscripts of the nonzeros in the lower
triangular part of the matrix including the nonzeros on the diagonal. (Input)
(IROW (K), JCOL(K)) gives the row and column indices of the k-th nonzero element of
the matrix stored in coordinate form. Note, IROW(K) JCOL(K).
NZSUB Vector of length MAXSUB containing the row subscripts for the off-diagonal
nonzeros in the Cholesky factor in compressed format. (Output)
LSCXD 395
INZSUB Vector of length N + 1 containing pointers for NZSUB. The row subscripts for the
off-diagonal nonzeros in column J are stored in NZSUB from location INZSUB (J) to
INZSUB(J + (ILNZ (J +1) ILNZ(J) 1). (Output)
MAXNZ Total number of off-diagonal nonzeros in the Cholesky factor. (Output)
ILNZ Vector of length N + 1 containing pointers to the Cholesky factor. The off-diagonal
nonzeros in column J of the factor are stored from location ILNZ (J) to
ILNZ(J + 1) 1. (Output)
(ILNZ, NZSUB, INZSUB) sets up the data structure for the off-diagonal nonzeros of the
Cholesky factor in column ordered form using compressed subscript format.
INVPER Vector of length N containing the inverse permutation. (Output)
INVPER (K) = I indicates that the original row K is the new row I.
Optional Arguments
N Number of equations. (Input)
Default: N = size (INVPER,1).
NZ Total number of the nonzeros in the lower triangular part of the symmetric matrix,
including the nonzeros on the diagonal. (Input)
Default: NZ = size (IROW,1).
IJOB Integer parameter selecting an ordering to permute the matrix symmetrically.
(Input)
IJOB = 0 selects the user ordering specified in IPER and reorders it so that the
multifrontal method can be used in the numerical factorization.
IJOB = 1 selects the user ordering specified in IPER.
IJOB = 2 selects a minimum degree ordering.
IJOB = 3 selects a minimum degree ordering suitable for the multifrontal method in the
numerical factorization.
Default: IJOB = 3.
ITWKSP The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero.
Default: ITWKSP = 0.
MAXSUB Number of subscripts contained in array NZSUB. (Input/Output)
On input, MAXSUB gives the size of the array NZSUB.
Note that when default workspace (ITWKSP = 0) is used, set MAXSUB = 3 * NZ.
Otherwise (ITWKSP > 0), set MAXSUB = (ITWKSP 10 * N 7) / 4. On output, MAXSUB
gives the number of subscripts used by the compressed subscript format.
Default: MAXSUB = 3*NZ.
IPER Vector of length N containing the ordering specified by IJOB. (Input/Output)
IPER (I) = K indicates that the original row K is the new row I.
396 Chapter 1: Linear Systems
ISPACE The storage space needed for stack of frontal matrices. (Output)
FORTRAN 90 Interface
Generic: Because the Fortran compiler cannot determine the precision desired from the
required arguments, there is no generic Fortran 90 Interface for this routine. The specific
Fortran 90 Interfaces are:
Single:
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER [,])
Or
CALL S_LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER [,])
Double:
CALL DLSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER [,])
Or
CALL D_LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER [,])
FORTRAN 77 Interface
Single:
CALL LSCXD (N, NZ, IROW, JCOL, IJOB, ITWKSP, MAXSUB, NZSUB, INZSUB,
MAXNZ, ILNZ, IPER, INVPER, ISPACE)
Double:
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and symmetric. The sparse coordinate format for the matrix A
requires one real and two integer vectors. The real array a contains all the nonzeros in the lower
triangle of A including the diagonal. Let the number of nonzeros be nz. The two integer arrays
irow and jcol, each of length nz, contain the row and column indices for these entries in A. That
is
irow(i),icol(i)
i = 1, , nz
= a(i),
irow(i) jcol(i)
i = 1, , nz
LSCXD 397
Comments
1.
Workspace may be explicitly provided, if desired, by use of L2CXD. The reference is:
CALL L2CXD (N, NZ, IROW, JCOL, IJOB, MAXSUB, NZSUB, INZSUB,
MAXNZ, ILNZ, IPER, INVPER, ISPACE, LIWK, IWK)
Informational errors
Type
Code
4
Example
As an example, the following matrix is symbolically factorized, and the result is printed:
1 0 2
10 0
0 20 0 0 3
A = 1 0 30 4 0
0 0 4 40 5
2 3 0 5 50
The number of nonzeros in the lower triangle of A is nz= 10. The sparse coordinate form for the
lower triangle of A is given by:
irow
jcol
or equivalently by
398 Chapter 1: Linear Systems
irow
jcol
USE LSCXD_INT
USE WRIRN_INT
INTEGER
N, NZ
PARAMETER (N=5, NZ=10)
!
INTEGER
!
DATA IROW/1, 2, 3, 3, 4, 4, 5, 5, 5, 5/
DATA JCOL/1, 2, 1, 3, 3, 4, 1, 2, 4, 5/
MAXSUB = 3 * NZ
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER,&
MAXSUB=MAXSUB, IPER=IPER)
Print results
CALL WRIRN ( iper , IPER, 1, N, 1)
CALL WRIRN ( invper ,INVPER, 1, N, 1)
CALL WRIRN ( nzsub , NZSUB, 1, MAXSUB, 1)
CALL WRIRN ( inzsub , INZSUB, 1, N+1, 1)
CALL WRIRN ( ilnz , ILNZ, 1, N+1, 1)
END
Output
1
2
2
1
3
5
iper
4
4
5
3
1
2
2
1
invper
3
4
5
4
5
3
1
3
2
5
1
1
2
1
3
3
inzsub
4
5
4
4
6
4
1
1
2
2
3
4
ilnz
4
5
6
7
6
7
nzsub
3
4
4
5
LNFXD
Computes the numerical Cholesky factorization of a sparse symmetrical matrix A.
LNFXD 399
Required Arguments
A Vector of length NZ containing the nonzero coefficients of the lower triangle of the
linear system. (Input)
IROW Vector of length NZ containing the row numbers of the corresponding elements in
the lower triangle of A. (Input)
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in the lower triangle of A. (Input)
MAXSUB Number of subscripts contained in array NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
NZSUB Vector of length MAXSUB containing the row subscripts for the nonzeros in the
Cholesky factor in compressed format as output from subroutine LSCXD/DLSCXD.
(Input)
INZSUB Vector of length N + 1 containing pointers for NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts for the nonzeros in column J are stored from location INZSUB (J)
to INZSUB(J + 1) 1.
MAXNZ Length of RLNZ as output from subroutine LSCXD/DLSCXD. (Input)
ILNZ Vector of length N + 1 containing pointers to the Cholesky factor as output from
subroutine LSCXD/DLSCXD. (Input)
The row subscripts for the nonzeros in column J of the factor are stored from location
ILNZ(J) to ILNZ(J + 1) 1. (ILNZ, NZSUB, INZSUB) sets up the compressed data
structure in column ordered form for the Cholesky factor.
IPER Vector of length N containing the permutation as output from subroutine
LSCXD/DLSCXD. (Input)
INVPER Vector of length N containing the inverse permutation as output from subroutine
LSCXD/DLSCXD. (Input)
ISPACE The storage space needed for the stack of frontal matrices as output from
subroutine LSCXD/DLSCXD. (Input)
DIAGNL Vector of length N containing the diagonal of the factor. (Output)
RLNZ Vector of length MAXNZ containing the strictly lower triangle nonzeros of the
Cholesky factor. (Output)
RPARAM Parameter vector containing factorization information. (Output)
RPARAM(1) = smallest diagonal element.
RPARAM(2) = largest diagonal element.
400 Chapter 1: Linear Systems
Optional Arguments
N Number of equations. (Input)
Default: N = size (IPER,1).
NZ The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IJOB Integer parameter selecting factorization method. (Input)
IJOB = 1 yields factorization in sparse column format.
IJOB = 2 yields factorization using multifrontal method.
Default: IJOB = 1.
ITWKSP The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero.
Default: ITWKSP = 0.
FORTRAN 90 Interface
Generic:
CALL LNFXD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ, IPER,
INVPER, ISPACE, DIAGNL, RLNZ, RPARAM [,])
Specific:
FORTRAN 77 Interface
Single:
CALL LNFXD (N, NZ, A, IROW, JCOL, IJOB, ITWKSP, MAXSUB, NZSUB, INZSUB,
MAXNZ, ILNZ, IPER, INVPER, ISPACE, ITWKSP, DIAGNL, RLNZ, RPARAM)
Double:
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and symmetric. The sparse coordinate format for the matrix A
requires one real and two integer vectors. The real array a contains all the nonzeros in the lower
triangle of A including the diagonal. Let the number of nonzeros be nz. The two integer arrays
irow and jcol, each of length nz, contain the row and column indices for these entries in A. That
is
irow(i),icol(i)
= a(i),
irow(i) jcol(i)
i = 1, , nz
i = 1, , nz
LNFXD 401
with all other entries in the lower triangle of A zero. The routine LNFXD produces the Cholesky
factorization of P APTgiven the symbolic factorization of A which is computed by LSCXD. That is,
this routine computes L which satisfies
P APT= LLT
The diagonal of L is stored in DIAGNL and the strictly lower triangular part of L is stored in
compressed subscript form in R = RLNZ as follows. The nonzeros in the j-th column of L are stored
in locations R(i), , R(i + k) where i = ILNZ(j) and k = ILNZ(j + 1) ILNZ(j) 1. The row
subscripts are stored in the vector NZSUB from locations INZSUB(j) to INZSUB(j) + k.
The numerical computations can be carried out in one of two ways. The first method (when
IJOB = 2) performs the factorization using a multifrontal technique. This option requires more
storage but in certain cases will be faster. The multifrontal method is based on the routines in Liu
(1987). For detailed description of this method, see Liu (1990), also Duff and Reid (1983, 1984),
Ashcraft (1987), Ashcraft et al. (1987), and Liu (1986, 1989). The second method (when
IJOB = 1) is fully described in George and Liu (1981). This is just the standard factorization
method based on the sparse compressed storage scheme.
Comments
1.
Informational errors
Type
Code
4
4
1
2
Example
As an example, consider the 5 5 linear system:
1 0 2
10 0
0 20 0 0 3
A = 1 0 30 4 0
0 0 4 40 5
2 3 0 5 50
The number of nonzeros in the lower triangle of A is nz = 10. The sparse coordinate form for the
lower triangle of A is given by:
irow
2 3
3 4
4 5 5 5
jcol
2 1
3 3
1 2 4
10 20 1 30 4 40 2 3 5 50
or equivalently by
irow
jcol
a
4 5 5 5
2 3
3 4
2 1
3 3
1 2 4
40 2 3 5 10 20 1 30 4 50
We first call LSCXD to produce the symbolic information needed to pass on to LNFXD. Then call
LNFXD to factor this matrix. The results are displayed below.
USE LNFXD_INT
USE LSCXD_INT
USE WRRRN_INT
INTEGER
N, NZ, NRLNZ
PARAMETER (N=5, NZ=10, NRLNZ=10)
!
INTEGER
REAL
!
!
!
!
!
!
DATA A/10., 20., 1., 30., 4., 40., 2., 3., 5., 50./
DATA IROW/1, 2, 3, 3, 4, 4, 5, 5, 5, 5/
DATA JCOL/1, 2, 1, 3, 3, 4, 1, 2, 4, 5/
Select minimum degree ordering
for multifrontal method
IJOB = 3
Use default workspace
MAXSUB = 3*NZ
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER, &
MAXSUB=MAXSUB)
Check if NRLNZ is large enough
IF (NRLNZ .GE. MAXNZ) THEN
Choose multifrontal method
IJOB = 2
CALL LNFXD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, &
ILNZ,IPER, INVPER, ISPACE, DIAGNL, RLNZ, RPARAM, &
IJOB=IJOB)
Print results
LNFXD 403
Construct L matrix
DO I=1,N
Diagonal
R(I,I) = DIAG(I)
IF (ILNZ(I) .GT. MAXNZ) GO TO 50
Find elements of RLNZ for this column
ISTRT = ILNZ(I)
ISTOP = ILNZ(I+1) - 1
Get starting index for NZSUB
K = INZSUB(I)
DO J=ISTRT, ISTOP
NZSUB(K) is the row for this element of
RLNZ
R((NZSUB(K)),I) = RLNZ(J)
K = K + 1
END DO
END DO
CONTINUE
CALL WRRRN ('L', R, NRA=N, NCA=N)
END
!
!
!
50
Output
1
4.472
2
3.162
diagnl
3
4
7.011
6.284
0.6708
0.6325
1
2
3
4
5
1
4.472
0.000
0.671
0.000
0.000
2
0.000
3.162
0.632
0.000
0.316
5
5.430
rlnz
3
0.3162
3
0.000
0.000
7.011
0.713
-0.029
0.7132
L
4
0.000
0.000
0.000
6.284
0.640
-0.0285
0.6398
5
0.000
0.000
0.000
0.000
5.430
LFSXD
Solves a real sparse symmetric positive definite system of linear equations, given the Cholesky
factorization of the coefficient matrix.
Required Arguments
N Number of equations. (Input)
FORTRAN 90 Interface
Generic:
CALL LFSXD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL,
IPER, B, X)
Specific:
FORTRAN 77 Interface
Single:
CALL LFSXD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL,
IPER, B, X)
Double:
LFSXD 405
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and symmetric. The sparse coordinate format for the matrix A
requires one real and two integer vectors. The real array a contains all the nonzeros in the lower
triangle of A including the diagonal. Let the number of nonzeros be nz. The two integer arrays
irow and jcol, each of length nz, contain the row and column indices for these entries in A. That
is
irow(i),icol(i)
i = 1, , nz
= a(i),
irow(i) jcol(i)
i = 1, , nz
Comments
Informational error
Type
4
Code
1 The input matrix is numerically singular.
Example
As an example, consider the 5 5 linear system:
1 0 2
10 0
0 20 0 0 3
A = 1 0 30 4 0
0 0 4 40 5
2 3 0 5 50
Let
x1T = (1, 2,3, 4,5 )
so that Ax2 = (55, 83, 103, 97, 82)T. The number of nonzeros in the lower triangle of A is nz = 10.
The sparse coordinate form for the lower triangle of A is given by:
irow
2 3
3 4
4 5 5 5
jcol
2 1
3 3
1 2 4
10 20 1 30 4 40 2 3 5 50
or equivalently by
irow
jcol
a
4 5 5 5
2 3
3 4
2 1
3 3
1 2 4
40 2 3 5 10 20 1 30 4 50
USE LFSXD_INT
USE LNFXD_INT
USE LSCXD_INT
USE WRRRN_INT
INTEGER
N, NZ, NRLNZ
PARAMETER (N=5, NZ=10, NRLNZ=10)
INTEGER
REAL
!
DATA
DATA
DATA
DATA
DATA
!
!
!
A/10., 20., 1., 30., 4., 40., 2., 3., 5., 50./
B1/23., 55., 107., 197., 278./
B2/55., 83., 103., 97., 82./
IROW/1, 2, 3, 3, 4, 4, 5, 5, 5, 5/
JCOL/1, 2, 1, 3, 3, 4, 1, 2, 4, 5/
Select minimum degree ordering
for multifrontal method
IJOB = 3
Use default workspace
ITWKSP = 0
MAXSUB = 3*NZ
LFSXD 407
CALL LSCXD (IROW, JCOL, NZSUB, INZSUB, MAXNZ, ILNZ, INVPER, &
MAXSUB=MAXSUB, IPER=IPER, ISPACE=ISPACE)
Check if NRLNZ is large enough
IF (NRLNZ .GE. MAXNZ) THEN
Choose multifrontal method
IJOB = 2
CALL LNFXD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ,&
IPER, INVPER,ISPACE, DIAGNL, RLNZ, RPARAM, IJOB=IJOB)
Solve A * X1 = B1
CALL LFSXD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL,&
IPER, B1, X)
Print X1
CALL WRRRN ( x1 , X, 1, N, 1)
Solve A * X2 = B2
CALL LFSXD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, &
DIAGNL, IPER, B2, X)
Print X2
CALL WRRRN ( x2 X, 1, N, 1)
!
!
!
!
!
!
END IF
!
END
Output
1
1.000
2
2.000
x1
3
3.000
4
4.000
5
5.000
1
5.000
2
4.000
x2
3
3.000
4
2.000
5
1.000
LSLZD
Solves a complex sparse Hermitian positive definite system of linear equations by Gaussian
elimination.
Required Arguments
A Complex vector of length NZ containing the nonzero coefficients in the lower triangle of
the linear system. (Input)
The sparse matrix has nonzeroes only in entries (IROW (i), JCOL(i)) for i = 1 to NZ, and
at this location the sparse matrix has value A(i).
IROW Vector of length NZ containing the row numbers of the corresponding elements in
the lower triangle of A. (Input)
Note IROW(i) JCOL(i), since we are only indexing the lower triangle.
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in the lower triangle of A. (Input)
408 Chapter 1: Linear Systems
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution to the linear system. (Output)
Optional Arguments
N Number of equations. (Input)
Default: N = size (B,1).
NZ The number of nonzero coefficients in the lower triangle of the linear system. (Input)
Default: NZ = size (A,1).
ITWKSP The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero.
Default: ITWKSP = 0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and Hermitian. The sparse coordinate format for the matrix A
requires one complex and two integer vectors. The complex array a contains all the nonzeros in
the lower triangle of A including the diagonal. Let the number of nonzeros be nz. The two integer
arrays irow and jcol, each of length nz, contain the row and column indices for these entries in
A. That is
irow(i),icol(i)
= a(i),
irow(i) jcol(i)
i = 1, , nz
i = 1, , nz
LSLZD 409
symbolic factorization of a permutation of the coefficient matrix. It then calls LNFZD to perform
the numerical factorization. The solution of the linear system is then found using LFSZD.
The routine LSCXD computes a minimum degree ordering or uses a user-supplied ordering to set
up the sparse data structure for the Cholesky factor, L. Then the routine LNFZD produces the
numerical entries in L so that we have
P APT= LLH
Ly1 = Pb
2) LH y2 = y1
3) x = PT y2
The routine LFSZD accepts b and the permutation vector which determines P . It then returns x.
Comments
1.
Note that the parameter ITWKSP is not an argument for this routine.
2.
Informational errors
Type
Code
4
4
3.
1
2
If the default parameters are desired for L2LZD, then set IPARAM(1) to zero and call the
routine L2LZD. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling L2LZD.
CALL L4LZD (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to L4LZD will set IPARAM and RPARAM to their default values, so only
nondefault values need to be set above. The arguments are as follows:
IPARAM Integer vector of length 4.
IPARAM(1) = Initialization flag.
IPARAM(2) = The numerical factorization method.
IPARAM(2)
Action
Multifrontal
Sparse column
Default: 0.
Action
factorization.
Cholesky factorization.
If double precision is required, then DL4LZD is called and RPARAM is declared
double precision.
Chapter 1: Linear Systems
LSLZD 411
Example
As an example, consider the 3 3 linear system:
2 + 0i
A = 1 i
1 + i
0
4 + 0i 1 + 2i
1 2i 10 + 0i
Let xT = (1 + i, 2 + 2i, 3 + 3i) so that Ax = (2 + 2i, 5 + 15i, 36 + 28i)T. The number of nonzeros in
the lower triangle of A is nz = 5. The sparse coordinate form for the lower triangle of A is given
by:
irow
jcol
4 + 0i 10 + 0i
1 i
1 2i
2 + 0i
or equivalently by
irow
jcol
10 + 0i
1 i
1 2i
2 + 0i
4 + 0i
a
USE LSLZD_INT
USE WRCRN_INT
INTEGER
N, NZ
PARAMETER (N=3, NZ=5)
!
INTEGER
COMPLEX
IROW(NZ), JCOL(NZ)
A(NZ), B(N), X(N)
!
DATA
DATA
DATA
DATA
!
Output
x
1
( 1.000, 1.000)
2
( 2.000, 2.000)
3
( 3.000, 3.000)
LNFZD
Computes the numerical Cholesky factorization of a sparse Hermitian matrix A.
412 Chapter 1: Linear Systems
Required Arguments
A Complex vector of length NZ containing the nonzero coefficients of the lower triangle of
the linear system. (Input)
IROW Vector of length NZ containing the row numbers of the corresponding elements in
the lower triangle of A. (Input)
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in the lower triangle of A. (Input)
MAXSUB Number of subscripts contained in array NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
NZSUB Vector of length MAXSUB containing the row subscripts for the nonzeros in the
Cholesky factor in compressed format as output from subroutine LSCXD/DLSCXD.
(Input)
INZSUB Vector of length N + 1 containing pointers for NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts for the nonzeros in column J are stored from location INZSUB(J) to
INZSUB(J + 1) 1.
MAXNZ Length of RLNZ as output from subroutine LSCXD/DLSCXD. (Input)
ILNZ Vector of length N + 1 containing pointers to the Cholesky factor as output from
subroutine LSCXD/DLSCXD. (Input)
The row subscripts for the nonzeros in column J of the factor are stored from location
ILNZ(J) to ILNZ(J + 1) 1.
(ILNZ , NZSUB, INZSUB) sets up the compressed data structure in column ordered form
for the Cholesky factor.
IPER Vector of length N containing the permutation as output from subroutine
LSCXD/DLSCXD. (Input)
INVPER Vector of length N containing the inverse permutation as output from subroutine
LSCXD/DLSCXD. (Input)
ISPACE The storage space needed for the stack of frontal matrices as output from
subroutine LSCXD/DLSCXD. (Input)
DIAGNL Complex vector of length N containing the diagonal of the factor. (Output)
RLNZ Complex vector of length MAXNZ containing the strictly lower triangle nonzeros of
the Cholesky factor. (Output)
LNFZD 413
Optional Arguments
N Number of equations. (Input)
Default: N = size (IPER,1).
NZ The number of nonzero coefficients in the linear system. (Input)
Default: NZ = size (A,1).
IJOB Integer parameter selecting factorization method. (Input)
IJOB = 1 yields factorization in sparse column format.
IJOB = 2 yields factorization using multifrontal method.
Default: IJOB = 1.
ITWKSP The total workspace needed. (Input)
If the default is desired, set ITWKSP to zero. See Comment 1 for the default.
Default: ITWKSP = 0.
FORTRAN 90 Interface
Generic:
CALL LNFZD (A, IROW, JCOL, MAXSUB, NZSUB, INZSUB, MAXNZ, ILNZ, IPER,
INVPER, ISPACE, DIAGNL, RLNZ, RPARAM [,])
Specific:
FORTRAN 77 Interface
Single:
CALL LNFZD (N, NZ, A, IROW, JCOL, IJOB, MAXSUB, NZSUB, INZSUB, MAXNZ,
ILNZ, IPER, INVPER, ISPACE, ITWKSP, DIAGNL, RLNZ, RPARAM)
Double:
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and Hermitian. The sparse coordinate format for the matrix A
requires one complex and two integer vectors. The complex array a contains all the nonzeros in
the lower triangle of A including the diagonal. Let the number of nonzeros be nz. The two integer
arrays irow and jcol, each of length nz, contain the row and column indices for these entries in
A. That is
A
414 Chapter 1: Linear Systems
irow(i),icol(i)
= a(i),
i = 1, , nz
Fortran Numerical MATH LIBRARY
irow(i) jcol(i)
i = 1, , nz
The diagonal of L is stored in DIAGNL and the strictly lower triangular part of L is stored in
compressed subscript form in R = RLNZ as follows. The nonzeros in the jth column of L are stored
in locations R(i), , R(i + k) where i = ILNZ(j) and k = ILNZ(j + 1) ILNZ(j) 1. The row
subscripts are stored in the vector NZSUB from locations INZSUB(j) to INZSUB(j) + k.
The numerical computations can be carried out in one of two ways. The first method
(when IJOB = 2) performs the factorization using a multifrontal technique. This option requires
more storage but in certain cases will be faster. The multifrontal method is based on the routines in
Liu (1987). For detailed description of this method, see Liu (1990), also Duff and Reid (1983,
1984), Ashcraft (1987), Ashcraft et al. (1987), and Liu (1986, 1989). The second method (when
IJOB = 1) is fully described in George and Liu (1981). This is just the standard factorization
method based on the sparse compressed storage scheme.
Comments
1.
Informational errors
Type
Code
4
4
1
2
Example
As an example, consider the 3 3 linear system:
Chapter 1: Linear Systems
LNFZD 415
2 + 0i
A = 1 i
1 + i
0
1 + 2i
1 2i 10 + 0i
4 + 0i
The number of nonzeros in the lower triangle of A is nz = 5. The sparse coordinate form for the
lower triangle of A is given by:
irow
jcol
4 + 0i 10 + 0i
1 i
1 2i
1
2 + 0i
or equivalently by
irow
jcol
a
10 + 0i
1 i
1 2i
2 + 0i
4 + 0i
We first call LSCXD to produce the symbolic information needed to pass on to LNFZD. Then call
LNFZD to factor this matrix. The results are displayed below.
USE LNFZD_INT
USE LSCXD_INT
USE WRCRN_INT
INTEGER
N, NZ, NRLNZ
PARAMETER (N=3, NZ=5, NRLNZ=5)
!
INTEGER
REAL
COMPLEX
!
!
!
!
!
END
Output
1
( 1.414, 0.000)
diagnl
2
( 1.732, 0.000)
3
( 2.887, 0.000)
rlnz
1
(-0.707,-0.707)
2
( 0.577,-1.155)
LFSZD
Solves a complex sparse Hermitian positive definite system of linear equations, given the
Cholesky factorization of the coefficient matrix.
Required Arguments
N Number of equations. (Input)
MAXSUB Number of subscripts contained in array NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
NZSUB Vector of length MAXSUB containing the row subscripts for the off-diagonal
nonzeros in the factor as output from subroutine LSCXD/DLSCXD. (Input)
INZSUB Vector of length N + 1 containing pointers for NZSUB as output from subroutine
LSCXD/DLSCXD. (Input)
The row subscripts of column J are stored from location INZSUB(J) to
INZSUB(J + 1) 1.
MAXNZ Total number of off-diagonal nonzeros in the Cholesky factor as output from
subroutine LSCXD/DLSCXD. (Input)
RLNZ Complex vector of length MAXNZ containing the off-diagonal nonzeros in the factor
in column ordered format as output from subroutine LNFZD/DLNFZD. (Input)
ILNZ Vector of length N +1 containing pointers to RLNZ as output from subroutine
LSCXD/DLSCXD. The nonzeros in column J of the factor are stored from location
ILNZ(J) to ILNZ(J + 1) 1. (Input)
The values (RLNZ, ILNZ, NZSUB, INZSUB) give the off-diagonal nonzeros of the factor
in a compressed subscript data format.
DIAGNL Complex vector of length N containing the diagonals of the Cholesky factor as
output from subroutine LNFZD/DLNFZD. (Input)
LFSZD 417
FORTRAN 90 Interface
Generic:
CALL LFSZD (N, MAXZUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL,
IPER, B, X)
Specific:
FORTRAN 77 Interface
Single:
CALL LFSZD (N, MAXSUB, NZSUB, INZSUB, MAXNZ, RLNZ, ILNZ, DIAGNL,
IPER, B, X)
Double:
Description
Consider the linear equation
Ax = b
where A is sparse, positive definite and Hermitian. The sparse coordinate format for the matrix A
requires one complex and two integer vectors. The complex array a contains all the nonzeros in
the lower triangle of A including the diagonal. Let the number of nonzeros be nz. The two integer
arrays irow and jcol, each of length nz, contain the row and column indices for these entries in
A. That is
irow(i),icol(i)
i = 1, , nz
= a(i),
irow(i) jcol(i)
i = 1, , nz
The numerical computations can be carried out in one of two ways. The first method performs the
factorization using a multifrontal technique. This option requires more storage but in certain cases
will be faster. The multifrontal method is based on the routines in Liu (1987). For detailed
description of this method, see Liu (1990), also Duff and Reid (1983, 1984), Ashcraft (1987),
Ashcraft et al. (1987), and Liu (1986, 1989). The second method is fully described in George and
Liu (1981). This is just the standard factorization method based on the sparse compressed storage
scheme. Finally, the solution x is obtained by the following calculations:
1) Ly1 = Pb
2) LH y2 = y1
3) x = PT y2
Comments
Informational error
Type
4
Code
1 The input matrix is numerically singular.
Example
As an example, consider the 3 3 linear system:
2 + 0i
A = 1 i
1 + i
0
1 + 2i
1 2i 10 + 0i
4 + 0i
Let
x1T = (1 + i, 2 + 2i,3 + 3i )
so that Ax2 = (2 + 6i, 7 5i, 16 + 8i)T. The number of nonzeros in the lower triangle of A is nz = 5.
The sparse coordinate form for the lower triangle of A is given by:
irow
4 + 0i 10 + 0i
1 i
1 2i
jcol
2 + 0i
irow
jcol
a 10 + 0i
1 i
1 2i
2 + 0i
4 + 0i
or equivalently by
2
USE IMSL_LIBRARIES
INTEGER
N, NZ, NRLNZ
Chapter 1: Linear Systems
LFSZD 419
PARAMETER
INTEGER
COMPLEX
REAL
!
DATA
DATA
DATA
DATA
DATA
!
!
!
!
!
!
!
!
!
!
END
Output
x1
1
( 1.000, 1.000)
2
( 2.000, 2.000)
3
( 3.000, 3.000)
x2
1
( 3.000, 3.000)
2
( 2.000, 2.000)
3
( 1.000, 1.000)
LSLTO
Solves a complex sparse Hermitian positive definite system of linear equations, given the
Cholesky factorization of the coefficient matrix.
Required Arguments
A Real vector of length 2N 1 containing the first row of the coefficient matrix followed
by its first column beginning with the second element. (Input)
See Comment 2.
B Real vector of length N containing the right-hand side of the linear system. (Input)
X Real vector of length N containing the solution of the linear system. (Output)
If B is not needed then B and X may share the same storage locations.
Optional Arguments
N Order of the matrix represented by A. (Input)
Default: N = (size (A,1) +1)/2
IPATH Integer flag. (Input)
IPATH = 1 means the system Ax = B is solved.
T
IPATH = 2 means the system A x = B is solved.
Default: IPATH =1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Toeplitz matrices have entries that are constant along each diagonal, for example,
p0
p
A = 1
p2
p3
Chapter 1: Linear Systems
p1
p2
p0
p1
p1
p0
p2
p1
p4
p2
p1
p0
LSLTO 421
The routine LSLTO is based on the routine TSLS in the TOEPLITZ package, see Arushanian et al.
(1983). It is based on an algorithm of Trench (1964). This algorithm is also described by Golub
and van Loan (1983), pages 125133.
Comments
1.
2.
Because of the special structure of Toeplitz matrices, the first row and the first column
of a Toeplitz matrix completely characterize the matrix. Hence, only the elements
A(1, 1), , A(1, N), A(2, 1), , A(N, 1) need to be stored.
Example
A system of four linear equations is solved. Note that only the first row and column of the matrix
A are entered.
USE LSLTO_INT
USE WRRRN_INT
!
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
!
Declare variables
N
(N=4)
A(2*N-1), B(N), X(N)
Set values for A, and B
A = (
(
(
(
2
1
4
3
-3
2
1
4
-1
-3
2
1
6
-1
-3
2
)
)
)
)
B = ( 16
-29
-7
Output
X
1
-2.000
2
-1.000
3
7.000
4
4.000
Fortran Numerical MATH LIBRARY
LSLTC
Solves a complex Toeplitz linear system.
Required Arguments
A Complex vector of length 2N 1 containing the first row of the coefficient matrix
followed by its first column beginning with the second element. (Input)
See Comment 2.
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution of the linear system. (Output)
Optional Arguments
N Order of the matrix represented by A. (Input)
Default: N = size (A,1).
IPATH Integer flag. (Input)
IPATH = 1 means the system Ax = B is solved.
T
IPATH = 2 means the system A x = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Toeplitz matrices have entries which are constant along each diagonal, for example,
p0
p
A = 1
p2
p3
p1
p2
p0
p1
p1
p2
p0
p1
p4
p2
p1
p0
LSLTC 423
The routine LSLTC is based on the routine TSLC in the TOEPLITZ package, see Arushanian et al.
(1983). It is based on an algorithm of Trench (1964). This algorithm is also described by Golub
and van Loan (1983), pages 125133.
Comments
1.
2.
Because of the special structure of Toeplitz matrices, the first row and the first column
of a Toeplitz matrix completely characterize the matrix. Hence, only the elements
A(1, 1), , A(1, N), A(2, 1), , A(N, 1) need to be stored.
Example
A system of four complex linear equations is solved. Note that only the first row and column of
the matrix A are entered.
USE LSLTC_INT
USE WRCRN_INT
!
PARAMETER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
Declare variables
(N=4)
A(2*N-1), B(N), X(N)
Set values for A and B
A = ( 2+2i
( i
( 4+2i
( 3-4i
-3
2+2i
i
4+2i
1+4i
-3
2+2i
i
6-2i
1+4i
-3
2+2i
)
)
)
)
B = ( 6+65i
-29-16i
7+i
-10+i )
Output
X
1
(-2.000, 0.000)
2
(-1.000,-5.000)
3
( 7.000, 2.000)
4
( 0.000, 4.000)
Fortran Numerical MATH LIBRARY
LSLCC
Solves a complex circulant linear system.
Required Arguments
A Complex vector of length N containing the first row of the coefficient matrix. (Input)
B Complex vector of length N containing the right-hand side of the linear system. (Input)
X Complex vector of length N containing the solution of the linear system. (Output)
Optional Arguments
N Order of the matrix represented by A. (Input)
Default: N = size (A,1).
IPATH Integer flag. (Input)
IPATH = 1 means the system Ax = B is solved.
T
IPATH = 2 means the system A x = B is solved.
Default: IPATH = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Circulant matrices have the property that each row is obtained by shifting the row above it one
place to the right. Entries that are shifted off at the right re-enter at the left. For example,
p1
p
A= 4
p3
p2
p2
p3
p1
p2
p4
p3
p1
p4
p4
p3
p2
p1
LSLCC 425
i =1
i =1
( Ax) j = pi j +1 xi = q j i +1 xi = (q x )i
is the discrete Fourier transform of q as computed by the IMSL routine FFTCF and denotes
elementwise multiplication. By division,
x = bq
Comments
1.
2.
Informational error
Type
Code
4
3.
Because of the special structure of circulant matrices, the first row of a circulant matrix
completely characterizes the matrix. Hence, only the elements A(1, 1), , A(1, N) need
to be stored.
Example
A system of four linear equations is solved. Note that only the first row of the matrix A is entered.
USE LSLCC_INT
USE WRCRN_INT
!
Declare variables
INTEGER
PARAMETER
COMPLEX
N
(N=4)
A(N), B(N), X(N)
!
!
!
!
!
!
!
!
A, and B
A = ( 2+2i -3+0i
B = (6+65i
1+4i
-41-10i
6-2i)
-8-30i
63-3i)
Output
1
(-2.000, 0.000)
2
(-1.000,-5.000)
3
( 7.000, 2.000)
4
( 0.000, 4.000)
PCGRC
Solves a real symmetric definite linear system using a preconditioned conjugate gradient method
with reverse communication.
Required Arguments
IDO Flag indicating task to be done. (Input/Output)
On the initial call IDO must be 0. If the routine returns with IDO = 1, then set Z = AP,
where A is the matrix, and call PCGRC again. If the routine returns with IDO = 2, then
set Z to the solution of the system MZ = R, where M is the preconditioning matrix, and
call PCGRC again. If the routine returns with IDO = 3, then the iteration has converged
and X contains the solution.
X Array of length N containing the solution. (Input/Output)
On input, X contains the initial guess of the solution. On output, X contains the solution
to the system.
P Array of length N. (Output)
Its use is described under IDO.
R Array of length N. (Input/Output)
On initial input, it contains the right-hand side of the linear system. On output, it
contains the residual.
Z Array of length N. (Input)
When IDO = 1, it contains AP, where A is the linear system. When IDO = 2, it contains
the solution of MZ = R, where M is the preconditioning matrix. When IDO = 0, it is
ignored. Its use is described under IDO.
PCGRC 427
Optional Arguments
N Order of the linear system. (Input)
Default: N = size (X,1).
RELERR Relative error desired. (Input)
Default: RELERR = 1.e-5 for single precision and 1.d-10 for double precision.
ITMAX Maximum number of iterations allowed. (Input)
Default: ITMAX = N.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine PCGRC solves the symmetric definite linear system Ax = b using the preconditioned
conjugate gradient method. This method is described in detail by Golub and Van Loan (1983,
Chapter 10), and in Hageman and Young (1981, Chapter 7).
The preconditioning matrix, M, is a matrix that approximates A, and for which the linear system
Mz = r is easy to solve. These two properties are in conflict; balancing them is a topic of much
current research.
The number of iterations needed depends on the matrix and the error tolerance RELERR. As a
rough guide, ITMAX = N1/2 is often sufficient when N >> 1. See the references for further
information.
Let M be the preconditioning matrix, let b, p, r, x and z be vectors and let be the desired relative
error. Then the algorithm used is as follows.
= 1
p0 = x0
r1 = b Ap
For k = 1, , itmax
zk = M1rk
If k = 1 then
428 Chapter 1: Linear Systems
k = 1
pk = zk
Else
k = zkT rk / zkT1rk 1
pk = zk + k pk
End if
zk = Ap
k = zkT1rk 1 / zkT pk
xk = xk + k pk
rk = rk k zk
If (||zk||2 (1 )||xk||2 ) Then
Recompute
If (||zk||2 (1 )||xk||2 ) Exit
End if end loop
Here is an estimate of (G), the largest eigenvalue of the iteration matrix G = I M1 A. The
stopping criterion is based on the result (Hageman and Young, 1981, pages 148151)
xk x
x
zk
1 max (G ) xk
Where
x
2
M
= xT Mx
It is known that
max ( G ) < 1
3
3 4
with
k = 1 k / k 1 1/ k , 1 = 1 1/ 1
and
k = k / k 1
Chapter 1: Linear Systems
PCGRC 429
The largest eigenvalue of Tk is found using the routine EVASB. Usually this eigenvalue
computation is needed for only a few of the iterations.
Comments
1.
2.
Informational errors
Type
Code
4
4
4
4
4
1
2
3
4
5
Example
In this example, the solution to a linear system is found. The coefficient matrix A is stored as a full
matrix. The preconditioning matrix is the diagonal of A. This is called the Jacobi preconditioner.
It is also used by the IMSL routine JCGRC.
USE PCGRC_INT
USE MURRV_INT
USE WRRRN_INT
USE SCOPY_INT
INTEGER
LDA, N
PARAMETER (N=3, LDA=N)
!
INTEGER
REAL
!
!
!
!
IDO, ITMAX, J
A(LDA,N), B(N), P(N), R(N), X(N), Z(N)
(
1, -3,
2
)
A =
( -3, 10, -5
)
(
2, -5,
6
)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
B =
(
27.0, -78.0, 64.0 )
DATA B/27.0, -78.0, 64.0/
!
!
!
!
!
!
END
Output
Solution
1
1.001
2 -4.000
3
7.000
Example 2
In this example, a more complicated preconditioner is used to find the solution of a linear system
which occurs in a finite-difference solution of Laplaces equation on a 4 4 grid. The matrix is
4 1 0
1 4 1
0 1 4
1 0 1
A=
1 0
0 1
1 0 1
4 1 0 1
1 4 1 0 1
0 1 4 1 0 1
1 0 1 4 1 0
1 0 1 4 1
1 0 1 4
PCGRC 431
4 1
1 4 1
1 4 1
1 4 1
M =
1 4 1
1 4 1
1 4 1
1 4 1
1 4
!
!
!
!
IDO, ITMAX
A(LDA,N), P(N), PRECND(LDPRE,N), PREFAC(LDPRE,N),&
R(N), RCOND, RELERR, X(N), Z(N)
Set A in band form
DATA A/3*0.0, 4.0, -1.0, 0.0, -1.0, 2*0.0, -1.0, 4.0, -1.0, 0.0,&
-1.0, 2*0.0, -1.0, 4.0, -1.0, 0.0, -1.0, -1.0, 0.0, -1.0,&
4.0, -1.0, 0.0, -1.0, -1.0, 0.0, -1.0, 4.0, -1.0, 0.0,&
-1.0, -1.0, 0.0, -1.0, 4.0, -1.0, 0.0, -1.0, -1.0, 0.0,&
-1.0, 4.0, -1.0, 2*0.0, -1.0, 0.0, -1.0, 4.0, -1.0, 2*0.0,&
-1.0, 0.0, -1.0, 4.0, 3*0.0/
Set PRECND in band symmetric form
DATA PRECND/0.0, 4.0, -1.0, 4.0, -1.0, 4.0, -1.0, 4.0, -1.0, 4.0,&
-1.0, 4.0, -1.0, 4.0, -1.0, 4.0, -1.0, 4.0/
Right side is (1, ..., 1)
R = 1.0E0
Initial guess for X is 0
X = 0.0E0
Factor the preconditioning matrix
CALL LFCQS (PRECND, NCOPRE, PREFAC, RCOND)
ITMAX = 100
RELERR = 1.0E-4
IDO
= 0
10 CALL PCGRC (IDO, X, P, R, Z, RELERR=RELERR, ITMAX=ITMAX)
IF (IDO .EQ. 1) THEN
Set z = Ap
CALL MURBV (A, NCODA, NCODA, P, Z)
GO TO 10
ELSE IF (IDO .EQ. 2) THEN
Solve PRECND*z = r for r
CALL LSLQS (PREFAC, NCOPRE, R, Z)
GO TO 10
END IF
Print the solution
Output
Solution
1
0.955
2
1.241
3
1.349
4
1.578
5
1.660
6
1.578
7
1.349
8
1.241
9
0.955
JCGRC
Solves a real symmetric definite linear system using the Jacobi-preconditioned conjugate gradient
method with reverse communication.
Required Arguments
IDO Flag indicating task to be done. (Input/Output)
On the initial call IDO must be 0. If the routine returns with IDO = 1, then set
Z = A P, where A is the matrix, and call JCGRC again. If the routine returns with
IDO = 2, then the iteration has converged and X contains the solution.
DIAGNL Vector of length N containing the diagonal of the matrix. (Input)
Its elements must be all strictly positive or all strictly negative.
X Array of length N containing the solution. (Input/Output)
On input, X contains the initial guess of the solution. On output, X contains the solution
to the system.
P Array of length N. (Output)
Its use is described under IDO.
R Array of length N. (Input/Output)
On initial input, it contains the right-hand side of the linear system. On output, it
contains the residual.
Z Array of length N. (Input)
When IDO = 1, it contains AP, where A is the linear system. When IDO = 0, it is
ignored. Its use is described under IDO.
JCGRC 433
Optional Arguments
N Order of the linear system. (Input)
Default: N = size (X,1).
RELERR Relative error desired. (Input)
Default: RELERR = 1.e-5 for single precision and 1.d-10 for double precision.
ITMAX Maximum number of iterations allowed. (Input)
Default: ITMAX = 100.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine JCGRC solves the symmetric definite linear system Ax = b using the Jacobi conjugate
gradient method. This method is described in detail by Golub and Van Loan (1983, Chapter 10),
and in Hageman and Young (1981, Chapter 7).
This routine is a special case of the routine PCGRC, with the diagonal of the matrix A used as the
preconditioning matrix. For details of the algorithm see PCGRC.
The number of iterations needed depends on the matrix and the error tolerance RELERR. As a
rough guide, ITMAX = N is often sufficient when N 1. See the references for further information.
Comments
1.
2.
Informational errors
Type
Code
4
4
4
4
4
1
2
3
4
5
Example
In this example, the solution to a linear system is found. The coefficient matrix A is stored as a full
matrix.
USE IMSL_LIBRARIES
INTEGER
LDA, N
PARAMETER (LDA=3, N=3)
!
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
IDO, ITMAX
A(LDA,N), B(N), DIAGNL(N), P(N), R(N), X(N), &
Z(N)
(
1, -3,
2
)
A =
( -3, 10, -5
)
(
2, -5,
6
)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
B =
(
27.0, -78.0, 64.0 )
DATA B/27.0, -78.0, 64.0/
Set R to right side
CALL SCOPY (N, B, 1, R, 1)
Initial guess for X is B
CALL SCOPY (N, B, 1, X, 1)
Copy diagonal of A to DIAGNL
CALL SCOPY (N, A(:, 1), LDA+1, DIAGNL, 1)
Set parameters
ITMAX = 100
IDO
= 0
10 CALL JCGRC (IDO, DIAGNL, X, P, R, Z, ITMAX=ITMAX)
IF (IDO .EQ. 1) THEN
Set z = Ap
CALL MURRV (A, P, Z)
GO TO 10
END IF
Print the solution
CALL WRRRN (Solution, X)
END
JCGRC 435
Output
Solution
1
1.001
2 -4.000
3
7.000
GMRES
Uses the Generalized Minimal Residual Method with reverse communication to generate an
approximate solution of Ax = b.
Required Arguments
IDO Flag indicating task to be done. (Input/Output)
On the initial call IDO must be 0. If the routine returns with IDO = 1, then set Z = AP,
where A is the matrix, and call GMRES again. If the routine returns with IDO = 2, then
set Z to the solution of the system MZ = P, where M is the preconditioning matrix, and
call GMRES again. If the routine returns with IDO = 3, set Z = AM-1P, and call GMRES
again. If the routine returns with IDO = 4, the iteration has converged, and X contains
the approximate solution to the linear system.
X Array of length N containing an approximate solution. (Input/Output)
On input, X contains an initial guess of the solution. On output, X contains the
approximate solution.
P Array of length N. (Output)
Its use is described under IDO.
R Array of length N. (Input/Output)
On initial input, it contains the right-hand side of the linear system. On output, it
contains the residual, b Ax.
Z Array of length N. (Input)
When IDO = 1, it contains AP, where A is the coefficient matrix. When IDO = 2, it
contains M-1P. When IDO = 3, it contains AM-1P. When IDO = 0, it is ignored.
TOL Stopping tolerance. (Input/Output)
The algorithm attempts to generate a solution x such that |b Ax| TOL*|b|. On output,
TOL contains the final residual norm.
Optional Arguments
N Order of the linear system. (Input)
Default: N = size (X,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine GMRES implements restarted GMRES with reverse communication to generate an
approximate solution to Ax = b. It is based on GMRESD by Homer Walker.
There are four distinct GMRES implementations, selectable through the parameter vector INFO. The
first Gram-Schmidt implementation, INFO(1) = 1, is essentially the original algorithm by Saad
and Schultz (1986). The second Gram-Schmidt implementation, developed by Homer Walker and
Lou Zhou, is simpler than the first implementation. The least squares problem is constructed in
upper-triangular form and the residual vector updating at the end of a GMRES cycle is cheaper. The
first Householder implementation is algorithm 2.2 of Walker (1988), but with more efficient
correction accumulation at the end of each GMRES cycle. The second Householder implementation
is algorithm 3.1 of Walker (1988). The products of Householder transformations are expanded as
sums, allowing most work to be formulated as large scale matrix-vector operations. Although
BLAS are used wherever possible, extensive use of Level 2 BLAS in the second Householder
implementation may yield a performance advantage on certain computing environments.
The Gram-Schmidt implementations are less expensive than the Householder, the latter requiring
about twice as much arithmetic beyond the coefficient matrix/vector products. However, the
Householder implementations may be more reliable near the limits of residual reduction. See
Walker (1988) for details. Issues such as the cost of coefficient matrix/vector products, availability
of effective preconditioners, and features of particular computing environments may serve to
mitigate the extra expense of the Householder implementations.
Comments
1.
GMRES 437
For any components INFO(1) ... INFO(7) with value zero on input, the default
value is used.
INFO(1) = IMP, the flag indicating the desired implementation.
IMP
1
Action
first Gram-Schmidt implementation
Default: IMP = 1
INFO(2) = KDMAX, the maximum Krylor subspace dimension, i.e., the
maximum allowable number of GMRES iterations before restarting. It
must satisfy 1 KDMAX N.
Default: KDMAX = min(N, 20)
INFO(3) = ITMAX, the maximum number of GMRES iterations allowed.
Default: ITMAX = 1000
INFO(4) = IRP, the flag indicating whether right preconditioning is used.
If IRP = 0, no right preconditioning is performed. If IRP = 1, right
preconditioning is performed. If IRP = 0, then IDO = 2 or 3 will not
occur.
Default: IRP = 0
INFO(5) = IRESUP, the flag that indicates the desired residual vector updating
IRESUP
1
Action
update by linear combination, restarting only
Updating by direct evaluation requires an otherwise unnecessary matrixvector product. The alternative is to update by forming a linear
combination of various available vectors. This may or may not be
cheaper and may be less reliable if the residual vector has been greatly
reduced. If IRESUP = 2 or 4, then the residual vector is returned in
WORK(1), ..., WORK(N). This is useful in some applications but costs
another unnecessary residual update. It is recommended that IRESUP = 1
or 2 be used, unless matrix-vector products are inexpensive or great
residual reduction is required. In this case use IRESUP = 3 or 4. The
meaning of inexpensive varies with IMP as follows:
IMP
(KDMAX + 1) *N flops
N flops
(2*KDMAX + 1) *N flops
(2*KDMAX + 1) *N flops
Great residual reduction means that TOL is only a few orders of magnitude
larger than machine epsilon.
Default: IRESUP = 1
INFO(6) = flag for indicating the inner product and norm used in the GramSchmidt implementations. If INFO(6) = 0, sdot and snrm2, from BLAS,
are used. If INFO(6) = 1, the user must provide the routines, as specified
under arguments USRNPR and USRNRM.
Default: INFO(6) = 0
INFO(7) = IPRINT, the print flag. If IPRINT = 0, no printing is performed. If
IPRINT = 1, print the iteration numbers and residuals.
Default: IPRINT = 0
INFO(8) = the total number of GMRES iterations on output.
INFO(9) = the total number of matrix-vector products in GMRES on output.
INFO(10) = the total number of right preconditioner solves in GMRES on output if
IRP = 1.
GMRES 439
length of WORK
N*(KDMAX + 2) + 3 *KDMAX + 2
Example 1
This is a simple example of GMRES usage. A solution to a small linear system is found. The
coefficient matrix A is stored as a full matrix, and no preconditioning is used. Typically,
preconditioning is required to achieve convergence in a reasonable number of iterations.
!
!
!
!
!
!
!
!
!
!
USE IMSL_LIBRARIES
Declare variables
INTEGER
LDA, N
PARAMETER (N=3, LDA=N)
Specifications for local variables
INTEGER
IDO, NOUT
REAL
P(N), TOL, X(N), Z(N)
REAL
A(LDA,N), R(N)
SAVE
A, R
Specifications for intrinsics
INTRINSIC SQRT
REAL
SQRT
( 33.0 16.0 72.0)
A = (-24.0 -10.0 -57.0)
( 18.0 -11.0
7.0)
B = (129.0 -96.0
8.5)
DATA A/33.0, -24.0, 18.0, 16.0, -10.0, -11.0, 72.0, -57.0, 7.0/
DATA R/129.0, -96.0, 8.5/
CALL UMACH (2, NOUT)
!
!
!
!
!
TOL = SQRT(TOL)
IDO = 0
10 CONTINUE
CALL GMRES (IDO, X, P, R, Z, TOL)
IF (IDO .EQ. 1) THEN
Set z = A*p
CALL MURRV (A, P, Z)
GO TO 10
END IF
!
CALL WRRRN ('Solution', X, 1, N, 1)
WRITE (NOUT,'(A11, E15.5)') 'Residual = ', TOL
END
Output
Solution
1
2
3
1.000
1.500
1.000
Residual =
0.29746E-05
Additional Examples
Example 2
This example solves a linear system with a coefficient matrix stored in coordinate form, the same
problem as in the document example for LSLXG. Jacobi preconditioning is used, i.e. the
preconditioning matrix M is the diagonal matrix with Mii = Aii, for i = 1, , n.
USE IMSL_LIBRARIES
INTEGER
N, NZ
PARAMETER
!
INTEGER
REAL
REAL
!
INTRINSIC
REAL
!
EXTERNAL
!
!
EXTERNAL
(N=6, NZ=15)
Specifications for
IDO, INFO(10), NOUT
P(N), TOL, WORK(1000), X(N), Z(N)
DIAGIN(N), R(N)
Specifications for
SQRT
SQRT
Specifications for
AMULTP
Specifications for
G8RES, G9RES
local variables
intrinsics
subroutines
functions
!
!
Chapter 1: Linear Systems
INFO = 0
INFO(4) = 1
!
!
TOL = AMACH(4)
TOL = SQRT(TOL)
IDO = 0
10 CONTINUE
CALL G2RES (IDO, N, X, P, R, Z, TOL, INFO, G8RES, G9RES, WORK)
IF (IDO .EQ. 1) THEN
Set z = A*p
CALL AMULTP (P, Z)
GO TO 10
ELSE IF (IDO .EQ. 2) THEN
!
!
!
!
!
Set z = inv(M)*p
The diagonal of inv(M) is stored
in DIAGIN
CALL SHPROD (N, DIAGIN, 1, P, 1, Z, 1)
GO TO 10
ELSE IF (IDO .EQ. 3) THEN
!
!
!
Set z = A*inv(M)*p
CALL SHPROD (N, DIAGIN, 1, P, 1, Z, 1)
P = Z
CALL AMULTP (P, Z)
GO TO 10
END IF
CALL WRRRN ('Solution', X)
WRITE (NOUT,'(A11, E15.5)') 'Residual = ', TOL
END
!
SUBROUTINE AMULTP (P, Z)
USE IMSL_LIBRARIES
INTEGER
NZ
PARAMETER (NZ=15)
!
P(*), Z(*)
INTEGER
PARAMETER
N
(N=6)
!
INTEGER
INTEGER
REAL
SAVE
!
!
!
DATA A/6.0, 10.0, 15.0, -3.0, 10.0, -1.0, -1.0, -3.0, -5.0, 1.0, &
10.0, -1.0, -2.0, -1.0, -2.0/
DATA IROW/6, 2, 3, 2, 4, 4, 5, 5, 5, 5, 1, 6, 6, 2, 4/
442 Chapter 1: Linear Systems
DATA JCOL/6, 2, 3, 3, 4, 5, 1, 6, 4, 5, 1, 1, 2, 4, 1/
!
CALL SSET(N, 0.0, Z, 1)
!
Output
Solution
1
1.000
2
2.000
3
3.000
4
4.000
5
5.000
6
6.000
Residual =
0.25882E-05
Example 3
The coefficient matrix in this example corresponds to the five-point discretization of the 2-d
Poisson equation with the Dirichlet boundary condition. Assuming the natural ordering of the
unknowns, and moving all boundary terms to the right hand side, we obtain the block tridiagonal
matrix
T
I
A=
where
4 1
T =
1 4
and I is the identity matrix. Discretizing on a k k grid implies that T and I are both k k, and thus
the coefficient matrix A is k2 k2.
The problem is solved twice, with discretization on a 50 50 grid. During both solutions, use the
second Householder implementation to take advantage of the large scale matrix/vector operations
done in Level 2 BLAS. Also choose to update the residual vector by direct evaluation since the
small tolerance will require large residual reduction.
The first solution uses no preconditioning. For the second solution, we construct a block diagonal
preconditioning matrix
GMRES 443
T
M =
M is factored once, and these factors are used in the forward solves and back substitutions
necessary when GMRES returns with IDO = 2 or 3.
Timings are obtained for both solutions, and the ratio of the time for the solution with no
preconditioning to the time for the solution with preconditioning is printed. Though the exact
results are machine dependent, we see that the savings realized by faster convergence from using a
preconditioner exceed the cost of factoring M and performing repeated forward and back solves.
USE IMSL_LIBRARIES
INTEGER
K, N
PARAMETER (K=50, N=K*K)
!
!
!
!
!
!
!
!
!
!
!
!
!
Set z = A*p
CALL AMULTP (K, P, Z)
GO TO 10
END IF
TNOPRE = CPSEC() - TNOPRE
!
WRITE (NOUT,'(A32, I4)') 'Iterations, no preconditioner = ', &
INFO(8)
!
444 Chapter 1: Linear Systems
!
!
!
Define M
(N-1, -1.0, B, 1)
(N-1, -1.0, C, 1)
(N, 4.0, A, 1)
1
AMACH(4)
100.0*TOL
0
CPSEC()
!
!
!
!
!
Set z = A*p
CALL AMULTP (K, P, Z)
GO TO 20
ELSE IF (IDO .EQ. 2) THEN
!
!
!
Set z = inv(M)*p
CALL SCOPY (N, P, 1, Z, 1)
CALL LSLCR (C, A, B, Z, U, IR, IS, IJOB=5)
GO TO 20
ELSE IF (IDO .EQ. 3) THEN
!
!
!
Set z = A*inv(M)*p
CALL LSLCR (C, A, B, P, U, IR, IS, IJOB=5)
CALL AMULTP (K, P, Z)
GO TO 20
END IF
TPRE = CPSEC() - TPRE
WRITE (NOUT,'(A35, I4)') 'Iterations, with preconditioning = ',&
INFO(8)
WRITE (NOUT,'(A45, F10.5)') '(Precondition time)/(No '// &
'precondition time) = ', TPRE/TNOPRE
!
END
!
SUBROUTINE AMULTP (K, P, Z)
USE IMSL_LIBRARIES
!
K
P(*), Z(*)
INTEGER
I, N
!
Chapter 1: Linear Systems
GMRES 445
N = K*K
!
!
!
!
!
!
!
!
!
RETURN
END
Output
Iterations, no preconditioner = 329
Iterations, with preconditioning = 192
(Precondition time)/(No precondition time) =
0.66278
LSQRR
CAPABLE
Required Arguments
A NRA by NCA matrix containing the coefficient matrix of the least-squares system to be
solved. (Input)
B Vector of length NRA containing the right-hand side of the least-squares system. (Input)
X Vector of length NCA containing the solution vector with components corresponding to
the columns not used set to zero. (Output)
RES Vector of length NRA containing the residual vector B A * X. (Output)
KBASIS Scalar containing the number of columns used in the solution.
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = size (A,1).
NCA Number of columns of A. (Input)
Default: NCA = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
TOL Scalar containing the nonnegative tolerance used to determine the subset of columns
of A to be included in the solution. (Input)
If TOL is zero, a full complement of min(NRA, NCA) columns is used. See Comments.
Default: TOL = 0.0
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
Routine LSQRR solves the linear least-squares problem. The underlying code is based on either
LINPACK , LAPACK, or ScaLAPACK code depending upon which supporting libraries are used
during linking. For a detailed explanation see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual. The routine LQRRR is first used to compute
the QR decomposition of A. Pivoting, with all rows free, is used. Column k is in the basis if
Rkk R11
LSQRR 447
with = TOL. The truncated least-squares problem is then solved using IMSL routine LQRSL.
Finally, the components in the solution, with the same index as columns that are not in the basis,
are set to zero; and then, the permutation determined by the pivoting in IMSL routine LQRRR is
applied.
Comments
1.
2.
Routine LSQRR calculates the QR decomposition with pivoting of a matrix A and tests
the diagonal elements against a user-supplied tolerance TOL. The first integer
KBASIS = k is determined for which
rk +1, k +1 TOL * r11
In effect, this condition implies that a set of columns with a condition number
approximately bounded by 1.0/TOL is used. Then, LQRSL performs a truncated fit of
the first KBASIS columns of the permuted A to an input vector B. The coefficient of this
fit is unscrambled to correspond to the original columns of A, and the coefficients
corresponding to unused columns are set to zero. It may be helpful to scale the rows
and columns of A so that the error estimates in the elements of the scaled matrix are
roughly equal to TOL.
3.
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2QRR the leading dimension of QR is increased by IVAL(3)
when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
This option has two values that determine if the L1 condition number is to be
computed. Routine LSQRR temporarily replaces IVAL(2) by IVAL(1). The
routine L2CRG computes the condition number if IVAL(2) = 2. Otherwise L2CRG
skips this computation. LSQRR restores the option. Default values for the option
are IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA, MXLDX, and MXCOL can be obtained
through a call to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see
Utilities) has been made. See the ScaLAPACK Example below.
Example
Consider the problem of finding the coefficients ci in
f(x) = c0 + c1 x + c2 x2
given data at x = 1, 2, 3 and 4, using the method of least squares. The row of the matrix A contains
the value of 1, x and x2 at the data points. The vector b contains the data, chosen such that
c0 1, c1 2 and c2 0. The routine LSQRR solves this least-squares problem.
USE LSQRR_INT
USE UMACH_INT
USE WRRRN_INT
Chapter 1: Linear Systems
LSQRR 449
!
PARAMETER
REAL
Declare variables
(NRA=4, NCA=3, LDA=NRA)
A(LDA,NCA), B(NRA), X(NCA), RES(NRA), TOL
!
!
!
!
!
!
!
!
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
!
!
!
!
!
9.001,
12.999,
17.001 /
!
END
Output
KBASIS =
1
0.999
X
2
2.000
3
0.000
RES
1
-0.000400
2
0.001200
3
-0.001200
4
0.000400
ScaLAPACK Example
The previous example is repeated here as a distributed computing example. Consider the problem
of finding the coefficients ci in
f(x) = c0 + c1 x + c2 x2
given data at x = 1, 2, 3 and 4, using the method of least squares. The row of the matrix A contains
the value of 1, x and x2 at the data points. The vector b contains the data, chosen such that
c0 1, c1 2 and c2 0. The routine LSQRR solves this least-squares problem. SCALAPACK_MAP
and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities) used to map and
unmap arrays to and from the processor grid. They are used here for brevity. DESCINIT is a
ScaLAPACK tools routine which initializes the descriptors for the local arrays.
450 Chapter 1: Linear Systems
USE MPI_SETUP_INT
USE LSQRR_INT
USE UMACH_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, NRA, NCA, DESCA(9), DESCX(9), DESCR(9)
INTEGER
INFO, KBASIS, MXCOL, MXLDA, MXCOLX, MXLDX, NOUT
REAL
TOL
REAL, ALLOCATABLE ::
A(:,:), B(:), X(:), RES(:)
REAL, ALLOCATABLE ::
A0(:,:), B0(:), X0(:), RES0(:)
PARAMETER
(NRA=4, NCA=3, LDA=NRA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), B(NRA), X(NCA), RES(NRA))
Set values for A and B
A(1,:) = (/ 1.0, 2.0,
4.0/)
A(2,:) = (/ 1.0, 4.0, 16.0/)
A(3,:) = (/ 1.0, 6.0, 36.0/)
A(4,:) = (/ 1.0, 8.0, 64.0/)
!
B = (/4.999, 9.001,
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
12.999, 17.001/)
LSQRR 451
!
!
Output
KBASIS =
1
0.999
X
2
2.000
3
0.000
RES
1
-0.000400
2
0.001200
3
-0.001200
4
0.000400
LQRRV
CAPABLE
Computes the least-squares solution using Householder transformations applied in blocked form.
Required Arguments
A Real LDA by (NCA + NUMEXC) array containing the matrix and right-hand sides. (Input)
The right-hand sides are input in A(1 : NRA, NCA + j), j = 1, , NUMEXC. The array A
is preserved upon output. The Householder factorization of the matrix is computed and
used to solve the systems.
X Real LDX by NUMEXC array containing the solution. (Output)
Optional Arguments
NRA Number of rows in the matrix. (Input)
Default: NRA = size (A,1).
NCA Number of columns in the matrix. (Input)
Default: NCA = size (A,2) - NUMEXC.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
The routine LQRRV computes the QR decomposition of a matrix A using blocked Householder
transformations. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK
code depending upon which supporting libraries are used during linking. For a detailed
explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction
section of this manual. The standard algorithm is based on the storage-efficient WY representation
for products of Householder transformations. See Schreiber and Van Loan (1989).
The routine LQRRV determines an orthogonal matrix Q and an upper triangular matrix R such that
A = QR. The QR factorization of a matrix A having NRA rows and NCA columns is as follows:
Initialize A1 A
For k = 1, min(NRA - 1, NCA)
Determine a Householder transformation for column k of Ak having the form
LQRRV 453
Hk = I k k Tk
A H A
=A
AT
k
k k 1
k 1 k k k 1 k
End k
Thus,
Ap = H p H p1
H1 A = QT A = R
where p = min(NRA 1, NCA). The matrix Q is not produced directly by LQRRV. The information
needed to construct the Householder transformations is saved instead. If the matrix Q is needed
explicitly, QT can be determined while the matrix is factored. No pivoting among the columns is
done. The primary purpose of LQRRV is to give the user a high-performance QR least-squares
solver. It is intended for least-squares problems that are well-posed. For background, see Golub
and Van Loan (1989, page 225). During the QR factorization, the most timeconsuming step is
computing the matrixvector update Ak HkAk 1. The routine LQRRV constructs block of NB
Householder transformations in which the update is rich in matrix multiplication. The product of
NB Householder transformations are written in the form
Hk Hk +1
Hk +nb1 = I + YTY T
where YNRANB is a lower trapezoidal matrix and TNB NB is upper triangular. The optimal choice of
the block size parameter NB varies among computer systems. Users may want to change it from its
default value of 1.
Comments
1.
2.
Informational errors
Type
4
3.
Code
1 The input matrix is singular.
This option allows the user to reset the blocking factor used in computing the
factorization. On some computers, changing IVAL(*) to a value larger than 1
will result in greater efficiency. The value IVAL(*) is the maximum value to use.
(The software is specialized so that IVAL(*) is reset to an optimal used value
within routine L2RRV.) The user can control the blocking by resetting IVAL(*)
to a smaller value than the default. Default values are IVAL(*) = 1, IMACH(5).
This option is the vector dimension where a shift is made from in-line level-2
loops to the use of level-2 BLAS in forming the partial product of Householder
transformations. Default value is IVAL(*) = IMACH(5).
10
This option allows the user to control the factorization step. If the value is 1 the
Householder factorization will be computed. If the value is 2, the factorization
will not be computed. In this latter case the decomposition has already been
computed. Default value is IVAL(*) = 1.
11
This option allows the user to control the solving steps. The rules for IVAL(*)
are:
1. Compute b QTb, and x R+b.
2. Compute b QTb.
3. Compute b Qb.
4. Compute x R+b.
Default value is IVAL (*) = 1. Note that IVAL (*) = 2 or 3 may only be set when
calling L2RRV/DL2RRV.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA, MXLDX, MXCOL, and MXCOLX can
be obtained through a call to SCALAPACK_GETDIM (see Utilities) after a call to
SCALAPACK_SETUP (see Utilities) has been made. See the ScaLAPACK Example below.
LQRRV 455
Example
Given a real m k matrix B it is often necessary to compute the k least-squares solutions of the
linear system AX = B, where A is an m n real matrix. When m > n the system is considered
overdetermined. A solution with a zero residual normally does not exist. Instead the minimization
problem
min Ax j b j
x j R n
is solved k times where xj, bj are the j-th columns of the matrices X, B respectively. When A is of
full column rank there exits a unique solution XLS that solves the above minimization problem. By
using the routine LQRRV, XLS is computed.
USE LQRRV_INT
USE WRRRN_INT
USE SGEMM_INT
!
INTEGER
PARAMETER
!
REAL
!
REAL
SAVE
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDX, NCA, NRA, NUMEXC
(NCA=3, NRA=5, NUMEXC=2, LDA=NRA, LDX=NCA)
SPECIFICATIONS FOR LOCAL VARIABLES
X(LDX,NUMEXC)
SPECIFICATIONS FOR SAVE VARIABLES
A(LDA,NCA+NUMEXC)
A
SPECIFICATIONS FOR SUBROUTINES
Set values for A and the
righthand sides.
A = (
(
(
(
(
1
1
1
1
1
2
4
6
8
10
4
16
36
64
100
|
7
| 21
| 43
| 73
| 111
10)
10)
9 )
10)
10)
DATA A/5*1.0, 2.0, 4.0, 6.0, 8.0, 10.0, 4.0, 16.0, 36.0, 64.0, &
100.0, 7.0, 21.0, 43.0, 73.0, 111.0, 2*10., 9., 2*10./
!
!
!
!
!
END
Output
SOLUTIONS 1-2
1
2
456 Chapter 1: Linear Systems
1
2
3
1
2
3
4
5
1.00
1.00
1.00
10.80
-0.43
0.04
RESIDUALS 1-2
1
2
0.0000
0.0857
0.0000 -0.3429
0.0000
0.5143
0.0000 -0.3429
0.0000
0.0857
ScaLAPACK Example
The previous example is repeated here as a distributed computing example. Given a real m k
matrix B it is often necessary to compute the k least-squares solutions of the linear system
AX = B, where A is an m n real matrix. When m > n the system is considered overdetermined. A
solution with a zero residual normally does not exist. Instead the minimization problem
min Ax j b j
x j R n
is solved k times where xj, bj are the j-th columns of the matrices X, B respectively. When A is of
full column rank there exits a unique solution XLS that solves the above minimization problem. By
using the routine LQRRV, XLS is computed. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL
utility routines (see Chapter 11, Utilities) used to map and unmap arrays to and from the
processor grid. They are used here for brevity. DESCINIT is a ScaLAPACK tools routine which
initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LQRRV_INT
USE SGEMM_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, LDX, NCA, NRA, NUMEXC, DESCA(9), DESCX(9)
INTEGER
INFO, MXCOL, MXLDA, MXLDX, MXCOLX
INTEGER
K
REAL, ALLOCATABLE ::
A(:,:), X(:)
REAL, ALLOCATABLE ::
A0(:,:), X0(:)
PARAMETER
(NRA=5, NCA=3, NUMEXC=2, LDA=NRA, LDX=NCA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA+NUMEXC), X(LDX, NUMEXC))
Set values for A and the righthand sides
A(1,:) = (/ 1.0, 2.0,
4.0,
7.0, 10.0/)
A(2,:) = (/ 1.0, 4.0, 16.0, 21.0, 10.0/)
A(3,:) = (/ 1.0, 6.0, 36.0, 43.0, 9.0/)
A(4,:) = (/ 1.0, 8.0, 64.0, 73.0, 10.0/)
A(5,:) = (/ 1.0, 10.0, 100.0, 111.0, 10.0/)
ENDIF
LQRRV 457
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
LSBRR
Solves a linear least-squares problem with iterative refinement.
Required Arguments
A Real NRA by NCA matrix containing the coefficient matrix of the least-squares system to
be solved. (Input)
B Real vector of length NRA containing the right-hand side of the least-squares system.
(Input)
X Real vector of length NCA containing the solution vector with components corresponding
to the columns not used set to zero. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = size (A,1).
NCA Number of columns of A. (Input)
Default: NCA = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
TOL Real scalar containing the nonnegative tolerance used to determine the subset of
columns of A to be included in the solution. (Input)
If TOL is zero, a full complement of min(NRA, NCA) columns is used. See Comments.
Default: TOL = 0.0
RES Real vector of length NRA containing the residual vector B AX. (Output)
KBASIS Integer scalar containing the number of columns used in the solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine LSBRR solves the linear least-squares problem using iterative refinement. The iterative
refinement algorithm is due to Bjrck (1967, 1968). It is also described by Golub and Van Loan
(1983, pages 182183).
LSBRR 459
Comments
1.
2.
Informational error
Type
Code
4
3.
Routine LSBRR calculates the QR decomposition with pivoting of a matrix A and tests
the diagonal elements against a user-supplied tolerance TOL. The first integer
KBASIS = k is determined for which
rk +1, k +1 TOL * r11
In effect, this condition implies that a set of columns with a condition number
approximately bounded by 1.0/TOL is used. Then, LQRSL performs a truncated fit of the
first KBASIS columns of the permuted A to an input vector B. The coefficient of this fit
is unscrambled to correspond to the original columns of A, and the coefficients
corresponding to unused columns are set to zero. It may be helpful to scale the rows
and columns of A so that the error estimates in the elements of the scaled matrix are
roughly equal to TOL. The iterative refinement method of Bjrck is then applied to this
factorization.
4.
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2BRR the leading dimension of QR is increased by IVAL(3)
when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSBRR.
Additional memory allocation for QR and option value restoration are done
This option has two valuess that determine if the L1 condition number is to be
computed. Routine LSBRR temporarily replaces IVAL(2) by IVAL(1). The
routine L2CRG computes the condition number if IVAL(2) = 2. Otherwise L2CRG
skips this computation. LSBRR restores the option. Default values for the option
are IVAL(*) = 1, 2.
Example
This example solves the linear least-squares problem with A, an 8 4 matrix. Note that the second
and fourth columns of A are identical. Routine LSBRR determines that there are three columns in
the basis.
USE LSBRR_INT
USE UMACH_INT
USE WRRRN_INT
!
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
(NRA=8, NCA=4, LDA=NRA)
A(LDA,NCA), B(NRA), X(NCA), RES(NRA), TOL
Set values for A
A = (
(
(
(
(
(
(
(
1
1
1
1
1
1
1
1
5
4
7
3
1
8
3
4
15
17
14
18
15
11
9
10
5
4
7
3
1
8
3
4
)
)
)
)
)
)
)
)
DATA A/8*1, 5., 4., 7., 3., 1., 8., 3., 4., 15., 17., 14., &
18., 15., 11., 9., 10., 5., 4., 7., 3., 1., 8., 3., 4. /
!
!
!
!
!
!
!
END
Chapter 1: Linear Systems
LSBRR 461
Output
KBASIS =
3
X
1
0.636
2
2.845
3
1.058
1
-0.733
2
0.996
3
-0.365
4
0.000
RES
4
5
0.783 -1.353
6
-0.036
7
1.306
8
-0.597
LCLSQ
Solves a linear least-squares problem with linear constraints.
Required Arguments
A Matrix of dimension NRA by NCA containing the coefficients of the NRA least squares
equations. (Input)
B Vector of length NRA containing the right-hand sides of the least squares equations.
(Input)
C Matrix of dimension NCON by NCA containing the coefficients of the NCON constraints.
(Input)
If NCON = 0, C is not referenced.
BL Vector of length NCON containing the lower limit of the general constraints. (Input)
If there is no lower limit on the I-th constraint, then BL(I) will not be referenced.
BU Vector of length NCON containing the upper limit of the general constraints. (Input)
If there is no upper limit on the I-th constraint, then BU(I) will not be referenced. If
there is no range constraint, BL and BU can share the same storage locations.
IRTYPE Vector of length NCON indicating the type of constraints exclusive of simple
bounds, where IRTYPE(I) = 0, 1, 2, 3 indicates .EQ., .LE., .GE., and range
constraints respectively. (Input)
XLB Vector of length NCA containing the lower bound on the variables. (Input)
If there is no lower bound on the I-th variable, then XLB(I) should be set to 1.0E30.
XUB Vector of length NCA containing the upper bound on the variables. (Input)
If there is no upper bound on the I-th variable, then XUB(I) should be set to 1.0E30.
X Vector of length NCA containing the approximate solution. (Output)
Optional Arguments
NRA Number of least-squares equations. (Input)
Default: NRA = size (A,1).
NCA Number of variables. (Input)
Default: NCA = size (A,2).
NCON Number of constraints. (Input)
Default: NCON = size (C,1).
LDA Leading dimension of A exactly as specified in the dimension statement of the
calling program. (Input)
LDA must be at least NRA.
Default: LDA = size (A,1).
LDC Leading dimension of C exactly as specified in the dimension statement of the calling
program. (Input)
LDC must be at least NCON.
Default: LDC = size (C,1).
RES Vector of length NRA containing the residuals B AX of the least-squares equations at
the approximate solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LCLSQ (NRA, NCA, NCON, A, LDA, B, C, LDC, BL, BU, IRTYPE, XLB, XUB,
X, RES)
Double:
Description
The routine LCLSQ solves linear least-squares problems with linear constraints. These are systems
of least-squares equations of the form Ax b
subject to
bl Cx bu
xl x xu
Here, A is the coefficient matrix of the least-squares equations, b is the right-hand side, and C is
the coefficient matrix of the constraints. The vectors bl, bu, xl and xu are the lower and upper
Chapter 1: Linear Systems
LCLSQ 463
bounds on the constraints and the variables, respectively. The system is solved by defining
dependent variables y Cx and then solving the least squares system with the lower and upper
bounds on x and y. The equation Cx y = 0 is a set of equality constraints. These constraints are
realized by heavy weighting, i.e. a penalty method, Hanson, (1986, pages 826834).
Comments
1.
2.
Informational errors
Type
Code
3
4
4
4
3.
4.
1
2
3
4
Debug output flag. If more detailed output is desired, set this option to the value
1. Otherwise, set it to 0. Default value is 0.
14
The value of this option is the relative rank determination tolerance to be used.
Default value is sqrt(AMACH (4)).
The value of this option is the absolute rank determination tolerance to be used.
Default value is sqrt(AMACH (4)).
Example
A linear least-squares problem with linear constraints is solved.
USE LCLSQ_INT
USE UMACH_INT
USE SNRM2_INT
464 Chapter 1: Linear Systems
!
!
Solve the following in the least squares sense:
!
3x1 + 2x2 + x3 = 3.3
!
4x1 + 2x2 + x3 = 2.3
!
2x1 + 2x2 + x3 = 1.3
!
x1 + x2 + x3 = 1.0
!
!
Subject to: x1 + x2 + x3 <= 1
!
0 <= x1 <= .5
!
0 <= x2 <= .5
!
0 <= x3 <= .5
!
! ---------------------------------------------------------------------!
Declaration of variables
!
INTEGER
NRA, NCA, MCON, LDA, LDC
PARAMETER
(NRA=4, NCA=3, MCON=1, LDC=MCON, LDA=NRA)
!
INTEGER
IRTYPE(MCON), NOUT
REAL
A(LDA,NCA), B(NRA), BC(MCON), C(LDC,NCA), RES(NRA), &
RESNRM, XSOL(NCA), XLB(NCA), XUB(NCA)
!
Data initialization!
DATA A/3.0E0, 4.0E0, 2.0E0, 1.0E0, 2.0E0, &
2.0E0, 2.0E0, 1.0E0, 1.0E0, 1.0E0, 1.0E0, 1.0E0/, &
B/3.3E0, 2.3E0, 1.3E0, 1.0E0/, &
C/3*1.0E0/, &
BC/1.0E0/, IRTYPE/1/, XLB/3*0.0E0/, XUB/3*.5E0/
!
!
Solve the bounded, constrained
!
least squares problem.
!
CALL LCLSQ (A, B, C, BC, BC, IRTYPE, XLB, XUB, XSOL, RES=RES)
!
Compute the 2-norm of the residuals.
RESNRM = SNRM2 (NRA, RES, 1)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT, 999) XSOL, RES, RESNRM
!
999 FORMAT ( The solution is , 3F9.4, //, The residuals , &
evaluated at the solution are , /, 18X, 4F9.4, //, &
The norm of the residual vector is , F8.4)
!
END
Output
The solution is
0.5000
0.3000
0.2000
The residuals evaluated at the solution are
-1.0000
0.5000
0.5000
The norm of the residual vector is
0.0000
1.2247
LCLSQ 465
LQRRR
CAPABLE
Required Arguments
A Real NRA by NCA matrix containing the matrix whose QR factorization is to be
computed. (Input)
QR Real NRA by NCA matrix containing information required for the QR factorization.
(Output)
The upper trapezoidal part of QR contains the upper trapezoidal part of R with its
diagonal elements ordered in decreasing magnitude. The strict lower trapezoidal part of
QR contains information to recover the orthogonal matrix Q of the factorization.
Arguments A and QR can occupy the same storage locations. In this case, A will not be
preserved on output.
QRAUX Real vector of length NCA containing information about the orthogonal part of the
decomposition in the first min(NRA, NCA) position. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = size (A,1).
NCA Number of columns of A. (Input)
Default: NCA = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
PIVOT Logical variable. (Input)
PIVOT = .TRUE. means column pivoting is enforced.
PIVOT = .FALSE. means column pivoting is not done.
Default: PIVOT = .TRUE.
IPVT Integer vector of length NCA containing information that controls the final order of
the columns of the factored matrix A. (Input/Output)
On input, if IPVT(K) > 0, then the K-th column of A is an initial column. If IPVT(K) = 0,
then the K-th column of A is a free column. If IPVT(K) < 0, then the K-th column of A is
a final column. See Comments.
On output, IPVT(K) contains the index of the column of A that has been interchanged
into the K-th column. This defines the permutation matrix P. The array IPVT is
466 Chapter 1: Linear Systems
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LQRRR (NRA, NCA, A, LDA, PIVOT, IPVT, QR, LDQR, QRAUX, CONORM)
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
The routine LQRRR computes the QR decomposition of a matrix using Householder
transformations. The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK
code depending upon which supporting libraries are used during linking. For a detailed
explanation see Using ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction
section of this manual.
LQRRR determines an orthogonal matrix Q, a permutation matrix P, and an upper trapezoidal
matrix R with diagonal elements of nonincreasing magnitude, such that AP = QR. The
Householder transformation for column k is of the form
I
uk ukT
pk
LQRRR 467
for k = 1, 2, , min(NRA, NCA), where u has zeros in the first k 1 positions. The matrix Q is not
produced directly by LQRRR . Instead the information needed to reconstruct the Householder
transformations is saved. If the matrix Q is needed explicitly, the subroutine LQERR can be called
after LQRRR. This routine accumulates Q from its factored form.
Before the decomposition is computed, initial columns are moved to the beginning of the array A
and the final columns to the end. Both initial and final columns are frozen in place during the
computation. Only free columns are pivoted. Pivoting, when requested, is done on the free
columns of largest reduced norm.
Comments
1.
2.
I uk1uu T
where u has zeros in the first k 1 positions. If the explicit matrix Q is needed, the user
can call routine LQERR after calling LQRRR. This routine accumulates Q from its
factored form.
3.
Before the decomposition is computed, initial columns are moved to the beginning and
the final columns to the end of the array A. Both initial and final columns are not
moved during the computation. Only free columns are moved. Pivoting, if requested, is
done on the free columns of largest reduced norm.
4.
When pivoting has been selected by having entries of IPVT initialized to zero, an
estimate of the condition number of A can be obtained from the output by computing
the magnitude of the number QR(1, 1)/QR(K, K), where K = MIN(NRA, NCA). This
estimate can be used to select the number of columns, KBASIS, used in the solution
step computed with routine LQRSL.
A0 MXLDA by MXCOL local matrix containing the local portions of the distributed matrix
A. A contains the matrix whose QR factorization is to be computed. (Input)
QR0 MXLDA by MXCOL local matrix containing the local portions of the distributed
matrix QR. QR contains the information required for the QR factorization. (Output)
The upper trapezoidal part of QR contains the upper trapezoidal part of R with its
diagonal elements ordered in decreasing magnitude. The strict lower trapezoidal part of
QR contains information to recover the orthogonal matrix Q of the factorization.
Arguments A and QR can occupy the same storage locations. In this case, A will not be
preserved on output.
QRAUX0 Real vector of length MXCOL containing the local portions of the distributed
matrix QRAUX. QRAUX contains information about the orthogonal part of the
decomposition in the first MIN(NRA, NCA) position. (Output)
IPVT0 Integer vector of length MXLDB containing the local portions of the distributed
vector IPVT. IPVT contains the information that controls the final order of the
columns of the factored matrix A. (Input/Output)
On input, if IPVT(K) > 0, then the K-th column of A is an initial column. If IPVT(K) = 0,
then the K-th column of A is a free column. If IPVT(K) < 0, then the K-th column of A is
a final column. See Comments.
On output, IPVT(K) contains the index of the column of A that has been interchanged
into the K-th column. This defines the permutation matrix P. The array IPVT is
referenced only if PIVOT is equal to .TRUE.
Default: IPVT = 0.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA, MXLDB, and MXCOL can be obtained
through a call to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see
Utilities) has been made. See the ScaLAPACK Example below.
Example
In various statistical algorithms it is necessary to compute q = xT(AT A)1x, where A is a rectangular
matrix of full column rank. By using the QR decomposition, q can be computed without forming
ATA. Note that
AT A = (QRP1)T (QRP1) = P RT (QT Q)RP1 = P RT RPT
2
2
LQRRR 469
t := R1T t
Finally,
q= t
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
!
INTEGER
PARAMETER
!
!
!
!
!
!
!
!
!
INTEGER
REAL
LOGICAL
REAL
Declare variables
LDA, LDQR, NCA, NRA
(NCA=3, NRA=4, LDA=NRA, LDQR=NRA)
SPECIFICATIONS FOR PARAMETERS
LDQ
(LDQ=NRA)
SPECIFICATIONS FOR LOCAL VARIABLES
IPVT(NCA), NOUT
CONORM(NCA), Q, QR(LDQR,NCA), QRAUX(NCA), T(NCA)
PIVOT
A(LDA,NCA), X(NCA)
Set values for A
A = (
(
(
(
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
!
!
!
!
!
!
!
!
!
!
X = (
QR factorization
PIVOT = .TRUE.
IPVT=0
CALL LQRRR (A, QR, QRAUX, PIVOT=PIVOT, IPVT=IPVT)
Set t = inv(P)*x
CALL PERMU (X, IPVT, T, IPATH=1)
Compute t = inv(trans(R))*t
CALL LSLRT (QR, T, T, IPATH=4)
Compute 2-norm of t, squared.
Q = SDOT(NCA,T,1,T,1)
Print result
CALL UMACH (2, NOUT)
WRITE (NOUT,*) Q = , Q
!
END
Output
Q =
0.840624
ScaLAPACK Example
The previous example is repeated here as a distributed computing example. In various statistical
algorithms it is necessary to compute q = xT(AT A)1x, where A is a rectangular matrix of full
column rank. By using the QR decomposition, q can be computed without forming AT A. Note that
AT A = (QRP1)T (QRP1) = P RT (QT Q)RP1 = P RT RPT
2
2
Finally,
q= t
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities)
used to map and unmap arrays to and from the processor grid. They are used here for brevity.
DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LQRRR_INT
USE PERMU_INT
USE LSLRT_INT
USE UMACH_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, LDQR, NCA, NRA, DESCA(9), DESCB(9), DESCL(9)
INTEGER
INFO, MXCOL, MXLDA, MXLDB, MXCOLB, NOUT
INTEGER, ALLOCATABLE ::
IPVT(:), IPVT0(:)
LOGICAL
PIVOT
REAL
Q
REAL, ALLOCATABLE ::
A(:,:), X(:), T(:)
REAL, ALLOCATABLE ::
A0(:,:), T0(:), QR0(:,:), QRAUX0(:)
REAL, (KIND(1E0))SDOT
PARAMETER
(NRA=4, NCA=3, LDA=NRA, LDQR=NRA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), X(NCA), T(NCA), IPVT(NCA))
Set values for A and the righthand side
LQRRR 471
A(1,:)
A(2,:)
A(3,:)
A(4,:)
=
=
=
=
(/
(/
(/
(/
1.0,
1.0,
1.0,
1.0,
2.0,
4.0,
6.0,
8.0,
4.0/)
16.0/)
36.0/)
64.0/)
= (/ 1.0,
2.0,
3.0/)
!
IPVT = 0
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
Q =
0.840624
LQERR
CAPABLE
Accumulates the orthogonal matrix Q from its factored form given the QR factorization of a
rectangular matrix A.
Required Arguments
QR Real NRQR by NCQR matrix containing the factored form of the matrix Q in the first
min(NRQR, NCQR) columns of the strict lower trapezoidal part of QR as output from
subroutine LQRRR/DLQRRR. (Input)
QRAUX Real vector of length NCQR containing information about the orthogonal part of
the decomposition in the first min(NRQR, NCQR) position as output from routine
LQRRR/DLQRRR. (Input)
Q Real NRQR by NRQR matrix containing the accumulated orthogonal matrix Q; Q and QR
can share the same storage locations if QR is not needed. (Output)
Optional Arguments
NRQR Number of rows in QR. (Input)
Default: NRQR = size (QR,1).
NCQR Number of columns in QR. (Input)
Default: NCQR = size (QR,2).
LDQR Leading dimension of QR exactly as specified in the dimension statement of the
calling program. (Input)
Default: LDQR = size (QR,1).
LDQ Leading dimension of Q exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDQ = size (Q,1).
FORTRAN 90 Interface
Generic:
Specific:
LQERR 473
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
The routine LQERR accumulates the Householder transformations computed by IMSL routine
LQRRR to produce the orthogonal matrix Q.
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
Q0 MXLDA by MXLDA local matrix containing the local portions of the distributed matrix
Q. Q contains the accumulated orthogonal matrix ; Q and QR can share the same storage
locations if QR is not needed. (Output)
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA and MXCOL can be obtained through a
call to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see Utilities)
has been made. See the ScaLAPACK Example below.
Example
In this example, the orthogonal matrix Q in the QR decomposition of a matrix A is computed. The
product X = QR is also computed. Note that X can be obtained from A by reordering the columns
of A according to IPVT.
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
Declare variables
LDA, LDQ, LDQR, NCA, NRA
(NCA=3, NRA=4, LDA=NRA, LDQ=NRA, LDQR=NRA)
!
INTEGER
REAL
LOGICAL
IPVT(NCA), J
A(LDA,NCA), CONORM(NCA), Q(LDQ,NRA), QR(LDQR,NCA), &
QRAUX(NCA), R(NRA,NCA), X(NRA,NCA)
PIVOT
!
!
!
!
!
!
!
!
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
!
!
!
!
!
!
!
QR factorization
Set IPVT = 0 (all columns free)
IPVT = 0
PIVOT = .TRUE.
CALL LQRRR (A, QR, QRAUX, IPVT=IPVT, PIVOT=PIVOT)
Accumulate Q
CALL LQERR (QR, QRAUX, Q)
R is the upper trapezoidal part of QR
R = 0.0E0
DO 10 J=1, NCA
CALL SCOPY (J, QR(:,J), 1, R(:,J), 1)
10 CONTINUE
Compute X = Q*R
CALL MRRRR (Q, R, X)
Print results
CALL WRIRN (IPVT, IPVT, 1, NCA, 1)
CALL WRRRN (Q, Q)
CALL WRRRN (R, R)
CALL WRRRN (X = Q*R, X)
LQERR 475
!
END
Output
1
3
IPVT
2
2
3
1
Q
2
3
-0.5422
0.8082
-0.6574 -0.2694
-0.3458 -0.4490
0.3928
0.2694
1
2
3
4
1
-0.0531
-0.2126
-0.4783
-0.8504
1
2
3
4
1
-75.26
0.00
0.00
0.00
2
-10.63
-2.65
0.00
0.00
3
-1.59
-1.15
0.36
0.00
1
2
3
4
1
4.00
16.00
36.00
64.00
X = Q*R
2
2.00
4.00
6.00
8.00
3
1.00
1.00
1.00
1.00
4
-0.2236
0.6708
-0.6708
0.2236
ScaLAPACK Example
In this example, the orthogonal matrix Q in the QR decomposition of a matrix A is computed.
SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines (see Chapter 11, Utilities)
used to map and unmap arrays to and from the processor grid. They are used here for brevity.
DESCINIT is a ScaLAPACK tools routine which initializes the descriptors for the local arrays.
USE MPI_SETUP_INT
USE LQRRR_INT
USE LQERR_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
INTEGER
LDA, LDQR, NCA, NRA, DESCA(9), DESCL(9), DESCQ(9)
INTEGER
INFO, MXCOL, MXLDA, LDQ
INTEGER, ALLOCATABLE ::
IPVT(:), IPVT0(:)
LOGICAL
PIVOT
REAL, ALLOCATABLE ::
A(:,:), QR(:,:), Q(:,:), QRAUX(:)
REAL, ALLOCATABLE ::
A0(:,:), QR0(:,:), Q0(:,:), QRAUX0(:)
PARAMETER
(NRA=4, NCA=3, LDA=NRA, LDQR=NRA, LDQ=NRA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
!
IPVT = 0
ENDIF
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
LQERR 477
LQRSL
CAPABLE
Computes the coordinate transformation, projection, and complete the solution of the least-squares
problem Ax = b.
Required Arguments
KBASIS Number of columns of the submatrix Ak of A. (Input)
The value KBASIS must not exceed min(NRA, NCA), where NCA is the number of
columns in matrix A. The value NCA is an argument to routine LQRRR. The value of
KBASIS is normally NCA unless the matrix is rank-deficient. The user must analyze the
problem data and determine the value of KBASIS. See Comments.
QR NRA by NCA array containing information about the QR factorization of A as output
from routine LQRRR/DLQRRR. (Input)
QRAUX Vector of length NCA containing information about the QR factorization of A as
output from routine LQRRR/DLQRRR. (Input)
B Vector b of length NRA to be manipulated. (Input)
IPATH Option parameter specifying what is to be computed. (Input)
The value IPATH has the decimal expansion IJKLM, such that:
I 0 means compute Qb;
T
J 0 means compute Q b;
T
K 0 means compute Q b and x;
T
L 0 means compute Q b and b Ax;
T
M 0 means compute Q b and Ax.
Optional Arguments
NRA Number of rows of matrix A. (Input)
Default: NRA = size (QR,1).
LDQR Leading dimension of QR exactly as specified in the dimension statement of the
calling program. (Input)
Default: LDQR = size (QR,1).
QB Vector of length NRA containing Qb if requested in the option IPATH. (Output)
QTB Vector of length NRA containing QTb if requested in the option IPATH. (Output)
X Vector of length KBASIS containing the solution of the least-squares problem Akx = b, if
this is requested in the option IPATH. (Output)
If pivoting was requested in routine LQRRR/DLQRRR, then the J-th entry of X will be
associated with column IPVT(J) of the original matrix A. See Comments.
RES Vector of length NRA containing the residuals (b Ax) of the least-squares problem if
requested in the option IPATH. (Output)
This vector is the orthogonal projection of b onto the orthogonal complement of the
column space of A.
AX Vector of length NRA containing the least-squares approximation Ax if requested in the
option IPATH. (Output)
This vector is the orthogonal projection of b onto the column space of A.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LQRSL (NRA, KBASIS, QR, LDQR, QRAUX, B, IPATH, QB, QTB, X, RES,
AX)
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
The most important use of LQRSL is for solving the least-squares problem Ax = b, with coefficient
matrix A and data vector b. This problem can be formulated, using the normal equations method,
as AT Ax = AT b. Using LQRRR the QR decomposition of A, AP = QR, is computed. Here P is a
Chapter 1: Linear Systems
LQRSL 479
Comments
1.
Informational error
Type
Code
4
2.
This routine is designed to be used together with LQRRR. It assumes that LQRRR/DLQRR
has been called to get QR, QRAUX and IPVT. The submatrix Ak mentioned above is
actually equal to Ak = (A(IPVT(1)), A(IPVT(2)), , A(IPVT (KBASIS))), where
A(IPVT(I)) is the IPVT(I)-th column of the original matrix.
X0 Real vector of length MXLDX containing the local portions of the distributed vector X. X
contains the solution of the least-squares problem Akx = b, if this is requested in the
option IPATH. (Output)
If pivoting was requested in routine LQRRR/DLQRRR, then the J-th entry of X will be
associated with column IPVT(J) of the original matrix A. See Comments.
RES0 Real vector of length MXLDA containing the local portions of the distributed vector
RES. RES contains the residuals (b Ax) of the least-squares problem if requested in
the option IPATH. (Output)
This vector is the orthogonal projection of b onto the orthogonal complement of the
column space of A.
AX0 Real vector of length MXLDA containing the local portions of the distributed vector
AX. AX contains the least-squares approximation Ax if requested in the option IPATH.
(Output)
This vector is the orthogonal projection of b onto the column space of A.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA, MXLDX and MXCOL can be obtained
through a call to SCALAPACK_GETDIM (see Utilities) after a call to SCALAPACK_SETUP (see
Utilities) has been made. See the ScaLAPACK Example below.
Example
Consider the problem of finding the coefficients ci in
f(x) = c0 + c1 x + c2 x2
given data at xi = 2i, = 1, 2, 3, 4, using the method of least squares. The row of the matrix A
contains the value of 1, xi and
xi2
at the data points. The vector b contains the data. The routine LQRRR is used to compute the QR
decomposition of A. Then LQRSL is then used to solve the least-squares problem and compute the
residual vector.
USE IMSL_LIBRARIES
!
PARAMETER
INTEGER
REAL
LOGICAL
Declare variables
(NRA=4, NCA=3, KBASIS=3, LDA=NRA, LDQR=NRA)
IPVT(NCA)
A(LDA,NCA), QR(LDQR,NCA), QRAUX(NCA), CONORM(NCA), &
X(KBASIS), QB(1), QTB(NRA), RES(NRA), &
AX(1), B(NRA)
PIVOT
!
!
!
!
!
!
Chapter 1: Linear Systems
1
1
1
2
4
6
4
16
36
)
)
)
LQRSL 481
!
!
64
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
!
!
!
!
!
!
57.01,
B = ( 16.99 57.01
120.99, 209.01 /
120.99
209.01 )
QR factorization
PIVOT = .TRUE.
IPVT = 0
CALL LQRRR (A, QR, QRAUX, PIVOT=PIVOT, IPVT=IPVT)
Solve the least squares problem
IPATH = 00110
CALL LQRSL (KBASIS, QR, QRAUX, B, IPATH, X=X, RES=RES)
Print results
CALL WRIRN (IPVT, IPVT, 1, NCA, 1)
CALL WRRRN (X, X, 1, KBASIS, 1)
CALL WRRRN (RES, RES, 1, NRA, 1)
!
!
!
END
Output
1
3
IPVT
2
2
1
3.000
3
1
X
2
2.002
3
0.990
RES
1
-0.00400
2
0.01200
3
-0.01200
4
0.00400
Note that since IPVT is (3, 2, 1) the array X contains the solution coefficients ci in reverse order.
ScaLAPACK Example
The previous example is repeated here as a distributed example. Consider the problem of finding
the coefficients ci in
f(x) = c0 + c1 x + c2 x2
given data at xi = 2i, = 1, 2, 3, 4, using the method of least squares. The row of the matrix A
contains the value of 1, xi and
xi2
at the data points. The vector b contains the data. The routine LQRRR is used to compute the QR
decomposition of A. Then LQRSL is then used to solve the least-squares problem and compute the
residual vector. SCALAPACK_MAP and SCALAPACK_UNMAP are IMSL utility routines
482 Chapter 1: Linear Systems
(see Chapter 11, Utilities) used to map and unmap arrays to and from the processor grid. They
are used here for brevity. DESCINIT is a ScaLAPACK tools routine which initializes the
descriptors for the local arrays.
USE MPI_SETUP_INT
USE LQRRR_INT
USE LQRSL_INT
USE WRIRN_INT
USE WRRRN_INT
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
KBASIS, LDA, LDQR, NCA, NRA, DESCA(9), DESCL(9), &
DESCX(9), DESCB(9)
INTEGER
INFO, MXCOL, MXCOLX, MXLDA, MXLDX, LDQ, IPATH
INTEGER, ALLOCATABLE ::
IPVT(:), IPVT0(:)
REAL, ALLOCATABLE ::
A(:,:), B(:), QR(:,:), QRAUX(:), X(:), &
RES(:)
REAL, ALLOCATABLE ::
A0(:,:), QR0(:,:), QRAUX0(:), X0(:), &
RES0(:), B0(:), QTB0(:)
LOGICAL
PIVOT
PARAMETER
(NRA=4, NCA=3, LDA=NRA, LDQR=NRA, KBASIS=3)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), B(NRA), QR(LDQR,NCA), &
QRAUX(NCA), IPVT(NCA), X(NCA), RES(NRA))
Set values for A and the righthand sides
A(1,:) = (/ 1.0, 2.0,
4.0/)
A(2,:) = (/ 1.0, 4.0, 16.0/)
A(3,:) = (/ 1.0, 6.0, 36.0/)
A(4,:) = (/ 1.0, 8.0, 64.0/)
INTEGER
!
B
!
IPVT = 0
ENDIF
!
!
!
!
!
!
!
LQRSL 483
!
!
!
!
!
!
Output
1
3
IPVT
2
2
1
3.000
3
1
X
2
2.002
3
0.990
RES
1
-0.00400
2
0.01200
3
-0.01200
4
0.00400
Note that since IPVT is (3, 2, 1) the array X contains the solution coefficients ci in reverse order.
LUPQR
Computes an updated QR factorization after the rank-one matrix xyT is added.
Required Arguments
ALPHA Scalar determining the rank-one update to be added. (Input)
484 Chapter 1: Linear Systems
Optional Arguments
NROW Number of rows in the matrix A = Q * R. (Input)
Default: NROW = size (W,1).
NCOL Number of columns in the matrix A = Q * R. (Input)
Default: NCOL = size (Y,1).
Q Matrix of order NROW containing the Q matrix from the QR factorization. (Input)
Ignored if IPATH = 0.
Default: Q is 1x1 and un-initialized.
LDQ Leading dimension of Q exactly as specified in the dimension statement of the calling
program. (Input)
Ignored if IPATH = 0.
Default: LDQ = size (Q,1).
LDR Leading dimension of R exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDR = size (R,1).
QNEW Matrix of order NROW containing the updated Q matrix in the QR factorization.
(Output)
Ignored if J = 0, see IPATH for definition of J.
LDQNEW Leading dimension of QNEW exactly as specified in the dimension statement of
the calling program. (Input)
Chapter 1: Linear Systems
LUPQR 485
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let A be an m n matrix and let A = QR be its QR decomposition. (In the program, m is called
NROW and n is called NCOL) Then
A + xyT = QR + xyT = Q(R + QTxyT) = Q(R + wyT)
is orthogonal and
R = GH
is upper triangular.
If the last k components of w are zero, then the number of Givens rotations needed to construct
J or G is m k 1 instead of m 1.
For further information, see Dennis and Schnabel (1983, pages 5558 and 311313), or Golub and
Van Loan (1983, pages 437439).
Comments
1.
Example
The QR factorization of A is found. It is then used to find the QR factorization of A + xyT. Since
pivoting is used, the QR factorization routine finds AP = QR, where P is a permutation matrix
determined by IPVT. We compute
AP + xyT = A + x ( Py )
) P = QR
The IMSL routine PERMU (see Utilities) is used to compute Py. As a check
QR
is computed and printed. It can also be obtained from A + xyT by permuting its columns using the
order given by IPVT.
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
!
INTEGER
REAL
LOGICAL
INTRINSIC
Declare variables
LDA, LDAQR, LDQ, LDQNEW, LDQR, LDR, LDRNEW, NCOL, NROW
(NCOL=3, NROW=4, LDA=NROW, LDAQR=NROW, LDQ=NROW, &
LDQNEW=NROW, LDQR=NROW, LDR=NROW, LDRNEW=NROW)
IPATH, IPVT(NCOL), J, MIN0
A(LDA,NCOL), ALPHA, AQR(LDAQR,NCOL), CONORM(NCOL), &
Q(LDQ,NROW), QNEW(LDQNEW,NROW), QR(LDQR,NCOL), &
QRAUX(NCOL), R(LDR,NCOL), RNEW(LDRNEW,NCOL), W(NROW), &
Y(NCOL)
PIVOT
MIN0
!
!
!
!
!
!
!
!
!
1
1
1
1
2
4
6
8
4
16
36
64
)
)
)
)
DATA A/4*1.0, 2.0, 4.0, 6.0, 8.0, 4.0, 16.0, 36.0, 64.0/
Set values for W and Y
DATA W/1., 2., 3., 4./
LUPQR 487
!
!
!
!
!
QR factorization
Set IPVT = 0 (all columns free)
IPVT = 0
PIVOT = .TRUE.
CALL LQRRR (A, QR, QRAUX, IPVT=IPVT, PIVOT=PIVOT)
Accumulate Q
CALL LQERR (QR, QRAUX, Q)
Permute Y
CALL PERMU (Y, IPVT, Y)
R is the upper trapezoidal part of QR
R = 0.0E0
DO 10 J=1, NCOL
CALL SCOPY (MIN0(J,NROW), QR(:,J), 1, R(:,J), 1)
10 CONTINUE
Update Q and R
ALPHA = 1.0
IPATH = 01
CALL LUPQR (ALPHA, W, Y, R, IPATH, RNEW, Q=Q, QNEW=QNEW)
Compute AQR = Q*R
CALL MRRRR (QNEW, RNEW, AQR)
Print results
CALL WRIRN (IPVT, IPVT, 1, NCOL,1)
CALL WRRRN (QNEW, QNEW)
CALL WRRRN (RNEW, RNEW)
CALL WRRRN (QNEW*RNEW, AQR)
END
Output
1
3
IPVT
2
2
3
1
QNEW
1
2
3
4
1
-0.0620
-0.2234
-0.4840
-0.8438
1
2
3
4
1
-80.59
0.00
0.00
0.00
2
-0.5412
-0.6539
-0.3379
0.4067
3
0.8082
-0.2694
-0.4490
0.2694
4
-0.2236
0.6708
-0.6708
0.2236
RNEW
1
2
3
4
2
-21.34
-4.94
0.00
0.00
QNEW*RNEW
1
2
5.00
4.00
18.00
8.00
39.00
12.00
68.00
16.00
3
-17.62
-4.83
0.36
0.00
3
4.00
7.00
10.00
13.00
LCHRG
Computes the Cholesky decomposition of a symmetric positive definite matrix with optional
column pivoting.
Required Arguments
A N by N symmetric positive definite matrix to be decomposed. (Input)
Only the upper triangle of A is referenced.
FACT N by N matrix containing the Cholesky factor of the permuted matrix in its upper
triangle. (Output)
If A is not needed, A and FACT can share the same storage locations.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
PIVOT Logical variable. (Input)
PIVOT = .TRUE. means column pivoting is done. PIVOT = .FALSE. means no
pivoting is done.
Default: PIVOT = .TRUE.
IPVT Integer vector of length N containing information that controls the selection of the
pivot columns. (Input/Output)
On input, if IPVT(K) > 0, then the K-th column of A is an initial column; if
IPVT(K) = 0, then the K-th column of A is a free column; if IPVT(K) < 0, then the K-th
column of A is a final column. See Comments. On output, IPVT(K) contains the index
of the diagonal element of A that was moved into the K-th position. IPVT is only
referenced when PIVOT is equal to .TRUE..
LDFACT Leading dimension of FACT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFACT = size (FACT,1).
FORTRAN 90 Interface
Generic:
Specific:
LCHRG 489
FORTRAN 77 Interface
Single:
Double:
Description
Routine LCHRG is based on the LINPACK routine SCHDC; see Dongarra et al. (1979).
Before the decomposition is computed, initial elements are moved to the leading part of A and
final elements to the trailing part of A. During the decomposition only rows and columns
corresponding to the free elements are moved. The result of the decomposition is an upper
triangular matrix R and a permutation matrix P that satisfy PT AP = RT R, where P is represented
by IPVT.
Comments
1.
Informational error
Type
Code
4
2.
Before the decomposition is computed, initial elements are moved to the leading part
of A and final elements to the trailing part of A. During the decomposition only rows
and columns corresponding to the free elements are moved. The result of the
decomposition is an upper triangular matrix R and a permutation matrix P that satisfy
PT AP = RT R, where P is represented by IPVT.
3.
LCHRG can be used together with subroutines PERMU and LSLDS to solve the positive
definite linear system AX = B with the solution X overwriting the right-hand side B as
follows:
CALL
CALL
CALL
CALL
CALL
ISET
LCHRG
PERMU
LSLDS
PERMU
(N, 0, IPVT, 1)
(A, FACT, N, LDA,.TRUE, IPVT, LDFACT)
(B, IPVT, B, N, 1)
(FACT, B, B, N, LDFACT)
(B, IPVT, B, N, 2)
Example
Routine LCHRG can be used together with the IMSL routines PERMU (see Chapter 11) and LFSDS
to solve a positive definite linear system Ax = b. Since A = PRT RP, the system Ax = b is equivalent
to RT R(Px) = Pb. LFSDS is used to solve RT Ry = Pb for y. The routine PERMU is used to compute
both Pb and x = Py.
USE IMSL_LIBRARIES
!
PARAMETER
INTEGER
Declare variables
(N=3, LDA=N, LDFACT=N)
IPVT(N)
REAL
LOGICAL
!
!
!
!
!
!
!
!
!
1
-3
2
-3
10
-5
2
-5
6
)
)
)
B = (
27
-78
64
DATA A/1.,-3.,2.,-3.,10.,-5.,2.,-5.,6./
DATA B/27.,-78.,64./
Pivot using all columns
PIVOT = .TRUE.
IPVT = 0
Compute Cholesky factorization
CALL LCHRG (A, FACT, PIVOT=PIVOT, IPVT=IPVT)
Permute B and store in X
CALL PERMU (B, IPVT, X, IPATH=1)
Solve for X
CALL LFSDS (FACT, X, X)
Inverse permutation
CALL PERMU (X, IPVT, X, IPATH=2)
Print X
CALL WRRRN (X, X, 1, N, 1)
!
!
!
!
!
!
!
END
Output
1
1.000
X
2
-4.000
3
7.000
LUPCH
Updates the RT R Cholesky factorization of a real symmetric positive definite matrix after a rankone matrix is added.
Required Arguments
R N by N upper triangular matrix containing the upper triangular factor to be updated.
(Input)
Only the upper triangle of R is referenced.
X Vector of length N determining the rank-one matrix to be added to the factorization
RT R. (Input)
RNEW N by N upper triangular matrix containing the updated triangular factor of
RT R + XXT. (Output)
Chapter 1: Linear Systems
LUPCH 491
Only the upper triangle of RNEW is referenced. If R is not needed, R and RNEW can share
the same storage locations.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (R,2).
LDR Leading dimension of R exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDR = size (R,1).
LDRNEW Leading dimension of RNEW exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDRNEW = size (RNEW,1).
CS Vector of length N containing the cosines of the rotations. (Output)
SN Vector of length N containing the sines of the rotations. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine LUPCH is based on the LINPACK routine SCHUD; see Dongarra et al. (1979).
The Cholesky factorization of a matrix is A = RT R, where R is an upper triangular matrix. Given
this factorization, LUPCH computes the factorization
A + xxT = RT R
In the program
R
is called RNEW.
LUPCH determines an orthogonal matrix U as the product GNG1 of Givens rotations, such that
R R
U T=
x 0
By multiplying this equation by its transpose, and noting that UT U = I, the desired result
RT R + xxT = RT R
is obtained.
Each Givens rotation, Gi, is chosen to zero out an element in xT. The matrix
Gi is (N + 1) (N + 1) and has the form
I i 1
0
Gi =
0
ci
0
si
I N i
0
0
si
0
ci
Where Ik is the identity matrix of order k and ci = cosi = CS(I), si = sini = SN(I) for some i.
Example
A linear system Az = b is solved using the Cholesky factorization of A. This factorization is then
updated and the system (A + xxT) z = b is solved using this updated factorization.
USE IMSL_LIBRARIES
!
INTEGER
PARAMETER
REAL
Declare variables
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), FACT(LDFACT,LDFACT), FACNEW(LDFACT,LDFACT), &
X(N), B(N), CS(N), SN(N), Z(N)
!
!
!
!
!
!
!
!
2.0)
-5.0)
6.0)
DATA A/1.0, -3.0, 2.0, -3.0, 10.0, -5.0, 2.0, -5.0, 6.0/
Set values for X and B
DATA X/3.0, 2.0, 1.0/
DATA B/53.0, 20.0, 31.0/
!
!
!
LUPCH 493
END
Output
1
2
3
FACT
1
2
1.000 -3.000
1.000
Z
1
1860.0
1
2
3
2
433.0
3
2.000
1.000
1.000
3
-254.0
FACNEW
1
2
3.162
0.949
3.619
3
1.581
-1.243
-1.719
Z
1
4.000
2
1.000
3
2.000
LDNCH
Downdates the RT R Cholesky factorization of a real symmetric positive definite matrix after a
rank-one matrix is removed.
Required Arguments
R N by N upper triangular matrix containing the upper triangular factor to be downdated.
(Input)
Only the upper triangle of R is referenced.
X Vector of length N determining the rank-one matrix to be subtracted from the
factorization RT R. (Input)
RNEW N by N upper triangular matrix containing the downdated triangular factor of
RT R X XT. (Output)
Only the upper triangle of RNEW is referenced. If R is not needed, R and RNEW can share
the same storage locations.
Optional Arguments
N Order of the matrix. (Input)
Default: N = size (R,2).
494 Chapter 1: Linear Systems
LDR Leading dimension of R exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDR = size (R,1).
LDRNEW Leading dimension of RNEW exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDRNEW = size (RNEW,1).
CS Vector of length N containing the cosines of the rotations. (Output)
SN Vector of length N containing the sines of the rotations. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine LDNCH is based on the LINPACK routine SCHDD; see Dongarra et al. (1979).
The Cholesky factorization of a matrix is A = RT R, where R is an upper triangular matrix. Given
this factorization, LDNCH computes the factorization
A xxT = RT R
In the program
R
is called RNEW. This is not always possible, since A xxT may not be positive definite.
LDNCH determines an orthogonal matrix U as the product GN G1 of Givens rotations, such that
R R
U 0 = xT
By multiplying this equation by its transpose and noting that UT U = I, the desired result
RT R xxT = RT R
is obtained.
Let a be the solution of the linear system RT a = x and let
Chapter 1: Linear Systems
LDNCH 495
= 1 a
2
2
a 0
GN = 1
0
ci
0
si
0
0
I N i
0
0
si
0
ci
where Ik is the identity matrix of order k; and ci= cosi = CS(I), si= sini = SN(I) for some i.
The Givens rotations are then used to form
R, G1
R R
GN = T
0 x
The matrix
R
because
a
a
0
x = ( RT 0 ) = ( RT 0 ) U T U = RT x 1 = x
Comments
Informational error
Type
Code
4
Example
A linear system Az = b is solved using the Cholesky factorization of A. This factorization is then
downdated, and the system (A xxT)z = b is solved using this downdated factorization.
USE
USE
USE
USE
LDNCH_INT
LFTDS_INT
LFSDS_INT
WRRRN_INT
!
496 Chapter 1: Linear Systems
Declare variables
Fortran Numerical MATH LIBRARY
INTEGER
PARAMETER
REAL
!
!
!
!
!
!
LDA, LDFACT, N
(LDA=3, LDFACT=3, N=3)
A(LDA,LDA), FACT(LDFACT,LDFACT), FACNEW(LDFACT,LDFACT), &
X(N), B(N), CS(N), SN(N), Z(N)
Set values for A
A = ( 10.0
3.0
( 3.0 14.0
( 5.0 -3.0
5.0)
-3.0)
7.0)
DATA A/10.0, 3.0, 5.0, 3.0, 14.0, -3.0, 5.0, -3.0, 7.0/
!
!
!
!
Output
FACT
1
2
3
1
3.162
2
0.949
3.619
3
1.581
-1.243
1.719
Z
1
4.000
1
2
3
1
1.000
1
1859.9
2
1.000
3
2.000
FACNEW
2
-3.000
1.000
Z
2
433.0
3
2.000
1.000
1.000
3
-254.0
LDNCH 497
LSVRR
CAPABLE
Required Arguments
A NRA by NCA matrix whose singular value decomposition is to be computed. (Input)
IPATH Flag used to control the computation of the singular vectors. (Input)
IPATH has the decimal expansion IJ such that:
I = 0 means do not compute the left singular vectors;
I = 1 means return the NRA left singular vectors in U;
NOTE: This option is not available for the ScaLAPACK interface. If this option is
chosen for ScaLAPACK usage, the min(NRA, NCA) left singular vectors will be
returned.
I = 2 means return only the min(NRA, NCA) left singular vectors in U;
J = 0 means do not compute the right singular vectors,
J = 1 means return the right singular vectors in V.
Optional Arguments
NRA Number of rows in the matrix A. (Input)
Default: NRA = size (A,1).
NCA Number of columns in the matrix A. (Input)
Default: NCA = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
TOL Scalar containing the tolerance used to determine when a singular value is negligible.
(Input)
If TOL is positive, then a singular value i considered negligible if i TOL . If TOL is
negative, then a singular value i considered negligible if i |TOL| * ||A|| . In this case,
|TOL| generally contains an estimate of the level of the relative error in the data.
Default: TOL = 1.0e-5 for single precision and 1.0d-10 for double precision.
IRANK Scalar containing an estimate of the rank of A. (Output)
498 Chapter 1: Linear Systems
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL LSVRR (NRA, NCA, A, LDA, IPATH, TOL, IRANK, S, U, LDU, V, LDV)
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
Description
The underlying code is based on either LINPACK , LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Let n = NRA (the number of rows in A) and let p = NCA (the number of columns in A). For any n
p matrix A, there exists an n n orthogonal matrix U and a p p orthogonal matrix V such that
LSVRR 499
U AV = 0
[ 0]
T
if n p
if n p
where = diag(1 , , m), and m = min(n, p). The scalars 1 2 m 0 are called the
singular values of A. The columns of U are called the left singular vectors of A. The columns of V
are called the right singular vectors of A.
The estimated rank of A is the number of k that is larger than a tolerance . If is the parameter
TOL in the program, then
if > 0
if < 0
Comments
1.
2.
Informational error
Type
Code
4
Convergence cannot be achieved for all the singular values and their
corresponding singular vectors.
3.
When NRA is much greater than NCA, it might not be reasonable to store the whole
matrix U. In this case, IPATH with I = 2 allows a singular value factorization of A to be
computed in which only the first NCA columns of U are computed, and in many
applications those are all that are needed.
4.
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2VRR the leading dimension of ACOPY is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSVRR.
Additional memory allocation for ACOPY and option value restoration are done
automatically in LSVRR. Users directly calling L2VRR can allocate additional
space for ACOPY and set IVAL(3) and IVAL(4) so that memory bank conflicts no
This option has two values that determine if the L1 condition number is to be
computed. Routine LSVRR temporarily replaces IVAL(2) by IVAL(1). The
routine L2CRG computes the condition number if IVAL(2) = 2. Otherwise L2CRG
skips this computation. LSVRR restores the option. Default values for the option
are IVAL(*) = 1, 2.
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA, MXCOL, MXLDU, MXCOLU, MXLDV and MXCOLV
can be obtained through a call to ScaLAPACK_GETDIM (see Chapter 11, Utilities) after a call to
ScaLAPACK_SETUP (see Chapter 11, Utilities) has been made. See the ScaLAPACK Example
below.
Example
This example computes the singular value decomposition of a 6 4 matrix A. The matrices U and
V containing the left and right singular vectors, respectively, and the diagonal of , containing
singular values, are printed. On some systems, the signs of some of the columns of U and V may
be reversed.
USE IMSL_LIBRARIES
!
PARAMETER
REAL
Declare variables
(NRA=6, NCA=4, LDA=NRA, LDU=NRA, LDV=NCA)
A(LDA,NCA), U(LDU,NRA), V(LDV,NCA), S(NCA)
!
!
!
!
!
!
Chapter 1: Linear Systems
1
3
4
2
2
3
1
1
1
4
3
4
)
)
)
LSVRR 501
!
!
!
!
!
!
(
(
(
2
1
1
1
5
2
3
2
2
1
2
3
)
)
)
DATA A/1., 3., 4., 2., 1., 1., 2., 2., 3., 1., 5., 2., 3*1., &
3., 2., 2., 4., 3., 4., 1., 2., 3./
Compute all singular vectors
IPATH = 11
TOL
= AMACH(4)
TOL
= 10.*TOL
CALL LSVRR(A, IPATH, S, TOL=TOL, IRANK=IRANK, U=U, V=V)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT, *) IRANK = , IRANK
CALL WRRRN (U, U, NRA, NCA)
CALL WRRRN (S, S, 1, NCA, 1)
CALL WRRRN (V, V)
!
END
Output
IRANK =
1
2
3
4
5
6
1
-0.3805
-0.4038
-0.5451
-0.2648
-0.4463
-0.3546
2
0.1197
0.3451
0.4293
-0.0683
-0.8168
-0.1021
U
3
4
0.4391 -0.5654
-0.0566
0.2148
0.0514
0.4321
-0.8839 -0.2153
0.1419
0.3213
-0.0043 -0.5458
S
1
11.49
1
2
3
4
1
-0.4443
-0.5581
-0.3244
-0.6212
2
3.27
3
2.65
V
2
0.5555
-0.6543
-0.3514
0.3739
4
2.09
3
-0.4354
0.2775
-0.7321
0.4444
4
0.5518
0.4283
-0.4851
-0.5261
ScaLAPACK Example
The previous example is repeated here as a distributed example. This example computes the
singular value decomposition of a 6 4 matrix A. The matrices U and V containing the left and
right singular vectors, respectively, and the diagonal of S, containing singular values, are printed.
On some systems, the signs of some of the columns of U and V may be reversed..
USE MPI_SETUP_INT
USE IMSL_LIBRARIES
USE SCALAPACK_SUPPORT
502 Chapter 1: Linear Systems
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
KBASIS, LDA, LDQR, NCA, NRA, DESCA(9), DESCU(9), &
DESCV(9), MXLDV, MXCOLV, NSZ, MXLDU, MXCOLU
INTEGER
INFO, MXCOL, MXLDA, LDU, LDV, IPATH, IRANK
REAL
TOL, AMACH
REAL, ALLOCATABLE ::
A(:,:),U(:,:), V(:,:), S(:)
REAL, ALLOCATABLE ::
A0(:,:), U0(:,:), V0(:,:), S0(:)
PARAMETER
(NRA=6, NCA=4, LDA=NRA, LDU=NRA, LDV=NCA)
NSZ = MIN(NRA,NCA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), U(LDU,NCA), V(LDV,NCA), S(NCA))
Set values for A
A(1,:) = (/ 1.0, 2.0, 1.0, 4.0/)
A(2,:) = (/ 3.0, 2.0, 1.0, 3.0/)
A(3,:) = (/ 4.0, 3.0, 1.0, 4.0/)
A(4,:) = (/ 2.0, 1.0, 3.0, 1.0/)
A(5,:) = (/ 1.0, 5.0, 2.0, 2.0/)
A(6,:) = (/ 1.0, 2.0, 2.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
MXCOL, MXLDU, MXCOLU, MXLDV, AND MXCOLV
CALL SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
CALL SCALAPACK_GETDIM(NRA, NSZ, MP_MB, MP_NB, MXLDU, MXCOLU)
CALL SCALAPACK_GETDIM(NSZ, NCA, MP_MB, MP_NB, MXLDV, MXCOLV)
Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDA, INFO)
CALL DESCINIT(DESCU, NRA, NSZ, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDU, INFO)
CALL DESCINIT(DESCV, NSZ, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDV, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), U0(MXLDU,MXCOLU), V0(MXLDV,MXCOLV), S(NCA))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Compute all singular vectors
IPATH = 11
TOL = AMACH(4)
TOL = 10. * TOL
CALL LSVRR (A0, IPATH, S, TOL=TOL, IRANK=IRANK, U=U0, V=V0)
Unmap the results from the distributed
array back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(U0, DESCU, U)
CALL SCALAPACK_UNMAP(V0, DESCV, V)
Print results.
Only Rank=0 has the solution.
INTEGER
!
!
!
!
!
!
!
!
!
!
!
!
!
LSVRR 503
!
!
Output
IRANK =
1
2
3
4
5
6
1
-0.3805
-0.4038
-0.5451
-0.2648
-0.4463
-0.3546
2
0.1197
0.3451
0.4293
-0.0683
-0.8168
-0.1021
U
3
4
0.4391 -0.5654
-0.0566
0.2148
0.0514
0.4321
-0.8839 -0.2153
0.1419
0.3213
-0.0043 -0.5458
S
1
11.49
1
2
3
4
2
3.27
1
-0.4443
-0.5581
-0.3244
-0.6212
3
2.65
V
2
0.5555
-0.6543
-0.3514
0.3739
4
2.09
3
-0.4354
0.2775
-0.7321
0.4444
4
0.5518
0.4283
-0.4851
-0.5261
LSVCR
Computes the singular value decomposition of a complex matrix.
Required Arguments
A Complex NRA by NCA matrix whose singular value decomposition is to be computed.
(Input)
IPATH Integer flag used to control the computation of the singular vectors. (Input)
IPATH has the decimal expansion IJ such that:
I=0 means do not compute the left singular vectors;
I=1 means return the NCA left singular vectors in U;
I=2 means return only the min(NRA, NCA) left singular vectors in U;
J=0 means do not compute the right singular vectors;
J=1 means return the right singular vectors in V.
504 Chapter 1: Linear Systems
Optional Arguments
NRA Number of rows in the matrix A. (Input)
Default: NRA = size (A,1).
NCA Number of columns in the matrix A. (Input)
Default: NCA = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
TOL Real scalar containing the tolerance used to determine when a singular value is
negligible. (Input)
If TOL is positive, then a singular value SI is considered negligible if SI TOL . If TOL
is negative, then a singular value SI is considered negligible if
SI |TOL|*(Infinity norm of A). In this case |TOL| should generally contain an estimate
of the level of relative error in the data.
Default: TOL = 1.0e-5 for single precision and 1.0d-10 for double precision.
IRANK Integer scalar containing an estimate of the rank of A. (Output)
U Complex NRA by NRA if I = 1 or NRA by min(NRA, NCA) if I = 2 matrix containing the
left singular vectors of A. (Output)
U will not be referenced if I is equal to zero. If NRA is less than or equal to NCA or
IPATH = 2, then U can share the same storage locations as A.
LDU Leading dimension of U exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDU = size (U,1).
V Complex NCA by NCA matrix containing the right singular vectors of A. (Output)
V will not be referenced if J is equal to zero. If NCA is less than or equal to NRA, then V
can share the same storage locations as A; however U and V cannot both coincide with A
simultaneously.
LDV Leading dimension of V exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDV = size (V,1).
FORTRAN 90 Interface
Generic:
LSVCR 505
Specific:
FORTRAN 77 Interface
Single:
CALL LSVCR (NRA, NCA, A, LDA, IPATH, TOL, IRANK, S, U, LDU, V, LDV)
Double:
Description
The underlying code is based on either LINPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Let n = NRA (the number of rows in A) and let p = NCA (the number of columns in A).For any n p
matrix A there exists an n n orthogonal matrix U and a p p orthogonal matrix V such that
U AV = 0
[ 0]
T
if n p
if n p
where = diag(1, , m), and m = min(n, p). The scalars 1 2 0 are called the singular
values of A. The columns of U are called the left singular vectors of A. The columns of V are
called the right singular vectors of A.
The estimated rank of A is the number of k which are larger than a tolerance . If is the
parameter TOL in the program, then
if > 0
if < 0
Comments
1.
2.
Informational error
Type
Code
4
Convergence cannot be achieved for all the singular values and their
corresponding singular vectors.
3.
When NRA is much greater than NCA, it might not be reasonable to store the whole
matrix U. In this case IPATH with I = 2 allows a singular value factorization of A to be
computed in which only the first NCA columns of U are computed, and in many
applications those are all that are needed.
4.
This option uses four values to solve memory bank conflict (access inefficiency)
problems. In routine L2VCR the leading dimension of ACOPY is increased by
IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and IVAL(4) are
temporarily replaced by IVAL(1) and IVAL(2), respectively, in LSVCR.
Additional memory allocation for ACOPY and option value restoration are done
automatically in LSVCR. Users directly calling L2VCR can allocate additional
space for ACOPY and set IVAL(3) and IVAL(4) so that memory bank conflicts no
longer cause inefficiencies. There is no requirement that users change existing
applications that use LSVCR or L2VCR. Default values for the option are
IVAL(*) = 1, 16, 0, 1.
17
This option has two values that determine if the L1 condition number is to be
computed. Routine LSVCR temporarily replaces IVAL(2) by IVAL(1). The
routine L2CCG computes the condition number if IVAL(2) = 2. Otherwise L2CCG
skips this computation. LSVCR restores the option. Default values for the option
are IVAL(*) = 1, 2.
Example
This example computes the singular value decomposition of a 6 3 matrix A. The matrices U and
V containing the left and right singular vectors, respectively, and the diagonal of , containing
singular values, are printed. On some systems, the signs of some of the columns of U and V may
be reversed.
USE IMSL_LIBRARIES
!
!
!
!
!
!
!
!
!
!
!
PARAMETER
COMPLEX
Declare variables
(NRA=6, NCA=3, LDA=NRA, LDU=NRA, LDV=NCA)
A(LDA,NCA), U(LDU,NRA), V(LDV,NCA), S(NCA)
Set values for A
A = (
(
(
(
(
(
1+2i
3-2i
4+3i
2-1i
1-5i
1+2i
3+2i
2-4i
-2+1i
3+0i
2-5i
4-2i
1-4i
1+3i
1+4i
3-1i
2+2i
2-3i
)
)
)
)
)
)
LSVCR 507
!
END
Output
IRANK =
3
U
1
2
3
4
5
6
(
(
(
(
(
(
1
0.1968, 0.2186)
0.3443,-0.3542)
0.1457, 0.2307)
0.3016,-0.0844)
0.2283,-0.6008)
0.2876,-0.0350)
2
( 0.5011, 0.0217)
(-0.2933, 0.0248)
(-0.5424, 0.1381)
( 0.2157, 0.2659)
(-0.1325, 0.1433)
( 0.4377,-0.0400)
3
(-0.2007,-0.1003)
( 0.1155,-0.2338)
(-0.4361,-0.4407)
(-0.0523,-0.0894)
( 0.3152,-0.0090)
( 0.0458,-0.6205)
S
( 11.77,
1
0.00)
9.30,
2
0.00)
4.99,
3
0.00)
V
1
2
3
1
( 0.6616, 0.0000)
( 0.7355, 0.0379)
( 0.0507,-0.1317)
2
(-0.2651, 0.0000)
( 0.3850,-0.0707)
( 0.1724, 0.8642)
3
(-0.7014, 0.0000)
( 0.5482, 0.0624)
(-0.0173,-0.4509)
LSGRR
CAPABLE
Required Arguments
A NRA by NCA matrix whose generalized inverse is to be computed. (Input)
GINVA NCA by NRA matrix containing the generalized inverse of A. (Output)
508 Chapter 1: Linear Systems
Optional Arguments
NRA Number of rows in the matrix A. (Input)
Default: NRA = size (A,1).
NCA Number of columns in the matrix A. (Input)
Default: NCA = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
TOL Scalar containing the tolerance used to determine when a singular value (from the
singular value decomposition of A) is negligible. (Input)
If TOL is positive, then a singular value i considered negligible if i TOL . If TOL is
negative, then a singular value i considered negligible if i |TOL| * ||A|| . In this case,
|TOL| generally contains an estimate of the level of the relative error in the data.
Default: TOL = 1.0e-5 for single precision and 1.0d-10 for double precision.
IRANK Scalar containing an estimate of the rank of A. (Output)
LDGINV Leading dimension of GINVA exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDGINV = size (GINV,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
ScaLAPACK Interface
Generic:
Specific:
See the ScaLAPACK Usage Notes below for a description of the arguments for distributed
computing.
LSGRR 509
Description
Let k = IRANK, the rank of A; let n = NRA, the number of rows in A; let p = NCA, the number of
columns in A; and let
A = GINV
The underlying code is based on either LINPACK, LAPACK, or ScaLAPACK code depending
upon which supporting libraries are used during linking. For a detailed explanation see Using
ScaLAPACK, LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational error
Type Code
4
Convergence cannot be achieved for all the singular values and their
corresponding singular vectors.
A0 MXLDA by MXCOL local matrix containing the local portions of the distributed
matrix A. A contains the matrix for which the generalized inverse is to be computed.
(Input)
GINVA0 MXLDG by MXCOLG local matrix containing the local portions of the distributed
matrix GINVA. GINVA contains the generalized inverse of matrix A. (Output)
All other arguments are global and are the same as described for the standard version of the
routine. In the argument descriptions above, MXLDA, MXCOL, MXLDG, and MXCOLG can
be obtained through a call to SCALAPACK_GETDIM (see Chapter 11, Utilities) after a call to
SCALAPACK_SETUP (see Chapter 11, Utilities) has been made. See the ScaLAPACK
Example below.
Example
This example computes the generalized inverse of a 3 2 matrix A. The rank k = IRANK and the
inverse
A = GINV
are printed.
USE IMSL_LIBRARIES
!
PARAMETER
REAL
!
!
!
!
!
!
!
!
!
Declare variables
(NRA=3, NCA=2, LDA=NRA, LDGINV=NCA)
A(LDA,NCA), GINV(LDGINV,NRA)
Set values for A
A = (
1
(
1
( 100
`
DATA A/1., 1., 100., 0., 1., -50./
0
1
-50
)
)
)
!
END
Output
IRANK =
2
GINV
1
0.1000
2
0.3000
3
0.0060
LSGRR 511
0.2000
0.6000
-0.0080
ScaLAPACK Example
This example computes the generalized inverse of a 6 4 matrix A as a distributed example. The
rank k = IRANK and the inverse
A = GINV
are printed.
USE MPI_SETUP_INT
USE IMSL_LIBRARIES
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE mpif.h
!
Declare variables
IRANK, LDA, NCA, NRA, DESCA(9), DESCG(9), &
LDGINV, MXLDG, MXCOLG, NOUT
INTEGER
INFO, MXCOL, MXLDA
REAL
TOL, AMACH
REAL, ALLOCATABLE ::
A(:,:),GINVA(:,:)
REAL, ALLOCATABLE ::
A0(:,:), GINVA0(:,:)
PARAMETER
(NRA=6, NCA=4, LDA=NRA, LDGINV=NCA)
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), GINVA(NCA,NRA))
Set values for A
A(1,:) = (/ 1.0, 2.0, 1.0, 4.0/)
A(2,:) = (/ 3.0, 2.0, 1.0, 3.0/)
A(3,:) = (/ 4.0, 3.0, 1.0, 4.0/)
A(4,:) = (/ 2.0, 1.0, 3.0, 1.0/)
A(5,:) = (/ 1.0, 5.0, 2.0, 2.0/)
A(6,:) = (/ 1.0, 2.0, 2.0, 3.0/)
ENDIF
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
MXCOL, MXLDG, and MXCOLG
CALL SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
INTEGER
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
LSGRR 513
Routines
2.1.
Eigenvalue Decomposition
2.1.1
520
2.1.2
527
2.1.3
535
2.2.
2.2.1
543
545
548
550
552
555
557
559
561
563
566
568
571
573
575
578
581
584
586
2.2.2
2.2.3
2.2.4
Routines 515
2.2.5
2.2.6
2.2.7
589
591
593
596
599
602
604
607
609
611
614
616
2.3.
2.3.1
618
621
625
627
629
632
634
636
639
2.3.2
2.3.3
Usage Notes
This chapter includes routines for linear eigensystem analysis. Many of these are for matrices with
special properties. Some routines compute just a portion of the eigensystem. Use of the
appropriate routine can substantially reduce computing time and storage requirements compared to
computing a full eigensystem for a general complex matrix.
An ordinary linear eigensystem problem is represented by the equation Ax = x where A denotes
an
n n matrix. The value is an eigenvalue and x 0 is the corresponding eigenvector. The
eigenvector is determined up to a scalar factor. In all routines, we have chosen this factor so that x
has Euclidean length with value one, and the component of x of smallest index and largest
magnitude is positive. In case x is a complex vector, this largest component is real and positive.
Similar comments hold for the use of the remaining Level 1 routines in the following tables in
those cases where the second character of the Level 2 routine name is no longer the character "2".
Symmetric
Band
Hermitian
Full
All eigenvalues
EVLSF
EVLSB
EVLHF
All eigenvalues
and eigenvectors
EVCSF
EVCSB
EVCHF
Extreme eigenvalues
EVASF
EVASB
EVAHF
Extreme eigenvalues
and eigenvectors
EVESF
EVESB
EVEHF
Eigenvalues in
an interval
EVBSF
EVBSB
EVBHF
Eigenvalues and
eigevectors in an interval
EVFSF
EVFSB
EVFHF
Performance index
EPISF
EPISB
EPIHF
General Eigensystems
Real
General
Complex
General
Real
Hessenberg
Complex
Hessenberg
All eigenvalues
EVLRG
EVLCG
EVLRH
EVLCH
All eigenvalues
and eigenvectors
EVCRG
EVCCG
EVCRH
EVCCH
Performance
index
EPIRG
EPICG
EPIRG
EPICG
Generalized Eigensystems Ax = Bx
Real
General
Complex
General
A Symmetric
B Positive
Definite
All eigenvalues
GVLRG
GVLCG
GVLSP
GVCRG
GVCCG
GVCSP
Performance index
GPIRG
GPICG
GPISP
for all in ( A )
where (A) is the set of all eigenvalues of A (called the spectrum of A), X is the matrix of
eigenvectors, || ||2 is the 2-norm, and (X) is the condition number of X defined as
(X) = || X ||2 || X1||2. If A is a real symmetric or complex Hermitian matrix, then its eigenvector
matrix X is respectively orthogonal or unitary. For these matrices,(X) = 1.
The eigenvalues
j
and eigenvectors
xj
computed by EVC** can be checked by computing their performance index using EPI**. The
performance index is defined by Smith et al. (1976, pages 124126) to be
= max
1 j n
Ax j j x j
10n A
xj
1
1
No significance should be attached to the factor of 10 used in the denominator. For a real vector x,
the symbol || x ||1 represents the usual 1-norm of x. For a complex vector x, the symbol || x ||1 is
defined by
N
x 1 = ( xk + xk
k =1
= Ax j j x j
calculation. There are also similar routines GPI** to compute the performance index for
generalized eigenvalue problems.
If the condition number (X) of the eigenvector matrix X is large, there can be large errors in the
eigenvalues even if is small. In particular, it is often difficult to recognize near multiple
eigenvalues or unstable mathematical problems from numerical results. This facet of the
eigenvalue problem is difficult to understand: A user often asks for the accuracy of an individual
eigenvalue. This can be answered approximately by computing the condition number of an
individual eigenvalue. See Golub and Van Loan (1989, pages 344-345). For matrices A such that
the computed array of normalized eigenvectors X is invertible, the condition number of j is j
the Euclidean length of row j of the inverse matrix X1. Users can choose to compute this matrix
with routine LINCG, see Chapter 1, Linear Systems. An approximate bound for the accuracy of a
computed eigenvalue is then given by j || A ||. To compute an approximate bound for the
relative accuracy of an eigenvalue, divide this bound by | j |.
j = j 1
The generalized eigenvectors for j correspond to those for j. Other reformulations can be made:
If B is nonsingular, the user can solve the ordinary eigenvalue problem Cx B1 Ax = x. This is
not recommended as a computational algorithm for two reasons. First, it is generally less efficient
than solving the generalized problem directly. Second, the matrix C will be subject to
perturbations due to ill-conditioning and rounding errors when computing B1A. Computing the
condition numbers of the eigenvalues for C may, however, be helpful for analyzing the accuracy
of results for the generalized problem.
There is another method that users can consider to reduce the generalized problem to an alternate
ordinary problem. This technique is based on first computing a matrix decomposition B = PQ,
where both P and Q are matrices that are simple to invert. Then, the given generalized problem
is equivalent to the ordinary eigenvalue problem Fy = y. The matrix F P1AQ1. The
unnormalized eigenvectors of the generalized problem are given by x = Q1y. An example of this
reformulation is used in the case where A and B are real and symmetric with B positive definite.
The IMSL routines GVLSP and GVCSP use P = RT and Q = R where R is an upper triangular matrix
obtained from a Cholesky decomposition, B = RTR. The matrix F = R AR1 is symmetric and
real. Computation of the eigenvalue-eigenvector expansion for F is based on routine EVCSF.
LIN_EIG_SELF
Computes the eigenvalues of a self-adjoint (i.e. real symmetric or complex Hermitian) matrix, A.
Optionally, the eigenvectors can be computed. This gives the decomposition A = VDVT , where V
is an n n orthogonal matrix and D is a real diagonal matrix.
Required Arguments
A Array of size n n containing the matrix. (Input [/Output])
D Array of size n containing the eigenvalues. The values are in order of decreasing
absolute value. (Output)
Optional Arguments
NROWS = n (Input)
Uses array A(1:n, 1:n) for the input matrix.
Default: n = size(A, 1)
v = v(:,:) (Output)
Array of the same type and kind as A(1:n, 1:n). It contains the n n orthogonal matrix
V.
Derived type array with the same precision as the input matrix; used for passing
optional data to the routine. The options are as follows:
Packaged Options for LIN_EIG_SELF
Option Prefix = ?
Option Name
Option Value
Lin_eig_self_set_small
Lin_eig_self_overwrite_input
Lin_eig_self_scan_for_NaN
Lin_eig_self_use_QR
Lin_eig_self_skip_Orth
Lin_eig_self_use_Gauss_elim
Lin_eig_self_set_perf_ratio
Examines each input array entry to find the first value such that
isNaN(a(i,j)) == .true.
If the eigenvalues are computed using inverse iteration, skips the final
orthogonalization of the vectors. This will result in a more efficient computation but
the eigenvectors, while a complete set, may be far from orthogonal.
Default: the eigenvectors are normally orthogonalized if obtained using inverse
iteration.
iopt(IO) = ?_options(?_lin_eig_use_Gauss_elim, ?_dummy)
If the eigenvalues are computed using inverse iteration, uses standard elimination with
partial pivoting to solve the inverse iteration problems.
Default: the eigenvectors computed using cyclic reduction
iopt(IO) = ?_options(?_lin_eig_self_set_perf_ratio, perf_ratio)
FORTRAN 90 Interface
Generic:
Specific:
Description
Routine LIN_EIG_SELF is an implementation of the QR algorithm for self-adjoint matrices. An
orthogonal similarity reduction of the input matrix to self-adjoint tridiagonal form is performed.
Then, the eigenvalue-eigenvector decomposition of a real tridiagonal matrix is calculated. The expansion of the matrix as AV = VD results from a product of these matrix factors. See Golub and
Van Loan (1989, Chapter 8) for details.
LIN_EIG_SELF 521
Output
Example 1 for LIN_EIG_SELF is correct.
Additional Examples
Example 2: Eigenvalue-Eigenvector Expansion of a Square Matrix
A self-adjoint matrix is generated and the eigenvalues and eigenvectors are computed. Thus,
A = VDVT, where V is orthogonal and D is a real diagonal matrix. The matrix V is obtained using
an optional argument. Also, see operator_ex26, Chapter 10.
use lin_eig_self_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_EIG_SELF.
integer, parameter :: n=8
real(kind(1e0)), parameter :: one=1e0
real(kind(1e0)) :: a(n,n), d(n), v_s(n,n), y(n*n)
! Generate a random self-adjoint matrix.
call rand_gen(y)
a = reshape(y,(/n,n/))
a = a + transpose(a)
! Compute the eigenvalues and eigenvectors.
call lin_eig_self(a, d, v=v_s)
! Check the results for small residuals.
if (sum(abs(matmul(a,v_s)-v_s*spread(d,1,n)))/d(1) <= &
sqrt(epsilon(one))) then
write (*,*) 'Example 2 for LIN_EIG_SELF is correct.'
end if
end
Output
Example 2 for LIN_EIG_SELF is correct.
( A di I ) vi = bi
The solutions are then orthogonalized as in Hanson et al. (1991) to comprise a partial decomposition
AV = VD where V is an n k matrix resulting from the orthogonalized {vi } and D is the k k
diagonal matrix of the distinguished eigenvalues. It is necessary to suppress the error message when
the matrix is singular. Since these singularities are desirable, it is appropriate to ignore the
exceptions and not print the message text. Also, see operator_ex27, supplied with the product
examples.
LIN_EIG_SELF 523
use
use
use
use
lin_eig_self_int
lin_sol_self_int
rand_gen_int
error_option_packet
implicit none
! This is Example 3 for LIN_EIG_SELF.
integer i, j
integer, parameter :: n=64, k=8
real(kind(1d0)), parameter :: one=1d0, zero=0d0
real(kind(1d0)) big, err
real(kind(1d0)) :: a(n,n), b(n,1), d(n), res(n,k), temp(n,n), &
v(n,k), y(n*n)
type(d_options) :: iopti(2)=d_options(0,zero)
! Generate a random self-adjoint matrix.
call rand_gen(y)
a = reshape(y,(/n,n/))
a = a + transpose(a)
! Compute just the eigenvalues.
call lin_eig_self(a, d)
do i=1, k
! Define a temporary array to hold the matrices A - eigenvalue*I.
temp = a
do j=1, n
temp(j,j) = temp(j,j) - d(i)
end do
! Use packaged option to reset the value of a small diagonal.
iopti(1) = d_options(d_lin_sol_self_set_small,&
epsilon(one)*abs(d(i)))
! Use packaged option to skip singularity messages.
iopti(2) = d_options(d_lin_sol_self_no_sing_mess,&
zero)
call rand_gen(b(1:n,1))
call lin_sol_self(temp, b, v(1:,i:i),&
iopt=iopti)
end do
! Orthogonalize the eigenvectors.
do i=1, k
big = maxval(abs(v(1:,i)))
v(1:,i) = v(1:,i)/big
v(1:,i) = v(1:,i)/sqrt(sum(v(1:,i)**2))
if (i == k) cycle
v(1:,i+1:k) = v(1:,i+1:k) + &
spread(-matmul(v(1:,i),v(1:,i+1:k)),1,n)* &
spread(v(1:,i),2,k-i)
end do
524 Chapter 2: Eigensystem Analysis
do i=k-1, 1, -1
v(1:,i+1:k) = v(1:,i+1:k) + &
spread(-matmul(v(1:,i),v(1:,i+1:k)),1,n)* &
spread(v(1:,i),2,k-i)
end do
! Check the results for both orthogonality of vectors and small
! residuals.
res(1:k,1:k) = matmul(transpose(v),v)
do i=1,k
res(i,i)=res(i,i)-one
end do
err = sum(abs(res))/k**2
res = matmul(a,v) - v*spread(d(1:k),1,n)
if (err <= sqrt(epsilon(one))) then
if (sum(abs(res))/abs(d(1)) <= sqrt(epsilon(one))) then
write (*,*) 'Example 3 for LIN_EIG_SELF is correct.'
end if
end if
end
Output
Example 3 for LIN_EIG_SELF is correct.
where
D = S 1/ 2
The relationship between x and y is summarized as X = VDY, computed after the ordinary
eigenvalue problem is solved for the eigenvectors Y of C. The matrix X is normalized so that each
column has Euclidean length of value one. This solution method is nonstandard for any but the
most
ill-conditioned matrices B. The standard approach is to compute an ordinary self-adjoint problem
following computation of the Cholesky decomposition
B = RT R
where R is upper triangular. The computation of C can also be completed efficiently by exploiting
its self-adjoint property. See Golub and Van Loan (1989, Chapter 8) for more information. Also,
see operator_ex28, Chapter 10.
use lin_eig_self_int
Chapter 2: Eigensystem Analysis
LIN_EIG_SELF 525
use rand_gen_int
implicit none
! This is Example 4 for LIN_EIG_SELF.
integer i
integer, parameter :: n=64
real(kind(1e0)), parameter :: one=1d0
real(kind(1e0)) b_sum
real(kind(1e0)), dimension(n,n) :: A, B, C, D(n), lambda(n), &
S(n), vb_d, X, ytemp(n*n), res
! Generate random self-adjoint matrices.
call rand_gen(ytemp)
A = reshape(ytemp,(/n,n/))
A = A + transpose(A)
call rand_gen(ytemp)
B = reshape(ytemp,(/n,n/))
B = B + transpose(B)
b_sum = sqrt(sum(abs(B**2))/n)
! Add a scalar matrix so B is positive definite.
do i=1, n
B(i,i) = B(i,i) + b_sum
end do
! Get the eigenvalues and eigenvectors for B.
call lin_eig_self(B, S, v=vb_d)
! For full rank problems, convert to an ordinary self-adjoint
! problem. (All of these examples are full rank.)
if (S(n) > epsilon(one)) then
D = one/sqrt(S)
C = spread(D,2,n)*matmul(transpose(vb_d), &
matmul(A,vb_d))*spread(D,1,n)
! Get the eigenvalues and eigenvectors for C.
call lin_eig_self(C, lambda, v=X)
! Compute the generalized eigenvectors.
X = matmul(vb_d,spread(D,2,n)*X)
! Normalize the eigenvectors for the generalized problem.
X = X * spread(one/sqrt(sum(X**2,dim=2)),1,n)
res =
matmul(A,X) - &
matmul(B,X)*spread(lambda,1,n)
end if
end
Output
Example 4 for LIN_EIG_SELF is correct.
LIN_EIG_GEN
Computes the eigenvalues of an n n matrix, A. Optionally, the eigenvectors of A or AT are
computed. Using the eigenvectors of A gives the decomposition AV = VE, where V is an n n
complex matrix of eigenvectors, and E is the complex diagonal matrix of eigenvalues. Other
options include the reduction of A to upper triangular or Schur form, reduction to block upper
triangular form with 2 2 or unit sized diagonal block matrices, and reduction to upper
Hessenberg form.
Required Arguments
A Array of size n n containing the matrix. (Input [/Output])
E Array of size n containing the eigenvalues. These complex values are in order of
decreasing absolute value. The signs of imaginary parts of the eigenvalues are in no
predictable order. (Output)
Optional Arguments
NROWS = n (Input)
Uses array A(1:n, 1:n) for the input matrix.
Default: n = SIZE(A, 1)
v = V(:,:) (Output)
Returns the complex array of eigenvectors for the matrix AT. Thus the residuals
S = AT U UE
are small.
tri = T(:,:) (Output)
Returns the complex upper-triangular matrix T associated with the reduction of the
matrix A to Schur form. Optionally a unitary matrix W is returned in array V(:,:)
such that the residuals Z = AW WT are small.
LIN_EIG_GEN 527
Derived type array with the same precision as the input matrix. Used for passing
optional data to the routine. The options are as follows:
Packaged Options for LIN_EIG_GEN
Option Prefix = ?
Option Name
Option Value
lin_eig_gen_set_small
lin_eig_gen_overwrite_input
lin_eig_gen_scan_for_NaN
lin_eig_gen_no_balance
lin_eig_gen_set_iterations
lin_eig_gen_in_Hess_form
lin_eig_gen_out_Hess_form
lin_eig_gen_out_block_form
lin_eig_gen_out_tri_form
lin_eig_gen_continue_with_V
10
lin_eig_gen_no_sorting
11
This is the tolerance used to declare off-diagonal values effectively zero compared with
the size of the numbers involved in the computation of a shift.
Default: Small = epsilon(), the relative accuracy of arithmetic
Examines each input array entry to find the first value such that
isNaN(a(i,j)) == .true.
The input matrix is not preprocessed searching for isolated eigenvalues followed by
rescaling. See Golub and Van Loan (1989, Chapter 7) for references. With some
optional uses of the routine, this option flag is required.
Default: The matrix is first balanced.
Resets the maximum number of iterations permitted to isolate each diagonal block
matrix.
Default: The maximum number of iterations is 52.
528 Chapter 2: Eigensystem Analysis
The input matrix is in upper Hessenberg form. This flag is used to avoid the initial
reduction phase which may not be needed for some problem classes.
Default: The matrix is first reduced to Hessenberg form.
iopt(IO) = ?_options(?_lin_eig_gen_out_Hess_form, ?_dummy)
The output matrix is transformed to upper Hessenberg form, H 2 , which is block upper
triangular. The dimensions of the blocks are either 2 2 or unit sized. Nonzero
subdiagonal values of H 2 determine the size of the blocks. If the optional argument
v=V(:,:) is passed by the calling program unit, then the array V(:,:) contains an
orthogonal matrix Q2 such that
AQ2 Q2 H 2 0
As a convenience or for maintaining efficiency, the calling program unit sets the
optional argument v=V(:,:) to a matrix that has transformed a problem to the
similar matrix, A . The contents of V(:,:) are updated by the transformations used in
the algorithm. Requires the simultaneous use of option ?_lin_eig_no_balance.
Default: The array V(:,:) is initialized to the identity matrix.
iopt(IO) = ?_options(?_lin_eig_gen_no_sorting, ?_dummy)
Does not sort the eigenvalues as they are isolated by solving the 2 2 or unit sized
blocks. This will have the effect of guaranteeing that complex conjugate pairs of
LIN_EIG_GEN 529
FORTRAN 90 Interface
Generic:
Specific:
Description
The input matrix A is first balanced. The resulting similar matrix is transformed to upper Hessenberg form using orthogonal transformations. The double-shifted QR algorithm transforms the Hessenberg matrix so that 2 2 or unit sized blocks remain along the main diagonal. Any off-diagonal
that is classified as small in order to achieve this block form is set to the value zero. Next the
block upper triangular matrix is transformed to upper triangular form with unitary rotations. The
eigenvectors of the upper triangular matrix are computed using back substitution. Care is taken to
avoid overflows during this process. At the end, eigenvectors are normalized to have Euclidean
length one, with the largest component real and positive. This algorithm follows that given in
Golub and Van Loan, (1989, Chapter 7), with some novel organizational details for additional
options, efficiency and robustness.
Output
Example 1 for LIN_EIG_GEN is correct.
Additional Examples
Example 2: Complex Polynomial Equation Roots
The roots of a complex polynomial equation,
n
f ( z ) bk z n k + z n = 0
k =1
are required. This algebraic equation is formulated as a matrix eigenvalue problem. The equivalent
matrix eigenvalue problem is solved using the upper Hessenberg matrix which has the value zero
except in row number 1 and along the first subdiagonal. The entries in the first row are given by
a1,j = bj, i = 1, , n, while those on the first subdiagonal have the value one. This is a companion
matrix for the polynomial. The results are checked by testing for small values of |f(ei)|, i = 1, , n,
at the eigenvalues of the matrix, which are the roots of f(z). Also, see operator_ex30, supplied
with the product examples.
use lin_eig_gen_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_EIG_GEN.
integer i
integer, parameter :: n=12
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
real(kind(1d0)) err, t(2*n)
type(d_options) :: iopti(1)=d_options(0,zero)
complex(kind(1d0)) a(n,n), b(n), e(n), f(n), fg(n)
call rand_gen(t)
b = cmplx(t(1:n),t(n+1:),kind(one))
Chapter 2: Eigensystem Analysis
LIN_EIG_GEN 531
Output
Example 2 for LIN_EIG_GEN is correct.
(T + hI ) y = W T b
Then, x = x(h) = Wy. This is an efficient and accurate method for such parametric systems provided the expense of computing the Schur form has a pay-off in later efficiency. Using the Schur
532 Chapter 2: Eigensystem Analysis
form in this way, it is not required to compute an LU factorization of A + hI with each new value
of h. Note that even if the data A, h, and b are real, subexpressions for the solution may involve
complex intermediate values, with x(h) finally a real quantity. Also, see operator_ex31,
supplied with the product examples.
use lin_eig_gen_int
use lin_sol_gen_int
use rand_gen_int
implicit none
! This is Example 3 for LIN_EIG_GEN.
integer i
integer, parameter :: n=32, k=2
real(kind(1e0)), parameter :: one=1.0e0, zero=0.0e0
real(kind(1e0)) a(n,n), b(n,k), x(n,k), temp(n*max(n,k)), h, err
type(s_options) :: iopti(2)
complex(kind(1e0)) w(n,n), t(n,n), e(n), z(n,k)
call rand_gen(temp)
a = reshape(temp,(/n,n/))
call rand_gen(temp)
b = reshape(temp,(/n,k/))
iopti(1) = s_options(s_lin_eig_gen_out_tri_form,zero)
iopti(2) = s_options(s_lin_eig_gen_no_balance,zero)
! Compute the Schur decomposition of the matrix.
call lin_eig_gen(a, e, v=w, tri=t, &
iopt=iopti)
! Choose a value so that A+h*I is non-singular.
h = one
! Solve for (A+h*I)x=b using the Schur decomposition.
z = matmul(conjg(transpose(w)),b)
! Solve intermediate upper-triangular system with implicit
! additive diagonal, h*I. This is the only dependence on
! h in the solution process.
do i=n,1,-1
z(i,1:k) = z(i,1:k)/(t(i,i)+h)
z(1:i-1,1:k) = z(1:i-1,1:k) + &
spread(-t(1:i-1,i),dim=2,ncopies=k)* &
spread(z(i,1:k),dim=1,ncopies=i-1)
end do
! Compute the solution. It should be the same as x, but will not be
! exact due to rounding errors. (The quantity real(z,kind(one)) is
! the real-valued answer when the Schur decomposition method is used.)
LIN_EIG_GEN 533
z = matmul(w,z)
! Compute the solution by solving for x directly.
do i=1, n
a(i,i) = a(i,i) + h
end do
call lin_sol_gen(a, b, x)
! Check that x and z agree approximately.
err = sum(abs(x-z))/sum(abs(x))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 3 for LIN_EIG_GEN is correct.'
end if
end
Output
Example 3 for LIN_EIG_GEN is correct.
i i
uv
is satisfied. The vectors ui and vi are the ordinary and adjoint eigenvectors associated respectively
with ei and its complex conjugate. This gives an upper bound on the size of the change to each
ei due to changing the matrix data. The reciprocal
ui vi
integer i
integer, parameter :: n=17
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)) a(n,n), c(n,n), variation(n), y(n*n), temp(n), &
norm_of_a, eta
complex(kind(1d0)), dimension(n,n) :: e(n), d(n), u, v
! Generate a random matrix.
call rand_gen(y)
a = reshape(y,(/n,n/))
! Compute the eigenvalues, left- and right- eigenvectors.
call lin_eig_gen(a, e, v=v, v_adj=u)
! Compute condition numbers and variations of eigenvalues.
norm_of_a = sqrt(sum(a**2)/n)
do i=1, n
variation(i) = norm_of_a/abs(dot_product(u(1:n,i), &
v(1:n,i)))
end do
!
!
!
!
Output
Example 4 for LIN_EIG_GEN is correct.
LIN_GEIG_GEN
Computes the generalized eigenvalues of an n n matrix pencil, Av = Bv. Optionally, the
generalized eigenvectors are computed. If either of A or B is nonsingular, there are diagonal
matrices and , and a complex matrix V, all computed such that AV = BV.
LIN_GEIG_GEN 535
Required Arguments
A Array of size n n containing the matrix A. (Input [/Output])
B Array of size n n containing the matrix B. (Input [/Output])
ALPHA Array of size n containing diagonal matrix factors of the generalized
eigenvalues. These complex values are in order of decreasing absolute value. (Output)
BETAV Array of size n containing diagonal matrix factors of the generalized
eigenvalues. These real values are in order of decreasing value. (Output)
Optional Arguments
NROWS = n (Input)
Uses arrays A(1:n, 1:n) and B(1:n, 1:n) for the input matrix pencil.
Default: n = SIZE(A, 1)
v = V(:,:) (Output)
Returns the complex array of generalized eigenvectors for the matrix pencil.
iopt = iopt(:) (Input)
Derived type array with the same precision as the input matrix. Used for passing
optional data to the routine. The options are as follows:
Option Prefix = ?
Option Value
lin_geig_gen_set_small
lin_geig_gen_overwrite_input
lin_geig_gen_scan_for_NaN
lin_geig_gen_self_adj_pos
lin_geig_gen_for_lin_sol_self
lin_geig_gen_for_lin_eig_self
lin_geig_gen_for_lin_sol_lsq
lin_geig_gen_for_lin_eig_gen
This tolerance, multiplied by the sum of absolute value of the matrix B, is used to
define a small diagonal term in the routines lin_sol_lsq and lin_sol_self. That
value can be replaced using the option flags lin_geig_gen_for_lin_sol_lsq, and
lin_geig_gen_for_lin_sol_self.
Default: Small = epsilon(.), the relative accuracy of arithmetic
Examines each input array entry to find the first value such that
isNaN(a(i,j)) .or. isNaN(b(i,j)) == .true.
If both matrices A and B are self-adjoint and additionally B is positive-definite, then the
Cholesky algorithm is used to reduce the matrix pencil to an ordinary self-adjoint
eigenvalue problem.
iopt(IO) = ?_options(?_lin_geig_gen_for_lin_sol_self, ?_dummy)
iopt(IO+1) = ?_options((k=size of options for lin_sol_self), ?_dummy)
The options for lin_sol_self follow as data in iopt().
iopt(IO) = ?_options(?_lin_geig_gen_for_lin_eig_self, ?_dummy)
iopt(IO+1) = ?_options((k=size of options for lin_eig_self), ?_dummy)
The options for lin_eig_self follow as data in iopt().
iopt(IO) = ?_options(?_lin_geig_gen_for_lin_sol_lsq, ?_dummy)
iopt(IO+1) = ?_options((k=size of options for lin_sol_lsq), ?_dummy)
The options for lin_sol_lsq follow as data in iopt().
iopt(IO) = ?_options(?_lin_geig_gen_for_lin_eig_gen, ?_dummy)
iopt(IO+1) = ?_options((k=size of options for lin_eig_gen), ?_dummy)
The options for lin_eig_gen follow as data in iopt().
FORTRAN 90 Interface
Generic:
Specific:
Description
Routine LIN_GEIG_GEN implements a standard algorithm that reduces a generalized eigenvalue or
matrix pencil problem to an ordinary eigenvalue problem. An orthogonal decomposition is computed
BPT = HR
Chapter 2: Eigensystem Analysis
LIN_GEIG_GEN 537
and
RPv = x
If the matrices A and B are self-adjoint and if, in addition, B is positive-definite, then a more
efficient reduction than the default algorithm can be optionally used to solve the problem: A
Cholesky decomposition is obtained, RTR R = PBPT. The matrix R is upper triangular and P is a
permutation matrix. This is equivalent to the ordinary self-adjoint eigenvalue problem Cx = x,
where RPv = x and
C = R T PAPT R 1
are small. Note that when the matrix B is nonsingular = I, the identity matrix. When B is singular
and A is nonsingular, some diagonal entries of are essentially zero. This corresponds to infinite
eigenvalues of the matrix pencil. This random matrix pencil example has all finite eigenvalues.
Also, see operator_ex33, Chapter 10.
use lin_geig_gen_int
use rand_gen_int
implicit none
! This is Example 1 for LIN_GEIG_GEN.
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)) A(n,n), B(n,n), betav(n), beta_t(n), err, y(n*n)
complex(kind(1d0)) alpha(n), alpha_t(n), V(n,n)
538 Chapter 2: Eigensystem Analysis
Output
Example 1 for LIN_GEIG_GEN is correct.
Additional Examples
Example 2: Self-Adjoint, Positive-Definite Generalized Eigenvalue Problem
This example illustrates the use of optional flags for the special case where A and B are complex
self-adjoint matrices, and B is positive-definite. For purposes of maximum efficiency an option is
passed to routine LIN_SOL_SELF so that pivoting is not used in the computation of the Cholesky
decomposition of matrix B. This example does not require that secondary option. Also, see
operator_ex34, supplied with the product examples.
use lin_geig_gen_int
use lin_sol_self_int
use rand_gen_int
implicit none
! This is Example 2 for LIN_GEIG_GEN.
integer i
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
real(kind(1d0)) betav(n), temp_c(n,n), temp_d(n,n), err
type(d_options) :: iopti(4)=d_options(0,zero)
complex(kind(1d0)), dimension(n,n) :: A, B, C, D, V, alpha(n)
LIN_GEIG_GEN 539
Output
Example 2 for LIN_GEIG_GEN is correct.
use
use
use
use
lin_geig_gen_int
rand_gen_int
error_option_packet
isnan_int
implicit none
! This is Example 3 for LIN_GEIG_GEN.
integer, parameter :: n=6
real(kind(1d0)), parameter :: one=1.0d0, zero=0.0d0
real(kind(1d0)) a(n,n), b(n,n), betav(n), y(n*n)
type(d_options) iopti(1)
type(d_error) epack(1)
complex(kind(1d0)) alpha(n)
! Generate random matrices for both A and B.
call rand_gen(y)
a = reshape(y,(/n,n/))
call rand_gen(y)
b = reshape(y,(/n,n/))
! Make columns of A and B zero, so both are singular.
a(1:n,n) = 0; b(1:n,n) = 0
! Set internal tolerance for a small diagonal term.
iopti(1) = d_options(d_lin_geig_gen_set_small,sqrt(epsilon(one)))
! Compute the generalized eigenvalues.
call lin_geig_gen(a, b, alpha, betav, &
iopt=iopti,epack=epack)
! See if singular DAE system is detected.
! (The size of epack() is too small for the message, so
! output is blocked with NaNs.)
if (isnan(alpha)) then
write (*,*) 'Example 3 for LIN_GEIG_GEN is correct.'
end if
end
Output
Example 3 for LIN_GEIG_GEN is correct.
LIN_GEIG_GEN 541
solved. We anticipate that B might be singular and detect this fact. Also, see operator_ex36,
Chapter 10.
use
use
use
use
lin_geig_gen_int
lin_sol_lsq_int
rand_gen_int
isNaN_int
implicit none
! This is Example 4 for LIN_GEIG_GEN.
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1d0, zero=0d0
real(kind(1d0)) a(n,n), b(n,n), betav(n), y(n*n), err
type(d_options) iopti(4)
type(d_error) epack(1)
complex(kind(1d0)) alpha(n), v(n,n)
! Generate random matrices for both A and B.
call rand_gen(y)
a = reshape(y,(/n,n/))
call rand_gen(y)
b = reshape(y,(/n,n/))
! Set the option, a larger tolerance than default for lin_sol_lsq.
iopti(1) = d_options(d_lin_geig_gen_for_lin_sol_lsq,zero)
! Number of secondary optional data items
iopti(2) =
d_options(2,zero)
iopti(3) =
d_options(d_lin_sol_lsq_set_small,sqrt(epsilon(one))*&
sqrt(sum(b**2)/n))
iopti(4) =
d_options(d_lin_sol_lsq_no_sing_mess,zero)
! Compute the generalized eigenvalues.
call lin_geig_gen(A, B, alpha, betav, v=v, &
iopt=iopti, epack=epack)
if(.not. isNaN(alpha)) then
! Check the residuals.
err = sum(abs(matmul(A,V)*spread(betav,dim=1,ncopies=n) - &
matmul(B,V)*spread(alpha,dim=1,ncopies=n))) / &
sum(abs(a)+abs(b))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 4 for LIN_GEIG_GEN is correct.'
end if
end if
end
Output
Example 4 for LIN_GEIG_GEN is correct.
EVLRG
Computes all of the eigenvalues of a real matrix.
Required Arguments
A Real full matrix of order N. (Input)
EVAL Complex vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVLRG computes the eigenvalues of a real matrix. The matrix is first balanced.
Elementary or Gauss similarity transformations with partial pivoting are used to reduce this
balanced matrix to a real upper Hessenberg matrix. A hybrid doubleshifted LRQR algorithm is
used to compute the eigenvalues of the Hessenberg matrix, Watkins and Elsner (1990).
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. The LRQR
algorithm is based on software work of Watkins and Haag. Further details, some timing data, and
credits are given in Hanson et al. (1990).
Chapter 2: Eigensystem Analysis
EVLRG 543
Comments
1.
2.
Informational error
Type
Code
4
3.
This option uses eight values to solve memory bank conflict (access
inefficiency) problems. In routine E3LRG, the internal or working leading
dimension of ACOPY is increased by IVAL(3) when N is a multiple of IVAL(4).
The values IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and
IVAL(2), respectively, in routine EVLRG . Additional memory allocation and
option value restoration are automatically done in EVLRG. There is no
requirement that users change existing applications that use EVLRG or E3LRG.
Default values for the option are
IVAL(*) = 1, 16, 0, 1, 1, 16, 0, 1. Items 58 in IVAL(*) are for the generalized
eigenvalue problem and are not used in EVLRG.
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 85). The eigenvalues of this real matrix are computed and printed. The exact eigenvalues are
known to be {4, 3, 2, 1}.
USE EVLRG_INT
USE WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
LDA, N
(N=4, LDA=N)
REAL
COMPLEX
A(LDA,N)
EVAL(N)
Declare variables
!
!
!
544 Chapter 2: Eigensystem Analysis
Set values of A
!
!
!
!
A = ( -2.0
2.0
2.0
( -3.0
3.0
2.0
( -2.0
0.0
4.0
( -1.0
0.0
0.0
DATA A/-2.0, -3.0, -2.0, -1.0, 2.0, 3.0, 0.0, 0.0, 2.0,
4.0, 0.0, 2.0, 2.0, 2.0, 5.0/
!
!
2.0
2.0
2.0
5.0
2.0,
)
)
)
)
&
Find eigenvalues of A
CALL EVLRG (A, EVAL)
Print results
CALL WRCRN ('EVAL', EVAL, 1, N, 1)
END
Output
1
( 4.000, 0.000)
EVAL
2
3
( 3.000, 0.000) ( 2.000, 0.000)
4
( 1.000, 0.000)
EVCRG
Computes all of the eigenvalues and eigenvectors of a real matrix.
Required Arguments
A Floating-point array containing the matrix. (Input)
EVAL Complex array of size N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
EVEC Complex array containing the matrix of eigenvectors. (Output)
The J-th eigenvector, corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrix. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
EVCRG 545
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVCRG computes the eigenvalues and eigenvectors of a real matrix. The matrix is first
balanced. Orthogonal similarity transformations are used to reduce the balanced matrix to a real
upper Hessenberg matrix. The implicit doubleshifted QR algorithm is used to compute the
eigenvalues and eigenvectors of this Hessenberg matrix. The eigenvectors are normalized such
that each has Euclidean length of value one. The largest component is real and positive.
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. Further details,
some timing data, and credits are given in Hanson et al. (1990).
Comments
1.
2.
Informational error
Type
Code
4
3.
This option uses eight values to solve memory bank conflict (access
inefficiency) problems. In routine E8CRG, the internal or working leading
dimensions of ACOPY and ECOPY are both increased by IVAL(3) when N is a
multiple of IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced
by IVAL(1) and IVAL(2), respectively, in routine EVCRG. Additional memory
allocation and option value restoration are automatically done in EVCRG. There
is no requirement that users change existing applications that use EVCRG or
E8CRG. Default values for the option are IVAL(*) = 1, 16, 0, 1, 1, 16, 0, 1. Items
58 in IVAL(*) are for the generalized eigenvalue problem and are not used in
EVCRG.
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 82). The eigenvalues and eigenvectors of this real matrix are computed and printed. The
performance index is also computed and printed. This serves as a check on the computations. For
more details, see IMSL routine EPIRG.
USE
USE
USE
USE
EVCRG_INT
EPIRG_INT
UMACH_INT
WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
REAL
Declare variables
LDA, LDEVEC, N
(N=3, LDA=N, LDEVEC=N)
NOUT
PI
EVAL(N), EVEC(LDEVEC,N)
A(LDA,N)
!
!
!
!
!
!
Define values of A:
A = ( 8.0
( -4.0
( 18.0
-1.0
4.0
-5.0
-5.0
-2.0
-7.0
)
)
)
DATA A/8.0, -4.0, 18.0, -1.0, 4.0, -5.0, -5.0, -2.0, -7.0/
!
!
Print results
CALL UMACH (2, NOUT)
CALL WRCRN ('EVAL', EVAL, 1, N, 1)
CALL WRCRN ('EVEC', EVEC)
EVCRG 547
Output
EVAL
1
( 2.000, 4.000)
2
( 2.000,-4.000)
3
( 1.000, 0.000)
EVEC
1
2
3
1
( 0.3162, 0.3162)
(-0.0000, 0.6325)
( 0.6325, 0.0000)
Performance index =
2
( 0.3162,-0.3162)
(-0.0000,-0.6325)
( 0.6325, 0.0000)
3
( 0.4082, 0.0000)
( 0.8165, 0.0000)
( 0.4082, 0.0000)
0.026
EPIRG
This function computes the performance index for a real eigensystem.
Required Arguments
NEVAL Number of eigenvalue/eigenvector pairs on which the performance index
computation is based. (Input)
A Matrix of order N. (Input)
EVAL Complex vector of length NEVAL containing eigenvalues of A. (Input)
EVEC Complex N by NEVAL array containing eigenvectors of A. (Input)
The eigenvector corresponding to the eigenvalue EVAL(J) must be in the J-th column
of EVEC.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NEVAL, = EVAL, xj = EVEC(*,J), the j-th column of EVEC. Also, let be the machine
precision given by AMACH(4). The performance index, , is defined to be
= max
1 j M
Ax j j x j
10 N A
xj
The norms used are a modified form of the 1-norm. The norm of the complex vector v is
N
v 1 = { vi + vi }
i =1
While the exact value of is highly machine dependent, the performance of EVCSF is considered
excellent if < 1, good if 1 100, and poor if > 100.
The performance index was first developed by the EISPACK project at Argonne National
Laboratory; see Smith et al. (1976, pages 124125).
Comments
1.
2.
Informational errors
Type
Code
EPIRG 549
3
3
3
1
2
3
Example
For an example of EPIRG, see IMSL routine EVCRG.
EVLCG
Computes all of the eigenvalues of a complex matrix.
Required Arguments
A Complex matrix of order N. (Input)
EVAL Complex vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVLCG computes the eigenvalues of a complex matrix. The matrix is first balanced.
Unitary similarity transformations are used to reduce this balanced matrix to a complex upper
Hessenberg matrix. The shifted QR algorithm is used to compute the eigenvalues of this
Hessenberg matrix.
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational error
Type
Code
4
3.
This option uses eight values to solve memory bank conflict (access
inefficiency) problems. In routine E3LCG, the internal or working, leading
dimension of ACOPY is increased by IVAL(3) when N is a multiple of IVAL(4).
The values IVAL(3) and IVAL (4) are temporarily replaced by IVAL(1) and
IVAL(2), respectively, in routine EVLCG . Additional memory allocation and
option value restoration are automatically done in EVLCG. There is no
requirement that users change existing applications that use EVLCG or E3LCG.
Default values for the option are
IVAL(*) = 1, 16, 0, 1, 1, 16, 0, 1. Items 58 in IVAL(*) are for the generalized
eigenvalue problem and are not used in EVLCG.
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney
(1969, page 115). The program computes the eigenvalues of this matrix.
USE EVLCG_INT
USE WRCRN_INT
!
Declare variables
INTEGER
PARAMETER
LDA, N
(N=3, LDA=N)
!
Chapter 2: Eigensystem Analysis
EVLCG 551
COMPLEX
!
!
!
!
!
!
A(LDA,N), EVAL(N)
Set values of A
A = ( 1+2i
(43+44i
( 5+6i
3+4i
13+14i
7+8i
21+22i)
15+16i)
25+26i)
Find eigenvalues of A
CALL EVLCG (A, EVAL)
Print results
CALL WRCRN (EVAL, EVAL, 1, N, 1)
END
Output
EVAL
1
( 39.78, 43.00)
2
6.70, -7.88)
( -7.48,
3
6.88)
EVCCG
Computes all of the eigenvalues and eigenvectors of a complex matrix.
Required Arguments
A Complex matrix of order N. (Input)
EVAL Complex vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
EVEC Complex matrix of order N. (Output)
The J-th eigenvector, corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
552 Chapter 2: Eigensystem Analysis
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVCCG computes the eigenvalues and eigenvectors of a complex matrix. The matrix is
first balanced. Unitary similarity transformations are used to reduce this balanced matrix to a
complex upper Hessenberg matrix. The QR algorithm is used to compute the eigenvalues and
eigenvectors of this Hessenberg matrix. The eigenvectors of the original matrix are computed by
transforming the eigenvectors of the complex upper Hessenberg matrix.
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
Informational error
Type
Code
4
3.
EVCCG 553
This option uses eight values to solve memory bank conflict (access inefficiency)
problems. In routine E6CCG, the internal or working leading dimensions of ACOPY and
ECOPY are both increased by IVAL(3) when N is a multiple of IVAL(4). The values
IVAL(3) and IVAL(4) are temporarily replaced by IVAL(1) and IVAL(2), respectively,
in routine EVCCG. Additional memory allocation and option value restoration are
automatically done in EVCCG. There is no requirement that users change existing
applications that use EVCCG or E6CCG. Default values for the option are
IVAL(*) = 1, 16, 0, 1, 1, 16, 0, 1. Items 58 in IVAL(*) are for the generalized
eigenvalue problem and are not used in EVCCG.
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 116). Its eigenvalues are known to be {1 + 5i, 2 + 6i, 3 + 7i, 4 + 8i}. The program computes
the eigenvalues and eigenvectors of this matrix. The performance index is also computed and
printed. This serves as a check on the computations, for more details, see IMSL routine EPICG.
USE
USE
USE
USE
EVCCG_INT
EPICG_INT
WRCRN_INT
UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, N
(N=4, LDA=N, LDEVEC=N)
!
!
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
NOUT
PI
A(LDA,N), EVAL(N), EVEC(LDEVEC,N)
Set values of A
A = (5+9i
(3+3i
(2+2i
(1+i
5+5i
6+10i
3+3i
2+2i
-6-6i
-5-5i
-1+3i
-3-3i
-7-7i)
-6-6i)
-5-5i)
4i)
Print results
CALL UMACH (2, NOUT)
CALL WRCRN ('EVAL', EVAL, 1, N, 1)
CALL WRCRN ('EVEC', EVEC)
WRITE (NOUT,'(/,A,F6.3)') ' Performance index = ', PI
END
Output
1
( 4.000, 8.000)
EVAL
2
3
( 3.000, 7.000) ( 2.000, 6.000)
EVEC
2
1
4
1 ( 0.5774, 0.0000)
0.0000)
2 ( 0.5774,-0.0000)
0.0000)
3 ( 0.5774,-0.0000)
0.0000)
4 ( 0.0000, 0.0000)
0.0000)
Performance index =
4
( 1.000, 5.000)
3
( 0.5774, 0.0000)
( 0.3780, 0.0000)
( 0.7559,
( 0.5773,-0.0000)
( 0.7559, 0.0000)
( 0.3780,
(-0.0000,-0.0000)
( 0.3780, 0.0000)
( 0.3780,
( 0.5774, 0.0000)
( 0.3780, 0.0000)
( 0.3780,
0.016
EPICG
This function computes the performance index for a complex eigensystem.
Required Arguments
NEVAL Number of eigenvalue/eigenvector pairs on which the performance index
computation is based. (Input)
A Complex matrix of order N. (Input)
EVAL Complex vector of length N containing the eigenvalues of A. (Input)
EVEC Complex matrix of order N containing the eigenvectors of A. (Input)
The J-th eigenvalue/eigenvector pair should be in EVAL(J) and in the J-th column of
EVEC.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
EPICG 555
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NEVAL, = EVAL, xj = EVEC(*, J), the j-th column of EVEC. Also, let be the machine
precision given by AMACH(4). The performance index, , is defined to be
= max
1 j M
Ax j j x j
10 N A
xj
The norms used are a modified form of the 1-norm. The norm of the complex vector v is
N
v 1 = { vi + vi }
i =1
While the exact value of is highly machine dependent, the performance of EVCSF is considered
excellent if < 1, good if 1 100, and poor if > 100. The performance index was first
developed by the EISPACK project at Argonne National Laboratory; see Smith et al. (1976, pages
124125).
Comments
1.
2.
Informational errors
Type
Code
3
3
1
2
Example
For an example of EPICG, see IMSL routine EVCCG.
EVLSF
Computes all of the eigenvalues of a real symmetric matrix.
Required Arguments
A Real symmetric matrix of order N. (Input)
EVAL Real vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVLSF computes the eigenvalues of a real symmetric matrix. Orthogonal similarity
transformations are used to reduce the matrix to an equivalent symmetric tridiagonal matrix. Then,
an implicit rational QR algorithm is used to compute the eigenvalues of this tridiagonal matrix.
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
EVLSF 557
Comments
1.
2.
Informational error
Type
Code
3
Example
In this example, the eigenvalues of a real symmetric matrix are computed and printed. This matrix
is given by Gregory and Karney (1969, page 56).
USE EVLSF_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
LDA, N
(N=4, LDA=N)
REAL
A(LDA,N), EVAL(N)
Set values of A
Declare variables
!
!
!
!
!
!
!
!
A = (
(
(
(
6.0
4.0
4.0
1.0
4.0
6.0
1.0
4.0
4.0
1.0
6.0
4.0
1.0)
4.0)
4.0)
6.0)
DATA A /6.0, 4.0, 4.0, 1.0, 4.0, 6.0, 1.0, 4.0, 4.0, 1.0, 6.0, &
4.0, 1.0, 4.0, 4.0, 6.0 /
!
!
Find eigenvalues of A
CALL EVLSF (A, EVAL)
Print results
CALL WRRRN ('EVAL', EVAL, 1, N, 1)
END
Output
1
15.00
EVAL
2
3
5.00
5.00
4
-1.00
EVCSF
Computes all of the eigenvalues and eigenvectors of a real symmetric matrix.
Required Arguments
A Real symmetric matrix of order N. (Input)
EVAL Real vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
EVEC Real matrix of order N. (Output)
The J-th eigenvector, corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVCSF computes the eigenvalues and eigenvectors of a real symmetric matrix. Orthogonal
similarity transformations are used to reduce the matrix to an equivalent symmetric tridiagonal
matrix. These transformations are accumulated. An implicit rational QR algorithm is used to
compute the eigenvalues of this tridiagonal matrix. The eigenvectors are computed using the
eigenvalues as perfect shifts, Parlett (1980, pages 169, 172). The underlying code is based on
Chapter 2: Eigensystem Analysis
EVCSF 559
either EISPACK or LAPACK code depending upon which supporting libraries are used during
linking. For a detailed explanation, see Using ScaLAPACK, LAPACK, LINPACK, and
EISPACK in the Introduction section of this manual. Further details, some timing data, and
credits are given in Hanson et al. (1990).
Comments
1.
2.
Informational error
Type
Code
3
Example
The eigenvalues and eigenvectors of this real symmetric matrix are computed and printed. The
performance index is also computed and printed. This serves as a check on the computations. For
more details, see EPISF.
USE
USE
USE
USE
EVCSF_INT
EPISF_INT
UMACH_INT
WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, N
(N=3, LDA=N, LDEVEC=N)
INTEGER
REAL
NOUT
A(LDA,N), EVAL(N), EVEC(LDEVEC,N), PI
!
!
!
!
!
!
!
!
!
Set values of A
A = ( 7.0
( -8.0
( -8.0
-8.0
-16.0
-18.0
-8.0)
-18.0)
13.0)
DATA A/7.0, -8.0, -8.0, -8.0, -16.0, -18.0, -8.0, -18.0, 13.0/
!
!
!
!
Output
1
-27.90
EVAL
2
22.68
3
9.22
EVEC
1
2
3
1
0.2945
0.8521
0.4326
2
-0.2722
-0.3591
0.8927
Performance index =
3
0.9161
-0.3806
0.1262
0.019
EVASF
Computes the largest or smallest eigenvalues of a real symmetric matrix.
Required Arguments
NEVAL Number of eigenvalues to be computed. (Input)
A Real symmetric matrix of order N. (Input)
SMALL Logical variable. (Input)
If .TRUE., the smallest NEVAL eigenvalues are computed. If .FALSE., the largest NEVAL
eigenvalues are computed.
EVAL Real vector of length NEVAL containing the eigenvalues of A in decreasing order of
magnitude. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
EVASF 561
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVASF computes the largest or smallest eigenvalues of a real symmetric matrix.
Orthogonal similarity transformations are used to reduce the matrix to an equivalent symmetric
tridiagonal matrix. Then, an implicit rational QR algorithm is used to compute the eigenvalues of
this tridiagonal matrix.
The reduction routine is based on the EISPACK routine TRED2. See Smith et al. (1976). The
rational QR algorithm is called the PWK algorithm. It is given in Parlett (1980, page 169).
Comments
1.
2.
Informational error
Type
Code
3
Example
In this example, the three largest eigenvalues of the computed Hilbert matrix aij = 1/(i + j 1) of
order N = 10 are computed and printed.
USE EVASF_INT
USE WRRRN_INT
!
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, N, NEVAL
(N=10, NEVAL=3, LDA=N)
!
INTEGER
REAL
LOGICAL
INTRINSIC
!
!
I, J
A(LDA,N), EVAL(NEVAL), REAL
SMALL
REAL
Set up Hilbert matrix
DO 20 J=1, N
DO 10 I=1, N
A(I,J) = 1.0/REAL(I+J-1)
10
CONTINUE
20 CONTINUE
Find the 3 largest eigenvalues
SMALL = .FALSE.
CALL EVASF (NEVAL, A, SMALL, EVAL)
Print results
CALL WRRRN ('EVAL', EVAL, 1, NEVAL, 1)
END
Output
1
1.752
EVAL
2
0.343
3
0.036
EVESF
Computes the largest or smallest eigenvalues and the corresponding eigenvectors of a real
symmetric matrix.
Required Arguments
NEVEC Number of eigenvalues to be computed. (Input)
A Real symmetric matrix of order N. (Input)
SMALL Logical variable. (Input)
If .TRUE., the smallest NEVEC eigenvalues are computed. If .FALSE., the largest NEVEC
eigenvalues are computed.
EVAL Real vector of length NEVEC containing the eigenvalues of A in decreasing order of
magnitude. (Output)
EVEC Real matrix of dimension N by NEVEC. (Output)
The J-th eigenvector, corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
EVESF 563
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVESF computes the largest or smallest eigenvalues and the corresponding eigenvectors
of a real symmetric matrix. Orthogonal similarity transformations are used to reduce the matrix to
an equivalent symmetric tridiagonal matrix. Then, an implicit rational QR algorithm is used to
compute the eigenvalues of this tridiagonal matrix. Inverse iteration is used to compute the
eigenvectors of the tridiagonal matrix. This is followed by orthogonalization of these vectors. The
eigenvectors of the original matrix are computed by back transforming those of the tridiagonal
matrix.
The reduction routine is based on the EISPACK routine TRED2. See Smith et al. (1976). The
rational QR algorithm is called the PWK algorithm. It is given in Parlett (1980, page 169). The
inverse iteration and orthogonalization computation is discussed in Hanson et al. (1990). The back
transformation routine is based on the EISPACK routine TRBAK1.
Comments
1.
2.
Informational errors
Type
Code
3
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 55). The largest two eigenvalues and their eigenvectors are computed and printed. The
performance index is also computed and printed. This serves as a check on the computations. For
more details, see IMSL routine EPISF.
USE
USE
USE
USE
EVESF_INT
EPISF_INT
UMACH_INT
WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, N
(N=4, LDA=N, LDEVEC=N)
INTEGER
REAL
LOGICAL
NEVEC, NOUT
A(LDA,N), EVAL(N), EVEC(LDEVEC,N), PI
SMALL
!
!
!
!
!
!
!
!
!
!
Set values of A
A = (
(
(
(
5.0
4.0
1.0
1.0
4.0
5.0
1.0
1.0
1.0
1.0
4.0
2.0
1.0)
1.0)
2.0)
4.0)
DATA A/5.0, 4.0, 1.0, 1.0, 4.0, 5.0, 1.0, 1.0, 1.0, 1.0, 4.0, &
2.0, 1.0, 1.0, 2.0, 4.0/
!
!
!
!
EVESF 565
Output
EVAL
1
2
10.00
5.00
1
2
3
4
EVEC
1
2
0.6325 -0.3162
0.6325 -0.3162
0.3162
0.6325
0.3162
0.6325
Performance index =
0.031
EVBSF
Computes selected eigenvalues of a real symmetric matrix.
Required Arguments
MXEVAL Maximum number of eigenvalues to be computed. (Input)
A Real symmetric matrix of order N. (Input)
ELOW Lower limit of the interval in which the eigenvalues are sought. (Input)
EHIGH Upper limit of the interval in which the eigenvalues are sought. (Input)
NEVAL Number of eigenvalues found. (Output)
EVAL Real vector of length MXEVAL containing the eigenvalues of A in the interval (ELOW,
EHIGH) in decreasing order of magnitude. (Output)
Only the first NEVAL elements of EVAL are significant.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVBSF computes the eigenvalues in a given interval for a real symmetric matrix.
Orthogonal similarity transformations are used to reduce the matrix to an equivalent symmetric
tridiagonal matrix. Then, an implicit rational QR algorithm is used to compute the eigenvalues of
this tridiagonal matrix. The reduction step is based on the EISPACK routine TRED1. See Smith et
al. (1976). The rational QR algorithm is called the PWK algorithm. It is given in Parlett (1980,
page 169).
Comments
1.
2.
Informational error
Type
Code
3
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 56). The eigenvalues of A are known to be 1, 5, 5 and 15. The eigenvalues in the interval
[1.5, 5.5] are computed and printed. As a test, this example uses MXEVAL = 4. The routine EVBSF
computes NEVAL, the number of eigenvalues in the given interval. The value of NEVAL is 2.
USE EVBSF_INT
USE UMACH_INT
Chapter 2: Eigensystem Analysis
EVBSF 567
USE WRRRN_INT
!
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, MXEVAL, N
(MXEVAL=4, N=4, LDA=N)
INTEGER
REAL
NEVAL, NOUT
A(LDA,N), EHIGH, ELOW, EVAL(MXEVAL)
!
!
!
!
!
!
!
!
!
!
!
Set values of A
A = (
(
(
(
6.0
4.0
4.0
1.0
4.0
6.0
1.0
4.0
4.0
1.0
6.0
4.0
1.0)
4.0)
4.0)
6.0)
DATA A/6.0, 4.0, 4.0, 1.0, 4.0, 6.0, 1.0, 4.0, 4.0, 1.0, 6.0, &
4.0, 1.0, 4.0, 4.0, 6.0/
Find eigenvalues of A
ELOW = 1.5
EHIGH = 5.5
CALL EVBSF (MXEVAL, A, ELOW, EHIGH, NEVAL, EVAL)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,'(/,A,I2)') ' NEVAL = ', NEVAL
CALL WRRRN ('EVAL', EVAL, 1, NEVAL, 1)
END
Output
NEVAL =
EVAL
1
2
5.000
5.000
EVFSF
Computes selected eigenvalues and eigenvectors of a real symmetric matrix.
Required Arguments
MXEVAL Maximum number of eigenvalues to be computed. (Input)
A Real symmetric matrix of order N. (Input)
ELOW Lower limit of the interval in which the eigenvalues are sought. (Input)
EHIGH Upper limit of the interval in which the eigenvalues are sought. (Input)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL EVFSF (N, MXEVAL, A, LDA, ELOW, EHIGH, NEVAL, EVAL, EVEC,
LDEVEC)
Double:
Description
Routine EVFSF computes the eigenvalues in a given interval and the corresponding eigenvectors
of a real symmetric matrix. Orthogonal similarity transformations are used to reduce the matrix to
an equivalent symmetric tridiagonal matrix. Then, an implicit rational QR algorithm is used to
compute the eigenvalues of this tridiagonal matrix. Inverse iteration is used to compute the
eigenvectors of the tridiagonal matrix. This is followed by orthogonalization of these vectors. The
eigenvectors of the original matrix are computed by back transforming those of the tridiagonal
matrix.
EVFSF 569
The reduction step is based on the EISPACK routine TRED1. The rational QR algorithm is called
the PWK algorithm. It is given in Parlett (1980, page 169). The inverse iteration and
orthogonalization processes are discussed in Hanson et al. (1990). The transformation back to the
userss input matrix is based on the EISPACK routine TRBAK1. See Smith et al. (1976) for the
EISPACK routines.
Comments
1.
2.
Informational errors
Type
Code
3
Example
In this example, A is set to the computed Hilbert matrix. The eigenvalues in the interval [0.001, 1]
and their corresponding eigenvectors are computed and printed. This example uses MXEVAL = 3.
The routine EVFSF computes the number of eigenvalues NEVAL in the given interval. The value of
NEVAL is 2. The performance index is also computed and printed. For more details, see IMSL
routine EPISF.
USE
USE
USE
USE
!
EVFSF_INT
EPISF_INT
WRRRN_INT
UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, MXEVAL, N, J, I
(MXEVAL=3, N=3, LDA=N, LDEVEC=N)
!
INTEGER
REAL
NEVAL, NOUT
A(LDA,N), EHIGH, ELOW, EVAL(MXEVAL), &
EVEC(LDEVEC,MXEVAL), PI
!
!
Output
NEVAL =
EVAL
1
2
0.1223
0.0027
1
2
3
EVEC
1
2
-0.5474 -0.1277
0.5283
0.7137
0.6490 -0.6887
Performance index =
0.008
EPISF
This function computes the performance index for a real symmetric eigensystem.
Required Arguments
NEVAL Number of eigenvalue/eigenvector pairs on which the performance index
computation is based on. (Input)
A Symmetric matrix of order N. (Input)
EVAL Vector of length NEVAL containing eigenvalues of A. (Input)
Chapter 2: Eigensystem Analysis
EPISF 571
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NEVAL, = EVAL, xj = EVEC(*,J), the j-th column of EVEC. Also, let be the machine
precision, given by AMACH(4), see the Reference chapter of this manual. The performance index, ,
is defined to be
= max
1 j M
Ax j j x j
10 N A
xj
1
1
While the exact value of is highly machine dependent, the performance of EVCSF is considered
excellent if < 1, good if 1 100, and poor if > 100. The performance index was first
developed by the EISPACK project at Argonne National Laboratory; see Smith et al. (1976, pages
124125).
Comments
1.
2.
Informational errors
Type
Code
3
3
3
1
2
3
Example
For an example of EPISF, see routine EVCSF.
EVLSB
Computes all of the eigenvalues of a real symmetric matrix in band symmetric storage mode.
Required Arguments
A Band symmetric matrix of order N. (Input)
NCODA Number of codiagonals in A. (Input)
EVAL Vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
EVLSB 573
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVLSB computes the eigenvalues of a real band symmetric matrix. Orthogonal similarity
transformations are used to reduce the matrix to an equivalent symmetric tridiagonal matrix. The
implicit QL algorithm is used to compute the eigenvalues of the resulting tridiagonal matrix.
The reduction routine is based on the EISPACK routine BANDR; see Garbow et al. (1977). The QL
routine is based on the EISPACK routine IMTQL1; see Smith et al. (1976).
Comments
1.
2.
Informational error
Type
Code
4
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 77). The eigenvalues of this matrix are given by
k = 1 2 cos
k
3
N +1
Since the eigenvalues returned by EVLSB are in decreasing magnitude, the above formula for
k = 1, , N gives the values in a different order. The eigenvalues of this real band symmetric
matrix are computed and printed.
USE EVLSB_INT
USE WRRRN_INT
!
IMPLICIT
NONE
INTEGER
Declare variables
LDA, LDEVEC, N, NCODA
PARAMETER
REAL
A(LDA,N), EVAL(N)
Define values of A:
A = (-1 2 1
( 2 0 2 1
( 1 2 0 2 1
(
1 2 0 2
(
1 2 -1
Represented in band
form this is:
A = ( 0 0 1 1 1
( 0 2 2 2 2
(-1 0 0 0 -1
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
)
)
)
)
)
symmetric
)
)
)
DATA A/0.0, 0.0, -1.0, 0.0, 2.0, 0.0, 1.0, 2.0, 0.0, 1.0, 2.0, &
0.0, 1.0, 2.0, -1.0/
CALL EVLSB (A, NCODA, EVAL)
Print results
CALL WRRRN ('EVAL', EVAL, 1, N, 1)
END
Output
1
4.464
2
-3.000
EVAL
3
-2.464
4
-2.000
5
1.000
EVCSB
Computes all of the eigenvalues and eigenvectors of a real symmetric matrix in band symmetric
storage mode.
Required Arguments
A Band symmetric matrix of order N. (Input)
NCODA Number of codiagonals in A. (Input)
EVAL Vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
EVEC Matrix of order N containing the eigenvectors. (Output)
The J-th eigenvector, corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
Chapter 2: Eigensystem Analysis
EVCSB 575
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVCSB computes the eigenvalues and eigenvectors of a real band symmetric matrix.
Orthogonal similarity transformations are used to reduce the matrix to an equivalent symmetric
tridiagonal matrix. These transformations are accumulated. The implicit QL algorithm is used to
compute the eigenvalues and eigenvectors of the resulting tridiagonal matrix.
The reduction routine is based on the EISPACK routine BANDR; see Garbow et al. (1977). The QL
routine is based on the EISPACK routine IMTQL2; see Smith et al. (1976).
Comments
1.
2.
Informational error
Type
Code
4
3.
Example
In this example, a DATA statement is used to set A to a band matrix given by Gregory and Karney
(1969, page 75). The eigenvalues, k, of this matrix are given by
k
2N + 2
k = 16sin 4
The eigenvalues and eigenvectors of this real band symmetric matrix are computed and printed.
The performance index is also computed and printed. This serves as a check on the computations,
for more details, see IMSL routine EPISB.
USE
USE
USE
USE
EVCSB_INT
EPISB_INT
UMACH_INT
WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, N, NCODA
(N=6, NCODA=2, LDA=NCODA+1, LDEVEC=N)
!
!
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
NOUT
A(LDA,N), EVAL(N), EVEC(LDEVEC,N), PI
Define values of A:
A = ( 5 -4
1
( -4
6 -4
1
( 1 -4
6 -4
1
(
1 -4
6 -4
1
(
1 -4
6 -4
(
1 -4
5
Represented in band symmetric
form this is:
A = ( 0
0
1
1
1
1
( 0 -4 -4 -4 -4 -4
( 5
6
6
6
6
5
)
)
)
)
)
)
)
)
)
DATA A/0.0, 0.0, 5.0, 0.0, -4.0, 6.0, 1.0, -4.0, 6.0, 1.0, -4.0, &
6.0, 1.0, -4.0, 6.0, 1.0, -4.0, 5.0/
Find eigenvalues and vectors
CALL EVCSB (A, NCODA, EVAL, EVEC)
Compute performance index
PI = EPISB(N,A,NCODA,EVAL,EVEC)
Print results
CALL UMACH (2, NOUT)
CALL WRRRN ('EVAL', EVAL, 1, N, 1)
CALL WRRRN ('EVEC', EVEC)
WRITE (NOUT,'(/,A,F6.3)') ' Performance index = ', PI
END
EVCSB 577
Output
1
14.45
1
2
3
4
5
6
2
10.54
1
-0.2319
0.4179
-0.5211
0.5211
-0.4179
0.2319
EVAL
3
4
5.98
2.42
2
-0.4179
0.5211
-0.2319
-0.2319
0.5211
-0.4179
Performance index =
5
0.57
EVEC
3
4
-0.5211
0.5211
0.2319
0.2319
0.4179 -0.4179
-0.4179 -0.4179
-0.2319
0.2319
0.5211
0.5211
6
0.04
5
-0.4179
-0.5211
-0.2319
0.2319
0.5211
0.4179
6
0.2319
0.4179
0.5211
0.5211
0.4179
0.2319
0.029
EVASB
Computes the largest or smallest eigenvalues of a real symmetric matrix in band symmetric
storage mode.
Required Arguments
NEVAL Number of eigenvalues to be computed. (Input)
A Band symmetric matrix of order N. (Input)
NCODA Number of codiagonals in A. (Input)
SMALL Logical variable. (Input)
If .TRUE., the smallest NEVAL eigenvalues are computed. If .FALSE., the largest NEVAL
eigenvalues are computed.
EVAL Vector of length NEVAL containing the computed eigenvalues in decreasing order of
magnitude. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVASB computes the largest or smallest eigenvalues of a real band symmetric matrix.
Orthogonal similarity transformations are used to reduce the matrix to an equivalent symmetric
tridiagonal matrix. The rational QR algorithm with Newton corrections is used to compute the
extreme eigenvalues of this tridiagonal matrix.
The reduction routine is based on the EISPACK routine BANDR; see Garbow et al. (1978). The QR
routine is based on the EISPACK routine RATQR; see Smith et al. (1976).
Comments
1.
2.
Informational error
Type
Code
3
Example
The following example is given in Gregory and Karney (1969, page 63). The smallest four
eigenvalues of the matrix
EVASB 579
5
2
A=
2 1 1
6 3 1 1
3 6 3 1 1
1 3 6 3 1 1
1 1 3 6 3 1 1
1 1 3 6 3 1 1
1 1 3 6 3 1 1
1 1 3 6 3 1
1 1 3 6 3
1 1 3 6
1 1 2
1
2
NONE
INTEGER
PARAMETER
Declare variables
LDA, N, NCODA, NEVAL
(N=11, NCODA=3, NEVAL=4, LDA=NCODA+1)
!
!
REAL
LOGICAL
!
!
!
!
A(LDA,N), EVAL(NEVAL)
SMALL
Set up matrix in band symmetric
storage mode
CALL SSET (N, 6.0, A(4:,1), LDA)
CALL SSET (N-1, 3.0, A(3:,2), LDA)
CALL SSET (N-2, 1.0, A(2:,3), LDA)
CALL SSET (N-3, 1.0, A(1:,4), LDA)
CALL SSET (NCODA, 0.0, A(1:,1), 1)
CALL SSET (NCODA-1, 0.0, A(1:,2), 1)
CALL SSET (NCODA-2, 0.0, A(1:,3), 1)
A(4,1) = 5.0
A(4,N) = 5.0
A(3,2) = 2.0
A(3,N) = 2.0
Find the 4 smallest eigenvalues
SMALL = .TRUE.
CALL EVASB (NEVAL, A, NCODA, SMALL, EVAL)
Print results
CALL WRRRN ('EVAL', EVAL, 1, NEVAL, 1)
END
Output
1
4.000
EVAL
2
3
3.172
1.804
4
0.522
EVESB
Computes the largest or smallest eigenvalues and the corresponding eigenvectors of a real
symmetric matrix in band symmetric storage mode.
Required Arguments
NEVEC Number of eigenvectors to be calculated. (Input)
A Band symmetric matrix of order N. (Input)
NCODA Number of codiagonals in A. (Input)
SMALL Logical variable. (Input)
If .TRUE. , the smallest NEVEC eigenvectors are computed. If .FALSE. , the largest
NEVEC eigenvectors are computed.
EVAL Vector of length NEVEC containing the eigenvalues of A in decreasing order of
magnitude. (Output)
EVEC Real matrix of dimension N by NEVEC. (Output)
The J-th eigenvector, corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL EVESB (N, NEVEC, A, LDA, NCODA, SMALL, EVAL, EVEC, LDEVEC)
EVESB 581
Double:
Description
Routine EVESB computes the largest or smallest eigenvalues and the corresponding eigenvectors
of a real band symmetric matrix. Orthogonal similarity transformations are used to reduce the
matrix to an equivalent symmetric tridiagonal matrix. The rational QR algorithm with Newton
corrections is used to compute the extreme eigenvalues of this tridiagonal matrix. Inverse iteration
and orthogonalization are used to compute the eigenvectors of the given band matrix. The
reduction routine is based on the EISPACK routine BANDR; see Garbow et al. (1977). The QR
routine is based on the EISPACK routine RATQR; see Smith et al. (1976). The inverse iteration and
orthogonalization steps are based on EISPACK routine BANDV using the additional steps given in
Hanson et al. (1990).
Comments
1.
2.
3.
Informational errors
Type
Code
3
Inverse iteration did not converge. Eigenvector is not correct for the
specified eigenvalue.
The eigenvectors have lost orthogonality.
Example
The following example is given in Gregory and Karney (1969, page 75). The largest three
eigenvalues and the corresponding eigenvectors of the matrix are computed and printed.
USE
USE
USE
USE
EVESB_INT
EPISB_INT
UMACH_INT
WRRRN_INT
IMPLICIT
NONE
!
INTEGER
PARAMETER
!
INTEGER
REAL
LOGICAL
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
LDA, LDEVEC, N, NCODA, NEVEC
(N=6, NCODA=2, NEVEC=3, LDA=NCODA+1, LDEVEC=N)
NOUT
A(LDA,N), EVAL(NEVEC), EVEC(LDEVEC,NEVEC), PI
SMALL
Define values of A:
A = ( 5 -4
1
( -4
6 -4
1
( 1 -4
6 -4
1
(
1 -4
6 -4
1
(
1 -4
6 -4
(
1 -4
5
Represented in band symmetric
form this is:
A = ( 0
0
1
1
1
1
( 0 -4 -4 -4 -4 -4
( 5
6
6
6
6
5
)
)
)
)
)
)
)
)
)
DATA A/0.0, 0.0, 5.0, 0.0, -4.0, 6.0, 1.0, -4.0, 6.0, 1.0, -4.0, &
6.0, 1.0, -4.0, 6.0, 1.0, -4.0, 5.0/
!
!
!
!
!
Output
1
14.45
EVAL
2
10.54
3
5.98
EVEC
1
2
3
4
5
6
1
0.2319
-0.4179
0.5211
-0.5211
0.4179
-0.2319
2
-0.4179
0.5211
-0.2319
-0.2319
0.5211
-0.4179
Performance index =
3
0.5211
-0.2319
-0.4179
0.4179
0.2319
-0.5211
0.175
EVESB 583
EVBSB
Computes the eigenvalues in a given interval of a real symmetric matrix stored in band symmetric
storage mode.
Required Arguments
MXEVAL Maximum number of eigenvalues to be computed. (Input)
A Band symmetric matrix of order N. (Input)
NCODA Number of codiagonals in A. (Input)
ELOW Lower limit of the interval in which the eigenvalues are sought. (Input)
EHIGH Upper limit of the interval in which the eigenvalues are sought. (Input)
NEVAL Number of eigenvalues found. (Output)
EVAL Real vector of length MXEVAL containing the eigenvalues of A in the interval (ELOW,
EHIGH) in decreasing order of magnitude. (Output)
Only the first NEVAL elements of EVAL are set.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL EVBSB (N, MXEVAL, A, LDA, NCODA, ELOW, EHIGH, NEVAL, EVAL)
Double:
Description
Routine EVBSB computes the eigenvalues in a given range of a real band symmetric matrix.
Orthogonal similarity transformations are used to reduce the matrix to an equivalent symmetric
tridiagonal matrix. A bisection algorithm is used to compute the eigenvalues of the tridiagonal
matrix in a given range.
The reduction routine is based on the EISPACK routine BANDR; see Garbow et al. (1977). The
bisection routine is based on the EISPACK routine BISECT; see Smith et al. (1976).
Comments
1.
2.
Informational error
Type
Code
3
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 77). The eigenvalues in the range (-2.5, 1.5) are computed and printed. As a test, this
example uses MXEVAL = 5. The routine EVBSB computes NEVAL, the number of eigenvalues in the
given range, has the value 3.
USE EVBSB_INT
USE UMACH_INT
USE WRRRN_INT
!
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, MXEVAL, N, NCODA
(MXEVAL=5, N=5, NCODA=2, LDA=NCODA+1)
INTEGER
REAL
NEVAL, NOUT
A(LDA,N), EHIGH, ELOW, EVAL(MXEVAL)
!
!
!
!
!
Chapter 2: Eigensystem Analysis
Define values of A:
A = ( -1
2
1
( 2
0
2
1
)
)
EVBSB 585
!
!
!
!
!
!
!
!
( 1
2
0
2
1 )
(
1
2
0
2 )
(
1
2 -1 )
Representedin band symmetric
form this is:
A = ( 0
0
1
1
1 )
( 0
2
2
2
2 )
( -1
0
0
0 -1 )
DATA A/0.0, 0.0, -1.0, 0.0, 2.0, 0.0, 1.0, 2.0, 0.0, 1.0, 2.0, &
0.0, 1.0, 2.0, -1.0/
ELOW = -2.5
EHIGH = 1.5
CALL EVBSB (MXEVAL, A, NCODA, ELOW, EHIGH, NEVAL, EVAL)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,'(/,A,I1)') ' NEVAL = ', NEVAL
CALL WRRRN ('EVAL', EVAL, 1, NEVAl, 1)
END
Output
NEVAL = 3
1
-2.464
EVAL
2
-2.000
3
1.000
EVFSB
Computes the eigenvalues in a given interval and the corresponding eigenvectors of a real
symmetric matrix stored in band symmetric storage mode.
Required Arguments
MXEVAL Maximum number of eigenvalues to be computed. (Input)
A Band symmetric matrix of order N. (Input)
NCODA Number of codiagonals in A. (Input)
ELOW Lower limit of the interval in which the eigenvalues are sought. (Input)
EHIGH Upper limit of the interval in which the eigenvalues are sought. (Input)
NEVAL Number of eigenvalues found. (Output)
EVAL Real vector of length MXEVAL containing the eigenvalues of A in the interval (ELOW,
EHIGH) in decreasing order of magnitude. (Output)
Only the first NEVAL elements of EVAL are significant.
586 Chapter 2: Eigensystem Analysis
EVEC Real matrix containing in its first NEVAL columns the eigenvectors associated with
the eigenvalues found and stored in EVAL. Eigenvector J corresponds to eigenvalue J
for J = 1 to NEVAL. Each vector is normalized to have Euclidean length equal to the
value one. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
CALL EVFSB (MXEVEL, A, NCODA, ELOW, EHIGH, NEVAL, EVAL, EVEC [,])
Specific:
FORTRAN 77 Interface
Single:
CALL EVFSB (N, MXEVAL, A, LDA, NCODA, ELOW, EHIGH, NEVAL, EVAL, EVEC,
LDEVEC)
Double:
Description
Routine EVFSB computes the eigenvalues in a given range and the corresponding eigenvectors of a
real band symmetric matrix. Orthogonal similarity transformations are used to reduce the matrix to
an equivalent tridiagonal matrix. A bisection algorithm is used to compute the eigenvalues of the
tridiagonal matrix in the required range. Inverse iteration and orthogonalization are used to
compute the eigenvectors of the given band symmetric matrix.
The reduction routine is based on the EISPACK routine BANDR; see Garbow et al. (1977). The
bisection routine is based on the EISPACK routine BISECT; see Smith et al. (1976). The inverse
iteration and orthogonalization steps are based on the EISPACK routine BANDV using remarks
from Hanson et al. (1990).
Comments
1.
EVFSB 587
CALL E3FSB (N, MXEVAL, A, LDA, NCODA, ELOW, EHIGH, NEVAL, EVAL, EVEC,
LDEVEC, ACOPY, WK1, WK2, IWK)
2.
Informational errors
Type
Code
3
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 75). The eigenvalues in the range [1, 6] and their corresponding eigenvectors are computed
and printed. As a test, this example uses MXEVAL = 4. The routine EVFSB computes NEVAL, the
number of eigenvalues in the given range has the value 2. As a check on the computations, the
performance index is also computed and printed. For more details, see IMSL routine EPISB.
USE
USE
USE
USE
EVFSB_INT
EPISB_INT
WRRRN_INT
UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, MXEVAL, N, NCODA
(MXEVAL=4, N=6, NCODA=2, LDA=NCODA+1, LDEVEC=N)
!
!
INTEGER
REAL
!
!
!
!
!
!
!
!
!
NEVAL, NOUT
A(LDA,N), EHIGH, ELOW, EVAL(MXEVAL), &
EVEC(LDEVEC,MXEVAL), PI
Define values of A:
A = ( 5 -4
1
( -4
6 -4
1
( 1 -4
6 -4
1
(
1 -4
6 -4
1
(
1 -4
6 -4
(
1 -4
5
Represented in band symmetric
form this is:
)
)
)
)
)
)
!
!
!
A = ( 0
0
1
1
1
1 )
( 0 -4 -4 -4 -4 -4 )
( 5
6
6
6
6
5 )
DATA A/0.0, 0.0, 5.0, 0.0, -4.0, 6.0, 1.0, -4.0, 6.0, 1.0, -4.0, &
6.0, 1.0, -4.0, 6.0, 1.0, -4.0, 5.0/
!
!
!
!
Output
NEVAL = 2
EVAL
1
2
5.978
2.418
1
2
3
4
5
6
EVEC
1
2
0.5211
0.5211
-0.2319
0.2319
-0.4179 -0.4179
0.4179 -0.4179
0.2319
0.2319
-0.5211
0.5211
Performance index =
0.083
EPISB
This function computes the performance index for a real symmetric eigensystem in band
symmetric storage mode.
Required Arguments
EPISB Performance index. (Output)
Required Arguments
NEVAL Number of eigenvalue/eigenvector pairs on which the performance is based.
(Input)
Chapter 2: Eigensystem Analysis
EPISB 589
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NEVAL, = EVAL, xj = EVEC(*,J), the j-th column of EVEC. Also, let be the machine
precision, given by AMACH(4), see the Reference chapter of the manual. The performance index, ,
is defined to be
= max
1 j M
Ax j j x j
10 N A
xj
While the exact value of is highly machine dependent, the performance of EVCSF is considered
excellent if < 1, good if 1 100, and poor if > 100. The performance index was first
590 Chapter 2: Eigensystem Analysis
developed by the EISPACK project at Argonne National Laboratory; see Smith et al. (1976, pages
124125).
Comments
1.
2.
Informational errors
Type
Code
3
3
3
1
2
3
Example
For an example of EPISB, see IMSL routine EVCSB.
EVLHF
Computes all of the eigenvalues of a complex Hermitian matrix.
Required Arguments
A Complex Hermitian matrix of order N. (Input)
Only the upper triangle is used.
EVAL Real vector of length N containing the eigenvalues of A in decreasing order
of magnitude. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
EVLHF 591
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVLHF computes the eigenvalues of a complex Hermitian matrix. Unitary similarity
transformations are used to reduce the matrix to an equivalent real symmetric tridiagonal matrix.
The implicit QL algorithm is used to compute the eigenvalues of this tridiagonal matrix.
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
3.
Informational errors
Type
Code
3
4
4
1
2
This option uses eight values to solve memory bank conflict (access
inefficiency) problems. In routine E3LHF, the internal or working leading
dimensions of ACOPY and ECOPY are both increased by IVAL(3) when N is a
multiple of IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 114). The eigenvalues of this complex Hermitian matrix are computed and printed.
USE EVLHF_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
LDA, N
(N=2, LDA=N)
REAL
COMPLEX
EVAL(N)
A(LDA,N)
!
!
Declare variables
!
!
!
!
!
Set values of A
A = (
(
1
i
-i
1
)
)
Find eigenvalues of A
CALL EVLHF (A, EVAL)
Print results
CALL WRRRN ('EVAL', EVAL, 1, N, 1)
END
Output
EVAL
1
2
2.000
0.000
EVCHF
Computes all of the eigenvalues and eigenvectors of a complex Hermitian matrix.
Required Arguments
A Complex Hermitian matrix of order N. (Input)
Only the upper triangle is used.
EVCHF 593
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVCHF computes the eigenvalues and eigenvectors of a complex Hermitian matrix.
Unitary similarity transformations are used to reduce the matrix to an equivalent real symmetric
tridiagonal matrix. The implicit QL algorithm is used to compute the eigenvalues and eigenvectors
of this tridiagonal matrix. These eigenvectors and the transformations used to reduce the matrix to
tridiagonal form are combined to obtain the eigenvectors for the users problem. The underlying
code is based on either EISPACK or LAPACK code depending upon which supporting libraries
are used during linking. For a detailed explanation, see Using ScaLAPACK, LAPACK,
LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
CALL E5CHF (N, A, LDA, EVAL, EVEC, LDEVEC, ACOPY, RWK, CWK, IWK)
2.
Informational error
Type
Code
3
4
4
1
2
3.
4.
This option uses eight values to solve memory bank conflict (access
inefficiency) problems. In routine E5CHF, the internal or working leading
dimensions of ACOPY and ECOPY are both increased by IVAL(3) when N is a
multiple of IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced
by IVAL(1) and IVAL(2), respectively, in routine EVCHF. Additional memory
allocation and option value restoration are automatically done in EVCHF. There
is no requirement that users change existing applications that use EVCHF or
E5CHF. Default values for the option are IVAL(*) = 1, 16, 0, 1, 1, 16, 0, 1. Items
58 in IVAL(*) are for the generalized eigenvalue problem and are not used in
EVCHF.
Example
In this example, a DATA statement is used to set A to a complex Hermitian matrix. The eigenvalues
and eigenvectors of this matrix are computed and printed. The performance index is also
computed and printed. This serves as a check on the computations, for more details, see routine
EPIHF.
USE IMSL_libraries
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, N
(N=3, LDA=N, LDEVEC=N)
EVCHF 595
!
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
NOUT
EVAL(N), PI
A(LDA,N), EVEC(LDEVEC,N)
Set values of A
A = ((1, 0)
((1,7i)
((0, i)
( 1,-7i)
( 5, 0)
( 10, 3i)
( 0,- i))
(10,-3i))
(-2, 0))
Print results
CALL UMACH (2, NOUT)
CALL WRRRN ('EVAL', EVAL, 1, N, 1)
CALL WRCRN ('EVEC', EVEC)
WRITE (NOUT,'(/,A,F6.3)') ' Performance index = ', PI
END
Output
1
15.38
EVAL
2
-10.63
3
-0.75
EVEC
1
2
3
1
( 0.0631,-0.4075)
( 0.7703, 0.0000)
( 0.4668, 0.1366)
Performance index =
2
(-0.0598,-0.3117)
(-0.5939, 0.1841)
( 0.7160, 0.0000)
3
( 0.8539, 0.0000)
(-0.0313,-0.1380)
( 0.0808,-0.4942)
0.093
EVAHF
Computes the largest or smallest eigenvalues of a complex Hermitian matrix.
Required Arguments
NEVAL Number of eigenvalues to be calculated. (Input)
A Complex Hermitian matrix of order N. (Input)
Only the upper triangle is used.
SMALL Logical variable. (Input)
If .TRUE., the smallest NEVAL eigenvalues are computed. If .FALSE., the largest
NEVAL eigenvalues are computed.
596 Chapter 2: Eigensystem Analysis
EVAL Real vector of length N containing the extreme eigenvalues of A in decreasing order
of magnitude in the first NEVAL elements. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVAHF computes the largest or smallest eigenvalues of a complex Hermitian matrix.
Unitary transformations are used to reduce the matrix to an equivalent symmetric tridiagonal
matrix. The rational QR algorithm with Newton corrections is used to compute the extreme
eigenvalues of this tridiagonal matrix.
The reduction routine is based on the EISPACK routine HTRIDI. The QR routine is based on the
EISPACK routine RATQR. See Smith et al. (1976) for the EISPACK routines.
Comments
1.
EVAHF 597
2.
Informational errors
Type
Code
3
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 114). Its largest eigenvalue is computed and printed.
USE EVAHF_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
LDA, N
(N=2, LDA=N)
INTEGER
REAL
COMPLEX
LOGICAL
NEVAL
EVAL(N)
A(LDA,N)
SMALL
Declare variables
!
!
!
!
!
Set values of A
A = (
(
1
i
-i
1
)
)
Output
EVAL
2.000
EVEHF
Computes the largest or smallest eigenvalues and the corresponding eigenvectors of a complex
Hermitian matrix.
Required Arguments
NEVEC Number of eigenvectors to be computed. (Input)
A Complex Hermitian matrix of order N. (Input)
Only the upper triangle is used.
SMALL Logical variable. (Input)
If .TRUE., the smallest NEVEC eigenvectors are computed. If .FALSE., the largest
NEVEC eigenvectors are computed.
EVAL Real vector of length NEVEC containing the eigenvalues of A in decreasing order of
magnitude. (Output)
EVEC Complex matrix of dimension N by NEVEC. (Output)
The J-th eigenvector corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
EVEHF 599
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVEHF computes the largest or smallest eigenvalues and the corresponding eigenvectors
of a complex Hermitian matrix. Unitary transformations are used to reduce the matrix to an
equivalent real symmetric tridiagonal matrix. The rational QR algorithm with Newton corrections
is used to compute the extreme eigenvalues of the tridiagonal matrix. Inverse iteration is used to
compute the eigenvectors of the tridiagonal matrix. Eigenvectors of the original matrix are found
by back transforming the eigenvectors of the tridiagonal matrix.
The reduction routine is based on the EISPACK routine HTRIDI. The QR routine used is based on
the EISPACK routine RATQR. The inverse iteration routine is based on the EISPACK routine
TINVIT. The back transformation routine is based on the EISPACK routine HTRIBK. See Smith et
al. (1976) for the EISPACK routines.
Comments
1.
2.
Informational errors
Type
Code
3
3.
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 115). The smallest eigenvalue and its corresponding eigenvector is computed and printed.
The performance index is also computed and printed. This serves as a check on the computations.
For more details, see IMSL routine EPIHF.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
Declare variables
LDA, LDEVEC, N, NEVEC
(N=3, NEVEC=1, LDA=N, LDEVEC=N)
!
INTEGER
REAL
COMPLEX
LOGICAL
!
!
!
!
!
!
NOUT
EVAL(N), PI
A(LDA,N), EVEC(LDEVEC,NEVEC)
SMALL
Set values of A
A = (
(
(
2
i
0
-i
2
0
0
0
3
)
)
)
!
!
Output
EVAL
1.000
1
EVEC
( 0.0000, 0.7071)
EVEHF 601
2
3
( 0.7071, 0.0000)
( 0.0000, 0.0000)
Performance index =
0.031
EVBHF
Computes the eigenvalues in a given range of a complex Hermitian matrix.
Required Arguments
MXEVAL Maximum number of eigenvalues to be computed. (Input)
A Complex Hermitian matrix of order N. (Input)
Only the upper triangle is used.
ELOW Lower limit of the interval in which the eigenvalues are sought. (Input)
EHIGH Upper limit of the interval in which the eigenvalues are sought. (Input)
NEVAL Number of eigenvalues found. (Output)
EVAL Real vector of length MXEVAL containing the eigenvalues of A in the interval (ELOW,
EHIGH) in decreasing order of magnitude. (Output)
Only the first NEVAL elements of EVAL are significant.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVBHF computes the eigenvalues in a given range of a complex Hermitian matrix. Unitary
transformations are used to reduce the matrix to an equivalent symmetric tridiagonal matrix. A
bisection algorithm is used to compute the eigenvalues in the given range of this tridiagonal
matrix.
The reduction routine is based on the EISPACK routine HTRIDI. The bisection routine used is
based on the EISPACK routine BISECT. See Smith et al. (1976) for the EISPACK routines.
Comments
1.
2.
Informational errors
Type
Code
3
Example
In this example, a DATA statement is used to set A to a matrix given by Gregory and Karney (1969,
page 114). The eigenvalues in the range [1.5, 2.5] are computed and printed. This example allows
a maximum number of eigenvalues MXEVAL = 2. The routine computes that there is one eigenvalue
in the given range. This value is returned in NEVAL.
USE EVBHF_INT
USE UMACH_INT
USE WRRRN_INT
EVBHF 603
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, MXEVAL, N
(MXEVAL=2, N=2, LDA=N)
!
!
INTEGER
REAL
COMPLEX
!
!
!
!
!
NEVAL, NOUT
EHIGH, ELOW, EVAL(MXEVAL)
A(LDA,N)
Set values of A
A = (
(
1
i
-i
1
)
)
!
!
Find eigenvalue
ELOW = 1.5
EHIGH = 2.5
CALL EVBHF (MXEVAL, A, ELOW, EHIGH, NEVAL, EVAL)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,'(/,A,I3)') ' NEVAL = ', NEVAL
CALL WRRRN ('EVAL', EVAL, 1, NEVAL, 1)
END
Output
NEVAL =
EVAL
2.000
EVFHF
Computes the eigenvalues in a given range and the corresponding eigenvectors of a complex
Hermitian matrix.
Required Arguments
MXEVAL Maximum number of eigenvalues to be computed. (Input)
A Complex Hermitian matrix of order N. (Input)
Only the upper triangle is used.
ELOW Lower limit of the interval in which the eigenvalues are sought. (Input)
EHIGH Upper limit of the interval in which the eigenvalues are sought. (Input)
NEVAL Number of eigenvalues found. (Output)
EVAL Real vector of length MXEVAL containing the eigenvalues of A in the interval (ELOW,
EHIGH) in decreasing order of magnitude. (Output)
Only the first NEVAL elements of EVAL are significant.
EVEC Complex matrix containing in its first NEVAL columns the eigenvectors associated
with the eigenvalues found stored in EVAL. Each vector is normalized to have
Euclidean length equal to the value one. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL EVFHF (N, MXEVAL, A, LDA, ELOW, EHIGH, NEVAL, EVAL, EVEC,
LDEVEC)
Double:
Description
Routine EVFHF computes the eigenvalues in a given range and the corresponding eigenvectors of a
complex Hermitian matrix. Unitary transformations are used to reduce the matrix to an equivalent
symmetric tridiagonal matrix. A bisection algorithm is used to compute the eigenvalues in the
given range of this tridiagonal matrix. Inverse iteration is used to compute the eigenvectors of the
tridiagonal matrix. The eigenvectors of the original matrix are computed by back transforming the
eigenvectors of the tridiagonal matrix.
The reduction routine is based on the EISPACK routine HTRIDI. The bisection routine is based on
the EISPACK routine BISECT. The inverse iteration routine is based on the EISPACK routine
TINVIT. The back transformation routine is based on the EISPACK routine HTRIBK. See Smith et
al. (1976) for the EISPACK routines.
EVFHF 605
Comments
1.
2.
Informational errors
Type
Code
3
Example
In this example, a DATA statement is used to set A to a complex Hermitian matrix. The eigenvalues
in the range [15, 0] and their corresponding eigenvectors are computed and printed. As a test, this
example uses MXEVAL = 3. The routine EVFHF computes the number of eigenvalues in the given
range. That value, NEVAL, is two. As a check on the computations, the performance index is also
computed and printed. For more details, see routine EPIHF.
USE IMSL_LIBRARIES
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, MXEVAL, N
(MXEVAL=3, N=3, LDA=N, LDEVEC=N)
!
!
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
NEVAL, NOUT
EHIGH, ELOW, EVAL(MXEVAL), PI
A(LDA,N), EVEC(LDEVEC,MXEVAL)
Set values of A
A = ((1, 0)
((1,7i)
((0, i)
( 1,-7i)
( 5, 0)
( 10, 3i)
( 0,- i))
(10,-3i))
(-2, 0))
!
!
Output
NEVAL =
EVAL
1
2
-10.63
-0.75
1
2
3
EVEC
1
2
(-0.0598,-0.3117) ( 0.8539, 0.0000)
(-0.5939, 0.1841) (-0.0313,-0.1380)
( 0.7160, 0.0000) ( 0.0808,-0.4942)
Performance index =
0.057
EPIHF
This function computes the performance index for a complex Hermitian eigensystem.
EPIHF 607
Required Arguments
NEVAL Number of eigenvalue/eigenvector pairs on which the performance index
computation is based. (Input)
A Complex Hermitian matrix of order N. (Input)
EVAL Vector of length NEVAL containing eigenvalues of A. (Input)
EVEC Complex N by NEVAL array containing eigenvectors of A. (Input)
The eigenvector corresponding to the eigenvalue EVAL(J) must be in the J-th column
of EVEC.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NEVAL, = EVAL, xj = EVEC(*, J), the j-th column of EVEC. Also, let be the machine
precision, given by AMACH(4), see the Reference chapter of this manual. The performance index, ,
is defined to be
= max
1 j M
Ax j j x j
10 N A 1 x j
1
1
The norms used are a modified form of the 1-norm. The norm of the complex vector v is
N
v 1 = { vi + vi }
i =1
While the exact value of is highly machine dependent, the performance of EVCSF is considered
excellent if < 1, good if 1 100, and poor if > 100. The performance index was first
developed by the EISPACK project at Argonne National Laboratory; see Smith et al. (1976, pages
124125).
Comments
1.
2.
Informational errors
Type
Code
3
3
3
1
2
3
Example
For an example of EPIHF, see IMSL routine EVCHF.
EVLRH
Computes all of the eigenvalues of a real upper Hessenberg matrix.
Required Arguments
A Real upper Hessenberg matrix of order N. (Input)
EVAL Complex vector of length N containing the eigenvalues in decreasing order of
magnitude. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
EVLRH 609
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVLRH computes the eigenvalues of a real upper Hessenberg matrix by using the QR
algorithm. The QR Algorithm routine is based on the EISPACK routine HQR, Smith et al. (1976).
Comments
1.
2.
Informational error
Type
Code
4
Example
In this example, a DATA statement is used to set A to an upper Hessenberg matrix of integers. The
eigenvalues of this matrix are computed and printed.
USE EVLRH_INT
USE UMACH_INT
USE WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
LDA, N
(N=4, LDA=N)
INTEGER
REAL
COMPLEX
NOUT
A(LDA,N)
EVAL(N)
!
!
Declare variables
!
!
!
!
!
!
!
!
!
Set values of A
A = (
(
(
(
2.0
1.0
1.0
0.0
1.0
3.0
0.0
0.0
1.0
4.0
0.0
0.0
0.0
)
)
)
)
DATA A/2.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 3.0, 0.0, 0.0, &
1.0, 4.0, 0.0, 0.0, 0.0/
Find eigenvalues of A
CALL EVLRH (A, EVAL)
Print results
CALL UMACH (2, NOUT)
CALL WRCRN ('EVAL', EVAL, 1, N, 1)
END
Output
1
( 2.878, 0.000)
EVAL
2
3
( 0.011, 1.243) ( 0.011,-1.243)
4
(-0.900, 0.000)
EVCRH
Computes all of the eigenvalues and eigenvectors of a real upper Hessenberg matrix.
Required Arguments
A Real upper Hessenberg matrix of order N. (Input)
EVAL Complex vector of length N containing the eigenvalues in decreasing order of
magnitude. (Output)
EVEC Complex matrix of order N. (Output)
The J-th eigenvector, corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
EVCRH 611
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVCRH computes the eigenvalues and eigenvectors of a real upper Hessenberg matrix by
using the QR algorithm. The QR algorithm routine is based on the EISPACK routine HQR2; see
Smith et al. (1976).
Comments
1.
2.
Informational error
Type
Code
4
Example
In this example, a DATA statement is used to set A to a Hessenberg matrix with integer entries. The
values are returned in decreasing order of magnitude. The eigenvalues, eigenvectors and
performance index of this matrix are computed and printed. See routine EPIRG for details.
USE
USE
USE
USE
!
EVCRH_INT
EPIRG_INT
UMACH_INT
WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, N
(N=4, LDA=N, LDEVEC=N)
!
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
NOUT
A(LDA,N), PI
EVAL(N), EVEC(LDEVEC,N)
Define values of A:
A = ( -1.0
( 1.0
(
(
-1.0
0.0
1.0
-1.0
0.0
0.0
1.0
-1.0
0.0
0.0
0.0
)
)
)
)
DATA A/-1.0, 1.0, 0.0, 0.0, -1.0, 0.0, 1.0, 0.0, -1.0, 0.0, 0.0, &
1.0, -1.0, 0.0, 0.0, 0.0/
!
!
Print results
CALL UMACH (2, NOUT)
CALL WRCRN ('EVAL', EVAL, 1, N, 1)
CALL WRCRN ('EVEC', EVEC)
WRITE (NOUT,'(/,A,F6.3)') ' Performance index = ', PI
END
Output
1
(-0.8090, 0.5878)
EVAL
2
3
(-0.8090,-0.5878) ( 0.3090, 0.9511)
EVEC
2
1
4
1 (-0.4045, 0.2939)
0.2939)
2 ( 0.5000, 0.0000)
0.2939)
3 (-0.4045,-0.2939)
0.4755)
4
( 0.3090,-0.9511)
3
(-0.4045,-0.2939)
(-0.4045,-0.2939)
(-0.4045,
( 0.5000, 0.0000)
(-0.4045, 0.2939)
(-0.4045,-
(-0.4045, 0.2939)
( 0.1545, 0.4755)
( 0.1545,-
EVCRH 613
4 ( 0.1545, 0.4755)
0.0000)
Performance index =
( 0.1545,-0.4755)
( 0.5000, 0.0000)
( 0.5000,
0.098
EVLCH
Computes all of the eigenvalues of a complex upper Hessenberg matrix.
Required Arguments
A Complex upper Hessenberg matrix of order N. (Input)
EVAL Complex vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
Required Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVLCH computes the eigenvalues of a complex upper Hessenberg matrix using the QR
algorithm. This routine is based on the EISPACK routine COMQR2; see Smith et al. (1976).
Comments
1.
2.
Informational error
Type
Code
4
Example
In this example, a DATA statement is used to set the matrix A. The program computes and prints the
eigenvalues of this matrix.
USE EVLCH_INT
USE WRCRN_INT
IMPLICIT NONE
!
INTEGER LDA, N
PARAMETER (N=4, LDA=N)
COMPLEX A(LDA,N), EVAL(N)
!
!
!
!
!
!
!
Declare variables
Set values of A
A = (5+9i
(3+3i
( 0
( 0
5+5i
6+10i
3+3i
0
-6-6i
-5-5i
-1+3i
-3-3i
-7-7i)
-6-6i)
-5-5i)
4i)
Print results
CALL WRCRN ('EVAL', EVAL, 1, N, 1)
END
Output
1
8.22, 12.22)
3.40,
EVAL
2
7.40) (
1.60,
3
5.60)
( -3.22,
4
0.78)
EVLCH 615
EVCCH
Computes all of the eigenvalues and eigenvectors of a complex upper Hessenberg matrix.
Required Arguments
A Complex upper Hessenberg matrix of order N. (Input)
EVAL Complex vector of length N containing the eigenvalues of A in decreasing order of
magnitude. (Output)
EVEC Complex matrix of order N. (Output)
The J-th eigenvector, corresponding to EVAL(J), is stored in the J-th column. Each
vector is normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine EVCCH computes the eigenvalues and eigenvectors of a complex upper Hessenberg matrix
using the QR algorithm. This routine is based on the EISPACK routine COMQR2; see Smith et al.
(1976).
Comments
1.
Informational error
Type
Code
4
3.
The results of EVCCH can be checked using EPICG. This requires that the matrix A
explicitly contains the zeros in A(I, J) for (I 1) > J which are assumed by EVCCH.
Example
In this example, a DATA statement is used to set the matrix A. The program computes the
eigenvalues and eigenvectors of this matrix. The performance index is also computed and printed.
This serves as a check on the computations; for more details, see IMSL routine EPICG. The zeros
in the lower part of the matrix are not referenced by EVCCH, but they are required by EPICG.
USE
USE
USE
USE
EVCCH_INT
EPICG_INT
UMACH_INT
WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDEVEC, N
(N=4, LDA=N, LDEVEC=N)
!
!
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
NOUT
PI
A(LDA,N), EVAL(N), EVEC(LDEVEC,N)
Set values of A
A = (5+9i
(3+3i
( 0
( 0
5+5i
6+10i
3+3i
0
-6-6i
-5-5i
-1+3i
-3-3i
-7-7i)
-6-6i)
-5-5i)
4i)
EVCCH 617
!
DATA A/(5.0,9.0), (3.0,3.0), (0.0,0.0), (0.0,0.0), (5.0,5.0), &
(6.0,10.0), (3.0,3.0), (0.0,0.0), (-6.0,-6.0), (-5.0,-5.0), &
(-1.0,3.0), (-3.0,-3.0), (-7.0,-7.0), (-6.0,-6.0), &
(-5.0,-5.0), (0.0,4.0)/
!
!
Print results
CALL UMACH (2, NOUT)
CALL WRCRN ('EVAL', EVAL, 1, N, 1)
CALL WRCRN ('EVEC', EVEC)
WRITE (NOUT,'(/,A,F6.3)') ' Performance index = ', PI
END
Output
1
8.22, 12.22)
3.40,
EVAL
2
7.40) (
Performance index =
3
5.60)
EVEC
2
4
1 ( 0.7167, 0.0000)
0.0000)
2 ( 0.6402,-0.0000)
0.0000)
3 ( 0.2598, 0.0000)
0.0000)
4 (-0.0948,-0.0000)
0.0000)
1.60,
( -3.22,
4
0.78)
(-0.0704, 0.0000)
(-0.3678, 0.0000)
( 0.5429,
(-0.0046,-0.0000)
( 0.6767, 0.0000)
( 0.4298,-
( 0.7477, 0.0000)
(-0.3005, 0.0000)
( 0.5277,-
(-0.6603,-0.0000)
( 0.5625, 0.0000)
( 0.4920,-
0.020
GVLRG
Computes all of the eigenvalues of a generalized real eigensystem Az = Bz.
Required Arguments
A Real matrix of order N. (Input)
B Real matrix of order N. (Input)
ALPHA Complex vector of size N containing scalars i, i = 1, , n. If i 0, i = i / i
the eigenvalues of the system in decreasing order of magnitude. (Output)
BETAV Vector of size N containing scalars i. (Output)
618 Chapter 2: Eigensystem Analysis
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine GVLRG computes the eigenvalues of the generalized eigensystem Ax = Bx where A and B
are real matrices of order N. The eigenvalues for this problem can be infinite; so instead of
returning , GVLRG returns and . If is nonzero, then = /.
The first step of the QZ algorithm is to simultaneously reduce A to upper Hessenberg form and B
to upper triangular form. Then, orthogonal transformations are used to reduce A to quasi-uppertriangular form while keeping B upper triangular. The generalized eigenvalues are then computed.
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
GVLRG 619
ACOPY Work array of size N2 . The arrays A and ACOPY may be the same, in which
case the first N2 elements of A will be destroyed.
BCOPY Work array of size N2 . The arrays B and BCOPY may be the same, in which
case the first N2 elements of B will be destroyed.
RWK Real work array of size N.
CWK Complex work array of size N.
IWK Integer work array of size N.
2.
This option uses eight values to solve memory bank conflict (access inefficiency)
problems. In routine G3LRG, the internal or working leading dimension of ACOPY
is increased by IVAL(3) when N is a multiple of IVAL(4). The values IVAL(3) and
IVAL (4) are temporarily replaced by IVAL(1) and IVAL(2), respectively, in
routine GVLRG . Analogous comments hold for BCOPY and the values IVAL(5)
IVAL(8) . Additional memory allocation and option value restoration are
automatically done in GVLRG. There is no requirement that users change existing
applications that use GVLRG or G3LRG. Default values for the option are IVAL(*) =
1, 16, 0, 1, 1, 16, 0, 1.
Example
In this example, DATA statements are used to set A and B. The eigenvalues are computed and
printed.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
LDA, LDB, N
(N=3, LDA=N, LDB=N)
INTEGER
REAL
COMPLEX
I
A(LDA,N), B(LDB,N), BETAV(N)
ALPHA(N), EVAL(N)
!
!
!
!
!
!
!
!
!
!
!
)
)
)
B = (
(
(
)
)
)
0.5
3.0
4.0
0.0
3.0
0.5
0.0
0.0
1.0
Declare variables
DATA A/1.0, -10.0, 5.0, 0.5, 2.0, 1.0, 0.0, 0.0, 0.5/
DATA B/0.5, 3.0, 4.0, 0.0, 3.0, 0.5, 0.0, 0.0, 1.0/
!
!
Output
1
( 0.833, 1.993)
EVAL
2
( 0.833,-1.993)
3
( 0.500, 0.000)
GVCRG
Computes all of the eigenvalues and eigenvectors of a generalized real eigensystem Az = Bz.
Required Arguments
A Real matrix of order N. (Input)
B Real matrix of order N. (Input)
ALPHA Complex vector of size N containing scalars i. If
i 0, i = i / i, i = 1, , n are the eigenvalues of the system.
BETAV Vector of size N containing scalars i. (Output)
EVEC Complex matrix of order N. (Output)
The J-th eigenvector, corresponding to J, is stored in the J-th column. Each vector is
normalized to have Euclidean length equal to the value one.
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDB = SIZE (B,1).
GVCRG 621
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine GVCRG computes the complex eigenvalues and eigenvectors of the generalized
eigensystem Ax = Bx where A and B are real matrices of order N. The eigenvalues for this
problem can be infinite; so instead of returning , GVCRG returns complex numbers and real
numbers . If is nonzero, then = /. For problems with small || users can choose to solve the
mathematically equivalent problem Bx = Ax where = 1.
The first step of the QZ algorithm is to simultaneously reduce A to upper Hessenberg form and B
to upper triangular form. Then, orthogonal transformations are used to reduce A to quasi-uppertriangular form while keeping B upper triangular. The generalized eigenvalues and eigenvectors
for the reduced problem are then computed.
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual.
Comments
1.
2.
This option uses eight values to solve memory bank conflict (access
inefficiency) problems. In routine G8CRG, the internal or working leading
dimensions of ACOPY and ECOPY are both increased by IVAL(3) when N is a
multiple of IVAL(4). The values IVAL(3) and IVAL(4) are temporarily replaced
by IVAL(1) and IVAL(2), respectively, in routine GVCRG. Analogous comments
hold for the array BCOPY and the option values IVAL(5) IVAL(8). Additional
memory allocation and option value restoration are automatically done in
GVCRG. There is no requirement that users change existing applications that use
GVCRG or G8CRG. Default values for the option are IVAL(*) = 1, 16, 0, 1, 1, 16,
0, 1. Items 58 in IVAL(*) are for the generalized eigenvalue problem and are
not used in GVCRG.
Example
In this example, DATA statements are used to set A and B. The eigenvalues, eigenvectors and
performance index are computed and printed for the systems Ax = Bx and Bx = Ax where =
1. For more details about the performance index, see routine GPIRG.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
LDA, LDB, LDEVEC, N
(N=3, LDA=N, LDB=N, LDEVEC=N)
INTEGER
REAL
COMPLEX
I, NOUT
A(LDA,N), B(LDB,N), BETAV(N), PI
ALPHA(N), EVAL(N), EVEC(LDEVEC,N)
!
!
!
!
!
!
!
!
!
!
!
)
)
)
B = (
(
(
)
)
)
0.5
3.0
4.0
0.0
3.0
0.5
0.0
0.0
1.0
Declare variables
DATA A/1.0, -10.0, 5.0, 0.5, 2.0, 1.0, 0.0, 0.0, 0.5/
DATA B/0.5, 3.0, 4.0, 0.0, 3.0, 0.5, 0.0, 0.0, 1.0/
!
Chapter 2: Eigensystem Analysis
GVCRG 623
!
!
!
!
!
!
Output
EVAL
1
( 0.833, 1.993)
2
( 0.833,-1.993)
3
( 0.500, 0.000)
EVEC
1
2
3
1
(-0.197, 0.150)
(-0.069,-0.568)
( 0.782, 0.000)
Performance index =
1
( 2.000, 0.000)
2
(-0.197,-0.150)
(-0.069, 0.568)
( 0.782, 0.000)
3
(-0.000, 0.000)
(-0.000, 0.000)
( 1.000, 0.000)
0.384
EVAL reciprocals
2
( 0.179, 0.427)
3
( 0.179,-0.427)
EVEC
1
2
3
1
( 0.000, 0.000)
( 0.000, 0.000)
( 1.000, 0.000)
Performance index =
2
(-0.197,-0.150)
(-0.069, 0.568)
( 0.782, 0.000)
3
(-0.197, 0.150)
(-0.069,-0.568)
( 0.782, 0.000)
0.283
GPIRG
This function computes the performance index for a generalized real eigensystem Az = Bz.
Required Arguments
NEVAL Number of eigenvalue/eigenvector pairs performance index computation is based
on. (Input)
A Real matrix of order N. (Input)
B Real matrix of order N. (Input)
ALPHA Complex vector of length NEVAL containing the numerators of eigenvalues.
(Input)
BETAV Real vector of length NEVAL containing the denominators of eigenvalues. (Input)
EVEC Complex N by NEVAL array containing the eigenvectors. (Input)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDB = SIZE (B,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
GPIRG 625
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NEVAL, xj = EVEC(*,J) , the j-th column of EVEC. Also, let be the machine precision
given by AMACH(4), see the Reference chapter of this manual. The performance index, , is defined
to be
= max
1 j M
j Ax j j Bx j
j A 1 + j B 1 xj
The norms used are a modified form of the 1-norm. The norm of the complex vector v is
N
v 1 = { vi + vi }
i =1
While the exact value of is highly machine dependent, the performance of EVCSF is considered
excellent if < 1, good if 1 100, and poor if > 100. The performance index was first
developed by the EISPACK project at Argonne National Laboratory; see Garbow et al. (1977,
pages 7779).
Comments
1.
2.
Informational errors
Type
Code
3
3
3
3
3.
1
2
3
4
The J-th eigenvalue should be ALPHA(J)/BETAV(J), its eigenvector should be in the Jth column of EVEC.
Example
For an example of GPIRG, see routine GVCRG.
626 Chapter 2: Eigensystem Analysis
GVLCG
Computes all of the eigenvalues of a generalized complex eigensystem Az = Bz.
Required Arguments
A Complex matrix of order N. (Input)
B Complex matrix of order N. (Input)
ALPHA Complex vector of length N. Ultimately, alpha(i)/betav(i) (for i = 1, n), will be the
eigenvalues of the system in decreasing order of magnitude. (Output)
BETAV Complex vector of length N. (Output)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine GVLCG computes the eigenvalues of the generalized eigensystem Ax = Bx, where A and B
are complex matrices of order n. The eigenvalues for this problem can be infinite; so instead of
returning , GVLCG returns and . If is nonzero, then = /. If the eigenvectors are needed,
then use GVCCG.
GVLCG 627
The underlying code is based on either EISPACK or LAPACK code depending upon which
supporting libraries are used during linking. For a detailed explanation, see Using ScaLAPACK,
LAPACK, LINPACK, and EISPACK in the Introduction section of this manual. Some timing
information is given in Hanson et al. (1990).
Comments
1.
2.
Informational error
Type
Code
4
Example
In this example, DATA statements are used to set A and B. Then, the eigenvalues are computed and
printed.
USE GVLCG_INT
USE WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declaration of variables
LDA, LDB, N
(N=5, LDA=N, LDB=N)
INTEGER
COMPLEX
I
A(LDA,N), ALPHA(N), B(LDB,N), BETAV(N), EVAL(N)
!
!
!
!
!
Output
1
(-1.000,-1.333)
EVAL
2
3
( 0.765, 0.941) (-0.353, 0.412)
4
(-0.353,-0.412)
5
(-0.353,-0.412)
GVCCG
Computes all of the eigenvalues and eigenvectors of a generalized complex eigensystem Az = Bz.
Required Arguments
A Complex matrix of order N. (Input)
B Complex matrix of order N. (Input)
ALPHA Complex vector of length N. Ultimately, alpha(i)/betav(i) (for i = 1, , n), will be
the eigenvalues of the system in decreasing order of magnitude. (Output)
BETAV Complex vector of length N. (Output)
GVCCG 629
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine GVCCG computes the eigenvalues and eigenvectors of the generalized eigensystem Ax =
Bx. Here, A and B, are complex matrices of order n. The eigenvalues for this problem can be
infinite; so instead of returning , GVCCG returns and . If is nonzero, then = / .
The routine GVCCG uses the QZ algorithm described by Moler and Stewart (1973). The
implementation is based on routines of Garbow (1978). Some timing results are given in Hanson
et al. (1990).
Comments
1.
CALL G6CCG (N, A, LDA, B, LDB, ALPHA, BETAV, EVEC, LDEVEC, ACOPY, BCOPY, CWK,
WK, IWK)
2.
Informational error
Type
Code
4
3.
Example
In this example, DATA statements are used to set A and B. The eigenvalues and eigenvectors are
computed and printed. The performance index is also computed and printed. This serves as a
check on the computations. For more details, see routine GPICG.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
LDA, LDB, LDEVEC, N
(N=3, LDA=N, LDB=N, LDEVEC=N)
!
INTEGER
REAL
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
I, NOUT
PI
A(LDA,N), ALPHA(N), B(LDB,N), BETAV(N), EVAL(N), &
EVEC(LDEVEC,N)
Define values of A and B
A = ( 1+0i
0.5+i
0+5i
)
(-10+0i
2+i
0+0i
)
( 5+i
1+0i
0.5+3i )
B = ( 0.5+0i
(
3+3i
(
4+2i
0+0i
3+3i
0.5+i
0+0i
0+i
1+i
)
)
)
Declare variables
DATA A/(1.0,0.0), (-10.0,0.0), (5.0,1.0), (0.5,1.0), (2.0,1.0), &
GVCCG 631
!
!
Output
EVAL
1
( -8.18,-25.38)
2.18,
2
0.61)
3
0.12, -0.39)
EVEC
1
2
3
1
(-0.3267,-0.1245)
( 0.1767, 0.0054)
( 0.9201, 0.0000)
Performance index =
2
(-0.3007,-0.2444)
( 0.8959, 0.0000)
(-0.2019, 0.0801)
3
( 0.0371, 0.1518)
( 0.9577, 0.0000)
(-0.2215, 0.0968)
0.709
GPICG
This function computes the performance index for a generalized complex eigensystem Az = Bz.
Required Arguments
NEVAL Number of eigenvalue/eigenvector pairs performance index computation is based
on. (Input)
A Complex matrix of order N. (Input)
B Complex matrix of order N. (Input)
ALPHA Complex vector of length NEVAL containing the numerators of eigenvalues.
(Input)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDB = SIZE (B,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NEVAL, xj = EVEC(*, J) , the j-th column of EVEC. Also, let be the machine precision
given by AMACH(4). The performance index, , is defined to be
= max
1 j M
j Ax j j Bx j
j A 1 + j B 1 xj
The norms used are a modified form of the 1-norm. The norm of the complex vector v is
N
v 1 = { vi + vi }
i =1
GPICG 633
While the exact value of is highly machine dependent, the performance of EVCSF is considered
excellent if < 1, good if 1 100, and poor if > 100.
The performance index was first developed by the EISPACK project at Argonne National
Laboratory; see Garbow et al. (1977, pages 7779).
Comments
1.
2.
Informational errors
Type
Code
3
3
3
3
3.
1
2
3
4
The J-th eigenvalue should be ALPHA(J)/BETAV (J), its eigenvector should be in the Jth column of EVEC.
Example
For an example of GPICG, see routine GVCCG.
GVLSP
Computes all of the eigenvalues of the generalized real symmetric eigenvalue problem Az = Bz,
with B symmetric positive definite.
Required Arguments
A Real symmetric matrix of order N. (Input)
B Positive definite symmetric matrix of order N. (Input)
EVAL Vector of length N containing the eigenvalues in decreasing order of magnitude.
(Output)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
634 Chapter 2: Eigensystem Analysis
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine GVLSP computes the eigenvalues of Ax = Bx with A symmetric and B symmetric positive
definite. The Cholesky factorization B = RT R, with R a triangular matrix, is used to transform the
equation Ax = Bx to
(RT AR1)(Rx) = (Rx)
The eigenvalues of C = RT AR 1 are then computed. This development is found in Martin and
Wilkinson (1968). The Cholesky factorization of B is computed based on IMSL routine LFTDS,
(see Chapter 1, Linear Systems). The eigenvalues of C are computed based on routine EVLSF.
Further discussion and some timing results are given Hanson et al. (1990).
Comments
1.
GVLSP 635
2.
Informational errors
Type
Code
4
4
1
2
Example
In this example, a DATA statement is used to set the matrices A and B. The eigenvalues of the
system are computed and printed.
USE GVLSP_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N
(N=3, LDA=N, LDB=N)
!
!
REAL
!
!
!
!
!
!
!
!
!
!
!
Find eigenvalues
CALL GVLSP (A, B, EVAL)
Print results
CALL WRRRN ('EVAL', EVAL, 1, N, 1)
END
Output
1
-4.717
EVAL
2
4.393
3
-0.676
GVCSP
Computes all of the eigenvalues and eigenvectors of the generalized real symmetric eigenvalue
problem Az = Bz, with B symmetric positive definite.
Required Arguments
A Real symmetric matrix of order N. (Input)
636 Chapter 2: Eigensystem Analysis
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDB = SIZE (B,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine GVLSP computes the eigenvalues and eigenvectors of Az = Bz, with A symmetric and B
symmetric positive definite. The Cholesky factorization B = RTR, with R a triangular matrix, is
used to transform the equation Az = Bz, to
(R AR 1)(Rz) = (Rz)
The eigenvalues and eigenvectors of C = R AR 1 are then computed. The generalized
eigenvectors of A are given by z = R 1 x, where x is an eigenvector of C. This development is
Chapter 2: Eigensystem Analysis
GVCSP 637
found in Martin and Wilkinson (1968). The Cholesky factorization is computed based on IMSL
routine LFTDS, see Chapter 1, Linear Systems. The eigenvalues and eigenvectors of C are
computed based on routine EVCSF. Further discussion and some timing results are given Hanson
et al. (1990).
Comments
1.
2.
Informational errors
Type
Code
4
4
3.
1
2
Example
In this example, a DATA statement is used to set the matrices A and B. The eigenvalues,
eigenvectors and performance index are computed and printed. For details on the performance
index, see IMSL routine GPISP.
USE
USE
USE
USE
GVCSP_INT
GPISP_INT
UMACH_INT
WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, LDEVEC, N
(N=3, LDA=N, LDB=N, LDEVEC=N)
!
!
INTEGER
REAL
!
!
!
!
NOUT
A(LDA,N), B(LDB,N), EVAL(N), EVEC(LDEVEC,N), PI
Define values of A:
A = ( 1.1
1.2
1.4 )
( 1.2
1.3
1.5 )
( 1.4
1.5
1.6 )
DATA A/1.1, 1.2, 1.4, 1.2, 1.3, 1.5, 1.4, 1.5, 1.6/
!
638 Chapter 2: Eigensystem Analysis
!
!
!
!
!
!
!
!
Define values
B = ( 2.0
( 1.0
( 0.0
DATA B/2.0, 1.0, 0.0, 1.0, 2.0, 1.0, 0.0,
of B:
1.0
0.0
2.0
1.0
1.0
2.0
1.0, 2.0/
)
)
)
Output
1
2
3
1
2
3
EVAL
1.386
-0.058
-0.003
1
0.6431
-0.0224
0.7655
EVEC
2
-0.1147
-0.6872
0.7174
Performance index =
3
-0.6817
0.7266
-0.0858
0.417
GPISP
This function computes the performance index for a generalized real symmetric eigensystem
problem.
Required Arguments
NEVAL Number of eigenvalue/eigenvector pairs that the performance index computation
is based on. (Input)
A Symmetric matrix of order N. (Input)
B Symmetric matrix of order N. (Input)
EVAL Vector of length NEVAL containing eigenvalues. (Input)
Chapter 2: Eigensystem Analysis
GPISP 639
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDB = SIZE (B,1).
LDEVEC Leading dimension of EVEC exactly as specified in the dimension statement in
the calling program. (Input)
Default: LDEVEC = SIZE (EVEC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NEVAL, = EVAL, xj = EVEC(*, J) , the j-th column of EVEC. Also, let be the machine
precision given by AMACH(4). The performance index, , is defined to be
= max
1 j M
Ax j j Bx j
A 1 + j B 1 xj
The norms used are a modified form of the 1-norm. The norm of the complex vector v is
N
v 1 = { vi + vi }
i =1
While the exact value of is highly machine dependent, the performance of EVCSF is considered
excellent if < 1, good if 1 100, and poor if > 100. The performance index was first
developed by the EISPACK project at Argonne National Laboratory; see Garbow et al. (1977,
pages 7779).
Comments
1.
2.
Informational errors
Type
Code
3
3
3
3
3.
1
2
3
4
The J-th eigenvalue should be ALPHA(J)/BETAV(J), its eigenvector should be in the Jth column of EVEC.
Example
For an example of GPISP, see routine GVCSP.
GPISP 641
Routines
3.1
3.2.
3.3.
3.4.
652
653
654
664
665
666
677
680
682
687
690
692
696
699
700
703
706
B-spline Interpolation
Easy to use spline routine.....................................................SPLEZ
One-dimensional interpolation .............................................. BSINT
708
711
Routines 643
3.5.
3.6.
3.7.
3.8.
3.9.
3.10.
715
718
720
725
731
732
735
738
741
742
746
750
754
756
760
766
770
Piecewise Polynomial
Evaluation.............................................................................PPVAL
Evaluation of the derivative ................................................. PPDER
Evaluation on a grid............................................................. PP1GD
Integration .............................................................................PPITG
771
774
776
780
782
784
786
789
792
796
800
Least-Squares Approximation
Linear polynomial ..................................................................RLINE
General polynomial .............................................................RCURV
General functions ................................................................ FNLSQ
Splines with fixed knots ....................................................... BSLSQ
Splines with variable knot.....................................................BSVLS
Splines with linear constraints............................................. CONFT
Two-dimensional tensor-product splines with fixed knots.... BSLS2
Three-dimensional tensor-product splines with fixed knots . BSLS3
803
806
811
815
819
824
833
838
844
848
3.11.
851
Rational L Approximation
Rational Chebyshev.............................................................RATCH
854
Usage Notes
The majority of the routines in this chapter produce piecewise polynomial or spline functions that
either interpolate or approximate given data, or are support routines for the evaluation, integration,
and conversion from one representation to another. Two major subdivisions of routines are
provided. The cubic spline routines begin with the letters CS and utilize the piecewise
polynomial representation described below. The B-spline routines begin with the letters BS and
utilize the B-spline representation described below. Most of the spline routines are based on
routines in the book by de Boor (1978).
Piecewise Polynomials
A univariate piecewise polynomial (function) p is specified by giving its breakpoint sequence
Rn, the order k (degree k 1) of its polynomial pieces, and the k (n 1) matrix c of its local
polynomial coefficients. In terms of this information, the piecewise polynomial (pp) function is
given by
k
p ( x ) = c ji
j =1
( x i )
( j 1)!
j 1
for i x <i +1
The breakpoint sequence is assumed to be strictly increasing, and we extend the pp function to
the entire real axis by extrapolation from the first and last intervals. The subroutines in this chapter
will consistently make the following identifications for FORTRAN variables:
c = PPCOEF
= BREAK
k = KORDER
N = NBREAK
This representation is redundant when the pp function is known to be smooth. For example, if p is
known to be continuous, then we can compute c1,i+1 from the cji as follows
c1,i +1 = p (i +1 ) = c1i + c2i i + + cki
( i )
( k 1)!
k 1
where i := i+1 i. For smooth pp, we prefer to use the irredundant representation in terms of
the B-(for basis)-splines, at least when such a function is first to be determined. The above pp
representation is employed for evaluation of the pp function at many points since it is more
efficient.
t = XKNOT
m = NCOEF
M = NKNOT
CSINT
CSAKM
CSCON
Cubic Splines
Cubic splines are smooth (i.e., C 1 or C 2) fourth-order pp functions. For historical and other
reasons, cubic splines are the most heavily used pp functions. Therefore, we provide special
routines for their construction and evaluation. The routines for their determination use yet another
representation (in terms of value and slope at all the breakpoints) but output the pp representation
as described above for general pp functions.
We provide seven cubic spline interpolation routines: CSIEZ, CSINT, CSDEC, CSHER, CSAKM,
CSCON, and CSPER. The first routine, CSIEZ, is an easy-to-use version of CSINT coupled with
CSVAL. The routine CSIEZ will compute the value of the cubic spline interpolant (to given data
using the not-a-knot criterion) on a grid. The routine CSDEC allows the user to specify various
endpoint conditions (such as the value of the first or second derivative at the right and left points).
This means that the natural cubic spline can be obtained using this routine by setting the second
derivative to zero at both endpoints. If function values and derivatives are available, then the
Chapter 3: Interpolation and Approximation
Hermite cubic interpolant can be computed using CSHER. The two routines CSAKM and CSCON are
designed so that the shape of the curve matches the shape of the data. In particular, CSCON
preserves the convexity of the data while CSAKM attempts to minimize oscillations. If the data is
periodic, then CSPER will produce a periodic interpolant. The routine CONFT allows the user wide
latitude in enforcing shapes. This routine returns the B-spline representation.
It is possible that the cubic spline interpolation routines will produce unsatisfactory results. The
adventurous user should consider using the B-spline interpolation routine BSINT that allows one
to choose the knots and order of the spline interpolant.
In Figure 3-1, we display six spline interpolants to the same data. This data can be found in
Example 1 of the IMSL routine CSCON Notice the different characteristics of the interpolants. The
interpolation routines CSAKM and CSCON are the only two that attempt to preserve the shape of the
data. The other routines tend to have extraneous inflection points, with the piecewise quartic
(k = 5) exhibiting the most oscillation.
Nx
m =1
n =1
nm
Bn, k x ,t x ( x) Bm , k y ,t y ( y )
{ xi }i =1
Nx
and { yi }i =y1
N
for which the corresponding univariate interpolation problem could be solved, the tensor product
interpolation problem becomes: Find the coefficients cnm so that
Ny
Nx
m =1
n =1
nm
Bn, k x ,t x ( xi ) Bm , k y , t y ( yi ) = fij
This problem can be solved efficiently by repeatedly solving univariate interpolation problems as
described in de Boor (1978, page 347). Three-dimensional interpolation has analogous behavior.
In this chapter, we provide routines that compute the two-dimensional tensorproduct spline
coefficients given two-dimensional interpolation data (BS2IN), compute the three-dimensional
tensor-product spline coefficients given three-dimensional interpolation data (BS3IN) compute the
two-dimensional tensor-product spline coefficients for a tensor-product least squares problem
(BSLS2), and compute the three-dimensional tensor-product spline coefficients for a
tensor-product least squares problem (BSLS3). In addition, we provide evaluation, differentiation,
and integration routines for the twoand three-dimensional tensor-product spline functions. The
relevant routines are BS2VL, BS3VL, BS2DR, BS3DR, BS2GD, BS3GD, BS2IG, and BS3IG.
Quadratic Interpolation
The routines that begin with the letters QD in this chapter are designed to interpolate a one-,
two-, or three-dimensional (tensor product) table of values and return an approximation to the
value of the underlying function or one of its derivatives at a given point. These routines are all
based on quadratic polynomial interpolation.
Least Squares
Routines are provided to smooth noisy data: regression using linear polynomials (RLINE),
regression using arbitrary polynomials (RCURV), and regression using user-supplied functions
(FNLSQ). Additional routines compute the least-squares fit using splines with fixed knots (BSLSQ)
or free knots (BSVLS). These routines can produce cubic-spline least-squares fit simply by setting
the order to 4. The routine CONFT computes a fixed-knot spline weighted least-squares fit subject
to linear constraints. This routine is very general and is recommended if issues of shape are
important. The two- and three-dimensional tensor-product spline regression routines are (BSLS2)
and (BSLS3).
splines with the not-a-knot end condition, is to call CSINT to obtain the local coefficients of the
piecewise cubic interpolant and then call CSVAL to evaluate the interpolant. A more complicated
situation arises if one wants to compute a quadratic spline interpolant and then evaluate it
(efficiently) many times. Typically, the sequence of routines called might be BSNAK (get the
knots), BSINT (returns the B-spline coefficients of the interpolant), BSCPP (convert to pp form),
and PPVAL (evaluate). The last two calls could be replaced by a call to the B-spline grid evaluator
BS1GD, or the last call could be replaced with pp grid evaluator PP1GD. The interconnection of the
spline routines is summarized in Figure 3-2.
DATA
CSINT
CSDEC
CSHER
CSAKM
CSCON
CSPER
CSSMH
BSLSQ
BSVLS
CONFT
BSNAK
BSOPK
CSSCV
BSINT
BSCPP
CSVAL
CSDER
CSITG
PPVAL
PPDER
PPITG
PP1GD
BSVAL
BSDER
BSITG
BS1GD
CS1GD
OUT
INTERPOLATION
univariate
multivariate
Scattered
data
shape
preserving
periodic
CSAKM
CSCON
tensor
product
SURF
2D
CSPER
3D
derivatives
CSHER
BS2IN
QD2VL
QD2DR
CSIEZ
CSINT
CSDEC
SPLEZ
BSINT
QDVAL
QDDER
BS3IN
QD3VL
QD3DR
SPLINE_CONSTRAINTS
This function returns the derived type array result, ?_SPLINE_CONSTRAINTS, given optional
input. There are optional arguments for the derivative index, the value applied to the spline, and
the periodic point for any periodic constraint.
The function is used, for entry number j,
?_SPLINE_CONSTRAINTS(J) = &
SPLINE_CONSTRAINTS([DERIVATIVE=DERIVATIVE_INDEX,] &
POINT = WHERE_APPLIED, [VALUE=VALUE_APPLIED,], &
TYPE = CONSTRAINT_INDICATOR, &
[PERIODIC_POINT = VALUE_APPLIED])
The square brackets enclose optional arguments. For each constraint either (but not both) the
VALUE = or the PERIODIC_POINT = optional arguments must be present.
Required Arguments
POINT = WHERE_APPLIED (Input)
The indicator for the type of constraint the spline function or its derivatives is to
satisfy at the point: where_applied. The choices are the character strings
==, <=, >=, .=., and .=-. They respectively indicate that the
spline value or its derivatives will be equal to, not greater than, not less than,
equal to the value of the spline at another point, or equal to the negative of the
spline value at another point. These last two constraints are called periodic and
negative-periodic, respectively. The alternate independent variable point is
value_applied for either periodic constraint. There is a use of periodic
constraints in .
Optional Arguments
DERIVATIVE = DERIVATIVE_INDEX (Input)
This is the number of the derivative for the spline to apply the constraint. The
value 0 corresponds to the function, the value 1 to the first derivative, etc. If this
argument is not present in the list, the value 0 is substituted automatically. Thus
a constraint without the derivative listed applies to the spline function.
PERIODIC_POINT = VALUE_APPLIED
FORTRAN 90 Interface
Generic:
Specific:
SPLINE_VALUES
This rank-1 array function returns an array result, given an array of input. Use the optional
argument for the covariance matrix when the square root of the variance function is required. The
result will be a scalar value when the input variable is scalar.
Required Arguments
DERIVATIVE = DERIVATIVE (Input)
The index of the derivative evaluated. Use non-negative integer values. For the
function itself use the value 0.
VARIABLES = VARIABLES (Input)
The independent variable values where the spline or its derivatives are
evaluated. Either a rank-1 array or a scalar can be used as this argument.
KNOTS = KNOTS (Input)
The derived type ?_spline_knots, defined as the array COEFFS was obtained
with the function SPLINE_FITTING. This contains the polynomial spline
degree and the number of knots and the knots themselves for this spline
function.
COEFFS = C (Input)
f ( x) = cj Bj ( x) .
j =1
Optional Arguments
COVARIANCE = G (Input)
This argument, when present, results in the evaluation of the square root of the
variance function
e ( x ) = b ( x ) Gb ( x )
T
1/ 2
where
b ( x ) = B1 ( x ) , , BN ( x )
and G is the covariance matrix associated with the coefficients of the spline
c = [ c1 , , cN ]
SPLINE_VALUES 653
FORTRAN 90 Interface
Generic:
Specific:
SPLINE_FITTING
Weighted least-squares fitting by B-splines to discrete One-Dimensional data is performed.
Constraints on the spline or its derivatives are optional. The spline function
N
f ( x) = c j Bj ( x)
j =1
its derivatives, or the square root of its variance function are evaluated after the fitting.
Required Arguments
DATA = DATA(1:3,:) (Input/Output)
An assumed-shape array with size(data,1) = 3. The data are placed in the array:
data(1,i) = xi , data(2,i) = yi , and data(3,i) = i , i = 1,..., ndata . If the
variances are not known but are proportional to an unknown value, users may set
data(3,i) = 1, i = 1,..., ndata .
KNOTS = KNOTS (Input)
A derived type, ?_spline_knots, that defines the degree of the spline and the
Optional Arguments
CONSTRAINTS = SPLINE_CONSTRAINTS (Input)
A rank-1 array of derived type ?_spline_constraints that give constraints the
An assumed-shape rank-2 array of the same precision as the data. This output is the
covariance matrix of the coefficients. It is optionally used to evaluate the square root
of the variance function.
Derived type array with the same precision as the input array; used for passing optional
data to SPLINE_FITTING. The options are as follows:
Packaged Options for SPLINE_FITTING
Option Name
Prefix = None
Option Value
SPLINE_FITTING_TOL_EQUAL
SPLINE_FITTING_TOL_LEAST
This resets the value for determining that equality constraint equations are rankdeficient. The default is ?_value = 10-4.
IOPT(IO) = ?_OPTIONS(SPLINE_FITTING_TOL_LEAST, ?_VALUE)
This resets the value for determining that least-squares equations are rank-deficient.
The default is ?_value = 10-4.
FORTRAN 90 Interface
Generic:
Specific:
Description
This routine has similar scope to CONFT found in IMSL (2003, pp 734-743). We provide the
square root of the variance function, but we do not provide for constraints on the integral of the
spline. The least-squares matrix problem for the coefficients is banded, with band-width equal to
the spline order. This fact is used to obtain an efficient solution algorithm when there are no
constraints. When constraints are present the routine solves a linear-least squares problem with
equality and inequality constraints. The processed least-squares equations result in a banded and
upper triangular matrix, following accumulation of the spline fitting equations. The algorithm
used for solving the constrained least-squares system will handle rank-deficient problems. A set
of reference are available in Hanson (1995) and Lawson and Hanson (1995). The CONFT routine
uses QPROG (loc cit., p. 959), which requires that the least-squares equations be of full rank.
SPLINE_FITTING 655
d2 f
d2g
x
=
(
)
( xi ) , i = 0 and ndata
i
dx 2
dx 2
Our program checks the term const. appearing in the maximum truncation error term
error const. x 4
at a finer grid.
USE spline_fitting_int
USE show_int
USE norm_int
implicit none
! This is Example 1 for SPLINE_FITTING, Natural Spline
! Interpolation using cubic splines. Use the function
! exp(-x**2/2) to generate samples.
integer :: i
integer, parameter :: ndata=24, nord=4, ndegree=nord-1, &
nbkpt=ndata+2*ndegree, ncoeff=nbkpt-nord, nvalues=2*ndata
real(kind(1e0)), parameter :: zero=0e0, one=1e0, half=5e-1
real(kind(1e0)), parameter :: delta_x=0.15, delta_xv=0.4*delta_x
real(kind(1e0)), target :: xdata(ndata), ydata(ndata), &
spline_data (3, ndata), bkpt(nbkpt), &
ycheck(nvalues), coeff(ncoeff), &
xvalues(nvalues), yvalues(nvalues), diffs
real(kind(1e0)), pointer :: pointer_bkpt(:)
type (s_spline_knots) break_points
type (s_spline_constraints) constraints(2)
xdata = (/((i-1)*delta_x, i=1,ndata)/)
ydata = exp(-half*xdata**2)
xvalues =(/(0.03+(i-1)*delta_xv,i=1,nvalues)/)
ycheck= exp(-half*xvalues**2)
spline_data(1,:)=xdata
spline_data(2,:)=ydata
spline_data(3,:)=one
! Define the knots for the interpolation problem.
bkpt(1:ndegree) = (/(i*delta_x, i=-ndegree,-1)/)
bkpt(nord:nbkpt-ndegree) = xdata
bkpt(nbkpt-ndegree+1:nbkpt) = &
(/(xdata(ndata)+i*delta_x, i=1,ndegree)/)
! Assign the degree of the polynomial and the knots.
pointer_bkpt => bkpt
break_points=s_spline_knots(ndegree, pointer_bkpt)
656 Chapter 3: Interpolation and Approximation
Output
Example 1 for SPLINE_FITTING is correct.
Additional Examples
Example 2: Shaping a Curve and its Derivatives
The function
g ( x ) = exp ( x 2 / 2 ) (1 + noise )
is fit by cubic splines on the grid of equally spaced points
xi = ( i 1) x, i = 1,..., ndata
The term noise is uniform random numbers from the normalized interval
[ , ] , where = 0.01 . The spline curve is constrained to be convex down for for 0 x 1
convex upward for 1< x 4, and have the second derivative exactly equal to the value zero at
x = 1. The first derivative is constrained with the value zero at x = 0 and is non-negative at the
right and of the interval, x = 4. A sample table of independent variables, second derivatives and
square root of variance function values is printed.
use
use
use
use
spline_fitting_int
show_int
rand_int
norm_int
implicit none
! This is Example 2 for SPLINE_FITTING. Use 1st and 2nd derivative
! constraints to shape the splines.
SPLINE_FITTING 657
integer :: i, icurv
integer, parameter :: nbkptin=13, nord=4, ndegree=nord-1, &
nbkpt=nbkptin+2*ndegree, ndata=21, ncoeff=nbkpt-nord
real(kind(1e0)), parameter :: zero=0e0, one=1e0, half=5e-1
real(kind(1e0)), parameter :: range=4.0, ratio=0.02, tol=ratio*half
real(kind(1e0)), parameter :: delta_x=range/(ndata-1), &
delta_b=range/(nbkptin-1)
real(kind(1e0)), target :: xdata(ndata), ydata(ndata), ynoise(ndata),&
sddata(ndata), spline_data (3, ndata), bkpt(nbkpt), &
values(ndata), derivat1(ndata), derivat2(ndata), &
coeff(ncoeff), root_variance(ndata), diffs
real(kind(1e0)), dimension(ncoeff,ncoeff) :: sigma_squared
real(kind(1e0)), pointer :: pointer_bkpt(:)
type (s_spline_knots) break_points
type (s_spline_constraints) constraints(nbkptin+2)
xdata = (/((i-1)*delta_x, i=1,ndata)/)
ydata = exp(-half*xdata**2)
ynoise = ratio*ydata*(rand(ynoise)-half)
ydata = ydata+ynoise
sddata = ynoise
spline_data(1,:)=xdata
spline_data(2,:)=ydata
spline_data(3,:)=sddata
bkpt=(/((i-nord)*delta_b, i=1,nbkpt)/)
! Assign the degree of the polynomial and the knots.
pointer_bkpt => bkpt
break_points=s_spline_knots(ndegree, pointer_bkpt)
icurv=int(one/delta_b)+1
! At first shape the curve to be convex down.
do i=1,icurv-1
constraints(i)=spline_constraints &
(derivative=2, point=bkpt(i+ndegree), type='<=', value=zero)
end do
! Force a curvature change.
constraints(icurv)=spline_constraints &
(derivative=2, point=bkpt(icurv+ndegree), type='==', value=zero)
! Finally, shape the curve to be convex up.
do i=icurv+1,nbkptin
constraints(i)=spline_constraints &
(derivative=2, point=bkpt(i+ndegree), type='>=', value=zero)
end do
! Make the slope zero and value non-negative at right.
constraints(nbkptin+1)=spline_constraints &
(derivative=1, point=bkpt(nord), type='==', value=zero)
constraints(nbkptin+2)=spline_constraints &
(derivative=0, point=bkpt(nbkptin+ndegree), type='>=', value=zero)
658 Chapter 3: Interpolation and Approximation
Output
Example 2 for SPLINE_FITTING is correct.
q ( x ) = f ( t ) dt
1
Gauss-Legendre quadrature formulas, IMSL (1994, pp. 621-626), of order two are used on each
polynomial piece of f(t) to evaluate q(x) cheaply. After normalizing the cubic spline so that
q(1) = 1, we may then generate random numbers according to the distribution f ( x ) g ( x ) . The
values of x are evaluated by solving q(x) = u, -1 < x < 1. Here u is a uniform random sample.
Newtons method, for a vector of unknowns, is used for the solution algorithm. Recalling the
relation
d
(q ( x) u ) = f ( x) , 1 < x < 1
dx
Chapter 3: Interpolation and Approximation
SPLINE_FITTING 659
we believe this illustrates a method for generating a vector of random numbers according to a
continuous distribution function having finite support.
use spline_fitting_int
use linear_operators
use Numerical_Libraries
implicit none
!
!
!
!
SPLINE_FITTING 661
Output
Example 3 for SPLINE_FITTING is correct.
Output
Example 4 for SPLINE_FITTING is correct.
SPLINE_FITTING 663
SURFACE_CONSTRAINTS
To further shape a surface defined by a tensor product of B-splines, the routine SURFACE_FITTING
will least squares fit data with equality, inequality and periodic constraints. These can apply to the
surface function or its partial derivatives. Each constraint is packaged in the derived type
?_SURFACE_CONSTRAINTS. This function uses the data consisting of: the place where the
constraint is to hold, the partial derivative indices, and the type of the constraint. This object is
returned as the derived type function result ?_SURFACE_CONSTRAINTS. The function itself has
two required and two optional arguments. In a list of constraints, the j-th item will be:
?_SURFACE_CONSTRAINTS(j) = &
SURFACE_CONSTRAINTS&
([DERIVATIVE=DERIVATIVE_INDEX(1:2),] &
POINT = WHERE_APPLIED(1:2),[VALUE=VALUE_APPLIED,],&
TYPE = CONSTRAINT_INDICATOR, &
[PERIODIC_POINT = PERIODIC_POINT(1:2)])
The square brackets enclose optional arguments. For each constraint the arguments value =
and PERIODIC_POINT = are not used at the same time.
Required Arguments
POINT = WHERE_APPLIED (Input)
The point in the data domain where a constraint is to be applied. Each point has
an x and y coordinate, in that order.
TYPE = CONSTRAINT_INDICATOR (Input)
The indicator for the type of constraint the tensor product spline function or its
partial derivatives is to satisfy at the point: where_applied. The choices are
the character strings ==, <=, >=, .=., and .=-. They
respectively indicate that the spline value or its derivatives will be equal to, not
greater than, not less than, equal to the value of the spline at another point, or
equal to the negative of the spline value at another point. These last two
constraints are called periodic and negative-periodic, respectively.
Optional Arguments
DERIVATIVE = DERIVATIVE_INDEX(1:2) (Input)
These are the number of the partial derivatives for the tensor product spline to
apply the constraint. The array (/0,0/) corresponds to the function, the value
(/1,0/) to the first partial derivative with respect to x, etc. If this argument is
not present in the list, the value (/0,0/) is substituted automatically. Thus a
constraint without the derivatives listed applies to the tensor product spline
function.
PERIODIC = PERIODIC_POINT(1:2)
FORTRAN 90 Interface
Generic:
Specific:
SURFACE_VALUES
This rank-2 array function returns a tensor product array result, given two arrays of independent
variable values. Use the optional input argument for the covariance matrix when the square root
of the variance function is evaluated. The result will be a scalar value when the input independent
variable is scalar.
Required Arguments
DERIVATIVE = DERIVATIVE(1:2) (Input)
The indices of the partial derivative evaluated. Use non-negative integer values.
For the function itself use the array (/0,0/).
VARIABLESX = VARIABLESX (Input)
The independent variable values in the first or x dimension where the spline or
its derivatives are evaluated. Either a rank-1 array or a scalar can be used as this
argument.
VARIABLESY = VARIABLESY (Input)
The independent variable values in the second or y dimension where the spline
or its derivatives are evaluated. Either a rank-1 array or a scalar can be used as
this argument.
KNOTSX = KNOTSX (Input)
The derived type ?_spline_knots, used when the array coeffs(:,:)was
obtained with the function SURFACE_FITTING. This contains the polynomial
spline degree and the number of knots and the knots themselves, in the x
dimension.
KNOTSY = KNOTSY (Input)
The derived type ?_spline_knots, used when the array coeffs(:,:) was
obtained with the function SURFACE_FITTING. This contains the polynomial
spline degree and the number of knots and the knots themselves, in the y
dimension.
COEFFS = C (Input)
f ( x, y ) = cij Bi ( y )B j ( x )
j =1 i =1
SURFACE_VALUES 665
The values M = size (C,1) and N = size (C,2) satisfies the respective identities
N -1 + spline_degree = size (?_knotsx), and
M -1 + spline_degree = size (?_knotsy) , where the two right-most quantities in
both equations refer to components of the arguments knotsx and knotsy. The
same value of spline_degree must be used for both knotsx and knotsy.
Optional Arguments
COVARIANCE = G (Input)
This argument, when present, results in the evaluation of the square root of the
variance function
e ( x, y ) = b ( x, y ) Gb ( x, y )
T
1/ 2
where
b ( x, y ) = B1 ( x ) B1 ( y ) , , BN ( x ) B1 ( y ) ,
and G is the covariance matrix associated with the coefficients of the spline
c = [ c11 , , cN 1 ,]
FORTRAN 90 Interface
Generic:
Specific:
SURFACE_FITTING
Weighted least-squares fitting by tensor product B-splines to discrete two-dimensional data is
performed. Constraints on the spline or its partial derivatives are optional. The spline function
N
f ( x, y ) = cij Bi ( y )B j ( x ) ,
j =1 i =1
its derivatives, or the square root of its variance function are evaluated after the fitting.
Required Arguments
DATA = DATA(1:4,:) (Input/Output)
An assumed-shape array with size(data,1) = 4. The data are placed in the array:
data(1,i) = xi ,
data(2,i) = yi ,
data(3,i) = zi ,
data(4,i) = i , i = 1,..., ndata .
If the variances are not known, but are proportional to an unknown value, use
data(4,i) = 1, i = 1,..., ndata .
KNOTSX = KNOTSX (Input)
A derived type, ?_SPLINE_KNOTS, that defines the degree of the spline and the
Optional Arguments
CONSTRAINTS = SURFACE_CONSTRAINTS (Input)
A rank-1 array of derived type ?_SURFACE_CONSTRAINTS that defines constraints the
An assumed-shape rank-2 array of the same precision as the data. This output is the
covariance matrix of the coefficients. It is optionally used to evaluate the square root
of the variance function.
IOPT = IOPT(:) (Input/Output)
Derived type array with the same precision as the input array; used for passing optional
data to SURFACE_FITTING. The options are as follows:
Packaged Options for SURFACE_FITTING
Prefix = None
Option Name
Option Value
SURFACE_FITTING_SMALLNESS
SURFACE_FITTING_FLATNESS
SURFACE_FITTING_TOL_EQUAL
SURFACE_FITTING_TOL_LEAST
SURFACE_FITTING_RESIDUALS
SURFACE_FITTING 667
SURFACE_FITTING_THINNESS
IOPT(IO) = ?_OPTIONS&
(surface_fitting_smallnes, ?_value)
This resets the square root of the regularizing parameter multiplying the squared
integral of the unknown function. The argument ?_value is replaced by the default
value. The default is ?_value = 0.
IOPT(IO) = ?_OPTIONS&
(SURFACE_FITTING_FLATNESS, ?_VALUE)
This resets the square root of the regularizing parameter multiplying the squared
integral of the partial derivatives of the unknown function. The argument ?_VALUE
is replaced by the default value. The default is
?_VALUE = SQRT(EPSILON(?_VALUE))*SIZE, where
size = | data (3,:) / data(4,:) | / ( ndata + 1) .
IOPT(IO) = ?_OPTIONS&
(SURFACE_FITTING_TOL_EQUAL, ?_VALUE)
This resets the value for determining that equality constraint equations are rankdeficient. The default is ?_VALUE = 10-4.
IOPT(IO) = ?_OPTIONS&
(SURFACE_FITTING_TOL_LEAST, ?_VALUE)
This resets the value for determining that least-squares equations are rank-deficient.
The default is ?_VALUE = 10-4.
IOPT(IO) = ?_OPTIONS&
(SURFACE_FITTING_RESIDUALS, DUMMY)
This option returns the residuals = surface - data, in data(4,:). That row of the
array is overwritten by the residuals. The data is returned in the order of cell
processing order, or left-to-right in x and then increasing in y. The allocation of a
temporary for data(1:4,:) is avoided, which may be desirable for problems with
large amounts of data. The default is to not evaluate the residuals and to leave
data(1:4,:) as input.
IOPT(IO) = ?_OPTIONS&
(SURFACE_FITTING_PRINT, DUMMY)
This option prints the knots or breakpoints for x and y, and the count of data points in
cell processing order. The default is to not print these arrays.
668 Chapter 3: Interpolation and Approximation
IOPT(IO) = ?_OPTIONS&
(SURFACE_FITTING_THINNESS, ?_VALUE)
This resets the square root of the regularizing parameter multiplying the squared
integral of the second partial derivatives of the unknown function. The argument
-3
?_VALUE is replaced by the default value. The default is ?_VALUE = 10 SIZE,,
where
size = | data (3,:) / data(4,:) | / ( ndata + 1) .
FORTRAN 90 Interface
Generic:
Specific:
Description
The coefficients are obtained by solving a least-squares system of linear algebraic equations,
subject to linear equality and inequality constraints. The system is the result of the weighted data
equations and regularization. If there are no constraints, the solution is computed using a banded
least-squares solver. Details are found in Hanson (1995).
[0, 2] [0, 2]
There are ndata random pairs of values for the independent variables. Each datum is given unit
uncertainty. The grid of knots in both x and y dimensions are equally spaced, in the interior cells,
and identical to each other. After the coefficients are computed a check is made that the surface
approximately agrees with g(x,y) at a tensor product grid of equally spaced values.
USE surface_fitting_int
USE rand_int
USE norm_int
implicit none
! This is Example 1 for SURFACE_FITTING, tensor product
Chapter 3: Interpolation and Approximation
SURFACE_FITTING 669
!
!
!
!
Output
Example 1 for SURFACE_FITTING is correct.
Additional Examples
Example 2: Parametric Representation of a Sphere
From Struik (1961), the parametric representation of points (x,y,z) on the surface of a sphere of
radius a > 0 is expressed in terms of spherical coordinates,
x ( u, v ) = a cos ( u ) cos ( v ) , 2u
y ( u , v ) = a cos ( u ) sin ( v ) , v
z ( u , v ) = a sin ( u )
The parameters are radians of latitude (u)and longitude (v). The example program fits the same
ndata random pairs of latitude and longitude in each coordinate. We have covered the sphere
twice by allowing
u
for latitude. We solve three data fitting problems, one for each coordinate function. Periodic
constraints on the value of the spline are used for both u and v. We could reduce the
computational effort by fitting a spline function in one variable for the z coordinate. To illustrate
the representation of more general surfaces than spheres, we did not do this. When the surface is
evaluated we compute latitude, moving from the South Pole to the North Pole,
2u
These residuals are checked at a rectangular mesh of latitude and longitude pairs. To illustrate the
use of some options, we have reset the three regularization parameters to the value zero, the leastsquares system tolerance to a smaller value than the default, and obtained the residuals for each
parametric coordinate function at the data points.
USE
USE
USE
USE
surface_fitting_int
rand_int
norm_int
Numerical_Libraries
implicit none
!
!
!
!
!
!
!
SURFACE_FITTING 671
!
!
!
!
Output
Example 2 for SURFACE_FITTING is correct.
SURFACE_FITTING 673
Output
Example 3 for SURFACE_FITTING is correct.
surface_fitting_int
rand_int
surface_fitting_int
rand_int
norm_int
implicit none
!
!
!
!
!
SURFACE_FITTING 675
3.961 , -14.418,
7.35 ,
6.012,
7.151 , -5.973,
10.901,
9.015,
10.602,
0.06 ,
10.304, -8.895,
14.194,
6.783,
14.469, -0.672,
15.0 , 12.0 ,
16.457,
4.134,
17.914, -3.735,
3.5 ,
3.5 ,
2.0 ,
2.0 ,
1.85,
1.7 ,
1.3 ,
2.1 ,
0.5 ,
0.7 ,
1.7/)
7.45 ,
7.251,
7.051,
10.751,
10.453,
14.055,
14.331,
14.607,
15.729,
17.185,
12.003,
0.018,
-11.967,
4.536,
-4.419,
10.509,
3.054,
-4.398,
8.067,
0.198,
2.5,&
3.0,&
2.5,&
1.925,&
1.576,&
1.5,&
1.7,&
1.75,&
0.5,&
1.1,&
spline_data(1:3,:)=reshape(data,(/3,ndata/)); spline_data(4,:)=one
! Define the knots for the tensor product data fitting problem.
! Use the data limits to the knot sequences.
bkptx(1:ndegree) = minval(spline_data(1,:))
bkptx(nbkpt-ndegree+1:nbkpt) = maxval(spline_data(1,:))
delta=(bkptx(nbkpt)-bkptx(ndegree))/(ngrid-1)
bkptx(nord:nbkpt-ndegree)=(/(bkptx(1)+i*delta,i=0,ngrid-1)/)
! Assign the degree of the polynomial and the knots for x.
pointer_bkpt => bkptx
knotsx=d_spline_knots(ndegree, pointer_bkpt)
bkpty(1:ndegree) = minval(spline_data(2,:))
bkpty(nbkpt-ndegree+1:nbkpt) = maxval(spline_data(2,:))
delta=(bkpty(nbkpt)-bkpty(ndegree))/(ngrid-1)
bkpty(nord:nbkpt-ndegree)=(/(bkpty(1)+i*delta,i=0,ngrid-1)/)
! Assign the degree of the polynomial and the knots for y.
pointer_bkpt => bkpty
knotsy=d_spline_knots(ndegree, pointer_bkpt)
! Fit the data and obtain the coefficients.
coeff = surface_fitting(spline_data, knotsx, knotsy)
delta=(bkptx(nbkpt)-bkptx(1))/(nvalues+1)
x=(/(bkptx(1)+i*delta,i=1,nvalues)/)
delta=(bkpty(nbkpt)-bkpty(1))/(nvalues+1)
y=(/(bkpty(1)+i*delta,i=1,nvalues)/)
! Evaluate the function at a rectangular grid.
! Use non-positive values to a constraint.
values=surface_values((/0,0/), x, y, knotsx, knotsy, coeff)
! Count the number of values <= zero. Then constrain the spline
! so that it is >= TOLERANCE at those points where it was <= zero.
q=count(values <= zero)
allocate (C(q))
DO I=1,nvalues
DO J=1,nvalues
IF(values(I,J) <= zero) THEN
C(q)=surface_constraints(point=(/x(i),y(j)/), type='>=',&
value=TOLERANCE)
q=q-1
676 Chapter 3: Interpolation and Approximation
END IF
END DO
END DO
! Fit the data with constraints and obtain the coefficients.
coeff = surface_fitting(spline_data, knotsx, knotsy,&
CONSTRAINTS=C)
deallocate(C)
! Evaluate the surface at a grid and check, once again, for
! non-positive values. All values should now be positive.
values=surface_values((/0,0/), x, y, knotsx, knotsy, coeff)
if (count(values <= zero) == 0) then
write(*,*) 'Example 4 for SURFACE_FITTING is correct.'
end if
end
Output
Example 4 for SURFACE_FITTING is correct.
CSIEZ
Computes the cubic spline interpolant with the not-a-knot condition and return values of the
interpolant at specified points.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
The data point abscissas must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
XVEC Array of length N containing the points at which the spline is to be evaluated.
(Input)
VALUE Array of length N containing the values of the spline at the points in XVEC.
(Output)
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least 2.
Default: NDATA = size (XDATA,1).
N Length of vector XVEC. (Input)
Default: N = size (XVEC,1).
CSIEZ 677
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
This routine is designed to let the user easily compute the values of a cubic spline interpolant. The
routine CSIEZ computes a spline interpolant to a set of data points (xi, fi) for i = 1, , NDATA. The
output for this routine consists of a vector of values of the computed cubic spline. Specifically, let
n = N, v = XVEC, and y = VALUE, then if s is the computed spline we set
yj = s(vj )
j = 1, , n
Additional documentation can be found by referring to the IMSL routines CSINT or SPLEZ.
Comments
Workspace may be explicitly provided, if desired, by use of C2IEZ/DC2IEZ. The reference is:
CALL C2IEZ (NDATA, XDATA, FDATA, N, XVEC, VALUE, IWK, WK1, WK2)
Example
In this example, a cubic spline interpolant to a function F is computed. The values of this spline
are then compared with the exact function values.
USE CSIEZ_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=11)
INTEGER
I, NOUT
REAL
!
!
10
20
!
!
!
99998
!
!
30
99999
Output
X
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0.450
0.500
0.550
0.600
0.650
0.700
0.750
0.800
0.850
0.900
0.950
1.000
INTERPOLANT
0.000
0.809
0.997
0.723
0.141
-0.549
-0.978
-0.843
-0.279
0.441
0.938
0.903
0.412
-0.315
-0.880
-0.938
-0.537
0.148
0.804
1.086
0.650
ERROR
0.000000
-0.127025
0.000000
0.055214
0.000000
-0.022789
0.000000
-0.016246
0.000000
0.009348
0.000000
0.019947
0.000000
-0.004895
0.000000
-0.029541
0.000000
0.034693
0.000000
-0.092559
0.000000
CSIEZ 679
CSINT
Computes the cubic spline interpolant with the not-a-knot condition.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
The data point abscissas must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
BREAK Array of length NDATA containing the breakpoints for the piecewise cubic
representation. (Output)
CSCOEF Matrix of size 4 by NDATA containing the local coefficients of the cubic pieces.
(Output)
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least 2.
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSINT computes a C 2 cubic spline interpolant to a set of data points (xi, fi) for
i = 1, , NDATA = N. The breakpoints of the spline are the abscissas. Endpoint conditions are
automatically determined by the program. These conditions correspond to the not-a-knot
condition (see de Boor 1978), which requires that the third derivative of the spline be continuous
at the second and next-to-last breakpoint. If N is 2 or 3, then the linear or quadratic interpolating
polynomial is computed, respectively.
If the data points arise from the values of a smooth (say C 4) function f, i.e. fi = f(xi), then the error
will behave in a predictable fashion. Let be the breakpoint vector for the above spline
interpolant. Then, the maximum absolute error satisfies
680 Chapter 3: Interpolation and Approximation
f s
[1 , N ] C
f(
4)
[1 , N ]
where
:= max i i 1
i = 2,, N
Comments
1.
2.
The cubic spline can be evaluated using CSVAL; its derivative can be evaluated using
CSDER.
3.
Example
In this example, a cubic spline interpolant to a function F is computed. The values of this spline
are then compared with the exact function values.
USE CSINT_INT
USE UMACH_INT
USE CSVAL_INT
IMPLICIT
NONE
INTEGER
PARAMETER
NDATA
(NDATA=11)
Specifications
!
INTEGER
REAL
!
!
!
!
I, NINTV, NOUT
BREAK(NDATA), CSCOEF(4,NDATA), F,&
FDATA(NDATA), FLOAT, SIN, X, XDATA(NDATA)
INTRINSIC FLOAT, SIN
Define function
F(X) = SIN(15.0*X)
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
10 CONTINUE
Compute cubic spline interpolant
CALL CSINT (XDATA, FDATA, BREAK, CSCOEF)
Get output unit number.
CSINT 681
Write heading
WRITE (NOUT,99999)
99999 FORMAT (13X, 'X', 9X, 'Interpolant', 5X, 'Error')
NINTV = NDATA - 1
!
Print the interpolant and the error
!
on a finer grid
DO 20 I=1, 2*NDATA - 1
X = FLOAT(I-1)/FLOAT(2*NDATA-2)
WRITE (NOUT,'(2F15.3,F15.6)') X, CSVAL(X,BREAK,CSCOEF),&
F(X) - CSVAL(X,BREAK,&
CSCOEF)
20 CONTINUE
END
Output
X
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0.450
0.500
0.550
0.600
0.650
0.700
0.750
0.800
0.850
0.900
0.950
1.000
Interpolant
0.000
0.809
0.997
0.723
0.141
-0.549
-0.978
-0.843
-0.279
0.441
0.938
0.903
0.412
-0.315
-0.880
-0.938
-0.537
0.148
0.804
1.086
0.650
Error
0.000000
-0.127025
0.000000
0.055214
0.000000
-0.022789
0.000000
-0.016246
0.000000
0.009348
0.000000
0.019947
0.000000
-0.004895
0.000000
-0.029541
0.000000
0.034693
0.000000
-0.092559
0.000000
CSDEC
Computes the cubic spline interpolant with specified derivative endpoint conditions.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input) The data
point abscissas must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
ILEFT Type of end condition at the left endpoint. (Input)
682 Chapter 3: Interpolation and Approximation
Condition
ILEFT
Not-a-knot condition
Condition
Not-a-knot condition
Optional Arguments
NDATA Number of data points. (Input)
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CSDEC 683
Double:
Description
The routine CSDEC computes a C 2 cubic spline interpolant to a set of data points (xi, fi) for
i = 1, , NDATA = N. The breakpoints of the spline are the abscissas. Endpoint conditions are to be
selected by the user. The user may specify not-a-knot, first derivative, or second derivative at each
endpoint (see de Boor 1978, Chapter 4).
If the data (including the endpoint conditions) arise from the values of a smooth (say C 4) function
f, i.e. fi = f(xi), then the error will behave in a predictable fashion. Let be the breakpoint vector
for the above spline interpolant. Then, the maximum absolute error satisfies
f s
[1 , N ]
C f(
4)
[1 , N ]
where
:=
i = 2,, N
i i 1
Comments
1.
2.
The cubic spline can be evaluated using CSVAL; its derivative can be evaluated using
CSDER.
3.
Example 1
In Example 1, a cubic spline interpolant to a function f is computed. The value of the derivative at
the left endpoint and the value of the second derivative at the right endpoint are specified. The
values of this spline are then compared with the exact function values.
USE CSDEC_INT
USE UMACH_INT
USE CSVAL_INT
IMPLICIT
INTEGER
NONE
ILEFT, IRIGHT, NDATA
PARAMETER
!
INTEGER
REAL
I, NINTV, NOUT
BREAK(NDATA), COS, CSCOEF(4,NDATA), DLEFT,&
DRIGHT, F, FDATA(NDATA), FLOAT, SIN, X, XDATA(NDATA)
INTRINSIC COS, FLOAT, SIN
!
Define function
F(X) = SIN(15.0*X)
!
Initialize DLEFT and DRIGHT
DLEFT = 15.0*COS(15.0*0.0)
DRIGHT = -15.0*15.0*SIN(15.0*1.0)
!
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
10 CONTINUE
!
Compute cubic spline interpolant
CALL CSDEC (XDATA, FDATA, ILEFT, DLEFT, IRIGHT, &
DRIGHT, BREAK, CSCOEF)
!
Get output unit number
CALL UMACH (2, NOUT)
!
Write heading
WRITE (NOUT,99999)
99999 FORMAT (13X, 'X', 9X, 'Interpolant', 5X, 'Error')
NINTV = NDATA - 1
!
Print the interpolant on a finer grid
DO 20 I=1, 2*NDATA - 1
X = FLOAT(I-1)/FLOAT(2*NDATA-2)
WRITE (NOUT,'(2F15.3,F15.6)') X, CSVAL(X,BREAK,CSCOEF),&
F(X) - CSVAL(X,BREAK,&
CSCOEF)
20 CONTINUE
END
Output
X
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0.450
0.500
0.550
0.600
0.650
0.700
0.750
0.800
0.850
Interpolant
0.000
0.675
0.997
0.759
0.141
-0.558
-0.978
-0.840
-0.279
0.440
0.938
0.902
0.412
-0.312
-0.880
-0.947
-0.537
0.182
Error
0.000000
0.006332
0.000000
0.019485
0.000000
-0.013227
0.000000
-0.018765
0.000000
0.009859
0.000000
0.020420
0.000000
-0.007301
0.000000
-0.020391
0.000000
0.000497
CSDEC 685
0.900
0.950
1.000
0.804
0.959
0.650
0.000000
0.035074
0.000000
Additional Examples
Example 2
In Example 2, we compute the natural cubic spline interpolant to a function f by forcing the
second derivative of the interpolant to be zero at both endpoints. As in the previous example, we
compare the exact function values with the values of the spline.
USE CSDEC_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
ILEFT, IRIGHT, NDATA, NOUT
(ILEFT=2, IRIGHT=2, NDATA=11)
!
INTEGER
REAL
I, NINTV
BREAK(NDATA), CSCOEF(4,NDATA), DLEFT, DRIGHT,&
F, FDATA(NDATA), FLOAT, SIN, X, XDATA(NDATA), CSVAL
INTRINSIC FLOAT, SIN
!
Initialize DLEFT and DRIGHT
DATA DLEFT/0./, DRIGHT/0./
!
Define function
F(X) = SIN(15.0*X)
!
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
10 CONTINUE
!
Compute cubic spline interpolant
CALL CSDEC (XDATA, FDATA, ILEFT, DLEFT, IRIGHT, DRIGHT,&
BREAK, CSCOEF)
!
Get output unit number
CALL UMACH (2, NOUT)
!
Write heading
WRITE (NOUT,99999)
99999 FORMAT (13X, 'X', 9X, 'Interpolant', 5X, 'Error')
NINTV = NDATA - 1
!
Print the interpolant on a finer grid
DO 20 I=1, 2*NDATA - 1
X = FLOAT(I-1)/FLOAT(2*NDATA-2)
WRITE (NOUT,'(2F15.3,F15.6)') X, CSVAL(X,BREAK,CSCOEF),&
F(X) - CSVAL(X,BREAK,&
CSCOEF)
20 CONTINUE
END
Output
X
0.000
Interpolant
0.000
Error
0.000000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0.450
0.500
0.550
0.600
0.650
0.700
0.750
0.800
0.667
0.997
0.761
0.141
-0.559
-0.978
-0.840
-0.279
0.440
0.938
0.902
0.412
-0.311
-0.880
-0.952
-0.537
0.015027
0.000000
0.017156
0.000000
-0.012609
0.000000
-0.018907
0.000000
0.009812
0.000000
0.020753
0.000000
-0.008586
0.000000
-0.015585
0.000000
CSHER
Computes the Hermite cubic spline interpolant.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
The data point abscissas must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
DFDATA Array of length NDATA containing the values of the derivative. (Input)
BREAK Array of length NDATA containing the breakpoints for the piecewise cubic
representation. (Output)
CSCOEF Matrix of size 4 by NDATA containing the local coefficients of the cubic pieces.
(Output)
Optional Arguments
NDATA Number of data points. (Input)
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
CSHER 687
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSHER computes a C 1 cubic spline interpolant to the set of data points
( xi , fi ) and ( xi , fi )
for i = 1, , NDATA = N. The breakpoints of the spline are the abscissas.
If the data points arise from the values of a smooth (say C 4) function f, i.e.,
f i = f ( xi ) and fi = f ( xi )
[1 , N ]
C f(
4)
[1 , N ]
where
:=
i = 2,, N
i i 1
Comments
1.
2.
Informational error
Type
Code
4
3.
The cubic spline can be evaluated using CSVAL; its derivative can be evaluated using
CSDER.
4.
Example
In this example, a cubic spline interpolant to a function f is computed. The value of the function f
and its derivative f are computed on the interpolation nodes and passed to CSHER. The values of
this spline are then compared with the exact function values.
USE CSHER_INT
USE UMACH_INT
USE CSVAL_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=11)
!
INTEGER
REAL
I, NINTV, NOUT
BREAK(NDATA), COS, CSCOEF(4,NDATA), DF,&
DFDATA(NDATA), F, FDATA(NDATA), FLOAT, SIN, X,&
XDATA(NDATA)
INTRINSIC COS, FLOAT, SIN
!
Define function and derivative
F(X) = SIN(15.0*X)
DF(X) = 15.0*COS(15.0*X)
!
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
DFDATA(I) = DF(XDATA(I))
10 CONTINUE
!
Compute cubic spline interpolant
CALL CSHER (XDATA, FDATA, DFDATA, BREAK, CSCOEF)
!
Get output unit number
CALL UMACH (2, NOUT)
!
Write heading
WRITE (NOUT,99999)
99999 FORMAT (13X, 'X', 9X, 'Interpolant', 5X, 'Error')
NINTV = NDATA - 1
!
Print the interpolant on a finer grid
DO 20 I=1, 2*NDATA - 1
X = FLOAT(I-1)/FLOAT(2*NDATA-2)
WRITE (NOUT,'(2F15.3, F15.6)') X, CSVAL(X,BREAK,CSCOEF)&
, F(X) - CSVAL(X,BREAK,&
CSCOEF)
20 CONTINUE
END
Output
X
0.000
0.050
0.100
0.150
0.200
0.250
Interpolant
0.000
0.673
0.997
0.768
0.141
-0.564
Error
0.000000
0.008654
0.000000
0.009879
0.000000
-0.007257
CSHER 689
0.300
0.350
0.400
0.450
0.500
0.550
0.600
0.650
0.700
0.750
0.800
0.850
0.900
0.950
1.000
-0.978
-0.848
-0.279
0.444
0.938
0.911
0.412
-0.315
-0.880
-0.956
-0.537
0.180
0.804
0.981
0.650
0.000000
-0.010906
0.000000
0.005714
0.000000
0.011714
0.000000
-0.004057
0.000000
-0.012288
0.000000
0.002318
0.000000
0.012616
0.000000
CSAKM
Computes the Akima cubic spline interpolant.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
The data point abscissas must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
BREAK Array of length NDATA containing the breakpoints for the piecewise cubic
representation. (Output)
CSCOEF Matrix of size 4 by NDATA containing the local coefficients of the cubic pieces.
(Output)
Optional Arguments
NDATA Number of data points. (Input)
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSAKM computes a C 1 cubic spline interpolant to a set of data points (xi, fi) for
i = 1, , NDATA = N. The breakpoints of the spline are the abscissas. Endpoint conditions are
automatically determined by the program; see Akima (1970) or de Boor (1978).
If the data points arise from the values of a smooth (say C 4) function f, i.e. fi = f(xi), then the error
will behave in a predictable fashion. Let be the breakpoint vector for the above spline
interpolant. Then, the maximum absolute error satisfies
f s
[1 , N ]
C f(
2)
[1 , N ]
where
:= max i i 1
i = 2,, N
The routine CSAKM is based on a method by Akima (1970) to combat wiggles in the interpolant.
The method is nonlinear; and although the interpolant is a piecewise cubic, cubic polynomials are
not reproduced. (However, linear polynomials are reproduced.)
Comments
1.
2.
The cubic spline can be evaluated using CSVAL; its derivative can be evaluated using
CSDER.
3.
Example
In this example, a cubic spline interpolant to a function f is computed. The values of this spline are
then compared with the exact function values.
USE CSAKM_INT
USE UMACH_INT
USE CSVAL_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=11)
INTEGER
REAL
I, NINTV, NOUT
BREAK(NDATA), CSCOEF(4,NDATA), F,&
CSAKM 691
Output
X
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0.450
0.500
0.550
0.600
0.650
0.700
0.750
0.800
0.850
0.900
0.950
1.000
Interpolant
0.000
0.818
0.997
0.615
0.141
-0.478
-0.978
-0.812
-0.279
0.386
0.938
0.854
0.412
-0.276
-0.880
-0.889
-0.537
0.149
0.804
0.932
0.650
Error
0.000000
-0.135988
0.000000
0.163487
0.000000
-0.093376
0.000000
-0.046447
0.000000
0.064491
0.000000
0.068274
0.000000
-0.043288
0.000000
-0.078947
0.000000
0.033757
0.000000
0.061260
0.000000
CSCON
Computes a cubic spline interpolant that is consistent with the concavity of the data.
692 Chapter 3: Interpolation and Approximation
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
The data point abscissas must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
IBREAK The number of breakpoints. (Output)
It will be less than 2 * NDATA.
BREAK Array of length IBREAK containing the breakpoints for the piecewise cubic
representation in its first IBREAK positions. (Output)
The dimension of BREAK must be at least 2 * NDATA.
CSCOEF Matrix of size 4 by N where N is the dimension of BREAK. (Output)
The first IBREAK 1 columns of CSCOEF contain the local coefficients of the cubic
pieces.
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least 3.
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Descritpion
The routine CSCON computes a cubic spline interpolant to n = NDATA data points {xi, fi} for
i = 1, , n. For ease of explanation, we will assume that xi < xi + 1, although it is not necessary for
the user to sort these data values. If the data are strictly convex, then the computed spline is
convex, C 2, and minimizes the expression
xn
( g )
x1
over all convex C 1 functions that interpolate the data. In the general case when the data have both
convex and concave regions, the convexity of the spline is consistent with the data and the above
Chapter 3: Interpolation and Approximation
CSCON 693
integral is minimized under the appropriate constraints. For more information on this interpolation
scheme, we refer the reader to Micchelli et al. (1985) and Irvine et al. (1986).
One important feature of the splines produced by this subroutine is that it is not possible, a priori,
to predict the number of breakpoints of the resulting interpolant. In most cases, there will be
breakpoints at places other than data locations. The method is nonlinear; and although the
interpolant is a piecewise cubic, cubic polynomials are not reproduced. (However, linear
polynomials are reproduced.) This routine should be used when it is important to preserve the
convex and concave regions implied by the data.
Comments
1.
Informational errors
Type
Code
3
4
3.
The cubic spline can be evaluated using CSVAL; its derivative can be evaluated using
CSDER.
4.
The default value for ITMAX is 25. This can be reset by calling C2CON/DC2CON directly.
Example
We first compute the shape-preserving interpolant using CSCON, and display the coefficients and
breakpoints. Second, we interpolate the same data using CSINT in a program not shown and
overlay the two results. The graph of the result from CSINT is represented by the dashed line.
Notice the extra inflection points in the curve produced by CSINT.
USE CSCON_INT
USE UMACH_INT
USE WRRRL_INT
IMPLICIT
NONE
INTEGER
PARAMETER
NDATA
(NDATA=9)
INTEGER
REAL
IBREAK, NOUT
BREAK(2*NDATA), CSCOEF(4,2*NDATA), FDATA(NDATA),&
XDATA(NDATA)
CLABEL(14)*2, RLABEL(4)*2
Specifications
CHARACTER
!
DATA
DATA
DATA
DATA
!
!
!
!
!
Output
IBREAK = 13
1
1
0.000
2
0.100
BREAK
3
0.136
4
0.200
5
0.259
6
0.300
7
0.400
8
0.436
9
0.500
10
0.600
11
0.609
12
0.800
13
1.000
CSCOEF
CSCON 695
1
2
3
4
1
0.000
11.886
0.000
-1731.699
2
0.900
3.228
-173.170
4841.604
3
0.942
0.131
0.000
0.000
4
0.950
0.131
0.000
0.000
5
0.958
0.131
0.000
-5312.082
6
0.900
-4.434
220.218
4466.875
1
2
3
4
7
0.100
-4.121
226.470
-6222.348
8
0.050
0.000
0.000
0.000
9
0.050
0.000
0.000
0.000
10
0.050
0.000
0.000
0.000
11
0.050
0.000
0.000
129.115
12
0.200
2.356
24.664
123.321
1
2
3
4
13
1.000
0.000
0.000
0.000
CSPER
Computes the cubic spline interpolant with periodic boundary conditions.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
The data point abscissas must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
696 Chapter 3: Interpolation and Approximation
BREAK Array of length NDATA containing the breakpoints for the piecewise cubic
representation. (Output)
CSCOEF Matrix of size 4 by NDATA containing the local coefficients of the cubic pieces.
(Output)
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least 4.
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSPER computes a C2 cubic spline interpolant to a set of data points (xi, fi) for
i = 1, , NDATA = N. The breakpoints of the spline are the abscissas. The program enforces
periodic endpoint conditions. This means that the spline s satisfies s(a) = s(b), s(a) = s(b), and
s(a) = s(b), where a is the leftmost abscissa and b is the rightmost abscissa. If the ordinate values
corresponding to a and b are not equal, then a warning message is issued. The ordinate value at b
is set equal to the ordinate value at a and the interpolant is computed.
If the data points arise from the values of a smooth (say C 4) periodic function f, i.e. fi = f(xi), then
the error will behave in a predictable fashion. Let be the breakpoint vector for the above spline
interpolant. Then, the maximum absolute error satisfies
f s
[1 , N ]
C f(
4)
[1 , N ]
where
:= max i i 1
i = 2,, N
CSPER 697
Comments
1.
2.
Informational error
Type
Code
3
3.
The data set is not periodic, i.e., the function values at the smallest
and largest XDATA points are not equal. The value at the smallest
XDATA point is used.
The cubic spline can be evaluated using CSVAL and its derivative can be evaluated
using CSDER.
Example
In this example, a cubic spline interpolant to a function f is computed. The values of this spline are
then compared with the exact function values.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=11)
INTEGER
REAL
I, NINTV, NOUT
BREAK(NDATA), CSCOEF(4,NDATA), F,&
FDATA(NDATA), FLOAT, H, PI, SIN, X, XDATA(NDATA)
FLOAT, SIN
INTRINSIC
!
!
Define function
F(X) = SIN(15.0*X)
!
!
!
!
!
Set up a grid
PI = CONST('PI')
H = 2.0*PI/15.0/10.0
DO 10 I=1, NDATA
XDATA(I) = H*FLOAT(I-1)
FDATA(I) = F(XDATA(I))
10 CONTINUE
FDATA(11).
FDATA(NDATA) = FDATA(1)
!
!
Output
X
0.000
0.021
0.042
0.063
0.084
0.105
0.126
0.147
0.168
0.188
0.209
0.230
0.251
0.272
0.293
0.314
0.335
0.356
0.377
0.398
0.419
Interpolant
Error
0.000
0.000000
0.309
0.000138
0.588
0.000000
0.809
0.000362
0.951
0.000000
1.000
0.000447
0.951
0.000000
0.809
0.000362
0.588
0.000000
0.309
0.000138
0.000
0.000000
-0.309
-0.000138
-0.588
0.000000
-0.809
-0.000362
-0.951
0.000000
-1.000
-0.000447
-0.951
0.000000
-0.809
-0.000362
-0.588
0.000000
-0.309
-0.000138
0.000
0.000000
CSVAL
This function evaluates a cubic spline.
CSVAL 699
Required Arguments
X Point at which the spline is to be evaluated. (Input)
BREAK Array of length NINTV + 1 containing the breakpoints for the piecewise cubic
representation. (Input)
BREAK must be strictly increasing.
CSCOEF Matrix of size 4 by NINTV + 1 containing the local coefficients of the cubic
pieces. (Input)
Optional Arguments
NINTV Number of polynomial pieces. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSVAL evaluates a cubic spline at a given point. It is a special case of the routine
PPDER, which evaluates the derivative of a piecewise polynomial. (The value of a piecewise
polynomial is its zero-th derivative and a cubic spline is a piecewise polynomial of order 4.) The
routine PPDER is based on the routine PPVALU in de Boor (1978, page 89).
Example
For an example of the use of CSVAL, see IMSL routine CSINT.
CSDER
This function evaluates the derivative of a cubic spline.
Required Arguments
IDERIV Order of the derivative to be evaluated. (Input)
In particular, IDERIV = 0 returns the value of the polynomial.
X Point at which the polynomial is to be evaluated. (Input)
BREAK Array of length NINTV + 1 containing the breakpoints for the piecewise cubic
representation. (Input)
BREAK must be strictly increasing.
CSCOEF Matrix of size 4 by NINTV + 1 containing the local coefficients of the cubic
pieces. (Input)
Optional Arguments
NINTV Number of polynomial pieces. (Input)
Default: NINTV = size (BREAK,1) 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function CSDER evaluates the derivative of a cubic spline at a given point. It is a special case
of the routine PPDER, which evaluates the derivative of a piecewise polynomial. (A cubic spline is
a piecewise polynomial of order 4.) The routine PPDER is based on the routine PPVALU in de Boor
(1978, page 89).
Example
In this example, we compute a cubic spline interpolant to a function f using IMSL routine CSINT.
The values of the spline and its first and second derivatives are computed using CSDER. These
values can then be compared with the corresponding values of the interpolated function.
USE CSDER_INT
USE CSINT_INT
USE UMACH_INT
IMPLICIT
NONE
CSDER 701
INTEGER
PARAMETER
NDATA
(NDATA=10)
!
INTEGER
REAL
I, NINTV, NOUT
BREAK(NDATA), CDDF, CDF, CF, COS, CSCOEF(4,NDATA),&
DDF, DF, F, FDATA(NDATA), FLOAT, SIN, X,&
XDATA(NDATA)
INTRINSIC COS, FLOAT, SIN
!
Define function and derivatives
F(X)
= SIN(15.0*X)
DF(X) = 15.0*COS(15.0*X)
DDF(X) = -225.0*SIN(15.0*X)
!
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
10 CONTINUE
!
Compute cubic spline interpolant
CALL CSINT (XDATA, FDATA, BREAK, CSCOEF)
!
Get output unit number
CALL UMACH (2, NOUT)
!
Write heading
WRITE (NOUT,99999)
99999 FORMAT (9X, 'X', 8X, 'S(X)', 5X, 'Error', 6X, 'S''(X)', 5X,&
'Error', 6X, 'S''''(X)', 4X, 'Error', /)
NINTV = NDATA - 1
!
Print the interpolant on a finer grid
DO 20 I=1, 2*NDATA
X
= FLOAT(I-1)/FLOAT(2*NDATA-1)
CF
= CSDER(0,X,BREAK,CSCOEF)
CDF = CSDER(1,X,BREAK,CSCOEF)
CDDF = CSDER(2,X,BREAK,CSCOEF)
WRITE (NOUT,'(F11.3, 3(F11.3, F11.6))') X, CF, F(X) - CF,&
CDF, DF(X) - CDF,&
CDDF, DDF(X) - CDDF
20 CONTINUE
END
Output
X
0.000
0.053
0.105
0.158
0.211
0.263
0.316
0.368
0.421
0.474
0.526
0.579
0.632
S(X)
0.000
0.902
1.019
0.617
-0.037
-0.674
-0.985
-0.682
0.045
0.708
0.978
0.673
-0.064
Error
0.000000
-0.192203
-0.019333
0.081009
0.021155
-0.046945
-0.015060
-0.004651
-0.011915
0.024292
0.020854
0.001410
0.015118
S(X)
Error
26.285 -11.284739
8.841
1.722460
-3.548
3.425718
-10.882
0.146207
-13.160 -1.837700
-10.033 -0.355268
-0.719
1.086203
11.314 -0.409097
14.708
0.284042
9.508
0.702690
0.161 -0.771948
-11.394
0.322443
-14.937 -0.045511
S(X)
Error
-379.458
-283.411
-187.364
-91.317
4.730
117.916
235.999
154.861
-25.887
-143.785
-211.402
-163.483
28.856
379.457794
123.664734
-37.628586
-65.824875
-1.062027
44.391640
-11.066727
-0.365387
18.552732
-21.041260
-13.411087
11.674103
-17.856323
0.684
0.737
0.789
0.842
0.895
0.947
1.000
-0.724
-0.954
-0.675
0.027
0.764
1.114
0.650
-0.019246
-0.044143
0.012143
0.038176
-0.010112
-0.116304
0.000000
-8.859
0.301
10.307
15.015
11.666
0.258
-19.208
-1.170871
0.554493
0.928152
-0.047344
-1.819128
-1.357680
7.812407
163.866
184.217
166.021
12.914
-140.193
-293.301
-446.408
3.435547
40.417282
-16.939514
-27.575521
-29.538193
68.905701
300.092896
CS1GD
Evaluates the derivative of a cubic spline on a grid.
Required Arguments
IDERIV Order of the derivative to be evaluated. (Input)
In particular, IDERIV = 0 returns the values of the cubic spline.
XVEC Array of length N containing the points at which the cubic spline is to be evaluated.
(Input)
The points in XVEC should be strictly increasing.
BREAK Array of length NINTV + 1 containing the breakpoints for the piecewise cubic
representation. (Input)
BREAK must be strictly increasing.
CSCOEF Matrix of size 4 by NINTV + 1 containing the local coefficients of the cubic
pieces. (Input)
VALUE Array of length N containing the values of the IDERIV-th derivative of the cubic
spline at the points in XVEC. (Output)
Optional Arguments
N Length of vector XVEC. (Input)
Default: N = size (XVEC,1).
NINTV Number of polynomial pieces. (Input)
Default: NINTV = size (BREAK,1) 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CS1GD 703
Double:
Description
The routine CS1GD evaluates a cubic spline (or its derivative) at a vector of points. That is, given a
vector x of length n satisfying xi < xi + 1 for i = 1, , n 1, a derivative value j, and a cubic spline
s that is represented by a breakpoint sequence and coefficient matrix this routine returns the values
s(j)(xi)
i = 1, , n
in the array VALUE. The functionality of this routine is the same as that of CSDER called in a loop,
however CS1GD should be much more efficient.
Comments
1.
2.
Informational error
Type
Code
4
Example
To illustrate the use of CS1GD, we modify the example program for CSINT. In this example, a
cubic spline interpolant to F is computed. The values of this spline are then compared with the
exact function values. The routine CS1GD is based on the routine PPVALU in de Boor (1978, page
89).
USE
USE
USE
USE
CS1GD_INT
CSINT_INT
UMACH_INT
CSVAL_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Specifications
NDATA, N, IDERIV, J
(NDATA=11, N=2*NDATA-1)
INTEGER
I, NINTV, NOUT
!
!
REAL
!
!
10
!
20
!
!
99999
!
!
30
Output
X
0.000
0.050
0.100
0.150
0.200
0.250
0.300
0.350
0.400
0.450
0.500
0.550
0.600
0.650
0.700
0.750
0.800
0.850
0.900
Interpolant
0.000
0.809
0.997
0.723
0.141
-0.549
-0.978
-0.843
-0.279
0.441
0.938
0.903
0.412
-0.315
-0.880
-0.938
-0.537
0.148
0.804
Error
0.000000
-0.127025
0.000000
0.055214
0.000000
-0.022789
0.000000
-0.016246
0.000000
0.009348
0.000000
0.019947
0.000000
-0.004895
0.000000
-0.029541
0.000000
0.034693
0.000000
CS1GD 705
0.950
1.000
1.086
0.650
-0.092559
0.000000
CSITG
This function evaluates the integral of a cubic spline.
Required Arguments
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
BREAK Array of length NINTV + 1 containing the breakpoints for the piecewise cubic
representation. (Input)
BREAK must be strictly increasing.
CSCOEF Matrix of size 4 by NINTV + 1 containing the local coefficients of the cubic
pieces. (Input)
Optional Arguments
NINTV Number of polynomial pieces. (Input)
Default: NINTV = size (BREAK,1) 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function CSITG evaluates the integral of a cubic spline over an interval. It is a special case of
the routine PPITG, which evaluates the integral of a piecewise polynomial. (A cubic spline is a
piecewise polynomial of order 4.)
Example
This example computes a cubic spline interpolant to the function x2 using CSINT and evaluates its
integral over the intervals [0., .5] and [0., 2.]. Since CSINT uses the not-a knot condition, the
interpolant reproduces x2, hence the integral values are 1/24 and 8/3, respectively.
USE CSITG_INT
USE UMACH_INT
USE CSINT_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=10)
!
INTEGER
REAL
I, NINTV, NOUT
A, B, BREAK(NDATA), CSCOEF(4,NDATA), ERROR,&
EXACT, F, FDATA(NDATA), FI, FLOAT, VALUE, X,&
XDATA(NDATA)
INTRINSIC FLOAT
!
Define function and integral
F(X) = X*X
FI(X) = X*X*X/3.0
!
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
10 CONTINUE
!
Compute cubic spline interpolant
CALL CSINT (XDATA, FDATA, BREAK, CSCOEF)
!
Compute the integral of F over
!
[0.0,0.5]
A
= 0.0
B
= 0.5
NINTV = NDATA - 1
VALUE = CSITG(A,B,BREAK,CSCOEF)
EXACT = FI(B) - FI(A)
ERROR = EXACT - VALUE
!
Get output unit number
CALL UMACH (2, NOUT)
!
Print the result
WRITE (NOUT,99999) A, B, VALUE, EXACT, ERROR
!
Compute the integral of F over
!
[0.0,2.0]
A
= 0.0
B
= 2.0
VALUE = CSITG(A,B,BREAK,CSCOEF)
EXACT = FI(B) - FI(A)
ERROR = EXACT - VALUE
!
Print the result
WRITE (NOUT,99999) A, B, VALUE, EXACT, ERROR
99999 FORMAT (' On the closed interval (', F3.1, ',', F3.1,&
') we have :', /, 1X, 'Computed Integral = ', F10.5, /,&
1X, 'Exact Integral
= ', F10.5, /, 1X, 'Error
'&
, '
= ', F10.6, /, /)
END
Chapter 3: Interpolation and Approximation
CSITG 707
Output
On the closed interval (0.0,0.5) we have :
Computed Integral =
0.04167
Exact Integral
=
0.04167
Error
=
0.000000
On the closed interval (0.0,2.0) we have :
Computed Integral =
2.66666
Exact Integral
=
2.66667
Error
=
0.000006
SPLEZ
Computes the values of a spline that either interpolates or fits user-supplied data.
Required Arguments
XDATA Array of length NDATA containing the data point abscissae. (Input)
The data point abscissas must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
XVEC Array of length N containing the points at which the spline function values are
desired. (Input)
The entries of XVEC must be distinct.
VALUE Array of length N containing the spline values. (Output)
VALUE (I) = S(XVEC (I)) if IDER = 0, VALUE(I) = S(XVEC (I)) if IDER = 1, and so
forth, where S is the computed spline.
Optional Arguments
NDATA Number of data points. (Input)
Default: NDATA = size (XDATA,1).
All choices of ITYPE are valid if NDATA is larger than 6. More specifically,
NDATA > ITYPE
or ITYPE = 1.
NDATA > 3
for ITYPE = 2, 3.
for ITYPE = 4, 5, 6, 7, 8.
NDATA > 3
ITYPE
yields CSINT
yields CSAKM
yields CSCON
yields BSINT-BSNAK K = 2
yields BSINT-BSNAK K = 3
yields BSINT-BSNAK K = 4
yields BSINT-BSNAK K = 5
yields BSINT-BSNAK K = 6
yields CSSCV
10
yields BSLSQ K = 2
11
yields BSLSQ K = 3
12
yields BSLSQ K = 4
13
yields BSVLS K = 2
14
yields BSVLS K = 3
15
yields BSVLS K = 4
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
This routine is designed to let the user experiment with various interpolation and smoothing
routines in the library.
Chapter 3: Interpolation and Approximation
SPLEZ 709
The routine SPLEZ computes a spline interpolant to a set of data points (xi, fi) for i = 1,, NDATA
if ITYPE = 1, , 8. If ITYPE 9, various smoothing or least squares splines are computed. The
output for this routine consists of a vector of values of the computed spline or its derivatives.
Specifically, let i = IDER, n = N, v = XVEC, and y = VALUE, then if s is the computed spline we set
yj = s(i)(vj)
j = 1, , n
The routines called are listed above under the ITYPE heading. Additional documentation can be
found by referring to these routines.
Example
In this example, all the ITYPE parameters are exercised. The values of the spline are then
compared with the exact function values and derivatives.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
!
INTEGER
REAL
!
INTRINSIC
REAL
!
!
REAL
NONE
NDATA, N
(NDATA=21, N=2*NDATA-1)
Specifications for local variables
I, IDER, ITYPE, NOUT
FDATA(NDATA), FPVAL(N), FVALUE(N),&
VALUE(N), XDATA(NDATA), XVEC(N), EMAX1(15),&
EMAX2(15), X
Specifications for intrinsics
FLOAT, SIN, COS
FLOAT, SIN, COS
Specifications for subroutines
F, FP
!
!
Define a function
F(X) = SIN(X*X)
FP(X) = 2*X*COS(X*X)
!
!
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = 3.0*(FLOAT(I-1)/FLOAT(NDATA-1))
FDATA(I) = F(XDATA(I))
10 CONTINUE
DO 20 I=1, N
XVEC(I)
= 3.0*(FLOAT(I-1)/FLOAT(2*NDATA-2))
FVALUE(I) = F(XVEC(I))
FPVAL(I) = FP(XVEC(I))
20 CONTINUE
!
WRITE (NOUT,99999)
!
Output
ITYPE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0.014082
0.024682
0.020896
0.083615
0.010403
0.014082
0.004756
0.001070
0.020896
0.392603
0.162793
0.045404
0.588370
0.752475
0.049340
BSINT
Computes the spline interpolant, returning the B-spline coefficients.
Required Arguments
NDATA Number of data points. (Input)
XDATA Array of length NDATA containing the data point abscissas. (Input)
FDATA Array of length NDATA containing the data point ordinates. (Input)
KORDER Order of the spline. (Input)
KORDER must be less than or equal to NDATA.
XKNOT Array of length NDATA + KORDER containing the knot sequence. (Input)
XKNOT must be nondecreasing.
Chapter 3: Interpolation and Approximation
BSINT 711
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Following the notation in de Boor (1978, page 108), let Bj = Bj,k,t denote the j-th B-spline of order
k with respect to the knot sequence t. Then, BSINT computes the vector a satisfying
N
a B (x ) = f
j =1
and returns the result in BSCOEF = a. This linear system is banded with at most k 1 subdiagonals
and k 1 superdiagonals. The matrix
A = (Bj (xi))
is totally positive and is invertible if and only if the diagonal entries are nonzero. The routine
BSINT is based on the routine SPLINT by de Boor (1978, page 204).
The routine BSINT produces the coefficients of the B-spline interpolant of order KORDER with knot
sequence XKNOT to the data (xi, fi) for i = 1 to NDATA, where x = XDATA and f = FDATA. Let
t = XKNOT, k = KORDER, and N = NDATA. First, BSINT sorts the XDATA vector and stores the result
in x. The elements of the FDATA vector are permuted appropriately and stored in f, yielding the
equivalent data (xi, fi) for i = 1 to N. The following preliminary checks are performed on the data.
We verify that
xi < xi +1
i = 1, , N 1
t i < t i +1
i = 1, , N
ti ti+k
i = 1, , N + k 1
The first test checks to see that the abscissas are distinct. The second and third inequalities verify
that a valid knot sequence has been specified.
In order for the interpolation matrix to be nonsingular, we also check tk xi tN + 1 for i = 1 to N.
This first inequality in the last check is necessary since the method used to generate the entries of
the interpolation matrix requires that the k possibly nonzero B-splines at xi,
Bj - k +1, , Bj where j satisfies tj xi < tj + 1
[t k , t N +1 ]
C f(
k)
[t k , t N +1 ]
where
t :=
max t i +1 t i
i = k ,, N
For more information on this problem, see de Boor (1978, Chapter 13) and the references therein.
This routine can be used in place of the IMSL routine CSINT by calling BSNAK to obtain the
proper knots, then calling BSINT yielding the B-spline coefficients, and finally calling IMSL
routine BSCPP to convert to piecewise polynomial form.
Comments
1.
2.
Informational errors
Type
Code
3
4
4
4
4
1
3
4
5
15
16
17
BSINT 713
3.
The spline can be evaluated using BSVAL, and its derivative can be evaluated using
BSDER.
Example
In this example, a spline interpolant s, to
f ( x) = x
is computed. The interpolated values are then compared with the exact function values using the
IMSL routine BSVAL.
USE
USE
USE
USE
BSINT_INT
BSNAK_INT
UMACH_INT
BSVAL_INT
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NDATA, NKNOT
(KORDER=3, NDATA=5, NKNOT=NDATA+KORDER)
!
INTEGER
REAL
!
!
!
!
!
!
!
!
I, NCOEF, NOUT
BSCOEF(NDATA), BT, F, FDATA(NDATA), FLOAT,&
SQRT, X, XDATA(NDATA), XKNOT(NKNOT), XT
INTRINSIC FLOAT, SQRT
Define function
F(X) = SQRT(X)
Set up interpolation points
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
10 CONTINUE
Generate knot sequence
CALL BSNAK (NDATA, XDATA, KORDER, XKNOT)
Interpolate
CALL BSINT (NDATA, XDATA, FDATA, KORDER, XKNOT, BSCOEF)
Get output unit number
CALL UMACH (2, NOUT)
Write heading
WRITE (NOUT,99999)
Print on a finer grid
NCOEF = NDATA
XT
= XDATA(1)
Evaluate spline
BT
= BSVAL(XT,KORDER,XKNOT,NCOEF,BSCOEF)
WRITE (NOUT,99998) XT, BT, F(XT) - BT
DO 20 I=2, NDATA
XT = (XDATA(I-1)+XDATA(I))/2.0
Evaluate spline
BT = BSVAL(XT,KORDER,XKNOT,NCOEF,BSCOEF)
WRITE (NOUT,99998) XT, BT, F(XT) - BT
XT = XDATA(I)
Evaluate spline
BT = BSVAL(XT,KORDER,XKNOT,NCOEF,BSCOEF)
Output
X
0.0000
0.1250
0.2500
0.3750
0.5000
0.6250
0.7500
0.8750
1.0000
S(X)
0.0000
0.2918
0.5000
0.6247
0.7071
0.7886
0.8660
0.9365
1.0000
Error
0.000000
0.061781
0.000000
-0.012311
0.000000
0.002013
0.000000
-0.001092
0.000000
BSNAK
Computes the not-a-knot spline knot sequence.
Required Arguments
NDATA Number of data points. (Input)
XDATA Array of length NDATA containing the location of the data points. (Input)
KORDER Order of the spline. (Input)
XKNOT Array of length NDATA + KORDER containing the knot sequence. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Given the data points x = XDATA , the order of the spline k = KORDER, and the number N = NDATA
of elements in XDATA, the subroutine BSNAK returns in t = XKNOT a knot sequence that is
Chapter 3: Interpolation and Approximation
BSNAK 715
appropriate for interpolation of data on x by splines of order k. The vector t contains the knot
sequence in its first N + k positions. If k is even and we assume that the entries in the input vector
x are increasing, then t is returned as
ti = x1
for i = 1, , k
ti = xi - k/2
for i = k + 1, , N
ti = xN +
for i = N + 1, , N + k
where is a small positive constant. There is some discussion concerning this selection of knots in
de Boor (1978, page 211). If k is odd, then t is returned as
t i = x1
for i = 1, , k
ti = ( x
k 1
2
t i = xN +
+x
i 1
k 1
2
) / 2 for i = k + 1, , N
for i = N + 1, , N + k
It is not necessary to sort the values in x since this is done in the routine BSNAK.
Comments
1.
2.
Informational error
Type
Code
4
3.
4.
The first knot is at the left endpoint and the last knot is slightly beyond the last
endpoint. Both endpoints have multiplicity KORDER.
Interior knots have multiplicity one.
Example
In this example, we compute (for k = 3, , 8) six spline interpolants sk to F(x) = sin(10x3) on the
interval [0,1]. The routine BSNAK is used to generate the knot sequences for sk and then BSINT is
called to obtain the interpolant. We evaluate the absolute error
|sk F|
716 Chapter 3: Interpolation and Approximation
at 100 equally spaced points and print the maximum error for each k.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
KMAX, KMIN, NDATA
(KMAX=8, KMIN=3, NDATA=20)
!
INTEGER
REAL
!
!
!
!
!
!
!
!
!
I, K, KORDER, NOUT
ABS, AMAX1, BSCOEF(NDATA), DIF, DIFMAX, F,&
FDATA(NDATA), FLOAT, FT, SIN, ST, T, X, XDATA(NDATA),&
XKNOT(KMAX+NDATA), XT
INTRINSIC ABS, AMAX1, FLOAT, SIN
Define function and tau function
F(X) = SIN(10.0*X*X*X)
T(X) = 1.0 - X*X
Set up data
DO 10 I=1, NDATA
XT
= FLOAT(I-1)/FLOAT(NDATA-1)
XDATA(I) = T(XT)
FDATA(I) = F(XDATA(I))
10 CONTINUE
Get output unit number
CALL UMACH (2, NOUT)
Write heading
WRITE (NOUT,99999)
Loop over different orders
DO 30 K=KMIN, KMAX
KORDER = K
Generate knots
CALL BSNAK (NDATA, XDATA, KORDER, XKNOT)
Interpolate
CALL BSINT (NDATA, XDATA, FDATA, KORDER, XKNOT, BSCOEF)
DIFMAX = 0.0
DO 20 I=1, 100
XT
= FLOAT(I-1)/99.0
Evaluate spline
ST
= BSVAL(XT,KORDER,XKNOT,NDATA,BSCOEF)
FT
= F(XT)
DIF
= ABS(FT-ST)
Compute maximum difference
DIFMAX = AMAX1(DIF,DIFMAX)
20 CONTINUE
Print maximum difference
WRITE (NOUT,99998) KORDER, DIFMAX
30 CONTINUE
!
99998 FORMAT (' ', I3, 5X, F9.4)
99999 FORMAT (' KORDER', 5X, 'Maximum difference', /)
END
Output
KORDER
3
Maximum difference
0.0080
BSNAK 717
4
5
6
7
8
0.0026
0.0004
0.0008
0.0010
0.0004
BSOPK
Computes the optimal spline knot sequence.
Required Arguments
NDATA Number of data points. (Input)
XDATA Array of length NDATA containing the location of the data points. (Input)
KORDER Order of the spline. (Input)
XKNOT Array of length NDATA + KORDER containing the knot sequence. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Given the abscissas x = XDATA for an interpolation problem and the order of the spline interpolant
k = KORDER, BSOPK returns the knot sequence t = XKNOT that minimizes the constant in the error
estimate
|| f s || c || f (k) ||
In the above formula, f is any function in Ck and s is the spline interpolant to f at the abscissas x
with knot sequence t.
The algorithm is based on a routine described in de Boor (1978, page 204), which in turn is based
on a theorem of Micchelli, Rivlin and Winograd (1976).
Comments
1.
2.
Informational errors
Type
Code
3
4
4
3.
6
3
4
The default value for MAXIT is 10, this can be overridden by calling B2OPK/DB2OPK
directly with a larger value.
Example
In this example, we compute (for k = 3, , 8) six spline interpolants sk to F(x) = sin(10x3) on the
interval [0, 1]. The routine BSOPK is used to generate the knot sequences for sk and then BSINT is
called to obtain the interpolant. We evaluate the absolute error
| sk F |
at 100 equally spaced points and print the maximum error for each k.
USE
USE
USE
USE
BSOPK_INT
BSINT_INT
UMACH_INT
BSVAL_INT
IMPLICIT
INTEGER
PARAMETER
NONE
KMAX, KMIN, NDATA
(KMAX=8, KMIN=3, NDATA=20)
!
INTEGER
REAL
!
!
I, K, KORDER, NOUT
ABS, AMAX1, BSCOEF(NDATA), DIF, DIFMAX, F,&
FDATA(NDATA), FLOAT, FT, SIN, ST, T, X, XDATA(NDATA),&
XKNOT(KMAX+NDATA), XT
INTRINSIC ABS, AMAX1, FLOAT, SIN
Define function and tau function
F(X) = SIN(10.0*X*X*X)
T(X) = 1.0 - X*X
Set up data
DO 10 I=1, NDATA
XT
= FLOAT(I-1)/FLOAT(NDATA-1)
BSOPK 719
XDATA(I) = T(XT)
FDATA(I) = F(XDATA(I))
10 CONTINUE
!
WRITE (NOUT,99999)
!
!
!
!
!
Generate knots
CALL BSOPK (NDATA, XDATA, KORDER, XKNOT)
Interpolate
CALL BSINT (NDATA, XDATA, FDATA, KORDER, XKNOT, BSCOEF)
DIFMAX = 0.0
DO 20 I=1, 100
XT
= FLOAT(I-1)/99.0
Evaluate spline
ST
= BSVAL(XT,KORDER,XKNOT,NDATA,BSCOEF)
FT
= F(XT)
DIF
= ABS(FT-ST)
Compute maximum difference
DIFMAX = AMAX1(DIF,DIFMAX)
20 CONTINUE
Print maximum difference
WRITE (NOUT,99998) KORDER, DIFMAX
30 CONTINUE
!
99998 FORMAT (' ', I3, 5X, F9.4)
99999 FORMAT (' KORDER', 5X, 'Maximum difference', /)
END
Output
KORDER
Maximum difference
3
4
5
6
7
8
0.0096
0.0018
0.0005
0.0004
0.0007
0.0035
BS2IN
Computes a two-dimensional tensor-product spline interpolant, returning the tensor-product Bspline coefficients.
Required Arguments
XDATA Array of length NXDATA containing the data points in the X-direction. (Input)
XDATA must be strictly increasing.
720 Chapter 3: Interpolation and Approximation
YDATA Array of length NYDATA containing the data points in the Y-direction. (Input)
YDATA must be strictly increasing.
FDATA Array of size NXDATA by NYDATA containing the values to be interpolated.
(Input)
FDATA (I, J) is the value at (XDATA (I), YDATA(J)).
KXORD Order of the spline in the X-direction. (Input)
KXORD must be less than or equal to NXDATA.
KYORD Order of the spline in the Y-direction. (Input)
KYORD must be less than or equal to NYDATA.
XKNOT Array of length NXDATA + KXORD containing the knot sequence in the X-direction.
(Input)
XKNOT must be nondecreasing.
YKNOT Array of length NYDATA + KYORD containing the knot sequence in the Y-direction.
(Input)
YKNOT must be nondecreasing.
BSCOEF Array of length NXDATA * NYDATA containing the tensor-product B-spline
coefficients. (Output)
BSCOEF is treated internally as a matrix of size NXDATA by NYDATA.
Optional Arguments
NXDATA Number of data points in the X-direction. (Input)
Default: NXDATA = size (XDATA,1).
NYDATA Number of data points in the Y-direction. (Input)
Default: NYDATA = size (YDATA,1).
LDF The leading dimension of FDATA exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDF = size (FDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
BS2IN 721
FORTRAN 77 Interface
Single:
CALL BS2IN (NXDATA, XDATA, NYDATA, YDATA, FDATA, LDF, KXORD, KYORD,
XKNOT, YKNOT, BSCOEF)
Double:
Description
The routine BS2IN computes a tensor product spline interpolant. The tensor product spline
interpolant to data {(xi, yj, fij)}, where 1 i Nx and 1 j Ny, has the form
Ny
n,kx ,t x
m =1
( x ) Bm, k ,t ( y )
y
where kx and ky are the orders of the splines. (These numbers are passed to the subroutine in
KXORD and KYORD, respectively.) Likewise, tx and ty are the corresponding knot sequences (XKNOT
and YKNOT). The algorithm requires that
tx(kx) xi tx(Nx + 1) 1 i Nx
ty(ky) yj ty(Ny + 1) 1 j Ny
Tensor product spline interpolants in two dimensions can be computed quite efficiently by solving
(repeatedly) two univariate interpolation problems. The computation is motivated by the following
observations. It is necessary to solve the system of equations
N y Nx
c
m =1 n =1
nm
Bn , kx , t x ( xi )Bm , k y , t y ( y j ) = fij
Setting
hmi = n =x1 cnm Bn , kx , t x ( xi )
N
we note that for each fixed i from 1 to Nx, we have Ny linear equations in the same number of
unknowns as can be seen below:
Ny
h
m =1
mi
Bm , k y ,t y ( y j ) = f ij
1 m, j Ny
Thus, we need only factor this matrix once and then apply this factorization to the Nx righthand
sides. Once this is done and we have computed hmi, then we must solve for the coefficients cnm
using the relation
Nx
c
n =1
nm
Bn , k x ,t x ( xi ) = hmi
for m from 1 to Ny, which again involves one factorization and Ny solutions to the different righthand sides. The routine BS2IN is based on the routine SPLI2D by de Boor (1978, page 347).
Comments
1.
2.
Informational errors
Type
Code
3
3
4
4
4
4
4
1
2
6
7
13
14
15
16
17
Example
In this example, a tensor product spline interpolant to a function f is computed. The values of the
interpolant and the error on a 4 4 grid are displayed.
USE
USE
USE
USE
BS2IN_INT
BSNAK_INT
BS2VL_INT
UMACH_INT
IMPLICIT
!
INTEGER
PARAMETER
NONE
SPECIFICATIONS FOR PARAMETERS
KXORD, KYORD, LDF, NXDATA, NXKNOT, NXVEC, NYDATA,&
NYKNOT, NYVEC
(KXORD=5, KYORD=2, NXDATA=21, NXVEC=4, NYDATA=6,&
BS2IN 723
!
!
10
!
!
20
!
!
30
40
!
!
!
!
!
50
60
!
70
80
99999 FORMAT (13X, 'X', 14X, 'Y', 10X, 'S(X,Y)', 9X, 'Error')
END
Output
X
0.0000
0.0000
0.0000
0.0000
0.3333
0.3333
0.3333
0.3333
0.6667
0.6667
0.6667
0.6667
1.0000
1.0000
1.0000
1.0000
Y
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
S(X,Y)
0.0000
0.0000
0.0000
0.0000
0.0370
0.1481
0.2593
0.3704
0.2963
0.5185
0.7407
0.9630
1.0000
1.3333
1.6667
2.0000
Error
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
BS3IN
Computes a three-dimensional tensor-product spline interpolant, returning the tensor-product Bspline coefficients.
Required Arguments
XDATA Array of length NXDATA containing the data points in the x-direction. (Input)
XDATA must be increasing.
YDATA Array of length NYDATA containing the data points in the y-direction. (Input)
YDATA must be increasing.
ZDATA Array of length NZDATA containing the data points in the z-direction. (Input)
ZDATA must be increasing.
FDATA Array of size NXDATA by NYDATA by NZDATA containing the values to be
interpolated. (Input)
FDATA (I, J, K) contains the value at (XDATA (I), YDATA(J), ZDATA(K)).
KXORD Order of the spline in the x-direction. (Input)
KXORD must be less than or equal to NXDATA.
KYORD Order of the spline in the y-direction. (Input)
KYORD must be less than or equal to NYDATA.
BS3IN 725
Optional Arguments
NXDATA Number of data points in the x-direction. (Input)
Default: NXDATA = size (XDATA,1).
NYDATA Number of data points in the y-direction. (Input)
Default: NYDATA = size (YDATA,1).
NZDATA Number of data points in the z-direction. (Input)
Default: NZDATA = size (ZDATA,1).
LDF Leading dimension of FDATA exactly as specified in the dimension statement of the
calling program. (Input)
Default: LDF = size (FDATA,1).
MDF Middle dimension of FDATA exactly as specified in the dimension statement of the
calling program. (Input)
Default: MDF = size (FDATA,2).
FORTRAN 90 Interface
Generic:
CALL BS3IN (XDATA, YDATA, ZDATA, FDATA, KXORD, KYORD, KZORD, XKNOT,
YKNOT, ZKNOT, BSCOEF [,])
Specific:
FORTRAN 77 Interface
Single:
CALL BS3IN (NXDATA, XDATA, NYDATA, YDATA, NZDATA, ZDATA, FDATA, LDF,
MDF, KXORD, KYORD, KZORD, XKNOT, YKNOT, ZKNOT, BSCOEF)
Double:
Description
The routine BS3IN computes a tensor-product spline interpolant. The tensor-product spline
interpolant to data {(xi, yj, zk, fijk)}, where 1 i Nx, 1 j Ny, and 1 k Nz has the form
Nz
N y Nx
nml
l =1 m =1 n =1
Bn, kx ,t x ( x ) Bm , k y ,t y ( y ) Bl , kz , t z ( z )
where kx, ky, and kz are the orders of the splines (these numbers are passed to the subroutine in
KXORD, KYORD, and KZORD, respectively). Likewise, tx, ty, and tz are the corresponding knot
sequences (XKNOT, YKNOT, and ZKNOT). The algorithm requires that
t x ( kx )
xi t x ( N x + 1) 1 i N x
t z ( kz )
zk t z ( N z + 1) 1 k N z
t y (ky )
y j t y ( N y + 1) 1 j N y
Tensor-product spline interpolants can be computed quite efficiently by solving (repeatedly) three
univariate interpolation problems. The computation is motivated by the following observations. It
is necessary to solve the system of equations
Nz
N y Nx
c
l =1 m =1 n =1
nml
Bn, kx ,t x ( xi ) Bm , k y , t y ( y j ) Bl , kz ,t z ( zk ) = fijk
Setting
hlij = m =y 1 n =x1 cnml Bn , kx , t x ( xi ) Bm , k y , t y ( y j )
N
we note that for each fixed pair ij we have Nz linear equations in the same number of unknowns as
can be seen below:
Nz
h
l =1
lij
Bl , kz ,t z ( zk ) = f ijk
1 l, k N z
Thus, we need only factor this matrix once and then apply it to the NxNy right-hand sides. Once
this is done and we have computed hlij, then we must solve for the coefficients cnml using the
relation
BS3IN 727
N y Nx
c
m =1 n =1
nml
Bn, kx ,t x ( xi ) Bm , k y ,t y ( y j ) = hlij
that is the bivariate tensor-product problem addressed by the IMSL routine BS2IN. The interested
reader should consult the algorithm description in the two-dimensional routine if more detail is
desired. The routine BS3IN is based on the routine SPLI2D by de Boor (1978, page 347).
Comments
1.
2.
Informational errors
Type
Code
3
3
4
4
4
1
2
13
14
15
16
17
4
4
4
18
19
20
Example
In this example, a tensor-product spline interpolant to a function f is computed. The values of the
interpolant and the error on a 4 4 2 grid are displayed.
USE BS3IN_INT
USE BSNAK_INT
728 Chapter 3: Interpolation and Approximation
USE UMACH_INT
USE BS3GD_INT
IMPLICIT
INTEGER
PARAMETER
NONE
!
INTEGER
REAL
!
!
10
!
20
!
30
!
40
50
!
!
!
NXCOEF = NXDATA
NYCOEF = NYDATA
NZCOEF = NZDATA
!
Write heading
WRITE (NOUT,99999)
BS3IN 729
!
!
!
90
100
110
99999
DO 60 I=1, NXVEC
XVEC(I) = 2.0*(FLOAT(I-1)/3.0) - 1.0
CONTINUE
DO 70 I=1, NYVEC
YVEC(I) = FLOAT(I-1)/3.0
CONTINUE
DO 80 I=1, NZVEC
ZVEC(I) = FLOAT(I-1)
CONTINUE
Call the evaluation routine.
CALL BS3GD (0, 0, 0, XVEC, YVEC, ZVEC,&
KXORD, KYORD, KZORD, XKNOT, YKNOT, ZKNOT, BSCOEF, VALUE)
DO 110 I=1, NXVEC
DO 100 J=1, NYVEC
DO 90 K=1, NZVEC
WRITE (NOUT,'(4F13.4, F13.6)') XVEC(I), YVEC(K),&
ZVEC(K), VALUE(I,J,K),&
F(XVEC(I),YVEC(J),ZVEC(K))&
- VALUE(I,J,K)
CONTINUE
CONTINUE
CONTINUE
FORMAT (10X, 'X', 11X, 'Y', 10X, 'Z', 10X, 'S(X,Y,Z)', 7X,&
'Error')
END
Output
X
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
0.3333
0.3333
0.3333
0.3333
0.3333
0.3333
0.3333
Y
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
Z
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
S(X,Y,Z)
-1.0000
-1.0000
-1.0000
-1.3333
-1.0000
-1.6667
-1.0000
-2.0000
-0.0370
-0.0370
-0.0370
-0.1481
-0.0370
-0.2593
-0.0370
-0.3704
0.0370
0.0370
0.0370
0.1481
0.0370
0.2593
0.0370
Error
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
Fortran Numerical MATH LIBRARY
0.3333
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
0.0000
0.3333
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.3704
1.0000
1.0000
1.0000
1.3333
1.0000
1.6667
1.0000
2.0000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
BSVAL
This function evaluates a spline, given its B-spline representation.
Required Arguments
X Point at which the spline is to be evaluated. (Input)
KORDER Order of the spline. (Input)
XKNOT Array of length KORDER + NCOEF containing the knot sequence. (Input)
XKNOT must be nondecreasing.
NCOEF Number of B-spline coefficients. (Input)
BSCOEF Array of length NCOEF containing the B-spline coefficients. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function BSVAL evaluates a spline (given its B-spline representation) at a specific point. It is a
special case of the routine BSDER, which evaluates the derivative of a spline given its B-spline
representation. The routine BSDER is based on the routine BVALUE by de Boor (1978, page 144).
Chapter 3: Interpolation and Approximation
BSVAL 731
Specifically, given the knot vector t, the number of coefficients N, the coefficient vector a, and a
point x, BSVAL returns the number
N
a B ( x)
j =1
j ,k
where Bj,k is the j-th B-spline of order k for the knot sequence t. Note that this function routine
arbitrarily treats these functions as if they were right continuous near XKNOT(KORDER) and left
continuous near XKNOT(NCOEF + 1). Thus, if we have KORDER knots stacked at the left or right end
point, and if we try to evaluate at these end points, then we will get the value of the limit from the
interior of the interval.
Comments
1.
2.
Informational errors
Type
4
4
Code
4 Multiplicity of the knots cannot exceed the order of the spline.
5 The knots must be nondecreasing.
Example
For an example of the use of BSVAL, see IMSL routine BSINT.
BSDER
This function evaluates the derivative of a spline, given its B-spline representation.
Required Arguments
IDERIV Order of the derivative to be evaluated. (Input)
In particular, IDERIV = 0 returns the value of the spline.
732 Chapter 3: Interpolation and Approximation
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function BSDER produces the value of a spline or one of its derivatives (given its B-spline
representation) at a specific point. The function BSDER is based on the routine BVALUE by de Boor
(1978, page 144).
Specifically, given the knot vector t, the number of coefficients N, the coefficient vector a, the
order of the derivative i and a point x, BSDER returns the number
N
a B( ) ( x )
j =1
i
j ,k
where Bj,k is the j-th B-spline of order k for the knot sequence t. Note that this function routine
arbitrarily treats these functions as if they were right continuous near XKNOT(KORDER) and left
continuous near XKNOT(NCOEF + 1). Thus, if we have KORDER knots stacked at the left or right end
point, and if we try to evaluate at these end points, then we will get the value of the limit from the
interior of the interval.
Comments
1.
BSDER 733
2.
Informational errors
Type
4
4
Code
4 Multiplicity of the knots cannot exceed the order of the spline.
5 The knots must be nondecreasing.
Example
A spline interpolant to the function
f ( x) = x
is constructed using BSINT. The B-spline representation, which is returned by the IMSL routine
BSINT, is then used by BSDER to compute the value and derivative of the interpolant. The output
consists of the interpolation values and the error at the data points and the midpoints. In addition,
we display the value of the derivative and the error at these same points.
USE
USE
USE
USE
BSDER_INT
BSINT_INT
BSNAK_INT
UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NDATA, NKNOT
(KORDER=3, NDATA=5, NKNOT=NDATA+KORDER)
!
INTEGER
REAL
!
!
!
!
!
!
!
I, NCOEF, NOUT
BSCOEF(NDATA), BT0, BT1, DF, F, FDATA(NDATA),&
FLOAT, SQRT, X, XDATA(NDATA), XKNOT(NKNOT), XT
INTRINSIC FLOAT, SQRT
Define function and derivative
F(X) = SQRT(X)
DF(X) = 0.5/SQRT(X)
Set up interpolation points
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I)/FLOAT(NDATA)
FDATA(I) = F(XDATA(I))
10 CONTINUE
Generate knot sequence
CALL BSNAK (NDATA, XDATA, KORDER, XKNOT)
Interpolate
CALL BSINT (NDATA, XDATA, FDATA, KORDER, XKNOT, BSCOEF)
Get output unit number
CALL UMACH (2, NOUT)
Write heading
WRITE (NOUT,99999)
Print on a finer grid
NCOEF = NDATA
XT
= XDATA(1)
Evaluate spline
BT0
= BSDER(0,XT,KORDER,XKNOT,NCOEF,BSCOEF)
BT1
= BSDER(1,XT,KORDER,XKNOT,NCOEF,BSCOEF)
WRITE (NOUT,99998) XT, BT0, F(XT) - BT0, BT1, DF(XT) - BT1
DO 20 I=2, NDATA
XT = (XDATA(I-1)+XDATA(I))/2.0
!
Evaluate spline
BT0 = BSDER(0,XT,KORDER,XKNOT,NCOEF,BSCOEF)
BT1 = BSDER(1,XT,KORDER,XKNOT,NCOEF,BSCOEF)
WRITE (NOUT,99998) XT, BT0, F(XT) - BT0, BT1, DF(XT) - BT1
XT = XDATA(I)
!
Evaluate spline
BT0 = BSDER(0,XT,KORDER,XKNOT,NCOEF,BSCOEF)
BT1 = BSDER(1,XT,KORDER,XKNOT,NCOEF,BSCOEF)
WRITE (NOUT,99998) XT, BT0, F(XT) - BT0, BT1, DF(XT) - BT1
20 CONTINUE
99998 FORMAT (' ', F6.4, 5X, F7.4, 3X, F10.6, 5X, F8.4, 3X, F10.6)
99999 FORMAT (6X, 'X', 8X, 'S(X)', 7X, 'Error', 8X, 'S''(X)', 8X,&
'Error', /)
END
Output
X
S(X)
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
0.8000
0.9000
1.0000
0.4472
0.5456
0.6325
0.7077
0.7746
0.8366
0.8944
0.9489
1.0000
Error
0.000000
0.002084
0.000000
-0.000557
0.000000
0.000071
0.000000
-0.000214
0.000000
S(X)
1.0423
0.9262
0.8101
0.6940
0.6446
0.5952
0.5615
0.5279
0.4942
Error
0.075738
-0.013339
-0.019553
0.013071
0.000869
0.002394
-0.002525
-0.000818
0.005814
BS1GD
Evaluates the derivative of a spline on a grid, given its B-spline representation.
Required Arguments
IDERIV Order of the derivative to be evaluated. (Input)
In particular, IDERIV = 0 returns the value of the spline.
XVEC Array of length N containing the points at which the spline is to be evaluated.
(Input)
XVEC should be strictly increasing.
KORDER Order of the spline. (Input)
Chapter 3: Interpolation and Approximation
BS1GD 735
XKNOT Array of length NCOEF + KORDER containing the knot sequence. (Input)
XKNOT must be nondecreasing.
BSCOEF Array of length NCOEF containing the B-spline coefficients. (Input)
VALUE Array of length N containing the values of the IDERIV-th derivative of the spline
at the points in XVEC. (Output)
Optional Arguments
N Length of vector XVEC. (Input)
Default: N = size (XVEC,1).
NCOEF Number of B-spline coefficients. (Input)
Default: NCOEF = size (BSCOEF,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BS1GD evaluates a B-spline (or its derivative) at a vector of points. That is, given a
vector x of length n satisfying xi < xi + 1 for i = 1, , n 1, a derivative value j, and a B-spline s
that is represented by a knot sequence and coefficient sequence, this routine returns the values
s ( j ) ( xi ) i = 1, , n
in the array VALUE. The functionality of this routine is the same as that of BSDER called in a loop,
however BS1GD should be much more efficient. This routine converts the B-spline representation
to piecewise polynomial form using the IMSL routine BSCPP, and then uses the IMSL routine
PPVAL for evaluation.
Comments
1.
2.
Informational error
Type
4
Code
5 The points in XVEC must be strictly increasing.
Example
To illustrate the use of BS1GD, we modify the example program for BSDER. In this example, a
quadratic (order 3) spline interpolant to F is computed. The values and derivatives of this spline
are then compared with the exact function and derivative values. The routine BS1GD is based on
the routines BSPLPP and PPVALU in de Boor (1978, page 89).
USE
USE
USE
USE
BS1GD_INT
BSINT_INT
BSNAK_INT
UMACH_INT
IMPLICIT
INTEGER
PARAMETER
!
INTEGER
REAL
!
INTRINSIC
REAL
!
REAL
!
NONE
KORDER, NDATA, NKNOT, NFGRID
(KORDER=3, NDATA=5, NKNOT=NDATA+KORDER, NFGRID = 9)
SPECIFICATIONS FOR LOCAL VARIABLES
I, NCOEF, NOUT
ANS0(NFGRID), ANS1(NFGRID), BSCOEF(NDATA),&
FDATA(NDATA),&
X, XDATA(NDATA), XKNOT(NKNOT), XVEC(NFGRID)
SPECIFICATIONS FOR INTRINSICS
FLOAT, SQRT
FLOAT, SQRT
SPECIFICATIONS FOR SUBROUTINES
DF, F
F(X) = SQRT(X)
DF(X) = 0.5/SQRT(X)
!
CALL UMACH (2, NOUT)
!
BS1GD 737
FDATA(I) = F(XDATA(I))
10 CONTINUE
CALL BSNAK (NDATA, XDATA, KORDER, XKNOT)
!
Interpolate
CALL BSINT (NDATA, XDATA, FDATA, KORDER, XKNOT, BSCOEF)
WRITE (NOUT,99999)
!
Print on a finer grid
NCOEF
= NDATA
XVEC(1) = XDATA(1)
DO 20 I=2, 2*NDATA - 2, 2
XVEC(I)
= (XDATA(I/2+1)+XDATA(I/2))/2.0
XVEC(I+1) = XDATA(I/2+1)
20 CONTINUE
CALL BS1GD (0, XVEC, KORDER, XKNOT, BSCOEF, ANS0)
CALL BS1GD (1, XVEC, KORDER, XKNOT, BSCOEF, ANS1)
DO 30 I=1, 2*NDATA - 1
WRITE (NOUT,99998) XVEC(I), ANS0(I), F(XVEC(I)) - ANS0(I),&
ANS1(I), DF(XVEC(I)) - ANS1(I)
30 CONTINUE
99998 FORMAT (' ', F6.4, 5X, F7.4, 5X, F8.4, 5X, F8.4, 5X, F8.4)
99999 FORMAT (6X, 'X', 8X, 'S(X)', 7X, 'Error', 8X, 'S''(X)', 8X,&
'Error', /)
END
Output
X
S(X)
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
0.8000
0.9000
1.0000
0.4472
0.5456
0.6325
0.7077
0.7746
0.8366
0.8944
0.9489
1.0000
Error
S(X)
Error
0.0000
0.0021
0.0000
-0.0006
0.0000
0.0001
0.0000
-0.0002
0.0000
1.0423
0.9262
0.8101
0.6940
0.6446
0.5952
0.5615
0.5279
0.4942
0.0757
-0.0133
-0.0196
0.0131
0.0009
0.0024
-0.0025
-0.0008
0.0058
BSITG
This function evaluates the integral of a spline, given its B-spline representation.
Required Arguments
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function BSITG computes the integral of a spline given its B-spline representation.
Specifically, given the knot sequence t = XKNOT, the order k = KORDER, the coefficients
a = BSCOEF , n = NCOEF and an interval [a, b], BSITG returns the value
b n
a B ( x ) dx
a
i =1
i ,k ,t
This routine uses the identity (22) on page 151 of de Boor (1978), and it assumes that t1 = = tk
and tn + 1= = tn + k.
Comments
1.
BSITG 739
2.
Informational errors
Type
3
3
3
4
4
Code
7
8
9
4
5
Example
We integrate the quartic (k = 5) spline that interpolates x3 at the points {i/10 : i = 10, , 10} over
the interval [0, 1]. The exact answer is 1/4 since the interpolant reproduces cubic polynomials.
USE
USE
USE
USE
BSITG_INT
BSNAK_INT
BSINT_INT
UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NDATA, NKNOT
(KORDER=5, NDATA=21, NKNOT=NDATA+KORDER)
INTEGER
REAL
!
!
!
!
!
I, NCOEF, NOUT
A, B, BSCOEF(NDATA), ERROR, EXACT, F,&
FDATA(NDATA), FI, FLOAT, VAL, X, XDATA(NDATA),&
XKNOT(NKNOT)
INTRINSIC FLOAT
Define function and integral
F(X) = X*X*X
FI(X) = X**4/4.0
Set up interpolation points
DO 10 I=1, NDATA
XDATA(I) = FLOAT(I-11)/10.0
FDATA(I) = F(XDATA(I))
10 CONTINUE
Generate knot sequence
CALL BSNAK (NDATA, XDATA, KORDER, XKNOT)
Interpolate
CALL BSINT (NDATA, XDATA, FDATA, KORDER, XKNOT, BSCOEF)
Get output unit number
CALL UMACH (2, NOUT)
!
NCOEF = NDATA
A
= 0.0
B
= 1.0
!
Integrate from A to B
VAL
= BSITG(A,B,KORDER,XKNOT,NCOEF,BSCOEF)
EXACT = FI(B) - FI(A)
ERROR = EXACT - VAL
!
Print results
WRITE (NOUT,99999) A, B, VAL, EXACT, ERROR
99999 FORMAT (' On the closed interval (', F3.1, ',', F3.1,&
') we have :', /, 1X, 'Computed Integral = ', F10.5, /,&
740 Chapter 3: Interpolation and Approximation
'&
END
Output
On the closed interval (0.0,1.0) we have :
Computed Integral =
0.25000
Exact Integral
=
0.25000
Error
=
0.000000
BS2VL
This function evaluates a two-dimensional tensor-product spline, given its tensor-product B-spline
representation.
Required Arguments
X X-coordinate of the point at which the spline is to be evaluated. (Input)
Y Y-coordinate of the point at which the spline is to be evaluated. (Input)
KXORD Order of the spline in the X-direction. (Input)
KYORD Order of the spline in the Y-direction. (Input)
XKNOT Array of length NXCOEF + KXORD containing the knot sequence in the X-direction.
(Input)
XKNOT must be nondecreasing.
YKNOT Array of length NYCOEF + KYORD containing the knot sequence in the Y-direction.
(Input)
YKNOT must be nondecreasing.
NXCOEF Number of B-spline coefficients in the X-direction. (Input)
NYCOEF Number of B-spline coefficients in the Y-direction. (Input)
BSCOEF Array of length NXCOEF * NYCOEF containing the tensor-product B-spline
coefficients. (Input)
BSCOEF is treated internally as a matrix of size NXCOEF by NYCOEF.
BS2VL 741
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function BS2VL evaluates a bivariate tensor product spline (represented as a linear
combination of tensor product B-splines) at a given point. This routine is a special case of the
routine BS2DR, which evaluates partial derivatives of such a spline. (The value of a spline is its
zero-th derivative.) For more information see de Boor (1978, pages 351353).
This routine returns the value of the function s at a point (x, y) given the coefficients c by
computing
N y Nx
s ( x, y ) = cnm Bn , k x ,t x ( x ) Bm , k y ,t y ( y )
m =1 n =1
where kx and ky are the orders of the splines. (These numbers are passed to the subroutine in
KXORD and KYORD, respectively.) Likewise, tx and ty are the corresponding knot sequences (XKNOT
and YKNOT).
Comments
Workspace may be explicitly provided, if desired, by use of B22VL/DB22VL. The reference
is:
CALL B22VL(X, Y, KXORD, KYORD, XKNOT, YKNOT, NXCOEF, NYCOEF, BSCOEF, WK)
Example
For an example of the use of BS2VL, see IMSL routine BS2IN.
BS2DR
This function evaluates the derivative of a two-dimensional tensor-product spline, given its tensorproduct B-spline representation.
742 Chapter 3: Interpolation and Approximation
Required Arguments
IXDER Order of the derivative in the X-direction. (Input)
IYDER Order of the derivative in the Y-direction. (Input)
X X-coordinate of the point at which the spline is to be evaluated. (Input)
Y Y-coordinate of the point at which the spline is to be evaluated. (Input)
KXORD Order of the spline in the X-direction. (Input)
KYORD Order of the spline in the Y-direction. (Input)
XKNOT Array of length NXCOEF + KXORD containing the knot sequence in the Xdirection. (Input)
XKNOT must be nondecreasing.
YKNOT Array of length NYCOEF + KYORD containing the knot sequence in the Y-direction.
(Input)
YKNOT must be nondecreasing.
NXCOEF Number of B-spline coefficients in the X-direction. (Input)
NYCOEF Number of B-spline coefficients in the Y-direction. (Input)
BSCOEF Array of length NXCOEF * NYCOEF containing the tensor-product B-spline
coefficients. (Input)
BSCOEF is treated internally as a matrix of size NXCOEF by NYCOEF.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
BS2DR 743
Description
The routine BS2DR evaluates a partial derivative of a bivariate tensor-product spline (represented
as a linear combination of tensor product B-splines) at a given point; see de Boor (1978, pages
351353).
This routine returns the value of s(p,q)at a point (x, y) given the coefficients c by computing
s(
p,q)
N y Nx
m =1 n =1
where kx and ky are the orders of the splines. (These numbers are passed to the subroutine in
KXORD and KYORD, respectively.) Likewise, tx and ty are the corresponding knot sequences (XKNOT
and YKNOT).
Comments
1.
2.
Informational errors
Type
3
Code
1 The point X does not satisfy
XKNOT(KXORD) .LE. X .LE. XKNOT(NXCOEF + 1).
2 The point Y does not satisfy
YKNOT(KYORD) .LE. Y .LE. YKNOT(NYCOEF + 1).
Example
In this example, a spline interpolant s to a function f is constructed. We use the IMSL routine
(2,1)
BS2IN to compute the interpolant and then BS2DR is employed to compute s (x, y). The values
of this partial derivative and the error are computed on a 4 4 grid and then displayed.
USE
USE
USE
USE
!
BS2DR_INT
BSNAK_INT
UMACH_INT
BS2IN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
!
744 Chapter 3: Interpolation and Approximation
INTEGER
REAL
INTRINSIC
!
!
10
!
!
20
!
!
30
40
!
!
!
!
!
!
50
60
99999
Output
Chapter 3: Interpolation and Approximation
BS2DR 745
X
0.0000
0.0000
0.0000
0.0000
0.3333
0.3333
0.3333
0.3333
0.6667
0.6667
0.6667
0.6667
1.0000
1.0000
1.0000
1.0000
Y
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
(2,1)
S
(X,Y)
0.0000
0.0000
0.0000
0.0000
0.0000
1.3333
2.6667
4.0000
0.0000
2.6667
5.3333
8.0001
-0.0004
4.0003
7.9996
12.0005
Error
0.000000
0.000000
0.000000
0.000001
0.000000
0.000002
-0.000002
0.000008
0.000006
-0.000011
0.000028
-0.000134
0.000439
-0.000319
0.000363
-0.000458
BS2GD
Evaluates the derivative of a two-dimensional tensor-product spline, given its tensor-product
B-spline representation on a grid.
Required Arguments
IXDER Order of the derivative in the X-direction. (Input)
IYDER Order of the derivative in the Y-direction. (Input)
XVEC Array of length NX containing the X-coordinates at which the spline is to be
evaluated. (Input)
The points in XVEC should be strictly increasing.
YVEC Array of length NY containing the Y-coordinates at which the spline is to be
evaluated. (Input)
The points in YVEC should be strictly increasing.
KXORD Order of the spline in the X-direction. (Input)
KYORD Order of the spline in the Y-direction. (Input)
XKNOT Array of length NXCOEF + KXORD containing the knot sequence in the X-direction.
(Input)
XKNOT must be nondecreasing.
YKNOT Array of length NYCOEF + KYORD containing the knot sequence in the Y-direction.
(Input)
YKNOT must be nondecreasing.
746 Chapter 3: Interpolation and Approximation
Optional Arguments
NX Number of grid points in the X-direction. (Input)
Default: NX = size (XVEC,1).
NY Number of grid points in the Y-direction. (Input)
Default: NY = size (YVEC,1).
NXCOEF Number of B-spline coefficients in the X-direction. (Input)
Default: NXCOEF = size (XKNOT,1) KXORD.
NYCOEF Number of B-spline coefficients in the Y-direction. (Input)
Default: NYCOEF = size (YKNOT,1) KYORD.
LDVALU Leading dimension of VALUE exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDVALU = SIZE (VALUE,1).
FORTRAN 90 Interface
Generic:
CALL BS2GD (IXDER, IDER, XVEC, YVEC, KXORD, KYORD, XKNOT, YKNOT,
BSCOEF, VALUE [,])
Specific:
FORTRAN 77 Interface
Single:
CALL BS2GD (IXDER, IYDER, NX, XVEC, NY, YVEC, KXORD, KYORD, XKNOT,
YKNOT, NXCOEF, NYCOEF, BSCOEF, VALUE, LDVALU)
Double:
Description
The routine BS2GD evaluates a partial derivative of a bivariate tensor-product spline (represented
as a linear combination of tensor-product B-splines) on a grid of points; see de Boor (1978, pages
351353).
BS2GD 747
This routine returns the values of s(p,q)on the grid (xi, yj) for i = 1, , nx and j = 1, , ny given the
coefficients c by computing (for all (x, y) in the grid)
N y Nx
where kx and ky are the orders of the splines. (These numbers are passed to the subroutine in
KXORD and KYORD, respectively.) Likewise, tx and ty are the corresponding knot sequences (XKNOT
and YKNOT). The grid must be ordered in the sense that xi < xi+1 and yj < yj+1.
Comments
1.
Informational errors
Type
Code
3
4
4
3
4
Example
In this example, a spline interpolant s to a function f is constructed. We use the IMSL routine
(2,1)
BS2IN to compute the interpolant and then BS2GD is employed to compute s
(x, y) on a grid.
The values of this partial derivative and the error are computed on a 4 4 grid and then displayed.
USE
USE
USE
USE
BS2GD_INT
BS2IN_INT
BSNAK_INT
UMACH_INT
IMPLICIT
!
NONE
!
!
!
10
!
20
!
!
!
30
40
!
!
!
!
BS2GD 749
50
60
70
99999
WRITE (NOUT,99999)
DO 50 I=1, 4
XVEC(I) = FLOAT(I-1)/3.0
YVEC(I) = XVEC(I)
CONTINUE
CALL BS2GD (2, 1, XVEC, YVEC, KXORD, KYORD, DOCXK, DOCYK,&
DOCBSC, VALUE)
DO 70 I=1, 4
DO 60 J=1, 4
WRITE (NOUT,'(3F15.4,F15.6)') XVEC(I), YVEC(J),&
VALUE(I,J),&
F21(XVEC(I),YVEC(J)) -&
VALUE(I,J)
CONTINUE
CONTINUE
FORMAT (39X, '(2,1)', /, 13X, 'X', 14X, 'Y', 10X, 'S
(X,Y)',&
5X, 'Error')
END
Output
X
0.0000
0.0000
0.0000
0.0000
0.3333
0.3333
0.3333
0.3333
0.6667
0.6667
0.6667
0.6667
1.0000
1.0000
1.0000
1.0000
Y
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
0.0000
0.3333
0.6667
1.0000
(2,1)
S
(X,Y)
0.0000
0.0000
0.0000
0.0000
0.0000
1.3333
2.6667
4.0000
0.0000
2.6667
5.3333
8.0001
-0.0005
4.0004
7.9995
12.0002
Error
0.000000
0.000000
0.000000
0.000001
-0.000001
0.000001
-0.000004
0.000008
-0.000001
-0.000008
0.000038
-0.000113
0.000488
-0.000412
0.000488
-0.000244
BS2IG
This function evaluates the integral of a tensor-product spline on a rectangular domain, given its
tensor-product B-spline representation.
Required Arguments
A Lower limit of the X-variable. (Input)
B Upper limit of the X-variable. (Input)
C Lower limit of the Y-variable. (Input)
D Upper limit of the Y-variable. (Input)
KXORD Order of the spline in the X-direction. (Input)
KYORD Order of the spline in the Y-direction. (Input)
XKNOT Array of length NXCOEF + KXORD containing the knot sequence in the X-direction.
(Input)
XKNOT must be nondecreasing.
YKNOT Array of length NYCOEF + KYORD containing the knot sequence in the Y-direction.
(Input)
YKNOT must be nondecreasing.
BSCOEF Array of length NXCOEF * NYCOEF containing the tensor-product B-spline
coefficients. (Input)
BSCOEF is treated internally as a matrix of size NXCOEF by NYCOEF.
Optional Arguments
NXCOEF Number of B-spline coefficients in the X-direction. (Input)
Default: NXCOEF = size (XKNOT,1) KXORD.
NYCOEF Number of B-spline coefficients in the Y-direction. (Input)
Default: NYCOEF = size (YKNOT,1) KYORD.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
BS2IG 751
Description
The function BS2IG computes the integral of a tensor-product two-dimensional spline given its
B-spline representation. Specifically, given the knot sequence tx = XKNOT, ty = YKNOT, the order
kx = KXORD, ky = KYORD, the coefficients = BSCOEF, the number of coefficients nx = NXCOEF,
ny = NYCOEF and a rectangle [a, b] by [c, d], BS2IG returns the value
nx
ny
i =1 j =1
ij
Bij dy dx
where
Bi , j ( x, y ) = Bi , k x ,t x ( x ) B j , k y ,t y ( y )
This routine uses the identity (22) on page 151 of de Boor (1978). It assumes (for all knot
sequences) that the first and last k knots are stacked, that is,t1 = = tk and tn + 1 = = tn + k,
where k is the order of the spline in the x or y direction.
Comments
1.
2.
Informational errors
Type
3
3
3
3
4
4
Code
1 The lower limit of the X-integration is less than XKNOT(KXORD).
2 The upper limit of the X-integration is greater than XKNOT(NXCOEF +
1).
3 The lower limit of the Y-integration is less than YKNOT(KYORD).
4 The upper limit of the Y-integration is greater than YKNOT(NYCOEF +
1).
13 Multiplicity of the knots cannot exceed the order of the spline.
14 The knots must be nondecreasing.
Example
We integrate the two-dimensional tensor-product quartic (kx = 5) by linear (ky = 2) spline that
interpolates x3 + xy at the points {(i/10, j/5) : i = 10, , 10 and j = 0, , 5} over the rectangle
[0, 1] [.5, 1]. The exact answer is 5/16.
USE BS2IG_INT
USE BSNAK_INT
USE BS2IN_INT
752 Chapter 3: Interpolation and Approximation
USE UMACH_INT
!
IMPLICIT
NONE
INTEGER
PARAMETER
!
INTEGER
REAL
!
!
10
!
!
20
!
!
30
40
!
!
!
!
!
99999
BS2IG 753
Output
Computed Integral =
Exact Integral
=
Error
=
0.31250
0.31250
0.000000
BS3VL
This function Evaluates a three-dimensional tensor-product spline, given its tensor-product Bspline representation.
Required Arguments
X X-coordinate of the point at which the spline is to be evaluated. (Input)
Y Y-coordinate of the point at which the spline is to be evaluated. (Input)
Z Z-coordinate of the point at which the spline is to be evaluated. (Input)
KXORD Order of the spline in the X-direction. (Input)
KYORD Order of the spline in the Y-direction. (Input)
KZORD Order of the spline in the Z-direction. (Input)
XKNOT Array of length NXCOEF + KXORD containing the knot sequence in the X-direction.
(Input)
XKNOT must be nondecreasing.
YKNOT Array of length NYCOEF + KYORD containing the knot sequence in the Y-direction.
(Input)
YKNOT must be nondecreasing.
ZKNOT Array of length NZCOEF + KZORD containing the knot sequence in the Z-direction.
(Input)
ZKNOT must be nondecreasing.
NXCOEF Number of B-spline coefficients in the X-direction. (Input)
NYCOEF Number of B-spline coefficients in the Y-direction. (Input)
NZCOEF Number of B-spline coefficients in the Z-direction. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function BS3VL evaluates a trivariate tensor-product spline (represented as a linear
combination of tensor-product B-splines) at a given point. This routine is a special case of the
IMSL routine BS3DR, which evaluates a partial derivative of such a spline. (The value of a spline
is its zero-th derivative.) For more information, see de Boor (1978, pages 351353).
This routine returns the value of the function s at a point (x, y, z) given the coefficients c by
computing
Nz
Ny
Nx
where kx, ky, and kz are the orders of the splines. (These numbers are passed to the subroutine in
KXORD, KYORD, and KZORD, respectively.) Likewise, tx, ty, and tz are the corresponding knot
sequences (XKNOT, YKNOT, and ZKNOT).
Comments
Workspace may be explicitly provided, if desired, by use of B23VL/DB23VL. The reference is:
CALL B23VL (X, Y, Z, KXORD, KYORD, KZORD, XKNOT, YKNOT, ZKNOT, NXCOEF,
NYCOEF, NZCOEF, BSCOEF, WK)
BS3VL 755
Example
For an example of the use of BS3VL, see IMSL routine BS3IN.
BS3DR
This function evaluates the derivative of a three-dimensional tensor-product spline, given its
tensor-product B-spline representation.
Required Arguments
IXDER Order of the X-derivative. (Input)
IYDER Order of the Y-derivative. (Input)
IZDER Order of the Z-derivative. (Input)
X X-coordinate of the point at which the spline is to be evaluated. (Input)
Y Y-coordinate of the point at which the spline is to be evaluated. (Input)
Z Z-coordinate of the point at which the spline is to be evaluated. (Input)
KXORD Order of the spline in the X-direction. (Input)
KYORD Order of the spline in the Y-direction. (Input)
KZORD Order of the spline in the Z-direction. (Input)
XKNOT Array of length NXCOEF + KXORD containing the knot sequence in the X-direction.
(Input)
KNOT must be nondecreasing.
YKNOT Array of length NYCOEF + KYORD containing the knot sequence in the Y-direction.
(Input)
YKNOT must be nondecreasing.
ZKNOT Array of length NZCOEF + KZORD containing the knot sequence in the Z-direction.
(Input)
ZKNOT must be nondecreasing.
NXCOEF Number of B-spline coefficients in the X-direction. (Input)
756 Chapter 3: Interpolation and Approximation
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function BS3DR evaluates a partial derivative of a trivariate tensor-product spline (represented
as a linear combination of tensor-product B-splines) at a given point. For more information, see de
Boor (1978, pages 351353).
This routine returns the value of the function s(p, q, r) at a point (x, y, z) given the coefficients c by
computing
s(
p,q,r )
Nz
Ny
Nx
where kx, ky, and kz are the orders of the splines. (These numbers are passed to the subroutine in
KXORD, KYORD, and KZORD, respectively.) Likewise, tx, ty, and tz are the corresponding knot
sequences (XKNOT, YKNOT, and ZKNOT).
Comments
1.
BS3DR 757
2.
Informational errors
Type
3
Code
1 The point X does not satisfy
XKNOT(KXORD) .LE. X .LE. XKNOT(NXCOEF + 1).
2 The point Y does not satisfy
YKNOT(KYORD) .LE. Y .LE. YKNOT(NYCOEF + 1).
3 The point Z does not satisfy
ZKNOT (KZORD) .LE. Z .LE. ZKNOT(NZCOEF + 1).
3
3
Example
In this example, a spline interpolant s to a function f(x, y, z) = x4 + y(xz)3 is constructed using
(2,0,1)
BS3IN. Next, BS3DR is used to compute s
(x, y, z). The values of this partial derivative and the
error are computed on a 4 4 2 grid and then displayed.
USE
USE
USE
USE
BS3DR_INT
BS3IN_INT
BSNAK_INT
UMACH_INT
IMPLICIT
!
INTEGER
PARAMETER
!
!
!
NONE
SPECIFICATIONS FOR PARAMETERS
KXORD, KYORD, KZORD, LDF, MDF, NXDATA, NXKNOT,&
NYDATA, NYKNOT, NZDATA, NZKNOT
(KXORD=5, KYORD=2, KZORD=3, NXDATA=21, NYDATA=6,&
NZDATA=8, LDF=NXDATA, MDF=NYDATA,&
NXKNOT=NXDATA+KXORD, NYKNOT=NYDATA+KYORD,&
NZKNOT=NZDATA+KZORD)
INTEGER
REAL
!
!
ZDATA(I) = FLOAT(I-1)/FLOAT(NZDATA-1)
30 CONTINUE
Generate knots
CALL BSNAK (NXDATA, XDATA, KXORD, XKNOT)
CALL BSNAK (NYDATA, YDATA, KYORD, YKNOT)
CALL BSNAK (NZDATA, ZDATA, KZORD, ZKNOT)
Generate FDATA
DO 50 K=1, NZDATA
DO 40 I=1, NYDATA
DO 40 J=1, NXDATA
FDATA(J,I,K) = F(XDATA(J),YDATA(I),ZDATA(K))
40 CONTINUE
50 CONTINUE
Get output unit number
CALL UMACH (2, NOUT)
Interpolate&
CALL BS3IN (XDATA, YDATA, ZDATA, FDATA, KXORD, KYORD, KZORD, XKNOT, &
YKNOT, ZKNOT, BSCOEF)
!
NXCOEF = NXDATA
NYCOEF = NYDATA
NZCOEF = NZDATA
!
Write heading
WRITE (NOUT,99999)
!
!
!
60
70
80
99999
DO 80 I=1, 4
DO 70 J=1, 4
DO 60 L=1, 2
X
= 2.0*(FLOAT(I-1)/3.0) - 1.0
Y
= FLOAT(J-1)/3.0
Z
= FLOAT(L-1)
Evaluate spline
S201 = BS3DR(2,0,1,X,Y,Z,KXORD,KYORD,KZORD,XKNOT,YKNOT,&
ZKNOT,NXCOEF,NYCOEF,NZCOEF,BSCOEF)
WRITE (NOUT,'(3F12.4,2F12.6)') X, Y, Z, S201,&
F201(X,Y,Z) - S201
CONTINUE
CONTINUE
CONTINUE
FORMAT (38X, '(2,0,1)', /, 9X, 'X', 11X,&
'Y', 11X, 'Z', 4X, 'S
(X,Y,Z)
Error')
END
Output
X
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
Y
0.0000
0.0000
0.3333
0.3333
0.6667
0.6667
Z
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
(2,0,1)
S
(X,Y,Z)
Error
-0.000107
0.000107
0.000053
-0.000053
0.064051
-0.064051
-5.935941
-0.064059
0.127542
-0.127542
-11.873034
-0.126966
BS3DR 759
-1.0000
-1.0000
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
0.3333
0.3333
0.3333
0.3333
0.3333
0.3333
0.3333
0.3333
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.0000
0.0000
0.3333
0.3333
0.6667
0.6667
1.0000
1.0000
0.0000
0.0000
0.3333
0.3333
0.6667
0.6667
1.0000
1.0000
0.0000
0.0000
0.3333
0.3333
0.6667
0.6667
1.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.191166
-17.808527
-0.000002
0.000000
0.021228
-1.978768
0.042464
-3.957536
0.063700
-5.936305
-0.000003
0.000000
-0.021229
1.978763
-0.042465
3.957539
-0.063700
5.936304
-0.000098
0.000053
-0.063855
5.936146
-0.127631
11.873067
-0.191442
17.807940
-0.191166
-0.191473
0.000002
0.000000
-0.021228
-0.021232
-0.042464
-0.042464
-0.063700
-0.063694
0.000003
0.000000
0.021229
0.021238
0.042465
0.042462
0.063700
0.063697
0.000098
-0.000053
0.063855
0.063854
0.127631
0.126933
0.191442
0.192060
BS3GD
Evaluates the derivative of a three-dimensional tensor-product spline, given its tensor-product
B-spline representation on a grid.
Required Arguments
IXDER Order of the X-derivative. (Input)
IYDER Order of the Y-derivative. (Input)
IZDER Order of the Z-derivative. (Input)
XVEC Array of length NX containing the x-coordinates at which the spline is to be
evaluated. (Input)
The points in XVEC should be strictly increasing.
YVEC Array of length NY containing the y-coordinates at which the spline is to be
evaluated. (Input)
The points in YVEC should be strictly increasing.
ZVEC Array of length NY containing the y-coordinates at which the spline is to be
evaluated. (Input)
The points in YVEC should be strictly increasing.
760 Chapter 3: Interpolation and Approximation
Optional Arguments
NX Number of grid points in the x-direction. (Input)
Default: NX = size (XVEC,1).
NY Number of grid points in the y-direction. (Input)
Default: NY = size (YVEC,1).
NZ Number of grid points in the z-direction. (Input)
Default: NZ = size (ZVEC,1).
NXCOEF Number of B-spline coefficients in the x-direction. (Input)
Default: NXCOEF = size (XKNOT,1) KXORD.
NYCOEF Number of B-spline coefficients in the y-direction. (Input)
Default: NYCOEF = size (YKNOT,1) KYORD.
NZCOEF Number of B-spline coefficients in the z-direction. (Input)
Default: NZCOEF = size (ZKNOT,1) KZORD.
BS3GD 761
FORTRAN 90 Interface
Generic:
CALL BS3GD (IXDER, IYDER, IZDER, XVEC, YVEC, ZVEC, KXORD, KYORD,
KZORD, XKNOT, YKNOT, ZKNOT, BSCOEF, VALUE [,])
Specific:
FORTRAN 77 Interface
Single:
CALL BS3GD (IXDER, IYDER, IZDER, NX, XVEC, NY, YVEC, NZ, ZVEC, KXORD,
KYORD, KZORD, XKNOT, YKNOT, ZKNOT, NXCOEF, NYCOEF, NZCOEF, BSCOEF,
VALUE, LDVALU, MDVALU)
Double:
Description
The routine BS3GD evaluates a partial derivative of a trivariate tensor-product spline (represented
as a linear combination of tensor-product B-splines) on a grid. For more information, see de Boor
(1978, pages 351353).
This routine returns the value of the function s(p,q,r) on the grid (xi, yj, zk) for i = 1, , nx,
j = 1, , ny, and k = 1, , nz given the coefficients c by computing (for all (x, y, z) on the grid)
s(
p,q,r )
Nz
Ny
Nx
where kx, ky, and kz are the orders of the splines. (These numbers are passed to the subroutine in
KXORD, KYORD, and KZORD, respectively.) Likewise, tx, ty, and tz are the corresponding knot
sequences (XKNOT, YKNOT, and ZKNOT). The grid must be ordered in the sense that
Comments
1.
LDVALU, MDVALU, LEFTX, LEFTY, LEFTZ, A, B, C, DBIATX, DBIATY, DBIATZ, BX, BY,
BZ)
2.
Informational errors
Type
3
3
3
4
4
4
Code
1 XVEC(I) does not satisfy
XKNOT(KXORD) XVEC(I) XKNOT(NXCOEF + 1).
2 YVEC(I) does not satisfy
YKNOT(KYORD) YVEC(I) YKNOT(NYCOEF + 1).
3 ZVEC(I) does not satisfy
ZKNOT(KZORD) ZVEC(I) ZKNOT(NZCOEF + 1).
4 XVEC is not strictly increasing.
5 YVEC is not strictly increasing.
6 ZVEC is not strictly increasing.
Example
In this example, a spline interpolant s to a function f(x, y, z) = x4 + y(xz)3 is constructed using
(2,0,1)
BS3IN. Next, BS3GD is used to compute s
(x, y, z) on the grid. The values of this partial
derivative and the error are computed on a 4 4 2 grid and then displayed.
USE BS3GD_INT
Chapter 3: Interpolation and Approximation
BS3GD 763
USE BS3IN_INT
USE BSNAK_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
KXORD, KYORD, KZORD, LDF, LDVAL, MDF, MDVAL, NXDATA,&
NXKNOT, NYDATA, NYKNOT, NZ, NZDATA, NZKNOT
(KXORD=5, KYORD=2, KZORD=3, LDVAL=4, MDVAL=4,&
NXDATA=21, NYDATA=6, NZ=2, NZDATA=8, LDF=NXDATA,&
MDF=NYDATA, NXKNOT=NXDATA+KXORD, NYKNOT=NYDATA+KYORD,&
NZKNOT=NZDATA+KZORD)
!
INTEGER
REAL
INTRINSIC
!
!
!
F(X,Y,Z)
= X*X*X*X + X*X*X*Y*Z*Z*Z
F201(X,Y,Z) = 18.0*X*Y*Z
!
CALL UMACH (2, NOUT)
!
!
NXCOEF = NXDATA
NYCOEF = NYDATA
764 Chapter 3: Interpolation and Approximation
NZCOEF = NZDATA
!
!
!
DO 60 I=1, 4
XVEC(I) = 2.0*(FLOAT(I-1)/3.0) - 1.0
60 CONTINUE
DO 70 J=1, 4
YVEC(J) = FLOAT(J-1)/3.0
70 CONTINUE
DO 80 L=1, 2
ZVEC(L) = FLOAT(L-1)
80 CONTINUE
CALL BS3GD (2, 0, 1, XVEC, YVEC, ZVEC, KXORD, KYORD,&
KZORD, XKNOT, YKNOT, ZKNOT, BSCOEF, VALUE)
!
!
90
100
110
99999
WRITE (NOUT,99999)
DO 110 I=1, 4
DO 100 J=1, 4
DO 90 L=1, 2
WRITE (NOUT,'(5F13.4)') XVEC(I), YVEC(J), ZVEC(L),&
VALUE(I,J,L),&
F201(XVEC(I),YVEC(J),ZVEC(L)) -&
VALUE(I,J,L)
CONTINUE
CONTINUE
CONTINUE
FORMAT (44X, '(2,0,1)', /, 10X, 'X', 11X, 'Y', 10X, 'Z', 10X,&
'S
(X,Y,Z) Error')
STOP
END
Output
X
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-1.0000
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
-0.3333
0.3333
0.3333
Y
0.0000
0.0000
0.3333
0.3333
0.6667
0.6667
1.0000
1.0000
0.0000
0.0000
0.3333
0.3333
0.6667
0.6667
1.0000
1.0000
0.0000
0.0000
Z
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
(2,0,1)
S
(X,Y,Z)
-0.0005
0.0002
0.0641
-5.9360
0.1274
-11.8730
0.1911
-17.8086
0.0000
0.0000
0.0212
-1.9788
0.0425
-3.9575
0.0637
-5.9363
0.0000
0.0000
Error
0.0005
-0.0002
-0.0641
-0.0640
-0.1274
-0.1270
-0.1911
-0.1914
0.0000
0.0000
-0.0212
-0.0212
-0.0425
-0.0425
-0.0637
-0.0637
0.0000
0.0000
BS3GD 765
0.3333
0.3333
0.3333
0.3333
0.3333
0.3333
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.3333
0.3333
0.6667
0.6667
1.0000
1.0000
0.0000
0.0000
0.3333
0.3333
0.6667
0.6667
1.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
1.0000
-0.0212
1.9788
-0.0425
3.9575
-0.0637
5.9363
-0.0005
0.0000
-0.0637
5.9359
-0.1273
11.8733
-0.1912
17.8096
0.0212
0.0212
0.0425
0.0425
0.0637
0.0637
0.0005
0.0000
0.0637
0.0641
0.1273
0.1267
0.1912
0.1904
BS3IG
This function evaluates the integral of a tensor-product spline in three dimensions over a threedimensional rectangle, given its tensor-product B-spline representation.
Required Arguments
A Lower limit of the X-variable. (Input)
B Upper limit of the X-variable. (Input)
C Lower limit of the Y-variable. (Input)
D Upper limit of the Y-variable. (Input)
E Lower limit of the Z-variable. (Input)
F Upper limit of the Z-variable. (Input)
KXORD Order of the spline in the X-direction. (Input)
KYORD Order of the spline in the Y-direction. (Input)
KZORD Order of the spline in the Z-direction. (Input)
XKNOT Array of length NXCOEF + KXORD containing the knot sequence in the X-direction.
(Input)
XKNOT must be nondecreasing.
YKNOT Array of length NYCOEF + KYORD containing the knot sequence in the Y-direction.
(Input)
YKNOT must be nondecreasing.
ZKNOT Array of length NZCOEF + KZORD containing the knot sequence in the Z-direction.
(Input)
ZKNOT must be nondecreasing.
NXCOEF Number of B-spline coefficients in the X-direction. (Input)
NYCOEF Number of B-spline coefficients in the Y-direction. (Input)
NZCOEF Number of B-spline coefficients in the Z-direction. (Input)
BSCOEF Array of length NXCOEF * NYCOEF * NZCOEF containing the tensor-product
B-spline coefficients. (Input)
BSCOEF is treated internally as a matrix of size NXCOEF by NYCOEF by NZCOEF.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BS3IG computes the integral of a tensor-product three-dimensional spline, given its
B-spline representation. Specifically, given the knot sequence tx = XKNOT, ty = YKNOT, tz = ZKNOT,
the order kx = KXORD, ky = KYORD, kz = KZORD, the coefficients = BSCOEF, the number of
coefficients nx = NXCOEF, ny = NYCOEF, nz = NZCOEF, and a three-dimensional rectangle [a, b] by
[c, d] by [e, f], BS3IG returns the value
nx
ny
nz
i =1 j =1 m =1
ijm
Bijm dz dy dx
where
Bijm ( x, y, z ) = Bi , kx ,t x ( x ) B j , k y ,t y ( y ) Bm , k z t z ( z )
BS3IG 767
This routine uses the identity (22) on page 151 of de Boor (1978). It assumes (for all knot
sequences) that the first and last k knots are stacked, that is, t1 = = tk and tn + 1 = = tn + k,
where k is the order of the spline in the x, y, or z direction.
Comments
1.
2.
Informational errors
Type
Code
The lower limit of the X-integration is less than XKNOT(KXORD).
The upper limit of the X-integration is greater than
XKNOT(NXCOEF + 1).
3 The lower limit of the Y-integration is less than YKNOT(KYORD).
4 The upper limit of the Y-integration is greater than
YKNOT(NYCOEF + 1).
5 The lower limit of the Z- integration is less than ZKNOT(KZORD).
6 The upper limit of the Z-integration is greater than
ZKNOT(NZCOEF + 1).
13 Multiplicity of the knots cannot exceed the order of the spline.
14 The knots must be nondecreasing.
3
3
1
2
3
3
3
3
4
4
Example
We integrate the three-dimensional tensor-product quartic (kx = 5) by linear (ky = 2) by quadratic
(kz = 3) spline which interpolates x3 + xyz at the points
{( i /10, j / 5, m / 7 ) : i
over the rectangle [0, 1] [.5, 1] [0, .5]. The exact answer is 11/128.
USE
USE
USE
USE
BS3IG_INT
BS3IN_INT
BSNAK_INT
UMACH_INT
IMPLICIT
!
INTEGER
PARAMETER
NONE
SPECIFICATIONS FOR PARAMETERS
KXORD, KYORD, KZORD, LDF, MDF, NXDATA, NXKNOT,&
NYDATA, NYKNOT, NZDATA, NZKNOT
(KXORD=5, KYORD=2, KZORD=3, NXDATA=21, NYDATA=6,&
INTEGER
REAL
!
!
10
!
20
!
!
30
!
!
40
50
!
!
!
=
=
=
=
=
=
=
=
=
VAL
Integrate
= BS3IG(A,B,C ,D,E,FF,KXORD,KYORD,KZORD,XKNOT,YKNOT,ZKNOT,&
NXDATA
NYDATA
NZDATA
0.0
1.0
0.5
1.0
0.0
0.5
BS3IG 769
NXCOEF,NYCOEF,NZCOEF,BSCOEF)
Calculate integral directly
G
= .5*(B**4-A**4)
H
= (B-A)*(B+A)
RI = G*(D-C )
RJ = .5*H*(D-C )*(D+C )
FIG = .5*(RI*(FF-E)+.5*RJ*(FF-E)*(FF+E))
!
Print results
WRITE (NOUT,99999) VAL, FIG, FIG - VAL
99999 FORMAT (' Computed Integral = ', F10.5, /, ' Exact Integral
, '= ', F10.5,/, ' Error
'&
, '= ', F10.6, /)
END
!
'&
Output
Computed Integral =
Exact Integral
=
Error
=
0.08594
0.08594
0.000000
BSCPP
Converts a spline in B-spline representation to piecewise polynomial representation.
Required Arguments
KORDER Order of the spline. (Input)
XKNOT Array of length KORDER + NCOEF containing the knot sequence. (Input)
XKNOT must be nondecreasing.
NCOEF Number of B-spline coefficients. (Input)
BSCOEF Array of length NCOEF containing the B-spline coefficients. (Input)
NPPCF Number of piecewise polynomial pieces. (Output)
NPPCF is always less than or equal to NCOEF KORDER + 1.
BREAK Array of length (NPPCF + 1) containing the breakpoints of the piecewise
polynomial representation. (Output)
BREAK must be dimensioned at least NCOEF KORDER + 2.
PPCOEF Array of length KORDER * NPPCF containing the local coefficients of the
polynomial pieces. (Output)
PPCOEF is treated internally as a matrix of size KORDER by NPPCF.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BSCPP is based on the routine BSPLPP by de Boor (1978, page 140). This routine is
used to convert a spline in B-spline representation to a piecewise polynomial (pp) representation
which can then be evaluated more efficiently. There is some overhead in converting from the
B-spline representation to the pp representation, but the conversion to pp form is recommended
when 3 or more function values are needed per polynomial piece.
Comments
1.
2.
Informational errors
Type
Code
4
4
4
5
Example
For an example of the use of BSCPP, see PPDER.
PPVAL
This function evaluates a piecewise polynomial.
Required Arguments
X Point at which the polynomial is to be evaluated. (Input)
PPVAL 771
Optional Arguments
KORDER Order of the polynomial. (Input)
Default: KORDER = size (PPCOEF,1).
NINTV Number of polynomial pieces. (Input)
Default: NINTV = size (PPCOEF,2).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine PPVAL evaluates a piecewise polynomial at a given point. This routine is a special
case of the routine PPDER, which evaluates the derivative of a piecewise polynomial. (The value
of a piecewise polynomial is its zero-th derivative.)
The routine PPDER is based on the routine PPVALU in de Boor (1978, page 89).
Example
In this example, a spline interpolant to a function f is computed using the IMSL routine BSINT.
This routine represents the interpolant as a linear combination of B-splines. This representation is
then converted to piecewise polynomial representation by calling the IMSL routine BSCPP. The
piecewise polynomial is evaluated using PPVAL. These values are compared to the corresponding
values of f.
USE
USE
USE
USE
USE
PPVAL_INT
BSNAK_INT
BSCPP_INT
BSINT_INT
UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NCOEF, NDATA, NKNOT
(KORDER=4, NCOEF=20, NDATA=20, NKNOT=NDATA+KORDER)
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
I, NOUT, NPPCF
BREAK(NCOEF), BSCOEF(NCOEF), EXP, F, FDATA(NDATA),&
FLOAT, PPCOEF(KORDER,NCOEF), S, X, XDATA(NDATA),&
XKNOT(NKNOT)
INTRINSIC EXP, FLOAT
Define function
F(X) = X*EXP(X)
Set up interpolation points
DO 30 I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
30 CONTINUE
Generate knot sequence
CALL BSNAK (NDATA, XDATA, KORDER, XKNOT)
Compute the B-spline interpolant
CALL BSINT (NCOEF, XDATA, FDATA, KORDER, XKNOT, BSCOEF)
Convert to piecewise polynomial
CALL BSCPP (KORDER, XKNOT, NCOEF, BSCOEF, NPPCF, BREAK, PPCOEF)
Get output unit number
CALL UMACH (2, NOUT)
Write heading
WRITE (NOUT,99999)
Print the interpolant on a uniform
grid
DO 40 I=1, NDATA
X = FLOAT(I-1)/FLOAT(NDATA-1)
Compute value of the piecewise
polynomial
S = PPVAL(X,BREAK,PPCOEF)
WRITE (NOUT,'(2F12.3, E14.3)') X, S, F(X) - S
40 CONTINUE
99999 FORMAT (11X, 'X', 8X, 'S(X)', 7X, 'Error')
END
Output
X
0.000
0.053
0.105
0.158
0.211
0.263
0.316
0.368
0.421
0.474
0.526
S(X)
0.000
0.055
0.117
0.185
0.260
0.342
0.433
0.533
0.642
0.761
0.891
Error
0.000E+00
-0.745E-08
0.000E+00
0.000E+00
-0.298E-07
0.298E-07
0.000E+00
0.000E+00
0.000E+00
0.596E-07
0.000E+00
PPVAL 773
0.579
0.632
0.684
0.737
0.789
0.842
0.895
0.947
1.000
1.033
1.188
1.356
1.540
1.739
1.955
2.189
2.443
2.718
0.000E+00
0.000E+00
0.000E+00
-0.119E-06
0.000E+00
0.000E+00
0.238E-06
0.238E-06
0.238E-06
PPDER
This function evaluates the derivative of a piecewise polynomial.
Required Arguments
X Point at which the polynomial is to be evaluated. (Input)
BREAK Array of length NINTV + 1 containing the breakpoints of the piecewise
polynomial representation. (Input)
BREAK must be strictly increasing.
PPCOEF Array of size KORDER * NINTV containing the local coefficients of the piecewise
polynomial pieces. (Input)
PPCOEF is treated internally as a matrix of size KORDER by NINTV.
Optional Arguments
IDERIV Order of the derivative to be evaluated. (Input)
In particular, IDERIV = 0 returns the value of the polynomial.
Default: IDERIV = 1.
KORDER Order of the polynomial. (Input)
Default: KORDER = size (PPCOEF,1).
NINTV Number of polynomial pieces. (Input)
Default: NINTV = size (PPCOEF,2).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine PPDER evaluates the derivative of a piecewise polynomial function f at a given point.
This routine is based on the subroutine PPVALU by de Boor (1978, page 89). In particular, if the
breakpoint sequence is stored in (a vector of length N = NINTV + 1), and if the coefficients of the
piecewise polynomial representation are stored in c, then the value of the j-th derivative of f at x
in[i, i + 1) is
f
( j)
( x i )
( x ) = cm +1, i
( m j )!
m= j
k 1
m j
when j = 0 to k 1 and zero otherwise. Notice that this representation forces the function to be
right continuous. If x is less than 1, then i is set to 1 in the above formula; if x is greater than or
equal to N , then i is set to N 1. This has the effect of extending the piecewise polynomial
representation to the real axis by extrapolation of the first and last pieces.
Example
In this example, a spline interpolant to a function f is computed using the IMSL routine BSINT.
This routine represents the interpolant as a linear combination of B-splines. This representation is
then converted to piecewise polynomial representation by calling the IMSL routine BSCPP. The
piecewise polynomials zero-th and first derivative are evaluated using PPDER. These values are
compared to the corresponding values of f.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NCOEF, NDATA, NKNOT
(KORDER=4, NCOEF=20, NDATA=20, NKNOT=NDATA+KORDER)
INTEGER
REAL
I, NOUT, NPPCF
BREAK(NCOEF), BSCOEF(NCOEF), DF, DS, EXP, F,&
FDATA(NDATA), FLOAT, PPCOEF(KORDER,NCOEF), S,&
X, XDATA(NDATA), XKNOT(NKNOT)
EXP, FLOAT
INTRINSIC
!
F(X) = X*EXP(X)
DF(X) = (X+1.)*EXP(X)
!
PPDER 775
Output
X
0.000
0.053
0.105
0.158
0.211
0.263
0.316
0.368
0.421
0.474
0.526
0.579
0.632
0.684
0.737
0.789
0.842
0.895
0.947
1.000
S(X)
0.000
0.055
0.117
0.185
0.260
0.342
0.433
0.533
0.642
0.761
0.891
1.033
1.188
1.356
1.540
1.739
1.955
2.189
2.443
2.718
Error
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
S(X)
1.000
1.109
1.228
1.356
1.494
1.643
1.804
1.978
2.165
2.367
2.584
2.817
3.068
3.338
3.629
3.941
4.276
4.636
5.022
5.436
Error
-0.000112
0.000030
-0.000008
0.000002
0.000000
0.000000
-0.000001
0.000002
0.000001
0.000000
-0.000001
0.000001
0.000001
0.000001
0.000001
0.000000
-0.000006
0.000024
-0.000090
0.000341
PP1GD
Evaluates the derivative of a piecewise polynomial on a grid.
Required Arguments
XVEC Array of length N containing the points at which the piecewise polynomial is to be
evaluated. (Input)
The points in XVEC should be strictly increasing.
BREAK Array of length NINTV + 1 containing the breakpoints for the piecewise
polynomial representation. (Input)
BREAK must be strictly increasing.
PPCOEF Matrix of size KORDER by NINTV containing the local coefficients of the
polynomial pieces. (Input)
VALUE Array of length N containing the values of the IDERIV-th derivative of the
piecewise polynomial at the points in XVEC. (Output)
Optional Arguments
IDERIV Order of the derivative to be evaluated. (Input)
In particular, IDERIV = 0 returns the values of the piecewise polynomial.
Default: IDERIV = 1.
N Length of vector XVEC. (Input)
Default: N = size (XVEC,1).
KORDER Order of the polynomial. (Input)
Default: KORDER = size (PPCOEF,1).
NINTV Number of polynomial pieces. (Input)
Default: NINTV = size (PPCOEF,2).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine PP1GD evaluates a piecewise polynomial function f (or its derivative) at a vector of
points. That is, given a vector x of length n satisfying xi < xi + 1 for i = 1, , n 1, a derivative
Chapter 3: Interpolation and Approximation
PP1GD 777
value j, and a piecewise polynomial function f that is represented by a breakpoint sequence and
coefficient matrix this routine returns the values
f ( j ) ( xi ) i = 1, , n
in the array VALUE. The functionality of this routine is the same as that of PPDER called in a loop,
however PP1GD is much more efficient.
Comments
1.
2.
Informational error
Type
4
Code
4 The points in XVEC must be strictly increasing.
Example
To illustrate the use of PP1GD, we modify the example program for PPDER. In this example, a
piecewise polynomial interpolant to F is computed. The values of this polynomial are then
compared with the exact function values. The routine PP1GD is based on the routine PPVALU in de
Boor (1978, page 89).
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, N, NCOEF, NDATA, NKNOT
(KORDER=4, N=20, NCOEF=20, NDATA=20,&
NKNOT=NDATA+KORDER)
INTEGER
REAL
INTRINSIC
!
F(X) = X*EXP(X)
DF(X) = (X+1.)*EXP(X)
!
I=1, NDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NDATA-1)
FDATA(I) = F(XDATA(I))
10 CONTINUE
!
Generate knot sequence
CALL BSNAK (NDATA, XDATA, KORDER, XKNOT)
!
Compute the B-spline interpolant
CALL BSINT (NCOEF, XDATA, FDATA, KORDER, XKNOT, BSCOEF)
!
Convert to piecewise polynomial
CALL BSCPP (KORDER, XKNOT, NCOEF, BSCOEF, NPPCF, BREAK, PPCOEF)
!
Compute evaluation points
DO 20 I=1, N
XVEC(I) = FLOAT(I-1)/FLOAT(N-1)
20 CONTINUE
!
Compute values of the piecewise
!
polynomial
NINTV = NPPCF
CALL PP1GD (XVEC, BREAK, PPCOEF, VALUE1, IDERIV=0, NINTV=NINTV)
!
Compute the values of the first
!
derivative of the piecewise
!
polynomial
CALL PP1GD (XVEC, BREAK, PPCOEF, VALUE2, IDERIV=1, NINTV=NINTV)
!
Get output unit number
CALL UMACH (2, NOUT)
!
Write heading
WRITE (NOUT,99998)
!
Print the results on a uniform
!
grid
DO 30 I=1, N
WRITE (NOUT,99999) XVEC(I), VALUE1(I), F(XVEC(I)) - VALUE1(I)&
, VALUE2(I), DF(XVEC(I)) - VALUE2(I)
30 CONTINUE
99998 FORMAT (11X, 'X', 8X, 'S(X)', 7X, 'Error', 7X, 'S''(X)', 7X,&
'Error')
99999 FORMAT (' ', 2F12.3, F12.6, F12.3, F12.6)
END
Output
X
0.000
0.053
0.105
0.158
0.211
0.263
0.316
0.368
0.421
0.474
0.526
0.579
0.632
0.684
0.737
0.789
S(X)
0.000
0.055
0.117
0.185
0.260
0.342
0.433
0.533
0.642
0.761
0.891
1.033
1.188
1.356
1.540
1.739
Error
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
S(X)
1.000
1.109
1.228
1.356
1.494
1.643
1.804
1.978
2.165
2.367
2.584
2.817
3.068
3.338
3.629
3.941
Error
-0.000112
0.000030
-0.000008
0.000002
0.000000
0.000000
-0.000001
0.000002
0.000001
0.000000
-0.000001
0.000001
0.000001
0.000001
0.000001
0.000000
PP1GD 779
0.842
0.895
0.947
1.000
1.955
2.189
2.443
2.718
0.000000
0.000000
0.000000
0.000000
4.276
4.636
5.022
5.436
-0.000006
0.000024
-0.000090
0.000341
PPITG
This function evaluates the integral of a piecewise polynomial.
Required Arguments
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
BREAK Array of length NINTV + 1 containing the breakpoints for the piecewise
polynomial. (Input)
BREAK must be strictly increasing.
PPCOEF Array of size KORDER * NINTV containing the local coefficients of the piecewise
polynomial pieces. (Input)
PPCOEF is treated internally as a matrix of size KORDER by NINTV.
Optional Arguments
KORDER Order of the polynomial. (Input)
Default: KORDER = size (PPCOEF,1).
NINTV Number of piecewise polynomial pieces. (Input)
Default: NINTV = size (PPCOEF,2).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine PPITG evaluates the integral of a piecewise polynomial over an interval.
Example
In this example, we compute a quadratic spline interpolant to the function x2 using the IMSL
routine BSINT. We then evaluate the integral of the spline interpolant over the intervals [0, 1/2]
and [0, 2]. The interpolant reproduces x2, and hence, the values of the integrals are 1/24 and 8/3,
respectively.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NDATA, NKNOT
(KORDER=3, NDATA=10, NKNOT=NDATA+KORDER)
INTEGER
REAL
I, NOUT, NPPCF
A, B, BREAK(NDATA), BSCOEF(NDATA), EXACT, F,&
FDATA(NDATA), FI, FLOAT, PPCOEF(KORDER,NDATA),&
VALUE, X, XDATA(NDATA), XKNOT(NKNOT)
FLOAT
INTRINSIC
!
F(X) = X*X
FI(X) = X*X*X/3.0
!
!
!
!
!
!
!
!
!
!
PPITG 781
99999 FORMAT (' On the closed interval (', F3.1, ',', F3.1,&
') we have :', /, 1X, 'Computed Integral = ', F10.5, /,&
1X, 'Exact Integral
= ', F10.5, /, 1X, 'Error
'&
, '
= ', F10.6, /, /)
!
END
Output
On the closed interval (0.0,0.5) we have :
Computed Integral =
0.04167
Exact Integral
=
0.04167
Error
=
0.000000
On the closed interval (0.0,2.0) we have :
Computed Integral =
2.66667
Exact Integral
=
2.66667
Error
=
0.000001
QDVAL
This function evaluates a function defined on a set of points using quadratic interpolation.
Required Arguments
X Coordinate of the point at which the function is to be evaluated. (Input)
XDATA Array of length NDATA containing the location of the data points. (Input)
XDATA must be strictly increasing.
FDATA Array of length NDATA containing the function values. (Input)
FDATA(I) is the value of the function at XDATA(I).
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least 3.
Default: NDATA = size (XDATA,1).
CHECK Logical variable that is .TRUE. if checking of XDATA is required or .FALSE. if
checking is not required. (Input)
Default: CHECK = .TRUE.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function QDVAL interpolates a table of values, using quadratic polynomials, returning an
approximation to the tabulated function. Let (xi, fi) for i = 1, , n be the tabular data. Given a
number x at which an interpolated value is desired, we first find the nearest interior grid point xi. A
quadratic interpolant q is then formed using the three points (xi-1, fi-1), (xi, fi), and (xi+1, fi+1). The
number returned by QDVAL is q(x).
Comments
Informational error
Type
4
Code
3 The XDATA values must be strictly increasing.
Example
In this example, the value of sin x is approximated at /4 by using QDVAL on a table of 33 equally
spaced values.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=33)
!
INTEGER
REAL
!
!
I, NOUT
F, FDATA(NDATA), H, PI, QT, SIN, X,&
XDATA(NDATA)
INTRINSIC SIN
Define function
F(X) = SIN(X)
Generate data points
XDATA(1) = 0.0
FDATA(1) = F(XDATA(1))
H
= 1.0/32.0
DO 10 I=2, NDATA
XDATA(I) = XDATA(I-1) + H
FDATA(I) = F(XDATA(I))
10 CONTINUE
QDVAL 783
QT = QDVAL(X,XDATA,FDATA)
Evaluate at PI/4
Get output unit number
Print results
WRITE (NOUT,99999) X, F(X), QT, (F(X)-QT)
!
99999 FORMAT (15X, 'X', 6X, 'F(X)', 6X, 'QDVAL', 5X, 'ERROR', //, 6X,&
4F10.3, /)
END
Output
X
F(X)
0.785
0.707
QDVAL
0.707
ERROR
0.000
QDDER
This function evaluates the derivative of a function defined on a set of points using quadratic
interpolation.
Required Arguments
IDERIV Order of the derivative. (Input)
X Coordinate of the point at which the function is to be evaluated. (Input)
XDATA Array of length NDATA containing the location of the data points. (Input) XDATA
must be strictly increasing.
FDATA Array of length NDATA containing the function values. (Input)
FDATA(I) is the value of the function at XDATA(I).
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least three.
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function QDDER interpolates a table of values, using quadratic polynomials, returning an
approximation to the derivative of the tabulated function. Let (xi, fi) for i = 1, , n be the tabular
data. Given a number x at which an interpolated value is desired, we first find the nearest interior
grid point xi. A quadratic interpolant q is then formed using the three points (xi-1, fi-1)
(xi, fi), and (xi+1, fi+1). The number returned by QDDER is q(j)(x), where j = IDERIV.
Comments
1.
Informational error
Type
4
2.
Code
3 The XDATA values must be strictly increasing.
Because quadratic interpolation is used, if the order of the derivative is greater than two,
then the returned value is zero.
Example
In this example, the value of sin x and its derivatives are approximated at /4 by using QDDER on a
table of 33 equally spaced values.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
!
NONE
NDATA
(NDATA=33)
INTEGER
REAL
I, IDERIV, NOUT
COS, F, F1, F2, FDATA(NDATA), H, PI,&
QT, SIN, X, XDATA(NDATA)
LOGICAL CHECK
INTRINSIC COS, SIN
QDDER 785
XDATA(1) = 0.0
FDATA(1) = F(XDATA(1))
H
= 1.0/32.0
DO 10 I=2, NDATA
XDATA(I) = XDATA(I-1) + H
FDATA(I) = F(XDATA(I))
10 CONTINUE
!
Check XDATA
CHECK = .TRUE.
Write heading
WRITE (NOUT,99998)
!
99998 FORMAT (33X, 'IDER', /, 15X, 'X', 6X, 'IDER', 6X, 'F
5X, 'QDDER', 6X, 'ERROR', //)
99999 FORMAT (7X, F10.3, I8, 3F12.3/)
END
(X)',&
Output
X
IDER
IDER
(X)
QDDER
ERROR
0.785
0.707
0.707
0.000
0.785
0.707
0.707
0.000
0.785
-0.707
-0.704
-0.003
QD2VL
This function evaluates a function defined on a rectangular grid using quadratic interpolation.
786 Chapter 3: Interpolation and Approximation
Required Arguments
X x-coordinate of the point at which the function is to be evaluated. (Input)
Y y-coordinate of the point at which the function is to be evaluated. (Input)
XDATA Array of length NXDATA containing the location of the data points in the
x-direction. (Input)
XDATA must be increasing.
YDATA Array of length NYDATA containing the location of the data points in the
y-direction. (Input)
YDATA must be increasing.
FDATA Array of size NXDATA by NYDATA containing function values. (Input)
FDATA (I, J) is the value of the function at (XDATA (I), YDATA(J)).
Optional Arguments
NXDATA Number of data points in the x-direction. (Input)
NXDATA must be at least three.
Default: NXDATA = size (XDATA,1).
NYDATA Number of data points in the y-direction. (Input)
NYDATA must be at least three.
Default: NYDATA = size (YDATA,1).
LDF Leading dimension of FDATA exactly as specified in the dimension statement of the
calling program. (Input)
LDF must be at least as large as NXDATA.
Default: LDF = size (FDATA,1).
CHECK Logical variable that is .TRUE. if checking of XDATA and YDATA is required or
.FALSE. if checking is not required. (Input)
Default: CHECK = .TRUE.
FORTRAN 90 Interface
Generic:
Specific:
QD2VL 787
FORTRAN 77 Interface
Single:
Double:
Description
The function QD2VL interpolates a table of values, using quadratic polynomials, returning an
approximation to the tabulated function. Let (xi, yj, fij) for i = 1, , nx and j = 1, , ny be the
tabular data. Given a point (x, y) at which an interpolated value is desired, we first find the nearest
interior grid point (xi, yj). A bivariate quadratic interpolant q is then formed using six points near
(x, y). Five of the six points are (xi, yj), (xi 1, yj), and (xi, yj 1). The sixth point is the nearest point
to (x, y) of the grid points (xi1, yj1). The value q(x, y) is returned by QD2VL.
Comments
Informational errors
Type
4
4
Code
6 The XDATA values must be strictly increasing.
7 The YDATA values must be strictly increasing.
Example
In this example, the value of sin(x + y) at x = y = /4 is approximated by using QDVAL on a table of
size 21 42 equally spaced values on the unit square.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
LDF, NXDATA, NYDATA
(NXDATA=21, NYDATA=42, LDF=NXDATA)
!
INTEGER
REAL
!
!
I, J, NOUT
F, FDATA(LDF,NYDATA), FLOAT, PI, Q, &
SIN, X, XDATA(NXDATA), Y, YDATA(NYDATA)
INTRINSIC FLOAT, SIN
Define function
F(X,Y) = SIN(X+Y)
Set up X-grid
DO 10 I=1, NXDATA
XDATA(I) = FLOAT(I-1)/FLOAT(NXDATA-1)
10 CONTINUE
Set up Y-grid
DO 20 I=1, NYDATA
YDATA(I) = FLOAT(I-1)/FLOAT(NYDATA-1)
20 CONTINUE
Evaluate function on grid
DO 30 I=1, NXDATA
DO 30 J=1, NYDATA
FDATA(I,J) = F(XDATA(I),YDATA(J))
30 CONTINUE
!
WRITE (NOUT,99999)
Write heading
Get value for PI and set X and Y
PI = CONST('PI')
X = PI/4.0
Y = PI/4.0
!
Output
X
0.7854
Y
0.7854
F(X,Y)
1.0000
QD2VL
1.0000
DIF
0.0000
QD2DR
This function evaluates the derivative of a function defined on a rectangular grid using quadratic
interpolation.
Required Arguments
IXDER Order of the x-derivative. (Input)
IYDER Order of the y-derivative. (Input)
X X-coordinate of the point at which the function is to be evaluated. (Input)
Y Y-coordinate of the point at which the function is to be evaluated. (Input)
XDATA Array of length NXDATA containing the location of the data points in the
x-direction. (Input)
XDATA must be increasing.
YDATA Array of length NYDATA containing the location of the data points in the
y-direction. (Input)
YDATA must be increasing.
QD2DR 789
Optional Arguments
NXDATA Number of data points in the x-direction. (Input)
NXDATA must be at least three.
Default: NXDATA = size (XDATA,1).
NYDATA Number of data points in the y-direction. (Input)
NYDATA must be at least three.
Default: NYDATA = size (YDATA,1).
LDF Leading dimension of FDATA exactly as specified in the dimension statement of the
calling program. (Input)
LDF must be at least as large as NXDATA.
Default: LDF = size (FDATA,1).
CHECK Logical variable that is .TRUE. if checking of XDATA and YDATA is required or
.FALSE. if checking is not required. (Input)
Default: CHECK = .TRUE.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function QD2DR interpolates a table of values, using quadratic polynomials, returning an
approximation to the tabulated function. Let (xi, yj, fij) for i = 1, , nx and j = 1, , ny be the
tabular data. Given a point (x, y) at which an interpolated value is desired, we first find the nearest
interior grid point (xi, yj). A bivariate quadratic interpolant q is then formed using six points near
(x, y). Five of the six points are (xi, yj), (xi1, yj), and (xi, yj1). The sixth point is the nearest point to
(x, y) of the grid points (xi1, yj1). The value q(p, r) (x, y) is returned by QD2DR, where
p = IXDER and r = IYDER.
Comments
1.
Informational errors
Type
4
4
2.
Code
6 The XDATA values must be strictly increasing.
7 The YDATA values must be strictly increasing.
Because quadratic interpolation is used, if the order of any derivative is greater than
two, then the returned value is zero.
Example
In this example, the partial derivatives of sin(x + y) at x = y = /3 are approximated by using
QD2DR on a table of size 21 42 equally spaced values on the rectangle [0, 2] [0, 2].
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
LDF, NXDATA, NYDATA
(NXDATA=21, NYDATA=42, LDF=NXDATA)
INTEGER
REAL
!
!
!
!
!
!
!
!
!
QD2DR 791
DO 40 IXDER=0, 1
DO 40 IYDER=0, 1
Q = QD2DR(IXDER,IYDER,X,Y,XDATA,YDATA,FDATA)
FU = FUNC(IXDER,IYDER,X,Y)
WRITE (NOUT,99999) X, Y, IXDER, IYDER, FU, Q, (FU-Q)
40 CONTINUE
!
99998 FORMAT (32X, '(IDX,IDY)', /, 8X, 'X', 8X, 'Y', 3X, 'IDX', 2X,&
'IDY', 3X, 'F
(X,Y)', 3X, 'QD2DR', 6X, 'ERROR')
99999 FORMAT (2F9.4, 2I5, 3X, F9.4, 2X, 2F11.4)
END
REAL FUNCTION FUNC (IX, IY, X, Y)
INTEGER
IX, IY
REAL
X, Y
!
REAL
COS, SIN
INTRINSIC COS, SIN
!
IF (IX.EQ.0 .AND. IY.EQ.0) THEN
!
Define (0,0) derivative
FUNC = SIN(X+Y)
ELSE IF (IX.EQ.0 .AND. IY.EQ.1) THEN
!
Define (0,1) derivative
FUNC = COS(X+Y)
ELSE IF (IX.EQ.1 .AND. IY.EQ.0) THEN
!
Define (1,0) derivative
FUNC = COS(X+Y)
ELSE IF (IX.EQ.1 .AND. IY.EQ.1) THEN
!
Define (1,1) derivative
FUNC = -SIN(X+Y)
ELSE
FUNC = 0.0
END IF
RETURN
END
Output
X
1.0472
1.0472
1.0472
1.0472
Y
1.0472
1.0472
1.0472
1.0472
IDX
0
0
1
1
IDY
0
1
0
1
(IDX,IDY)
F
(X,Y)
QD2DR
0.8660
0.8661
-0.5000
-0.4993
-0.5000
-0.4995
-0.8660
-0.8634
ERROR
-0.0001
-0.0007
-0.0005
-0.0026
QD3VL
This function evaluates a function defined on a rectangular three-dimensional grid using quadratic
interpolation.
Required Arguments
X x-coordinate of the point at which the function is to be evaluated. (Input)
Y y-coordinate of the point at which the function is to be evaluated. (Input)
Z z-coordinate of the point at which the function is to be evaluated. (Input)
XDATA Array of length NXDATA containing the location of the data points in the
x-direction. (Input)
XDATA must be increasing.
YDATA Array of length NYDATA containing the location of the data points in the
y-direction. (Input)
YDATA must be increasing.
ZDATA Array of length NZDATA containing the location of the data points in the
z-direction. (Input)
ZDATA must be increasing.
FDATA Array of size NXDATA by NYDATA by NZDATA containing function values. (Input)
FDATA(I, J, K) is the value of the function at (XDATA(I), YDATA(J), ZDATA(K)).
Optional Arguments
NXDATA Number of data points in the x-direction. (Input)
NXDATA must be at least three.
Default: NXDATA = size (XDATA,1).
NYDATA Number of data points in the y-direction. (Input)
NYDATA must be at least three.
Default: NYDATA = size (YDATA,1).
NZDATA Number of data points in the z-direction. (Input)
NZDATA must be at least three.
Default: NZDATA = size (ZDATA,1).
LDF Leading dimension of FDATA exactly as specified in the dimension statement of the
calling program. (Input)
LDF must be at least as large as NXDATA.
Default: LDF = size (FDATA,1).
MDF Middle (second) dimension of FDATA exactly as specified in the dimension
statement of the calling program. (Input)
MDF must be at least as large as NYDATA.
Default: MDF = size (FDATA,2).
QD3VL 793
CHECK Logical variable that is .TRUE. if checking of XDATA, YDATA, and ZDATA is
required or .FALSE. if checking is not required. (Input)
Default: CHECK = .TRUE.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function QD3VL interpolates a table of values, using quadratic polynomials, returning an
approximation to the tabulated function. Let (xi, yj, zk, fijk) for i = 1, , nx, j = 1, , ny, and
k = 1, , nz be the tabular data. Given a point (x, y, z) at which an interpolated value is desired, we
first find the nearest interior grid point (xi, yj, zk,). A trivariate quadratic interpolant q is then
formed. Ten points are needed for this purpose. Seven points have the form
( x , y , z ), ( x
i
i 1
, y j , zk ) , ( xi , y j 1 , zk ) and ( xi , y j , zk 1 )
The last three points are drawn from the vertices of the octant containing (x, y, z). There are four of
these vertices remaining, and we choose to exclude the vertex farthest from the center. This has
the slightly deleterious effect of not reproducing the tabular data at the eight exterior corners of the
table. The value q(x, y, z) is returned by QD3VL.
Comments
Informational errors
Type
4
4
4
Code
9 The XDATA values must be strictly increasing.
10 The YDATA values must be strictly increasing.
11 The ZDATA values must be strictly increasing.
Example
In this example, the value of sin(x + y + z) at x = y = z = /3 is approximated by using QD3VL on a
grid of size 21 42 18 equally spaced values on the cube [0, 2]3.
USE IMSL_LIBRARIES
IMPLICIT
NONE
INTEGER
PARAMETER
!
INTEGER
REAL
!
!
10
!
20
!
30
!
40
!
!
!
!
!
!
99999
I, J, K, NOUT
F, FDATA(LDF,MDF,NZDATA), FLOAT, PI, Q, &
SIN, X, XDATA(NXDATA), Y, YDATA(NYDATA), Z,&
ZDATA(NZDATA)
INTRINSIC FLOAT, SIN
Define function
F(X,Y,Z) = SIN(X+Y+Z)
Set up X-grid
DO 10 I=1, NXDATA
XDATA(I) = 2.0*(FLOAT(I-1)/FLOAT(NXDATA-1))
CONTINUE
Set up Y-grid
DO 20 J=1, NYDATA
YDATA(J) = 2.0*(FLOAT(J-1)/FLOAT(NYDATA-1))
CONTINUE
Set up Z-grid
DO 30 K=1, NZDATA
ZDATA(K) = 2.0*(FLOAT(K-1)/FLOAT(NZDATA-1))
CONTINUE
Evaluate function on grid
DO 40 I=1, NXDATA
DO 40 J=1, NYDATA
DO 40 K=1, NZDATA
FDATA(I,J,K) = F(XDATA(I),YDATA(J),ZDATA(K))
CONTINUE
Get output unit number
CALL UMACH (2, NOUT)
Write heading
WRITE (NOUT,99999)
Get value for PI and set values
for X, Y, and Z
PI = CONST('PI')
X = PI/3.0
Y = PI/3.0
Z = PI/3.0
Evaluate quadratic at (X,Y,Z)
Q = QD3VL(X,Y,Z,XDATA,YDATA,ZDATA,FDATA)
Print results
WRITE (NOUT,'(6F11.4)') X, Y, Z, F(X,Y,Z), Q, (Q-F(X,Y,Z))
FORMAT (10X, 'X', 10X, 'Y', 10X, 'Z', 5X, 'F(X,Y,Z)', 4X,&
'QD3VL', 6X, 'ERROR')
END
Output
X
1.0472
Y
1.0472
Z
1.0472
F(X,Y,Z)
0.0000
QD3VL
0.0001
ERROR
0.0001
QD3VL 795
QD3DR
This function evaluates the derivative of a function defined on a rectangular three-dimensional
grid using quadratic interpolation.
Required Arguments
IXDER Order of the x-derivative. (Input)
IYDER Order of the y-derivative. (Input)
IZDER Order of the z-derivative. (Input)
X x-coordinate of the point at which the function is to be evaluated. (Input)
Y y-coordinate of the point at which the function is to be evaluated. (Input)
Z z-coordinate of the point at which the function is to be evaluated. (Input)
XDATA Array of length NXDATA containing the location of the data points in the
x-direction. (Input)
XDATA must be increasing.
YDATA Array of length NYDATA containing the location of the data points in the
y-direction. (Input)
YDATA must be increasing.
ZDATA Array of length NZDATA containing the location of the data points in the
z-direction. (Input)
ZDATA must be increasing.
FDATA Array of size NXDATA by NYDATA by NZDATA containing function values. (Input)
FDATA(I, J, K) is the value of the function at (XDATA(I), YDATA(J), ZDATA(K)).
Optional Arguments
NXDATA Number of data points in the x-direction. (Input)
NXDATA must be at least three.
Default: NXDATA = size (XDATA,1).
NYDATA Number of data points in the y-direction. (Input)
NYDATA must be at least three.
Default: NYDATA = size (YDATA,1).
796 Chapter 3: Interpolation and Approximation
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function QD3DR interpolates a table of values, using quadratic polynomials, returning an
approximation to the partial derivatives of the tabulated function. Let
(xi, yj, zk, fijk)
for i = 1, , nx, j = 1, , ny, and k = 1, , nz be the tabular data. Given a point (x, y, z) at which
an interpolated value is desired, we first find the nearest interior grid point (xi, yj, zk). A trivariate
quadratic interpolant q is then formed. Ten points are needed for this purpose. Seven points have
the form
( x , y , z ), ( x
i
i 1
, y j , zk ) , ( xi , y j 1 , zk ) and ( xi , y j , zk 1 )
The last three points are drawn from the vertices of the octant containing (x, y, z). There are four of
these vertices remaining, and we choose to exclude the vertex farthest from the center. This has
QD3DR 797
the slightly deleterious effect of not reproducing the tabular data at the eight exterior corners of the
table. The value q(p,r,t)(x, y, z) is returned by QD3DR, where p = IXDER, r = IYDER, and t = IZDER.
Comments
1.
Informational errors
Type
4
4
4
2.
Code
9 The XDATA values must be strictly increasing.
10 The YDATA values must be strictly increasing.
11 The ZDATA values must be strictly increasing.
Because quadratic interpolation is used, if the order of any derivative is greater than
two, then the returned value is zero.
Example
In this example, the derivatives of sin(x + y + z) at x = y = z = /5 are approximated by using
3
QD3DR on a grid of size 21 42 18 equally spaced values on the cube [0, 2] .
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
!
NONE
LDF, MDF, NXDATA, NYDATA, NZDATA
(NXDATA=21, NYDATA=42, NZDATA=18, LDF=NXDATA,&
MDF=NYDATA)
INTEGER
REAL
!
!
10
20
!
30
!
40
!
Write heading
WRITE (NOUT,99999)
PI
X
Y
Z
=
=
=
=
CONST('PI')
PI/5.0
PI/5.0
PI/5.0
!
!
!
99998 FORMAT (3F7.4, 3I5, 4X, F7.4, 8X, 2F10.4)
99999 FORMAT (39X, '(IDX,IDY,IDZ)', /, 6X, 'X', 6X, 'Y', 6X,&
'Z', 3X, 'IDX', 2X, 'IDY', 2X, 'IDZ', 2X, 'F
'(X,Y,Z)', 3X, 'QD3DR', 5X, 'ERROR')
END
!
REAL FUNCTION FUNC (IX, IY, IZ, X, Y, Z)
INTEGER
IX, IY, IZ
REAL
X, Y, Z
!
REAL
COS, SIN
INTRINSIC COS, SIN
!
IF (IX.EQ.0 .AND. IY.EQ.0 .AND. IZ.EQ.0) THEN
!
Define (0,0,0) derivative
FUNC = SIN(X+Y+Z)
ELSE IF (IX.EQ.0 .AND. IY.EQ.0 .AND. IZ.EQ.1) THEN
!
Define (0,0,1) derivative
FUNC = COS(X+Y+Z)
ELSE IF (IX.EQ.0 .AND. IY.EQ.1 .AND. IZ.EQ.0) THEN
!
Define (0,1,0,) derivative
FUNC = COS(X+Y+Z)
ELSE IF (IX.EQ.0 .AND. IY.EQ.1 .AND. IZ.EQ.1) THEN
!
Define (0,1,1) derivative
FUNC = -SIN(X+Y+Z)
ELSE IF (IX.EQ.1 .AND. IY.EQ.0 .AND. IZ.EQ.0) THEN
!
Define (1,0,0) derivative
FUNC = COS(X+Y+Z)
ELSE IF (IX.EQ.1 .AND. IY.EQ.0 .AND. IZ.EQ.1) THEN
!
Define (1,0,1) derivative
FUNC = -SIN(X+Y+Z)
ELSE IF (IX.EQ.1 .AND. IY.EQ.1 .AND. IZ.EQ.0) THEN
!
Define (1,1,0) derivative
FUNC = -SIN(X+Y+Z)
ELSE IF (IX.EQ.1 .AND. IY.EQ.1 .AND. IZ.EQ.1) THEN
!
Define (1,1,1) derivative
Chapter 3: Interpolation and Approximation
',&
QD3DR 799
FUNC = -COS(X+Y+Z)
ELSE
FUNC = 0.0
END IF
RETURN
END
Output
X
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
Y
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
Z
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
0.6283
IDX
0
0
0
0
1
1
1
1
IDY
0
0
1
1
0
0
1
1
IDZ
0
1
0
1
0
1
0
1
(IDX,IDY,IDZ)
F
(X,Y,Z)
0.9511
-0.3090
-0.3090
-0.9511
-0.3090
-0.9511
-0.9511
0.3090
QD3DR
0.9511
-0.3080
-0.3088
-0.9587
-0.3078
-0.9348
-0.9613
0.0000
ERROR
-0.0001
-0.0010
0.0002
0.0077
-0.0012
-0.0162
0.0103
0.3090
SURF
Computes a smooth bivariate interpolant to scattered data that is locally a quintic polynomial in
two variables.
Required Arguments
XYDATA A 2 by NDATA array containing the coordinates of the interpolation points.
(Input)
These points must be distinct. The x-coordinate of the I-th data point is stored in
XYDATA(1, I) and the y-coordinate of the I-th data point is stored in XYDATA(2, I).
FDATA Array of length NDATA containing the interpolation values. (Input) FDATA(I)
contains the value at (XYDATA(1, I), XYDATA(2, I)).
XOUT Array of length NXOUT containing an increasing sequence of points. (Input)
These points are the x-coordinates of a grid on which the interpolated surface is to be
evaluated.
YOUT Array of length NYOUT containing an increasing sequence of points. (Input)
These points are the y-coordinates of a grid on which the interpolated surface is to be
evaluated.
SUR Matrix of size NXOUT by NYOUT. (Output)
This matrix contains the values of the surface on the XOUT by YOUT grid, i.e. SUR(I, J)
contains the interpolated value at (XOUT(I), YOUT(J)).
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least four.
Default: NDATA = size (FDATA,1).
NXOUT The number of elements in XOUT. (Input)
Default: NXOUT = size (XOUT,1).
NYOUT The number of elements in YOUT. (Input)
Default: NYOUT = size (YOUT,1).
LDSUR Leading dimension of SUR exactly as specified in the dimension statement of the
calling program. (Input)
LDSUR must be at least as large as NXOUT.
Default: LDSUR = size (SUR,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL SURF (NDATA, XYDATA, FDATA, NXOUT, NYOUT, XOUT, YOUT, SUR,
LDSUR)
Double:
Description
This routine is designed to compute a C 1 interpolant to scattered data in the plane. Given the data
points
{( x , y , f )}
i
i =1
in R3
SURF returns (in SUR, the user-specified grid) the values of the interpolant s. The computation of s
{( x , y )}
i
i =1
x, y T
Thus, s is a bivariate quintic polynomial on each triangle of the triangulation. In addition, we have
s(xi, yi) = fi
Chapter 3: Interpolation and Approximation
for i = 1, , N
SURF 801
Comments
1.
2.
Informational errors
Type
4
4
4
3.
Code
5 The data point values must be distinct.
6 The XOUT values must be strictly increasing.
7 The YOUT values must be strictly increasing.
Example
In this example, the interpolant to the linear function 3 + 7x + 2y is computed from 20 data points
equally spaced on the circle of radius 3. We then print the values on a 3 3 grid.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
LDSUR, NDATA, NXOUT, NYOUT
(NDATA=20, NXOUT=3, NYOUT=3, LDSUR=NXOUT)
!
INTEGER
REAL
!
!
!
I, J, NOUT
ABS, COS, F, FDATA(NDATA), FLOAT, PI,&
SIN, SUR(LDSUR,NYOUT), X, XOUT(NXOUT),&
XYDATA(2,NDATA), Y, YOUT(NYOUT)
INTRINSIC ABS, COS, FLOAT, SIN
Define function
F(X,Y) = 3.0 + 7.0*X + 2.0*Y
Get value for PI
PI
= CONST('PI')
Set up X, Y, and F data on a circle
10
!
!
20
30
!
!
!
!
40
99998
99999
DO 10 I=1, NDATA
XYDATA(1,I) = 3.0*SIN(2.0*PI*FLOAT(I-1)/FLOAT(NDATA))
XYDATA(2,I) = 3.0*COS(2.0*PI*FLOAT(I-1)/FLOAT(NDATA))
FDATA(I)
= F(XYDATA(1,I),XYDATA(2,I))
CONTINUE
Set up XOUT and YOUT data on [0,1] by
[0,1] grid.
DO 20 I=1, NXOUT
XOUT(I) = FLOAT(I-1)/FLOAT(NXOUT-1)
CONTINUE
DO 30 I=1, NXOUT
YOUT(I) = FLOAT(I-1)/FLOAT(NYOUT-1)
CONTINUE
Interpolate scattered data
CALL SURF (XYDATA, FDATA, XOUT, YOUT, SUR)
Get output unit number
CALL UMACH (2, NOUT)
Write heading
WRITE (NOUT,99998)
Print results
DO 40 I=1, NYOUT
DO 40 J=1, NXOUT
WRITE (NOUT,99999) XOUT(J), YOUT(I), SUR(J,I),&
F(XOUT(J),YOUT(I)),&
ABS(SUR(J,I)-F(XOUT(J),YOUT(I)))
CONTINUE
FORMAT (' ', 10X, 'X', 11X, 'Y', 9X, 'SURF', 6X, 'F(X,Y)', 7X,&
'ERROR', /)
FORMAT (1X, 5F12.4)
END
Output
X
0.0000
0.5000
1.0000
0.0000
0.5000
1.0000
0.0000
0.5000
1.0000
0.0000
0.0000
0.0000
0.5000
0.5000
0.5000
1.0000
1.0000
1.0000
SURF
3.0000
6.5000
10.0000
4.0000
7.5000
11.0000
5.0000
8.5000
12.0000
F(X,Y)
3.0000
6.5000
10.0000
4.0000
7.5000
11.0000
5.0000
8.5000
12.0000
ERROR
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
RLINE
Fits a line to a set of data points using least squares.
Required Arguments
XDATA Vector of length NOBS containing the x-values. (Input)
RLINE 803
Optional Arguments
NOBS Number of observations. (Input)
Default: NOBS = size (XDATA,1).
STAT Vector of length 12 containing the statistics described below. (Output)
I
1
2
3
4
5
6
7
8
9
10
11
12
ISTAT(I)
Mean of XDATA
Mean of YDATA
Sample variance of XDATA
Sample variance of YDATA
Correlation
Estimated standard error of B0
Estimated standard error of B1
Degrees of freedom for regression
Sum of squares for regression
Degrees of freedom for error
Sum of squares for error
Number of (x, y) points containing NaN (not a number) as either the x or y value
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine RLINE fits a line to a set of (x, y) data points using the method of least squares. Draper
and Smith (1981, pages 169) discuss the method. The fitted model is
y = 0 + 1 x
where 0 (stored in B0) is the estimated intercept and 1 (stored in B1) is the estimated slope. In
addition to the fit, RLINE produces some summary statistics, including the means, sample
variances, correlation, and the error (residual) sum of squares. The estimated standard errors of
0 and 1 are computed under the simple linear regression model. The errors in the model are
assumed to be uncorrelated and with constant variance.
If the x values are all equal, the model is degenerate. In this case, RLINE sets 1
to zero and to the mean of the y values.
0
Comments
Informational error
Type
4
Code
1 Each (x, y) point contains NaN (not a number). There are no valid data.
Example
This example fits a line to a set of data discussed by Draper and Smith (1981, Table 1.1, pages
933). The response y is the amount of steam used per month (in pounds), and the independent
variable x is the average atmospheric temperature (in degrees Fahrenheit).
USE RLINE_INT
USE UMACH_INT
USE WRRRL_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NOBS
(NOBS=25)
INTEGER
REAL
CHARACTER
NOUT
B0, B1, STAT(12), XDATA(NOBS), YDATA(NOBS)
CLABEL(13)*15, RLABEL(1)*4
!
DATA XDATA/35.3, 29.7, 30.8, 58.8, 61.4, 71.3, 74.4, 76.7, 70.7,&
57.5, 46.4, 28.9, 28.1, 39.1, 46.8, 48.5, 59.3, 70.0, 70.0,&
74.5, 72.1, 58.1, 44.6, 33.4, 28.6/
DATA YDATA/10.98, 11.13, 12.51, 8.4, 9.27, 8.73, 6.36, 8.5,&
7.82, 9.14, 8.24, 12.19, 11.88, 9.57, 10.94, 9.58, 10.09,&
8.11, 6.83, 8.88, 7.68, 8.47, 8.86, 10.36, 11.08/
DATA RLABEL/'NONE'/, CLABEL/' ', 'Mean of X', 'Mean of Y',&
'Variance X', 'Variance Y', 'Corr.', 'Std. Err. B0',&
'Std. Err. B1', 'DF Reg.', 'SS Reg.', 'DF Error',&
'SS Error', 'Pts. with NaN'/
!
CALL RLINE (XDATA, YDATA, B0, B1, STAT=STAT)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) B0, B1
99999 FORMAT (' B0 = ', F7.2, ' B1 = ', F9.5)
CALL WRRRL ('%/STAT', STAT, RLABEL, CLABEL, 1, 12, 1, &
FMT = '(12W10.4)')
Chapter 3: Interpolation and Approximation
RLINE 805
!
END
Output
B0 =
13.62
Mean of X
52.6
B1 =
-0.07983
STAT
Mean of Y Variance X Variance Y
9.424
298.1
2.659
Std. Err. B1
0.01052
DF Reg.
1
SS Reg.
45.59
DF Error
23
RCURV
Fits a polynomial curve using least squares.
Required Arguments
XDATA Vector of length NOBS containing the x values. (Input)
YDATA Vector of length NOBS containing the y values. (Input)
+ k x k
Optional Arguments
NOBS Number of observations. (Input)
Default: NOBS = size (XDATA,1).
NDEG Degree of polynomial. (Input)
Default: NDEG = size (B,1) 1.
SSPOLY Vector of length NDEG + 1 containing the sequential sums of squares. (Output)
SSPOLY(1) contains the sum of squares due to the mean. For i = 1, 2, , NDEG,
i
2
SSPOLY(i + 1) contains the sum of squares due to x adjusted for the mean, x, x ,,
and xi-1.
STAT Vector of length 10 containing statistics described below. (Output)
i
Statistics
Mean of x
Mean of y
Sample variance of x
Sample variance of y
10
FORTRAN 90 Interface
Generic:
RCURV 807
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine RCURV computes estimates of the regression coefficients in a polynomial (curvilinear)
regression model. In addition to the computation of the fit, RCURV computes some summary
statistics. Sequential sums of squares attributable to each power of the independent variable
(stored in SSPOLY) are computed. These are useful in assessing the importance of the higher order
powers in the fit. Draper and Smith (1981, pages 101102) and Neter and Wasserman (1974,
pages 278287) discuss the interpretation of the sequential sums of squares. The statistic R2
(stored in STAT(5)) is the percentage of the sum of squares of y about its mean explained by the
polynomial curve. Specifically,
( y
=
(y
y)
i =1
n
i =1
y)
100%
where
y i
(stored in STAT(2)) is the mean of y. This statistic is useful in assessing the overall fit of the
curve to the data. R2 must be between 0% and 100%, inclusive. R2 = 100% indicates a perfect fit to
the data.
Routine RCURV computes estimates of the regression coefficients in a polynomial model using
orthogonal polynomials as the regressor variables. This reparameterization of the polynomial
model in terms of orthogonal polynomials has the advantage that the loss of accuracy resulting
from forming powers of the x-values is avoided. All results are returned to the user for the original
model.
The routine RCURV is based on the algorithm of Forsythe (1957). A modification to Forsythes
algorithm suggested by Shampine (1975) is used for computing the polynomial coefficients. A
discussion of Forsythes algorithm and Shampines modification appears in Kennedy and Gentle
(1980, pages 342347).
Comments
1.
CALL R2URV (NOBS, XDATA, YDATA, NDEG, B, SSPOLY, STAT, WK, IWK)
2.
3.
Informational errors
Type
Code
4
Each (x, y) point contains NaN (not a number). There are no valid
data.
The x values are constant. At least NDEG + 1 distinct x values are
needed to fit a NDEG polynomial.
The y values are constant. A zero order polynomial is fit. High order
coefficients are set to zero.
There are too few observations to fit the desired degree polynomial.
High order coefficients are set to zero.
A perfect fit was obtained with a polynomial of degree less than
NDEG. High order coefficients are set to zero.
If NDEG is greater than 10, the accuracy of the results may be questionable.
Example
A polynomial model is fitted to data discussed by Neter and Wasserman (1974, pages 279285).
The data set contains the response variable y measuring coffee sales (in hundred gallons) and the
number of self-service coffee dispensers. Responses for fourteen similar cafeterias are in the data
set.
USE RCURV_INT
USE WRRRL_INT
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NDEG, NOBS
(NDEG=2, NOBS=14)
REAL
!
CHARACTER
!
DATA RLABEL/'NONE'/, CLABEL/' ', 'Mean of X', 'Mean of Y',&
'Variance X', 'Variance Y', 'R-squared',&
'DF Reg.', 'SS Reg.', 'DF Error', 'SS Error',&
'Pts. with NaN'/
DATA XDATA/0., 0., 1., 1., 2., 2., 4., 4., 5., 5., 6., 6., 7.,&
7./
DATA YDATA/508.1, 498.4, 568.2, 577.3, 651.7, 657.0, 755.3,&
758.9, 787.6, 792.1, 841.4, 831.8, 854.7, 871.4/
!
Chapter 3: Interpolation and Approximation
RCURV 809
Output
1
503.3
B
2
78.9
3
-4.0
SSPOLY
1
7077152.0
2
220644.2
3
4387.7
Mean of X
3.571
STAT
Mean of Y Variance X Variance Y
711.0
6.418
17364.8
SS Reg.
225031.9
DF Error
11
SS Error
710.5
R-squared
99.69
DF Reg.
2
FNLSQ
Computes a least-squares approximation with user-supplied basis functions.
Required Arguments
F User-supplied function to evaluate basis functions. The form is F(K, X),
where
K Number of the basis function. (Input)
K may be equal to 1, 2, , NBASIS.
X Argument for evaluation of the K-th basis function. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program. The data FDATA is approximated
by A(1) * F(1, X) + A(2) * F(2, X) ++ A(NBASIS) * F(NBASIS, X) if INTCEP = 0 and
is approximated by A(1) + A(2) * F(1, X) ++ A(NBASIS + 1) * F(NBASIS, X) if
INTCEP = 1.
XDATA Array of length NDATA containing the abscissas of the data points. (Input)
FDATA Array of length NDATA containing the ordinates of the data points. (Input)
A Array of length INTCEP + NBASIS containing the coefficients of the approximation.
(Output)
If INTCEP = 1, A(1) contains the intercept. A(INTCEP + I) contains the coefficient of
the I-th basis function.
SSE Sum of squares of the errors. (Output)
Optional Arguments
INTCEP Intercept option. (Input)
Default: INTCEP = 0.
INTCEP
Action
FNLSQ 811
Action
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL FNLSQ (F, INTCEP, NBASIS, NDATA, XDATA, FDATA, IWT, WEIGHT, A,
SSE)
Double:
Description
The routine FNLSQ computes a best least-squares approximation to given univariate data of the
form
{( x , f )}
i
i =1
by M basis functions
{F }
j
M
j =1
(where M = NBASIS). In particular, if INTCEP = 0, this routine returns the error sum of squares
SSE and the coefficients a which minimize
M
wi fi a j Fj ( xi )
i =1
j =1
wi fi a1 a j +1 Fj ( xi )
i =1
j =1
That is, the first element of the vector a is now the coefficient of the function that is identically 1
and the coefficients of the Fjs are now aj+1.
One additional parameter in the calling sequence for FNLSQ is IWT. If IWT is set to 0, then wi = 1 is
assumed. If IWT is set to 1, then the user must supply the weights.
Comments
1.
2.
Informational errors
Type
Code
3
Example
In this example, we fit the following two functions (indexed by )
1 + sin x + 7 sin 3x +
where is random uniform deviate over the range [1, 1], and is 0 for the first function and 1 for
the second. These functions are evaluated at 90 equally spaced points on the interval [0, 6]. We
use 4 basis functions, sin kx for k = 1, , 4, with and without the intercept.
USE
USE
USE
USE
FNLSQ_INT
RNSET_INT
UMACH_INT
RNUNF_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NBASIS, NDATA
(NBASIS=4, NDATA=90)
FNLSQ 813
!
INTEGER
REAL
INTRINSIC
EXTERNAL
I, INTCEP, NOUT
A(NBASIS+1), F, FDATA(NDATA), FLOAT, G, RNOISE,&
SIN, SSE, X, XDATA(NDATA)
FLOAT, SIN
F
!
!
!
!
!
!
!
!
!
INTCEP = 1
!
!
!
!
!
!
!
!
!
INTCEP = 1
!
!
Write output
WRITE (NOUT,99998) SSE, A(1), (A(I),I=2,NBASIS+1)
!
99996 FORMAT (//, ' Without error introduced we have :', /,&
'
SSE
Intercept
Coefficients ', /)
99997 FORMAT (//, ' With error introduced we have :', /, '
SSE
, '
Intercept
Coefficients ', /)
99998 FORMAT (1X, F8.4, 5X, F9.4, 5X, 4F9.4, /)
99999 FORMAT (1X, F8.4, 14X, 5X, 4F9.4, /)
END
REAL FUNCTION F (K, X)
INTEGER
K
REAL
X
!
REAL
SIN
INTRINSIC SIN
!
F = SIN(K*X)
RETURN
END
'&
Output
Without error introduced we have :
SSE
Intercept
Coefficients
89.8776
0.0000
1.0000
1.0101
1.0000
0.0199
0.0000
7.0291
7.0000
0.0374
0.0000
6.9825
6.9548
0.0133
-0.0223
0.9522
0.9963
0.9867
-0.0675
-0.0864
BSLSQ
Computes the least-squares spline approximation, and return the B-spline coefficients.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
FDATA Array of length NDATA containing the data point ordinates. (Input)
KORDER Order of the spline. (Input)
KORDER must be less than or equal to NDATA.
XKNOT Array of length NCOEF + KORDER containing the knot sequence. (Input)
XKNOT must be nondecreasing.
BSLSQ 815
Optional Arguments
NDATA Number of data points. (Input)
Default: NDATA = size(XDATA, 1)
WEIGHT Array of length NDATA containing the weights. (Input)
Default: WEIGHT = 1.0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BSLSQ is based on the routine L2APPR by de Boor (1978, page 255). The IMSL
routine BSLSQ computes a weighted discrete L2 approximation from a spline subspace to a given
data set (xi, fi) for i = 1, , N (where N = NDATA). In other words, it finds B-spline coefficients,
a = BSCOEF, such that
N
i =1
fi a j B j ( xi ) wi
j =1
is a minimum, where m = NCOEF and Bj denotes the j-th B-spline for the given order, KORDER, and
knot sequence, XKNOT. This linear least squares problem is solved by computing and solving the
normal equations. While the normal equations can sometimes cause numerical difficulties, their
use here should not cause a problem because the B-spline basis generally leads to well-conditioned
banded matrices.
The choice of weights depends on the problem. In some cases, there is a natural choice for the
weights based on the relative importance of the data points. To approximate a continuous function
(if the location of the data points can be chosen), then the use of Gauss quadrature weights and
points is reasonable. This follows because BSLSQ is minimizing an approximation to the integral
816 Chapter 3: Interpolation and Approximation
F s
dx
The Gauss quadrature weights and points can be obtained using the IMSL routine GQRUL (see
Chapter 4, Integration and Differentiation).
Comments
1.
2.
3.
Informational errors
Type
Code
4
4
4
4
5
6
7
8
The B-spline representation can be evaluated using BSVAL, and its derivative can be
evaluated using BSDER.
Example
In this example, we try to recover a quadratic polynomial using a quadratic spline with one interior
knot from two different data sets. The first data set is generated by evaluating the quadratic at 50
equally spaced points in the interval (0, 1) and then adding uniformly distributed noise to the data.
The second data set includes the first data set, and, additionally, the values at 0 and at 1 with no
noise added. Since the first and last data points are uncontaminated by noise, we have chosen
weights equal to 105 for these two points in this second problem. The quadratic, the first
approximation, and the second approximation are then evaluated at 11 equally spaced points. This
example illustrates the use of the weights to enforce interpolation at certain of the data points.
Chapter 3: Interpolation and Approximation
BSLSQ 817
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NCOEF
(KORDER=3, NCOEF=4)
INTEGER
REAL
I, NDATA, NOUT
ABS, BSCOF1(NCOEF), BSCOF2(NCOEF), F,&
FDATA1(50), FDATA2(52), FLOAT, RNOISE, S1,&
S2, WEIGHT(52), X, XDATA1(50), XDATA2(52),&
XKNOT(KORDER+NCOEF), XT, YT
ABS, FLOAT
INTRINSIC
!
DATA WEIGHT/52*1.0/
!
Define function
F(X) = 8.0*X*(1.0-X)
!
!
!
!
!
!
!
!
!
!
!
WEIGHT(1) = 1.0E5
XDATA2(1) = 0.0
FDATA2(1) = F(XDATA2(1))
WEIGHT(NDATA) = 1.0E5
XDATA2(NDATA) = 1.0
FDATA2(NDATA) = F(XDATA2(NDATA))
Compute least squares B-spline
!
!
!
!
!
representation.
CALL BSLSQ (XDATA2, FDATA2, KORDER, XKNOT, NCOEF, BSCOF2, &
WEIGHT=WEIGHT)
Get output unit number
CALL UMACH (2, NOUT)
Write heading
WRITE (NOUT,99998)
Print the two interpolants
at 11 points.
DO 40 I=1, 11
XT = FLOAT(I-1)/10.0
YT = F(XT)
Evaluate splines
S1 = BSVAL(XT,KORDER,XKNOT,NCOEF,BSCOF1)
S2 = BSVAL(XT,KORDER,XKNOT,NCOEF,BSCOF2)
WRITE (NOUT,99999) XT, YT, S1, S2, (S1-YT), (S2-YT)
40 CONTINUE
!
99998 FORMAT (7X, 'X', 9X, 'F(X)', 6X, 'S1(X)', 5X, 'S2(X)', 7X,&
'F(X)-S1(X)', 7X, 'F(X)-S2(X)')
99999 FORMAT (' ', 4F10.4, 4X, F10.4, 7X, F10.4)
END
Output
X
0.0000
0.1000
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
0.8000
0.9000
1.0000
F(X)
0.0000
0.7200
1.2800
1.6800
1.9200
2.0000
1.9200
1.6800
1.2800
0.7200
0.0000
S1(X)
0.0515
0.7594
1.3142
1.7158
1.9641
2.0593
1.9842
1.7220
1.2726
0.6360
-0.1878
S2(X)
0.0000
0.7490
1.3277
1.7362
1.9744
2.0423
1.9468
1.6948
1.2863
0.7214
0.0000
F(X)-S1(X)
0.0515
0.0394
0.0342
0.0358
0.0441
0.0593
0.0642
0.0420
-0.0074
-0.0840
-0.1878
F(X)-S2(X)
0.0000
0.0290
0.0477
0.0562
0.0544
0.0423
0.0268
0.0148
0.0063
0.0014
0.0000
BSVLS
Computes the variable knot B-spline least squares approximation to given data.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
FDATA Array of length NDATA containing the data point ordinates. (Input)
KORDER Order of the spline. (Input)
KORDER must be less than or equal to NDATA.
BSVLS 819
Optonal Arguments
NDATA Number of data points. (Input)
NDATA must be at least 2.
Default: NDATA = size(XDATA, 1)
WEIGHT Array of length NDATA containing the weights. (Input)
Default: WEIGHT = 1.0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BSVLS attempts to find the best placement of knots that will minimize the leastsquares
error to given data by a spline of order k = KORDER with N = NCOEF coefficients. The user
provides the order k of the spline and the number of coefficients N. For this problem to make
sense, it is necessary that N > k. We then attempt to find the minimum of the functional
N
F ( a, t ) = wi fi a j B j , k ,t ( x j )
i =1
j =1
The user must provide the weights w = WEIGHT, the data xi = XDATA and fi = FDATA, and
M = NDATA. The minimum is taken over all admissible knot sequences t.
The technique employed in BSVLS uses the fact that for a fixed knot sequence t the minimization
in a is a linear least-squares problem that can be solved by calling the IMSL routine BSLSQ. Thus,
we can think of our objective function F as a function of just t by setting
G ( t ) = min F ( a, t )
a
A Gauss-Seidel (cyclic coordinate) method is then used to reduce the value of the new objective
function G. In addition to this local method, there is a global heuristic built into the algorithm that
will be useful if the data arise from a smooth function. This heuristic is based on the routine
NEWNOT of de Boor (1978, pages 184 and 258261).
The user must input an initial guess, tg = XGUESS, for the knot sequence. This guess must be a
valid knot sequence for the splines of order k with
t1g t kg xi t gN +1 t gN + k ,
i = 1, , M
i = 1, , N
The routine BSVLS returns the B-spline representation of the best fit found by the algorithm as
well as the square root of the sum of squares error in SSQ. If this answer is unsatisfactory, you may
reinitialize BSVLS with the return from BSVLS to see if an improvement will occur. We have found
that this option does not usually (substantially) improve the result. In regard to execution speed,
this routine can be several orders of magnitude slower than one call to the least-squares routine
BSLSQ.
Comments
1.
2.
Informational errors
Type
Code
3
12 The knots found to be optimal are stacked more than KORDER. This
indicates fewer knots will produce the same error sum of squares.
The knots have been separated slightly.
BSVLS 821
Example
In this example, we try to fit the function |x .33| evaluated at 100 equally spaced points on [0, 1].
We first use quadratic splines with 2 interior knots initially at .2 and .8. The eventual error should
be zero since the function is a quadratic spline with two knots stacked at .33. As a second
example, we try to fit the same data with cubic splines with three interior knots initially located at
.1, .2, and, .5. Again, the theoretical error is zero when the three knots are stacked at .33.
We include a graph of the initial least-squares fit using the IMSL routine BSLSQ for the above
quadratic spline example with knots at .2 and .8. This graph overlays the graph of the spline
computed by BSVLS, which is indistinguishable from the data.
USE BSVLS_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
KORD1, KORD2, NCOEF1, NCOEF2, NDATA
(KORD1=3, KORD2=4, NCOEF1=5, NCOEF2=7, NDATA=100)
INTEGER
REAL
I, NOUT
ABS, BSCOEF(NCOEF2), F, FDATA(NDATA), FLOAT, SSQ,&
WEIGHT(NDATA), X, XDATA(NDATA), XGUES1(NCOEF1+KORD1),&
XGUES2(KORD2+NCOEF2), XKNOT(NCOEF2+KORD2)
ABS, FLOAT
INTRINSIC
!
!
!
!
!
!
!
!
!
!
!
!
!
99998 FORMAT (' Piecewise ', A, /)
99999 FORMAT (' Square root of the sum of squares : ', F9.4, /,&
' Knot sequence : ', /, 1X, 11(F9.4,/,1X))
END
Output
Piecewise quadratic
Square root of the sum of squares :
Knot sequence :
0.0000
0.0000
0.0000
0.3137
0.3464
1.0001
1.0001
1.0001
0.0008
Piecewise cubic
Square root of the sum of squares :
Knot sequence :
0.0000
0.0000
0.0000
0.0000
0.3167
0.3273
0.3464
1.0001
1.0001
1.0001
1.0001
0.0005
BSVLS 823
CONFT
Computes the least-squares constrained spline approximation, returning the B-spline coefficients.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
FDATA Array of size NDATA containing the values to be approximated. (Input)
FDATA(I) contains the value at XDATA(I).
XVAL Array of length NXVAL containing the abscissas at which the fit is to be constrained.
(Input)
NHARD Number of entries of XVAL involved in the hard constraints. (Input)
Note: (0 NHARD NXVAL). Setting NHARD to zero always results in a fit, while setting
NHARD to NXVAL forces all constraints to be met. The hard constraints must be
satisfied or else the routine signals failure. The soft constraints need not be satisfied,
but there will be an attempt to satisfy the soft constraints. The constraints must be
ordered in terms of priority with the most important constraints first. Thus, all of the
hard constraints must preceed the soft constraints. If infeasibility is detected among
the soft constraints, we satisfy (in order) as many of the soft constraints as possible.
IDER Array of length NXVAL containing the derivative value of the spline that is to be
constrained. (Input)
824 Chapter 3: Interpolation and Approximation
If we want to constrain the integral of the spline over the closed interval (c, d), then we
set IDER(I) = IDER(I + 1) = 1 and XVAL(I) = c and XVAL(I + 1) = d. For
consistency, we insist that ITYPE(I) = ITYPE(I + 1) .GE. 0 and c .LE. d. Note that
every entry in IDER must be at least 1.
ITYPE Array of length NXVAL indicating the types of general constraints. (Input)
ITYPE(I)
I-th Constraint
( xi )
f ( xi ) BU ( I )
d
f ( ) ( xi ) BL ( I )
d
BL ( I ) = f ( ) ( xi ) BU ( I )
d
BL ( I ) = f ( t ) dt
c
BL(I) = f (
di )
( di )
2
3
( di = 1)1
( di = 1) 2 c f ( t ) dt BU ( I )
d
( di = 1) 3 c f ( t ) dt BL ( I )
( di = 1) 4
BL ( I ) f ( t ) dt BU ( I )
c
10
99
In order to set two point constraints, we must have ITYPE(I) = ITYPE(I + 1) and ITYPE(I)
must be negative.
ITYPE ( I ) I th Contraint
( xi ) f ( ) ( xi +1 )
d
d
f ( ) ( xi ) f ( ) ( xi +1 ) BU ( I )
d
d
f ( ) ( xi ) f ( ) ( xi +1 ) BL ( I )
d
d
BL ( I ) f ( ) ( xi ) f ( ) ( xi +1 ) BU ( I )
1 BL ( I ) = f (
2
3
4
di +1
di )
i +1
i +1
i +1
BL Array of length NXVAL containing the lower limit of the general constraints, if there is
no lower limit on the I-th constraint, then BL(I) is not referenced. (Input)
BU Array of length NXVAL containing the upper limit of the general constraints, if there is
no upper limit on the I-th constraint, then BU(I) is not referenced; if there is no range
constraint, BL and BU can share the same storage locations. (Input)
If the I-th constraint is an equality constraint, BU(I) is not referenced.
KORDER Order of the spline. (Input)
CONFT 825
XKNOT Array of length NCOEF + KORDER containing the knot sequence. (Input)
The entries of XKNOT must be nondecreasing.
BSCOEF Array of length NCOEF containing the B-spline coefficients. (Output)
Optional Arguments
NDATA Number of data points. (Input)
Default: NDATA = size (XDATA,1).
WEIGHT Array of length NDATA containing the weights. (Input)
Default: WEIGHT = 1.0.
NXVAL Number of points in the vector XVAL. (Input)
Default: NXVAL = size (XVAL,1).
NCOEF Number of B-spline coefficients. (Input)
Default: NCOEF = size (BSCOEF,1).
FORTRAN 90 Interface
Generic:
CALL CONFT (XDATA, FDATA, XVAL, NHARD, IDER, ITYPE, BL, BU, KORDER,
XKNOT, BSCOEF [,])
Specific:
FORTRAN 77 Interface
Single:
CALL CONFT (NDATA, XDATA, FDATA, WEIGHT, NXVAL, XVAL, NHARD, IDER,
ITYPE, BL, BU, KORDER, XKNOT, NCOEF, BSCOEF)
Double:
Description
The routine CONFT produces a constrained, weighted least-squares fit to data from a spline
subspace. Constraints involving one point, two points, or integrals over an interval are allowed.
The types of constraints supported by the routine are of four types.
Ep [ f ]
or
= f
= f
( jp )
( jp
(y )
)
(y ) f ( ) (y )
p
j p +1
y p +1
p +1
f ( t )dt
or
or
yp
An interval, Ip, (which may be a point, a finite interval , or semi-infinite interval) is associated
with each of these constraints.
The input for this routine consists of several items, first, the data set (xi, fi) for i = 1, , N (where
N = NDATA), that is the data which is to be fit. Second, we have the weights to be used in the least
squares fit (w = WEIGHT). The vector XVAL of length NXVAL contains the abscissas of the points
involved in specifying the constraints. The algorithm tries to satisfy all the constraints, but if the
constraints are inconsistent then it will drop constraints, in the reverse order specified, until either
a consistent set of constraints is found or the hard constraints are determined to be inconsistent
(the hard constraints are those involving XVAL(1), , XVAL(NHARD)). Thus, the algorithm
satisfies as many constraints as possible in the order specified by the user. In the case when
constraints are dropped, the user will receive a message explaining how many constraints had to
be dropped to obtain the fit. The next several arguments are related to the type of constraint and
the constraint interval. The last four arguments determine the spline solution. The user chooses the
spline subspace (KORDER, XKNOT, and NCOEF), and the routine returns the B-spline coefficients in
BSCOEF.
Let nf denote the number of feasible constraints as described above. Then, the routine solves the
problem.
N
i =1
subject to
fi a j B j ( xi ) wi
j =1
E p a j B j I p
j =1
p = 1, , n f
This linearly constrained least-squares problem is treated as a quadratic program and is solved by
invoking the IMSL routine QPROG (see Chapter 8, Optimization).
The choice of weights depends on the data uncertainty in the problem. In some cases, there is a
natural choice for the weights based on the estimates of errors in the data points.
Determining feasibility of linear constraints is a numerically sensitive task. If you encounter
difficulties, a quick fix would be to widen the constraint intervals Ip.
Comments
1.
CONFT 827
2.
Informational errors
Type
Code
3
4
4
4
11
12
13
14
15
4
4
4
4
16
17
18
19
20
Example 1
This is a simple application of CONFT. We generate data from the function
x
x
+ sin
2
2
contaminated with random noise and fit it with cubic splines. The function is increasing so we
would hope that our least-squares fit would also be increasing. This is not the case for the
unconstrained least squares fit generated by BSLSQ. We then force the derivative to be greater than
0 at NXVAL = 15 equally spaced points and call CONFT. The resulting curve is monotone. We print
the error for the two fits averaged over 100 equally spaced points.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NCOEF, NDATA, NXVAL
(KORDER=4, NCOEF=8, NDATA=15, NXVAL=15)
INTEGER
REAL
INTRINSIC
!
!
!
!
!
!
!
!
!
!
!
CONFT 829
X
= GRDSIZ*FLOAT(I-1)/99.0
ERRNFT = ERRNFT + ABS(F1(X)-BSVAL(X,KORDER,XKNOT,NCOEF,BSCNFT)&
)
ERRLSQ = ERRLSQ + ABS(F1(X)-BSVAL(X,KORDER,XKNOT,NCOEF,BSCLSQ)&
)
50 CONTINUE
Print results
WRITE (NOUT,99998) ERRLSQ/100.0
WRITE (NOUT,99999) ERRNFT/100.0
!
99998 FORMAT (' Average error with BSLSQ fit:
99999 FORMAT (' Average error with CONFT fit:
END
', F8.5)
', F8.5)
Output
Average error with BSLSQ fit:
Average error with CONFT fit:
0.20250
0.14334
Additional Examples
Example 2
We now try to recover the function
1
1 + x4
from noisy data. We first try the unconstrained least-squares fit using BSLSQ. Finding that fit
somewhat unsatisfactory, we apply several constraints using CONFT. First, notice that the
unconstrained fit oscillates through the true function at both ends of the interval. This is common
for flat data. To remove this oscillation, we constrain the cubic spline to have zero second
derivative at the first and last four knots. This forces the cubic spline to reduce to a linear
polynomial on the first and last three knot intervals. In addition, we constrain the fit (which we
will call s) as follows:
s ( 7 )
7
s ( x )dx
7
s ( 7 )
0
2.3
= s (7)
Notice that the last constraint was generated using the periodic option (requiring only the zeroeth
derivative to be periodic). We print the error for the two fits averaged over 100 equally spaced
points.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
KORDER, NCOEF, NDATA, NXVAL
(KORDER=4, NCOEF=13, NDATA=51, NXVAL=12)
INTEGER
REAL
INTRINSIC
!
F1(X) = 1.0/(1.0+X**4)
!
!
!
!
!
!
GRDSIZ = 14.0
DO 10 I=1, NDATA
XDATA(I) = GRDSIZ*((FLOAT(I-1)/FLOAT(NDATA-1))) - GRDSIZ/2.0
FDATA(I) = RNUNF()
FDATA(I) = F1(XDATA(I)) + 0.125*(FDATA(I)-.5)
10 CONTINUE
Compute KNOTS
DO 20 I=1, NCOEF - KORDER + 2
XKNOT(I+KORDER-1) = GRDSIZ*((FLOAT(I-1)/FLOAT(NCOEF-KORDER+1))&
) - GRDSIZ/2.0
20 CONTINUE
CONFT 831
!
!
!
DO 30 I=1, KORDER - 1
XKNOT(I) = XKNOT(KORDER)
XKNOT(I+NCOEF+1) = XKNOT(NCOEF+1)
30 CONTINUE
Compute BSLSQ fit
CALL BSLSQ (XDATA, FDATA, KORDER, XKNOT, NCOEF, BSCLSQ)
Construct the constraints for
CONFT
DO 40 I=1, 4
XVAL(I)
= XKNOT(KORDER+I-1)
XVAL(I+4) = XKNOT(NCOEF-3+I)
ITYPE(I)
= 1
ITYPE(I+4) = 1
IDER(I)
= 2
IDER(I+4) = 2
BL(I)
= 0.0
BL(I+4)
= 0.0
40 CONTINUE
!
XVAL(9)
ITYPE(9)
IDER(9)
BL(9)
=
=
=
=
-7.0
3
0
0.0
!
XVAL(10)
ITYPE(10)
IDER(10)
BU(10)
=
=
=
=
-7.0
2
-1
2.3
XVAL(11)
ITYPE(11)
IDER(11)
BU(11)
=
=
=
=
7.0
2
-1
2.3
!
XVAL(12) = -7.0
ITYPE(12) = 10
IDER(12) = 0
!
!
!
Call CONFT
CALL CONFT (XDATA, FDATA, XVAL, NHARPT, IDER, ITYPE, BL, BU,&
KORDER, XKNOT, BSCNFT, NCOEF=NCOEF)
Compute the average error
of 100 points in the interval.
ERRLSQ = 0.0
ERRNFT = 0.0
DO 50 I=1, 100
X
= GRDSIZ*FLOAT(I-1)/99.0 - GRDSIZ/2.0
ERRNFT = ERRNFT + ABS(F1(X)-BSVAL(X,KORDER,XKNOT,NCOEF,BSCNFT)&
)
ERRLSQ = ERRLSQ + ABS(F1(X)-BSVAL(X,KORDER,XKNOT,NCOEF,BSCLSQ)&
)
50 CONTINUE
Print results
WRITE (NOUT,99998) ERRLSQ/100.0
WRITE (NOUT,99999) ERRNFT/100.0
!
832 Chapter 3: Interpolation and Approximation
', F8.5)
', F8.5)
Output
Average error with BSLSQ fit:
Average error with CONFT fit:
0.01783
0.01339
BSLS2
Computes a two-dimensional tensor-product spline approximant using least squares, returning the
tensor-product B-spline coefficients.
Required Arguments
XDATA Array of length NXDATA containing the data points in the X-direction. (Input)
XDATA must be nondecreasing.
YDATA Array of length NYDATA containing the data points in the Y-direction. (Input)
YDATA must be nondecreasing.
BSLS2 833
FDATA Array of size NXDATA by NYDATA containing the values on the X Y grid to be
interpolated. (Input)
FDATA(I, J) contains the value at (XDATA(I), YDATA(I)).
KXORD Order of the spline in the X-direction. (Input)
KYORD Order of the spline in the Y-direction. (Input)
XKNOT Array of length KXORD + NXCOEF containing the knots in the X-direction. (Input)
XKNOT must be nondecreasing.
YKNOT Array of length KYORD + NYCOEF containing the knots in the Y-direction. (Input)
YKNOT must be nondecreasing.
BSCOEF Array of length NXCOEF * NYCOEF that contains the tensor product B-spline
coefficients. (Output)
BSCOEF is treated internally as an array of size NXCOEF by NYCOEF.
Optional Arguments
NXDATA Number of data points in the X-direction. (Input)
Default: NXDATA = size (XDATA,1).
NYDATA Number of data points in the Y-direction. (Input)
Default: NYDATA = size (YDATA,1).
LDF Leading dimension of FDATA exactly as specified in the dimension statement of
calling program. (Input)
Default: LDF = size (FDATA,1).
NXCOEF Number of B-spline coefficients in the X-direction. (Input)
Default: NXCOEF = size (XKNOT,1) KXORD.
NYCOEF Number of B-spline coefficients in the Y-direction. (Input)
Default: NYCOEF = size (YKNOT,1) KYORD.
XWEIGH Array of length NXDATA containing the positive weights of XDATA. (Input)
Default: XWEIGH = 1.0.
YWEIGH Array of length NYDATA containing the positive weights of YDATA. (Input)
Default: YWEIGH = 1.0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL BSLS2 (NXDATA, XDATA, NYDATA, YDATA, FDATA, LDF, KXORD, KYORD,
XKNOT, YKNOT, NXCOEF, NYCOEF, XWEIGH, YWEIGH, BSCOEF)
Double:
Description
The routine BSLS2 computes the coefficients of a tensor-product spline least-squares
approximation to weighted tensor-product data. The input for this subroutine consists of data
vectors to specify the tensor-product grid for the data, two vectors with the weights, the values of
the surface on the grid, and the specification for the tensor-product spline. The grid is specified by
the two vectors x = XDATA and y = YDATA of length n = NXDATA and m = NYDATA, respectively. A
two-dimensional array f = FDATA contains the data values that are to be fit. The two vectors
wx = XWEIGH and wy = YWEIGH contain the weights for the weighted least-squares problem. The
information for the approximating tensor-product spline must also be provided. This information
is contained in kx = KXORD, tx = XKNOT, and N = NXCOEF for the spline in the first variable, and in
ky = KYORD , ty = YKNOT and M = NYCOEF for the spline in the second variable. The coefficients of
the resulting tensor-product spline are returned in c = BSCOEF, which is an N * M array. The
procedure computes coefficients by solving the normal equations in tensor-product form as
discussed
in de Boor (1978, Chapter 17). The interested reader might also want to study the paper by E.
Grosse (1980).
The final result produces coefficients c minimizing
N M
i =1 j =1
k =1 l =1
where the function Bkl is the tensor-product of two B-splines of order kx and ky. Specifically, we
have
Bkl ( x, y ) = Bk , kx ,t x ( x ) Bl , k y ,t y ( y )
The spline
N
c
k =1 l =1
kl
Bkl
can be evaluated using BS2VL and its partial derivatives can be evaluated using BS2DR.
Comments
1.
BSLS2 835
CALL B2LS2 (NXDATA, XDATA, NYDATA, YDATA, FDATA, LDF, KXORD, KYORD, XKNOT,
YKNOT, NXCOEF, NYCOEF, XWEIGH, YWEIGH, BSCOEF, WK)
2.
Informational errors
Type
Code
3
14
4
4
4
4
4
5
6
7
9
10
11
There may be less than one digit of accuracy in the least squares fit.
Try using higher precision if possible.
Multiplicity of the knots cannot exceed the order of the spline.
The knots must be nondecreasing.
All weights must be greater than zero.
The data point abscissae must be nondecreasing.
The smallest element of the data point array must be greater than or
equal to the K_ORDth knot.
The largest element of the data point array must be less than or equal
to the (N_COEF + 1)st knot.
Example
The data for this example arise from the function ex sin(x + y) + on the rectangle [0, 3] [0, 5].
Here, is a uniform random variable with range [1, 1]. We sample this function on a 100 50
grid and then try to recover it by using cubic splines in the x variable and quadratic splines in the y
variable. We print out the values of the function ex sin(x + y) on a 3 5 grid and compare these
values with the values of the tensor-product spline that was computed using the IMSL routine
BSLS2.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
KXORD, KYORD, LDF, NXCOEF, NXDATA, NXVEC, NYCOEF,&
NYDATA, NYVEC
(KXORD=4, KYORD=3, NXCOEF=15, NXDATA=100, NXVEC=4,&
NYCOEF=7, NYDATA=50, NYVEC=6, LDF=NXDATA)
!
INTEGER
REAL
!
!
!
I, J, NOUT
BSCOEF(NXCOEF,NYCOEF), EXP, F, FDATA(NXDATA,NYDATA),&
FLOAT, RNOISE, SIN, VALUE(NXVEC,NYVEC), X,&
XDATA(NXDATA), XKNOT(NXCOEF+KXORD), XVEC(NXVEC),&
XWEIGH(NXDATA), Y, YDATA(NYDATA),&
YKNOT(NYCOEF+KYORD), YVEC(NYVEC), YWEIGH(NYDATA)
INTRINSIC EXP, FLOAT, SIN
Define function
F(X,Y) = EXP(X)*SIN(X+Y)
Set random number seed
CALL RNSET (1234579)
Set up X knot sequence.
DO 10 I=1, NXCOEF - KXORD + 2
!
!
!
!
!
!
!
!
!
XKNOT(I+KXORD-1) = 3.0*(FLOAT(I-1)/FLOAT(NXCOEF-KXORD+1))
10 CONTINUE
XKNOT(NXCOEF+1) = XKNOT(NXCOEF+1) + 0.001
Stack knots.
DO 20 I=1, KXORD - 1
XKNOT(I) = XKNOT(KXORD)
XKNOT(I+NXCOEF+1) = XKNOT(NXCOEF+1)
20 CONTINUE
Set up Y knot sequence.
DO 30 I=1, NYCOEF - KYORD + 2
YKNOT(I+KYORD-1) = 5.0*(FLOAT(I-1)/FLOAT(NYCOEF-KYORD+1))
30 CONTINUE
YKNOT(NYCOEF+1) = YKNOT(NYCOEF+1) + 0.001
Stack knots.
DO 40 I=1, KYORD - 1
YKNOT(I) = YKNOT(KYORD)
YKNOT(I+NYCOEF+1) = YKNOT(NYCOEF+1)
40 CONTINUE
Set up X-grid.
DO 50 I=1, NXDATA
XDATA(I) = 3.0*(FLOAT(I-1)/FLOAT(NXDATA-1))
50 CONTINUE
Set up Y-grid.
DO 60 I=1, NYDATA
YDATA(I) = 5.0*(FLOAT(I-1)/FLOAT(NYDATA-1))
60 CONTINUE
Evaluate function on grid and
introduce random noise in [1,-1].
DO 70 I=1, NYDATA
DO 70 J=1, NXDATA
RNOISE
= RNUNF()
RNOISE
= 2.0*RNOISE - 1.0
FDATA(J,I) = F(XDATA(J),YDATA(I)) + RNOISE
70 CONTINUE
Use default weights equal to 1.
Compute least squares approximation.
CALL BSLS2 (XDATA, YDATA, FDATA, KXORD, KYORD, &
XKNOT, YKNOT, BSCOEF)
Get output unit number
CALL UMACH (2, NOUT)
Write heading
WRITE (NOUT,99999)
Print interpolated values
on [0,3] x [0,5].
DO 80 I=1, NXVEC
XVEC(I) = FLOAT(I-1)
80 CONTINUE
DO 90 I=1, NYVEC
YVEC(I) = FLOAT(I-1)
90 CONTINUE
Evaluate spline
CALL BS2GD (0, 0, XVEC, YVEC, KXORD, KYORD, XKNOT,&
YKNOT, BSCOEF, VALUE)
DO 110 I=1, NXVEC
BSLS2 837
Output
X
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
2.0000
2.0000
2.0000
2.0000
2.0000
2.0000
3.0000
3.0000
3.0000
3.0000
3.0000
3.0000
Y
0.0000
1.0000
2.0000
3.0000
4.0000
5.0000
0.0000
1.0000
2.0000
3.0000
4.0000
5.0000
0.0000
1.0000
2.0000
3.0000
4.0000
5.0000
0.0000
1.0000
2.0000
3.0000
4.0000
5.0000
F(X,Y)
0.0000
0.8415
0.9093
0.1411
-0.7568
-0.9589
2.2874
2.4717
0.3836
-2.0572
-2.6066
-0.7595
6.7188
1.0427
-5.5921
-7.0855
-2.0646
4.8545
2.8345
-15.2008
-19.2605
-5.6122
13.1959
19.8718
S(X,Y)
0.2782
0.7762
0.8203
0.1391
-0.5705
-1.0290
2.2678
2.4490
0.4947
-2.0378
-2.6218
-0.7274
6.6923
0.8492
-5.5885
-7.0955
-2.1588
4.7339
2.5971
-15.1079
-19.1698
-5.5820
12.6659
20.5170
Error
-0.2782
0.0653
0.0890
0.0020
-0.1863
0.0701
0.0196
0.0227
-0.1111
-0.0195
0.0151
-0.0321
0.0265
0.1935
-0.0035
0.0099
0.0942
0.1206
0.2373
-0.0929
-0.0907
-0.0302
0.5300
-0.6452
BSLS3
Computes a three-dimensional tensor-product spline approximant using least squares, returning
the tensor-product B-spline coefficients.
Required Arguments
XDATA Array of length NXDATA containing the data points in the x-direction. (Input)
XDATA must be nondecreasing.
YDATA Array of length NYDATA containing the data points in the y-direction. (Input)
YDATA must be nondecreasing.
ZDATA Array of length NZDATA containing the data points in the z-direction. (Input)
ZDATA must be nondecreasing.
FDATA Array of size NXDATA by NYDATA by NZDATA containing the values to be
interpolated. (Input)
FDATA(I, J, K) contains the value at (XDATA(I), YDATA(J), ZDATA(K)).
KXORD Order of the spline in the x-direction. (Input)
KYORD Order of the spline in the y-direction. (Input)
KZORD Order of the spline in the z-direction. (Input)
XKNOT Array of length KXORD + NXCOEF containing the knots in the x-direction. (Input)
XKNOT must be nondecreasing.
YKNOT Array of length KYORD + NYCOEF containing the knots in the y-direction. (Input)
YKNOT must be nondecreasing.
ZKNOT Array of length KZORD + NZCOEF containing the knots in the z-direction. (Input)
ZKNOT must be nondecreasing.
BSCOEF Array of length NXCOEF*NYCOEF*NZCOEF that contains the tensor product
B-spline coefficients. (Output)
Optional Arguments
NXDATA Number of data points in the x-direction. (Input)
NXDATA must be greater than or equal to NXCOEF.
Default: NXDATA = size (XDATA,1).
NYDATA Number of data points in the y-direction. (Input)
NYDATA must be greater than or equal to NYCOEF.
Default: NYDATA = size (YDATA,1).
NZDATA Number of data points in the z-direction. (Input)
NZDATA must be greater than or equal to NZCOEF.
Default: NZDATA = size (ZDATA,1).
LDFDAT Leading dimension of FDATA exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFDAT = size (FDATA,1).
MDFDAT Second dimension of FDATA exactly as specified in the dimension statement of
the calling program. (Input)
Default: MDFDAT = size (FDATA,2).
BSLS3 839
FORTRAN 90 Interface
Generic:
CALL BSLS3 (XDATA, YDATA, ZDATA, FDATA, KXORD, KYORD, KZORD, XKNOT,
YKNOT, ZKNOT, BSCOEF [,])
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BSLS3 computes the coefficients of a tensor-product spline least-squares
approximation to weighted tensor-product data. The input for this subroutine consists of data
vectors to specify the tensor-product grid for the data, three vectors with the weights, the values of
the surface on the grid, and the specification for the tensor-product spline. The grid is specified by
the three vectors x = XDATA, y = YDATA, and z = ZDATA of length k = NXDATA, l = NYDATA , and
m = NYDATA, respectively. A three-dimensional array f = FDATA contains the data values which are
to be fit. The three vectors wx = XWEIGH, wy = YWEIGH, and wz = ZWEIGH contain the weights for
the weighted least-squares problem. The information for the approximating tensor-product spline
must also be provided. This information is contained in kx = KXORD, tx = XKNOT, and K = NXCOEF
for the spline in the first variable, in ky = KYORD, ty = YKNOT and L = NYCOEF for the spline in the
second variable, and in kz = KZORD, tz = ZKNOT and M = NZCOEF for the spline in the third
variable.
840 Chapter 3: Interpolation and Approximation
The coefficients of the resulting tensor product spline are returned in c = BSCOEF, which is an
K L M array. The procedure computes coefficients by solving the normal equations in tensorproduct form as discussed in de Boor (1978, Chapter 17). The interested reader might also want to
study the paper by E. Grosse (1980).
The final result produces coefficients c minimizing
K L M
i = l j =1 p =1
s =1 t =1 u =1
where the function Bstu is the tensor-product of three B-splines of order kx, ky, and kz. Specifically,
we have
Bstu ( x, y, z ) = Bs , kx ,t x ( x ) Bt , k y ,t y ( y ) Bu , kz , t z ( z )
The spline
K
c
s =1 t =1 u =1
stu
Bstu
can be evaluated at one point using BS3VL and its partial derivatives can be evaluated using
BS3DR. If the values on a grid are desired then we recommend BS3GD.
Comments
1.
2.
Informational errors
Type
3
4
4
4
4
4
4
Code
13 There may be less than one digit of accuracy in the least squares fit.
Try using higher precision if possible.
7 Multiplicity of knots cannot exceed the order of the spline.
8 The knots must be nondecreasing.
9 All weights must be greater than zero.
10 The data point abscissae must be nondecreasing.
11 The smallest element of the data point array must be greater than or
equal to the K_ORDth knot.
12 The largest element of the data point array must be less than or equal
to the (N_COEF + 1)st knot.
BSLS3 841
Example
The data for this example arise from the function e(y - z) sin(x + y) + on the rectangle
[0, 3] [0, 2] [0, 1]. Here, is a uniform random variable with range [.5, .5]. We sample this
function on a 4 3 2 grid and then try to recover it by using tensor-product cubic splines in all
variables. We print out the values of the function e(y - z) sin(x + y) on a 4 3 2 grid and compare
these values with the values of the tensor-product spline that was computed using the IMSL
routine BSLS3.
USE
USE
USE
USE
USE
BSLS3_INT
RNSET_INT
RNUNF_INT
UMACH_INT
BS3GD_INT
IMPLICIT
INTEGER
PARAMETER
NONE
KXORD, KYORD, KZORD, LDFDAT, MDFDAT, NXCOEF, NXDATA,&
NXVAL, NYCOEF, NYDATA, NYVAL, NZCOEF, NZDATA, NZVAL
(KXORD=4, KYORD=4, KZORD=4, NXCOEF=8, NXDATA=15,&
NXVAL=4, NYCOEF=8, NYDATA=15, NYVAL=3, NZCOEF=8,&
NZDATA=15, NZVAL=2, LDFDAT=NXDATA, MDFDAT=NYDATA)
!
INTEGER
REAL
I, J, K, NOUT
BSCOEF(NXCOEF,NYCOEF,NZCOEF), EXP, F,&
FDATA(NXDATA,NYDATA,NZDATA), FLOAT, RNOISE,&
SIN, SPXYZ(NXVAL,NYVAL,NZVAL), X, XDATA(NXDATA),&
XKNOT(NXCOEF+KXORD), XVAL(NXVAL), XWEIGH(NXDATA), Y,&
YDATA(NYDATA), YKNOT(NYCOEF+KYORD), YVAL(NYVAL),&
YWEIGH(NYDATA), Z, ZDATA(NZDATA),&
ZKNOT(NZCOEF+KZORD), ZVAL(NZVAL), ZWEIGH(NZDATA)
INTRINSIC EXP, FLOAT, SIN
Define a function
F(X,Y,Z) = EXP(Y-Z)*SIN(X+Y)
!
!
20
!
30
40
!
50
60
!
70
!
80
!
90
!
!
100
!
!
!
Compute least-squares
CALL BSLS3 (XDATA, YDATA, ZDATA, FDATA, KXORD, KYORD, KZORD, XKNOT, &
YKNOT, ZKNOT, BSCOEF)
!
Set up grid for evaluation.
DO 110 I=1, NXVAL
XVAL(I) = FLOAT(I-1)
110 CONTINUE
DO 120 I=1, NYVAL
YVAL(I) = FLOAT(I-1)
120 CONTINUE
DO 130 I=1, NZVAL
ZVAL(I) = FLOAT(I-1)
130 CONTINUE
!
Evaluate on the grid.
CALL BS3GD (0, 0, 0, XVAL, YVAL, ZVAL, KXORD, KYORD, KZORD, XKNOT, &
YKNOT, ZKNOT, BSCOEF, SPXYZ)
!
Print results.
WRITE (NOUT,99998)
DO 140 I=1, NXVAL
DO 140 J=1, NYVAL
DO 140 K=1, NZVAL
WRITE (NOUT,99999) XVAL(I), YVAL(J), ZVAL(K),&
F(XVAL(I),YVAL(J),ZVAL(K)),&
SPXYZ(I,J,K), F(XVAL(I),YVAL(J),ZVAL(K)&
) - SPXYZ(I,J,K)
Chapter 3: Interpolation and Approximation
BSLS3 843
140 CONTINUE
99998 FORMAT (8X, 'X', 9X, 'Y', 9X, 'Z', 6X, 'F(X,Y,Z)', 3X,&
'S(X,Y,Z)', 4X, 'Error')
99999 FORMAT (' ', 3F10.3, 3F11.4)
END
Output
X
0.000
0.000
0.000
0.000
0.000
0.000
1.000
1.000
1.000
1.000
1.000
1.000
2.000
2.000
2.000
2.000
2.000
2.000
3.000
3.000
3.000
3.000
3.000
3.000
Y
0.000
0.000
1.000
1.000
2.000
2.000
0.000
0.000
1.000
1.000
2.000
2.000
0.000
0.000
1.000
1.000
2.000
2.000
0.000
0.000
1.000
1.000
2.000
2.000
Z
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
0.000
1.000
F(X,Y,Z)
0.0000
0.0000
2.2874
0.8415
6.7188
2.4717
0.8415
0.3096
2.4717
0.9093
1.0427
0.3836
0.9093
0.3345
0.3836
0.1411
-5.5921
-2.0572
0.1411
0.0519
-2.0572
-0.7568
-7.0855
-2.6066
S(X,Y,Z)
0.1987
0.1447
2.2854
1.0557
6.4704
2.2054
0.8779
0.2571
2.4015
0.8995
1.1330
0.4951
0.8269
0.3258
0.3564
0.1905
-5.5362
-1.9659
0.4841
-0.4257
-1.9710
-0.8479
-7.0957
-2.1650
Error
-0.1987
-0.1447
0.0019
-0.2142
0.2484
0.2664
-0.0365
0.0524
0.0703
0.0098
-0.0902
-0.1115
0.0824
0.0087
0.0272
-0.0494
-0.0559
-0.0913
-0.3430
0.4776
-0.0862
0.0911
0.0101
-0.4416
CSSED
Smooths one-dimensional data by error detection.
Required Arguments
XDATA Array of length NDATA containing the abscissas of the data points. (Input)
FDATA Array of length NDATA containing the ordinates (function values) of the data
points. (Input)
DIS Proportion of the distance the ordinate in error is moved to its interpolating curve.
(Input)
It must be in the range 0.0 to 1.0. A suggested value for DIS is one.
SC Stopping criterion. (Input)
SC should be greater than or equal to zero. A suggested value for SC is zero.
844 Chapter 3: Interpolation and Approximation
Optional Arguments
NDATA Number of data points. (Input)
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSSED is designed to smooth a data set that is mildly contaminated with isolated
errors. In general, the routine will not work well if more than 25% of the data points are in error.
The routine CSSED is based on an algorithm of Guerra and Tapia (1974).
Setting NDATA = n, FDATA = f, SDATA = s and XDATA = x, the algorithm proceeds as follows.
Although the user need not input an ordered XDATA sequence, we will assume that x is increasing
for simplicity. The algorithm first sorts the XDATA values into an increasing sequence and then
continues. A cubic spline interpolant is computed for each of the 6-point data sets (initially setting
s = f)
(xj, sj)
j = i 3, , i + 3 j i,
where i = 4, , n 3 using CSAKM. For each i the interpolant, which we will call Si, is compared
with the current value of si, and a point energy is computed as
pei = Si(xi) si
Setting sc = SC, the algorithm terminates either if MAXIT iterations have taken place or if
pei sc ( xi +3 xi 3 ) / 6
i = 4, , n 3
If the above inequality is violated for any i, then we update the i-th element of s by setting
si = si + d(pei), where d = DIS. Note that neither the first three nor the last three data points are
changed. Thus, if these points are inaccurate, care must be taken to interpret the results.
The choice of the parameters d, sc and MAXIT are crucial to the successful usage of this
subroutine. If the user has specific information about the extent of the contamination, then he
Chapter 3: Interpolation and Approximation
CSSED 845
should choose the parameters as follows: d = 1, sc = 0 and MAXIT to be the number of data points
in error. On the other hand, if no such specific information is available, then choose d = .5,
MAXIT 2n, and
sc = .5
max s min s
( xn x1 )
In any case, we would encourage the user to experiment with these values.
Comments
1.
2.
Informational error
Type
Code
3
3.
Example
We take 91 uniform samples from the function 5 + (5 + t2 sin t)/t on the interval [1, 10]. Then, we
contaminate 10 of the samples and try to recover the original function values.
USE CSSED_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=91)
INTEGER
REAL
INTRINSIC
!
DATA ISB/6, 17, 26, 34, 42, 49, 56, 62, 75, 83/
DATA RNOISE/2.5, -3.0, -2.0, 2.5, 3.0, -2.0, -2.5, 2.0, -2.0, 3.0/
!
!
!
SC
= 0.56
MAXIT = 182
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
99997 FORMAT (' Case A - No specific information available', /,&
'
F(X)
F(X)+NOISE
SDATA(X)', /)
99998 FORMAT (' Case B - Specific information available', /,&
'
F(X)
F(X)+NOISE
SDATA(X)', /)
99999 FORMAT (' ', F7.3, 8X, F7.3, 11X, F7.3)
END
Output
Case A - No specific information available
F(X)
F(X)+NOISE
SDATA(X)
CSSED 847
9.830
8.263
5.201
2.223
1.259
3.167
7.167
10.880
12.774
7.594
12.330
5.263
3.201
4.723
4.259
1.167
4.667
12.880
10.774
10.594
9.870
8.215
5.168
2.264
1.308
3.138
7.131
10.909
12.708
7.639
*** WARNING ERROR 1 from CSSED. Maximum number of iterations limit MAXIT
***
=10 exceeded. The best answer found is returned.
Case B - Specific information available
F(X)
F(X)+NOISE
SDATA(X)
9.830
8.263
5.201
2.223
1.259
3.167
7.167
10.880
12.774
7.594
12.330
5.263
3.201
4.723
4.259
1.167
4.667
12.880
10.774
10.594
9.831
8.262
5.199
2.225
1.261
3.170
7.170
10.878
12.770
7.592
CSSMH
Computes a smooth cubic spline approximation to noisy data.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input)
XDATA must be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
SMPAR A nonnegative number which controls the smoothing. (Input)
The spline function S returned is such that the sum from I = 1 to NDATA of
((S(XDATA(I))FDATA(I)) / WEIGHT(I))**2 is less than or equal to SMPAR. It is
recommended that SMPAR lie in the confidence interval of this sum, i.e.,
NDATA SQRT(2 * NDATA).LE. SMPAR.LE. NDATA + SQRT(2 * NDATA).
BREAK Array of length NDATA containing the breakpoints for the piecewise cubic
representation. (Output)
CSCOEF Matrix of size 4 by NDATA containing the local coefficients of the cubic pieces.
(Output)
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least 2.
Default: NDATA = size (XDATA,1).
WEIGHT Array of length NDATA containing estimates of the standard deviations of
FDATA. (Input)
All elements of WEIGHT must be positive.
Default: WEIGHT = 1.0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSSMH is designed to produce a C2 cubic spline approximation to a data set in which
the function values are noisy. This spline is called a smoothing spline. It is a natural cubic spline
with knots at all the data abscissas x = XDATA, but it does not interpolate the data (xi, fi). The
smoothing spline S is the unique C2 function which minimizes
S ( x ) dx
2
S ( xi ) f i
i =1
wi
The routine CSSMH is based on an algorithm of Reinsch (1967). This algorithm is also discussed in
de Boor (1978, pages 235243).
Chapter 3: Interpolation and Approximation
CSSMH 849
Comments
1.
2.
3.
Informational errors
Type
Code
3
The cubic spline can be evaluated using CSVAL; its derivative can be evaluated using
CSDER.
Example
In this example, function values are contaminated by adding a small random amount to the
correct values. The routine CSSMH is used to approximate the original, uncontaminated data.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=300)
INTEGER
REAL
I, NOUT
BREAK(NDATA), CSCOEF(4,NDATA), ERROR, F,&
FDATA(NDATA), FLOAT, FVAL, SDEV, SMPAR, SQRT,&
SVAL, WEIGHT(NDATA), X, XDATA(NDATA), XT, RN
FLOAT, SQRT
INTRINSIC
!
!
!
!
F(X) = 1.0/(.1+(3.0*(X-1.0))**4)
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = 3.0*(FLOAT(I-1)/FLOAT(NDATA-1))
FDATA(I) = F(XDATA(I))
10 CONTINUE
Set the random number seed
CALL RNSET (1234579)
Contaminate the data
DO 20 I=1, NDATA
RN = RNUNF()
FDATA(I) = FDATA(I) + 2.0*RN - 1.0
20 CONTINUE
!
!
!
!
!
!
99999 FORMAT (12X, 'X', 9X, 'Function', 7X, 'Smoothed', 10X,&
'Error')
END
Output
X
0.0000
0.3010
0.6020
0.9030
1.2040
1.5050
1.8060
2.1070
2.4080
2.7090
Function
0.0123
0.0514
0.4690
9.3312
4.1611
0.1863
0.0292
0.0082
0.0031
0.0014
Smoothed
0.1118
0.0646
0.2972
8.7022
4.7887
0.2718
0.1408
0.0826
0.0076
-0.1789
Error
0.0995
0.0131
-0.1718
-0.6289
0.6276
0.0856
0.1116
0.0743
0.0045
-0.1803
CSSCV
Computes a smooth cubic spline approximation to noisy data using cross-validation to estimate the
smoothing parameter.
Required Arguments
XDATA Array of length NDATA containing the data point abscissas. (Input) XDATA must
be distinct.
FDATA Array of length NDATA containing the data point ordinates. (Input)
IEQUAL A flag alerting the subroutine that the data is equally spaced. (Input)
CSSCV 851
BREAK Array of length NDATA containing the breakpoints for the piecewise cubic
representation. (Output)
CSCOEF Matrix of size 4 by NDATA containing the local coefficients of the cubic pieces.
(Output)
Optional Arguments
NDATA Number of data points. (Input)
NDATA must be at least 3.
Default: NDATA = size (XDATA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSSCV is designed to produce a C2 cubic spline approximation to a data set in which
the function values are noisy. This spline is called a smoothing spline. It is a natural cubic spline
with knots at all the data abscissas x = XDATA, but it does not interpolate the data (xi, fi). The
smoothing spline Ss is the unique C2 function that minimizes
S ( x ) dx
2
S ( x ) f
i =1
2
i
where is the smoothing parameter and N = NDATA. The reader should consult Reinsch (1967) for
more information concerning smoothing splines. The IMSL subroutine CSSMH solves the above
problem when the user provides the smoothing parameter . This routine attempts to find the
optimal smoothing parameter using the statistical technique known as cross-validation. This
means that (in a very rough sense) one chooses the value of so that the smoothing spline (Ss)
best approximates the value of the data at xi, if it is computed using all the data except the i-th; this
is true for all i = 1, , N. For more information on this topic, we refer the reader to Craven and
Wahba (1979).
Comments
1.
2.
Informational error
Type
Code
4
Example
In this example, function values are computed and are contaminated by adding a small random
amount. The routine CSSCV is used to try to reproduce the original, uncontaminated data.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
NDATA
(NDATA=300)
INTEGER
REAL
I, IEQUAL, NOUT
BREAK(NDATA), CSCOEF(4,NDATA), ERROR, F,&
FDATA(NDATA), FLOAT, FVAL, SVAL, X,&
XDATA(NDATA), XT, RN
FLOAT
INTRINSIC
!
F(X) = 1.0/(.1+(3.0*(X-1.0))**4)
!
CALL UMACH (2, NOUT)
!
!
!
Set up a grid
DO 10 I=1, NDATA
XDATA(I) = 3.0*(FLOAT(I-1)/FLOAT(NDATA-1))
FDATA(I) = F(XDATA(I))
10 CONTINUE
Introduce noise on [-.5,.5]
Contaminate the data
CALL RNSET (1234579)
DO 20 I=1, NDATA
RN = RNUNF ()
FDATA(I) = FDATA(I) + 2.0*RN - 1.0
20 CONTINUE
!
!
Chapter 3: Interpolation and Approximation
IEQUAL = 1
!
Smooth data
CALL CSSCV (XDATA, FDATA, IEQUAL, BREAK, CSCOEF)
!
Print results
WRITE (NOUT,99999)
DO 30 I=1, 10
XT
= 90.0*(FLOAT(I-1)/FLOAT(NDATA-1))
SVAL = CSVAL(XT,BREAK,CSCOEF)
FVAL = F(XT)
ERROR = SVAL - FVAL
WRITE (NOUT,'(4F15.4)') XT, FVAL, SVAL, ERROR
30 CONTINUE
99999 FORMAT (12X, 'X', 9X, 'Function', 7X, 'Smoothed', 10X,&
'Error')
END
Output
X
0.0000
0.3010
0.6020
0.9030
1.2040
1.5050
1.8060
2.1070
2.4080
2.7090
Function
0.0123
0.0514
0.4690
9.3312
4.1611
0.1863
0.0292
0.0082
0.0031
0.0014
Smoothed
0.2528
0.1054
0.3117
8.9461
4.6847
0.3819
0.1168
0.0658
0.0395
-0.2155
Error
0.2405
0.0540
-0.1572
-0.3850
0.5235
0.1956
0.0877
0.0575
0.0364
-0.2169
RATCH
Computes a rational weighted Chebyshev approximation to a continuous function on an interval.
Required Arguments
F User-supplied FUNCTION to be approximated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
Optional Arguments
N The degree of the numerator. (Input)
Default: N = size (P,1) 1.
M The degree of the denominator. (Input)
Default: M = size (Q,1) 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine RATCH is designed to compute the best weighted L (Chebyshev) approximant to a
given function. Specifically, given a weight function w = WEIGHT, a monotone function = PHI,
and a function f to be approximated on the interval [a, b], the subroutine RATCH returns the
coefficients (in P and Q) for a rational approximation to f on [a, b]. The user must supply the
degree of the numerator N and the degree of the denominator M of the rational function
Chapter 3: Interpolation and Approximation
RATCH 855
RMN
f RMN
:=
w
f ( x)
N +1
i =1
M +1
max
x[ a , b]
i =1
Pi i 1 ( x )
Qi i 1 ( x )
w( x)
Notice that setting (x) = x yields ordinary rational approximation. A typical use of the function
occurs when one wants to approximate an even function on a symmetric interval, say [a, a] using
ordinary rational functions. In this case, it is known that the answer must be an even function.
Hence, one can set (x) = x2, only approximate on [0, a], and decrease by one half the degrees in
the numerator and denominator.
The algorithm implemented in this subroutine is designed for fast execution. It assumes that the
best approximant has precisely N + M + 2 equi-oscillations. That is, that there exist N + M + 2
points t1 < < tN+M+2 satisfying
e ( t i ) = e ( t i +1 ) =
f RMN
w
Such points are called alternants. Unfortunately, there are many instances in which the best
rational approximant to the given function has either fewer alternants or more alternants. In this
case, it is not expected that this subroutine will perform well. For more information on rational
Chebyshev approximation, the reader can consult Cheney (1966). The subroutine is based on work
of Cody, Fraser, and Hart (1968).
Comments
1.
2.
Informational errors
Type
Code
3
Example
In this example, we compute the best rational approximation to the gamma function, , on the
interval [2, 3] with weight function w = 1 and N = M = 2. We display the maximum error and the
coefficients. This problem is taken from the paper of Cody, Fraser, and Hart (1968). We compute
in double precision due to the conditioning of this problem.
USE RATCH_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
M, N
(M=2, N=2)
INTEGER
NOUT
DOUBLE PRECISION A, B, ERROR, F, P(N+1), PHI, Q(M+1), WEIGHT
EXTERNAL
F, PHI, WEIGHT
!
A = 2.0D0
B = 3.0D0
!
!
RATCH 857
!
DOUBLE PRECISION DGAMMA
EXTERNAL
DGAMMA
!
F = DGAMMA(X)
RETURN
END
! ----------------------------------------------------------------------!
DOUBLE PRECISION FUNCTION PHI (X)
DOUBLE PRECISION X
!
PHI = X
RETURN
END
! ----------------------------------------------------------------------!
DOUBLE PRECISION FUNCTION WEIGHT (X)
DOUBLE PRECISION X
!
DOUBLE PRECISION DGAMMA
EXTERNAL
DGAMMA
!
WEIGHT = DGAMMA(X)
RETURN
END
Output
In double precision we have:
P
=
1.265583562487
-0.650585004466
0.197868699191
1.000000000000
-0.064342721236
-0.028851461855
ERROR
-0.000026934190
Routines
4.1.
4.2.
4.3.
4.4.
Univariate Quadrature
Adaptive general-purpose endpoint singularities................ QDAGS
Adaptive general purpose..................................................... QDAG
Adaptive general-purpose points of singularity................... QDAGP
Adaptive general-purpose infinite interval ........................... QDAGI
Adaptive weighted oscillatory (trigonometric) .................... QDAWO
Adaptive weighted Fourier (trigonometric)..........................QDAWF
Adaptive weighted algebraic endpoint singularities........... QDAWS
Adaptive weighted Cauchy principal value ........................ QDAWC
Nonadaptive general purpose............................................... QDNG
Multidimensional Quadrature
Two-dimensional quadrature (iterated integral)................. TWODQ
Adaptive N-dimensional quadrature
over a hyper-rectangle...........................................................QAND
Integrates a function over a hyperrectangle using a
quasi-Monte Carlo method ......................................................QMC
862
865
869
872
875
879
883
886
889
891
896
899
901
905
908
911
914
Differentiation
Approximation to first, second, or third derivative................. DERIV
918
Routines 859
Usage Notes
Univariate Quadrature
The first nine routines described in this chapter are designed to compute approximations to
integrals of the form
f ( x )w ( x ) dx
The weight function w is used to incorporate known singularities (either algebraic or logarithmic),
to incorporate oscillations, or to indicate that a Cauchy principal value is desired. For general
purpose integration, we recommend the use of QDAGS (even if no endpoint singularities are
present). If more efficiency is desired, then the use of QDAG (or QDAG*) should be considered.
These routines are organized as follows:
w=1
QDAGS
QDAG
QDAGP
QDAGI
QDNG
w(x) = (x a)(b x) ln(x a) ln(b x), where the ln factors are optional
QDAWS
w(x) = 1/(x c)
QDAWC
The calling sequences for these routines are very similar. The function to be integrated is always
F; the lower and upper limits are, respectively, A and B. The requested absolute error is ERRABS,
while the requested relative error is ERRREL. These quadrature routines return two numbers of
interest, namely, RESULT and ERREST, which are the approximate integral R and the error estimate
E, respectively. These numbers are related as follows:
b
f ( x ) w ( x ) dx R E max
a
{ ,
f ( x ) w ( x ) dx
One situation that occasionally arises in univariate quadrature concerns the approximation of
integrals when only tabular data are given. The routines described above do not directly address
this question. However, the standard method for handling this problem is first to interpolate the
data and then to integrate the interpolant. This can be accomplished by using the IMSL spline
860 Chapter 4: Integration and Differentiation
interpolation routines described in Chapter 3, Interpolation and Apprximation, with one of the
integration routines CSINT, BSINT, or PPITG.
Multivariate Quadrature
Two routines are described in this chapter that are of use in approximating certain multivariate
integrals. In particular, the routine TWODQ returns an approximation to an iterated two-dimensional
integral of the form
b
h( x)
g x
( ) f ( x, y ) dy dx
The second routine, QAND, returns an approximation to the integral of a function of n variables
over a hyper-rectangle
b1
a1
bn
an
f ( x1 , , xn ) dxn dx1
If one has two- or three-dimensional tensor-product tabular data, use the IMSL spline interpolation
routines BS2IN or BS3IN, followed by the IMSL spline integration routines BS2IG and BS3IG
that are described in Chapter 3, Interpolation and Approximation.
f ( x )w ( x ) dx = f ( x )w
b
i =1
for all functions f that are polynomials of degree less than 2N. The weight functions w may be
selected from the following table:
w( x)
1
1/ 1-x 2
1 x
e x
(1 + x ) (1 x )
e x
1/ cosh ( x )
Interval
( 1, 1)
Name
Legendre
( 1, 1)
( 1, 1)
( , )
( 1, 1)
( 0, )
( )
Where permissible, GQRUL will also compute Gauss-Radau and Gauss-Lobatto quadrature rules.
The routine RECCF produces the three-term recurrence relation for the monic orthogonal
polynomials with respect to the above weight functions.
Chapter 4: Integration and Differentiation
Another routine, GQRCF, produces the Gauss, Gauss-Radau, or Gauss-Lobatto quadrature rule
from the three-term recurrence relation. This means Gauss rules for general weight functions may
be obtained if the three-term recursion for the orthogonal polynomials is known. The routine
RECQR is an inverse to GQRCF in the sense that it produces the recurrence coefficients given the
Gauss quadrature formula.
The last routine described in this section, FQRUL, generates the Fejr quadrature rules for the
following family of weights:
w( x) = 1
w ( x ) = 1/ ( x )
w( x) =
w( x) =
w( x) =
(b x ) ( x a )
(b x ) ( x a )
(b x ) ( x a )
ln ( x a )
ln ( b x )
Numerical differentiation
We provide one routine, DERIV, for numerical differentiation. This routine provides an estimate
for the first, second, or third derivative of a user-supplied function.
QDAGS
Integrates a function (which may have endpoint singularities).
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
RESULT Estimate of the integral from A to B of F. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QDAGS is a general-purpose integrator that uses a globally adaptive scheme to reduce
the absolute error. It subdivides the interval [A, B] and uses a 21-point Gauss-Kronrod rule to
estimate the integral over each subinterval. The error for each subinterval is estimated by
comparison with the 10-point Gauss quadrature rule. This routine is designed to handle functions
with endpoint singularities. However, the performance on functions, which are well-behaved at the
endpoints, is quite good also. In addition to the general strategy described in QDAG, this routine
uses an extrapolation procedure known as the -algorithm. The routine QDAGS is an
implementation of the routine QAGS, which is fully documented by Piessens et al. (1983). Should
QDAGS fail to produce acceptable results, then either IMSL routines QDAG or QDAG* may be
appropriate. These routines are documented in this chapter.
Comments
1.
QDAGS 863
2.
3.
Informational errors
Type
Code
4
3
1
2
3
3
3
4
If EXACT is the exact value, QDAGS attempts to find RESULT such that
|EXACT RESULT| max(ERRABS, ERRREL * |EXACT|). To specify only a relative
error, set ERRABS to zero. Similarly, to specify only an absolute error, set ERRREL to
zero.
Example
The value of
ln ( x ) x 1/ 2 dx = 4
is estimated. The values of the actual and estimated error are machine dependent.
USE QDAGS_INT
USE UMACH_INT
!
!
IMPLICIT NONE
INTEGER
NOUT
REAL
A, ABS, B, ERRABS, ERREST, ERROR, ERRREL, EXACT, F, &
RESULT
INTRINSIC ABS
EXTERNAL
F
Get output unit number
CALL UMACH (2, NOUT)
Set limits of integration
A = 0.0
B = 1.0
Output
Computed =
-4.000
Exact =
-4.000
Error = 2.098E-05
QDAG
Integrates a function using a globally adaptive scheme based on Gauss-Kronrod rules.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
RESULT Estimate of the integral from A to B of F. (Output)
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERRREL = 1.e-3 for single precision and 1.d-8 for double precision.
QDAG 865
Points
7-15
10-21
15-31
20-41
25-51
30-61
IRULE = 2 is recommended for most functions. If the function has a peak singularity, use
IRULE = 1. If the function is oscillatory, use IRULE = 6.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QDAG is a general-purpose integrator that uses a globally adaptive scheme in order to
reduce the absolute error. It subdivides the interval [A, B] and uses a (2k + 1)-point Gauss-Kronrod
rule to estimate the integral over each subinterval. The error for each subinterval is estimated by
comparison with the k-point Gauss quadrature rule. The subinterval with the largest estimated
error is then bisected and the same procedure is applied to both halves. The bisection process is
continued until either the error criterion is satisfied, roundoff error is detected, the subintervals
become too small, or the maximum number of subintervals allowed is reached. The routine QDAG
is based on the subroutine QAG by Piessens et al. (1983).
Should QDAG fail to produce acceptable results, then one of the IMSL routines QDAG* may be
appropriate. These routines are documented in this chapter.
866 Chapter 4: Integration and Differentiation
Comments
1.
2.
3.
Informational errors
Type
Code
4
3
1
2
If EXACT is the exact value, QDAG attempts to find RESULT such that
ABS(EXACT RESULT).LE.MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
QDAG 867
Example
The value of
xe x dx = e2 + 1
is estimated. Since the integrand is not oscillatory, IRULE = 1 is used. The values of the actual and
estimated error are machine dependent.
USE QDAG_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
NONE
IRULE, NOUT
A, ABS, B, ERRABS, ERREST, ERROR, EXACT, EXP, &
F, RESULT
INTRINSIC ABS, EXP
EXTERNAL
F
!
Get output unit number
CALL UMACH (2, NOUT)
!
Set limits of integration
A = 0.0
B = 2.0
!
Set error tolerances
ERRABS = 0.0
!
Parameter for non-oscillatory
!
function
IRULE = 1
CALL QDAG (F, A, B, RESULT, ERRABS=ERRABS, IRULE=IRULE, ERREST=ERREST)
!
Print results
EXACT = 1.0 + EXP(2.0)
ERROR = ABS(RESULT-EXACT)
WRITE (NOUT,99999) RESULT, EXACT, ERREST, ERROR
99999 FORMAT (' Computed =', F8.3, 13X, ' Exact =', F8.3, /, /, &
' Error estimate =', 1PE10.3, 6X, 'Error =', 1PE10.3)
END
!
REAL FUNCTION F (X)
REAL
X
REAL
EXP
INTRINSIC EXP
F = X*EXP(X)
RETURN
END
Output
Computed =
8.389
Exact =
8.389
Error = 9.537E-07
QDAGP
Integrates a function with singularity points given.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
POINTS Array of length NPTS containing breakpoints in the range of integration. (Input)
Usually these are points where the integrand has singularities.
RESULT Estimate of the integral from A to B of F. (Output)
Optional Arguments
NPTS Number of break points given. (Input)
Default: NPTS = size (POINTS,1).
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERRREL = 1.e-3 for single precision and 1.d-8 for double precision.
ERREST Estimate of the absolute value of the error. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
QDAGP 869
Description
The routine QDAGP uses a globally adaptive scheme in order to reduce the absolute error. It
initially subdivides the interval [A, B] into NPTS + 1 user-supplied subintervals and uses a 21-point
Gauss-Kronrod rule to estimate the integral over each subinterval. The error for each subinterval is
estimated by comparison with the 10-point Gauss quadrature rule. This routine is designed to
handle endpoint as well as interior singularities. In addition to the general strategy described in the
IMSL routine QDAG, this routine employs an extrapolation procedure known as the -algorithm.
The routine QDAGP is an implementation of the subroutine QAGP, which is fully documented by
Piessens et al. (1983).
Comments
1.
user-provided break point or integration limit, then (AA, BB) has level L if
ABS(BB AA) = ABS(P2 P1) * 2**(L).
2.
3.
Informational errors
Type
Code
4
3
1
2
3
3
3
4
If EXACT is the exact value, QDAGP attempts to find RESULT such that
ABS(EXACT RESULT).LE.MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
Example
The value of
x 3 ln ( x 2 1)( x 2 2 ) dx = 61 ln 2 +
77
ln 7 27
4
is estimated. The values of the actual and estimated error are machine dependent. Note that this
subroutine never evaluates the user-supplied function at the user-supplied breakpoints.
USE QDAGP_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
!
!
!
!
NONE
NOUT, NPTS
A, ABS, ALOG, B, ERRABS, ERREST, ERROR, ERRREL, &
EXACT, F, POINTS(2), RESULT, SQRT
INTRINSIC ABS, ALOG, SQRT
EXTERNAL
F
Get output unit number
CALL UMACH (2, NOUT)
Set limits of integration
A = 0.0
B = 3.0
Set error tolerances
ERRABS = 0.0
ERRREL = 0.01
Set singularity parameters
NPTS
= 2
POINTS(1) = 1.0
QDAGP 871
POINTS(2) = SQRT(2.0)
CALL QDAGP (F, A, B, POINTS, RESULT, ERRABS=ERRABS, ERRREL=ERRREL, &
ERREST=ERREST)
!
Print results
EXACT = 61.0*ALOG(2.0) + 77.0/4.0*ALOG(7.0) - 27.0
ERROR = ABS(RESULT-EXACT)
WRITE (NOUT,99999) RESULT, EXACT, ERREST, ERROR
99999 FORMAT (' Computed =', F8.3, 13X, ' Exact =', F8.3, /, /, &
' Error estimate =', 1PE10.3, 6X, 'Error =', 1PE10.3)
!
END
!
REAL FUNCTION F (X)
REAL
X
REAL
ABS, ALOG
INTRINSIC ABS, ALOG
F = X**3*ALOG(ABS((X*X-1.0)*(X*X-2.0)))
RETURN
END
Output
Computed =
52.741
Exact =
52.741
Error = 6.104E-04
QDAGI
Integrates a function over an infinite or semi-infinite interval.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is
F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
BOUND Finite bound of the integration range. (Input)
Ignored if INTERV = 2.
INTERV Flag indicating integration interval. (Input)
INTERV
Interval
(, BOUND)
(BOUND, + )
(, + )
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERRREL = 1.e-3 for single precision and 1.d-8 for double precision.
ERREST Estimate of the absolute value of the error. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QDAGI uses a globally adaptive scheme in an attempt to reduce the absolute error. It
initially transforms an infinite or semi-infinite interval into the finite interval [0, 1]. Then, QDAGI
uses a 21-point Gauss-Kronrod rule to estimate the integral and the error. It bisects any interval
with an unacceptable error estimate and continues this process until termination. This routine is
designed to handle endpoint singularities. In addition to the general strategy described in QDAG,
this subroutine employs an extrapolation procedure known as the -algorithm. The routine QDAGI
is an implementation of the subroutine QAGI, which is fully documented by Piessens et al. (1983).
Comments
1.
QDAGI 873
2.
3.
Informational errors
Type
Code
4
3
1
2
3
3
3
4
If EXACT is the exact value, QDAGI attempts to find RESULT such that
ABS(EXACT RESULT).LE.MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
Example
The value of
ln ( x )
1 + (10 x )
dx =
ln (10 )
20
is estimated. The values of the actual and estimated error are machine dependent. Note that we
have requested an absolute error of 0 and a relative error of .001. The effect of these requests, as
documented in Comment 3 above, is to ignore the absolute error requirement.
USE QDAGI_INT
USE UMACH_INT
USE CONST_INT
IMPLICIT
INTEGER
REAL
NONE
INTERV, NOUT
ABS, ALOG, BOUND, ERRABS, ERREST, ERROR, &
ERRREL, EXACT, F, PI, RESULT
INTRINSIC ABS, ALOG
EXTERNAL
F
!
Get output unit number
CALL UMACH (2, NOUT)
!
Set limits of integration
BOUND = 0.0
INTERV = 1
!
Set error tolerances
ERRABS = 0.0
CALL QDAGI (F, BOUND, INTERV, RESULT, ERRABS=ERRABS, &
ERREST=ERREST)
!
Print results
PI
= CONST('PI')
EXACT = -PI*ALOG(10.)/20.
ERROR = ABS(RESULT-EXACT)
WRITE (NOUT,99999) RESULT, EXACT, ERREST, ERROR
99999 FORMAT (' Computed =', F8.3, 13X, ' Exact =', F8.3//' Error ', &
'estimate =', 1PE10.3, 6X, 'Error =', 1PE10.3)
END
!
REAL FUNCTION F (X)
REAL
X
REAL
ALOG
INTRINSIC ALOG
F = ALOG(X)/(1.+(10.*X)**2)
RETURN
END
Output
Computed =
-0.362
Exact =
-0.362
Error = 5.960E-08
QDAWO
Integrates a function containing a sine or a cosine.
Required Arguments
F User-supplied function to be integrated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
Chapter 4: Integration and Differentiation
QDAWO 875
Weight
COS(OMEGA * X)
SIN(OMEGA * X)
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERRREL = 1.e-3 for single precision and 1.d-8 for double precision.
ERREST Estimate of the absolute value of the error. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QDAWO uses a globally adaptive scheme in an attempt to reduce the absolute error.
This routine computes integrals whose integrands have the special form w(x) f(x), where w(x) is
either cos x or sin x. Depending on the length of the subinterval in relation to the size of ,
either a modified Clenshaw-Curtis procedure or a Gauss-Kronrod 7/15 rule is employed to
approximate the integral on a subinterval. In addition to the general strategy described for the
IMSL routine QDAG, this subroutine uses an extrapolation procedure known as the -algorithm.
876 Chapter 4: Integration and Differentiation
The routine QDAWO is an implementation of the subroutine QAWO, which is fully documented by
Piessens et al. (1983).
Comments
1.
QDAWO 877
2.
3.
Informational errors
Type
Code
4
3
1
2
3
3
3
4
If EXACT is the exact value, QDAWO attempts to find RESULT such that
ABS(EXACT RESULT) .LE. MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
Example
The value of
ln ( x ) sin (10 x ) dx
is estimated. The values of the actual and estimated error are machine dependent. Notice that the
log function is coded to protect for the singularity at zero.
USE QDAWO_INT
USE UMACH_INT
USE CONST_INT
IMPLICIT
INTEGER
REAL
NONE
IWEIGH, NOUT
A, ABS, B, ERRABS, ERREST, ERROR, &
EXACT, F, OMEGA, PI, RESULT
INTRINSIC ABS
EXTERNAL
F
!
Get output unit number
CALL UMACH (2, NOUT)
!
Set limits of integration
A = 0.0
B = 1.0
!
Weight function = sin(10.*pi*x)
IWEIGH = 2
PI
= CONST('PI')
OMEGA = 10.*PI
!
Set error tolerances
ERRABS = 0.0
CALL QDAWO (F, A, B, IWEIGH, OMEGA, RESULT, ERRABS=ERRABS, &
ERREST=ERREST)
!
Print results
EXACT = -0.1281316
ERROR = ABS(RESULT-EXACT)
WRITE (NOUT,99999) RESULT, EXACT, ERREST, ERROR
99999 FORMAT (' Computed =', F8.3, 13X, ' Exact =', F8.3, /, /, &
878 Chapter 4: Integration and Differentiation
Output
Computed =
-0.128
Exact =
-0.128
Error = 5.260E-06
QDAWF
Computes a Fourier integral.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
A Lower limit of integration. (Input)
IWEIGH Type of weight function used. (Input)
IWEIGH
Weight
COS(OMEGA * X)
SIN(OMEGA * X)
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
Chapter 4: Integration and Differentiation
QDAWF 879
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QDAWF uses a globally adaptive scheme in an attempt to reduce the absolute error.
This routine computes integrals whose integrands have the special form w(x) f(x), where w(x) is
either cos x or sin x. The integration interval is always semi-infinite of the form [A, ]. These
Fourier integrals are approximated by repeated calls to the IMSL routine QDAWO followed by
extrapolation. The routine QDAWF is an implementation of the subroutine QAWF, which is fully
documented by Piessens et al. (1983).
Comments
1.
RSLIST Array of length MAXCYL containing the contributions to the integral over
the interval (A + (k 1) * C, A + k * C), for k = 1, , NCYCLE. (Output)
C = (2 * INT(ABS(OMEGA)) + 1) * PI/ABS(OMEGA).
ERLIST Array of length MAXCYL containing the error estimates for the intervals
defined in RSLIST. (Output)
IERLST Array of length MAXCYL containing error flags for the intervals defined in
RSLIST. (Output)
IERLST(K)
Meaning
2.
Informational errors
Type
Code
3
4
3
3.
1
2
3
If EXACT is the exact value, QDAWF attempts to find RESULT such that
ABS(EXACT RESULT) .LE. ERRABS.
Example
The value of
Chapter 4: Integration and Differentiation
QDAWF 881
x 1/ 2 cos ( x / 2 ) dx = 1
is estimated. The values of the actual and estimated error are machine dependent. Notice that F is
coded to protect for the singularity at zero.
USE QDAWF_INT
USE UMACH_INT
USE CONST_INT
IMPLICIT
INTEGER
REAL
NONE
IWEIGH, NOUT
A, ABS, ERRABS, ERREST, ERROR, EXACT, F, &
OMEGA, PI, RESULT
INTRINSIC ABS
EXTERNAL
F
!
Get output unit number
CALL UMACH (2, NOUT)
!
Set lower limit of integration
A = 0.0
!
Select weight W(X) = COS(PI*X/2)
IWEIGH = 1
PI
= CONST('PI')
OMEGA = PI/2.0
!
Set error tolerance
CALL QDAWF (F, A, IWEIGH, OMEGA, RESULT, ERREST=ERREST)
!
Print results
EXACT = 1.0
ERROR = ABS(RESULT-EXACT)
WRITE (NOUT,99999) RESULT, EXACT, ERREST, ERROR
99999 FORMAT (' Computed =', F8.3, 13X, ' Exact =', F8.3, /, /, &
' Error estimate =', 1PE10.3, 6X, 'Error =', 1PE10.3)
END
!
REAL FUNCTION F (X)
REAL
X
REAL
SQRT
INTRINSIC SQRT
IF (X .GT. 0.0) THEN
F = 1.0/SQRT(X)
ELSE
F = 0.0
END IF
RETURN
END
Output
Computed =
1.000
Exact =
1.000
Error = 2.205E-06
QDAWS
Integrates a function with algebraic-logarithmic singularities.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
B must be greater than A
IWEIGH Type of weight function used. (Input)
IWEIGH
Weight
(X A)**ALPHA * (B X)**BETAW
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERRREL = 1.e-3 for single precision and 1.d-8 for double precision.
ERREST Estimate of the absolute value of the error. (Output)
QDAWS 883
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QDAWS uses a globally adaptive scheme in an attempt to reduce the absolute error.
This routine computes integrals whose integrands have the special form w(x) f(x), where w(x) is a
weight function described above. A combination of modified Clenshaw-Curtis and Gauss-Kronrod
formulas is employed. In addition to the general strategy described for the IMSL routine QDAG,
this routine uses an extrapolation procedure known as the -algorithm. The routine QDAWS is an
implementation of the routine QAWS, which is fully documented by Piessens et al. (1983).
Comments
1.
ELIST Array of length MAXSUB containing the error estimates of the NSUBIN values
in RLIST. (Output)
IORD Array of length MAXSUB. Let k be NSUBIN if NSUBIN.LE.
(MAXSUB/2 + 2), MAXSUB + 1 NSUBIN otherwise. The first k locations
contain pointers to the error estimates over the subintervals, such that
ELIST(IORD(1)), , ELIST(IORD(k)) form a decreasing sequence. (Output)
2.
3.
Informational errors
Type
Code
4
3
1
2
If EXACT is the exact value, QDAWS attempts to find RESULT such that
ABS(EXACT RESULT).LE.MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
Example
The value of
1
1/ 2
(1 + x )(1 x )
0
x ln ( x ) dx =
3ln ( 2 ) 4
9
is estimated. The values of the actual and estimated error are machine dependent.
USE QDAWS_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
!
!
!
NONE
IWEIGH, NOUT
A, ABS, ALOG, ALPHA, B, BETAW, ERRABS, ERREST, ERROR, &
EXACT, F, RESULT
INTRINSIC ABS, ALOG
EXTERNAL
F
Get output unit number
CALL UMACH (2, NOUT)
Set limits of integration
A = 0.0
B = 1.0
Select weight
ALPHA = 1.0
BETAW
= 0.5
IWEIGH = 2
Set error tolerances
ERRABS = 0.0
CALL QDAWS (F, A, B, IWEIGH, ALPHA, BETAW, RESULT, &
ERRABS=ERRABS, ERREST=ERREST)
QDAWS 885
Print results
EXACT = (3.*ALOG(2.)-4.)/9.
ERROR = ABS(RESULT-EXACT)
WRITE (NOUT,99999) RESULT, EXACT, ERREST, ERROR
99999 FORMAT (' Computed =', F8.3, 13X, ' Exact =', F8.3, /, /, &
' Error estimate =', 1PE10.3, 6X, 'Error =', 1PE10.3)
END
!
REAL FUNCTION F (X)
REAL
X
REAL
SQRT
INTRINSIC SQRT
F = SQRT(1.0+X)
RETURN
END
Output
Computed =
-0.213
Exact =
-0.213
Error = 2.980E-08
QDAWC
Integrates a function f(x)/(x-c) in the Cauchy principal value sense.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
C Singular point. (Input)
C must not equal A or B.
RESULT Estimate of the integral from A to B of F(X)/(X C). (Output)
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERREL =1.e-3 for single precision and 1.d-8 for double precision.
886 Chapter 4: Integration and Differentiation
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QDAWC uses a globally adaptive scheme in an attempt to reduce the absolute error.
This routine computes integrals whose integrands have the special form w(x) f(x), where
w(x) = 1/(x c). If c lies in the interval of integration, then the integral is interpreted as a Cauchy
principal value. A combination of modified Clenshaw-Curtis and Gauss-Kronrod formulas are
employed. In addition to the general strategy described for the IMSL routine QDAG, this routine
uses an extrapolation procedure known as the -algorithm. The routine QDAWC is an
implementation of the subroutine QAWC, which is fully documented by Piessens et al. (1983).
Comments
1.
QDAWC 887
2.
3.
Informational errors
Type
Code
4
3
1
2
If EXACT is the exact value, QDAWC attempts to find RESULT such that
ABS(EXACT RESULT) .LE. MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
Example
The Cauchy principal value of
5
x (5x
+ 6)
dx =
ln (125 / 631)
18
is estimated. The values of the actual and estimated error are machine dependent.
USE QDAWC_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
!
!
NONE
NOUT
A, ABS, ALOG, B, C, ERRABS, ERREST, ERROR, EXACT, &
F, RESULT
INTRINSIC ABS, ALOG
EXTERNAL
F
Get output unit number
CALL UMACH (2, NOUT)
Set limits of integration and C
A = -1.0
B = 5.0
C = 0.0
Set error tolerances
ERRABS = 0.0
Output
Computed =
-0.090
Exact =
-0.090
Error = 2.980E-08
QDNG
Integrates a smooth function using a nonadaptive rule.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(X), where
X Independent variable. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
RESULT Estimate of the integral from A to B of F. (Output)
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERRREL = 1.e-3 for single precision and 1.d-8 for double precision.
ERREST Estimate of the absolute value of the error. (Output)
QDNG 889
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QDNG is designed to integrate smooth functions. This routine implements a
nonadaptive quadrature procedure based on nested Paterson rules of order 10, 21, 43, and 87.
These rules are positive quadrature rules with degree of accuracy 19, 31, 64, and 130, respectively.
The routine QDNG applies these rules successively, estimating the error, until either the error
estimate satisfies the user-supplied constraints or the last rule is applied. The routine QDNG is based
on the routine QNG by Piessens et al. (1983).
This routine is not very robust, but for certain smooth functions it can be efficient. If QDNG should
not perform well, we recommend the use of the IMSL routine QDAGS.
Comments
1.
Informational error
Type
Code
4
2.
If EXACT is the exact value, QDNG attempts to find RESULT such that
ABS(EXACT RESULT).LE.MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
3.
This routine is designed for efficiency, not robustness. If the above error is
encountered, try QDAGS.
Example
The value of
xe x dx = e2 + 1
is estimated. The values of the actual and estimated error are machine dependent.
USE QDNG_INT
USE UMACH_INT
890 Chapter 4: Integration and Differentiation
IMPLICIT
INTEGER
REAL
NONE
NOUT
A, ABS, B, ERRABS, ERREST, ERROR, EXACT, EXP, &
F, RESULT
INTRINSIC ABS, EXP
EXTERNAL
F
!
Get output unit number
CALL UMACH (2, NOUT)
!
Set limits of integration
A = 0.0
B = 2.0
!
Set error tolerances
ERRABS = 0.0
CALL QDNG (F, A, B, RESULT, ERRABS=ERRABS, ERREST=ERREST)
!
Print results
EXACT = 1.0 + EXP(2.0)
ERROR = ABS(RESULT-EXACT)
WRITE (NOUT,99999) RESULT, EXACT, ERREST, ERROR
99999 FORMAT (' Computed =', F8.3, 13X, ' Exact =', F8.3, /, /, &
' Error estimate =', 1PE10.3, 6X, 'Error =', 1PE10.3)
END
!
REAL FUNCTION F (X)
REAL
X
REAL
EXP
INTRINSIC EXP
F = X*EXP(X)
RETURN
END
Output
Computed =
8.389
Exact =
8.389
Error = 9.537E-07
TWODQ
Computes a two-dimensional iterated integral.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(X,Y), where
X First argument of F. (Input)
Y Second argument of F. (Input)
F The function value. (Output)
F must be declared EXTERNAL in the calling program.
A Lower limit of outer integral. (Input)
B Upper limit of outer integral. (Input)
Chapter 4: Integration and Differentiation
TWODQ 891
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERRREL = 1.e-3 for single precision and 1.d-8 for double precision.
IRULE --- Choice of quadrature rule. (Input)
Default: IRULE = 2.
The Gauss-Kronrod rule is used with the following points:
IRULE
1
2
3
4
5
6
Points
7-15
10-21
15-31
20-41
25-51
30-61
If the function has a peak singularity, use IRULE = 1. If the function is oscillatory, use
IRULE = 6.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine TWODQ approximates the two-dimensional iterated integral
b
h( x)
g x
( ) f ( x, y ) dy dx
with the approximation returned in RESULT. An estimate of the error is returned in ERREST. The
approximation is achieved by iterated calls to QDAG. Thus, this algorithm will share many of the
characteristics of the routine QDAG. As in QDAG, several options are available. The absolute and
relative error must be specified, and in addition, the Gauss-Kronrod pair must be specified
(IRULE). The lower-numbered rules are used for less smooth integrands while the higher-order
rules are more efficient for smooth (oscillatory) integrands.
Comments
1.
TWODQ 893
2.
3.
Informational errors
Type
Code
4
3
1
2
If EXACT is the exact value, TWODQ attempts to find RESULT such that
ABS(EXACT RESULT).LE.MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
Example 1
In this example, we approximate the integral
1
y cos ( x + y 2 ) dy dx
!
REAL FUNCTION F (X, Y)
REAL
X, Y
REAL
COS
INTRINSIC COS
F = Y*COS(X+Y*Y)
RETURN
END
!
REAL FUNCTION G (X)
REAL
X
G = 1.0
RETURN
END
!
REAL FUNCTION H (X)
REAL
X
H = 3.0
RETURN
END
Output
Result =
-0.514
Additional Examples
Example 2
We modify the above example by assuming that the limits for the inner integral depend on x and,
in particular, are g(x) = 2x and h(x) = 5x. The integral now becomes
1
5x
2 x
y cos ( x + y 2 ) dy dx
Declare F, G, H
IRULE, NOUT
A, B, ERRABS, ERREST, ERRREL, F, G, H, RESULT
F, G, H
!
!
A = 0.0
B = 1.0
!
!
!
TWODQ 895
Output
Computed =
-0.083
QAND
Integrates a function on a hyper-rectangle.
Required Arguments
F User-supplied FUNCTION to be integrated. The form is F(N, X), where
N The dimension of the hyper-rectangle. (Input)
X The independent variable of dimension N. (Input)
F The value of the integrand at X. (Output)
F must be declared EXTERNAL in the calling program.
N The dimension of the hyper-rectangle. (Input)
N must be less than or equal to 20.
A Vector of length N. (Input)
Lower limits of integration.
B Vector of length N. (Input)
Upper limits of integration.
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: ERRABS = 1.e-3 for single precision and 1.d-8 for double precision.
ERRREL Relative accuracy desired. (Input)
Default: ERRREL = 1.e-3 for single precision and 1.d-8 for double precision.
MAXFCN Approximate maximum number of function evaluations to be permitted.
(Input)
MAXFCN cannot be greater than 256 or IMACH(5) if N is greater than 3.
Default: MAXFCN = 32**N.
ERREST Estimate of the absolute value of the error. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QAND approximates the n-dimensional iterated integral
b1
a1
bn
f ( x1 , , xn ) dxn dx1
an
with the approximation returned in RESULT. An estimate of the error is returned in ERREST. The
approximation is achieved by iterated applications of product Gauss formulas. The integral is first
estimated by a two-point tensor product formula in each direction. Then for i = 1, , n the routine
calculates a new estimate by doubling the number of points in the i-th direction, but halving the
number immediately afterwards if the new estimate does not change appreciably. This process is
repeated until either one complete sweep results in no increase in the number of sample points in
any dimension, or the number of Gauss points in one direction exceeds 256, or the number of
function evaluations needed to complete a sweep would exceed MAXFCN.
QAND 897
Comments
1.
Informational errors
Type
Code
3
4
2.
1
2
If EXACT is the exact value, QAND attempts to find RESULT such that
ABS(EXACT RESULT).LE.MAX(ERRABS, ERRREL * ABS(EXACT)). To specify only a
relative error, set ERRABS to zero. Similarly, to specify only an absolute error, set
ERRREL to zero.
Example
In this example, we approximate the integral of
e
on an expanding cube. The values of the error estimates are machine dependent. The exact integral
over
R3 is 3 / 2
USE QAND_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
EXTERNAL
!
NONE
I, J, MAXFCN, N, NOUT
A(3), B(3), CNST, ERRABS, ERREST, ERRREL, F, RESULT
F
Get output unit number
CALL UMACH (2, NOUT)
!
N
= 3
MAXFCN = 100000
!
DO 20 I=1, 6
CNST = I/2.0
!
!
!
DO 10 J=1, 3
A(J) = -CNST
B(J) = CNST
10 CONTINUE
CALL QAND (F, N, A, B, RESULT, ERRABS, ERRREL, MAXFCN, ERREST)
WRITE (NOUT,99999) CNST, RESULT, ERREST
20 CONTINUE
99999 FORMAT (1X, 'For CNST = ', F4.1, ', result = ', F7.3, ' with ', &
898 Chapter 4: Integration and Differentiation
Output
For
For
For
For
For
For
CNST
CNST
CNST
CNST
CNST
CNST
=
=
=
=
=
=
0.5,
1.0,
1.5,
2.0,
2.5,
3.0,
result
result
result
result
result
result
=
=
=
=
=
=
0.785
3.332
5.021
5.491
5.561
5.568
with
with
with
with
with
with
error
error
error
error
error
error
estimate
estimate
estimate
estimate
estimate
estimate
3.934E-06
2.100E-03
1.192E-05
2.413E-04
4.232E-03
2.580E-04
QMC
Integrates a function over a hyper rectangle using a quasi-Monte Carlo method.
Required Arguments
FCN User-supplied FUNCTION to be integrated. The form is FCN(X), where
X - The independent variable. (Input)
FCN The value of the integrand at X. (Output)
FCN must be declared EXTERNAL in the calling program.
b1
a1
bn
f ( x1 , , xn ) dxn dx1
an
Optional Arguments
ERRABS Absolute accuracy desired. (Input)
Default: 1.0e-2.
QMC 899
integer.
FORTRAN 90 Interface
Generic:
Specific:
Description
Integration of functions over hyper rectangle by direct methods, such as QAND, is practical only for
fairly low dimensional hypercubes. This is because the amount of work required increases
exponentially as the dimension increases.
An alternative to direct methods is QMC, in which the integral is evaluated as the value of the
function averaged over a sequence of randomly chosen points. Under mild assumptions on the
function, this method will converge like
1/ k
Example
This example evaluates the n-dimensional integral
n
1 1
( 1) x j dx1 dxn = 1
0 0
3 2
i =1 j =1
1 w
with n=10.
use qmc_int
implicit none
integer, parameter
real(kind(1d0))
real(kind(1d0))
real(kind(1d0))
integer
external fcn
::
::
::
::
::
ndim=10
a(ndim)
b(ndim)
result
I
a = 0.d0
b = 1.d0
call qmc(fcn, a, b, result)
write (*,*) 'result = ', result
end
real(kind(1d0)) function fcn(x)
implicit none
real(kind(1d0)), dimension(:) :: x
integer :: i, j
real(kind(1d0)) :: prod, sum, sign
sign = -1.d0
sum = 0.d0
do i=1, size(x)
prod = 1.d0
prod = product(x(1:i))
sum = sum + (sign * prod)
sign = -sign
end do
fcn = sum
end function fcn
Output
result = -0.3334789
GQRUL
Computes a Gauss, Gauss-Radau, or Gauss-Lobatto quadrature rule with various classical weight
functions.
Required Arguments
N Number of quadrature points. (Input)
QX Array of length N containing quadrature points. (Output)
QW Array of length N containing quadrature weights. (Output)
GQRUL 901
Optional Arguments
IWEIGH Index of the weight function. (Input)
Default: IWEIGH = 1.
IWEIGH WT ( X )
1
1
2
1/ 1 X 2
1 X 2
X2
(1 X ) (1 + X )
6
7
e X
1/ cosh ( X )
Interval
( 1, + 1)
Name
Legendre
( 1, + 1)
( 1, + 1)
( , + )
( 1, + 1)
( 0, + )
( , + )
ALPHA Parameter used in the weight function with some values of IWEIGH, otherwise it
is ignored. (Input)
Default: ALPHA = 2.0.
BETAW Parameter used in the weight function with some values of IWEIGH, otherwise it
is ignored. (Input)
Default: BETAW = 2.0.
NFIX Number of fixed quadrature points. (Input)
NFIX = 0, 1 or 2. For the usual Gauss quadrature rules, NFIX = 0.
Default: NFIX = 0.
QXFIX Array of length NFIX (ignored if NFIX = 0) containing the preset quadrature
point(s). (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL GQRUL (N, IWEIGH, ALPHA, BETAW, NFIX, QXFIX, QX, QW)
Double:
Description
The routine GQRUL produces the points and weights for the Gauss, Gauss-Radau, or Gauss-Lobatto
quadrature formulas for some of the most popular weights. In fact, it is slightly more general than
this suggests because the extra one or two points that may be specified do not have to lie at the
endpoints of the interval. This routine is a modification of the subroutine GAUSSQUADRULE (Golub
and Welsch 1969).
In the simple case when NFIX = 0, the routine returns points in x = QX and weights in w = QW so
that
N
f ( x )w ( x ) dx = f ( x )w
b
i =1
for all functions f that are polynomials of degree less than 2N.
If NFIX = 1, then one of the above xi equals the first component of QXFIX. Similarly, if NFIX = 2,
then two of the components of x will equal the first two components of QXFIX. In general, the
accuracy of the above quadrature formula degrades when NFIX increases. The quadrature rule will
integrate all functions f that are polynomials of degree less than 2N NFIX.
Comments
1.
2.
If IWEIGH specifies the weight WT(X) and the interval (a, b), then approximately
N
F ( X ) * WT ( X ) dX = F ( QX ( I ) ) * QW ( I )
b
I =1
3.
Gaussian quadrature is always the method of choice when the function F(X) behaves
like a polynomial. Gaussian quadrature is also useful on infinite intervals (with
appropriate weight functions), because other techniques often fail.
4.
The weight function 1/cosh(X) behaves like a polynomial near zero and like e|X| far
from zero.
Example 1
In this example, we obtain the classical Gauss-Legendre quadrature formula, which is accurate for
polynomials of degree less than 2N, and apply this when N = 6 to the function x8 on the interval
[1, 1]. This quadrature rule is accurate for polynomials of degree less than 12.
USE GQRUL_INT
Chapter 4: Integration and Differentiation
GQRUL 903
USE UMACH_INT
IMPLICIT
NONE
INTEGER
N
PARAMETER (N=6)
INTEGER
I, NOUT
REAL
ANSWER, QW(N), QX(N), SUM
Get output unit number
CALL UMACH (2, NOUT)
!
!
!
Output
QX(1)
QX(2)
QX(3)
QX(4)
QX(5)
QX(6)
=
=
=
=
=
=
-0.9325
-0.6612
-0.2386
0.2386
0.6612
0.9325
QW(1)
QW(2)
QW(3)
QW(4)
QW(5)
QW(6)
=
=
=
=
=
=
0.17132
0.36076
0.46791
0.46791
0.36076
0.17132
The quadrature result making use of these points and weights is 2.2222E-01.
Additional Examples
Example 2
We modify Example 1 by requiring that both endpoints be included in the quadrature formulas and
again apply the new formulas to the function x8 on the interval [1, 1]. This quadrature rule is
accurate for polynomials of degree less than 10.
USE GQRUL_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
N
PARAMETER (N=6)
INTEGER
I, IWEIGH, NFIX, NOUT
REAL
ALPHA, ANSWER, BETAW, QW(N), QX(N), QXFIX(2), SUM
904 Chapter 4: Integration and Differentiation
!
IWEIGH
ALPHA
BETAW
NFIX
QXFIX(1)
QXFIX(2)
= 1
= 0.0
= 0.0
= 2
= -1.0
= 1.0
Output
QX(1)
QX(2)
QX(3)
QX(4)
QX(5)
QX(6)
=
=
=
=
=
=
-1.0000
-0.7651
-0.2852
0.2852
0.7651
1.0000
QW(1)
QW(2)
QW(3)
QW(4)
QW(5)
QW(6)
=
=
=
=
=
=
0.06667
0.37847
0.55486
0.55486
0.37847
0.06667
The quadrature result making use of these points and weights is 2.2222E-01.
GQRCF
Computes a Gauss, Gauss-Radau or Gauss-Lobatto quadra ture rule given the recurrence
coefficients for the monic polynomials orthogonal with respect to the weight function.
Required Arguments
N Number of quadrature points. (Input)
B Array of length N containing the recurrence coefficients. (Input)
See Comments for definitions.
C Array of length N containing the recurrence coefficients. (Input)
See Comments for definitions.
Chapter 4: Integration and Differentiation
GQRCF 905
Optional Arguments
NFIX Number of fixed quadrature points. (Input)
NFIX = 0, 1 or 2. For the usual Gauss quadrature rules NFIX = 0.
Default: NFIX = 0.
QXFIX Array of length NFIX (ignored if NFIX = 0) containing the preset quadrature
point(s). (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine GQRCF produces the points and weights for the Gauss, Gauss-Radau, or Gauss-Lobatto
quadrature formulas given the three-term recurrence relation for the orthogonal polynomials. In
particular, it is assumed that the orthogonal polynomials are monic, and hence, the three-term
recursion may be written as
pi ( x ) = ( x bi ) pi 1 ( x ) ci pi 2 ( x ) for i =1, , N
where p0 = 1 and p1 = 0. It is obvious from this representation that the degree of pi is i and that pi
is monic. In order for the recurrence to give rise to a sequence of orthogonal polynomials (with
respect to a nonnegative measure), it is necessary and sufficient that ci > 0. This routine is a
modification of the subroutine GAUSSQUADRULE (Golub and Welsch 1969). In the simple case
when NFIX = 0, the routine returns points in x = QX and weights in w = QW so that
N
f ( x )w ( x ) dx = f ( x )w
b
i =1
for all functions f that are polynomials of degree less than 2N. Here, w is any weight function for
which the above recurrence produces the orthogonal polynomials pi on the interval [a, b] and w is
normalized by
b
w ( x ) dx = c
a
If NFIX = 1, then one of the above xi equals the first component of QXFIX. Similarly, if NFIX = 2,
then two of the components of x will equal the first two components of QXFIX. In general, the
accuracy of the above quadrature formula degrades when NFIX increases. The quadrature rule will
integrate all functions f that are polynomials of degree less than 2N NFIX.
Comments
1.
2.
Informational error
Type
Code
4
3.
The recurrence coefficients B(I) and C(I) define the monic polynomials via the relation
P(I) = (X B(I + 1)) * P(I 1) C(I + 1) * P(I 2). C(1) contains the zero-th
moment
WT ( X ) dX
of the weight function. Each element of C must be greater than zero.
4.
If WT(X) is the weight specified by the coefficients and the interval is (a, b), then
approximately
5.
F ( X ) * WT ( X ) dX = F ( QX ( I ) ) * QW ( I )
I =1
Gaussian quadrature is always the method of choice when the function F(X) behaves
like a polynomial. Gaussian quadrature is also useful on infinite intervals (with
appropriate weight functions) because other techniques often fail.
Example
We compute the Gauss quadrature rule (with N = 6) for the Chebyshev weight, (1 + x2)(1/2), from
the recurrence coefficients. These coefficients are obtained by a call to the IMSL routine RECCF.
USE GQRCF_INT
USE UMACH_INT
USE RECCF_INT
IMPLICIT
INTEGER
NONE
N
GQRCF 907
PARAMETER (N=6)
INTEGER
I, NFIX, NOUT
REAL
B(N), C(N), QW(N), QX(N), QXFIX(2)
Get output unit number
CALL UMACH (2, NOUT)
Recursion coefficients will come from
routine RECCF.
The call to RECCF finds recurrence
coefficients for Chebyshev
polynomials of the 1st kind.
CALL RECCF (N, B, C)
!
!
!
!
!
!
!
!
!
!
Output
QX(1)
QX(2)
QX(3)
QX(4)
QX(5)
QX(6)
=
=
=
=
=
=
-0.9325
-0.6612
-0.2386
0.2386
0.6612
0.9325
QW(1)
QW(2)
QW(3)
QW(4)
QW(5)
QW(6)
=
=
=
=
=
=
0.17132
0.36076
0.46791
0.46791
0.36076
0.17132
RECCF
Computes recurrence coefficients for various monic polynomials.
Required Arguments
N Number of recurrence coefficients. (Input)
B Array of length N containing recurrence coefficients. (Output)
C Array of length N containing recurrence coefficients. (Output)
Optional Arguments
IWEIGH Index of the weight function. (Input)
Default: IWEIGH = 1.
IWEIGH WT ( X )
1
1
2
3
1/ 1 X 2
1 X
e X
(1 X ) (1 + X )
6
7
e X
1/ cosh ( X )
Interval
( 1, + 1)
Name
Legendre
( 1, + 1)
( 1, + 1)
( , + )
( 1, + 1)
( 0, + )
( , + )
ALPHA Parameter used in the weight function with some values of IWEIGH, otherwise it
is ignored. (Input)
Default: ALPHA=1.0.
BETAW Parameter used in the weight function with some values of IWEIGH, otherwise it
is ignored. (Input)
Default: BETAW=1.0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine RECCF produces the recurrence coefficients for the orthogonal polynomials for some
of the most important weights. It is assumed that the orthogonal polynomials are monic; hence, the
three-term recursion may be written as
pi ( x ) = ( x bi ) pi 1 ( x ) ci pi 2 ( x ) for i =1, , N
where p0 = 1 and p1 = 0. It is obvious from this representation that the degree of pi is i and that pi
is monic. In order for the recurrence to give rise to a sequence of orthogonal polynomials (with
respect to a nonnegative measure), it is necessary and sufficient that ci > 0.
Comments
The recurrence coefficients B(I) and C(I) define the monic polynomials via the relation
P(I) = (X B(I + 1)) * P(I 1) C(I + 1) * P(I 2). The zero-th moment
Chapter 4: Integration and Differentiation
RECCF 909
( WT ( X ) dX )
of the weight function is returned in C(1).
Example
Here, we obtain the well-known recurrence relations for the first six monic Legendre polynomials,
Chebyshev polynomials of the first kind, and Laguerre polynomials.
USE RECCF_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
N
PARAMETER (N=6)
INTEGER
I, IWEIGH, NOUT
REAL
ALPHA, B(N), C(N), BETAW
Get output unit number
CALL UMACH (2, NOUT)
!
!
FORMAT
FORMAT
FORMAT
FORMAT
END
(1X, 'Legendre')
(/, 1X, 'Chebyshev, first kind')
(/, 1X, 'Laguerre')
(6(6X,'B(',I1,') = ',F8.4,7X,'C(',I1,') = ',F8.5,/))
Output
Legendre
B(1) =
B(2) =
B(3) =
B(4) =
B(5) =
B(6) =
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
C(1)
C(2)
C(3)
C(4)
C(5)
C(6)
=
=
=
=
=
=
2.00000
0.33333
0.26667
0.25714
0.25397
0.25253
3.14159
B(2)
B(3)
B(4)
B(5)
B(6)
=
=
=
=
=
0.0000
0.0000
0.0000
0.0000
0.0000
C(2)
C(3)
C(4)
C(5)
C(6)
=
=
=
=
=
0.50000
0.25000
0.25000
0.25000
0.25000
Laguerre
B(1) =
1.0000
B(2) =
3.0000
B(3) =
5.0000
B(4) =
7.0000
B(5) =
9.0000
B(6) = 11.0000
C(1)
C(2)
C(3)
C(4)
C(5)
C(6)
= 1.00000
= 1.00000
= 4.00000
= 9.00000
= 16.00000
= 25.00000
RECQR
Computes recurrence coefficients for monic polynomials given a quadrature rule.
Required Arguments
QX Array of length N containing the quadrature points. (Input)
QW Array of length N containing the quadrature weights. (Input)
B Array of length NTERM containing recurrence coefficients. (Output)
C Array of length NTERM containing recurrence coefficients. (Output)
Optional Arguments
N Number of quadrature points. (Input)
Default: N = size (QX,1).
NTERM Number of recurrence coefficients. (Input)
NTERM must be less than or equal to N.
Default: NTERM = size (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
RECQR 911
Description
The routine RECQR produces the recurrence coefficients for the orthogonal polynomials given the
points and weights for the Gauss quadrature formula. It is assumed that the orthogonal
polynomials are monic; hence the three-term recursion may be written
pi ( x ) = ( x bi ) pi 1 ( x ) ci pi 2 ( x ) for i =1, , N
where p0 = 1 and p1 = 0. It is obvious from this representation that the degree of pi is i and that pi
is monic. In order for the recurrence to give rise to a sequence of orthogonal polynomials (with
respect to a nonnegative measure), it is necessary and sufficient that ci > 0.
This routine is an inverse routine to GQRCF. Given the recurrence coefficients, the routine GQRCF
produces the corresponding Gauss quadrature formula, whereas the routine RECQR produces the
recurrence coefficients given the quadrature formula.
Comments
1.
2.
The recurrence coefficients B(I) and C(I) define the monic polynomials via the relation
P(I) = (X B(I + 1)) * P(I 1) C(I + 1) * P(I 2). The zero-th moment
( WT ( X ) dX )
of the weight function is returned in C(1).
Example
To illustrate the use of RECQR, we will input a simple choice of recurrence coefficients, call GQRCF
for the quadrature formula, put this information into RECQR, and recover the recurrence
coefficients.
USE RECQR_INT
USE UMACH_INT
USE GQRCF_INT
IMPLICIT
NONE
INTEGER
N
PARAMETER (N=5)
INTEGER
I, J, NFIX, NOUT, NTERM
REAL
B(N), C(N), FLOAT, QW(N), QX(N), QXFIX(2)
INTRINSIC FLOAT
Get output unit number
DO 10 J=1, N
B(J) = FLOAT(J)
C(J) = FLOAT(J)/2.0
10 CONTINUE
WRITE (NOUT,99995)
99995 FORMAT (1X, 'Original recurrence coefficients')
WRITE (NOUT,99996) (I,B(I),I,C(I),I=1,N)
99996 FORMAT (5(6X,'B(',I1,') = ',F8.4,7X,'C(',I1,') = ',F8.5,/))
!
!
The call to GQRCF will compute the
!
quadrature rule from the recurrence
!
coefficients given above.
!
CALL GQRCF (N, B, C, QX, QW)
WRITE (NOUT,99997)
99997 FORMAT (/, 1X, 'Quadrature rule from the recurrence coefficients' &
)
WRITE (NOUT,99998) (I,QX(I),I,QW(I),I=1,N)
99998 FORMAT (5(6X,'QX(',I1,') = ',F8.4,7X,'QW(',I1,') = ',F8.5,/))
!
!
Call RECQR to recover the original
!
recurrence coefficients
NTERM = N
CALL RECQR (QX, QW, B, C)
WRITE (NOUT,99999)
99999 FORMAT (/, 1X, 'Recurrence coefficients determined by RECQR')
WRITE (NOUT,99996) (I,B(I),I,C(I),I=1,N)
!
END
Output
Original
B(1) =
B(2) =
B(3) =
B(4) =
B(5) =
recurrence coefficients
1.0000
C(1) = 0.50000
2.0000
C(2) = 1.00000
3.0000
C(3) = 1.50000
4.0000
C(4) = 2.00000
5.0000
C(5) = 2.50000
RECQR 913
B(4) =
B(5) =
4.0000
5.0000
C(4) =
C(5) =
2.00000
2.50000
FQRUL
Computes a Fejr quadrature rule with various classical weight functions.
Required Arguments
N Number of quadrature points. (Input)
A Lower limit of integration. (Input)
B Upper limit of integration. (Input)
B must be greater than A.
QX Array of length N containing quadrature points. (Output)
QW Array of length N containing quadrature weights. (Output)
Optional Arguments
IWEIGH Index of the weight function. (Input)
Default: IWEIGH = 1.
IWEIGH
WT(X)
1/(X ALPHA)
(B X) (X A)
(B X) (X A) log(X A)
(B X) (X A) log(B X)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FQRUL produces the weights and points for the Fejr quadrature rule. Since this
computation is based on a quarter-wave cosine transform, the computations are most efficient
when N, the number of points, is a product of small primes. These quadrature formulas may be an
intermediate step in a more complicated situation, see for instance Gautschi and Milovanofic
(1985).
The Fejr quadrature rules are based on polynomial interpolation. First, choose classical abscissas
(in our case, the Gauss points for the Chebyshev weight function (1 x2)1/2), then derive the
quadrature rule for a different weight. In order to keep the presentation simple, we will describe
the case where the interval of integration is [1, 1] even though FQRUL allows rescaling to an
arbitrary interval [a, b].
We are looking for quadrature rules of the form
N
Q ( f ) := w j f ( x j )
j =1
where the
{x j }Nj =1
are the zeros of the N-th Chebyshev polynomial (of the first kind) TN (x) = cos(N arccos x). The
weights in the quadrature rule Q are chosen so that, for all polynomials p of degree less than N,
N
Q ( p ) = w j p ( x j ) = p ( x )w ( x ) dx
j =1
for some weight function w. In FQRUL, the user has the option of choosing w from five families of
functions with various algebraic and logarithmic endpoint singularities.
These Fejr rules are important because they can be computed using specialized FFT quarter-wave
transform routines. This means that rules with a large number of abscissas may be computed
efficiently. If we insert Tl for p in the above formula, we obtain
N
Q (Tl ) = w j Tl ( x j ) = Tl ( x ) w ( x ) dx
j =1
FQRUL 915
for l = 0, , N 1. This is a system of linear equations for the unknown weights wj that can be
simplified by noting that
x j = cos
( 2 j 1)
j = 1, , N
2N
and hence,
Tl ( x ) w ( x ) dx = w j Tl ( x j )
j =1
l ( 2 j 1)
j =1
2N
= w j cos
The last expression is the cosine quarter-wave forward transform for the sequence
{w j }Nj =1
that is implemented in Chapter 6, Transforms under the name QCOSF. More importantly, QCOSF
has an inverse QCOSB. It follows that if the integrals on the left in the last expression can be
computed, then the Fejr rule can be derived efficiently for highly composite integers N utilizing
QCOSB. For more information on this topic, consult Davis and Rabinowitz (1984, pages 8486)
and Gautschi (1968, page 259).
Comments
1.
2.
If IWEIGH specifies the weight WT(X) and the interval (A, B), then approximately
3.
B
A
F ( X ) * WT ( X ) dX = F ( QX ( I ) ) * QW ( I )
I =1
The routine FQRUL uses an fft, so it is most efficient when N is the product of small
primes.
Example
Here, we obtain the Fejr quadrature rules using 10, 100, and 200 points. With these rules, we get
successively better approximations to the integral
x sin ( 41 x 2 ) dx =
1
41
USE FQRUL_INT
USE UMACH_INT
USE CONST_INT
IMPLICIT
NONE
INTEGER
NMAX
PARAMETER (NMAX=200)
INTEGER
I, K, N, NOUT
REAL
A, ANSWER, B, F, QW(NMAX), &
QX(NMAX), SIN, SUM, X, PI, ERROR
INTRINSIC SIN, ABS
!
F(X) = X*SIN(41.0*PI*X**2)
!
!
!
!
!
!
99999 FORMAT (/, 1X, 'When N = ', I3, ', the quadrature result making ' &
, 'use of these points ', /, ' and weights is ', 1PE11.4, &
', with error ', 1PE9.2, '.')
END
Output
When N = 10, the quadrature result making use of these points and weights
is -1.6523E-01, with error 1.73E-01.
When N = 100, the quadrature result making use of these points and weights
is 7.7637E-03, with error 2.79E-08.
When N = 200, the quadrature result making use of these points and weights
is 7.7636E-03, with error 1.40E-08.
FQRUL 917
DERIV
This function computes the first, second or third derivative of a user-supplied function.
Required Arguments
FCN User-supplied FUNCTION whose derivative at X will be computed. The
form is FCN(X), where
X Independent variable. (Input)
FCN The function value. (Output)
FCN must be declared EXTERNAL in the calling program.
X Point at which the derivative is to be evaluated. (Input)
Optional Arguments
KORDER Order of the derivative desired (1, 2 or 3). (Input)
Default: KORDER = 1.
BGSTEP Beginning value used to compute the size of the interval used in computing the
derivative. (Input)
The interval used is the closed interval (X 4 * BGSTEP, X + 4 * BGSTEP). BGSTEP
must be positive.
Default: BGSTEP = .01.
TOL Relative error desired in the derivative estimate. (Input)
Default: TOL = 1.e-2 for single precision and 1.d-4 for double precision.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
DERIV produces an estimate to the first, second, or third derivative of a function. The estimate
originates from first computing a spline interpolant to the input function using values within the
interval (X 4.0 * BGSTEP, X + 4.0 * BGSTEP), then differentiating the spline at X.
Comments
1.
2.
Informational errors
Type
Code
3
3.
The initial step size, BGSTEP, must be chosen small enough that FCN is defined and
reasonably smooth in the interval (X 4 * BGSTEP, X + 4 * BGSTEP), yet large enough
to avoid roundoff problems.
Example 1
In this example, we obtain the approximate first derivative of the function
f(x) = 2 sin(3x/2)
at the point x = 2.
USE DERIV_INT
USE UMACH_INT
!
!
IMPLICIT
INTEGER
REAL
EXTERNAL
NONE
KORDER, NCOUNT, NOUT
BGSTEP, DERV, TOL, X
FCN
Get output unit number
CALL UMACH (2, NOUT)
X
= 2.0
BGSTEP = 0.2
NCOUNT = 1
DERV
= DERIV(FCN,X, BGSTEP=BGSTEP)
WRITE (NOUT,99999) DERV
99999 FORMAT (/, 1X, 'First derivative of FCN is ', 1PE10.3)
END
!
Chapter 4: Integration and Differentiation
DERIV 919
Output
First derivative of FCN is
2.970E+00
Additional Example
Example 2
In this example, we attempt to approximate in single precision the third derivative of the function
f(x) = 2x4 + 3x
at the point x = 0.75. Although the function is well-behaved near x = 0.75, finding derivatives is
often computationally difficult on 32-bit machines. The difficulty is overcome in double precision.
USE IMSL_LIBRARIES
!
!
!
!
!
!
!
IMPLICIT
NONE
INTEGER
KORDER, NOUT
REAL
BGSTEP, DERV, X, TOL
DOUBLE PRECISION DBGSTE, DDERV, DFCN, DTOL, DX
EXTERNAL
DFCN, FCN
Get output unit number
CALL UMACH (2, NOUT)
Turn off stopping due to error
condition
CALL ERSET (0, -1, 0)
X
= 0.75
BGSTEP = 0.1
KORDER = 3
REAL
X
FCN = 2.0*X**4 + 3.0*X
RETURN
END
DOUBLE PRECISION FUNCTION DFCN (X)
DOUBLE PRECISION X
DFCN = 2.0D0*X**4 + 3.0D0*X
RETURN
END
Output
*** FATAL
***
***
***
DERIV 921
Routines
5.1.
5.1.1
927
934
944
961
973
980
5.1.2
5.1.3
5.2
5.2.1
1004
1038
1053
1059
Sturm-Liouville Problems
Eigenvalues, eigenfunctions,
and spectral density functions .............................................. SLEIG
Indices of eigenvalues ......................................................... SLCNT
1066
1078
5.2.2
5.3.
Usage Notes
A differential equation is an equation involving one or more dependent variables (called yi or ui),
their derivatives, and one or more independent variables (called t, x, and y). Users will typically
need to relabel their own model variables so that they correspond to the variables used in the
solvers described here. A differential equation with one independent variable is called an ordinary
differential equation (ODE). A system of equations involving derivatives in one independent
variable and other dependent variables is called a differential-algebraic system. A differential
equation with more than one independent variable is called a partial differential equation (PDE).
Chapter 5: Differential Equations
Routines 923
The order of a differential equation is the highest order of any of the derivatives in the equation.
Some of the routines in this chapter require the user to reduce higher-order problems to systems of
first-order differential equations.
dy ( t )
dt
= f (t, y )
with initial values y (t0). Values of y(t) for t > t0 or t < t0 are required. The routines IVPRK, IVMRK,
and IVPAG, solve the IVP for systems of ODEs of the form y = f (t, y) with y(t = t0) specified.
Here, f is a user supplied function that must be evaluated at any set of values (t, y1, , yN);
i = 1, , N. The routines IVPAG, and DASPG, will also solve implicit systems of the form Ay = f
(t, y) where A is a user supplied matrix. For IVPAG, the matrix A must be nonsingular.
The system y = f (t, y) is said to be stiff if some of the eigenvalues of the Jacobian matrix
{ fi/ yj} have large, negative real parts. This is often the case for differential equations
representing the behavior of physical systems such as chemical reactions proceeding to
equilibrium where subspecies effectively complete their reaction in different epochs. An alternate
model concerns discharging capacitors such that different parts of the system have widely varying
decay rates (or time constants). This definition of stiffness, based on the eigenvalues of the
Jacobian matrix, is not satisfactory. Users typically identify stiff systems by the fact that numerical
differential equation solvers such as IVPRK, are inefficient, or else they fail. The most common
inefficiency is that a large number of evaluations of the functions fi are required. In such cases, use
routine IVPAG, or DASPG. For more about stiff systems, see Gear (1971, Chapter 11) or Shampine
and Gear (1979).
In the boundary value problem (BVP) for ODEs, constraints on the dependent variables are given
at the endpoints of the interval of interest, [a, b]. The routines BVPFD and BVPMS solve the BVP
for systems of the form y(t) = f (t, y), subject to the conditions
hi(y1(a), , yN(a), y1(b), , yN(b)) = 0 i = 1, , N
Here, f and h = [h1, , hN]T are user-supplied functions.
Differential-algebraic Equations
Frequently, it is not possible or not convenient to express the model of a dynamical system as a set
of ODEs. Rather, an implicit equation is available in the form
gi ( t , y, , y N , y1, , y N ) = 0
i = 1, , N
g ( t , y, y ) = g1 ( t , y, y ) , , g N ( t , y, y ) = 0
T
With initial value y(t0). Any system of ODEs can be trivially written as a differential-algebraic
system by defining
g ( t , y, y ) = f ( t , y ) y
The routine DASPG solves differential-algebraic systems of index 1 or index 0. For a definition of
index of a differential-algebraic system, see (Brenan et al. 1989). Also, see Gear and Petzold
(1984) for an outline of the computing methods used.
ui
u 2 u1
2uN
u
= f i x, t , u1 , , u N , 1 , , N ,
,
t
x
x x2
x2
ui
( a ) = 1 (t )
x
u
2( i ) ui ( b ) + 2( i ) i ( b ) = 2 ( t )
x
1( i ) ui ( a ) + 1(i )
(ji ) , and (j i )
are user-supplied, j = 1, 2.
The routines FPS2H and FPS3H solve Laplaces, Poissons, or Helmholtzs equation in two or
three dimensions. FPS2H uses a fast Poisson method to solve a PDE of the form
2u 2u
+
+ cu = f ( x, y )
x2 y 2
over a rectangle, subject to boundary conditions on each of the four sides. The scalar constant c
and the function f are user specified. FPS3H solves the three-dimensional analogue of this
problem.
Users wishing to solve more general PDEs, in more general 2-d and 3-d regions are referred to
Visual Numerics partner PDE2D (www.pde2d.com).
Summary
The following table summarizes the types of problems handled by the routines in this chapter.
With the exception of FPS2H and FPS3H, the routines can handle more than one differential
equation.
Problem
Consideration
Routine
Ay= f(t, y)
y(t0) = y0
IVPAG
IVPAG
IVPAG
IVPRK
BVPFD
y = f(t, y),
y (t0) = y0
y = f(t, y)
h(y(a), y(b)) = 0
g(t, y, y) = 0
y(t0), y(t0) given
DASPG
MOLCH
FPS2H
FPS3H
( pu ) + qu = ru ,
Sturm-Liouville problems
SLEIG
u (a)
1
( pu ( a ) )
= u ( a ) ( pu ( a ) )
1
2
u ( b ) + ( pu ( b ) ) = 0
1
2
IVPRK
Solves an initial-value problem for ordinary differential equations using the Runge-Kutta-Verner
fifth-order and sixth-order method.
Required Arguments
IDO Flag indicating the state of the computation. (Input/Output)
IDO
State
Initial entry
Normal re-entry
Normally, the initial call is made with IDO = 1. The routine then sets IDO = 2, and this
value is used for all but the last call that is made with IDO = 3. This final call is used to
release workspace, which was automatically allocated by the initial call with IDO = 1.
No integration is performed on this final call. See Comment 3 for a description of the
other interrupts.
FCN User-supplied SUBROUTINE to evaluate functions. The usage is
CALL FCN(N, T, Y, YPRIME), where
N Number of equations. (Input)
T Independent variable, t. (Input)
Y Array of size N containing the dependent variable values, y.
(Input)
YPRIME Array of size N containing the values of the vector y
evaluated at (t, y). (Output)
FCN must be declared EXTERNAL in the calling program.
T Independent variable. (Input/Output)
On input, T contains the initial value. On output, T is replaced by TEND unless error
conditions have occurred. See IDO for details.
TEND Value of t where the solution is required. (Input)
The value TEND may be less than the initial value of t.
Y Array of size NEQ of dependent variables. (Input/Output)
On input, Y contains the initial values. On output, Y contains the approximate solution.
Chapter 5: Differential Equations
IVPRK 927
Optional Arguments
NEQ Number of differential equations. (Input)
Default: NEQ = size (Y,1).
TOL Tolerance for error control. (Input)
An attempt is made to control the norm of the local error such that the global error is
proportional to TOL.
Default: TOL = machine precision.
PARAM A floating-point array of size 50 containing optional parameters. (Input/ Output)
If a parameter is zero, then a default value is used. These default values are given
below. Parameters that concern values of step size are applied in the direction of
integration. The following parameters may be set by the user:
PARAM
HINIT
Meaning
Initial value of the step size. Default: 10.0 * MAX (AMACH (1),
AMACH(4) * MAX(ABS(TEND), ABS(T)))
HMIN
HMAX
MXSTEP
MXFCN
Not used.
INTRP1
INTRP2
SCALE
10
INORM
FLOOR
11
1230
31
HTRIAL
Meaning
Current trial step size.
32
HMINC
33
HMAXC
34
NSTEP
35
NFCN
3650
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine IVPRK finds an approximation to the solution of a system of first-order differential
equations of the form y0 = f (t, y) with given initial data. The routine attempts to keep the global
error proportional to a user-specified tolerance. This routine is efficient for nonstiff systems where
the derivative evaluations are not expensive.
The routine IVPRK is based on a code designed by Hull, Enright and Jackson (1976, 1977). It uses
Runge-Kutta formulas of order five and six developed by J. H. Verner.
Comments
1.
NEQ
i =1
ei2 / wi2
IVPRK 929
Arg
Definition
YMAX
ENORM
WK Work array of size 10N using the working precision. The contents of WK must not be
changed from the first call with IDO = 1 until after the final call with IDO = 3.
2.
Informational errors
Type
Code
4
4
4
3.
1
2
3
Cannot satisfy error condition. The value of TOL may be too small.
Too many function evaluations needed.
Too many steps needed. The problem may be stiff.
If PARAM(7) is nonzero, the subroutine returns with IDO = 4 and will resume
calculation at the point of interruption if re-entered with IDO = 4. If PARAM(8) is
nonzero, the subroutine will interrupt the calculations immediately after it decides
whether or not to accept the result of the most recent trial step. The values used are
IDO = 5 if the routine plans to accept, or IDO = 6 if it plans to reject the step. The
values of IDO may be changed by the user (by changing IDO from 6 to 5) in order to
force acceptance of a step that would otherwise be rejected. Some parameters the user
might want to examine after return from an interrupt are IDO, HTRIAL, NSTEP, NFCN,
T, and Y. The array Y contains the newly computed trial value for y(t), accepted or not.
Example 1
Consider a predator-prey problem with rabbits and foxes. Let r be the density of rabbits and let
f be the density of foxes. In the absence of any predator-prey interaction, the rabbits would
increase at a rate proportional to their number, and the foxes would die of starvation at a rate
proportional to their number. Mathematically,
930 Chapter 5: Differential Equations
r = 2r
f=f
The rate at which the rabbits are eaten by the foxes is 2r f, and the rate at which the foxes increase,
because they are eating the rabbits, is r f. So, the model to be solved is
r = 2r 2r f
f=f+rf
The initial conditions are r(0) = 1 and f(0) = 3 over the interval 0 t 10.
In the program Y(1) = r and Y(2) = f. Note that the parameter vector PARAM is first set to zero with
IMSL routine SSET (Chapter 9, Basic Matrix/Vector Operations). Then, absolute error control is
selected by setting PARAM(10) = 1.0.
The last call to IVPRK with IDO = 3 deallocates IMSL workspace allocated on the first call to
IVPRK. It is not necessary to release the workspace in this example because the program ends after
solving a single problem. The call to release workspace is made as a model of what would be
needed if the program included further calls to IMSL routines.
USE IVPRK_INT
USE UMACH_INT
!
!
!
IMPLICIT
INTEGER
PARAMETER
INTEGER
REAL
EXTERNAL
NONE
MXPARM, N
(MXPARM=50, N=2)
SPECIFICATIONS FOR LOCAL VARIABLES
IDO, ISTEP, NOUT
PARAM(MXPARM), T, TEND, TOL, Y(N)
SPECIFICATIONS FOR SUBROUTINES
FCN
!
!
!
!
TOL = 0.0005
PARAM = 0.E0
PARAM(10) = 1.0
IVPRK 931
Output
ISTEP
1
2
3
4
5
6
7
8
9
10
Time
1.000
2.000
3.000
4.000
5.000
6.000
7.000
8.000
9.000
10.000
Y1
0.078
0.085
0.292
1.449
4.046
0.176
0.066
0.148
0.655
3.157
Y2
1.465
0.578
0.250
0.187
1.444
2.256
0.908
0.367
0.188
0.352
Additional Examples
Example 2
This is a mildly stiff problem (F2) from the test set of Enright and Pryce (1987). It is included here
because it illustrates the inefficiency of requiring more function evaluations with a nonstiff solver,
for a requested accuracy, than would be required using a stiff solver. Also, see IVPAG Example 2,
where the problem is solved using a BDF method. The number of function evaluations may vary,
depending on the accuracy and other arithmetic characteristics of the computer. The test problem
has n = 2 equations:
y1
y2
k2 y2 + k3 (1 y2 ) y1
y1 ( 0 )
y2 ( 0 )
k1
294
k2
k3
0.01020408
tend
240
y1 y1 y2 + k1 y2
USE IVPRK_INT
932 Chapter 5: Differential Equations
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
!
INTEGER
REAL
!
!
EXTERNAL
NONE
MXPARM, N
(MXPARM=50, N=2)
SPECIFICATIONS FOR LOCAL VARIABLES
IDO, ISTEP, NOUT
PARAM(MXPARM), T, TEND, TOL, Y(N)
SPECIFICATIONS FOR SUBROUTINES
SPECIFICATIONS FOR FUNCTIONS
FCN
!
CALL UMACH (2, NOUT)
!
TOL = 0.001
!
Print header
WRITE (NOUT,99998)
IDO = 1
ISTEP = 0
10 CONTINUE
ISTEP = ISTEP + 24
TEND = ISTEP
CALL IVPRK (IDO, FCN, T, TEND, Y, TOL=TOL, PARAM=PARAM)
IF (ISTEP .LE. 240) THEN
WRITE (NOUT,'(I6,3F12.3)') ISTEP/24, T, Y
!
Final call to release workspace
IF (ISTEP .EQ. 240) IDO = 3
GO TO 10
END IF
!
Show number of function calls.
WRITE (NOUT,99999) PARAM(35)
99998 FORMAT (4X, 'ISTEP', 5X, 'Time', 9X, 'Y1', 11X, 'Y2')
99999 FORMAT (4X, 'Number of fcn calls with IVPRK =', F6.0)
END
SUBROUTINE FCN (N, T, Y, YPRIME)
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER
N
REAL
T, Y(N), YPRIME(N)
!
SPECIFICATIONS FOR DATA VARIABLES
REAL
AK1, AK2, AK3
!
DATA AK1, AK2, AK3/294.0E0, 3.0E0, 0.01020408E0/
!
YPRIME(1) = -Y(1) - Y(1)*Y(2) + AK1*Y(2)
YPRIME(2) = -AK2*Y(2) + AK3*(1.0E0-Y(2))*Y(1)
RETURN
END
Chapter 5: Differential Equations
IVPRK 933
Output
ISTEP
1
2
3
4
5
6
7
8
9
10
Number
Time
Y1
24.000
0.688
48.000
0.634
72.000
0.589
96.000
0.549
120.000
0.514
144.000
0.484
168.000
0.457
192.000
0.433
216.000
0.411
240.000
0.391
of fcn calls with IVPRK =
Y2
0.002
0.002
0.002
0.002
0.002
0.002
0.002
0.001
0.001
0.001
2153.
IVMRK
Solves an initial-value problem y = f(t, y) for ordinary differential equations using Runge-Kutta
pairs of various orders.
Required Arguments
IDO Flag indicating the state of the computation. (Input/Output)
IDO
1
2
3
4
5
State
Initial entry
Normal re-entry
Final call to release workspace
Return after a step
Return for function evaluation (reverse communication)
Normally, the initial call is made with IDO = 1. The routine then sets IDO = 2, and this
value is used for all but the last call that is made with IDO = 3. This final call is used to
release workspace, which was automatically allocated by the initial call with IDO = 1.
FCN User-supplied SUBROUTINE to evaluate functions. The usage is
CALL FCN (N, T, Y, YPRIME), where
N Number of equations. (Input)
T Independent variable. (Input)
Y Array of size N containing the dependent variable values, y. (Input)
YPRIME Array of size N containing the values of the vector y evaluated at (t, y).
(Output)
FCN must be declared EXTERNAL in the calling program.
T Independent variable. (Input/Output)
On input, T contains the initial value. On output, T is replaced by TEND unless error
conditions have occurred.
Optional Arguments
N Number of differential equations. (Input)
Default: N= size (Y,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine IVMRK finds an approximation to the solution of a system of first-order differential
equations of the form y = f(t, y) with given initial data. Relative local error is controlled according
to a user-supplied tolerance. For added efficiency, three Runge-Kutta formula pairs, of orders 3, 5,
and 8, are available.
Optionally, the values of the vector y can be passed to IVMRK by reverse communication,
avoiding the user-supplied subroutine FCN. Reverse communication is especially useful in
applications that have complicated algorithmic requirement for the evaluations of f(t, y). Another
option allows assessment of the global error in the integration.
The routine IVMRK is based on the codes contained in RKSUITE, developed by R. W. Brankin, I.
Gladwell, and L. F. Shampine (1991).
Comments
1.
IVMRK 935
PARAM
1 HINIT
Meaning
Initial value of the step size. Must be chosen such that
0.01 HINIT 10.0 amach(4).
Default: automatic selection of stepsize.
2 METHOD
3 ERREST
4 INTRP
5 RCSTAT
6 - 30
Not used
2.
Informational errors
Type
Code
4
If PARAM(4) is nonzero, the subroutine returns with IDO = 4 and will resume
calculation at the point of interruption if re-entered with IDO = 4. Some parameters the
user might want to examine are IDO, HTRIAL, NSTEP, NFCN, T, and Y. The array Y
contains the newly computed trial value for y(t), accepted or not.
If PARAM(5) is nonzero, the subroutine will return with IDO = 5. At this time, evaluate
the derivatives at T, place the result in YPRIME, and call IVMRK again. The dummy
function I40RK/DI40RK may be used in place of FCN.
Example 1
This example integrates the small system (A.2.B2) from the test set of Enright and Pryce (1987):
IVMRK 937
y1 = y1 + y2
y2 = y1 2 y2 + y3
y3 = y2 y3
y1 ( 0 ) = 2
y2 ( 0 ) = 0
y3 ( 0 ) = 1
USE IVMRK_INT
USE WRRRN_INT
!
!
IMPLICIT
INTEGER
NONE
N
PARAMETER
(N=3)
!
CALL WRRRN ('Y', Y)
END
!
!
!
Output
1
2
3
Y
1.000
1.000
1.000
Additional Examples
Example 2
This problem is the same mildly stiff problem (A.1.F2) from the test set of Enright and Pryce as
Example 2 for IVPRK.
y1
y2
= y1 y1 y2 + k1 y2
= k2 y2 + k3 (1 y2 ) y1
y1 ( 0 )
=1
y2 ( 0 )
=0
k1
= 294
k2
=3
k3
= 0.01020408
tend = 240
Although not a stiff solver, one notes the greater efficiency of IVMRK over IVPRK, in terms of
derivative evaluations. Reverse communication is also used in this example. Users will find this
feature particularly helpful if their derivative evaluation scheme is difficult to isolate in a separate
subroutine.
USE I2MRK_INT
USE UMACH_INT
USE AMACH_INT
!
!
!
!
!
!
!
!
IMPLICIT
INTEGER
NONE
N
PARAMETER
(N=2)
INTEGER
REAL
REAL
SAVE
INTRINSIC
REAL
EXTERNAL
vector
TOL = .001
PREC = AMACH(4)
THRES = SQRT (PREC)
PARAM = 0.0E0
LWORK = 1000
!
!
!
!
Print header
WRITE (NOUT,99998)
10 CONTINUE
TEND = ISTEP
CALL I2MRK (IDO, N, I40RK, T, TEND, Y, YPRIME, TOL, THRES, PARAM,&
YMAX, RMSERR, WORK, LWORK)
IF (IDO .EQ. 5) THEN
Evaluate derivatives
YPRIME(1) = -Y(1) - Y(1)*Y(2) + AK1*Y(2)
YPRIME(2) = -AK2*Y(2) + AK3*(1.0-Y(2))*Y(1)
GO TO 10
ELSE IF (ISTEP .LE. 240) THEN
!
!
!
!
!
Output
ISTEP
1
2
3
4
TIME
24.000
48.000
72.000
96.000
Y1
0.688
0.634
0.589
0.549
Y2
0.002
0.002
0.002
0.002
Fortran Numerical MATH LIBRARY
5
6
7
8
9
10
NUMBER
120.000
144.000
168.000
192.000
216.000
240.000
OF DERIVATIVE
0.514
0.484
0.457
0.433
0.411
0.391
EVALUATIONS
0.002
0.002
0.002
0.001
0.001
0.001
WITH IVMRK = 1375.
Example 3
This example demonstrates how exceptions may be handled. The problem is from Enright and
Pryce (A.2.F1), and has discontinuities. We choose this problem to force a failure in the global
error estimation scheme, which requires some smoothness in y. We also request an initial relative
error tolerance which happens to be unsuitably small in this precision.
If the integration fails because of problems in global error assessment, the assessment option is
turned off, and the integration is restarted. If the integration fails because the requested accuracy is
not achievable, the tolerance is increased, and global error assessment is requested. The reason
error assessment is turned on is that prior assessment failures may have been due more in part to
an overly stringent tolerance than lack of smoothness in the derivatives.
When the integration is successful, the example prints the final relative error tolerance, and
indicates whether or not global error estimation was possible.
y1 = y2
2ay2 ( 2 + a 2 ) y1 + 1, x even
y2 =
2
2
2ay2 ( + a ) y1 1, x odd
y1 ( 0 ) = 0
y2 ( 0 ) = 0
a = 0.1
x = largest integer x
USE IMSL_LIBRARIES
!
!
!
!
!
!
IMPLICIT
INTEGER
PARAMETER
INTEGER
REAL
NONE
N
(N=2)
INTRINSIC
REAL
SQRT
SQRT
EXTERNAL
FCN
IVMRK 941
!
!
CALL UMACH (2, NOUT)
!
!
!
LWORK = 100
PREC = AMACH(4)
TOL
= SQRT(PREC)
PARAM = 0.0E01
THRES = TOL
TEND
= 20.0E0
PARAM(3) = 1
!
!
!
!
!
!
!
!
!
!
!
!
10 CONTINUE
Set initial values
= 0.0E0
= 0.0E0
= 0.0E0
= 1
I2MRK (IDO, N, FCN, T, TEND, Y, YPRIME, TOL, THRES, PARAM,&
YMAX, RMSERR, WORK, LWORK)
IF (IERCD() .EQ. 32) THEN
Unable to achieve requested
accuracy, so increase tolerance.
Activate global error assessment
TOL
= 10.0*TOL
PARAM(3) = 1
WRITE (NOUT,99995) TOL
GO TO 10
ELSE IF (IERCD() .EQ. 34) THEN
Global error assessment has failed,
cannot continue from this point,
so restart integration
WRITE (NOUT,99996)
PARAM(3) = 0
GO TO 10
END IF
T
Y(1)
Y(2)
IDO
CALL
Summarize status
!
99995 FORMAT (/, 'CHANGING TOLERANCE TO ', E9.3, ' AND RESTARTING ...'&
942 Chapter 5: Differential Equations
Output
*** FATAL
***
***
IVMRK 943
***
Y
-12.30
0.95
IVPAG
Solves an initial-value problem for ordinary differential equations using either Adams-Moultons
or Gears BDF method.
Required Arguments
IDO Flag indicating the state of the computation. (Input/Output)
IDO
State
Initial entry
Normal re-entry
Normally, the initial call is made with IDO = 1. The routine then sets IDO = 2, and this
value is then used for all but the last call that is made with IDO = 3. This final call is
944 Chapter 5: Differential Equations
only used to release workspace, which was automatically allocated by the initial call
with IDO = 1. See Comment 5 for a description of the interrupts.
When IDO = 7, the matrix A at t must be recomputed and IVPAG/DIVPAG called again.
No other argument (including IDO) should be changed. This value of IDO is returned
only if PARAM(19) = 2.
FCN User-supplied SUBROUTINE to evaluate functions. The usage is
CALL FCN (N, T, Y, YPRIME), where
N Number of equations. (Input)
T Independent variable, t. (Input)
Y Array of size N containing the dependent variable values, y.
(Input)
YPRIME Array of size N containing the values of the vector y
evaluated at (t, y). (Output)
See Comment 3.
FCN must be declared EXTERNAL in the calling program.
FCNJ User-supplied SUBROUTINE to compute the Jacobian. The usage is
CALL FCNJ (N, T, Y, DYPDY) where
N Number of equations. (Input)
T Independent variable, t. (Input)
Y Array of size N containing the dependent variable values, y(t).
(Input)
DYPDY An array, with data structure and type determined by
PARAM(14) = MTYPE, containing the required partial derivatives fi/yj. (Output)
These derivatives are to be evaluated at the current values of (t, y). When the
Jacobian is dense, MTYPE = 0 or = 2, the leading dimension of DYPDY has the
value N. When the Jacobian matrix is banded, MTYPE = 1, and the leading
dimension of DYPDY has the value 2 * NLC + NUC + 1. If the matrix is banded
positive definite symmetric, MTYPE = 3, and the leading dimension of DYPDY has
the value NUC + 1.
FCNJ must be declared EXTERNAL in the calling program. If PARAM(19) = IATYPE is
nonzero, then FCNJ should compute the Jacobian of the righthand side of the equation
Ay = f(t, y). The subroutine FCNJ is used only if PARAM(13) = MITER = 1.
T Independent variable, t. (Input/Output)
On input, T contains the initial independent variable value. On output, T is replaced by
TEND unless error or other normal conditions arise. See IDO for details.
TEND Value of t = tend where the solution is required. (Input)
The value tend may be less than the initial value of t.
Y Array of size NEQ of dependent variables, y(t). (Input/Output)
On input, Y contains the initial values, y(t0). On output, Y contains the approximate
solution, y(t).
IVPAG 945
Optional Arguments
NEQ Number of differential equations. (Input)
Default: NEQ = size (Y,1)
A Matrix structure used when the system is implicit. (Input)
The matrix A is referenced only if PARAM(19) = IATYPE is nonzero. Its data structure is
determined by PARAM(14) = MTYPE. The matrix A must be nonsingular and MITER
must be 1 or 2. See Comment 3.
TOL Tolerance for error control. (Input)
An attempt is made to control the norm of the local error such that the global error is
proportional to TOL.
Default: TOL = .001
PARAM A floating-point array of size 50 containing optional parameters. (Input/Output)
If a parameter is zero, then the default value is used. These default values are given
below. Parameters that concern values of the step size are applied in the direction of
integration. The following parameters may be set by the user:
PARAM
HINIT
Meaning
Initial value of the step size H. Always nonnegative.
Default: 0.001|tend t0|.
HMIN
HMAX
MXSTEP
MXFCN
MAXORD
INTRP1
INTRP2
SCALE
10
INORM
NEQ
i =1
ei2 / wi2
FLOOR
12
METH
13
MITER
IVPAG 947
14
MTYPE
Matrix type for A (if used) and the Jacobian (if MITER = 1
or = 2). When both are used, A and the Jacobian must be of
the same type.
0 = MTYPE selects full matrices.
1 = MTYPE selects banded matrices.
2 = MTYPE selects symmetric positive definite matrices.
3 = MTYPE selects banded symmetric positive definite
matrices.
Default: 0.
15
NLC
16
NUC
17
18
Not used.
EPSJ
19
IATYPE
20
LDA
if MTYPE = 0 or = 2
if MTYPE = 1
if MTYPE = 3
Not used.
2130
The following entries in the array PARAM are set by the program:
PARAM
31
HTRIAL
Meaning
Current trial step size.
32
HMINC
33
HMAXC
34
NSTEP
35
NFCN
36
NJE
3750
Not used.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine IVPAG solves a system of first-order ordinary differential equations of the form
y = f (t, y) or Ay = f (t, y) with initial conditions where A is a square nonsingular matrix of order
N. Two classes of implicit linear multistep methods are available. The first is the implicit AdamsMoulton method (up to order twelve); the second uses the backward differentiation formulas BDF
(up to order five). The BDF method is often called Gears stiff method. In both cases, because
basic formulas are implicit, a system of nonlinear equations must be solved at each step. The
deriviative matrix in this system has the form L = A + J where is a small number computed by
IVPAG and J is the Jacobian. When it is used, this matrix is computed in the user-supplied routine
FCNJ or else it is approximated by divided differences as a default. Using defaults, A is the
identity matrix. The data structure for the matrix L may be identified to be real general, real
banded, symmetric positive definite, or banded symmetric positive definite. The default structure
for L is real general.
Comments
1.
None of the additional array arguments should be changed from the first call with
IDO = 1 until after the final call with IDO = 3. The additional arguments are as follows:
YTEMP Array of size NMETH. (Workspace)
YMAX Array of size NEQ containing the maximum Y-values computed so far.
(Output)
Chapter 5: Differential Equations
IVPAG 949
ERROR Array of size NEQ containing error estimates for each component of Y.
(Output)
SAVE1 Array of size NEQ. (Workspace)
SAVE2 Array of size NEQ. (Workspace)
PW Array of size NPW. (Workspace)
IPVT Array of size NEQ. (Workspace)
VNORM A Fortran SUBROUTINE to compute the norm of the error. (Input)
The routine may be provided by the user, or the IMSL routine I3PRK/DI3PRK
may be used. In either case, the name must be declared in a Fortran EXTERNAL
statement. If usage of the IMSL routine is intended, then the name
I3PRK/DI3PRK should be specified. The usage of the error norm routine is
CALL VNORM (NEQ, V, Y, YMAX, ENORM) where
Arg.
Definition
NEQ
YMAX
ENORM
2.
Informational errors
Type
Code
4
4
4
2
3
Integration was halted after failing to pass the error test even after
dividing the initial step size by a factor of 1.0E + 10. The value TOL
may be too small.
Integration was halted after failing to achieve corrector convergence
even after dividing the initial step size by a factor of 1.0E + 10. The
value TOL may be too small.
IATYPE is nonzero and the input matrix A multiplying y is singular.
3.
Both explicit systems, of the form y = f (t, y), and implicit systems, Ay = f (t, y), can
be solved. If the system is explicit, then PARAM(19) = 0; and the matrix A is not
referenced. If the system is implicit, then PARAM(14) determines the data structure of
the array A. If PARAM(19) = 1, then A is assumed to be a constant matrix. The value of A
used on the first call (with IDO = 1) is saved until after a call with IDO = 3. The value
of A must not be changed between these calls.
If PARAM(19) = 2, then the matrix is assumed to be a function of t.
4.
5.
If PARAM(7) is nonzero, the subroutine returns with IDO= 4 and will resume calculation
at the point of interruption if re-entered with IDO = 4. If PARAM(8) is nonzero, the
subroutine will interrupt immediately after decides to accept the result of the most
recent trial step. The value IDO = 5 is returned if the routine plans to accept, or IDO = 6
if it plans to reject. The value IDO may be changed by the user (by changing IDO from
6 to 5) to force acceptance of a step that would otherwise be rejected. Relevant
parameters to observe after return from an interrupt are IDO, HTRIAL, NSTEP, NFCN,
NJE, T and Y. The array Y contains the newly computed trial value y(t).
Example 1
Eulers equation for the motion of a rigid body not subject to external forces is
y1 = y2 y3
y1 ( 0 ) = 0
y2 = y1 y3
y2 ( 0 ) = 1
y3 = 0.51 y1 y2
y3 ( 0 ) = 1
Its solution is, in terms of Jacobi elliptic functions, y1(t) = sn(t; k), y2(t) = cn(t; k), y3(t) = dn(t; k)
where k2 = 0.51. The Adams-Moulton method of IVPAG is used to solve this system, since this is
the default. All parameters are set to defaults.
The last call to IVPAG with IDO = 3 releases IMSL workspace that was reserved on the first call to
IVPAG. It is not necessary to release the workspace in this example because the program ends after
solving a single problem. The call to release workspace is made as a model of what would be
needed if the program included further calls to IMSL routines.
Because PARAM(13) = MITER = 0, functional iteration is used and so subroutine FCNJ is never
called. It is included only because the calling sequence for IVPAG requires it.
Chapter 5: Differential Equations
IVPAG 951
USE IVPAG_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
!
!
!
INTEGER
REAL
EXTERNAL
!
!
IDO
T
Y(1)
Y(2)
Y(3)
TOL
=
=
=
=
=
=
NONE
N, NPARAM
(N=3, NPARAM=50)
SPECIFICATIONS FOR LOCAL VARIABLES
IDO, IEND, NOUT
A(1,1), T, TEND, TOL, Y(N)
SPECIFICATIONS FOR SUBROUTINES
SPECIFICATIONS FOR FUNCTIONS
FCN, FCNJ
Initialize
1
0.0
0.0
1.0
1.0
1.0E-6
Write title
CALL UMACH (2, NOUT)
WRITE (NOUT,99998)
Integrate ODE
IEND = 0
10 CONTINUE
IEND = IEND + 1
TEND = IEND
Output
T
1.00000
2.00000
3.00000
4.00000
5.00000
6.00000
7.00000
8.00000
9.00000
10.00000
Y(1)
0.80220
0.99537
0.64141
-0.26961
-0.91173
-0.95751
-0.42877
0.51092
0.97567
0.87790
Y(2)
0.59705
-0.09615
-0.76720
-0.96296
-0.41079
0.28841
0.90342
0.85963
0.21926
-0.47884
Y(3)
0.81963
0.70336
0.88892
0.98129
0.75899
0.72967
0.95197
0.93106
0.71730
0.77906
Additional Examples
Example 2
The BDF method of IVPAG is used to solve Example 2 of IVPRK. We set PARAM(12) = 2 to
designate the BDF method. A chord or modified Newton method, with the Jacobian computed by
divided differences, is used to solve the nonlinear equations. Thus, we set PARAM(13) = 2. The
number of evaluations of y is printed after the last output point, showing the efficiency gained
when using a stiff solver compared to using IVPRK on this problem. The number of evaluations
may vary, depending on the accuracy and other arithmetic characteristics of the computer.
USE IVPAG_INT
USE UMACH_INT
!
!
!
!
!
!
!
!
IMPLICIT
INTEGER
PARAMETER
INTEGER
PARAMETER
INTEGER
REAL
EXTERNAL
NONE
MXPARM, N
(MXPARM=50, N=2)
SPECIFICATIONS
MABSE, MBDF, MSOLVE
(MABSE=1, MBDF=2, MSOLVE=2)
SPECIFICATIONS
IDO, ISTEP, NOUT
A(1,1), PARAM(MXPARM), T, TEND,
SPECIFICATIONS
SPECIFICATIONS
FCN, FCNJ
FOR PARAMETERS
FOR LOCAL VARIABLES
TOL, Y(N)
FOR SUBROUTINES
FOR FUNCTIONS
PARAM = 0.0E0
!
PARAM(10) = MABSE
!
PARAM(12) = MBDF
!
!
PARAM(13) = MSOLVE
!
WRITE (NOUT,99998)
IDO = 1
ISTEP = 0
10 CONTINUE
ISTEP = ISTEP + 24
TEND = ISTEP
Print header
Output
ISTEP
1
Time
24.000
Y1
0.689
Y2
0.002
Fortran Numerical MATH LIBRARY
2
3
4
5
6
7
8
9
10
Number
48.000
0.636
72.000
0.590
96.000
0.550
120.000
0.515
144.000
0.485
168.000
0.458
192.000
0.434
216.000
0.412
240.000
0.392
of fcn calls with IVPAG =
0.002
0.002
0.002
0.002
0.002
0.002
0.001
0.001
0.001
73.
Example 3
The BDF method of IVPAG is used to solve the so-called Robertson problem:
y1 = c1 y1 + c2 y2 y3
y1 ( 0 ) = 1
y2 = y1 y3
y2 ( 0 ) = 0
y3 ( 0 ) = 0
y3 = c3 y22
c1 = 0.04, c2 = 10 , c3 = 3 10
4
0 t 10
Output is obtained after each unit of the independent variable. A user-provided subroutine for the
Jacobian matrix is used. An absolute error tolerance of 105 is required.
USE IVPAG_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
INTEGER
PARAMETER
!
!
!
!
!
!
!
!
!
INTEGER
REAL
EXTERNAL
NONE
MXPARM, N
(MXPARM=50, N=3)
SPECIFICATIONS
MABSE, MBDF, MSOLVE
(MABSE=1, MBDF=2, MSOLVE=1)
SPECIFICATIONS
IDO, ISTEP, NOUT
A(1,1), PARAM(MXPARM), T, TEND,
SPECIFICATIONS
SPECIFICATIONS
FCN, FCNJ
FOR PARAMETERS
FOR LOCAL VARIABLES
TOL, Y(N)
FOR SUBROUTINES
FOR FUNCTIONS
PARAM(12) = MBDF
!
!
!
Print header
WRITE (NOUT,99998)
IDO = 1
ISTEP = 0
10 CONTINUE
ISTEP = ISTEP + 1
TEND = ISTEP
!
DYPDY(2,1) = -DYPDY(1,1)
DYPDY(2,2) = -DYPDY(1,2) - DYPDY(3,2)
DYPDY(2,3) = -DYPDY(1,3)
RETURN
END
Output
ISTEP
1
2
3
4
5
6
7
8
9
10
Time
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
Y1
0.96647
0.94164
0.92191
0.90555
0.89153
0.87928
0.86838
0.85855
0.84959
0.84136
Y2
0.00003
0.00003
0.00002
0.00002
0.00002
0.00002
0.00002
0.00002
0.00002
0.00002
Y3
0.03350
0.05834
0.07806
0.09443
0.10845
0.12070
0.13160
0.14143
0.15039
0.15862
Example 4
Solve the partial differential equation
et
u 2u
=
t x2
c ( t ) k ( x )
k =1 k
where k (x) is the piecewiselinear function that equals 1 at xk and is zero at all of the other
breakpoints. We approximate the partial differential equation by a system of N ordinary
differential equations, A dc/dt = Rc where A and R are matrices of order N. The matrix A is given
by
e t 2h / 3
Aij = e t i ( x ) j ( x ) dx = e t h / 6
0
if i = j
if i = j 1
otherwise
IVPAG 957
if i = j 1
otherwise
are assigned the values of the integrals on the right-hand side, by using the boundary values and
integration by parts. Because this system may be stiff, Gears BDF method is used.
In the following program, the array Y(1:N) corresponds to the vector of coefficients, c. Note that Y
contains N + 2 elements; Y(0) and Y(N + 1) are used to store the boundary values. The matrix A
depends on t so we set PARAM(19) = 2 and evaluate A when IVPAG returns with IDO = 7. The
subroutine FCN computes the vector Rc, and the subroutine FCNJ computes R. The matrices A and
R are stored as band-symmetric positive-definite structures having one upper co-diagonal.
USE
USE
USE
USE
IVPAG_INT
CONST_INT
WRRRN_INT
SSET_INT
IMPLICIT
INTEGER
PARAMETER
!
!
!
!
!
!
!
NONE
LDA, N, NPARAM, NUC
(N=9, NPARAM=50, NUC=1, LDA=NUC+1)
SPECIFICATIONS FOR PARAMETERS
INTEGER
NSTEP
PARAMETER (NSTEP=4)
SPECIFICATIONS FOR LOCAL VARIABLES
INTEGER
I, IATYPE, IDO, IMETH, INORM, ISTEP, MITER, MTYPE
REAL
A(LDA,N), C, HINIT, PARAM(NPARAM), PI, T, TEND, TMAX, &
TOL, XPOINT(0:N+1), Y(0:N+1)
CHARACTER TITLE*10
SPECIFICATIONS FOR COMMON /COMHX/
COMMON
/COMHX/ HX
REAL
HX
SPECIFICATIONS FOR INTRINSICS
INTRINSIC EXP, REAL, SIN
REAL
EXP, REAL, SIN
SPECIFICATIONS FOR SUBROUTINES
SPECIFICATIONS FOR FUNCTIONS
EXTERNAL
FCN, FCNJ
Initialize PARAM
HINIT = 1.0E-3
INORM = 1
IMETH = 2
MITER = 1
MTYPE = 3
IATYPE = 2
PARAM = 0.0E0
PARAM(1) = HINIT
PARAM(10) = INORM921
PARAM(12)
PARAM(13)
PARAM(14)
PARAM(16)
PARAM(19)
!
!
!
!
!
!
!
!
=
=
=
=
=
IMETH
MITER
MTYPE
NUC
IATYPE
FOR ARGUMENTS
FOR LOCAL VARIABLES
FOR COMMON /COMHX/
FOR SUBROUTINES
IVPAG 959
EXTERNAL
SSCAL
!
YPRIME(1) = -2.0*Y(1) + Y(2)
DO 10 I=2, N - 1
YPRIME(I) = -2.0*Y(I) + Y(I-1) + Y(I+1)
10 CONTINUE
YPRIME(N) = -2.0*Y(N) + Y(N-1)
CALL SSCAL (N, 1.0/HX, YPRIME, 1)
RETURN
END
!
!
!
!
!
CALL SSET (N-1, 1.0/HX, DYPDY(1,2), 2)
CALL SSET (N, -2.0/HX, DYPDY(2,1), 2)
RETURN
END
Output
1
0.0000
2
0.2321
3
0.4414
9
0.4414
10
0.2321
11
0.0000
1
0.0000
2
0.1607
3
0.3056
9
0.3056
10
0.1607
11
0.0000
1
0.0000
2
0.1002
3
0.1906
9
0.1906
10
0.1002
11
0.0000
U(T=0.250)
4
5
0.6076
0.7142
6
0.7510
7
0.7142
8
0.6076
U(T=0.500)
4
5
0.4206
0.4945
6
0.5199
7
0.4945
8
0.4206
U(T=0.750)
4
5
0.2623
0.3084
6
0.3243
7
0.3084
8
0.2623
1
0.0000
2
0.0546
3
0.1039
9
0.1039
10
0.0546
11
0.0000
U(T=1.000)
4
5
0.1431
0.1682
6
0.1768
7
0.1682
8
0.1431
BVPFD
Solves a (parameterized) system of differential equations with boundary conditions at two points,
using a variable order, variable step size finite difference method with deferred corrections.
Required Arguments
FCNEQN User-supplied SUBROUTINE to evaluate derivatives. The usage is CALL
FCNEQN (N, T, Y, P, DYDT), where
N Number of differential equations. (Input)
T Independent variable, t. (Input)
Y Array of size N containing the dependent variable values, y(t).
(Input)
P Continuation parameter, p. (Input)
See Comment 3.
DYDT Array of size N containing the derivatives y(t). (Output)
The name FCNEQN must be declared EXTERNAL in the calling program.
FCNJAC User-supplied SUBROUTINE to evaluate the Jacobian. The usage is CALL
FCNJAC (N, T, Y, P, DYPDY), where
N Number of differential equations. (Input)
T Independent variable, t. (Input)
Y Array of size N containing the dependent variable values. (Input)
P Continuation parameter, p. (Input)
See Comments 3.
DYPDY N by N array containing the partial derivatives ai, j = fi / yj
evaluated at (t, y). The values ai,j are returned in DYPDY(i, j).
(Output)
The name FCNJAC must be declared EXTERNAL in the calling program.
FCNBC User-supplied SUBROUTINE to evaluate the boundary conditions. The usage is
CALL FCNBC (N, YLEFT, YRIGHT, P, H), where
N Number of differential equations. (Input)
YLEFT Array of size N containing the values of the dependent
variable at the left endpoint. (Input)
YRIGHT Array of size N containing the values of the dependent
variable at the right endpoint. (Input)
P Continuation parameter, p. (Input)
See Comment 3.
BVPFD 961
(Output)
See Comment 3.
DYPDP Array of size N containing the derivative of y
evaluated at (t, y).
(Output)
The name FCNPEQ must be declared EXTERNAL in the calling program.
FCNPBC User-supplied SUBROUTINE to evaluate the derivative of the boundary
conditions with respect to the parameter p. The usage is
CALL FCNPBC (N, YLEFT, YRIGHT, P, H), where
N Number of differential equations. (Input)
YLEFT Array of size N containing the values of the dependent
variable at the left endpoint. (Input)
YRIGHT Array of size N containing the values of the dependent
variable at the right endpoint. (Input)
P Continuation parameter, p. (Input)
See Comment 3.
H Array of size N containing the derivative of fi with respect to p.
(Output)
Optional Arguments
N Number of differential equations. (Input)
Default: N = size (YINIT,1).
NINIT Number of initial grid points, including the endpoints. (Input)
It must be at least 4.
Default: NINIT = size (TINIT,1).
LDYINI Leading dimension of YINIT exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDYINI = size (YINIT,1).
PRINT Logical .TRUE. if intermediate output is to be printed. (Input)
Default: PRINT = .FALSE.
LDYFIN Leading dimension of YFINAL exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDYFIN = size (YFINAL,1).
Chapter 5: Differential Equations
BVPFD 963
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BVPFD is based on the subprogram PASVA3 by M. Lentini and V. Pereyra (see Pereyra
1978). The basic discretization is the trapezoidal rule over a nonuniform mesh. This mesh is
chosen adaptively, to make the local error approximately the same size everywhere. Higher-order
discretizations are obtained by deferred corrections. Global error estimates are produced to control
the computation. The resulting nonlinear algebraic system is solved by Newtons method with step
control. The linearized system of equations is solved by a special form of Gauss elimination that
preserves the sparseness.
Comments
1.
2.
Informational errors
Type
Code
4
4
3
1
2
3
More than MXGRID grid points are needed to solve the problem.
Newtons method diverged.
Newtons method reached roundoff error level.
3.
If the value of PISTEP is greater than zero, then the routine BVPFD assumes that the
user has embedded the problem into a one-parameter family of problems:
y = y(t, y, p)
h(ytleft, ytright, p) = 0
such that for p = 0 the problem is simple. For p = 1, the original problem is recovered.
The routine BVPFD automatically attempts to increment from p = 0 to p = 1. The value
PISTEP is the beginning increment used in this continuation. The increment will
usually be changed by routine BVPFD, but an arbitrary minimum of 0.01 is imposed.
4.
5.
Example 1
This example solves the third-order linear equation
y 2 y + y y = sin t
subject to the boundary conditions y(0) = y(2) and y(0) = y(2) = 1. (Its solution is y = sin t.) To
use BVPFD, the problem is reduced to a system of first-order equations by defining
y1 = y, y2 = y and y3 = y. The resulting system is
y1 = y2
y2 ( 0 ) 1 = 0
y2 = y3
y1 ( 0 ) y1 ( 2 ) = 0
y3 = 2 y3 y2 + y1 + sin t
y2 ( 2 ) 1 = 0
Note that there is one boundary condition at the left endpoint t = 0 and one boundary condition
coupling the left and right endpoints. The final boundary condition is at the right endpoint. The
total number of boundary conditions must be the same as the number of equations (in this case 3).
Note that since the parameter p is not used in the call to BVPFD, the routines FCNPEQ and FCNPBC
are not needed. Therefore, in the call to BVPFD, FCNEQN and FCNBC were used in place of FCNPEQ
and FCNPBC.
USE BVPFD_INT
USE UMACH_INT
USE CONST_INT
!
IMPLICIT
INTEGER
PARAMETER
INTEGER
REAL
NONE
SPECIFICATIONS FOR PARAMETERS
LDYFIN, LDYINI, MXGRID, NEQNS, NINIT
(MXGRID=45, NEQNS=3, NINIT=10, LDYFIN=NEQNS, &
LDYINI=NEQNS)
SPECIFICATIONS FOR LOCAL VARIABLES
I, J, NCUPBC, NFINAL, NLEFT, NOUT
ERREST(NEQNS), PISTEP, TFINAL(MXGRID), TINIT(NINIT), &
BVPFD 965
!
!
!
!
!
10
!
!
99997
99998
99999
!
!
!
INTEGER
REAL
!
!
!
NEQNS
T, P, Y(NEQNS), DYPDY(NEQNS,NEQNS)
Define d(DYDX)/dY
= 0.0
= 1.0
= 0.0
= 0.0
= 0.0
= 1.0
= 1.0
= -1.0
= 2.0
DYPDY(1,1)
DYPDY(1,2)
DYPDY(1,3)
DYPDY(2,1)
DYPDY(2,2)
DYPDY(2,3)
DYPDY(3,1)
DYPDY(3,2)
DYPDY(3,3)
RETURN
END
SUBROUTINE FCNBC (NEQNS, YLEFT, YRIGHT, P, F)
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NEQNS
REAL
P, YLEFT(NEQNS), YRIGHT(NEQNS), F(NEQNS)
Define boundary conditions
F(1) = YLEFT(2) - 1.0
F(2) = YLEFT(1) - YRIGHT(1)
F(3) = YRIGHT(2) - 1.0
RETURN
END
Output
I
T
1
0.000000E+00
2
3.490659E-01
3
6.981317E-01
4
1.396263E+00
5
2.094395E+00
6
2.792527E+00
7
3.490659E+00
8
4.188790E+00
9
4.886922E+00
10
5.585054E+00
11
5.934120E+00
12
6.283185E+00
Error estimates
Y1
-1.123191E-04
3.419107E-01
6.426908E-01
9.847531E-01
8.660529E-01
3.421830E-01
-3.417234E-01
-8.656880E-01
-9.845794E-01
-6.427721E-01
-3.420819E-01
-1.123186E-04
2.840430E-04
Y2
1.000000E+00
9.397087E-01
7.660918E-01
1.737333E-01
-4.998747E-01
-9.395474E-01
-9.396111E-01
-5.000588E-01
1.734571E-01
7.658258E-01
9.395434E-01
1.000000E+00
1.792939E-04
Y3
6.242319E-05
-3.419580E-01
-6.427230E-01
-9.847453E-01
-8.660057E-01
-3.420648E-01
3.418948E-01
8.658733E-01
9.847518E-01
6.429526E-01
3.423986E-01
6.743190E-04
5.588399E-04
Additional Examples
Example 2
In this example, the following nonlinear problem is solved:
y y3 + (1 + sin2 t) sin t = 0
with y(0) = y() = 0. Its solution is y = sin t. As in Example 1, this equation is reduced to a system
of first-order differential equations by defining y1 = y and y2 = y. The resulting system is
BVPFD 967
y1 = y2
y1 ( 0 ) = 0
y1 ( ) = 0
In this problem, there is one boundary condition at the left endpoint and one at the right endpoint;
there are no coupled boundary conditions.
Note that since the parameter p is not used, in the call to BVPFD the routines FCNPEQ and FCNPBC
are not needed. Therefore, in the call to BVPFD, FCNEQN and FCNBC were used in place of FCNPEQ
and FCNPBC.
USE BVPFD_INT
USE UMACH_INT
USE CONST_INT
IMPLICIT
!
NONE
!
!
!
WRITE (NOUT,99997)
WRITE (NOUT,99998) (I,TFINAL(I),(YFINAL(J,I),J=1,NEQNS),I=1, &
NFINAL)
WRITE (NOUT,99999) (ERREST(J),J=1,NEQNS)
99997 FORMAT (4X, 'I', 7X, 'T', 14X, 'Y1', 13X, 'Y2')
99998 FORMAT (I5, 1P3E15.6)
99999 FORMAT (' Error estimates', 4X, 1P2E15.6)
END
SUBROUTINE FCNEQN (NEQNS, T, Y, P, DYDT)
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NEQNS
REAL
T, P, Y(NEQNS), DYDT(NEQNS)
!
SPECIFICATIONS FOR INTRINSICS
INTRINSIC SIN
REAL
SIN
!
Define PDE
DYDT(1) = Y(2)
DYDT(2) = Y(1)**3 - SIN(T)*(1.0+SIN(T)**2)
RETURN
END
SUBROUTINE FCNJAC (NEQNS, T, Y, P, DYPDY)
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NEQNS
REAL
T, P, Y(NEQNS), DYPDY(NEQNS,NEQNS)
!
Define d(DYDT)/dY
DYPDY(1,1) = 0.0
DYPDY(1,2) = 1.0
DYPDY(2,1) = 3.0*Y(1)**2
DYPDY(2,2) = 0.0
RETURN
END
SUBROUTINE FCNBC (NEQNS, YLEFT, YRIGHT, P, F)
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NEQNS
REAL
P, YLEFT(NEQNS), YRIGHT(NEQNS), F(NEQNS)
!
Define boundary conditions
F(1) = YLEFT(1)
F(2) = YRIGHT(1)
RETURN
END
Output
I
1
2
3
4
5
6
7
8
9
10
11
T
0.000000E+00
2.855994E-01
5.711987E-01
8.567980E-01
1.142397E+00
1.427997E+00
1.713596E+00
1.999195E+00
2.284795E+00
2.570394E+00
2.855994E+00
Y1
0.000000E+00
2.817682E-01
5.406458E-01
7.557380E-01
9.096186E-01
9.898143E-01
9.898143E-01
9.096185E-01
7.557380E-01
5.406460E-01
2.817683E-01
Y2
9.999277E-01
9.594315E-01
8.412407E-01
6.548904E-01
4.154530E-01
1.423307E-01
-1.423307E-01
-4.154530E-01
-6.548903E-01
-8.412405E-01
-9.594313E-01
BVPFD 969
12
3.141593E+00
Error estimates
0.000000E+00
3.906105E-05
-9.999274E-01
7.124186E-05
Example 3
In this example, the following nonlinear problem is solved:
40 1
y y = t
9 2
2/3
1
t
2
with y(0) = y(1) = /2. As in the previous examples, this equation is reduced to a system of firstorder differential equations by defining y1 = y and y2 = y. The resulting system is
y1 ( 0 ) = / 2
y1 = y2
y2 = y13
40 1
t
9 2
2/3
1
+ t
2
y1 (1) = / 2
The problem is embedded in a family of problems by introducing the parameter p and by changing
the second differential equation to
40 1
y2 = py + t
9 2
3
1
2/3
1
t
2
At p = 0, the problem is linear; and at p = 1, the original problem is recovered. The derivatives
y/p must now be specified in the subroutine FCNPEQ. The derivatives f/p are zero in FCNPBC.
USE BVPFD_INT
USE UMACH_INT
!
IMPLICIT
INTEGER
PARAMETER
!
!
!
INTEGER
REAL
LOGICAL
INTEGER
REAL
SAVE
EXTERNAL
NONE
SPECIFICATIONS FOR PARAMETERS
LDYFIN, LDYINI, MXGRID, NEQNS, NINIT
(MXGRID=45, NEQNS=2, NINIT=5, LDYFIN=NEQNS, &
LDYINI=NEQNS)
SPECIFICATIONS FOR LOCAL VARIABLES
NCUPBC, NFINAL, NLEFT, NOUT
ERREST(NEQNS), PISTEP, TFINAL(MXGRID), TLEFT, TOL, &
XRIGHT, YFINAL(LDYFIN,MXGRID)
LINEAR, PRINT
SPECIFICATIONS FOR SAVE VARIABLES
I, J
TINIT(NINIT), YINIT(LDYINI,NINIT)
I, J, TINIT, YINIT
SPECIFICATIONS FOR FUNCTIONS
FCNBC, FCNEQN, FCNJAC, FCNPBC, FCNPEQ
NCUPBC
TOL
TLEFT
XRIGHT
PISTEP
PRINT
LINEAR
=
=
=
=
=
=
=
0
.001
0.0
1.0
0.1
.FALSE.
.FALSE.
!
CALL BVPFD (FCNEQN, FCNJAC, FCNBC, FCNPEQ, FCNPBC, NLEFT, &
NCUPBC, TLEFT, XRIGHT, PISTEP, TOL, TINIT, &
YINIT, LINEAR, MXGRID, NFINAL,TFINAL, YFINAL, ERREST)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99997)
WRITE (NOUT,99998) (I,TFINAL(I),(YFINAL(J,I),J=1,NEQNS),I=1, &
NFINAL)
WRITE (NOUT,99999) (ERREST(J),J=1,NEQNS)
99997 FORMAT (4X, 'I', 7X, 'T', 14X, 'Y1', 13X, 'Y2')
99998 FORMAT (I5, 1P3E15.6)
99999 FORMAT (' Error estimates', 4X, 1P2E15.6)
END
SUBROUTINE FCNEQN (NEQNS, T, Y, P, DYDT)
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NEQNS
REAL
T, P, Y(NEQNS), DYDT(NEQNS)
!
Define PDE
DYDT(1) = Y(2)
DYDT(2) = P*Y(1)**3 + 40./9.*((T-0.5)**2)**(1./3.) - (T-0.5)**8
RETURN
END
SUBROUTINE FCNJAC (NEQNS, T, Y, P, DYPDY)
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NEQNS
REAL
T, P, Y(NEQNS), DYPDY(NEQNS,NEQNS)
!
Define d(DYDT)/dY
DYPDY(1,1) = 0.0
DYPDY(1,2) = 1.0
DYPDY(2,1) = P*3.*Y(1)**2
DYPDY(2,2) = 0.0
RETURN
END
SUBROUTINE FCNBC (NEQNS, YLEFT, YRIGHT, P, F)
USE CONST_INT
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NEQNS
REAL
P, YLEFT(NEQNS), YRIGHT(NEQNS), F(NEQNS)
!
SPECIFICATIONS FOR LOCAL VARIABLES
REAL
PI
!
Define boundary conditions
PI
= CONST('PI')
F(1) = YLEFT(1) - PI/2.0
F(2) = YRIGHT(1) - PI/2.0
RETURN
END
SUBROUTINE FCNPEQ (NEQNS, T, Y, P, DYPDP)
Chapter 5: Differential Equations
BVPFD 971
!
!
!
Output
I
T
1
0.000000E+00
2
4.444445E-02
3
8.888889E-02
4
1.333333E-01
5
2.000000E-01
6
2.666667E-01
7
3.333334E-01
8
4.000000E-01
9
4.250000E-01
10
4.500000E-01
11
4.625000E-01
12
4.750000E-01
13
4.812500E-01
14
4.875000E-01
15
4.937500E-01
16
5.000000E-01
17
5.062500E-01
18
5.125000E-01
19
5.187500E-01
20
5.250000E-01
21
5.375000E-01
22
5.500000E-01
23
5.750000E-01
24
6.000000E-01
25
6.666667E-01
26
7.333333E-01
27
8.000000E-01
28
8.666667E-01
29
9.111111E-01
30
9.555556E-01
31
1.000000E+00
Error estimates
Y1
1.570796E+00
1.490495E+00
1.421951E+00
1.363953E+00
1.294526E+00
1.243628E+00
1.208785E+00
1.187783E+00
1.183038E+00
1.179822E+00
1.178748E+00
1.178007E+00
1.177756E+00
1.177582E+00
1.177480E+00
1.177447E+00
1.177480E+00
1.177582E+00
1.177756E+00
1.178007E+00
1.178748E+00
1.179822E+00
1.183038E+00
1.187783E+00
1.208786E+00
1.243628E+00
1.294526E+00
1.363953E+00
1.421951E+00
1.490495E+00
1.570796E+00
3.448358E-06
Y2
-1.949336E+00
-1.669567E+00
-1.419465E+00
-1.194307E+00
-8.958461E-01
-6.373191E-01
-4.135206E-01
-2.219351E-01
-1.584200E-01
-9.973146E-02
-7.233893E-02
-4.638248E-02
-3.399763E-02
-2.205547E-02
-1.061177E-02
-1.479182E-07
1.061153E-02
2.205518E-02
3.399727E-02
4.638219E-02
7.233876E-02
9.973124E-02
1.584199E-01
2.219350E-01
4.135205E-01
6.373190E-01
8.958461E-01
1.194307E+00
1.419465E+00
1.669566E+00
1.949336E+00
5.549869E-05
BVPMS
Solves a (parameterized) system of differential equations with boundary conditions at two points,
using a multiple-shooting method.
Required Arguments
FCNEQN User-supplied SUBROUTINE to evaluate derivatives. The usage is CALL
FCNEQN (NEQNS, T, Y, P, DYDT), where
NEQNS Number of equations. (Input)
T Independent variable, t. (Input)
Y Array of length NEQNS containing the dependent variable. (Input)
P Continuation parameter used in solving highly nonlinear problems. (Input)
See Comment 4.
DYDT Array of length NEQNS containing y at T. (Output)
See Comment 4.
See Comment 4.
BVPMS 973
Optional Arguments
NEQNS Number of differential equations. (Input)
DTOL Differential equation error tolerance. (Input)
An attempt is made to control the local error in such a way that the global error is
proportional to DTOL.
Default: DTOL = 1.0e-4.
BTOL Boundary condition error tolerance. (Input)
The computed solution satisfies the boundary conditions, within BTOL tolerance.
Default: BTOL = 1.0e-4.
MAXIT Maximum number of Newton iterations allowed. (Input)
Iteration stops if convergence is achieved sooner. Suggested values are MAXIT = 2 for
linear problems and MAXIT = 9 for nonlinear problems.
Default: MAXIT = 9.
NINIT Number of shooting points supplied by the user. (Input)
It may be 0. A suggested value for the number of shooting points is 10.
Default: NINIT = 0.
TINIT Vector of length NINIT containing the shooting points supplied by the user.
(Input)
If NINIT = 0, then TINIT is not referenced and the routine chooses all of the shooting
points. This automatic selection of shooting points may be expensive and should only
be used for linear problems. If NINIT is nonzero, then the points must be an increasing
sequence with TINIT(1) = TLEFT and TINIT(NINIT) = TRIGHT. By default, TINIT is
not used.
YINIT Array of size NEQNS by NINIT containing an initial guess for the values of Y at the
points in TINIT. (Input)
YINIT is not referenced if NINIT = 0. By default, YINIT is not used.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Define N = NEQNS, M = NFINAL, ta = TLEFT and tb = TRIGHT. The routine BVPMS uses a multipleshooting technique to solve the differential equation system y = f (t, y) with boundary conditions
of the form
hk(y1(ta), , yN (ta), y1(tb), , yN (tb)) = 0
for k = 1, , N
A modified version of IVPRK is used to compute the initial-value problem at each shot. If there
are M shooting points (including the endpoints ta and tb), then a system of NM simultaneous
nonlinear equations must be solved. Newtons method is used to solve this system, which has a
Jacobian matrix with a periodic band structure. Evaluation of the NM functions and the
NM NM (almost banded) Jacobian for one iteration of Newtons method is accomplished in one
pass from ta to tb of the modified IVPRK, operating on a system of N(N + 1) differential equations.
For most problems, the total amount of work should not be highly dependent on M. Multiple
shooting avoids many of the serious ill-conditioning problems that plague simple shooting
methods. For more details on the algorithm, see Sewell (1982).
The boundary functions should be scaled so that all components hk are of comparable magnitude
since the absolute error in each is controlled.
Comments
1.
BVPMS 975
CALL B2PMS (FCNEQN, FCNJAC, FCNBC, NEQNS, TLEFT, TRIGHT, DTOL, BTOL,
MAXIT, NINIT, TINIT, YINIT, LDYINI, NMAX, NFINAL, TFINAL, YFINAL, LDYFIN,
WORK, IWK)
2.
Informational errors
Type
Code
1
4
4
2
3
3.
4.
5.
Example
The differential equations that model an elastic beam are (see Washizu 1968, pages 142143):
NM
+ L ( x) = 0
EI
EIWxx + M = 0
M xx
EA 0 ( U x + Wx2/2 ) N = 0
Nx = 0
where U is the axial displacement, W is the transverse displacement, N is the axial force, M is the
bending moment, E is the elastic modulus, I is the moment of inertia, A0 is the cross-sectional
area, and L(x) is the transverse load.
Assume we have a clamped cylindrical beam of radius 0.1in, a length of 10in, and an elastic
modulus E = 10.6 106 lb/in2. Then, I = 0.784 104, and A0 = 102 in2, and the boundary
conditions are U = W = Wx= 0 at each end. If we let y1 = U, y2 = N/EA0, y3 = W, y4 = Wx,
y5 = M/EI , and y6 = Mx/EI, then the above nonlinear equations can be written as a system of six
first-order equations.
y1 = y2
y2 = 0
y3 = y4
y4 = y5
y42
2
y5 = y6
y6 =
A 0 y2 y5 L ( x )
I
EI
!
!
!
!
!
IMPLICIT
INTEGER
PARAMETER
INTEGER
REAL
COMMON
REAL
INTRINSIC
REAL
EXTERNAL
NONE
LDY, NEQNS, NMAX
(NEQNS=6, NMAX=21, LDY=NEQNS)
SPECIFICATIONS FOR LOCAL VARIABLES
I, MAXIT, NFINAL, NINIT, NOUT
TOL, X(NMAX), XLEFT, XRIGHT, Y(LDY,NMAX)
SPECIFICATIONS FOR COMMON /PARAM/
/PARAM/ A0, A1, E
A0, A1, E
SPECIFICATIONS FOR INTRINSICS
REAL
REAL
SPECIFICATIONS FOR SUBROUTINES
FCNBC, FCNEQN, FCNJAC
Set material parameters
BVPMS 977
A0 = 3.14E-2
A1 = 0.784E-4
E = 10.6E6
!
!
!
!
!
!
!
!
XLEFT = 0.0
XRIGHT = 10.0
MAXIT = 19
NINIT = NMAX
Y = 0.0E0
F(4) = YRIGHT(1)
F(5) = YRIGHT(3)
F(6) = YRIGHT(4)
RETURN
END
SUBROUTINE FCNJAC (NEQNS, X, Y, P, DYPDY)
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NEQNS
REAL
X, P, Y(NEQNS), DYPDY(NEQNS,NEQNS)
SPECIFICATIONS FOR COMMON /PARAM/
COMMON
/PARAM/ A0, A1, E
REAL
A0, A1, E
SPECIFICATIONS FOR SUBROUTINES
Define partials, d(DYDX)/dY
DYPDY = 0.0E0
DYPDY(1,2) = 1.0
DYPDY(1,4) = -P*Y(4)
DYPDY(3,4) = 1.0
DYPDY(4,5) = -1.0
DYPDY(5,6) = 1.0
DYPDY(6,2) = P*Y(5)*A0/A1
DYPDY(6,5) = P*Y(2)*A0/A1
RETURN
END
!
!
!
!
Output
X
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
40.0
45.0
50.0
55.0
60.0
65.0
70.0
75.0
80.0
85.0
90.0
95.0
100.0
Displacement
Axial
Transverse
1.631E-11
-8.677E-10
1.914E-05
-1.273E-03
2.839E-05
-4.697E-03
2.461E-05
-9.688E-03
1.008E-05
-1.567E-02
-9.550E-06
-2.206E-02
-2.721E-05
-2.830E-02
-3.644E-05
-3.382E-02
-3.379E-05
-3.811E-02
-2.016E-05
-4.083E-02
-4.414E-08
-4.176E-02
2.006E-05
-4.082E-02
3.366E-05
-3.810E-02
3.627E-05
-3.380E-02
2.702E-05
-2.828E-02
9.378E-06
-2.205E-02
-1.021E-05
-1.565E-02
-2.468E-05
-9.679E-03
-2.842E-05
-4.692E-03
-1.914E-05
-1.271E-03
0.000E+00
0.000E+00
BVPMS 979
DASPG
Solves a first order differential-algebraic system of equations, g(t, y, y) = 0, using the Petzold
Gear BDF method.
Required Arguments
T Independent variable, t. (Input/Output)
Set T to the starting value t0 at the first step.
TOUT Final value of the independent variable. (Input)
Update this value when re-entering after output, IDO = 2.
IDO Flag indicating the state of the computation. (Input/Output)
IDO
State
Initial entry
Release workspace
The user sets IDO = 1 or IDO = 3. All other values of IDO are defined as output. The
initial call is made with IDO = 1 and T = t0. The routine then sets IDO = 2, and this
value is used for all but the last entry that is made with IDO = 3. This call is used to
release workspace and other final tasks. Values of IDO larger than 4 occur only when
calling the second-level routine D2SPG and using the options associated with reverse
communication.
Y Array of size NEQ containing the dependent variable values, y. This array must contain
initial values. (Input/Output)
YPR Array of size NEQ containing derivative values, y. This array must contain initial
values. (Input/Output)
The routine will solve for consistent values of y to satisfy the equations at the starting
point.
GCN User-supplied SUBROUTINE to evaluate g(t, y, y). The usage is
CALL GCN (NEQ, T, Y, YPR, GVAL), where GCN must be declared EXTERNAL in the
calling program. The routine will solve for values of y(t0) so that
g(t0, y, y) = 0. The user can signal that g is not defined at requested values of (t, y, y)
using an option. This causes the routine to reduce the step size or else quit.
Optional Arguments
NEQ Number of differential equations. (Input)
Default: NEQ = size(y,1)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine DASPG finds an approximation to the solution of a system of differential-algebraic
equations g(t, y, y) = 0, with given initial data for y and y. The routine uses BDF formulas,
appropriate for systems of stiff ODEs, and attempts to keep the global error proportional to a userspecified tolerance. See Brenan et al. (1989). This routine is efficient for stiff systems of index 1
or index 0. See Brenan et al. (1989) for a definition of index. Users are encouraged to use DOUBLE
PRECISION accuracy on machines with a short REAL precision accuracy. The examples given
below are in REAL accuracy because of the desire for consistency with the rest of IMSL
MATH/LIBRARY examples. The routine DASPG is based on the code DASSL designed by L.
Petzold (1982-1990).
Comments
Users can often get started using the routine DASPG/DDASPG without reading beyond this point in
the documentation. There is often no reason to use options when getting started. Those readers
who do not want to use options can turn directly to the first two examples. The following tables
give numbers and key phrases for the options. A detailed guide to the options is given below in
Comment 2.
Value
DASPG 981
Value
IN(1)
IN(2)
IN(3)
IN(4)
IN(5)
IN(6)
IN(7)
IN(8)
IN(9)
Not Used
IN(10)
IN(11)
IN(12-15)
Not Used
IN(16)
Number of equations
IN(17)
IN(18)
IN(19)
IN(20)
IN(21)
Number of steps
IN(22)
Number of g evaluations
IN(23)
IN(24)
IN(25)
IN(26)
IN(27)
Where is g stored?
IN(28)
Panic flag
IN(29)
IN(30)
IN(31)
IN(32)
Not Used
IN(33)
IN(34)
IN(35)
IN(36-50)
Not used
Value
INR(1)
Value of t
INR(2)
INR(3)
Value of TOUT
INR(4)
INR(5)
INR(6)
INR(7)
INR(8)
INR(9)
INR(10)
INR(11)
INR(12-20)
Not Used
1.
Workspace may be explicitly provided, and many of the options utilized by directly
calling D2SPG/DD2SPG. The reference is:
CALL D2SPG (N, T, TOUT, IDO, Y, YPR, GCN, JGCN, IWK, WK)
IDO State
5
These values of IDO occur only when calling the second-level routine D2SPG and using
options associated with reverse communication. The routine D2SPG/DD2SPG is
reentered.
GCN A Fortran SUBROUTINE to compute g(t, y, y). This routine is normally
provided by the user. That is the default case. The dummy IMSL routine
DGSPG/DDGSPG may be used as this argument when g(t, y, y) is evaluated by
reverse communication. In either case, a name must be declared in a Fortran
EXTERNAL statement. If usage of the dummy IMSL routine is intended, then the
name DGSPG/DDGSPG should be specified. The dummy IMSL routine will never
DASPG 983
If the user writes a routine with the fixed name DJSPG/DDJSPG, then partial derivatives
can be provided while calling DASPG. An option is used to signal that formulas for
partial derivatives are being supplied. This is illustrated in Example 3. The name of the
partial derivative routine must be declared in a Fortran EXTERNAL statement when
calling D2SPG. If usage of the dummy IMSL routine is intended, then the name
DJSPG/DDJSPG should be specified for this EXTERNAL name. Whenever the user
provides partial derivative evaluation formulas, by whatever means, that must be noted
with an option. Usage of the derivative evaluation routine is
CALL JGCN (N, T, Y, YPR, CJ, PDG, LDPDG) where
Arg
Definition
YPR
CJ
PDG
LDPDG
IWK Work array of integer values. The size of this array is 35 + N. The contents of
IWK must not be changed from the first call with IDO = 1 until after the final call
with IDO = 3.
WK Work ahrray of floating-point values in the working precision. The size of this
array is 41 + (MAXORD + 6)N + (N + K)N(1 L) where K is determined
from the values IVAL(3) and IVAL(4) of option 16 of LSLRG (Chapter 1,
Linear Systems). The value of L is 0 unless option IN(34) is used to avoid
allocation of the array containing the partial derivatives. With the use of this
option, L can be set to 1. The contents of array WK must not be changed from the
first call with IDO = 1 until after the final call.
2.
This is the list of numbers used for INTEGER options. Users will typically call
this option first to get the numbers, IN(I), I = 1, 50. This option has 50 entries.
The default values are IN(I) = I + 50, I = 1, 50.
This is the list of numbers used for REAL and DOUBLE PRECISION options.
Users will typically call this option first to get the numbers, INR(I), I = 1,20.
This option has 20 entries. The default values are INR(I) = I + 50, I = 1, 20.
IN(1) This is the first call to the routine DASPG or D2SPG. Value is 0 for the first call, 1
for further calls. Setting IDO = 1 resets this option to its default. Default value is
0.
IN(2) This flag controls the kind of tolerances to be used for the solution. Value is 0
for scalar values of absolute and relative tolerances applied to all components.
Value is 1 when arrays for both these quantities are specified. In this case, use
D2SPG. Increase the size of WK by 2*N, and supply the tolerance arrays at the
end of WK. Use option IN(33) to specify the offset into WK where the 2N array
values are to be placed: all ATOL values are followed by all RTOL values. Also
see IN(33). Default value is 0.
IN(3) This flag controls when the code returns to the user with output values of y and
y. If the value is 0, it returns to the user at T = TOUT only. If the value is 1, it
returns to the user at an internal working step. Default value is 0.
IN(4) This flag controls whether the code should integrate past a special point, TSTOP,
and then interpolate to get y and yat TOUT. If the value is 0, this is permitted. If
the value is 1, the code assumes the equations either change on the alternate side
Chapter 5: Differential Equations
DASPG 985
of TSTOP or they are undefined there. In this case, the code creeps up to TSTOP
in the direction of integration. The value of TSTOP is set with option INR(4).
Default value is 0.
IN(5) This flag controls whether partial derivatives are computed using divided
onesided differences, or they are to be computed using user-supplied evaluation
formulas. If the value is 0, use divided differences. If the value is 1, use
formulas for the partial derivatives. See Example 3 for an illustration of one way
to do this. Default value is 0.
IN(6) The maximum number of steps. Default value is 500.
IN(7) This flag controls a maximum magnitude constraint for the step size. If the value
is 0, the routine picks its own maximum. If the value is 1, a maximum is
specified by the user. That value is set with option number INR(7). Default
value is 0.
IN(8) This flag controls an initial value for the step size. If the value is 0, the routine
picks its own initial step size. If the value is 1, a starting step size is specified by
the user. That value is set with option number INR(6). Default value is 0.
IN(9) Not used. Default value is 0.
IN(10)
This flag controls attempts to constrain all components to be nonnegative. If
the value is 0, no constraints are enforced. If value is 1, constraint is enforced.
Default value is 0.
IN(11)
This flag controls whether the initial values (t, y, y) are consistent. If the
value is 0, g(t, y, y) = 0 at the initial point. If the value is 1, the routine will try
to solve for y to make this equation satisfied. Default value is 0.
IN(12-15) Not used. Default value is 0 for each option.
IN(16)
IN(17)
Value
Explanation
Value
Explanation
10
The BDF corrector equation solver did not converge because the
evaluation failure flag was raised.
11
12
33
IN(18)
The maximum order of BDF formula the routine should use. Default value
is 5.
IN(19)
The order of the BDF method the routine will use on the next step. Default
value is IMACH(5).
IN(20)
The order of the BDF method used on the last step. Default value is
IMACH(5).
IN(21)
IN(22)
IN(23)
The number of times that the partial derivative matrix has been evaluated.
Default value is 0.
IN(24)
IN(25)
The total number of convergence test failures so far. This includes singular
iteration matrices. Default value is 0.
IN(26)
IN(27)
in
IN(28)
This value is a panic flag. After an evaluation of g, this value is checked.
The value of g is used if the flag is 0. If it has the value 1, the routine reduces
the step size and possibly the order of the BDF. If the value is 2, the routine
returns control to the user immediately. This option is also used to signal a
singular or poorly conditioned partial derivative matrix encountered during the
Chapter 5: Differential Equations
DASPG 987
factor phase in reverse communication. Use a nonzero value when the matrix is
singular. Default value is 0.
IN(29)
Use reverse communication to evaluate the partial derivative matrix when
this value is 0. If the value is 1, forward communication is used. Use the routine
D2SPG for reverse communication. With reverse communication, a return will
be made with IDO = 6. Compute the partial derivative matrix A and re-enter the
routine. If forward communication is used for the linear solver, return the
partials using the offset into the array WK. This offset value is obtained with
option IN(30). Default value is 1.
IN(30)
The user is to store the values of the partial derivative matrix A by columns
in the work array WK using this value as an offset. The option 16 for LSLRG is
used here to compute the row dimension of the internal working array that
contains A. Users can also choose to store this matrix in some convenient form
in their calling program if they are providing linear system solving using reverse
communication. See options IN(31) and IN(34). Default value is IMACH(5).
IN(31)
Use reverse communication to solve the linear system Ay = g if this
value is 0. If the value is 1, use forward communication into the routines L2CRG
and LFSRG (Chapter 1, Linear Systems) for the linear system solving. Return the
solution using the offset into the array WK where g is stored. This offset value is
obtained with option IN(27). With reverse communication, a return will be
made with IDO = 7 for factorization of A and with IDO = 8 for solving the
system. Re-enter the routine in both cases. If the matrix A is singular or poorly
conditioned, raise the panic flag, option IN(28), during the factorization.
Default value is 1.
IN(32)
IN(33)
This value is used when IN(2) is set to 1, indicating that arrays of absolute
and relative tolerances are input in the WK array of D2SPG. Set this parameter to
the offset, ioff, into WK where the tolerances are stored. Increase the size of WK
by 2*N , and store tolerance values beginning at ioff=size(WK)-2*N+1.
Absolute tolerances will be stored in WK(ioff+i-1) for i=1,N and relative
tolerances will be stored in WK(ioff+N+i-1) for i=1,N. Also, use IN(35) to
specify the size of the work arrays.
IN(34)
This flag is used if the user has not allocated storage for the matrix A in the
array WK. If the value is 0, storage is allocated. If the value is 1, storage was not
allocated. In this case, the user must be using reverse communication to evaluate
the partial derivative matrix and to solve the linear systems Ay = g. Default
value is 0.
IN(35)
These two values are the sizes of the arrays IWK and WK allocated in the
users program. The values are checked against the program requirements. These
checks are made only if the values are positive. Users will normally set this
option when directly calling D2SPG. Default values are (0, 0).
988 Chapter 5: Differential Equations
INR(1)
INR(2)
The farthest working t point the integration has reached. Default value is
AMACH(6) .
INR(3)
The next special point, TSTOP, before reaching TOUT. Default value is
AMACH(6). Used with option IN(4).
INR(4)
INR(5) The pair of scalar values ATOL and RTOL that apply to the error estimates of
all components of y. Default values for both are SQRT(AMACH(4)).
INR(6)
The initial step size if DASPG is not to compute it internally. Default value is
AMACH(6).
INR(7)
INR(8) This value is the reciprocal of the condition number of the matrix A. It is
defined when forward communication is used to solve for the linear updates to
the BDF corrector equation. No further program action, such as declaring a
singular system, based on the condition number. Users can declare the system to
be singular by raising the panic flag using option IN(28). Default value is
AMACH(6).
INR(9) The value of cj used in the partial derivative matrix for reverse
communication evaluation. Default value is AMACH(6).
INR(10) The step size to be attempted on the next move. Default value is AMACH(6).
INR(11)
4.
The step size taken on the previous move. Default value is AMACH(6).
D10PG = N 1 i =1 ( vi / wti )
N
Users can replace this function with one of their own choice. This should be done only
for problem-related reasons.
DASPG 989
Example 1
The Van der Pol equation u + (u2 1) u + u = 0, > 0, is a single ordinary differential equation
with a periodic limit cycle. See Hartman (1964, page 181). For the value = 5, the equations are
integrated from t = 0 until the limit has clearly developed at t = 26. The (arbitrary) initial
conditions used here are u(0) = 2 and u(0) = 2/3. Except for these initial conditions and the final
t value, this is problem (E2) of the Enright and Pryce (1987) test package. This equation is solved
as a differential-algebraic system by defining the first-order system:
= 1/
y1 = u
g1
g2
y2 y1 = 0
(1 y ) y
2
1
( y1 + y2 ) = 0
y2
in the sample program is not consistent, g2 0 at t = 0. The routine DASPG solves for this starting
value. No options need to be changed for this usage. The set of pairs (u(tj), u(tj)) are accumulated
for the 260 values tj = 0.1, 26, (0.1).
USE UMACH_INT
USE DASPG_INT
!
!
!
!
!
!
!
IMPLICIT
INTEGER
PARAMETER
NONE
N, NP, IDO
(N=2, NP=260)
ISTEP = ISTEP + 1
CALL DASPG (T, T+DELT, IDO, Y, YPR, GCN)
!
Save solution for plotting
IF (ISTEP .LE. NSTEP) THEN
U(ISTEP) = Y(1)
UPR(ISTEP) = YPR(1)
!
Release work space
IF (ISTEP .EQ. NSTEP) IDO = 3
GO TO 10
END IF
WRITE (NOUT,99999) TEND, Y, YPR
99998 FORMAT (11X, 'T', 14X, 'Y(1)', 11X, 'Y(2)', 10X, 'Y''(1)', 10X, &
'Y''(2)')
99999 FORMAT (5F15.5)
!
Start plotting
!
CALL SCATR (NSTEP, U, UPR)
!
CALL EFSPLT (0, ' ')
END
!
SUBROUTINE GCN (N, T, Y, YPR, GVAL)
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER
N
REAL T, Y(N), YPR(N), GVAL(N)
!
SPECIFICATIONS FOR LOCAL VARIABLES
REAL EPS
!
EPS = 0.2
!
GVAL(1) = Y(2) - YPR(1)
GVAL(2) = (1.0-Y(1)**2)*Y(2) - EPS*(Y(1)+YPR(2))
RETURN
END
Output
T
26.00000
Y(1)
1.45330
Y(2)
-0.24486
Y(1)
-0.24713
Y(2)
-0.09399
DASPG 991
Additional Examples
Example 2
The first-order equations of motion of a point-mass m suspended on a massless wire of length
under the influence of gravity force, mg and tension value , in Cartesian coordinates, (p, q), are
p
q
mu
mv
p2 + q2
=
=
=
=
u
v
p
q mg
= 0
This is a genuine differential-algebraic system. The problem, as stated, has an index number equal
to the value 3. Thus, it cannot be solved with DASPG directly. Unfortunately, the fact that the index
is greater than 1 must be deduced indirectly. Typically there will be an error processed which
states that the (BDF) corrector equation did not converge. The user then differentiates and replaces
the constraint equation. This example is transformed to a problem of index number of value 1 by
differentiating the last equation twice. This resulting equation, which replaces the given equation,
is the total energy balance:
m(u 2 + v 2 ) mgq 2 = 0
992 Chapter 5: Differential Equations
With initial conditions and systematic definitions of the dependent variables, the system becomes:
p ( 0) =
y1 =
, q ( 0) = u ( 0) = v ( 0) = ( 0) = 0
p
y2 = q
y3 = u
y4 = v
y5 =
g1 = y3 y1 = 0
g 2 = y4 y2 = 0
g3 = y1 y5 my3 = 0
g 4 = y2 y5 mg my4 = 0
g5 = m ( y32 + y42 ) mgy2
y5 = 0
The problem is given in English measurement units of feet, pounds, and seconds. The wire has
length 6.5 ft, and the mass at the end is 98 lb. Usage of the software does not require it, but
standard or SI units are used in the numerical model. This conversion of units is done as a first
step in the user-supplied evaluation routine, GCN. A set of initial conditions, corresponding to the
pendulum starting in a horizontal position, are provided as output for the input signal of n = 0. The
maximum magnitude of the tension parameter, (t) = y5(t), is computed at the output points,
t = 0.1, , (0.1). This extreme value is converted to English units and printed.
USE
USE
USE
USE
USE
USE
!
!
!
!
DASPG_INT
CUNIT_INT
DASPG_INT
CUNIT_INT
UMACH_INT
CONST_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=5)
DASPG 993
TEND = CONST('pi')
DELT = 0.1
NSTEP = TEND/DELT
CALL UMACH (2, NOUT)
YPR(1)
YPR(2)
YPR(3)
YPR(4)
YPR(5)
RETURN
END IF
!
GVAL(1)
GVAL(2)
GVAL(3)
GVAL(4)
GVAL(5)
RETURN
=
=
=
=
=
=
=
=
=
=
0.
0.
0.
0.
0.
Compute residuals
Y(3) - YPR(1)
Y(4) - YPR(2)
-Y(1)*Y(5) - MASSKG*YPR(3)
-Y(2)*Y(5) - MASSKG*YPR(4) - MG
MASSKG*(Y(3)**2+Y(4)**2) - MG*Y(2) - LENSQ*Y(5)
!
!
!
!
!
Change to meters
CALL CUNIT (FEETL, 'ft', METERL, 'meter')
Change to kilograms
CALL CUNIT (MASSLB, 'lb', MASSKG, 'kg')
Get standard gravity
GRAV = CONST('StandardGravity')
MG
= MASSKG*GRAV
LENSQ = METERL**2
FIRST = .FALSE.
GO TO 10
END
Output
Extreme string tension of
2.50
Example 3
In this example, we solve a stiff ordinary differential equation (E5) from the test package of
Enright and Pryce (1987). The problem is nonlinear with nonreal eigenvalues. It is included as an
example because it is a stiff problem, and its partial derivatives are provided in the usersupplied
routine with the fixed name DJSPG. Users who require a variable routine name for partial
derivatives can use the routine D2SPG. Providing explicit formulas for partial derivatives is an
important consideration for problems where evaluations of the function g(t, y, y) are expensive.
Signaling that a derivative matrix is provided requires a call to the Chapter 11 options manager
utility, IUMAG. In addition, an initial integration step-size is given for this test problem. A signal
for this is passed using the options manager routine IUMAG. The error tolerance is changed from
the defaults to a pure absolute tolerance of 0.1 * SQRT(AMACH(4)). Also see IUMAG, and
SUMAG/DUMAG in Chapter 11, Utilities, for further details about the options manager routines.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
NONE
N
DASPG 995
PARAMETER
!
!
INTEGER
PARAMETER
INTEGER
REAL
!
!
EXTERNAL
(N=4)
SPECIFICATIONS FOR PARAMETERS
ICHAP, IGET, INUM, IPUT, IRNUM
(ICHAP=5, IGET=1, INUM=6, IPUT=2, IRNUM=7)
SPECIFICATIONS FOR LOCAL VARIABLES
IDO, IN(50), INR(20), IOPT(2), IVAL(2), NOUT
C0, PREC, SVAL(3), T, TEND, Y(N), YPR(N)
SPECIFICATIONS FOR FUNCTIONS
GCN
Define initial data
IDO = 1
T = 0.0
TEND = 1000.0
!
Initial values
C0 =
Y(1)
Y(2)
Y(3)
Y(4)
1.76E-3
= C0
= 0.
= 0.
= 0.
Initial derivatives
YPR(1)
YPR(2)
YPR(3)
YPR(4)
!
!
!
!
!
!
!
!
=
=
=
=
0.
0.
0.
0.
IOPT(1) = -INR(5)
IOPT(2) = -INR(6)
!
!
!
!
DASPG 997
PDG(4,1) = C2*Y(3)
PDG(4,3) = C2*Y(1)
PDG(4,4) = -C4 - CJ
RETURN
END
Output
T
1000.00000
0.00162
0.00000
Y followed by Y
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
Example 4
In this final example, we compute the solution of n = 10 ordinary differential equations,
g = Hy y, where y(0) = y0 = (1, 1, , 1)T. The value
y (t )
n
i =1 i
The function g,
2.
3.
In addition to the use of reverse communication, we evaluate the partial derivatives using
formulas. No storage is allocated in the floating-point work array for the matrix. Instead, the
matrix A is stored in an array A within the main program unit. Signals for this organization are
passed using the routine IUMAG (Chapter 11, Utilities).
An algorithm appropriate for this matrix, Givens transformations applied from the right side, is
used to factor the matrix A. The rotations are reconstructed during the solve step. See SROTG
(Chapter 9, Basic Matrix/Vector Operations) for the formulas.
The routine D2SPG stores the value of cj. We get it with a call to the options manager routine
SUMAG (Chapter 11, Utilities). A pointer, or offset into the work array, is obtained as an integer
option. This gives the location of g and g. The solution vector y replaces g at that location.
Caution: If a user writes code wherein g is computed with reverse communication and partials are
evaluated with divided differences, then there will be two distinct places where g is to be stored.
This example shows a correct place to get this offset.
This example also serves as a prototype for large, structured (possibly nonlinear) DAE problems
where the user must use special methods to store and factor the matrix A and solve the linear
system Ay = g. The word factor is used literally here. A user could, for instance, solve the
system using an iterative method. Generally, the factor step can be any preparatory phase required
for a later solve step.
998 Chapter 5: Differential Equations
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
!
NONE
N
(N=10)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
DASPG 999
!
!
!
!
IOPT(6) = IN(35)
IVAL(6) = 35 + N
IVAL(7) = 41 + 11*N
!
!
!
!
40
!
!
!
!
!
!
50
!
60
!
70
80
90
!
!
100
110
!
!
!
120
130
!
!
!
140
150
!
160
!
170
!
180 CONTINUE
SUMY = 0.E0
DO 190 I=1, N
SUMY = SUMY + Y(I)
190 CONTINUE
WRITE (NOUT,99999) TEND, SUMY
!
Finish up internally
IDO = 3
GO TO 40
99998 FORMAT (11X, 'T', 6X, 'Sum of Y(i), i=1,n')
99999 FORMAT (2F15.5)
Chapter 5: Differential Equations
DASPG 1001
END
Output
T
1.00000
to values of
ut .
The integration method is noteworthy due to the maintenance of grid lines in the
space variable, x. Details for choosing new grid lines are given in Blom and Zegeling, (1994).
The class of problems solved with PDE_1D_MG is expressed by equations:
u k
= x m ( x m R j ( x , t , u , u x ) ) Q j ( x, t , u , u x ) ,
t
x
k =1
j = 1, , NPDE , xL < x < xR , t > t0 , m {0,1, 2}
NPDE
C j ,k ( x , t , u , u x )
Equation 2
The vector
u u, u NPDE
is the solution. The integer value NPDE 1 is the number of differential equations. The
functions Rj and Qj can be regarded, in special cases, as flux and source terms. The functions
u , C j ,k , R j and Q j
j ( x, t ) R j ( x , t , u , u x ) = j ( x , t , u , u x ) ,
at x = xL and x = xR , j = 1,..., NPDE
Equation 3
are continuous functions of their arguments. In the two cases m > 0 and an endpoint occurs at 0,
the finite value of the solution at x = 0 must be ensured. This requires the specification of the
solution at x = 0, or implies that
Rj
x = xL
=0
or
Rj
x = xR
=0
Ck , j , R j , Q j , j ,
and u0 .
These functions are provided to the routine PDE_1D_MG in the form of three subroutines.
Optionally, this information can be provided by reverse communication. These forms of the
interface are explained below and illustrated with examples. Users may turn directly to the
examples if they are comfortable with the description of the algorithm.
PDE_1D_MG
Invokes a module, with the statement USE PDE_1D_MG, near the second line of the program unit.
The integrator is provided with single or double precision arithmetic, and a generic named
interface is provided. We do not recommend using 32-bit floating point arithmetic here. The
routine is called within the following loop, and is entered with each value of IDO. The loop
continues until a value of IDO results in an exit.
IDO=1
DO
CASE(IDO == 1) {Do required initialization steps}
CASE(IDO == 2) {Save solution, update T0 and TOUT }
IF{Finished with integration} IDO=3
CASE(IDO == 3) EXIT {Normal}
CASE(IDO == 4) EXIT {Due to errors}
CASE(IDO == 5) {Evaluate initial data}
CASE(IDO == 6) {Evaluate differential equations}
CASE(IDO == 7) {Evaluate boundary conditions}
CASE(IDO == 8) {Prepare to solve banded system}
CASE(IDO == 9) {Solve banded system}
1004 Chapter 5: Differential Equations
Required Arguments
T0(Input/Output)
This is the value of the independent variable t where the integration of ut begins. It is
set to the value TOUT on return.
TOUT(Input)
This is the value of the independent variable t where the integration of ut ends. Note:
Values of T0 < TOUT imply integration in the forward direction, while values of
T0 > TOUT imply integration in the backward direction. Either direction is permitted.
IDO(Input/Output)
This in an integer flag that directs program control and user action. Its value is used
for initialization, termination, and for directing user response during reverse
communication:
IDO=1 This value is assigned by the user for the start of a new problem. Internally it
condition has occurred, and error processing is set NOT to STOP for these
types of errors. It is not necessary to make a final call to the integrator with
IDO=3 in this case.
Values of IDO = 5,6,7,8,9 are reserved for applications that provide problem
information or linear algebra computations using reverse communication. When
problem information is provided using reverse communication, the differential
equations, boundary conditions and initial data must all be given. The absence
of optional subroutine names in the calling sequence directs the routine to use
reverse communication. In the module PDE_1D_MG_INT, scalars and arrays for
evaluating results are named below. The names are preceded by the prefix
s_pde_1d_mg_ or d_pde_1d_mg_, depending on the precision. We use
the prefix ?_pde_1d_mg_, for the appropriate choice.
Chapter 5: Differential Equations
PDE_1D_MG 1005
IDO=5 This value is assigned by the integrator, requesting data for the initial
equations. Following this evaluation the integrator is re-entered. Evaluate the terms of
the system of Equation 2. A default value of m = 0 is assumed, but this can be changed
to one of the other choices m = 1 or 2 . Use the optional argument IOPT(:) for that
purpose. Put the values in the arrays as indicated 1.
x ?_pde_1d_mg_x
t ?_pde_1d_mg_t
u j ?_pde_1d_mg_u ( j )
u j
= u xj ?_pde_1d_mg_dudx ( j )
x
?_pde_1d_mg_c( j , k ) := C j , k ( x, t , u , u x )
?_pde_1d_mg_r ( j ) := rj ( x, t , u, u x )
?_pde_1d_mg_q ( j ) := q j ( x, t , u , u x )
j , k = 1,..., NPDE
The assign-to equality, a := b , used here and below, is read the expression b is evaluated and then
assigned to the location a .
1
x ?_pde_1d_mg_x
t ?_pde_1d_mg_t
u j ?_pde_1d_mg_u ( j )
u j
= u xj ?_pde_1d_mg_dudx ( j )
x
?_pde_1d_mg_beta ( j ) := j ( x, t , u, u x )
?_pde_1d_mg_gamma ( j ) := j ( x, t , u, u x )
j = 1,..., NPDE
The value x{xL, xR}, and the logical flag pde_1d_mg_LEFT=.TRUE. for x = xL. It
has the value pde_1d_mg_LEFT=.FALSE. for x = xR. If any of the functions cannot be
evaluated, set pde_1d_mg_ires=3. Otherwise do not change its value.
IDO=8 This value is assigned by the integrator, requesting the calling program to prepare for
solving a banded linear system of algebraic equations. This value will occur only when
the option for reverse communication solving is set in the array IOPT(:), with
option PDE_1D_MG_REV_COMM_FACTOR_SOLVE. The matrix data for this system is in
Band Storage Mode, described in the section: Reference Material for the IMSL Fortran
Numerical Libraries.
PDE_1D_MG_IBAND
PDE_1D_MG_LDA
?_PDE_1D_MG_A
PDE_1D_MG_PANIC_FLAG
IDO=9 This value is assigned by the integrator , requesting the calling program to solve a
linear system with the matrix defined as noted with IDO=8.
?_PDE_1D_MG_RHS
PDE_1D_MG_PANIC_FLAG
?_PDE_1D_MG_SOL
PDE_1D_MG 1007
U(1:NPDE+1,1:N)(Input/Output)
This assumed-shape array specifies Input information about the problem size and
boundaries. The dimension of the problem is obtained from NPDE +1 = size(U,1). The
number of grid points is obtained by N = size(U,2). Limits for the variable x are
assigned as input in array locations, U(NPDE +1, 1) = xL, U(NPDE +1, N) =xR. It is
not required to define U(NPDE +1, j), j=2, , N-1. At completion, the array
U(1:NPDE,1:N)contains the approximate solution value Ui(xj(TOUT),TOUT) in
location U(I,J). The grid value xj(TOUT) is in location U(NPDE+1,J). Normally the
grid values are equally spaced as the integration starts. Variable spaced grid values can
be provided by defining them as Output from the subroutine initial_conditions
or during reverse communication, IDO=5.
Optional Arguments
initial_conditions(Input)
The name of an external subroutine, written by the user, when using forward
communication. If this argument is not used, then reverse communication is used to
provide the problem information. The routine gives the initial values for the system at
the starting independent variable value T0. This routine can also provide a nonuniform grid at the initial value.
SUBROUTINE initial_conditions (NPDE,N,U)
Integer NPDE,N
REAL(kind(T0)) U(:,)
END SUBROUTINE
The name of an external subroutine, written by the user, when using forward
communication. It gives the differential equation, as expressed in Equation 2.
SUBROUTINE pde_system_ definition&
(t, x, NPDE, u, dudx, c, q, r, IRES)
Integer NPDE, IRES
REAL(kind(T0)) t, x, u(:), dudx(:)
REAL(kind(T0)) c(:,:), q(:), r(:)
END SUBROUTINE
Evaluate the terms of the system of . A default value of m = 0 is assumed, but this can be changed
to one of the other choices m = 1 or 2 . Use the optional argument IOPT(:) for that purpose. Put
the values in the arrays as indicated.
u j u( j)
u
= u xj dudx ( j )
x
j
c ( j , k ) := C j ,k ( x, t , u , u x )
r ( j ) : = rj ( x, t , u , u x )
q ( j ) := q j ( x, t , u , u x )
j , k = 1,..., NPDE
If any of the functions cannot be evaluated, set IRES=3. Otherwise do not change its value.
boundary_conditions(Input)
The name of an external subroutine, written by the user when using forward
communication. It gives the boundary conditions, as expressed in Equation 2.
SUBROUTINE BOUNDARY_CONDITIONS(T,BETA,GAMMA,U,DUDX,NPDE,LEFT,IRES)
real(kind(1d0)),intent(in)::t
real(kind(1d0)),intent(out),dimension(:)::BETA, GAMMA
real(kind(1d0)),intent(in),dimension(:)::U,DUDX
integer,intent(in)::NPDE
logical,intent(in)::LEFT
integer,intent(out)::IRES
END SUBROUTINE
u j u( j)
u
= u xj dudx ( j )
x
beta ( j ) := j ( x, t , u , u x )
j
gamma ( j ) : = j ( x, t , u , u x )
j = 1,..., NPDE
The value x { xL , xR } , and the logical flag LEFT=.TRUE. for x = x L . The flag has the value
LEFT=.FALSE. for x = x R .
IOPT(Input)
Derived type array s_options or d_options, used for passing optional data to
PDE_1D_MG. See the section Optional Data in the Introduction for an explanation of
the derived type and its use. It is necessary to invoke a module, with the statement USE
ERROR_OPTION_PACKET, near the second line of the program unit. Examples 2-8 use
this optional argument. The choices are as follows:
PDE_1D_MG 1009
Option Name
Option Value
S_, d_
PDE_1D_MG_CART_COORDINATES
S_, d_
PDE_1D_MG_CYL_COORDINATES
S_, d_
PDE_1D_MG_SPH_COORDINATES
S_, d_
PDE_1D_MG_TIME_SMOOTHING
S_, d_
PDE_1D_MG_SPATIAL_SMOOTHING
S_, d_
PDE_1D_MG_MONITOR_REGULARIZING
S_, d_
PDE_1D_MG_RELATIVE_TOLERANCE
S_, d_
PDE_1D_MG_ABSOLUTE_TOLERANCE
S_, d_
PDE_1D_MG_MAX_BDF_ORDER
S_, d_
PDE_1D_MG_REV_COMM_FACTOR_SOLVE
10
s_, d_
PDE_1D_MG_NO_NULLIFY_STACK
11
IOPT(IO) = PDE_1D_MG_CART_COORDINATES
IOPT(IO) = PDE_1D_MG_CYL_COORDINATES
IOPT(IO) = PDE_1D_MG_SPH_COORDINATES
IOPT(IO) =
?_OPTIONS(PDE_1D_MG_TIME_SMOOTHING,TAU)
IOPT(IO) =
?_OPTIONS(PDE_1D_MG_SPATIAL_SMOOTHING,KAP)
IOPT(IO) =
?_OPTIONS(PDE_1D_MG_MONITOR_REGULARIZING,ALPH)
IOPT(IO) = ?_OPTIONS
(PDE_1D_MG_RELATIVE_TOLERANCE,RTOL)
This option resets the value of the relative accuracy parameter used in DASPG. The
default value is RTOL=1E-2 for single precision and
RTOL=1D-4 for double precision.
IOPT(IO) = ?_OPTIONS
(PDE_1D_MG_ABSOLUTE_TOLERANCE,ATOL)
This option resets the value of the absolute accuracy parameter used in DASPG. The
default value is ATOL=1E-2 for single precision and
ATOL=1D-4 for double precision.
IOPT(IO) = PDE_1D_MG_MAX_BDF_ORDER
IOPT(IO+1) = MAXBDF
Reset the maximum order for the BDF formulas used in DASPG. The default value is
MAXBDF=2. The new value can be any integer between 1 and 5. Some problems will
benefit by making this change. We used the default value due to the fact that DASPG
may cycle on its selection of order and step-size with orders higher than value 2.
IOPT(IO) = PDE_1D_MG_REV_COMM_FACTOR_SOLVE
The calling program unit will solve the banded linear systems required in the stiff
differential-algebraic equation integrator. Values of IDO=8, 9 will occur only when
this optional value is used.
IOPT(IO) = PDE_1D_MG_NO_NULLIFY_STACK
To maintain an efficient interface, the routine PDE_1D_MG collapses the subroutine call
stack with CALL_E1PSH(NULLIFY_STACK). This implies that the overhead of
maintaining the stack will be eliminated, which may be important with reverse
communication. It does not eliminate error processing. However, precise information
of which routines have errors will not be displayed. To see the full call chain, this
option should be used. Following completion of the integration, stacking is turned
back on with CALL_E1POP(NULLIFY_STACK).
FORTRAN 90 Interface
Generic:
Specific:
Description
The equation
ut = f ( u , x ,t ), x L < x < x R , t > t0 ,
PDE_1D_MG 1011
du
dx
ux
= ut = f ( u, x, t ) .
dt
dt
Using central divided differences for the factor ux leads to the system of ordinary differential
equations in implicit form
dU i (U i+1 U i1 ) dxi
= F , t > t0 , i = 1, , N .
dt
( xi+1 xi1 ) dt i
The terms Ui, Fi respectively represent the approximate solution to the partial differential equation
and the value of f(u,x,t) at the point (x,t) = (xi,(t),t). The truncation error is second-order in the
space variable, x. The above ordinary differential equations are underdetermined, so additional
equations are added for the variation of the time-dependent grid points. It is necessary to discuss
these equations, since they contain parameters that can be adjusted by the user. Often it will be
necessary to modify these parameters to solve a difficult problem. For this purpose the following
quantities are defined 2:
xi = xi +1 xi , ni = ( xi )
i = ni ( + 1)( ni +1 2ni + ni 1 ) , 0 i N
n1 n0 , nN +1 nN
The values ni are the so-called point concentration of the grid, and 0 denotes a spatial
smoothing parameter. Now the grid points are defined implicitly so that
d i 1
d
i + 1
dt =
dt , 1 i N ,
M i 1
Mi
i 1 +
NPDE
(U
j
i +1
Uij )
( x )
j =1
The value determines the level of clustering or spatial smoothing of the grid points. Decreasing
from its default decrease the amount of spatial smoothing. The parameters Mi approximate arc
length and help determine the shape of the grid or xi-distribution. The parameter prevents the
grid movement from adjusting immediately to new values of the Mi, thereby avoiding oscillations
in the grid that cause large relative errors. This is important when applied to solutions with steep
gradients.
The discrete form of the differential equation and the smoothing equations are combined to yield
the implicit system of differential equations.
2
The three-tiered equal sign, used here and below, is read a b or a and b are exactly the same object
or value.
1012 Chapter 5: Differential Equations
A(Y )
dY
= L (Y ) ,
dt
This is frequently a stiff differential-algebraic system. It is solved using the integrator DASPG and
its subroutines, including D2SPG. These are documented in this chapter. Note that DASPG is
restricted to use within PDE_1D_MG until the routine exits with the flag IDO = 3. If DASPG is
needed during the evaluations of the differential equations or boundary conditions, use of a second
processor and inter-process communication is required. The only options for DASPG set by
PDE_1D_MG are the Maximum BDF Order, and the absolute and relative error values, ATOL and
RTOL. Users may set other options using the Options Manager. This is described in routine
DASPG and generally in Chapter 11 of this manual.
PDE_1D_MG 1013
;
;
;
;
;
;
;
;
;
;
;
;
;
;
;
;
;
1 = 1, 2 = 0, 1 = 0, 2 = v, at x = xL = 0
1 = 0, 2 = 1, 1 = u 1, 2 = 0, at x = xR = 1
Rationale: Example 1
This is a non-linear problem with sharply changing conditions near t = 0 . The default settings of
integration parameters allow the problem to be solved. The use of PDE_1D_MG with forward
communication requires three subroutines provided by the user to describe the initial conditions,
differential equations, and boundary conditions.
program PDE_EX1
! Electrodynamics Model:
USE PDE_1d_mg_int
IMPLICIT NONE
INTEGER, PARAMETER :: NPDE=2, N=51, NFRAMES=5
INTEGER I, IDO
! Define array space for the solution.
Chapter 5: Differential Equations
PDE_1D_MG 1015
real(kind(1d0))
real(kind(1d0))
DELTA_T=10D0,
EXTERNAL IC_01,
Additional Examples
Example 2 - Inviscid Flow on a Plate
This example is a first order system from Pennington and Berzins, (1994). The equations are
ut = vx
uut = vu x + wxx
w = u x ,implying that uut = vu x + u xx
u ( 0, t ) = v ( 0, t ) = 0, u ( , t ) = u ( xR , t ) = 1, t 0
u ( x, 0 ) = 1, v ( x, 0 ) = 0, x 0
Following elimination of w, there remain NPDE = 2 differential equations. The variable t is not
time, but a second space variable. The integration goes from t = 0 to t = 5 . It is necessary to
truncate the variable x at a finite value, say xmax = x R = 25 . In terms of the integrator, the system
is defined by letting m = 0 and
PDE_1D_MG 1017
{ jk } = 1u 00 , R = uxv , Q = vu0 x
C= C
= 0, =
u 1
= 0, =
,at x = xR
vx
We use N = 10 + 51 = 61 grid points and output the solution at steps of t = 01
..
Rationale: Example 2
This is a non-linear boundary layer problem with sharply changing conditions near t = 0 . The
problem statement was modified so that boundary conditions are continuous near t = 0 . Without
this change the underlying integration software, DASPG, cannot solve the problem. The
continuous blending function u exp ( 20t ) is arbitrary and artfully chosen. This is a
mathematical change to the problem, required because of the stated discontinuity at t = 0 . Reverse
communication is used for the problem data. No additional user-written subroutines are required
when using reverse communication. We also have chosen 10 of the initial grid points to be
concentrated near x L = 0 , anticipating rapid change in the solution near that point. Optional
changes are made to use a pure absolute error tolerance and non-zero time-smoothing.
program PDE_1D_MG_EX02
! Inviscid Flow Over a Plate
USE PDE_1d_mg_int
USE ERROR_OPTION_PACKET
IMPLICIT NONE
INTEGER, PARAMETER :: NPDE=2, N1=10, N2=51, N=N1+N2
INTEGER I, IDO, NFRAMES
! Define array space for the solution.
real(kind(1d0)) U(NPDE+1,N), T0, TOUT, DX1, DX2, DIF
real(kind(1d0)) :: ZERO=0D0, ONE=1D0, DELTA_T=1D-1,&
TEND=5D0, XMAX=25D0
real(kind(1d0)) :: U0=1D0, U1=0D0, TDELTA=1D-1, TOL=1D-2
TYPE(D_OPTIONS) IOPT(3)
! Start loop to integrate and record solution values.
IDO=1
DO
SELECT CASE (IDO)
! Define values that determine limits and options.
CASE (1)
T0=ZERO
TOUT=DELTA_T
U(NPDE+1,1)=ZERO;U(NPDE+1,N)=XMAX
OPEN(FILE='PDE_ex02.out',UNIT=7)
NFRAMES=NINT((TEND+DELTA_T)/DELTA_T)
WRITE(7, "(3I5, 4D14.5)") NPDE, N, NFRAMES,&
1018 Chapter 5: Differential Equations
PDE_1D_MG 1019
DIF=EXP(-20D0*D_PDE_1D_MG_T)
! Blend the left boundary value down to zero.
D_PDE_1D_MG_GAMMA=(/D_PDE_1D_MG_U(1)-DIF,D_PDE_1D_MG_U(2)/)
ELSE
D_PDE_1D_MG_GAMMA=(/D_PDE_1D_MG_U(1)ONE,D_PDE_1D_MG_DUDX(2)/)
END IF
END SELECT
! Reverse communication is used for the problem data.
CALL PDE_1D_MG (T0, TOUT, IDO, U, IOPT=IOPT)
END DO
end program
I ( t ) = u ( x, t ) dx
0
u ( x, 0 ) =
exp ( x )
2 exp ( a )
u ( 0, t ) = g b ( x, I ( t ) ) u ( x, t ) dx, t , where
0
xy exp ( x )
b ( x, y ) =
, and
2
( y + 1)
g ( z,t ) =
4 z ( 2 2exp ( a ) + exp ( t ) )
exp ( x )
1 exp ( a ) + exp ( t )
across
the entire domain. The software can solve the problem by introducing two dependent algebraic
equations:
a
v1 (t ) = u ( x, t )dx,
0
v2 (t ) = x exp ( x ) u ( x, t ) dx
0
ut = u x v1u , 0 x a, t 0
u ( 0, t ) =
g (1, t ) v1v2
(v
+ 1)
In the interface to the evaluation of the differential equation and boundary conditions, it is
necessary to evaluate the integrals, which are computed with the values of u ( x, t ) on the grid.
The integrals are approximated using the trapezoid rule, commensurate with the truncation error in
the integrator.
Rationale: Example 3
This is a non-linear integro-differential problem involving non-local conditions for the differential
equation and boundary conditions. Access to evaluation of these conditions is provided using
reverse communication. It is not possible to solve this problem with forward communication,
given the current subroutine interface. Optional changes are made to use an absolute error
tolerance and non-zero time-smoothing. The time-smoothing value = 1 prevents grid lines from
crossing.
program PDE_1D_MG_EX03
! Population Dynamics Model.
USE PDE_1d_mg_int
USE ERROR_OPTION_PACKET
IMPLICIT NONE
INTEGER, PARAMETER :: NPDE=1, N=101
INTEGER IDO, I, NFRAMES
! Define array space for the solution.
real(kind(1d0)) U(NPDE+1,N), MID(N-1), T0, TOUT, V_1, V_2
real(kind(1d0)) :: ZERO=0D0, HALF=5D-1, ONE=1D0,&
TWO=2D0, FOUR=4D0, DELTA_T=1D-1,TEND=5D0, A=5D0
TYPE(D_OPTIONS) IOPT(3)
! Start loop to integrate and record solution values.
IDO=1
DO
SELECT CASE (IDO)
! Define values that determine limits.
CASE (1)
T0=ZERO
TOUT=DELTA_T
U(NPDE+1,1)=ZERO;U(NPDE+1,N)=A
OPEN(FILE='PDE_ex03.out',UNIT=7)
NFRAMES=NINT((TEND+DELTA_T)/DELTA_T)
WRITE(7, "(3I5, 4D14.5)") NPDE, N, NFRAMES,&
U(NPDE+1,1), U(NPDE+1,N), T0, TEND
IOPT(1)=D_OPTIONS(PDE_1D_MG_RELATIVE_TOLERANCE,ZERO)
IOPT(2)=D_OPTIONS(PDE_1D_MG_ABSOLUTE_TOLERANCE,1D-2)
IOPT(3)=D_OPTIONS(PDE_1D_MG_TIME_SMOOTHING,1D0)
! Update to the next output point.
! Write solution and check for final point.
CASE (2)
T0=TOUT
IF(T0 <= TEND) THEN
WRITE(7,"(F10.5)")TOUT
Chapter 5: Differential Equations
PDE_1D_MG 1021
!
!
!
DO I=1,NPDE+1
WRITE(7,"(4E15.5)")U(I,:)
END DO
TOUT=MIN(TOUT+DELTA_T,TEND)
IF(T0 == TEND)IDO=3
END IF
All completed. Solver is shut down.
CASE (3)
CLOSE(UNIT=7)
EXIT
Define initial data values.
CASE (5)
U(1,:)=EXP(-U(2,:))/(TWO-EXP(-A))
WRITE(7,"(F10.5)")T0
DO I=1,NPDE+1
WRITE(7,"(4E15.5)")U(I,:)
END DO
Define differential equations.
CASE (6)
D_PDE_1D_MG_C(1,1)=ONE
D_PDE_1D_MG_R(1)=-D_PDE_1D_MG_U(1)
Evaluate the approximate integral, for this t.
V_1=HALF*SUM((U(1,1:N-1)+U(1,2:N))*&
(U(2,2:N) - U(2,1:N-1)))
D_PDE_1D_MG_Q(1)=V_1*D_PDE_1D_MG_U(1)
Define boundary conditions.
CASE (7)
IF(PDE_1D_MG_LEFT) THEN
Evaluate the approximate integral, for this t.
A second integral is needed at the edge.
V_1=HALF*SUM((U(1,1:N-1)+U(1,2:N))*&
(U(2,2:N) - U(2,1:N-1)))
MID=HALF*(U(2,2:N)+U(2,1:N-1))
V_2=HALF*SUM(MID*EXP(-MID)*&
(U(1,1:N-1)+U(1,2:N))*(U(2,2:N)-U(2,1:N-1)))
D_PDE_1D_MG_BETA=ZERO
D_PDE_1D_MG_GAMMA=G(ONE,D_PDE_1D_MG_T)*V_1*V_2/(V_1+ONE)**2-&
D_PDE_1D_MG_U
ELSE
D_PDE_1D_MG_BETA=ZERO
D_PDE_1D_MG_GAMMA=D_PDE_1D_MG_DUDX(1)
END IF
END SELECT
! Reverse communication is used for the problem data.
CALL PDE_1D_MG (T0, TOUT, IDO, U, IOPT=IOPT)
END DO
CONTAINS
FUNCTION G(z,t)
IMPLICIT NONE
REAL(KIND(1d0)) Z, T, G
G=FOUR*Z*(TWO-TWO*EXP(-A)+EXP(-T))**2
G=G/((ONE-EXP(-A))*(ONE-(ONE+TWO*A)*&
EXP(-TWO*A))*(1-EXP(-A)+EXP(-T)))
END FUNCTION
1022 Chapter 5: Differential Equations
end program
T
+ exp
r
1 + T
Tr ( 0, z ) = 0, T (1, z ) = 0, z > 0
Tz = r 1
T ( r , 0 ) = 0, 0 r < 1
= 104 , = 1, = 0.1
The axial direction
variable.
Rationale: Example 4
This is a non-linear problem in cylindrical coordinates. Our example illustrates assigning m = 1 in
Equation 2. We provide an optional argument that resets this value from its default, m = 0 .
Reverse communication is used to interface with the problem data.
program PDE_1D_MG_EX04
! Reactor-Diffusion problem in cylindrical coordinates.
USE pde_1d_mg_int
USE error_option_packet
IMPLICIT NONE
INTEGER, PARAMETER :: NPDE=1, N=41
INTEGER IDO, I, NFRAMES
! Define array space for the solution.
real(kind(1d0)) T(NPDE+1,N), Z0, ZOUT
real(kind(1d0)) :: ZERO=0D0, ONE=1D0, DELTA_Z=1D-1,&
ZEND=1D0, ZMAX=1D0, BTA=1D-4, GAMA=1D0, EPS=1D-1
TYPE(D_OPTIONS) IOPT(1)
! Start loop to integrate and record solution values.
IDO=1
DO
SELECT CASE (IDO)
! Define values that determine limits.
CASE (1)
Z0=ZERO
ZOUT=DELTA_Z
T(NPDE+1,1)=ZERO;T(NPDE+1,N)=ZMAX
OPEN(FILE='PDE_ex04.out',UNIT=7)
NFRAMES=NINT((ZEND+DELTA_Z)/DELTA_Z)
WRITE(7, "(3I5, 4D14.5)") NPDE, N, NFRAMES,&
T(NPDE+1,1), T(NPDE+1,N), Z0, ZEND
IOPT(1)=PDE_1D_MG_CYL_COORDINATES
! Update to the next output point.
! Write solution and check for final point.
CASE (2)
IF(Z0 <= ZEND) THEN
WRITE(7,"(F10.5)")ZOUT
Chapter 5: Differential Equations
PDE_1D_MG 1023
!
!
!
DO I=1,NPDE+1
WRITE(7,"(4E15.5)")T(I,:)
END DO
ZOUT=MIN(ZOUT+DELTA_Z,ZEND)
IF(Z0 == ZEND)IDO=3
END IF
All completed. Solver is shut down.
CASE (3)
CLOSE(UNIT=7)
EXIT
Define initial data values.
CASE (5)
T(1,:)=ZERO
WRITE(7,"(F10.5)")Z0
DO I=1,NPDE+1
WRITE(7,"(4E15.5)")T(I,:)
END DO
Define differential equations.
CASE (6)
D_PDE_1D_MG_C(1,1)=ONE
D_PDE_1D_MG_R(1)=BTA*D_PDE_1D_MG_DUDX(1)
D_PDE_1D_MG_Q(1)= -GAMA*EXP(D_PDE_1D_MG_U(1)/&
(ONE+EPS*D_PDE_1D_MG_U(1)))
Define boundary conditions.
CASE (7)
IF(PDE_1D_MG_LEFT) THEN
D_PDE_1D_MG_BETA=ONE; D_PDE_1D_MG_GAMMA=ZERO
ELSE
D_PDE_1D_MG_BETA=ZERO; D_PDE_1D_MG_GAMMA=D_PDE_1D_MG_U(1)
END IF
END SELECT
Reverse communication is used for the problem data.
The optional derived type changes the internal model
to use cylindrical coordinates.
CALL PDE_1D_MG (Z0, ZOUT, IDO, T, IOPT=IOPT)
END DO
end program
ut = u xx uf ( v )
vt = vxx + uf ( v ) ,
where f ( z ) = exp ( / z ) , = 4, = 3.52 106
0 x 1, 0 t 0.006
u ( x, 0 ) = 1, v ( x, 0 ) = 0.2
u x = vx = 0, x = 0
u x = 0, v = b ( t ) , x = 1, where
b ( t ) = 1.2, for t 2 104 , and
= 0.2 + 5 103 t , for 0 t 2 104
Rationale: Example 5
This is a non-linear problem. The example shows the model steps for replacing the banded solver
in the software with one of the users choice. Reverse communication is used for the interface to
the problem data and the linear solver. Following the computation of the matrix factorization in
DL2CRB, we declare the system to be singular when the reciprocal of the condition number is
smaller than the working precision. This choice is not suitable for all problems. Attention must
be given to detecting a singularity when this option is used.
program PDE_1D_MG_EX05
! Flame propagation model
USE pde_1d_mg_int
USE ERROR_OPTION_PACKET
USE Numerical_Libraries, ONLY :&
dl2crb, dlfsrb
IMPLICIT NONE
INTEGER, PARAMETER :: NPDE=2, N=40, NEQ=(NPDE+1)*N
INTEGER I, IDO, NFRAMES, IPVT(NEQ)
! Define array space for the solution.
real(kind(1d0)) U(NPDE+1,N), T0, TOUT
! Define work space for the banded solver.
real(kind(1d0)) WORK(NEQ), RCOND
real(kind(1d0)) :: ZERO=0D0, ONE=1D0, DELTA_T=1D-4,&
TEND=6D-3, XMAX=1D0, BTA=4D0, GAMA=3.52D6
TYPE(D_OPTIONS) IOPT(1)
! Start loop to integrate and record solution values.
IDO=1
DO
SELECT CASE (IDO)
! Define values that determine limits.
CASE (1)
T0=ZERO
TOUT=DELTA_T
U(NPDE+1,1)=ZERO; U(NPDE+1,N)=XMAX
OPEN(FILE='PDE_ex05.out',UNIT=7)
NFRAMES=NINT((TEND+DELTA_T)/DELTA_T)
Chapter 5: Differential Equations
PDE_1D_MG 1025
u ( x, 0 ) = 1
u x = 0, x = 0
u = 1, x = 1
Rationale: Example 6
This is a non-linear problem. The output shows a case where a rapidly changing front, or hot-spot,
develops after a considerable way into the integration. This causes rapid change to the grid. An
option sets the maximum order BDF formula from its default value of 2 to the theoretical stable
maximum value of 5.
USE pde_1d_mg_int
USE error_option_packet
IMPLICIT NONE
PDE_1D_MG 1027
Rationale: Example 7
This is a non-linear system of first order equations.
program PDE_1D_MG_EX07
! Traveling Waves
USE pde_1d_mg_int
USE error_option_packet
IMPLICIT NONE
INTEGER, PARAMETER :: NPDE=2, N=50
Chapter 5: Differential Equations
PDE_1D_MG 1029
Example 8 - Black-Scholes
The value of a European call option, c ( s, t ) , with exercise price
2
2
s 2 css + rscs rc ct +
2
2
( s c ) + ( r ) sc
2
rc = 0 .
The parameters in the model are the risk-free interest rate, r , and the stock volatility, . The
boundary conditions are c ( 0, t ) = 0 and cs ( s, t ) 1, s . This development is described in
Wilmott, et al. (1995), pages 41-57. There are explicit solutions for this equation based on the
Normal Curve of Probability. The normal curve, and the solution itself, can be efficiently
computed with the IMSL function ANORDF, IMSL (1994), page 186. With numerical
integrationthe equation itself or the payoff can be readily changed to include other formulas,
c ( s, T ) , and corresponding boundary conditions. We use
e = 100,r = 0.08,T t = 0.25, 2 = 0.04 , s L = 0, and sR = 150 .
Rationale: Example 8
This is a linear problem but with initial conditions that are discontinuous. It is necessary to use a
positive time-smoothing value to prevent grid lines from crossing. We have used an absolute
tolerance of 103 . In $US, this is one-tenth of a cent.
Chapter 5: Differential Equations
PDE_1D_MG 1031
program PDE_1D_MG_EX08
! Black-Scholes call price
USE pde_1d_mg_int
USE error_option_packet
IMPLICIT NONE
INTEGER, PARAMETER :: NPDE=1, N=100
INTEGER I, IDO, NFRAMES
! Define array space for the solution.
real(kind(1d0)) U(NPDE+1,N), T0, TOUT, SIGSQ, XVAL
real(kind(1d0)) :: ZERO=0D0, HALF=5D-1, ONE=1D0, &
DELTA_T=25D-3, TEND=25D-2, XMAX=150, SIGMA=2D-1, &
R=8D-2, E=100D0
TYPE(D_OPTIONS) IOPT(5)
! Start loop to integrate and record solution values.
IDO=1
DO
SELECT CASE (IDO)
! Define values that determine limits.
CASE (1)
T0=ZERO
TOUT=DELTA_T
U(NPDE+1,1)=ZERO; U(NPDE+1,N)=XMAX
OPEN(FILE='PDE_ex08.out',UNIT=7)
NFRAMES=NINT((TEND+DELTA_T)/DELTA_T)
WRITE(7, "(3I5, 4D14.5)") NPDE, N, NFRAMES,&
U(NPDE+1,1), U(NPDE+1,N), T0, TEND
SIGSQ=SIGMA**2
! Illustrate allowing the BDF order to increase
! to its maximum allowed value.
IOPT(1)=PDE_1D_MG_MAX_BDF_ORDER
IOPT(2)=5
IOPT(3)=D_OPTIONS(PDE_1D_MG_TIME_SMOOTHING,5D-3)
IOPT(4)=D_OPTIONS(PDE_1D_MG_RELATIVE_TOLERANCE,ZERO)
IOPT(5)=D_OPTIONS(PDE_1D_MG_ABSOLUTE_TOLERANCE,1D-2)
! Update to the next output point.
! Write solution and check for final point.
CASE (2)
T0=TOUT
IF(T0 <= TEND) THEN
WRITE(7,"(F10.5)")TOUT
DO I=1,NPDE+1
WRITE(7,"(4E15.5)")U(I,:)
END DO
TOUT=MIN(TOUT+DELTA_T,TEND)
IF(T0 == TEND)IDO=3
END IF
! All completed. Solver is shut down.
CASE (3)
CLOSE(UNIT=7)
EXIT
REQUIRED
For a detailed description of MPI Requirements see Dense Matrix Parallelism Using MPI in
Chapter 10 of this manual.
This example, described above in Example 1, is from Blom and Zegeling (1994). The system
parameters , p, and , are varied, using uniform random numbers. The intervals studied are
01
. 0.2 , 01
. p 0.2 , and 10 20 . Using N = 21 grid values and other program options,
the elapsed time, parameter values, and the value v ( x, t ) x=1,t =4 are sent to the root node. This
information is written on a file. The final summary includes the minimum value of
v ( x, t ) x=1,t =4 ,
and the maximum and average time per integration, per node.
Chapter 5: Differential Equations
PDE_1D_MG 1033
Rationale: Example 9
This is a non-linear simulation problem. Using at least two integrating processors and MPI allows
more values of the parameters to be studied in a given time than with a single processor. This
code is valuable as a study guide when an application needs to estimate timing and other output
parameters. The simulation time is controlled at the root node. An integration is started, after
receiving results, within the first SIM_TIME seconds. The elapsed time will be longer than
SIM_TIME by the slowest processors time for its last integration.
program PDE_1D_MG_EX09
! Electrodynamics Model, parameter study.
USE PDE_1d_mg_int
USE MPI_SETUP_INT
USE RAND_INT
USE SHOW_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: NPDE=2, N=21
INTEGER I, IDO, IERROR, CONTINUE, STATUS(MPI_STATUS_SIZE)
INTEGER, ALLOCATABLE :: COUNTS(:)
! Define array space for the solution.
real(kind(1d0)) :: U(NPDE+1,N), T0, TOUT
real(kind(1d0)) :: ZERO=0D0, ONE=1D0,DELTA_T=10D0, TEND=4D0
! SIM_TIME is the number of seconds to run the simulation.
real(kind(1d0)) :: EPS, P, ETA, Z, TWO=2D0, THREE=3D0, SIM_TIME=60D0
real(kind(1d0)) :: TIMES, TIMEE, TIMEL, TIME, TIME_SIM, V_MIN, &
DATA(5)
real(kind(1d0)), ALLOCATABLE :: AV_TIME(:), MAX_TIME(:)
TYPE(D_OPTIONS) IOPT(4), SHOW_IOPT(2)
TYPE(S_OPTIONS) SHOW_INTOPT(2)
MP_NPROCS=MP_SETUP(1)
MPI_NODE_PRIORITY=(/(I-1,I=1,MP_NPROCS)/)
! If NP_NPROCS=1, the program stops. Change
! MPI_ROOT_WORKS=.TRUE. if MP_NPROCS=1.
MPI_ROOT_WORKS=.FALSE.
IF(.NOT. MPI_ROOT_WORKS .and. MP_NPROCS == 1) STOP
ALLOCATE(AV_TIME(MP_NPROCS), MAX_TIME(MP_NPROCS), COUNTS(MP_NPROCS))
! Get time start for simulation timing.
TIME=MPI_WTIME()
IF(MP_RANK == 0) OPEN(FILE='PDE_ex09.out',UNIT=7)
SIMULATE: DO
! Pick random parameter values.
EPS=1D-1*(ONE+rand(EPS))
P=1D-1*(ONE+rand(P))
ETA=10D0*(ONE+rand(ETA))
! Start loop to integrate and communicate solution times.
IDO=1
! Get time start for each new problem.
DO
IF(.NOT. MPI_ROOT_WORKS .and. MP_RANK == 0) EXIT
SELECT CASE (IDO)
! Define values that determine limits.
CASE (1)
T0=ZERO
TOUT=1D-3
1034 Chapter 5: Differential Equations
U(NPDE+1,1)=ZERO;U(NPDE+1,N)=ONE
IOPT(1)=PDE_1D_MG_MAX_BDF_ORDER
IOPT(2)=5
IOPT(3)=D_OPTIONS(PDE_1D_MG_RELATIVE_TOLERANCE,1D-2)
IOPT(4)=D_OPTIONS(PDE_1D_MG_ABSOLUTE_TOLERANCE,1D-2)
TIMES=MPI_WTIME()
! Update to the next output point.
! Write solution and check for final point.
CASE (2)
T0=TOUT;TOUT=TOUT*DELTA_T
IF(T0 >= TEND) IDO=3
TOUT=MIN(TOUT, TEND)
! All completed. Solver is shut down.
CASE (3)
TIMEE=MPI_WTIME()
EXIT
! Define initial data values.
CASE (5)
U(1,:)=1D0;U(2,:)=0D0
! Define differential equations.
CASE (6)
D_PDE_1D_MG_C=0D0;D_PDE_1D_MG_C(1,1)=1D0;D_PDE_1D_MG_C(2,2)=1D0
D_PDE_1D_MG_R=P*D_PDE_1D_MG_DUDX
D_PDE_1D_MG_R(1)=D_PDE_1D_MG_R(1)*EPS
Z=ETA*(D_PDE_1D_MG_U(1)-D_PDE_1D_MG_U(2))/THREE
D_PDE_1D_MG_Q(1)=EXP(Z)-EXP(-TWO*Z)
D_PDE_1D_MG_Q(2)=-D_PDE_1D_MG_Q(1)
! Define boundary conditions.
CASE (7)
IF(PDE_1D_MG_LEFT) THEN
D_PDE_1D_MG_BETA(1)=1D0;D_PDE_1D_MG_BETA(2)=0D0
D_PDE_1D_MG_GAMMA(1)=0D0;D_PDE_1D_MG_GAMMA(2)=D_PDE_1D_MG_U(2)
ELSE
D_PDE_1D_MG_BETA(1)=0D0;D_PDE_1D_MG_BETA(2)=1D0
D_PDE_1D_MG_GAMMA(1)=D_PDE_1D_MG_U(1)- &
1D0;D_PDE_1D_MG_GAMMA(2)=0D0
END IF
END SELECT
! Reverse communication is used for the problem data.
CALL PDE_1D_MG (T0, TOUT, IDO, U)
END DO
TIMEL=TIMEE-TIMES
DATA=(/EPS, P, ETA, U(2,N), TIMEL/)
IF(MP_RANK > 0) THEN
! Send parameters and time to the root.
CALL MPI_SEND(DATA, 5, MPI_DOUBLE_PRECISION,0, MP_RANK, &
MP_LIBRARY_WORLD, IERROR)
! Receive back a "go/stop" flag.
CALL MPI_RECV(CONTINUE, 1, MPI_INTEGER, 0, MPI_ANY_TAG, &
MP_LIBRARY_WORLD, STATUS, IERROR)
! If root notes that time is up, it sends node a quit flag.
IF(CONTINUE == 0) EXIT SIMULATE
Chapter 5: Differential Equations
PDE_1D_MG 1035
ELSE
! If root is working, record its result and then stand ready
! for other nodes to send.
IF(MPI_ROOT_WORKS) WRITE(7,*) MP_RANK, DATA
! If all nodes have reported, then quit.
IF(COUNT(MPI_NODE_PRIORITY >= 0) == 0) EXIT SIMULATE
! See if time is up. Some nodes still must report.
IF(MPI_WTIME()-TIME >= SIM_TIME) THEN
CONTINUE=0
ELSE
CONTINUE=1
END IF
! Root receives simulation data and finds which node sent it.
IF(MP_NPROCS > 1) THEN
CALL MPI_RECV(DATA, 5, MPI_DOUBLE_PRECISION, &
MPI_ANY_SOURCE, MPI_ANY_TAG, MP_LIBRARY_WORLD, &
STATUS, IERROR)
WRITE(7,*) STATUS(MPI_SOURCE), DATA
! If time at the root has elapsed, nodes receive signal to stop.
! Send the reporting node the "go/stop" flag.
! Mark if a node has been stopped.
CALL MPI_SEND(CONTINUE, 1, MPI_INTEGER, &
STATUS(MPI_SOURCE), &0, MP_LIBRARY_WORLD, IERROR)
IF (CONTINUE == 0) MPI_NODE_PRIORITY(STATUS(MPI_SOURCE)+1)&
=- MPI_NODE_PRIORITY(STATUS(MPI_SOURCE)+1)-1
END IF
IF (CONTINUE == 0) MPI_NODE_PRIORITY(1)=-1
END IF
END DO SIMULATE
IF(MP_RANK == 0) THEN
ENDFILE(UNIT=7);REWIND(UNIT=7)
! Read the data. Find extremes and averages.
MAX_TIME=ZERO;AV_TIME=ZERO;COUNTS=0;V_MIN=HUGE(ONE)
DO
READ(7,*, END=10) I, DATA
COUNTS(I+1)=COUNTS(I+1)+1
AV_TIME(I+1)=AV_TIME(I+1)+DATA(5)
IF(MAX_TIME(I+1) < DATA(5)) MAX_TIME(I+1)=DATA(5)
V_MIN=MIN(V_MIN, DATA(4))
END DO
10
CONTINUE
CLOSE(UNIT=7)
! Set printing Index to match node numbering.
SHOW_IOPT(1)= SHOW_STARTING_INDEX_IS
SHOW_IOPT(2)=0
SHOW_INTOPT(1)=SHOW_STARTING_INDEX_IS
SHOW_INTOPT(2)=0
CALL SHOW(MAX_TIME,"Maximum Integration Time, per
process:",IOPT=SHOW_IOPT)
AV_TIME=AV_TIME/MAX(1,COUNTS)
CALL SHOW(AV_TIME,"Average Integration Time, per
process:",IOPT=SHOW_IOPT)
CALL SHOW(COUNTS,"Number of Integrations",IOPT=SHOW_INTOPT)
WRITE(*,"(1x,A,F6.3)") "Minimum value for v(x,t),at x=1,t=4:
",V_MIN
1036 Chapter 5: Differential Equations
END IF
MP_NPROCS=MP_SETUP("Final")
end program
PDE_1D_MG 1037
MOLCH
Solves a system of partial differential equations of the form ut = f(x, t, u, ux, uxx) using the method
of lines. The solution is represented with cubic Hermite polynomials.
Required Arguments
IDO Flag indicating the state of the computation. (Input/Output)
IDO
State
Initial entry
Normal reentry
Normally, the initial call is made with IDO = 1. The routine then sets IDO = 2, and this
value is then used for all but the last call that is made with IDO = 3.
FCNUT User-supplied SUBROUTINE to evaluate the function ut. The usage is
CALL FCNUT (NPDES, X, T, U, UX, UXX, UT), where
NPDES Number of equations. (Input)
X Space variable, x. (Input)
T Time variable, t. (Input)
U Array of length NPDES containing the dependent variable values,
u. (Input)
UX Array of length NPDES containing the first derivatives ux.
(Input)
UXX Array of length NPDES containing the second derivative uxx.
(Input)
UT Array of length NPDES containing the computed derivatives, ut.
(Output)
(Input)
T Time variable, t. (Input)
ALPHA Array of length NPDES containing the k values. (Output)
1038 Chapter 5: Differential Equations
d k
dt
= k
(Output)
The name FCNBC must be declared EXTERNAL in the calling program.
T Independent variable, t. (Input/Output)
On input, T supplies the initial time, t0. On output, T is set to the value to which the
integration has been updated. Normally, this new value is TEND.
TEND Value of t = tend at which the solution is desired. (Input)
XBREAK Array of length NX containing the break points for the cubic Hermite splines
used in the x discretization. (Input)
The points in the array XBREAK must be strictly increasing. The values XBREAK(1) and
XBREAK(NX) are the endpoints of the interval.
Y Array of size NPDES by NX containing the solution. (Input/Output)
The array Y contains the solution as Y(k, i) = uk(x, tend) at x = XBREAK(i). On input, Y
contains the initial values. It MUST satisfy the boundary conditions. On output, Y
contains the computed solution.
There is an optional application of MOLCH that uses derivative values, ux(x, t0). The user
allocates twice the space for Y to pass this information. The optional derivative
information is input as
Y ( k,i + NX ) =
uk
( x , t0 )
x
Y ( k,i + NX ) =
uk
( x, tend )
x
at x = X(i). To signal that this information is provided, use an options manager call as
outlined in Comment 3 and illustrated in Examples 3 and 4.
Optional Arguments
NPDES Number of differential equations. (Input)
Default: NPDES = size (Y,1).
NX Number of mesh points or lines. (Input)
Default: NX = size (Y,2).
MOLCH 1039
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Let M = NPDES, N = NX and xi = XBREAK(I). The routine MOLCH uses the method of lines to solve
the partial differential equation system
uk
u
u 2 u1
2 uM
,
= f k x, t , u1 , uM , 1 , M ,
t
x
x x2
x2
at t = t0
k uk + k
uk
= k (t )
x
at x = x1 and at x = xN
for k = 1, , M.
Cubic Hermite polynomials are used in the x variable approximation so that the trial solution is
expanded in the series
uk ( x, t ) = ( ai , k ( t ) i ( x ) + bi , k ( t ) i ( x ) )
i =1
where i(x) and i(x) are the standard basis functions for the cubic Hermite polynomials with the
knots x1 < x2 < < xN. These are piecewise cubic polynomials with continuous first derivatives.
At the breakpoints, they satisfy
i ( xl ) = il i ( xl ) = 0
d i
( xl ) = 0
dx
d i
( xl ) = il
dx
According to the collocation method, the coefficients of the approximation are obtained so that the
trial solution satisfies the differential equation at the two Gaussian points in each subinterval,
p2 j 1 = x j +
p2 j
3 3
( x j +1 x j )
6
= xj +
3+ 3
( x j +1 + x j )
6
dai , k
i =1
dt
i ( p j ) +
dbi , k
dt
i ( pj ) =
f k p j , t , u1 ( p j ) , , uM ( p j ) , , ( u1 ) xx ( p j ) , , ( uM ) xx ( p j )
dak
db
d
+ k k = k
dt
dt
dt
for k = 1, , M.
The initial conditions uk(x, t0) must satisfy the boundary conditions. Also, the k(t) must be
continuous and have a smooth derivative, or the boundary conditions will not be properly imposed
for t > t0.
If k = k = 0, it is assumed that no boundary condition is desired for the k-th unknown at the left
endpoint. A similar comment holds for the right endpoint. Thus, collocation is done at the
endpoint. This is generally a useful feature for systems of first-order partial differential equations.
If the number of partial differential equations is M = 1 and the number of breakpoints is N = 4,
then
MOLCH 1041
1
1
p
1 ( 1 ) 1 ( p1 )
1 ( p2 ) 1 ( p2 )
A=
2 ( p1 )
2 ( p2 )
3 ( p3 )
3 ( p4 )
2 ( p1 )
2 ( p2 )
3 ( p3 ) 4 ( p3 )
3 ( p4 ) 4 ( p4 )
5 ( p5 )
5 ( p6 )
4 ( p3 )
4 ( p4 )
5 ( p5 ) 6 ( p5 )
5 ( p6 ) 6 ( p6 )
4
6 ( p5 )
6 ( p6 )
The vector c is
c = [a1, b1, a2, b2, a3, b3, a4, b4]T
F = ( x1 ) , f ( p1 ) , f ( p2 ) , f ( p3 ) , f ( p4 ) , f ( p5 ) , f ( p6 ) , ( x4 )
If M > 1, then each entry in the above matrix is replaced by an M M diagonal matrix. The
element 1 is replaced by diag(1,1, , 1,). The elements N, 1 and N are handled in the same
manner. The i(pj) and i(pj) elements are replaced by i(pj)IM and i(pj)IM where IM is the
identity matrix of order M. See Madsen and Sincovec (1979) for further details about
discretization errors and Jacobian matrix structure.
The input/output array Y contains the values of the ak, i. The initial values of the bk, i are obtained
by using the IMSL cubic spline routine CSINT (see Chapter 3, Interpolation and Approximation)
to construct functions
uk ( x, t0 )
such that
uk ( xi , t0 ) = aki
The IMSL routine CSDER, see Chapter 3, Interpolation and Approximation, is used to approximate
the values
dU k
( xi , t0 ) bk ,i
dx
There is an optional usage of MOLCH that allows the user to provide the initial values of bk, i.
The order of matrix A is 2M N and its maximum bandwidth is 6M 1. The band structure of the
Jacobian of F with respect to c is the same as the band structure of A. This system is solved using
a modified version of IVPAG. Some of the linear solvers were removed. Numerical Jacobians are
used exclusively. The algorithm is unchanged. Gears BDF method is used as the default because
the system is typically stiff.
1042 Chapter 5: Differential Equations
We now present four examples of PDEs that illustrate how users can interface their problems with
IMSL PDE solving software. The examples are small and not indicative of the complexities that
most practitioners will face in their applications. A set of seven sample application problems,
some of them with more than one equation, is given in Sincovec and Madsen (1975). Two further
examples are given in Madsen and Sincovec (1979).
Comments
1.
2.
3.
Informational errors
Type
Code
4
This option consists of the parameter PARAM, an array with 50 components. See
IVPAG for a more complete documentation of the contents of this array. To reset
this option, use the subprogram SUMAG for single precision, and DUMAG (see
Chapter 11, Utilities) for double precision. The entry PARAM(1) is assigned the
initial step, HINIT. The entries PARAM(15) and PARAM(16) are assigned the
values equal to the number of lower and upper diagonals that will occur in the
Newton method for solving the BDF corrector equations. The value
PARAM(17) = 1 is used to signal that the x derivatives of the initial data are
provided in the the array Y. The output values PARAM(31)-PARAM(36) , showing
technical data about the ODE integration, are available with another option
MOLCH 1043
manager subroutine call. This call is made after the storage for MOLCH is
released. The default values for the first 20 entries of PARAM are (0, 0, amach(2),
500., 0., 5., 0, 0, 1., 3., 1., 2., 2., 1., amach(6), amach(6), 0, sqrt(amach(4)), 1.,
0.). Entries 2150 are defaulted to amach(6).
Example 1
The normalized linear diffusion PDE, ut = uxx, 0 x 1, t > t0, is solved. The initial values are
t0 = 0, u(x, t0) = u0 = 1. There is a zero-flux boundary condition at x = 1, namely ux(1, t) = 0,
(t > t0). The boundary value of u(0, t) is abruptly changed from u0 to the value u1 = 0.1. This
transition is completed by t = t = 0.09.
Due to restrictions in the type of boundary conditions sucessfully processed by MOLCH, it is
necessary to provide the derivative boundary value function at x = 0 and at x = 1. The function
at x = 0 makes a smooth transition from the value u0 at t = t0 to the value u1 at t = t. We compute
the transition phase for by evaluating a cubic interpolating polynomial. For this purpose, the
function subprogram CSDER, see Chapter 3, Interpolation and Approximation, is used. The
interpolation is performed as a first step in the user-supplied routine FCNBC. The function and
derivative values (t0) = u0, (t0) = 0, (t) = u1, and (t) = 0, are used as input to routine C2HER,
to obtain the coefficients evaluated by CSDER. Notice that (t) = 0, t > t. The evaluation routine
CSDER will not yield this value so logic in the routine FCNBC assigns (t) = 0, t > t.
USE
USE
USE
USE
MOLCH_INT
UMACH_INT
AMACH_INT
WRRRN_INT
IMPLICIT
!
!
!
!
!
!
!
INTEGER
PARAMETER
INTEGER
REAL
CHARACTER
INTRINSIC
REAL
EXTERNAL
NONE
SPECIFICATIONS FOR LOCAL VARIABLES
LDY, NPDES, NX
(NPDES=1, NX=8, LDY=NPDES)
SPECIFICATIONS FOR LOCAL VARIABLES
I, IDO, J, NOUT, NSTEP
HINIT, PREC, T, TEND, TOL, XBREAK(NX), Y(LDY,NX), U0
TITLE*19
SPECIFICATIONS FOR INTRINSICS
FLOAT
FLOAT
SPECIFICATIONS FOR SUBROUTINES
SPECIFICATIONS FOR FUNCTIONS
FCNBC, FCNUT
Set breakpoints and initial
conditions
U0 = 1.0
DO 10 I=1, NX
XBREAK(I) = FLOAT(I-1)/(NX-1)
Y(1,I)
= U0
10 CONTINUE
Set parameters for MOLCH
PREC = AMACH(4)
TOL
= SQRT(PREC)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
HINIT = 0.01*TOL
T
= 0.0
IDO
= 1
NSTEP = 10
CALL UMACH (2, NOUT)
J = 0
20 CONTINUE
J
= J + 1
TEND = FLOAT(J)/FLOAT(NSTEP)
This puts more output for small
t values where action is fastest.
TEND = TEND**2
Solve the problem
CALL MOLCH (IDO, FCNUT, FCNBC, T, TEND, XBREAK, Y, TOL=TOL, &
HINIT=HINIT)
IF (J .LE. NSTEP) THEN
Print results
WRITE (TITLE,'(A,F4.2)') 'Solution at T =', T
CALL WRRRN (TITLE, Y)
Final call to release workspace
IF (J .EQ. NSTEP) IDO = 3
GO TO 20
END IF
END
SUBROUTINE FCNUT (NPDES, X, T, U, UX, UXX, UT)
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NPDES
REAL
X, T, U(*), UX(*), UXX(*), UT(*)
UT(1) = UXX(1)
RETURN
END
MOLCH 1045
10 CONTINUE
!
!
!
!
!
!
ALPHA(1) = 1.0
BTA(1) = 0.0
GAMP(1) = 0.
Output
1
0.969
2
0.997
Solution at T =0.01
3
4
5
6
1.000
1.000
1.000
1.000
7
1.000
8
1.000
1
0.625
2
0.871
Solution at T =0.04
3
4
5
6
0.963
0.991
0.998
1.000
7
1.000
8
1.000
1
0.0998
2
0.4603
Solution at T =0.09
3
4
5
0.7171
0.8673
0.9437
6
0.9781
7
0.9917
8
0.9951
1
0.0994
2
0.3127
Solution at T =0.16
3
4
5
0.5069
0.6680
0.7893
6
0.8708
7
0.9168
8
0.9316
1
0.0994
2
0.2564
Solution at T =0.25
3
4
5
0.4043
0.5352
0.6428
6
0.7223
7
0.7709
8
0.7873
1
0.0994
2
0.2172
Solution at T =0.36
3
4
5
0.3289
0.4289
0.5123
6
0.5749
7
0.6137
8
0.6268
1
0.0994
2
0.1847
Solution at T =0.49
3
4
5
0.2657
0.3383
0.3989
6
0.4445
7
0.4728
8
0.4824
1
0.0994
2
0.1583
Solution at T =0.64
3
4
5
0.2143
0.2644
0.3063
6
0.3379
7
0.3574
8
0.3641
1
0.0994
2
0.1382
Solution at T =0.81
3
4
5
0.1750
0.2080
0.2356
6
0.2563
7
0.2692
8
0.2736
1
0.0994
2
0.1237
Solution at T =1.00
3
4
5
0.1468
0.1674
0.1847
6
0.1977
7
0.2058
8
0.2085
Additonal Examples
Example 2
In this example, using MOLCH, we solve the linear normalized diffusion PDE ut = uxx but with an
optional usage that provides values of the derivatives, ux, of the initial data. Due to errors in the
numerical derivatives computed by spline interpolation, more precise derivative values are
required when the initial data is u(x, 0) = 1 + cos[(2n 1)x], n > 1. The boundary conditions are
zero flux conditions ux(0, t) = ux(1, t) = 0 for t > 0. Note that the initial data is compatible with
these end conditions since the derivative function
u x ( x, 0 ) =
du ( x, 0 )
dx
= ( 2n 1) sin ( 2n 1) x
vanishes at x = 0 and x = 1.
The example illustrates the use of the IMSL options manager subprograms SUMAG or, for double
precision, DUMAG, see Chapter 11, Utilities, to reset the array PARAM used for control of the
specialized version of IVPAG that integrates the system of ODEs. This optional usage signals that
the derivative of the initial data is passed by the user. The values u(x, tend) and ux(x, tend) are
output at the breakpoints with the optional usage.
USE IMSL_LIBRARIES
!
!
!
IMPLICIT
INTEGER
PARAMETER
INTEGER
PARAMETER
NONE
SPECIFICATIONS FOR LOCAL VARIABLES
LDY, NPDES, NX, IAC
(NPDES=1, NX=10, LDY=NPDES)
SPECIFICATIONS FOR PARAMETERS
ICHAP, IGET, IPUT, KPARAM
(ICHAP=5, IGET=1, IPUT=2, KPARAM=11)
SPECIFICATIONS FOR LOCAL VARIABLES
MOLCH 1047
INTEGER
REAL
!
!
!
!
!
!
10
!
!
!
!
!
!
!
!
20
30
40
!
!
!
!
!
!
!
!
!
!
!
!
!
Print results
WRITE (TITLE,'(A,F5.3)') 'Solution and derivatives at T =', T
CALL WRRRN (TITLE, Y)
Final call to release workspace
IF (J .EQ. NSTEP) IDO = 3
GO TO 40
END IF
Show, for example, the maximum
step size used.
JGO = 3
IACT = IGET
GO TO 70
50 CONTINUE
WRITE (NOUT,*) ' Maximum step size used is: ', PARAM(33)
Reset option to defaults
JGO
= 4
IAC
= IPUT
IOPT(1) = -IOPT(1)
GO TO 70
60 CONTINUE
RETURN
Internal routine to work options
70 CONTINUE
CALL SUMAG ('math', ICHAP, IACT, IOPT, PARAM, numopt=1)
GO TO (20, 30, 50, 60), JGO
END
SUBROUTINE FCNUT (NPDES, X, T, U, UX, UXX, UT)
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NPDES
REAL
X, T, U(*), UX(*), UXX(*), UT(*)
Define the PDE
UT(1) = UXX(1)
RETURN
END
SUBROUTINE FCNBC (NPDES, X, T, ALPHA, BTA, GAMP)
SPECIFICATIONS FOR ARGUMENTS
INTEGER
NPDES
REAL
X, T, ALPHA(*), BTA(*), GAMP(*)
ALPHA(1) = 0.0
BTA(1) = 1.0
GAMP(1) = 0.0
RETURN
END
Output
1
1.483
2
0.517
9
1.483
10
0.517
11
0.000
12
0.000
13
0.000
19
0.000
20
0.000
14
0.000
15
0.000
16
0.000
17
0.000
18
0.000
MOLCH 1049
1
1.233
2
0.767
9
1.233
10
0.767
11
0.000
12
0.000
13
0.000
18
0.000
19
0.000
20
0.000
1
1.113
2
0.887
9
1.113
10
0.887
11
0.000
12
0.000
13
0.000
18
0.000
19
0.000
20
0.000
1
1.054
2
0.946
9
1.054
10
0.946
11
0.000
12
0.000
13
0.000
18
0.000
19
0.000
20
0.000
1
1.026
2
0.974
9
1.026
10
0.974
11
0.000
12
0.000
13
0.000
18
0.000
19
0.000
20
0.000
1
1.012
2
0.988
9
1.012
10
0.988
11
0.000
12
0.000
13
0.000
18
0.000
19
0.000
20
0.000
1
1.006
2
0.994
9
1.006
10
0.994
11
0.000
12
0.000
13
0.000
18
0.000
19
0.000
20
0.000
1
1.003
2
0.997
9
1.003
10
0.997
11
0.000
12
0.000
13
0.000
18
0.000
19
0.000
20
0.000
1
1.001
2
0.999
9
1.001
10
0.999
11
12
19
20
13
14
0.000
14
0.000
14
0.000
14
0.000
14
0.000
14
0.000
14
0.000
14
15
0.000
15
0.000
15
0.000
15
0.000
15
0.000
15
0.000
15
0.000
15
16
0.000
16
0.000
16
0.000
16
0.000
16
0.000
16
0.000
16
0.000
16
17
0.000
17
0.000
17
0.000
17
0.000
17
0.000
17
0.000
17
0.000
17
18
0.000
0.000
0.000
0.000
0.000
0.000
0.000
1
1.001
2
0.999
9
1.001
10
0.999
19
0.000
20
0.000
11
12
13
14
0.000 0.000
0.000 0.000
Maximum step size used is:
0.000
0.000
15
16
0.000 0.000
1.00000E-02
0.000
17
0.000
18
0.000
Example 3
In this example, we consider the linear normalized hyperbolic PDE, utt = uxx, the vibrating string
equation. This naturally leads to a system of first order PDEs. Define a new dependent variable
ut = v. Then, vt = uxx is the second equation in the system. We take as initial data u(x, 0) = sin(x)
and ut(x, 0) = v(x, 0) = 0. The ends of the string are fixed so u(0, t) = u(1, t) = v(0, t) = v(1, t) = 0.
The exact solution to this problem is u(x, t) = sin(x) cos(t). Residuals are computed at the output
values of t for 0 < t 2. Output is obtained at 200 steps in increments of 0.01.
Even though the sample code MOLCH gives satisfactory results for this PDE, users should be aware
that for nonlinear problems, shocks can develop in the solution. The appearance of shocks may
cause the code to fail in unpredictable ways. See Courant and Hilbert (1962), pages 488-490, for
an introductory discussion of shocks in hyperbolic systems.
USE IMSL_LIBRARIES
IMPLICIT
!
!
!
!
!
!
!
!
!
!
NONE
MOLCH 1051
Y(2,I+NX) = 0.0
10 CONTINUE
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Output
Maximum error in u(x,t) divided by TOL:
1.28094
Maximum step size used is:
9.99999E-02
FPS2H
Solves Poissons or Helmholtzs equation on a two-dimensional rectangle using a fast Poisson
solver based on the HODIE finite-difference scheme on a uniform mesh.
Required Arguments
PRHS User-supplied FUNCTION to evaluate the right side of the partial differential
equation. The form is PRHS(X, Y), where
FPS2H 1053
(Output)
BRHS User-supplied FUNCTION to evaluate the right side of the boundary conditions. The
form is BRHS(ISIDE, X, Y), where
ISIDE Side number. (Input)
See IBCTY below for the definition of the side numbers.
X X-coordinate value. (Input)
Y Y-coordinate value. (Input)
BRHS Value of the right side of the boundary condition at (X, Y). (Output)
BRHS must be declared EXTERNAL in the calling program.
(Input)
IBCTY Array of size 4 indicating the type of boundary condition on each side of the
domain or that the solution is periodic. (Input)
The sides are numbered 1 to 4 as follows:
Side
Location
1 - Right
(X = BX)
2 - Bottom
(Y = AY)
3 - Left
(X = AX)
4 - Top
(Y = BY)
IBCTY
Boundary Condition
Periodic.
Optional Arguments
IORDER Order of accuracy of the finite-difference approximation. (Input)
It can be either 2 or 4. Usually, IORDER = 4 is used.
Default: IORDER = 4.
LDU Leading dimension of U exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDU = size (U,1).
FORTRAN 90 Interface
Generic:
CALL FPS2H (PRHS, BRHS, COEFU, NX, NY, AX, BX, AY, BY,
IBCTY, U [,])
Specific:
FORTRAN 77 Interface
Single:
CALL FPS2H (PRHS, BRHS, COEFU, NX, NY, AX, BX, AY, BY,
IBCTY, IORDER, U, LDU)
Double:
Description
Let c = COEFU, ax = AX, bx = BX, ay = AY, by = BY, nx = NX and ny = NY.
FPS2H is based on the code HFFT2D by Boisvert (1984). It solves the equation
2u 2u
+
+ cu = p
x2 y2
on the rectangular domain (ax, bx) (ay, by) with a user-specified combination of Dirichlet
(solution prescribed), Neumann (first-derivative prescribed), or periodic boundary conditions. The
sides are numbered clockwise, starting with the right side.
FPS2H 1055
Side 4
by
Side 3
Side 1
ay
ax
Side 2
bx
When c = 0 and only Neumann or periodic boundary conditions are prescribed, then any constant
may be added to the solution to obtain another solution to the problem. In this case, the solution of
minimum -norm is returned.
The solution is computed using either a second-or fourth-order accurate finite-difference
approximation of the continuous equation. The resulting system of linear algebraic equations is
solved using fast Fourier transform techniques. The algorithm relies upon the fact that nx 1 is
highly composite (the product of small primes). For details of the algorithm, see Boisvert (1984).
If nx 1 is highly composite then the execution time of FPS2H is proportional to nxny log2 nx. If
evaluations of p(x, y) are inexpensive, then the difference in running time between IORDER = 2
and IORDER = 4 is small.
Comments
1.
2.
The grid spacing is the distance between the (uniformly spaced) grid lines. It is given
by the formulas HX = (BX AX)/(NX 1) and HY = (BY AY)/(NY 1). The grid
spacings in the X and Y directions must be the same, i.e., NX and NY must be such that
HX equals HY. Also, as noted above, NX and NY must both be at least 4. To increase the
speed of the fast Fourier transform, NX 1 should be the product of small primes. Good
Example
In this example, the equation
2u 2u
+
+ 3u = 2sin ( x + 2 y ) + 16e 2 x +3 y
2
2
x y
with the boundary conditions u/y = 2 cos(x + 2y) + 3 exp(2x + 3y) on the bottom side and
u = sin(x + 2y) + exp(2x + 3y) on the other three sides. The domain is the rectangle[0, 1/4] [0,
1/2]. The output of FPS2H is a 17 33 table of U values. The quadratic interpolation routine
QD2VL is used to print a table of values.
USE FPS2H_INT
USE QD2VL_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NCVAL, NX, NXTABL, NY, NYTABL
(NCVAL=11, NX=17, NXTABL=5, NY=33, NYTABL=5)
!
INTEGER
REAL
!
!
!
!
FPS2H 1057
DO 20 J=1, NY
YDATA(J) = AY + (BY-AY)*FLOAT(J-1)/FLOAT(NY-1)
20 CONTINUE
Print the solution
CALL UMACH (2, NOUT)
WRITE (NOUT,'(8X,A,11X,A,11X,A,8X,A)') 'X', 'Y', 'U', 'Error'
DO 40 J=1, NYTABL
DO 30 I=1, NXTABL
X
= AX + (BX-AX)*FLOAT(I-1)/FLOAT(NXTABL-1)
Y
= AY + (BY-AY)*FLOAT(J-1)/FLOAT(NYTABL-1)
UTABL = QD2VL(X,Y,XDATA,YDATA,U)
TRUE = SIN(X+2.*Y) + EXP(2.*X+3.*Y)
ERROR = TRUE - UTABL
WRITE (NOUT,'(4F12.4)') X, Y, UTABL, ERROR
30 CONTINUE
40 CONTINUE
END
REAL FUNCTION PRHS (X, Y)
REAL
X, Y
!
REAL
INTRINSIC
!
EXP, SIN
EXP, SIN
!
REAL FUNCTION BRHS (ISIDE, X, Y)
INTEGER
ISIDE
REAL
X, Y
!
!
REAL
INTRINSIC
Output
X
0.0000
0.0625
0.1250
0.1875
0.2500
0.0000
0.0625
0.1250
0.1875
Y
0.0000
0.0000
0.0000
0.0000
0.0000
0.1250
0.1250
0.1250
0.1250
U
1.0000
1.1956
1.4087
1.6414
1.8961
1.7024
1.9562
2.2345
2.5407
Error
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
Fortran Numerical MATH LIBRARY
0.2500
0.0000
0.0625
0.1250
0.1875
0.2500
0.0000
0.0625
0.1250
0.1875
0.2500
0.0000
0.0625
0.1250
0.1875
0.2500
0.1250
0.2500
0.2500
0.2500
0.2500
0.2500
0.3750
0.3750
0.3750
0.3750
0.3750
0.5000
0.5000
0.5000
0.5000
0.5000
2.8783
2.5964
2.9322
3.3034
3.7148
4.1720
3.7619
4.2163
4.7226
5.2878
5.9199
5.3232
5.9520
6.6569
7.4483
8.3380
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
FPS3H
Solves Poissons or Helmholtzs equation on a three-dimensional box using a fast Poisson solver
based on the HODIE finite-difference scheme on a uniform mesh.
Required Arguments
PRHS User-supplied FUNCTION to evaluate the right side of the partial differential
equation. The form is PRHS(X, Y, Z), where
X The x-coordinate value.
Y The y-coordinate value.
Z The z-coordinate value.
(Input)
(Input)
(Input)
PRHS Value of the right side at (X, Y, Z). (Output)
BRHS User-supplied FUNCTION to evaluate the right side of the boundary conditions. The
form is BRHS(ISIDE, X, Y, Z), where
ISIDE Side number. (Input)
See IBCTY for the definition of the side numbers.
X The x-coordinate value. (Input)
Y The y-coordinate value. (Input)
Z The z-coordinate value. (Input)
BRHS Value of the right side of the boundary condition at (X, Y, Z). (Output)
BRHS must be declared EXTERNAL in the calling program.
FPS3H 1059
Side
Location
1 - Right
(X = BX)
2 - Bottom
(Y = AY)
3 - Left
(X = AX)
4 - Top
(Y = BY)
5 - Front
(Z = BZ)
6 - Back
(Z = AZ)
Boundary Condition
Periodic.
Optional Arguments
IORDER Order of accuracy of the finite-difference approximation. (Input)
It can be either 2 or 4. Usually, IORDER = 4 is used.
Default: IORDER = 4.
LDU Leading dimension of U exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDU = size (U,1).
MDU Middle dimension of U exactly as specified in the dimension statement of the calling
program. (Input)
Default: MDU = size (U,2).
FORTRAN 90 Interface
Generic:
CALL FPS3H (PRHS, BRHS, COEFU, NX, NY, NZ, AX, BX, AY, BY,
AZ, BZ, IBCTY, U [,])
Specific:
FORTRAN 77 Interface
Single:
CALL FPS3H (PRHS, BRHS, COEFU, NX, NY, NZ, AX, BX, AY, BY,
AZ, BZ, IBCTY, IORDER, U, LDU, MDU)
Double:
Description
Let c = COEFU, ax = AX, bx = BX, nx = NX, ay = AY, by = BY, ny = NY, az = AZ, bz = BZ, and nz = NZ.
FPS3H is based on the code HFFT3D by Boisvert (1984). It solves the equation
2u 2 u 2 u
+
+
+ cu = p
x2 y2 z 2
on the domain (ax, bx) (ay, by) (az, bz) (a box) with a user-specified combination of Dirichlet
(solution prescribed), Neumann (first derivative prescribed), or periodic boundary conditions. The
six sides are numbered as shown in the following diagram.
FPS3H 1061
y
by
Top - 4
Back - 6
Right - 1
Front - 5
bx
Left - 3
Bottom - 2
bz
z
When c = 0 and only Neumann or periodic boundary conditions are prescribed, then any constant
may be added to the solution to obtain another solution to the problem. In this case, the solution of
minimum -norm is returned.
The solution is computed using either a second-or fourth-order accurate finite-difference
approximation of the continuous equation. The resulting system of linear algebraic equations is
solved using fast Fourier transform techniques. The algorithm relies upon the fact that nx 1 and
nz 1 are highly composite (the product of small primes). For details of the algorithm, see
Boisvert (1984). If nx 1 and nz 1 are highly composite, then the execution time of FPS3H is
proportional to
nx n y nz ( log 22 nx + log 22 nz )
If evaluations of p(x, y, z) are inexpensive, then the difference in running time between
IORDER = 2 and IORDER = 4 is small.
Comments
1.
2.
The grid spacing is the distance between the (uniformly spaced) grid lines. It is given
by the formulas
HX = (BX AX)/(NX 1),
HY = (BY AY)/(NY 1), and
HZ = (BZ AZ)/(NZ 1).
The grid spacings in the X, Y and Z directions must be the same, i.e., NX, NY and NZ
must be such that HX = HY = HZ. Also, as noted above, NX, NY and NZ must all be at
least 4. To increase the speed of the Fast Fourier transform, NX 1 and NZ 1 should
be the product of small primes. Good choices for NX and NZ are 17, 33 and 65.
3.
Example
This example solves the equation
2u 2 u 2u
+
+ 2 + 10u = 4 cos ( 3 x + y 2 z ) + 12e x z + 10
2
2
x y z
with the boundary conditions u/z = 2 sin(3x + y 2z) exp(x z) on the front side and
u = cos(3x + y 2z) + exp(x z) + 1 on the other five sides. The domain is the box [0, 1/4] [0,
1/2] [0, 1/2]. The output of FPS3H is a 9 17 17 table of U values. The quadratic interpolation
routine QD3VL is used to print a table of values.
USE FPS3H_INT
USE UMACH_INT
USE QD3VL_INT
IMPLICIT
!
INTEGER
PARAMETER
NONE
SPECIFICATIONS FOR PARAMETERS
LDU, MDU, NX, NXTABL, NY, NYTABL, NZ, NZTABL
(NX=5, NXTABL=4, NY=9, NYTABL=3, NZ=9, &
NZTABL=3, LDU=NX, MDU=NY)
INTEGER
REAL
INTRINSIC
EXTERNAL
AX
BX
AY
BY
AZ
=
=
=
=
=
0.0
0.125
0.0
0.25
0.0
FPS3H 1063
BZ = 0.25
!
=
=
=
=
=
=
1
1
1
1
2
1
COEFU = 10.0
Coefficient of U
Order of the method
IORDER = 4
!
!
10
20
30
!
40
50
60
!
!
!
COS, EXP
COS, EXP
REAL
X, Y, Z
REAL
INTRINSIC
!
!
Boundary conditions
IF (ISIDE .EQ. 5) THEN
BRHS = -2.0*SIN(3.0*X+Y-2.0*Z) - EXP(X-Z)
ELSE
BRHS = COS(3.0*X+Y-2.0*Z) + EXP(X-Z) + 1.0
END IF
RETURN
END
Output
X
0.0000
0.0417
0.0833
0.1250
0.0000
0.0417
0.0833
0.1250
0.0000
0.0417
0.0833
0.1250
0.0000
0.0417
0.0833
0.1250
0.0000
0.0417
0.0833
0.1250
0.0000
0.0417
0.0833
0.1250
0.0000
0.0417
0.0833
0.1250
0.0000
0.0417
0.0833
0.1250
0.0000
0.0417
0.0833
0.1250
Y
0.0000
0.0000
0.0000
0.0000
0.1250
0.1250
0.1250
0.1250
0.2500
0.2500
0.2500
0.2500
0.0000
0.0000
0.0000
0.0000
0.1250
0.1250
0.1250
0.1250
0.2500
0.2500
0.2500
0.2500
0.0000
0.0000
0.0000
0.0000
0.1250
0.1250
0.1250
0.1250
0.2500
0.2500
0.2500
0.2500
Z
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.1250
0.1250
0.1250
0.1250
0.1250
0.1250
0.1250
0.1250
0.1250
0.1250
0.1250
0.1250
0.2500
0.2500
0.2500
0.2500
0.2500
0.2500
0.2500
0.2500
0.2500
0.2500
0.2500
0.2500
U
3.0000
3.0348
3.0558
3.0637
2.9922
3.0115
3.0175
3.0107
2.9690
2.9731
2.9645
2.9440
2.8514
2.9123
2.9592
2.9922
2.8747
2.9211
2.9524
2.9689
2.8825
2.9123
2.9281
2.9305
2.6314
2.7420
2.8112
2.8609
2.7093
2.8153
2.8628
2.8825
2.7351
2.8030
2.8424
2.8735
Error
0.0000
0.0000
0.0001
0.0001
0.0000
0.0000
0.0000
0.0000
0.0001
0.0000
0.0000
-0.0001
0.0000
0.0000
0.0000
0.0000
0.0000
0.0010
0.0010
0.0000
0.0000
0.0000
0.0000
0.0000
-0.0249
-0.0004
-0.0042
-0.0138
0.0000
0.0344
0.0237
0.0000
-0.0127
-0.0011
-0.0040
-0.0012
FPS3H 1065
SLEIG
Determines eigenvalues, eigenfunctions and/or spectral density functions for Sturm-Liouville
problems in the form
d
du
( p ( x ) ) + q ( x ) u = r ( x ) u for x in ( a, b )
dx
dx
Required Arguments
CONS Array of size eight containing
a1 , a1, a2 , a2 , b1 , b2 , a and b
in locations CONS(1) through CONS(8), respectively. (Input)
COEFFN User-supplied SUBROUTINE to evaluate the coefficient functions. The usage is
CALL COEFFN (X, PX, QX, RX)
X Independent variable. (Input)
PX The value of p(x) at X. (Output)
QX The value of q(x) at X. (Output)
RX The value of r(x) at X. (Output)
COEFFN must be declared EXTERNAL in the calling program.
ENDFIN Logical array of size two. ENDFIN(1) = .true. if the endpoint a is finite.
ENDFIN(2) = .true. if endpoint b is finite. (Input)
INDEX Vector of size NUMEIG containing the indices of the desired eigenvalues. (Input)
EVAL Array of length NUMEIG containing the computed approximations to the
eigenvalues whose indices are specified in INDEX. (Output)
Optional Arguments
NUMEIG The number of eigenvalues desired. (Input)
Default: NUMEIG = size (INDEX,1).
TEVLAB Absolute error tolerance for eigenvalues. (Input)
Default: TEVLAB = 10.* machine precision.
TEVLRL Relative error tolerance for eigenvalues. (Input)
Default: TEVLRL = SQRT(machine precision).
1066 Chapter 5: Differential Equations
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
This subroutine is designed for the calculation of eigenvalues, eigenfunctions and/or spectral
density functions for Sturm-Liouville problems in the form
d
du
( p ( x ) ) + q ( x ) u = r ( x ) u for x in ( a, b ) (1)
dx
dx
a1u a2 ( pu ) = ( a1u a2 ( pu ) ) at a
b1u + b2 ( pu ) = 0 at b
We assume that
1/p(x), q(x) and r(x) are locally integrable near the endpoints.
Otherwise the problem is called singular. The theory assumes that p, p, q, and r are at least
continuous on (a, b), though a finite number of jump discontinuities can be handled by suitably
defining an input mesh.
For regular problems, there are an infinite number of eigenvalues
0 < 1 < < k, k
Each eigenvalue has an associated eigenfunction which is unique up to a constant. For singular
problems, there is a wide range in the behavior of the eigenvalues.
As presented in Pruess and Fulton (1993) the approach is to replace (1) by a new problem
) + qu
= ru
( pu
Chapter 5: Differential Equations
(2)
SLEIG 1067
)( a ) = a1u ( a ) a2 ( pu
)( a )
a1u ( a ) a2 ( pu
)( b ) = 0
b1u ( b ) + b2 ( pu
where
p , q and r
are step function approximations to p, q, and r, respectively. Given the mesh
a = x1 < x2 < < xN+1 = b, the usual choice for the step functions uses midpoint interpolation,
i. e.,
p ( x ) = pn p (
xn + xn +1
)
2
for x in (xn, xn+1) and similarly for the other coefficient functions. This choice works well for
regular problems. Some singular problems require a more sophisticated technique to capture the
asymptotic behavior. For the midpoint interpolants, the differential equation (2) has the known
closed form solution in
(xn, xn+1)
)( xn ) n ( x xn ) / pn
u ( x ) = u ( xn ) n ( x xn ) + ( pu
with
sin nt / n , n > 0
n ( t ) = sinh nt / n , n < 0
t , = 0
where
n = rn qn / pn
and
n = n
Starting with,
)( a )
u ( a ) and ( pu
consistent with the boundary condition,
u ( a ) = a2 a2
)( a ) = a1 a1
( pu
1068 Chapter 5: Differential Equations
)( xn ) n ( hn ) / pn
u ( xn +1 ) = u ( xn ) n ( hn ) + ( pu
)( xn +1 ) = n pnu ( xn ) n ( hn ) + ( pu
)( xn ) n ( hn )
( pu
which is a shooting method. For a fixed mesh we can iterate on the approximate eigenvalue until
the boundary condition at b is satisfied. This will yield an O(h2) approximation
k
to some k.
The problem (2) has a step spectral function given by
( t ) =
r ( x ) u ( x ) dx +
2
k
k t
and
= a1a2 a1a2
Comments
1.
SLEIG 1069
If JOB(3) = .true.
IPRINT Printed Output
1
the actual (a, b) used at each iteration and the total number
of eigenvalues computed
2
the above and switchover points to the asymptotic
formulas, and some intermediate (t) approximations
4
the above and initial meshes for each iteration, the index
of the largest eigenvalue which may be computed, and various
eigenvalue and RN values
4
the above and
NUMX Integer whose value is the number of output points where each eigenfunction is to
be evaluated (the number of entries in XEF(*)) when JOB(2) = .true.. If JOB(5)= .false.
and NUMX is greater than zero, then NUMX is the number of points in the initial mesh
used. If JOB(5) = .false., the points in XEF(*) should be chosen with a reasonable
distribution. Since the endpoints a and b must be part of any mesh, NUMX cannot be one
in this case. If JOB(5) = .false. and JOB(3) = .true., then NUMX must be positive. On
output, NUMX is set to the number of points for eigenfunctions when input NUMX = 0,
and JOB(2) or JOB(5) = .true.. (Input/Output)
XEF Array of points on input where eigenfunction estimates are desired, if JOB(2) = .true..
Otherwise, if JOB(5) = .false. and NUMX is greater than zero, the users initial mesh is
entered. The entries must be ordered so that
a = XEF(1) < XEF(2) < < XEF(NUMX) = b. If either endpoint is infinite, the
corresponding XEF(1) or XEF(NUMX) is ignored. However, it is required that XEF(2) be
negative when ENDFIN(1) = .false., and that XEF(NUMX-1) be positive when
ENDFIN(2) = .false.. On output, XEF(*) is changed only if JOB(2) and JOB(5) are true.
If JOB(2) = .false., this vector is not referenced. If JOB(2) = .true. and NUMX is greater
than zero on input, XEF(*) should be dimensioned at least NUMX + 16. If JOB(2) is true
and NUMX is zero on input, XEF(*) should be dimensioned at least 31.
NRHO The number of output values desired for the array RHO(*). NRHO is not used if
JOB(3) = .false.. (Input)
T Real vector of size NRHO containing values where the spectral function RHO(*) is desired.
The entries must be sorted in increasing order. The existence and location of a
continuous spectrum can be determined by calling SLEIG with the first four entries of
JOB set to false and IPRINT set to 1. T(*) is not used if JOB(3) = .false.. (Input)
TYPE 4 by 2 logical matrix. Column 1 contains information about endpoint a and column
2 refers to endpoint b.
TYPE(1,*) = .true. if and only if the endpoint is regular
TYPE(2,*) = .true. if and only if the endpoint is limit circle
TYPE(3,*) = .true. if and only if the endpoint is nonoscillatory for all eigenvalues
TYPE(4,*) = .true. if and only if the endpoint is oscillatory for all eigenvalues
Note: all of these values must be correctly input if JOB(4) = .true..
Otherwise, TYPE(*,*) is output. (Input/Output)
EF Array of eigenfunction values. EF((k 1)*NUMX + i) is the estimate of u(XEF(i))
corresponding to the eigenvalue in EV(k). If JOB(2) = .false. then this vector is not
referenced. If JOB(2) = .true. and NUMX is greater than zero on entry, then EF(*) should
be dimensioned at least NUMX * NUMEIG. If JOB(2) = .true. and NUMX is zero on input,
then EF(*) should be dimensioned 31 * NUMEIG. (Output)
PDEF Array of eigenfunction derivative values. PDEF((k-1)*NUMX + i) is the estimate of
(pu) (XEF(i)) corresponding to the eigenvalue in EV(k). If JOB(2) = .false. this vector is
not referenced. If JOB(2) = .true., it must be dimensioned the same as EF(*). (Output)
SLEIG 1071
RHO Array of size NRHO containing values for the spectral density function (t),
RHO(I) = (T(I)). This vector is not referenced if JOB(3) is false. (Output)
IFLAG Array of size max(1, numeig) containing information about the output. IFLAG(K)
refers to the K-th eigenvalue, when JOB(1) or JOB(2) = .true.. Otherwise, only
IFLAG(1) is used. Negative values are associated with fatal errors, and the calculations
are ceased. Positive values indicate a warning. (Output)
IFLAG(K)
IFLAG(K)
Description
10
15
p(x) and r(x) are not positive in the interval (a, b).
20
Example 1
This example computes the first ten eigenvalues of the problem from Titchmarsh (1962) given by
p(x) = r(x) = 1
q(x) = x
[a, b] = [0, ]
u(a) = u(b) = 0
f ( ) = J1/ 3 3 / 2 + J 1/ 3 3 / 2
3
NONE
SLEIG 1073
!
INTEGER
REAL
COMPLEX
LOGICAL
!
INTRINSIC
REAL
COMPLEX
!
!
EXTERNAL
SPECIFICATIONS FOR
I, INDEX(10), NUMEIG, NOUT
CONS(8), EVAL(10), LAMBDA, TEVLAB,&
TEVLRL, XNU
CBS1(1), CBS2(1), Z
ENDFIN(2)
SPECIFICATIONS FOR
CMPLX, SQRT
SQRT
CMPLX
SPECIFICATIONS FOR
SPECIFICATIONS FOR
COEFF
LOCAL VARIABLES
INTRINSICS
SUBROUTINES
FUNCTIONS
!
!
=
=
=
=
=
=
=
=
1.0
0.0
0.0
0.0
1.0
0.0
0.0
0.0
!
ENDFIN(1) = .TRUE.
ENDFIN(2) = .FALSE.
!
!
!
!
!
99998 FORMAT(/, 2X, 'index', 5X, 'lambda', 5X, 'f(lambda)',/)
99999 FORMAT(I5, F13.4, E15.4)
END
!
SUBROUTINE COEFF (X, PX, QX, RX)
!
SPECIFICATIONS FOR ARGUMENTS
REAL
X, PX, QX, RX
!
PX = 1.0
1074 Chapter 5: Differential Equations
QX = X
RX = 1.0
RETURN
END
Output
index
lambda
f(lambda)
0
1
2
3
4
5
6
7
8
9
2.3381
4.0879
5.5205
6.7867
7.9440
9.0227
10.0401
11.0084
11.9361
12.8293
-0.8285E-05
-0.1651E-04
0.6843E-04
-0.4523E-05
0.8952E-04
0.1123E-04
0.1031E-03
-0.7913E-04
-0.5095E-04
0.4645E-03
Additional Examples
Example 2
In this problem from Scott, Shampine and Wing (1969),
p(x) = r(x) = 1
q(x) = x2 + x4
[a, b] = [, ]
u(a) = u(b) = 0
the first eigenvalue and associated eigenfunction, evaluated at selected points, are computed. As a
rough check of the correctness of the results, the magnitude of the residual
d
du
( p ( x) ) + q ( x) u r ( x ) u
dx
dx
is printed. We compute a spline interpolant to u and use the function CSDER to estimate the
quantity (p(x)u).
USE
USE
USE
USE
!
S2EIG_INT
CSDER_INT
UMACH_INT
CSAKM_INT
IMPLICIT
INTEGER
REAL
NONE
SPECIFICATIONS FOR LOCAL VARIABLES
I, IFLAG(1), INDEX(1), IWORK(100), NINTV, NOUT, NRHO, &
NUMEIG, NUMX
BRKUP(61), CONS(8), CSCFUP(4,61), EF(61), EVAL(1), &
LAMBDA, PDEF(61), PX, QX, RESIDUAL, RHO(1), RX, T(1), &
SLEIG 1075
!
!
!
!
ENDFIN(1) = .FALSE.
ENDFIN(2) = .FALSE.
!
!
!
!
!
!
!
!
NUMEIG
= 1
INDEX(1) = 0
TEVLAB
TEVLRL
TOLS(1)
TOLS(2)
TOLS(3)
TOLS(4)
NRHO
=
=
=
=
=
=
=
1.0E-3
1.0E-3
TEVLAB
TEVLRL
TEVLAB
TEVLRL
0
NUMX = 61
DO 10 I=1, NUMX
XEF(I) = 0.05*REAL(I-31)
10 CONTINUE
CALL S2EIG (CONS, COEFF, ENDFIN, NUMEIG, INDEX, TEVLAB, TEVLRL, &
EVAL, JOB, 0, TOLS, NUMX, XEF, NRHO, T, TYPE, EF, &
PDEF, RHO, IFLAG, WORK, IWORK)
LAMBDA = EVAL(1)
20 CONTINUE
Compute spline interpolant to u'
CALL CSAKM (XEF, PDEF, BRKUP, CSCFUP)
NINTV = NUMX - 1
CALL UMACH (2, NOUT)
!
!
!
!
DO 30 I=1, 41, 2
X = XEF(I+10)
CALL COEFF (X, PX, QX, RX)
Use the spline fit to u' to
estimate u'' with CSDER
RESIDUAL = ABS(-CSDER(1,X,BRKUP,CSCFUP)+QX*EF(I+10)- &
LAMBDA*EF(I+10))
WRITE (NOUT,99998) X, EF(I+10), PDEF(I+10), RESIDUAL
30 CONTINUE
!
99997 FORMAT (/, A14, F10.5, /)
99998 FORMAT (5X, F4.1, 3F15.5)
99999 FORMAT (7X, 'x', 11X, 'u(x)', 10X, 'u''(x)', 9X, 'residual', /)
END
!
SUBROUTINE COEFF (X, PX, QX, RX)
!
SPECIFICATIONS FOR ARGUMENTS
REAL
X, PX, QX, RX
!
PX = 1.0
QX = X*X + X*X*X*X
RX = 1.0
RETURN
END
Output
lambda =
x
-1.0
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1.39247
u(x)
0.38632
0.45218
0.51837
0.58278
0.64334
0.69812
0.74537
0.78366
0.81183
0.82906
0.83473
0.82893
0.81170
0.78353
0.74525
0.69800
0.64324
0.58269
u'(x)
0.65019
0.66372
0.65653
0.62827
0.57977
0.51283
0.42990
0.33393
0.22811
0.11570
0.00000
-0.11568
-0.22807
-0.33388
-0.42983
-0.51274
-0.57967
-0.62816
residual
0.00189
0.00081
0.00023
0.00113
0.00183
0.00230
0.00273
0.00265
0.00273
0.00278
0.00136
0.00273
0.00273
0.00267
0.00265
0.00230
0.00182
0.00113
SLEIG 1077
0.8
0.9
1.0
0.51828
0.45211
0.38626
-0.65641
-0.66361
-0.65008
0.00023
0.00081
0.00189
SLCNT
Calculates the indices of eigenvalues of a Sturm-Liouville problem of the form for
d
du
( p ( x ) ) + q ( x ) u = r ( x ) u for x in [ a, b ]
dx
dx
a1u a2 ( pu ) = ( a1u a2 ( pu ) ) at a
b1u + b2 ( pu ) = 0 at b
in a specified subinterval of the real line, [, ].
Required Arguments
ALPHA Value of the left end point of the search interval. (Input)
BETAR Value of the right end point of the search interval. (Input)
CONS Array of size eight containing
a1 , a1, a2 , a2 , b1 , b2 , a and b
in locations CONS(1) CONS(8), respectively. (Input)
COEFFN User-supplied SUBROUTINE to evaluate the coefficient functions. The usage is
CALL COEFFN (X, PX, QX, RX)
X Independent variable. (Input)
PX The value of p(x) at X. (Output)
QX The value of q(x) at X. (Output)
RX The value of r(x) at X. (Output)
COEFFN must be declared EXTERNAL in the calling program.
ENDFIN Logical array of size two. ENDFIN = .true. if and only if the endpoint a is
finite. ENDFIN(2) = .true. if and only if endpoint b is finite. (Input)
IFIRST The index of the first eigenvalue greater than . (Output)
NTOTAL Total number of eigenvalues in the interval [, ]. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
This subroutine computes the indices of eigenvalues, if any, in a subinterval of the real line for
Sturm-Liouville problems in the form
d
du
( p ( x ) ) + q ( x ) u = r ( x ) u for x in [ a, b ]
dx
dx
a1u a2 ( pu ) = ( a1u a2 ( pu ) ) at a
b1u + b2 ( pu ) = 0 at b
It is intended to be used in conjunction with SLEIG. SLCNT is based on the routine INTERV from
the package SLEDGE.
Example
Consider the harmonic oscillator (Titchmarsh) defined by
p(x) = 1
q(x) = x2
r(x) = 1
[a, b] = [, ]
u(a) = 0
u(b) = 0
Therefore in the interval [10, 16] we expect SLCNT to note three eigenvalues, with the first of
these having index five.
USE SLCNT_INT
Chapter 5: Differential Equations
SLCNT 1079
USE UMACH_INT
IMPLICIT
!
!
!
INTEGER
REAL
LOGICAL
EXTERNAL
NONE
SPECIFICATIONS FOR LOCAL VARIABLES
IFIRST, NOUT, NTOTAL
ALPHA, BETAR, CONS(8)
ENDFIN(2)
SPECIFICATIONS FOR SUBROUTINES
SPECIFICATIONS FOR FUNCTIONS
COEFFN
!
CALL UMACH (2, NOUT)
!
=
=
=
=
=
=
=
=
1.0E0
0.0E0
0.0E0
0.0E0
1.0E0
0.0E0
0.0E0
0.0E0
!
ENDFIN(1) = .FALSE.
ENDFIN(2) = .FALSE.
!
ALPHA = 10.0
BETAR = 16.0
!
!
!
99998 FORMAT (/, 'Index of first eigenvalue in [', F5.2, ',', F5.2, &
'] IS ', I2)
99999 FORMAT ('Total number of eigenvalues in this interval: ', I2)
!
END
!
SUBROUTINE COEFFN (X, PX, QX, RX)
!
SPECIFICATIONS FOR ARGUMENTS
REAL
X, PX, QX, RX
!
PX = 1.0E0
QX = X*X
RX = 1.0E0
RETURN
END
Output
Index of first eigenvalue in [10.00,16.00] is 5
Total number of eigenvalues in this interval: 3
SLCNT 1081
Chapter 6: Transforms
Routines
6.1.
6.2.
6.3.
6.4.
6.5.
Chapter 6: Transforms
1086
1093
1099
1103
1106
1109
1111
1113
1116
1118
1120
1122
1124
1126
1129
1131
1133
1135
1137
1139
1142
1145
1149
Routines 1083
6.6.
6.7.
1153
1158
1163
1168
Laplace Transform
Inverse Laplace transform..................................................... INLAP
Inverse Laplace transform for smooth functions ................... SINLP
1172
1175
Usage Notes
Fast Fourier Transforms
A Fast Fourier Transform (FFT) is simply a discrete Fourier transform that can be computed
efficiently. Basically, the straightforward method for computing the Fourier transform takes
approximately N2 operations where N is the number of points in the transform, while the FFT
(which computes the same values) takes approximately N log N operations. The algorithms in this
chapter are modeled on the Cooley-Tukey (1965) algorithm; hence, the computational savings
occur, not for all integers N, but for N which are highly composite. That is, N (or in certain cases
N + 1 or N 1) should be a product of small primes.
All of the FFT routines compute a discrete Fourier transform. The routines accept a vector x of
length N and return a vector
x
defined by
N
xm := xn nm
n =1
The various transforms are determined by the selection of . In the following table, we indicate
the selection of for the various transforms. This table should not be mistaken for a definition
since the precise transform definitions (at times) depend on whether N or m is even or odd.
nm
Routine
FFTRF
FFTRB
( m 1)( n 1) 2
cos or sin
cos or sin
( m 1)( n 1) 2
N
-2 i ( n -1)( m 1) / N
FFTCF
exp
FFTCB
exp 2 i ( n -1)( m 1) / N
FSINT
FCOST
QSINF
QSINB
nm
N +1
( n 1)( m 1)
sin
cos
N 1
2 sin
4 sin
QCOSF
2 cos
QCOSB
4 cos
( 2m 1) n
2N
( 2n 1) m
2N
2
1
m
(
)( n 1)
2N
( 2n-1)( m 1)
2N
For many of the routines listed above, there is a corresponding I (for initialization) routine. Use
these routines only when repeatedly transforming sequences of the same length. In this situation,
the I routine will compute the initial setup once, and then the user will call the corresponding
2 routine. This can result in substantial computational savings. For more information on the
usage of these routines, the user should consult the documentation under the appropriate routine
name.
In addition to the one-dimensional transformations described above, we also provide complex two
and three-dimensional FFTs and their inverses based on calls to either FFTCF or FFTCB. If you
need a higher dimensional transform, then you should consult the example program for FFTCI,
which suggests a basic strategy one could employ.
f ( ) = ( F f )( ) = f ( t ) e 2 i t dt
Chapter 6: Transforms
T /2
f ( )
f ( t ) e 2 i t dt
T / 2
= f ( t T / 2 ) e 2 i (t T / 2) dt
0
= e iT f ( t T / 2 ) e2 i t dt
0
If we approximate the last integral using the rectangle rule with spacing h = T/N, we have
N 1
f ( ) e iT h e 2 i kh f ( kh T / 2 )
k =0
N 1
k =0
k =0
j
f ( j / T ) e ij h e 2 ijk / N f ( kh T / 2 ) = ( 1) h e2 ijk / N f kh
where the vector f h = (f( T/2), , f((N 1)h T/2)). Thus, after scaling the components by
(1)jh, the discrete Fourier transform as computed in FFTCF (with input fh) is related to an
approximation of the continuous Fourier transform by the above formula. This is seen more
clearly by making a change of variables in the last sum. Set
n = k + 1, m = j + 1, and f kh = xn
m
m
2 i m 1 n 1 / N
f ( ( m 1) / T ) ( 1) hxm = ( 1) h e ( )( ) xn
n =1
If the function f is expressed as a FORTRAN function routine, then the continuous Fourier
transform
f
can be approximated using the IMSL routine QDAWF (see Chapter 4, Integration and
Differentiation).
FAST_DFT
Computes the Discrete Fourier Transform (DFT) of a rank-1 complex array, x.
Required Arguments
No required arguments; pairs of optional arguments are required. These pairs are forward_in
and forward_out or inverse_in and inverse_out.
Optional Arguments
forward_in = x (Input)
Stores the output complex array of rank-1 resulting from the transform.
inverse_in = y (Input)
inverse_out = x (Output)
Stores the output complex array of rank-1 resulting from the inverse transform.
ndata = n (Input)
Integer flag that directs user action. Normally, this argument is used only when the
working variables required for the transform and its inverse are saved in the calling
program unit. Computing the working variables and saving them in internal arrays
within fast_dft is the default. This initialization step is expensive.
There is a two-step process to compute the working variables just once. Example 3
illustrates this usage. The general algorithm for this usage is to enter fast_dft
with ido = 0. A return occurs thereafter with ido < 0. The optional rank-1
complex array w(:) with size(w) >= ido must be re-allocated. Then, re-enter
fast_dft. The next return from fast_dft has the output value ido = 1. The
variables required for the transform and its inverse are saved in w(:). Thereafter,
when the routine is entered with ido = 1 and for the same value of n, the contents
of w(:) will be used for the working variables. The expensive initialization step is
avoided. The optional arguments ido= and work_array= must be used
together.
Complex array of rank-1 used to store working variables and values between calls to
fast_dft. The value for size(w) must be at least as large as the value ido for the
value of ido < 0.
iopt = iopt(:) (Input/Output)
Derived type array with the same precision as the input array; used for passing optional
data to fast_dft. The options are as follows:
Chapter 6: Transforms
FAST_DFT 1087
Option Name
Option Value
c_, z_
fast_dft_scan_for_NaN
c_, z_
fast_dft_near_power_of_2
c_, z_
fast_dft_scale_forward
c_, z_
Fast_dft_scale_inverse
Examines each input array entry to find the first value such that
isNaN(x(i)) ==.true.
FORTRAN 90 Interface
Generic:
None
Specific:
Description
The fast_dft routine is a Fortran 90 version of the FFT suite of IMSL (1994, pp. 772-776). The
maximum computing efficiency occurs when the size of the array can be factored in the form
n = 2i1 3i2 4i3 5i4
using non-negative integer values {i1, i2, i3, i4}. There is no further restriction on n 1.
1088 Chapter 6: Transforms
Output
Example 1 for FAST_DFT is correct.
Additional Examples
Example 2: Cyclical Data with a Linear Trend
This set of data is sampled from a function x(t) = at + b + y(t), where y(t) is a harmonic series. The
independent variable is normalized as 1 t 1. Thus, the data is said to have cyclical
components plus a linear trend. As a first step, the linear terms are effectively removed from the
Chapter 6: Transforms
FAST_DFT 1089
data using the least-squares system solver LIN_SOL_LSQ, Chapter 1. Then, the residuals are
transformed and the resulting frequencies are analyzed.
use
use
use
use
fast_dft_int
lin_sol_lsq_int
rand_gen_int
sort_real_int
implicit none
! This is Example 2 for FAST_DFT.
integer i
integer, parameter :: n=64, k=4
integer ip(n)
real(kind(1e0)), parameter :: one=1e0, two=2e0, zero=0e0
real(kind(1e0)) delta_t, pi
real(kind(1e0)) y(k), z(2), indx(k), t(n), temp(n)
complex(kind(1e0)) a_trend(n,2), a, b_trend(n,1), b, c(k), f(n),&
r(n), x(n), x_trend(2,1)
! Generate random data for linear trend and harmonic series.
call rand_gen(z)
a = z(1); b = z(2)
call rand_gen(y)
! This emphasizes harmonics 2 through k+1.
c = y + one
! Determine sampling interval.
delta_t = two/n
t=(/(-one+i*delta_t, i=0,n-1)/)
! Compute pi.
pi = atan(one)*4E0
indx=(/(i*pi,i=1,k)/)
! Make up data set as a linear trend plus harmonics.
x = a + b*t + &
matmul(exp(cmplx(zero,spread(t,2,k)*spread(indx,1,n),kind(one))),c)
! Define least-squares matrix data for a linear trend.
a_trend(1:,1) = one
a_trend(1:,2) = t
b_trend(1:,1) = x
! Solve for a linear trend.
call lin_sol_lsq(a_trend, b_trend, x_trend)
! Compute harmonic residuals.
r = x - reshape(matmul(a_trend,x_trend),(/n/))
! Transform harmonic residuals.
call c_fast_dft(forward_in=r, forward_out=f)
ip=(/(i,i=1,n)/)
Output
Example 2 for FAST_DFT is correct.
FAST_DFT 1091
Output
Example 3 for FAST_DFT is correct.
ck = a j bk j , k = 0, , n 1
j =0
The definition implies a matrix-vector product. A direct approach requires about n 2 operations
consisisting of an add and multiply. An efficient method consisting of computing the products of
the transforms of the
{a } and {b }
j
then inverting this product, is preferable to the matrix-vector approach for large problems. The
example is also illustrated in operator_ex37, Chapter 10 using the generic function interface
FFT and IFFT.
use fast_dft_int
use rand_gen_int
implicit none
! This is Example 4 for FAST_DFT.
integer j
1092 Chapter 6: Transforms
Output
Example 4 for FAST_DFT is correct.
FAST_2DFT
Computes the Discrete Fourier Transform (2DFT) of a rank-2 complex array, x.
Chapter 6: Transforms
FAST_2DFT 1093
Required Arguments
No required arguments; pairs of optional arguments are required. These pairs are forward_in
and forward_out or inverse_in and inverse_out.
Optional Arguments
forward_in = x (Input)
forward_out = y (Output)
Stores the output complex array of rank-2 resulting from the transform.
inverse_in = y (Input)
Stores the output complex array of rank-2 resulting from the inverse transform.
mdata = m (Input)
Uses the sub-array in the second dimension of size n for the numbers.
Default value: n = size(x,2).
ido = ido (Input/Output)
Integer flag that directs user action. Normally, this argument is used only when the
working variables required for the transform and its inverse are saved in the calling
program unit. Computing the working variables and saving them in internal arrays
within fast_2dft is the default. This initialization step is expensive.
There is a two-step process to compute the working variables just once. Example 3
illustrates this usage. The general algorithm for this usage is to enter fast_2dft with
ido = 0. A return occurs thereafter with ido < 0. The optional rank-1 complex array w(:)
with size(w) >= ido must be re-allocated. Then, re-enter fast_2dft. The next return
from fast_2dft has the output value ido = 1. The variables required for the transform
and its inverse are saved in w(:). Thereafter, when the routine is entered with ido = 1
and for the same values of m and n, the contents of w(:) will be used for the working
variables. The expensive initialization step is avoided. The optional arguments ido=
and work_array= must be used together.
Complex array of rank-1 used to store working variables and values between calls to
fast_2dft. The value for size(w) must be at least as large as the value ido for the
value of ido < 0.
Derived type array with the same precision as the input array; used for passing optional
data to fast_2dft. The options are as follows:
Packaged Options for FAST_2DFT
Option Prefix = ?
Option Name
Option Value
c_, z_
fast_2dft_scan_for_NaN
c_, z_
fast_2dft_near_power_of_2
c_, z_
fast_2dft_scale_forward
c_, z_
fast_2dft_scale_inverse
Examines each input array entry to find the first value such that
isNaN(x(i,j)) ==.true.
See the isNaN() function, Chapter 10.
FORTRAN 90 Interface
Generic:
None
Specific:
Chapter 6: Transforms
FAST_2DFT 1095
Description
The fast_2dft routine is a Fortran 90 version of the FFT suite of IMSL (1994, pp. 772-776).
Output
Example 1 for FAST_2DFT is correct.
Additional Examples
Example 2: Cyclical 2D Data with a Linear Trend
This set of data is sampled from a function x(s, t) = a + bs + ct + y(s, t) , where y(s, t) is an
harmonic series. The independent variables are normalized as 1 s 1 and 1 t 1. Thus, the
data is said to have cyclical components plus a linear trend. As a first step, the linear terms are
effectively removed from the data using the least-squares system solver . Then, the residuals are
transformed and the resulting frequencies are analyzed.
use fast_2dft_int
use lin_sol_lsq_int
use sort_real_int
use rand_int
implicit none
! This is Example 2 for FAST_2DFT.
integer i
integer, parameter :: n=8, k=15
integer ip(n*n), order(k)
real(kind(1e0)), parameter :: one=1e0, two=2e0, zero=0e0
real(kind(1e0)) delta_t
real(kind(1e0)) rn(3), s(n), t(n), temp(n*n), new_order(k)
complex(kind(1e0)) a, b, c, a_trend(n*n,3), b_trend(n*n,1),
f(n,n), r(n,n), x(n,n), x_trend(3,1)
complex(kind(1e0)), dimension(n,n) :: g=zero, h=zero
&
FAST_2DFT 1097
a_trend(1:,2) = reshape(spread(s,dim=2,ncopies=n),(/n*n/))
a_trend(1:,3) = reshape(spread(t,dim=1,ncopies=n),(/n*n/))
b_trend(1:,1) = reshape(x,(/n*n/))
! Solve for a linear trend.
call lin_sol_lsq(a_trend, b_trend, x_trend)
! Compute harmonic residuals.
r = x - reshape(matmul(a_trend,x_trend),(/n,n/))
! Transform harmonic residuals.
call c_fast_2dft(forward_in=r, forward_out=f)
ip = (/(i,i=1,n**2)/)
! Sort the magnitude of the transform.
call s_sort_real(-(abs(reshape(f,(/n*n/)))), &
temp, iperm=ip)
! The dominant frequencies are output in ip(1:k).
! Sort these values to compare with the original frequency order.
call s_sort_real(real(ip(1:k)), new_order)
order(1:n) = (/(i,i=1,n)/)
order(n+1:k) = (/((i-n)*n+1,i=n+1,k)/)
! Check the results.
if (count(order /= int(new_order)) == 0) then
write (*,*) 'Example 2 for FAST_2DFT is correct.'
end if
end
Output
Example 2 for FAST_2DFT is correct.
Output
Example 3 for FAST_2DFT is correct.
FAST_3DFT
Computes the Discrete Fourier Transform (2DFT) of a rank-3 complex array.
Chapter 6: Transforms
FAST_3DFT 1099
Required Arguments
No required arguments; pairs of optional arguments are required. These pairs are forward_in
and forward_out or inverse_in and inverse_out.
Optional Arguments
forward_in = x (Input)
forward_out = y (Output)
Stores the output complex array of rank-3 resulting from the transform.
inverse_in = y (Input)
Stores the output complex array of rank-3 resulting from the inverse transform.
mdata = m (Input)
Uses the sub-array in the second dimension of size n for the numbers.
Default value: n = size(x,2).
kdata = k (Input)
Uses the sub-array in the third dimension of size k for the numbers.
Default value: k = size(x,3).
ido = ido (Input/Output)
Integer flag that directs user action. Normally, this argument is used only when the
working variables required for the transform and its inverse are saved in the calling
program unit. Computing the working variables and saving them in internal arrays
within fast_3dft is the default. This initialization step is expensive.
There is a two-step process to compute the working variables just once. The general
algorithm for this usage is to enter fast_3dft with
ido = 0. A return occurs thereafter with ido < 0. The optional rank-1 complex array w(:)
with size(w) >= ido must be re-allocated. Then, re-enter fast_3dft. The next return
from fast_3dft has the output value ido = 1. The variables required for the transform
and its inverse are saved in w(:). Thereafter, when the routine is entered with ido = 1
and for the same values of m and n, the contents of w(:) will be used for the working
variables. The expensive initialization step is avoided. The optional arguments ido=
and work_array= must be used together.
Complex array of rank-1 used to store working variables and values between calls to
fast_3dft. The value for size(w) must be at least as large as the value ido for the
value of ido < 0.
iopt = iopt(:) (Input/Output)
Derived type array with the same precision as the input array; used for passing optional
data to fast_3dft. The options are as follows:
Packaged Options for FAST_3DFT
Option Name
Option Prefix = ?
Option Value
C_, z_
fast_3dft_scan_for_NaN
C_, z_
fast_3dft_near_power_of_2
C_, z_
fast_3dft_scale_forward
C_, z_
fast_3dft_scale_inverse
Examines each input array entry to find the first value such that
isNaN(x(i,j,k)) ==.true.
FORTRAN 90 Interface
Generic:
None
Specific:
Chapter 6: Transforms
FAST_3DFT 1101
Description
The fast_3dft routine is a Fortran 90 version of the FFT suite of IMSL (1994, pp. 772-776).
Output
Example 1 for FAST_3DFT is correct.
FFTRF
Computes the Fourier coefficients of a real periodic sequence.
Required Arguments
N Length of the sequence to be transformed. (Input)
SEQ Array of length N containing the periodic sequence. (Input)
COEF Array of length N containing the Fourier coefficients. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFTRF computes the discrete Fourier transform of a real vector of size N. The method
used is a variant of the Cooley-Tukey algorithm that is most efficient when N is a product of small
prime factors. If N satisfies this condition, then the computational effort is proportional to N log N.
Specifically, given an N-vector s = SEQ, FFTRF returns in c = COEF, if N is even:
N
( m 1)( n 1) 2
c2 m 2 = sn cos
N
n =1
N
( m 1)( n 1) 2
c2 m 1 = sn sin
N
n =1
m = 2, , N / 2 + 1
m = 2, , N / 2
c1 = sn
n =1
Chapter 6: Transforms
FFTRF 1103
We now describe a fairly common usage of this routine. Let f be a real valued function of time.
Suppose we sample f at N equally spaced time intervals of length seconds starting at time t0.
That is, we have
SEQ i:= f (t0 + (i 1)) i = 1, 2, , N
The routine FFTRF treats this sequence as if it were periodic of period N. In particular, it assumes
that f (t0) = f (t0 + N). Hence, the period of the function is assumed to be T = N.
Now, FFTRF accepts as input SEQ and returns as output coefficients c = COEF that satisfy the
following relation when N is odd (N even is similar):
SEQi =
1
N
( N +1) / 2
2 ( n 1)( i 1) ( N +1) / 2
2 ( n 1)( i 1)
c1 + 2 c2 n 2 cos
2 c2 n 1 sin
N
N
n=2
n=2
This formula is very revealing. It can be interpreted in the following manner. The coefficients
produced by FFTRF produce an interpolating trigonometric polynomial to the data. That is, if we
define
g (t )
( N +1) / 2
2 ( n 1)( t t0 ) ( N +1) / 2
2 ( n 1)( t t0 )
c1 + 2 c2 n 2 cos
2 c2 n 1 sin
N
N
n=2
n=2
( N +1) / 2
2 ( n 1)( t t0 ) ( N +1) / 2
2 ( n 1)( t t0 )
1
= c1 + 2 c2 n 2 cos
2 c2 n 1 sin
N
T
T
n=2
n=2
:=
1
N
then, we have
f(t0 + (i 1)) = g(t0 + (i 1))
Now, suppose we want to discover the dominant frequencies. One forms the vector P of length
N/2 as follows:
P1
:= c1
Pk
:= c22k 2 + c22k 1
k = 2, 3, , ( N + 1) / 2
These numbers correspond to the energy in the spectrum of the signal. In particular, Pk
corresponds to the energy level at frequency
k 1 k 1
=
T
N
k = 1, 2, ,
N +1
2
Furthermore, note that there are only (N + 1)/2 T/(2) resolvable frequencies when N
observations are taken. This is related to the Nyquist phenomenon, which is induced by discrete
sampling of a continuous signal.
Similar relations hold for the case when N is even.
Finally, note that the Fourier transform hsas an (unnormalized) inverse that is implemented in
FFTRB. The routine FFTRF is based on the real FFT in FFTPACK. The package FFTPACK was
developed by Paul Swarztrauber at the National Center for Atmospheric Research.
Comments
1.
2.
The routine FFTRF is most efficient when N is the product of small primes.
3.
4.
If FFTRF/FFTRB is used repeatedly with the same value of N, then call FFTRI followed
by repeated calls to F2TRF/F2TRB. This is more efficient than repeated calls to
FFTRF/FFTRB.
Example
In this example, a pure cosine wave is used as a data vector, and its Fourier series is recovered.
The Fourier series is a vector with all components zero except at the appropriate frequency where
it has an N.
USE FFTRF_INT
USE CONST_INT
USE UMACH_INT
!
!
!
!
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
INTEGER
I, NOUT
REAL
COEF(N), COS, FLOAT, TWOPI, SEQ(N)
INTRINSIC COS, FLOAT
TWOPI = CONST('PI')
TWOPI = 2.0*TWOPI
CALL UMACH (2, NOUT)
DO 10 I=1, N
SEQ(I) = COS(FLOAT(I-1)*TWOPI/FLOAT(N))
10 CONTINUE
!
Compute the Fourier transform of SEQ
CALL FFTRF (N, SEQ, COEF)
!
Print results
WRITE (NOUT,99998)
99998 FORMAT (9X, 'INDEX', 5X, 'SEQ', 6X, 'COEF')
Chapter 6: Transforms
FFTRF 1105
Output
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.62
-0.22
-0.90
-0.90
-0.22
0.62
COEF
0.00
3.50
0.00
0.00
0.00
0.00
0.00
FFTRB
Computes the real periodic sequence from its Fourier coefficients.
Required Arguments
N Length of the sequence to be transformed. (Input)
COEF Array of length N containing the Fourier coefficients. (Input)
SEQ Array of length N containing the periodic sequence. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFTRB is the unnormalized inverse of the routine FFTRF. This routine computes the
discrete inverse Fourier transform of a real vector of size N. The method used is a variant of the
Cooley-Tukey algorithm, which is most efficient when N is a product of small prime factors. If N
satisfies this condition, then the computational effort is proportional to N log N.
Specifically, given an N-vector c = COEF, FFTRB returns in s = SEQ, if N is even:
sm = c1 + ( 1)
( m 1)
N /2
( n 1)( m 1) 2
cN + 2 c2 n 2 cos
N
n=2
N /2
( n 1)( m 1) 2
2 c2 n 1 sin
N
n=2
If N is odd:
sm = c1 + 2
( N +1) / 2
n=2
( n 1)( m 1) 2
c2 n 2 cos
N
( N +1) / 2
n=2
( n 1)( m 1) 2
c2 n 1 sin
N
The routine FFTRB is based on the inverse real FFT in FFTPACK. The package FFTPACK was
developed by Paul Swarztrauber at the National Center for Atmospheric Research.
Comments
1.
2.
The routine FFTRB is most efficient when N is the product of small primes.
3.
4.
If FFTRF/FFTRB is used repeatedly with the same value of N, then call FFTRI followed
by repeated calls to F2TRF/F2TRB. This is more efficient than repeated calls to
FFTRF/FFTRB.
Example
We compute the forward real FFT followed by the inverse operation. In this example, we first
compute the Fourier transform
x = COEF
is now input into FFTRB with the resulting output s = Nx, that is, sj = (1)j N for j = 1 to N.
USE FFTRB_INT
Chapter 6: Transforms
FFTRB 1107
USE CONST_INT
USE FFTRF_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
!
INTEGER
I, NOUT
REAL
COEF(N), FLOAT, SEQ(N), TWOPI, X(N)
INTRINSIC FLOAT
TWOPI = CONST('PI')
!
TWOPI = TWOPI
!
!
!
99994
99995
99996
!
!
!
99997
99998
99999
Print results
WRITE (NOUT,99994)
WRITE (NOUT,99995)
FORMAT (9X, 'Result after forward transform')
FORMAT (9X, 'INDEX', 5X, 'X', 8X, 'COEF')
WRITE (NOUT,99996) (I, X(I), COEF(I), I=1,N)
FORMAT (1X, I11, 5X, F5.2, 5X, F5.2)
Compute the backward transform of
COEF
CALL FFTRB (N, COEF, SEQ)
Print results
WRITE (NOUT,99997)
WRITE (NOUT,99998)
FORMAT (/, 9X, 'Result after backward transform')
FORMAT (9X, 'INDEX', 4X, 'COEF', 6X, 'SEQ')
WRITE (NOUT,99999) (I, COEF(I), SEQ(I), I=1,N)
FORMAT (1X, I11, 5X, F5.2, 5X, F5.2)
END
Output
Result after forward transform
INDEX
X
COEF
1
-1.00
-1.00
2
1.00
-1.00
3
-1.00
-0.48
4
1.00
-1.00
5
-1.00
-1.25
6
1.00
-1.00
7
-1.00
-4.38
Result after backward transform
INDEX
COEF
SEQ
1108 Chapter 6: Transforms
1
2
3
4
5
6
7
-1.00
-1.00
-0.48
-1.00
-1.25
-1.00
-4.38
-7.00
7.00
-7.00
7.00
-7.00
7.00
-7.00
FFTRI
Computes parameters needed by FFTRF and FFTRB.
Required Arguments
N Length of the sequence to be transformed. (Input)
WFFTR Array of length 2N + 15 containing parameters needed by FFTRF and FFTRB.
(Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFTRI initializes the routines FFTRF and FFTRB. An efficient way to make multiple
calls for the same N to routine FFTRF or FFTRB, is to use routine FFTRI for initialization. (In this
case, replace FFTRF or FFTRB with F2TRF or F2TRB, respectively.) The routine FFTRI is based on
the routine RFFTI in FFTPACK. The package FFTPACK was developed by Paul Swarztrauber at the
National Center for Atmospheric Research.
Comments
Different WFFTR arrays are needed for different values of N.
Example
In this example, we compute three distinct real FFTs by calling FFTRI once and then calling
F2TRF three times.
Chapter 6: Transforms
FFTRI 1109
USE
USE
USE
USE
FFTRI_INT
CONST_INT
F2TRF_INT
UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
INTEGER
REAL
INTRINSIC
I, K, NOUT
COEF(N), COS, FLOAT, TWOPI, WFFTR(29), SEQ(N)
COS, FLOAT
!
TWOPI = CONST('PI')
TWOPI = 2* TWOPI
K=1, 3
!
!
DO 10 I=1, N
SEQ(I) = COS(FLOAT(K*(I-1))*TWOPI/FLOAT(N))
10 CONTINUE
!
Compute the Fourier transform of SEQ
CALL F2TRF (N, SEQ, COEF, WFFTR)
!
Print results
WRITE (NOUT,99998)
99998
FORMAT (/, 9X, 'INDEX', 5X, 'SEQ', 6X, 'COEF')
WRITE (NOUT,99999) (I, SEQ(I), COEF(I), I=1,N)
99999
FORMAT (1X, I11, 5X, F5.2, 5X, F5.2)
!
20 CONTINUE
END
Output
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.62
-0.22
-0.90
-0.90
-0.22
0.62
COEF
0.00
3.50
0.00
0.00
0.00
0.00
0.00
INDEX
1
2
3
4
5
SEQ
1.00
-0.22
-0.90
0.62
0.62
COEF
0.00
0.00
0.00
3.50
0.00
6
7
INDEX
1
2
3
4
5
6
7
-0.90
-0.22
0.00
0.00
SEQ
1.00
-0.90
0.62
-0.22
-0.22
0.62
-0.90
COEF
0.00
0.00
0.00
0.00
0.00
3.50
0.00
FFTCF
Computes the Fourier coefficients of a complex periodic sequence.
Required Arguments
N Length of the sequence to be transformed. (Input)
SEQ Complex array of length N containing the periodic sequence. (Input)
COEF Complex array of length N containing the Fourier coefficients. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFTCF computes the discrete complex Fourier transform of a complex vector of size
N. The method used is a variant of the Cooley-Tukey algorithm, which is most efficient when N is
a product of small prime factors. If N satisfies this condition, then the computational effort is
proportional to N log N. This considerable savings has historically led people to refer to this
algorithm as the fast Fourier transform or FFT.
Specifically, given an N-vector x, FFTCF returns in c = COEF
N
cm = xn e 2 i ( n 1)( m 1) / N
n =1
Chapter 6: Transforms
FFTCF 1111
1
N
c
m =1
e 2 i ( m 1)( n 1) / N
This formula reveals the fact that, after properly normalizing the Fourier coefficients, one has the
coefficients for a trigonometric interpolating polynomial to the data. An unnormalized inverse is
implemented in FFTCB. FFTCF is based on the complex FFT in FFTPACK. The package
FFTPACK was developed by Paul Swarztrauber at the National Center for Atmospheric Research.
Comments
1.
2.
The routine FFTCF is most efficient when N is the product of small primes.
3.
4.
If FFTCF/FFTCB is used repeatedly with the same value of N, then call FFTCI followed
by repeated calls to F2TCF/F2TCB. This is more efficient than repeated calls to
FFTCF/FFTCB.
Example
In this example, we input a pure exponential data vector and recover its Fourier series, which is a
vector with all components zero except at the appropriate frequency where it has an N. Notice that
the norm of the input vector is
N
NONE
N
PARAMETER
(N=7)
INTEGER
REAL
COMPLEX
INTRINSIC
I, NOUT
TWOPI
C, CEXP, COEF(N), H, SEQ(N)
CEXP
!
C
= (0.,1.)
TWOPI = CONST('PI')
TWOPI = 2.0 * TWOPI
!
!
!
!
DO 10 I=1, N
SEQ(I) = CEXP((I-1)*H)
10 CONTINUE
!
!
Output
INDEX
1
2
3
4
5
6
7
SEQ
( 1.00, 0.00)
(-0.90, 0.43)
( 0.62,-0.78)
(-0.22, 0.97)
(-0.22,-0.97)
( 0.62, 0.78)
(-0.90,-0.43)
(
(
(
(
(
(
(
COEF
0.00, 0.00)
0.00, 0.00)
0.00, 0.00)
7.00, 0.00)
0.00, 0.00)
0.00, 0.00)
0.00, 0.00)
FFTCB
Computes the complex periodic sequence from its Fourier coefficients.
Required Arguments
N Length of the sequence to be transformed. (Input)
COEF Complex array of length N containing the Fourier coefficients. (Input)
SEQ Complex array of length N containing the periodic sequence. (Output)
Chapter 6: Transforms
FFTCB 1113
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFTCB computes the inverse discrete complex Fourier transform of a complex vector
of size N. The method used is a variant of the Cooley-Tukey algorithm, which is most efficient
when N is a product of small prime factors. If N satisfies this condition, then the computational
effort is proportional to N log N. This considerable savings has historically led people to refer to
this algorithm as the fast Fourier transform or FFT.
Specifically, given an N-vector c = COEF, FFTCB returns in s = SEQ
N
sm = cn e 2 i ( n 1)( m 1) / N
n =1
Finally, note that we can invert the inverse Fourier transform as follows:
cn =
1
N
s
m =1
e 2 i ( n 1)( m 1) / N
This formula reveals the fact that, after properly normalizing the Fourier coefficients, one has the
coefficients for a trigonometric interpolating polynomial to the data. FFTCB is based on the
complex inverse FFT in FFTPACK. The package FFTPACK was developed by Paul Swarztrauber
at the National Center for Atmospheric Research.
Comments
1.
2.
The routine FFTCB is most efficient when N is the product of small primes.
3.
4.
If FFTCF/FFTCB is used repeatedly with the same value of N; then call FFTCI followed
by repeated calls to F2TCF/F2TCB. This is more efficient than repeated calls to
FFTCF/FFTCB.
Example
In this example, we first compute the Fourier transform of the vector x, where xj = j for j = 1 to N.
Note that the norm of x is (N[N + 1][2N + 1]/6)1/2, and hence, the norm of the transformed vector
x = c
is used as input into FFTCB with the resulting output s = Nx, that is, sj = jN, for j = 1 to N.
USE FFTCB_INT
USE FFTCF_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
!
INTEGER
COMPLEX
INTRINSIC
I, NOUT
CMPLX, SEQ(N), COEF(N), X(N)
CMPLX
!
This loop fills out the data vector
!
with X(I)=I, I=1,N
DO 10 I=1, N
X(I) = CMPLX(I,0)
10 CONTINUE
!
Compute the forward transform of X
CALL FFTCF (N, X, COEF)
!
Compute the backward transform of
!
COEF
CALL FFTCB (N, COEF, SEQ)
!
Get output unit number
CALL UMACH (2, NOUT)
!
Print results
WRITE (NOUT,99998)
WRITE (NOUT,99999) (I, X(I), COEF(I), SEQ(I), I=1,N)
99998 FORMAT (5X, 'INDEX', 9X, 'INPUT', 9X, 'FORWARD TRANSFORM', 3X, &
'BACKWARD TRANSFORM')
99999 FORMAT (1X, I7, 7X,'(',F5.2,',',F5.2,')', &
7X,'(',F5.2,',',F5.2,')', &
7X,'(',F5.2,',',F5.2,')')
Chapter 6: Transforms
FFTCB 1115
END
Output
INDEX
1
2
3
4
5
6
7
(
(
(
(
(
(
(
INPUT
1.00,
2.00,
3.00,
4.00,
5.00,
6.00,
7.00,
0.00)
0.00)
0.00)
0.00)
0.00)
0.00)
0.00)
FORWARD TRANSFORM
(28.00, 0.00)
(-3.50, 7.27)
(-3.50, 2.79)
(-3.50, 0.80)
(-3.50,-0.80)
(-3.50,-2.79)
(-3.50,-7.27)
BACKWARD TRANSFORM
( 7.00, 0.00)
(14.00, 0.00)
(21.00, 0.00)
(28.00, 0.00)
(35.00, 0.00)
(42.00, 0.00)
(49.00, 0.00)
FFTCI
Computes parameters needed by FFTCF and FFTCB.
Required Arguments
N Length of the sequence to be transformed. (Input)
WFFTC Array of length 4N + 15 containing parameters needed by FFTCF and FFTCB.
(Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFTCI initializes the routines FFTCF and FFTCB. An efficient way to make multiple
calls for the same N to IMSL routine FFTCF or FFTCB is to use routine FFTCI for initialization. (In
this case, replace FFTCF or FFTCB with F2TCF or F2TCB, respectively.) The routine FFTCI is
based on the routine CFFTI in FFTPACK. The package FFTPACK was developed by Paul
Swarztrauber at the National Center for Atmospheric Research.
Comments
Different WFFTC arrays are needed for different values of N.
Example
In this example, we compute a two-dimensional complex FFT by making one call to FFTCI
followed by 2N calls to F2TCF.
USE
USE
USE
USE
!
FFTCI_INT
CONST_INT
F2TCF_INT
UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
N
(N=4)
INTEGER
REAL
COMPLEX
INTRINSIC
!
TWOPI
TWOPI
IR
IS
!
!
10
20
!
!
30
!
40
50
60
=
=
=
=
CONST('PI')
2*TWOPI
3
1
Chapter 6: Transforms
FFTCI 1117
DO 70 I=1, N
CALL F2TCF (N, COEF(1:,I), SEQ(1:,I), WFFTC, CPY)
70 CONTINUE
!
Take transpose of the result
DO 90 I=1, N
DO 80 J=I + 1, N
TEMP
= SEQ(I,J)
SEQ(I,J) = SEQ(J,I)
SEQ(J,I) = TEMP
80 CONTINUE
90 CONTINUE
!
Print results
WRITE (NOUT,99999)
DO 100 I=1, N
WRITE (NOUT,99998) (SEQ(I,J),J=1,N)
100 CONTINUE
!
99997 FORMAT (1X, 'The input matrix is below')
99998 FORMAT (1X, 4(' (',F5.2,',',F5.2,')'))
99999 FORMAT (/, 1X, 'Result of two-dimensional transform')
END
Output
The input matrix is below
( 1.00, 0.00) ( 1.00, 0.00)
(-1.00, 0.00) (-1.00, 0.00)
( 1.00, 0.00) ( 1.00, 0.00)
(-1.00, 0.00) (-1.00, 0.00)
( 1.00,
(-1.00,
( 1.00,
(-1.00,
0.00)
0.00)
0.00)
0.00)
( 1.00,
(-1.00,
( 1.00,
(-1.00,
0.00)
0.00)
0.00)
0.00)
0.00)
0.00)
0.00)
0.00)
(
(
(
(
0.00)
0.00)
0.00)
0.00)
0.00,
0.00,
0.00,
0.00,
FSINT
Computes the discrete Fourier sine transformation of an odd sequence.
Required Arguments
N Length of the sequence to be transformed. It must be greater than 1. (Input)
SEQ Array of length N containing the sequence to be transformed. (Input)
COEF Array of length N + 1 containing the transformed sequence. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FSINT computes the discrete Fourier sine transform of a real vector of size N. The
method used is a variant of the Cooley-Tukey algorithm, which is most efficient when N + 1 is a
product of small prime factors. If N satisfies this condition, then the computational effort is
proportional to N log N.
Specifically, given an N-vector s = SEQ, FSINT returns in c = COEF
N
mn
cm = 2 sn sin
N +1
n =1
Finally, note that the Fourier sine transform is its own (unnormalized) inverse. The routine FSINT
is based on the sine FFT in FFTPACK. The package FFTPACK was developed by Paul
Swarztrauber at the National Center for Atmospheric Research.
Comments
1.
2.
The routine FSINT is most efficient when N + 1 is the product of small primes.
3.
The routine FSINT is its own (unnormalized) inverse. Applying FSINT twice will
reproduce the original sequence multiplied by 2 * (N + 1).
4.
The arrays COEF and SEQ may be the same, if SEQ is also dimensioned at least N + 1.
5.
6.
If FSINT is used repeatedly with the same value of N, then call FSINI followed by
repeated calls to F2INT. This is more efficient than repeated calls to FSINT.
Chapter 6: Transforms
FSINT 1119
Example
In this example, we input a pure sine wave as a data vector and recover its Fourier sine series,
which is a vector with all components zero except at the appropriate frequency it has an N.
USE FSINT_INT
USE CONST_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
INTEGER
REAL
INTRINSIC
I, NOUT
COEF(N+1), FLOAT, PI, SIN, SEQ(N)
FLOAT, SIN
!
Get output unit number
CALL UMACH (2, NOUT)
!
Fill the data vector SEQ
!
with a pure sine wave
PI = CONST('PI')
DO 10 I=1, N
SEQ(I) = SIN(FLOAT(I)*PI/FLOAT(N+1))
10 CONTINUE
!
Compute the transform of SEQ
CALL FSINT (N, SEQ, COEF)
!
Print results
WRITE (NOUT,99998)
WRITE (NOUT,99999) (I, SEQ(I), COEF(I), I=1,N)
99998 FORMAT (9X, 'INDEX', 6X, 'SEQ', 7X, 'COEF')
99999 FORMAT (1X, I11, 5X, F6.2, 5X, F6.2)
END
Output
INDEX
1
2
3
4
5
6
7
SEQ
0.38
0.71
0.92
1.00
0.92
0.71
0.38
COEF
8.00
0.00
0.00
0.00
0.00
0.00
0.00
FSINI
Computes parameters needed by FSINT.
Required Arguments
N Length of the sequence to be transformed. N must be greater than 1. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FSINI initializes the routine FSINT. An efficient way to make multiple calls for the
same N to IMSL routine FSINT, is to use routine FSINI for initialization. (In this case, replace
FSINT with F2INT.) The routine FSINI is based on the routine SINTI in FFTPACK. The package
FFTPACK was developed by Paul Swarztrauber at the National Center for Atmospheric Research.
Comments
Different WFSIN arrays are needed for different values of N.
Example
In this example, we compute three distinct sine FFTs by calling FSINI once and then calling
F2INT three times.
USE
USE
USE
USE
!
!
!
!
!
FSINI_INT
UMACH_INT
CONST_INT
F2INT_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
INTEGER
REAL
INTRINSIC
I, K, NOUT
COEF(N+1), FLOAT, PI, SIN, WFSIN(32), SEQ(N)
FLOAT, SIN
Get output unit number
CALL UMACH (2, NOUT)
Initialize the work vector WFSIN
CALL FSINI (N, WFSIN)
Different frequencies of the same
wave will be transformed
DO 20 K=1, 3
Fill the data vector SEQ
Chapter 6: Transforms
FSINI 1121
10
!
!
20
99998
99999
Output
INDEX
1
2
3
4
5
6
7
SEQ
0.38
0.71
0.92
1.00
0.92
0.71
0.38
COEF
8.00
0.00
0.00
0.00
0.00
0.00
0.00
INDEX
1
2
3
4
5
6
7
SEQ
0.71
1.00
0.71
0.00
-0.71
-1.00
-0.71
COEF
0.00
8.00
0.00
0.00
0.00
0.00
0.00
INDEX
1
2
3
4
5
6
7
SEQ
0.92
0.71
-0.38
-1.00
-0.38
0.71
0.92
COEF
0.00
0.00
8.00
0.00
0.00
0.00
0.00
FCOST
Computes the discrete Fourier cosine transformation of an even sequence.
Required Arguments
N Length of the sequence to be transformed. It must be greater than 1. (Input)
SEQ Array of length N containing the sequence to be transformed. (Input)
1122 Chapter 6: Transforms
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FCOST computes the discrete Fourier cosine transform of a real vector of size N. The
method used is a variant of the Cooley-Tukey algorithm , which is most efficient when N 1 is a
product of small prime factors. If N satisfies this condition, then the computational effort is
proportional to N log N.
Specifically, given an N-vector s = SEQ, FCOST returns in c = COEF
N 1
( m 1)( n 1)
( m 1)
cm = 2 sn cos
+ s1 + sN ( 1)
N
1
n=2
Finally, note that the Fourier cosine transform is its own (unnormalized) inverse. Two applications
of FCOST to a vector s produces (2N 2)s. The routine FCOST is based on the cosine FFT in
FFTPACK. The package FFTPACK was developed by Paul Swarztrauber at the National Center
for Atmospheric Research.
Comments
1.
2.
The routine FCOST is most efficient when N 1 is the product of small primes.
3.
The routine FCOST is its own (unnormalized) inverse. Applying FCOST twice will
reproduce the original sequence multiplied by 2 * (N 1).
4.
Chapter 6: Transforms
FCOST 1123
If FCOST is used repeatedly with the same value of N, then call FCOSI followed by
repeated calls to F2OST. This is more efficient than repeated calls to FCOST.
5.
Example
In this example, we input a pure cosine wave as a data vector and recover its Fourier cosine series,
which is a vector with all components zero except at the appropriate frequency it has an N 1.
USE FCOST_INT
USE CONST_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
INTEGER
REAL
INTRINSIC
I, NOUT
COEF(N), COS, FLOAT, PI, SEQ(N)
COS, FLOAT
!
CALL UMACH (2, NOUT)
!
!
PI = CONST('PI')
DO 10 I=1, N
SEQ(I) = COS(FLOAT(I-1)*PI/FLOAT(N-1))
10 CONTINUE
!
Compute the transform of SEQ
CALL FCOST (N, SEQ, COEF)
!
Print results
WRITE (NOUT,99998)
WRITE (NOUT,99999) (I, SEQ(I), COEF(I), I=1,N)
99998 FORMAT (9X, 'INDEX', 6X, 'SEQ', 7X, 'COEF')
99999 FORMAT (1X, I11, 5X, F6.2, 5X, F6.2)
END
Output
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.87
0.50
0.00
-0.50
-0.87
-1.00
COEF
0.00
6.00
0.00
0.00
0.00
0.00
0.00
FCOSI
Computes parameters needed by FCOST.
Required Arguments
N Length of the sequence to be transformed. N must be greater than 1. (Input)
WFCOS Array of length 3N + 15 containing parameters needed by FCOST. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FCOSI initializes the routine FCOST. An efficient way to make multiple calls for the
same N to IMSL routine FCOST is to use routine FCOSI for initialization. (In this case, replace
FCOST with F2OST.) The routine FCOSI is based on the routine COSTI in FFTPACK. The package
FFTPACK was developed by Paul Swarztrauber at the National Center for Atmospheric Research.
Comments
Different WFCOS arrays are needed for different values of N.
Example
In this example, we compute three distinct cosine FFTs by calling FCOSI once and then calling
F2OST three times.
USE
USE
USE
USE
!
!
!
FCOSI_INT
CONST_INT
F2OST_INT
UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
INTEGER
REAL
INTRINSIC
I, K, NOUT
COEF(N), COS, FLOAT, PI, WFCOS(36), SEQ(N)
COS, FLOAT
Get output unit number
CALL UMACH (2, NOUT)
Initialize the work vector WFCOS
CALL FCOSI (N, WFCOS)
Different frequencies of the same
Chapter 6: Transforms
FCOSI 1125
!
!
!
!
20
99998
99999
DO 10 I=1, N
SEQ(I) = COS(FLOAT(K*(I-1))*PI/FLOAT(N-1))
CONTINUE
Compute the transform of SEQ
CALL F2OST (N, SEQ, COEF, WFCOS)
Print results
WRITE (NOUT,99998)
WRITE (NOUT,99999) (I, SEQ(I), COEF(I), I=1,N)
CONTINUE
FORMAT (/, 9X, 'INDEX', 6X, 'SEQ', 7X, 'COEF')
FORMAT (1X, I11, 5X, F6.2, 5X, F6.2)
END
Output
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.87
0.50
0.00
-0.50
-0.87
-1.00
COEF
0.00
6.00
0.00
0.00
0.00
0.00
0.00
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.50
-0.50
-1.00
-0.50
0.50
1.00
COEF
0.00
0.00
6.00
0.00
0.00
0.00
0.00
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.00
-1.00
0.00
1.00
0.00
-1.00
COEF
0.00
0.00
0.00
6.00
0.00
0.00
0.00
QSINF
Computes the coefficients of the sine Fourier transform with only odd wave numbers.
Required Arguments
N Length of the sequence to be transformed. (Input)
SEQ Array of length N containing the sequence. (Input)
COEF Array of length N containing the Fourier coefficients. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QSINF computes the discrete Fourier quarter sine transform of a real vector of size N.
The method used is a variant of the Cooley-Tukey algorithm, which is most efficient when N is a
product of small prime factors. If N satisfies this condition, then the computational effort is
proportional to N log N.
Specifically, given an N-vector s = SEQ, QSINF returns in c = COEF
N 1
( 2m 1) n
m 1
cm = 2 sn sin
+ sN ( 1)
2
N
n =1
Finally, note that the Fourier quarter sine transform has an (unnormalized) inverse, which is
implemented in the IMSL routine QSINB. The routine QSINF is based on the quarter sine FFT in
FFTPACK. The package FFTPACK was developed by Paul Swarztrauber at the National Center
for Atmospheric Research.
Comments
1.
Chapter 6: Transforms
QSINF 1127
2.
The routine QSINF is most efficient when N is the product of small primes.
3.
4.
If QSINF/QSINB is used repeatedly with the same value of N, then call QSINI followed
by repeated calls to Q2INF/Q2INB. This is more efficient than repeated calls to
QSINF/QSINB.
Example
In this example, we input a pure quarter sine wave as a data vector and recover its Fourier quarter
sine series.
USE QSINF_INT
USE CONST_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
!
NONE
N
(N=7)
INTEGER
REAL
INTRINSIC
I, NOUT
COEF(N), FLOAT, PI, SIN, SEQ(N)
FLOAT, SIN
!
Get output unit number
CALL UMACH (2, NOUT)
!
Fill the data vector SEQ
!
with a pure sine wave
PI = CONST('PI')
DO 10 I=1, N
SEQ(I) = SIN(FLOAT(I)*(PI/2.0)/FLOAT(N))
10 CONTINUE
!
Compute the transform of SEQ
CALL QSINF (N, SEQ, COEF)
!
Print results
WRITE (NOUT,99998)
WRITE (NOUT,99999) (I, SEQ(I), COEF(I), I=1,N)
99998 FORMAT (9X, 'INDEX', 6X, 'SEQ', 7X, 'COEF')
99999 FORMAT (1X, I11, 5X, F6.2, 5X, F6.2)
END
Output
INDEX
1
2
3
4
5
6
7
SEQ
0.22
0.43
0.62
0.78
0.90
0.97
1.00
COEF
7.00
0.00
0.00
0.00
0.00
0.00
0.00
QSINB
Computes a sequence from its sine Fourier coefficients with only odd wave numbers.
Required Arguments
N Length of the sequence to be transformed. (Input)
COEF Array of length N containing the Fourier coefficients. (Input)
SEQ Array of length N containing the sequence. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QSINB computes the discrete (unnormalized) inverse Fourier quarter sine transform of
a real vector of size N. The method used is a variant of the Cooley-Tukey algorithm, which is most
efficient when N is a product of small prime factors. If N satisfies this condition, then the
computational effort is proportional to N log N.
Specifically, given an N-vector c = COEF, QSINB returns in s = SEQ
N
( 2n 1) m
sm = 4 cn sin
2N
n =1
Furthermore, a vector x of length N that is first transformed by QSINF and then by QSINB will be
returned by QSINB as 4Nx. The routine QSINB is based on the inverse quarter sine FFT in
FFTPACK which was developed by Paul Swarztrauber at the National Center for Atmospheric
Research.
Comments
1.
QSINB 1129
2.
The routine QSINB is most efficient when N is the product of small primes.
3.
4.
If QSINF/QSINB is used repeatedly with the same value of N, then call QSINI followed
by repeated calls to Q2INF/Q2INB. This is more efficient than repeated calls to
QSINF/QSINB.
Example
In this example, we first compute the quarter wave sine Fourier transform c of the vector x where
xn = n for n = 1 to N. We then compute the inverse quarter wave Fourier transform of c which is
4Nx = s.
USE QSINB_INT
USE QSINF_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
!
INTEGER
REAL
INTRINSIC
I, NOUT
FLOAT, SEQ(N), COEF(N), X(N)
FLOAT
!
Get output unit number
CALL UMACH (2, NOUT)
!
Fill the data vector X
!
with X(I) = I, I=1,N
DO 10 I=1, N
X(I) = FLOAT(I)
10 CONTINUE
!
Compute the forward transform of X
CALL QSINF (N, X, COEF)
!
Compute the backward transform of W
CALL QSINB (N, COEF, SEQ)
!C
Print results
WRITE (NOUT,99998)
WRITE (NOUT,99999) (X(I), COEF(I), SEQ(I), I=1,N)
99998 FORMAT (5X, 'INPUT', 5X, 'FORWARD TRANSFORM', 3X, 'BACKWARD ', &
'TRANSFORM')
99999 FORMAT (3X, F6.2, 10X, F6.2, 15X, F6.2)
END
Output
INPUT
1.00
2.00
FORWARD TRANSFORM
39.88
-4.58
BACKWARD TRANSFORM
28.00
56.00
Fortran Numerical MATH LIBRARY
3.00
4.00
5.00
6.00
7.00
1.77
-1.00
0.70
-0.56
0.51
84.00
112.00
140.00
168.00
196.00
QSINI
Computes parameters needed by QSINF and QSINB.
CALL QSINI (N, WQSIN)
Required Arguments
N Length of the sequence to be transformed. (Input)
WQSIN Array of length 3N + 15 containing parameters needed by QSINF and QSINB.
(Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QSINI initializes the routines QSINF and QSINB. An efficient way to make multiple
calls for the same N to IMSL routine QSINF or QSINB is to use routine QSINI for initialization.
(In this case, replace QSINF or QSINB with Q2INF or Q2INB, respectively.) The routine QSINI is
based on the routine SINQI in FFTPACK. The package FFTPACK was developed by Paul
Swarztrauber at the National Center for Atmospheric Research.
Comments
Different WQSIN arrays are needed for different values of N.
Example
In this example, we compute three distinct quarter sine transforms by calling QSINI once and then
calling Q2INF three times.
USE QSINI_INT
USE CONST_INT
Chapter 6: Transforms
QSINI 1131
USE Q2INF_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
!
INTEGER
REAL
INTRINSIC
!
!
!
!
!
!
10
!
!
20
99998
99999
I, K, NOUT
COEF(N), FLOAT, PI, SIN, WQSIN(36), SEQ(N)
FLOAT, SIN
Get output unit number
CALL UMACH (2, NOUT)
Initialize the work vector WQSIN
CALL QSINI (N, WQSIN)
Different frequencies of the same
wave will be transformed
PI = CONST('PI')
DO 20 K=1, 3
Fill the data vector SEQ
with a pure sine wave
DO 10 I=1, N
SEQ(I) = SIN(FLOAT((2*K-1)*I)*(PI/2.0)/FLOAT(N))
CONTINUE
Compute the transform of SEQ
CALL Q2INF (N, SEQ, COEF, WQSIN)
Print results
WRITE (NOUT,99998)
WRITE (NOUT,99999) (I, SEQ(I), COEF(I), I=1,N)
CONTINUE
FORMAT (/, 9X, 'INDEX', 6X, 'SEQ', 7X, 'COEF')
FORMAT (1X, I11, 5X, F6.2, 5X, F6.2)
END
Output
INDEX
1
2
3
4
5
6
7
SEQ
0.22
0.43
0.62
0.78
0.90
0.97
1.00
COEF
7.00
0.00
0.00
0.00
0.00
0.00
0.00
INDEX
1
2
3
4
5
6
7
SEQ
0.62
0.97
0.90
0.43
-0.22
-0.78
-1.00
COEF
0.00
7.00
0.00
0.00
0.00
0.00
0.00
INDEX
1
SEQ
0.90
COEF
0.00
2
3
4
5
6
7
0.78
-0.22
-0.97
-0.62
0.43
1.00
0.00
7.00
0.00
0.00
0.00
0.00
QCOSF
Computes the coefficients of the cosine Fourier transform with only odd wave numbers.
Required Arguments
N Length of the sequence to be transformed. (Input)
SEQ Array of length N containing the sequence. (Input)
COEF Array of length N containing the Fourier coefficients. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QCOSF computes the discrete Fourier quarter cosine transform of a real vector of size
N. The method used is a variant of the Cooley-Tukey algorithm, which is most efficient when N is
a product of small prime factors. If N satisfies this condition, then the computational effort is
proportional to N log N.
Specifically, given an N-vector s = SEQ, QCOSF returns in c = COEF
N
( 2m 1)( n 1)
cm = s1 + 2 sn cos
2N
n=2
Finally, note that the Fourier quarter cosine transform has an (unnormalized) inverse which is
implemented in QCOSB. The routine QCOSF is based on the quarter cosine FFT in FFTPACK. The
package FFTPACK was developed by Paul Swarztrauber at the National Center for Atmospheric
Research.
Chapter 6: Transforms
QCOSF 1133
Comments
1.
2.
The routine QCOSF is most efficient when N is the product of small primes.
3.
4.
If QCOSF/QCOSB is used repeatedly with the same value of N, then call QCOSI followed
by repeated calls to Q2OSF/Q2OSB. This is more efficient than repeated calls to
QCOSF/QCOSB.
Example
In this example, we input a pure quarter cosine wave as a data vector and recover its Fourier
quarter cosine series.
USE QCOSF_INT
USE CONST_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
!
!
!
INTEGER
REAL
INTRINSIC
I, NOUT
COEF(N), COS, FLOAT, PI, SEQ(N)
COS, FLOAT
Get output unit number
CALL UMACH (2, NOUT)
Fill the data vector SEQ
with a pure cosine wave
PI = CONST('PI')
DO 10 I=1, N
SEQ(I) = COS(FLOAT(I-1)*(PI/2.0)/FLOAT(N))
10
CONTINUE
Output
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.97
0.90
0.78
0.62
0.43
0.22
COEF
7.00
0.00
0.00
0.00
0.00
0.00
0.00
QCOSB
Computes a sequence from its cosine Fourier coefficients with only odd wave numbers.
Required Arguments
N Length of the sequence to be transformed. (Input)
COEF Array of length N containing the Fourier coefficients. (Input)
SEQ Array of length N containing the sequence. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QCOSB computes the discrete (unnormalized) inverse Fourier quarter cosine transform
of a real vector of size N. The method used is a variant of the Cooley-Tukey algorithm, which is
most efficient when N is a product of small prime factors. If N satisfies this condition, then the
computational effort is proportional to N log N. Specifically, given an N-vector c = COEF, QCOSB
returns in s = SEQ
N
( 2n 1)( m 1)
sm = 4 cn cos
2N
n =1
Furthermore, a vector x of length N that is first transformed by QCOSF and then by QCOSB will be
returned by QCOSB as 4Nx. The routine QCOSB is based on the inverse quarter cosine FFT in
Chapter 6: Transforms
QCOSB 1135
FFTPACK. The package FFTPACK was developed by Paul Swarztrauber at the National Center
for Atmospheric Research.
Comments
1.
2.
The routine QCOSB is most efficient when N is the product of small primes.
3.
4.
If QCOSF/QCOSB is used repeatedly with the same value of N, then call QCOSI followed
by repeated calls to Q2OSF/Q2OSB. This is more efficient than repeated calls to
QCOSF/QCOSB.
Example
In this example, we first compute the quarter wave cosine Fourier transform c of the vector x,
where xn = n for n = 1 to N. We then compute the inverse quarter wave Fourier transform of c
which is 4Nx = s.
USE QCOSB_INT
USE QCOSF_INT
USE UMACH_INT
!
!
!
!
!
!
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=7)
INTEGER
REAL
INTRINSIC
I, NOUT
FLOAT, SEQ(N), COEF(N), X(N)
FLOAT
Get output unit number
CALL UMACH (2, NOUT)
Fill the data vector X
with X(I) = I, I=1,N
DO 10 I=1, N
X(I) = FLOAT(I)
10 CONTINUE
Compute the forward transform of X
CALL QCOSF (N, X, COEF)
Compute the backward transform of
COEF
CALL QCOSB (N, COEF, SEQ)
Print results
WRITE (NOUT,99998)
DO 20 I=1, N
WRITE (NOUT,99999) X(I), COEF(I), SEQ(I)
20 CONTINUE
99998 FORMAT (5X, 'INPUT', 5X, 'FORWARD TRANSFORM', 3X, 'BACKWARD ', &
'TRANSFORM')
99999 FORMAT (3X, F6.2, 10X, F6.2, 15X, F6.2)
END
Output
INPUT
1.00
2.00
3.00
4.00
5.00
6.00
7.00
FORWARD TRANSFORM
31.12
-27.45
10.97
-9.00
4.33
-3.36
0.40
BACKWARD TRANSFORM
28.00
56.00
84.00
112.00
140.00
168.00
196.00
QCOSI
Computes parameters needed by QCOSF and QCOSB.
Required Arguments
N Length of the sequence to be transformed. (Input)
WQCOS Array of length 3N + 15 containing parameters needed by QCOSF and QCOSB.
(Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QCOSI initializes the routines QCOSF and QCOSB. An efficient way to make multiple
calls for the same N to IMSL routine QCOSF or QCOSB is to use routine QCOSI for initialization.
(In this case, replace QCOSF or QCOSB with Q2OSF or Q2OSB , respectively.) The routine QCOSI is
Chapter 6: Transforms
QCOSI 1137
based on the routine COSQI in FFTPACK, which was developed by Paul Swarztrauber at the
National Center for Atmospheric Research.
Comments
Different WQCOS arrays are needed for different values of N.
Example
In this example, we compute three distinct quarter cosine transforms by calling QCOSI once and
then calling Q2OSF three times.
USE
USE
USE
USE
QCOSI_INT
CONST_INT
Q2OSF_INT
UMACH_INT
IMPLICIT
INTEGER
PARAMETER
!
INTEGER
REAL
INTRINSIC
!
!
!
!
!
!
NONE
N
(N=7)
10
!
20
99998
99999
I, K, NOUT
COEF(N), COS, FLOAT, PI, WQCOS(36), SEQ(N)
COS, FLOAT
Get output unit number
CALL UMACH (2, NOUT)
Initialize the work vector WQCOS
CALL QCOSI (N, WQCOS)
Different frequencies of the same
wave will be transformed
PI = CONST('PI')
DO 20 K=1, 3
Fill the data vector SEQ
with a pure cosine wave
DO 10 I=1, N
SEQ(I) = COS(FLOAT((2*K-1)*(I-1))*(PI/2.0)/FLOAT(N))
CONTINUE
Compute the transform of SEQ
CALL Q2OSF (N, SEQ, COEF, WQCOS)
Print results
WRITE (NOUT,99998)
WRITE (NOUT,99999) (I, SEQ(I), COEF(I), I=1,N)
CONTINUE
FORMAT (/, 9X, 'INDEX', 6X, 'SEQ', 7X, 'COEF')
FORMAT (1X, I11, 5X, F6.2, 5X, F6.2)
END
Output
INDEX
1
2
3
4
SEQ
1.00
0.97
0.90
0.78
COEF
7.00
0.00
0.00
0.00
Fortran Numerical MATH LIBRARY
5
6
7
0.62
0.43
0.22
0.00
0.00
0.00
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.78
0.22
-0.43
-0.90
-0.97
-0.62
COEF
0.00
7.00
0.00
0.00
0.00
0.00
0.00
INDEX
1
2
3
4
5
6
7
SEQ
1.00
0.43
-0.62
-0.97
-0.22
0.78
0.90
COEF
0.00
0.00
7.00
0.00
0.00
0.00
0.00
FFT2D
Computes Fourier coefficients of a complex periodic two-dimensional array.
Required Arguments
A NRA by NCA complex matrix containing the periodic data to be transformed. (Input)
COEF NRA by NCA complex matrix containing the Fourier coefficients of A. (Output)
Optional Arguments
NRA The number of rows of A. (Input)
Default: NRA = size (A,1).
NCA The number of columns of A. (Input)
Default: NCA = size (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
LDCOEF Leading dimension of COEF exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDCOEF = size (COEF,1).
Chapter 6: Transforms
FFT2D 1139
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFT2D computes the discrete complex Fourier transform of a complex two
dimensional array of size (NRA = N) (NCA = M). The method used is a variant of the CooleyTukey algorithm , which is most efficient when N and M are each products of small prime factors.
If N and M satisfy this condition, then the computational effort is proportional to N M log N M.
This considerable savings has historically led people to refer to this algorithm as the fast Fourier
transform or FFT.
Specifically, given an N M array a, FFT2D returns in c = COEF
N
c jk = anm e
2 i ( j 1)( n 1) / N
2 i ( k 1)( m 1) / M
n =1 m =1
Finally, note that an unnormalized inverse is implemented in FFT2B. The routine FFT2D is based
on the complex FFT in FFTPACK. The package FFTPACK was developed by Paul Swarztrauber
at the National Center for Atmospheric Research.
Comments
1.
2.
The routine FFT2D is most efficient when NRA and NCA are the product of small primes.
3.
4.
If FFT2D/FFT2B is used repeatedly, with the same values for NRA and NCA, then use
FFTCI to fill WFF1(N = NRA) and WFF2(N = NCA). Follow this with repeated calls to
F2T2D/F2T2B. This is more efficient than repeated calls to FFT2D/FFT2B.
Example
In this example, we compute the Fourier transform of the pure frequency input for a 5 4 array
anm = e
2 i ( n 1) 2 / N
2 i ( m 1) 3 / M
NONE
I, IR, IS, J, NCA, NRA
FLOAT, TWOPI
A(5,4), C, CEXP, CMPLX, COEF(5,4), H
TITLE1*26, TITLE2*26
CEXP, CMPLX, FLOAT
TITLE1
TITLE2
NRA
NCA
IR
IS
=
=
=
=
=
=
Chapter 6: Transforms
FFT2D 1141
!
CALL FFT2D (A, COEF)
!
CALL WRCRN (TITLE2, COEF)
END
Output
1
2
3
4
5
1
2
3
4
5
The
1
( 1.000, 0.000)
(-0.809, 0.588)
( 0.309,-0.951)
( 0.309, 0.951)
(-0.809,-0.588)
(
(
(
(
(
0.00,
0.00,
0.00,
0.00,
0.00,
3
0.00)
0.00)
0.00)
0.00)
0.00)
4
( 0.000, 1.000)
(-0.588,-0.809)
( 0.951, 0.309)
(-0.951, 0.309)
( 0.588,-0.809)
( 0.00,
( 0.00,
( 20.00,
( 0.00,
( 0.00,
4
0.00)
0.00)
0.00)
0.00)
0.00)
FFT2B
Computes the inverse Fourier transform of a complex periodic two-dimensional array.
Required Arguments
COEF NRCOEF by NCCOEF complex array containing the Fourier coefficients to be
transformed. (Input)
A NRCOEF by NCCOEF complex array containing the Inverse Fourier coefficients of COEF.
(Output)
Optional Arguments
NRCOEF The number of rows of COEF. (Input)
Default: NRCOEF = size (COEF,1).
NCCOEF The number of columns of COEF. (Input)
Default: NCCOEF = size (COEF,2).
LDCOEF Leading dimension of COEF exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDCOEF = size (COEF,1).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFT2B computes the inverse discrete complex Fourier transform of a complex twodimensional array of size (NRCOEF = N) (NCCOEF = M). The method used is a variant of the
Cooley-Tukey algorithm , which is most efficient when N and M are both products of small prime
factors. If N and M satisfy this condition, then the computational effort is proportional to N M log
N M. This considerable savings has historically led people to refer to this algorithm as the fast
Fourier transform or FFT.
Specifically, given an N M array c = COEF, FFT2B returns in a
N
Finally, note that an unnormalized inverse is implemented in FFT2D. The routine FFT2B is based
on the complex FFT in FFTPACK. The package FFTPACK was developed by Paul Swarztrauber
at the National Center for Atmospheric Research.
Comments
1.
Chapter 6: Transforms
FFT2B 1143
2.
The routine FFT2B is most efficient when NRCOEF and NCCOEF are the product of
small primes.
3.
4.
If FFT2D/FFT2B is used repeatedly, with the same values for NRCOEF and NCCOEF,
then use FFTCI to fill WFF1(N = NRCOEF) and WFF2(N = NCCOEF). Follow this with
repeated calls to F2T2D/F2T2B. This is more efficient than repeated calls to
FFT2D/FFT2B.
Example
In this example, we first compute the Fourier transform of the 5 4 array
xnm = n + 5 ( m 1)
is then inverted by a call to FFT2B. Note that the result is an array a satisfying a = (5)(4)x = 20x. In
general, FFT2B is an unnormalized inverse with expansion factor N M.
USE FFT2B_INT
USE FFT2D_INT
USE WRCRN_INT
IMPLICIT
INTEGER
COMPLEX
CHARACTER
INTRINSIC
TITLE1
TITLE2
TITLE3
NRA
NCA
NONE
M, N, NCA, NRA
CMPLX, X(5,4), A(5,4), COEF(5,4)
TITLE1*26, TITLE2*26, TITLE3*26
CMPLX
=
=
=
=
=
!
CALL FFT2B (COEF, A)
!
CALL WRCRN (TITLE3, A)
END
Output
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
(
(
(
(
(
1.00,
2.00,
3.00,
4.00,
5.00,
The
1
0.00)
0.00)
0.00)
0.00)
0.00)
210.0,
-10.0,
-10.0,
-10.0,
-10.0,
1
0.0)
13.8)
3.2)
-3.2)
-13.8)
( 20.0,
( 40.0,
( 60.0,
( 80.0,
( 100.0,
1
0.0)
0.0)
0.0)
0.0)
0.0)
(
(
(
(
(
3
0.00)
0.00)
0.00)
0.00)
0.00)
( -50.0,
(
0.0,
(
0.0,
(
0.0,
(
0.0,
After FFT2D
2
50.0) ( -50.0,
0.0) (
0.0,
0.0) (
0.0,
0.0) (
0.0,
0.0) (
0.0,
3
0.0)
0.0)
0.0)
0.0)
0.0)
4
( -50.0, -50.0)
(
0.0,
0.0)
(
0.0,
0.0)
(
0.0,
0.0)
(
0.0,
0.0)
(
(
(
(
(
After FFT2B
2
0.0) ( 220.0,
0.0) ( 240.0,
0.0) ( 260.0,
0.0) ( 280.0,
0.0) ( 300.0,
3
0.0)
0.0)
0.0)
0.0)
0.0)
(
(
(
(
(
120.0,
140.0,
160.0,
180.0,
200.0,
(
(
(
(
(
16.00,
17.00,
18.00,
19.00,
20.00,
320.0,
340.0,
360.0,
380.0,
400.0,
4
0.00)
0.00)
0.00)
0.00)
0.00)
4
0.0)
0.0)
0.0)
0.0)
0.0)
FFT3F
Computes Fourier coefficients of a complex periodic three-dimensional array.
Required Arguments
A Three-dimensional complex matrix containing the data to be transformed. (Input)
B Three-dimensional complex matrix containing the Fourier coefficients of A. (Output)
The matrices A and B may be the same.
Chapter 6: Transforms
FFT3F 1145
Optional Arguments
N1 Limit on the first subscript of matrices A and B. (Input)
Default: N1 = size(A, 1)
N2 Limit on the second subscript of matrices A and B. (Input)
Default: N2 = size(A, 2)
N3 Limit on the third subscript of matrices A and B. (Input)
Default: N3 = size(A, 3)
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
MDA Middle dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: MDA = size (A,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = size (B,1).
MDB Middle dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: MDB = size (B,2).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFT3F computes the forward discrete complex Fourier transform of a complex threedimensional array of size (N1 = N) (N2 = M) (N3 = L). The method used is a variant of the
Cooley-Tukey algorithm , which is most efficient when N, M, and L are each products of small
prime factors. If N, M, and L satisfy this condition, then the computational effort is proportional to
N M L log N M L. This considerable savings has historically led people to refer to this algorithm
as the fast Fourier transform or FFT.
1146 Chapter 6: Transforms
c jkl = anml e
2 i ( j 1)( n 1) / N
2 i ( k 1)( m 1) / M
2 i ( k 1)( l 1) / L
n =1 m =1 l =1
Finally, note that an unnormalized inverse is implemented in FFT3B. The routine FFT3F is based
on the complex FFT in FFTPACK. The package FFTPACK was developed by Paul Swarztrauber
at the National Center for Atmospheric Research.
Comments
1.
2.
The routine FFT3F is most efficient when N1, N2, and N3 are the product of small
primes.
3.
If FFT3F/FFT3B is used repeatedly with the same values for N1, N2 and N3, then use
FFTCI to fill WFF1(N = N1), WFF2(N = N2), and WFF3(N = N3). Follow this with
repeated calls to F2T3F/F2T3B. This is more efficient than repeated calls to
FFT3F/FFT3B.
Example
In this example, we compute the Fourier transform of the pure frequency input for a 2 3 4
array
anml = e 2 i ( n 1)1/ 2 e 2 i ( m 1) 2 / 3 e 2 i (l 1) 2 / 4
FFT3F 1147
!
!
!
!
!
!
NONE
LDA, LDB, MDA, MDB, NDA, NDB
(LDA=2, LDB=2, MDA=3, MDB=3, NDA=4, NDB=4)
SPECIFICATIONS FOR LOCAL VARIABLES
INTEGER
I, J, K, L, M, N, N1, N2, N3, NOUT
REAL
PI
COMPLEX
A(LDA,MDA,NDA), B(LDB,MDB,NDB), C, H
SPECIFICATIONS FOR INTRINSICS
INTRINSIC CEXP, CMPLX
COMPLEX
CEXP, CMPLX
SPECIFICATIONS FOR SUBROUTINES
SPECIFICATIONS FOR FUNCTIONS
Get output unit number
CALL UMACH (2, NOUT)
PI = CONST('PI')
C = CMPLX(0.0,2.0*PI)
Set array A
DO 30 N=1, 2
DO 20 M=1, 3
DO 10 L=1, 4
H
= C*(N-1)*1/2 + C*(M-1)*2/3 + C*(L-1)*2/4
A(N,M,L) = CEXP(H)
10
CONTINUE
20
CONTINUE
30 CONTINUE
CALL FFT3F (A, B)
WRITE (NOUT,99996)
DO 50 I=1, 2
WRITE (NOUT,99998) I
DO 40 J=1, 3
WRITE (NOUT,99999) (A(I,J,K),K=1,4)
40
CONTINUE
50 CONTINUE
WRITE (NOUT,99997)
DO 70 I=1, 2
WRITE (NOUT,99998) I
DO 60 J=1, 3
WRITE (NOUT,99999) (B(I,J,K),K=1,4)
60
CONTINUE
70 CONTINUE
!
99996
99997
99998
99999
FORMAT
FORMAT
FORMAT
FORMAT
END
Output
The input for FFT3F is
Face no. 1
( 1.00, 0.00)
( -0.50, -0.87)
( -0.50, 0.87)
( -1.00, 0.00)
( 0.50, 0.87)
( 0.50, -0.87)
( 1.00, 0.00)
( -0.50, -0.87)
( -0.50, 0.87)
( -1.00, 0.00)
( 0.50, 0.87)
( 0.50, -0.87)
Face no. 2
( -1.00, 0.00)
( 0.50, 0.87)
( 0.50, -0.87)
( 1.00, 0.00)
( -0.50, -0.87)
( -0.50, 0.87)
( -1.00, 0.00)
( 0.50, 0.87)
( 0.50, -0.87)
( 1.00, 0.00)
( -0.50, -0.87)
( -0.50, 0.87)
(
(
(
0.00,
0.00,
0.00,
0.00)
0.00)
0.00)
(
(
(
0.00,
0.00,
0.00,
0.00)
0.00)
0.00)
(
(
(
0.00,
0.00,
0.00,
0.00)
0.00)
0.00)
Face no. 2
( 0.00, 0.00)
( 0.00, 0.00)
( 0.00, 0.00)
(
(
(
0.00,
0.00,
0.00,
0.00)
0.00)
0.00)
( 0.00,
( 0.00,
( 24.00,
0.00)
0.00)
0.00)
(
(
(
0.00,
0.00,
0.00,
0.00)
0.00)
0.00)
FFT3B
Computes the inverse Fourier transform of a complex periodic three-dimensional array.
Required Arguments
A Three-dimensional complex matrix containing the data to be transformed. (Input)
B Three-dimensional complex matrix containing the inverse Fourier coefficients of A.
(Output)
The matrices A and B may be the same.
Optional Arguments
N1 Limit on the first subscript of matrices A and B. (Input)
Default: N1 = size (A,1).
N2 Limit on the second subscript of matrices A and B. (Input)
Default: N2 = size (A,2).
N3 Limit on the third subscript of matrices A and B. (Input)
Default: N3 = size (A,3).
Chapter 6: Transforms
FFT3B 1149
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = size (A,1).
MDA Middle dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: MDA = size (A,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = size (B,1).
MDB Middle dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: MDB = size (B,2).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FFT3B computes the inverse discrete complex Fourier transform of a complex threedimensional array of size (N1 = N) (N2 = M) (N3 = L). The method used is a variant of the
Cooley-Tukey algorithm, which is most efficient when N, M, and L are each products of small
prime factors. If N, M, and L satisfy this condition, then the computational effort is proportional to
N M L log N M L. This considerable savings has historically led people to refer to this algorithm as
the fast Fourier transform or FFT.
Specifically, given an N M L array a, FFT3B returns in b
N
Finally, note that an unnormalized inverse is implemented in FFT3F. The routine FFT3B is based
on the complex FFT in FFTPACK. The package FFTPACK was developed by Paul Swarztrauber
at the National Center for Atmospheric Research.
1150 Chapter 6: Transforms
Comments
1.
2.
The routine FFT3B is most efficient when N1, N2, and N3 are the product of small
primes.
3.
If FFT3F/FFT3B is used repeatedly with the same values for N1, N2 and N3, then use
FFTCI to fill WFF1(N = N1), WFF2(N = N2), and WFF3(N = N3). Follow this with
repeated calls to F2T3F/F2T3B. This is more efficient than repeated calls to
FFT3F/FFT3B.
Example
In this example, we compute the Fourier transform of the 2 3 4 array
xnml = n + 2 ( m 1) + 2 ( 3)( l 1)
is then inverted using FFT3B. Note that the result is an array b satisfying b = 2(3)(4)x = 24x. In
general, FFT3B is an unnormalized inverse with expansion factor N M L.
USE FFT3B_INT
USE FFT3F_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
INTEGER
COMPLEX
Chapter 6: Transforms
NONE
LDA, LDB, MDA, MDB, NDA, NDB
(LDA=2, LDB=2, MDA=3, MDB=3, NDA=4, NDB=4)
SPECIFICATIONS FOR LOCAL VARIABLES
I, J, K, L, M, N, N1, N2, N3, NOUT
A(LDA,MDA,NDA), B(LDB,MDB,NDB), X(LDB,MDB,NDB)
FFT3B 1151
CEXP, CMPLX
CEXP, CMPLX
!
!
Set array X
DO 30 N=1, 2
DO 20 M=1, 3
DO 10 L=1, 4
X(N,M,L) = N + 2*(M-1) + 2*3*(L-1)
10
CONTINUE
20
CONTINUE
30 CONTINUE
CALL FFT3F (X, A)
CALL FFT3B (A, B)
!
WRITE (NOUT,99996)
DO 50 I=1, 2
WRITE (NOUT,99998) I
DO 40 J=1, 3
WRITE (NOUT,99999) (X(I,J,K),K=1,4)
40
CONTINUE
50 CONTINUE
!
WRITE (NOUT,99997)
DO 70 I=1, 2
WRITE (NOUT,99998) I
DO 60 J=1, 3
WRITE (NOUT,99999) (A(I,J,K),K=1,4)
60
CONTINUE
70 CONTINUE
80
90
99995
99996
99997
99998
99999
Output
The input for FFT3F is
Face no. 1
( 1.00, 0.00)
( 3.00, 0.00)
( 5.00, 0.00)
( 7.00,
( 9.00,
( 11.00,
0.00)
0.00)
0.00)
( 13.00,
( 15.00,
( 17.00,
0.00)
0.00)
0.00)
( 19.00,
( 21.00,
( 23.00,
0.00)
0.00)
0.00)
Face no. 2
( 2.00, 0.00)
( 4.00, 0.00)
( 6.00, 0.00)
( 8.00,
( 10.00,
( 12.00,
0.00)
0.00)
0.00)
( 14.00,
( 16.00,
( 18.00,
0.00)
0.00)
0.00)
( 20.00,
( 22.00,
( 24.00,
0.00)
0.00)
0.00)
Face no. 1
(300.00, 0.00)
(-24.00, 13.86)
(-24.00,-13.86)
(-72.00, 72.00)
( 0.00, 0.00)
( 0.00, 0.00)
(-72.00,
( 0.00,
( 0.00,
0.00)
0.00)
0.00)
(-72.00,-72.00)
( 0.00, 0.00)
( 0.00, 0.00)
Face no. 2
(-12.00, 0.00)
( 0.00, 0.00)
( 0.00, 0.00)
(
(
(
0.00)
0.00)
0.00)
(
(
(
0.00,
0.00,
0.00,
0.00)
0.00)
0.00)
(
(
(
0.00,
0.00,
0.00,
0.00)
0.00)
0.00)
0.00,
0.00,
0.00,
(168.00,
(216.00,
(264.00,
0.00)
0.00)
0.00)
(312.00,
(360.00,
(408.00,
0.00)
0.00)
0.00)
(456.00,
(504.00,
(552.00,
0.00)
0.00)
0.00)
Face no. 2
( 48.00, 0.00)
( 96.00, 0.00)
(144.00, 0.00)
(192.00,
(240.00,
(288.00,
0.00)
0.00)
0.00)
(336.00,
(384.00,
(432.00,
0.00)
0.00)
0.00)
(480.00,
(528.00,
(576.00,
0.00)
0.00)
0.00)
RCONV
Computes the convolution of two real vectors.
Required Arguments
X Real vector of length NX. (Input)
Y Real vector of length NY. (Input)
Z Real vector of length NZ ontaining the convolution of X and Y. (Output)
ZHAT Real vector of length NZ containing the discrete Fourier transform of Z. (Output)
Optional Arguments
IDO Flag indicating the usage of RCONV. (Input)
Default: IDO = 0.
Chapter 6: Transforms
RCONV 1153
IDO
Usage
If RCONV is called multiple times in sequence with the same NX, NY, and IPAD, IDO
should be set to
1
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine RCONV computes the discrete convolution of two sequences x and y. More precisely,
let nx be the length of x and ny denote the length of y. If a circular convolution is desired, then
IPAD must be set to zero. We set
nz := max{nx, ny}
and we pad out the shorter vector with zeroes. Then, we compute
nz
zi = xi j +1 y j
j =1
where the index on x is interpreted as a positive number between 1 and nz, modulo nz.
The technique used to compute the zis is based on the fact that the (complex discrete) Fourier
transform maps convolution into multiplication. Thus, the Fourier transform of z is given by
z ( n ) = x ( n ) y ( n )
where
nz
z ( n ) = zm e 2 i ( m 1)( n 1) / nz
m =1
The technique used here to compute the convolution is to take the discrete Fourier transform of x
and y, multiply the results together component-wise, and then take the inverse transform of this
product. It is very important to make sure that nz is a product of small primes if IPAD is set to zero.
If nz is a product of small primes, then the computational effort will be proportional to nz log(nz). If
IPAD is one, then a good value is chosen for nz so that the Fourier transforms are efficient and
nz nx + ny 1. This will mean that both vectors will be padded with zeroes.
We point out that no complex transforms of x or y are taken since both sequences are real, we can
take real transforms and simulate the complex transform above. This can produce a savings of a
factor of six in time as well as save space over using the complex transform.
Comments
1.
2.
Informational error
Type
Code
4
Chapter 6: Transforms
The length of the vector Z must be large enough to hold the results.
An acceptable length is returned in NZ.
RCONV 1155
Example
In this example, we compute both a periodic and a non-periodic convolution. The idea here is that
one can compute a moving average of the type found in digital filtering using this routine. The
averaging operator in this case is especially simple and is given by averaging five consecutive
points in the sequence. The periodic case tries to recover a noisy sin function by averaging five
nearby values. The nonperiodic case tries to recover the values of an exponential function
contaminated by noise. The large error for the last value printed has to do with the fact that the
convolution is averaging the zeroes in the pad rather than function values. Notice that the signal
size is 100, but we only report the errors at ten points.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
NFLTR, NY, A
(NFLTR=5, NY=100)
!
INTEGER
REAL
!
!
!
!
!
!
!
DO 20 I=1, NY
X
= TWOPI*FLOAT(I-1)/FLOAT(NY-1)
Y(I) = RNUNF()
Y(I) = F1(X) + 0.5*Y(I) - 0.25
20 CONTINUE
CALL THE CONVOLUTION ROUTINE FOR THE
PERIODIC CASE.
NZ = 2*(NFLTR+NY-1)
CALL RCONV (FLTR, Y, Z, ZHAT, IPAD=0, NZ=NZ)
PRINT RESULTS
WRITE (NOUT,99993)
WRITE (NOUT,99995)
TOTAL1 = 0.0
TOTAL2 = 0.0
DO 30 I=1, NY
COMPUTE THE OFFSET FOR THE Z-VECTOR
IF (I .GE. NY-1) THEN
K = I - NY + 2
ELSE
K = I + 2
END IF
!
30
!
!
40
!
!
!
50
99993
99994
99995
99996
99997
99998
99999
X
= TWOPI*FLOAT(I-1)/FLOAT(NY-1)
ORIGER = ABS(Y(I)-F1(X))
FLTRER = ABS(Z(K)-F1(X))
IF (MOD(I,11) .EQ. 1) WRITE (NOUT,99997) X, F1(X), ORIGER, &
FLTRER
TOTAL1 = TOTAL1 + ORIGER
TOTAL2 = TOTAL2 + FLTRER
CONTINUE
WRITE (NOUT,99998) TOTAL1/FLOAT(NY)
WRITE (NOUT,99999) TOTAL2/FLOAT(NY)
SET UP Y-VECTOR FOR THE NONPERIODIC
CASE.
DO 40 I=1, NY
A
= FLOAT(I-1)/FLOAT(NY-1)
Y(I) = RNUNF()
Y(I) = F2(A) + 0.5*Y(I) - 0.25
CONTINUE
CALL THE CONVOLUTION ROUTINE FOR THE
NONPERIODIC CASE.
NZ = 2*(NFLTR+NY-1)
CALL RCONV (FLTR, Y, Z, ZHAT, IPAD=1)
PRINT RESULTS
WRITE (NOUT,99994)
WRITE (NOUT,99996)
TOTAL1 = 0.0
TOTAL2 = 0.0
DO 50 I=1, NY
X
= FLOAT(I-1)/FLOAT(NY-1)
ORIGER = ABS(Y(I)-F2(X))
FLTRER = ABS(Z(I+2)-F2(X))
IF (MOD(I,11) .EQ. 1) WRITE (NOUT,99997) X, F2(X), ORIGER, &
FLTRER
TOTAL1 = TOTAL1 + ORIGER
TOTAL2 = TOTAL2 + FLTRER
CONTINUE
WRITE (NOUT,99998) TOTAL1/FLOAT(NY)
WRITE (NOUT,99999) TOTAL2/FLOAT(NY)
FORMAT (' Periodic Case')
FORMAT (/,' Nonperiodic Case')
FORMAT (8X, 'x', 9X, 'sin(x)', 6X, 'Original Error', 5X, &
'Filtered Error')
FORMAT (8X, 'x', 9X, 'exp(x)', 6X, 'Original Error', 5X, &
'Filtered Error')
FORMAT (1X, F10.4, F13.4, 2F18.4)
FORMAT (' Average absolute error before filter:', F10.5)
FORMAT (' Average absolute error after filter:', F11.5)
END
Output
Periodic Case
x
sin(x)
Chapter 6: Transforms
Original Error
Filtered Error
RCONV 1157
0.0000
0.0000
0.0811
0.6981
0.6428
0.0226
1.3963
0.9848
0.1526
2.0944
0.8660
0.0959
2.7925
0.3420
0.1747
3.4907
-0.3420
0.1035
4.1888
-0.8660
0.0402
4.8869
-0.9848
0.0673
5.5851
-0.6428
0.1044
6.2832
0.0000
0.0154
Average absolute error before filter:
Average absolute error after filter:
0.0587
0.0781
0.0529
0.0125
0.0292
0.0238
0.0595
0.0798
0.0074
0.0018
0.12481
0.04778
Nonperiodic Case
x
exp(x)
Original Error
Filtered Error
0.0000
1.0000
0.1476
0.3915
0.1111
1.1175
0.0537
0.0326
0.2222
1.2488
0.1278
0.0932
0.3333
1.3956
0.1136
0.0987
0.4444
1.5596
0.1617
0.0964
0.5556
1.7429
0.0071
0.0662
0.6667
1.9477
0.1248
0.0713
0.7778
2.1766
0.1556
0.0158
0.8889
2.4324
0.1529
0.0696
1.0000
2.7183
0.2124
1.0562
Average absolute error before filter:
0.12538
Average absolute error after filter:
0.07764
CCONV
Computes the convolution of two complex vectors.
Required Arguments
X Complex vector of length NX. (Input)
Y Complex vector of length NY. (Input)
Z Complex vector of length NZ containing the convolution of X and Y. (Output)
ZHAT Complex vector of length NZ containing the discrete complex Fourier transform of
Z. (Output)
Optional Arguments
IDO Flag indicating the usage of CCONV. (Input)
Default: IDO = 0.
IDO
Usage
If CCONV is called multiple times in sequence with the same NX, NY, and IPAD, IDO
should be set to:
1
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The subroutine CCONV computes the discrete convolution of two complex sequences x and y. More
precisely, let nx be the length of x and ny denote the length of y. If a circular convolution is desired,
then IPAD must be set to zero. We set
nz := max{nx, ny}
and we pad out the shorter vector with zeroes. Then, we compute
Chapter 6: Transforms
CCONV 1159
nz
zi = xi j +1 y j
j =1
where the index on x is interpreted as a positive number between 1 and nz, modulo nz.
The technique used to compute the zis is based on the fact that the (complex discrete) Fourier
transform maps convolution into multiplication. Thus, the Fourier transform of z is given by
z ( n ) = x ( n ) y ( n )
where
nz
z ( n ) = zm e 2 i ( m 1)( n 1) / nz
m =1
The technique used here to compute the convolution is to take the discrete Fourier transform of x
and y, multiply the results together component-wise, and then take the inverse transform of this
product. It is very important to make sure that nz is a product of small primes if IPAD is set to zero.
If nz is a product of small primes, then the computational effort will be proportional to nz log(nz). If
IPAD is one, then a a good value is chosen for nz so that the Fourier transforms are efficient and
nz nx + ny 1. This will mean that both vectors will be padded with zeroes.
Comments
1.
2.
Informational error
Type
Code
4
The length of the vector Z must be large enough to hold the results.
An acceptable length is returned in NZ.
Example
In this example, we compute both a periodic and a non-periodic convolution. The idea here is that
one can compute a moving average of the type found in digital filtering using this routine. The
averaging operator in this case is especially simple and is given by averaging five consecutive
points in the sequence. The periodic case tries to recover a noisy function f1(x) = cos(x) + i sin(x)
by averaging five nearby values. The nonperiodic case tries to recover the values of the function
1160 Chapter 6: Transforms
f2(x) = exf1(x) contaminated by noise. The large error for the first and last value printed has to do
with the fact that the convolution is averaging the zeroes in the pad rather than function values.
Notice that the signal size is 100, but we only report the errors at ten points.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
NFLTR, NY
(NFLTR=5, NY=100)
!
INTEGER
REAL
!
!
!
!
!
!
!
!
Chapter 6: Transforms
CCONV 1161
FLTRER
TOTAL1 = TOTAL1
TOTAL2 = TOTAL2
30 CONTINUE
WRITE (NOUT,99998)
WRITE (NOUT,99999)
!
!
40
!
!
!
50
99993
99994
99995
99996
99997
99998
99999
+ ORIGER
+ FLTRER
TOTAL1/FLOAT(NY)
TOTAL2/FLOAT(NY)
SET UP Y-VECTOR FOR THE NONPERIODIC
CASE.
DO 40 I=1, NY
X
= FLOAT(I-1)/FLOAT(NY-1)
T1
= RNUNF()
T2
= RNUNF()
Y(I) = F2(X) + CMPLX(0.5*T1-0.25,0.5*T2-0.25)
CONTINUE
CALL THE CONVOLUTION ROUTINE FOR THE
NONPERIODIC CASE.
NZ = 2*(NFLTR+NY-1)
CALL CCONV (FLTR, Y, Z, ZHAT, IPAD=1)
PRINT RESULTS
WRITE (NOUT,99994)
WRITE (NOUT,99996)
TOTAL1 = 0.0
TOTAL2 = 0.0
DO 50 I=1, NY
X
= FLOAT(I-1)/FLOAT(NY-1)
ORIGER = CABS(Y(I)-F2(X))
FLTRER = CABS(Z(I+2)-F2(X))
IF (MOD(I,11) .EQ. 1) WRITE (NOUT,99997) X, F2(X), ORIGER, &
FLTRER
TOTAL1 = TOTAL1 + ORIGER
TOTAL2 = TOTAL2 + FLTRER
CONTINUE
WRITE (NOUT,99998) TOTAL1/FLOAT(NY)
WRITE (NOUT,99999) TOTAL2/FLOAT(NY)
FORMAT (' Periodic Case')
FORMAT (/, ' Nonperiodic Case')
FORMAT (8X, 'x', 15X, 'f1(x)', 8X, 'Original Error', 5X, &
'Filtered Error')
FORMAT (8X, 'x', 15X, 'f2(x)', 8X, 'Original Error', 5X, &
'Filtered Error')
FORMAT (1X, F10.4, 5X, '(', F7.4, ',', F8.4, ' )', 5X, F8.4, &
10X, F8.4)
FORMAT (' Average absolute error before filter:', F11.5)
FORMAT (' Average absolute error after filter:', F12.5)
END
Output
Periodic Case
x
f1(x)
0.0000
( 1.0000,
0.6981
( 0.7660,
1.3963
( 0.1736,
2.0944
(-0.5000,
2.7925
(-0.9397,
1162 Chapter 6: Transforms
0.0000
0.6428
0.9848
0.8660
0.3420
Original Error
)
0.1666
)
0.1685
)
0.1756
)
0.2171
)
0.1147
Filtered Error
0.0773
0.1399
0.0368
0.0142
0.0200
Fortran Numerical MATH LIBRARY
3.4907
(-0.9397, -0.3420 )
0.0998
4.1888
(-0.5000, -0.8660 )
0.1137
4.8869
( 0.1736, -0.9848 )
0.2217
5.5851
( 0.7660, -0.6428 )
0.1831
6.2832
( 1.0000, 0.0000 )
0.3234
Average absolute error before filter:
0.19315
Average absolute error after filter:
0.08296
0.0331
0.0586
0.0843
0.0744
0.0893
Nonperiodic Case
x
f2(x)
Original Error
Filtered Error
0.0000
( 1.0000, 0.0000 )
0.0783
0.4336
0.1111
( 1.1106, 0.1239 )
0.2434
0.0477
0.2222
( 1.2181, 0.2752 )
0.1819
0.0584
0.3333
( 1.3188, 0.4566 )
0.0703
0.1267
0.4444
( 1.4081, 0.6706 )
0.1458
0.0868
0.5556
( 1.4808, 0.9192 )
0.1946
0.0930
0.6667
( 1.5307, 1.2044 )
0.1458
0.0734
0.7778
( 1.5508, 1.5273 )
0.1815
0.0690
0.8889
( 1.5331, 1.8885 )
0.0805
0.0193
1.0000
( 1.4687, 2.2874 )
0.2396
1.1708
Average absolute error before filter:
0.18549
Average absolute error after filter:
0.09636
RCORL
Computes the correlation of two real vectors.
Required Arguments
X Real vector of length N. (Input)
Y Real vector of length N. (Input)
Z Real vector of length NZ containing the correlation of X and Y. (Output)
ZHAT Real vector of length NZ containing the discrete Fourier transform of Z. (Output)
Optional Arguments
IDO Flag indicating the usage of RCORL. (Input)
Default: IDO = 0.
IDO
Usage
If RCORL is called multiple times in sequence with the same NX, NY, and IPAD, IDO
should be set to:
1
Chapter 6: Transforms
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The subroutine RCORL computes the discrete correlation of two sequences x and y. More precisely,
let n be the length of x and y. If a circular correlation is desired, then IPAD must be set to zero (for
x and y distinct) and two (for x = y). We set (on output)
nz = n
if IPAD = 0, 2
nz = 2 3 5 2 n 1
if IPAD = 1, 3
where , , are nonnegative integers yielding the smallest number of the type 235 satisfying
the inequality. Once nz is determined, we pad out the vectors with zeroes. Then, we compute
nz
zi = xi + j 1 y j
j =1
where the index on x is interpreted as a positive number between one and nz, modulo nz. Note that
this means that
z nz k
contains the correlation of x( k 1) with y as k = 0, 1, , nz /2. Thus, if x(k 1) = y(k) for all k,
then we would expect
z nz
where
nz
z j = zm e 2 i ( m 1)( j 1) / nz
m =1
Thus, the technique used here to compute the correlation is to take the discrete Fourier transform
of x and the conjugate of the discrete Fourier transform of y, multiply the results together
component-wise, and then take the inverse transform of this product. It is very important to make
sure that nz is a product of small primes if IPAD is set to zero or two. If nz is a product of small
primes, then the computational effort will be proportional to nz log(nz). If IPAD is one or three,
then a good value is chosen for nz so that the Fourier transforms are efficient and nz 2n 1. This
will mean that both vectors will be padded with zeroes.
We point out that no complex transforms of x or y are taken since both sequences are real, and we
can take real transforms and simulate the complex transform above. This can produce a savings of
a factor of six in time as well as save space over using the complex transform.
Comments
1.
RCORL 1165
2.
Informational error
Type
Code
4
The length of the vector Z must be large enough to hold the results.
An acceptable length is returned in NZ.
Example
In this example, we compute both a periodic and a non-periodic correlation between two distinct
signals x and y. In the first case we have 100 equally spaced points on the interval [0, 2] and
f1(x) = sin(x). We define x and y as follows
xi
yi
i 1
i = 1, , n
)
n 1
i 1
= f1 (2
+ ) i = 1, , n
n 1 2
= f1 (2
Note that the maximum value of z (the correlation of x with y) occurs at i = 26, which corresponds
to the offset.
The nonperiodic case uses the function f2(x) = sin(x2). The two input signals are on the interval
[0, 4].
xi
yi
i 1
)
n 1
i 1
= f 2 (4
+ )
n 1
= f 2 (4
i = 1, , n
i = 1, , n
The offset of x to y is again (roughly) 26 and this is where z has its maximum value.
USE IMSL_LIBRARIES
!
!
!
!
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=100)
INTEGER
REAL
I, IPAD, K, NOUT, NZ
A, F1, F2, FLOAT, PI, SIN, X(N), XNORM, &
Y(N), YNORM, Z(4*N), ZHAT(4*N)
INTRINSIC FLOAT, SIN
Define functions
F1(A) = SIN(A)
F2(A) = SIN(A*A)
CALL UMACH (2, NOUT)
PI = CONST('pi')
Set up the vectors for the
periodic case.
DO 10 I=1, N
X(I) = F1(2.0*PI*FLOAT(I-1)/FLOAT(N-1))
Y(I) = F1(2.0*PI*FLOAT(I-1)/FLOAT(N-1)+PI/2.0)
10 CONTINUE
!
!
!
!
20
!
!
!
!
!
!
30
!
!
40
!
!
99994
99995
99996
99997
99998
99999
XNORM = SNRM2(N,X,1)
YNORM = SNRM2(N,Y,1)
DO 20 I=1, N
Z(I) = Z(I)/(XNORM*YNORM)
CONTINUE
K = ISMAX(N,Z,1)
Print results for the periodic
case.
WRITE (NOUT,99995)
WRITE (NOUT,99994)
WRITE (NOUT,99997)
WRITE (NOUT,99998) K
WRITE (NOUT,99999) K, Z(K)
Set up the vectors for the
nonperiodic case.
DO 30 I=1, N
X(I) = F2(4.0*PI*FLOAT(I-1)/FLOAT(N-1))
Y(I) = F2(4.0*PI*FLOAT(I-1)/FLOAT(N-1)+PI)
CONTINUE
Call the correlation routine for the
nonperiodic case.
NZ = 4*N
CALL RCORL (X, Y, Z, ZHAT, IPAD=1)
Find the element of Z with the
largest normalized value.
XNORM = SNRM2(N,X,1)
YNORM = SNRM2(N,Y,1)
DO 40 I=1, N
Z(I) = Z(I)/(XNORM*YNORM)
CONTINUE
K = ISMAX(N,Z,1)
Print results for the nonperiodic
case.
WRITE (NOUT,99996)
WRITE (NOUT,99994)
WRITE (NOUT,99997)
WRITE (NOUT,99998) K
WRITE (NOUT,99999) K, Z(K)
FORMAT (1X, 28('-'))
FORMAT (' Case #1: Periodic data')
FORMAT (/, ' Case #2: Nonperiodic data')
FORMAT (' The element of Z with the largest normalized ')
FORMAT (' value is Z(', I2, ').')
FORMAT (' The normalized value of Z(', I2, ') is', F6.3)
END
Output
Chapter 6: Transforms
RCORL 1167
CCORL
Computes the correlation of two complex vectors.
Required Arguments
X Complex vector of length N. (Input)
Y Complex vector of length N. (Input)
Z Complex vector of length NZ containing the correlation of X and Y. (Output)
ZHAT Complex vector of length NZ containing the inverse discrete complex Fourier
transform of Z. (Output)
Optional Arguments
IDO Flag indicating the usage of CCORL. (Input)
Default: IDO = 0.
IDO
Usage
If CCORL is called multiple times in sequence with the same NX, NY, and IPAD, IDO
should be set to:
1
and Y different. IPAD = 2 for periodic data with X and Y identical. IPAD = 3 for
nonperiodic data with X and Y identical.
Default: IPAD = 0.
NZ Length of the vector Z. (Input/Output)
Upon input: When IPAD is zero or two, NZ must be at least (2 * N 1). When IPAD is
one or three, NZ must be greater than or equal to the smallest integer greater than or
equal to (2 * N 1) of the form (2) * (3) * (5) where alpha, beta, and gamma are
nonnegative integers. Upon output, the value for NZ that was used by CCORL.
Default: NZ = size (Z,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The subroutine CCORL computes the discrete correlation of two complex sequences x and y. More
precisely, let n be the length of x and y. If a circular correlation is desired, then IPAD must be set
to zero (for x and y distinct) and two (for x = y). We set (on output)
nz = n
if IPAD = 0, 2
nz = 2 3 5 2 n 1
if IPAD = 1, 3
where , , are nonnegative integers yielding the smallest number of the type 235 satisfying
the inequality. Once nz is determined, we pad out the vectors with zeroes. Then, we compute
nz
zi = xi + j 1 y j
j =1
where the index on x is interpreted as a positive number between one and nz, modulo nz. Note that
this means that
z nz k
contains the correlation of x( k 1) with y as k = 0, 1, , nz /2. Thus, if x(k 1) = y(k) for all k,
then we would expect
znz
CCORL 1169
The technique used to compute the zis is based on the fact that the (complex discrete) Fourier
transform maps correlation into multiplication. Thus, the Fourier transform of z is given by
z j = x j y j
where
nz
z j = zm e 2 i ( m 1)( j 1) / nz
m =1
Thus, the technique used here to compute the correlation is to take the discrete Fourier transform
of x and the conjugate of the discrete Fourier transform of y, multiply the results together
component-wise, and then take the inverse transform of this product. It is very important to make
sure that nz is a product of small primes if IPAD is set to zero or two. If nz is a product of small
primes, then the computational effort will be proportional to nz log(nz). If IPAD is one or three,
then a good value is chosen for nz so that the Fourier transforms are efficient and nz 2n 1. This
will mean that both vectors will be padded with zeroes.
Comments
1.
2.
Informational error
Type
Code
4
The length of the vector Z must be large enough to hold the results.
An acceptable length is returned in NZ.
Example
In this example, we compute both a periodic and a non-periodic correlation between two distinct
signals x and y. In the first case, we have 100 equally spaced points on the interval [0, 2] and
f1(x) = cos(x) + i sin(x). We define x and y as follows
xi
yi
i 1
i = 1, , n
)
n 1
i 1
= f1 (2
+ ) i = 1, , n
n 1 2
= f1 (2
Note that the maximum value of z (the correlation of x with y) occurs at i = 26, which corresponds
to the offset.
The nonperiodic case uses the function f2(x) = cos(x2) + i sin(x2). The two input signals are on the
interval [0, 4].
xi
yi
i 1
)
n 1
i 1
= f 2 (4
+ )
n 1
= f 2 (4
i = 1, , n
i = 1, , n
The offset of x to y is again (roughly) 26 and this is where z has its maximum value.
USE IMSL_LIBRARIES
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=100)
!
INTEGER
REAL
!
!
I, IPAD, K, NOUT, NZ
A, COS, F1, F2, FLOAT, PI, SIN, &
XNORM, YNORM, ZREAL1(4*N)
COMPLEX
CMPLX, X(N), Y(N), Z(4*N), ZHAT(4*N)
INTRINSIC CMPLX, COS, FLOAT, SIN
Define functions
F1(A) = CMPLX(COS(A),SIN(A))
F2(A) = CMPLX(COS(A*A),SIN(A*A))
CALL RNSET (1234579)
CALL UMACH (2, NOUT)
PI = CONST('pi')
!
!
!
!
!
!
!
!
Chapter 6: Transforms
CCORL 1171
!
!
30
!
!
!
!
40
!
!
99994
99995
99996
99997
99998
99999
WRITE (NOUT,99998) K
WRITE (NOUT,99999) K, ZREAL1(K)
Set up the vectors for the
nonperioddic case.
DO 30 I=1, N
X(I) = F2(4.0*PI*FLOAT(I-1)/FLOAT(N-1))
Y(I) = F2(4.0*PI*FLOAT(I-1)/FLOAT(N-1)+PI)
CONTINUE
Call the correlation routine for the
nonperiodic case.
NZ = 4*N
CALL CCORL (X, Y, Z, ZHAT, IPAD=1, NZ=NZ)
Find the element of z with the
largest normalized real part.
XNORM = SCNRM2(N,X,1)
YNORM = SCNRM2(N,Y,1)
DO 40 I=1, N
ZREAL1(I) = REAL(Z(I))/(XNORM*YNORM)
CONTINUE
K = ISMAX(N,ZREAL1,1)
Print results for the nonperiodic
case.
WRITE (NOUT,99996)
WRITE (NOUT,99994)
WRITE (NOUT,99997)
WRITE (NOUT,99998) K
WRITE (NOUT,99999) K, ZREAL1(K)
FORMAT (1X, 28('-'))
FORMAT (' Case #1: periodic data')
FORMAT (/, ' Case #2: nonperiodic data')
FORMAT (' The element of Z with the largest normalized ')
FORMAT (' real part is Z(', I2, ').')
FORMAT (' The normalized value of real(Z(', I2, ')) is', F6.3)
END
Output
Example #1: periodic case
---------------------------The element of Z with the largest normalized real part is Z(26).
The normalized value of real(Z(26)) is 1.000
Example #2: nonperiodic case
---------------------------The element of Z with the largest normalized real part is Z(26).
The normalized value of real(Z(26)) is 0.638
INLAP
Computes the inverse Laplace transform of a complex function.
Required Arguments
F User-supplied FUNCTION to which the inverse Laplace transform will be computed. The
form is F(Z), where
Z Complex argument. (Input)
F The complex function value. (Output)
F must be declared EXTERNAL in the calling program. F should also be declared COMPLEX.
T Array of length N containing the points at which the inverse Laplace transform is
desired. (Input)
T(I) must be greater than zero for all I.
FINV Array of length N whose I-th component contains the approximate value of the
Laplace transform at the point T(I). (Output)
Optional Arguments
N Number of points at which the inverse Laplace transform is desired. (Input)
Default: N = size (T,1).
ALPHA An estimate for the maximum of the real parts of the singularities of F. If
unknown, set ALPHA = 0. (Input)
Default: ALPHA = 0.0.
KMAX The number of function evaluations allowed for each T(I). (Input)
Default: KMAX = 500.
RELERR The relative accuracy desired. (Input)
Default: RELERR = 1.1920929e-5 for single precision and 2.22d-10 for double
precision.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Chapter 6: Transforms
INLAP 1173
Description
The routine INLAP computes the inverse Laplace transform of a complex-valued function. Recall
that if f is a function that vanishes on the negative real axis, then we can define the Laplace
transform of f by
L [ f ] ( s ) := e sx f ( x ) dx
0
ik
ik t
1
) exp(
)
g ( t ) = ( e t / T ) F ( ) + F ( +
2
T
T
k =1
This is the real part of the sum of a complex power series in z = exp(it/T), and the algorithm
accelerates the convergence of the partial sums of this power series by using the epsilon algorithm
to compute the corresponding diagonal Pade approximants. The algorithm attempts to choose the
order of the Pade approximant to obtain the specified relative accuracy while not exceeding the
maximum number of function evaluations allowed. The parameter is an estimate for the
maximum of the real parts of the singularities of F, and an incorrect choice of may give false
convergence. Even in cases where the correct value of is unknown, the algorithm will attempt to
estimate an acceptable value. Assuming satisfactory convergence, the discretization error
E := g f satisfies
E = e2 n T f ( 2nT + t )
n =1
It follows that if |f(t)| Me , then we can estimate the expression above to obtain
(for 0 t 2T)
E Me t / e
2T ( )
Comments
Informational errors
Type
Code
4
4
1 The algorithm was not able to achieve the accuracy requested within KMAX
function evaluations for some T(I).
2 Overflow is occurring for a particular value of T.
Example
We invert the Laplace transform of the simple function (s 1)2 and print the computed answer,
the true solution and the difference at five different points. The correct inverse transform is xex.
1174 Chapter 6: Transforms
USE INLAP_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
NONE
I, KMAX, N, NOUT
ALPHA, DIF(5), EXP, FINV(5), FLOAT, RELERR, T(5), &
TRUE(5)
COMPLEX
F
INTRINSIC EXP, FLOAT
EXTERNAL
F
Get output unit number
CALL UMACH (2, NOUT)
!
!
DO 10 I=1, 5
T(I) = FLOAT(I) - 0.5
10 CONTINUE
N
= 5
ALPHA = 1.0E0
RELERR = 5.0E-4
CALL INLAP (F, T, FINV, ALPHA=ALPHA, RELERR=RELERR)
Evaluate the true solution and the
difference
DO 20 I=1, 5
TRUE(I) = T(I)*EXP(T(I))
DIF(I) = TRUE(I) - FINV(I)
20 CONTINUE
Output
T
0.5E+00
1.5E+00
2.5E+00
3.5E+00
4.5E+00
FINV
8.244E-01
6.723E+00
3.046E+01
1.159E+02
4.051E+02
TRUE
8.244E-01
6.723E+00
3.046E+01
1.159E+02
4.051E+02
DIFF
-4.768E-06
-3.481E-05
-1.678E-04
-6.027E-04
-2.106E-03
SINLP
Computes the inverse Laplace transform of a complex function.
Chapter 6: Transforms
SINLP 1175
Required Arguments
F User-supplied FUNCTION to which the inverse Laplace transform will be
computed. The form is F(Z), where
Z Complex argument. (Input)
F The complex function value. (Output)
F must be declared EXTERNAL in the calling program. F must also be declared
COMPLEX.
T Vector of length N containing points at which the inverse Laplace transform is desired.
(Input)
T(I) must be greater than zero for all I.
FINV Vector of length N whose I-th component contains the approximate value of the
inverse Laplace transform at the point T(I). (Output)
Optional Arguments
N The number of points at which the inverse Laplace transform is desired. (Input)
Default: N = size (T,1).
SIGMA0 An estimate for the maximum of the real parts of the singularities of F. (Input)
If unknown, set SIGMA0 = 0.0.
Default: SIGMA0 = 0.e0.
EPSTOL The required absolute uniform pseudo accuracy for the coefficients and inverse
Laplace transform values. (Input)
Default: EPSTOL = 1.1920929e-5 for single precision and 2.22d-10 for double
precision.
ERRVEC Vector of length eight containing diagnostic information. (Output)
All components depend on the intermediately generated Laguerre coefficients. See
Comments.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine SINLP computes the inverse Laplace transform of a complex-valued function. Recall
that if f is a function that vanishes on the negative real axis, then we can define the Laplace
transform of f by
L [ f ] ( s ) := e sx f ( x ) dx
0
<
where := EPSTOL and := SIGMA > SIGMA0. The expression on the left is called the pseudo
error. An estimate of the pseudo error is available in ERRVEC(1).
The first step in the method is to transform F to where
(z) =
b
b
b
+
F
1 z 1 z 2
Then, if f is smooth, it is known that is analytic in the unit disc of the complex plane and hence
has a Taylor series expansion
( z ) = as z s
s =0
which converges for all z whose absolute value is less than the radius of convergence Rc. This
number is estimated in ERRVEC(6). In ERRVEC(5), we estimate the smallest number K which
satisfies
as <
K
Rs
f ( t ) = e t as e bt / 2 Ls ( bt )
s =0
Chapter 6: Transforms
SINLP 1177
Comments
1.
2.
3.
Informational errors
Type
Code
1
Laguerre expansion.
ERRVEC(6) = R, the base of the decay function for ACOEF. Here
abs(ACOEF (J + 1)).LE.K/R**J for J.GE.MACT/2, where MACT is the number of
0 = Normal termination.
1 = The value of the inverse Laplace transform is found to be too large to be
representable; FINV(I) is set to AMACH(6).
1 = The value of the inverse Laplace transform is found to be too small to be
representable; FINV(I) is set to 0.0.
2 = The value of the inverse Laplace transform is estimated to be too large, even
before the series expansion, to be representable; FINV(I) is set to AMACH(6).
2 = The value of the inverse Laplace transform is estimated to be too small, even
before the series expansion, to be representable; FINV(I) is set to 0.0.
Example
We invert the Laplace transform of the simple function (s 1)2 and print the computed answer,
the true solution, and the difference at five different points. The correct inverse transform is xex.
USE SINLP_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
COMPLEX
Chapter 6: Transforms
NONE
I, NOUT
DIF(5), ERRVEC(8), EXP, FINV(5), FLOAT, RELERR, &
SIGMA0, T(5), TRUE(5), EPSTOL
F
SINLP 1179
INTRINSIC
EXTERNAL
EXP, FLOAT
F
!
!
!
!
!
WRITE (NOUT,99999) (T(I),FINV(I),TRUE(I),DIF(I),I=1,5)
99999 FORMAT (7X, 'T', 8X, 'FINV', 9X, 'TRUE', 9X, 'DIFF', /, &
5(1X,E9.1,3X,1PE10.3,3X,1PE10.3,3X,1PE10.3,/))
END
!
COMPLEX FUNCTION F (S)
COMPLEX
S
!
F = 1./(S-1.)**2
RETURN
END
Output
T
0.5E+00
1.5E+00
2.5E+00
3.5E+00
4.5E+00
FINV
8.244E-01
6.723E+00
3.046E+01
1.159E+02
4.051E+02
TRUE
8.244E-01
6.723E+00
3.046E+01
1.159E+02
4.051E+02
DIFF
-2.086E-06
-8.583E-06
0.000E+00
2.289E-05
-2.136E-04
Chapter 6: Transforms
SINLP 1181
Routines
7.1.
7.2.
7.3.
Zeros of a Polynomial
Real coefficients using Laguerre method ............................ ZPLRC
Real coefficients using Jenkins-Traub method................... ZPORC
Complex coefficients........................................................... ZPOCC
1184
1186
1188
Zero(s) of a Function
Zeros of a complex analytic function ................................... ZANLY
Zero of a real function with sign changes ............................ZBREN
Zeros of a real function ........................................................ ZREAL
1189
1192
1195
1198
1201
1204
1210
Usage Notes
Zeros of a Polynomial
A polynomial function of degree n can be expressed as follows:
p(z) = anzn + an1zn1 + + a1z + a0
where an 0.
There are three routines for zeros of a polynomial. The routines ZPLRC and ZPORC find zeros of
the polynomial with real coefficients while the routine ZPOCC finds zeros of the polynomial with
complex coefficients.
The Jenkins-Traub method is used for the routines ZPORC and ZPOCC; whereas ZPLRC uses the
Laguerre method. Both methods perform well in comparison with other methods. The JenkinsTraub algorithm usually runs faster than the Laguerre method. Furthermore, the routine ZANLY in
the next section can also be used for the complex polynomial.
Routines 1183
Zero(s) of a Function
The routines ZANLY and ZREAL use Mllers method to find the zeros of a complex analytic
function and real zeros of a real function, respectively. The routine ZBREN finds a zero of a real
function, using an algorithm that is a combination of interpolation and bisection. This algorithm
requires the user to supply two points such that the function values at these two points have
opposite sign. For functions where it is difficult to obtain two such points, ZREAL can be used.
where x R .
The routines NEQNF and NEQNJ use a modified Powell hybrid method to find a zero of a system of
nonlinear equations. The difference between these two routines is that the Jacobian is estimated by
a finite-difference method in NEQNF, whereas the user has to provide the Jacobian for NEQNJ. It is
advised that the Jacobian-checking routine, CHJAC (see Chapter 8, Optimization), be used to ensure
the accuracy of the user-supplied Jacobian.
The routines NEQBF and NEQBJ use a secant method with Broydens update to find a zero of a
system of nonlinear equations. The difference between these two routines is that the Jacobian is
estimated by a finite-difference method in NEQBF; whereas the user has to provide the Jacobian for
NEQBJ. For more details, see Dennis and Schnabel (1983, Chapter 8).
ZPLRC
Finds the zeros of a polynomial with real coefficients using Laguerres method.
Required Arguments
COEFF Vector of length NDEG + 1 containing the coefficients of the polynomial in
increasing order by degree. (Input)
The polynomial is
COEFF(NDEG + 1) * Z**NDEG + COEFF(NDEG) * Z**(NDEG 1) + + COEFF(1).
ROOT Complex vector of length NDEG containing the zeros of the polynomial. (Output)
Optional Arguments
NDEG Degree of the polynomial. 1 NDEG 100 (Input)
Default: NDEG = size (COEFF,1) 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine ZPLRC computes the n zeros of the polynomial
p(z) = anzn + an1zn1 + + a1z + a0
where the coefficients ai for i = 0, 1, , n are real and n is the degree of the polynomial.
The routine ZPLRC is a modification of B.T. Smiths routine ZERPOL (Smith 1967) that uses
Laguerres method. Laguerres method is cubically convergent for isolated zeros and linearly
convergent for multiple zeros. The maximum length of the step between successive iterates is
restricted so that each new iterate lies inside a region about the previous iterate known to contain a
zero of the polynomial. An iterate is accepted as a zero when the polynomial value at that iterate is
smaller than a computed bound for the rounding error in the polynomial value at that iterate. The
original polynomial is deflated after each real zero or pair of complex zeros is found. Subsequent
zeros are found using the deflated polynomial.
Comments
Informational errors
Type
3
3
Code
1 The first several coefficients of the polynomial are equal to zero. Several of
the last roots will be set to machine infinity to compensate for this problem.
2 Fewer than NDEG zeros were found. The ROOT vector will contain the value
for machine infinity in the locations that do not contain zeros.
Example
This example finds the zeros of the third-degree polynomial
p(z) = z3 3z2 + 4z 2
IMPLICIT
NONE
INTEGER
PARAMETER
NDEG
(NDEG=3)
REAL
COMPLEX
COEFF(NDEG+1)
ZERO(NDEG)
Declare variables
ZPLRC 1185
!
!
!
!
1.0)
!
CALL WRCRN ('The zeros found are', ZERO, 1, NDEG, 1)
!
END
Output
The zeros found are
1
2
3
( 1.000, 1.000) ( 1.000,-1.000) ( 1.000, 0.000)
ZPORC
Finds the zeros of a polynomial with real coefficients using the Jenkins-Traub three-stage
algorithm.
Required Arguments
COEFF Vector of length NDEG + 1 containing the coefficients of the polynomial in
increasing order by degree. (Input)
The polynomial is
COEFF(NDEG + 1)*Z**NDEG + COEFF(NDEG) * Z**(NDEG 1) + + COEFF(1).
ROOT Complex vector of length NDEG containing the zeros of the polynomial. (Output)
Optional Arguments
NDEG Degree of the polynomial. 1 NDEG 100 (Input)
Default: NDEG = size (COEFF,1) 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine ZPORC computes the n zeros of the polynomial
p(z) = anzn + an1zn1 + + a1z + a0
where the coefficients ai for i = 0, 1, , n are real and n is the degree of the polynomial.
The routine ZPORC uses the Jenkins-Traub three-stage algorithm (Jenkins and Traub 1970; Jenkins
1975). The zeros are computed one at a time for real zeros or two at a time for complex conjugate
pairs. As the zeros are found, the real zero or quadratic factor is removed by polynomial deflation.
Comments
Informational errors
Type
3
3
Code
1 The first several coefficients of the polynomial are equal to zero. Several of
the last roots will be set to machine infinity to compensate for this problem.
2 Fewer than NDEG zeros were found. The ROOT vector will contain the value
for machine infinity in the locations that do not contain zeros.
Example
This example finds the zeros of the third-degree polynomial
p(z) = z3 3z2 + 4z 2
IMPLICIT
NONE
INTEGER
PARAMETER
NDEG
(NDEG=3)
REAL
COMPLEX
COEFF(NDEG+1)
ZERO(NDEG)
Declare variables
!
!
!
!
!
!
!
1.0)
ZPORC 1187
Output
The zeros found are
1
2
3
( 1.000, 0.000) ( 1.000, 1.000) ( 1.000,-1.000)
ZPOCC
Finds the zeros of a polynomial with complex coefficients.
Required Arguments
COEFF Complex vector of length NDEG + 1 containing the coefficients of the polynomial
in increasing order by degree. (Input)
The polynomial is
COEFF(NDEG + 1) * Z**NDEG + COEFF(NDEG) * Z**(NDEG 1) + + COEFF(1).
ROOT Complex vector of length NDEG containing the zeros of the polynomial. (Output)
Optional Arguments
NDEG Degree of the polynomial. 1 NDEG < 50 (Input)
Default: NDEG = size (COEFF,1) 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine ZPOCC computes the n zeros of the polynomial
p(z) = anzn + an1zn1 + + a1z + a0
where the coefficients ai for i = 0, 1, , n are real and n is the degree of the polynomial.
The routine ZPOCC uses the Jenkins-Traub three-stage complex algorithm (Jenkins and Traub
1970, 1972). The zeros are computed one at a time in roughly increasing order of modulus. As
each zero is found, the polynomial is deflated to one of lower degree.
Comments
Informational errors
Type
3
3
Code
1 The first several coefficients of the polynomial are equal to zero. Several of
the last roots will be set to machine infinity to compensate for this problem.
2 Fewer than NDEG zeros were found. The ROOT vector will contain the value
for machine infinity in the locations that do not contain zeros.
Example
This example finds the zeros of the third-degree polynomial
p(z) = z3 (3 + 6i)z2 (8 12i)z + 10
IMPLICIT
NONE
INTEGER
PARAMETER
NDEG
(NDEG=3)
COMPLEX
COEFF(NDEG+1), ZERO(NDEG)
Set values of COEFF
COEFF = ( 10.0 + 0.0i
( -8.0 + 12.0i
( -3.0 - 6.0i
( 1.0 + 0.0i
Declare variables
)
)
)
)
Output
The zeros found are
1
2
3
( 1.000, 1.000) ( 1.000, 2.000) ( 1.000, 3.000)
ZANLY
Finds the zeros of a univariate complex function using Mllers method.
ZANLY 1189
Required Arguments
F User-supplied COMPLEX FUNCTION to compute the value of the function
of which the zeros will be found. The form is F(Z), where
Optional Arguments
ERRABS First stopping criterion. (Input)
Let FP(Z) = F(Z)/P where P = (Z Z(1)) * (Z Z(2)) ** (Z Z(K 1))
and Z(1), , Z(K 1) are previously found zeros.
If (CABS(F(Z)).LE.ERRABS.AND.CABS(FP(Z)).LE.ERRABS),
then Z is accepted as a zero.
Default: ERRABS = 1.e-4 for single precision and 1.d-8 for double precision.
ERRREL Second stopping criterion is the relative error. (Input)
A zero is accepted if the difference in two successive approximations to this zero is
within ERRREL. ERRREL must be less than 0.01; otherwise, 0.01 will be used.
Default: ERRREL = 1.e-4 for single precision and 1.d-8 for double precision.
NKNOWN The number of previously known zeros, if any, that must be stored in
ZINIT(1), , ZINIT(NKNOWN) prior to entry to ZANLY. (Input)
NKNOWN must be set equal to zero if no zeros are known.
Default: NKNOWN = 0.
NNEW The number of new zeros to be found by ZANLY. (Input)
Default: NNEW = 1.
NGUESS The number of initial guesses provided. (Input)
These guesses must be stored in ZINIT(NKNOWN + 1), , ZINIT(NKNOWN + NGUESS).
NGUESS must be set equal to zero if no guesses are provided.
Default: NGUESS = 0.
ITMAX The maximum allowable number of iterations per zero. (Input)
Default: ITMAX = 100.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Example
This example finds the zeros of the equation f(z) = z3 + 5z2 + 9z + 45, where z is a complex
variable.
USE ZANLY_INT
USE WRCRN_INT
!
!
!
!
!
!
!
!
IMPLICIT
INTEGER
COMPLEX
EXTERNAL
NONE
Declare variables
INFO(3), NGUESS, NNEW
F, Z(3), ZINIT(3)
F
Set the guessed zero values in ZINIT
DATA ZINIT/3*(1.0,1.0)/
NNEW
= 3
NGUESS = 3
ZANLY 1191
Output
The zeros are
1
2
( 0.000, 3.000) ( 0.000,-3.000)
3
(-5.000, 0.000)
ZBREN
Finds a zero of a real function that changes sign in a given interval.
Required Arguments
F User-supplied FUNCTION to compute the value of the function of which a zero will be
found. The form is F(X), where
X The point at which the function is evaluated. (Input)
X should not be changed by F.
F The computed function value at the point X. (Output)
F must be declared EXTERNAL in the calling program.
A See B. (Input/Output)
B On input, the user must supply two points, A and B, such that F(A) and F(B) are opposite
in sign. (Input/Output)
On output, both A and B are altered. B will contain the best approximation to the zero of
F.
Optional Arguments
ERRABS First stopping criterion. (Input)
A zero, B, is accepted if ABS(F(B)) is less than or equal to ERRABS. ERRABS may be set
to zero.
Default: ERRABS = 1.e-4 for single precision and 1.d-8 for double precision.
ERRREL Second stopping criterion is the relative error. (Input)
A zero is accepted if the change between two successive approximations to this zero is
within ERRREL.
Default: ERRREL = 1.e-4 for single precision and 1.d-8 for double precision.
MAXFN On input, MAXFN specifies an upper bound on the number of function evaluations
required for convergence. (Input/Output)
On output, MAXFN will contain the actual number of function evaluations used.
Default: MAXFN = 100.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The algorithm used by ZBREN is a combination of linear interpolation, inverse quadratic
interpolation, and bisection. Convergence is usually superlinear and is never much slower than the
rate for the bisection method. See Brent (1971) for a more detailed account of this algorithm.
Comments
1.
Informational error
Type
4
2.
Code
1 Failure to converge in MAXFN function evaluations.
On exit from ZBREN without any error message, A and B satisfy the following:
F(A)F(B) 0.0
|F(B)| |F(A)|, and
either |F(B)| ERRABS or
|A B| max(|B|, 0.1) * ERRREL.
The presence of 0.1 in the stopping criterion causes leading zeros to the right of the
decimal point to be counted as significant digits. Scaling may be required in order to
accurately determine a zero of small magnitude.
3.
This is an upper bound on the number of evaluations. Rarely does the actual number of
evaluations used by ZBREN exceed
Chapter 7: Nonlinear Equations
ZBREN 1193
K
D can be computed as follows:
P = AMAX1(0.1, AMIN1(|A|, |B|))
IF((A 0.1) * (B 0.1) < 0.0) P = 0.1,
D = P * ERRREL
Example
This example finds a zero of the function
f(x) = x2 + x 2
!
!
IMPLICIT
NONE
REAL
ERRABS, ERRREL
INTEGER
REAL
EXTERNAL
NOUT, MAXFN
A, B, F
F
A
B
ERRABS
ERRREL
MAXFN
=
=
=
=
=
Declare variables
!
CALL UMACH (2, NOUT)
!
Find zero of F
CALL ZBREN (F, A, B, ERRABS=ERRABS, ERRREL=ERRREL, MAXFN=MAXFN)
Output
The best approximation to the zero of F is equal to -2.0.
The number of function evaluations required was 12.
1194 Chapter 7: Nonlinear Equations
ZREAL
Finds the real zeros of a real function using Mllers method.
Required Arguments
F User-supplied FUNCTION to compute the value of the function of which a zero will be
found. The form is F(X), where
X The point at which the function is evaluated. (Input)
X should not be changed by F.
F The computed function value at the point X. (Output)
F must be declared EXTERNAL in the calling program.
Optional Arguments
ERRABS First stopping criterion. (Input)
A zero X(I) is accepted if ABS(F(X(I)).LT. ERRABS.
Default: ERRABS = 1.e-4 for single precision and 1.d-8 for double precision.
ERRREL Second stopping criterion is the relative error. (Input)
A zero X(I) is accepted if the relative change of two successive approximations to X(I)
is less than ERRREL.
Default: ERRREL = 1.e-4 for single precision and 1.d-8 for double precision.
EPS See ETA. (Input)
Default: EPS = 1.e-4 for single precision and 1.d-8 for double precision.
ETA Spread criteria for multiple zeros. (Input)
If the zero X(I) has been computed and ABS(X(I) X(J)).LT.EPS, where X(J) is a
previously computed zero, then the computation is restarted with a guess equal to
X(I) + ETA.
Default: ETA = .01.
NROOT The number of zeros to be found by ZREAL. (Input)
Default: NROOT = 1.
ITMAX The maximum allowable number of iterations per zero. (Input)
Default: ITMAX = 100.
XGUESS A vector of length NROOT. (Input)
XGUESS contains the initial guesses for the zeros.
Default: XGUESS = 0.0.
Chapter 7: Nonlinear Equations
ZREAL 1195
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine ZREAL computes n real zeros of a real function f. Given a user-supplied function f(x) and
an n-vector of initial guesses x1, x2, , xn, the routine uses Mllers method to locate n real zeros
of f, that is, n real values of x for which f(x) = 0. The routine has two convergence criteria: the first
requires that
f ( xim )
be less than ERRABS; the second requires that the relative change of any two successive
approximations to an xi be less than ERRREL. Here,
xim
is the m-th approximation to xi. Let ERRABS be 1, and ERRREL be 2.The criteria may be stated
mathematically as follows:
Criterion 1:
f ( xim ) < 1
Criterion 2:
xim +1 xim
< 2
xim
Comments
1.
Informational error
Type
3
Code
1 Failure to converge within ITMAX iterations for at least one of the
NROOT roots.
2.
Routine ZREAL always returns the last approximation for zero J in X(J). If the
convergence criterion is satisfied, then INFO(J) is less than or equal to ITMAX. If the
convergence criterion is not satisfied, then INFO(J) is set to ITMAX + 1.
3.
The routine ZREAL assumes that there exist NROOT distinct real zeros for the function F
and that they can be reached from the initial guesses supplied. The routine is designed
so that convergence to any single zero cannot be obtained from two different initial
guesses.
4.
Scaling the X vector in the function F may be required, if any of the zeros are known to
be less than one.
Example
This example finds the real zeros of the second-degree polynomial
f(x) = x2 + 2x 6
IMPLICIT
INTEGER
REAL
PARAMETER
!
!
!
!
!
!
!
!
!
INTEGER
REAL
EXTERNAL
NONE
Declare variables
NROOT
EPS, ERRABS, ERRREL
(NROOT=2)
INFO(NROOT)
F, X(NROOT), XGUESS(NROOT)
F
Set values of initial guess
XGUESS = ( 4.6 -193.3)
ZREAL 1197
Output
The zeros are
1
2
1.646 -3.646
NEQNF
Solves a system of nonlinear equations using a modified Powell hybrid algorithm and a finitedifference approximation to the Jacobian.
Required Arguments
FCN User-supplied SUBROUTINE to evaluate the system of equations to be solved. The
usage is CALL FCN (X, F, N), where
X The point at which the functions are evaluated. (Input)
X should not be changed by FCN.
F The computed function values at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
Optional Arguments
ERRREL Stopping criterion. (Input)
The root is accepted if the relative error between two successive approximations to this
root is less than ERRREL.
Default: ERRREL = 1.e-4 for single precision and 1.d-8 for double precision.
N The number of equations to be solved and the number of unknowns. (Input)
Default: N = size (X,1).
ITMAX The maximum allowable number of iterations. (Input)
The maximum number of calls to FCN is ITMAX * (N + 1). Suggested value
ITMAX = 200.
Default: ITMAX = 200.
1198 Chapter 7: Nonlinear Equations
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine NEQNF is based on the MINPACK subroutine HYBRD1, which uses a modification of
M.J.D. Powells hybrid algorithm. This algorithm is a variation of Newtons method, which uses a
finite-difference approximation to the Jacobian and takes precautions to avoid large step sizes or
increasing residuals. For further description, see More et al. (1980).
Since a finite-difference method is used to estimate the Jacobian, for single precision calculation,
the Jacobian may be so incorrect that the algorithm terminates far from a root. In such cases, high
precision arithmetic is recommended. Also, whenever the exact Jacobian can be easily provided,
IMSL routine NEQNJ should be used instead.
Comments
1.
NEQNF 1199
2.
Informational errors
Type
4
Code
1 The number of calls to FCN has exceeded ITMAX * (N + 1). A new
initial guess may be tried.
2 ERRREL is too small. No further improvement in the approximate
solution is possible.
3 The iteration has not made good progress. A new initial guess may
be tried.
4
4
Example
The following 3 3 system of nonlinear equations
f1 ( x ) = x1 + e x1 1 + ( x2 + x3 ) 27 = 0
2
f 2 ( x ) = e x2 2 / x1 + x32 10 = 0
f 3 ( x ) = x3 + sin ( x2 2 ) + x22 7 = 0
IMPLICIT
NONE
INTEGER
PARAMETER
N
(N=3)
INTEGER
REAL
EXTERNAL
K, NOUT
FNORM, X(N), XGUESS(N)
FCN
Set values of initial guess
XGUESS = ( 4.0 4.0 4.0 )
Declare variables
!
!
!
!
!
!
REAL
INTRINSIC
EXP, SIN
EXP, SIN
!
F(1) = X(1) + EXP(X(1)-1.0) + (X(2)+X(3))*(X(2)+X(3)) - 27.0
F(2) = EXP(X(2)-2.0)/X(1) + X(3)*X(3) - 10.0
F(3) = X(3) + SIN(X(2)-2.0) + X(2)*X(2) - 7.0
RETURN
END
Output
The solution to the system is
X = ( 1.0 2.0 3.0)
with FNORM =.0000
NEQNJ
Solves a system of nonlinear equations using a modified Powell hybrid algorithm with a usersupplied Jacobian.
Required Arguments
FCN User-supplied SUBROUTINE to evaluate the system of equations to be solved. The
usage is CALL FCN (X, F, N), where
X The point at which the functions are evaluated. (Input)
X should not be changed by FCN.
F The computed function values at the point X. (Output)
N Length of X, F. (Input)
FCN must be declared EXTERNAL in the calling program.
NEQNJ 1201
Optional Arguments
ERRREL Stopping criterion. (Input)
The root is accepted if the relative error between two successive approximations to this
root is less than ERRREL.
Default: ERRREL = 1.e-4 for single precision and 1.d-8 for double precision.
N The number of equations to be solved and the number of unknowns. (Input)
Default: N = size (X,1).
ITMAX The maximum allowable number of iterations. (Input)
Suggested value = 200.
Default: ITMAX = 200.
XGUESS A vector of length N. (Input)
XGUESS contains the initial estimate of the root.
Default: XGUESS = 0.0.
FNORM A scalar that has the value F(1)2 + + F(N)2 at the point X. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine NEQNJ is based on the MINPACK subroutine HYBRDJ, which uses a modification of
M.J.D. Powells hybrid algorithm. This algorithm is a variation of Newtons method, which takes
precautions to avoid large step sizes or increasing residuals. For further description, see More et al.
(1980).
Comments
1.
FVEC A vector of length N. FVEC contains the functions evaluated at the point X.
FJAC An N by N matrix. FJAC contains the orthogonal matrix Q produced by the
QR factorization of the final approximate Jacobian.
R A vector of length N * (N + 1)/2. R contains the upper triangular matrix
produced by the QR factorization of the final approximate Jacobian. R is stored
row-wise.
QTF A vector of length N. QTF contains the vector TRANS(Q) * FVEC.
WK A work vector of length 5 * N.
2.
Informational errors
Type
4
Code
1 The number of calls to FCN has exceeded ITMAX. A new initial guess
may be tried.
2 ERRREL is too small. No further improvement in the approximate
solution is possible.
3 The iteration has not made good progress. A new initial guess may
be tried.
4
4
Example
The following 3 3 system of nonlinear equations
f1 ( x ) = x1 + e x1 1 + ( x2 + x3 ) 27 = 0
2
f 2 ( x ) = e x2 2 / x1 + x32 10 = 0
f 3 ( x ) = x3 + sin ( x2 2 ) + x22 7 = 0
!
!
!
!
!
IMPLICIT
NONE
INTEGER
PARAMETER
N
(N=3)
INTEGER
REAL
EXTERNAL
K, NOUT
FNORM, X(N), XGUESS(N)
FCN, LSJAC
Set values of initial guess
XGUESS = ( 4.0 4.0 4.0 )
Declare variables
NEQNJ 1203
Output
The roots found are
X = ( 1.0 2.0 3.0)
with FNORM =.0000
NEQBF
Solves a system of nonlinear equations using factored secant update with a finite-difference
approximation to the Jacobian.
Required Arguments
FCN User-supplied SUBROUTINE to evaluate the system of equations to be solved. The
usage is CALL FCN (N, X, F), where
N Length of X and F. (Input)
X The point at which the functions are evaluated. (Input)
Optional Arguments
N Dimension of the problem. (Input)
Default: N = size (X,1).
XGUESS Vector of length N containing initial guess of the root. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the distance between two points. In the absence of
other information, set all entries to 1.0. If internal scaling is desired for XSCALE, set
IPARAM (6) to 1.
Default: XSCALE = 1.0.
FSCALE Vector of length N containing the diagonal scaling matrix for the functions.
(Input)
FSCALE is used mainly in scaling the function residuals. In the absence of other
information, set all entries to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM (1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 5. (Input/Output)
See Comment 4.
FVEC Vector of length N containing the values of the functions at the approximate
solution. (Output)
NEQBF 1205
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine NEQBF uses a secant algorithm to solve a system of nonlinear equations, i.e.,
F(x) = 0
F ( xc ) + J ( xc ) s
subject to || s ||2 c
to get a direction sc, where F(xc) and J(xc) are the function values and the approximate Jacobian
respectively evaluated at the current point xc. Then, the function values at the point xn = xc + sc are
evaluated and used to decide whether the new point xn should be accepted.
When the point xn is rejected, this routine reduces the trust region c and goes back to solve the
subproblem again. This procedure is repeated until a better point is found.
The algorithm terminates if the new point satisfies the stopping criterion. Otherwise, c is
adjusted, and the approximate Jacobian is updated by Broydens formula,
Jn = Jc +
( y J c sc ) scT
scT sc
where Jn = J(xn), Jc = J(xc), and y = F (xn) F (xc). The algorithm then continues using the new
point as the current point, i.e. xc xn.
For more details, see Dennis and Schnabel (1983, Chapter 8).
Since a finite-difference method is used to estimate the initial Jacobian, for single precision
calculation, the Jacobian may be so incorrect that the algorithm terminates far from a root. In such
cases, high precision arithmetic is recommended. Also, whenever the exact Jacobian can be easily
provided, IMSL routine NEQBJ should be used instead.
Comments
1.
2.
Informational errors
Type
3
3
3
3
3
Code
1 The last global step failed to decrease the 2-norm of F(X) sufficiently;
either the current point is close to a root of F(X) and no more
accuracy is possible, or the secant approximation to the Jacobian is
inaccurate, or the step tolerance is too large.
3 The scaled distance between the last two steps is less than the step
tolerance; the current point is probably an approximate root of F(X)
(unless STEPTL is too large).
4 Maximum number of iterations exceeded.
5 Maximum number of function evaluations exceeded.
7 Five consecutive steps of length STEPMX have been taken; either the
2-norm of F(X) asymptotes from above to a finite value in some
direction or the maximum allowable step size STEPMX is too small.
3.
The stopping criterion for NEQBF occurs when the scaled norm of the functions is less
than the scaled function tolerance (RPARAM(1)).
4.
If the default parameters are desired for NEQBF, then set IPARAM(1) to zero and call
routine NEQBF. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling NEQBF:
CALL N4QBJ (IPARAM, RPARAM)
NEQBF 1207
Default: 100.
Default: 400.
Default: 0.
RPARAM Real vector of length 5.
RPARAM(1) = Scaled function tolerance.
fi * fsi )
where fi is the i-th component of the function vector F, and fsi is the i-th
component of FSCALE.
Default:
The scaled norm of the step between two points x and y is computed as
xi yi
max {
}
i
max ( xi , 1/ si )
Default:
1 =
(s t )
n
i =1
i i
Users wishing to override the default print/stop attributes associated with error messages
issued by this routine are referred to Error Handling in the Introduction.
Example
The following 3 3 system of nonlinear equations:
f1 ( x ) = x1 + e x1 1 + ( x2 + x3 ) 27 = 0
2
f 2 ( x ) = e x2 2 / x1 + x32 10 = 0
f 3 ( x ) = x3 + sin ( x2 2 ) + x22 7 = 0
!
!
!
!
!
!
IMPLICIT
NONE
INTEGER
PARAMETER
N
(N=3)
INTEGER
REAL
EXTERNAL
K, NOUT
X(N), XGUESS(N)
FCN
Declare variables
NEQBF 1209
Output
The solution to the system is
X = (
1.000
2.000
3.000)
NEQBJ
Solves a system of nonlinear equations using factored secant update with a user-supplied Jacobian.
Required Arguments
FCN User-supplied SUBROUTINE to evaluate the system of equations to be solved. The
usage is CALL FCN (N, X, F), where
N
X
X
F
JAC User-supplied SUBROUTINE to evaluate the Jacobian at a point X. The usage is CALL
JAC (N, X, FJAC, LDFJAC), where
N Length of X. (Input)
X Vector of length N at which point the Jacobian is evaluated. (Input)
X should not be changed by JAC.
FJAC The computed N by N Jacobian at the point X. (Output)
LDFJAC Leading dimension of FJAC. (Input)
JAC must be declared EXTERNAL in the calling program.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = size (X,1).
XGUESS Vector of length N containing initial guess of the root. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the distance between two points. In the absence of
other information, set all entries to 1.0. If internal scaling is desired for XSCALE, set
IPARAM(6) to 1.
Default: XSCALE = 1.0.
FSCALE Vector of length N containing the diagonal scaling matrix for the functions.
(Input)
FSCALE is used mainly in scaling the function residuals. In the absence of other
information, set all entries to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM (1) to zero for default values of IPARAM and RPARAM.
See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 5. (Input/Output)
See Comment 4.
FVEC Vector of length N containing the values of the functions at the approximate
solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine NEQBJ uses a secant algorithm to solve a system of nonlinear equations, i. e.,
F (x) = 0
NEQBJ 1211
From a current point, the algorithm uses a double dogleg method to solve the following
subproblem approximately:
minn F ( xc ) + J ( xc ) s
sR
subject to ||s||2 c
to get a direction sc, where F(xc) and J(xc) are the function values and the approximate Jacobian
respectively evaluated at the current point xc. Then, the function values at the point xn = xc + sc are
evaluated and used to decide whether the new point xn should be accepted.
When the point xn is rejected, this routine reduces the trust region c and goes back to solve the
subproblem again. This procedure is repeated until a better point is found.
The algorithm terminates if the new point satisfies the stopping criterion. Otherwise, c is
adjusted, and the approximate Jacobian is updated by Broydens formula,
Jn = Jc +
( y J c sc ) scT
scT sc
where Jn = J(xn), Jc = J(xc), and y = F (xn) F (xc). The algorithm then continues using the new
point as the current point, i.e. xc xn.
For more details, see Dennis and Schnabel (1983, Chapter 8).
Comments
1.
2.
Informational errors
Type
3
Code
1 The last global step failed to decrease the 2-norm of F(X) sufficiently;
either the current point is close to a root of F(X) and no more
accuracy is possible, or the secant approximation to the Jacobian is
inaccurate, or the step tolerance is too large.
3
3
3
4
5
7
The scaled distance between the last two steps is less than the step
tolerance; the current point is probably an approximate root of F(X)
(unless STEPTL is too large).
Maximum number of iterations exceeded.
Maximum number of function evaluations exceeded.
Five consecutive steps of length STEPMX have been taken; either the
2-norm of F(X) asymptotes from above to a finite value in some
direction or the maximum allowable stepsize STEPMX is too small.
3.
The stopping criterion for NEQBJ occurs when the scaled norm of the functions is less
than the scaled function tolerance (RPARAM(1)).
4.
If the default parameters are desired for NEQBJ, then set IPARAM(1) to zero and call
routine NEQBJ. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling NEQBJ:
CALL N4QBJ (IPARAM, RPARAM)
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of Jacobian evaluations.
Default: not used in NEQBJ.
IPARAM(6) = Internal variable scaling flag.
If IPARAM(6) = 1, then the values of XSCALE are set internally.
Default: 0.
RPARAM Real vector of length 5.
NEQBJ 1213
max
i
fi * fsi )
where fi is the i-th component of the function vector F, and fsi is the i-th component of
FSCALE.
Default:
The scaled norm of the step between two points x and y is computed as
xi yi
max {
}
i
max ( xi , 1/ si )
Default:
1 =
(s t )
n
i =1
i i
If double precision is desired, then DN4QBJ is called and RPARAM is declared double
precision.
5.
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The following 3 3 system of nonlinear equations
1214 Chapter 7: Nonlinear Equations
f1 ( x ) = x1 + e x1 1 + ( x2 + x3 ) 27 = 0
2
f 2 ( x ) = e x2 2 / x1 + x32 10 = 0
f 3 ( x ) = x3 + sin ( x2 2 ) + x22 7 = 0
NONE
INTEGER
PARAMETER
N
(N=3)
INTEGER
REAL
EXTERNAL
K, NOUT
X(N), XGUESS(N)
FCN, JAC
!
!
!
!
!
Declare variables
NEQBJ 1215
FJAC(1,3)
FJAC(2,1)
FJAC(2,2)
FJAC(2,3)
FJAC(3,1)
FJAC(3,2)
FJAC(3,3)
RETURN
END
=
=
=
=
=
=
=
2.0*(X(2)+X(3))
-EXP(X(2)-2.0)*(1.0/X(1)**2)
EXP(X(2)-2.0)*(1.0/X(1))
2.0*X(3)
0.0
COS(X(2)-2.0) + 2.0*X(2)
1.0
Output
The solution to the system is
X = (
1.000
2.000
3.000)
Chapter 8: Optimization
Routines
8.1.
Unconstrained Minimization
8.1.1
Univariate Function
Using function values only ....................................................UVMIF
Using function and first derivative values ............................ UVMID
Nonsmooth function............................................................UVMGS
1222
1225
1229
Multivariate Function
Using finite-difference gradient .............................................UMINF
Using analytic gradient ........................................................ UMING
Using finite-difference Hessian ............................................ UMIDH
Using analytic Hessian ........................................................ UMIAH
Using conjugate gradient with finite-difference gradient.....UMCGF
Using conjugate gradient with analytic gradient ................ UMCGG
Nonsmooth function............................................................ UMPOL
1232
1237
1243
1249
1255
1259
1263
1267
1273
1279
1286
1293
1299
1306
1310
1317
1324
8.1.2
8.1.3
8.2.
8.3.
Chapter 8: Optimization
1333
1343
1346
1351
Routines 1217
8.4.
8.5.
1355
1361
1364
1370
1377
1383
Service Routines
Central-difference gradient................................................. CDGRD
Forward-difference gradient ................................................FDGRD
Forward-difference Hessian ................................................ FDHES
Forward-difference Hessian using analytic gradient ...........GDHES
Forward-difference Jacobian................................................FDJAC
Check user-supplied gradient ............................................ CHGRD
Check user-supplied Hessian .............................................CHHES
Check user-supplied Jacobian ............................................ CHJAC
Generate starting points ..................................................... GGUES
1390
1392
1394
1397
1400
1390
1406
1410
1414
Usage Notes
Unconstrained Minimization
The unconstrained minimization problem can be stated as follows:
minn f ( x )
xR
where f : R R is at least continuous. The routines for unconstrained minimization are grouped
into three categories: univariate functions (UV***), multivariate functions (UM***), and nonlinear
least squares (UNLS*).
For the univariate function routines, it is assumed that the function is unimodal within the
specified interval. Otherwise, only a local minimum can be expected. For further discussion on
unimodality, see Brent (1973).
A quasi-Newton method is used for the multivariate function routines UMINF and UMING, whereas
UMIDH and UMIAH use a modified Newton algorithm. The routines UMCGF and UMCGG make use of
a conjugate gradient approach, and UMPOL uses a polytope method. For more details on these
algorithms, see the documentation for the corresponding routines.
The nonlinear least squares routines use a modified Levenberg-Marquardt algorithm. If the
nonlinear least squares problem is a nonlinear data-fitting problem, then software that is designed
to deliver better statistical output may be useful; see IMSL (1991).
These routines are designed to find only a local minimum point. However, a function may have
many local minima. It is often possible to obtain a better local solution by trying different initial
points and intervals.
High precision arithmetic is recommended for the routines that use only function values. Also it is
advised that the derivative-checking routines CH*** be used to ensure the accuracy of the usersupplied derivative evaluation subroutines.
subject to li xi ui ,
for i 1, 2, , n
subject to Ax = b
subject to gi ( x ) = 0,
gi ( x ) 0,
Chapter 8: Optimization
for i 1, 2, , m 1
for i = m 1 +1, , m 1
Usage Notes 1219
Selection of Routines
The following general guidelines are provided to aid in the selection of the appropriate routine.
Unconstrained Minimization
1.
For the univariate case, use UVMID when the gradient is available, and use UVMIF when
it is not. If discontinuities exist, then use UVMGS.
2.
For the multivariate case, use UMCG* when storage is a problem, and use UMPOL when
the function is nonsmooth. Otherwise, use UMI** depending on the availability of the
gradient and the Hessian.
3.
For least squares problems, use UNLSJ when the Jacobian is available, and use UNLSF
when it is not.
Use BCONF when only function values are available. When first derivatives are
available, use either BCONG or BCODH. If first and second derivatives are available, then
use BCOAH.
2.
For least squares, use BCLSF or BCLSJ depending on the availability of the Jacobian.
3.
Use BCPOL for nonsmooth functions that could not be solved satisfactorily by the other
routines.
univariate
UMCGF
large-size
no derivative
problem
UMCGG
UNLSF
least squares
no Jacobian
UNLSJ
nonsmooth
UVMSG
UMPOL
no derivative
nonsmooth
UVMIF
UMINF
no first
derivative
smooth
UMING
UMIDH
UVMID
Chapter 8: Optimization
no second
derivative
UMIAH
UVMIF
Finds the minimum point of a smooth function of a single variable using only function
evaluations.
Required Arguments
F User-supplied function to compute the value of the function to be minimized. The form
is F(X), where
X The point at which the function is evaluated. (Input)
1222 Chapter 8: Optimization
Optional Arguments
STEP An order of magnitude estimate of the required change in X. (Input)
Default: STEP = 1.0.
XACC The required absolute accuracy in the final value of X. (Input)
On a normal return there are points on either side of X within a distance XACC at which
F is no less than F(X).
Default: XACC = 1.e-4.
MAXFN Maximum number of function evaluations allowed. (Input)
Default: MAXFN = 1000.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UVMIF uses a safeguarded quadratic interpolation method to find a minimum point of
a univariate function. Both the code and the underlying algorithm are based on the routine ZXLSF
written by M.J.D. Powell at the University of Cambridge.
The routine UVMIF finds the least value of a univariate function, f, that is specified by the function
subroutine F. Other required data include an initial estimate of the solution, XGUESS , and a
positive number BOUND. Let x0 = XGUESS and b = BOUND, then x is restricted to the interval
[x0 b, x0 + b]. Usually, the algorithm begins the search by moving from x0 to x = x0 + s, where
Chapter 8: Optimization
UVMIF 1223
s = STEP is also provided by the user and may be positive or negative. The first two function
evaluations indicate the direction to the minimum point, and the search strides out along this
direction until a bracket on a minimum point is found or until x reaches one of the bounds x0 b.
During this stage, the step length increases by a factor of between two and nine per function
evaluation; the factor depends on the position of the minimum point that is predicted by quadratic
interpolation of the three most recent function values.
When an interval containing a solution has been found, we will have three points, x1, x2, and x3,
with x1 < x2 < x3 and f (x2) f (x1) and f (x2) f (x3). There are three main ingredients in the
technique for choosing the new x from these three points. They are (i) the estimate of the
minimum point that is given by quadratic interpolation of the three function values, (ii) a tolerance
parameter , that depends on the closeness of f to a quadratic, and (iii) whether x2 is near the center
of the range between x1 and x3 or is relatively close to an end of this range. In outline, the new
value of x is as near as possible to the predicted minimum point, subject to being at least from x2,
and subject to being in the longer interval between x1 and x2 or x2 and x3 when x2 is particularly
close to x1 or x3. There is some elaboration, however, when the distance between these points is
close to the required accuracy; when the distance is close to the machine precision; or when is
relatively large.
The algorithm is intended to provide fast convergence when f has a positive and continuous
second derivative at the minimum and to avoid gross inefficiencies in pathological cases, such as
f (x) = x + 1.001|x|
The algorithm can make large automatically in the pathological cases. In this case, it is usual for
a new value of x to be at the midpoint of the longer interval that is adjacent to the least calculated
function value. The midpoint strategy is used frequently when changes to f are dominated by
computer rounding errors, which will almost certainly happen if the user requests an accuracy that
is less than the square root of the machine precision. In such cases, the routine claims to have
achieved the required accuracy if it knows that there is a local minimum point within distance of
x, where = XACC, even though the rounding errors in f may cause the existence of other local
minimum points nearby. This difficulty is inevitable in minimization routines that use only
function values, so high precision arithmetic is recommended.
Comments
Informational errors
Type
3
3
4
Code
1 Computer rounding errors prevent further refinement of X.
2 The final value of X is at a bound. The minimum is probably beyond the
bound.
3 The number of function evaluations has exceeded MAXFN.
Example
A minimum point of ex 5x is found.
USE UVMIF_INT
USE UMACH_INT
1224 Chapter 8: Optimization
IMPLICIT
!
INTEGER
REAL
EXTERNAL
!
XGUESS
XACC
BOUND
STEP
MAXFN
!
!
!
=
=
=
=
=
NONE
Declare variables
MAXFN, NOUT
BOUND, F, FX, STEP, X, XACC, XGUESS
F
Initialize variables
0.0
0.001
100.0
0.1
50
!
99999 FORMAT (
The minimum is at , 7X, F7.3, //,
The function &
, value is , F7.3)
!
END
!
Real function: F = EXP(X) - 5.0*X
REAL FUNCTION F (X)
REAL
X
!
REAL
EXP
INTRINSIC EXP
!
F = EXP(X) - 5.0E0*X
!
RETURN
END
Output
The minimum is at
The function value is
1.609
-3.047
UVMID
Finds the minimum point of a smooth function of a single variable using both function evaluations
and first derivative evaluations.
Required Arguments
F User-supplied function to define the function to be minimized. The form is F(X), where
X The point at which the function is to be evaluated. (Input)
Chapter 8: Optimization
UVMID 1225
G User-supplied function to compute the derivative of the function. The form is G(X),
where
X The point at which the derivative is to be computed. (Input)
G The computed value of the derivative at X. (Output)
G must be declared EXTERNAL in the calling program.
A A is the lower endpoint of the interval in which the minimum point of F is to be located.
(Input)
B B is the upper endpoint of the interval in which the minimum point of F is to be located.
(Input)
X The point at which a minimum value of F is found. (Output)
Optional Arguments
XGUESS An initial guess of the minimum point of F. (Input)
Default: XGUESS = (a + b) / 2.0.
ERRREL The required relative accuracy in the final value of X. (Input)
This is the first stopping criterion. On a normal return, the solution X is in an interval
that contains a local minimum and is less than or equal to MAX(1.0, ABS(X)) * ERRREL.
When the given ERRREL is less than machine epsilon, SQRT(machine epsilon) is used
as ERRREL.
Default: ERRREL = 1.e-4.
GTOL The derivative tolerance used to decide if the current point is a local minimum.
(Input)
This is the second stopping criterion. X is returned as a solution when GX is less than or
equal to GTOL. GTOL should be nonnegative, otherwise zero would be used.
Default: GTOL = 1.e-4.
MAXFN Maximum number of function evaluations allowed. (Input)
Default: MAXFN = 1000.
FX The function value at point X. (Output)
GX The derivative value at point X. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UVMID uses a descent method with either the secant method or cubic interpolation to
find a minimum point of a univariate function. It starts with an initial guess and two endpoints. If
any of the three points is a local minimum point and has least function value, the routine
terminates with a solution. Otherwise, the point with least function value will be used as the
starting point.
From the starting point, say xc, the function value fc = f (xc), the derivative value gc = g(xc), and a
new point xn defined by xn = xc gc are computed. The function fn = f(xn), and the derivative
gn = g(xn) are then evaluated. If either fn fc or gn has the opposite sign of gc, then there exists a
minimum point between xc and xn; and an initial interval is obtained. Otherwise, since xc is kept as
the point that has lowest function value, an interchange between xn and xc is performed. The secant
method is then used to get a new point
xs = xc g c (
gn gc
)
xn xc
Let xn xs and repeat this process until an interval containing a minimum is found or one of the
convergence criteria is satisfied. The convergence criteria are as follows:
Criterion 1:
xc xn c
Criterion 2:
gc g
Chapter 8: Optimization
UVMID 1227
Comments
Informational errors
Type
3
Code
1 The final value of X is at the lower bound. The minimum is probably
beyond the bound.
2 The final value of X is at the upper bound. The minimum is probably
beyond the bound.
3 The maximum number of function evaluations has been exceeded.
3
4
Example
A minimum point of ex 5x is found.
USE UVMID_INT
USE UMACH_INT
IMPLICIT
!
INTEGER
REAL
EXTERNAL
!
XGUESS =
!
!
!
ERRREL
GTOL
A
B
MAXFN
!
!
!
=
=
=
=
=
NONE
Declare variables
MAXFN, NOUT
A, B, ERRREL, F, FX, G, GTOL, GX, X, XGUESS, FTOL
F, G
Initialize variables
0.0
Set ERRREL to zero in order
to use SQRT(machine epsilon)
as relative error
0.0
0.0
-10.0
10.0
50
!
99999 FORMAT ('
The minimum is at ', 7X, F7.3, //, '
The function ' &
, 'value is ', F7.3, //, '
The derivative is ', F7.3)
!
END
!
Real function: F = EXP(X) - 5.0*X
REAL FUNCTION F (X)
REAL
X
!
REAL
EXP
INTRINSIC EXP
!
F = EXP(X) - 5.0E0*X
!
RETURN
1228 Chapter 8: Optimization
END
!
!
EXP
EXP
!
G = EXP(X) - 5.0E0
RETURN
END
Output
The minimum is at
1.609
-3.047
-0.001
UVMGS
Finds the minimum point of a nonsmooth function of a single variable.
Required Arguments
F User-supplied function to compute the value of the function to be minimized. The form
is F(X), where
X The point at which the function is evaluated. (Input)
X should not be changed by F.
Chapter 8: Optimization
UVMGS 1229
Optional Arguments
TOL The allowable length of the final subinterval containing the minimum point. (Input)
Default: TOL = 1.e-4.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UVMGS uses the golden section search technique to compute to the desired accuracy
the independent variable value that minimizes a unimodal function of one independent variable,
where a known finite interval contains the minimum.
Let = TOL. The number of iterations required to compute the minimizing value to accuracy is
the greatest integer less than or equal to
ln ( / ( b a ) )
ln (1 c )
+1
c = 3 5 / 2
The first two test points are v1 and v2 that are defined as
v1 = a + c(b a), and v2 = b c(b a)
If f(v1) < f(v2), then the minimizing value is in the interval (a, v2). In this case, b v2, v2 v1,
and v1 a + c(b a). If f(v1) f(v2), the minimizing value is in (v1, b). In this case, a v1,
v1 v2, and v2 b c(b a).
The algorithm continues in an analogous manner where only one new test point is computed at
each step. This process continues until the desired accuracy is achieved. XMIN is set to the point
producing the minimum value for the current iteration.
Mathematically, the algorithm always produces the minimizing value to the desired accuracy;
however, numerical problems may be encountered. If f is too flat in part of the region of interest,
the function may appear to be constant to the computer in that region. Error code 2 indicates that
this problem has occurred. The user may rectify the problem by relaxing the requirement on ,
modifying (scaling, etc.) the form of f or executing the program in a higher precision.
1230 Chapter 8: Optimization
Comments
1.
Informational errors
Type
3
4
Code
1 TOL is too small to be satisfied.
2 Due to rounding errors F does not appear to be unimodal.
2.
On exit from UVMGS without any error messages, the following conditions hold:
(B-A) TOL.
A XMIN and XMIN B
F(XMIN) F(A) and F(XMIN) F(B)
3.
On exit from UVMGS with error code 2, the following conditions hold:
A XMIN and XMIN B
F(XMIN) F(A) and F(XMIN) F(B) (only one equality can hold).
Further analysis of the function F is necessary in order to determine whether it is not
unimodal in the mathematical sense or whether it appears to be not unimodal to the
routine due to rounding errors in which case the A, B, and XMIN returned may be
acceptable.
Example
A minimum point of 3x2 2x + 4 is found.
USE UVMGS_INT
USE UMACH_INT
IMPLICIT
NONE
Specification of variables
NOUT
A, B, FCN, FMIN, TOL, XMIN
FCN
!
Initialize variables
A
= 0.0E0
B
= 5.0E0
TOL = 1.0E-3
!
Minimize FCN
CALL UVMGS (FCN, A, B, XMIN, TOL=TOL)
FMIN = FCN(XMIN)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) XMIN, FMIN, A, B
99999 FORMAT ('
The minimum is at ', F5.3, //, '
The ', &
'function value is ', F5.3, //, '
The final ', &
'interval is (', F6.4, ',', F6.4, ')', /)
!
END
!
!
REAL FUNCTION: F = 3*X**2 - 2*X + 4
REAL FUNCTION FCN (X)
REAL
X
!
INTEGER
REAL
EXTERNAL
Chapter 8: Optimization
UVMGS 1231
Output
The minimum is at 0.333
The function value is 3.667
The final interval is (0.3331,0.3340)
UMINF
Minimizes a function of N variables using a quasi-Newton method and a finite-difference gradient.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing an initial guess of the computed solution. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. In
the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UMINF uses a quasi-Newton method to find the minimum of a function f(x) of n
variables. Only function values are required. The problem is stated as follows:
minn f ( x )
xR
Given a starting point xc, the search direction is computed according to the formula
d = B1 gc
where B is a positive definite approximation of the Hessian and gc is the gradient evaluated at xc.
A line search is then used to find a new point
xn = xc + d, > 0
such that
f(xn) f(xc) + gT d,
(0, 0.5)
Chapter 8: Optimization
UMINF 1233
B B
BssT B yy T
+ T
sT Bs
y s
where s = xn xc and y = gn gc. Another search direction is then computed to begin the next
iteration. For more details, see Dennis and Schnabel (1983, Appendix A).
Since a finite-difference method is used to estimate the gradient, for some single precision
calculations, an inaccurate estimate of the gradient may cause the algorithm to terminate at a
noncritical point. In such cases, high precision arithmetic is recommended. Also, whenever the
exact gradient can be easily provided, IMSL routine UMING should be used instead.
Comments
1.
2.
Informational errors
Type
4
4
4
4
4
Code
2
3
4
5
6
3.
The first stopping criterion for UMINF occurs when the infinity norm of the scaled
gradient is less than the given gradient tolerance (RPARAM(1)). The second stopping
criterion for UMINF occurs when the scaled distance between the last two steps is less
than the step tolerance (RPARAM(2)).
4.
If the default parameters are desired for UMINF, then set IPARAM(1) to zero and call the
routine UMINF. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling UMINF:
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of gradient evaluations.
Default: 400.
IPARAM(6) = Hessian initialization parameter.
If IPARAM(6) = 0, the Hessian is initialized to the identity matrix;
max f ( t ) , f s si2
max f ( x ) , f s
Chapter 8: Optimization
UMINF 1235
Default:
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
1 =
(s t )
n
i =1
i i
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The function
f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
is minimized.
USE UMINF_INT
USE U4INF_INT
1236 Chapter 8: Optimization
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=2)
!
INTEGER
REAL
EXTERNAL
IPARAM(7), L, NOUT
F, RPARAM(7), X(N), XGUESS(N), &
XSCALE(N)
ROSBRK
!
DATA XGUESS/-1.2E0, 1.0E0/
!
!
!
!
!
!
!
99999 FORMAT (' The solution is ', 6X, 2F8.3, //, ' The function ', &
'value is ', F8.3, //, ' The number of iterations is ', &
10X, I3, /, ' The number of function evaluations is ', &
I3, /, ' The number of gradient evaluations is ', I3)
!
END
!
SUBROUTINE ROSBRK (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 1.0E2*(X(2)-X(1)*X(1))**2 + (1.0E0-X(1))**2
!
RETURN
END
Output
The solution is
1.000
0.000
1.000
15
40
19
UMING
Minimizes a function of N variables using a quasi-Newton method and a user-supplied gradient.
Chapter 8: Optimization
UMING 1237
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by GRAD .
G The gradient evaluated at the point X. (Output)
GRAD must be declared EXTERNAL in the calling program.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess of the minimum. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. In
the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
FSCALE Scalar containing the function scaling. (Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
FSCALE to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 7. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
1238 Chapter 8: Optimization
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UMING uses a quasi-Newton method to find the minimum of a function f(x) of n
variables. Function values and first derivatives are required. The problem is stated as follows:
minn f ( x )
xR
Given a starting point xc, the search direction is computed according to the formula
d = B1 gc
where B is a positive definite approximation of the Hessian and gc is the gradient evaluated at xc.
A line search is then used to find a new point
xn = xc + d, > 0
such that
f(xn) f(xc) + gT d,
(0, 0.5)
BssT B yyT
+ T
sT Bs
y s
where s = xn xc and y = gn gc. Another search direction is then computed to begin the next
iteration. For more details, see Dennis and Schnabel (1983, Appendix A).
Comments
1.
Chapter 8: Optimization
UMING 1239
2.
Informational errors
Type
4
4
4
4
4
Code
2
3
4
5
6
3.
The first stopping criterion for UMING occurs when the infinity norm of the scaled
gradient is less than the given gradient tolerance (RPARAM(1)). The second stopping
criterion for UMING occurs when the scaled distance between the last two steps is less
than the step tolerance (RPARAM(2)).
4.
If the default parameters are desired for UMING, then set IPARAM(1) to zero and call
routine UMING. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling UMING:
CALL U4INF (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to U4INF will set IPARAM and RPARAM to their default values so only
nondefault values need to be set above.
The following is a list of the parameters and the default values:
IPARAM Integer vector of length 7.
IPARAM(1) = Initialization flag.
IPARAM(2) = Number of good digits in the function.
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of gradient evaluations.
Default: 400.
IPARAM(6) = Hessian initialization parameter
If IPARAM(6) = 0, the Hessian is initialized to the identity matrix;
max f ( t ) , f s si2
max f ( x ) , f s
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
Chapter 8: Optimization
UMING 1241
1 =
(s t )
n
i =1
i i
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The function
f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
NONE
N
(N=2)
INTEGER
REAL
EXTERNAL
IPARAM(7), L, NOUT
F, X(N), XGUESS(N)
ROSBRK, ROSGRD
!
DATA XGUESS/-1.2E0, 1.0E0/
!
IPARAM(1) = 0
!
!
1242 Chapter 8: Optimization
!
99999 FORMAT (' The solution is ', 6X, 2F8.3, //, ' The function ', &
'value is ', F8.3, //, ' The number of iterations is ', &
10X, I3, /, ' The number of function evaluations is ', &
I3, /, ' The number of gradient evaluations is ', I3)
!
END
!
SUBROUTINE ROSBRK (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 1.0E2*(X(2)-X(1)*X(1))**2 + (1.0E0-X(1))**2
!
RETURN
END
!
SUBROUTINE ROSGRD (N, X, G)
INTEGER
N
REAL
X(N), G(N)
!
G(1) = -4.0E2*(X(2)-X(1)*X(1))*X(1) - 2.0E0*(1.0E0-X(1))
G(2) = 2.0E2*(X(2)-X(1)*X(1))
!
RETURN
END
Output
The solution is
1.000
0.000
1.000
18
31
22
UMIDH
Minimizes a function of N variables using a modified Newton method and a finite-difference
Hessian.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
Chapter 8: Optimization
UMIDH 1243
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X The point at which the gradient is evaluated. (Input)
X should not be changed by GRAD.
G The gradient evaluated at the point X. (Output)
GRAD must be declared EXTERNAL in the calling program.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing initial guess. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. In
the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
FSCALE Scalar containing the function scaling. (Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
FSCALE to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 7. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVALUE Scalar containing the value of the function at the computed solution. (Output)
1244 Chapter 8: Optimization
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UMIDH uses a modified Newton method to find the minimum of a function f (x) of n
variables. First derivatives must be provided by the user. The algorithm computes an optimal
locally constrained step (Gay 1981) with a trust region restriction on the step. It handles the case
that the Hessian is indefinite and provides a way to deal with negative curvature. For more details,
see Dennis and Schnabel (1983, Appendix A) and Gay (1983).
Since a finite-difference method is used to estimate the Hessian for some single precision
calculations, an inaccurate estimate of the Hessian may cause the algorithm to terminate at a
noncritical point. In such cases, high precision arithmetic is recommended. Also, whenever the
exact Hessian can be easily provided, IMSL routine UMIAH should be used instead.
Comments
1.
2.
Informational errors
Type
3
4
4
4
Chapter 8: Optimization
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
UMIDH 1245
4
4
5
6
4
3
7
8
3.
The first stopping criterion for UMIDH occurs when the norm of the gradient is less than
the given gradient tolerance (RPARAM(1)). The second stopping criterion for UMIDH
occurs when the scaled distance between the last two steps is less than the step
tolerance (RPARAM(2)).
4.
If the default parameters are desired for UMIDH, then set IPARAM(1) to zero and call
routine UMIDH. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling UMIDH:
CALL U4INF (IPARAM, RPARAM)
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of gradient evaluations.
Default: 400.
IPARAM(6) = Hessian initialization parameter
Default:100
1246 Chapter 8: Optimization
max f ( x ) , f s
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
1 =
(s t )
n
i =1
i i
Chapter 8: Optimization
UMIDH 1247
If double precision is required, then DU4INF is called, and RPARAM is declared double
precision.
5.
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The function
f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
NONE
N
(N=2)
INTEGER
REAL
EXTERNAL
IPARAM(7), L, NOUT
F, X(N), XGUESS(N)
ROSBRK, ROSGRD
!
DATA XGUESS/-1.2E0, 1.0E0/
!
!
!
!
IPARAM(1) = 0
Minimize Rosenbrock function using
initial guesses of -1.2 and 1.0
CALL UMIDH (ROSBRK, ROSGRD, X, XGUESS=XGUESS, IPARAM=IPARAM, FVALUE=F)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) X, F, (IPARAM(L),L=3,5), IPARAM(7)
!
99999 FORMAT (' The solution is ', 6X, 2F8.3, //, ' The function ', &
'value is ', F8.3, //, ' The number of iterations is ', &
10X, I3, /, ' The number of function evaluations is ', &
I3, /, ' The number of gradient evaluations is ', I3, /, &
' The number of Hessian evaluations is ', I3)
!
END
!
SUBROUTINE ROSBRK (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 1.0E2*(X(2)-X(1)*X(1))**2 + (1.0E0-X(1))**2
!
RETURN
END
!
SUBROUTINE ROSGRD (N, X, G)
INTEGER
N
1248 Chapter 8: Optimization
REAL
X(N), G(N)
!
!
Output
The solution is
1.000
0.000
The
The
The
The
number
number
number
number
of
of
of
of
1.000
iterations is
function evaluations is
gradient evaluations is
Hessian evaluations is
21
30
22
21
UMIAH
Minimizes a function of N variables using a modified Newton method and a user-supplied
Hessian.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X Vector of length N at which point the gradient is evaluated. (Input)
X should not be changed by GRAD.
G The gradient evaluated at the point X. (Output)
GRAD must be declared EXTERNAL in the calling program.
Chapter 8: Optimization
UMIAH 1249
HESS User-supplied subroutine to compute the Hessian at the point X. The usage is
CALL HESS (N, X, H, LDH), where
N Length of X. (Input)
X Vector of length N at which point the Hessian is evaluated. (Input)
X should not be changed by HESS.
H The Hessian evaluated at the point X. (Output)
LDH Leading dimension of H exactly as specified in the dimension statement of the
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing initial guess. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. In
the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
FSCALE Scalar containing the function scaling. (Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
FSCALE to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 7. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVALUE Scalar containing the value of the function at the computed solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UMIAH uses a modified Newton method to find the minimum of a function f(x) of n
variables. First and second derivatives must be provided by the user. The algorithm computes an
optimal locally constrained step (Gay 1981) with a trust region restriction on the step. This
algorithm handles the case where the Hessian is indefinite and provides a way to deal with
negative curvature. For more details, see Dennis and Schnabel (1983, Appendix A) and Gay
(1983).
Comments
1.
2.
Informational errors
Type
3
4
4
4
4
4
Chapter 8: Optimization
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
5 Maximum number of gradient evaluations exceeded.
6 Five consecutive steps have been taken with the maximum step
length.
UMIAH 1251
4
3
7
8
3.
The first stopping criterion for UMIAH occurs when the norm of the gradient is less than
the given gradient tolerance (RPARAM(1)). The second stopping criterion for UMIAH
occurs when the scaled distance between the last two steps is less than the step
tolerance (RPARAM(2)).
4.
If the default parameters are desired for UMIAH, then set IPARAM(1) to zero and call the
routine UMIAH. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling UMIAH:
CALL U4INF (IPARAM, RPARAM)
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of gradient evaluations.
Default: 400.
IPARAM(6) = Hessian initialization parameter
Default: 100.
RPARAM Real vector of length 7.
max f ( x ) , f s
, 3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
1 =
(s t )
n
i =1
i i
Chapter 8: Optimization
UMIAH 1253
5.
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The function
f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
NONE
N
(N=2)
!
INTEGER
REAL
!
EXTERNAL
IPARAM(7), L, NOUT
F, FSCALE, RPARAM(7), X(N), &
XGUESS(N), XSCALE(N)
ROSBRK, ROSGRD, ROSHES
!
99999 FORMAT (' The solution is ', 6X, 2F8.3, //, ' The function ', &
'value is ', F8.3, //, ' The number of iterations is ', &
10X, I3, /, ' The number of function evaluations is ', &
I3, /, ' The number of gradient evaluations is ', I3, /, &
' The number of Hessian evaluations is ', I3)
!
END
!
SUBROUTINE ROSBRK (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 1.0E2*(X(2)-X(1)*X(1))**2 + (1.0E0-X(1))**2
!
RETURN
END
!
SUBROUTINE ROSGRD (N, X, G)
INTEGER
N
REAL
X(N), G(N)
1254 Chapter 8: Optimization
!
G(1) = -4.0E2*(X(2)-X(1)*X(1))*X(1) - 2.0E0*(1.0E0-X(1))
G(2) = 2.0E2*(X(2)-X(1)*X(1))
!
RETURN
END
!
SUBROUTINE ROSHES (N, X, H, LDH)
INTEGER
N, LDH
REAL
X(N), H(LDH,N)
!
H(1,1)
H(2,1)
H(1,2)
H(2,2)
=
=
=
=
!
RETURN
END
Output
The solution is
1.000
0.000
The
The
The
The
number
number
number
number
of
of
of
of
1.000
iterations is
function evaluations is
gradient evaluations is
Hessian evaluations is
21
31
22
21
UMCGF
Minimizes a function of N variables using a conjugate gradient algorithm and a finite-difference
gradient.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
UMCGF 1255
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess of the minimum. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
Default: XSCALE = 1.0.
GRADTL Convergence criterion. (Input)
The calculation ends when the sum of squares of the components of G is less than
GRADTL.
Default: GRADTL = 1.e-4.
MAXFN Maximum number of function evaluations. (Input)
If MAXFN is set to zero, then no restriction on the number of function evaluations is set.
Default: MAXFN = 0.
G Vector of length N containing the components of the gradient at the final parameter
estimates. (Output)
FVALUE Scalar containing the value of the function at the computed solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UMCGF uses a conjugate gradient method to find the minimum of a function f (x) of n
variables. Only function values are required.
The routine is based on the version of the conjugate gradient algorithm described in Powell
(1977). The main advantage of the conjugate gradient technique is that it provides a fast rate of
1256 Chapter 8: Optimization
convergence without the storage of any matrices. Therefore, it is particularly suitable for
unconstrained minimization calculations where the number of variables is so large that matrices of
dimension n cannot be stored in the main memory of the computer. For smaller problems,
however, a routine such as routine UMINF, is usually more efficient because each iteration makes
use of additional information from previous iterations.
Since a finite-difference method is used to estimate the gradient for some single precision
calculations, an inaccurate estimate of the gradient may cause the algorithm to terminate at a
noncritical point. In such cases, high precision arithmetic is recommended. Also, whenever the
exact gradient can be easily provided, routine UMCGG should be used instead.
Comments
1.
2.
Informational errors
Type
4
4
4
3
3.
Code
1 The line search of an integration was abandoned. This error may be
caused by an error in gradient.
2 The calculation cannot continue because the search is uphill.
3 The iteration was terminated because MAXFN was exceeded.
4 The calculation was terminated because two consecutive iterations
failed to reduce the function.
Because of the close relation between the conjugate-gradient method and the method of
steepest descent, it is very helpful to choose the scale of the variables in a way that
balances the magnitudes of the components of a typical gradient vector. It can be
particularly inefficient if a few components of the gradient are much larger than the
rest.
Chapter 8: Optimization
UMCGF 1257
4.
If the value of the parameter GRADTL in the argument list of the routine is set to zero,
then the subroutine will continue its calculation until it stops reducing the objective
function. In this case, the usual behavior is that changes in the objective function
become dominated by computer rounding errors before precision is lost in the gradient
vector. Therefore, because the point of view has been taken that the user requires the
least possible value of the function, a value of the objective function that is small due
to computer rounding errors can prevent further progress. Hence, the precision in the
final values of the variables may be only about half the number of significant digits in
the computer arithmetic, but the least value of FVALUE is usually found to be quite
accurate.
Example
The function
f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
NONE
INTEGER
PARAMETER
N
(N=2)
INTEGER
REAL
EXTERNAL
I, MAXFN, NOUT
DFPRED, FVALUE, G(N), GRADTL, X(N), XGUESS(N)
ROSBRK
Declaration of variables
!
!
RETURN
END
Output
The solution is
0.999
0.998
-0.001
0.000
0.000
UMCGG
Minimizes a function of N variables using a conjugate gradient algorithm and a user-supplied
gradient.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X The point at which the gradient is evaluated. (Input)
X should not be changed by GRAD.
G The gradient evaluated at the point X. (Output)
GRAD must be declared EXTERNAL in the calling program.
Chapter 8: Optimization
UMCGG 1259
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess of the minimum. (Input)
Default: XGUESS = 0.0.
GRADTL Convergence criterion. (Input)
The calculation ends when the sum of squares of the components of G is less than
GRADTL.
Default: GRADTL = 1.e-4.
MAXFN Maximum number of function evaluations. (Input)
Default: MAXFN = 100.
G Vector of length N containing the components of the gradient at the final parameter
estimates. (Output)
FVALUE Scalar containing the value of the function at the computed solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UMCGG uses a conjugate gradient method to find the minimum of a function f (x) of n
variables. Function values and first derivatives are required.
The routine is based on the version of the conjugate gradient algorithm described in Powell
(1977). The main advantage of the conjugate gradient technique is that it provides a fast rate of
convergence without the storage of any matrices. Therefore, it is particularly suitable for
unconstrained minimization calculations where the number of variables is so large that matrices of
dimension n cannot be stored in the main memory of the computer. For smaller problems,
however, a subroutine such as IMSL routine UMING, is usually more efficient because each
iteration makes use of additional information from previous iterations.
Comments
1.
2.
Informational errors
Type
4
4
4
3
Code
1
The line search of an integration was abandoned. This error may be
caused by an error in gradient.
2
The calculation cannot continue because the search is uphill.
3
The iteration was terminated because MAXFN was exceeded.
4
The calculation was terminated because two consecutive iterations
failed to reduce the function.
3.
The routine includes no thorough checks on the part of the user program that calculates
the derivatives of the objective function. Therefore, because derivative calculation is a
frequent source of error, the user should verify independently the correctness of the
derivatives that are given to the routine.
4.
Because of the close relation between the conjugate-gradient method and the method of
steepest descent, it is very helpful to choose the scale of the variables in a way that
balances the magnitudes of the components of a typical gradient vector. It can be
particularly inefficient if a few components of the gradient are much larger than the
rest.
5.
If the value of the parameter GRADTL in the argument list of the routine is set to zero,
then the subroutine will continue its calculation until it stops reducing the objective
function. In this case, the usual behavior is that changes in the objective function
become dominated by computer rounding errors before precision is lost in the gradient
vector. Therefore, because the point of view has been taken that the user requires the
Chapter 8: Optimization
UMCGG 1261
least possible value of the function, a value of the objective function that is small due
to computer rounding errors can prevent further progress. Hence, the precision in the
final values of the variables may be only about half the number of significant digits in
the computer arithmetic, but the least value of FVALUE is usually found to be quite
accurate.
Example
The function
f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
NONE
INTEGER
PARAMETER
N
(N=2)
Declaration of variables
!
INTEGER
REAL
EXTERNAL
I, NOUT
DFPRED, FVALUE, G(N), GRADTL, X(N), &
XGUESS(N)
ROSBRK, ROSGRD
!
DATA XGUESS/-1.2E0, 1.0E0/
!
DFPRED = 0.2
GRADTL = 1.0E-7
!
!
G(1) = -4.0E2*(X(2)-X(1)*X(1))*X(1) - 2.0E0*(1.0E0-X(1))
G(2) = 2.0E2*(X(2)-X(1)*X(1))
!
RETURN
END
!
SUBROUTINE ROSBRK (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 1.0E2*(X(2)-X(1)*X(1))**2 + (1.0E0-X(1))**2
RETURN
END
!
SUBROUTINE ROSGRD (N, X, G)
INTEGER
N
REAL
X(N), G(N)
!
G(1) = -4.0E2*(X(2)-X(1)*X(1))*X(1) - 2.0E0*(1.0E0-X(1))
G(2) = 2.0E2*(X(2)-X(1)*X(1))
!
RETURN
END
Output
The solution is
1.000
1.000
0.000
0.000
-0.000
UMPOL
Minimizes a function of N variables using a direct search polytope algorithm.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
Chapter 8: Optimization
UMPOL 1263
X Real vector of length N containing the best estimate of the minimum found. (Output)
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Real vector of length N which contains an initial guess to the minimum. (Input)
Default: XGUESS = 0.0.
S On input, real scalar containing the length of each side of the initial simplex.
(Input/Output)
If no reasonable information about S is known, S could be set to a number less than or
equal to zero and UMPOL will generate the starting simplex from the initial guess with a
random number generator. On output, the average distance from the vertices to the
centroid that is taken to be the solution; see Comment 4.
Default: S = 0.0.
FTOL First convergence criterion. (Input)
The algorithm stops when a relative error in the function values is less than FTOL, i.e.
when (F(worst) F(best)) < FTOL * (1 + ABS(F(best))) where F(worst) and F(best) are
the function values of the current worst and best points, respectively. Second
convergence criterion. The algorithm stops when the standard deviation of the function
values at the N + 1 current points is less than FTOL. If the subroutine terminates
prematurely, try again with a smaller value for FTOL.
Default: FTOL = 1.e-7.
MAXFCN On input, maximum allowed number of function evaluations. (Input/ Output)
On output, actual number of function evaluations needed.
Default: MAXFCN = 200.
FVALUE Function value at the computed solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UMPOL uses the polytope algorithm to find a minimum point of a function f(x) of n
variables. The polytope method is based on function comparison; no smoothness is assumed. It
starts with n + 1 points x1, x2, , xn + 1. At each iteration, a new point is generated to replace the
worst point xj, which has the largest function value among these n + 1 points. The new point is
constructed by the following formula:
xk = c + (c xj)
where
c=
1
i j xi
n
Criterion 2:
n +1
( f
i =1
n +1
j =1
fj
n +1
)2 f
where fi = f (xi), fj = f (xj), and f is a given tolerance. For a complete description, see Nelder and
Mead (1965) or Gill et al. (1981).
Comments
1.
2.
Informational error
Type
4
3.
Code
1 Maximum number of function evaluations exceeded.
Since UMPOL uses only function value information at each step to determine a new
approximate minimum, it could be quite ineficient on smooth problems compared to
other methods such as those implemented in routine UMINF that takes into account
Chapter 8: Optimization
UMPOL 1265
derivative information at each iteration. Hence, routine UMPOL should only be used as a
last resort. Briefly, a set of N + 1 points in an N-dimensional space is called a simplex.
The minimization process iterates by replacing the point with the largest function value
by a new point with a smaller function value. The iteration continues until all the points
cluster sufficiently close to a minimum.
4.
The value returned in S is useful for assessing the flatness of the function near the
computed minimum. The larger its value for a given value of FTOL, the flatter the
function tends to be in the neighborhood of the returned point.
Example
The function
f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
NONE
INTEGER
PARAMETER
N
(N=2)
INTEGER
REAL
EXTERNAL
K, NOUT
FTOL, FVALUE, S, X(N), XGUESS(N)
FCN
!
!
Variable declarations
!
!
!
!
Initializations
XGUESS = ( -1.2, 1.0)
DATA XGUESS/-1.2, 1.0/
!
FTOL
S
!
= 1.0E-10
= 1.0
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) (X(K),K=1,N), FVALUE
99999 FORMAT (' The best estimate for the minimum value of the', /, &
' function is X = (', 2(2X,F4.2), ')', /, ' with ', &
'function value FVALUE = ', E12.6)
!
END
!
External function to be minimized
SUBROUTINE FCN (N, X, F)
INTEGER
N
REAL
X(N), F
!
1266 Chapter 8: Optimization
F = 100.0*(X(1)*X(1)-X(2))**2 + (1.0-X(1))**2
RETURN
END
Output
The best estimate for the minimum value of the
function is X = ( 1.00 1.00)
with function value FVALUE = 0.502496E-10
UNLSF
Solves a nonlinear least-squares problem using a modified Levenberg-Marquardt algorithm and a
finite-difference Jacobian.
Required Arguments
FCN User-supplied subroutine to evaluate the function that defines the least-squares
problem. The usage is
CALL FCN (M, N, X, F), where
M Length of F. (Input)
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
F Vector of length M containing the function values at X. (Output)
FCN must be declared EXTERNAL in the calling program.
Optional Arguments
N Number of variables. N must be less than or equal to M. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. By
Chapter 8: Optimization
UNLSF 1267
default, the values for XSCALE are set internally. See IPARAM(6) in Comment 4.
Default: XSCALE = 1.0.
FSCALE Vector of length M containing the diagonal scaling matrix for the functions.
(Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
all entries to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVEC Vector of length M containing the residuals at the approximate solution. (Output)
FJAC M by N matrix containing a finite difference approximate Jacobian at the
approximate solution. (Output)
LDFJAC Leading dimension of FJAC exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFJAC = SIZE (FJAC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UNLSF is based on the MINPACK routine LMDIF by Mor et al. (1980). It uses a
modified Levenberg-Marquardt method to solve nonlinear least squares problems. The problem is
stated as follows:
1
1 m
T
2
minn F ( x ) F ( x ) = fi ( x )
xR 2
2 i =1
where m n, F : Rn Rm, and fi(x) is the i-th component function of F(x). From a current point,
the algorithm uses the trust region approach:
minn F ( xc ) + J ( xc )( xn xc )
xn R
xn = xc J ( xc ) J ( xc ) + c I
T
J ( xc ) F ( xc )
T
where c = 0 if c ||(J(xc)T J(xc))1 J(xc)T F(xc)||2 and c > 0 otherwise. F(xc) and J(xc) are the
function values and the Jacobian evaluated at the current point xc. This procedure is repeated until
the stopping criteria are satisfied. For more details, see Levenberg (1944), Marquardt (1963), or
Dennis and Schnabel (1983, Chapter 10).
Since a finite-difference method is used to estimate the Jacobian for some single precision
calculations, an inaccurate estimate of the Jacobian may cause the algorithm to terminate at a
noncritical point. In such cases, high precision arithmetic is recommended. Also, whenever the
exact Jacobian can be easily provided, routine UNLSJ should be used instead.
Comments
1.
2.
Informational errors
Type
3
3
4
4
3
Chapter 8: Optimization
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
6 Five consecutive steps have been taken with the maximum step
length.
UNLSF 1269
3.
The first stopping criterion for UNLSF occurs when the norm of the function is less than
the absolute function tolerance (RPARAM(4)). The second stopping criterion occurs
when the norm of the scaled gradient is less than the given gradient tolerance
(RPARAM(1)). The third stopping criterion for UNLSF occurs when the scaled distance
between the last two steps is less than the step tolerance (RPARAM(2)).
4.
If the default parameters are desired for UNLSF, then set IPARAM(1) to zero and call the
routine UNLSF. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling UNLSF:
CALL U4LSF (IPARAM, RPARAM)
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of Jacobian evaluations.
Default: 1.
RPARAM Real vector of length 7.
RPARAM(1) = Scaled gradient tolerance.
gi max ( xi , 1/ si )
F ( x)
2
2
where
g i = J ( x ) F ( x ) i ( f s )i
T
Default:
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
1 =
(s t )
n
i =1
i i
UNLSF 1271
5.
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The nonlinear least squares problem
min2
xR
1 2
2
fi ( x )
2 i =1
where
f1 ( x ) = 10 ( x2 x12 ) and f 2 ( x ) = (1 x1 )
NONE
INTEGER
PARAMETER
Declaration of variables
LDFJAC, M, N
(LDFJAC=2, M=2, N=2)
!
!
INTEGER
REAL
EXTERNAL
!
!
IPARAM(6), NOUT
FVEC(M), RPARAM(7),X(N), XGUESS(N)
ROSBCK
Compute the least squares for the
Rosenbrock function.
DATA XGUESS/-1.2E0, 1.0E0/
!
!
!
!
!
!
!
99999 FORMAT (' The solution is ', 2F9.4, //, ' The function ', &
'evaluated at the solution is ', /, 18X, 2F9.4, //, &
' The number of iterations is ', 10X, I3, /, ' The ', &
'number of function evaluations is ', I3, /)
END
!
SUBROUTINE ROSBCK (M, N, X, F)
INTEGER
M, N
REAL
X(N), F(M)
!
F(1) = 10.0E0*(X(2)-X(1)*X(1))
1272 Chapter 8: Optimization
Output
The solution is
1.0000
1.0000
24
33
UNLSJ
Solves a nonlinear least squares problem using a modified Levenberg-Marquardt algorithm and a
user-supplied Jacobian.
Required Arguments
FCN User-supplied subroutine to evaluate the function which defines the least-squares
problem. The usage is
CALL FCN (M, N, X, F), where
M Length of F. (Input)
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
F Vector of length M containing the function values at X. (Output)
FCN must be declared EXTERNAL in the calling program.
Chapter 8: Optimization
UNLSJ 1273
Optional Arguments
N Number of variables. N must be less than or equal to M. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. By
default, the values for XSCALE are set internally. See IPARAM(6) in Comment 4.
Default: XSCALE = 1.0.
FSCALE Vector of length M containing the diagonal scaling matrix for the functions.
(Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
all entries to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVEC Vector of length M containing the residuals at the approximate solution. (Output)
FJAC M by N matrix containing a finite-difference approximate Jacobian at the
approximate solution. (Output)
LDFJAC Leading dimension of FJAC exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFJAC = SIZE (FJAC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine UNLSJ is based on the MINPACK routine LMDER by Mor et al. (1980). It uses a
modified Levenberg-Marquardt method to solve nonlinear least squares problems. The problem is
stated as follows:
1
1 m
T
2
minn F ( x ) F ( x ) = f i ( x )
xR 2
2 i =1
where m n, F : Rn Rm, and fi(x) is the i-th component function of F(x). From a current point,
the algorithm uses the trust region approach:
min F ( xc ) + J ( xc )( xn xc )
xn R n
xn = xc J ( xc ) J ( xc ) + c I
T
J ( xc ) F ( xc )
T
where c = 0 if c ||(J(xc)T J(xc))1 J(xc)T F (xc)||2 and c > 0 otherwise. F(xc) and J(xc) are the
function values and the Jacobian evaluated at the current point xc. This procedure is repeated until
the stopping criteria are satisfied. For more details, see Levenberg (1944), Marquardt(1963), or
Dennis and Schnabel (1983, Chapter 10).
Comments
1.
2.
Informational errors
Type
3
3
4
Chapter 8: Optimization
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
UNLSJ 1275
4
4
3
4
5
6
3.
The first stopping criterion for UNLSJ occurs when the norm of the function is less than
the absolute function tolerance (RPARAM(4)). The second stopping criterion occurs
when the norm of the scaled gradient is less than the given gradient tolerance
(RPARAM(1)). The third stopping criterion for UNLSJ occurs when the scaled distance
between the last two steps is less than the step tolerance (RPARAM(2)).
4.
If the default parameters are desired for UNLSJ, then set IPARAM(1) to zero and call the
routine UNLSJ. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling UNLSJ:
CALL U4LSF (IPARAM, RPARAM)
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of Jacobian evaluations.
Default: 100.
IPARAM(6) = Internal variable scaling flag.
If IPARAM(6) = 1, then the values for XSCALE are set internally.
Default: 1.
RPARAM Real vector of length 7.
2
2
where
g i = J ( x ) F ( x ) i ( f s )i
T
Default:
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
(s t )
n
i =1
i i
UNLSJ 1277
If double precision is desired, then DU4LSF is called and RPARAM is declared double
precision.
5.
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The nonlinear least-squares problem
min2
xR
1 2
2
fi ( x )
2 i =1
where
f1 ( x ) = 10 ( x2 x12 ) and f 2 ( x ) = (1 x1 )
IMPLICIT
INTEGER
PARAMETER
NONE
Declaration of variables
LDFJAC, M, N
(LDFJAC=2, M=2, N=2)
!
INTEGER
REAL
EXTERNAL
!
!
IPARAM(6), NOUT
FVEC(M), X(N), XGUESS(N)
ROSBCK, ROSJAC
Compute the least squares for the
Rosenbrock function.
DATA XGUESS/-1.2E0, 1.0E0/
IPARAM(1) = 0
!
!
!
99999 FORMAT (' The solution is ', 2F9.4, //, ' The function ', &
'evaluated at the solution is ', /, 18X, 2F9.4, //, &
' The number of iterations is ', 10X, I3, /, ' The ', &
'number of function evaluations is ', I3, /, ' The ', &
'number of Jacobian evaluations is ', I3, /)
END
!
SUBROUTINE ROSBCK (M, N, X, F)
INTEGER
M, N
REAL
X(N), F(M)
!
F(1) = 10.0E0*(X(2)-X(1)*X(1))
F(2) = 1.0E0 - X(1)
1278 Chapter 8: Optimization
RETURN
END
!
SUBROUTINE ROSJAC (M, N, X, FJAC, LDFJAC)
INTEGER
M, N, LDFJAC
REAL
X(N), FJAC(LDFJAC,N)
!
FJAC(1,1)
FJAC(2,1)
FJAC(1,2)
FJAC(2,2)
RETURN
END
=
=
=
=
-20.0E0*X(1)
-1.0E0
10.0E0
0.0E0
Output
The solution is
1.0000
1.0000
23
32
24
BCONF
Minimizes a function of N variables subject to bounds on the variables using a quasi-Newton
method and a finite-difference gradient.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
Action
Chapter 8: Optimization
BCONF 1279
User supplies only the bounds on 1st variable, all other variables will have
the same bounds.
XLB Vector of length N containing the lower bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
XUB Vector of length N containing the upper bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
X Vector of length N containing the computed solution. (Output)
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing an initial guess of the computed solution. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. In
the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
FSCALE Scalar containing the function scaling. (Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
FSCALE to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 7. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVALUE Scalar containing the value of the function at the computed solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BCONF uses a quasi-Newton method and an active set strategy to solve minimization
problems subject to simple bounds on the variables. The problem is stated as follows:
minn f ( x )
xR
subject to l x u
c
From a given starting point x , an active set IA, which contains the indices of the variables at their
bounds, is built. A variable is called a free variable if it is not in the active set. The routine then
computes the search direction for the free variables according to the formula
d = B1 gc
where B is a positive definite approximation of the Hessian and gc is the gradient evaluated at xc;
both are computed with respect to the free variables. The search direction for the variables in IA is
set to zero. A line search is used to find a new point xn ,
xn = xc + d, (0, 1]
such that
f (xn) f (xc) + gT d, (0, 0.5)
are checked, where is a gradient tolerance. When optimality is not achieved, B is updated
according to the BFGS formula:
B B
BssT B yyT
+ T
sT Bs
y s
where s = xn xc and y = gn gc. Another search direction is then computed to begin the next
iteration.
The active set is changed only when a free variable hits its bounds during an iteration or the
optimality condition is met for the free variables but not for all variables in IA, the active set. In
the latter case, a variable that violates the optimality condition will be dropped out of IA. For more
details on the quasi-Newton method and line search, see Dennis and Schnabel (1983). For more
detailed information on active set strategy, see Gill and Murray (1976).
Chapter 8: Optimization
BCONF 1281
Since a finite-difference method is used to estimate the gradient for some single precision
calculations, an inaccurate estimate of the gradient may cause the algorithm to terminate at a
noncritical point. In such cases, high precision arithmetic is recommended. Also, whenever the
exact gradient can be easily provided, routine BCONG should be used instead.
Comments
1.
2.
Informational errors
Type
3
4
4
4
4
4
2
3
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
5 Maximum number of gradient evaluations exceeded.
6 Five consecutive steps have been taken with the maximum step
length.
7 Scaled step tolerance satisfied; the current point may be an
approximate local solution, or the algorithm is making very slow
progress and is not near a solution, or STEPTL is too big.
8 The last global step failed to locate a lower point than the current X
value.
3.
The first stopping criterion for BCONF occurs when the norm of the gradient is less than
the given gradient tolerance (RPARAM(1)). The second stopping criterion for BCONF
occurs when the scaled distance between the last two steps is less than the step
tolerance (RPARAM(2)).
4.
If the default parameters are desired for BCONF, then set IPARAM(1) to zero and call the
routine BCONF. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling BCONF:
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of gradient evaluations.
Default: 400.
IPARAM(6) = Hessian initialization parameter.
If IPARAM(6) = 0, the Hessian is initialized to the identity matrix;
otherwise,
it is initialized to a diagonal matrix containing
max f ( t ) , f s si2
max f ( x ) , f s
Chapter 8: Optimization
BCONF 1283
Default:
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
(s t )
n
i =1
i i
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The problem
subject to
2 x1 0.5
1 x2 2
is solved with an initial guess (1.2, 1.0) and default values for parameters.
USE BCONF_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=2)
!
INTEGER
REAL
EXTERNAL
!
!
!
!
!
!
!
99999 FORMAT (' The solution is ', 6X, 2F8.3, //, ' The function ', &
'value is ', F8.3, //, ' The number of iterations is ', &
10X, I3, /, ' The number of function evaluations is ', &
I3, /, ' The number of gradient evaluations is ', I3)
!
END
!
SUBROUTINE ROSBRK (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 1.0E2*(X(2)-X(1)*X(1))**2 + (1.0E0-X(1))**2
!
RETURN
END
Output
The solution is
0.500
0.250
Chapter 8: Optimization
0.250
BCONF 1285
24
34
26
BCONG
Minimizes a function of N variables subject to bounds on the variables using a quasi-Newton
method and a user-supplied gradient.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X Vector of length N at which point the gradient is evaluated. (Input)
X should not be changed by GRAD.
G The gradient evaluated at the point X. (Output)
GRAD must be declared EXTERNAL in the calling program.
IBTYPE
Action
User supplies only the bounds on 1st variable, all other variables
will have the same bounds.
XLB Vector of length N containing the lower bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
XUB Vector of length N containing the upper bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
X Vector of length N containing the computed solution. (Output)
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess of the minimum. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. In
the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
FSCALE Scalar containing the function scaling. (Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
FSCALE to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 7. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVALUE Scalar containing the value of the function at the computed solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL BCONG (FCN, GRAD, N, XGUESS, IBTYPE, XLB, XUB, XSCALE, FSCALE,
IPARAM, RPARAM, X, FVALUE)
Chapter 8: Optimization
BCONG 1287
Double:
Description
The routine BCONG uses a quasi-Newton method and an active set strategy to solve minimization
problems subject to simple bounds on the variables. The problem is stated as follows:
minn f ( x )
xR
subject to l x u
From a given starting point xc, an active set IA, which contains the indices of the variables at their
bounds, is built. A variable is called a free variable if it is not in the active set. The routine then
computes the search direction for the free variables according to the formula
d = B1 gc
where B is a positive definite approximation of the Hessian and gc is the gradient evaluated at xc;
both are computed with respect to the free variables. The search direction for the variables in IA is
set to zero. A line search is used to find a new point xn ,
xn = xc + d, (0, 1]
such that
f (xn) f (xc) + gT d, (0, 0.5)
are checked, where is a gradient tolerance. When optimality is not achieved, B is updated
according to the BFGS formula:
B B
BssT B yyT
+ T
sT Bs
y s
where s = xn xc and y = gn gc. Another search direction is then computed to begin the next
iteration.
The active set is changed only when a free variable hits its bounds during an iteration or the
optimality condition is met for the free variables but not for all variables in IA, the active set. In
the latter case, a variable that violates the optimality condition will be dropped out of IA. For more
details on the quasi-Newton method and line search, see Dennis and Schnabel (1983). For more
detailed information on active set strategy, see Gill and Murray (1976).
Comments
1.
CALL B2ONG (FCN, GRAD, N, XGUESS, IBTYPE, XLB, XUB, XSCALE, FSCALE, IPARAM,
RPARAM, X, FVALUE, WK, IWK)
2.
Informational errors
Type
3
4
4
4
4
4
2
3
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
5 Maximum number of gradient evaluations exceeded.
6 Five consecutive steps have been taken with the maximum step
length.
7 Scaled step tolerance satisfied; the current point may be an
approximate local solution, or the algorithm is making very slow
progress and is not near a solution, or STEPTL is too big.
8 The last global step failed to locate a lower point than the current X
value.
3.
The first stopping criterion for BCONG occurs when the norm of the gradient is less than
the given gradient tolerance (RPARAM(1)). The second stopping criterion for BCONG
occurs when the scaled distance between the last two steps is less than the step
tolerance (RPARAM(2)).
4.
If the default parameters are desired for BCONG, then set IPARAM (1) to zero and call
the routine BCONG. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling BCONG:
CALL U4INF (IPARAM, RPARAM)
BCONG 1289
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of gradient evaluations.
Default: 400.
IPARAM(6) = Hessian initialization parameter.
If IPARAM (6) = 0, the Hessian is initialized to the identity matrix;
max f ( t ) , f s si2
max f ( x ) , f s
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
1290 Chapter 8: Optimization
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
(s t )
n
i =1
i i
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The problem
min f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
subject to
2 x1 0.5
1 x2 2
is solved with an initial guess (1.2, 1.0), and default values for parameters.
USE BCONG_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=2)
INTEGER
REAL
EXTERNAL
Chapter 8: Optimization
BCONG 1291
!
!
!
!
!
!
!
99999 FORMAT (' The solution is ', 6X, 2F8.3, //, ' The function ', &
'value is ', F8.3, //, ' The number of iterations is ', &
10X, I3, /, ' The number of function evaluations is ', &
I3, /, ' The number of gradient evaluations is ', I3)
!
END
!
SUBROUTINE ROSBRK (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 1.0E2*(X(2)-X(1)*X(1))**2 + (1.0E0-X(1))**2
!
RETURN
END
!
SUBROUTINE ROSGRD (N, X, G)
INTEGER
N
REAL
X(N), G(N)
!
G(1) = -4.0E2*(X(2)-X(1)*X(1))*X(1) - 2.0E0*(1.0E0-X(1))
G(2) = 2.0E2*(X(2)-X(1)*X(1))
!
RETURN
END
Output
The solution is
0.500
0.250
0.250
22
32
23
BCODH
Minimizes a function of N variables subject to bounds on the variables using a modified Newton
method and a finite-difference Hessian.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X Vector of length N at which point the gradient is evaluated. (Input)
X should not be changed by GRAD.
G The gradient evaluated at the point X. (Output)
GRAD must be declared EXTERNAL in the calling program.
Action
User supplies only the bounds on 1st variable, all other variables will have
the same bounds.
XLB Vector of length N containing the lower bounds on the variables. (Input)
XUB Vector of length N containing the upper bounds on the variables. (Input)
Chapter 8: Optimization
BCODH 1293
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess of the minimum. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. In
the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
FSCALE Scalar containing the function scaling. (Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
FSCALE to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 7. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVALUE Scalar containing the value of the function at the computed solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BCODH uses a modified Newton method and an active set strategy to solve
minimization problems subject to simple bounds on the variables. The problem is stated as
1294 Chapter 8: Optimization
minn f ( x )
xR
subject to l x u
c
From a given starting point x , an active set IA, which contains the indices of the variables at their
bounds, is built. A variable is called a free variable if it is not in the active set. The routine then
computes the search direction for the free variables according to the formula
d = H1 gc
where H is the Hessian and gc is the gradient evaluated at xc; both are computed with respect to the
free variables. The search direction for the variables in IA is set to zero. A line search is used to
find a new point xn ,
xn = xc + d, (0, 1]
such that
f (xn) f (xc) + gT d, (0, 0.5)
are checked where is a gradient tolerance. When optimality is not achieved, another search
direction is computed to begin the next iteration. This process is repeated until the optimality
criterion is met.
The active set is changed only when a free variable hits its bounds during an iteration or the
optimality condition is met for the free variables but not for all variables in IA, the active set. In
the latter case, a variable that violates the optimality condition will be dropped out of IA. For more
details on the modified Newton method and line search, see Dennis and Schnabel (1983). For
more detailed information on active set strategy, see Gill and Murray (1976).
Since a finite-difference method is used to estimate the Hessian for some single precision
calculations, an inaccurate estimate of the Hessian may cause the algorithm to terminate at a
noncritical point. In such cases, high precision arithmetic is recommended. Also, whenever the
exact Hessian can be easily provided, routine BCOAH should be used instead.
Comments
1.
BCODH 1295
third N locations contain the last Newton step. The fourth N locations contain an
estimate of the gradient at the solution. The final N2 locations contain the
Hessian at the approximate solution.
IWK Integer work vector of length N.
2.
Informational errors
Type
3
4
4
4
4
4
2
4
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
5 Maximum number of gradient evaluations exceeded.
6 Five consecutive steps have been taken with the maximum step
length.
7 Scaled step tolerance satisfied; the current point may be an
approximate local solution, or the algorithm is making very slow
progress and is not near a solution, or STEPTL is too big.
7 Maximum number of Hessian evaluations exceeded.
3.
The first stopping criterion for BCODH occurs when the norm of the gradient is less than
the given gradient tolerance (RPARAM(1)). The second stopping criterion for BCODH
occurs when the scaled distance between the last two steps is less than the step
tolerance (RPARAM(2)).
4.
If the default parameters are desired for BCODH, then set IPARAM(1) to zero and call the
routine BCODH. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM; then the following steps should be taken before calling BCODH:
CALL U4INF (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to U4INF will set IPARAM and RPARAM to their default values so only
nondefault values need to be set above.
The following is a list of the parameters and the default values:
IPARAM Integer vector of length 7.
IPARAM(1) = Initialization flag.
IPARAM(2) = Number of good digits in the function.
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
1296 Chapter 8: Optimization
Default: 400.
IPARAM(5) = Maximum number of gradient evaluations.
Default: 400.
IPARAM(6) = Hessian initialization parameter.
Default: 100.
RPARAM Real vector of length 7.
RPARAM(1) = Scaled gradient tolerance.
The i-th component of the scaled gradient at x is calculated as
gi max ( xi , 1/ si )
max f ( x ) , f s
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
Chapter 8: Optimization
BCODH 1297
(s t )
n
i i
i =1
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The problem
min f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
subject to
2 x1 0.5
1 x2 2
is solved with an initial guess (1.2, 1.0), and default values for parameters.
USE BCODH_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=2)
INTEGER
REAL
EXTERNAL
!
DATA XGUESS/-1.2E0, 1.0E0/
DATA XLB/-2.0E0, -1.0E0/, XUB/0.5E0, 2.0E0/
!
IPARAM(1) = 0
IP
= 0
!
!
!
!
99999 FORMAT (' The solution is ', 6X, 2F8.3, //, ' The function ', &
'value is ', F8.3, //, ' The number of iterations is ', &
1298 Chapter 8: Optimization
END
SUBROUTINE ROSBRK (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 1.0E2*(X(2)-X(1)*X(1))**2 + (1.0E0-X(1))**2
!
RETURN
END
SUBROUTINE ROSGRD (N, X, G)
INTEGER
N
REAL
X(N), G(N)
!
!
RETURN
END
Output
The solution is
0.500
0.250
0.250
17
26
18
BCOAH
Minimizes a function of N variables subject to bounds on the variables using a modified Newton
method and a user-supplied Hessian.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
BCOAH 1299
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X Vector of length N at which point the gradient is evaluated. (Input)
X should not be changed by GRAD.
G The gradient evaluated at the point X. (Output)
GRAD must be declared EXTERNAL in the calling program.
HESS User-supplied subroutine to compute the Hessian at the point X. The usage is
CALL HESS (N, X, H, LDH), where
N Length of X. (Input)
X Vector of length N at which point the Hessian is evaluated. (Input)
X should not be changed by HESS.
H The Hessian evaluated at the point X. (Output)
LDH Leading dimension of H exactly as specified in the dimension statement of the
Action
User supplies only the bounds on 1st variable, all other variables will have
the same bounds.
XLB Vector of length N containing the lower bounds on the variables. (Input)
XUB Vector of length N containing the upper bounds on the variables. (Input)
X Vector of length N containing the computed solution. (Output)
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. In
the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
FSCALE Scalar containing the function scaling. (Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
FSCALE to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 7. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM = 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVALUE Scalar containing the value of the function at the computed solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BCOAH uses a modified Newton method and an active set strategy to solve
minimization problems subject to simple bounds on the variables. The problem is stated as
follows:
minn f ( x )
xR
Chapter 8: Optimization
BCOAH 1301
subject to l x u
From a given starting point xc, an active set IA, which contains the indices of the variables at their
bounds, is built. A variable is called a free variable if it is not in the active set. The routine then
computes the search direction for the free variables according to the formula
d = H1 gc
where H is the Hessian and gc is the gradient evaluated at xc; both are computed with respect to the
free variables. The search direction for the variables in IA is set to zero. A line search is used to
find a new point xn ,
xn = xc + d, (0, 1]
such that
f(xn) f(xc) + gT d,
(0, 0.5)
are checked where is a gradient tolerance. When optimality is not achieved, another search
direction is computed to begin the next iteration. This process is repeated until the optimality
criterion is met.
The active set is changed only when a free variable hits its bounds during an iteration or the
optimality condition is met for the free variables but not for all variables in IA, the active set. In
the latter case, a variable that violates the optimality condition will be dropped out of IA. For more
details on the modified Newton method and line search, see Dennis and Schnabel (1983). For
more detailed information on active set strategy, see Gill and Murray (1976).
Comments
1.
2.
Informational errors
Type
3
4
4
4
4
4
2
4
3
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
5 Maximum number of gradient evaluations exceeded.
6 Five consecutive steps have been taken with the maximum step
length.
7 Scaled step tolerance satisfied; the current point may be an
approximate local solution, or the algorithm is making very slow
progress and is not near a solution, or STEPTL is too big.
7 Maximum number of Hessian evaluations exceeded.
8 The last global step failed to locate a lower point than the current X
value.
3.
The first stopping criterion for BCOAH occurs when the norm of the gradient is less than
the given gradient tolerance (RPARAM(1)). The second stopping criterion for BCOAH
occurs when the scaled distance between the last two steps is less than the step
tolerance (RPARAM(2)).
4.
If the default parameters are desired for BCOAH, then set IPARAM(1) to zero and call the
routine BCOAH. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling BCOAH:
CALL U4INF (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to U4INF will set IPARAM and RPARAM to their default values so only
nondefault values need to be set above.
The following is a list of the parameters and the default values:
IPARAM Integer vector of length 7.
IPARAM(1) = Initialization flag.
IPARAM(2) = Number of good digits in the function.
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of gradient evaluations.
Chapter 8: Optimization
BCOAH 1303
Default: 400.
IPARAM(6) = Hessian initialization parameter.
Default: 100.
RPARAM Real vector of length 7.
RPARAM(1) = Scaled gradient tolerance.
The i-th component of the scaled gradient at x is calculated as
gi max ( xi , 1/ si )
max f ( x ) , f s
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
(s t )
n
i =1
i i
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The problem
min f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
subject to
2 x1 0.5
1 x2 2
is solved with an initial guess (1.2, 1.0), and default values for parameters.
USE BCOAH_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=2)
INTEGER
REAL
EXTERNAL
!
!
!
!
!
!
99999 FORMAT (' The solution is ', 6X, 2F8.3, //, ' The function ', &
'value is ', F8.3, //, ' The number of iterations is ', &
10X, I3, /, ' The number of function evaluations is ', &
I3, /, ' The number of gradient evaluations is ', I3, /, &
' The number of Hessian evaluations is ', I3)
!
END
Chapter 8: Optimization
BCOAH 1305
!
RETURN
END
!
SUBROUTINE ROSGRD (N, X, G)
INTEGER
N
REAL
X(N), G(N)
!
G(1) = -4.0E2*(X(2)-X(1)*X(1))*X(1) - 2.0E0*(1.0E0-X(1))
G(2) = 2.0E2*(X(2)-X(1)*X(1))
!
RETURN
END
!
SUBROUTINE ROSHES (N, X, H, LDH)
INTEGER
N, LDH
REAL
X(N), H(LDH,N)
!
H(1,1)
H(2,1)
H(1,2)
H(2,2)
=
=
=
=
!
RETURN
END
Output
The solution is
0.500
0.250
The
The
The
The
number
number
number
number
of
of
of
of
0.250
iterations is
function evaluations is
gradient evaluations is
Hessian evaluations is
18
29
19
18
BCPOL
Minimizes a function of N variables subject to bounds on the variables using a direct search
complex algorithm.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
1306 Chapter 8: Optimization
N Length of X. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
Action
User supplies only the bounds on the first, variable. All other variables will
have the same bounds.
XLB Vector of length N containing the lower bounds on the variables. (Input, if
IBTYPE = 0; output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
XUB Vector of length N containing the upper bounds on the variables. (Input, if
IBTYPE = 0; output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
X Real vector of length N containing the best estimate of the minimum found. (Output)
Optional Arguments
N The number of variables. (Input)
Default: N = SIZE (XGUESS,1).
XGUESS Real vector of length N that contains an initial guess to the minimum. (Input)
Default: XGUESS = 0.0.
FTOL First convergence criterion. (Input)
The algorithm stops when a relative error in the function values is less than FTOL, i.e.
when (F(worst) F(best)) < FTOL * (1 + ABS(F(best))) where F(worst) and F(best) are
the function values of the current worst and best point, respectively. Second
convergence criterion. The algorithm stops when the standard deviation of the function
values at the 2 * N current points is less than FTOL. If the subroutine terminates
prematurely, try again with a smaller value FTOL.
Default: FTOL = 1.0e-4 for single and 1.0d-8 for double precision.
Chapter 8: Optimization
BCPOL 1307
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BCPOL uses the complex method to find a minimum point of a function of n variables.
The method is based on function comparison; no smoothness is assumed. It starts with 2n points
x1, x2, , x2n. At each iteration, a new point is generated to replace the worst point xj, which has
the largest function value among these 2n points. The new point is constructed by the following
formula:
xk = c + (c xj)
where
c=
1
i j xi
2n 1
Criterion 2:
2n
( f
i =1
2n
j =1
2n
fj
)2 f
Fortran Numerical MATH LIBRARY
where fi = f(xi), fj = f(xj), and f is a given tolerance. For a complete description, see Nelder and
Mead (1965) or Gill et al. (1981).
Comments
1.
2.
Informational error
Type
3
3.
Code
1 The maximum number of function evaluations is exceeded.
Since BCPOL uses only function-value information at each step to determine a new
approximate minimum, it could be quite inefficient on smooth problems compared to
other methods such as those implemented in routine BCONF, which takes into account
derivative information at each iteration. Hence, routine BCPOL should only be used as a
last resort. Briefly, a set of 2 * N points in an N-dimensional space is called a complex.
The minimization process iterates by replacing the point with the largest function value
by a new point with a smaller function value. The iteration continues until all the points
cluster sufficiently close to a minimum.
Example
The problem
min f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
subject to
2 x1 0.5
1 x2 2
is solved with an initial guess (1.2, 1.0), and the solution is printed.
USE BCPOL_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
N
(N=2)
INTEGER
REAL
EXTERNAL
IBTYPE, K, NOUT
FTOL, FVALUE, X(N), XGUESS(N), XLB(N), XUB(N)
FCN
Variable declarations
!
!
Chapter 8: Optimization
Initializations
BCPOL 1309
!
!
!
!
DATA
XGUESS =
XLB
=
XUB
=
XGUESS/-1.2, 1.0/, XLB/-2.0E0,
(-1.2, 1.0)
(-2.0, -1.0)
( 0.5, 2.0)
-1.0E0/, XUB/0.5E0, 2.0E0/
FTOL
= 1.0E-5
IBTYPE = 0
!
CALL BCPOL (FCN, IBTYPE, XLB, XUB, X, xguess=xguess, ftol=ftol, &
fvalue=fvalue)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) (X(K),K=1,N), FVALUE
99999 FORMAT (' The best estimate for the minimum value of the', /, &
' function is X = (', 2(2X,F4.2), ')', /, ' with ', &
'function value FVALUE = ', E12.6)
!
END
!
External function to be minimized
SUBROUTINE FCN (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = 100.0*(X(2)-X(1)*X(1))**2 + (1.0-X(1))**2
RETURN
END
Output
The best estimate for the minimum value of the
function is X = ( 0.50 0.25)
with function value FVALUE = 0.250002E+00
BCLSF
Solves a nonlinear least squares problem subject to bounds on the variables using a modified
Levenberg-Marquardt algorithm and a finite-difference Jacobian.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (M, N, X, F), where
M Length of F. (Input)
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function at the point X. (Output)
1310 Chapter 8: Optimization
Action
User supplies only the bounds on 1st variable, all other variables will have
the same bounds.
XLB Vector of length N containing the lower bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
XUB Vector of length N containing the upper bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
X Vector of length N containing the approximate solution. (Output)
Optional Arguments
N Number of variables. (Input)
N must be less than or equal to M.
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. By
default, the values for XSCALE are set internally. See IPARAM(6) in Comment 4.
FSCALE Vector of length M containing the diagonal scaling matrix for the functions.
(Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
all entries to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM= 0.
Chapter 8: Optimization
BCLSF 1311
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BCLSF uses a modified Levenberg-Marquardt method and an active set strategy to
solve nonlinear least squares problems subject to simple bounds on the variables. The problem is
stated as follows:
1
1 m
2
T
minn F ( x ) F ( x ) = fi ( x )
xR 2
2 i =1
subject to l x u
n
where m n, F : R R , and fi(x) is the i-th component function of F(x). From a given starting
point, an active set IA, which contains the indices of the variables at their bounds, is built. A
variable is called a free variable if it is not in the active set. The routine then computes the
search direction for the free variables according to the formula
d = (JT J + I)1 JT F
where is the Levenberg-Marquardt parameter, F = F (x), and J is the Jacobian with respect to the
free variables. The search direction for the variables in IA is set to zero. The trust region approach
discussed by Dennis and Schnabel (1983) is used to find the new point. Finally, the optimality
conditions are checked. The conditions are
||g(xi)|| , li < xi< ui
g(xi) < 0, xi = ui
g(xi) > 0, xi = li
where is a gradient tolerance. This process is repeated until the optimality criterion is achieved.
The active set is changed only when a free variable hits its bounds during an iteration or the
optimality condition is met for the free variables but not for all variables in IA, the active set. In
the latter case, a variable that violates the optimality condition will be dropped out of IA. For more
detail on the Levenberg-Marquardt method, see Levenberg (1944), or Marquardt (1963). For more
detailed information on active set strategy, see Gill and Murray (1976).
Since a finite-difference method is used to estimate the Jacobian for some single precision
calculations, an inaccurate estimate of the Jacobian may cause the algorithm to terminate at a
noncritical point. In such cases, high precision arithmetic is recommended. Also, whenever the
exact Jacobian can be easily provided, routine BCLSJ should be used instead.
Comments
1.
2.
Informational errors
Type
3
3
4
4
3
2
3.
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
6 Five consecutive steps have been taken with the maximum step
length.
7 Scaled step tolerance satisfied; the current point may be an
approximate local solution, or the algorithm is making very slow
progress and is not near a solution, or STEPTL is too big.
The first stopping criterion for BCLSF occurs when the norm of the function is less than
the absolute function tolerance. The second stopping criterion occurs when the norm of
the scaled gradient is less than the given gradient tolerance. The third stopping criterion
Chapter 8: Optimization
BCLSF 1313
for BCLSF occurs when the scaled distance between the last two steps is less than the
step tolerance.
4.
If the default parameters are desired for BCLSF, then set IPARAM(1) to zero and call the
routine BCLSF. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling BCLSF:
CALL U4LSF (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to U4LSF will set IPARAM and RPARAM to their default values so only
nondefault values need to be set above.
The following is a list of the parameters and the default values:
IPARAM Integer vector of length 6.
IPARAM(1) = Initialization flag.
IPARAM(2) = Number of good digits in the function.
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of Jacobian evaluations.
Default: 100.
IPARAM(6) = Internal variable scaling flag.
If IPARAM(6) = 1, then the values for XSCALE are set internally.
Default: 1.
RPARAM Real vector of length 7.
RPARAM(1) = Scaled gradient tolerance.
The i-th component of the scaled gradient at x is calculated as
gi max ( xi , 1/ si )
F ( x)
2
2
where
g i = J ( x ) F ( x ) i ( f s )i
T
Default:
1314 Chapter 8: Optimization
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
The i-th component of the scaled step between two points x and y is
computed as
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
1 =
(s t )
n
i =1
i i
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to Error Handling in the Introduction.
Example
The nonlinear least squares problem
min2
xR
1 2
2
fi ( x )
2 i =1
subject to 2 x1 0.5
Chapter 8: Optimization
BCLSF 1315
1 x2 2
where
f1 ( x ) = 10 ( x2 x12 ) and f 2 ( x ) = (1 x1 )
is solved with an initial guess (1.2, 1.0) and default values for parameters.
USE BCLSF_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
M, N
(M=2, N=2)
Declaration of variables
!
INTEGER
REAL
EXTERNAL
!
!
!
!
!
!
!
99999 FORMAT (' The solution is ', 2F9.4, //, ' The function ', &
'evaluated at the solution is ', /, 18X, 2F9.4, //, &
' The number of iterations is ', 10X, I3, /, ' The ', &
'number of function evaluations is ', I3, /)
END
!
SUBROUTINE ROSBCK (M, N, X, F)
INTEGER
M, N
REAL
X(N), F(M)
!
F(1) = 1.0E1*(X(2)-X(1)*X(1))
F(2) = 1.0E0 - X(1)
RETURN
END
Output
The solution is
0.5000
0.2500
15
20
BCLSJ
Solves a nonlinear least squares problem subject to bounds on the variables using a modified
Levenberg-Marquardt algorithm and a user-supplied Jacobian.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (M, N, X, F), where
M Length of F. (Input)
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
Action
Chapter 8: Optimization
BCLSJ 1317
User supplies only the bounds on 1st variable, all other variables will have
the same bounds.
XLB Vector of length N containing the lower bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
XUB Vector of length N containing the upper bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
X Vector of length N containing the approximate solution. (Output)
Optional Arguments
N Number of variables. (Input)
N must be less than or equal to M.
Default: N = SIZE (X,1).
XGUESS Vector of length N containing the initial guess. (Input)
Default: XGUESS = 0.0.
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
XSCALE is used mainly in scaling the gradient and the distance between two points. By
default, the values for XSCALE are set internally. See IPARAM(6) in Comment 4.
FSCALE Vector of length M containing the diagonal scaling matrix for the functions.
(Input)
FSCALE is used mainly in scaling the gradient. In the absence of other information, set
all entries to 1.0.
Default: FSCALE = 1.0.
IPARAM Parameter vector of length 6. (Input/Output)
Set IPARAM(1) to zero for default values of IPARAM and RPARAM. See Comment 4.
Default: IPARAM= 0.
RPARAM Parameter vector of length 7. (Input/Output)
See Comment 4.
FVEC Vector of length M containing the residuals at the approximate solution. (Output)
FJAC M by N matrix containing a finite difference approximate Jacobian at the
approximate solution. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BCLSJ uses a modified Levenberg-Marquardt method and an active set strategy to
solve nonlinear least squares problems subject to simple bounds on the variables. The problem is
stated as follows:
1
1 m
T
2
minn F ( x ) F ( x ) = fi ( x )
xR 2
2 i =1
subject to l x u
where m n, F : Rn Rm, and fi(x) is the i-th component function of F(x). From a given starting
point, an active set IA, which contains the indices of the variables at their bounds, is built. A
variable is called a free variable if it is not in the active set. The routine then computes the
search direction for the free variables according to the formula
d = (JT J + I)1 JT F
where is the Levenberg-Marquardt parameter, F = F (x), and J is the Jacobian with respect to the
free variables. The search direction for the variables in IA is set to zero. The trust region approach
discussed by Dennis and Schnabel (1983) is used to find the new point. Finally, the optimality
conditions are checked. The conditions are
||g(xi)|| , lt < xt< ut
g(xt) < 0, xt = ut
g(xt) > 0, xt = lt
where is a gradient tolerance. This process is repeated until the optimality criterion is achieved.
The active set is changed only when a free variable hits its bounds during an iteration or the
optimality condition is met for the free variables but not for all variables in IA, the active set. In
the latter case, a variable that violates the optimality condition will be dropped out of IA. For more
Chapter 8: Optimization
BCLSJ 1319
detail on the Levenberg-Marquardt method, see Levenberg (1944) or Marquardt (1963). For more
detailed information on active set strategy, see Gill and Murray (1976).
Comments
1.
2.
Informational errors
Type
3
3
4
4
3
4
2
Code
1 Both the actual and predicted relative reductions in the function are
less than or equal to the relative function convergence tolerance.
2 The iterates appear to be converging to a noncritical point.
3 Maximum number of iterations exceeded.
4 Maximum number of function evaluations exceeded.
6 Five consecutive steps have been taken with the maximum step
length.
5 Maximum number of Jacobian evaluations exceeded.
7 Scaled step tolerance satisfied; the current point may be an
approximate local solution, or the algorithm is making very slow
progress and is not near a solution, or STEPTL is too big.
3.
The first stopping criterion for BCLSJ occurs when the norm of the function is less than
the absolute function tolerance. The second stopping criterion occurs when the norm of
the scaled gradient is less than the given gradient tolerance. The third stopping criterion
for BCLSJ occurs when the scaled distance between the last two steps is less than the
step tolerance.
4.
If the default parameters are desired for BCLSJ, then set IPARAM(1) to zero and call the
routine BCLSJ. Otherwise, if any nondefault parameters are desired for IPARAM or
RPARAM, then the following steps should be taken before calling BCLSJ:
CALL U4LSF (IPARAM, RPARAM)
Set nondefault values for desired IPARAM, RPARAM elements.
Note that the call to U4LSF will set IPARAM and RPARAM to their default values so only
nondefault values need to be set above.
The following is a list of the parameters and the default values:
IPARAM Integer vector of length 6.
IPARAM(1) = Initialization flag.
IPARAM(2) = Number of good digits in the function.
Default: 100.
IPARAM(4) = Maximum number of function evaluations.
Default: 400.
IPARAM(5) = Maximum number of Jacobian evaluations.
Default: 100.
IPARAM(6) = Internal variable scaling flag.
2
2
where
g i = J ( x ) F ( x ) i ( f s )i
T
Default:
,3
in double where is the machine precision.
RPARAM(2) = Scaled step tolerance. (STEPTL)
Chapter 8: Optimization
BCLSJ 1321
xi yi
max ( xi , 1/ si )
where s = XSCALE.
Default: 2/3 where is the machine precision.
RPARAM(3) = Relative function tolerance.
1 =
(s t )
n
i =1
i i
Users wishing to override the default print/stop attributes associated with error
messages issued by this routine are referred to ERROR HANDLING in the Introduction.
Example
The nonlinear least squares problem
min2
xR
1 2
2
fi ( x )
2 i =1
subject to 2 x1 0.5
1 x2 2
where
f1 ( x ) = 10 ( x2 x12 ) and f 2 ( x ) = (1 x1 )
is solved with an initial guess ( 1.2, 1.0) and default values for parameters.
1322 Chapter 8: Optimization
USE BCLSJ_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declaration of variables
LDFJAC, M, N
(LDFJAC=2, M=2, N=2)
!
!
INTEGER
REAL
EXTERNAL
!
!
!
!
!
!
!
99999 FORMAT (' The solution is ', 2F9.4, //, ' The function ', &
'evaluated at the solution is ', /, 18X, 2F9.4, //, &
' The number of iterations is ', 10X, I3, /, ' The ', &
'number of function evaluations is ', I3, /)
END
!
SUBROUTINE ROSBCK (M, N, X, F)
INTEGER
M, N
REAL
X(N), F(M)
!
F(1) = 1.0E1*(X(2)-X(1)*X(1))
F(2) = 1.0E0 - X(1)
RETURN
END
!
SUBROUTINE ROSJAC (M, N, X, FJAC, LDFJAC)
INTEGER
M, N, LDFJAC
REAL
X(N), FJAC(LDFJAC,N)
!
FJAC(1,1) = -20.0E0*X(1)
FJAC(2,1) = -1.0E0
FJAC(1,2) = 10.0E0
FJAC(2,2) = 0.0E0
RETURN
END
Chapter 8: Optimization
BCLSJ 1323
Output
The solution is
0.5000
0.2500
13
21
BCNLS
Solves a nonlinear least-squares problem subject to bounds on the variables and general linear
constraints.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (M, N, X, F), where
M Number of functions. (Input)
N Number of variables. (Input)
X Array of length N containing the point at which the function will be evaluated.
(Input)
F Array of length M containing the computed function at the point X. (Output)
The routine FCN must be declared EXTERNAL in the calling program.
M Number of functions. (Input)
C MCON N matrix containing the coefficients of the MCON general linear constraints.
(Input)
BL Vector of length MCON containing the lower limit of the general constraints. (Input).
BU Vector of length MCON containing the upper limit of the general constraints. (Input).
IRTYPE Vector of length MCON indicating the types of general constraints in the matrix C.
(Input)
Let R(I) = C(I, 1)*X(1) + + C(I, N)*X(N). Then the value of IRTYPE(I)
signifies the following:
IRTYPE(I)
0
1
2
3
I-th CONSTRAINT
BL(I).EQ.R(I).EQ.BU(I)
R(I).LE.BU(I)
R(I).GE.BL(I)
BL(I).LE.R(I).LE.BU(I)
XLB Vector of length N containing the lower bounds on variables; if there is no lower
bound on a variable, then 1.0E30 should be set as the lower bound. (Input)
1324 Chapter 8: Optimization
XUB Vector of length N containing the upper bounds on variables; if there is no upper
bound on a variable, then 1.0E30 should be set as the upper bound. (Input)
X Vector of length N containing the approximate solution. (Output)
Optional Arguments
N Number of variables. (Input)
Default: N = SIZE (C,2).
MCON The number of general linear constraints for the system, not including simple
bounds. (Input)
Default: MCON = SIZE (C,1).
LDC Leading dimension of C exactly as specified in the dimension statement of the calling
program. (Input)
LDC must be at least MCON.
Default: LDC = SIZE (C,1).
XGUESS Vector of length N containing the initial guess. (Input)
Default: XGUESS = 0.0.
RNORM The Euclidean length of components of the function f (x) after the approximate
solution has been found. (Output).
ISTAT Scalar indicating further information about the approximate solution X. (Output)
See the Comments section for a description of the tolerances and the vectors IPARAM
and RPARAM.
ISTAT Meaning
The function f (x) has a length less than TOLF = RPARAM(1). This is the expected
value for ISTAT when an actual zero value of f (x) is anticipated.
The function f (x) has reached a local minimum. This is the expected value for
ISTAT when a nonzero value of f (x) is anticipated.
A small change (absolute) was noted for the vector x. A full model problem step
was taken. The condition for ISTAT = 2 may also be satisfied, so that a
minimum has been found. However, this test is made before the test for
ISTAT = 2.
A small change (relative) was noted for the vector x. A full model problem step
was taken. The condition for ISTAT = 2 may also be satisfied, so that a
minimum has been found. However, this test is made before the test for
ISTAT = 2.
Chapter 8: Optimization
BCNLS 1325
The number of terms in the quadratic model is being restricted by the amount of
storage allowed for that purpose. It is suggested, but not required, that
additional storage be given for the quadratic model parameters. This is
accessed through the vector
IPARAM, documented below.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine BCNLS solves the nonlinear least squares problem
m
min f i ( x )
i =1
subject to
bl Cx bu
xl x xu
BCNLS is based on the routine DQED by R.J. Hanson and F.T. Krogh. The section of BCNLS that
approximates, using finite differences, the Jacobian of f(x) is a modification of JACBF by D.E.
Salane.
Comments
1.
If the default parameters are desired for B2NLS, set IPARAM(1) to zero.
Otherwise, if any nondefault parameters are desired for IPARAM or RPARAM, the
following steps should be taken before calling B2NLS:
CALL B7NLS (IPARAM, RPARAM)
Set nondefault values for IPARAM and RPARAM.
If double precision is being used, DB7NLS should be called instead. Following is
a list of parameters and the default values.
IPARAM(1) = Initialization flag.
IPARAM(2) = ITMAX, the maximum number of iterations allowed.
Default: 75
IPARAM(3) = a flag that suppresses the use of the quadratic model in the inner
loop. If set to one, then the quadratic model is never used. Otherwise use
the quadratic model where appropriate. This option decreases the amount
of workspace as well as the computing overhead required. A user may
wish to determine if the application really requires the use of the
quadratic model.
Default: 0
IPARAM(4) = NTERMS, one more than the maximum number of terms used in the
quadratic model.
Default: 5
IPARAM(5) = RCSTAT, a flag that determines whether forward or reverse
Default: 0
RPARAM Real vector of length 7 used to change certain default attributes of
BCNLS. (Input)
For the description of RPARAM, we make the following definitions:
FC current value of the length of f (x)
FB best value of length of f (x)
FL value of length of f (x) at the previous step
PV predicted value of length of f (x), after the step is taken, using the
Chapter 8: Optimization
BCNLS 1327
Default : min(1.E 5, )
RPARAM(2) = TOLX, tolerance for stopping when change to x values has length
less than or equal to TOLX*length of x values.
Default : min(1.E 5, )
RPARAM(3) = TOLD, tolerance for stopping when change to x values has length
less than or equal to TOLD.
Default : min(1.E 5, )
RPARAM(4) = TOLSNR, tolerance used in stopping condition ISTAT = 2.
Default: 1.E5
RPARAM(5) = TOLP, tolerance used in stopping condition ISTAT = 2.
Default: 1.E5
RPARAM(6) = TOLUSE, tolerance used to avoid values of x in the quadratic
RPARAM(7) = COND, largest condition number to allow when solving for the
fi
xj
2.
Informational errors
Type
3
Code
1 The function f (x) has reached a value that may be a local minimum.
However, the bounds on the trust region defining the size of the step
are being hit at each step. Thus, the situation is suspect. (Situations of
this type can occur when the solution is at infinity at some of the
components of the unknowns, x).
2 The model problem solver has noted a value for the linear or
quadratic model problem residual vector length that is greater than or
equal to the current value of the function, i.e. the Euclidean length of
f (x). This situation probably means that the evaluation of f (x) has
more uncertainty or noise than is possible to account for in the
tolerances used to not a local minimum. The value of x is suspect, but
a minimum has probably been found.
3 More than ITMAX iterations were taken to obtain the solution. The
value obtained for x is suspect, although it is the best set of x values
that occurred in the entire computation. The value of ITMAX can be
increased though the IPARAM vector.
Example 1
This example finds the four variables x1, x2, x3, x4 that are in the model function
Chapter 8: Optimization
BCNLS 1329
h ( t ) = x1e x2t + x3 e x4 t
There are also the constraints that x2, x4 0, x1, x3 0, and x2 and x4 must be separated by at least
0.05. Nothing more about the values of the parameters is known so the initial guess is 0.
USE BCNLS_INT
USE UMACH_INT
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
MCON, N
(MCON=1, N=4)
INTEGER
PARAMETER
LDC, M
(M=5, LDC=MCON)
!
INTEGER
REAL
!
!
EXTERNAL
!
CALL UMACH (2, NOUT)
!
!
XLB(1)
XLB(2)
XLB(3)
XLB(4)
=
=
=
=
0.0
1.0E30
0.0
1.0E30
XUB(1)
XUB(2)
XUB(3)
XUB(4)
=
=
=
=
-1.0E30
0.0
-1.0E30
0.0
!
CALL BCNLS (FCN, M, C, BL, BL, IRTYPE, XLB, XUB, X, RNORM=RNORM)
CALL WRRRN ('X', X, 1, N, 1)
1330 Chapter 8: Optimization
Output
X
1
2
3
1.999 -1.000
0.500
rnorm = .42425E-03
4
-9.954
Additional Examples
Example 2
This example solves the same problem as the last example, but reverse communication is used to
evaluate f(x) and the Jacobian of f(x). The use of the quadratic model is turned off.
USE B2NLS_INT
USE UMACH_INT
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
!
INTEGER
REAL
REAL
SAVE
NONE
LDC, LDFJ, M, MCON, N
(M=5, MCON=1, N=4, LDC=MCON, LDFJ=M)
Specifications for local variables
I, IPARAM(6), IRTYPE(MCON), ISTAT, IWORK(1000), &
LIWORK, LWORK, NOUT
BL(MCON), C(MCON,N), F(M), FJ(M,N), RNORM, RPARAM(7), &
WORK(1000), X(N), XGUESS(N), XLB(N), XUB(N)
H(5), T(5)
H, T
Chapter 8: Optimization
BCNLS 1331
INTRINSIC
REAL
EXP
EXP
EXTERNAL
B7NLS
EXTERNAL
B10LS, B11LS
!
!
!
DATA T/0.05, 0.1, 0.4, 0.5, 1.0/
DATA H/2.206, 1.994, 1.35, 1.216, 0.7358/
!
CALL UMACH (2, NOUT)
!
!
=
=
=
=
=
=
0.0
1.0
0.0
-1.0
0.05
2
=
=
=
=
0.0
1.0E30
0.0
1.0E30
XUB(1)
XUB(2)
XUB(3)
XUB(4)
=
=
=
=
-1.0E30
0.0
-1.0E30
0.0
XGUESS = 0.0E0
=
=
=
=
=
1
1
1
1000
1000
!
!
!
!
!
!
FJ(I,3) = EXP(X(4)*T(I))
FJ(I,4) = T(I)*X(3)*FJ(I,3)
F(I) = X(1)*FJ(I,1) + X(3)*FJ(I,3) - H(I)
20
CONTINUE
GO TO 10
END IF
!
CALL WRRRN ('X', X, 1, N, 1)
WRITE (NOUT,99999) RNORM
99999 FORMAT (/, 'rnorm = ', E10.5)
END
Output
X
1
2
3
1.999 -1.000
0.500
rnorm = .42413E-03
4
-9.954
READ_MPS
This subroutine reads an MPS file containing a linear programming problem or a quadratic
programming problem.
Required Arguments
FILENAME Character string containing the name of the MPS file to be read. (Input)
MPS A structure of IMSL defined derived type s_MPS containing the data read from the
MPS file. (Output)
The IMSL defined derived type s_MPS consists of the following components:
Component
Description
integer nrows
integer ncolumns
integer nonzeros
integer nhessian
integer ninteger
integer nbinary
Chapter 8: Optimization
READ_MPS 1333
Component
Description
(0 or 1).
constraint(:)
type(s_SparseMatrixElement), allocatable ::
hessian(:)
Continous
Integer
Binary (0 or 1)
Semicontinuous
Fortran Numerical MATH LIBRARY
Component
Description
This derived type stores the constraint and Hessian matrices in a simple sparse matrix format of
derived type s_SparseMatrixElement defined in the interface module mp_types.
s_SparseMatrixElement consists of three components; a row index, a column index, and a
value. For each non-zero element in the constraint and Hessian matrices an element of derived
type s_SparseMatrixElement is stored. The following code fragment expands the sparse
constraint matrix of the derived type s_SparseMatrixElement contained in mps, a derived type
of type s_MPS, into a dense matrix:
! allocate a matrix
integer nr = mps%nrows
integer nc = mps%ncolumns
real (kind(1e0)), allocatable :: matrix(:,:)
allocate(matrix(nr,nc))
matrix = 0.0e0
! expand the sparse matrix
do k = 1, mps%nonzeros
i = mps%constraint(k)%row
j = mps%constraint(k)%column
matrix(i,j) = mps%constraint(k)%value
end do
Chapter 8: Optimization
READ_MPS 1335
The IMSL derived type d_MPS is the double precision counterpart to s_MPS. The IMSL derived
type d_SparseMatrixElement is the double precision counterpart to
s_SparseMatrixElement.
To release the space allocated for this derived type use the following statement:
call mps_free(mps)
Optional Arguments
NUNIT The unit number for reading an MPS file opened by the user. If NUNIT is not used,
this subroutine opens the file indicated by FILENAME for reading and then closes it
after reading. (Input)
By default, 7 is used.
OBJ Character string of length 8 containing the name of the objective function set to be
used. (Input)
An MPS file can contain multiple objective function sets.
By default, the first objective function set in the MPS file is used. This name is case
sensitive.
RHS Character string of length 8 containing the name of the RHS set to be used. (Input)
An MPS file can contain multiple RHS sets.
By default, the first RHS set in the MPS file is used. This name is case sensitive.
RANGES Character string of length 8 containing the name of the RANGES set to be used.
(Input)
An MPS file can contain multiple RANGES sets.
By default, the first RANGES set in the MPS file is used. This name is case sensitive.
BOUNDS Character string of length 8 containing the name of the BOUNDS set to be used.
(Input)
An MPS file can contain multiple BOUNDS sets.
By default, the first BOUNDS set in the MPS file is used. This name is case sensitive.
POS_INF Value used for a constraint or bound upper limit when the constraint or bound is
unbounded above. (Input)
Default: 1.0e+30.
NEG_INF Value used for a constraint or bound lower limit when the constraint or bound
is unbounded below. (Input)
Default: -1.0e+30.
FORTRAN 90 Interface
Generic:
Specific:
Description
An MPS file defines a linear or quadratic programming problem.
A linear programming problem is assumed to have the form:
min cT x
x
bl Ax bu
xl x xu
1 T
x Qx + cT x
2
bl Ax bu
xl x xu
The following table maps this notation into the components in the derived type returned by
READ_MPS:
C
Objective
A
Constraint
Q
Hessian
bl
lower_range
bu
upper_range
xl
lower_bound
xu
upper_bound
If the MPS file specifies an equality constraint or bound, the corresponding lower and upper
values in the returned derived type will be exactly equal.
The problem formulation assumes that the constraints and bounds are two-sided. If a particular
constraint or bound has no lower limit, then the corresponding component of the derived type is
set to -1.0e+30. If the upper limit is missing, then the corresponding component of the derived
type is set to +1.0e+30.
Chapter 8: Optimization
READ_MPS 1337
Field Number
Columns
Contents
2-3
Indicator
5-12
Name
15-22
Name
25-36
Value
40-47
Name
50-61
Value
The format limits MPS names to 8 characters and values to 12 characters. The names in fields 2, 3
and 5 are case sensitive. Leading and trailing blanks are ignored, but internal spaces are
significant.
The sections in an MPS file are as follows.
NAME
ROWS
COLUMNS
RHS
RANGES (optional)
BOUNDS (optional)
QUADRATIC (optional)
ENDATA
NAME Section
The NAME section contains a single line. A problem name can occur anywhere on the line after
NAME and before column 62. The problem name is truncated to 8 characters.
ROWS Section
The ROWS section defines the name and type for each row. Field 1 contains the row type and
field 2 contains the row name. Row type values are not case sensitive. Row names are case
sensitive. The following row types are allowed:
Row Type
E
Meaning
Equality Constraint.
Row Type
G
Meaning
Greater than or equal constraint.
COLUMNS Section
The COLUMNS section defines the nonzero entries in the objective and the constraint matrix. The
row names here must have been defined in the ROWS section.
Field
Contents
Column name.
Row name.
Row name.
The COLUMNS section can also contain markers. These are indicated by the name MARKER
(with the quotes) in field 3 and the marker type in field 4 or 5.
Marker type INTORG (with the quotes) begins an integer group. The marker type INTEND (with
the quotes) ends this group. The variables corresponding to the columns defined within this group
are required to be integer.
RHS Section
The RHS section defines the right-hand side of the constraints. An MPS file can contain more than
one RHS set, distinguished by the RHS set name. The row names here must be defined in the
ROWS section.
Field
Contents
Row name.
Row name.
Chapter 8: Optimization
READ_MPS 1339
RANGES Section
The optional RANGES section defines two-sided constraints. An MPS file can contain more than
one range set, distinguished by the range set name. The row names here must have been defined in
the ROWS section.
Field
Contents
2
Row name.
Row name.
Ranges change one-sided constraints, defined in the RHS section, into two-sided constraints. The
two-sided constraint for row i depends on the range value, ri , defined in this section. The righthand side value, bi , is defined in the RHS section. The two-sided constraints for row i are given
in the following table:
Row Type
Lower Constraint
Upper Constraint
bi
bi + ri
bi ri
bi
bi + min(0, ri )
bi + max(0, ri )
BOUNDS Section
The optional BOUNDS section defines bounds on the variables. By default, the bounds
are 0 xi . The bounds can also be used to indicate that a variable must be an integer.
More than one bound can be set for a single variable. For example, to set 2 xi 6 use a LO
bound with value 2 to set 2 xi and a UP bound with value 6 to add the condition xi 6 .
An MPS file can contain more than one bounds set, distinguished by the bound set name.
Field
Contents
Bounds type.
Column name
Column name.
The bound types are as follows. Here bi are the bound values defined in this section, the xi are the
variables, and I is the set of integers.
Bounded Type
Definition
Formula
LO
Lower bound
b j xi
UP
Upper bound
xi bi
FX
Fixed variable
xi = bi
FR
Free variable
xi
MI
xi
PL
xi
BV
xi {0,1}
UI
xi bi and xi I
LI
bi xi and xi I
SC
Semicontinuous
0 or bi xi
Chapter 8: Optimization
READ_MPS 1341
QUADRATIC Section
The optional QUADRATIC section defines the Hessian for quadratic programming problems. The
names HESSIAN, QUADS, QUADOBJ, QSECTION, and QMATRIX are also recognized as
beginning the QUADRATIC section.
Field
Contents
Column name.
Column name.
Column name.
ENDATA Section
The ENDATA section ends the MPS file.
Comments
Informational errors
Type
Code
No bounds found.
Invalid number.
11
12
13
Out-of-order marker.
14
15
16
17
18
Example 1
use read_mps_int
implicit none
TYPE(S_MPS) mps
CALL read_mps ('test.mps', mps)
End
Additional Examples
Example 2
See Example 2 of DENSE_LP.
MPS_FREE
Deallocates the space allocated for the IMSL derived type s_MPS. This routine is usually used in
conjunction with READ_MPS.
Required Arguments
MPS A structure of IMSL defined derived type s_MPS containing the data read from the
MPS file. (Input/Output)
The allocated components of s_MPS will be deallocated on output.
The IMSL defined derived type s_MPS consists of the following components:
Component
Description
integer nrows
integer ncolumns
integer nonzeros
integer nhessian
integer ninteger
integer nbinary
Chapter 8: Optimization
MPS_FREE 1343
Component
type(s_SparseMatrixElement), allocatable ::
hessian(:)
Description
containing the objective vector.
Continous
Integer
Binary (0 or 1)
Semicontinuous
Component
Description
This derived type stores the constraint and Hessian matrices in a simple sparse matrix format of
derived type s_SparseMatrixElement defined in the interface module mp_types.
s_SparseMatrixElement consists of three components; a row index, a column index, and a
value. For each non-zero element in the constraint and Hessian matrices an element of derived
type s_SparseMatrixElement is stored The following code fragment expands the sparse
constraint matrix of the derived type s_SparseMatrixElement contained in mps, a derived type
of type s_MPS, into a dense matrix:
! allocate a matrix
integer nr = mps%nrows
integer nc = mps%ncolumns
real (kind(1e0)), allocatable :: matrix(:,:)
allocate(matrix(nr,nc))
matrix = 0.0e0
! expand the sparse matrix
do k = 1, mps%nonzeros
i = mps%constraint(k)%row
j = mps%constraint(k)%column
matrix(i,j) = mps%constraint(k)%value
end do
The IMSL derived type d_MPS is the double precision counterpart to s_MPS. The IMSL derived
type d_SparseMatrixElement is the double precision counterpart to
s_SparseMatrixElement.
Chapter 8: Optimization
MPS_FREE 1345
FORTRAN 90 Interface
Generic:
Specific:
Description
This subroutine simply issues deallocate statements for each of the arrays allocated in the IMSL
derived type s_MPS defined above. It is supplied as a convenience utility to the user of
READ_MPS.
Example
In the following example, the space that had been allocated to accommodate the IMSL derived
type S_MPS is deallocated with a call to MPS_FREE after a call to READ_MPS was made.
use read_mps_int
use mps_free_int
implicit none
TYPE(S_MPS) mps
CALL read_mps ('test.mps', mps)
.
.
.
call mps_free (mps)
end
DENSE_LP
Solves a linear programming problem.
NOTE: DENSE_LP is available in double precision only.
Required Arguments
A M by NVAR matrix containing the coefficients of the M constraints. (Input)
BL Vector of length M containing the lower limit of the general constraints; if there is no
lower limit on the I-th constraint, then BL(I) is not referenced. (Input)
BU Vector of length M containing the upper limit of the general constraints; if there is no
upper limit on the I-th constraint, then BU(I) is not referenced; if there are no range
constraints, BL and BU can share the same storage locations. (Input)
C Vector of length NVAR containing the coefficients of the objective function. (Input)
IRTYPE Vector of length M indicating the types of general constraints in the matrix A.
(Input)
Let R(I) = A(I, 1) * XSOL(1) + + A(I, NVAR) * XSOL(NVAR). Then, the value of
IRTYPE(I) signifies the following:
Irtype[I]
I-th Constraint
R(I) BU(I)
R(I) BL(I)
Optional Arguments
M Number of constraints. (Input)
Default: M = SIZE (A,1).
NVAR Number of variables. (Input)
Default: NVAR = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
LDA must be at least M.
Default: LDA = SIZE (A,1).
XLB Vector of length NVAR containing the lower bound on the variables; if there is no
lower bound on a variable, then 1.0D30 should be set as the lower bound. (Input)
Default: XLB = 0.0D0.
XUB Vector of length NVAR containing the upper bound on the variables; if there is no
upper bound on a variable, then 1.0D30 should be set as the upper bound. (Input)
Default: No upperbound enforced.
ITREF The type if iterative refinement used.
Chapter 8: Optimization
(Input)
DENSE_LP 1347
ITREF
Refinement
No refinement
Iterative refinement
Default: ITREF = 0.
ITERS Number of iterations. (Output)
IERR Status flag indicating which warning conditions were set upon completion.
(Output)
IERR
Status
FORTRAN 90 Interface
Generic:
CALL DENSE_LP (A, BL, BU, C, IRTYPE, OBJ, XSOL, DSOL [,])
Specific:
Description
The routine DENSE_LP solves the linear programming problem
minn cT x
xR
subject to bl Ax bu
xl x xu
where c is the objective coefficient vector, A is the coefficient matrix, and the vectors bl, bu, xl and
xu are the lower and upper bounds on the constraints and the variables, respectively.
DENSE_LP uses an active set strategy.
Refer to the following paper for further information: Krogh, Fred, T. (2005), An Algorithm for
Linear Programming, http://mathalacarte.com/fkrogh/pub/lp.pdf ,Tujunga, CA.
Comments
1.
Informational errors
Type
1
3
3
3
4
4
4
4
4
Code
1 Multiple solutions giving essentially the same solution exist.
1 Some constraints were discarded because they were too linearly
dependent on other active constraints.
2 All constraints are not satisfied.
3 The algorithm appears to be cycling.
1 The problem appears vacuous.
2 The problem is unbounded.
3 An acceptable pivot could not be found.
4 The constraint bounds are inconsistent.
5 The variable bounds are inconsistent.
Example 1
The linear programming problem in the standard form
min f ( x ) = x1 3x2
subject to x1 + x2 + x3
x1 + x2
x1
= 1.5
x4
= 0.5
= 1.0
+ x5
x2
+ x6
= 1.0
xi 0, for i = 1, , 6
is solved.
USE UMACH_INT
USE WRRRN_INT
USE DENSE_LP_INT
IMPLICIT NONE
INTEGER NOUT, M, NVAR
PARAMETER (M=4, NVAR=6)
DOUBLE PRECISION A(M, NVAR), B(M), C(NVAR), XSOL(NVAR), &
DSOL(M), BL(M), BU(M), OBJ
INTEGER IRTYPE(M)
DATA A/1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, -1, &
0, 0, 0, 0, 1, 0, 0, 0, 0, 1/
Chapter 8: Optimization
DENSE_LP 1349
DATA
DATA
DATA
DATA
DATA
Output
Objective
1
0.500
-3.5000
2
1.000
Solution
3
4
0.000
1.000
5
0.500
6
0.000
Additional Examples
Example 2
This example demonstrates how READ_MPS can be used together with DENSE_LP to solve a linear
programming problem defined in an MPS file. The MPS file used in this example is an
uncompressed version of the file afiro, available from http://www.netlib.org/lp/data/.
USE UMACH_INT
USE WRRRN_INT
USE READ_MPS_INT
USE DENSE_LP_INT
IMPLICIT NONE
REAL(KIND(1D0)) OBJ
REAL(KIND(1D0)), ALLOCATABLE :: XSOL(:)
REAL(KIND(1D0)), ALLOCATABLE :: DSOL(:)
REAL(KIND(1D0)), ALLOCATABLE :: A(:,:)
INTEGER, ALLOCATABLE :: IRTYPE(:)
TYPE(D_MPS) PROBLEM
CHARACTER NAME*256
INTEGER I,J, K, NOUT
CALL UMACH(2, NOUT)
!
A = 0
IRTYPE = 3
FILL DENSE A
DO K = 1, PROBLEM%NONZEROS
I = PROBLEM%CONSTRAINT(K)%ROW
J = PROBLEM%CONSTRAINT(K)%COLUMN
A(I,J) = PROBLEM%CONSTRAINT(K)%VALUE
ENDDO
CALL THE LP SOLVER
CALL DENSE_LP (A, PROBLEM%LOWER_RANGE, PROBLEM%UPPER_RANGE, &
PROBLEM%OBJECTIVE, IRTYPE, OBJ, XSOL, DSOL, &
XLB=PROBLEM%LOWER_BOUND, XUB=PROBLEM%UPPER_BOUND)
WRITE(NOUT, 99999) OBJ
CALL WRRRN('Solution', XSOL, 1, PROBLEM%NROWS, 1)
DEALLOCATE(A)
DEALLOCATE(IRTYPE)
DEALLOCATE(XSOL)
DEALLOCATE(DSOL)
99999 FORMAT('Objective:
END
', E16.7)
Output
Objective:
1
80.0
11
0.0
21
363.9
-0.4647531E+03
2
25.5
12
0.0
3
54.5
13
18.2
22
0.0
4
84.8
14
39.7
23
0.0
24
0.0
Solution
5
57.9
15
61.3
25
0.0
6
0.0
16
500.0
26
0.0
7
0.0
17
475.9
8
0.0
18
24.1
19
0.0
9
0.0
10
0.0
20
215.0
27
0.0
DLPRS
Solves a linear programming problem via the revised simplex algorithm.
Required Arguments
A M by NVAR matrix containing the coefficients of the M constraints. (Input)
BL Vector of length M containing the lower limit of the general constraints; if there is no
lower limit on the I-th constraint, then BL(I) is not referenced. (Input)
BU Vector of length M containing the upper limit of the general constraints; if there is no
upper limit on the I-th constraint, then BU(I) is not referenced; if there are no range
constraints, BL and BU can share the same storage locations. (Input)
C Vector of length NVAR containing the coefficients of the objective function. (Input)
Chapter 8: Optimization
DLPRS 1351
IRTYPE Vector of length M indicating the types of general constraints in the matrix A.
(Input)
Let R(I) = A(I, 1) * XSOL(1) + + A(I, NVAR) * XSOL(NVAR). Then, the value of
IRTYPE(I) signifies the following:
IRTYPE(I)
I-th Constraint
BL(I).EQ.R(I).EQ.BU(I)
R(I).LE.BU(I)
R(I).GE.BL(I)
BL(I).LE.R(I).LE.BU(I)
Optional Arguments
M Number of constraints. (Input)
Default: M = SIZE (A,1).
NVAR Number of variables. (Input)
Default: NVAR = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
LDA must be at least M.
Default: LDA = SIZE (A,1).
XLB Vector of length NVAR containing the lower bound on the variables; if there is no
lower bound on a variable, then 1.0E30 should be set as the lower bound. (Input)
Default: XLB = 0.0.
XUB Vector of length NVAR containing the upper bound on the variables; if there is no
upper bound on a variable, then 1.0E30 should be set as the upper bound. (Input)
Default: XUB = 3.4e38 for single precision and 1.79d + 308 for double precision.
FORTRAN 90 Interface
Generic:
CALL DLPRS (A, BL, BU, C, IRTYPE, OBJ, XSOL, DSOL [,])
Specific:
FORTRAN 77 Interface
Single:
CALL DLPRS (M, NVAR, A, LDA, BL, BU, C, IRTYPE, XLB, XUB,
OBJ, XSOL, DSOL)
Double:
Description
The routine DLPRS uses a revised simplex method to solve linear programming problems, i.e.,
problems of the form
minn cT x
xR
subject to bl Ax bu
xl x xu
where c is the objective coefficient vector, A is the coefficient matrix, and the vectors bl, bu, xl and
xu are the lower and upper bounds on the constraints and the variables, respectively.
For a complete description of the revised simplex method, see Murtagh (1981) or Murty (1983).
Comments
1.
2.
Informational errors
Type
3
4
Chapter 8: Optimization
Code
1 The problem is unbounded.
2 Maximum number of iterations exceeded.
DLPRS 1353
3
4
3
4
Example
A linear programming problem is solved.
USE DLPRS_INT
USE UMACH_INT
USE SSCAL_INT
IMPLICIT
INTEGER
PARAMETER
!
!
!
INTEGER
REAL
NONE
LDA, M, NVAR
(M=2, NVAR=2, LDA=M)
M = number of constraints
NVAR = number of variables
I, IRTYPE(M), NOUT
A(LDA,NVAR), B(M), C(NVAR), DSOL(M), OBJ, XLB(NVAR), &
XSOL(NVAR), XUB(NVAR)
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
99999 FORMAT (//, '
1354 Chapter 8: Optimization
Objective
Primal ',&
Dual solution
=', 2F9.4)
!
END
Output
Objective
3.5000
Primal Solution =
0.5000
1.0000
Dual solution
1.0000
0.0000
SLPRS
Solves a sparse linear programming problem via the revised simplex algorithm.
Required Arguments
A Vector of length NZ containing the coefficients of the M constraints. (Input)
IROW Vector of length NZ containing the row numbers of the corresponding element in A.
(Input)
JCOL Vector of length NZ containing the column numbers of the corresponding elements
in A. (Input)
BL Vector of length M containing the lower limit of the general constraints; if there is no
lower limit on the I-th constraint, then BL(I) is not referenced. (Input)
BU Vector of length M containing the upper lower limit of the general constraints; if there
is no upper limit on the I-th constraint, then BU(I) is not referenced. (Input)
C Vector of length NVAR containing the coefficients of the objective function. (Input)
IRTYPE Vector of length M indicating the types of general constraints in the matrix A.
(Input)
Let R(I) = A(I, 1)*XSOL(1) + + A(I, NVAR)*XSOL(NVAR)
IRTYPE(I)
0
1
2
3
I-th CONSTRAINT
BL(I) = R(I) = BU(I)
R(I) BU(I)
R(I) BL(I)
BL(I) R(I) BU(I)
Chapter 8: Optimization
SLPRS 1355
Optional Arguments
M Number of constraints. (Input)
Default: M = SIZE (IRTYPE,1).
NVAR Number of variables. (Input)
Default: NVAR = SIZE (C,1).
NZ Number of nonzero coefficients in the matrix A. (Input)
Default: NZ = SIZE (A,1).
XLB Vector of length NVAR containing the lower bound on the variables; if there is no
lower bound on a variable, then 1.0E30 should be set as the lower bound. (Input)
Default: XLB = 0.0.
XUB Vector of length NVAR containing the upper bound on the variables; if there is no
upper bound on a variable, then 1.0E30 should be set as the upper bound. (Input)
Default: XLB = 3.4e38 for single precision and 1.79d + 308 for double precision.
FORTRAN 90 Interface
Generic:
CALL SLPRS (A, IROW, JCOL, BL, BU, C, IRTYPE, OBJ, XSOL,
DSOL [,])
Specific:
FORTRAN 77 Interface
Single:
CALL SLPRS (M, NVAR, NZ, A, IROW, JCOL, BL, BU, C, IRTYPE,
XLB, XUB, OBJ, XSOL, DSOL)
Double:
Description
This subroutine solves problems of the form
min cTx
subject to
bl Ax bu ,
xl x xu
where c is the objective coefficient vector, A is the coefficient matrix, and the vectors bl, bu, xl, and
xu are the lower and upper bounds on the constraints and the variables, respectively. SLPRS is
designed to take advantage of sparsity in A. The routine is based on DPLO by Hanson and Hiebert.
Comments
Workspace may be explicitly provided, if desired, by use of S2PRS/DS2PRS. The
reference is:
CALL S2PRS (M, NVAR, NZ, A, IROW, JCOL, BL, BU, C, IRTYPE, XLB, XUB, OBJ, XSOL,
DSOL, IPARAM, RPARAM, COLSCL, ROWSCL, WORK, LW, IWORK, LIW)
Note that the call to S5PRS will set IPARAM and RPARAM to their default values
so only nondefault values need to be set above.
IPARAM(1) = 0 indicates that a minimization problem is solved. If set to 1, a
If set to zero, the routine uses the steepest edge pricing strategy which is
the best local move. If set to one, the minimum reduced cost pricing
strategy is used. The steepest edge pricing strategy generally uses fewer
iterations than the minimum reduced cost pricing, but each iteration costs
more in terms of the amount of calculation performed. However, this is
very problem-dependent.
Default: IPARAM(3) = 0
IPARAM(4) = MXITBR, the number of iterations between recalculating the error
in the primal solution is used to monitor the error in solving the linear
system. This is an expensive calculation and every tenth iteration is
generally enough.
Default: IPARAM(4) = 10
Chapter 8: Optimization
SLPRS 1357
IPARAM(5) = NPP, the number of negative reduced costs (at most) to be found at
NPP = NVARS will be used, implying that all of the reduced costs are
computed at each such step. This Partial pricing may increase the total
number of iterations required. However, it decreases the number of
calculation required at each iteration. The effect on overall efficiency is
very problem-dependent. If set to some positive number, that value is
used as NPP.
Default: IPARAM(5) = 0
IPARAM(6) = IREDFQ, the number of steps between basis matrix
IPARAM(7) = LAMAT, the length of the portion of WORK that is allocated to sparse
matrix storage and decomposition. LAMAT must be greater than NZ +
NVARS + 4.
IPARAM(10) = switch indicating that partial results have been computed and
stored on unit number IPARAM(10), if greater than zero. If IPARAM(10) is
IPARAM(12) = switch indicating that the user supplied scale factors for the rows
of the matrix A. If IPARAM(12) is set to zero, no row scaling is one. If
IPARAM(12) is set to 1, element I of the vector ROWSCL is used as the
scale factor for row I of the matrix A. The scaling is implicit, so no input
Default: IPARAM(12) = 0
RPARAM Real parameter vector of length 7.
RPARAM(1) = COSTSC, a scale factor for the vector of costs. Normally
SLPRS computes this scale factor to be the reciprocal of the max norm if
the vector costs after the column scaling has been applied. If RPARAM(1)
is zero, SLPRS compute COSTSC.
range [0.01, 0.1], particularly on machines with short word length and
working precision when solving a large problem. If RPARAM(5) is
nonzero, that value is used as PHI, otherwise the default value is used.
SLPRS 1359
COLSCL Array of length NVARS containing column scale factors for the matrix A.
(Input).
COLSCL is not used if IPARAM(11) is set to zero.
ROWSCL Array of length M containing row scale factors for the matrix A. (Input)
ROWSCL is not used if IPARAM(12) is set to zero.
WORK Work array of length LW.
LW Length of real work array. LW must be at least
2 + 2NZ + 9NVAR + 27M + MAX(NZ + NVAR + 8, 4NVAR + 7).
Example
Solve a linear programming problem, with
0 0.5
1 0.5
A=
1
0.5
1
NONE
M, NVAR
(M=200, NVAR=200)
!
CALL UMACH (2, NOUT)
!
1360 Chapter 8: Optimization
Define A
Fortran Numerical MATH LIBRARY
INDEX = 1
DO 10 J=2, M
!
Superdiagonal element
IROW(INDEX) = J - 1
JCOL(INDEX) = J
A(INDEX)
= 0.5
Diagonal element
IROW(INDEX+1) = J
JCOL(INDEX+1) = J
A(INDEX+1) = 1.0
INDEX
= INDEX + 2
10 CONTINUE
NZ = INDEX - 1
!
!
XL(4) = 0.2
CALL SLPRS (A, IROW, JCOL, B, B, C, IRTYPE, OBJ, XSOL, DSOL, &
NZ=NZ, XLB=XL, XUB=XU)
!
WRITE (NOUT,99999) OBJ
!
99999 FORMAT (/, 'The value of the objective function is ', E12.6)
!
END
Output
The value of the objective function is -.280971E+03
QPROG
Solves a quadratic programming problem subject to linear equality/inequality constraints.
Required Arguments
NEQ The number of linear equality constraints. (Input)
A NCON by NVAR matrix. (Input)
The matrix contains the equality contraints in the first NEQ rows followed by the
inequality constraints.
B Vector of length NCON containing right-hand sides of the linear constraints. (Input)
G Vector of length NVAR containing the coefficients of the linear term of the objective
function. (Input)
H NVAR by NVAR matrix containing the Hessian matrix of the objective function. (Input)
H should be symmetric positive definite; if H is not positive definite, the algorithm
attempts to solve the QP problem with H replaced by a H + DIAGNL * I such that
H + DIAGNL * I is positive definite. See Comment 3.
Chapter 8: Optimization
QPROG 1361
Optional Arguments
NVAR The number of variables. (Input)
Default: NVAR = SIZE (A,2).
NCON The number of linear constraints. (Input)
Default: NCON = SIZE (A,1).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDH Leading dimension of H exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDH = SIZE (H,1).
DIAGNL Scalar equal to the multiple of the identity matrix added to H to give a positive
definite matrix. (Output)
NACT Final number of active constraints. (Output)
IACT Vector of length NVAR containing the indices of the final active constraints in the
first NACT positions. (Output)
ALAMDA Vector of length NVAR containing the Lagrange multiplier estimates of the final
active constraints in the first NACT positions. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine QPROG is based on M.J.D. Powells implementation of the Goldfarb and Idnani (1983)
dual quadratic programming (QP) algorithm for convex QP problems subject to general linear
equality/inequality constraints, i.e., problems of the form
1362 Chapter 8: Optimization
minn g T x +
xR
subject to
1 T
x Hx
2
A1x = b1
A2x b2
given the vectors b1, b2, and g and the matrices H, A1, and A2. H is required to be positive definite.
In this case, a unique x solves the problem or the constraints are inconsistent. If H is not positive
definite, a positive definite perturbation of H is used in place of H. For more details, see Powell
(1983, 1985).
Comments
1.
2.
Informational errors
Type
3
Code
1 Due to the effect of computer rounding error, a change in the
variables fail to improve the objective function value; usually the
solution is close to optimum.
2 The system of equations is inconsistent. There is no solution.
4
3.
Example
The quadratic programming problem
min f ( x ) = x12 + x22 + x32 + x42 + x52 2 x2 x3 2 x4 x5 2 x1
subject to
x1 + x2 + x3 + x4 + x5 = 5
x3 2 x4 2 x5 = 3
is solved.
USE QPROG_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDH, NCON, NEQ, NVAR
(NCON=2, NEQ=2, NVAR=5, LDA=NCON, LDH=NVAR)
Chapter 8: Optimization
QPROG 1363
!
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
K, NACT, NOUT
A(LDA,NVAR), ALAMDA(NVAR), B(NCON), G(NVAR), &
H(LDH,LDH), SOL(NVAR)
Set values of A, B, G and H.
A = ( 1.0 1.0 1.0 1.0 1.0)
( 0.0 0.0 1.0 -2.0 -2.0)
B = ( 5.0 -3.0)
G = (-2.0
H = (
(
(
(
(
0.0
0.0
0.0
0.0)
DATA A/1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, -2.0, 1.0, -2.0/
DATA B/5.0, -3.0/
DATA G/-2.0, 4*0.0/
DATA H/2.0, 5*0.0, 2.0, -2.0, 3*0.0, -2.0, 2.0, 5*0.0, 2.0, &
-2.0, 3*0.0, -2.0, 2.0/
!
CALL QPROG (NEQ, A, B, G, H, SOL)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) (SOL(K),K=1,NVAR)
99999 FORMAT (' The solution vector is', /, '
' )')
!
END
Output
The solution vector is
SOL = (
1.0
1.0
1.0
1.0
1.0
LCONF
Minimizes a general objective function subject to linear equality/inequality constraints.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Value of NVAR. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
1364 Chapter 8: Optimization
Optional Arguments
NVAR The number of variables. (Input)
Default: NVAR = SIZE (A,2).
NCON The number of linear constraints (excluding simple bounds). (Input)
Default: NCON = SIZE (A,1).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
XGUESS Vector of length NVAR containing the initial guess of the minimum. (Input)
Default: XGUESS = 0.0.
ACC The nonnegative tolerance on the first order conditions at the calculated solution.
(Input)
Default: ACC = 1.e-4 for single precision and 1.d-8 for double precision.
Chapter 8: Optimization
LCONF 1365
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine LCONF is based on M.J.D. Powells TOLMIN, which solves linearly constrained
optimization problems, i.e., problems of the form
minn f ( x )
xR
subject to
A1x = b1
A2x b2
xl x xu
given the vectors b1, b2, xl and xu and the matrices A1, and A2.
The algorithm starts by checking the equality constraints for inconsistency and redundancy. If the
equality constraints are consistent, the method will revise x0, the initial guess provided by the user,
to satisfy
A1x = b1
Next, x0 is adjusted to satisfy the simple bounds and inequality constraints. This is done by solving
a sequence of quadratic programming subproblems to minimize the sum of the constraint or bound
violations.
Now, for each iteration with a feasible xk, let Jk be the set of indices of inequality constraints that
have small residuals. Here, the simple bounds are treated as inequality constraints. Let Ik be the set
of indices of active constraints. The following quadratic programming problem
1
min f ( x k ) + d T f ( x k ) + d T B k d
2
subject to
ajd = 0 j Ik
ajd 0 j Jk
and
(d )
k T
f ( x k + k d k ) 0.7 ( d k ) f ( x k )
T
The main idea in forming the set Jk is that, if any of the inequality constraints restricts the steplength k, then its index is not in Jk. Therefore, small steps are likely to be avoided.
Finally, the second derivative approximation, Bk , is updated by the BFGS formula, if the
condition
(d )
k T
f ( x k + k d k ) f ( x k ) > 0
is satisfied; here, is a user-supplied tolerance. For more details, see Powell (1988, 1989).
Since a finite-difference method is used to estimate the gradient for some single precision
calculations, an inaccurate estimate of the gradient may cause the algorithm to terminate at a
noncritical point. In such cases, high precision arithmetic is recommended. Also, whenever the
exact gradient can be easily provided, routine LCONG should be used instead.
Chapter 8: Optimization
LCONF 1367
Comments
1.
2.
Informational Errors
Type
4
4
4
4
4
3.
Code
4 The equality constraints are inconsistent.
5 The equality constraints and the bounds on the variables are found to
be inconsistent.
6 No vector X satisfies all of the constraints. In particular, the current
active constraints prevent any change in X that reduces the sum of
constraint violations.
7 Maximum number of function evaluations exceeded.
9 The variables are determined by the equality constraints.
IPRINT This argument must be set by the user to specify the frequency of printing during
the execution of the routine LCONF. There is no printed output if IPRINT = 0.
Otherwise, after ensuring feasibility, information is given every IABS(IPRINT)
iterations and whenever a parameter called TOL is reduced. The printing provides the
values of X(.), F(.) and G(.) = GRAD(F) if IPRINT is positive. If IPRINT is
negative, this information is augmented by the current values of IACT(K) K = 1, ,
NACT, PAR(K) K = 1, , NACT and RESKT(I) I = 1, , N. The reason for returning to
the calling program is also displayed when IPRINT is nonzero.
INFO On exit from L2ONF, INFO will have one of the following integer values to indicate
the reason for leaving the routine:
INFO = 1 SOL is feasible, and the condition that depends on ACC is satisfied.
INFO = 2 SOL is feasible, and rounding errors are preventing further progress.
INFO = 3 SOL is feasible, but the objective function fails to decrease although a
INFO = 4 In this case, the calculation cannot begin because LDA is less than NCON or
because the lower bound on a variable is greater than the upper bound.
INFO = 5 This value indicates that the equality constraints are inconsistent. These
constraints include any components of X(.) that are frozen by setting
XL(I) = XU(I).
INFO = 6 In this case there is an error return because the equality constraints and the
INFO = 7 This value indicates that there is no vector of variables that satisfies all of
the constraints. Specifically, when this return or an INFO = 6 return occurs, the
current active constraints (whose indices are IACT(K), K = 1, , NACT) prevent
any change in X(.) that reduces the sum of constraint violations. Bounds are
only included in this sum if INFO = 6.
INFO = 8 Maximum number of function evaluations exceeded.
INFO = 9 The variables are determined by the equality constraints.
Example
The problem from Schittkowski (1987)
min f(x) = x1x2x3
subject to
x1 2x2 2x3 0
x1 +2x2 + 2x3 72
0 x1 20
0 x2 11
0 x3 42
is solved with an initial guess x1 = 10, x2 = 10 and x3 = 10.
USE LCONF_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declaration of variables
NCON, NEQ, NVAR
(NCON=2, NEQ=0, NVAR=3)
!
!
INTEGER
REAL
EXTERNAL
MAXFCN, NOUT
A(NCON,NVAR), ACC, B(NCON), OBJ, &
SOL(NVAR), XGUESS(NVAR), XLB(NVAR), XUB(NVAR)
FCN
!
!
Chapter 8: Optimization
!
!
!
!
!
!
!
!
!
!
Min
-X(1)*X(2)*X(3)
.LE.
.LE.
.LE.
X(1)
X(2)
X(3)
.LE.
.LE.
.LE.
.LE.
.LE.
0
72
20
11
42
&
!
WRITE (NOUT,99998) 'Solution:'
WRITE (NOUT,99999) SOL
WRITE (NOUT,99998) 'Function value at solution:'
WRITE (NOUT,99999) OBJ
WRITE (NOUT,99998) 'Number of function evaluations:', MAXFCN
STOP
99998 FORMAT (//, ' ', A, I4)
99999 FORMAT (1X, 5F16.6)
END
!
SUBROUTINE FCN (N, X, F)
INTEGER
N
REAL
X(*), F
!
F = -X(1)*X(2)*X(3)
RETURN
END
Output
Solution:
20.000000
11.000000
15.000000
LCONG
Minimizes a general objective function subject to linear equality/inequality constraints.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Value of NVAR. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by FCN.
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Value of NVAR. (Input)
X Vector of length N at which point the function is evaluated. (Input)
X should not be changed by GRAD.
G Vector of length N containing the values of the gradient of the objective function
LCONG 1371
Optional Arguments
NVAR The number of variables. (Input)
Default: NVAR = SIZE (A,2).
NCON The number of linear constraints (excluding simple bounds). (Input)
Default: NCON = SIZE (A,1).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
XGUESS Vector of length NVAR containing the initial guess of the minimum. (Input)
Default: XGUESS = 0.0.
ACC The nonnegative tolerance on the first order conditions at the calculated solution.
(Input)
Default: ACC = 1.e-4 for single precision and 1.d-8 for double precision.
MAXFCN On input, maximum number of function evaluations allowed.(Input/ Output)
On output, actual number of function evaluations needed.
Default: MAXFCN = 400.
OBJ Value of the objective function. (Output)
NACT Final number of active constraints. (Output)
IACT Vector containing the indices of the final active constraints in the first NACT
positions. (Output)
Its length must be at least NCON + 2 * NVAR.
ALAMDA Vector of length NVAR containing the Lagrange multiplier estimates of the final
active constraints in the first NACT positions. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine LCONG is based on M.J.D. Powells TOLMIN, which solves linearly constrained
optimization problems, i.e., problems of the form
minn f ( x )
xR
subject to
A1x = b1
A2x b2
xl x xu
given the vectors b1, b2, xl and xu and the matrices A1, and A2.
The algorithm starts by checking the equality constraints for inconsistency and redundancy. If the
equality constraints are consistent, the method will revise x0, the initial guess provided by the user,
to satisfy
A1x = b1
Next, x0 is adjusted to satisfy the simple bounds and inequality constraints. This is done by solving
a sequence of quadratic programming subproblems to minimize the sum of the constraint or bound
violations.
Now, for each iteration with a feasible xk, let Jk be the set of indices of inequality constraints that
have small residuals. Here, the simple bounds are treated as inequality constraints. Let Ik be the set
of indices of active constraints. The following quadratic programming problem
1
min f ( x k ) + d T f ( x k ) + d T B k d
2
subject to
ajd = 0 j Ik
ajd 0 j Jk
and
(d )
k T
f ( x k + k d k ) 0.7 ( d k ) f ( x k )
T
The main idea in forming the set Jk is that, if any of the inequality constraints restricts the steplength k, then its index is not in Jk. Therefore, small steps are likely to be avoided.
Chapter 8: Optimization
LCONG 1373
Finally, the second derivative approximation, Bk, is updated by the BFGS formula, if the condition
(d )
k T
f ( x k + k d k ) f ( x k ) > 0
is satisfied; here, is a user-supplied tolerance. For more details, see Powell (1988, 1989).
Comments
1.
2.
Informational errors
Type
4
4
4
4
4
3.
Code
4 The equality constraints are inconsistent.
5 The equality constraints and the bounds on the variables are found to
be inconsistent.
6 No vector X satisfies all of the constraints. In particular, the current
active constraints prevent any change in X that reduces the sum of
constraint violations.
7 Maximum number of function evaluations exceeded.
9 The variables are determined by the equality constraints.
INFO On exit from L2ONG, INFO will have one of the following integer
values to indicate the reason for leaving the routine:
INFO = 1 SOL is feasible and the condition that depends on ACC is satisfied.
INFO = 2 SOL is feasible and rounding errors are preventing further progress.
INFO = 3 SOL is feasible but the objective function fails to decrease although
INFO = 4 In this case, the calculation cannot begin because LDA is less than
NCON or because the lower bound on a variable is greater than the
upper bound.
INFO = 5 This value indicates that the equality constraints are inconsistent.
These constraints include any components of X(.) that are frozen
by setting XL(I) = XU(I).
INFO = 6 In this case, there is an error return because the equality constraints
Example
The problem from Schittkowski (1987)
min f(x) = x1x2x3
subject to
x1 2x2 2x3 0
x1 +2x2 + 2x3 72
0 x1 20
0 x2 11
0 x3 42
is solved with an initial guess x1 = 10, x2 = 10 and x3 = 10.
Chapter 8: Optimization
LCONG 1375
USE LCONG_INT
USE UMACH_INT
!
IMPLICIT
INTEGER
PARAMETER
NONE
Declaration of variables
NCON, NEQ, NVAR
(NCON=2, NEQ=0, NVAR=3)
!
INTEGER
REAL
EXTERNAL
MAXFCN, NOUT
A(NCON,NVAR), ACC, B(NCON), OBJ, &
SOL(NVAR), XGUESS(NVAR), XLB(NVAR), XUB(NVAR)
FCN, GRAD
!
!
!
!
!
!
!
!
!
!
!
!
-X(1)*X(2)*X(3)
.LE.
.LE.
.LE.
X(1)
X(2)
X(3)
.LE.
.LE.
.LE.
.LE.
.LE.
0
72
20
11
42
CALL LCONG (FCN, GRAD, NEQ, A, B, XLB, XUB, SOL, XGUESS=XGUESS, &
ACC=ACC, MAXFCN=MAXFCN, OBJ=OBJ)
!
WRITE (NOUT,99998) 'Solution:'
WRITE (NOUT,99999) SOL
WRITE (NOUT,99998) 'Function value at solution:'
WRITE (NOUT,99999) OBJ
WRITE (NOUT,99998) 'Number of function evaluations:', MAXFCN
STOP
99998 FORMAT (//, ' ', A, I4)
99999 FORMAT (1X, 5F16.6)
END
!
SUBROUTINE FCN (N, X, F)
INTEGER
N
REAL
X(*), F
!
F = -X(1)*X(2)*X(3)
RETURN
END
!
SUBROUTINE GRAD (N, X, G)
INTEGER
N
REAL
X(*), G(*)
!
1376 Chapter 8: Optimization
G(1) = -X(2)*X(3)
G(2) = -X(1)*X(3)
G(3) = -X(1)*X(2)
RETURN
END
Output
Solution:
20.000000
11.000000
15.000000
NNLPF
Solves a general nonlinear programming problem using a sequential equality constrained quadratic
programming method.
Required Arguments
FCN User-supplied subroutine to evaluate the objective function and constraints at a given
point. The internal usage is CALL FCN (X, IACT, RESULT, IERR),
where
X The point at which the objective function or constraint is evaluated. (Input)
IACT Integer indicating whether evaluation of the objective function is requested or
evaluation of a constraint is requested. If IACT is zero, then an objective
function evaluation is requested. If IACT is nonzero then the value if IACT
indicates the index of the constraint to evaluate. (Input)
RESULT If IACT is zero, then RESULT is the computed function value at the point
X. If IACT is nonzero, then RESULT is the computed constraint value at the
point X. (Output)
IERR Logical variable. On input IERR is set to .FALSE. If an error or other
undesirable condition occurs during evaluation, then IERR should be set to
.TRUE. Setting IERR to .TRUE. will result in the step size being reduced and
the step being tried again. (If IERR is set to .TRUE. for XGUESS, then an error is
issued.)
The routine FCN must be use-associated in a user module that uses NNLPF_INT, or else declared
EXTERNAL in the calling program. If FCN is a separately compiled routine, not in a module, then it
must be declared EXTERNAL.
M Total number of constraints. (Input)
Chapter 8: Optimization
NNLPF 1377
Action
User supplies only the bounds on 1st variable; all other variables will have
the same bounds.
XLB Vector of length N containing the lower bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3)
If there is no lower bound for a variable, then the corresponding XLB value should be
set to Huge(X(1)).
XUB Vector of length N containing the upper bounds on variables. (Input, if IBTYPE = 0;
output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3).
If there is no upper bound for a variable, then the corresponding XUB value should be
set to Huge(X(1)).
X Vector of length N containing the computed solution. (Output)
Optional Arguments
N Number of variables. (Input)
Default: N = SIZE(X).
XGUESS Vector of length N containing an initial guess of the solution. (Input)
Default: XGUESS = x, (with the smallest value of x 2 ) that satisfies the bounds.
XSCALE Vector of length N setting the internal scaling of the variables. The initial value
given and the objective function and gradient evaluations however are always in the
original unscaled variables. The first internal variable is obtained by dividing values
X(I) by XSCALE(I). (Input)
In the absence of other information, set all entries to 1.0.
Default: XSCALE(:) = 1.0.
IPRINT Parameter indicating the desired output level. (Input)
IPRINT
Action
No output printed.
Lines of intermediate results summarizing the most important data for each
step are printed.
Lines of detailed intermediate results showing all primal and dual variables,
the relevant values from the working set, progress in the backtracking and
etc are printed
Lines of detailed intermediate results showing all primal and dual variables,
the relevant values from the working set, progress in the backtracking, the
gradients in the working set, the quasi-Newton updated and etc are printed.
Default: IPRINT = 0.
MAXITN Maximum number of iterations allowed. (Input)
Default: MAXITN = 200.
EPSDIF Relative precision in gradients. (Input)
Default: EPSDIF = EPSILON(x(1))
TAU0 A universal bound describing how much the unscaled penalty-term may deviate
from zero. (Input)
NNLPF assumes that within the region described by
Me
g ( x)
i =1
i = M e +1
min ( 0, gi ( x ) ) TAU0
all functions may be evaluated safely. The initial guess, however, may violate these
requirements. In that case an initial feasibility improvement phase is run by NNLPF
until such a point is found. A small TAU0 diminishes the efficiency of NNLPF, because
the iterates then will follow the boundary of the feasible set closely. Conversely, a large
TAU0 may degrade the reliability of the code.
Default TAU0 = 1.E0
DEL0 In the initial phase of minimization a constraint is considered binding if
gi ( x )
max 1, gi ( x )
DEL0
i = M e + 1, , M
Good values are between .01 and 1.0. If DEL0 is chosen too small then identification
of the correct set of binding constraints may be delayed. Contrary, if DEL0 is too large,
then the method will often escape to the full regularized SQP method, using individual
slack variables for any active constraint, which is quite costly. For well-scaled
problems DEL0=1.0 is reasonable. (Input)
Default: DEL0 = .5*TAU0
Chapter 8: Optimization
NNLPF 1379
Action
FORTRAN 90 Interface
Generic:
Specific:
Description
The routine NNLPF provides an interface to a licensed version of subroutine DONLP2, a FORTRAN
code developed by Peter Spellucci (1998). It uses a sequential equality constrained quadratic
programming method with an active set technique, and an alternative usage of a fully regularized
mixed constrained subproblem in case of nonregular constraints (i.e. linear dependent gradients in
the working sets). It uses a slightly modified version of the Pantoja-Mayne update for the
Hessian of the Lagrangian, variable dual scaling and an improved Armjijo-type stepsize algorithm.
Bounds on the variables are treated in a gradient-projection like fashion. Details may be found in
the following two papers:
P. Spellucci: An SQP method for general nonlinear programs using only equality constrained
subproblems. Math. Prog. 82, (1998), 413-448.
P. Spellucci: A new technique for inconsistent problems in the SQP method. Math. Meth. of Oper.
Res. 47, (1998), 355-500. (published by Physica Verlag, Heidelberg, Germany).
The problem is stated as follows:
minn f ( x )
xR
subject to
g j ( x ) = 0, for
j = 1, , me
g j ( x ) 0, for
j = me + 1, , m
xl x xu
Although default values are provided for optional input arguments, it may be necessary to adjust
these values for some problems. Through the use of optional arguments, NNLPF allows for several
parameters of the algorithm to be adjusted to account for specific characteristics of problems.
The DONLP2 Users Guide provides detailed descriptions of these parameters as well as strategies
for maximizing the perfomance of the algorithm. The DONLP2 Users Guide is available in the
help subdirectory of the main IMSL product installation directory. In addition, the following are
a number of guidelines to consider when using NNLPF.
A good initial starting point is very problem specific and should be provided by the calling
program whenever possible. See optional argument XGUESS.
Gradient approximation methods can have an effect on the success of NNLPF. Selecting a
higher order appoximation method may be necessary for some problems. See optional
argument IDTYPE.
estimate for that value. This will increase the efficiency of the algorithm. See optional
argument DEL0.
The parameter IERR provided in the interface to the user supplied function FCN can be very
useful in cases when evaluation is requested at a point that is not possible or reasonable. For
example, if evaluation at the requested point would result in a floating point exception, then
setting IERR to .TRUE. and returning without performing the evaluation will avoid the
exception. NNLPF will then reduce the stepsize and try the step again. Note, if IERR is set to
.TRUE. for the initial guess, then an error is issued.
Chapter 8: Optimization
NNLPF 1381
Example
The problem
min F ( x ) = ( x1 2 ) + ( x2 1)
2
subject to
g1 ( x ) = x1 2 x2 + 1 = 0
g 2 ( x ) = x12 / 4 x22 + 1 0
is solved.
USE NNLPF_INT
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
IBTYPE, M, ME
(IBTYPE=0, M=2, ME=1)
!
REAL(KIND(1E0)) FVALUE, X(2), XGUESS(2), XLB(2), XUB(2)
EXTERNAL FCN
!
XLB = -HUGE(X(1))
XUB = HUGE(X(1))
!
CALL NNLPF (FCN, M, ME, IBTYPE, XLB, XUB, X)
!
CALL WRRRN ('The solution is', X)
END
SUBROUTINE FCN (X, IACT, RESULT, IERR)
INTEGER
IACT
REAL(KIND(1E0)) X(*), RESULT
LOGICAL IERR
!
SELECT CASE
CASE(0)
RESULT =
CASE(1)
RESULT =
CASE(2)
RESULT =
END SELECT
RETURN
END
(IACT)
(X(1)-2.0E0)**2 + (X(2)-1.0E0)**2
X(1) - 2.0E0*X(2) + 1.0E0
-(X(1)**2)/4.0E0 - X(2)**2 + 1.0E0
Output
The solution is
1
0.8229
2
0.9114
NNLPG
Solves a general nonlinear programming problem using a sequential equality constrained quadratic
programming method with user supplied gradients.
Required Arguments
FCN User-supplied subroutine to evaluate the objective function and constraints at a given
point. The internal usage is CALL FCN (X, IACT, RESULT, IERR),
where
X The point at which the objective function or constraint is evaluated. (Input)
IACT Integer indicating whether evaluation of the objective function is requested or
evaluation of a constraint is requested. If IACT is zero, then an objective
function evaluation is requested. If IACT is nonzero then the value if IACT
indicates the index of the constraint to evaluate. (Input)
RESULT If IACT is zero, then RESULT is the computed objective function value at
the point X. If IACT is nonzero, then RESULT is the computed constraint value
at the point X. (Output)
IERR Logical variable. On input IERR is set to .FALSE. If an error or other
undesirable condition occurs during evaluation, then IERR should be set to
.TRUE. Setting IERR to .TRUE. will result in the step size being reduced and
the step being tried again. (If IERR is set to .TRUE. for XGUESS, then an error is
issued.)
The routine FCN must be use-associated in a user module that uses NNLPG_INT, or else
declared EXTERNAL in the calling program. If FCN is a separately compiled routine, not in a
module, then it must be declared EXTERNAL.
GRAD User-supplied subroutine to evaluate the gradients at a given point. The usage is
CALL GRAD (X, IACT, RESULT), where
X The point at which the gradient of the objective function or gradient of a constraint
is evaluated. (Input)
IACT Integer indicating whether evaluation of the function gradient is requested or
evaluation of a constraint gradient is requested. If IACT is zero, then an
objective function gradient evaluation is requested. If IACT is nonzero then the
value if IACT indicates the index of the constraint gradient to evaluate.
(Input)RESULT If IACT is zero, then RESULT is the computed gradient of the
objective function at the point X. If IACT is nonzero, then RESULT is the
computed gradient of the requested constraint value at the point X. (Output)
Chapter 8: Optimization
NNLPG 1383
The routine GRAD must be use-associated in a user module that uses NNLPG_INT, or else
declared EXTERNAL in the calling program. If GRAD is a separately compiled routine, not in a
module, then is must be declared EXTERNAL
M Total number of constraints. (Input)
ME Number of equality constraints. (Input)
IBTYPE Scalar indicating the types of bounds on variables. (Input)
IBTYPE
Action
User supplies only the bounds on 1st variable, all other variables will have
the same bounds.
XLB Vector of length N containing the lower bounds on the variables. (Input, if
IBTYPE = 0; output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3) If there is no
lower bound on a variable, then the corresponding XLB value should be set to
huge(x(1)).
XUB Vector of length N containing the upper bounds on the variables. (Input, if
IBTYPE = 0; output, if IBTYPE = 1 or 2; input/output, if IBTYPE = 3) If there is no
upper bound on a variable, then the corresponding XUB value should be set to
huge(x(1)).
X Vector of length N containing the computed solution. (Output)
Optional Arguments
N Number of variables. (Input)
Default: N = SIZE(X).
IPRINT Parameter indicating the desired output level. (Input)
IPRINT
Action
No output printed.
Lines of intermediate results summarizing the most important data for each
step are printed.
Lines of detailed intermediate results showing all primal and dual variables,
the relevant values from the working set, progress in the backtracking and
etc are printed
Lines of detailed intermediate results showing all primal and dual variables,
the relevant values from the working set, progress in the backtracking, the
gradients in the working set, the quasi-Newton updated and etc are printed.
Default: IPRINT = 0.
MAXITN Maximum number of iterations allowed. (Input)
Default: MAXITN = 200.
XGUESS Vector of length N containing an initial guess of the solution. (Input)
Default: XGUESS = x, (with the smallest value of x 2 ) that satisfies the bounds.
TAU0 A universal bound describing how much the unscaled penalty-term may deviate
from zero. (Input)
NNLPG assumes that within the region described by
Me
g ( x)
i =1
i = M e +1
min ( 0, gi ( x ) ) TAU0
all functions may be evaluated safely. The initial guess however, may violate these
requirements. In that case an initial feasibility improvement phase is run by NNLPG
until such a point is found. A small TAU0 diminishes the efficiency of NNLPG, because
the iterates then will follow the boundary of the feasible set closely. Conversely, a large
TAU0 may degrade the reliability of the code.
Default: TAU0 = 1.E0
DEL0 In the initial phase of minimization a constraint is considered binding if
gi ( x )
max 1, gi ( x )
DEL0
i = M e + 1, , M
Good values are between .01 and 1.0. If DEL0 is chosen too small then identification
of the correct set of binding constraints may be delayed. Contrary, if DEL0 is too large,
then the method will often escape to the full regularized SQP method, using individual
slack variables for any active constraint, which is quite costly. For well-scaled
problems DEL0=1.0 is reasonable. (Input)
Default: DEL0 = .5*TAU0
SMALLW Scalar containing the error allowed in the multipliers. For example, a negative
multiplier of an inequality constraint is accepted (as zero) if its absolute value is less
than SMALLW. (Input)
Default: SMALLW = exp(2*log(epsilon(x(1)/3)))
Chapter 8: Optimization
NNLPG 1385
DELMIN Scalar which defines allowable constraint violations of the final accepted result.
Constraints are satisfied if |gi(x)| DELMIN , and gj(x) (-DELMIN ) respectively.
(Input)
Default: DELMIN = min(DEL0/10, max(1.E-6*DEL0, SMALLW))
SCFMAX Scalar containing the bound for the internal automatic scaling of the objective
function. (Intput)
Default: SCFMAX = 1.0E4
FVALUE Scalar containing the value of the objective function at the computed solution.
(Output)
FORTRAN 90 Interface
Generic:
Specific:
Description
The routine NNLPG provides an interface to a licensed version of subroutine DONLP2, a FORTRAN
code developed by Peter Spellucci (1998). It uses a sequential equality constrained quadratic
programming method with an active set technique, and an alternative usage of a fully regularized
mixed constrained subproblem in case of nonregular constraints (i.e. linear dependent gradients in
the working sets). It uses a slightly modified version of the Pantoja-Mayne update for the
Hessian of the Lagrangian, variable dual scaling and an improved Armjijo-type stepsize algorithm.
Bounds on the variables are treated in a gradient-projection like fashion. Details may be found in
the following two papers:
P. Spellucci: An SQP method for general nonlinear programs using only equality constrained
subproblems. Math. Prog. 82, (1998), 413-448.
P. Spellucci: A new technique for inconsistent problems in the SQP method. Math. Meth. of Oper.
Res. 47, (1998), 355-500. (published by Physica Verlag, Heidelberg, Germany).
The problem is stated as follows:
minn f ( x )
xR
subject to
g j ( x ) = 0, for
j = 1, , me
g j ( x ) 0, for
j = me + 1, , m
xl x xu
Although default values are provided for optional input arguments, it may be necessary to adjust
these values for some problems. Through the use of optional arguments, NNLPG allows for several
parameters of the algorithm to be adjusted to account for specific characteristics of problems.
The DONLP2 Users Guide provides detailed descriptions of these parameters as well as strategies
for maximizing the perfomance of the algorithm. The DONLP2 Users Guide is available in the
help subdirectory of the main IMSL product installation directory. In addition, the following are
a number of guidelines to consider when using NNLPG.
A good initial starting point is very problem specific and should be provided by the
calling program whenever possible. See optional argument XGUESS.
estimate for that value. This will increase the efficiency of the algorithm. See optional
argument DEL0.
The parameter IERR provided in the interface to the user supplied function FCN can be
very useful in cases when evaluation is requested at a point that is not possible or
reasonable. For example, if evaluation at the requested point would result in a floating
point exception, then setting IERR to .TRUE. and returning without performing the
evaluation will avoid the exception. NNLPG will then reduce the stepsize and try the step
again. Note, if IERR is set to .TRUE. for the initial guess, then an error is issued.
Comments
1.
Informational errors
Type
4
4
4
4
4
4
4
4
4
4
4
Code
1
2
3
4
5
6
7
8
9
10
11
12
Example 1
The problem
min F ( x ) = ( x1 2 ) + ( x2 1)
2
subject to
g1 ( x ) = x1 2 x2 + 1 = 0
g 2 ( x ) = x12 / 4 x22 + 1 0
is solved.
USE NNLPG_INT
USE WRRRN_INT
Chapter 8: Optimization
NNLPG 1387
IMPLICIT
INTEGER
PARAMETER
!
NONE
IBTYPE, M, ME
(IBTYPE=0, M=2, ME=1)
!
XLB = -HUGE(X(1))
XUB = HUGE(X(1))
!
CALL NNLPG (FCN, GRAD, M, ME, IBTYPE, XLB, XUB, X)
!
CALL WRRRN ('The solution is', X)
END
SUBROUTINE FCN (X, IACT, RESULT, IERR)
INTEGER
IACT
REAL(KIND(1E0)) X(*), RESULT
LOGICAL IERR
!
SELECT CASE
CASE(0)
RESULT =
CASE(1)
RESULT =
CASE(2)
RESULT =
END SELECT
RETURN
END
(IACT)
(X(1)-2.0E0)**2 + (X(2)-1.0E0)**2
X(1) - 2.0E0*X(2) + 1.0E0
-(X(1)**2)/4.0E0 - X(2)**2 + 1.0E0
Output
The solution is
1
0.8229
2
0.9114
Additional Examples
Example 2
The same problem from Example 1 is solved, but here we use central differences to compute the
gradient of the first constraint. This example demonstrates how NNLPG can be used in cases when
analytic gradients are known for only a portion of the constraints and/or objective function. The
subroutine CDGRD is used to compute an approximation to the gradient of the first constraint.
USE NNLPG_INT
USE CDGRD_INT
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
!
NONE
IBTYPE, M, ME
(IBTYPE=0, M=2, ME=1)
!
XLB = -HUGE(X(1))
XUB = HUGE(X(1))
!
CALL NNLPG (FCN, GRAD, M, ME, IBTYPE, XLB, XUB, X)
!
CALL WRRRN ('The solution is', X)
END
SUBROUTINE FCN (X, IACT, RESULT, IERR)
INTEGER
IACT
REAL(KIND(1E0)) X(2), RESULT
LOGICAL IERR
EXTERNAL CONSTR1
!
SELECT CASE (IACT)
CASE(0)
RESULT = (X(1)-2.0E0)**2 + (X(2)-1.0E0)**2
CASE(1)
CALL CONSTR1(2, X, RESULT)
CASE(2)
RESULT = -(X(1)**2)/4.0E0 - X(2)**2 + 1.0E0
END SELECT
RETURN
END
SUBROUTINE GRAD (X, IACT, RESULT)
USE CDGRD_INT
INTEGER
IACT
REAL(KIND(1E0)) X(2),RESULT(2)
EXTERNAL CONSTR1
!
SELECT CASE (IACT)
CASE(0)
RESULT (1) = 2.0E0*(X(1)-2.0E0)
RESULT (2) = 2.0E0*(X(2)-1.0E0)
Chapter 8: Optimization
NNLPG 1389
CASE(1)
CALL CDGRD(CONSTR1, X, RESULT)
CASE(2)
RESULT (1) = -0.5E0*X(1)
RESULT (2) = -2.0E0*X(2)
END SELECT
RETURN
END
SUBROUTINE CONSTR1 (N, X, RESULT)
INTEGER N
REAL(KIND(1E0)) X(*), RESULT
RESULT = X(1) - 2.0E0*X(2) + 1.0E0
RETURN
END
Output
The solution is
1
0.8229
2
0.9114
CDGRD
Approximates the gradient using central differences.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (XC,1).
1390 Chapter 8: Optimization
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
In the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
EPSFCN Estimate for the relative noise in the function. (Input)
EPSFCN must be less than or equal to 0.1. In the absence of other information, set
EPSFCN to 0.0.
Default: EPSFCN = 0.0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CDGRD uses the following finite-difference formula to estimate the gradient of a
function of n variables at x:
f ( x + hi ei ) f ( x hi ei )
2hi
for i = 1, , n
where hi = 1/2 max{|xi|, 1/si} sign(xi), is the machine epsilon, si is the scaling factor of the i-th
variable, and ei is the i-th unit vector. For more details, see Dennis and Schnabel (1983).
Since the finite-difference method has truncation error, cancellation error, and rounding error,
users should be aware of possible poor performance. When possible, high precision arithmetic is
recommended.
Comments
This is Description A5.6.4, Dennis and Schnabel, 1983, page 323.
Example
In this example, the gradient of f(x) = x1 x1x2 2 is estimated by the finite-difference method at
the point (1.0, 1.0).
USE CDGRD_INT
USE UMACH_INT
Chapter 8: Optimization
CDGRD 1391
IMPLICIT
INTEGER
PARAMETER
REAL
EXTERNAL
!
!
NONE
I, N, NOUT
(N=2)
EPSFCN, GC(N), XC(N)
FCN
Initialization.
DATA XC/2*1.0E0/
Set function noise.
EPSFCN = 0.01
!
CALL CDGRD (FCN, XC, GC, EPSFCN=EPSFCN)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) (GC(I),I=1,N)
99999 FORMAT ( The gradient is, 2F8.2, /)
!
END
!
SUBROUTINE FCN (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = X(1) - X(1)*X(2) - 2.0E0
!
RETURN
END
Output
The gradient is
0.00
-1.00
FDGRD
Approximates the gradient using forward differences.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (XC,1).
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
In the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
EPSFCN Estimate of the relative noise in the function. (Input)
EPSFCN must be less than or equal to 0.1. In the absence of other information, set
EPSFCN to 0.0.
Default: EPSFCN = 0.0.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FDGRD uses the following finite-difference formula to estimate the gradient of a
function of n variables at x:
f ( x + hi ei ) f ( x )
hi
for i = 1, , n
where hi = 1/2 max{|xi|, 1/si} sign(xi), is the machine epsilon, ei is the i-th unit vector, and si is
the scaling factor of the i-th variable. For more details, see Dennis and Schnabel (1983).
Since the finite-difference method has truncation error, cancellation error, and rounding error,
users should be aware of possible poor performance. When possible, high precision arithmetic is
recommended. When accuracy of the gradient is important, IMSL routine CDGRD should be used.
Chapter 8: Optimization
FDGRD 1393
Comments
This is Description A5.6.3, Dennis and Schnabel, 1983, page 322.
Example
In this example, the gradient of f(x) = x1 x1x2 2 is estimated by the finite-difference method at
the point (1.0, 1.0).
USE FDGRD_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
REAL
EXTERNAL
!
!
!
!
NONE
I, N, NOUT
(N=2)
EPSFCN, FC, GC(N), XC(N)
FCN
Initialization.
DATA XC/2*1.0E0/
Set function noise.
EPSFCN = 0.01
Get function value at current
point.
CALL FCN (N, XC, FC)
!
CALL FDGRD (FCN, XC, FC, GC, EPSFCN=EPSFCN)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) (GC(I),I=1,N)
99999 FORMAT ( The gradient is, 2F8.2, /)
!
END
!
SUBROUTINE FCN (N, X, F)
INTEGER
N
REAL
X(N), F
!
F = X(1) - X(1)*X(2) - 2.0E0
!
RETURN
END
Output
The gradient is
0.00
-1.00
FDHES
Approximates the Hessian using forward differences and function values.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (XC,1).
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
In the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
EPSFCN Estimate of the relative noise in the function. (Input)
EPSFCN must be less than or equal to 0.1. In the absence of other information, set
EPSFCN to 0.0.
Default: EPSFCN = 0.0.
LDH Row dimension of H exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDH = SIZE (H,1).
FORTRAN 90 Interface
Generic:
Specific:
Chapter 8: Optimization
FDHES 1395
FORTRAN 77 Interface
Single:
Double:
Description
The routine FDHES uses the following finite-difference formula to estimate the Hessian matrix of
function f at x:
f ( x + hi ei + h j e j ) f ( x + hi ei ) f ( x + h j e j ) + f ( x )
hi h j
where hi = 1/3 max{|xi|, 1/si} sign(xi), hj = 1/3 max{|xj|, 1/si} sign(xj), is the machine epsilon or
user-supplied estimate of the relative noise, si and sj are the scaling factors of the i-th and j-th
variables, and ei and ej are the i-th and j-th unit vectors, respectively. For more details, see Dennis
and Schnabel (1983).
Since the finite-difference method has truncation error, cancellation error, and rounding error,
users should be aware of possible poor performance. When possible, high precision arithmetic is
recommended.
Comments
1.
2.
This is Description A5.6.2 from Dennis and Schnabel, 1983; page 321.
Example
The Hessian is estimated for the following function at (1, 1)
f ( x ) = x12 x1 x2 2
USE FDHES_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
N, LDHES, NOUT
Declaration of variables
PARAMETER
REAL
EXTERNAL
!
!
!
!
!
!
(N=2, LDHES=2)
XC(N), FVALUE, HES(LDHES,N), EPSFCN
FCN
Initialization
DATA XC/1.0E0,-1.0E0/
Set function noise
EPSFCN = 0.001
Evaluate the function at
current point
CALL FCN (N, XC, FVALUE)
Get Hessian forward difference
approximation
CALL FDHES (FCN, XC, FVALUE, HES, EPSFCN=EPSFCN)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) ((HES(I,J),J=1,I),I=1,N)
99999 FORMAT ( The lower triangle of the Hessian is, /,&
5X,F10.2,/,5X,2F10.2,/)
!
END
!
SUBROUTINE FCN (N, X, F)
!
SPECIFICATIONS FOR ARGUMENTS
INTEGER N
REAL
X(N), F
!
F = X(1)*(X(1) - X(2)) - 2.0E0
!
RETURN
END
Output
The lower triangle of the Hessian is
2.00
-1.00
0.00
GDHES
Approximates the Hessian using forward differences and a user-supplied gradient.
Required Arguments
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X The point at which the gradient is evaluated. (Input)
X should not be changed by GRAD.
G The gradient evaluated at the point X. (Output)
Chapter 8: Optimization
GDHES 1397
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (XC,1).
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
In the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
EPSFCN Estimate of the relative noise in the function. (Input)
EPSFCN must be less than or equal to 0.1. In the absence of other information, set
EPSFCN to 0.0.
Default: EPSFCN = 0.0.
LDH Leading dimension of H exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDH = SIZE (H,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine GDHES uses the following finite-difference formula to estimate the Hessian matrix of
function F at x:
g ( x + hj e j ) g ( x )
hj
where hj = 1/2 max{|xj|, 1/sj} sign(xj), is the machine epsilon, sj is the scaling factor of the j-th
variable, g is the analytic gradient of F at x, and ej is the j-th unit vector. For more details, see
Dennis and Schnabel (1983).
Since the finite-difference method has truncation error, cancellation error, and rounding error,
users should be aware of possible poor performance. When possible, high precision arithmetic is
recommended.
Comments
1.
2.
Example
The Hessian is estimated by the finite-difference method at point (1.0, 1.0) from the following
gradient functions:
g1 = 2 x1 x2 2
g 2 = x1 x1 + 1
USE GDHES_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
REAL
EXTERNAL
Declaration of variables
N, LDHES, NOUT
(N=2, LDHES=2)
XC(N), GC(N), HES(LDHES,N)
GRAD
!
DATA XC/2*1.0E0/
!
!
!
!
!
!
Chapter 8: Optimization
GDHES 1399
Output
THE HESSIAN IS
2.00
2.00
2.00
0.00
FDJAC
Approximates the Jacobian of M functions in N unknowns using forward differences.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (M, N, X, F), where
M Length of F. (Input)
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
Optional Arguments
M The number of functions. (Input)
Default: M = SIZE (FC,1).
N The number of variables. (Input)
Default: N = SIZE (XC,1).
XSCALE Vector of length N containing the diagonal scaling matrix for the variables.
(Input)
In the absence of other information, set all entries to 1.0.
Default: XSCALE = 1.0.
EPSFCN Estimate for the relative noise in the function. (Input)
EPSFCN must be less than or equal to 0.1. In the absence of other information, set
EPSFCN to 0.0.
Default: EPSFCN = 0.0.
LDFJAC Leading dimension of FJAC exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDFJAC = SIZE (FJAC,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine FDJAC uses the following finite-difference formula to estimate the Jacobian matrix of
function f at x:
f ( x + hj e j ) f ( x )
hj
where ej is the j-th unit vector, hj = 1/2 max{|xj|, 1/sj} sign(xj), is the machine epsilon, and sj is
the scaling factor of the j-th variable. For more details, see Dennis and Schnabel (1983).
Chapter 8: Optimization
FDJAC 1401
Since the finite-difference method has truncation error, cancellation error, and rounding error,
users should be aware of possible poor performance. When possible, high precision arithmetic is
recommended.
Comments
1.
2.
Example
In this example, the Jacobian matrix of
f1 ( x ) = x1 x2 2
f 2 ( x ) = x1 x1 x2 + 1
NONE
INTEGER
PARAMETER
REAL
EXTERNAL
Declaration of variables
N, M, LDFJAC, NOUT
(N=2, M=2, LDFJAC=2)
FJAC(LDFJAC,N), XC(N), FC(M), EPSFCN
FCN
!
DATA XC/2*1.0E0/
!
!
!
!
RETURN
END
Output
The Jacobian is
1.00
1.00
0.00
-1.00
CHGRD
Checks a user-supplied gradient of a function.
Required Arguments
FCN User-supplied subroutine to evaluate the function of which the gradient will be
checked. The usage is
CALL FCN (N, X, F), where
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
CHGRD 1403
INFO(I) = 3 means the user-supplied gradient and the numerical gradient are both zero
at X(I), and, therefore, the gradient should be rechecked at a different point.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CHGRD uses the following finite-difference formula to estimate the gradient of a
function of n variables at x:
gi ( x ) =
f ( x + hi ei ) f ( x )
hi
for i =1, , n
where hi = 1/2 max{|xi|, 1/si} sign(xi), is the machine epsilon, ei is the i-th unit vector, and si is
the scaling factor of the i-th variable.
The routine CHGRD checks the user-supplied gradient f(x) by comparing it with the finitedifference gradient g(x). If
g i ( x ) ( f ( x ) ) i < ( f ( x ) )i
where = 1/4, then (f(x))i, which is the i-th element of f(x), is declared correct; otherwise,
CHGRD computes the bounds of calculation error and approximation error. When both bounds are
too small to account for the difference, (f(x))i is reported as incorrect. In the case of a large error
bound, CHGRD uses a nearly optimal stepsize to recompute gi(x) and reports that (f(x))i is correct
if
gi ( x ) ( f ( x ) )i < 2 ( f ( x ) )i
Otherwise, (f(x))i is considered incorrect unless the error bound for the optimal step is greater
than |(f(x))i|. In this case, the numeric gradient may be impossible to compute correctly. For
more details, see Schnabel (1985).
Comments
1.
2.
Informational errors
Type
4
Code
1 The user-supplied gradient is a poor estimate of the numerical
gradient.
Example
The user-supplied gradient of
f ( x ) = xi + x2 e
( t x3 ) 2 / x4
NONE
INTEGER
PARAMETER
N
(N=4)
INTEGER
REAL
EXTERNAL
INFO(N)
GRAD(N), X(N)
DRIV, FCN
Declare variables
!
!
!
!
!
CALL DRIV (N, X, GRAD)
!
!
Chapter 8: Optimization
CHGRD 1405
END
!
SUBROUTINE FCN (N, X, FX)
INTEGER
N
REAL
X(N), FX
!
REAL
INTRINSIC
EXP
EXP
!
FX = X(1) + X(2)*EXP(-1.0E0*(2.125E0-X(3))**2/X(4))
RETURN
END
!
SUBROUTINE DRIV (N, X, GRAD)
INTEGER
N
REAL
X(N), GRAD(N)
!
REAL
INTRINSIC
EXP
EXP
!
GRAD(1) = 1.0E0
GRAD(2) = EXP(-1.0E0*(2.125E0-X(3))**2/X(4))
GRAD(3) = X(2)*EXP(-1.0E0*(2.125E0-X(3))**2/X(4))*2.0E0/X(4)* &
(2.125-X(3))
GRAD(4) = X(2)*EXP(-1.0E0*(2.125E0-X(3))**2/X(4))* &
(2.125E0-X(3))**2/(X(4)*X(4))
RETURN
END
Output
The information vector
1
2
3
4
1
1
1
1
CHHES
Checks a user-supplied Hessian of an analytic function.
Required Arguments
GRAD User-supplied subroutine to compute the gradient at the point X. The usage is
CALL GRAD (N, X, G), where
N Length of X and G. (Input)
X The point at which the gradient is evaluated. X should not be changed by GRAD.
(Input)
HESS User-supplied subroutine to compute the Hessian at the point X. The usage is
CALL HESS (N, X, H, LDH), where
N Length of X. (Input)
X The point at which the Hessian is evaluated. (Input)
X should not be changed by HESS.
H The Hessian evaluated at the point X. (Output)
LDH Leading dimension of H exactly as specified in in the dimension statement of the
X Vector of length N containing the point at which the Hessian is to be checked. (Input)
INFO Integer matrix of dimension N by N. (Output)
INFO(I, J) = 0 means the Hessian is a poor estimate for function I at the point X(J).
INFO(I, J) = 1 means the Hessian is a good estimate for function I at the point X(J).
INFO(I, J) = 2 means the Hessian disagrees with the numerical Hessian for function I
at the point X(J), but it might be impossible to calculate the numerical Hessian.
INFO(I, J) = 3 means the Hessian for function I at the point X(J) and the numerical
Hessian are both zero, and, therefore, the gradient should be rechecked at a
different point.
Optional Arguments
N Dimension of the problem. (Input)
Default: N = SIZE (X,1).
LDINFO Leading dimension of INFO exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDINFO = SIZE (INFO,1).
FORTRAN 90 Interface
Generic:
Specific:
Chapter 8: Optimization
CHHES 1407
FORTRAN 77 Interface
Single:
Double:
Description
The routine CHHES uses the following finite-difference formula to estimate the Hessian of a
function of n variables at x:
Bij ( x ) = gi ( x + h j e j ) gi ( x ) / h j for j = 1, , n
where hj = 1/2max{|xj|, 1/sj} sign(xj), is the machine epsilon, ej is the j-th unit vector, sj is the
scaling factor of the j-th variable, and gi(x) is the gradient of the function with respect to the i-th
variable.
Next, CHHES checks the user-supplied Hessian H(x) by comparing it with the finite difference
approximation B(x). If
|Bij(x) Hij(x)| < |Hij(x)|
where = 1/4, then Hij(x) is declared correct; otherwise, CHHES computes the bounds of
calculation error and approximation error. When both bounds are too small to account for the
difference, Hij(x) is reported as incorrect. In the case of a large error bound, CHHES uses a nearly
optimal stepsize to recompute Bij(x) and reports that Bij(x) is correct if
|Bij(x) Hij(x)| < 2 |Hij(x)|
Otherwise, Hij(x) is considered incorrect unless the error bound for the optimal step is greater than
|Hij(x)|. In this case, the numeric approximation may be impossible to compute correctly. For
more details, see Schnabel (1985).
Comments
Workspace may be explicitly provided, if desired, by use of C2HES/DC2HES. The reference is
CALL C2HES (GRAD, HESS, N, X, INFO, LDINFO, G, HX, HS, XSCALE, EPSFCN, INFT,
NEWX)
Example
The user-supplied Hessian of
f ( x ) = 100 ( x2 x12 ) + (1 x1 )
2
NONE
LDINFO, N
(N=2, LDINFO=N)
INTEGER
REAL
EXTERNAL
INFO(LDINFO,N)
X(N)
GRD, HES
!
!
!
!
!
CALL CHHES (GRD, HES, X, INFO)
!
END
!
SUBROUTINE GRD (N, X, UG)
INTEGER
N
REAL
X(N), UG(N)
!
UG(1) = -400.0*X(1)*(X(2)-X(1)*X(1)) + 2.0*X(1) - 2.0
UG(2) = 200.0*X(2) - 200.0*X(1)*X(1)
RETURN
END
!
SUBROUTINE HES (N, X, HX, LDHS)
INTEGER
N, LDHS
REAL
X(N), HX(LDHS,N)
!
!
!
Chapter 8: Optimization
CHHES 1409
RETURN
END
Output
*** FATAL
***
CHJAC
Checks a user-supplied Jacobian of a system of equations with M functions in N unknowns.
Required Arguments
FCN User-supplied subroutine to evaluate the function to be minimized. The usage is
CALL FCN (M, N, X, F), where
M Length of F. (Input)
N Length of X. (Input)
X The point at which the function is evaluated. (Input)
X should not be changed by FCN.
F The computed function value at the point X. (Output)
FCN must be declared EXTERNAL in the calling program.
X Vector of length N containing the point at which the Jacobian is to be checked. (Input)
INFO Integer matrix of dimension M by N. (Output)
1410 Chapter 8: Optimization
numerical Jacobian.
INFO(I, J) = 3 means the user-supplied Jacobian for function I at the point X(J) and
the numerical Jacobian are both zero. Therefore, the gradient should be
rechecked at a different point.
Optional Arguments
M The number of functions in the system of equations. (Input)
Default: M = SIZE (INFO,1).
N The number of unknowns in the system of equations. (Input)
Default: N = SIZE (X,1).
LDINFO Leading dimension of INFO exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDINFO = SIZE (INFO,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CHJAC uses the following finite-difference formula to estimate the gradient of the i-th
function of n variables at x:
gij(x) = (fi(x + hjej) fi(x))/hj
for j = 1, , n
1/2
where hj = max{|xj|, 1/sj} sign(xj), is the machine epsilon, ej is the j-th unit vector, and sj is the
scaling factor of the j-th variable.
Chapter 8: Optimization
CHJAC 1411
Next, CHJAC checks the user-supplied Jacobian J(x) by comparing it with the finite difference
gradient gi(x). If
|gij(x) Jij(x)| < |Jij(x)|
1/4
where = , then Jij(x) is declared correct; otherwise, CHJAC computes the bounds of calculation
error and approximation error. When both bounds are too small to account for the difference, Jij(x)
is reported as incorrect. In the case of a large error bound, CHJAC uses a nearly optimal stepsize to
recompute gij(x) and reports that Jij(x) is correct if
|gij(x) Jij(x)| < 2 |Jij(x)|
Otherwise, Jij(x) is considered incorrect unless the error bound for the optimal step is greater than
|Jij(x)|. In this case, the numeric gradient may be impossible to compute correctly. For more
details, see Schnabel (1985).
Comments
1.
2.
Informational errors
Type
4
Code
1 The user-supplied Jacobian is a poor estimate of the numerical
Jacobian.
Example
The user-supplied Jacobian of
f1 = 1 x1
f 2 = 10 ( x2 x12 )
NONE
LDINFO, N
(M=2,N=2,LDINFO=M)
INTEGER
REAL
EXTERNAL
INFO(LDINFO,N)
X(N)
FCN, JAC
!
!
!
!
!
!
!
SUBROUTINE FCN (M, N, X, F)
INTEGER
M, N
REAL
X(N), F(M)
!
F(1) = 1.0 - X(1)
F(2) = 10.0*(X(2)-X(1)*X(1))
RETURN
END
!
SUBROUTINE JAC (M, N, X, FJAC, LDFJAC)
INTEGER
M, N, LDFJAC
REAL
X(N), FJAC(LDFJAC,N)
!
FJAC(1,1)
FJAC(1,2)
FJAC(2,1)
FJAC(2,2)
RETURN
END
=
=
=
=
-1.0
0.0
-20.0*X(1)
10.0
Output
*** WARNING
***
Chapter 8: Optimization
CHJAC 1413
***
***
***
the user-supplied value are both zero. The Jacobian for this
function should probably be re-checked at another value for
this point.
GGUES
Generates points in an N-dimensional space.
Required Arguments
A Vector of length N. (Input)
See B.
B Real vector of length N. (Input)
A and B define the rectangular region in which the points will be generated, i.e.,
A(I) < S(I) < B(I) for I = 1, 2, , N. Note that if B(I) < A(I), then B(I) < S(I) < A(I).
K The number of points to be generated. (Input)
IDO Initialization parameter. (Input/Output)
IDO must be set to zero for the first call. GGUES resets IDO to 1 and returns the first
generated point in S. Subsequent calls should be made with IDO = 1.
S Vector of length N containing the generated point. (Output)
Each call results in the next generated point being stored in S.
Optional Arguments
N Dimension of the space. (Input)
Default: N = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine GGUES generates starting points for algorithms that optimize functions of several
variablesor, almost equivalentlyalgorithms that solve simultaneous nonlinear equations.
The routine GGUES is based on systematic placement of points to optimize the dispersion of the
set. For more details, see Aird and Rice (1977).
Comments
1.
2.
Informational error
Type
4
3.
Code
1 Attempt to generate more than K points.
The routine GGUES may be used with any nonlinear optimization routine that requires
starting points. The rectangle to be searched (defined by A, B, and N) must be
determined; and the number of starting points, K, must be chosen. One possible use for
GGUES would be to call GGUES to generate a point in the chosen rectangle. Then, call
the nonlinear optimization routine using this point as an initial guess for the solution.
Repeat this process K times. The number of iterations that the optimization routine is
allowed to perform should be quite small (5 to 10) during this search process. The best
(or best several) point(s) found during the search may be used as an initial guess to
allow the optimization routine to determine the optimum more accurately. In this
manner, an N dimensional rectangle may be effectively searched for a global optimum
of a nonlinear function. The choice of K depends upon the nonlinearity of the function
being optimized. A function with many local optima requires a larger value than a
function with only a few local optima.
Example
We want to search the rectangle with vertices at coordinates (1, 1), (3, 1), (3, 2), and (1, 2) ten
times for a global optimum of a nonlinear function. To do this, we need to generate starting points.
The following example illustrates the use of GGUES in this process:
USE GGUES_INT
USE UMACH_INT
!
IMPLICIT
Chapter 8: Optimization
NONE
Variable Declarations
GGUES 1415
INTEGER
PARAMETER
N
(N=2)
INTEGER
REAL
IDO, J, K, NOUT
A(N), B(N), S(N)
Initializations
!
!
!
!
!
!
A
B
= ( 1.0, 1.0)
= ( 3.0, 2.0)
Output
Point Number
1
2
3
4
5
6
7
8
9
10
(
(
(
(
(
(
(
(
(
(
Generated Point
1.5, 1.125)
2.0, 1.500)
2.5, 1.750)
1.5, 1.375)
2.0, 1.750)
1.5, 1.625)
2.5, 1.250)
1.5, 1.875)
2.0, 1.250)
2.5, 1.500)
Chapter 8: Optimization
GGUES 1417
Routines
9.1.
1425
1425
1425
1425
1425
1426
1426
1426
1426
1427
1427
1428
1428
1428
1428
1429
1429
1429
1429
1430
1430
1430
1430
1431
1432
1433
1436
1437
Routines 1419
1437
1437
1437
1438
1438
1438
1438
1439
1439
1439
1439
1439
1440
1440
1440
1440
1441
1441
1441
1441
1442
1442
1442
1443
9.2.
9.2.1
Matrix Copy
Real general ....................................................................... CRGRG
Complex general ................................................................ CCGCG
Real band ............................................................................CRBRB
Complex band .....................................................................CCBCB
1444
1445
1447
1448
Matrix Conversion
Real general to real band ................................................... CRGRB
Real band to real general ................................................... CRBRG
Complex general to complex band..................................... CCGCB
Complex band to complex general..................................... CCBCG
Real general to complex general ....................................... CRGCG
Real rectangular to complex rectangular ........................... CRRCR
Real band to complex band ................................................CRBCB
Real symmetric to real general ...........................................CSFRG
Complex Hermitian to complex general ..............................CHFCG
Real symmetric band to real band ...................................... CSBRB
Complex Hermitian band to complex band .........................CHBCB
Real rectangular matrix to its transpose..............................TRNRR
1450
1452
1453
1455
1457
1458
1460
1462
1463
1465
1467
1469
9.2.2
9.2.3
9.2.4
9.2.5
9.2.6
9.2.7
9.2.8
9.3.
Matrix Multiplication
Compute XT X.......................................................................MXTXF
1470
Compute XT Y ......................................................................MXTYF
Compute XYT ........................................................................MXYTF
Multiply two real rectangular matrices ................................MRRRR
Multiply two complex rectangular matrices.........................MCRCR
Compute matrix Hadamard product.................................... HRRRR
Compute the bilinear form xTAy..............................................BLINF
Compute the matrix polynomial p(A) ................................... POLRG
1472
1474
1476
1479
1481
1483
1485
Matrix-Vector Multiplication
Real rectangular matrix times a real vector ........................MURRV
Real band matrix times a real vector .................................. MURBV
Complex rectangular matrix times a complex vector..........MUCRV
Complex band matrix times a complex vector.................... MUCBV
1487
1489
1491
1493
Matrix Addition
Real band matrix plus a real band matrix ............................ARBRB
Complex band matrix plus a complex band matrix..............ACBCB
1495
1497
Matrix Norm
-norm of a real rectangular matrix ......................................NRIRR
1-norm of a real rectangular matrix......................................NR1RR
Frobenius norm of a real rectangular matrix........................NR2RR
1-norm of a real band matrix................................................NR1RB
1-norm of a complex band matrix ........................................NR1CB
1499
1501
1502
1504
1505
1507
1509
1510
Vector Convolutions
Convolution of real vectors .................................................VCONR
Convolution of complex vectors.......................................... VCONC
1512
1514
1516
1516
1516
1516
1516
1516
1516
1516
Routines 1421
Integer
Real
Complex
Double
Double complex
SD
CZ
DQ
ZQ
Vector arguments have an increment parameter that specifies the storage space or stride between
elements. The correspondence between the vectors x and y and the arguments SX and SY, and
INCX and INCY is
SX ( ( I-1) INCX + 1)
xi =
SX ( ( I-N ) INCX + 1)
if INCX 0
SY ( ( I-1) INCY + 1)
yi =
SY ( ( I-N ) INCY + 1)
if INCY 0
if INCX < 0
if INCY < 0
Function subprograms SXYZ and DXYZ refer to a third vector argument z. The storage increment
INCZ for z is defined like INCX, INCY. In the Level 1 BLAS, only positive values of INCX are
allowed for operations that have a single vector argument. The loops in all of the Level 1 BLAS
process the vector arguments in order of increasing i. For INCX, INCY, INCZ < 0, this implies
processing in reverse storage order.
The function subprograms in the Level 1 BLAS are all illustrated by means of an assignment
statement. For example, see SDOT. Any value of a function subprogram can be used in an
expression or as a parameter passed to a subprogram as long as the data types agree.
INTEGER
REAL
DOUBLE PRECISION
COMPLEX
DOUBLE COMPLEX
INTEGER
REAL
(I-N)
S
D
C
Z
IX(MX)
SX(MX), SY(MY), SZ(MZ),
SPARAM(5)
DX(MX), DY(MY), DZ(MZ),
DPARAM(5)
DOUBLE PRECISION
DOUBLE PRECISION
COMPLEX
DOUBLE COMPLEX
DACC(2), DZACC(4)
CX(MX), CY(MY)
ZX(MX), ZY(MY)
Since FORTRAN 77 does not include the type DOUBLE COMPLEX, subprograms with DOUBLE
COMPLEX arguments are not available for all systems. Some systems use the declaration COMPLEX
* 16 instead of DOUBLE COMPLEX.
In the following descriptions, the original BLAS are marked with an * in the left column.
Table 9.1: Level 1 Basic Linear Algebra Subprograms
Operation
Integer
Real
Double
Complex
Double
Complex
Pg.
xi a
ISET
SSET
DSET
CSET
ZSET
1425
yi xi
ICOPY
SCOPY
DCOPY
CCOPY
ZCOPY
1425
SSCAL
DSCAL
CSCAL
ZSCAL
1425
CSSCAL
ZDSCAL
CVCAL
ZVCAL
CSVCAL
ZDVCAL
xi axi
aR
yi axi
SVCAL
aR
Chapter 9: Basic Matrix/Vector Operations
DVCAL
1425
Operation
Integer
Real
Double
Complex
Double
Complex
Pg.
xi xi + a
IADD
SADD
DADD
CADD
ZADD
1425
xi a xi
ISUB
SSUB
DSUB
CSUB
ZSUB
1426
SAXPY
DAXPY
CAXPY
ZAXPY
1426
SSWAP
DSWAP
CSWAP
ZSWAP
1426
SDOT
DDOT
CDOTU
ZDOTU
1426
CDOTC
ZDOTC
CZDOTU
ZQDOTU
CZDOTC
ZQDOTC
DQDDOT CZUDOT
ZQUDOT
CZCDOT
ZQCDOT
yi axi + yi
yi xi
ISWAP
xy
x y
xy
DSDOT
x y
a+xy
SDSDOT
a+ x y
1427
1427
b+xy
SDDOTI
DQDOTI CZDOTI
ZQDOTI
ACC + b + x y
SDDOTA
DQDOTA CZDOTA
ZQDOTA
zi xiyi
SHPROD
DHPROD
1428
xiyizi
SXYZ
DXYZ
1428
SSUM
DSUM
1428
|xi|
SASUM
DASUM
SCASUM
DZASUM
1429
||x||2
SNRM2
DNRM2
SCNRM2
DZNRM2
1429
xi
SPRDCT
DPRDCT
1429
xi
ISUM
1428
i : xi = minj xj
IIMIN
ISMIN
IDMIN
1429
i : xi = maxj xj
IIMAX
ISMAX
IDMAX
1430
ISAMIN
IDAMIN ICAMIN
IZAMIN
1430
ISAMAX
IDAMAX ICAMAX
IZAMAX
1430
Construct Givens
rotation
SROTG
DROTG
Apply Givens
rotation
SROT
DROT
Construct
modified Givens
transform
SROTMG
DROTMG
Apply modified
Givens transform
SROTM
DROTM
1430
CSROT
ZDROT
1431
1432
CSROTM
ZDROTM
1433
ISET
SSET
DSET
CSET
ZSET
Copy a Vector
CALL ICOPY (N, IX, INCX, IY, INCY)
*CALL SCOPY (N, SX, INCX, SY, INCY)
*CALL DCOPY (N, DX, INCX, DY, INCY)
*CALL CCOPY (N, CX, INCX, CY, INCY)
CALL ZCOPY (N, ZX, INCX, ZY, INCY)
Scale a Vector
*CALL SSCAL (N, SA, SX, INCX)
*CALL DSCAL (N, DA, DX, INCX)
*CALL CSCAL (N, CA, CX, INCX)
CALL ZSCAL (N, ZA, ZX, INCX)
*CALL CSSCAL (N, SA, CX, INCX)
CALL ZDSCAL (N, DA, ZX, INCX)
These subprograms set xi axi for i = 1, 2, , N. If N 0, then the subprograms return
immediately. CAUTION: For CSSCAL and ZDSCAL, the scalar quantity a is real and the vector x is
complex.
ISUB
SSUB
DSUB
CSUB
ZSUB
Dot Product
*SW
*DW
*CW
*CW
=
=
=
=
ZW =
ZW =
SDOT
DDOT
CDOTU
CDOTC
ZDOTU
ZDOTC
x yi
i =1 i
x yi
i =1 i
The suffix C indicates that the complex conjugates of xi are used. The suffix U indicates that the
unconjugated values of xi are used. If N 0, then the subprograms return zero.
=
=
=
=
DSDOT
CZDOTC
CZDOTU
ZQDOTC
ZQDOTU
x yi
i =1 i
using double precision accumulation. The function subprograms CZDOTU and ZQDOTU compute
x yi
i =1 i
using double and quadruple complex accumulation, respectively. The function subprograms
CZDOTC and ZQDOTC compute
x yi
i =1 i
using double and quadruple complex accumulation, respectively. If N 0, then the subprograms
return zero.
using higher precision accumulation where SDSDOT uses double precision accumulation, DQDDOT
uses quadruple precision accumulation, CZUDOT uses double complex accumulation, and ZQUDOT
uses quadruple complex accumulation. The function subprograms CZCDOT and ZQCDOT compute
a + i =1 xi yi
N
using double complex and quadruple complex accumulation, respectively. If N 0, then the
subprograms return zero.
=
=
=
=
=
=
=
=
SDDOTI
SDDOTA
CZDOTI
CZDOTA
DQDOTI
DQDOTA
ZQDOTI
ZQDOTA
The variable DACC, a double precision array of length two, is used as a quadruple precision
accumulator. DZACC, a double precision array of length four, is its complex analog. The function
subprograms, with a name ending in I, initialize DACC to zero. All of the function subprograms
then compute
DACC + b + i =1 xi yi
N
and store the result in DACC. The result, converted to the precision of the function, is also returned
as the function value. If N 0, then the function subprograms return zero.
Hadamard Product
CALL SHPROD (N, SX, INCX, SY, INCY, SZ, INCZ)
CALL DHPROD (N, DX, INCX, DY, INCY, DZ, INCZ)
x yi zi
i =1 i
i =1 i
N
i =1
xi
N
i =1
xi + xi
If N 0, then the subprograms return zero. CAUTION: For SCASUM and DZASUM, the function
subprogram returns a real value.
Euclidean or
2 Norm of a Vector
*SW = SNRM2 (N, SX, INCX)
*DW = DNRM2 (N, DX, INCX)
*SW = SCNRM2 (N, CX, INCX)
DW = DZNRM2 (N, ZX, INCX)
N xi 2
i =1
If N 0, then the subprograms return zero. CAUTION: For SCNRM2 and DZNRM2, the function
subprogram returns a real value.
i =1 i
These function subprograms compute the smallest index i such that xi = min1jN xj. If N 0, then
the subprograms return zero.
These function subprograms compute the smallest index i such thatxi = max1jN xj. If N 0, then
the subprograms return zero.
=
=
=
=
ISAMIN
IDAMIN
ICAMIN
IZAMIN
The function subprograms ISAMIN and IDAMIN compute the smallest index i such that
|xi| = min1jN |xj|. The function subprograms ICAMIN and IZAMIN compute the smallest index i
such that
xi + xi = min x j + x j
1 j N
if r 0
if r = 0
Fortran Numerical MATH LIBRARY
and
if r 0
if r = 0
b / r
s=
1
s
c
a r
b = 0
The introduction of is not essential to the computation of the Givens rotation matrix; but its use
permits later stable reconstruction of c and s from just one stored number, an idea due to Stewart
(1976). For this purpose, the subprogram also computes
if s < c or c = 0
s
z=
1/ c if 0 < c s
s xi
= for i = 1, , N
c yi
If N 0, then the subprograms return immediately. CAUTION: For CSROT and ZDROT, the scalar
quantities c and s are real, and x and y are complex.
=
d 2
zi 0
xi
yi
The subprograms determine the modified Givens rotation matrix H that transforms y1, and thus, z1
to zero. They also replace d1, d2 and x1 with
d1 , d 2 and x1
x1 x1
y =
1 0
A representation of this matrix is stored in the array SPARAM or DPARAM. The form of the matrix H
is flagged by PARAM(1).
PARAM(1) = 1. In this case,
d1 x12 d 2 y12
and
1
PARAM(2)
H =
1
PARAM(5)
and
1
PARAM(4)
H =
1
PARAM(3)
PARAM(2) PARAM(4)
H =
PARAM(3) PARAM(5)
PARAM(1) = 2. In this case, H = I where I is the identity matrix. The elements PARAM(2),
PARAM(3), PARAM(4) and PARAM(5) are not changed.
1432 Chapter 9: Basic Matrix/Vector Operations
The values of d1, d2 and x1 are changed to represent the effect of the transformation. The quantity
y1, which would be zeroed by the transformation, is left unchanged.
The input value of d1 should be nonnegative, but d2 can be negative for the purpose of removing
data from a least-squares problem.
See Lawson et al. (1979) for further details.
PARAM(5)
1
yi
xi
for i = 1, , N
yi
1
PARAM(3)
yi
xi
for i = 1, , N
yi
PARAM(3) PARAM(5)
yi
xi
for i = 1, , N
yi
If N 0 or if PARAM(1) = 2.0, then the subprograms return immediately. CAUTION: For CSROTM
and ZDROTM, the scalar quantities PARAM(*) are real and x and y are complex.
Real
Complex
Double
Double
Complex
General Band
SY
Symmetric
SB
Symmetric Band
HE
Hermitian
HB
Hermitian Band
TR
Triangular
TB
Triangular Band
Rank-One Update
RU
Rank-One Update,
Unconjugated
R2
Rank-Two Update
MM
RK
RC
Rank-One Update,
Conjugated
Matrix-Multiply
SM
Rank-K Update
R2K
Rank 2K Update
IMSL does not support the Packed Symmetric, Packed-Hermitian, or Packed-Triangular data
structures, with respective root names SP, HP or TP, nor any extended precision versions of the
Level 2 BLAS.
The specifications of the operations are provided by subprogram arguments of CHARACTER*1 data
type. Both lower and upper case of the letter have the same meaning:
TRANS, TRANSA, TRANSB
'N'
No Transpose
UPLO
DIAGNL
SIDE
'T'
Transpose
'C'
'L'
Lower Triangular
'U'
Upper Triangular
'N'
Non-unit Triangular
'U'
Unit Triangular
'L'
'R'
Note: See the Triangular Mode section in the Reference Material for definitions of these terms.
REAL
DOUBLE PRECISION
COMPLEX
DOUBLE COMPLEX
SALPHA,
DALPHA,
CALPHA,
ZALPHA,
SBETA,
DBETA,
CBETA,
ZBETA,
SX(*),
DX(*),
CX(*),
ZX(*),
SY(*),
DY(*),
CY(*),
ZY(*),
SA(LDA,*)
DA(LDA,*)
CA(LDA,*)
ZA(LDA,*)
There is a lower bound on the leading dimension LDA. It must be the number of rows in the
matrix that is contained in this array. Vector arguments have an increment parameter that specifies
the storage space or stride between elements. The correspondence between the vector x, y and the
arguments SX, SY and INCX, INCY is
SX ( ( I-1) INCX + 1)
xi =
SX ( ( I-N ) INCX + 1)
SY ( ( I-1) INCY + 1)
yi =
SY ( ( I-N ) INCY + 1)
if INCX > 0
if INCX < 0
if INCY > 0
if INCY < 0
In the Level 2 BLAS, only nonzero values of INCX, INCY are allowed for operations that have
vector arguments. The Level 3 BLAS do not refer to INCX, INCY.
Each of the integers K, M, N must be 0. It is an error if any of them are < 0. If any of them are = 0,
the subprograms return immediately. There are lower bounds on the leading dimensions LDA, LDB,
LDC. Each must be the number of rows in the matrix that is contained in this array.
Table 9.2: Level 2 and 3 Basic Linear Algebra Subprograms
Double
Operation
Real
Double
Complex
Complex
Pg.
SGEMV
DGEMV
CGEMV
ZGEMV
1436
SGBMV
DGBMV
CGBMV
ZGBMV
1437
CHEMV
ZHEMV
1437
Matrix-Vector Multiply,
Hermitian and Banded
CHBMV
ZHBMV
1437
Matrix-Vector Multiply
Symmetric and Real
SSYMV
DSYMV
1437
Matrix-Vector Multiply,
Symmetric and Banded
SSBMV
DSBMV
1438
Double
Operation
Real
Double
Complex
Complex
Pg.
STRMV
DTRMV
CTRMV
ZTRMV
1438
Matrix-Vector Multiply,
Triangular and Banded
STBMV
DTBMV
CTBMV
ZTBMV
1438
STRSV
DTRSV
CTRSV
ZTRSV
1438
Matrix-Vector Solve,
Triangular and Banded
STBSV
DTBSV
CTBSV
ZTBSV
1439
SGER
DGER
1439
CGERU
ZGERU
1439
CGERC
ZGERC
1439
CHER
ZHER
1439
CHER2
ZHER2
1440
SSYR
DSYR
1440
SSYR2
DSYR2
1440
SGEMM
DGEMM
CGEMM
ZGEMM
1440
SSYMM
DSYMM
CSYMM
ZSYMM
1441
CHEMM
ZHEMM
1441
CSYRK
ZSYRK
1441
SSYRK
DSYRK
SSYR2K DSYR2K
CHERK
ZHERK
1441
CSYR2K
ZSYR2K
1442
CHER2K
ZHER2K
1442
STRMM
DTRMM
CTRMM
ZTRMM
1442
STRSM
DTRSM
CTRSM
ZTRSM
1443
SGEMV
DGEMV
CGEMV
ZGEMV
For all data types, A is an M N matrix. These subprograms set y to one of the expressions:
y Ax + y, y ATx + y, or for complex data,
y AT + y
1436 Chapter 9: Basic Matrix/Vector Operations
For all data types, A is an M N matrix with NLCA lower codiagonals and NUCA upper
codiagonals. The matrix is stored in band storage mode. These subprograms set y to one of the
expressions: y Ax + y, y ATx + y, or for complex data,
y AT x + y
For all data types, A is an N N matrix with NCODA codiagonals. The matrix is stored in band
Hermitian storage mode. These subprograms set y Ax + y. The matrix A is either referenced
using its upper or lower triangular part. The character flag UPLO determines the part used.
For all data types, A is an N N matrix with NCODA codiagonals. The matrix is stored in band
symmetric storage mode. These subprograms set y Ax + y. The matrix A is either referenced
using its upper or lower triangular part. The character flag UPLO determines the part used.
STRMV
DTRMV
CTRMV
ZTRMV
For all data types, A is an N N triangular matrix. These subprograms set x to one of the
expressions: x Ax, x ATx, or for complex data,
x AT x
The matrix A is either referenced using its upper or lower triangular part and is unit or nonunit
triangular. The character flags UPLO, TRANS, and DIAGNL determine the part of the matrix used
and the operation performed.
STBMV
DTBMV
CTBMV
ZTBMV
For all data types, A is an N N matrix with NCODA codiagonals. The matrix is stored in band
triangular storage mode. These subprograms set x to one of the expressions: x Ax, x ATx, or
for complex data,
x AT x
The matrix A is either referenced using its upper or lower triangular part and is unit or nonunit
triangular. The character flags UPLO, TRANS, and DIAGNL determine the part of the matrix used
and the operation performed.
STRSV
DTRSV
CTRSV
ZTRSV
For all data types, A is an N N triangular matrix. These subprograms solve x for one of the
expressions: x A1 x, x (A1 )Tx, or for complex data,
x ( AT ) x
1
The matrix A is either referenced using its upper or lower triangular part and is unit or nonunit
triangular. The character flags UPLO, TRANS, and DIAGNL determine the part of the matrix used
and the operation performed.
STBSV
DTBSV
CTBSV
ZTBSV
For all data types, A is an N N triangular matrix with NCODA codiagonals. The matrix is stored in
band triangular storage mode. These subprograms solve x for one of the expressions: x A1 x,
x (A )1x, or for complex data,
x ( AT ) x
1
The matrix A is either referenced using its upper or lower triangular part and is unit or nonunit
triangular. The character flags UPLO, TRANS, and DIAGNL determine the part of the matrix used
and the operation performed.
where A is Hermitian. The matrix A is either referenced by its upper or lower triangular part. The
character flag UPLO determines the part used. CAUTION: Notice the scalar parameter is real,
and the data in the matrix and vector are complex.
where A is an Hermitian matrix. The matrix A is either referenced by its upper or lower triangular
part. The character flag UPLO determines the part used.
For all data types, A is an N N matrix. These subprograms set A A + xxT where A is a
symmetric matrix. The matrix A is either referenced by its upper or lower triangular part. The
character flag UPLO determines the part used.
For all data types, A is an N N matrix. These subprograms set A A + xyT + yxT where A is a
symmetric matrix. The matrix A is referenced by its upper or lower triangular part. The character
flag UPLO determines the part used.
For all data types, these subprograms set CM N to one of the expressions:
C AB + C , C AT B + C , C ABT + C , C AT BT + C ,
or for complex data, C AB T + C , C AT B + C , C AT B T + C ,
C AT B T + C , C A T B T + C
The character flags TRANSA and TRANSB determine the operation to be performed. Each matrix
product has dimensions that follow from the fact that C has dimension M N.
SSYMM
DSYMM
CSYMM
ZSYMM
(SIDE, UPLO, M, N, SALPHA, SA, LDA, SB, LDB, SBETA, SC, LDC)
(SIDE, UPLO, M, N, DALPHA, DA, LDA, DB, LDB, DBETA, DC, LDC)
(SIDE, UPLO, M, N, CALPHA, CA, LDA, CB, LDB, CBETA, CC, LDC)
(SIDE, UPLO, M, N, ZALPHA, ZA, LDA, ZB, LDB, ZBETA, ZC, LDC)
For all data types, these subprograms set CM N to one of the expressions: C AB + C or
C BA + C, where A is a symmetric matrix. The matrix A is referenced either by its upper or
lower triangular part. The character flags SIDE and UPLO determine the part of the matrix used
and the operation performed.
For all data types, these subprograms set CM N to one of the expressions: C AB + C or
C BA + C, where A is an Hermitian matrix. The matrix A is referenced either by its upper or
lower triangular part. The character flags SIDE and UPLO determine the part of the matrix used
and the operation performed.
SSYRK
DSYRK
CSYRK
ZSYRK
For all data types, these subprograms set CM N to one of the expressions: C AAT + C or
C ATA + C. The matrix C is referenced either by its upper or lower triangular part. The
character flags UPLO and TRANS determine the part of the matrix used and the operation
performed. In subprogram CSYRK and ZSYRK, only values N or T are allowed for TRANS;
Cis not acceptable.
For all data types, these subprograms set CN N to one of the expressions:
C AAT + C or C AT A + C
The matrix C is referenced either by its upper or lower triangular part. The character flags UPLO
and TRANS determine the part of the matrix used and the operation performed. CAUTION: Notice
the scalar parameters and are real, and the data in the matrices are complex. Only values
Nor Care allowed for TRANS; Tis not acceptable.
Chapter 9: Basic Matrix/Vector Operations
For all data types, these subprograms set CN N to one of the expressions:
C ABT + AT + C or C AT B + BT A + C
The matrix C is referenced either by its upper or lower triangular part. The character flags UPLO
and TRANS determine the part of the matrix used and the operation performed. In subprogram
CSYR2K and ZSYR2K, only values Nor T are allowed for TRANS; Cis not acceptable.
For all data types, these subprograms set CN N to one of the expressions:
C AB T + BAT + C or C AT B + B T A + C
The matrix C is referenced either by its upper or lower triangular part. The character flags UPLO
and TRANS determine the part of the matrix used and the operation performed. CAUTION: Notice
the scalar parameter is real, and the data in the matrices are complex. In subprogram CHER2K
and ZHER2K, only values N or Care allowed for TRANS; Tis not acceptable.
STRMM
DTRMM
CTRMM
ZTRMM
where A is a triangular matrix. The matrix A is either referenced using its upper or lower triangular
part and is unit or nonunit triangular. The character flags SIDE, UPLO, TRANSA, and DIAGNL
determine the part of the matrix used and the operation performed.
STRSM
DTRSM
CTRSM
ZTRSM
For all data types, these subprograms set BM N to one of the expressions:
B A1 B, B BA1 , B ( A1 ) B, B B ( A1 ) ,
T
where A is a triangular matrix. The matrix A is either referenced using its upper or lower triangular
part and is unit or nonunit triangular. The character flags SIDE, UPLO, TRANSA, and DIAGNL
determine the part of the matrix used and the operation performed.
Real
General
Complex
General
Real
Band
Real General
CRGRG
CRGCG
CRGRB
CCGCG
Complex General
Real Band
Symmetric Full
CCGCB
CRBRG
CRBRB
CCBCG
Complex Band
Complex
Band
CRBCB
CCBCB
CSFRG
CHFCG
Hermitian Full
CSBRB
Symmetric Band
CHBCB
Hermitian Band
A
Real
Rect.
Real
Band
Complex
Band
MURBV
MUCBV
MRRRR
MCRCR
Complex Rect.
Vector
Complex
Rect.
MURRV
MUCRV
||A||
Real
Rectangular
-norm
NRIRR
1-norm
NR1RR
Frobenius
NR2RR
Real
Band
Complex
Band
NR1RB
NR1CB
CRGRG
Copies a real general matrix.
Required Arguments
A Matrix of order N. (Input)
B Matrix of order N containing a copy of A. (Output)
Optional Arguments
N Order of the matrices. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CRGRG copies the real N N general matrix A into the real N N general matrix B.
1444 Chapter 9: Basic Matrix/Vector Operations
Example
A real 3 3 general matrix is copied into another real 3 3 general matrix.
USE CRGRG_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N
(LDA=3, LDB=3, N=3)
!
!
REAL
A(LDA,N), B(LDB,N)
Set values for A
A = (
0.0
1.0
( -1.0
0.0
( -1.0 -1.0
!
!
!
!
!
!
!
1.0
1.0
0.0
)
)
)
Output
B
1
2
3
1
0.000
-1.000
-1.000
2
1.000
0.000
-1.000
3
1.000
1.000
0.000
CCGCG
Copies a complex general matrix.
Required Arguments
A Complex matrix of order N. (Input)
B Complex matrix of order N containing a copy of A. (Output)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
Chapter 9: Basic Matrix/Vector Operations
CCGCG 1445
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CCGCG copies the complex N N general matrix A into the complex N N general
matrix B.
Example
A complex 3 3 general matrix is copied into another complex 3 3 general matrix.
USE CCGCG_INT
USE WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N
(LDA=3, LDB=3, N=3)
!
!
COMPLEX
!
!
!
!
!
!
!
A(LDA,N), B(LDB,N)
Set values for A
A = ( 0.0+0.0i 1.0+1.0i
( -1.0-1.0i 0.0+0.0i
( -1.0-1.0i -1.0-1.0i
1.0+1.0i
1.0+1.0i
0.0+0.0i
)
)
)
Output
B
1
3
Fortran Numerical MATH LIBRARY
1
2
3
( 0.000, 0.000)
(-1.000,-1.000)
(-1.000,-1.000)
( 1.000, 1.000)
( 0.000, 0.000)
(-1.000,-1.000)
( 1.000, 1.000)
( 1.000, 1.000)
( 0.000, 0.000)
CRBRB
Copies a real band matrix stored in band storage mode.
Required Arguments
A Real band matrix of order N. (Input)
NLCA Number of lower codiagonals in A. (Input)
NUCA Number of upper codiagonals in A. (Input)
B Real band matrix of order N containing a copy of A. (Output)
NLCB Number of lower codiagonals in B. (Input)
NLCB must be at least as large as NLCA.
NUCB Number of upper codiagonals in B. (Input)
NUCB must be at least as large as NUCA.
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CRBRB 1447
Double:
Description
The routine CRBRB copies the real band matrix A in band storage mode into the real band matrix B
in band storage mode.
Example
A real band matrix of order 3, in band storage mode with one upper codiagonal, and one lower
codiagonal is copied into another real band matrix also in band storage mode.
USE CRBRB_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N, NLCA, NLCB, NUCA, NUCB
(LDA=3, LDB=3, N=3, NLCA=1, NLCB=1, NUCA=1, NUCB=1)
!
!
REAL
A(LDA,N), B(LDB,N)
Set values for A (in band mode)
A = ( 0.0 1.0
1.0 )
( 1.0 1.0
1.0 )
( 1.0 1.0
0.0 )
!
!
!
!
!
!
!
Copy A to B
CALL CRBRB (A, NLCA, NUCA, B, NLCB, NUCB)
Print results
CALL WRRRN ('B', B)
END
Output
B
1
2
3
1
0.000
1.000
1.000
2
1.000
1.000
1.000
3
1.000
1.000
0.000
CCBCB
Copies a complex band matrix stored in complex band storage mode.
Required Arguments
A Complex band matrix of order N. (Input)
NLCA Number of lower codiagonals in A. (Input)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CCBCB copies the complex band matrix A in band storage mode into the complex band
matrix B in band storage mode.
Example
A complex band matrix of order 3 in band storage mode with one upper codiagonal and one lower
codiagonal is copied into another complex band matrix in band storage mode.
USE CCBCB_INT
USE WRCRN_INT
CCBCB 1449
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N, NLCA, NLCB, NUCA, NUCB
(LDA=3, LDB=3, N=3, NLCA=1, NLCB=1, NUCA=1, NUCB=1)
!
!
COMPLEX
!
!
!
!
!
!
!
A(LDA,N), B(LDB,N)
Set values for A (in band
A = ( 0.0+0.0i 1.0+1.0i
( 1.0+1.0i 1.0+1.0i
( 1.0+1.0i 1.0+1.0i
mode)
1.0+1.0i
1.0+1.0i
0.0+0.0i
)
)
)
Output
B
1
2
3
1
( 0.000, 0.000)
( 1.000, 1.000)
( 1.000, 1.000)
2
( 1.000, 1.000)
( 1.000, 1.000)
( 1.000, 1.000)
3
( 1.000, 1.000)
( 1.000, 1.000)
( 0.000, 0.000)
CRGRB
Converts a real general matrix to a matrix in band storage mode.
Required Arguments
A Real N by N matrix. (Input)
NLC Number of lower codiagonals in B. (Input)
NUC Number of upper codiagonals in B. (Input)
B Real (NUC + 1 + NLC) by N array containing the band matrix in band storage mode.
(Output)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CRGRB converts the real general N N matrix A with mu = NUC upper codiagonals and
ml = NLC lower codiagonals into the real band matrix B of order N. The first mu rows of B then
contain the upper codiagonals of A, the next row contains the main diagonal of A, and the last ml
rows of B contain the lower codiagonals of A.
Example
A real 4 4 matrix with one upper codiagonal and three lower codiagonals is copied to a real band
matrix of order 4 in band storage mode.
USE CRGRB_INT
USE WRRRN_INT
!
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N, NLC, NUC
(LDA=4, LDB=5, N=4, NLC=3, NUC=1)
!
REAL
!
!
!
!
!
!
!
!
A(LDA,N), B(LDB,N)
Set values for A
A = ( 1.0
2.0
( -2.0
1.0
( 0.0
-3.0
( -7.0
0.0
0.0
3.0
1.0
-4.0
0.0)
0.0)
4.0)
1.0)
DATA A/1.0, -2.0, 0.0, -7.0, 2.0, 1.0, -3.0, 0.0, 0.0, 3.0, 1.0, &
-4.0, 0.0, 0.0, 4.0, 1.0/
Convert A to band matrix B
CALL CRGRB (A, NLC, NUC, B)
Print results
CALL WRRRN ('B', B)
END
CRGRB 1451
Output
1
2
3
4
5
1
0.000
1.000
-2.000
0.000
-7.000
B
2
2.000
1.000
-3.000
0.000
0.000
3
3.000
1.000
-4.000
0.000
0.000
4
4.000
1.000
0.000
0.000
0.000
CRBRG
Converts a real matrix in band storage mode to a real general matrix.
Required Arguments
A Real (NUC + 1 + NLC) by N array containing the band matrix in band storage mode.
(Input)
NLC Number of lower codiagonals in A. (Input)
NUC Number of upper codiagonals in A. (Input)
B Real N by N array containing the matrix. (Output)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CRBRG converts the real band matrix A of order N in band storage mode into the real
N N general matrix B with mu = NUC upper codiagonals and ml = NLC lower codiagonals. The
first mu rows of A are copied to the upper codiagonals of B, the next row of A is copied to the
diagonal of B, and the last ml rows of A are copied to the lower codiagonals of B.
Example
A real band matrix of order 3 in band storage mode with one upper codiagonal and one lower
codiagonal is copied to a 3 3 real general matrix.
USE CRBRG_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N, NLC, NUC
(LDA=3, LDB=3, N=3, NLC=1, NUC=1)
!
!
REAL
A(LDA,N), B(LDB,N)
Set values for A (in
A = ( 0.0
1.0
( 4.0
3.0
( 2.0
2.0
!
!
!
!
!
!
!
band mode)
1.0)
2.0)
0.0)
DATA A/0.0, 4.0, 2.0, 1.0, 3.0, 2.0, 1.0, 2.0, 0.0/
Convert band matrix A to matrix B
CALL CRBRG (A, NLC, NUC, B)
Print results
CALL WRRRN ('B', B)
END
Output
B
1
2
3
1
4.000
2.000
0.000
2
1.000
3.000
2.000
3
0.000
1.000
2.000
CCGCB
Converts a complex general matrix to a matrix in complex band storage mode.
Required Arguments
A Complex N by N array containing the matrix. (Input)
CCGCB 1453
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CCGCB converts the complex general matrix A of order N with mu = NUC upper
codiagonals and ml = NLC lower codiagonals into the complex band matrix B of order N in band
storage mode. The first mu rows of B then contain the upper codiagonals of A, the next row
contains the main diagonal of A, and the last ml rows of B contain the lower codiagonals of A.
Example
A complex general matrix of order 4 with one upper codiagonal and three lower codiagonals is
copied to a complex band matrix of order 4 in band storage mode.
USE CCGCB_INT
USE WRCRN_INT
IMPLICIT
NONE
!
INTEGER
PARAMETER
!
COMPLEX
!
!
!
!
!
!
!
!
Declare variables
LDA, LDB, N, NLC, NUC
(LDA=4, LDB=5, N=4, NLC=3, NUC=1)
A(LDA,N), B(LDB,N)
Set values for A
A = ( 1.0+0.0i
2.0+1.0i 0.0+0.0i
( -2.0+1.0i
1.0+0.0i 3.0+2.0i
( 0.0+0.0i -3.0+2.0i 1.0+0.0i
( -7.0+1.0i
0.0+0.0i -4.0+3.0i
0.0+0.0i
0.0+0.0i
4.0+3.0i
1.0+0.0i
)
)
)
)
Output
1
2
3
4
5
( 0.000,
( 1.000,
(-2.000,
( 0.000,
(-7.000,
1
0.000)
0.000)
1.000)
0.000)
1.000)
( 2.000,
( 1.000,
(-3.000,
( 0.000,
( 0.000,
B
2
1.000)
0.000)
2.000)
0.000)
0.000)
( 3.000,
( 1.000,
(-4.000,
( 0.000,
( 0.000,
3
2.000)
0.000)
3.000)
0.000)
0.000)
(
(
(
(
(
4.000,
1.000,
0.000,
0.000,
0.000,
4
3.000)
0.000)
0.000)
0.000)
0.000)
CCBCG
Converts a complex matrix in band storage mode to a complex matrix in full storage mode.
Required Arguments
A Complex (NUC + 1 + NLC) by N matrix containing the band matrix in band mode.
(Input)
NLC Number of lower codiagonals in A. (Input)
NUC Number of upper codiagonals in A. (Input)
B Complex N by N matrix containing the band matrix in full mode. (Output)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
CCBCG 1455
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CCBCG converts the complex band matrix A of order N with mu = NUC upper
codiagonals and ml = NLC lower codiagonals into the N N complex general matrix B. The first
mu rows of A are copied to the upper codiagonals of B, the next row of A is copied to the diagonal
of B, and the last ml rows of A are copied to the lower codiagonals of B.
Example
A complex band matrix of order 4 in band storage mode with one upper codiagonal and three
lower codiagonals is copied into a 4 4 complex general matrix.
USE CCBCG_INT
USE WRCRN_INT
!
IMPLICIT
INTEGER
PARAMETER
NONE
Declare variables
LDA, LDB, N, NLC, NUC
(LDA=5, LDB=4, N=4, NLC=3, NUC=1)
!
COMPLEX
!
!
!
!
!
!
!
A(LDA,N), B(LDB,N)
Set values for A (in band mode)
A = ( 0.0+0.0i 2.0+1.0i 3.0+2.0i
( 1.0+0.0i 1.0+0.0i 1.0+0.0i
( -2.0+1.0i -3.0+2.0i -4.0+3.0i
( 0.0+0.0i 0.0+0.0i 0.0+0.0i
( -7.0+1.0i 0.0+0.0i 0.0+0.0i
4.0+3.0i
1.0+0.0i
0.0+0.0i
0.0+0.0i
0.0+0.0i
)
)
)
)
)
!
!
Output
1
2
3
4
( 1.000,
(-2.000,
( 0.000,
(-7.000,
1
0.000)
1.000)
0.000)
1.000)
( 2.000,
( 1.000,
(-3.000,
( 0.000,
B
2
1.000)
0.000)
2.000)
0.000)
( 0.000,
( 3.000,
( 1.000,
(-4.000,
3
0.000)
2.000)
0.000)
3.000)
(
(
(
(
0.000,
0.000,
4.000,
1.000,
4
0.000)
0.000)
3.000)
0.000)
CRGCG
Copies a real general matrix to a complex general matrix.
Required Arguments
A Real matrix of order N. (Input)
B Complex matrix of order N containing a copy of A. (Output)
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CRGCG 1457
Double:
Description
The routine CRGCG copies a real N N matrix to a complex N N matrix.
Example
A 3 3 real matrix is copied to a 3 3 complex matrix.
USE CRGCG_INT
USE WRCRN_INT
!
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N
(LDA=3, LDB=3, N=3)
REAL
COMPLEX
A(LDA,N)
B(LDB,N)
!
!
!
!
!
!
!
!
3.0 )
0.0 )
0.0 )
DATA A/2.0, 4.0, -1.0, 1.0, 1.0, 2.0, 3.0, 0.0, 0.0/
Convert real A to complex B
CALL CRGCG (A, B)
Print results
CALL WRCRN ('B', B)
END
Output
1
2
3
1
( 2.000, 0.000)
( 4.000, 0.000)
(-1.000, 0.000)
2
( 1.000, 0.000)
( 1.000, 0.000)
( 2.000, 0.000)
3
( 3.000, 0.000)
( 0.000, 0.000)
( 0.000, 0.000)
CRRCR
Copies a real rectangular matrix to a complex rectangular matrix.
Required Arguments
A Real NRA by NCA rectangular matrix. (Input)
B Complex NRB by NCB rectangular matrix containing a copy of A. (Output)
Optional Arguments
NRA Number of rows in A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns in A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NRB Number of rows in B. (Input)
It must be the same as NRA.
Default: NRB = SIZE (B,1).
NCB Number of columns in B. (Input)
It must be the same as NCA.
Default: NCB = SIZE (B,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CRRCR copies a real rectangular matrix to a complex rectangular matrix.
Example
A 3 2 real matrix is copied to a 3 2 complex matrix.
USE CRRCR_INT
USE WRCRN_INT
IMPLICIT
NONE
!
Chapter 9: Basic Matrix/Vector Operations
Declare variables
CRRCR 1459
INTEGER
PARAMETER
REAL
COMPLEX
A(LDA,NCA)
B(LDB,NCB)
!
!
!
!
!
!
!
!
)
)
)
Output
1
2
3
B
1
( 1.000, 0.000)
( 2.000, 0.000)
( 3.000, 0.000)
2
( 4.000, 0.000)
( 5.000, 0.000)
( 6.000, 0.000)
CRBCB
Converts a real matrix in band storage mode to a complex matrix in band storage mode.
Required Arguments
A Real band matrix of order N. (Input)
NLCA Number of lower codiagonals in A. (Input)
NUCA Number of upper codiagonals in A. (Input)
B Complex matrix of order N containing a copy of A. (Output)
NLCB Number of lower codiagonals in B. (Input)
NLCB must be at least as large as NLCA.
NUCB Number of upper codiagonals in B. (Input)
NUCB must be at least as large as NUCA.
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CRBCB converts a real band matrix in band storage mode with NUCA upper codiagonals
and NLCA lower codiagonals into a complex band matrix in band storage mode with NUCB upper
codiagonals and NLCB lower codiagonals.
Example
A real band matrix of order 3 in band storage mode with one upper codiagonal and one lower
codiagonal is copied into another complex band matrix in band storage mode.
USE CRBCB_INT
USE WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N, NLCA, NLCB, NUCA, NUCB
(LDA=3, LDB=3, N=3, NLCA=1, NLCB=1, NUCA=1, NUCB=1)
REAL
COMPLEX
A(LDA,N)
B(LDB,N)
!
!
!
!
!
!
!
!
!
!
band mode)
1.0)
1.0)
0.0)
DATA A/0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0/
Convert real band matrix A
to complex band matrix B
CALL CRBCB (A, NLCA, NUCA, B, NLCB, NUCB)
Print results
CRBCB 1461
Output
B
1
2
3
1
( 0.000, 0.000)
( 1.000, 0.000)
( 1.000, 0.000)
2
( 1.000, 0.000)
( 1.000, 0.000)
( 1.000, 0.000)
3
( 1.000, 0.000)
( 1.000, 0.000)
( 0.000, 0.000)
CSFRG
Extends a real symmetric matrix defined in its upper triangle to its lower triangle.
Required Arguments
A N by N symmetric matrix of order N to be filled out. (Input/Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSFRG converts an N N matrix A in symmetric mode into a general matrix by filling
in the lower triangular portion of A using the values defined in its upper triangular portion.
Example
The lower triangular portion of a real 3 3 symmetric matrix is filled with the values defined in its
upper triangular portion.
USE CSFRG_INT
USE WRRRN_INT
!
IMPLICIT
NONE
INTEGER
PARAMETER
LDA, N
(LDA=3, N=3)
REAL
A(LDA,N)
Declare variables
!
!
!
!
!
!
!
!
4.0
5.0
2.0
)
)
)
Output
A
1
2
3
1
0.000
3.000
4.000
2
3.000
1.000
5.000
3
4.000
5.000
2.000
CHFCG
Extends a complex Hermitian matrix defined in its upper triangle to its lower triangle.
Required Arguments
A Complex Hermitian matrix of order N. (Input/Output)
On input, the upper triangle of A defines a Hermitian matrix. On output, the lower
triangle of A is defined so that A is Hermitian.
Optional Arguments
N Order of the matrix. (Input)
Default: N = SIZE (A,2).
CHFCG 1463
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CHFCG converts an N N complex matrix A in Hermitian mode into a complex
general matrix by filling in the lower triangular portion of A using the values defined in its upper
triangular portion.
Comments
Informational errors
Type
3
4
Code
1 The matrix is not Hermitian. It has a diagonal entry with a small imaginary
part.
2 The matrix is not Hermitian. It has a diagonal entry with an imaginary part.
Example
A complex 3 3 Hermitian matrix defined in its upper triangle is extended to its lower triangle.
USE CHFCG_INT
USE WRCRN_INT
!
IMPLICIT
NONE
INTEGER
PARAMETER
LDA, N
(LDA=3, N=3)
COMPLEX
A(LDA,N)
Declare variables
!
!
!
!
!
!
A = (
(
(
1.0+2.0i
2.0+2.0i
3.0+0.0i
)
)
)
Print results
CALL WRCRN ('A', A)
END
Output
A
1
2
3
1
( 1.000, 0.000)
( 1.000,-1.000)
( 1.000,-2.000)
2
( 1.000, 1.000)
( 2.000, 0.000)
( 2.000,-2.000)
3
( 1.000, 2.000)
( 2.000, 2.000)
( 3.000, 0.000)
CSBRB
Copies a real symmetric band matrix stored in band symmetric storage mode to a real band matrix
stored in band storage mode.
Required Arguments
A Real band symmetric matrix of order N. (Input)
NUCA Number of codiagonals in A. (Input)
B Real band matrix of order N containing a copy of A. (Output)
NLCB Number of lower codiagonals in B. (Input)
NLCB must be at least as large as NUCA.
NUCB Number of upper codiagonals in B. (Input)
NUCB must be at least as large as NUCA.
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
CSBRB 1465
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine CSBRB copies a real matrix A stored in symmetric band mode to a matrix B stored in
band mode. The lower codiagonals of B are set using the values from the upper codiagonals of A.
Example
A real matrix of order 4 in band symmetric storage mode with 2 upper codiagonals is copied to a
real matrix in band storage mode with 2 upper codiagonals and 2 lower codiagonals.
USE CSBRB_INT
USE WRRRN_INT
IMPLICIT
!
INTEGER
PARAMETER
NONE
Declare variables
LDA, LDB, N, NLCB, NUCA, NUCB
(N=4, NUCA=2, LDA=NUCA+1, NLCB=NUCA, NUCB=NUCA, &
LDB=NLCB+NUCB+1)
!
REAL
!
!
!
!
!
!
!
A(LDA,N), B(LDB,N)
Set values for A, in
A = ( 0.0 0.0 2.0
( 0.0 2.0 3.0
( 1.0 2.0 3.0
band mode
1.0 )
1.0 )
4.0 )
DATA A/2*0.0, 1.0, 0.0, 2.0, 2.0, 2.0, 3.0, 3.0, 1.0, 1.0, 4.0/
Copy A to B
CALL CSBRB (A, NUCA, B, NLCB, NUCB)
Print results
CALL WRRRN ('B', B)
END
Output
1
2
3
1
0.000
0.000
1.000
B
2
0.000
2.000
2.000
3
2.000
3.000
3.000
4
1.000
1.000
4.000
4
5
2.000
2.000
3.000
1.000
1.000
0.000
0.000
0.000
CHBCB
Copies a complex Hermitian band matrix stored in band Hermitian storage mode to a complex
band matrix stored in band storage mode.
Required Arguments
A Complex band Hermitian matrix of order N. (Input)
NUCA Number of codiagonals in A. (Input)
B Complex band matrix of order N containing a copy of A. (Output)
NLCB Number of lower codiagonals in B. (Input)
NLCB must be at least as large as NUCA.
NUCB Number of upper codiagonals in B. (Input)
NUCB must be at least as large as NUCA.
Optional Arguments
N Order of the matrices A and B. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
CHBCB 1467
Description
The routine CSBRB copies a complex matrix A stored in Hermitian band mode to a matrix B stored
in complex band mode. The lower codiagonals of B are filled using the values in the upper
codiagonals of A.
Comments
Informational errors
Type
3
4
Code
1 An element on the diagonal has a complex part that is near zero, the
complex part is set to zero.
1 An element on the diagonal has a complex part that is not zero.
Example
A complex Hermitian matrix of order 3 in band Hermitian storage mode with one upper
codiagonal is copied to a complex matrix in band storage mode.
USE CHBCB_INT
USE WRCRN_INT
IMPLICIT
!
INTEGER
PARAMETER
!
COMPLEX
!
!
!
!
!
!
!
NONE
Declare variables
LDA, LDB, N, NLCB, NUCA, NUCB
(N=3, NUCA=1, LDA=NUCA+1, NLCB=NUCA, NUCB=NUCA, &
LDB=NLCB+NUCB+1)
A(LDA,N), B(LDB,N)
Set values for A (in band mode)
A = ( 0.0+0.0i -1.0+1.0i -2.0+2.0i )
( 1.0+0.0i 1.0+0.0i 1.0+0.0i )
Output
B
1
2
3
1
( 0.000, 0.000)
( 1.000, 0.000)
(-1.000,-1.000)
2
(-1.000, 1.000)
( 1.000, 0.000)
(-2.000,-2.000)
3
(-2.000, 2.000)
( 1.000, 0.000)
( 0.000, 0.000)
TRNRR
Transposes a rectangular matrix.
Required Arguments
A Real NRA by NCA matrix in full storage mode. (Input)
B Real NRB by NCB matrix in full storage mode containing the transpose of A. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NRB Number of rows of B. (Input)
NRB must be equal to NCA.
Default: NRB = SIZE (B,1).
NCB Number of columns of B. (Input)
NCB must be equal to NRA.
Default: NCB = SIZE (B,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
TRNRR 1469
Description
The routine TRNRR computes the transpose B = AT of a real rectangular matrix A.
Comments
If LDA = LDB and NRA = NCA, then A and B can occupy the same storage locations; otherwise, A
and B must be stored separately.
Example
Transpose the 5 3 real rectangular matrix A into the 3 5 real rectangular matrix B.
USE TRNRR_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
NCA, NCB, NRA, NRB
(NCA=3, NCB=5, NRA=5, NRB=3)
!
!
REAL
!
!
!
!
!
!
!
!
!
A(NRA,NCA), B(NRB,NCB)
Set values for A
A = ( 11.0 12.0
( 21.0 22.0
( 31.0 32.0
( 41.0 42.0
( 51.0 52.0
13.0
23.0
33.0
43.0
53.0
)
)
)
)
)
DATA A/11.0, 21.0, 31.0, 41.0, 51.0, 12.0, 22.0, 32.0, 42.0,&
52.0, 13.0, 23.0, 33.0, 43.0, 53.0/
B = transpose(A)
CALL TRNRR (A, B)
Print results
CALL WRRRN ('B = trans(A)', B)
END
Output
1
2
3
1
11.00
12.00
13.00
B = trans(A)
2
3
21.00
31.00
22.00
32.00
23.00
33.00
4
41.00
42.00
43.00
5
51.00
52.00
53.00
MXTXF
Computes the transpose product of a matrix, ATA.
Required Arguments
A Real NRA by NCA rectangular matrix. (Input)
The transpose product of A is to be computed.
B Real NB by NB symmetric matrix containing the transpose product ATA. (Output)
Optional Arguments
NRA Number of rows in A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns in A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NB Order of the matrix B. (Input)
NB must be equal to NCA.
Default: NB = SIZE (B,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine MXTXF computes the real general matrix B = ATA given the real rectangular matrix A.
Example
Multiply the transpose of a 3 4 real matrix by itself. The output matrix will be a 4 4 real
symmetric matrix.
Chapter 9: Basic Matrix/Vector Operations
MXTXF 1471
USE MXTXF_INT
USE WRRRN_INT
!
IMPLICIT
INTEGER
PARAMETER
NONE
Declare variables
NB, NCA, NRA
(NB=4, NCA=4, NRA=3)
!
REAL
!
!
!
!
!
!
!
A(NRA,NCA), B(NB,NB)
Set values for A
A = ( 3.0 1.0 4.0 2.0 )
( 0.0 2.0 1.0 -1.0 )
( 6.0 1.0 3.0 2.0 )
DATA A/3.0, 0.0, 6.0, 1.0, 2.0, 1.0, 4.0, 1.0, 3.0, 2.0, -1.0, &
2.0/
Compute B = trans(A)*A
CALL MXTXF (A, B)
Print results
CALL WRRRN ('B = trans(A)*A', B)
END
Output
1
2
3
4
B = trans(A)*A
1
2
3
45.00
9.00
30.00
9.00
6.00
9.00
30.00
9.00
26.00
18.00
2.00
13.00
4
18.00
2.00
13.00
9.00
MXTYF
Multiplies the transpose of matrix A by matrix B, ATB.
Required Arguments
A Real NRA by NCA matrix. (Input)
B Real NRB by NCB matrix. (Input)
C Real NCA by NCB matrix containing the transpose product ATB. (Output)
Optional Arguments
NRA Number of rows in A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns in A. (Input)
Default: NCA = SIZE (A,2).
1472 Chapter 9: Basic Matrix/Vector Operations
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NRB Number of rows in B. (Input)
NRB must be the same as NRA.
Default: NRB = SIZE (B,1).
NCB Number of columns in B. (Input)
Default: NCB = SIZE (B,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
NRC Number of rows of C. (Input)
NRC must be equal to NCA.
Default: NRC = SIZE (C,1).
NCC Number of columns of C. (Input)
NCC must be equal to NCB.
Default: NCC = SIZE (C,2).
LDC Leading dimension of C exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDC = SIZE (C,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL MXTYF (NRA, NCA, A, LDA, NRB, NCB, B, LDB, NRC, NCC,
C, LDC)
Double:
Description
The routine MXTYF computes the real general matrix C = ATB given the real rectangular matrices A
and B.
MXTYF 1473
Example
Multiply the transpose of a 3 4 real matrix by a 3 3 real matrix. The output matrix will be a
4 3 real matrix.
USE MXTYF_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
NCA, NCB, NCC, NRA, NRB, NRC
(NCA=4, NCB=3, NCC=3, NRA=3, NRB=3, NRC=4)
!
!
REAL
!
!
!
!
!
!
!
!
!
!
!
!
0.0 )
0.0 )
1.0 )
Output
1
2
3
4
C = trans(A)*B
1
2
3
8.00
12.00
1.00
12.00
5.00
-2.00
-5.00
14.00
5.00
0.00
5.00
2.00
MXYTF
Multiplies a matrix A by the transpose of a matrix B, ABT.
Required Arguments
A Real NRA by NCA rectangular matrix. (Input)
B Real NRB by NCB rectangular matrix. (Input)
C Real NRC by NCC rectangular matrix containing the transpose product ABT. (Output)
Optional Arguments
NRA Number of rows in A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns in A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NRB Number of rows in B. (Input)
Default: NRB = SIZE (B,1).
NCB Number of columns in B. (Input)
NCB must be the same as NCA.
Default: NCB = SIZE (B,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
NRC Number of rows of C. (Input)
NRC must be equal to NRA.
Default: NRC = SIZE (C,1).
NCC Number of columns of C. (Input)
NCC must be equal to NRB.
Default: NCC = SIZE (C,2).
LDC Leading dimension of C exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDC = SIZE (C,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL MXYTF (NRA, NCA, A, LDA, NRB, NCB, B, LDB, NRC, NCC,
C, LDC)
MXYTF 1475
Double:
Description
The routine MXYTF computes the real general matrix C = ABT given the real rectangular matrices A
and B.
Example
Multiply a 3 4 real matrix by the transpose of a 3 4 real matrix. The output matrix will be a
3 3 real matrix.
USE MXYTF_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
NCA, NCB, NCC, NRA, NRB, NRC
(NCA=4, NCB=4, NCC=3, NRA=3, NRB=3, NRC=3)
!
!
REAL
!
!
!
!
!
!
!
!
!
!
!
!
0.0 )
0.0 )
1.0 )
Output
1
2
3
C = A*trans(B)
1
2
3
-1.00
1.00
4.00
5.00
10.00
18.00
2.00
3.00
14.00
MRRRR
Multiplies two real rectangular matrices, AB.
1476 Chapter 9: Basic Matrix/Vector Operations
Required Arguments
A Real NRA by NCA matrix in full storage mode. (Input)
B Real NRB by NCB matrix in full storage mode. (Input)
C Real NRC by NCC matrix containing the product AB in full storage mode. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NRB Number of rows of B. (Input)
NRB must be equal to NCA.
Default: NRB = SIZE (B,1).
NCB Number of columns of B. (Input)
Default: NCB = SIZE (B,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
NRC Number of rows of C. (Input)
NRC must be equal to NRA.
Default: NRC = SIZE (C,1).
NCC Number of columns of C. (Input)
NCC must be equal to NCB.
Default: NCC = SIZE (C,2).
LDC Leading dimension of C exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDC = SIZE (C,1).
FORTRAN 90 Interface
Generic:
Specific:
MRRRR 1477
FORTRAN 77 Interface
Single:
CALL MRRRR (NRA, NCA, A, LDA, NRB, NCB, B, LDB, NRC, NCC,
C, LDC)
Double:
Description
Given the real rectangular matrices A and B, MRRRR computes the real rectangular matrix C = AB.
Example
Multiply a 3 4 real matrix by a 4 3 real matrix. The output matrix will be a 3 3 real matrix.
USE MRRRR_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
NCA, NCB, NCC, NRA, NRB, NRC
(NCA=4, NCB=3, NCC=3, NRA=3, NRB=4, NRC=3)
!
!
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
0.0 )
0.0 )
1.0 )
)
)
)
)
DATA A/1.0, 3.0, 2.0, 0.0, 4.0, 1.0, 2.0, -1.0, 2.0, 0.0, 0.0, &
1.0/
DATA B/-1.0, 3.0, 0.0, 2.0, 0.0, 5.0, 0.0, -1.0, 2.0, 2.0, -1.0, &
5.0/
Compute C = A*B
CALL MRRRR (A, B, C)
Print results
CALL WRRRN ('C = A*B', C)
END
Output
1
2
3
1
-1.00
9.00
3.00
C = A*B
2
0.00
20.00
4.00
3
0.00
15.00
9.00
MCRCR
Multiplies two complex rectangular matrices, AB.
Required Arguments
A Complex NRA by NCA rectangular matrix. (Input)
B Complex NRB by NCB rectangular matrix. (Input)
C Complex NRC by NCC rectangular matrix containing the product A * B. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NRB Number of rows of B. (Input)
NRB must be equal to NCA.
Default: NRB = SIZE (B,1).
NCB Number of columns of B. (Input)
Default: NCB = SIZE (B,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
NRC Number of rows of C. (Input)
NRC must be equal to NRA.
Default: NRC = SIZE (C,1).
NCC Number of columns of C. (Input)
NCC must be equal to NCB.
Default: NCC = SIZE (C,2).
LDC Leading dimension of C exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDC = SIZE (C,1).
MCRCR 1479
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL MCRCR (NRA, NCA, A, LDA, NRB, NCB, B, LDB, NRC, NCC,
C, LDC)
Double:
Description
Given the complex rectangular matrices A and B, MCRCR computes the complex rectangular matrix
C = AB.
Example
Multiply a 3 4 complex matrix by a 4 3 complex matrix. The output matrix will be a 3 3
complex matrix.
USE MCRCR_INT
USE WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
NCA, NCB, NCC, NRA, NRB, NRC
(NCA=4, NCB=3, NCC=3, NRA=3, NRB=4, NRC=3)
!
!
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
!
2.0
2.0
1.0
2.0
+
+
+
1.0i
1.0i
0.0i
1.0i
B
1.0i
3.0i
1.0i
1.0i
0.0 - 2.0i )
0.0 + 1.0i )
0.0 + 0.0i )
)
)
)
)
END
Output
C = A*B
1
2
3
(
(
(
1
3.00, 5.00)
8.00, 4.00)
0.00, -4.00)
(
(
(
2
6.00, 13.00)
8.00, -2.00)
3.00, -6.00)
3
( 0.00, 17.00)
( 22.00,-12.00)
( 2.00,-14.00)
HRRRR
Computes the Hadamard product of two real rectangular matrices.
Required Arguments
A Real NRA by NCA rectangular matrix. (Input)
B Real NRB by NCB rectangular matrix. (Input)
C Real NRC by NCC rectangular matrix containing the Hadamard product of A and B.
(Output)
If A is not needed, then C can share the same storage locations as A. Similarly, if B is
not needed, then C can share the same storage locations as B.
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NRB Number of rows of B. (Input)
NRB must be equal to NRA.
Default: NRB = SIZE (B,1).
NCB Number of columns of B. (Input)
NCB must be equal to NCA.
Default: NCB = SIZE (B,2).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
Chapter 9: Basic Matrix/Vector Operations
HRRRR 1481
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
CALL HRRRR (NRA, NCA, A, LDA, NRB, NCB, B, LDB, NRC, NCC,
C, LDC)
Double:
Description
The routine HRRRR computes the Hadamard product of two real matrices A and B and returns a
real matrix C, where Cij = AijBij.
Example
Compute the Hadamard product of two 4 4 real matrices. The output matrix will be a 4 4 real
matrix.
USE HRRRR_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
NCA, NCB, NCC, NRA, NRB, NRC
(NCA=4, NCB=4, NCC=4, NRA=4, NRB=4, NRC=4)
!
!
REAL
!
!
!
!
!
!
!
!
!
!
!
!
!
!
0.0 -10.0 )
4.0
2.0 )
7.0
1.0 )
9.0
0.0 )
DATA A/-1.0, 2.0, 3.0, 4.0, 0.0, 1.0, -2.0, 1.0, -3.0, 7.0, 2.0, &
-5.0, 8.0, 2.0, -6.0, -8.0/
DATA B/2.0, 1.0, -1.0, 2.0, 3.0, -1.0, -2.0, 1.0, 0.0, 4.0, 7.0, &
9.0, -10.0, 2.0, 1.0, 0.0/
Compute Hadamard product of A and B
CALL HRRRR (A, B, C)
Print results
CALL WRRRN ('C = A (*) B', C)
END
Output
1
2
3
4
1
-2.00
2.00
-3.00
8.00
C = A (*) B
2
3
0.00
0.00
-1.00
28.00
4.00
14.00
1.00 -45.00
4
-80.00
4.00
-6.00
0.00
BLINF
This function computes the bilinear form xTAy.
Required Arguments
A Real NRA by NCA matrix. (Input)
X Real vector of length NRA. (Input)
Y Real vector of length NCA. (Input)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
BLINF 1483
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Given the real rectangular matrix A and two vectors x and y, BLINF computes the bilinear form
xTAy.
Comments
The quadratic form can be computed by calling BLINF with the vector X in place of the vector Y.
Example
Compute the bilinear form xTAy, where x is a vector of length 5, A is a 5 2 matrix and y is a
vector of length 2.
USE BLINF_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
NCA, NRA
(NCA=2, NRA=5)
INTEGER
REAL
NOUT
A(NRA,NCA), VALUE, X(NRA), Y(NCA)
Set values for A
A = ( -2.0 2.0 )
( 3.0 -6.0 )
( -4.0 7.0 )
( 1.0 -8.0 )
( 0.0 10.0 )
Set values for X
X = ( 1.0 -2.0 3.0 -4.0 -5.0 )
Set values for Y
Y = ( -6.0 3.0 )
!
!
!
!
!
!
!
!
!
!
!
!
!
Declare variables
!
!
DATA A/-2.0, 3.0, -4.0, 1.0, 0.0, 2.0, -6.0, 7.0, -8.0, 10.0/
DATA X/1.0, -2.0, 3.0, -4.0, -5.0/
DATA Y/-6.0, 3.0/
Compute bilinear form
VALUE = BLINF(A,X,Y)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ' The bilinear form trans(x)*A*y = ', VALUE
END
Output
The bilinear form trans(x)*A*y =
195.000
POLRG
Evaluates a real general matrix polynomial.
Required Arguments
A N by N matrix for which the polynomial is to be computed. (Input)
COEF Vector of length NCOEF containing the coefficients of the polynomial in order of
increasing power. (Input)
B N by N matrix containing the value of the polynomial evaluated at A. (Output)
Optional Arguments
N Order of the matrix A. (Input)
Default: N = SIZE (A,1).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NCOEF Number of coefficients. (Input)
Default: NCOEF = SIZE (COEF,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
FORTRAN 90 Interface
Generic:
Specific:
POLRG 1485
FORTRAN 77 Interface
Single:
Double:
Description
Let m = NCOEF and c = COEF.
The routine POLRG computes the matrix polynomial
m
B = ck A k 1
k =1
B = ( ( cm A + cm 1 I ) A + cm 2 I ) A + + c1 I
Comments
Workspace may be explicitly provided, if desired, by use of P2LRG/DP2LRG. The reference is
CALL P2LRG (N, A, LDA, NCOEF, COEF, B, LDB, WORK)
Example
This example evaluates the matrix polynomial 3I + A + 2A2, where A is a 3 3 matrix.
USE POLRG_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, LDB, N, NCOEF
(N=3, NCOEF=3, LDA=N, LDB=N)
!
!
!
!
!
!
!
!
!
!
REAL
3.0
1.0
5.0
2.0
7.0
-4.0
)
)
)
Evaluate B = 3I + A + 2*A**2
CALL POLRG (A, COEF, B)
Print B
CALL WRRRN ('B = 3I + A + 2*A**2', B)
END
Output
1
2
3
B = 3I + A + 2*A**2
1
2
3
-20.0
35.0
32.0
-11.0
46.0
-55.0
-55.0
-19.0
105.0
MURRV
Multiplies a real rectangular matrix by a vector.
Required Arguments
A Real NRA by NCA rectangular matrix. (Input)
X Real vector of length NX. (Input)
Y Real vector of length NY containing the product A * X if IPATH is equal to 1 and the
product trans(A) * X if IPATH is equal to 2. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NX Length of the vector X. (Input)
NX must be equal to NCA if IPATH is equal to 1. NX must be equal to NRA if IPATH is
equal to 2.
Default: NX = SIZE (X,1).
IPATH Integer flag. (Input)
IPATH = 1 means the product Y = A * X is computed. IPATH = 2 means the product
MURRV 1487
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
If IPATH = 1, MURRV computes y = Ax, where A is a real general matrix and x and y are real
vectors. If IPATH = 2, MURRV computes y = ATx.
Example
Multiply a 3 3 real matrix by a real vector of length 3. The output vector will be a real vector of
length 3.
USE MURRV_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, NCA, NRA, NX, NY
(NCA=3, NRA=3, NX=3, NY=3)
!
!
INTEGER
REAL
!
!
!
!
!
!
!
!
!
IPATH
A(NRA,NCA), X(NX), Y(NY)
Set values
A = ( 1.0
( 0.0
( 4.0
X = ( 1.0
for A and X
0.0 2.0 )
3.0 0.0 )
1.0 2.0 )
2.0
1.0 )
DATA A/1.0, 0.0, 4.0, 0.0, 3.0, 1.0, 2.0, 0.0, 2.0/
DATA X/1.0, 2.0, 1.0/
Compute y = Ax
IPATH = 1
CALL MURRV (A, X, Y)
!
Print results
CALL WRRRN ('y = Ax', Y, 1, NY, 1)
END
Output
1
3.000
y = Ax
2
6.000
3
8.000
MURBV
Multiplies a real band matrix in band storage mode by a real vector.
Required Arguments
A Real NLCA + NUCA + 1 by N band matrix stored in band mode. (Input)
NLCA Number of lower codiagonals in A. (Input)
NUCA Number of upper codiagonals in A. (Input)
X Real vector of length NX. (Input)
Y Real vector of length NY containing the product A * X if IPATH is equal to 1 and the
product trans(A) * X if IPATH is equal to 2. (Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NX Length of the vector X. (Input)
NX must be equal to N.
Default: NX = SIZE (X,1).
IPATH Integer flag. (Input)
IPATH = 1 means the product Y = A * X is computed. IPATH = 2 means the product
Y = trans(A) * X is computed, where trans(A) is the transpose of A.
Default: IPATH = 1.
MURBV 1489
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
If IPATH = 1, MURBV computes y = Ax, where A is a real band matrix and x and y are real vectors.
If IPATH = 2, MURBV computes y = ATx.
Example
Multiply a real band matrix of order 6, with two upper codiagonals and two lower codiagonals
stored in band mode, by a real vector of length 6. The output vector will be a real vector of length
6.
USE MURBV_INT
USE WRRRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, N, NLCA, NUCA, NX, NY
(LDA=5, N=6, NLCA=2, NUCA=2, NX=6, NY=6)
!
!
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
IPATH
A(LDA,N), X(NX), Y(NY)
Set values for A (in band mode)
A = ( 0.0 0.0 1.0 2.0 3.0 4.0
( 0.0 1.0 2.0 3.0 4.0 5.0
( 1.0 2.0 3.0 4.0 5.0 6.0
(-1.0 -2.0 -3.0 -4.0 -5.0 0.0
(-5.0 -6.0 -7.0 -8.0 0.0 0.0
Set values for X
X = (-1.0 2.0 -3.0
4.0 -5.0
)
)
)
)
)
6.0 )
DATA A/0.0, 0.0, 1.0, -1.0, -5.0, 0.0, 1.0, 2.0, -2.0, -6.0, &
1.0, 2.0, 3.0, -3.0, -7.0, 2.0, 3.0, 4.0, -4.0, -8.0, 3.0, &
4.0, 5.0, -5.0, 0.0, 4.0, 5.0, 6.0, 0.0, 0.0/
DATA X/-1.0, 2.0, -3.0, 4.0, -5.0, 6.0/
1490 Chapter 9: Basic Matrix/Vector Operations
Compute y = Ax
IPATH = 1
CALL MURBV (A, NLCA, NUCA, X, Y)
Print results
CALL WRRRN ('y = Ax', Y, 1, NY, 1)
END
Output
1
-2.00
2
7.00
y = Ax
3
4
-11.00
17.00
5
10.00
6
29.00
MUCRV
Multiplies a complex rectangular matrix by a complex vector.
Required Arguments
A Complex NRA by NCA rectangular matrix. (Input)
X Complex vector of length NX. (Input)
Y Complex vector of length NY containing the product A * X if IPATH is equal to 1 and the
product trans(A) * X if IPATH is equal to 2. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NX Length of the vector X. (Input)
NX must be equal to NCA if IPATH is equal to 1. NX must be equal to NRA if IPATH is
equal to 2.
Default: NX = SIZE (X,1).
IPATH Integer flag. (Input)
IPATH = 1 means the product Y = A * X is computed. IPATH = 2 means the product
Y = trans(A) * X is computed, where trans(A) is the transpose of A.
Default: IPATH =1.
MUCRV 1491
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
If IPATH = 1, MUCRV computes y = Ax, where A is a complex general matrix and x and y are
complex vectors. If IPATH = 2, MUCRV computes y = ATx.
Example
Multiply a 3 3 complex matrix by a complex vector of length 3. The output vector will be a
complex vector of length 3.
USE MUCRV_INT
USE WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
NCA, NRA, NX, NY
(NCA=3, NRA=3, NX=3, NY=3)
INTEGER
COMPLEX
IPATH
A(NRA,NCA), X(NX), Y(NY)
!
!
!
!
!
!
!
!
!
!
A = ( 1.0 + 2.0i
( 2.0 + 1.0i
( 2.0 - 1.0i
X = ( 1.0 - 1.0i
2.0 - 2.0i
A and X
0.0i )
1.0i )
1.0i )
0.0 - 1.0i )
Print results
CALL WRCRN ('y = Ax', Y, 1, NY, 1)
END
Output
y = Ax
( 17.00,
1
2.00)
2
( 12.00, -3.00)
3
4.00, -5.00)
MUCBV
Multiplies a complex band matrix in band storage mode by a complex vector.
Required Arguments
A Complex NLCA + NUCA + 1 by N band matrix stored in band mode. (Input)
NLCA Number of lower codiagonals in A. (Input)
NUCA Number of upper codiagonals in A. (Input)
X Complex vector of length NX. (Input)
Y Complex vector of length NY containing the product A * X if IPATH is equal to 1 and the
product trans(A) * X if IPATH is equal to 2. (Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
NX Length of the vector X. (Input)
NX must be equal to N.
Default: NX = SIZE (X,1).
IPATH Integer flag. (Input)
IPATH = 1 means the product Y = A * X is computed. IPATH = 2 means the product
Y = trans(A) * X is computed, where trans(A) is the transpose of A.
Default: IPATH = 1.
NY Length of vector Y. (Input)
NY must be equal to N.
Default: NY = SIZE (Y,1).
Chapter 9: Basic Matrix/Vector Operations
MUCBV 1493
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
If IPATH = 1, MUCBV computes y = Ax, where A is a complex band matrix and x and y are complex
vectors. If IPATH = 2, MUCBV computes y = ATx.
Example
Multiply the transpose of a complex band matrix of order 4, with one upper codiagonal and two
lower codiagonals stored in band mode, by a complex vector of length 3. The output vector will be
a complex vector of length 3.
USE MUCBV_INT
USE WRCRN_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, N, NLCA, NUCA, NX, NY
(LDA=4, N=4, NLCA=2, NUCA=1, NX=4, NY=4)
!
!
INTEGER
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
IPATH
A(LDA,N), X(NX), Y(NY)
Set values for A (in band mode)
A = ( 0.0+ 0.0i
1.0+ 2.0i
3.0+ 4.0i
5.0+ 6.0i )
( -1.0- 1.0i -1.0- 1.0i -1.0- 1.0i -1.0- 1.0i )
( -1.0+ 2.0i -1.0+ 3.0i -2.0+ 1.0i
0.0+ 0.0i )
( 2.0+ 0.0i
0.0+ 2.0i
0.0+ 0.0i
0.0+ 0.0i )
X = ( 3.0 + 4.0i
-2.0 - 1.0i )
END
Output
1
3.00, -3.00)
(-10.00,
y = Ax
2
7.00) (
3
6.00, -3.00)
4
( -6.00, 19.00)
ARBRB
Adds two band matrices, both in band storage mode.
Required Arguments
A N by N band matrix with NLCA lower codiagonals and NUCA upper codiagonals stored in
band mode with dimension (NLCA + NUCA + 1) by N. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
B N by N band matrix with NLCB lower codiagonals and NUCB upper codiagonals stored in
band mode with dimension (NLCB + NUCB + 1) by N. (Input)
NLCB Number of lower codiagonals of B. (Input)
NUCB Number of upper codiagonals of B. (Input)
C N by N band matrix with NLCC lower codiagonals and NUCC upper codiagonals
containing the sum A + B in band mode with dimension (NLCC + NUCC + 1) by N.
(Output)
NLCC Number of lower codiagonals of C. (Input)
NLCC must be at least as large as max(NLCA, NLCB).
NUCC Number of upper codiagonals of C. (Input)
NUCC must be at least as large as max(NUCA, NUCB).
Optional Arguments
N Order of the matrices A, B and C. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
ARBRB 1495
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
LDC Leading dimension of C exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDC = SIZE (C,1).
FORTRAN 90 Interface
Generic:
CALL ARBRB (A, NLCA, NUCA, B, NLCB, NUCB, C, NLCC, NUCC [,])
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine ARBRB adds two real matrices stored in band mode, returning a real matrix stored in
band mode.
Example
Add two real matrices of order 4 stored in band mode. Matrix A has one upper codiagonal and one
lower codiagonal. Matrix B has no upper codiagonals and two lower codiagonals. The output
matrix C, has one upper codiagonal and two lower codiagonals.
USE ARBRB_INT
USE WRRRN_INT
IMPLICIT
!
INTEGER
PARAMETER
!
REAL
!
!
!
!
!
!
!
!
!
NONE
Declare variables
LDA, LDB, LDC, N, NLCA, NLCB, NLCC, NUCA, NUCB, NUCC
(LDA=3, LDB=3, LDC=4, N=4, NLCA=1, NLCB=2, NLCC=2, &
NUCA=1, NUCB=0, NUCC=1)
A(LDA,N), B(LDB,N), C(LDC,N)
Set values for A (in
A = ( 0.0
2.0
( 1.0
1.0
( 0.0
3.0
band mode)
3.0
-1.0)
1.0
1.0)
4.0
0.0)
band mode)
3.0
3.0)
1.0
0.0)
0.0
0.0)
!
!
!
DATA A/0.0, 1.0, 0.0, 2.0, 1.0, 3.0, 3.0, 1.0, 4.0, -1.0, 1.0, &
0.0/
DATA B/3.0, 1.0, -1.0, 3.0, -2.0, 2.0, 3.0, 1.0, 0.0, 3.0, 0.0, &
0.0/
Add A and B to obtain C (in band
mode)
CALL ARBRB (A, NLCA, NUCA, B, NLCB, NUCB, C, NLCC, NUCC)
Print results
CALL WRRRN ('C = A+B', C)
END
Output
1
2
3
4
1
0.000
4.000
1.000
-1.000
C = A+B
2
3
2.000
3.000
4.000
4.000
1.000
5.000
2.000
0.000
4
-1.000
4.000
0.000
0.000
ACBCB
Adds two complex band matrices, both in band storage mode.
Required Arguments
A N by N complex band matrix with NLCA lower codiagonals and NUCA upper codiagonals
stored in band mode with dimension (NLCA + NUCA + 1) by N. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
B N by N complex band matrix with NLCB lower codiagonals and NUCB upper codiagonals
stored in band mode with dimension (NLCB + NUCB + 1) by N. (Input)
NLCB Number of lower codiagonals of B. (Input)
NUCB Number of upper codiagonals of B. (Input)
C N by N complex band matrix with NLCC lower codiagonals and NUCC upper codiagonals
containing the sum A + B in band mode with dimension (NLCC + NUCC + 1) by N.
(Output)
NLCC Number of lower codiagonals of C. (Input)
NLCC must be at least as large as max(NLCA, NLCB).
NUCC Number of upper codiagonals of C. (Input)
NUCC must be at least as large as max(NUCA, NUCB).
Chapter 9: Basic Matrix/Vector Operations
ACBCB 1497
Optional Arguments
N Order of the matrices A, B and C. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
LDB Leading dimension of B exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDB = SIZE (B,1).
LDC Leading dimension of C exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDC = SIZE (C,1).
FORTRAN 90 Interface
Generic:
CALL ACBCB (A, NLCA, NUCA, B, NLCB, NUCB, C, NLCC, NUCC [,])
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine ACBCB adds two complex matrices stored in band mode, returning a complex matrix
stored in band mode.
Example
Add two complex matrices of order 4 stored in band mode. Matrix A has two upper codiagonals
and no lower codiagonals. Matrix B has no upper codiagonals and two lower codiagonals. The
output matrix C has two upper codiagonals and two lower codiagonals.
USE ACBCB_INT
USE WRCRN_INT
IMPLICIT
!
INTEGER
PARAMETER
NONE
Declare variables
LDA, LDB, LDC, N, NLCA, NLCB, NLCC, NUCA, NUCB, NUCC
(LDA=3, LDB=3, LDC=5, N=3, NLCA=0, NLCB=2, NLCC=2, &
NUCA=2, NUCB=0, NUCC=2)
!
1498 Chapter 9: Basic Matrix/Vector Operations
COMPLEX
!
!
!
!
!
!
!
!
!
!
!
!
Output
C = A+B
1
2
3
4
5
( 0.00,
( 0.00,
( 4.00,
( -1.00,
( 2.00,
1
0.00)
0.00)
5.00)
-4.00)
-1.00)
2
( 0.00, 0.00)
( -1.00, 3.00)
( 9.00, -1.00)
( 9.00, 3.00)
( 0.00, 0.00)
3
( 3.00, -2.00)
( 6.00, 0.00)
( 10.00, 0.00)
( 0.00, 0.00)
( 0.00, 0.00)
NRIRR
Computes the infinity norm of a real matrix.
Required Arguments
A Real NRA by NCA matrix whose infinity norm is to be computed. (Input)
ANORM Real scalar containing the infinity norm of A. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
NRIRR 1499
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine NRIRR computes the infinity norm of a real rectangular matrix A. If m = NRA and
n = NCA, then the -norm of A is
n
A = max Aij
1 i m
j =1
This is the maximum of the sums of the absolute values of the row elements.
Example
Compute the infinity norm of a 3 4 real rectangular matrix.
USE NRIRR_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
NCA, NRA
(NCA=4, NRA=3)
INTEGER
REAL
NOUT
A(NRA,NCA), ANORM
Declare variables
!
!
!
!
!
!
!
!
!
0.0 )
0.0 )
1.0 )
DATA A/1.0, 3.0, 2.0, 0.0, 4.0, 1.0, 2.0, -1.0, 2.0, 0.0, 0.0, &
1.0/
Compute the infinity norm of A
CALL NRIRR (A, ANORM)
Print results
Output
The infinity norm of A is
8.00000
NR1RR
Computes the 1-norm of a real matrix.
Required Arguments
A Real NRA by NCA matrix whose 1-norm is to be computed. (Input)
ANORM Real scalar containing the 1-norm of A. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine NR1RR computes the 1-norm of a real rectangular matrix A. If m = NRA and n = NCA,
then the 1-norm of A is
NR1RR 1501
A 1 = max Aij
1 j n
i =1
This is the maximum of the sums of the absolute values of the column elements.
Example
Compute the 1-norm of a 3 4 real rectangular matrix.
USE NR1RR_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
NCA, NRA
(NCA=4, NRA=3)
INTEGER
REAL
NOUT
A(NRA,NCA), ANORM
!
!
!
!
!
!
!
!
!
!
0.0 2.0
4.0 -1.0
1.0 2.0
0.0 )
0.0 )
1.0 )
DATA A/1.0, 3.0, 2.0, 0.0, 4.0, 1.0, 2.0, -1.0, 2.0, 0.0, 0.0, &
1.0/
Compute the L1 norm of A
CALL NR1RR (A, ANORM)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ' The 1-norm of A is ', ANORM
END
Output
The 1-norm of A is
6.00000
NR2RR
Computes the Frobenius norm of a real rectangular matrix.
Required Arguments
A Real NRA by NCA rectangular matrix. (Input)
ANORM Frobenius norm of A. (Output)
Optional Arguments
NRA Number of rows of A. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns of A. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine NR2RR computes the Frobenius norm of a real rectangular matrix A. If m = NRA and
n = NCA, then the Frobenius norm of A is
m n
A 2 = Aij2
i =1 j =1
12
Example
Compute the Frobenius norm of a 3 4 real rectangular matrix.
USE NR2RR_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, NCA, NRA
(LDA=3, NCA=4, NRA=3)
INTEGER
REAL
NOUT
A(LDA,NCA), ANORM
!
!
!
!
!
!
0.0 )
0.0 )
NR2RR 1503
!
!
!
!
( 2.0
1.0
2.0
1.0 )
DATA A/1.0, 3.0, 2.0, 0.0, 4.0, 1.0, 2.0, -1.0, 2.0, 0.0, 0.0, &
1.0/
Compute Frobenius norm of A
CALL NR2RR (A, ANORM)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ' The Frobenius norm of A is ', ANORM
END
Output
The Frobenius norm of A is
6.40312
NR1RB
Computes the 1-norm of a real band matrix in band storage mode.
Required Arguments
A Real (NUCA + NLCA + 1) by N array containing the N by N band matrix in band storage
mode. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
ANORM Real scalar containing the 1-norm of A. (Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine NR1RB computes the 1-norm of a real band matrix A. The 1-norm of a matrix A is
N
A 1 = max Aij
1 j N
i =1
This is the maximum of the sums of the absolute values of the column elements.
Example
Compute the 1-norm of a 4 4 real band matrix stored in band mode.
USE NR1RB_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
LDA, N, NLCA, NUCA
(LDA=4, N=4, NLCA=2, NUCA=1)
INTEGER
REAL
NOUT
A(LDA,N), ANORM
!
!
!
!
!
!
!
!
!
!
!
Output
The 1-norm of A is
7.00000
NR1CB
Computes the 1-norm of a complex band matrix in band storage mode.
NR1CB 1505
Required Arguments
A Complex (NUCA + NLCA + 1) by N array containing the N by N band matrix in band
storage mode. (Input)
NLCA Number of lower codiagonals of A. (Input)
NUCA Number of upper codiagonals of A. (Input)
ANORM Real scalar containing the 1-norm of A. (Output)
Optional Arguments
N Order of the matrix. (Input)
Default: N = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine NR1CB computes the 1-norm of a complex band matrix A. The 1-norm of a complex
matrix A is
N
Example
Compute the 1-norm of a complex matrix of order 4 in band storage mode.
USE NR1CB_INT
USE UMACH_INT
!
IMPLICIT
NONE
Declare variables
INTEGER
PARAMETER
INTEGER
REAL
COMPLEX
NOUT
ANORM
A(LDA,N)
!
!
!
!
!
!
!
!
!
)
)
)
)
Output
The 1-norm of A is
19.0000
DISL2
This function computes the Euclidean (2-norm) distance between two points.
Required Arguments
X Vector of length max(N * |INCX|, 1). (Input)
Y Vector of length max(N * |INCY|, 1). (Input)
Optional Arguments
N Length of the vectors X and Y. (Input)
Default: N = SIZE (X,1).
INCX Displacement between elements of X. (Input)
The I-th element of X is X(1 + (I 1) * INCX) if INCX is greater than or equal to zero
DISL2 1507
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function DISL2 computes the Euclidean (2-norm) distance between two points x and y. The
Euclidean distance is defined to be
N
2
( xi yi )
i =1
12
Example
Compute the Euclidean (2-norm) distance between two vectors of length 4.
USE DISL2_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
INCX, INCY, N
(N=4)
INTEGER
REAL
NOUT
VAL, X(N), Y(N)
Declare variables
!
!
!
!
!
!
!
2.0
1.0 -3.0 )
Compute L2 distance
VAL = DISL2(X,Y)
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ' The 2-norm distance is ', VAL
END
Output
The 2-norm distance is
6.63325
DISL1
This function computes the 1-norm distance between two points.
Required Arguments
X Vector of length max(N * |INCX|, 1). (Input)
Y Vector of length max(N * |INCY|, 1). (Input)
Optional Arguments
N Length of the vectors X and Y. (Input)
Default: N = SIZE (X,1).
INCX Displacement between elements of X. (Input)
The I-th element of X is X(1 + (I 1) * INCX) if INCX is greater than or equal to zero
or X(1 + (I N) * INCX) if INCX is less than zero.
Default: INCX = 1.
INCY Displacement between elements of Y. (Input)
The I-th element of Y is Y(1 + (I 1) * INCY) if INCY is greater than or equal to zero
or Y(1 + (I N) * INCY) if INCY is less than zero.
Default: INCY = 1.
FORTRAN 90 Interface
Generic:
Specific:
DISL1 1509
FORTRAN 77 Interface
Single:
Double:
Description
The function DISL1 computes the 1-norm distance between two points x and y. The 1-norm
distance is defined to be
N
x
i =1
yi
Example
Compute the 1-norm distance between two vectors of length 4.
USE DISL1_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
INCX, INCY, N
(N=4)
INTEGER
REAL
NOUT
VAL, X(N), Y(N)
Declare variables
!
!
!
!
!
!
!
!
2.0
1.0 -3.0 )
Compute L1 distance
VAL = DISL1(X,Y)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ' The 1-norm distance is ', VAL
END
Output
The 1-norm distance is
12.0000
DISLI
This function computes the infinity norm distance between two points.
Required Arguments
X Vector of length max(N * |INCX|, 1). (Input)
Y Vector of length max(N * |INCY|, 1). (Input)
Optional Arguments
N Length of the vectors X and Y. (Input)
Default: N = SIZE (X,1).
INCX Displacement between elements of X. (Input)
The I-th element of X is X(1 + (I 1) *INCX) if INCX is greater than or equal to zero
or X(1 + (I N) * INCX) if INCX is less than zero.
Default: INCX = 1.
INCY Displacement between elements of Y. (Input)
The I-th element of Y is Y(1 + (I 1) * INCY) if INCY is greater than or equal to zero
or Y(1 + (I N) * INCY) if INCY is less than zero.
Default: INCY = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The function DISLI computes the -norm distance between two points x and y. The -norm
distance is defined to be
max xi yi
1 i N
Example
Compute the -norm distance between two vectors of length 4.
Chapter 9: Basic Matrix/Vector Operations
DISLI 1511
USE DISLI_INT
USE UMACH_INT
IMPLICIT
NONE
Declare variables
INTEGER
PARAMETER
INCX, INCY, N
(N=4)
INTEGER
REAL
NOUT
VAL, X(N), Y(N)
!
!
!
!
!
!
!
2.0
1.0 -3.0 )
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,*) ' The infinity-norm distance is ', VAL
END
Output
The infinity-norm distance is
5.00000
VCONR
Computes the convolution of two real vectors.
Required Arguments
X Vector of length NX. (Input)
Y Vector of length NY. (Input)
Z Vector of length NZ containing the convolution Z = X * Y. (Output)
Optional Arguments
NX Length of the vector X. (Input)
Default: NX = SIZE (X,1).
NY Length of the vector Y. (Input)
Default: NY = SIZE (Y,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine VCONR computes the convolution z of two real vectors x and y. Let nx = NX, ny = NY
and nz = NZ. The vector z is defined to be
nx
z j = x j k +1 yk
k =1
for j = 1, 2, , nz
u = x1 , x2 , , xnx , 0, , 0
The complex vector v, also of length nz, is defined similarly using y. Then, by the Fourier
convolution theorem,
w i = ui vi
for i = 1, 2, , nz
where the u indicates the Fourier transform of u computed via IMSL routines FFTCF and FFTCB
(see Chapter 6, Transforms) is used to compute the complex vector w from w . The vector z is
then found by taking the real part of the vector w.
Comments
Workspace may be explicitly provided, if desired, by use of V2ONR/DV2ONR. The reference is
CALL V2ONR (NX, X, NY, Y, NZ, Z, XWK, YWK, ZWK, WK)
VCONR 1513
Example
In this example, the convolution of a vector x of length 8 and a vector y of length 3 is computed.
The resulting vector z is of length 8 + 3 1 = 10. (The vector y is sometimes called a filter.)
USE VCONR_INT
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NX, NY, NZ
(NX=8, NY=3, NZ=NX+NY-1)
REAL
!
!
!
!
!
!
6.0
7.0
8.0)
!
!
!
Output
1
0.000
2
0.000
3
1.000
4
2.000
Z = X (*) Y
5
6
7
3.000
4.000
5.000
8
6.000
9
7.000
10
8.000
VCONC
Computes the convolution of two complex vectors.
Required Arguments
X Complex vector of length NX. (Input)
Y Complex vector of length NY. (Input)
Z Complex vector of length NZ containing the convolution Z = X * Y. (Output)
Optional Arguments
NX Length of the vector X. (Input)
Default: NX = SIZE (X,1).
NY Length of the vector Y. (Input)
Default: NY = SIZE (Y,1).
NZ Length of the vector Z. (Input)
NZ must be at least NX + NY 1.
Default: NZ = SIZE (Z,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The routine VCONC computes the convolution z of two complex vectors x and y. Let nx = NX, then
ny = NY and nz = NZ. The vector z is defined to be
nx
z j = x j k +1 yk
k =1
for j = 1, 2, , nz
u = x1 , x2 , , xnz , 0, , 0
The complex vector v, also of length nz, is defined similarly using y. Then, by the Fourier
convolution theorem,
zi = ui vi
for i = 1, 2, , nz
where the u indicates the Fourier transform of u computed using IMSL routine FFTCF (see
Chapter 6, Transforms). The complex vector z is computed from w via IMSL routine FFTCB (see
Chapter 6, Transforms).
VCONC 1515
Comments
Workspace may be explicitly provided, if desired, by use of V2ONC/DV2ONC. The reference is
CALL V2ONC (NX, X, NY, Y, NZ, Z, XWK, YWK, WK)
Example
In this example, the convolution of a vector x of length 4 and a vector y of length 3 is computed.
The resulting vector z is of length 4 + 3 y is sometimes called a filter.)
USE VCONC_INT
USE WRCRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
NX, NY, NZ
(NX=4, NY=3, NZ=NX+NY-1)
COMPLEX
!
!
!
!
!
!
!
!
!
Output
0.00,
1
0.00)
5
( -1.00, 11.00)
0.00,
Z = X (*) Y
2
0.00) ( -1.00,
3
3.00)
( -1.00,
4
7.00)
6
( -1.00, 15.00)
two double precision numbers. An array called the accumulator stores the result of this
multiplication. The result of the multiplication is added to the current contents of the accumulator.
It is also possible to add a double precision number to the accumulator or to store a double
precision approximation in the accumulator.
The mixed double precision arithmetic routines are described below. The accumulator array,
QACC, is a double precision array of length 2. Double precision variables are denoted by DA and
DB. Available operations are:
There are also mixed double complex arithmetic versions of the above routines. The accumulator,
ZACC, is a double precision array of length 4. Double complex variables are denoted by ZA and ZB.
Available operations are:
Initialize a complex accumulator, ZACC ZA.
CALL ZQINI (ZA, ZACC)
Example
In this example, the value of 1.0D0/3.0D0 is computed in quadruple precision using Newtons
method. Four iterations of
xk +1 = xk + ( xk axk2 )
with a = 3 are taken. The error ax 1 is then computed. The results are accurate to approximately
twice the usual double precision accuracy, as given by the IMSL routine DMACH(4), in the
Reference Material section of this manual. Since DMACH is machine dependent, the actual accuracy
obtained is also machine dependent.
USE IMSL_LIBRARIES
IMPLICIT
NONE
INTEGER
I, NOUT
DOUBLE PRECISION A, DACC(2), DMACH, ERROR, SACC(2), X(2), X1, X2, EPSQ
!
CALL UMACH (2, NOUT)
A = 3.0D0
CALL DQINI (1.0001D0/A, X)
!
!
!
!
Compute X + X
CALL DQADD (X1, X)
CALL DQADD (X2, X)
CALL
CALL
CALL
CALL
CALL
DQINI
DQMUL
DQMUL
DQMUL
DQMUL
!
CALL DQINI
CALL DQMUL
CALL DQMUL
!
CALL DQADD
CALL DQADD
10 CONTINUE
Compute X*X
(0.0D0, DACC)
(X1, X1, DACC)
(X1, X2, DACC)
(X1, X2, DACC)
(X2, X2, DACC)
Compute -A*(X*X)
(0.0D0, SACC)
(-A, DACC(1), SACC)
(-A, DACC(2), SACC)
Compute -A*(X*X) + (X + X)
(SACC(1), X)
(SACC(2), X)
Compute A*X - 1
CALL
CALL
CALL
CALL
CALL
DQINI
DQMUL
DQMUL
DQADD
DQSTO
(0.0D0, SACC)
(A, X(1), SACC)
(A, X(2), SACC)
(-1.0D0, SACC)
(SACC, ERROR)
!
99999 FORMAT ('
END
Output
A*X - 1 =
0.6162976D-32 =
0.12500*MACHEPS**2
Routines
1.2.
10.2
Operators
Computes matrix-matrix or matrix-vector product ....................... .x.
Computes transpose matrix-matrix product................................ .tx.
Computes matrix- transpose matrix product.............................. .xt.
Computes conjugate transpose matrix-matrix product.............. .hx.
Computes matrix-conjugate transpose matrix product .............. .xh.
Computes the transpose of a matrix............................................. .t.
Computes conjugate transpose of a matrix ................................. .h.
Computes the inverse matrix ....................................................... ..i.
Computes inverse matrix-matrix product.................................... .ix.
Computes matrix-inverse matrix product.................................... .xi.
Functions
Computes the Cholesky factorization of a positive-definite,
symmetric or self-adjoint matrix ............................................ CHOL
Computes the condition number of a matrix......................... COND
Computes the determinant of a rectangular matrix ..................DET
Constructs a square diagonal matrix ...................................... DIAG
Extracts the diagonal terms of a matrix .....................DIAGONALS
Computes the eigenvalue-eigenvector decomposition of an
ordinary or generalized eigenvalue problem .............................EIG
Creates the identity matrix ........................................................EYE
Computes the Discrete Fourier Transform of one
complex sequence.................................................................... FFT
Discrete Fourier Transform of
several complex or real sequences .................................FFT_BOX
Computes the inverse of the Discrete Fourier
Transform of one complex sequence ...................................... IFFT
Computes the inverse Discrete Fourier Transform of
several complex or real sequences ................................IFFT_BOX
Tests for NaN......................................................................... isNaN
Returns the value for NaN ........................................................NaN
Computes the norm of an array ............................................ NORM
1537
1541
1544
1547
1550
1553
1556
1558
1561
1571
1574
1577
1581
1584
1585
1586
1590
1592
1594
1596
1598
1600
1601
1602
Routines 1521
1605
1608
1610
1612
1614
Usage Notes
This chapter describes numerical linear algebra, Fourier transforms, random number generation,
and other utility software packaged as defined operations that are executed with a function
notation similar to standard mathematics. The resulting interface alters the way libraries are
presented to the user. Many computations of numerical linear algebra are documented here as
operators and generic functions. A notation is developed reminiscent of matrix algebra. This
allows the Fortran user to express mathematical formulas in terms of operators. The operators can
be used with both dense and sparse matrices.
A comprehensive Fortran module, linear_operators, defines the operators and functions. Its use
provides this simplification. Subroutine calls and the use of type-dependent procedure names are
largely avoided. This makes a rapid development cycle possible, at least for the purposes of
experiments and proof-of-concept. The goal is to provide the Fortran programmer with an
interface, operators, and functions that are useful and succinct. The modules can be used with or
added to existing Fortran programs, but the operators provide a more readable program whenever
they apply. This approach may require more hidden working storage. The size of the executable
program may be larger than alternatives using subroutines. There are applications wherein the
operator and function interface does not have the functionality that is available using subroutine
libraries. To retain greater flexibility, some users will continue to require the techniques of calling
subroutines.
A parallel computation for many of the defined operators and functions has been implemented.
The type of problem solved is a simple one: several independent problems of the same data type
and size. Most of the detailed communication for parallel computation is hidden from the user.
Those functions having this data type computed in parallel are marked in bold type. The section
Dense Matrix Parallelism Using MPI gives an introduction on how users should write their
codes to use machines on a network.
A number of examples, in addition to those shown in this document, are supplied in the product
examples directory. The name of the example code is shown in parentheses in the example
heading, for those examples that are included with the product.
sqrt(epsilon(A))*sum(abs(A))/(n*n+1)
If the system is singular, a generalized matrix inverse is computed with the QR factorization code
LIN_SOL_LSQ using this same tolerance. Both row and column pivoting are used. If the system
is singular, an error message will be printed and a Fortran 90 STOP is executed. Users may want
to change this rule. This is illustrated by continuing and not printing the error message. The
following is a additional source to accomplish this, for all following invocations of the operator
.i.:
allocate(s_inv_options(1))
s_inv_options (1) = skip_error_processing
B = .i. A
Operator_ex36.f90
use linear_operators
implicit none
! This is the equivalent of Example 4 for LIN_GEIG_GEN (using operators).
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1d0, zero=0d0
real(kind(1d0)) a(n,n), b(n,n), bta(n), err
complex(kind(1d0)) alpha(n), v(n,n)
! Generate random matrices for both A and B.
A = rand(A); B = rand(B)
! Set the option, a larger tolerance than default for lin_sol_lsq.
allocate(d_eig_options(6))
d_eig_options(1) = options_for_lin_geig_gen
d_eig_options(2) = 4
d_eig_options(3) = d_lin_geig_gen_for_lin_sol_lsq
d_eig_options(4) = 2
d_eig_options(5) = d_options(d_lin_sol_lsq_set_small,&
sqrt(epsilon(one))*norm(B,1))
d_eig_options(6) = d_lin_sol_lsq_no_sing_mess
! Compute the generalized eigenvalues.
alpha = EIG(A, B=B, D=bta, W=V)
! Check the residuals.
err = norm((A .x. V .x. diag(bta)) - (B .x. V .x. diag(alpha)),1)/&
(norm(A,1)*norm(bta,1)+norm(B,1)*norm(alpha,1))
Note that in this example one first allocates the array by which the user will pass the new options
for EIG to use. This array is named d_eig_options in accordance with the name of the
unallocated option array specified in the documentation for EIG. A size of 6 is specified because a
total of six options must be passed to EIG to accomplish the resetting of the singular value
tolerance and to turn off the printing of the error message when the matrix is singular. The first
entry of d_eig_options specifies which of the options for EIG will be set. The next entry
designates the number of entries which follows that apply to options_for_lin_geig_gen.
The third entry specifies the option value of LIN_GEIG_GEN to be set,
d_lin_geig_gen_for_lin_sol_lsq. The fourth entry specifies the number of entries that
follow which apply to LIN_SOL_LSQ. Finally, the fifth and sixth entries set the two LIN_SOL_LSQ
options that we desire.
For a detailed description of MPI Capability see Dense Matrix Parallelism Using MPI.
This section is concerned with methods for computing with dense matrices. Consider a Fortran 90
code fragment that solves a linear system of algebraic equations, Ay = b, then computes the
residual r = b Ay. A standard mathematical notation is often used to write the solution,
y = A1b
A user thinks: matrix and right-hand side yields solution. The code shows the computation of
this mathematical solution using a defined Fortran operator .ix., and random data obtained
with the function, rand. This operator is read inverse matrix times. The residuals are computed
with another defined Fortran operator .x., read matrix times vector. Once a user understands
the equivalence of a mathematical formula with the corresponding Fortran operator, it is possible
to write this program with little effort. The last line of the example before end is discussed below.
USE linear_operators
integer,parameter :: n=3; real A(n,n), y(n), b(n), r(n)
A=rand(A); b=rand(b); y = A .ix. b
r = b - (A .x. y ) ! Parentheses are needed
end
The IMSL Fortran Numerical Library provides additional lower-level software that implements
the operation .ix., the function rand, matrix multiply .x., and others not used in this
1524 Chapter 10: Linear Algebra Operators and Generic Functions
example. Standard matrix products and inverse operations of matrix algebra are shown in the
following table:
Defined Array Operation
Matrix Operation
Alternative in Fortran 90
A .x. B
AB
matmul(A, B)
.i. A
A1
LIN_SOL_GEN
.t. A, .h. A
LIN_SOL_LSQ
A ,A
transpose(A)
conjg(transpose(A))
A .ix. B
LIN_SOL_GEN
A B
LIN_SOL_LSQ
B .xi. A
LIN_SOL_GEN
BA1
LIN_SOL_LSQ
A .tx. B, or (.t. A) .x. B
AT B, AH B
matmul(conjg(transpose(A)), B)
matmul(transpose (A), B)
BA , BA
matmul(B, transpose(A))
matmul(B, conjg(transpose(A)))
The IMSL operators apply generically to all standard precisions and floating-point data types
real and complex and to objects that are broader in scope than arrays with a fixed number of
dimensions. For example, the matrix product .x. applies to matrix times vector and matrix times
matrix represented as Fortran 90 arrays. It also applies to independent matrix products. For
this, use the notion: a box of problems to refer to independent linear algebra computations, of the
same kind and dimension, but different data. The racks of the box are the distinct problems. In
terms of Fortran 90 arrays, a rank-3, assumed-shape array is the data structure used for a box. The
first two dimensions are the data for a matrix; the third dimension is the rack number. Each
problem is independent of other problems in consecutive racks of the box. We use parallelism of
an underlying network of processors, and MPI, when computing these disjoint problems.
In addition to the operators .ix., .xi., .i., and .x., additional operators .t., .h., .tx.,
.hx., .xt., and .xh. are provided for complex matrices. Since the transpose matrix is defined
for complex matrices, this meaning is kept for the defined operations. In order to write one defined
operation for both real and complex matrices, use the conjugate-transpose in all cases. This will
result in only real operations when the data arrays are real.
For sums and differences of vectors and matrices, the intrinsic array operations + and are
available. It is not necessary to have separate defined operations. A parsing rule in Fortran 90
states that the result of a defined operation involving two quantities has a lower precedence than
any intrinsic operation. This explains the parentheses around the next-to-last line containing the
sub-expression A .x. y found in the example. Users are advised to always include
parentheses around array expressions that are mixed with defined operations, or whenever there is
possible confusion without them. The next-to-last line of the example results in computing the
residual associated with the solution, namely r = b Ay. Ideally, this residual is zero when the
system has a unique solution. It will be computed as a non-zero vector due to rounding errors and
conditioning of the problem.
Chapter 10: Linear Algebra Operators and Generic Functions
For a detailed description of MPI Capability see Dense Matrix Parallelism Using MPI.
Several decompositions and functions required for numerical linear algebra follow. The
convention of enclosing optional quantities in brackets, [ ] is used. The functions that use MPI
for parallel execution of the box data type are marked in bold.
Defined Array Functions
Matrix Operation
A = USV T
V=V, W=W])
R=CHOL(A)
A = RT R
Q=ORTH(A [,R=R])
( A = QR ) , QT Q = I
U=UNIT(A)
[u1 ,] = a1 /
F=DET(A)
Det(A) = determinant
K=RANK(A)
rank(A) = rank
P=NORM(A[,[type=]i])
a1 ,
p = A 1 = max j ( aij )
i =1
huge (1)
= max i ( aij )
j =1
C=COND(A)
s1 / srank ( A)
Z=EYE(N)
Z = IN
A=DIAG(X)
A = diag ( x1 ,)
X=DIAGONALS(A)
x = ( a11 ,)
Y=FFT (X,[WORK=W]);
X=IFFT(Y,[WORK=W])
Y=FFT_BOX (X,[WORK=W]);
X=IFFT_BOX(Y,[WORK=W])
A=RAND(A)
L=isNaN(A)
In certain functions, the optional arguments are inputs while other optional arguments are outputs.
To illustrate the example of the box SVD function, a code is given that computes the singular
value decomposition and the reconstruction of the random matrix box, A, using the computed
factors, R = USVT. Mathematically R = A, but this will be true, only approximately, due to
rounding errors. The value units_of_error = ||A R||/(||A||), shows the merit of this
approximation.
General Remarks
The central theme we use for the computing functions of the box data type is that of delivering
results to a distinguished node of the machine. One of the design goals was to shield much of the
complexity of distributed computing from the user.
The nodes are numbered by their ranks. Each node has rank value MP_RANK. There are
MP_NPROCS nodes, so MP_RANK = 0, 1,...,MP_NPROCS-1. The root node has
MP_RANK = 0. Most of the elementary MPI material is found in Gropp, Lusk, and Skjellum
(1994) and Snir, Otto, Huss-Lederman, Walker, and Dongarra (1996). Although IMSL Fortran
Numerical Library users are for the most part shielded from the complexity of MPI, it is desirable
for some users to learn this important topic. Users should become familiar with any referenced
MPI routines and the documentation of their usage. MPI routines are not discussed here, because
that is best found in the above references.
The IMSL Fortran Numerical Library algorithm for allocating the racks of the box to the
processors consists of creating a schedule for the processors, followed by communication and
execution of this schedule. The efficiency may be improved by using the nodes according to a
specific priority order. This order can reflect information such as a powerful machine on the
network other than the users work station, or even transient network behavior. The IMSL Fortran
Numerical Library allows users to define this order, but a default order is provided. A setup
function establishes an order based on timing matrix products of a size given by the user. See
Parallel Example 4 for an illustration of this usage.
When the function MP_SETUP() is called with no arguments, the following events occur:
If MPI has not been initialized, it is first initialized. This step uses the routines
MPI_Initialized() and possibly MPI_Init(). Users who choose not to call
MP_SETUP() must make the required initialization call before using any IMSL Fortran
Numerical Library code that relies on MPI for its execution. If the users code calls an IMSL
Fortran Numerical Library function utilizing the box data type and MPI has not been
initialized, then the computations are performed on the root node. The only MPI routine
always called in this context is MPI_Initialized(). The name MP_SETUP is pushed onto
the subprogram or call stack.
The integers MP_RANK and MP_NPROCS are respectively the nodes rank and the number of
nodes in the communicator, MP_LIBRARY_WORLD. Their values require the routines
MPI_Comm_size() and MPI_Comm_rank(). The default values are important when MPI
is not initialized and a box data type is computed. In this case the root node is the only node
and it will do all the work. No calls to MPI communication routines are made when
MP_NPROCS = 1 when computing the box data type functions. A program can temporarily
assign this value to force box data type computation entirely at the root node. This is
desirable for problems where using many nodes would be less efficient than using the root
node exclusively.
The array MPI_NODE_PRIORITY(:) is unallocated unless the user allocates it. The IMSL
Fortran Numerical Library codes use this array for assigning tasks to processors, if it is
allocated. If it is not allocated, the default priority of the nodes is
(0,1,...,MP_NPROCS-1). Use of the function call MP_SETUP(N) allocates the array, as
explained below. Once the array is allocated its size is MP_NPROCS. The contents of the array
is a permutation of the integers 0,...,MP_NPROCS-1. Nodes appearing at the start of the
list are used first for parallel computing. A node other than the root can avoid any
computing, except receiving the schedule, by setting the value MPI_NODE_PRIORITY(I)< 0.
This means that node |MPI_NODE_PRIORITY(I)| will be sent the task schedule but will
not perform any significant work as part of box data type function evaluations.
The LOGICAL flag MPI_ROOT_WORKS designates whether or not the root node participates in
the major computation of the tasks. The root node communicates with the other nodes to
complete the tasks but can be designated to do no other work. Since there may be only one
processor, this flag has the default value .TRUE., assuring that one node exists to do work.
When more than one processor is available users can consider assigning
MPI_ROOT_WORKS=.FALSE. This is desirable when the alternate nodes have equal or greater
computational resources compared with the root node. Parallel Example 4 illustrates this
usage. A single problem is given a box data type, with one rack. The computing is done at
the node, other than the root, with highest priority. This example requires more than one
processor since the root does no work.
When the generic function MP_SETUP(N) is called, where N is a positive integer, a call to
MP_SETUP() is first made, using no argument. Use just one of these calls to MP_SETUP(). This
initializes the MPI system and the other parameters described above. The array
MPI_NODE_PRIORITY(:) is allocated with size MP_NPROCS. Then DOUBLE PRECISION matrix
products C = AB, where A and B are N by N matrices, are computed at each node and the elapsed
time is recorded. These elapsed times are sorted and the contents of MPI_NODE_PRIORITY(:)
are permuted in accordance with the shortest times yielding the highest priority. All the nodes in
the communicator MP_LIBRARY_WORLD are timed. The array MPI_NODE_PRIORITY(:) is then
broadcast from the root to the remaining nodes of MP_LIBRARY_WORLD using the routine
MPI_Bcast(). Timing matrix products to define the node priority is relevant because the effort to
compute C is comparable to that of many linear algebra computations of similar size. Users are
free to define their own node priority and broadcast the array MPI_NODE_PRIORITY(:) to the
alternate nodes in the communicator.
To print any IMSL Fortran Numerical Library error messages that have occurred at any node, and
to finalize MPI, use the function call MP_SETUP(Final). Case of the string Final is not
important. Any error messages pending will be discarded after printing on the root node. This is
triggered by popping the name MP_SETUP from the subprogram stack or returning to Level 1 in
the stack. Users can obtain error messages by popping the stack to Level 1 and still continuing
with MPI calls. This requires executing call e1pop (MP_SETUP). To continue on after
summarizing errors execute call e1psh (MP_SETUP). More details about the error
processor are found in Reference Material chapter of this manual.
Messages are printed by nodes from largest rank to smallest, which is the root node. Use of the
routine MPI_Finalize() is made within MP_SETUP(Final), which shuts down MPI. After
MPI_Finalize() is called, the value of MP_NPROCS = 0. This flags that MPI has been
initialized and terminated. It cannot be initialized again in the same program unit execution. No
MPI routine is defined when MP_NPROCS has this value.
Using Processors
There are certain pitfalls to avoid when using IMSL Fortran Numerical Library and box data types
as implemented with MPI. A fundamental requirement is to allow all processors to participate in
parts of the program where their presence is needed for correctness. It is incorrect to have a
program unit that restricts nodes from executing a block of code required when computing with
the box data type. On the other hand it is appropriate to restrict computations with rank-2 arrays
to the root node. This is not required, but the results for the alternate nodes are normally
discarded. This will avoid gratuitous error messages that may appear at alternate nodes.
Observe that only the root has a correct result for a box data type function. Alternate nodes have
the constant value one as the result. The reason for this is that during the computation of the
functions, sub-problems are allocated to the alternate nodes by the root, but for only the root to
utilize the result. If a user needs a value at the other nodes, then the root must send it to the nodes.
See Parallel Example 3 for an illustration of this usage. Convergence information is computed at
the root node and broadcast to the others. Without this step some nodes would not terminate the
loop even when corrections at the root become small. This would cause the program to be
incorrect.
The definition of the sparse matrices starts with a triplet consisting of the row and column indices
and a value at that entry. By setting a flag in the derived type SLU_Options, repeated values
may be accumulated to yield a value that is the sum of all triplets for that matrix entry. A diagram
for constructing a single precision sparse 10000 10000 matrix, H, is illustrated with the
pseudocode fragment:
Use linear_operators
Integer I, J; Real(Kind(1.e0)) value, x(10000)
Type(s_sparse) A
Type(s_hbc_sparse) H
1.
2.
3.
x = H .ix. x.
A basic feature is that there are four sparse matrix derived types, Types (s_hbc_sparse),
(d_hbc_sparse), (c_hbc_sparse), and (z_hbc_sparse). These respectively handle single, double,
complex and double-complex data. The defined operators work with a sparse matrix and a
corresponding dense array of the same precision and data type. There is no mixing of data types
such as a sparse double precision matrix multiplied by a single precision vector. To accommodate
that case an intermediate double precision quantity will be created that ascends the single precision
1530 Chapter 10: Linear Algebra Operators and Generic Functions
vector to a double precision vector. The table below shows the operations that are valid with
sparse matrix types.
Mathematical Operation Operation Notation Input Terms
Output Terms
y = H 1 x
y = H .ix. x
H nn sparse, x(1:k), k
y(1:n)
y = x T H 1 H T x
y = x .xi. H
H nn sparse, x(1:k), k n
y(1:n)
Y = H 1 X n r
Y= H .ix. X
H nn sparse, X(1:k,1:r), k n
Y(1:n,1:r)
Y = X .xi. H
H nn sparse, X(1:r,1:n), k
Y(1:r,1:n)
Y = X r n H 1 ( H T X T )
y = Hx
y = H .x. x
H mn sparse, x(1:k), k
y(1:m)
y = xT H H T x
y = x .x. H
H mn sparse, x(1:k), k
y(1:n)
Y = HX n r
Y = H .x. X
H mn sparse,X(1:k,1:r), k n
Y(1:m,1:r)
Y = X rm H
Y = X .x. H
H mn sparse, X(1:r,1:k), k m
Y(1:r,1:n)
K = .t. H
H mn sparse
K nm sparse
K = .h. H
H mn sparse, complex
K nm sparse
y = HT x
y = H .tx. x
H mn sparse, x(1:k), k
Y = H T X m r
Y = H .tx. X
H mn sparse, X(1:k,1:r), k
y = xT H
Y = x .tx. H
H mn sparse, x(1:k), k m
Y = X rT m H
Y = X .tx. H
H mn sparse, X(1:k,1:r), k
y = Hx T
y = H .xt. x
H mn sparse, x(1:k), k n
y(1:m)
Y = HX nT r
Y = H .xt. X
H mn sparse, x(1:k,1:r), k n
Y(1:m,1:r)
y = xH T
y = x .xt. H
H mn sparse, x(1:k), k n
y(1:m)
Y = X rn H T
Y = X .xt. H
H mn sparse, x(1:r,1:k), k n
Y(1:r,1:m)
y = H H x = HT x
y = H .hx. x
H mn sparse 3, x(1:k), k m
y(1:n)
Y = H H X m r = H T X m r
Y = H .hx. X
H mn sparse, X(1:k,1:r), k m
Y(1:n,1:r)
y = xH H = x T H
Y = x .hx. H
H mn sparse, x(1:k), k m
y(1:n)
K = HT
K = HH = H
m
m
m
y(1:n)
Y(1:n,1:r)
y(1:n)
Y(1:r,1:n)
The operators .hx. and .xh. apply to sparse complex matrices only. For real matrices use
the .tx. and .xt. operators.
Output Terms
Y = X rH m H = X rTm H
Y = X .hx. H
H mn sparse, X(1:k,1:r), k m
Y(1:r,1:n)
y = Hx H = Hx T
y = H .xh. x
H mn sparse, x(1:k), k n
y(1:m)
Y = HX nHr = HX nTr
Y = H .xh. X
H mn sparse, x(1:k,1:r), k n
Y(1:m,1:r)
H mn sparse, x(1:k), k n
y(1:m)
H mn sparse, x(1:r,1:k), k n
Y(1:r,1:m)
y = xH H = xH T
Y = X rn H H = X rn H T
y = x .xh H
Y = X .xh. H
Additionally, type (d_entry), type (c_entry), and type (z_entry) are defined similarly. These
support double precision, complex and complex-double precision accuracy and types.
Thus for a sparse matrix A , the entry at the intersection of row irow and column jcol is the
scalar value. We define a sparse matrix representation in terms of a collection of triplets. This is
a convenient way for a user to define a sparse matrix. This representation is used to define the
matrix entries in a users program using overloaded assignment. There is no implied order on the
collection of triplets that define this sparse matrix. Our experience shows that for writing
application code the technique of using triplets to define the matrix entries is convenient and
provides a workable transition from mathematical definitions of the entries to computer code.
Also note that there is generally no need for the programmer to allocate the components of a
matrix of type s_sparse when using the overloaded assignment: s_sparse = s_entry. The
software handles this detail by reallocating and expanding those components of the s_sparse
matrix as required. (For this task we use the Fortran 2003 intrinsic subroutine move_alloc(),
when it is available. This routine provides an efficient way to perform a reallocation.) The
amount reallocated is controlled by an expansion factor that is a component of the derived type
SLU_options.
type s_sparse
integer :: mrows = 0
integer :: ncols = 0
integer :: numnz = 0
integer, allocatable, dimension(:) :: irow
integer, allocatable, dimension(:) :: jcol
real(kind(1.e0)), allocatable, dimension(:) :: value
type (SLU_options) options
end type
1532 Chapter 10: Linear Algebra Operators and Generic Functions
When performing matrix computations we use the Harwell-Boeing column-oriented derived type.
The row indices, for each column, are unique and increasing. The values in the
colptr(1:ncols) component mark the start of the row indices and corresponding matrix entries
for that column. The value colptr(ncols+1)-1 will equal the value numnz after the matrix is
defined with non-zero entries. The row indices for each column are in array irow(:). They are
unique and sorted into increasing order.
type s_hbc_sparse
integer :: mrows = 0
integer :: ncols = 0
integer :: numnz = 0
integer, allocatable, dimension(:) :: irow
integer, allocatable, dimension(:) :: colptr
real(kind(1.e0)), allocatable, dimension(:) :: value
type(SLU_options) options
end type
Type SLU_options
Sequence
Integer :: unique = 1 ! Each new entry is unique IMSL
Integer :: Accumulate = 0
! Accumulate or assemble duplicated entries in
! a ?_sparse matrix. This flag is checked
! when executing an overloaded assignment
! with a Harwell-Boeing = ?_sparse matrix.
! The default is not to accumulate (0)
! Assign the value 1 to accumulate.
Integer :: handle(2) = 0
SuperLU is used to support the defined operations .ix. and .xi., and the condition number
function, cond(). SuperLU is well-tested. Distributed and threaded versions are available but
these are not used here in our software at present.
Chapter 10: Linear Algebra Operators and Generic Functions
!
!
!
!
Overloaded Assignments
A natural way to define a sparse matrix is in terms of its triplets. The basic tool used here to
define all the non-zero entries is overloaded assignment. Fortran 90, and further updates to the
standard, supports a hidden subroutine call, packaged in a module, when an assignment is
executed between differing derived types. Thus if a Fortran program has a declaration
type(s_sparse) A, then the overloaded assignment statement
A = s_entry(I, J, value)
has the effect of calling subroutines that result in joining the matrix entry value at the intersection
of row I and column J. The components of A are managed to hold any number of values. The
number of rows, columns and non-zero values are updated as new triplets are assigned. Also the
arrays that hold the triplets are re-allocated and expanded, as required, to hold newly assigned
triplets.
The code snippet for this operation, and others that follow, will require use of the module
linear_operators. If new space is required in the assignment, a reallocation of the
components of the matrix A will occur. The user does not have to manage the details.
Use linear_operators
Type(s_sparse) A
H = 0
Sparse = Dense
The non-zero entries of the dense array are converted to a Harwell-Boeing sparse matrix. As a
first step any allocated components are cleared and then allocated as needed to hold the non-zero
values of the dense array. The specific dimensions of array D are arbitrary.
Use linear_operators
Type(s_hbc_sparse) H
Integer, parameter :: M=1000, N=1000
Real (kind(1.e0)) D(M,N)
{Define entries of D}
H = D
Dense = Sparse
For some applications it is convenient to expand a sparse matrix into a dense matrix. The specific
dimensions of array D are arbitrary.
Use linear_operators
Type(s_hbc_sparse) H
Integer, parameter :: M=1000, N=1000
Real (kind(1.e0)) D(M,N)
{Define entries of H}
Chapter 10: Linear Algebra Operators and Generic Functions
D = H
Scalar = s_hbc_entry(Sparse, I, J)
This assignment gets the value at the intersection of row I and column J of the Harwell-Boeing
sparse matrix. There must be type agreement with the function and sparse matrix type. Use a
prefix of d_, c_, or z_ for double, complex, or double complex values.
Use inear_operators
Type(s_hbc_sparse) H
Real (kind(1.e0)) value
{Define entries of H, I and J}
value = s_hbc_entry(H, I, J)
.x.
CAPABLE
Required Operands
A Left operand matrix or vector. This is an array of rank 1, 2, or 3. It may be real, double,
complex, double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
B Right operand matrix or vector. This is an array of rank 1, 2, or 3. It may be real, double,
complex, double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
FORTRAN 90 Interface
A .x. B
Description
Computes the product of matrix or vector A and matrix or vector B. The results are in a precision
and data type that ascends to the most accurate or complex operand.
Rank three operation is defined as follows:
do i = 1, min(size(A,3), size(B,3))
X(:,:,i) = A(:,:,i) .x. B(:,:,i)
end do
.x. can be used with either dense or sparse matrices. It is MPI capable for dense matrices only.
.x. 1537
Examples
Dense Matrix Example (operator_ex03.f90)
use linear_operators
implicit none
! This is the equivalent of Example 3 for LIN_SOL_GEN using operators.
integer, parameter :: n=32
real(kind(1e0)) :: one=1e0, zero=0e0, A(n,n), b(n), x(n)
real(kind(1e0)) change_new, change_old
real(kind(1d0)) :: d_zero=0d0, c(n), d(n,n), y(n)
! Generate a random matrix and right-hand side.
A = rand(A); b= rand(b)
! Save double precision copies of the matrix and right-hand side.
D = A
c = b
! Compute single precision inverse to compute the iterative refinement.
A = .i. A
! Start solution at zero. Update it to an accurate solution
! with each iteration.
y = d_zero
change_old = huge(one)
iterative_refinement: do
! Compute the residual with higher accuracy than the data.
b = c - (D .x. y)
! Compute the update in single precision.
x = A .x. b
y = x + y
change_new = norm(x)
! Exit when changes are no longer decreasing.
if (change_new >= change_old) exit iterative_refinement
change_old = change_new
end do iterative_refinement
write (*,*) 'Example 3 for LIN_SOL_GEN (operators) is correct.'
end
Using a standard approach to solving this involves approximating the second derivative operator
with central divided differences
d 2u
dx 2
ui 1 2ui + ui +1
, h = ( b a ) / ( N 1) , i = 2, , N 1, N > 2
h2
This leads to the sparse linear algebraic system Mu = w . The definitions for these terms are
implied in the following Fortran program.
Subroutine document_ex1
! Illustrate a 1D Poisson equation with Dirichlet boundary conditions.
! This module defines the structures and overloaded assignment code.
Use linear_operators
Implicit None
!
Integer :: I
Integer, Parameter :: N = 1000
Real (Kind(1.d0)) :: f, h, r, w (N), a = 0.d0, b = 1.d0, &
u_a = 0.d0, u_b = 1.d0, u (N)
Type (d_sparse) M
Type (d_hbc_sparse) K
External f
! Define the difference used.
h = (b-a) / (N-1)
r = 1.d0 / h ** 2
! Fill in the matrix entries.
! Isolated equation for the left boundary condition.
M = d_entry (1, 1, r)
Do I = 2, N - 1
M = d_entry (I, I-1, r)
M = d_entry (I, I,-2*r)
M = d_entry (I, I+1, r)
End Do
! Isolated equation for the right boundary condition.
M = d_entry (N, N, r)
! Fill in the right-hand side (a dense vector).
Do I = 2, N - 1
w (I) = f (a+(I-1)*h)
End Do
! Insert the known end conditions. These should be satisfied
! almost exactly, up to rounding errors.
w (1) = u_a * r
w (N) = u_b * r
! Ready to solve
! Conversion to Harwell-Boeing format using overloaded assignment
K = M
! Solve the system using an IMSL defined operator.
u = K .ix. w
! The parentheses are needed because of precedence rules.
! Compute residuals and overwrite w(:) with these values.
w = w - (K .x. u)
End Subroutine
!
Chapter 10: Linear Algebra Operators and Generic Functions
.x. 1539
Function f (x)
Real (Kind(1.d0)) :: f, x
! Define a hat function, peaked at x=0.5.
If (x <= 0.5d0) Then
f = x
Else
f = 1.d0 - x
End If
End Function
.tx.
CAPABLE
Required Operands
A Left operand matrix. This is an array of rank 2, or 3. It may be real, double, complex,
double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
B Right operand matrix or vector. This is an array of rank 1, 2, or 3. It may be real, double,
complex, double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
.tx. 1541
FORTRAN 90 Interface
A .tx. B
Description
Computes the product of the transpose of matrix A and matrix or vector B. The results are in a
precision and data type that ascends to the most accurate or complex operand.
Rank three operation is defined as follows:
do i = 1, min(size(A,3), size(B,3))
X(:,:,i) = A(:,:,i) .tx. B(:,:,i)
end do
.tx. can be used with either dense or sparse matrices. It is MPI capable for dense matrices only.
Examples
Dense Matrix Example (operator_ex05.f90)
use linear_operators
implicit none
! This is the equivalent of Example 1 for LIN_SOL_SELF using operators
! and functions.
integer, parameter :: m=64, n=32
real(kind(1e0)) :: one=1.0e0, err
real(kind(1e0)) A(n,n), b(n,n), C(m,n), d(m,n), x(n,n)
! Generate two rectangular random matrices.
C = rand(C); d=rand(d)
! Form the normal equations for the rectangular system.
A = C .tx. C; b = C .tx. d
! Compute the solution for Ax = b, A is symmetric.
x = A .ix. b
! Check the results.
err = norm(b - (A .x. x))/(norm(A)+norm(b))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 1 for LIN_SOL_SELF (operators) is correct.'
end if
end
Output
H
1
2
3
1
2.000
0.000
0.000
1
2
3
1
0.8711
0.8315
0.6839
2
0.000
4.000
0.000
3
1.000
0.000
6.000
B
2
0.4467
0.7257
0.0561
3
0.4743
0.4518
0.6972
H .tx. B
1
2
3
1
1.742
0.893
0.949
2
3.326
2.903
1.807
3
4.975
0.784
4.657
Sparse example for .tx. operator is correct.
.tx. 1543
.xt.
CAPABLE
Required Operands
A Left operand matrix or vector. This is an array of rank 1, 2, or 3. It may be real, double,
complex, double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
B Right operand matrix. This is an array of rank 2, or 3. It may be real, double, complex,
double complex, or one of the computational sparse matrix derived types,
1544 Chapter 10: Linear Algebra Operators and Generic Functions
?_hbc_sparse. (Input)
FORTRAN 90 Interface
A .xt. B
Description
Computes the product of matrix or vector A and the transpose of matrix B. The results are in a
precision and data type that ascends to the most accurate or complex operand.
Rank three operation is defined as follows:
do i = 1, min(size(A,3), size(B,3))
X(:,:,i) = A(:,:,i) .xt. B(:,:,i)
end do
.xt. can be used with either dense or sparse matrices. It is MPI capable for dense matrices only.
Examples
Dense Matrix Example
(operator_ex14.f90)
use linear_operators
implicit none
!
integer, parameter :: n=32
real(kind(1d0)) :: one=1d0, zero=0d0
real(kind(1d0)) A(n,n), P(n,n), Q(n,n), &
S_D(n), U_D(n,n), V_D(n,n)
! Generate a random matrix.
A = rand(A)
! Compute the singular value decomposition.
S_D = SVD(A, U=U_D, V=V_D)
! Compute the (left) orthogonal factor.
P = U_D .xt. V_D
! Compute the (right) self-adjoint factor.
Q = V_D .x. diag(S_D) .xt. V_D
! Check the results.
if (norm( EYE(n) - (P .xt. P)) &
<= sqrt(epsilon(one))) then
if (norm(A - (P .x. Q))/norm(A) &
<= sqrt(epsilon(one))) then
Chapter 10: Linear Algebra Operators and Generic Functions
.xt. 1545
Output
A
1
2
3
1
0.5423
0.0844
0.4146
1
2
3
1
2.000
0.000
0.000
2
0.2380
0.1323
0.3135
3
0.9250
0.1937
0.7757
H
2
0.000
4.000
0.000
3
1.000
0.000
6.000
A .xt. H
1
2
3
1
2.010
0.952
5.550
2
0.363
0.529
1.162
3
1.605
1.254
4.654
Sparse example for .xt. operator is correct.
1546 Chapter 10: Linear Algebra Operators and Generic Functions
.hx.
CAPABLE
.hx. 1547
Required Operands
A Left operand matrix. This is an array of rank 2 or 3. It may be real, double, complex,
double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
B Right operand matrix or vector. This is an array of rank 1, 2, or 3. It may be real, double,
complex, double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
FORTRAN 90 Interface
A .hx. B
Description
Computes the product of the conjugate transpose of matrix A and matrix or vector B. The results
are in a precision and data type that ascends to the most accurate or complex operand.
Rank three operation is defined as follows:
do i = 1, min(size(A,3), size(B,3))
X(:,:,i) = A(:,:,i) .hx. B(:,:,i)
end do
.hx. can be used with either dense or sparse matrices. It is MPI capable for dense matrices only.
Examples
Dense Matrix Example (operator_ex32.f90)
use linear_operators
implicit none
! This is the equivalent of Example 4 (using operators) for LIN_EIG_GEN.
integer, parameter :: n=17
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)), dimension(n,n) :: A, C
real(kind(1d0)) variation(n), eta
complex(kind(1d0)), dimension(n,n) :: U, V, e(n), d(n)
! Generate a random matrix.
1548 Chapter 10: Linear Algebra Operators and Generic Functions
A = rand(A)
! Compute the eigenvalues, left- and right- eigenvectors.
D = EIG(A, W=V); E = EIG(.t.A, W=U)
! Compute condition numbers and variations of eigenvalues.
variation = norm(A)/abs(diagonals( U .hx. V))
!
!
!
!
Output
Chapter 10: Linear Algebra Operators and Generic Functions
.hx. 1549
H
1
2
3
1
( 2.000, 1.000)
( 0.000, 0.000)
( 0.000, 0.000)
2
( 0.000, 0.000)
( 4.000,-1.000)
( 0.000, 0.000)
1
2
3
1
( 0.6278, 0.8475)
( 0.1249, 0.4675)
( 0.4608, 0.0891)
3
( 1.000, 3.000)
( 0.000, 0.000)
( 6.000, 2.000)
A
2
( 0.8007, 0.4179)
( 0.7957, 0.1609)
( 0.3181, 0.9180)
3
( 0.4512, 0.2601)
( 0.4228, 0.0507)
( 0.9961, 0.1939)
H .hx. A
1
2
3
1 ( 2.103, 1.067) ( 2.019, 0.035) ( 1.163, 0.069)
2 ( 0.032, 1.995) ( 3.022, 1.439) ( 1.640, 0.626)
3 ( 6.113,-1.423) ( 5.799, 2.888) ( 7.596,-1.922)
Sparse example for .hx. operator is correct.
Parallel Example
use linear_operators
use mpi_setup_int
integer, parameter :: N=32, nr=4
complex (kind(1.e0)) A(N,N,nr), B(N,N,nr), Y(N,N,nr)
! Setup for MPI
mp_nprocs = mp_setup()
if (mp_rank == 0) then
A = rand(A)
B = rand(B)
end if
Y = A .hx. B
mp_nprocs = mp_setup ('Final')
end
.xh.
CAPABLE
Required Operands
A Left operand matrix or vector. This is an array of rank 1, 2, or 3. It may be real, double,
complex, double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
B Right operand matrix. This is an array of rank 2, or 3. It may be real, double, complex,
double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Note that A and B cannot both be ?_hbc_sparse.
FORTRAN 90 Interface
A .xh. B
Description
Computes the product of matrix or vector A and the conjugate transpose of matrix B. The results
are in a precision and data type that ascends to the most accurate or complex operand.
Rank three operation is defined as follows:
do i = 1, min(size(A,3), size(B,3))
X(:,:,i) = A(:,:,i) .xh. B(:,:,i)
end do
.xh. can be used with either dense or sparse matrices. It is MPI capable for dense matrices only.
Examples
Dense Matrix Example
use wrcrn_int
use linear_operators
integer, parameter :: N=3
complex (kind(1.e0)) A(N,N), B(N,N), Y(N,N)
A = rand(A)
B = rand(B)
Y = A .xh. B
call wrcrn ( 'A', a)
call wrcrn ( 'H', b)
call wrcrn ( 'A .xh. B ', y)
end
.xh. 1551
Output
A
1
2
3
1
( 0.8071, 0.0054)
( 0.9380, 0.5181)
( 0.8349, 0.7291)
1
2
3
1
( 0.5342, 0.2246)
( 0.5531, 0.3362)
( 0.3553, 0.9157)
1
2
3
1
( 1.141, 0.265)
( 2.029, 0.900)
( 1.363, 0.434)
2
( 0.5617, 0.2508)
( 0.8895, 0.9512)
( 0.4162, 0.5255)
3
( 0.0223, 0.5555)
( 0.7951, 0.6010)
( 0.7388, 0.0309)
B
2
( 0.9045, 0.0550)
( 0.0757, 0.3970)
( 0.0951, 0.7807)
3
( 0.4576, 0.3173)
( 0.6807, 0.8625)
( 0.4853, 0.0617)
A .xh. B
2
( 1.085,-0.113)
( 2.198,-0.587)
( 1.477,-0.619)
3
( 0.586,-0.884)
( 2.058,-1.036)
( 1.775,-0.811)
Output
A
1
3
Fortran Numerical MATH LIBRARY
1
2
3
( 0.8526, 0.3532)
( 0.5599, 0.8914)
( 0.9947, 0.2735)
1
2
3
1
( 2.000, 1.000)
( 0.000, 0.000)
( 0.000, 0.000)
( 0.1822, 0.3938)
( 0.7541, 0.5163)
( 0.6237, 0.2137)
( 0.8008, 0.1308)
( 0.8713, 0.9580)
( 0.3802, 0.8903)
H
2
( 0.000, 0.000)
( 4.000,-1.000)
( 0.000, 0.000)
3
( 1.000, 3.000)
( 0.000, 0.000)
( 6.000, 2.000)
A .xh. H
1
2
3
1 ( 3.252,-2.418) ( 0.335, 1.757) ( 5.066,-0.817)
2 ( 5.757,-0.433) ( 2.500, 2.819) ( 7.144, 4.005)
3 ( 5.314,-0.698) ( 2.281, 1.478) ( 4.062, 4.581)
Sparse example for .xh. operator is correct.
Parallel Example
use linear_operators
use mpi_setup_int
integer, parameter :: N=32, nr=4
complex (kind(1.e0)) A(N,N,nr), B(N,N,nr), Y(N,N,nr)
! Setup for MPI
mp_nprocs = mp_setup()
if (mp_rank == 0) then
A = rand(A)
B = rand(B)
end if
Y = A .xh. B
mp_nprocs = mp_setup ('Final')
end
.t.
Computes the transpose of a matrix.
Required Operand
A Matrix for which the transpose is to be computed. This is a real, double, complex,
double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input).
.t. 1553
FORTRAN 90 Interface
.t. A
Description
Computes the transpose of matrix A. The operation may be read transpose, and the results are the
mathematical objects in a precision and data type that matches the operand. Since this is a unary
operation, it has higher Fortran 90 precedence than any other intrinsic unary array operation.
.t. can be used with either dense or sparse matrices.
Examples
Dense Matrix Example (operator_ex07.f90)
use linear_operators
implicit none
! This is the equivalent of Example 3 (using operators) for LIN_SOL_SELF.
integer tries
integer, parameter :: m=8, n=4, k=2
integer ipivots(n+1)
real(kind(1d0)) :: one=1.0d0, err
real(kind(1d0)) a(n,n), b(n,1), c(m,n), x(n,1), &
e(n), ATEMP(n,n)
type(d_options) :: iopti(4)
! Generate a random rectangular matrix.
C = rand(C)
! Generate a random right hand side for use in the inverse
! iteration.
b = rand(b)
! Compute the positive definite matrix.
A = C .tx. C; A = (A+.t.A)/2
! Obtain just the eigenvalues.
E = EIG(A)
! Use packaged option to reset the value of a small diagonal.
iopti(4) = 0
iopti(1) = d_options(d_lin_sol_self_set_small,&
epsilon(one)*abs(E(1)))
! Use packaged option to save the factorization.
iopti(2) = d_lin_sol_self_save_factors
! Suppress error messages and stopping due to singularity
! of the matrix, which is expected.
iopti(3) = d_lin_sol_self_no_sing_mess
1554 Chapter 10: Linear Algebra Operators and Generic Functions
ATEMP = A
! Compute A-eigenvalue*I as the coefficient matrix.
! Use eigenvalue number k.
A = A - e(k)*EYE(n)
do tries=1,2
call lin_sol_self(A, b, x, &
pivots=ipivots, iopt=iopti)
! When code is re-entered, the already computed factorization
! is used.
iopti(4) = d_lin_sol_self_solve_A
! Reset right-hand side in the direction of the eigenvector.
B = UNIT(x)
end do
! Normalize the eigenvector.
x = UNIT(x)
! Check the results.
b=ATEMP .x. x
err = dot_product(x(1:n,1), b(1:n,1)) - e(k)
! If any result is not accurate, quit with no printing.
if (abs(err) <= sqrt(epsilon(one))*E(1)) then
write (*,*) 'Example 3 for LIN_SOL_SELF (operators) is correct.'
end if
end
.t. 1555
Output
H
1
2.000
0.000
0.000
1
2
3
2
0.000
4.000
0.000
3
1.000
0.000
6.000
H Transpose
1
2
3
1
2.000
0.000
0.000
2
0.000
4.000
0.000
3
1.000
0.000
6.000
Sparse example for .t. operator is correct.
.h.
Computes the conjugate transpose of a matrix.
Required Operand
A Matrix for which the conjugate transpose is to be computed. This is an array of rank 2,
or 3. It may be real, double, complex, double complex, or one of the computational
sparse matrix derived types, ?_hbc_sparse. (Input)
FORTRAN 90 Interface
.h. A
Description
Computes the conjugate transpose of matrix A. The operation may be read adjoint, and the results
are the mathematical objects in a precision and data type that matches the operand. Since this is a
unary operation, it has higher Fortran 90 precedence than any other intrinsic unary array
operation.
.h. can be used with either dense or sparse matrices.
Examples
Dense Matrix Example (operator_ex34.f90)
use linear_operators
implicit none
! This is the equivalent of Example 2 (using operators) for LIN_GEIG_GEN.
integer, parameter :: n=32
real(kind(1d0)), parameter :: one=1d0, zero=0d0
real(kind(1d0)) err, alpha(n)
complex(kind(1d0)), dimension(n,n) :: A, B, C, D, V
! Generate random matrices for both A and B.
C = rand(C); D = rand(D)
A = C + .h.C; B = D .hx. D; B = (B + .h.B)/2
ALPHA = EIG(A, B=B, W=V)
! Check that residuals are small. Use a real array for alpha
! since the eigenvalues are known to be real.
err= norm((A .x. V) - (B .x. V .x. diag(alpha)),1)/&
(norm(A,1)+norm(B,1)*norm(alpha,1))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 2 for LIN_GEIG_GEN (operators) is correct.'
end if
end
.h. 1557
Output
H
1
2
3
1
( 2.000, 1.000)
( 0.000, 0.000)
( 0.000, 0.000)
2
( 0.000, 0.000)
( 4.000,-1.000)
( 0.000, 0.000)
3
( 1.000, 3.000)
( 0.000, 0.000)
( 6.000, 2.000)
H Conjugate Transpose
1
2
3
1 ( 2.000,-1.000) ( 0.000, 0.000) ( 0.000, 0.000)
2 ( 0.000, 0.000) ( 4.000, 1.000) ( 0.000, 0.000)
3 ( 1.000,-3.000) ( 0.000, 0.000) ( 6.000,-2.000)
Sparse example for .h. operator is correct.
.i.
CAPABLE
Required Operand
A Matrix for which the inverse is to be computed. This is an array of rank 2 or 3. It may be
real, double, complex, double complex. (Input)
Option Value
Use_lin_sol_lsq_only
I_options_for_lin_sol_gen
I_options_for_lin_sol_lsq
Skip_error_processing
Use
Derived Type
?_inv_options(:)
?_options
?_inv_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SOL_GEN and LIN_SOL_LSQ located in Chapter 1, Linear Systems for the specific options
for these routines.
FORTRAN 90 Interface
.i. A
Description
Computes the inverse matrix for square non-singular matrices using LIN_SOL_GEN, or the
Moore-Penrose generalized inverse matrix for singular square matrices or rectangular matrices
using LIN_SOL_LSQ. The operation may be read inverse or generalized inverse, and the results
are in a precision and data type that matches the operand.
This operator requires a single operand. Since this is a unary operation, it has higher Fortran 90
precedence than any other intrinsic array operation.
Examples
Dense Matrix Example (operator_ex02.f90)
use linear_operators
implicit none
! This is the equivalent of Example 2 for LIN_SOL_GEN using operators
! and functions.
integer, parameter :: n=32
real(kind(1e0)) :: one=1e0, err, det_A, det_i
real(kind(1e0)), dimension(n,n) :: A, inv
! Generate a random matrix.
Chapter 10: Linear Algebra Operators and Generic Functions
.i. 1559
A = rand(A)
! Compute the matrix inverse and its determinant.
inv = .i.A; det_A = det(A)
! Compute the determinant for the inverse matrix.
det_i = det(inv)
! Check the quality of both left and right inverses.
err = (norm(EYE(n)-(A .x. inv))+norm(EYE(n)-(inv.x.A)))/cond(A)
if (err <= sqrt(epsilon(one)) .and. abs(det_A*det_i - one) <= &
sqrt(epsilon(one))) &
write (*,*) 'Example 2 for LIN_SOL_GEN (operators) is correct.'
end
.ix.
CAPABLE
Required Operands
A Left operand matrix. This is an array of rank 2, or 3. It may be real, double, complex,
double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
B Right operand matrix or vector. This is an array of rank 1, 2, or 3. It may be real, double,
complex, or double complex. (Input)
use_lin_sol_lsq_only
ix_options_for_lin_sol_gen
ix_options_for_lin_sol_lsq
Skip_error_processing
Use
Derived Type
?_invx_options(:)
?_options
?_invx_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SOL_GEN and LIN_SOL_LSQ located in Chapter 1, Linear Systems for the specific options
for these routines.
.ix. 1561
FORTRAN 90 Interface
A .ix. B
Description
Computes the product of the inverse of matrix A and vector or matrix B, for square non-singular
matrices or the corresponding Moore-Penrose generalized inverse matrix for singular square
matrices or rectangular matrices. The operation may be read generalized inverse times. The results
are in a precision and data type that matches the most accurate or complex operand.
.ix. can be used with either dense or sparse matrices. It is MPI capable for dense matrices only.
Examples
Dense Matrix Example (operator_ex01.f90)
use linear_operators
implicit none
! This is the equivalent of Example 1 for LIN_SOL_GEN, with operators
! and functions.
integer, parameter :: n=32
real(kind(1e0)) :: one=1.0e0, err
real(kind(1e0)), dimension(n,n) :: A, b, x
! Generate random matrices for A and b:
A = rand(A); b=rand(b)
! Compute the solution matrix of Ax = b.
x = A .ix. b
! Check the results.
err = norm(b - (A .x. x))/(norm(A)*norm(x)+norm(b))
if (err <= sqrt(epsilon(one))) &
write (*,*) 'Example 1 for LIN_SOL_GEN (operators) is correct.'
end
X = H
! dense equivalent of H
B= rand(B)
Y = H .ix. B
call wrrrn ( 'H', X)
call wrrrn ( 'B', b)
call wrrrn ( 'H .ix. B ', y)
! Check the results.
err = norm(y - (X .ix. B))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Sparse example for .ix. operator is correct.'
end if
end
Output
H
1
2
3
1
2.000
0.000
0.000
2
0.000
4.000
0.000
3
1.000
0.000
6.000
1
2
3
1
0.8292
0.9670
0.1458
2
0.5697
0.7296
0.2726
3
0.1687
0.0603
0.8809
1
2
3
H .ix. B
1
2
0.4025
0.2621
0.2417
0.1824
0.0243
0.0454
3
0.0109
0.0151
0.1468
We want to calculate a numerical solution, which approximates the true solution of the Poisson
(boundary value) problem in the solution domain , a rectangle in R
u =
The equation is
u u
+
= f in
x 2 y 2
2
u
= h on 2
n
The boundary arcs comprising 1 2 = are mutually exclusive of each other. The
functions f , g , h are defined on their respective domains.
.ix. 1563
We will solve an instance of this problem by using finite differences to approximate the
derivatives. This will lead to a sparse system of linear algebraic equations. Note that particular
cases of this problem can be solved with methods that are likely to be more efficient or more
appropriate than the one illustrated here. We use this method to illustrate our matrix data handling
routines and defined operators.
The area of the rectangle is a b with the origin fixed at the lower left or SW corner. The
dimension along the x axis is a and along the y axis is b . A rectangular n m uniform grid is
defined on where each sub-rectangle in the grid has sides
x = a /( n 1) and y = a /( m 1) . What is perhaps novel in our development is that the
boundary values are written into the ( m n ) linear system as trivial equations. This leads to
2
more unknowns than standard approaches to this problem but the complexity of describing the
equations into computer code is reduced. The boundary conditions are naturally in place when the
solution is obtained. No reshaping is required.
We number the approximate values of u at the grid points and collapse them into a single vector.
u u
=
, and we impose the smooth interface h = 0 .
n x
Our use of finite differences is standard. For the differential equation we approximate
2u 2u
+
x 2 y 2
+
= f ( xi , y j )
x 2
y 2
condition we approximate
u
x
un , j un 1, j
= 0,
x
j = 1,, m
The function f = 0 for all ( x, y ) . Graphical results are shown below with the title Problem
2.
Case 1
A Poisson equation with the boundary conditions u = 0 on all of the edges and
solution u ( x, y ) = f ( x, y ) / 2
3.
accuracy is within the truncation error implied by the difference equations. Graphical results
are shown with the title Problem Case 2 The residual function verifies the expected
accuracy.
The Laplace Equation with the boundary conditions of Problem Case 1 except that the
boundary condition on the East Edge is replaced by the Neumann condition
u
= 0.
x
.ix. 1565
Do J = 2, M - 1
! Write entries for second partials WRT x and y.
C = d_entry (JJ(I, J), JJ(I-1, J), r)
C = d_entry (JJ(I, J), JJ(I+1, J), r)
C = d_entry (JJ(I, J), JJ(I, J),-2*(r+s))
C = d_entry (JJ(I, J), JJ(I, J-1), s)
C = d_entry (JJ(I, J), JJ(I, J+1), s)
!
! Define components of the right-hand side.
w (JJ(I, J)) = f((I-1)*delx, (J-1)*dely, MY_CASE)
End Do
End Do
! Write entries for Dirichlet boundary conditions.
! First do the South edge, then the West, then the North.
Select Case (MY_CASE)
Case (1:2)
Do I = 1, N
C = d_entry (JJ(I, 1), JJ(I, 1), r+s)
w (JJ(I, 1)) = g ((I-1)*delx, 0.d0, MY_CASE) * (r+s)
End Do
Do J = 2, M - 1
C = d_entry (JJ(1, J), JJ(1, J), r+s)
w (JJ(1, J)) = g (0.d0, (J-1)*dely, MY_CASE) * (r+s)
End Do
Do I = 1, N
C = d_entry (JJ(I, M), JJ(I, M), r+s)
w (JJ(I, M)) = g ((I-1)*delx, b, MY_CASE) * (r+s)
End Do
Do J = 2, M - 1
C = d_entry (JJ(N, J), JJ(N, J), (r+s))
w (JJ(N, J)) = g (a, (J-1)*dely, MY_CASE) * (r+s)
End Do
Case (3)
! Write entries for the boundary values but avoid the East edge.
Do I = 1, N - 1
C = d_entry (JJ(I, 1), JJ(I, 1), r+s)
w (JJ(I, 1)) = g ((I-1)*delx, 0.d0, MY_CASE) * (r+s)
End Do
Do J = 2, M - 1
C = d_entry (JJ(1, J), JJ(1, J), r+s)
w (JJ(1, J)) = g (0.d0, (J-1)*dely, MY_CASE) * (r+s)
End Do
Do I = 1, N - 1
C = d_entry (JJ(I, M), JJ(I, M), r+s)
w (JJ(I, M)) = g ((I-1)*delx, b, MY_CASE) * (r+s)
End Do
! Write entries for the Neumann condition on the East edge.
Do J = 1, M
C = d_entry (JJ(N, J), JJ(N, J), 1.d0/delx)
C = d_entry (JJ(N, J), JJ(N-2, J),-1.d0/delx)
w (JJ(N, J)) = 0.d0
End Do
End Select
!
! Convert to Harwell-Boeing format for solving.
1566 Chapter 10: Linear Algebra Operators and Generic Functions
D = C
!
Call cpu_time (TE)
Write (*,'(A,F6.2," S. - ",A)') "Time to build matrix = ", &
TE - TS, PR_LABEL(MY_CASE)
! Clear sparse triplets.
C = 0
!
! Turn off iterative refinement for maximal performance.
! This is generally not recommended unless
! the problem is known not to require it.
If (MY_CASE == 2) D%options%iterRefine = 0
! This is the solve step.
Call cpu_time (TS)
u = D .ix. w
Call cpu_time (TE)
Write (*,'(A,I6," is",F6.2," S")') &
"Time to solve system of size = ", N * M, TE - TS
! This is a second solve step using the factorization
! from the first step.
Call cpu_time (TS)
u = D .ix. w
Call cpu_time (TE)
!
If(MY_CASE == 1) then
Write (*,'(A,I6," is",F6.2," S")') &
"Time for a 2nd system of size (iterative refinement) =", &
N * M, TE - TS
Else
Write (*,'(A,I6," is",F6.2," S")') &
"Time for a 2nd system of size (without refinement) =", &
N * M, TE - TS
End if
! Convert solution vector to a 2D array of values.
P = reshape (u , (/ N, M /))
If (MY_CASE == 2) Then
pi = dconst ('pi')
!
scale = - 0.5 / pi ** 2
Do I = 1, N
Do J = 1, M
! This uses the known form of the solution to compute residuals.
P (I, J) = P (I, J) - scale * f ((I-1)*delx, &
(J-1)*dely, MY_CASE)
End Do
End Do
!
write (*,*) minval (P), " = min solution error "
write (*,*) maxval (P), " = max solution error "
End If
Write (*,'(A,1pE12.4/)') "Condition number of matrix", cond (D)
! Clear all matrix data for next problem case.
D = 0
!
End Do ! MY_CASE
Chapter 10: Linear Algebra Operators and Generic Functions
.ix. 1567
Contains
Function f (x, y, MY_CASE)
implicit none
! Define the right-hand side function associated with the
! "del" operator.
Real (Kind(1.d0)) x, y, f, pi
Integer MY_CASE
if(MY_CASE == 2) THEN
pi = dconst ('pi')
f = - Sin (pi*x) * Sin (pi*y)
Else
f = 0.d0
End If
End Function
!
Function g (x, y, MY_CASE)
implicit none
! Define the edge values, except along East edge, x = a.
Real (Kind(1.d0)) x, y, g
Integer MY_CASE
! Fill in a constant value along each edge.
If (MY_CASE == 1 .Or. MY_CASE == 3) Then
If (y == 0.d0) Then
g = 0.d0
Return
End If
If (y == b) Then
g = 1.d0
Return
End If
If (x == 0.d0) Then
g = 0.3d0
Return
End If
If (x == a) Then
g = 0.7d0
End If
Else
g = 0.d0
!
End If
!
End Function
End Subroutine
Problem Case 1
Problem Case 2
.ix. 1569
Problem Case 3
end
.xi.
CAPABLE
Required Operands
A Right operand matrix or vector. This is an array of rank 1, 2, or 3. It may be real, double,
complex, or double complex. (Input)
B Left operand matrix. This is an array of rank 2, or 3. It may be real, double, complex,
double complex, or one of the computational sparse matrix derived types,
?_hbc_sparse. (Input)
Option Value
use_lin_sol_gen_only
use_lin_sol_lsq_only
xi_options_for_lin_sol_gen
xi_options_for_lin_sol_lsq
Skip_error_processing
.xi. 1571
Use
Derived Type
?_xinv_options(:)
?_options
?_xinv_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SOL_GEN and LIN_SOL_LSQ located in Chapter 1, Linear Systems for the specific options
for these routines.
FORTRAN 90 Interface
A .xi. B
Description
Computes the product of matrix A and the inverse of matrix B, for square non-singular matrices
or the corresponding Moore-Penrose generalized inverse matrix for singular square matrices or
rectangular matrices. The operation may be read times generalized inverse. The results are in a
precision and data type that matches the most accurate or complex operand.
.xi. can be used with either dense or sparse matrices. It is MPI capable for dense matrices only.
Examples
Dense Matrix Example
use linear_operators
implicit none
integer, parameter :: n=32
real(kind(1e0)) :: one=1.0e0, err
real(kind(1e0)), dimension(n,n) :: A, b, x
! Generate random matrices for A and b:
A = rand(A); b=rand(b)
! Compute the solution matrix of xA = b.
x = b .xi. A
! Check the results.
err = norm(b - (x .x. A))/(norm(A)*norm(x)+norm(b))
if (err <= sqrt(epsilon(one))) &
write (*,*) 'Example for .xi. operator is correct.'
end
type (s_sparse) S
type (s_hbc_sparse) H
integer, parameter :: N=3
real (kind(1.e0)) x(N,N), y(N,N), a(N,N)
real (kind(1.e0)) err
S = s_entry (1, 1, 2.0)
S = s_entry (1, 3, 1.0)
S = s_entry (2, 2, 4.0)
S = s_entry (3, 3, 6.0)
H = S
! sparse
X = H
! dense equivalent of H
A = rand(A)
Y = A .xi. H
call wrrrn ( 'A', A)
call wrrrn ( 'H', X)
call wrrrn ( 'A .xi. H', y)
! Check the results.
err = norm(y - (A .xi. X))
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Sparse example for .xi. operator is correct.'
end if
end
Output
1
2
3
1
0.5926
0.4001
0.0412
1
2
3
1
2.000
0.000
0.000
2
0.5015
0.9529
0.0633
3
0.5368
0.6988
0.3821
H
2
0.000
4.000
0.000
3
1.000
0.000
6.000
A .xi. H
1
2
3
1
0.2963
0.1254
0.0401
2
0.2001
0.2382
0.0831
3
0.0206
0.0158
0.0602
Sparse example for .xi. operator is correct.
Parallel Example
use linear_operators
use mpi_setup_int
implicit none
! This is the equivalent of Parallel Example 1 for .xi., with box data types
! and functions.
Chapter 10: Linear Algebra Operators and Generic Functions
.xi. 1573
CHOL
CAPABLE
Required Argument
A Matrix to be factored. This argument must be a rank-2 or rank-3 array that contains a
positive-definite, symmetric or self-adjoint matrix. It may be real, double, complex,
double complex. (Input)
For rank-3 arrays each rank-2 array, (for fixed third subscript), is a positive-definite,
symmetric or self-adjoint matrix. In this case, the output is a rank-3 array of Cholesky
factors for the individual problems.
The option and derived type names are given in the following tables:
Option Value
Use_lin_sol_lsq_only
Use
Derived Type
?_chol_options(:)
?_options
?_chol_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SOL_SELF located in Chapter 1, Linear Systems for the specific options for this routine.
FORTRAN 90 Interface
CHOL(A)
Description
Computes the Cholesky factorization of a positive-definite, symmetric or self-adjoint matrix, A.
The factor is upper triangular, RTR = A.
Examples
Dense Matrix Example (operator_ex06.f90)
use linear_operators
implicit none
! This is the equivalent of Example 2 for LIN_SOL_SELF using operators
! and functions.
integer, parameter :: m=64, n=32
real(kind(1e0)) :: one=1e0, zero=0e0, err
real(kind(1e0)) A(n,n), b(n), C(m,n), d(m), cov(n,n), x(n)
! Generate a random rectangular matrix and right-hand side.
C = rand(C); d=rand(d)
! Form the normal equations for the rectangular system.
A = C .tx. C; b = C .tx. d
COV = .i. CHOL(A); COV = COV .xt. COV
! Compute the least-squares solution.
x = C .ix. d
Chapter 10: Linear Algebra Operators and Generic Functions
CHOL 1575
COND
CAPABLE
Required Argument
A Matrix for which the condition number is to be computed. The matrix may be real,
double, complex, double-complex, or one of the computational sparse matrix derived
types, ?_hbc_sparse. For an array of type real, double, complex, or double-complex
the array may be of rank-2 or rank-3.
For a dense rank-3 array, each rank-2 array section, (for fixed third subscript), is a
separate problem. In this case, the output is a rank-1 array of condition numbers for
each problem. (Input)
Rectangular Matrix
Dense
Sparse
l1
Yes
Yes
No
No
2 (Default)
l2
Yes
Yes
Yes
No
huge(1)
Yes
Yes
No
No
Option Value
?_cond_set_small
?_cond_for_lin_sol_svd
COND 1577
Use
Derived Type
?_cond_options(:)
?_options
?_cond_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SOL_SVD located in Chapter 1, Linear Systems for the specific options for this routine.
FORTRAN 90 Interface
COND (A [,])
Description
The mathematical definitions of the condition numbers which this routine estimates are:
l1 condition number 1 ( A ) = A 1 i A1
l2 condition number 2 ( A ) = A 2 i A
l condition number ( A ) = A i A
2
1
Square Matrix
Rectangular Matrix
Dense
Sparse
Dense
Sparse
l1
Yes
Yes
No
No
l2
Yes
Yes
Yes
No
Yes
Yes
No
No
The generic function COND can be used with either dense or sparse square matrices. This function
uses LIN_SOL_SVD for dense square and rectangular matrices in computing 2 ( A) = s1 / sn . The
function uses LIN_SOL_GEN for dense square matrices in computing 1 ( A) and ( A) . For
sparse square matrices, the values returned for 1 ( A) and ( A) are provided by the SuperLU
linear equation solver. The condition number 2 ( A) = s1 / sn is computed by an algorithm that first
approximates s1 by computing the singular values of the k k bidiagonal matrix obtained using
the Lanczos method found in Golub and Van Loan, Ed. 3, p. 495. Here k is set using the value
A%Options%Cond_Iteration_Max, which has the default value of 30.
The value sn is
obtained using the power method, Golub and Van Loan, p. 330, iterating with the inverse
eigenvalue of this inverse matrix is sn . The number of iterations is limited by the parameter
value k or relative accuracy equal to the cube root of machine epsilon. Some timing tests indicate
that computing 2 ( A) for sparse matrices by this algorithm typically requires about twice the time
as for a single linear solve using the defined operator A .ix. b.
For computation of 2 ( A) with rectangular sparse matrices one can use a dense matrix
representation for the matrix. This is not recommended except for small problem sizes. For
overdetermined systems of sparse least-squares equations Ax b a related square system is given
by
x A
C
r 0 n n
I m m x b
=
AT r 0
One can form C , which has more than twice the number of non-zeros as A . But C is still sparse.
One can use the condition number of C as an estimate of the accuracy for the solution vector x and
the residual vector r . Note that this version of the condition number is not the same as
the l2 condition number of A but is relevant to determining the accuracy of the least-squares
system.
Examples
Dense Matrix Example (operator_ex02.f90)
use wrrrn_int
use linear_operators
integer, parameter :: N=3
real (kind(1.e0)) A(N,N)
real (kind(1.e0)) C1, C2, CINF
DATA A/2.0, 2.0, -4.0, 0.0, -1.0, 2.0, 0.0, 0.0, 5.0/
CINF = COND (A, norm_choice=huge(1))
C1
= COND (A, norm_choice=1)
C2
= COND (A)
call wrrrn ( 'A', A)
write (*,*) 'L1 condition number= ', C1
write (*,*) 'L2 condition number= ', C2
write (*,*) 'L infinity condition number= ', CINF
end
Output
A
1
2
3
1
2.000
2.000
-4.000
2
0.000
-1.000
2.000
3
0.000
0.000
5.000
L1 condition number=
L2 condition number=
12.0
10.405088
COND 1579
22.0
Output
1
2
3
1
2.000
2.000
-4.000
2
0.000
-1.000
2.000
3
0.000
0.000
5.000
integer J
real(kind(1e0)) :: one=1e0
real(kind(1e0)), dimension(nr) :: err, det_A, det_i
real(kind(1e0)), dimension(n,n,nr) :: A, inv, R, S
! Setup for MPI.
MP_NPROCS=MP_SETUP()
! Generate a random matrix.
A = rand(A)
! Compute the matrix inverse and its determinant.
inv = .i.A; det_A = det(A)
! Compute the determinant for the inverse matrix.
det_i = det(inv)
! Check the quality of both left and right inverses.
DO J=1,nr; R(:,:,J)=EYE(N); END DO
S=R; R=R-(A .x. inv); S=S-(inv .x. A)
err = (norm(R)+norm(S))/cond(A)
if (ALL(err <= sqrt(epsilon(one)) .and. &
abs(det_A*det_i - one) <= sqrt(epsilon(one)))&
.and. MP_RANK == 0) &
write (*,*) 'Parallel Example 2 is correct.'
! See to any error messages and quit MPI.
MP_NPROCS=MP_SETUP('Final')
end
DET
CAPABLE
Required Argument
A Matrix for which the determinant is to be computed. This argument must be a rank-2 or
rank-3 array that contains a rectangular matrix. It may be real, double, complex,
double complex. (Input)
For rank-3 arrays, each rank-2 array (for fixed third subscript), is a separate matrix. In
this case, the output is a rank-1 array of determinant values for each problem.
DET 1581
?_det_for_lin_sol_lsq
Use
Derived Type
?_det_options(:)
?_options
?_det_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SOL_LSQ located in Chapter 1, Linear Systems for the specific options for this routine.
FORTRAN 90 Interface
DET (A)
Description
Computes the determinant of a rectangular matrix, A. The evaluation is based on the QR decomposition,
R
QAP = k k
0
0
0
Examples
Dense Matrix Example (operator_ex02.f90)
use linear_operators
implicit none
! This is Example 2 for LIN_SOL_GEN using operators and functions.
DET 1583
DIAG
Constructs a square diagonal matrix.
Required Argument
A This is a rank-1 or rank-2 array of type real, double, complex, or double complex,
containing the diagonal elements. The output is a rank-2 or rank-3 array,
respectively. (Input)
FORTRAN 90 Interface
DIAG (A)
Description
Constructs a square diagonal matrix from a rank-1 array or several diagonal matrices from a rank2 array. The dimension of the matrix is the value of the size of the rank-1 array.
The use of DIAG may be obviated by observing that the defined operations C = diag(x) .x. A
or D = B .x. diag(x) are respectively the array operations C = spread(x,
DIM=1,NCOPIES=size(A,1))*A, and D = B*spread(x,DIM=2,NCOPIES=size(B,2)).
These array products are not as easy to read as the defined operations using DIAG and matrix
multiply, but their use results in a more efficient code.
Examples
Dense Matrix Example (operator_ex13.f90)
use linear_operators
implicit none
! This is the equivalent of Example 1 for LIN_SOL_SVD using operators
! and functions.
integer, parameter :: m=128, n=32
real(kind(1d0)) :: one=1d0, err
real(kind(1d0)) A(m,n), b(m), x(n), U(m,m), V(n,n), S(n), g(m)
! Generate a random matrix and right-hand side.
A = rand(A); b = rand(b)
! Compute the least-squares solution matrix of Ax=b.
S = SVD(A, U = U, V = V)
g = U .tx. b; x = V .x. diag(one/S) .x. g(1:n)
! Check the results.
err = norm(A .tx. (b - (A .x. x)))/(norm(A)+norm(x))
1584 Chapter 10: Linear Algebra Operators and Generic Functions
DIAGONALS
Extracts the diagonal terms of a matrix.
Required Argument
A Matrix from which to extract the diagonal. This is a rank-2 or rank-3 array of type real,
double, complex, or double complex. The output is a rank-1 or rank-2 array,
respectively. (Input)
FORTRAN 90 Interface
DIAGONALS (A)
Description
Extracts a rank-1 array whose values are the diagonal terms of the rank-2 array A. The size of the
array is the smaller of the two dimensions of the rank-2 array.
Examples
Dense Matrix Example (operator_ex32.f90)
use linear_operators
implicit none
! This is the equivalent of Example 4 (using operators) for LIN_EIG_GEN.
integer, parameter :: n=17
real(kind(1d0)), parameter :: one=1d0
real(kind(1d0)), dimension(n,n) :: A, C
real(kind(1d0)) variation(n), eta
complex(kind(1d0)), dimension(n,n) :: U, V, e(n), d(n)
! Generate a random matrix.
A = rand(A)
! Compute the eigenvalues, left- and right- eigenvectors.
D = EIG(A, W=V); E = EIG(.t.A, W=U)
Chapter 10: Linear Algebra Operators and Generic Functions
DIAGONALS 1585
EIG
CAPABLE
Required Argument
A Matrix for which the eigenexpansion is to be computed. This is a square rank-2 array or
a rank-3 array with square first rank-2 sections of type single, double, complex, or
double complex. (Input)
Options_for_lin_eig_gen
Options_for_lin_geig_gen
Skip_error_processing
Use
Derived Type
?_eig_options(:)
?_options
?_eig_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_EIG_SELF, LIN_EIG_GEN, and LIN_GEIG_GEN located in Chapter 2, Eigensystems
Analysis for the specific options for these routines.
FORTRAN 90 Interface
EIG (A [,] )
Description
Computes the eigenvalue-eigenvector decomposition of an ordinary or generalized eigenvalue
problem.
For the ordinary eigenvalue problem, Ax = ex, the optional input B= is not used. With the
generalized problem, Ax = eBx, the matrix B is passed as the array in the right-side of B=. The
optional output D= is an array required only for the generalized problem and then only when
the matrix B is singular.
The array of real eigenvectors is an optional output for both the ordinary and the generalized
problem. It is used as V= where the right-side array will contain the eigenvectors. If any
eigenvectors are complex, the optional output W= must be present. In that case V= should not
be used.
Examples
Dense Matrix Example 1 (operator_ex26.f90)
use linear_operators
implicit none
Chapter 10: Linear Algebra Operators and Generic Functions
EIG 1587
Here an alternate node is used to compute the majority of a single application, and the user does
not need to make any explicit calls to MPI routines. The time-consuming parts are the evaluation
of the eigenvalue-eigenvector expansion, the solving step, and the residuals. To do this, the rank2 arrays are changed to a box data type with a unit third dimension. This uses parallel computing.
The node priority order is established by the initial function call, MP_SETUP(n). The root is
restricted from working on the box data type by assigning MPI_ROOT_WORKS=.false. This
example anticipates that the most efficient node, other than the root, will perform the heavy
computing. Two nodes are required to execute.
use linear_operators
use mpi_setup_int
implicit none
! This is the equivalent of Parallel Example 4 for matrix exponential.
! The box dimension has a single rack.
integer, parameter :: n=32, k=128, nr=1
integer i
real(kind(1e0)), parameter :: one=1e0, t_max=one, delta_t=t_max/(k-1)
real(kind(1e0)) err(nr), sizes(nr), A(n,n,nr)
real(kind(1e0)) t(k), y(n,k,nr), y_prime(n,k,nr)
complex(kind(1e0)), dimension(n,nr) :: x(n,n,nr), z_0, &
Z_1(n,nr,nr), y_0, d
!
!
!
!
EIG 1589
EYE
Creates the identity matrix.
Required Argument
N Size of output identity matrix. (Input)
FORTRAN 90 Interface
EYE (N)
Description
Creates a rank-2 square array whose diagonals are all the value one. The off-diagonals all have
value zero.
Examples
Dense Matrix Example (operator_ex07.f90)
use linear_operators
implicit none
! This is the equivalent of Example 3 (using operators) for LIN_SOL_SELF.
integer tries
integer, parameter :: m=8, n=4, k=2
integer ipivots(n+1)
real(kind(1d0)) :: one=1.0d0, err
real(kind(1d0)) a(n,n), b(n,1), c(m,n), x(n,1), &
e(n), ATEMP(n,n)
type(d_options) :: iopti(4)
! Generate a random rectangular matrix.
C = rand(C)
! Generate a random right hand side for use in the inverse
! iteration.
b = rand(b)
! Compute the positive definite matrix.
A = C .tx. C; A = (A+.t.A)/2
! Obtain just the eigenvalues.
E = EIG(A)
! Use packaged option to reset the value of a small diagonal.
iopti(4) = 0
iopti(1) = d_options(d_lin_sol_self_set_small,&
epsilon(one)*abs(E(1)))
! Use packaged option to save the factorization.
iopti(2) = d_lin_sol_self_save_factors
! Suppress error messages and stopping due to singularity
! of the matrix, which is expected.
iopti(3) = d_lin_sol_self_no_sing_mess
ATEMP = A
! Compute A-eigenvalue*I as the coefficient matrix.
! Use eigenvalue number k.
A = A - e(k)*EYE(n)
do tries=1,2
call lin_sol_self(A, b, x, &
pivots=ipivots, iopt=iopti)
! When code is re-entered, the already computed factorization
! is used.
iopti(4) = d_lin_sol_self_solve_A
! Reset right-hand side in the direction of the eigenvector.
B = UNIT(x)
end do
! Normalize the eigenvector.
x = UNIT(x)
! Check the results.
b=ATEMP .x. x
Chapter 10: Linear Algebra Operators and Generic Functions
EYE 1591
err =
FFT
Computes the Discrete Fourier Transform of one complex sequence.
Required Argument
X Array containing the sequence for which the transform is to be computed. X is an
assumed shape complex array of rank 1, 2 or 3. If X is real or double, it is converted to
complex internally prior to the computation. (Input)
The option and derived type names are given in the following tables:
Option Names for FFT
Option Value
Options_for_fast_dft
Use
Derived Type
?_fft_options(:)
?_options
?_fft_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
FAST_DFT located in Chapter 6, Transforms for the specific options for this routine.
FORTRAN 90 Interface
FFT (X [,] )
Description
Computes the Discrete Fourier Transform of a complex sequence. This function uses FAST_DFT,
FAST_2DFT, and FAST_3DFT from Chapter 6.
Examples (operator_ex37.f90)
use
use
use
use
rand_gen_int
fft_int
ifft_int
linear_operators
implicit none
! This is Example 4 for FAST_DFT (using operators).
integer j
integer, parameter :: n=40
real(kind(1e0)) :: err, one=1e0
real(kind(1e0)), dimension(n) :: a, b, c, yy(n,n)
complex(kind(1e0)), dimension(n) :: f, fa, fb
! Generate two random periodic sequences 'a' and 'b'.
a=rand(a); b=rand(b)
! Compute the convolution 'c' of 'a' and 'b'.
yy(1:,1)=b
do j=2,n
yy(2:,j)=yy(1:n-1,j-1)
yy(1,j)=yy(n,j-1)
end do
c=yy .x. a
! Compute f=inverse(transform(a)*transform(b)).
fa = fft(a)
fb = fft(b)
f=ifft(fa*fb)
! Check the Convolution Theorem:
! inverse(transform(a)*transform(b)) = convolution(a,b).
err = norm(c-f)/norm(c)
if (err <= sqrt(epsilon(one))) then
write (*,*) 'Example 4 for FAST_DFT (operators) is correct.'
end if
end
FFT 1593
FFT_BOX
CAPABLE
Required Argument
X Box containing the sequences for which the transform is to be computed. X is an
assumed shape complex array of rank 2, 3 or 4. If X is real or double, it is converted to
complex internally prior to the computation. (Input)
The option and derived type names are given in the following tables:
Option Names for FFT
Option Value
Options_for_fast_dft
Use
Derived Type
?_fft_box_options(:)
?_options
?_fft_box_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
FAST_DFT located in Chapter 6, Transforms for the specific options for this routine.
FORTRAN 90 Interface
FFT_BOX (X [,])
Description
Computes the Discrete Fourier Transform of a box of complex sequences. This function uses
FAST_DFT, FAST_2DFT, and FAST_3DFT from Chapter 6.
Examples
Parallel Example
use
use
use
use
use
rand_gen_int
fft_box_int
ifft_box_int
linear_operators
mpi_setup_int
implicit none
! This is FFT_BOX example.
integer i,j
integer, parameter :: n=40, nr=4
real(kind(1e0)) :: err(nr), one=1e0
real(kind(1e0)) :: a(n,1,nr), b(n,nr), c(n,1,nr), yy(n,n,nr)
complex(kind(1e0)), dimension(n,nr) :: f, fa, fb, cc, aa
real(kind(1e0)),parameter::zero_par=0.e0
real(kind(1e0))::dummy_par(0)
integer iseed_par
type(s_options)::iopti_par(2)
! setup for MPI
MP_NPROCS = MP_SETUP()
! Set Random Number generator seed
iseed_par = 53976279
iopti_par(1)=s_options(s_rand_gen_generator_seed,zero_par)
iopti_par(2)=s_options(iseed_par,zero_par)
call rand_gen(dummy_par,iopt=iopti_par)
! Generate two random periodic sequences 'a' and 'b'.
a=rand(a); b=rand(b)
! Compute the convolution 'c' of 'a' and 'b'.
do i=1,nr
aa(1:,i) = a(1:,1,i)
yy(1:,1,i)=b(1:,i)
do j=2,n
yy(2:,j,i)=yy(1:n-1,j-1,i)
yy(1,j,i)=yy(n,j-1,i)
end do
end do
Chapter 10: Linear Algebra Operators and Generic Functions
FFT_BOX 1595
c=yy .x. a
! Compute f=inverse(transform(a)*transform(b)).
fa = fft_box(aa)
fb = fft_box(b)
f=ifft_box(fa*fb)
! Check the Convolution Theorem:
! inverse(transform(a)*transform(b)) = convolution(a,b).
do i=1,nr
cc(1:,i) = c(1:,1,i)
end do
err = norm(cc-f)/norm(cc)
if (ALL(err <= sqrt(epsilon(one))) .AND. MP_RANK == 0) then
write (*,*) 'FFT_BOX is correct.'
end if
MP_NPROCS = MP_SETUP('Final')
end
IFFT
Computes the inverse of the Discrete Fourier Transform of one complex sequence.
Required Argument
X Array containing the sequence for which the inverse transform is to be computed. X is
an assumed shape complex array of rank 1, 2 or 3. If X is real or double, it is converted
to complex internally prior to the computation. (Input)
The option and derived type names are given in the following tables:
Option Name for IFFT
Option Value
options_for_fast_dft
Use
Derived Type
?_ifft_options(:)
?_options
?_ifft_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
FAST_DFT located in Chapter 6, Transforms for the specific options for this routine.
FORTRAN 90 Interface
IFFT (X [,])
Description
Computes the inverse of the Discrete Fourier Transform of a complex sequence. This function
uses FAST_DFT, FAST_2DFT, and FAST_3DFT from Chapter 6.
Example (operator_ex37.f90)
use
use
use
use
rand_gen_int
fft_int
ifft_int
linear_operators
implicit none
! This is the equivalent of Example 4 for FAST_DFT (using operators).
integer j
integer, parameter :: n=40
real(kind(1e0)) :: err, one=1e0
real(kind(1e0)), dimension(n) :: a, b, c, yy(n,n)
complex(kind(1e0)), dimension(n) :: f, fa, fb
! Generate two random periodic sequences 'a' and 'b'.
a=rand(a); b=rand(b)
! Compute the convolution 'c' of 'a' and 'b'.
yy(1:,1)=b
do j=2,n
yy(2:,j)=yy(1:n-1,j-1)
yy(1,j)=yy(n,j-1)
end do
c=yy .x. a
! Compute f=inverse(transform(a)*transform(b)).
fa = fft(a)
fb = fft(b)
f=ifft(fa*fb)
IFFT 1597
IFFT_BOX
CAPABLE
Computes the inverse Discrete Fourier Transform of several complex or real sequences.
Required Argument
X Box containing the sequences for which the inverse transform is to be computed. X is
an assumed shape complex array of rank 2, 3 or 4. If X is real or double, it is converted
to complex internally prior to the computation. (Input)
The option and derived type names are given in the following tables:
Option Names for IFFT
Option Value
Options_for_fast_dft
Use
Derived Type
?_ifft_box_options(:)
?_options
?_ifft_box_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
FAST_DFT located in Chapter 6, Transforms for the specific options for this routine.
FORTRAN 90 Interface
IFFT_BOX (X [,])
Description
Computes the inverse of the Discrete Fourier Transform of a box of complex sequences. This
function uses FAST_DFT, FAST_2DFT, and FAST_3DFT from Chapter 6.
Parallel Example
use
use
use
use
use
rand_gen_int
fft_box_int
ifft_box_int
linear_operators
mpi_setup_int
implicit none
! This is FFT_BOX example.
integer i,j
integer, parameter :: n=40, nr=4
real(kind(1e0)) :: err(nr), one=1e0
real(kind(1e0)) :: a(n,1,nr), b(n,nr), c(n,1,nr), yy(n,n,nr)
complex(kind(1e0)), dimension(n,nr) :: f, fa, fb, cc, aa
real(kind(1e0)),parameter::zero_par=0.e0
real(kind(1e0))::dummy_par(0)
integer iseed_par
type(s_options)::iopti_par(2)
! setup for MPI
MP_NPROCS = MP_SETUP()
! Set Random Number generator seed
iseed_par = 53976279
iopti_par(1)=s_options(s_rand_gen_generator_seed,zero_par)
iopti_par(2)=s_options(iseed_par,zero_par)
IFFT_BOX 1599
call rand_gen(dummy_par,iopt=iopti_par)
! Generate two random periodic sequences 'a' and 'b'.
a=rand(a); b=rand(b)
! Compute the convolution 'c' of 'a' and 'b'.
do i=1,nr
aa(1:,i) = a(1:,1,i)
yy(1:,1,i)=b(1:,i)
do j=2,n
yy(2:,j,i)=yy(1:n-1,j-1,i)
yy(1,j,i)=yy(n,j-1,i)
end do
end do
c=yy .x. a
! Compute f=inverse(transform(a)*transform(b)).
fa = fft_box(aa)
fb = fft_box(b)
f=ifft_box(fa*fb)
! Check the Convolution Theorem:
! inverse(transform(a)*transform(b)) = convolution(a,b).
do i=1,nr
cc(1:,i) = c(1:,1,i)
end do
err = norm(cc-f)/norm(cc)
if (ALL(err <= sqrt(epsilon(one))) .AND. MP_RANK == 0) then
write (*,*) 'FFT_BOX is correct.'
end if
MP_NPROCS = MP_SETUP('Final')
end
isNaN
Tests for NaN.
Required Argument
A The argument can be a scalar or array of rank-1, rank-2 or rank-3. The values can be any
of the four intrinsic floating-point types. (Input)
FORTRAN 90 Interface
isNaN( A)
1600 Chapter 10: Linear Algebra Operators and Generic Functions
Description
This is a generic logical function used to test scalars or arrays for occurrence of an IEEE 754
Standard format of floating point (ANSI/IEEE 1985) NaN, or not-a-number. Either quiet or
signaling NaNs are detected without an exception occurring in the test itself. The individual array
entries are each examined, with bit manipulation, until the first NaN is located. For non-IEEE
formats, the bit pattern tested for single precision is transfer(not(0),1). For double
precision numbers x, the bit pattern tested is equivalent to assigning the integer array
i(1:2) = not(0), then testing this array with the bit pattern of the integer array
transfer(x,i). This function is likely to be required whenever there is the possibility that a
subroutine blocked the output with NaNs in the presence of an error condition.
Example
use isnan_int
implicit none
! This is the equivalent of Example 1 for NaN.
integer, parameter :: n=3
real(kind(1e0)) A(n,n); real(kind(1d0)) B(n,n)
real(kind(1e0)), external :: s_NaN
real(kind(1d0)), external :: d_NaN
! Assign NaNs to both A and B:
A = s_Nan(1e0); B = d_Nan(1d0)
! Check that NaNs are noted in both A and B:
if (isNan(A) .and. isNan(B)) then
write (*,*) 'Example 1 for NaN is correct.'
end if
end
NaN
Returns the value for NaN.
Required Argument
X Scalar value of the same type and precision as the desired result, NaN. This input value
is used only to match the type of output. (Input)
NaN 1601
FORTRAN 90 Interface
NaN (A)
Description
NaN returns, as a scalar, a value corresponding to the IEEE 754 Standard format of floating point
(ANSI/IEEE 1985) for NaN.
The bit pattern used for single precision is transfer (not(0),1). For double precision, the bit
pattern for single precision is replicated by assigning the temporary integer array
i(1:2) = not(0), and then using the double-precision bit pattern transfer(i,x) for the
output value.
Example
Arrays are assigned all NaN values, using single and double-precision formats. These are tested
using the logical function routine, isNaN.
use isnan_int
implicit none
! This is the equivalent of Example 1 for NaN.
integer, parameter :: n=3
real(kind(1e0)) A(n,n); real(kind(1d0)) B(n,n)
real(kind(1e0)), external :: s_NaN
real(kind(1d0)), external :: d_NaN
! Assign NaNs to both A and B:
A = s_Nan(1e0); B = d_Nan(1d0)
! Check that NaNs are noted in both A and B:
if (isNan(A) .and. isNan(B)) then
write (*,*) 'Example 1 for NaN is correct.'
end if
end
NORM
CAPABLE
Required Argument
A An array of rank-1, rank-2, or rank-3, containing the values for which the norm is to be
computed. It may be real, double, complex, or double complex. (Input)
Use of the option number ?_reset_default_norm will switch the default from the l2 to
the l1 or l norms. (Input)
The option and derived type names are given in the following tables:
Option Value
?_reset_default_norm
Use
Derived Type
?_norm_options(:)
?_options
?_norm_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SOL_SVD located in Chapter 1, Linear Systems for the specific options for this
routine.
FORTRAN 90 Interface
NORM (A [,])
Description
Computes the l2 , l1 or l norm. The l1 and l norms are likely to be less expensive to compute
than the l2 norm.
NORM 1603
A 1 = max j ( aij )
i =1
If the l2 norm is required, this function uses LIN_SOL_SVD (see Chapter 1, Linear Systems), to
compute the largest singular value of A. For the other norms, Fortran 90 intrinsics are used.
Examples
Compute three norms of an array
use norm_int
real (kind(1e0)) A(5), n_1, n_2, n_inf
A = rand (A)
! I1
n_1 = norm(A, TYPE=1)
write (*,*) n_1
! I2
n_2 = norm(A)
write (*,*) n_2
! I infinity
n_inf = norm(A, TYPE=huge(1))
write (*,*) n_inf
end
A Polar Decomposition of several matrices are computed. The box data type and the SVD()
function are used. Orthogonality and small residuals are checked to verify that the results are
correct.
use linear_operators
use mpi_setup_int
implicit none
! This is Parallel Example 15 using operators and
! functions for a polar decomposition.
integer, parameter :: n=33, nr=3
real(kind(1d0)) :: one=1d0, zero=0d0
real(kind(1d0)),dimension(n,n,nr) :: A, P, Q, &
S_D(n,nr), U_D, V_D
real(kind(1d0)) TEMP1(nr), TEMP2(nr)
! Setup for MPI:
mp_nprocs = mp_setup()
! Generate a random matrix.
if(mp_rank == 0) A = rand(A)
! Compute the singular value decomposition.
1604 Chapter 10: Linear Algebra Operators and Generic Functions
ORTH
CAPABLE
Required Argument
A Matrix A to be decomposed. Must be an array of rank-2 or rank-3 (box data) of type real,
double, complex, or double complex. (Input)
The option and derived type names are given in the following tables:
Option Name for ORTH
Skip_error_processing
Option Value
5
ORTH 1605
Use
Derived Type
?_orth_options(:)
?_options
?_orth_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes.
FORTRAN 90 Interface
ORTH (A [,])
Description
Orthogonalizes the columns of a matrix. The decomposition A = QR is computed using a forward
and backward sweep of the Modified Gram-Schmidt algorithm.
Examples
(Operator_ex19.f90)
use linear_operators
use lin_sol_tri_int
use rand_int
use Numerical_Libraries
implicit none
! This is the equivalent of Example 3 (using operators) for LIN_SOL_TRI.
integer i, nopt
integer, parameter :: n=128, k=n/4, ncoda=1, lda=2
real(kind(1e0)), parameter :: s_one=1e0, s_zero=0e0
real(kind(1e0)) A(lda,n), EVAL(k)
type(s_options) :: iopt(2)
real(kind(1e0)) d(n), b(n), d_t(2*n,k), c_t(2*n,k), perf_ratio, &
b_t(2*n,k), y_t(2*n,k), eval_t(k), res(n,k)
logical small
! This flag is used to get the k largest eigenvalues.
small = .false.
! Generate the main diagonal and the co-diagonal of the
! tridiagonal matrix.
b=rand(b); d=rand(d)
A(1,1:)=b; A(2,1:)=d
! Use Numerical Libraries routine for the calculation of k
! largest eigenvalues.
CALL EVASB (N, K, A, LDA, NCODA, SMALL, EVAL)
EVAL_T = EVAL
1606 Chapter 10: Linear Algebra Operators and Generic Functions
!
!
!
!
ORTH 1607
end
Parallel Example
use linear_operators
use mpi_setup_int
integer, parameter :: N=32, nr=4
real (kind(1.e0)) A(N,N,nr), Q(N,N,nr)
! Setup for MPI
mp_nprocs = mp_setup()
if (mp_rank == 0) then
A = rand(A)
end if
Q = orth(A)
mp_nprocs = mp_setup ('Final')
end
RAND
Generates a scalar, rank-1, rank-2 or rank-3 array of random numbers.
Required Argument
A The argument must be a scalar, rank-1, rank-2, or rank-3 array of type single, double,
complex, or double complex. Used only to determine the type and rank of the output. (Input)
Use
Derived Type
?_rand_options(:)
?_options
?_rand_options_once(:)
?_options
FORTRAN 90 Interface
RAND(A)
Description
Generates a scalar, rank-1, rank-2 or rank-3 array of random numbers. Each component number is
positive and strictly less than one in value.
This function uses rand_gen to obtain the number of values required by the argument. The
values are then copied using the RESHAPE intrinsic
Example
use show_int
use rand_int
implicit none
! This is the equivalent of Example 1 for SHOW.
integer, parameter :: n=7, m=3
real(kind(1e0)) s_x(-1:n), s_m(m,n)
real(kind(1d0)) d_x(n), d_m(m,n)
complex(kind(1e0)) c_x(n), c_m(m,n)
complex(kind(1d0)) z_x(n),z_m(m,n)
integer i_x(n), i_m(m,n)
type (s_options) options(3)
! The data types printed are real(kind(1e0)), real(kind(1d0)),
! complex(kind(1e0)), complex(kind(1d0)), and INTEGER. Fill with random
! numbers and then print the contents, in each case with a label.
s_x=rand(s_x); s_m=rand(s_m)
d_x=rand(d_x); d_m=rand(d_m)
c_x=rand(c_x); c_m=rand(c_m)
z_x=rand(z_x); z_m=rand(z_m)
i_x=100*rand(s_x(1:n)); i_m=100*rand(s_m)
call
call
call
call
call
call
call
call
show
show
show
show
show
show
show
show
(s_x,
(s_m,
(d_x,
(d_m,
(c_x,
(c_m,
(z_x,
(z_m,
'Rank-1,
'Rank-2,
'Rank-1,
'Rank-2,
'Rank-1,
'Rank-2,
'Rank-1,
'Rank-2,
REAL')
REAL')
DOUBLE')
DOUBLE')
COMPLEX')
COMPLEX')
DOUBLE COMPLEX')
DOUBLE COMPLEX')
RAND 1609
RANK
CAPABLE
Required Argument
A Matrix for which the rank is to be computed. The argument must be rank-2 or rank-3
(box) array of type single, double, complex, or double complex. (Input)
Option Value
?_rank_set_small
?_rank_for_lin_sol_svd
Use
Derived Type
?_rank_options(:)
?_options
?_rank_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SOL_SVD located in Chapter 1, Linear Systems for the specific options for this routine.
FORTRAN 90 Interface
RANK (A)
Description
Computes the mathematical rank of a rank-2 or rank-3 array. The output function value is an
integer with a value equal to the number of singular values that are greater than a tolerance. The
default value for this tolerance is 1/ 2 s1 , where is machine precision and s1 is the largest
singular value of the matrix.
Examples
use linear_operators
real (kind(1e0)) A(5,5)
A = rand (A)
write (*,*) rank(A)
A=1.0
write (*,*) rank(A)
end
Output
5
1
Parallel Example
use linear_operators
use mpi_setup_int
integer, parameter :: N=3, nr=4
integer r(nr)
real (kind(1.e0)) s_mat(N,N), s_box(N,N,nr)
! Setup for MPI
mp_nprocs = mp_setup()
if (mp_rank == 0) then
s_mat = reshape((/1.,0.,0.,epsilon(1.0e0)/),(/n,n/))
s_box = spread(s_mat,dim=3,ncopies=nr)
end if
RANK 1611
r = rank(s_box)
mp_nprocs = mp_setup ('Final')
end
SVD
CAPABLE
Required Argument
A Array of size m x n to be decomposed. Must be rank-2 or rank-3 array of type single,
double, complex, or double complex. (Input)
The option and derived type names are given in the following tables:
Option Names for SVD
Option Value
Options_for_lin_svd
Options_for_lin_sol_svd
skip_error_processing
Use
Derived Type
?_svd_options(:)
?_options
?_svd_options_once(:)
?_options
For a description on how to use these options, see Matrix Optional Data Changes. See
LIN_SVD and LIN_SOL_SVD located in Chapter 1, Linear Systems for the specific options for
these routines.
1612 Chapter 10: Linear Algebra Operators and Generic Functions
FORTRAN 90 Interface
SVD (A [,])
Description
Computes the singular value decomposition of a rank-2 or rank-3 array, A = USV T .
This function uses one of the routines LIN_SVD and LIN_SOL_SVD. If a complete decomposition
is required, LIN_SVD is used. If singular values only, or singular values and one of the right and
left singular vectors are required, then LIN_SOL_SVD is called.
Examples
operator_ex14.f90
use linear_operators
implicit none
! This is the equivalent of Example 2 for LIN_SOL_SVD using operators
! and functions.
integer, parameter :: n=32
real(kind(1d0)) :: one=1d0, zero=0d0
real(kind(1d0)) A(n,n), P(n,n), Q(n,n), &
S_D(n), U_D(n,n), V_D(n,n)
! Generate a random matrix.
A = rand(A)
! Compute the singular value decomposition.
S_D = SVD(A, U=U_D, V=V_D)
! Compute the (left) orthogonal factor.
P = U_D .xt. V_D
! Compute the (right) self-adjoint factor.
Q = V_D .x. diag(S_D) .xt. V_D
! Check the results.
if (norm( EYE(n) - (P .xt. P)) &
<= sqrt(epsilon(one))) then
if (norm(A - (P .x. Q))/norm(A) &
<= sqrt(epsilon(one))) then
write (*,*) 'Example 2 for LIN_SOL_SVD (operators) is correct.'
end if
end if
end
Systems of least-squares problems are solved, but now using the SVD() function. A box data
type is used. This is an example which uses optional arguments and a generic function overloaded
for parallel execution of a box data type. Any number of nodes can be used.
Chapter 10: Linear Algebra Operators and Generic Functions
SVD 1613
use linear_operators
use mpi_setup_int
implicit none
! This is the equivalent of Parallel Example 14
! for SVD, .tx. , .x. and NORM.
integer, parameter :: m=128, n=32, nr=4
real(kind(1d0)) :: one=1d0, err(nr)
real(kind(1d0)) A(m,n,nr), b(m,1,nr), x(n,1,nr), U(m,m,nr), &
V(n,n,nr), S(n,nr), g(m,1,nr)
! Setup for MPI:
mp_nprocs=mp_setup()
if(mp_rank == 0) then
! Generate a random matrix and right-hand side.
A = rand(A); b = rand(b)
endif
! Compute
S =
g =
x =
UNIT
Normalizes the columns of a matrix so each has Euclidean length of value one.
Required Argument
A Matrix to be normalized. The argument must be a rank-2 or rank-3 array of type single,
double, complex, or double complex. (Input)
FORTRAN 90 Interface
UNIT (A)
Description
Normalizes the columns of a rank-2 or rank-3 array so each has Euclidean length of value one.
This function uses a rank-2 Euclidean length subroutine to compute the lengths of the nonzero
columns, which are then normalized to have lengths of value one. The subroutine carefully avoids
overflow or damaging underflow by rescaling the sums of squares as required.
Example (operator_ex28.f90)
use linear_operators
implicit none
! This is the equivalent of Example 4 (using operators) for LIN_EIG_SELF.
integer, parameter :: n=64
real(kind(1e0)), parameter :: one=1d0
real(kind(1e0)), dimension(n,n) :: A, B, C, D(n), lambda(n), &
S(n), vb_d, X, res
! Generate random self-adjoint matrices.
A = rand(A); A = A + .t.A
B = rand(B); B = B + .t.B
! Add a scalar matrix so B is positive definite.
B = B + norm(B)*EYE(n)
! Get the eigenvalues and eigenvectors for B.
S = EIG(B,V=vb_d)
! For full rank problems, convert to an ordinary self-adjoint
! problem. (All of these examples are full rank.)
if (S(n) > epsilon(one)) then
D = one/sqrt(S)
C = diag(D) .x. (vb_d .tx. A .x. vb_d) .x. diag(D)
C = (C + .t.C)/2
! Get the eigenvalues and eigenvectors for C.
lambda = EIG(C,v=X)
! Compute and normalize the generalized eigenvectors.
X = UNIT(vb_d .x. diag(D) .x. X)
res = (A .x. X) - (B .x. X .x. diag(lambda))
! Check the results.
if(norm(res)/(norm(A)+norm(B)) <= &
sqrt(epsilon(one))) then
write (*,*) 'Example 4 for LIN_EIG_SELF (operators) is correct.'
end if
UNIT 1615
end if
end
Routines
11.1.
11.2.
11.3.
ScaLAPACK Utilities
Sets up a processor grid and calculates default values for use
in mapping arrays to the processor grid ....... ScaLAPACK_SETUP
Calculates the array dimensions needed for
local arrays.................................................. ScaLAPACK_GETDIM
Reads matrix data from a file and transmits it into the
two-dimensional block-cyclic form ................. ScaLAPACK_READ
Writes the matrix data to a file .......................ScaLAPACK_WRITE
Reads matrix data from an array and transmits it into the
two-dimensional block-cyclic form ....................ScaLAPACK_MAP
Writes the matrix data to a global array....... ScaLAPACK_UNMAP
Exits ScaLAPACK usage.................................. ScaLAPACK_EXIT
Print
Prints error messages ............................................ ERROR_POST
Prints rank-1 or rank-2 arrays of numbers in a
readable format.....................................................................SHOW
Real rectangular matrix
with integer row and column labels.................................... WRRRN
Real rectangular matrix with given format and labels.........WRRRL
Integer rectangular matrix
with integer row and column labels......................................WRIRN
Integer rectangular matrix with given format and labels...... WRIRL
Complex rectangular matrix
with row and column labels................................................ WRCRN
Complex rectangular matrix
with given format and labels ...............................................WRCRL
Sets or retrieves options for printing a matrix .....................WROPT
Sets or retrieves page width and length ............................. PGOPT
Permute
Elements of a vector ...........................................................PERMU
Rows/columns of a matrix................................................... PERMA
1622
1624
1625
1627
1636
1637
1640
1640
1643
1647
1649
1653
1655
1658
1660
1664
1671
1673
1674
Routines 1617
11.4.
11.5.
11.6.
11.7.
11.8.
Sort
Sorts a rank-1 array of real numbers x so the y results
are algebraically nondecreasing, y1 y2 yn......... SORT_REAL
Real vector by algebraic value ............................................SVRGN
Real vector by algebraic value
and permutations returned ..................................................SVRGP
Integer vector by algebraic value ......................................... SVIGN
Integer vector by algebraic value
and permutations returned ................................................... SVIGP
Real vector by absolute value ............................................. SVRBN
Real vector by absolute value
and permutations returned .................................................. SVRBP
Integer vector by absolute value ...........................................SVIBN
Integer vector by absolute value
and permutations returned ....................................................SVIBP
1677
1679
1681
1683
1684
1685
1687
1688
1690
Search
Sorted real vector for a number ............................................ SRCH
Sorted integer vector for a number ...................................... ISRCH
Sorted character vector for a string.....................................SSRCH
1691
1694
1696
1698
1699
1700
1701
1703
1704
1705
1706
1707
1708
1710
1711
1713
1714
1722
1723
1724
1726
1727
1729
1729
1730
1732
1732
1734
1736
1736
1737
1739
1743
1746
1746
11.12. Miscellaneous
Decomposes an integer into its prime factors ......................PRIME
Returns mathematical and physical constants ................... CONST
Converts a quantity to different units .................................... CUNIT
1749
1751
1753
1757
1758
For a detailed description of MPI Requirements see Dense Matrix Parallelism Using MPI in
Chapter 10 of this manual.
This section describes the use of ScaLAPACK, a suite of dense linear algebra solvers, applicable
when a single problem size is large. We have integrated usage of IMSL Fortran Library with
ScaLAPACK. However, the ScaLAPACK library, including libraries for BLACS and PBLAS, are
not part of this Library. To use ScaLAPACK software, the required libraries must be installed on
the users computer system. We adhered to the specification of Blackford, et al. (1997), but use
only MPI for communication. The ScaLAPACK library includes certain LAPACK routines,
Anderson, et al. (1995), redesigned for distributed memory parallel computers. It is written in a
Single Program, Multiple Data (SPMD) style using explicit message passing for communication.
Matrices are laid out in a two-dimensional block-cyclic decomposition. Using High Performance
Chapter 11: Utilities
Fortran (HPF) directives, Koelbel, et al. (1994), and a static p q processor array, and following
declaration of the array, A(*,*), this is illustrated by:
INTEGER, PARAMETER :: N=500, P= 2, Q=3, MB=32, NB=32
!HPF$ PROCESSORS PROC(P,Q)
!HPF$ DISTRIBUTE A(cyclic(MB), cyclic(NB)) ONTO PROC
Our integration work provides modules that describe the interface to the ScaLAPACK library. We
recommend that users include these modules when using ScaLAPACK or ancillary packages,
including BLACS and PBLAS. For the job of distributing data within a users application to the
block-cyclic decomposition required by ScaLAPACK solvers, we provide a utility that reads data
from an external file and arranges the data within the distributed machines for a computational
step. Another utility writes the results into an external file. We also provide similar utilities that
map/unmap global arrays to/from local arrays. These utilities are used in our ScaLAPACK
examples for brevity.
The data types supported for these utilities are integer; single precision, real; double precision,
real; single precision, complex; and double precision, complex.
A ScaLAPACK library normally includes routines for:
condition number estimation and iterative refinement for LU and Cholesky factorization,
matrix inversion,
ScaLAPACK routines are available in four data types: single precision, real; double precision;
real, single precision, complex, and double precision, complex. At present, the non-symmetric
eigenproblem is only available in single and double precision. More background information and
user documentation is available on the World Wide Web at location
www.netlib.org/scalapack/slug/scalapack_slug.html.
For users with rank deficiency or simple constraints in their linear systems or least-squares
problem, we have routines for:
full or deficient rank least-squares problems with simple upper and lower bound constraints
These are available in two data types: single precision, real, and double precision, real, and they
are not part of ScaLAPACK. The matrices are distributed in a general block-column layout.
We also provide generic interfaces to a number of ScaLAPACK routines through the standard
IMSL Library routines. These are listed in Table D in the Introduction of this manual.
The global arrays which are to be distributed across the processor grid for use by the ScaLAPACK
routines require that an array descriptor be defined for each of them. We use the ScaLAPACK
TOOLS routine DESCINIT to set up array descriptors in our examples. A typical call to
DESCINIT:
CALL DESCINIT(DESCA, M, N, MB, NB, IRSRC, ICSRC, ICTXT, LLD, INFO)
Where the arguments in the above call are defined as follows for the matrix being described:
DESCA An input integer vector of length 9 which is to contain the array descriptor
information.
M An input integer which indicates the row size of the global array which is being
described.
N An input integer which indicates the column size of the global array which is being
described.
MB An input integer which indicates the blocking factor used to distribute the rows of the
matrix being described.
NB An input integer which indicates the blocking factor used to distribute the columns of
the matrix being described.
IRSRC An input integer which indicates the processor grid row over which the first row of
the array being described is distributed.
ICSRC An input integer which indicates the processor grid column over which the first
column of the array being described is distributed.
ICTXT An input integer which indicates the BLACS context handle.
LLD An input integer indicating the leading dimension of the local array which is to be
used for storing the local blocks of the array being described
INFO An output integer indicating whether or not the call was successful. INFO = 0
indicates a successful exit. INFO = -i indicates the i-th argument had an illegal value.
DESCA(2) = ICTXT
DESCA(3) = M
DESCA(4) = N
Chapter 11: Utilities
DESCA(5)
DESCA(6)
DESCA(7)
DESCA(8)
DESCA(9)
=
=
=
=
=
MB
NB
IRSRC
ICSRC
LLD
The IMSL Library routines which interface with ScaLAPACK routines use IRSRC = 0 and
ICSRC = 0 for the internal calls to DESCINIT.
ScaLAPACK_Support
ScaLAPACK_Int
PBLAS_Int
BLACS_Int
TOOLS_Int
LAPACK_Int
ScaLAPACK_IO_Int
MPI_Node_Int
GRIDINFO_Int
The module holding data describing the processor grid and information
required to map the target array to the processors. See the Description
section of ScaLAPACK_SETUP below.
ScaLAPACK_MAP_Int
ScaLAPACK_UNMAP_Int
ScaLAPACK_SETUP
REQUIRED
For a detailed description of MPI Requirements see Using ScaLAPACK Enhanced Routines in
the Introduction of this manual.
This routine sets up a processor grid and calculates default values for various entities to be used in
mapping a global array to the processor grid. All processors in the BLACS context call the routine.
Required Arguments
M The row dimension of the global array for which the local array dimensions are to be
calculated. (Input)
N The column dimension of the global array for which the local array dimensions are to be
calculated. (Input)
NSQUARE Input logical which indicates whether the block used for mapping the global
array to the processor grid must be square. If the block must be square, set NSQUARE to
.TRUE., otherwise, set it to .FALSE. (Input)
GRID1D Input logical which indicates whether the processor grid is to be one dimensional
or two dimensional. Set GRID1D to .TRUE. if the grid is to be one dimensional.
Otherwise, set GRID1D to .FALSE. (Input)
FORTRAN 90 Interface
Generic:
Description
Subroutine ScaLAPACK_SETUP creates a processor grid based on the number of processors being
used and the GRID1D logical supplied by the user. The argument, NSQUARE, is supplied because
some ScaLAPACK routines require that the row and column blocking factors be equal. GRID1D
is supplied for those routines which require that the processor grid be one dimensional.
ScaLAPACK_SETUP also establishes values for MP_M, MP_N, MP_NPROW, MP_NPCOL, MP_MB,
MP_NB, MP_PIGRID, MP_ICTXT, MP_NSQUARE, and MP_GRID1D in the IMSL Fortran Library
module GRIDINFO_INT. The above entities are defined as follows:
MP_M The row dimension of the primary array which is to be distributed among the processors.
MP_N The column dimension of the primary array which is to be distributed among the
processors.
MP_NPROW The number of rows in the processor grid.
MP_NPCOL The number of columns in the processor grid.
MP_MB The row blocking factor to be used in distributing the array.
MP_NB The column blocking factor to be used in distributing the array.
MP_PIGRID The pointer to the processor grid, MP_IGRID.
ScaLAPACK_SETUP 1623
Example
See ScaLAPACK_WRITE.
ScaLAPACK_GETDIM
REQUIRED
For a detailed description of MPI Requirements see Using ScaLAPACK Enhanced Routines in
the Introduction of this manual.
This routine calculates the row and column dimensions of a local distributed array based on the
size of the array to be distributed and the row and column blocking factors to be used. All
processors in the BLACS context call the routine.
Required Arguments
M The row dimension of the global array for which the local array dimensions are to be
calculated. (Input)
N The column dimension of the global array for which the local array dimensions are to be
calculated. (Input)
MB The row blocking factor to be used in distributing the array. (Input)
NB The column blocking factor to be used in distributing the array. (Input)
MXLDA The row dimension of the local array. (Output)
MXCOL The column dimension of the local array. (Output)
FORTRAN 90 Interface
Generic:
Description
Subroutine ScaLAPACK_GETDIM calculates the row and column dimensions of a local array by
using the ScaLAPACK utility NUMROC.
Note that ScaLAPACK_SETUP must be called prior to calling this routine because
ScaLAPACK_GETDIM will use some of the global entities defined by ScaLAPACK_SETUP.
Example
See ScaLAPACK_WRITE.
ScaLAPACK_READ
REQUIRED
For a detailed description of MPI Requirements see Using ScaLAPACK Enhanced Routines in
the Introduction of this manual.
This routine reads matrix data from a file and transmits it into the two-dimensional block-cyclic
form required by ScaLAPACK routines. This routine contains a call to a barrier routine so that if
one process is writing the file and an alternate process is to read it, the results will be
synchronized.
All processors in the BLACS context call the routine.
Required Arguments
File_Name A character variable naming the file containing the matrix data. (Input)
This file is opened with STATUS=OLD. If the name is misspelled or the file does not
exist, or any access violation occurs, a type = terminal error message will occur.
After the contents are read, the file is closed. This file is read with a loop logically
equivalent to groups of reads:
READ() ((BUFFER(I,J), I=1,M), J=1, NB)
or (optionally):
READ() ((BUFFER(I,J), J=1,N), I=1, MB)
DESC_A(*) The nine integer parameters associated with the ScaLAPACK matrix
descriptor. Values for NB,MB,LDA are contained in this array. (Input)
A(LDA,*) This is an assumed-size array, with leading dimension LDA, that will contain
this processors piece of the block-cyclic matrix. The data type for A(*,*) is any of five
Fortran intrinsic types: integer; single precision, real; double precision, real; single
precision, complex; and double precision, complex. (Output)
ScaLAPACK_READ 1625
Optional Arguments
Format A character variable containing a format to be used for reading the file containing
matrix data. If this argument is not present, an unformatted or list-directed read is
used. (Input)
iopt Derived type array with the same precision as the array A(*,*), used for passing
optional data to ScaLAPACK_READ. (Input)
The options are as follows:
Packaged Options for ScaLAPACK_READ
Option Prefix = ?
Option Name
Option Value
S_, d_
ScaLAPACK_READ_UNIT
S_, d_
ScaLAPACK_READ_FROM_PROCESS
S_, d_
ScaLAPACK_READ_BY_ROWS
iopt(IO) = ScaLAPACK_READ_UNIT
Sets the unit number to the value in iopt(IO + 1)%idummy. The default unit
number is the value 11.
iopt(IO) = ScaLAPACK_READ_FROM_PROCESS
Sets the process number that reads the named file to the value in
iopt(IO + 1)%idummy. The default process number is the value 0.
iopt(IO) = ScaLAPACK_READ_BY_ROWS
Read the matrix by rows from the named file. By default the matrix is read by
columns.
FORTRAN 90 Interface
Generic:
Specific:
Description
Subroutine ScaLAPACK_READ reads columns or rows of a problem matrix so that it is usable by a
ScaLAPACK routine. It uses the two-dimensional block-cyclic array descriptor for the matrix to
place the data in the desired assumed-size arrays on the processors. The blocks of data are read,
then transmitted and received. The block sizes, contained in the array descriptor, determines the
data set size for each blocking send and receive pair. The number of these synchronization points
is proportional to M N /( MB NB ) . A temporary local buffer is allocated for staging the
matrix data. It is of size M by NB, when reading by columns, or N by MB, when reading by rows.
Example
See ScaLAPACK_WRITE.
ScaLAPACK_WRITE
REQUIRED
For a detailed description of MPI Requirements see Using ScaLAPACK Enhanced Routines in
the Introduction of this manual.
This routine writes the matrix data to a file. The data is transmitted from the two-dimensional
block-cyclic form used by ScaLAPACK. This routine contains a call to a barrier routine so that if
one process is writing the file and an alternate process is to read it, the results will be
synchronized. All processors in the BLACS context call the routine.
Required Arguments
File_Name A character variable naming the file to receive the matrix data. (Input)
This file is opened with STATUS=UNKNOWN. If any access violation happens, a
type = terminal error message will occur. If the file already exists it will be
overwritten. After the contents are written, the file is closed. This file is written with a
loop logically equivalent to groups of writes:
WRITE() ((BUFFER(I,J), I=1,M), J=1, NB)
or (optionally):
WRITE() ((BUFFER(I,J), J=1,N), I=1, MB)
DESC_A(*) The nine integer parameters associated with the ScaLAPACK matrix
descriptor. Values for NB, MB, LDA are contained in this array. (Input)
A(LDA,*) This is an assumed-size array, with leading dimension LDA, containing this
processors piece of the block-cyclic matrix. The data type for A(*,*) is any of five
Fortran intrinsic types: integer; single precision, real; double precision, real; single
precision, complex; or double precision, complex. (Input)
Optional Arguments
Format A character variable containing a format to be used for writing the file that receives
matrix data. If this argument is not present, an unformatted or list-directed write is
used. (Input)
iopt Derived type array with the same precision as the array A(*,*), used for passing
optional data to ScaLAPACK_WRITE. Use single precision when A(*,*) is type
INTEGER. (Input)
The options are as follows:
ScaLAPACK_WRITE 1627
Option Name
Option Value
S_, d_
ScaLAPACK_WRITE_UNIT
S_, d_
ScaLAPACK_WRITE_FROM_PROCESS
S_, d_
ScaLAPACK_WRITE_BY_ROWS
iopt(IO) =ScaLAPACK_WRITE_UNIT
Sets the process number that writes the named file to the integer component of
iopt(IO + 1)%idummy. The default process number is the value 0.
iopt(IO) = ScaLAPACK_WRITE_BY_ROWS
Write the matrix by rows to the named file. By default the matrix is written by
columns.
FORTRAN 90 Interface
Generic:
Specific:
Description
Subroutine ScaLAPACK_WRITE writes columns or rows of a problem matrix output by a
ScaLAPACK routine. It uses the two-dimensional block-cyclic array descriptor for the matrix to
extract the data from the assumed-size arrays on the processors. The blocks of data are
transmitted and received, then written. The block sizes, contained in the array descriptor,
determines the data set size for each blocking send and receive pair. The number of these
synchronization points is proportional to M N /( MB NB ) . A temporary local buffer is
allocated for staging the matrix data. It is of size M by NB, when writing by columns, or N by MB,
when writing by rows.
! block-cyclic matrix.
USE ScaLAPACK_SUPPORT
USE ERROR_OPTION_PACKET
USE MPI_SETUP_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: M=6, N=6, NIN=10
INTEGER DESC_A(9), IERROR, INFO, I, J, K, L, MXLDA, MXCOL
LOGICAL :: GRID1D = .TRUE., NSQUARE = .TRUE.
real(kind(1d0)), allocatable :: A(:,:), A0(:,:)
real(kind(1d0)) ERROR
TYPE(d_OPTIONS) IOPT(1)
MP_NPROCS=MP_SETUP()
!
!
!
!
! A root process is used to create the matrix data for the test.
IF(MP_RANK == 0) THEN
ALLOCATE(A(M,N))
! Fill array with a pattern that is easy to recognize.
K=0
DO
K=K+1; IF(10**K > N) EXIT
END DO
DO J=1,N
DO I=1,M
! The values will appear, as decimals I.J, where I is
! the row and J is the column.
A(I,J)=REAL(I)+REAL(J)*10d0**(-K)
END DO
END DO
OPEN(UNIT=NIN, FILE='test.dat', STATUS='UNKNOWN')
! Write the data by columns.
DO J=1,N,MP_NB
WRITE(NIN,*) ((A(I,L),I=1,M),L=J,min(N,J+MP_NB-1))
END DO
CLOSE(NIN)
DEALLOCATE(A)
ALLOCATE(A(N,M))
END IF
! Read the matrix into the local arrays.
CALL ScaLAPACK_READ('test.dat', DESC_A, A0)
Chapter 11: Utilities
ScaLAPACK_WRITE 1629
Output
Example 1 for BLACS is correct.
Additional Examples
Example 2: Distributed Matrix Product with PBLAS
The program SCPK_EX2 illustrates computation of the matrix product Cm n = Am k Bk n . The
matrices on the right-hand side are random. Three temporary files are created and deleted.
BLACS and PBLAS are used. The problem size is such that the results are checked on one process.
program scpk_ex2
! This is Example 2 for ScaLAPACK_READ and ScaLAPACK_WRITE.
! The product of two matrices is computed with PBLAS
! and checked for correctness.
USE ScaLAPACK_SUPPORT
USE MPI_SETUP_INT
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: K=32, M=33, N=34, NIN=10
INTEGER INFO, IA, JA, IB, JB, IC, JC, MXLDA, MXCOL, MXLDB, &
MXCOLB, MXLDC, MXCOLC, IERROR, I, J, L,&
DESC_A(9), DESC_B(9), DESC_C(9)
LOGICAL :: GRID1D = .TRUE., NSQUARE = .TRUE.
real(kind(1d0)) :: ALPHA, BETA, ERROR=1d0, SIZE_C
real(kind(1d0)), allocatable, dimension(:,:) :: A,B,C,X(:),&
A0, B0, C0
MP_NPROCS=MP_SETUP()
! Set up a 1D processor grid and define its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(M, N, NSQUARE, GRID1D)
! Get the array descriptor entities
CALL SCALAPACK_GETDIM(M, K, MP_MB, MP_NB, MXLDA, MXCOL)
CALL SCALAPACK_GETDIM(K, N, MP_NB, MP_MB, MXLDB, MXCOLB)
CALL SCALAPACK_GETDIM(M, N, MP_MB, MP_NB, MXLDC, MXCOLC)
! Set up the array descriptors
CALL DESCINIT(DESC_A, M, K, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDA, INFO)
CALL DESCINIT(DESC_B, K, N, MP_NB, MP_NB, 0, 0, MP_ICTXT, &
MXLDB, INFO)
CALL DESCINIT(DESC_C, M, N, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDC, INFO)
Chapter 11: Utilities
ScaLAPACK_WRITE 1631
ALLOCATE(A0(MXLDA,MXCOL), B0(MXLDB,MXCOLB),C0(MXLDC,MXCOLC))
! A root process is used to create the matrix data for the test.
IF(MP_RANK == 0) THEN
ALLOCATE(A(M,K), B(K,N), C(M,N), X(M))
CALL RANDOM_NUMBER(A); CALL RANDOM_NUMBER(B)
OPEN(UNIT=NIN, FILE='Atest.dat', STATUS='UNKNOWN')
! Write the data by columns.
DO J=1,K,MP_NB
WRITE(NIN,*) ((A(I,L),I=1,M),L=J,min(K,J+MP_NB-1))
END DO
CLOSE(NIN)
OPEN(UNIT=NIN, FILE='Btest.dat', STATUS='UNKNOWN')
! Write the data by columns.
DO J=1,N,MP_NB
WRITE(NIN,*) ((B(I,L),I=1,K),L=J,min(N,J+MP_NB-1))
END DO
CLOSE(NIN)
END IF
! Read the factors into the local arrays.
CALL ScaLAPACK_READ('Atest.dat', DESC_A, A0)
CALL ScaLAPACK_READ('Btest.dat', DESC_B, B0)
! Compute the distributed product C = A x B.
ALPHA=1d0; BETA=0d0
IA=1; JA=1; IB=1; JB=1; IC=1; JC=1
C0=0
CALL pdGEMM &
("No", "No", M, N, K, ALPHA, A0, IA, JA,&
DESC_A, B0, IB, JB, DESC_B, BETA,&
C0, IC, JC, DESC_C )
! Put the product back on the root node.
Call ScaLAPACK_WRITE('Ctest.dat', DESC_C, C0)
IF(MP_RANK == 0) THEN
! Read the residuals and check them for size.
OPEN(UNIT=NIN, FILE='Ctest.dat', STATUS='OLD')
! Read the data by columns.
DO J=1,N,MP_NB
READ(NIN,*) ((C(I,L),I=1,M),L=J,min(N,J+MP_NB-1))
END DO
CLOSE(NIN,STATUS='DELETE')
SIZE_C=SUM(ABS(C)); C=C-matmul(A,B)
ERROR=SUM(ABS(C))/SIZE_C
! Open other temporary files and delete them.
OPEN(UNIT=NIN, FILE='Atest.dat', STATUS='OLD')
1632 Chapter 11: Utilities
CLOSE(NIN,STATUS='DELETE')
OPEN(UNIT=NIN, FILE='Btest.dat', STATUS='OLD')
CLOSE(NIN,STATUS='DELETE')
END IF
! See to any error messages.
call e1pop("Mp_Setup")
! Deallocate storage arrays and exit from BLACS.
IF(ALLOCATED(A)) DEALLOCATE(A)
IF(ALLOCATED(B)) DEALLOCATE(B)
IF(ALLOCATED(C)) DEALLOCATE(C)
IF(ALLOCATED(X)) DEALLOCATE(X)
IF(ALLOCATED(A0)) DEALLOCATE(A0)
IF(ALLOCATED(B0)) DEALLOCATE(B0)
IF(ALLOCATED(C0)) DEALLOCATE(C0)
! Check the results.
IF(ERROR <= SQRT(EPSILON(ALPHA)) .and. &
MP_RANK == 0) THEN
write(*,*) " Example 2 for BLACS and PBLAS is correct."
END IF
! Exit from using this process grid.
CALL SCALAPACK_EXIT( MP_ICTXT )
! Shut down MPI
MP_NPROCS = MP_SETUP(FINAL)
END
Output
Example 2 for BLACS and PBLAS is correct.
ScaLAPACK_WRITE 1633
Output
Example 3 for BLACS and ScaLAPACK is correct.
ScaLAPACK_WRITE 1635
ScaLAPACK_MAP
REQUIRED
For a detailed description of MPI Requirements see Using ScaLAPACK Enhanced Routines in
the Introduction of this manual.
This routine maps array data from a global array to local arrays in the two-dimensional blockcyclic form required by ScaLAPACK routines.
All processors in the BLACS context call the routine.
Required Arguments
A Global rank-1 or rank-2 array which is to be mapped to the processor grid. The data type
for A is any of five Fortran intrinsic types: integer; single precision, real; double
precision, real; single precision, complex; double precision, complex. Normally, the
user defines A to be valid only on the MP_RANK = 0 processor. (Input)
DESC_A An integer vector containing the nine parameters associated with the
ScaLAPACK matrix descriptor for array A. See Usage Notes for ScaLAPACK
Utilities for a description of the nine parameters. (Input)
A0 This is a local rank-1 or rank-2 array that will contain this processors piece of the
block-cyclic array. The data type for A0 is any of five Fortran intrinsic types: integer;
single precision, real; double precision, real; single precision, complex; and double
precision, complex. (Output)
Optional Arguments
LDA Leading dimension of A as specified in the calling program. If this argument is not
present, SIZE(A,1) is used. (Input)
COLMAP Input logical which indicates whether the global array should be mapped in
column major form or row major form. COLMAP set to .TRUE. will result in the array
being mapped in column- major form while setting COLMAP to .FALSE. will result in
the array being mapped in row major form. The default value of COLMAP is .TRUE.
(Input)
FORTRAN 90 Interface
Generic:
Description
Subroutine ScaLAPACK_MAP maps columns or rows of a global array on
MP_RANK = 0 to local distributed arrays so that the problem array is usable by a ScaLAPACK
routine. It uses the two-dimensional block-cyclic array descriptor for the matrix to place the data
in the desired assumed-size arrays on the processors. The block sizes, contained in the array
descriptor, determine the data set size for each blocking send and receive pair. The number of
these synchronization points is proportional to M N /( MB NB ) . A temporary local buffer is
allocated for staging the array data. It is of size M by NB, when mapping by columns, or N by MB,
when mapping by rows.
Example
See ScaLAPACK_UNMAP.
ScaLAPACK_UNMAP
REQUIRED
For a detailed description of MPI Requirements see Using ScaLAPACK Enhanced Routines in
the Introduction of this manual.
This routine unmaps array data from local distributed arrays to a global array. The data in the local
arrays must have been stored in the two-dimensional block-cyclic form required by ScaLAPACK
routines. All processors in the BLACS context call the routine.
Required Arguments
A0 This is a local rank-1 or rank-2 array that contains this processors piece of the blockcyclic array. The data type for A0 is any of five Fortran intrinsic types: integer; single
precision, real; double precision, real; single precision, complex; or double
precision, complex. (Input)
DESC_A An integer vector containing the nine parameters associated with the
ScaLAPACK matrix descriptor for array A. See Usage Notes for ScaLAPACK
Utilities for a description of the nine parameters. (Input)
A Global rank-1 or rank-2 array which is to receive the array which had been mapped to
the processor grid. The data type for A is any of five Fortran intrinsic types: integer;
single precision, real; double precision, real; single precision, complex; or double
precision, complex. A is only valid on MP_RANK = 0 after ScaLAPACK_UNMAP has
been called. (Output)
ScaLAPACK_UNMAP 1637
Optional Arguments
LDA Leading dimension of A as specified in the calling program. If this argument is not
present, SIZE(A,1) is used. (Input)
COLMAP Input logical which indicates whether the global array should be mapped in
column major form or row major form. COLMAP set to .TRUE. will result in the array
being mapped in column major form while setting COLMAP to .FALSE. will result in
the array being mapped in row major form. The default value of COLMAP is .TRUE.
(Input)
FORTRAN 90 Interface
Generic:
Description
Subroutine ScaLAPACK_UNMAP unmaps columns or rows of local distributed arrays to a global
array on MP_RANK = 0. It uses the two-dimensional block-cyclic array descriptor for the matrix
to retrieve the data from the assumed-size arrays on the processors. The block sizes, contained in
the array descriptor, determine the data set size for each blocking send and receive pair. The
number of these synchronization points is proportional to M N /( MB NB ) . A temporary
local buffer is allocated for staging the array data. It is of size M by NB, when mapping by
columns, or N by MB, when mapping by rows.
ScaLAPACK_UNMAP 1639
Output
Example 1 for ScaLAPACK_MAP and ScaLAPACK_UNMAP is correct.
ScaLAPACK_EXIT
REQUIRED
For a detailed description of MPI Requirements see Using ScaLAPACK Enhanced Routines in
the Introduction of this manual.
This routine exits ScaLAPACK mode for the IMSL Library routines. All processors in the BLACS
context call the routine.
Required Arguments
ICTXT The BLACS context ID to which the processor grid is associated. (Input)
FORTRAN 90 Interface
Generic:
Description
Subroutine ScaLAPACK_EXIT exits ScaLAPACK mode for the IMSL Library routines. The
following actions occur when this routine is called:
ERROR_POST
Prints error messages that are generated by IMSL routines using EPACK.
Required Argument
EPACK (Input [/Output])
Derived type array of size p containing the array of message numbers and associated
data for the messages. The definition of this derived type is packaged within the
modules used as interfaces for each suite of routines. The declaration is:
type ?_error
integer idummy; real(kind(?_)) rdummy
end type
Optional Arguments
new_unit = nunit (Input)
Unit number, of type integer, associated for reading the direct-access file of error
messages for the IMSL Fortran 90 routines.
Default: nunit = 4
Pathname in the local file space, of type character*64, needed for reading the directaccess file of error messages. Default string for path is defined during the installation
procedure for certain IMSL Fortran Library routines.
FORTRAN 90 Interface
Generic:
Specific:
Description
A default direct-access error message file (.daf file) is supplied with this product. This file is read
by error_post using the contents of the derived type argument epack, containing the message
number, error severity level, and associated data. The message is converted into character strings
accepted by the error processor and then printed. The number of pending messages that print
Chapter 11: Utilities
ERROR_POST 1641
depends on the settings of the parameters PRINT and STOP in the Reference Material in the IMSL
MATH/LIBRARY User's Manual. These values are initialized to defaults such that any Level 5 or
Level 4 message causes a STOP within the error processor after a print of the text. To change these
defaults so that more than one error message prints, use the routine ERSET documented and
illustrated with examples in the Reference Material in the IMSL MATH/LIBRARY User's
Manual. The method of using a message file to store the messages is required to support sharedmemory parallelism.
New system-wide messages have been developed for applications using this Library.
A subset of users need to add a specific message file for their applications using this Library.
Following is information on changing the contents of the message file, and information on how to
create and access a message file for a private application.
Changing Messages
In order to change messages, two files are required:
To change messages, first make a backup copy of messages.gls. Use a text editor to edit
messages.gls. The format of this file is a series of pairs of statements:
message_number=<nnnn>
message='message string'
%(i<n>) for an integer substitution, where n is the nth integer output in this message.
%(r<n>) for single precision real number substitution, where n is the nth real number output
in this message.
%(d<n>) for double precision real number substitution, where n is the nth double precision
number output in this message.
New messages added to the system-wide error message file should be placed at the end of the file.
Message numbers 5000 through 10000 have been reserved for user-added messages. Currently,
messages 1 through 1400 are used by IMSL. Gaps in message number ranges are permitted;
however, the message numbers must be in ascending order within the file. The message numbers
used for each IMSL Fortran Library subroutine are documented in this manual and in online help.
If existing messages are being edited or translated, make sure not to alter the message_number
lines. (This prevents conflicts with any new messages.gls file supplied with future versions of this
Library.)
A new messages.daf file is created. Edit the prepmess_output file and look near the end of
the file for the new error messages. The prepmess program processes each message through the
error message system as a validity check. There should be no FATAL error announcement within
the prepmess_output file.
SHOW
Prints rank-1 or rank-2 arrays of numbers in a readable format.
Required Arguments
X Rank-1 or rank-2 array containing the numbers to be printed. (Input)
Optional Arguments
text = CHARACTER (Input)
CHARACTER(LEN=*) string used for labeling the array.
Chapter 11: Utilities
SHOW 1643
present the output is converted to characters and packed. The lines are separated by an
end-of-line sequence. The length of buffer is estimated by the line width in effect,
time the number of lines for the array.
Derived type array with the same precision as the input array; used for passing optional
data to the routine. Use the REAL(KIND(1E0)) precision for output of INTEGER
arrays. The options are as follows:
Packaged Options for SHOW
Prefix is blank
Option Name
Option Value
show_significant_digits_is_4
show_significant_digits_is_7
show_significant_digits_is_16
show_line_width_is_44
show_line_width_is_72
show_line_width_is_128
show_end_of_line_sequence_is
show_starting_index_is
show_starting_row_index_is
show_starting_col_index_is
10
iopt(IO) = show_significant_digits_is_4
iopt(IO) = show_significant_digits_is_7
iopt(IO) = show_significant_digits_is_16
These options allow more precision to be displayed. The default is 4D for each
value. The other possible choices display 7D or 16D.
iopt(IO) = show_line_width_is_44
iopt(IO) = show_line_width_is_72
iopt(IO) = show_line_width_is_128
These options allow varying the output line width. The default is 72 characters per
line. This allows output on many work stations or terminals to be read without
wrapping of lines.
iopt(IO) = show_end-of_line_sequence_is
The sequence of characters ending a line when it is placed into the internal
character buffer corresponding to the optional argument IMAGE = buffer.
1644 Chapter 11: Utilities
This are used to reset the starting index for a rank-1 array to a value different from
the default value, which is 1.
iopt(IO) = show_starting_row_index_is
iopt(IO) = show_starting_col_index_is
These are used to reset the starting row and column indices to values different from
their defaults, each 1.
FORTRAN 90 Interface
Generic:
Specific:
Description
The show routine is a generic subroutine interface to separate low-level subroutines for each data
type and array shape. Output is directed to the unit number IUNIT. That number is obtained with
the subroutine UMACH, IMSL MATH/LIBRARY User's Manual. Thus the user must open this unit
in the calling program if it desired to be different from the standard output unit. If the optional
argument IMAGE = buffer is present, the output is not sent to a file but to a character string
within buffer. These characters are available to output or be used in the application.
SHOW 1645
show
show
show
show
show
show
show
show
show
show
(s_x,
(s_m,
(d_x,
(d_m,
(c_x,
(c_m,
(z_x,
(z_m,
(i_x,
(i_m,
'Rank-1,
'Rank-2,
'Rank-1,
'Rank-2,
'Rank-1,
'Rank-2,
'Rank-1,
'Rank-2,
'Rank-1,
'Rank-2,
REAL')
REAL')
DOUBLE')
DOUBLE')
COMPLEX')
COMPLEX')
DOUBLE COMPLEX')
DOUBLE COMPLEX')
INTEGER')
INTEGER')
Output
Example 1 for SHOW is correct.
Additional Examples
Example 2: Writing an Array to a Character Variable
This example prepares a rank-1 array for further processing, in this case delayed writing to the
standard output unit. The indices and the amount of precision are reset from their defaults, as in
Example 1. An end-of-line sequence of the characters CR-NL (ASCII 10,13) is used in place of
the standard ASCII 10. This is not required for writing this array, but is included for an illustration
of the option.
use show_int
use rand_int
1646 Chapter 11: Utilities
implicit none
! This is Example 2 for SHOW.
integer, parameter :: n=7
real(kind(1e0)) s_x(-1:n)
type (s_options) options(7)
CHARACTER (LEN=(72+2)*4) BUFFER
! The data types printed are real(kind(1e0)) random numbers.
s_x=rand(s_x)
!
!
!
!
Output
Example 2 for SHOW is correct.
WRRRN
Prints a real rectangular matrix with integer row and column labels.
Required Arguments
TITLE Character string specifying the title. (Input)
TITLE set equal to a blank character(s) suppresses printing of the title. Use % /
within the title to create a new line. Long titles are automatically wrapped.
Chapter 11: Utilities
WRRRN 1647
Optional Arguments
NRA Number of rows. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
ITRING Triangle option. (Input)
Default: ITRING = 0.
ITRING Action
FORTRAN 90 Interface
Generic:
Specific:
The specific interface names are S_WRRRN and D_WRRRN for two dimensional
arrays, and S_WRRRN1D and D_WRRRN1D for one dimensional arrays.
FORTRAN 77 Interface
Single:
Double:
Description
Routine WRRRN prints a real rectangular matrix with the rows and columns labeled 1, 2, 3, and so
on. WRRRN can restrict printing to the elements of the upper or lower triangles of matrices via the
ITRING option. Generally, ITRING 0 is used with symmetric matrices.
1648 Chapter 11: Utilities
In addition, one-dimensional arrays can be printed as column or row vectors. For a column vector,
set NRA to the length of the array and set NCA = 1. For a row vector, set NRA = 1 and set NCA to the
length of the array. In both cases, set LDA = NRA and set ITRING = 0.
Comments
1.
2.
Horizontal centering, a method for printing large matrices, paging, printing a title on
each page, and many other options can be selected by invoking WROPT.
3.
A page width of 78 characters is used. Page width and page length can be reset by
invoking PGOPT .
4.
Output is written to the unit specified by UMACH (see the Reference Material).
Example
The following example prints all of a 3 4 matrix A where aij= i + j/10.
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
ITRING, LDA, NCA, NRA
(ITRING=0, LDA=10, NCA=4, NRA=3)
INTEGER
REAL
I, J
A(LDA,NCA)
!
!
DO 20 I=1, NRA
DO 10 J=1, NCA
A(I,J) = I + J*0.1
10
CONTINUE
20 CONTINUE
Write A matrix.
CALL WRRRN ('A', A, NRA=NRA)
END
Output
A
1
2
3
1
1.100
2.100
3.100
2
1.200
2.200
3.200
3
1.300
2.300
3.300
4
1.400
2.400
3.400
WRRRL
Print a real rectangular matrix with a given format and labels.
Chapter 11: Utilities
WRRRL 1649
Required Arguments
TITLE Character string specifying the title. (Input)
TITLE set equal to a blank character(s) suppresses printing of the title.
A NRA by NCA matrix to be printed. (Input)
RLABEL CHARACTER * (*) vector of labels for rows of A. (Input)
If rows are to be numbered consecutively 1, 2, , NRA, use RLABEL(1) = NUMBER. If
no row labels are desired, use RLABEL(1) = NONE. Otherwise, RLABEL is a vector of
length NRA containing the labels.
CLABEL CHARACTER * (*) vector of labels for columns of A. (Input)
If columns are to be numbered consecutively 1, 2, , NCA, use
CLABEL(1) = NUMBER. If no column labels are desired, use CLABEL(1) = NONE.
Otherwise, CLABEL(1) is the heading for the row labels, and either CLABEL(2) must be
NUMBERor NONE, or CLABEL must be a vector of length NCA + 1 with
CLABEL(1 + j) containing the column heading for the j-th column.
Optional Arguments
NRA Number of rows. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
ITRING Triangle option. (Input)
Default: ITRING = 0.
ITRING Action
FORTRAN 90 Interface
Generic:
Specific:
The specific interface names are S_WRRRL and D_WRRRL for two dimensional
arrays, and S_WRRRL1D and D_WRRRL1D for one dimensional arrays.
FORTRAN 77 Interface
Single:
Double:
Description
Routine WRRRL prints a real rectangular matrix (stored in A) with row and column labels (specified
by RLABEL and CLABEL, respectively) according to a given format (stored in FMT). WRRRL can
restrict printing to the elements of upper or lower triangles of matrices via the ITRING option.
Generally, ITRING 0 is used with symmetric matrices.
In addition, one-dimensional arrays can be printed as column or row vectors. For a column vector,
set NRA to the length of the array and set NCA = 1. For a row vector, set NRA = 1 and set NCA to the
length of the array. In both cases, set LDA = NRA, and set ITRING = 0.
Comments
1.
WRRRL 1651
CALL W2RRL (TITLE, NRA, NCA, A, LDA, ITRING, FMT, RLABEL, CLABEL, CHWK)
2.
RLABEL(1)
Xxxxx
Xxxxx
Xxxxx
RLABEL(2)
Xxxxx
Xxxxx
Xxxxx
3.
Use % / within titles or labels to create a new line. Long titles or labels are
automatically wrapped.
4.
For printing numbers whose magnitudes are unknown, the G format in FORTRAN is
useful; however, the decimal points will generally not be aligned when printing a
column of numbers. The V and W formats are special formats used by this routine to
select a D, E, F, or I format so that the decimal points will be aligned. The V and W
formats are specified as Vn.d and Wn.d. Here, n is the field width and d is the number
of significant digits generally printed. Valid values for n are 3, 4,, 40. Valid values
for d are 1, 2, , n 2. If FMT specifies one format and that format is a V or W format,
all elements of the matrix A are examined to determine one FORTRAN format for
printing. If FMT specifies more than one format, FORTRAN formats are generated
separately from each V or W format.
5.
A page width of 78 characters is used. Page width and page length can be reset by
invoking PGOPT .
6.
Horizontal centering, method for printing large matrices, paging, method for printing
NaN (not a number), printing a title on each page, and many other options can be
selected by invoking WROPT .
7.
Example
The following example prints all of a 3 4 matrix A where aij = (i + j/10)10j3.
USE WRRRL_INT
IMPLICIT
INTEGER
PARAMETER
NONE
ITRING, LDA, NCA, NRA
(ITRING=0, LDA=10, NCA=4, NRA=3)
!
1652 Chapter 11: Utilities
INTEGER
REAL
CHARACTER
I, J
A(LDA,NCA)
CLABEL(5)*5, FMT*8, RLABEL(3)*5
!
DATA FMT/'(W10.6)'/
DATA CLABEL/'
', 'Col 1', 'Col 2', 'Col 3', 'Col 4'/
DATA RLABEL/'Row 1', 'Row 2', 'Row 3'/
!
DO 20 I=1, NRA
DO 10 J=1, NCA
A(I,J) = (I+J*0.1)*10.0**(J-3)
10
CONTINUE
20 CONTINUE
Write A matrix.
CALL WRRRL ('A', A, RLABEL, CLABEL, NRA=NRA, FMT=FMT)
END
Output
Col 1
0.011
0.021
0.031
Row 1
Row 2
Row 3
A
Col 2
0.120
0.220
0.320
Col 3
1.300
2.300
3.300
Col 4
14.000
24.000
34.000
WRIRN
Prints an integer rectangular matrix with integer row and column labels.
Required Arguments
TITLE Character string specifying the title. (Input)
TITLE set equal to a blank character(s) suppresses printing of the title. Use % /
within the title to create a new line. Long titles are automatically wrapped.
MAT NRMAT by NCMAT matrix to be printed. (Input)
Optional Arguments
NRMAT Number of rows. (Input)
Default: NRMAT = SIZE (MAT,1).
NCMAT Number of columns. (Input)
Default: NCMAT = SIZE (MAT,2).
LDMAT Leading dimension of MAT exactly as specified in the dimension statement in the
calling program. (Input)
Default: LDMAT = SIZE (MAT,1).
WRIRN 1653
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine WRIRN prints an integer rectangular matrix with the rows and columns labeled 1, 2, 3, and
so on. WRIRN can restrict printing to elements of the upper and lower triangles of matrices via the
ITRING option. Generally, ITRING 0 is used with symmetric matrices.
In addition, one-dimensional arrays can be printed as column or row vectors. For a column vector,
set NRMAT to the length of the array and set NCMAT = 1. For a row vector, set NRMAT = 1 and set
NCMAT to the length of the array. In both cases, set LDMAT = NRMAT and set ITRING = 0.
Comments
1.
All the entries in MAT are printed using a single I format. The field width is determined
by the largest absolute entry.
2.
Horizontal centering, a method for printing large matrices, paging, printing a title on
each page, and many other options can be selected by invoking WROPT.
3.
A page width of 78 characters is used. Page width and page length can be reset by
invoking PGOPT .
4.
Example
The following example prints all of a 3 4 matrix A = MAT where aij = 10i + j.
USE WRIRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
ITRING, LDMAT, NCMAT, NRMAT
(ITRING=0, LDMAT=10, NCMAT=4, NRMAT=3)
INTEGER
I, J, MAT(LDMAT,NCMAT)
!
!
DO 20 I=1, NRMAT
DO 10 J=1, NCMAT
MAT(I,J) = I*10 + J
10
CONTINUE
20 CONTINUE
!
Output
1
2
3
1
11
21
31
MAT
2
12
22
32
3
13
23
33
4
14
24
34
WRIRL
Print an integer rectangular matrix with a given format and labels.
Required Arguments
TITLE Character string specifying the title. (Input)
TITLE set equal to a blank character(s) suppresses printing of the title.
MAT NRMAT by NCMAT matrix to be printed. (Input)
RLABEL CHARACTER * (*) vector of labels for rows of MAT. (Input)
If rows are to be numbered consecutively 1, 2, , NRMAT, use
RLABEL(1) = NUMBER. If no row labels are desired, use RLABEL(1) = NONE.
Otherwise, RLABEL is a vector of length NRMAT containing the labels.
CLABEL CHARACTER * (*) vector of labels for columns of MAT. (Input)
If columns are to be numbered consecutively 1, 2, , NCMAT, use
CLABEL(1) = NUMBER. If no column labels are desired, use CLABEL(1) = NONE.
Otherwise, CLABEL(1) is the heading for the row labels, and either CLABEL(2) must be
NUMBER or NONE, or CLABEL must be a vector of length
Chapter 11: Utilities
WRIRL 1655
NCMAT + 1 with CLABEL(1 + j) containing the column heading for the j-th column.
Optional Arguments
NRMAT Number of rows. (Input)
Default: NRMAT = SIZE (MAT,1).
NCMAT Number of columns. (Input)
Default: NCMAT = SIZE (MAT,2).
LDMAT Leading dimension of MAT exactly as specified in the dimension statement in the
calling program. (Input)
Default: LDMAT = SIZE (MAT,1).
ITRING Triangle option. (Input)
Default: ITRING = 0.
ITRING Action
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine WRIRL prints an integer rectangular matrix (stored in MAT) with row and column labels
(specified by RLABEL and CLABEL, respectively), according to a given format (stored in FMT).
WRIRL can restrict printing to the elements of upper or lower triangles of matrices via the ITRING
option. Generally, ITRING 0 is used with symmetric matrices. In addition, one-dimensional
arrays can be printed as column or row vectors. For a column vector, set NRMAT to the length of
the array and set NCMAT = 1. For a row vector, set NRMAT = 1 and set NCMAT to the length of the
array. In both cases, set LDMAT = NRMAT, and set ITRING = 0.
Comments
1.
TITLE
CLABEL(2) CALBEL(3)
CLABEL 4)
RLABEL(1)
Xxxxx
xxxxx
xxxxx
RLABEL(2)
Xxxxx
xxxxx
xxxxx
2.
Use % / within titles or labels to create a new line. Long titles or labels are
automatically wrapped.
3.
A page width of 78 characters is used. Page width and page length can be reset by
invoking PGOPT.
4.
Horizontal centering, a method for printing large matrices, paging, printing a title on
each page, and many other options can be selected by invoking WROPT.
5.
Output is written to the unit specified by UMACH (see the Reference Material).
Example
The following example prints all of a 3 4 matrix A = MAT where aij= 10i + j.
USE WRIRL_INT
IMPLICIT
INTEGER
NONE
ITRING, LDMAT, NCMAT, NRMAT
PARAMETER
INTEGER
CHARACTER
I, J, MAT(LDMAT,NCMAT)
CLABEL(5)*5, FMT*8, RLABEL(3)*5
!
!
DATA FMT/'(I2)'/
WRIRL 1657
DATA CLABEL/'
', 'Col 1', 'Col 2', 'Col 3', 'Col 4'/
DATA RLABEL/'Row 1', 'Row 2', 'Row 3'/
!
DO 20 I=1, NRMAT
DO 10 J=1, NCMAT
MAT(I,J) = I*10 + J
10
CONTINUE
20 CONTINUE
!
Output
Row 1
Row 2
Row 3
Col 1
11
21
31
MAT
Col 2 Col 3
12
13
22
23
32
33
Col 4
14
24
34
WRCRN
Prints a complex rectangular matrix with integer row and column labels.
Required Arguments
TITLE Character string specifying the title. (Input)
TITLE set equal to a blank character(s) suppresses printing of the title. Use % /
within the title to create a new line. Long titles are automatically wrapped.
A Complex NRA by NCA matrix to be printed. (Input)
Optional Arguments
NRA Number of rows. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
ITRING Triangle option. (Input)
Default: ITRING = 0.
ITRING Action
FORTRAN 90 Interface
Generic:
Specific:
The specific interface names are S_WRCRN and D_WRCRN for two dimensional
arrays, and S_WRCRN1D and D_WRCRN1D for one dimensional arrays.
FORTRAN 77 Interface
Single:
Double:
Description
Routine WRCRN prints a complex rectangular matrix with the rows and columns labeled 1, 2, 3, and
so on. WRCRN can restrict printing to the elements of the upper or lower triangles of matrices via
the ITRING option. Generally, ITRING 0 is used with Hermitian matrices.
In addition, one-dimensional arrays can be printed as column or row vectors. For a column vector,
set NRA to the length of the array, and set NCA = 1. For a row vector, set NRA = 1, and set NCA to
the length of the array. In both cases, set LDA = NRA, and set ITRING = 0.
Comments
1.
2.
Horizontal centering, a method for printing large matrices, paging, method for printing
NaN (not a number), and printing a title on each page can be selected by invoking
WROPT.
3.
A page width of 78 characters is used. Page width and page length can be reset by
invoking subroutine PGOPT .
4.
WRCRN 1659
Example
This example prints all of a 3 4 complex matrix A with elements
amn = m + ni, where i = 1
USE WRCRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
ITRING, LDA, NCA, NRA
(ITRING=0, LDA=10, NCA=4, NRA=3)
INTEGER
COMPLEX
INTRINSIC
I, J
A(LDA,NCA), CMPLX
CMPLX
!
DO 20 I=1, NRA
DO 10 J=1, NCA
A(I,J) = CMPLX(I,J)
10
CONTINUE
20 CONTINUE
!
Write A matrix.
CALL WRCRN ('A', A, NRA=NRA)
END
Output
1
2
3
1
( 1.000, 1.000)
( 2.000, 1.000)
( 3.000, 1.000)
A
2
( 1.000, 2.000)
( 2.000, 2.000)
( 3.000, 2.000)
3
( 1.000, 3.000)
( 2.000, 3.000)
( 3.000, 3.000)
4
( 1.000, 4.000)
( 2.000, 4.000)
( 3.000, 4.000)
WRCRL
Prints a complex rectangular matrix with a given format and labels.
Required Arguments
TITLE Character string specifying the title. (Input)
TITLE set equal to a blank character(s) suppresses printing of the title.
A Complex NRA by NCA matrix to be printed. (Input)
RLABEL CHARACTER * (*) vector of labels for rows of A. (Input)
If rows are to be numbered consecutively 1, 2, , NRA, use RLABEL(1) = NUMBER. If
no row labels are desired, use RLABEL(1) = NONE. Otherwise, RLABEL is a vector of
length NRA containing the labels.
CLABEL CHARACTER * (*) vector of labels for columns of A. (Input)
If columns are to be numbered consecutively 1, 2, , NCA, use
1660 Chapter 11: Utilities
Optional Arguments
NRA Number of rows. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement in the calling
program. (Input)
Default: LDA = SIZE (A,1).
ITRING Triangle option. (Input)
Default: ITRING = 0.
ITRING
Action
WRCRL 1661
digits. While the V format prints trailing zeroes and a trailing decimal point, the W
format does not. See Comment 4 for general descriptions of the V and W formats. FMT
may contain only D, E, F, G, I, V, or W edit descriptors, e.g., the X descriptor is not
allowed.
Default: FMT = .
FORTRAN 90 Interface
Generic:
Specific:
The specific interface names are S_WRCRL and D_WRCRL for two dimensional
arrays, and S_WRCRL1D and D_WRCRL1D for one dimensional arrays.
FORTRAN 77 Interface
Single:
Double:
Description
Routine WRCRL prints a complex rectangular matrix (stored in A) with row and column labels
(specified by RLABEL and CLABEL, respectively) according to a given format (stored in FMT).
Routine WRCRL can restrict printing to the elements of upper or lower triangles of matrices via the
ITRING option. Generally, the ITRING 0 is used with Hermitian matrices.
In addition, one-dimensional arrays can be printed as column or row vectors. For a column vector,
set NRA to the length of the array, and set NCA = 1. For a row vector, set NRA = 1, and set NCA to
the length of the array. In both cases, set LDA = NRA, and set ITRING = 0.
Comments
1.
2.
CLABEL(1)
CLABEL(2)
RLABEL(1)
CLABEL(3)
CLABEL(4)
RLABEL(2)
3.
Use % / within titles or labels to create a new line. Long titles or labels are
automatically wrapped.
4.
For printing numbers whose magnitudes are unknown, the G format in FORTRAN is
useful; however, the decimal points will generally not be aligned when printing a
column of numbers. The V and W formats are special formats used by this routine to
select a D, E, F, or I format so that the decimal points will be aligned. The V and W
formats are specified as Vn.d and Wn.d. Here, n is the field width, and d is the number
of significant digits generally printed. Valid values for n are 3, 4, , 40. Valid values
for d are 1, 2, , n 2. If FMT specifies one format and that format is a V or W format,
all elements of the matrix A are examined to determine one FORTRAN format for
printing. If FMT specifies more than one format, FORTRAN formats are generated
separately from each V or W format.
5.
A page width of 78 characters is used. Page width and page length can be reset by
invoking PGOPT.
6.
Horizontal centering, a method for printing large matrices, paging, method for printing
NaN (not a number), printing a title on each page, and may other options can be
selected by invoking WROPT.
7.
Output is written to the unit specified by UMACH (see the Reference Material).
Example
The following example prints all of a 3 4 matrix A with elements
amn = ( m + .123456 ) + ni, where i = 1
USE WRCRL_INT
IMPLICIT
NONE
INTEGER
PARAMETER
INTEGER
COMPLEX
CHARACTER
INTRINSIC
I, J
A(LDA,NCA), CMPLX
CLABEL(5)*5, FMT*8, RLABEL(3)*5
CMPLX
!
DATA FMT/'(W12.6)'/
DATA CLABEL/'
', 'Col 1', 'Col 2', 'Col 3', 'Col 4'/
DATA RLABEL/'Row 1', 'Row 2', 'Row 3'/
!
DO 20
I=1, NRA
WRCRL 1663
DO 10 J=1, NCA
A(I,J) = CMPLX(I,J) + 0.123456
10
CONTINUE
20 CONTINUE
Write A matrix.
CALL WRCRL ('A', A, RLABEL, CLABEL, NRA=NRA, FMT=FMT)
END
Output
Row 1
Row 2
Row 3
(
(
(
1.12346,
2.12346,
3.12346,
A
Col 1
1.00000)
1.00000)
1.00000)
Row 1
Row 2
Row 3
(
(
(
1.12346,
2.12346,
3.12346,
Col 3
3.00000)
3.00000)
3.00000)
(
(
(
1.12346,
2.12346,
3.12346,
Col 2
2.00000)
2.00000)
2.00000)
(
(
(
1.12346,
2.12346,
3.12346,
Col 4
4.00000)
4.00000)
4.00000)
WROPT
Sets or retrieves an option for printing a matrix.
Required Arguments
IOPT Indicator of option type. (Input)
IOPT
1, 1
2, 2
3, 3
Paging
4, 4
Method for printing NaN (not a number), and negative and positive
machine infinity.
5, 5
Title option
6, 6
7, 7
8, 8
9, 9
10, 10
Hot zone option for determining line breaks for row labels
11, 11
12, 12
Hot zone option for determining line breaks for column labels
13, 13
14, 14
Option for the label that appears in the upper left hand corner that can be
used as a heading for the row numbers or a label for the column headings
for WR**N routines
15, 15
16, 16
Option for vertical alignment of the matrix values relative to the associated
row labels that occupy more than one line
Reset all the current settings saved in internal variables back to their last
setting made with an invocation of WROPT with ISCOPE = 1. (This option is
used internally by routines printing a matrix and is not useful otherwise.)
If IOPT is negative, ISETNG and ISCOPE are input and are saved in internal variables. If IOPT
is positive, ISETNG is output and receives the currently active setting for the option
(if ISCOPE = 0) or the last global setting for the option (if ISCOPE = 1). If IOPT = 0, ISETNG
and ISCOPE are not referenced.
ISETNG Setting for option selected by IOPT. (Input, if IOPT is negative; output, if IOPT
is positive; not referenced if IOPT = 0)
IOPT
ISETNG
1, 1
Meaning
Matrix is left justified
2, 2
WROPT 1665
3, 3
4, 4
Format is (1PE12.5 ).
7, 7
8, 8
K2
9, 9
K3
10, 10
K4
11, 11
K5
12, 12
K6
13, 13
K7
14
5, 5
6, 6
WROPT 1667
15
16, 16
ISCOPE Indicator of the scope of the option. (Input if IOPT is nonzero; not referenced if
IOPT = 0)
ISCOPE Action
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine WROPT allows the user to set or retrieve an option for printing a matrix. The options
controlled by WROPT include the following: horizontal centering, a method for printing large
matrices, paging, method for printing NaN (not a number) and positive and negative machine
infinities, printing titles, default formats for numbers, spacing between columns, maximum widths
reserved for row and column labels, indentation of row labels that continue beyond one line,
widths of hot zones for breaking of labels and titles, the default heading for row labels, whether to
print a blank line between invocations of routines, and vertical alignment of matrix entries with
respect to row labels continued beyond one line. (NaN and positive and negative machine
infinities can be retrieved by AMACH and DMACH that are documented in the section MachineDependent Constants in the Reference Material.) Options can be set globally (ISCOPE = 1) or
temporarily for the next call to a printing routine (ISCOPE = 0).
Comments
1.
This program can be invoked repeatedly before using a WR*** routine to print a matrix.
The matrix printing routines retrieve these settings to determine the printing options. It
is not necessary to call WROPT if a default value of a printing option is desired. The
defaults are as follows.
Default
Value for
ISET
IOPT
Meaning
Left justified
1000000
No paging
10
11
13
10
14
15
WROPT 1669
Default
Value for
ISET
IOPT
16
Meaning
For IOPT = 8, the default depends on the current value for the page width, IPAGEW (see
PGOPT).
2.
The V and W formats are special formats that can be used to select a D, E, F, or I format
so that the decimal points will be aligned. The V and W formats are specified as Vn.d
and Wn.d. Here, n is the field width and d is the number of significant digits generally
printed. Valid values for n are 3, 4, , 40. Valid values for d are 1, 2, , n 2. While
the V format prints trailing zeroes and a trailing decimal point, the W format does not.
Example
The following example illustrates the effect of WROPT when printing a 3 4 real matrix A with
WRRRN where aij = i + j/10. The first call to WROPT sets horizontal printing so that the matrix is first
printed horizontally centered on the page. In the next invocation of WRRRN, the left-justification
option has been set via routine WROPT so the matrix is left justified when printed. Finally, because
the scope of left justification was only for the next call to a printing routine, the last call to WRRRN
results in horizontally centered printing.
USE WROPT_INT
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
ITRING, LDA, NCA, NRA
(ITRING=0, LDA=10, NCA=4, NRA=3)
INTEGER
REAL
!
!
DO 20 I=1, NRA
DO 10 J=1, NCA
A(I,J) = I + J*0.1
10
CONTINUE
20 CONTINUE
!
!
!
!
!
!
ISETNG
=
ISCOPE = 0
CALL WROPT
CALL WRRRN
CALL WRRRN
END
0
(IOPT, ISETNG, ISCOPE)
('A', A, NRA=NRA)
('A', A, NRA=NRA)
Output
1
2
3
1
2
3
A
2
1.200
2.200
3.200
1
1.100
2.100
3.100
3
1.300
2.300
3.300
1
2
3
1
1.100
2.100
3.100
A
2
1.200
2.200
3.200
3
1.300
2.300
3.300
4
1.400
2.400
3.400
A
2
1.200
2.200
3.200
3
1.300
2.300
3.300
4
1.400
2.400
3.400
4
1.400
2.400
3.400
1
1.100
2.100
3.100
PGOPT
Sets or retrieves page width and length for printing.
Required Arguments
IOPT Page attribute option. (Input)
IOPT
Description of Attribute
1, 1
Page width.
2, 2
Page length.
Negative values of IOPT indicate the setting IPAGE is input. Positive values
of IOPT indicate the setting IPAGE is output.
IPAGE Value of page attribute. (Input, if IOPT is negative; output, if IOPT is positive.)
IOPT Description of Attribute
10, 11,
PGOPT 1671
10, 11,
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine PGOPT is used to set or retrieve the page width or the page length for routines that perform
printing.
Example
The following example illustrates the use of PGOPT to set the page width at 20 characters. Routine
WRRRN is then used to print a 3 4 matrix A where aij= i + j/10.
USE PGOPT_INT
USE WRRRN_INT
IMPLICIT
INTEGER
PARAMETER
NONE
ITRING, LDA, NCA, NRA
(ITRING=0, LDA=3, NCA=4, NRA=3)
INTEGER
REAL
I, IOPT, IPAGE, J
A(LDA,NCA)
!
!
DO 20 I=1, NRA
DO 10 J=1, NCA
A(I,J) = I + J*0.1
10
CONTINUE
20 CONTINUE
!
Output
1
2
A
1
1.100
2.100
2
1.200
2.200
3.100
3.200
1
2
3
3
1.300
2.300
3.300
4
1.400
2.400
3.400
PERMU
Rearranges the elements of an array as specified by a permutation.
Required Arguments
X Real vector of length N containing the array to be permuted. (Input)
IPERMU Integer vector of length N containing a permutation
IPERMU(1), , IPERMU(N) of the integers 1, , N. (Input)
XPERMU Real vector of length N containing the array X permuted. (Output)
If X is not needed, X and XPERMU can share the same storage locations.
Optional Arguments
N Length of the arrays X and XPERMU. (Input)
Default: N = SIZE (IPERMU,1).
IPATH Integer flag. (Input)
Default: IPATH = 1.
IPATH = 1 means IPERMU represents a forward permutation, i.e., X(IPERMU(I)) is
moved to XPERMU(I). IPATH = 2 means IPERMU represents a backward permutation,
i.e., X(I) is moved to XPERMU(IPERMU(I)).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
PERMU 1673
Description
Routine PERMU rearranges the elements of an array according to a permutation vector. It has the
option to do both forward and backward permutations.
Example
This example rearranges the array X using IPERMU; forward permutation is performed.
USE PERMU_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
IPATH, N
(IPATH=1, N=4)
INTEGER
REAL
IPERMU(N), J, NOUT
X(N), XPERMU(N)
Set values for
Declare variables
!
!
!
!
!
!
!
!
!
X, IPERMU
!
99999 FORMAT ('
END
Output
The Output vector is:
1.00
5.00
4.00
6.00
PERMA
Permutes the rows or columns of a matrix.
Required Arguments
A NRA by NCA matrix to be permuted. (Input)
IPERMU Vector of length K containing a permutation IPERMU(1), , IPERMU(K) of the
integers 1, , K where K = NRA if the rows of A are to be permuted and K = NCA if the
columns of A are to be permuted. (Input)
1674 Chapter 11: Utilities
Optional Arguments
NRA Number of rows. (Input)
Default: NRA = SIZE (A,1).
NCA Number of columns. (Input)
Default: NCA = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
IPATH Option parameter. (Input)
IPATH = 1 means the rows of A will be permuted. IPATH = 2 means the columns of A
will be permuted.
Default: IPATH = 1.
LDAPER Leading dimension of APER exactly as specified in the dimension statement of
the calling program. (Input)
Default: LDAPER = SIZE (APER,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine PERMA interchanges the rows or columns of a matrix using a permutation vector such as
the one obtained from routines SVRBP or SVRGP.
The routine PERMA permutes a column (row) at a time by calling PERMU. This process is continued
until all the columns (rows) are permuted. On completion, let B = APER and pi = IPERMU(I), then
Bij = Api j
for all i, j.
PERMA 1675
Comments
1.
Example
This example permutes the columns of a matrix A.
USE PERMA_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
Declare variables
IPATH, LDA, LDAPER, NCA, NRA
(IPATH=2, LDA=3, LDAPER=3, NCA=5, NRA=3)
!
!
INTEGER
REAL
!
!
!
!
!
!
!
!
!
!
!
I, IPERMU(5), J, NOUT
A(LDA,NCA), APER(LDAPER,NCA)
Set values for
A = ( 3.0 5.0
( 3.0 5.0
( 3.0 5.0
A, IPERMU
1.0 2.0 4.0 )
1.0 2.0 4.0 )
1.0 2.0 4.0 )
IPERMU = ( 3 4 1 5 2 )
DATA A/3*3.0, 3*5.0, 3*1.0, 3*2.0, 3*4.0/, IPERMU/3, 4, 1, 5, 2/
Perform column permutation on A,
giving APER
CALL PERMA (A, IPERMU, APER, IPATH=IPATH)
Get output unit number
CALL UMACH (2, NOUT)
Print results
WRITE (NOUT,99999) ((APER(I,J),J=1,NCA),I=1,NRA)
!
99999 FORMAT ('
END
Output
The Output matrix is:
1.0
2.0
3.0
1.0
2.0
3.0
1.0
2.0
3.0
4.0
4.0
4.0
5.0
5.0
5.0
SORT_REAL
Sorts a rank-1 array of real numbers x so the y results are algebraically nondecreasing,
y1 y2 yn.
Required Arguments
X Rank-1 array containing the numbers to be sorted. (Output)
Y Rank-1 array containing the sorted numbers. (Output)
Optional Arguments
NSIZE = n (Input)
Uses the sub-array of size n for the numbers.
Default value: n = SIZE(x)
IPERM = iperm (Input/Output)
Applies interchanges of elements that occur to the entries of iperm(:). If the values
iperm(i)=i,i=1,n are assigned prior to call, then the output array is moved to its
proper order by the subscripted array assignment y = x(iperm(1:n)).
ICYCLE = icycle (Output)
Permutations applied to the input data are converted to cyclic interchanges. Thus, the
output array y is given by the following elementary interchanges, where :=: denotes a
swap:
j = icycle(i)
y(j) :=: y(i), i = 1,n
Option Name
Option Value
s_, d_
Sort_real_scan_for_NaN
Examines each input array entry to find the first value such that
isNaN(x(i)) == .true.
See the isNaN() function, Chapter 10.
FORTRAN 90 Interface
Generic:
Chapter 11: Utilities
Specific:
Description
For a detailed description, see the Description section of routine SVRGN, which appears later in
this chapter.
Output
Example 1 for SORT_REAL is correct.
Additional Examples
Example 2: Sort and Final Move with a Permutation
A set of n random numbers is sorted so the results are nonincreasing. The columns of an n n
random matrix are moved to the order given by the permutation defined by the interchange of the
entries. Since the routine sorts the results to be algebraically nondecreasing, the array of negative
values is used as input. Thus, the negative value of the sorted output order is nonincreasing. The
optional argument iperm= records the final order and is used to move the matrix columns to
1678 Chapter 11: Utilities
that order. This example illustrates the principle of sorting record keys, followed by direct
movement of the records to sorted order.
use sort_real_int
use rand_gen_int
implicit none
! This is Example 2 for SORT_REAL.
integer i
integer, parameter :: n=100
integer ip(n)
real(kind(1e0)) a(n,n), x(n), y(n), temp(n*n)
! Generate a random array and matrix of values.
call rand_gen(x)
call rand_gen(temp)
a = reshape(temp,(/n,n/))
! Initialize permutation to the identity.
do i=1, n
ip(i) = i
end do
! Sort using negative values so the final order is
! non-increasing.
call sort_real(-x, y, iperm=ip)
! Final movement of keys and matrix columns.
y = x(ip(1:n))
a = a(:,ip(1:n))
! Check the results.
if (count(y(1:n-1) < y(2:n)) == 0) then
write (*,*) 'Example 2 for SORT_REAL is correct.'
end if
end
Output
Example 2 for SORT_REAL is correct.
SVRGN
Sorts a real array by algebraically increasing value.
Required Arguments
RA Vector of length N containing the array to be sorted. (Input)
SVRGN 1679
Optional Arguments
N Number of elements in the array to be sorted. (Input)
Default: N = SIZE (RA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine SVRGN sorts the elements of an array, A, into ascending order by algebraic value. The
array A is divided into two parts by picking a central element T of the array. The first and last
elements of A are compared with T and exchanged until the three values appear in the array in
ascending order. The elements of the array are rearranged until all elements greater than or equal
to the central element appear in the second part of the array and all those less than or equal to the
central element appear in the first part. The upper and lower subscripts of one of the segments are
saved, and the process continues iteratively on the other segment. When one segment is finally
sorted, the process begins again by retrieving the subscripts of another unsorted portion of the
array. On completion, Aj Ai for j < i. For more details, see Singleton (1969), Griffin and Redish
(1970), and Petro (1970).
Example
This example sorts the 10-element array RA algebraically.
USE SVRGN_INT
USE UMACH_INT
!
!
!
!
IMPLICIT
NONE
INTEGER
PARAMETER
REAL
N, NOUT, J
(N=10)
RA(N), RB(N)
RA = ( -1.0
2.0
Declare variables
-3.0
4.0
-9.0
10.0 )
DATA RA/-1.0, 2.0, -3.0, 4.0, -5.0, 6.0, -7.0, 8.0, -9.0, 10.0/
1680 Chapter 11: Utilities
Print results
CALL UMACH (2,NOUT)
WRITE (NOUT, 99999) (RB(J),J=1,N)
!
99999 FORMAT ('
END
Output
The Output vector is:
-9.0 -7.0 -5.0 -3.0
-1.0
2.0
4.0
6.0
8.0
10.0
SVRGP
Sorts a real array by algebraically increasing value and return the permutation that rearranges the
array.
Required Arguments
RA Vector of length N containing the array to be sorted. (Input)
RB Vector of length N containing the sorted array. (Output)
If RA is not needed, RA and RB can share the same storage locations.
IPERM Vector of length N. (Input/Output)
On input, IPERM should be initialized to the values 1, 2, , N. On output, IPERM
contains a record of permutations made on the vector RA.
Optional Arguments
N Number of elements in the array to be sorted. (Input)
Default: N = SIZE (IPERM,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
SVRGP 1681
Description
Routine SVRGP sorts the elements of an array, A, into ascending order by algebraic value, keeping
a record in P of the permutations to the array A. That is, the elements of P are moved in the same
manner as are the elements in A as A is being sorted. The routine SVRGP uses the algorithm
discussed in SVRGN. On completion, Aj Ai for j < i.
Comments
For wider applicability, integers (1, 2, , N) that are to be associated with RA(I) for I = 1, 2, , N
may be entered into IPERM(I) in any order. Note that these integers must be unique.
Example
This example sorts the 10-element array RA algebraically.
USE SVRGP_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
REAL
INTEGER
N, NOUT, J
(N=10)
RA(N), RB(N)
IPERM(N)
!
!
!
!
!
!
!
Declare variables
RA
= ( 10.0
IPERM = ( 1
-9.0
8.0
-1.0 )
10)
DATA RA/10.0, -9.0, 8.0, -7.0, 6.0, 5.0, 4.0, -3.0, -2.0, -1.0/
DATA IPERM/1, 2, 3, 4, 5, 6, 7, 8, 9, 10/
Sort RA by algebraic value into RB
CALL SVRGP (RA, RB, IPERM)
Print results
CALL UMACH (2,NOUT)
WRITE (NOUT, 99998) (RB(J),J=1,N)
WRITE (NOUT, 99999) (IPERM(J),J=1,N)
!
99998 FORMAT ('
99999 FORMAT ('
END
Output
The output vector is:
-9.0 -7.0 -3.0 -2.0
-1.0
4.0
7
5.0
6
6.0
5
8.0
3
10.0
1
SVIGN
Sorts an integer array by algebraically increasing value.
Required Arguments
IA Integer vector of length N containing the array to be sorted. (Input)
IB Integer vector of length N containing the sorted array. (Output)
If IA is not needed, IA and IB can share the same storage locations.
Optional Arguments
N Number of elements in the array to be sorted. (Input)
Default: N = SIZE (IA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine SVIGN sorts the elements of an integer array, A, into ascending order by algebraic value.
The routine SVIGN uses the algorithm discussed in SVRGN. On completion, Aj Ai for j < i.
Example
This example sorts the 10-element array IA algebraically.
USE SVIGN_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER
PARAMETER
INTEGER
N, NOUT, J
(N=10)
IA(N), IB(N)
!
!
!
IA = ( -1
-3
-5
Declare variables
SVIGN 1683
Print results
CALL UMACH (2,NOUT)
WRITE (NOUT, 99999) (IB(J),J=1,N)
!
99999 FORMAT ('
END
Output
The Output vector is:
-9
-7
-5
-3
-1
10
SVIGP
Sorts an integer array by algebraically increasing value and return the permutation that rearranges
the array.
Required Arguments
IA Integer vector of length N containing the array to be sorted. (Input)
IB Integer vector of length N containing the sorted array. (Output)
If IA is not needed, IA and IB can share the same storage locations.
IPERM Vector of length N. (Input/Output)
On input, IPERM should be initialized to the values 1, 2, , N. On output, IPERM
contains a record of permutations made on the vector IA.
Optional Arguments
N Number of elements in the array to be sorted. (Input)
Default: N = SIZE (IPERM,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine SVIGP sorts the elements of an integer array, A, into ascending order by algebraic value,
keeping a record in P of the permutations to the array A. That is, the elements of P are moved in
1684 Chapter 11: Utilities
the same manner as are the elements in A as A is being sorted. The routine SVIGP uses the
algorithm discussed in SVRGN. On completion, Aj Ai for j < i.
Comments
For wider applicability, integers (1, 2, , N) that are to be associated with IA(I) for I = 1, 2, , N
may be entered into IPERM(I) in any order. Note that these integers must be unique.
Example
This example sorts the 10-element array IA algebraically.
USE SVIGP_INT
USE UMACH_INT
IMPLICIT
!
!
!
!
!
!
!
!
NONE
Declare variables
INTEGER N, J, NOUT
PARAMETER (N=10)
INTEGER
IA(N), IB(N), IPERM(N)
Set values for IA and IPERM
IA
= ( 10 -9 8 -7 6 5 4 -3 -2 -1 )
IPERM = ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 )
DATA IA/10, -9, 8, -7, 6, 5, 4, -3, -2, -1/
DATA IPERM/1, 2, 3, 4, 5, 6, 7, 8, 9, 10/
Sort IA by algebraic value into IB
CALL SVIGP (IA, IB, IPERM)
Print results
CALL UMACH (2,NOUT)
WRITE (NOUT, 99998) (IB(J),J=1,N)
WRITE (NOUT, 99999) (IPERM(J),J=1,N)
!
99998 FORMAT (' The output vector is:', /, 10(1X,I5))
99999 FORMAT (' The permutation vector is:', /, 10(1X,I5))
END
Output
The Output vector is:
-9
-7
-3
-2
-1
4
7
5
6
6
5
8
3
10
1
SVRBN
Sorts a real array by nondecreasing absolute value.
SVRBN 1685
Required Arguments
RA Vector of length N containing the array to be sorted. (Input)
RB Vector of length N containing the sorted array. (Output)
If RA is not needed, RA and RB can share the same storage locations.
Optional Arguments
N Number of elements in the array to be sorted. (Input)
Default: N = SIZE (RA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine SVRBN sorts the elements of an array, A, into ascending order by absolute value. The
routine SVRBN uses the algorithm discussed in SVRGN. On completion, |Aj| |Ai| for j < i.
Example
This example sorts the 10-element array RA by absolute value.
USE SVRBN_INT
USE UMACH_INT
IMPLICIT
NONE
Declare variables
INTEGER N, J, NOUT
PARAMETER (N=10)
REAL
RA(N), RB(N)
!
!
!
!
!
RA = ( -1.0
3.0
-4.0
6.0
10.0
-7.0 )
DATA RA/-1.0, 3.0, -4.0, 2.0, -1.0, 0.0, -7.0, 6.0, 10.0, -7.0/
Sort RA by absolute value into RB
CALL SVRBN (RA, RB)
Print results
CALL UMACH (2,NOUT)
WRITE (NOUT, 99999) (RB(J),J=1,N)
!
99999 FORMAT ('
END
Output
The Output vector is :
0.0 -1.0 -1.0
2.0
3.0
-4.0
6.0
-7.0
-7.0
10.0
SVRBP
Sorts a real array by nondecreasing absolute value and return the permutation that rearranges the
array.
Required Arguments
RA Vector of length N containing the array to be sorted. (Input)
RB Vector of length N containing the sorted array. (Output)
If RA is not needed, RA and RB can share the same storage locations.
IPERM Vector of length N. (Input/Output)
On input, IPERM should be initialized to the values 1, 2, , N. On output, IPERM
contains a record of permutations made on the vector IA.
Optional Arguments
N Number of elements in the array to be sorted. (Input)
Default: N = SIZE (IPERM,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine SVRBP sorts the elements of an array, A, into ascending order by absolute value, keeping a
record in P of the permutations to the array A. That is, the elements of P are moved in the same
SVRBP 1687
manner as are the elements in A as A is being sorted. The routine SVRBP uses the algorithm
discussed in SVRGN. On completion, Aj Ai for j < i.
Comments
For wider applicability, integers (1, 2, , N) that are to be associated with RA(I) for I = 1, 2, , N
may be entered into IPERM(I) in any order. Note that these integers must be unique.
Example
This example sorts the 10-element array RA by absolute value.
USE SVRBP_INT
USE UMACH_INT
IMPLICIT
NONE
INTEGER N,
PARAMETER
REAL
INTEGER
J, NOUT, I
(N=10)
RA(N), RB(N)
IPERM(N)
!
!
!
!
!
!
!
Declare variables
RA
= ( 10.0
IPERM = ( 1
9.0
3
8.0
5
1.0 )
10 )
DATA RA/10.0, 9.0, 8.0, 7.0, 6.0, 5.0, -4.0, 3.0, -2.0, 1.0/
DATA IPERM/1, 2, 3, 4, 5, 6, 7, 8, 9, 10/
Sort RA by absolute value into RB
CALL SVRBP (RA, RB, IPERM)
Print results
CALL UMACH (2,NOUT)
WRITE (NOUT, 99998) (RB(J),J=1,N)
WRITE (NOUT, 99999) (IPERM(I),I=1,N)
!
99998 FORMAT ('
99999 FORMAT ('
END
Output
The output vector is:
1.0 -2.0
3.0 -4.0
5.0
The permutation vector is:
10
9
8
7
6
6.0
7.0
8.0
9.0
10.0
SVIBN
Sorts an integer array by nondecreasing absolute value.
Required Arguments
IA Integer vector of length N containing the array to be sorted. (Input)
IB Integer vector of length N containing the sorted array. (Output)
If IA is not needed, IA and IB can share the same storage locations.
Optional Arguments
N Number of elements in the array to be sorted. (Input)
Default: N = SIZE (IA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine SVIBN sorts the elements of an integer array, A, into ascending order by absolute value.
This routine SVIBN uses the algorithm discussed in SVRGN. On completion, Aj Ai for j < i.
Example
This example sorts the 10-element array IA by absolute value.
USE SVIBN_INT
USE UMACH_INT
IMPLICIT
NONE
Declare variables
INTEGER I, J, NOUT, N
PARAMETER (N=10)
INTEGER
IA(N), IB(N)
!
!
!
!
!
IA = ( -1
-4
-1
IA
!
99999 FORMAT ('
Chapter 11: Utilities
END
Output
The Output vector is:
0
-1
-1
2
-4
-7
-7
10
SVIBP
Sorts an integer array by nondecreasing absolute value and return the permutation that rearranges
the array.
Required Arguments
IA Integer vector of length N containing the array to be sorted. (Input)
IB Integer vector of length N containing the sorted array. (Output)
If IA is not needed, IA and IB can share the same storage locations.
IPERM Vector of length N. (Input/Output)
On input, IPERM should be initialized to the values 1, 2, , N. On output, IPERM
contains a record of permutations made on the vector IA.
Optional Arguments
N Number of elements in the array to be sorted. (Input)
Default: N = SIZE (IA,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine SVIBP sorts the elements of an integer array, A, into ascending order by absolute value,
keeping a record in P of the permutations to the array A. That is, the elements of P are moved in
the same manner as are the elements in A as A is being sorted. The routine SVIBP uses the
algorithm discussed in SVRGN. On completion, Aj Ai for j < i.
Comments
For wider applicability, integers (1, 2, , N) that are to be associated with IA(I) for I = 1, 2, , N
may be entered into IPERM(I) in any order. Note that these integers must be unique.
Example
This example sorts the 10-element array IA by absolute value.
USE SVIBP_INT
USE UMACH_INT
IMPLICIT
!
NONE
Declare variables
N, U, NOUT, J
(N=10)
IA(N), IB(N), IPERM(N)
Set values for IA
= ( 10 9 8 7 6 5 -4 3 -2 1 )
INTEGER
PARAMETER
INTEGER
!
!
!
!
!
!
!
IA
IPERM = ( 1
10 )
!
99998 FORMAT ('
99999 FORMAT ('
END
Output
The Output vector is:
1
-2
3
-4
6
5
7
4
8
3
9
2
10
1
SRCH
Searches a sorted vector for a given scalar and return its index.
Required Arguments
VALUE Scalar to be searched for in Y. (Input)
SRCH 1691
Location of VALUE
1 thru N
VALUE = Y(INDEX)
N thru 2
(N + 1)
Optional Arguments
N Length of vector Y. (Input)
Default: N = (SIZE (X,1)) / INCX.
INCX Displacement between elements of X. (Input)
INCX must be greater than zero.
Default: INCX = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine SRCH searches a real vector x (stored in X), whose n elements are sorted in ascending
order for a real number c (stored in VALUE). If c is found in x, its index i (stored in INDEX) is
returned so that xi = c. Otherwise, a negative number i is returned for the index. Specifically,
if 1 i n
then xi = c
if i = 1
then c < x1 or n = 0
if n I 2
if i = (n + 1)
then c > xn
The argument INCX is useful if a row of a matrix, for example, row number I of a matrix X, must
be searched. The elements of row I are assumed to be in ascending order. In this case, set INCX
equal to the leading dimension of X exactly as specified in the dimension statement in the calling
program. With X declared
REAL X(LDX,N)
the invocation
CALL SRCH (N, VALUE, X(I,1), LDX, INDEX)
Example
This example searches a real vector sorted in ascending order for the value 653.0. The problem is
discussed by Knuth (1973, pages 407409).
USE SRCH_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=16)
INTEGER
REAL
INDEX, NOUT
VALUE, X(N)
!
!
DATA X/61.0, 87.0, 154.0, 170.0, 275.0, 426.0, 503.0, 509.0, &
512.0, 612.0, 653.0, 677.0, 703.0, 765.0, 897.0, 908.0/
!
VALUE = 653.0
CALL SRCH (VALUE, X, INDEX)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'INDEX = ', INDEX
END
Output
INDEX =
11
SRCH 1693
ISRCH
Searches a sorted integer vector for a given integer and return its index.
Required Arguments
IVALUE Scalar to be searched for in IY. (Input)
IX Vector of length N * INCX. (Input)
IY is obtained from IX for I = 1, 2, , N by
IY(I) = IX(1 + (I 1) * INCX). IY(1), IY(2), , IY(N) must be in ascending order.
INDEX Index of IY pointing to IVALUE. (Output)
If INDEX is positive, IVALUE is found in IY. If INDEX is negative, IVALUE is not found
in IY.
INDEX
Location of VALUE
1 thru N
IVALUE = IY(INDEX )
N thru 2
(N + 1)
Optional Arguments
N Length of vector IY. (Input)
Default: N = SIZE (IX,1) / INCX.
INCX Displacement between elements of IX. (Input)
INCX must be greater than zero.
Default: INCX = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine ISRCH searches an integer vector x (stored in IX), whose n elements are sorted in
ascending order for an integer c (stored in IVALUE). If c is found in x, its index i (stored in INDEX)
is returned so that xi = c. Otherwise, a negative number i is returned for the index. Specifically,
if 1 i n
Then xi = c
if i = 1
Then c < x1 or n = 0
if n i 2
if i = (n + 1)
Then c > xn
The argument INCX is useful if a row of a matrix, for example, row number I of a matrix IX, must
be searched. The elements of row I are assumed to be in ascending order. Here, set INCX equal to
the leading dimension of IX exactly as specified in the dimension statement in the calling
program. With IX declared
INTEGER IX(LDIX,N)
the invocation
CALL ISRCH (N, IVALUE, IX(I,1), LDIX, INDEX)
Example
This example searches an integer vector sorted in ascending order for the value 653. The problem
is discussed by Knuth (1973, pages 407409).
USE ISRCH_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=16)
INTEGER
INTEGER
INDEX, NOUT
IVALUE, IX(N)
!
DATA IX/61, 87, 154, 170, 275, 426, 503, 509, 512, 612, 653, 677, &
703, 765, 897, 908/
!
IVALUE = 653
CALL ISRCH (IVALUE, IX, INDEX)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'INDEX = ', INDEX
END
ISRCH 1695
Output
INDEX =
11
SSRCH
Searches a character vector, sorted in ascending ASCII order, for a given string and return its
index.
Required Arguments
N Length of vector CHY. (Input)
Default: N = SIZE (CHX,1) / INCX.
STRING Character string to be searched for in CHY. (Input)
CHX Vector of length N * INCX containing character strings. (Input)
CHY is obtained from CHX for I = 1, 2, , N by CHY(I) = CHX(1 + (I 1) * INCX).
CHY(1), CHY(2), , CHY(N) must be in ascending ASCII order.
INCX Displacement between elements of CHX. (Input)
INCX must be greater than zero.
Default: INCX = 1.
INDEX Index of CHY pointing to STRING. (Output)
If INDEX is positive, STRING is found in CHY. If INDEX is negative, STRING is not
found in CHY.
INDEX
Location of STRING
1 thru N
STRING = CHY(INDEX)
N thru 2
(N + 1)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine SSRCH searches a vectorof character strings x (stored in CHX), whose n elements are sorted
in ascending ASCII order, for a character string c (stored in STRING). If c is found in x, its index i
(stored in INDEX) is returned so that xi = c. Otherwise, a negative number i is returned for the
index. Specifically,
if 1 i n
Then xi = c
if i = 1
Then c < x1 or n = 0
if n i 2
if i = (n + 1)
Then c > xn
Here, < and > are in reference to the ASCII collating sequence. For comparisons made
between character strings c and xi with different lengths, the shorter string is considered as if it
were extended on the right with blanks to the length of the longer string. (SSRCH uses FORTRAN
intrinsic functions LLT and LGT.)
The argument INCX is useful if a row of a matrix, for example, row number I of a matrix CHX,
must be searched. The elements of row I are assumed to be in ascending ASCII order. In this case,
set INCX equal to the leading dimension of CHX exactly as specified in the dimension statement in
the calling program. With CHX declared
CHARACTER * 7 CHX(LDCHX,N)
the invocation
CALL SSRCH (N, STRING, CHX(I,1), LDCHX, INDEX)
Example
This example searches a CHARACTER * 2 vector containing 9 character strings, sorted in ascending
ASCII order, for the value CC.
USE SSRCH_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N, INCX
(N=9)
INTEGER
CHARACTER
INDEX, NOUT
CHX(N)*2, STRING*2
!
!
DATA CHX/'AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', &
'II'/
!
INCX
= 1
SSRCH 1697
STRING = 'CC'
CALL SSRCH (N, STRING, CHX, INCX, INDEX)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'INDEX = ', INDEX
END
Output
INDEX =
ACHAR
This function returns a character given its ASCII value.
Required Arguments
I Integer ASCII value of the character desired. (Input)
I must be greater than or equal to zero and less than or equal to 127.
FORTRAN 90 Interface
Generic:
ACHAR (I)
Specific:
FORTRAN 77 Interface
Single:
ACHAR (I)
Description
Routine ACHAR returns the character of the input ASCII value. The input value should be between
0 and 127. If the input value is out of range, the value returned in ACHAR is machine dependent.
Example
This example returns the character of the ASCII value 65.
USE ACHAR_INT
USE UMACH_INT
!
IMPLICIT
INTEGER
NONE
I, NOUT
!
CALL UMACH (2, NOUT)
!
!
!
99999 FORMAT (' For the ASCII value of ', I2, ', the character is : ', &
A1)
END
Output
For the ASCII value of 65, the character is : A
IACHAR
This function returns the integer ASCII value of a character argument.
Required Arguments
CH Character argument for which the integer ASCII value is desired. (Input)
FORTRAN 90 Interface
Generic:
IACHAR (CH)
Specific:
FORTRAN 77 Interface
Description
Routine IACHAR returns the ASCII value of the input character.
Single:
IACHAR (CH)
Example
This example gives the ASCII value of character A.
USE IACHAR_INT
IMPLICIT
Chapter 11: Utilities
NONE
IACHAR 1699
INTEGER
CHARACTER
NOUT
CH
!
CALL UMACH (2, NOUT)
!
!
!
99999 FORMAT (' For the character
I3)
END
Output
For the character
65
ICASE
This function returns the ASCII value of a character converted to uppercase.
Required Arguments
CH Character to be converted. (Input)
FORTRAN 90 Interface
Generic:
ICASE (CH)
Specific:
FORTRAN 77 Interface
Single:
ICASE (CH)
Description
Routine ICASE converts a character to its integer ASCII value. The conversion is case insensitive;
that is, it returns the ASCII value of the corresponding uppercase letter for a lowercase letter.
Example
This example shows the case insensitive conversion.
1700 Chapter 11: Utilities
USE ICASE_INT
USE UMACH_INT
IMPLICIT
INTEGER
CHARACTER
NONE
NOUT
CHR
!
!
!
CHR = 'a'
WRITE (NOUT,99999) CHR, ICASE(CHR)
!
99999 FORMAT (' For the character
I3)
END
Output
For the character
65
IICSR
This function compares two character strings using the ASCII collating sequence but without
regard to case.
Meaning
Required Arguments
STR1 First character string. (Input)
STR2 Second character string. (Input)
IICSR 1701
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine IICSR compares two character strings. It returns 1 if the first string is less than the
second string, 0 if they are equal, and 1 if the first string is greater than the second string. The
comparison is case insensitive.
Comments
If the two strings, STR1 and STR2, are of unequal length, the shorter string is considered as if it
were extended with blanks to the length of the longer string.
Example
This example shows different cases on comparing two strings.
USE IICSR_INT
USE UMACH_INT
IMPLICIT
INTEGER
CHARACTER
NONE
NOUT
STR1*6, STR2*6
!
!
!
!
!
!
!
99999 FORMAT (' For String1 = ', A6, 'and String2 = ', A6, &
' IICSR = ', I2, /)
END
1702 Chapter 11: Utilities
Output
For String1 = ABc 1 and String2 =
IICSR =
IICSR =
IICSR = -1
IIDEX
This funcion determines the position in a string at which a given character sequence begins
without regard to case.
Required Arguments
CHRSTR Character string to be searched. (Input)
KEY Character string that contains the key sequence. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine IIDEX searches for a key string in a given string and returns the index of the starting
element at which the key character string begins. It returns 0 if there is no match. The comparison
is case insensitive. For a case-sensitive version, use the FORTRAN 77 intrinsic function INDEX.
Comments
If the length of KEY is greater than the length CHRSTR, IIDEX returns a zero.
Example
This example locates a key string.
Chapter 11: Utilities
IIDEX 1703
USE IIDEX_INT
USE UMACH_INT
IMPLICIT
INTEGER
CHARACTER
NONE
NOUT
KEY*5, STRING*10
!
!
!
KEY = 'F'
WRITE (NOUT,99999) STRING, KEY, IIDEX(STRING,KEY)
!
99999 FORMAT (' For STRING = ', A10, ' and KEY = ', A5, ' IIDEX = ', I2, &
/)
END
Output
For STRING = a1b2c3d4e5 and KEY = C3d4E IIDEX =
IIDEX =
CVTSI
Converts a character string containing an integer number into the corresponding integer form.
Required Arguments
STRING Character string containing an integer number. (Input)
NUMBER The integer equivalent of STRING. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine CVTSI converts a character string containing an integer to an INTEGER variable. Leading
and trailing blanks in the string are ignored. If the string contains something other than an integer,
a terminal error is issued. If the string contains an integer larger than can be represented by an
INTEGER variable as determined from routine IMACH (see the Reference Material), a terminal
error is issued.
Example
The string 12345 is converted to an INTEGER variable.
USE CVTSI_INT
USE UMACH_INT
IMPLICIT
INTEGER
CHARACTER
NONE
NOUT, NUMBER
STRING*10
!
DATA STRING/'12345'/
!
CALL CVTSI (STRING, NUMBER)
!
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'NUMBER = ', NUMBER
END
Output
NUMBER =
12345
CPSEC
This fuction returns CPU time used in seconds.
Required Arguments
None
FORTRAN 90 Interface
Generic:
CPSEC ()
Specific:
CPSEC 1705
FORTRAN 77 Interface
Single:
CPSEC (1)
Comments
1.
2.
The accuracy of this routine depends on the hardware and the operating system. On some
systems, identical runs can produce timings differing by more than 10 percent.
TIMDY
Gets time of day.
Required Arguments
IHOUR Hour of the day. (Output)
IHOUR is between 0 and 23 inclusive.
MINUTE Minute within the hour. (Output)
MINUTE is between 0 and 59 inclusive.
ISEC Second within the minute. (Output)
ISEC is between 0 and 59 inclusive.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine TIMDY is used to retrieve the time of day.
Example
The following example uses TIMDY to return the current time. Obviously, the output is dependent
upon the time at which the program is run.
USE TIMDY_INT
USE UMACH_INT
1706 Chapter 11: Utilities
IMPLICIT
INTEGER
NONE
IHOUR, IMIN, ISEC, NOUT
!
CALL TIMDY (IHOUR, IMIN, ISEC)
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'Hour:Minute:Second = ', IHOUR, ':', IMIN, &
':', ISEC
IF (IHOUR .EQ. 0) THEN
WRITE (NOUT,*) 'The time is ', IMIN, ' minute(s), ', ISEC,
' second(s) past midnight.'
ELSE IF (IHOUR .LT. 12) THEN
WRITE (NOUT,*) 'The time is ', IMIN, ' minute(s), ', ISEC,
' second(s) past ', IHOUR, ' am.'
ELSE IF (IHOUR .EQ. 12) THEN
WRITE (NOUT,*) 'The time is ', IMIN, ' minute(s), ', ISEC,
' second(s) past noon.'
ELSE
WRITE (NOUT,*) 'The time is ', IMIN, ' minute(s), ', ISEC,
' second(s) past ', IHOUR-12, ' pm.'
END IF
END
&
&
&
&
Output
Hour:Minute:Second = 14 : 34 : 30
The time is 34 minute(s), 30 second(s) past
pm.
TDATE
Gets todays date.
Required Arguments
IDAY Day of the month. (Output)
IDAY is between 1 and 31 inclusive.
MONTH Month of the year. (Output)
MONTH is between 1 and 12 inclusive.
IYEAR Year. (Output)
For example, IYEAR = 1985.
FORTRAN 90 Interface
Generic:
Specific:
TDATE 1707
FORTRAN 77 Interface
Single:
Description
Routine TDATE is used to retrieve todays date. Obviously, the output is dependent upon the date
the program is run.
Example
The following example uses TDATE to return todays date.
USE TDATE_INT
USE UMACH_INT
IMPLICIT
INTEGER
NONE
IDAY, IYEAR, MONTH, NOUT
!
CALL TDATE (IDAY, MONTH, IYEAR)
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'Day-Month-Year = ', IDAY, '-', MONTH, &
'-', IYEAR
END
Output
Day-Month-Year =
7 - 7 - 2006
NDAYS
This function computes the number of days from January 1, 1900, to the given date.
Required Arguments
IDAY Day of the input date. (Input)
MONTH Month of the input date. (Input)
IYEAR Year of the input date. (Input)
1950 would correspond to the year 1950 A.D. and 50 would correspond to year 50
A.D.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Function NDAYS returns the number of days from January 1, 1900, to the given date. The function
NDAYS returns negative values for days prior to January 1, 1900. A negative IYEAR can be used
to specify B.C. Input dates in year 0 and for October 5, 1582, through October 14, 1582, inclusive,
do not exist; consequently, in these cases, NDAYS issues a terminal error.
Comments
1.
Informational error
Type
1
Code
1 The Julian calendar, the first modern calendar, went into use in 45
B.C. No calendar prior to 45 B.C. was as universally used nor as
accurate as the Julian. Therefore, it is assumed that the Julian
calendar was in use prior to 45 B.C.
2.
The number of days from one date to a second date can be computed by two references
to NDAYS and then calculating the difference.
3.
The beginning of the Gregorian calendar was the first day after October 4, 1582, which
became October 15, 1582. Prior to that, the Julian calendar was in use. NDAYS makes
the proper adjustment for the change in calendars.
Example
The following example uses NDAYS to compute the number of days from January 15, 1986, to
February 28, 1986:
USE NDAYS_INT
USE UMACH_INT
IMPLICIT
INTEGER
NONE
IDAY, IYEAR, MONTH, NDAY0, NDAY1, NOUT
!
IDAY
MONTH
IYEAR
NDAY0
IDAY
MONTH
=
=
=
=
=
=
15
1
1986
NDAYS(IDAY,MONTH,IYEAR)
28
2
NDAYS 1709
IYEAR = 1986
NDAY1 = NDAYS(IDAY,MONTH,IYEAR)
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'Number of days = ', NDAY1 - NDAY0
END
Output
Number of days =
44
NDYIN
Gives the date corresponding to the number of days since January 1, 1900.
Required Arguments
NDAYS Number of days since January 1, 1900. (Input)
IDAY Day of the input date. (Output)
MONTH Month of the input date. (Output)
IYEAR Year of the input date. (Output)
1950 would correspond to the year 195 A.D. and 50 would correspond to year 50
B.C.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine NDYIN computes the date corresponding to the number of days since January 1, 1900. For
an input value of NDAYS that is negative, the date computed is prior to January 1, 1900. The
routine NDYIN is the inverse of NDAYS.
Comments
The beginning of the Gregorian calendar was the first day after October 4, 1582, which became
October 15, 1582. Prior to that, the Julian calendar was in use. Routine NDYIN makes the proper
adjustment for the change in calendars.
Example
The following example uses NDYIN to compute the date for the 100th day of 1986. This is
accomplished by first using NDAYS to get the day number for December 31, 1985.
USE NDYIN_INT
USE NDAYS_INT
USE UMACH_INT
IMPLICIT
INTEGER
NONE
IDAY, IYEAR, MONTH, NDAYO, NOUT, NDAY0
!
NDAY0 = NDAYS(31,12,1985)
CALL NDYIN (NDAY0+100, IDAY, MONTH, IYEAR)
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'Day 100 of 1986 is (day-month-year) ', IDAY, &
'-', MONTH, '-', IYEAR
END
Output
Day 100 of 1986 is (day-month-year)
10-
4-
1986
IDYWK
This function computes the day of the week for a given date.
Required Arguments
IDAY Day of the input date. (Input)
MONTH Month of the input date. (Input)
IYEAR Year of the input date. (Input)
1950 would correspond to the year 1950 A.D. and 50 would correspond to year 50
A.D.
FORTRAN 90 Interface
Generic:
Specific:
IDYWK 1711
FORTRAN 77 Interface
Single:
Description
Function IDYWK returns an integer code that specifies the day of week for a given date. Sunday
corresponds to 1, Monday corresponds to 2, and so forth.
A negative IYEAR can be used to specify B.C. Input dates in year 0 and for October 5, 1582,
through October 14, 1582, inclusive, do not exist; consequently, in these cases, IDYWK issues a
terminal error.
Comments
1.
Informational error
Type
1
2.
Code
1 The Julian calendar, the first modern calendar, went into use in 45
B.C. No calendar prior to 45 B.C. was as universally used nor as
accurate as the Julian. Therefore, it is assumed that the Julian
calendar was in use prior to 45 B.C.
The beginning of the Gregorian calendar was the first day after October 4, 1582, which
became October 15, 1582. Prior to that, the Julian calendar was in use. Function IDYWK
makes the proper adjustment for the change in calendars.
Example
The following example uses IDYWK to return the day of the week for February 24, 1963.
USE IDYWK_INT
USE UMACH_INT
IMPLICIT
INTEGER
NONE
IDAY, IYEAR, MONTH, NOUT
!
IDAY = 24
MONTH = 2
IYEAR = 1963
CALL UMACH (2, NOUT)
WRITE (NOUT,*) 'IDYWK (index for day of week) = ', &
IDYWK(IDAY,MONTH,IYEAR)
END
Output
IDYWK (index for day of week) =
VERML
This function obtains IMSL MATH/LIBRARY-related version, system and serial numbers.
Required Arguments
ISELCT Option for the information to retrieve. (Input)
ISELCT
VERML
Operating system (and version number) for which the library was produced.
Fortran compiler (and version number) for which the library was produced.
FORTRAN 90 Interface
Generic:
VERML(ISELCT)
Specific:
FORTRAN 77 Interface
Single:
VERML(ISELCT)
Example
In this example, we print all of the information returned by VERML on a particular machine. The
output is omitted because the results are system dependent.
USE UMACH_INT
USE VERML_INT
IMPLICIT
INTEGER
CHARACTER
!
STRING(1)
STRING(2)
STRING(3)
STRING(4)
!
Chapter 11: Utilities
NONE
ISELCT, NOUT
STRING(4)*50, TEMP*32
=
=
=
=
'(''
'(''
'(''
'(''
Output
IMSL MATH/LIBRARY Version Number: IMSL Fortran Numerical Library, Version 6.0.0
Operating System ID Number: Solaris Version 10
Fortran Compiler Version Number: Sun Fortran 95 8.1 2005/01/07 (Workshop 10.0)
IMSL MATH/LIBRARY Serial Number: 999999
RAND_GEN
Generates a rank-1 array of random numbers. The output array entries are positive and less than 1
in value.
Required Argument
X Rank-1 array containing the random numbers. (Output)
Optional Arguments
IRND = IRND (Output)
Rank-1 integer array. These integers are the internal results of the Generalized
Feedback Shift Register (GFSR) algorithm. The values are scaled to yield the floatingpoint array X. The output array entries are between 1 and 23 1 1 in value.
Rank-1 integer array of size 3p + 2, where p = 521, that defines the ensuing state of the
GFSR generator. It is used to reset the internal tables to a previously defined state. It is
the result of a previous use of the ISTATE_OUT= optional argument.
(Output)
Rank-1 integer array of size 3p + 2 that describes the current state of the GFSR
generator. It is normally used to later reset the internal tables to the state defined
following a return from the GFSR generator. It is the result of a use of the generator
without a user initialization, or it is the result of a previous use of the optional
argument ISTATE_IN= followed by updates to the internal tables from newly
generated values. Example 2 illustrates use of ISTATE_IN and ISTATE_OUT for
setting and then resetting RAND_GEN so that the sequence of integers, irnd, is
repeatable.
ISTATE_OUT = ISTATE_OUT
Derived type array with the same precision as the array x; used for passing optional
data to rand_gen. The options are as follows:
1714 Chapter 11: Utilities
Option Name
Option Value
s_, d_
Rand_gen_generator_seed
s_, d_
Rand_gen_LCM_modulus
s_, d_
Rand_gen_use_Fushimi_start
Sets the initial values for the GFSR. The present value of the seed, obtained by default
from the real-time clock as described below, swaps places with
iopt(IO + 1)%idummy. If the seed is set before any current usage of RAND_GEN, the
exchanged value will be zero.
Sets the initial values for the GFSR. The present value of the LCM, with default value
k = 16807, swaps places with iopt(IO+1)%idummy.
Starts the GFSR sequence as suggested by Fushimi (1990). The default starting
sequence is with the LCM recurrence described below.
FORTRAN 90 Interface
Generic:
Specific:
Description
This GFSR algorithm is based on the recurrence
xt = xt 3 p xt 3 p
where a b is the exclusive OR operation on two integers a and b. This operation is performed
until SIZE(x) numbers have been generated. The subscripts in the recurrence formula are
computed modulo 3p. These numbers are converted to floating point by effectively multiplying
the positive integer quantity
xt 1
by a scale factor slightly smaller than 1./(huge(1)). The values p = 521 and q = 32 yield a sequence
with a period approximately
2 p > 10156.8
The default initial values for the sequence of integers {xt} are created by a congruential generator
starting with an odd integer seed
Chapter 11: Utilities
RAND_GEN 1715
An error condition is noted if the value of CLRATE=0. This indicates that the processor does not
have a functioning real-time clock. In this exceptional case a starting seed must be provided by the
user with the optional argument iopt= and option number ?_rand_generator_seed. The
value v is the current clock for this day, in milliseconds. This value is obtained using the date
routine:
CALL DATE_AND_TIME(VALUES=values)
The default value of k = 16807. Using the optional argument iopt= and the packaged option
number ?_rand_gen_LCM_modulus, k can be given an alternate value. The option number
?_rand_gen_generator_seed can be used to set the initial value of m instead of using the
asynchronous value given by the system clock. This is illustrated in Example 2. If the default
choice of m results in an unsatisfactory starting sequence or it is necessary to duplicate the
sequence, then it is recommended that users set the initial seed value to one of their own choosing.
Resetting the seed complicates the usage of the routine.
This software is based on Fushimi (1990), who gives a more elaborate starting sequence for the
{xt} . The starting sequence suggested by Fushimi can be used with the option number
?_rand_gen_use_Fushimi_start. Fushimis starting process is more expensive than the
default method, and it is equivalent to starting in another place of the sequence with period 2p.
Output
Example 1 for RAND_GEN is correct.
Additional Examples
Example 2: Seeding, Using, and Restoring the Generator
use rand_gen_int
implicit none
! This is Example 2 for RAND_GEN.
integer i
integer, parameter :: n=34, p=521
Chapter 11: Utilities
RAND_GEN 1717
Output
Example 2 for RAND_GEN is correct.
representative of the histogram in the sense of looking at 20 integers during generation of a large
number of samples.
use rand_gen_int
use show_int
implicit none
! This is Example 3 for RAND_GEN.
integer i, i_bin, i_map, i_left, i_right
integer, parameter :: n_work=1000
integer, parameter :: n_bins=10
integer, parameter :: scale=1000
integer, parameter :: total_counts=100
integer, parameter :: n_samples=total_counts*scale
integer, dimension(n_bins) :: histogram= &
(/4, 6, 8, 14, 20, 17, 12, 9, 7, 3 /)
integer, dimension(n_work) :: working=0
integer, dimension(n_bins) :: distribution=0
integer break_points(0:n_bins)
real(kind(1e0)) rn(n_samples)
real(kind(1e0)), parameter :: tolerance=0.005
integer, parameter :: n_samples_20=20
integer rand_num_20(n_samples_20)
real(kind(1e0)) rn_20(n_samples_20)
! Compute the normalized cumulative distribution.
break_points(0)=0
do i=1,n_bins
break_points(i)=break_points(i-1)+histogram(i)
end do
break_points=break_points*n_work/total_counts
! Obtain uniform random numbers.
call rand_gen(rn)
! Set up the secondary mapping array.
do i_bin=1,n_bins
i_left=break_points(i_bin-1)+1
i_right=break_points(i_bin)
do i=i_left, i_right
working(i)=i_bin
end do
end do
! Map the random numbers into the 'distribution' array.
! This is made approximately proportional to the histogram.
do i=1,n_samples
i_map=nint(rn(i)*(n_work-1)+1)
distribution(working(i_map))= &
distribution(working(i_map))+1
Chapter 11: Utilities
RAND_GEN 1719
end do
! Check the agreement between the distribution of the
! generated random numbers and the original histogram.
write (*, '(A)', advance='no') 'Original: '
write (*, '(10I6)') histogram*scale
write (*, '(A)', advance='no') 'Generated:'
write (*, '(10I6)') distribution
if (maxval(abs(histogram(1:)*scale-distribution(1:))) &
<= tolerance*n_samples) then
write(*, '(A/)') 'Example 3 for RAND_GEN is correct.'
end if
! Generate 20 integers in 1, 10 according to the distribution
! induced by the histogram.
call rand_gen(rn_20)
! Map from the uniform distribution to the induced distribution.
do i=1,n_samples_20
i_map=nint(rn_20(i)*(n_work-1)+1)
rand_num_20(i)=working(i_map)
end do
call show(rand_num_20,&
'Twenty integers generated according to the histogram:')
end
Output
Example 3 for RAND_GEN is correct.
we generate the samples by obtaining uniform samples u, 0 < u < 1 and solve the equation
q ( x ) u = 0, < x <
These are evaluated in vector form, that is all entries at one time, using Newtons method:
x x dx, dx = ( q ( x ) u ) / p ( x )
An iteration counter forces the loop to terminate, but this is not often required although it is an
important detail.
use rand_gen_int
1720 Chapter 11: Utilities
use show_int
use Numerical_Libraries
IMPLICIT NONE
! This is Example 4 for RAND_GEN.
integer i, i_map, k
integer, parameter ::
integer, parameter ::
integer, parameter ::
integer, parameter ::
integer, parameter ::
n_bins=36
offset=18
n_samples=10000
n_samples_30=30
COUNT=15
real(kind(1e0)) probabilities(n_bins)
real(kind(1e0)), dimension(n_bins) :: counts=0.0
real(kind(1e0)), dimension(n_samples) :: rn, x, f, fprime, dx
real(kind(1e0)), dimension(n_samples_30) :: rn_30, &
x_30, f_30, fprime_30, dx_30
real(kind(1e0)), parameter :: one=1e0, zero=0e0, half=0.5e0
real(kind(1e0)), parameter :: tolerance=0.01
real(kind(1e0)) two_pi, omega
! Initialize values of 'two_pi' and 'omega'.
two_pi=2.0*const((/'pi'/))
omega=two_pi/n_bins
! Compute the probabilities for each bin according to
! the probability density (cos(x)+1)/(2*pi), -pi<x<pi.
do i=1,n_bins
probabilities(i)=(sin(omega*(i-offset)) &
-sin(omega*(i-offset-1))+omega)/two_pi
end do
! Obtain uniform random numbers in (0,1).
call rand_gen(rn)
! Use Newton's method to solve the nonlinear equation:
! accumulated_distribution_function - random_number = 0.
x=zero; k=0
solve_equation: do
f=(sin(x)+x)/two_pi+half-rn
fprime=(one+cos(x))/two_pi
dx=f/fprime
x=x-dx; k=k+1
if (maxval(abs(dx)) <= sqrt(epsilon(one)) &
.or. k > COUNT) exit solve_equation
end do solve_equation
! Map the random numbers 'x' array into the 'counts' array.
do i=1,n_samples
i_map=int(x(i)/omega+offset)+1
counts(i_map)=counts(i_map)+one
end do
RAND_GEN 1721
Output
Example 4 for RAND_GEN is correct.
RNGET
Retrieves the current value of the seed used in the IMSL random number generators.
Required Arguments
ISEED The seed of the random number generator. (Output)
ISEED is in the range (1, 2147483646).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine RNGET retrieves the current value of the seed used in the IMSL random number
generators. A reason for doing this would be to restart a simulation, using RNSET to reset the seed.
Example
The following FORTRAN statements illustrate the use of RNGET:
!
INTEGER ISEED
CALL RNSET(123457)
!
Do some simulations.
...
...
CALL RNGET(ISEED)
!
!
!
!
!
!
RNSET
Initializes a random seed for use in the IMSL random number generators.
Required Arguments
ISEED The seed of the random number generator. (Input)
ISEED must be in the range (0, 2147483646). If ISEED is zero, a value is computed
using the system clock; and, hence, the results of programs using the IMSL random
number generators will be different at different times.
FORTRAN 90 Interface
Generic:
Specific:
RNSET 1723
FORTRAN 77 Interface
Single:
Description
Routine RNSET is used to initialize the seed used in the IMSL random number generators. If the
seed is not initialized prior to invocation of any of the routines for random number generation by
calling RNSET, the seed is initialized via the system clock. The seed can be reinitialized to a
clock-dependent value by calling RNSET with ISEED set to 0.
The effect of RNSET is to set some values in a FORTRAN COMMON block that is used by the
random number generators.
A common use of RNSET is in conjunction with RNGET to restart a simulation.
Example
The following FORTRAN statements illustrate the use of RNSET:
INTEGER ISEED
!
!
Do some simulations.
...
...
!
!
!
!
!
!
RNOPT
Selects the uniform (0, 1) multiplicative congruential pseudorandom number generator.
Required Arguments
IOPT Indicator of the generator. (Input)
The random number generator is either a multiplicative congruential generator with
modulus 231 1 or a GFSR generator. IOPT is used to choose the multiplier and
1724 Chapter 11: Utilities
whether or not shuffling is done, or is used to choose the GFSR method, or is used to
choose the Mersenne Twister generator.
IOPT Generator
A 32-bit Mersenne Twister generator is used. The real and double random
numbers are generated from 32-bit integers.
A 64-bit Mersenne Twister generator is used. The real and double random
numbers are generated from 64-bit integers. This ensures that all bits of both
float and double are random.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
The uniform pseudorandom number generators use a multiplicative congruential method, with or
without shuffling or a GFSR method, or the Mersenne Twister method. Routine RNOPT determines
which method is used; and in the case of a multiplicative congruential method, it determines the
value of the multiplier and whether or not to use shuffling. The description of RNUN may provide
some guidance in the choice of the form of the generator. If no selection is made explicitly, the
generators use the multiplier 16807 without shuffling. This form of the generator has been in use
for some time (see Lewis, Goodman, and Miller, 1969). This is the generator formerly known as
GGUBS in the IMSL Library. It is the minimal standard generator discussed by Park and Miller
(1988).
Chapter 11: Utilities
RNOPT 1725
Both of the Mersenne Twister generators have a period of 219937 -1 and a 624-dimensional equidistribution property. See Matsumoto et al. 1998 for details.
The IMSL Mersenne Twister generators are derived from code copyright (C) 1997 - 2002, Makoto
Matsumoto and Takuji Nishimura, All rights reserved. It is subject to the following notice:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The IMSL 32-bit Mersenne Twister generator is based on the Matsumoto and Nishimura code
mt19937ar and the 64-bit code is based on mt19937-64.
Example
The FORTRAN statement
CALL RNOPT(1)
would select the simple multiplicative congruential generator with multiplier 16807. Since this is
the same as the default, this statement would have no effect unless RNOPT had previously been
called in the same program to select a different generator.
RNIN32
Initializes the 32-bit Mersenne Twister generator using an array.
Required Arguments
KEY Integer array of length LEN used to initialize the 32-bit Mersenne Twister generator.
(Input)
Optional Arguments
LEN Length of the array key. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
By default, the Mersenne Twister random number generator is initialized using the current seed
value (see RNGET). The seed is limited to one integer for initialization. This function allows an
arbitrary length array to be used for initialization. This subroutine completely replaces the use of the
seed for initialization of the 32-bit Mersenne Twister generator.
Example
See routine RNGE32.
RNGE32
Retrieves the current table used in the 32-bit Mersenne Twister generator.
Required Arguments
MTABLE Integer array of length 625 containing the table used in the 32-bit Mersenne
Twister generator. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
The values in the table contain the state of the 32-bit Mersenne Twister random number generator.
The table can be used by RNSE32 to set the generator back to this state.
Example
In this example, four simulation streams are generated. The first series is generated with the seed
used for initialization. The second series is generated using an array for initialization. The third
series is obtained by resetting the generator back to the state it had at the beginning of the second
stream. Therefore, the second and third streams are identical. The fourth stream is obtained by
resetting the generator back to its original, uninitialized state, and having it reinitialize using the
seed. The first and fourth streams are therefore the same.
RNGE32 1727
!
!
USE RNIN32_INT
USE RNGE32_INT
USE RNSET_INT
USE UMACH_INT
USE RNUN_INT
IMPLICIT
NONE
INTEGER
I, ISEED, NOUT
INTEGER INIT(4)
DATA INIT/291,564,837,1110/
DATA ISEED/123457/
INTEGER NR
REAL R(5)
INTEGER MTABLE(625)
CHARACTER CLABEL(5)*5, FMT*8, RLABEL(3)*5
RLABEL(1)='NONE'
CLABEL(1)='NONE'
DATA FMT/'(W10.4)'/
NR=5
CALL UMACH (2, NOUT)
ISEED = 123457
CALL RNOPT(8)
CALL RNSET(ISEED)
CALL RNUN(R)
CALL WRRRL('FIRST STREAM OUTPUT',1,5,R,1,0, &
FMT, RLABEL, CLABEL)
REINITIALIZE MERSENNE TWISTER SERIES WITH AN ARRAY
CALL RNIN32(INIT)
SAVE THE STATE OF THE SERIES
CALL RNGE32(MTABLE)
CALL RNUN(R)
CALL WRRRL('SECOND STREAM OUTPUT',1,5,R,1,0, &
FMT, RLABEL, CLABEL)
RESTORE THE STATE OF THE TABLE
CALL RNSE32(MTABLE)
CALL RNUN(R)
CALL WRRRL('THIRD STREAM OUTPUT',1,5,R,1,0, &
FMT, RLABEL, CLABEL)
RESET THE SERIES - IT WILL REINITIALIZE FROM THE SEED
MTABLE(1)=1000
CALL RNSE32(MTABLE)
CALL RNUN(R)
CALL WRRRL('FOURTH STREAM OUTPUT',1,5,R,1,0, &
FMT, RLABEL, CLABEL)
END
Output
0.4347
0.2486
0.2486
0.4347
First
0.3522
Second
0.2226
Third
0.2226
Fourth
0.3522
stream output
0.0139
stream output
0.1111
stream output
0.1111
stream output
0.0139
0.2091
0.4956
0.9563
0.9846
0.9563
0.9846
0.2091
0.4956
Fortran Numerical MATH LIBRARY
RNSE32
Sets the current table used in the 32-bit Mersenne Twister generator.
Required Arguments
MTABLE Integer array of length 625 containing the table used in the 32-bit Mersenne
Twister generator. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
The values in MTABLE are the state of the 32-bit Mersenne Twister random number generator
obtained by a call to RNGE32. The values in the table can be used to restore the state of the
generator.
Alternatively, if MTABLE [1] > 625 then the generator is set to its original, uninitialized, state.
Example
See routine RNGE32.
RNIN64
Initializes the 64-bit Mersenne Twister generator using an array.
Required Arguments
KEY Integer(kind=8) array of length LEN used to initialize the 64-bit Mersenne Twister
generator. (Input)
Optional Arguments
LEN Length of the array key. (Input)
FORTRAN 90 Interface
Generic:
Chapter 11: Utilities
Specific:
FORTRAN 77 Interface
Single:
Description
By default, the Mersenne Twister random number generator is initialized using the current seed
value (see RNGET). The seed is limited to one integer for initialization. This function allows an
arbitrary length array to be used for initialization. This subroutine completely replaces the use of the
seed for initialization of the 64-bit Mersenne Twister generator.
RNGE64
Retrieves the current table used in the 64-bit Mersenne Twister generator.
Required Arguments
MTABLE Integer(kind=8) array of length 313 containing the table used in the 64-bit
Mersenne Twister generator. (Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
The values in the table contain the state of the 64-bit Mersenne Twister random number generator.
The table can be used by RNSE64 to set the generator back to this state.
Example
In this example, four simulation streams are generated. The first series is generated with the seed
used for initialization. The second series is generated using an array for initialization. The third
series is obtained by resetting the generator back to the state it had at the beginning of the second
stream. Therefore, the second and third streams are identical. The fourth stream is obtained by
resetting the generator back to its original, uninitialized state, and having it reinitialize using the
seed. The first and fourth streams are therefore the same.
USE RNIN64_INT
USE RNGE64_INT
1730 Chapter 11: Utilities
!
!
USE RNSET_INT
USE UMACH_INT
USE RNUN_INT
IMPLICIT
NONE
INTEGER
I, ISEED, NOUT
INTEGER(KIND=8) INIT(4)
DATA INIT/291,564,837,1110/
DATA ISEED/123457/
INTEGER NR
REAL R(5)
INTEGER(KIND=8) MTABLE(313)
CHARACTER CLABEL(5)*5, FMT*8, RLABEL(3)*5
RLABEL(1)='NONE'
CLABEL(1)='NONE'
DATA FMT/'(W10.4)'/
NR=5
CALL UMACH (2, NOUT)
ISEED = 123457
CALL RNOPT(9)
CALL RNSET(ISEED)
CALL RNUN(R)
CALL WRRRL('FIRST STREAM OUTPUT',1,5,R,1,0, &
FMT, RLABEL, CLABEL)
REINITIALIZE MERSENNE TWISTER SERIES WITH AN ARRAY
CALL RNIN64(INIT)
SAVE THE STATE OF THE SERIES
CALL RNGE64(MTABLE)
CALL RNUN(R)
CALL WRRRL('SECOND STREAM OUTPUT',1,5,R,1,0, &
FMT, RLABEL, CLABEL)
RESTORE THE STATE OF THE TABLE
CALL RNSE64(MTABLE)
CALL RNUN(R)
CALL WRRRL('THIRD STREAM OUTPUT',1,5,R,1,0, &
FMT, RLABEL, CLABEL)
RESET THE SERIES - IT WILL REINITIALIZE FROM THE SEED
MTABLE(1)=1000
CALL RNSE64(MTABLE)
CALL RNUN(R)
CALL WRRRL('FOURTH STREAM OUTPUT',1,5,R,1,0, &
FMT, RLABEL, CLABEL)
END
Output
0.5799
0.4894
0.4894
0.5799
First
0.9401
Second
0.7397
Third
0.7397
Fourth
0.9401
stream output
0.7102
stream output
0.5725
stream output
0.5725
stream output
0.7102
0.1640
0.5457
0.0863
0.7588
0.0863
0.7588
0.1640
0.5457
RNGE64 1731
RNSE64
Sets the current table used in the 64-bit Mersenne Twister generator.
Required Arguments
MTABLE Integer (kind=8) array of length 313 containing the table used in the 64-bit
Mersenne Twister generator. (Input)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
The values in MTABLE are the state of the 64-bit Mersenne Twister random number generator
obtained by a call to RNGE64. The values in the table can be used to restore the state of the
generator. Alternatively, if MTABLE [1] > 313 then the generator is set to its original,
uninitialized, state.
Example
See function RNGE64.
RNUNF
This function generates a pseudorandom number from a uniform (0, 1) distribution.
Required Arguments
None
FORTRAN 90 Interface
Generic:
RNUNF ()
Specific:
FORTRAN 77 Interface
Single:
RNUNF ()
Double:
Description
Routine RNUNF is the function form of RNUN. The routine RNUNF generates pseudorandom
numbers from a uniform (0, 1) distribution. The algorithm used is determined by RNOPT. The
values returned by RNUNF are positive and less than 1.0.
If several uniform deviates are needed, it may be more efficient to obtain them all at once by a call
to RNUN rather than by several references to RNUNF.
Comments
1.
If the generic version of this function is used, the immediate result must be stored in a
variable before use in an expression. For example:
X = RNUNF(6)
Y = SQRT(X)
If this is too much of a restriction on the programmer, then the specific name can be
used without this restriction.
2.
Routine RNSET can be used to initialize the seed of the random number generator. The
routine RNOPT can be used to select the form of the generator.
3.
This function has a side effect: it changes the value of the seed, which is passed
through a common block.
Example
In this example, RNUNF is used to generate five pseudorandom uniform numbers. Since RNOPT is
not called, the generator used is a simple multiplicative congruential one with a multiplier of
16807.
USE RNUNF_INT
USE RNSET_INT
USE UMACH_INT
IMPLICIT
INTEGER
Chapter 11: Utilities
NONE
I, ISEED, NOUT
RNUNF 1733
REAL
R(5)
!
CALL UMACH (2, NOUT)
ISEED = 123457
CALL RNSET (ISEED)
DO 10 I=1, 5
R(I) = RNUNF()
10 CONTINUE
WRITE (NOUT,99999) R
99999 FORMAT ('
Uniform random deviates: ', 5F8.4)
END
Output
Uniform random deviates:
0.9662
0.2607
0.7663
0.5693
0.8448
RNUN
Generates pseudorandom numbers from a uniform (0, 1) distribution.
Required Arguments
R Vector of length NR containing the random uniform (0, 1) deviates. (Output)
Optional Arguments
NR Number of random numbers to generate. (Input)
Default: NR = SIZE (R,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine RNUN generates pseudorandom numbers from a uniform (0,1) distribution using either a
multiplicative congruential method or a generalized feedback shift register (GFSR) method, or the
Mersenne Twister generator. The form of the multiplicative congruential generator is
xi cxi 1 mod ( 231 1)
1734 Chapter 11: Utilities
Each xi is then scaled into the unit interval (0,1). The possible values for c in the IMSL generators
are 16807, 397204094, and 950706376. The selection is made by the routine RNOPT. The choice
of 16807 will result in the fastest execution time. If no selection is made explicitly, the routines
use the multiplier 16807.
The user can also select a shuffled version of the multiplicative congruential generators. In this
scheme, a table is filled with the first 128 uniform (0,1) numbers resulting from the simple
multiplicative congruential generator. Then, for each xi from the simple generator, the low-order
bits of xi are used to select a random integer, j, from 1 to 128. The j-th entry in the table is then
delivered as the random number; and xi, after being scaled into the unit interval, is inserted into the
j-th position in the table.
The GFSR method is based on the recursion Xt = Xt1563 Xt96. This generator, which is different
from earlier GFSR generators, was proposed by Fushimi (1990), who discusses the theory behind
the generator and reports on several empirical tests of it.
Mersenne Twister(MT) is a pseudorandom number generating algorithm developed by Makoto
Matsumoto and Takuji Nishimura in 1996-1997. MT has far longer period and far higher order of
equidistribution than any other implemented generators. The values returned in R by RNUN are
positive and less than 1.0. Values in R may be smaller than the smallest relative spacing, however.
Hence, it may be the case that some value R(i) is such that 1.0 R(i) = 1.0.
Deviates from the distribution with uniform density over the interval (A, B) can be obtained by
scaling the output from RNUN. The following statements (in single precision) would yield random
deviates from a uniform (A, B) distribution:
CALL RNUN (NR, R)
CALL SSCAL (NR, B-A, R, 1)
CALL SADD (NR, A, R, 1)
Comments
The routine RNSET can be used to initialize the seed of the random number generator. The routine
RNOPT can be used to select the form of the generator.
Example
In this example, RNUN is used to generate five pseudorandom uniform numbers. Since RNOPT is
not called, the generator used is a simple multiplicative congruential one with a multiplier of
16807.
USE RNUN_INT
USE RNSET_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
NONE
ISEED, NOUT, NR
R(5)
!
CALL UMACH (2, NOUT)
NR
= 5
ISEED = 123457
Chapter 11: Utilities
RNUN 1735
Output
Uniform random deviates:
0.9662
0.2607
0.7663
0.5693
0.8448
FAURE_INIT
Shuffled Faure sequence initialization.
Required Arguments
NDIM The dimension of the hyper-rectangle. (Input)
STATE An IMSL_FAURE pointer for the derived type created by the call to FAURE_INIT.
The output contains information about the sequence. Use ?_IMSL_FAURE as the type,
where ?_ is S_ or D_ depending on precision. (Output)
Optional Arguments
NBASE The base of the Faure sequence. (Input)
Default: basem/2 1 , where m = log B / log base and B is the largest machine
representable integer.
FORTRAN 90 Interface
Generic:
Specific:
FAURE_FREE
Frees the structure containing information about the Faure sequence.
Required Arguments
STATE An IMSL_FAURE pointer containing the structure created by the call to
FAURE_INIT. (Input/Output)
FORTRAN 90 Interface
Generic:
Specific:
FAURE_NEXT
Computes a shuffled Faure sequence.
Required Arguments
STATE An IMSL_FAURE pointer containing the structure created by the call to
FAURE_INIT. The structure contains information about the sequence. The structure
should be freed using FAURE_FREE after it is no longer needed. (Input/Output)
NEXT_PT Vector of length NDIM containing the next point in the shuffled Faure
sequence, where NDIM is the dimension of the hyper-rectangle specified in
FAURE_INIT.
(Output)
Optional Arguments
IMSL_RETURN_SKIP Returns the current point in the sequence. The sequence can be
restarted by calling FAURE_INIT using this value for NSKIP, and using the same value
for NDIM. (Input)
FORTRAN 90 Interface
Generic:
Specific:
Description
The routines FAURE_INIT and FAURE_NEXT are used to generate shuffled Faure sequence of low
discrepancy n-dimensional points. Low discrepency series fill an n-dimensional cube more
uniformly than psuedo-random sequences, and are used in multivariate quadrature, simulation, and
global optimization. Because of this uniformity, use of low discrepency series is generally more
effiicient than psuedo-random series for multivariate Monte Carlo methods. See the IMSL routine
QMC (Chapter 4, Integration and Differentiation) for a discussion of quasi-Monte Carlo quadrature
based on low discrepancy series.
Chapter 11: Utilities
FAURE_NEXT 1737
( d ) = sup A ( E; n ) ( E ) ,
D
n
n
E
where the supremum is over all subsets of [0, 1]d of the form
E = 0, t1 ... 0, t d , 0 t j 1, 1 j d ,
The sequence x1, x2, of points [0,1]d is a low-discrepancy sequence if there exists a constant
c(d), depending only on d, such that
( d ) c ( d ) ( log n )
D
n
n
n = ai (n)bi
i =0
1 j d
k =0 d =0
( j)
The generator matrix for the series, c , is defined to be
kd
ck( dj ) = j d k ck d
ck d
d!
kd
= c !( d c ) !
0
k>d
It is faster to compute a shuffled Faure sequence than to compute the Faure sequence itself. It can
be shown that this shuffling preserves the low-discrepancy property.
The shuffling used is the b-ary Gray code. The function G(n) maps the positive integer n into the
integer given by its b-ary expansion.
The sequence computed by this function is x(G(n)), where x is the generalized Faure sequence.
Example
In this example, five points in the Faure sequence are computed. The points are in the threedimensional unit cube.
Note that FAURE_INIT is used to create a structure that holds the state of the sequence. Each call
to FAURE_NEXT returns the next point in the sequence and updates the IMSL_FAURE structure. The
final call to FAURE_FREE frees data items, stored in the structure, that were allocated by
FAURE_INIT.
!
!
!
!
!
use faure_int
implicit none
type (s_imsl_faure), pointer :: state
real(kind(1e0))
:: x(3)
integer,parameter :: ndim=3
integer
:: k
CREATE THE STRUCTURE THAT HOLDS
THE STATE OF THE SEQUENCE.
call faure_init(ndim, state)
GET THE NEXT POINT IN THE SEQUENCE
do k=1,5
call faure_next(state, x)
write(*,'(3F15.3)') x(1), x(2) , x(3)
enddo
FREE DATA ITEMS STORED IN
state STRUCTURE
call faure_free(state)
end
Output
0.334
0.667
0.778
0.111
0.445
0.493
0.826
0.270
0.604
0.937
0.064
0.397
0.175
0.509
0.842
IUMAG
This routine handles MATH/LIBRARY and STAT/LIBRARY type INTEGER options.
Chapter 11: Utilities
IUMAG 1739
Required Arguments
PRODNM Product name. Use either MATH or STAT. (Input)
ICHP Chapter number of the routine that uses the options. (Input)
IACT 1 if user desires to get or read options, or 2 if user desires to put or write
options. (Input)
NUMOPT Size of IOPTS. (Input)
IOPTS Integer array of size NUMOPT containing the option numbers to get or put.
(Input)
IVALS Integer array containing the option values. These values are arrays corresponding
to the individual options in IOPTS in sequential order. The size of IVALS is the sum of
the sizes of the individual options. (Input/Output)
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
The Options Manager routine IUMAG reads or writes INTEGER data for some MATH/LIBRARY
and STAT/LIBRARY codes. See Atchison and Hanson (1991) for more complete details.
There are MATH/LIBRARY routines in Chapters 1, 2, and 5 that now use IUMAG to communicate
optional data from the user.
Comments
1.
Users can normally avoid reading about options when first using a routine that calls
IUMAG.
2.
Let I be any value between 1 and NUMOPT. A negative value of IOPTS(I) refers to
option number IOPTS(I) but with a different effect: For a get operation, the default
values are returned in IVALS. For a put operation, the default values replace the
current values. In the case of a put, entries of IVALS are not allocated by the user and
are not used by IUMAG.
3.
4.
INTEGER Options
If the value is positive, print the next activity for any library routine that uses the
Options Manager codes IUMAG, SUMAG, or DUMAG. Each printing step
decrements the value if it is positive.
Default value is 0.
If the value is 2, perform error checking in IUMAG, SUMAG , and DUMAG such as
the verifying of valid option numbers and the validity of input data. If the value
is 1, do not perform error checking.
Default value is 2.
This value is used for testing the installation of IUMAG by other IMSL software.
Default value is 3.
Example
The number of iterations allowed for the constrained least squares solver LCLSQ that calls L2LSQ
is changed from the default value of max(nra, nca) to the value 6. The default value is restored
after the call to LCLSQ. This change has no effect on the solution. It is used only for illustration.
The first two arguments required for the call to IUMAG are defined by the product name, MATH,
and chapter number, 1, where LCLSQ is documented. The argument IACT denotes a write or put
operation. There is one option to change so NUMOPT has the value 1. The arguments for the option
number, 14, and the new value, 6, are defined by reading the documentation for LCLSQ.
USE
USE
USE
USE
IUMAG_INT
LCLSQ_INT
UMACH_INT
SNRM2_INT
IMPLICIT
NONE
!
!
Solve the following in the least squares sense:
!
3x1 + 2x2 + x3 = 3.3
!
4x1 + 2x2 + x3 = 2.3
!
2x1 + 2x2 + x3 = 1.3
!
x1 + x2 + x3 = 1.0
!
!
Subject to: x1 + x2 + x3 <= 1
!
0 <= x1 <= .5
!
0 <= x2 <= .5
!
0 <= x3 <= .5
!
! ---------------------------------------------------------------------!
Declaration of variables
!
INTEGER
ICHP, IPUT, LDA, LDC, MCON, NCA, NEWMAX, NRA, NUMOPT
PARAMETER (ICHP=1, IPUT=2, MCON=1, NCA=3, NEWMAX=14, NRA=4, &
NUMOPT=1, LDA=NRA, LDC=MCON)
!
INTEGER
IOPT(1), IRTYPE(MCON), IVAL(1), NOUT
REAL
A(LDA,NCA), B(NRA), BC(MCON), C(LDC,NCA), RES(NRA), &
RESNRM, XLB(NCA), XSOL(NCA), XUB(NCA)
Chapter 11: Utilities
IUMAG 1741
!
!
Data initialization
DATA A/3.0E0, 4.0E0, 2.0E0, 1.0E0, 2.0E0, 2.0E0, 2.0E0, 1.0E0, &
1.0E0, 1.0E0, 1.0E0, 1.0E0/, B/3.3E0, 2.3E0, 1.3E0, 1.0E0/, &
C/3*1.0E0/, BC/1.0E0/, IRTYPE/1/, XLB/3*0.0E0/, XUB/3*.5E0/
! ---------------------------------------------------------------------!
!
Reset the maximum number of
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) XSOL, RES, RESNRM
--------------------------------------------------------------------Reset the maximum number of
iterations to its default value.
This is not required but is
recommended programming practice.
IOPT(1) = -IOPT(1)
CALL IUMAG ('math', ICHP, IPUT, NUMOPT, IOPT, IVAL)
---------------------------------------------------------------------
!
!
!
99999 FORMAT (' The solution is ', 3F9.4, //, ' The residuals ', &
'evaluated at the solution are ', /, 18X, 4F9.4, //, &
' The norm of the residual vector is ', F8.4)
!
END
Output
The solution is
0.5000
0.3000
0.2000
0.0000
1.2247
UMAG
This routine handles MATH/LIBRARY and STAT/LIBRARY type REAL and double precision
options.
Required Arguments
PRODNM Product name. Use either MATH or STAT. (Input)
ICHP Chapter number of the routine that uses the options. (Input)
IACT 1 if user desires to get or read options, or 2 if user desires to put or write
options. (Input)
IOPTS Integer array of size NUMOPT containing the option numbers to get or put.
(Input)
SVALS Array containing the option values. These values are arrays corresponding to the
individual options in IOPTS in sequential order. The size of SVALS is the sum of the
sizes of the individual options. (Input/Output)
Optional Arguments
NUMOPT Size of IOPTS. (Input)
Default: NUMOPT = SIZE (IOPTS,1).
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
The Options Manager routine SUMAG reads or writes REAL data for some MATH/LIBRARY and
STAT/LIBRARY codes. See Atchison and Hanson (1991) for more complete details. There are
MATH/LIBRARY routines in Chapters 1 and 5 that now use SUMAG to communicate optional data
from the user.
UMAG 1743
Comments
1.
Users can normally avoid reading about options when first using a routine that calls
SUMAG.
2.
Let I be any value between 1 and NUMOPT. A negative value of IOPTS(I) refers to
option number IOPTS(I) but with a different effect: For a get operation, the default
values are returned in SVALS. For a put operation, the default values replace the
current values. In the case of a put, entries of SVALS are not allocated by the user and
are not used by SUMAG.
3.
4.
This value is used for testing the installation of SUMAG by other IMSL software.
Default value is 3.0E0.
Example
The rank determination tolerance for the constrained least squares solver LCLSQ that calls L2LSQ
is changed from the default value of SQRT(AMACH(4)) to the value 0.01. The default value is
restored after the call to LCLSQ. This change has no effect on the solution. It is used only for
illustration. The first two arguments required for the call to SUMAG are defined by the product
name, MATH, and chapter number, 1, where LCLSQ is documented. The argument IACT
denotes a write or put operation. There is one option to change so NUMOPT has the value 1. The
arguments for the option number, 2, and the new value, 0.01E+0, are defined by reading the
documentation for LCLSQ.
USE
USE
USE
USE
UMAG_INT
LCLSQ_INT
UMACH_INT
SNRM2_INT
IMPLICIT
NONE
!
!
Solve the following in the least squares sense:
!
3x1 + 2x2 + x3 = 3.3
!
4x1 + 2x2 + x3 = 2.3
!
2x1 + 2x2 + x3 = 1.3
!
x1 + x2 + x3 = 1.0
!
!
Subject to: x1 + x2 + x3 <= 1
!
0 <= x1 <= .5
!
0 <= x2 <= .5
!
0 <= x3 <= .5
!
! ---------------------------------------------------------------------!
Declaration of variables
!
INTEGER
ICHP, IPUT, LDA, LDC, MCON, NCA, NEWTOL, NRA, NUMOPT
1744 Chapter 11: Utilities
PARAMETER
INTEGER
REAL
!
!
DATA A/3.0E0, 4.0E0, 2.0E0, 1.0E0, 2.0E0, 2.0E0, 2.0E0, 1.0E0, &
1.0E0, 1.0E0, 1.0E0, 1.0E0/, B/3.3E0, 2.3E0, 1.3E0, 1.0E0/, &
C/3*1.0E0/, BC/1.0E0/, IRTYPE/1/, XLB/3*0.0E0/, XUB/3*.5E0/
! ---------------------------------------------------------------------!
!
Reset the rank determination
!
tolerance used in the solver.
!
The value 2 is the option number.
!
The value 0.01 is the new tolerance.
!
IOPT(1) = NEWTOL
SVAL(1) = 0.01E+0
CALL UMAG ('math', ICHP, IPUT, IOPT, SVAL)
!
------------------------------------!
--------------------------------!
!
Solve the bounded, constrained
!
least squares problem.
!
CALL LCLSQ (A, B, C, BC, BC, IRTYPE, XLB, XUB, XSOL, RES=RES)
!
Compute the 2-norm of the residuals.
RESNRM = SNRM2(NRA,RES,1)
!
Print results
CALL UMACH (2, NOUT)
WRITE (NOUT,99999) XSOL, RES, RESNRM
!
------------------------------------!
--------------------------------!
!
!
!
!
!
!
99999 FORMAT (' The solution is ', 3F9.4, //, ' The residuals ', &
'evaluated at the solution are ', /, 18X, 4F9.4, //, &
' The norm of the residual vector is ', F8.4)
!
END
Output
The solution is
0.5000
0.3000
0.2000
UMAG 1745
0.0000
1.2247
SUMAG/DUMAG
See UMAG.
PLOTP
Prints a plot of up to 10 sets of points.
Required Arguments
X Vector of length NDATA containing the values of the independent variable. (Input)
A Matrix of dimension NDATA by NFUN containing the NFUN sets of dependent variable
values. (Input)
SYMBOL CHARACTER string of length NFUN. (Input)
SYMBOL(I : I) is the symbol used to plot function I.
XTITLE CHARACTER string used to label the x-axis. (Input)
YTITLE CHARACTER string used to label the y-axis. (Input)
TITLE CHARACTER string used to label the plot. (Input)
Optional Arguments
NDATA Number of independent variable data points. (Input)
Default: NDATA = SIZE (X,1).
NFUN Number of sets of points. (Input)
NFUN must be less than or equal to 10.
Default: NFUN = SIZE (A,2).
LDA Leading dimension of A exactly as specified in the dimension statement of the calling
program. (Input)
Default: LDA = SIZE (A,1).
INC Increment between elements of the data to be used. (Input)
PLOTP plots X(1 + (I 1) * INC) for I = 1, 2, , NDATA.
Default: INC = 1.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine PLOTP produces a line printer plot of up to ten sets of points superimposed upon the same
plot. A character M is printed to indicate multiple points. The user may specify the x and y-axis
plot ranges and plotting symbols. Plot width and length may be reset in advance by calling PGOPT.
Comments
1.
Informational errors
Type
3
3
3
3
Code
7
8
9
10
NFUN is greater than 10. Only the first 10 functions are plotted.
TITLE is too long. TITLE is truncated from the right side.
YTITLE is too long. YTITLE is truncated from the right side.
XTITLE is too long. XTITLE is truncated from the right side. The
3.
For multiple plots, the character M is used if the same print position is shared by two or
more data sets.
4.
5.
Default page width is 78 and default page length is 60. They may be changed by
calling PGOPT in advance.
PLOTP 1747
Example
This example plots the sine and cosine functions from 3.5 to + 3.5 and sets page width and
length to 78 and 40, respectively, by calling PGOPT in advance.
USE PLOTP_INT
USE CONST_INT
USE PGOPT_INT
IMPLICIT
INTEGER
REAL
CHARACTER
INTRINSIC
NONE
I, IPAGE
A(200,2), DELX, PI, RANGE(4), X(200)
SYMBOL*2
COS, SIN
DATA SYMBOL/'SC'/
DATA RANGE/-3.5, 3.5, -1.2, 1.2/
!
PI
= 3.14159
DELX
= 2.*PI/199.
DO 10 I= 1, 200
X(I)
= -PI + FLOAT(I-1) * DELX
A(I,1) = SIN(X(I))
A(I,2) = COS(X(I))
10 CONTINUE
Set page width and length
IPAGE = 78
CALL PGOPT (-1, IPAGE)
IPAGE = 40
CALL PGOPT (-2, IPAGE)
CALL PLOTP (X, A, SYMBOL, 'X AXIS', 'Y AXIS', ' C = COS,
RANGE=RANGE)
S = SIN', &
!
END
Output
C = COS,
Y
A
X
S = SIN
1.2 ::::+:::::::::::::::+:::::::::::::::+:::::::::::::::+::::
.
I
.
.
I
.
.
CCCCCCC
SSSSSSSS
.
.
CC I CC
SS
SS
.
0.8 +
C
I
C SS
SS
+
.
C
I
MS
SS
.
.
C
I
SSC
SS
.
.
CC
I
SS CC
SS
.
.
CC
I
S
CC
S
.
0.4 +
C
I S
C
S
+
.
C
I SS
C
SS
.
.
CC
I S
CC
S
.
.
C
IS
C
S
.
.
C
SS
C
SS .
0.0 +--S-----------CC-----------S-----------CC-----------S--+
I
S
. SS
CC
SS
CC
.
.
S
C
SI
C
.
.
S
CC
S I
CC
.
.
SS
C
SS I
C
.
-0.4 +
S
C
S I
C
+
.
S
CC
S
I
CC
.
.
SS CC
SS
I
CC
.
.
SSC
SS
I
C
.
.
MS
SS
I
C
.
-0.8 +
C SS
SS
I
C
+
.
CC
SS
SS
I
CC
.
. CCCC
SSSSSSSS
I
CCCC .
. C
I
C .
.
I
.
-1.2 ::::+:::::::::::::::+:::::::::::::::+:::::::::::::::+::::
-3
-1
1
3
X AXIS
PRIME
Decomposes an integer into its prime factors.
Required Arguments
N Integer to be decomposed. (Input)
NPF Number of different prime factors of ABS(N). (Output)
If N is equal to 1, 0, or 1, NPF is set to 0.
IPF Integer vector of length 13. (Output)
IPF(I) contains the prime factors of the absolute value of N, for I = 1, , NPF. The
remaining 13 NPF locations are not used.
IEXP Integer vector of length 13. (Output)
IEXP(I) is the exponent of IPF(I), for I = 1, , NPF. The remaining 13 NPF
locations are not used.
IPW Integer vector of length 13. (Output)
IPW(I) contains the quantity IPF(I)**IEXP(I), for I = 1, , NPF. The remaining
13 NPF locations are not used.
FORTRAN 90 Interface
Generic:
Specific:
PRIME 1749
FORTRAN 77 Interface
Single:
Description
Routine PRIME decomposes an integer into its prime factors. The number to be factored, N, may
not have more than 13 distinct factors. The smallest number with more than 13 factors is about
1.3 1016. Most computers do not allow integers of this size.
The routine PRIME is based on a routine by Brenner (1973).
Comments
The output from PRIME should be interpreted in the following way:
ABS(N) = IPF(1)**IEXP(1) * . * IPF(NPF)**IEXP(NPF).
Example
This example factors the integer 144 = 2432.
USE PRIME_INT
USE UMACH_INT
IMPLICIT
INTEGER
PARAMETER
NONE
N
(N=144)
!
INTEGER
!
!
!
!
99999 FORMAT (' The prime factors for', I5, ' are: ', /, 10X, 2I6, // &
' IEXP =', 2I6, /, ' IPW =', 2I6, /, ' NPF =', I6, /)
END
Output
The prime factors for
2
3
IEXP =
IPW =
NPF =
4
16
2
144 are:
2
9
CONST
This function returns the value of various mathematical and physical constants.
Required Arguments
NAME Character string containing the name of the desired constant. (Input)
See Comment 3 for a list of valid constants.
FORTRAN 90 Interface
Generic:
CONST (NAME)
Specific:
FORTRAN 77 Interface
Single:
CONST (NAME)
Double:
Description
Routine CONST returns the value of various mathematical and physical quantities. For all of the
physical values, the Systeme International dUnites (SI) are used.
The reference for constants are indicated by the code in [ ] Comment above.
[1] Cohen and Taylor (1986)
[2] Liepman (1964)
[3] Precomputed mathematical constants
The constants marked with an E before the [ ] are exact (to machine precision).
To change the units of the values returned by CONST, see CUNIT.
Comments
1.
If the generic version of this function is used, the immediate result must be stored in a
variable before use in an expression. For example:
X = CONST(PI)
Y = COS(x)
CONST 1751
If this is too much of a restriction on the programmer, then the specific name can be used
without this restriction.
2.
The case of the character string in NAME does not matter. The names PI, Pi, Pi,
and pi are equivalent.
3.
4.
Name
Description
Value
AMU
1.6605402E 27 kg
[1]
ATM
1.01325E + 5N/m E
[2]
AU
Astronomical unit
1.496E + 11m
[]
Avogadro
Avogadro's number
6.0221367E + 231/mole
[1]
Boltzman
Boltzman's constant
1.380658E 23J/K
[1]
Speed of light
2.997924580E + 8m/sE
[1]
Catalan
Catalan's constant
0.915965 E
[3]
2.718E
[3]
ElectronCharge
Electron change
1.60217733E 19C
[1]
ElectronMass
Electron mass
9.1093897E 31 kg
[1]
ElectronVolt
Electron volt
1.60217733E 19J
[1]
Euler
0.577 E
[3]
Faraday
Faraday constant
9.6485309E + 4C/mole
[1]
FineStructure
fine structure
7.29735308E 3
[1]
Gamma
Euler's constant
0.577 E
[3]
Gas
Gas constant
8.314510J/mole/k
[1]
2
Gravity
Gravitational constant
Hbar
Planck constant / 2 pi
1.05457266E 34J * s
3
[1]
[1]
PerfectGasVolume
2.241383E 2m /mole
[*]
Pi
Pi
3.141 E
[3]
Planck
Planck's constant h
6.6260755E 34J * s
[1]
ProtonMass
Proton mass
1.6726231E 27 kg
[1]
Ref.
Name
Description
Value
Rydberg
Rydberg's constant
1.0973731534E + 7/m
[1]
SpeedLight
Speed of light
2.997924580E + 8m/s E
[1]
[2]
StandardGravity
Standard g
StandardPressure
9.80665m/s E
[2]
1.01325E + 5N/m E
4
StefanBoltzmann
Stefan-Boltzman
5.67051E 8W/K /m
WaterTriple
2.7316E + 2K E
[1]
[2]
Example
In this example, Eulers constant is obtained and printed. Eulers constant is defined to be
n 1 1
n
k =1 k
= lim ln n
USE CONST_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
NONE
NOUT
GAMA
Get gamma
GAMA = CONST('GAMMA')
Print gamma
WRITE (NOUT,*) 'GAMMA = ', GAMA
END
Output
GAMMA =
0.5772157
CUNIT
Converts X in units XUNITS to Y in units YUNITS.
Required Arguments
X Value to be converted. (Input)
XUNITS Character string containing the name of the units for X. (Input)
See Comments for a description of units allowed.
Y Value in YUNITS corresponding to X in XUNITS. (Output)
Chapter 11: Utilities
CUNIT 1753
YUNITS Character string containing the name of the units for Y. (Input)
See Comments for a description of units allowed.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Double:
Description
Routine CUNIT converts a value expressed in one set of units to a value expressed in another set of
units.
The input and output units are checked for consistency unless the input unit is SI. SI means the
Systeme International dUnites. This is the meterkilogramsecond form of the metric system. If
the input units are SI, then the input is assumed to be expressed in the SI units consistent with
the output units.
Comments
1.
Strings XUNITS and YUNITS have the form U1 * U2 * * Um/V1 Vn, where Ui and Vi
are the names of basic units or are the names of basic units raised to a power. Examples
are, METER * KILOGRAM/SECOND, M * KG/S, METER, or M/KG2.
2.
The case of the character string in XUNITS and YUNITS does not matter. The names
METER, Meter and meter are equivalent.
3.
4.
Units of mass
AMU, g = gram, lb = pound, ounce = oz, slug
Units of distance
Angstrom, AU, feet = foot = ft, in = inch, m = meter = metre, micron, mile, mill,
parsec, yard
Units of area
acre
Units of volume
l = liter = litre
Units of force
dyne, N = Newton, poundal
Units of energy
BTU(thermochemical), Erg, J = Joule
Units of work
W = watt
Units of pressure
ATM = atomosphere, bar, Pascal
Units of temperature
degC = Celsius, degF = Fahrenheit, degK = Kelvin
Units of viscosity
poise, stoke
Units of charge
Abcoulomb, C = Coulomb, statcoulomb
Units of current
A = ampere, abampere, statampere,
Units of voltage
Abvolt, V = volt
Units of magnetic induction
T = Tesla, Wb = Weber
Other units
1, farad, mole, Gauss, Henry, Maxwell, Ohm
The following metric prefixes may be used with the above units. Note that the one or two letter
prefixes may only be used with one letter unit abbreviations.
Atto
1.E 18
Femto
1.E 15
Pico
1.E 12
Nano
1.E 9
Micro
1.E 6
Milli
1.E 3
CUNIT 1755
5.
Centi
1.E 2
Deci
1.E 1
DK
Deca
1.E + 2
Kilo
1.E + 3
Myriad
Mega
Giga
1.E + 9
Tera
1.E + 12
Informational error
Type
3
Code
8 A conversion of units of mass to units of force was required for
consistency.
Example
The routine CONST is used to obtain the speed on light, c, in SI units. CUNIT is then used to
convert c to mile/second and to parsec/year. An example involving substitution of force for mass
is required in conversion of Newtons/Meter2 to Pound/Inch2.
USE CONST_INT
USE CUNIT_INT
USE UMACH_INT
IMPLICIT
INTEGER
REAL
!
!
!
!
!
!
NONE
NOUT
CMH, CMS, CPY, CPSI
Get output unit number
CALL UMACH (2, NOUT)
Get speed of light in SI (m/s)
CMS = CONST('SpeedLight')
WRITE (NOUT,*) 'Speed of Light = ', CMS, ' meter/second'
Get speed of light in mile/second
CALL CUNIT (CMS, 'SI', CMH, 'Mile/Second')
WRITE (NOUT,*) 'Speed of Light = ', CMH, ' mile/second'
Get speed of light in parsec/year
CALL CUNIT (CMS, 'SI', CPY, 'Parsec/Year')
WRITE (NOUT,*) 'Speed of Light = ', CPY, ' Parsec/Year'
Convert Newton/Meter**2 to
Pound/Inch**2.
CALL CUNIT(1.E0, 'Newton/Meter**2', CPSI, &
'Pound/Inch**2')
WRITE(NOUT,*)' Atmospheres, in Pound/Inch**2 = ',CPSI
END
Output
Speed of Light =
Speed of Light =
Speed of Light =
299792440.0 meter/second
186282.39 mile/second
0.3063872 Parsec/Year
HYPOT
This functions computes SQRT(A**2 + B**2) without underflow or overflow.
Required Arguments
A First parameter. (Input)
B Second parameter. (Input)
FORTRAN 90 Interface
Generic:
HYPOT (A, B)
Specific:
FORTRAN 77 Interface
Single:
HYPOT (A, B)
Double:
Description
Routine HYPOT is based on the routine PYTHAG, used in EISPACK 3. This is an update of the work
documented in Garbow et al. (1972).
Example
Computes
c = a 2 + b2
NONE
Declare variables
HYPOT 1757
INTEGER
REAL
NOUT
A, B, C
!
A = 1.0E+20
B = 2.0E+20
C = HYPOT(A,B)
!
Output
C = 2.2361E+20
MP_SETUP
REQUIRED
Required Argument
None.
Optional Arguments
NOTE Character string Final. (Input)
With Final all pending error messages are sent from the nodes to the root and
printed. If any node should STOP after printing messages, then MPI_Finalize() and
a STOP are executed. Otherwise, only MPI_Finalize()is called. The character
string Final is the only valid string for this argument.
N Size of array to be allocated for timing. (Input)
When this argument is supplied, the array MPI_NODE_PRIORITY is allocated with
MP_PROCS components. The matrix products A .x. B are timed individually at each
node of the machine. The elapsed time is noted and sorted to determine the node
priority order. A and B are allocated to size N by N, and initialized with random data.
The priority order is finally broadcast to the other nodes.
FORTRAN 90 Interface
MP_SETUP ( [,])
Description
Following a call to the function MP_SETUP(), the module MPI_node_int will contain
information about the number of processors, the rank of a processor, the communicator for
IMSL Fortran Numerical Library, and the usage priority order of the node machines:
MODULE MPI_NODE_INT
INTEGER, ALLOCATABLE :: MPI_NODE_PRIORITY(:)
INTEGER, SAVE :: MP_LIBRARY_WORLD = huge(1)
LOGICAL, SAVE :: MPI_ROOT_WORKS = .TRUE.
INTEGER, SAVE :: MP_RANK = 0, MP_NPROCS = 1
END MODULE
When the function MP_SETUP() is called with no arguments, the following events occur:
If MPI has not been initialized, it is first initialized. This step uses the routines
MPI_Initialized() and possibly MPI_Init(). Users who choose not to call
MP_SETUP() must make the required initialization call before using any IMSL Fortran
Numerical Library code that relies on MPI for its execution. If the users code calls an IMSL
Fortran Numerical Library function utilizing the box data type and MPI has not been
initialized, then the computations are performed on the root node. The only MPI routine
always called in this context is MPI_Initialized(). The name MP_SETUP is pushed onto
the subprogram or call stack.
The integers MP_RANK and MP_NPROCS are respectively the nodes rank and the number of
nodes in the communicator, MP_LIBRARY_WORLD. Their values require the routines
MPI_Comm_size() and MPI_Comm_rank(). The default values are important when MPI is
not initialized and a box data type is computed. In this case the root node is the only node
and it will do all the work. No calls to MPI communication routines are made when
MP_NPROCS = 1 when computing the box data type functions. A program can temporarily
assign this value to force box data type computation entirely at the root node. This is
desirable for problems where using many nodes would be less efficient than using the root
node exclusively.
The array MPI_NODE_PRIORITY(:) is not allocated unless the user allocates it. The IMSL
Fortran Numerical Library codes use this array for assigning tasks to processors, if it is
allocated. If it is not allocated, the default priority of the nodes is
(0,1,...,MP_NPROCS-1). Use of the function call MP_SETUP(N) allocates the array, as
explained below. Once the array is allocated its size is MP_NPROCS. The contents of the array
is a permutation of the integers 0,...,MP_NPROCS-1. Nodes appearing at the start of the list
are used first for parallel computing. A node other than the root can avoid any computing,
MP_SETUP 1759
except receiving the schedule, by setting the value MPI_NODE_PRIORITY(I) < 0. This
means that node |MPI_NODE_PRIORITY(I)| will be sent the task schedule but will not
perform any significant work as part of box data type function evaluations.
The LOGICAL flag MPI_ROOT_WORKS designates whether or not the root node participates in
the major computation of the tasks. The root node communicates with the other nodes to
complete the tasks but can be designated to do no other work. Since there may be only one
processor, this flag has the default value .TRUE., assuring that one node exists to do work.
When more than one processor is available users can consider assigning
MPI_ROOT_WORKS=.FALSE. This is desirable when the alternate nodes have equal or greater
computational resources compared with the root node. Parallel Example 4 illustrates this
usage. A single problem is given a box data type, with one rack. The computing is done at
the node, other than the root, with highest priority. This example requires more than one
processor since the root does no work.
When the generic function MP_SETUP(N) is called, where N is a positive integer, a call to
MP_SETUP() is first made, using no argument. Use just one of these calls to MP_SETUP(). This
initializes the MPI system and the other parameters described above. The array
MPI_NODE_PRIORITY(:) is allocated with size MP_NPROCS. Then DOUBLE PRECISION matrix
products C = AB, where A and B are N by N matrices, are computed at each node and the elapsed
time is recorded. These elapsed times are sorted and the contents of MPI_NODE_PRIORITY(:)
are permuted in accordance with the shortest times yielding the highest priority. All the nodes in
the communicator MP_LIBRARY_WORLD are timed. The array MPI_NODE_PRIORITY(:) is then
broadcast from the root to the remaining nodes of MP_LIBRARY_WORLD using the routine
MPI_Bcast(). Timing matrix products to define the node priority is relevant because the effort
to compute C is comparable to that of many linear algebra computations of similar size. Users are
free to define their own node priority and broadcast the array MPI_NODE_PRIORITY(:) to the
alternate nodes in the communicator.
To print any IMSL Fortran Numerical Library error messages that have occurred at any node, and
to finalize MPI, use the function call MP_SETUP(Final). The case of the string Final is
not important. Any error messages pending will be discarded after printing on the root node. This
is triggered by popping the name MP_SETUP from the subprogram stack or returning to Level 1
in the stack. Users can obtain error messages by popping the stack to Level 1 and still continuing
with MPI calls. This requires executing call e1pop (MP_SETUP). To continue on after
summarizing errors execute call e1psh (MP_SETUP). More details about the error
processor are found in Reference Material chapter of this manual.
Messages are printed by nodes from largest rank to smallest, which is the root node. Use of the
routine MPI_Finalize() is made within MP_SETUP(Final), which shuts down MPI. After
MPI_Finalize() is called, the value of MP_NPROCS = 0. This flags that MPI has been
initialized and terminated. It cannot be initialized again in the same program unit execution. No
MPI routine is defined when MP_NPROCS has this value.
Examples
Parallel Example (parallel_ex01.f90)
use linear_operators
use mpi_setup_int
1760 Chapter 11: Utilities
implicit none
! This is the equivalent of Parallel Example 1 for .ix., with box data types
! and functions.
integer, parameter :: n=32, nr=4
real(kind(1e0)) :: one=1e0
real(kind(1e0)), dimension(n,n,nr) :: A, b, x, err(nr)
! Setup for MPI.
MP_NPROCS=MP_SETUP()
! Generate random matrices for A and b:
A = rand(A); b=rand(b)
! Compute the box solution matrix of Ax = b.
x = A .ix. b
! Check the results.
err = norm(b - (A .x. x))/(norm(A)*norm(x)+norm(b))
if (ALL(err <= sqrt(epsilon(one))) .and. MP_RANK == 0) &
write (*,*) 'Parallel Example 1 is correct.'
! See to any error messages and quit MPI.
MP_NPROCS=MP_SETUP('Final')
end
Here an alternate node is used to compute the majority of a single application, and the user does
not need to make any explicit calls to MPI routines. The time-consuming parts are the evaluation
of the eigenvalue-eigenvector expansion, the solving step, and the residuals. To do this, the
rank-2 arrays are changed to a box data type with a unit third dimension. This uses parallel
computing. The node priority order is established by the initial function call, MP_SETUP(n).
The root is restricted from working on the box data type by assigning
MPI_ROOT_WORKS=.false. This example anticipates that the most efficient node, other than the
root, will perform the heavy computing. Two nodes are required to execute.
use linear_operators
use mpi_setup_int
implicit none
! This is the equivalent of Parallel Example 4 for matrix exponential.
! The box dimension has a single rack.
integer, parameter :: n=32, k=128, nr=1
integer i
real(kind(1e0)), parameter :: one=1e0, t_max=one, delta_t=t_max/(k-1)
real(kind(1e0)) err(nr), sizes(nr), A(n,n,nr)
real(kind(1e0)) t(k), y(n,k,nr), y_prime(n,k,nr)
complex(kind(1e0)), dimension(n,nr) :: x(n,n,nr), z_0, &
Z_1(n,nr,nr), y_0, d
Chapter 11: Utilities
MP_SETUP 1761
!
!
!
!
Reference Material
Contents
User Errors...............................................................................1763
Automatic Workspace Allocation .............................................1785
Machine-Dependent Constants ...............................................1769
Matrix Storage Modes..............................................................1775
Reserved Names .....................................................................1784
Deprecated and Renamed Routines .......................................1785
User Errors
IMSL routines attempt to detect user errors and handle them in a way that provides as much
information to the user as possible. To do this, we recognize various levels of severity of errors,
and we also consider the extent of the error in the context of the purpose of the routine; a trivial
error in one situation may be serious in another. IMSL routines attempt to report as many errors as
they can reasonably detect. Multiple errors present a difficult problem in error detection because
input is interpreted in an uncertain context after the first error is detected.
Terminal errors
If the users input is regarded as meaningless, such as N = 1 when N is the number of equations,
the routine prints a message giving the value of the erroneous input argument(s) and the reason for
the erroneous input. The routine will then cause the users program to stop. An error in which the
users input is meaningless is the most severe error and is called a terminal error. Multiple
terminal error messages may be printed from a single routine.
Reference Material
Contents 1763
Informational errors
In many cases, the best way to respond to an error condition is simply to correct the input and
rerun the program. In other cases, the user may want to take actions in the program itself based on
errors that occur. An error that may be used as the basis for corrective action within the program is
called an informational error. If an informational error occurs, a user-retrievable code is set. A
routine can return at most one informational error for a single reference to the routine. The codes
for the informational error codes are printed in the error messages.
Other errors
In addition to informational errors, IMSL routines issue error messages for which no userretrievable code is set. Multiple error messages for this kind of error may be printed. These errors,
which generally are not described in the documentation, include terminal errors as well as less
serious errors. Corrective action within the calling program is not possible for these errors.
such cases, the user is advised to compare carefully the actual arguments passed to the
routine with the dummy argument descriptions given in the documentation. Special
attention should be given to checking argument order and data types.
A terminal error is not an informational error because corrective action within the
program is generally not reasonable. In normal usage, execution is terminated
immediately when a terminal error occurs. Messages relating to more than one terminal
error are printed if they occur. Default attributes: PRINT=YES, STOP=YES
The user can set PRINT and STOP attributes by calling ERSET as described in Routines for Error
Handling.
ERSET
Change the default printing or stopping actions when errors of a particular error severity level
occur.
Required Arguments
IERSVR Error severity level indicator. (Input)
If IERSVR = 0, actions are set for levels 1 to 5. If IERSVR is 1 to 5, actions are set for
errors of the specified severity level.
IPACT Printing action. (Input)
IPACT
Action
Do not print.
Print.
Reference Material
ERSET 1765
Action
Do not stop.
Stop.
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
The function retrieves the code set by the most recently called IMSL routine.
N1RTY retrieves the error type set by the most recently called IMSL routine. It is used in the
following way:
ITYPE = N1RTY(1)
ITYPE = 1, 2, 4, and 5 correspond to error severity levels 1, 2, 4, and 5, respectively. ITYPE = 3
and ITYPE = 6 are both warning errors, error severity level 3. While ITYPE = 3 errors are
informational errors (IERCD( ) 0), ITYPE = 6 errors are not informational errors (IERCD( ) = 0).
For software developers requiring additional interaction with the IMSL error handling system, see
Aird and Howell (1991).
Examples
Changes to default actions
Some possible changes to the default actions are illustrated below. The default actions remain in
effect for the kinds of errors not included in the call to ERSET.
1766 Reference Material
.
.
.
CALL LFTDS (A, FACT)
IF (IERCD() .EQ. 2) THEN
Handle matrix that is not nonnegative definite
.
.
.
END IF
Examples of errors
The program below illustrates each of the different types of errors detected by the
MATH/LIBRARY routines.
The error messages refer to the argument names that are used in the documentation for the routine,
rather than the users name of the variable used for the argument. In the message generated by
IMSL routine LINRG in this example, reference is made to N, whereas in the program a literal was
used for this argument.
USE_IMSL_LIBRARIES
INTEGER
N
PARAMETER (N=2)
!
REAL
!
!
!
!
!
Reference Material
Output
*** FATAL
***
***
*** TERMINAL ERROR 1 from LINRG. The order of the matrix must be positive
***
while N = 1 is given.
Example of traceback
The next program illustrates a situation in which a traceback is produced. The program uses the
IMSL quadrature routines QDAG and QDAGS to evaluate the double integral
1
( x + y ) dx dy = g ( y ) dy
where
1
g ( y ) = ( x + y ) dx = f ( x ) dx, with f ( x ) = x + y
Since both QDAG and QDAGS need 2500 numeric storage units of workspace, and since the
workspace allocator uses some space to keep track of the allocations, 6000 numeric storage units
of space are explicitly allocated for workspace. Although the traceback shows an error code
associated with a terminal error, this code has no meaning to the user; the printed message
contains all relevant information. It is not assumed that the user would take corrective action based
on knowledge of the code.
USE QDAGS_INT
!
!
!
INTEGER
REAL
COMMON
EXTERNAL
IRULE
C, D, ERRABS, ERREST, ERRREL, F, Y
/COMY/ Y
F
!
Y
C
= ARGY
= 0.0
D
ERRABS
ERRREL
IRULE
=
=
=
=
1.0
0.0
-0.001
1
!
CALL QDAG (F, C, D, G, ERRABS, ERRREL, IRULE, ERREST)
RETURN
END
!
!
REAL
COMMON
Y
/COMY/ Y
!
F = X + Y
RETURN
END
Output
*** TERMINAL ERROR 4 from Q2AG. The relative error desired ERRREL =
***
-1.000000E-03. It must be at least zero.
Here is a traceback of subprogram calls in reverse order:
Routine name
Error type Error code
--------------------- ---------Q2AG
5
4
(Called internally)
QDAG
0
0
Q2AGS
0
0
(Called internally)
QDAGS
0
0
USER
0
0
Machine-Dependent Constants
The function subprograms in this section return machine-dependent information and can be used
to enhance portability of programs between different computers. The routines IMACH, and AMACH
describe the computers arithmetic. The routine UMACH describes the input, ouput, and error output
unit numbers.
IMACH
This function retrieves machine integer constants that define the arithmetic used by the computer.
k = 0 xk Ak
M
The machine model assumes that floating-point numbers are represented in normalized
N-digit, base B form as
B E k =1 xk B k
N
Required Arguments
I Index of the desired constant. (Input)
FORTRAN 90 Interface
Generic:
IMACH (I)
Specific:
FORTRAN 77 Interface
Single:
IMACH (I)
AMACH
The function subprogram AMACH retrieves machine constants that define the computers singleprecision or double precision arithmetic. Such floating-point numbers are represented in
normalized N-digit, base B form as
B E k =1 xk B k
N
Emin 1
Emax
1 N
AMACH(5) = log10 ( B ) .
AMACH(6) = NaN (quiet not a number).
AMACH(7)=positive machine infinity.
AMACH(8)= negative machine infinity.
See Comment 1 for a description of the use of the generic version of this function.
See Comment 2 for a description of min, max, and N.
Required Arguments
I Index of the desired constant. (Input)
FORTRAN 90 Interface
Generic:
AMACH (I)
Specific:
FORTRAN 77 Interface
Single:
AMACH (I)
Double:
Reference Material
AMACH 1771
Comments
1.
If the generic version of this function is used, the immediate result must be stored in a
variable before use in an expression. For example:
X = AMACH(I)
Y = SQRT(X)
If this is too much of a restriction on the programmer, then the specific name can be
used without this restriction.
2.
3.
The IEEE standard for binary arithmetic (see IEEE 1985) specifies quiet NaN (not a
number) as the result of various invalid or ambiguous operations, such as 0/0. The intent
is that AMACH(6) return a quiet NaN. On IEEE format computers that do not support a
quiet NaN, a special value near AMACH(2) is returned for AMACH(6). On computers that do
not have a special representation for infinity, AMACH(7) returns the same value as
AMACH(2).
DMACH
See AMACH.
IFNAN(X)
This logical function checks if the argument X is NaN (not a number).
Required Arguments
X Argument for which the test for NAN is desired. (Input)
FORTRAN 90 Interface
Generic:
IFNAN(X)
Specific:
FORTRAN 77 Interface
Single:
IFNAN (X)
Double:
Example
USE IFNAN_INT
USE AMACH_INT
USE UMACH_INT
INTEGER
NOUT
REAL
X
!
CALL UMACH (2, NOUT)
!
X = AMACH(6)
IF (IFNAN(X)) THEN
WRITE (NOUT,*) X is NaN (not a number).
ELSE
WRITE (NOUT,*) X = , X
END IF
!
END
Output
X is NaN (not a number).
Description
The logical function IFNAN checks if the single or double precision argument X is NaN (not a
number). The function IFNAN is provided to facilitate the transfer of programs across computer
systems. This is because the check for NaN can be tricky and not portable across computer
systems that do not adhere to the IEEE standard. For example, on computers that support the IEEE
standard for binary arithmetic (see IEEE 1985), NaN is specified as a bit format not equal to itself.
Thus, the check is performed as
IFNAN = X .NE. X
On other computers that do not use IEEE floating-point format, the check can be performed as:
IFNAN = X .EQ. AMACH(6)
The function IFNAN is equivalent to the specification of the function Isnan listed in the Appendix,
(IEEE 1985). The above following example illustrates the use of IFNAN. If X is NaN, a message is
Reference Material
IFNAN(X) 1773
printed instead of X. (Routine UMACH, which is described in the following section, is used to
retrieve the output unit number for printing the message.)
UMACH
Routine UMACH sets or retrieves the input, output, or error output device unit numbers.
Required Arguments
N Integer value indicating the action desired. If the value of N is negative, the input, output, or
error output unit number is reset to NUNIT. If the value of N is positive, the input, output, or error
output unit number is returned in NUNIT. See the table in argument NUNIT for legal values of N.
(Input)
NUNIT The unit number that is either retrieved or set, depending on the value of input
argument N. (Input/Output)
Effect
FORTRAN 90 Interface
Generic:
Specific:
FORTRAN 77 Interface
Single:
Description
Routine UMACH sets or retrieves the input, output, or error output device unit numbers. UMACH is
set automatically so that the default FORTRAN unit numbers for standard input, standard output,
and standard error are used. These unit numbers can be changed by inserting a call to UMACH at the
beginning of the main program that calls MATH/LIBRARY routines. If these unit numbers are
changed from the standard values, the user should insert an appropriate OPEN statement in the
calling program.
1774 Reference Material
Example
In the following example, a terminal error is issued from the MATH/LIBRARY AMACH function
since the argument is invalid. With a call to UMACH, the error message will be written to a local
file named CHECKERR.
USE AMACH_INT
USE UMACH_INT
INTEGER
N, NUNIT
REAL
X
!
Set Parameter
N = 0
NUNIT = 9
!
CALL UMACH (-3, NUNIT)
OPEN (UNIT=NUNIT,FILE=CHECKERR)
X = AMACH(N)
END
Output
The output from this example, written to CHECKERR is:
*** TERMINAL ERROR 5 from AMACH. The argument must be between 1 and 8
***
inclusive. N = 0
General Mode
A general matrix is an N N matrix A. It is stored in a FORTRAN array that is declared by the
following statement:
DIMENSION A(LDA,N)
The parameter LDA is called the leading dimension of A. It must be at least as large as N. IMSL
general matrix subprograms only refer to values Aij for i = 1, , N and j = 1, , N. The data type
of a general array can be one of REAL, DOUBLE PRECISION, or COMPLEX. If your FORTRAN
compiler allows, the nonstandard data type DOUBLE COMPLEX can also be declared.
Rectangular Mode
A rectangular matrix is an M N matrix A. It is stored in a FORTRAN array that is declared by
the following statement:
DIMENSION A(LDA,N)
The parameter LDA is called the leading dimension of A. It must be at least as large as M. IMSL
rectangular matrix subprograms only refer to values Aij for i = 1, , M and j = 1, , N. The data
Reference Material
type of a rectangular array can be REAL, DOUBLE PRECISION, or COMPLEX. If your FORTRAN
compiler allows, you can declare the nonstandard data type DOUBLE COMPLEX.
Symmetric Mode
A symmetric matrix is a square N N matrix A, such that AT = A. (AT is the transpose of A.) It is
stored in a FORTRAN array that is declared by the following statement:
DIMENSION A(LDA,N)
The parameter LDA is called the leading dimension of A. It must be at least as large as N. IMSL
symmetric matrix subprograms only refer to the upper or to the lower half of A (i.e., to values Aij
for i = 1, , N and j = i, , N, or Aij for j = 1, , N and i = j, , N). The data type of a
symmetric array can be one of REAL or DOUBLE PRECISION. Use of the upper half of the array is
denoted in the BLAS that compute with symmetric matrices, see Chapter 9, Basic Matrix/Vector
Operations, using the CHARACTER*1 flag UPLO = U. Otherwise, UPLO = L denotes that the
lower half of the array is used.
Hermitian Mode
A Hermitian matrix is a square N N matrix A, such that
AT = A
The matrix
A
The parameter LDA is called the leading dimension of A. It must be at least as large as N. IMSL
Hermitian matrix subprograms only refer to the upper or to the lower half of A (i.e., to values Aij
for i = 1, , N and j = i, , N., or Aij for j = 1, , N and i = j, , N). Use of the upper half of the
array is denoted in the BLAS that compute with Hermitian matrices, see Chapter 9, Basic
Matrix/Vector Operations, using the CHARACTER*1 flag UPLO = U. Otherwise, UPLO = L
denotes that the lower half of the array is used. The data type of a Hermitian array can be
COMPLEX or, if your FORTRAN compiler allows, the nonstandard data type DOUBLE COMPLEX.
Triangular Mode
A triangular matrix is a square N N matrix A such that values Aij = 0 for i < j or Aij = 0 for i > j.
The first condition defines a lower triangular matrix while the second condition defines an upper
triangular matrix. A lower triangular matrix A is stored in the lower triangular part of a
FORTRAN array A. An upper triangular matrix is stored in the upper triangular part of a
FORTRAN array. Triangular matrices are called unit triangular whenever Ajj = 1, j = 1, , N. For
unit triangular matrices, only the strictly lower or upper parts of the array are referenced. This is
1776 Reference Material
denoted in the BLAS that compute with triangular matrices, see Chapter 9, Basic Matrix/Vector
Operations, using the CHARACTER*1 flag DIAGNL = U. Otherwise, DIAGNL = N denotes
that the diagonal array terms should be used. For unit triangular matrices, the diagonal terms are
each used with the mathematical value 1. The array diagonal term does not need to be 1.0 in this
usage. Use of the upper half of the array is denoted in the BLAS that compute with triangular
matrices, see Chapter 9, Basic Matrix/Vector Operations, using the CHARACTER*1 flag
UPLO = U. Otherwise, UPLO = L denotes that the lower half of the array is used. The data
type of an array that contains a triangular matrix can be one of REAL, DOUBLE PRECISION, or
COMPLEX. If your FORTRAN compiler allows, the nonstandard data type DOUBLE COMPLEX can
also be declared.
The parameter LDA is called the leading dimension of A. It must be at least as large as m. The data
type of a band matrix array can be one of REAL, DOUBLE PRECISION, COMPLEX or, if your
FORTRAN compiler allows, the nonstandard data type DOUBLE COMPLEX . Use of the
CHARACTER*1 flag TRANS=N in the BLAS, see Chapter 9, Basic Matrix/Vector Operations,
specifies that the matrix A is used. The flag value
TRANS =T uses AT
while
TRANS =C uses AT
For example, consider a real 5 5 band matrix with 1 lower and 2 upper codiagonals, stored in the
FORTRAN array declared by the following statements:
PARAMETER (N=5, NLCA=1, NUCA=2)
REAL A(NLCA+NUCA+1, N)
Reference Material
A11
A
21
A= 0
0
0
A12
A13
A22
A23
A24
A32
0
A33
A43
A34
A44
A54
A13
A24
A12
A23
A34
A22
A32
A33
A43
A44
A54
0
0
A35
A45
A55
As a FORTRAN array, it is
A=
A11
A21
A35
A45
A55
The entries marked with an x in the above array are not referenced by the IMSL band
subprograms.
The parameter LDA is called the leading dimension of A. It must be at least as large as NCODA + 1.
The data type of a band symmetric array can be REAL or DOUBLE PRECISION.
For example, consider a real 5 5 band matrix with 2 codiagonals. Its FORTRAN declaration is
PARAMETER (N=5, NCODA=2)
REAL A(NCODA+1, N)
0
0
1778 Reference Material
A12
A13
A22
A23
A24
A23
A33
A34
A24
A34
A44
A35
A45
0
0
A35
A45
A55
A =
A11
A13
A12
A23
A34
A22
A33
A44
A24
A35
A45
A55
The entries marked with an in the above array are not referenced by the IMSL band symmetric
subprograms.
An alternate storage mode for band symmetric matrices is designated using the CHARACTER*1 flag
UPLO = L in Level 2 BLAS that compute with band symmetric matrices, see Chapter 9, Basic
Matrix/Vector Operations. In that case, the example matrix is represented as
A11
A = A12
A13
A22
A33
A44
A23
A34
A45
A24
A35
A55
The parameter LDA is called the leading dimension of A. It must be at least as large as
(NCODA + 1). The data type of a band Hermitian array can be COMPLEX or, if your FORTRAN
compiler allows, the nonstandard data type DOUBLE COMPLEX.
For example, consider a complex 5 5 band matrix with 2 codiagonals. Its FORTRAN declaration
is
PARAMETER (N=5, NCODA = 2)
COMPLEX A(NCODA + 1, N)
0
0
A12
A13
A22
A23
A24
A23
A33
A34
A24
A34
A44
A35
A45
0
0
A35
A45
A55
A =
A11
A13
A12
A23
A34
A22
A33
A44
A24
A35
A45
A55
The entries marked with an in the above array are not referenced by the IMSL band Hermitian
subprograms.
An alternate storage mode for band Hermitian matrices is designated using the CHARACTER*1 flag
UPLO = L in Level 2 BLAS that compute with band Hermitian matrices, see Chapter 9, Basic
Matrix/Vector Operations. In that case, the example matrix is represented as
A11
A = A12
A13
A22
A33
A44
A23
A34
A45
A24
A35
A55
The parameter LDA is called the leading dimension of A. It must be at least as large as
(NCODA + 1).
For example, consider a 5 5 band upper triangular matrix with 2 codiagonals. Its FORTRAN
declaration is
PARAMETER (N = 5, NCODA = 2)
COMPLEX A(NCODA + 1, N)
A= 0
0
0
A12
A13
A22
A23
A24
0
0
A33
0
A34
A44
0
0
A35
A45
A55
A =
A11
A13
A24
A12
A22
A23
A33
A34
A44
A35
A45
A55
This corresponds to the CHARACTER*1 flags DIAGNL = N and UPLO = U. The matrix AT is
represented as the FORTRAN array
A11
A = A12
A13
A22
A33
A44
A23
A34
A45
A24
A35
A55
This corresponds to the CHARACTER*1 flags DIAGNL = N and UPLO = L. In both examples,
the entries indicated with an are not referenced by IMSL subprograms.
The parameter LDA is the leading positive dimension of A. It must be at least as large as
N + NCODA.
Consider a real symmetric 5 5 matrix with 2 codiagonals
A11
A
12
A = A13
0
0
A12
A13
A22
A23
A24
A23
A33
A34
A24
A34
A44
A35
A45
0
0
A35
A45
A55
A FORTRAN declaration for the array to hold this matrix and right-hand-side vector is
Reference Material
The matrix and right-hand-side entries are placed in the FORTRAN array A as follows:
A11
A = A22
A33
A44
A
55
A12
A23
A13
A34
A24
A45
A35
b1
b2
b3
b4
b5
Entries marked with an do not need to be defined. Certain of the IMSL band symmetric
subprograms will initialize and use these values during the solution process. When a solution is
computed, the bi, i = 1, , 5, are replaced by xi, i = 1, , 5.
The nonzero Aij, j i, are stored in array locations A(j + NCODA, (j i) + 1) . The right-hand-side
entries bj are stored in locations A(j + NCODA, NCODA + 2). The solution entries xj are returned in
A(j + NCODA, NCODA + 2).
where U and V are real matrices. They satisfy the conditions U = UT and V = VT. The
right-hand-side
b = c + 1 d
where u and v are real. The storage is declared with the following statement
DIMENSION A(LDA, 2*NCODA + 3)
The parameter LDA is the leading positive dimension of A. It must be at least as large as
N + NCODA.
The diagonal terms Ujj are stored in array locations A (j + NCODA, 1). The diagonal Vjj are zero and
are not stored. The nonzero Uij, j > i, are stored in locations A(j + NCODA, 2 * (j i)).
1782 Reference Material
The nonzero Vij are stored in locations A(j + NCODA, 2*(j i) + 1). The right side vector b is stored
with cj and dj in locations A(j + NCODA, 2*NCODA + 2) and A(j + NCODA, 2*NCODA + 3)
respectively. The real and imaginary parts of the solution, uj and vj, respectively overwrite cj and
dj.
Consider a complex hermitian 5 5 matrix with 2 codiagonals
U11
U12
A = U13
0
0
U12
U13
U 22 U 23 U 24
U 23 U 33 U 34
U 24 U 34 U 44
0
U 35 U 45
0
0
V
0
12
U 35 + 1 V13
U 45
0
0
U 55
V12
V13
V23
V24
V23
V24
0
V34
V35
V34
0
V45
0
0
V35
V45
0
A FORTRAN declaration for the array to hold this matrix and right-hand-side vector is
PARAMETER (N = 5, NCODA = 2, LDA = N + NCODA)
REAL A(LDA,2*NCODA + 3)
The matrix and right-hand-side entries are placed in the FORTRAN array A as follows:
U11
A = U 22 U12
U 33 U 23
U 44 U 34
U
55 U 45
c1
V12
V23 U13 V13
V34 U 24 V24
c2
c3
c4
V45 U 35 V35
c5
d1
d2
d3
d4
d5
AIROW (i ), JCOL (i ) = A ( i ) , i = 1, , NZ
The data type for A(*) can be one of REAL, DOUBLE PRECISION, or COMPLEX. If your FORTRAN
compiler allows, the nonstandard data type DOUBLE COMPLEX can also be declared.
For example, consider a real 5 5 sparse matrix with 11 nonzero entries. The matrix A has the
form
A11
A
21
A= 0
0
0
A13
A14
A22
A32
0
0
A33
A43
0
A34
0
A54
0
0
0
0
A55
Declarations of arrays and definitions of the values for this sparse matrix are
PARAMETER
DIMENSION
DATA IROW
DATA JCOL
DATA A
(NZ = 11, N = 5)
IROW(NZ), JCOL(NZ), A(NZ)
/1,1,1,2,2,3,3,3,4,5,5/
/1,3,4,1,2,2,3,4,3,4,5/
/A11,A13,A14,A21,A22,A32,A33,A34, & A43,A54,A55/
Reserved Names
When writing programs accessing the MATH/LIBRARY, the user should choose FORTRAN
names that do not conflict with names of IMSL subroutines, functions, or named common blocks,
such as the workspace common block WORKSP (see Automatic Workspace Allocation). The user
needs to be aware of two types of name conflicts that can arise. The first type of name conflict
occurs when a name (technically a symbolic name) is not uniquely defined within a program unit
(either a main program or a subprogram). For example, such a name conflict exists when the name
RCURV is used to refer both to a type REAL variable and to the IMSL subroutine RCURV in a single
program unit. Such errors are detected during compilation and are easy to correct. The second type
of name conflict, which can be more serious, occurs when names of program units and named
common blocks are not unique. For example, such a name conflict would be caused by the user
defining a subroutine named WORKSP and also referencing an MATH/LIBRARY subroutine that
uses the named common block WORKSP. Likewise, the user must not define a subprogram with the
same name as a subprogram in the MATH/LIBRARY, that is referenced directly by the users
program or is referenced indirectly by other MATH/LIBRARY subprograms.
The MATH/LIBRARY consists of many routines, some that are described in the Users Manual
and others that are not intended to be called by the user and, hence, that are not documented. If the
choice of names were completely random over the set of valid FORTRAN names, and if a
program uses only a small subset of the MATH/LIBRARY, the probability of name conflicts is
very small. Since names are usually chosen to be mnemonic, however, the user may wish to take
some precautions in choosing FORTRAN names.
Many IMSL names consist of a root name that may have a prefix to indicate the type of the
routine. For example, the IMSL single precision subroutine for fitting a polynomial by least
squares has the name RCURV, which is the root name, and the corresponding IMSL double
1784 Reference Material
precision routine has the name DRCURV. Associated with these two routines are R2URV and
DR2URV. RCURV is listed in the Alphabetical Index of Routines, but DRCURV, R2URV, and DR2URV
are not. The user of RCURV must consider both names RCURV and R2URV to be reserved; likewise,
the user of DRCURV must consider both names DRCURV and DR2URV to be reserved. The root
names of all routines and named common blocks that are used by the MATH/LIBRARY and that
do not have a numeral in the second position of the root name are listed in the Alphabetical Index
of Routines. Some of the routines in this Index (such as the Level 2 BLAS) are not intended to
be called by the user and so are not documented.
The careful user can avoid any conflicts with IMSL names if the following rules are observed:
Do not choose a name that appears in the Alphabetical Summary of Routines in the Users
Manual, nor one of these names preceded by a D, S_, D_, C_, or Z_.
Do not choose a name of three or more characters with a numeral in the second or third
position.
These simplified rules include many combinations that are, in fact, allowable. However, if the user
selects names that conform to these rules, no conflict will be encountered.
Reference Material
By default, the total amount of space allocated in the common area for storage of numeric data is
5000 numeric storage units. (A numeric storage unit is the amount of space required to store an
integer or a real number. By comparison, a double precision unit is twice this amount. Therefore
the total amount of space allocated in the common area for storage of numeric data is 2500 double
precision units.) This space is allocated as needed for INTEGER, REAL, or other numeric data. For
larger problems in which the default amount of workspace is insufficient, the user can change the
allocation by supplying the FORTRAN statements to define the array in the named common block
and by informing the IMSL workspace allocation system of the new size of the common array. To
request 7000 units, the statements are
COMMON /WORKSP/ RWKSP
REAL RWKSP(7000)
CALL IWKIN(7000)
If an IMSL routine attempts to allocate workspace in excess of the amount available in the
common stack, the routine issues a fatal error message that indicates how much space is needed
and prints statements like those above to guide the user in allocating the necessary amount. The
program below uses IMSL routine PERMA to permute rows or columns of a matrix. This routine
requires workspace equal to the number of columns, which in this example is too large. (Note that
the work vector RWKSP must also provide extra space for bookkeeping.)
USE_PERMA_INT
!
INTEGER
REAL
!
!
NRA = 2
NCA = 6000
LDA = 2
!
Output
*** TERMINAL ERROR 10 from PERMA. Insufficient workspace for current
***
allocation(s). Correct by calling IWKIN from main program with
***
the three following statements: (REGARDLESS OF PRECISION)
***
COMMON /WORKSP/ RWKSP
***
REAL RWKSP(6018)
***
CALL IWKIN(6018)
*** TERMINAL ERROR 10 from PERMA.
***
6000.
1786 Reference Material
In most cases, the amount of workspace is dependent on the parameters of the problem so the
amount needed is known exactly. In a few cases, however, the amount of workspace is dependent
on the data (for example, if it is necessary to count all of the unique values in a vector), so the
IMSL routine cannot tell in advance exactly how much workspace is needed. In such cases the
error message printed is an estimate of the amount of space required.
Character Workspace
Since character arrays cannot be equivalenced with numeric arrays, a separate named common
block WKSPCH is provided for character workspace. In most respects this stack is managed in the
same way as the numeric stack. The default size of the character workspace is 2000 character
units. (A character unit is the amount of space required to store one character.) The routine
analogous to IWKIN used to change the default allocation is IWKCIN.
The routines in the following list are being deprecated in Version 2.0 of MATH/LIBRARY. A
deprecated routine is one that is no longer used by anything in the library but is being included in
the product for those users who may be currently referencing it in their application. However, any
future versions of MATH/LIBRARY will not include these routines. If any of these routines are
being called within an application, it is recommended that you change your code or retain the
deprecated routine before replacing this library with the next version. Most of these routines were
called by users only when they needed to set up their own workspace. Thus, the impact of these
changes should be limited.
CZADD
Reference Material
DE2LRH
DNCONF
E3CRG
CZINI
DE2LSB
DNCONG
E4CRG
CZMUL
DE3CRG
E2ASF
E4ESF
CZSTO
DE3CRH
E2AHF
E5CRG
DE2AHF
DE3LSF
E2BHF
E7CRG
DE2ASF
DE4CRG
E2BSB
G2CCG
DE2BHF
DE4ESF
E2BSF
G2CRG
DE2BSB
DE5CRG
E2CCG
G2LCG
DE2BSF
DE7CRG
E2CCH
G2LRG
DE2CCG
DG2CCG
E2CHF
G3CCG
DE2CCH
DG2CRG
E2CRG
G4CCG
DE2CHF
DG2DF
E2CRH
G5CCG
DE2CRG
DG2IND
E2CSB
G7CRG
DE2CRH
DG2LCG
E2EHF
N0ONF
DE2CSB
DG2LRG
E2ESB
NCONF
DE2EHF
DG3CCG
E2FHF
NCONG
DE2ESB
DG3DF
E2FSB
SDADD
DE2FHF
DG4CCG
E2FSF
SDINI
DE2FSB
DG5CCG
E2LCG
SDMUL
DE2FSF
DG7CRG
E2LCH
SDSTO
DE2LCG
DHOUAP
E2LHF
SHOUAP
DE2LCH
DHOUTR
E2LRG
DE2LHF
DIVPBS
E2LRH
DE2LRG
DN0ONF
E2LSB
SHOUTR
The following routines have been renamed due to naming conflicts with other software
manufacturers.
CTIME replaced with CPSEC
DTIME replaced with TIMDY
PAGE replaced with PGOPT
Description
This index lists routines in MATH/LIBRARY by a tree-structured classification scheme known as
GAMS Version 2.0 (Boisvert, Howe, Kahaner, and Springmann (1990). Only the GAMS classes
that contain MATH/LIBRARY routines are included in the index. The page number for the
documentation and the purpose of the routine appear alongside the routine name.
The first level of the full classification scheme contains the following major subject areas:
A. Arithmetic, Error Analysis
B. Number Theory
C. Elementary and Special Functions
D. Linear Algebra
E. Interpolation
F. Solution of Nonlinear Equations
G. Optimization
H. Differentiation and Integration
I.
J.
Integral Transforms
K. Approximation
L. Statistics, Probability
M. Simulation, Stochastic Modeling
N. Data Handling
O. Symbolic Computation
P. Computational Geometry
Q. Graphics
R. Service Routines
S. Software Development Tools
Z. Other
There are seven levels in the classification scheme. Classes in the first level are identified by a
capital letter as is given above. Classes in the remaining levels are identified by alternating letterand-number combinations. A single letter (a-z) is used with the odd-numbered levels. A number
(126) is used within the even-numbered levels.
IMSL MATH/LIBRARY
A...........ARITHMETIC, ERROR ANALYSIS
A3.........Real
A3c.......Extended precision
DQADD Adds a double-precision scalar to the accumulator in
extended precision.
DQINI Initializes an extended-precision accumulator with a
double-precision scalar.
DQMUL Multiplies double-precision scalars in extended precision.
DQSTO Stores a double-precision approximation to an extendedprecision scalar.
A4.........Complex
A4c.......Extended precision
ZQADD Adds a double complex scalar to the accumulator in
extended precision.
ZQINI Initializes an extended-precision complex accumulator to a
double complex scalar.
ZQMUL Multiplies double complex scalars using extended
precision.
ZQSTO Stores a double complex approximation to an extendedprecision complex scalar.
A6.........Change of representation
A6c.......Decomposition, construction
PRIME Decomposes an integer into its prime factors.
B...........NUMBER THEORY
PRIME Decomposes an integer into its prime factors.
C...........ELEMENTARY AND SPECIAL FUNCTIONS
C2.........Powers, roots, reciprocals
HYPOT
IMSL MATH/LIBRARY
IMSL MATH/LIBRARY
y ay.
SSUM
SXYZ
CHERK
CSYR2K
CSYRK
CTBSV
x A1 x, x ( A1 ) x, or x ( AT ) x ,
1
CTRSM
C AB + BA + C or C A B + B A + C ,
where C is an n by n Hermitian matrix and A and B are n
by k matrices in the first case and k by n matrices in the
second case.
Computes one of the Hermitian rank k operations:
T
T
C AA + C or C A A + C ,
where C is an n by n Hermitian matrix and A is an n by k
matrix in the first case and a k by n matrix in the second
case.
Computes one of the symmetric rank 2k operations:
T
T
T
T
C AB + BA + C or C A B + B A + C ,
where C is an n by n symmetric matrix and A and B are n
by k matrices in the first case and k by n matrices in the
second case.
Computes one of the symmetric rank k operations:
T
T
C AA + C or C A A + C ,
where C is an n by n symmetric matrix and A is an n by k
matrix in the first case and a k by n matrix in the second
case.
Solves one of the complex triangular systems:
B A1 B, B BA1 , B ( A1 ) B, B B ( A1 ) ,
T
B ( AT ) B, or B B ( AT )
1
IMSL MATH/LIBRARY
CTRSV
HRRRR
SGER
SSYR
SSYR2
SSYR2K
SSYRK
STBSV
STRSM
B A1 B, B BA1 , B ( A1 ) B, or B B ( A1 ) ,
T
STRSV
D1b3..... Transpose
TRNRR
D1b4
Multiplication by vector
BLINF
CGBMV
CGEMV
CHBMV
CHEMV
CTBMV
CTRMV
MUCBV
MUCRV
MURBV
MURRV
SGBMV
SGEMV
SSBMV
SSYMV
STBMV
x Ax or x AT x ,
where A is a triangular matrix in band storage mode.
STRMV Computes one of the matrix-vector operations:
T
x Ax or x A x ,
where A is a triangular matrix.
D1b5.....Addition, subtraction
A-8 Appendix A: GAMS Index
IMSL MATH/LIBRARY
D1b6..... Multiplication
CGEMM Computes one of the matrix-matrix operations:
T
T
C AB + C , C A B + C , C AB
T
+C , C A B + C , C AB + C ,
T
or C A B + C , C A B + C ,
T
C A B + C , or C A B + C
CHEMM Computes one of the matrix-matrix operations:
C AB + C or C BA + C ,
where A is an Hermitian matrix and B and C are m by n
matrices.
CSYMM Computes one of the matrix-matrix operations:
C AB + C or C BA + C ,
where A is a symmetric matrix and B and C are m by n
matrices.
CTRMM Computes one of the matrix-matrix operations:
T
T
B AB , B A B , B BA, B BA ,
T
MCRCR
MRRRR
MXTXF
MXTYF
MXYTF
SGEMM
SSYMM
STRMM
B A B ,or B BA
where B is an m by n matrix and A is a triangular matrix.
Multiplies two complex rectangular matrices, AB.
Multiplies two real rectangular matrices, AB.
Computes the transpose product of a matrix, ATA.
Multiplies the transpose of matrix A by matrix B, ATB.
Multiplies a matrix A by the transpose of a matrix B, ABT.
Compute one of the matrix-matrix operations:
T
T
C AB + C , C A B + C , C AB
.
T T
+C , or C A B + C
Computes one of the matrix-matrix operations:
C AB + C or C BA + C ,
where A is a symmetric matrix and B and C are m by n
matrices.
Computes one of the matrix-matrix operations:
B AB, B AT B, or B BA, B BAT ,
where B is an m by n matrix and A is a triangular matrix.
D2a2..... Banded
LFCRB
LFIRB
LFSRB
LFTRB
LSARB
LSLRB
STBSV
D2a3..... Triangular
LFCRT
LINRT
LSLRT
STRSM
B A1 B, B BA1 , B ( A1 ) B
T
or B B ( A1 ) ,
T
LFCSF
LFISF
LFSSF
LFTSF
LSASF
LSLSF
LIN_SOL_SELF
D2b4.....Sparse
JCGRC
LFSXD
LNFXD
LSCXD
LSLXD
PCGRC
D2c2..... Banded
CTBSV
LFCCB
LFICB
LFSCB
LFTCB
LSACB
LSLCB
D2c3..... Triangular
CTRSM
B A1 B, B BA1 , B ( A1 ) B, B B ( A1 ) ,
T
B ( AT ) B, or B B ( AT )
1
CTRSV
D2c4.....Sparse
Solves a complex sparse system of linear equations given
the LU factorization of the coefficient matrix.
LFTZG Computes the LU factorization of a complex general
sparse matrix.
LSLZG Solves a complex sparse system of linear equations by
Gaussian elimination.
LFSZG
D2d1b...Positive definite
H
LFCDH Computes the R R factorization of a complex Hermitian
positive definite matrix and estimate its L1 condition
number.
LFIDH Uses iterative refinement to improve the solution of a
complex Hermitian positive definite system of linear
equations.
LFSDH Solves a complex Hermitian positive definite system of
linear equations given the RH R factorization of the
coefficient matrix.
H
LFTDH Computes the R R factorization of a complex Hermitian
positive definite matrix.
A-16 Appendix A: GAMS Index
IMSL MATH/LIBRARY
D3......... Determinants
D3a....... Real nonsymmetric matrices
D3a1..... General
Fortran Numerical MATH LIBRARY
LFDRG
D3a2.....Banded
LFDRB
D3a3.....Triangular
LFDRT
D3b1b...Positive definite
LFDDS Computes the determinant of a real symmetric positive
definite matrix given the RH R Cholesky factorization of
the matrix.
D3c.......Complex non-Hermitian matrices
D3c1.....General
LFDCG
D3c2.....Banded
LFDCB
D3c3.....Triangular
LFDCT
IMSL MATH/LIBRARY
EVBSF
EVCSF
EVESF
EVFSF
EVLSF
LIN_EIG_SELF
D4a6.....Banded
EVASB
EVBSB
EVCSB
EVESB
EVFSB
EVLSB
IMSL MATH/LIBRARY
EVLRH
D7c....... QR
LUPQR
IMSL MATH/LIBRARY
BAND_
ACCUMALATION
BAND_SOLVE
HOUSE_HOLDER
Householder transformations.
Computes the least-squares solution using Householder
transformations applied in blocked form.
LQRSL Computes the coordinate transformation, projection, and
complete the solution of the least-squares problem Ax = b.
LSBRR Solves a linear least-squares problem with iterative
refinement.
LSQRR Solves a linear least-squares problem without iterative
refinement.
LIN_SOL_LSQ Solves a rectangular system of linear equations Ax b, in a
least-squares sense. Using optional arguments, any of
several related computations can be performed. These
extra tasks include computing and saving the factorization
of A using column and row pivoting, representing the
determinant of A, computing the generalized inverse
matrix A, or computing the least-squares solution of
T
Ax b or A y d given the factorization of A. An optional
argument is provided for computing the following
unscaled covariance matrix: C = (ATA)-1.
LIN_SOL_SVD Solves a rectangular least-squares system of linear
equationsT Ax b using singular value decomposition,
A = USV . Using optional arguments, any of several
related computations can be performed. These extra tasks
include computing the rank of A, the orthogonal m m and
n n matrices U and V, and the m n diagonal matrix of
singular values, S.
LQRRV
D9b....... Constrained
D9b1..... Least squares (L2) solution
LCLSQ Solves a linear least-squares problem with linear
constraints.
D9c....... Generalized inverses
LSGRR Computes the generalized inverse of a real matrix.
LIN_SOL_LSQ Solves a rectangular system of linear equations Ax b, in a
least-squares sense. Using optional arguments, any of
several related computations can be performed. These
extra tasks include computing and saving the factorization
of A using column and row pivoting, representing the
determinant of A, computing the generalized inverse
Fortran Numerical MATH LIBRARY
BS3IN
QD2DR
QD2VL
QD3DR
QD3VL
SURFACE_FITTING
BSDER
CS1GD
CSDER
PP1GD
PPDER
QDDER
E3a3 .....Quadrature
Evaluates the integral of a tensor-product spline on a
rectangular domain, given its tensor-product B-spline
representation.
BS3IG Evaluates the integral of a tensor-product spline in three
dimensions over a three-dimensional rectangle, given its
tensorproduct B-spline representation.
BSITG Evaluates the integral of a spline, given its B-spline
representation.
CSITG Evaluates the integral of a cubic spline.
BS2IG
IMSL MATH/LIBRARY
IMSL MATH/LIBRARY
G2h3b1b
H2a.......One-dimensional integrals
H2a1.....Finite interval (general integrand)
H2a1a ...Integrand available via user-defined procedure
H2a1a1. Automatic (user need only specify required accuracy)
QDAG
Integrates a function using a globally adaptive scheme
based on Gauss-Kronrod rules.
QDAGS Integrates a function (which may have endpoint
singularities).
QDNG
Integrates a smooth function using a nonadaptive rule.
H2a2.....Finite interval (specific or special type integrand including weight
functions, oscillating and singular integrands, principal value integrals,
splines, etc.)
H2a2a ...Integrand available via user-defined procedure
H2a2a1 .Automatic (user need only specify required accuracy)
QDAGP Integrates a function with singularity points given.
QDAWC Integrates a function F(X)/(X C) in the Cauchy principal
value sense.
QDAWO Integrates a function containing a sine or a cosine.
QDAWS Integrates a function with algebraic-logarithmic
singularities.
H2a2b ...Integrand available only on grid
H2a2b1 .Automatic (user need only specify required accuracy)
BSITG Evaluates the integral of a spline, given its B-spline
representation.
H2a3.....Semi-infinite interval (including ex weight function)
H2a3a. ..Integrand available via user-defined procedure
H2a3a1. Automatic (user need only specify required accuracy)
QDAGI Integrates a function over an infinite or semi-infinite
interval.
QDAWF Computes a Fourier integral.
H2b.......Multidimensional integrals
H2b1.....One or more hyper-rectangular regions (including iterated integrals)
QMC
Integrates a function over a hyperrectangle using a
quasi-Monte Carlo method.
H2b1a. ..Integrand available via user-defined procedure
H2b1a1 .Automatic (user need only specify required accuracy)
QAND
Integrates a function on a hyper-rectangle.
TWODQ Computes a two-dimensional iterated integral.
H2b1b...Integrand available only on grid
H2b1b2.Nonautomatic
A-30 Appendix A: GAMS Index
IMSL MATH/LIBRARY
H2c....... Service routines (compute weight and nodes for quadrature formulas)
FQRUL Computes a Fejr quadrature rule with various classical
weight functions.
GQRCF Computes a Gauss, Gauss-Radau or Gauss-Lobatto
quadrature rule given the recurrence coefficients for the
monic polynomials orthogonal with respect to the weight
function.
GQRUL Computes a Gauss, Gauss-Radau, or Gauss-Lobatto
quadrature rule with various classical weight functions.
RECCF Computes recurrence coefficients for various monic
polynomials.
RECQR Computes recurrence coefficients for monic polynomials
given a quadrature rule.
I ............ DIFFERENTIAL AND INTEGRAL EQUATIONS
I1 .......... Ordinary differential equations (ODEs)
I1a. ....... Initial value problems
I1a1 ...... General, nonstiff or mildly stiff
I1a1a..... One-step methods (e.g., Runge-Kutta)
IVMRK Solves an initial-value problem y = f(t, y) for ordinary
differential equations using Runge-Kutta pairs of various
orders.
IVPRK Solves an initial-value problem for ordinary differential
equations using the Runge-Kutta-Verner fifth-order and
sixth-order method.
I1a1b. ... Multistep methods (e.g., Adams predictor-corrector)
IVPAG Solves an initial-value problem for ordinary differential
equations using either Adams-Moultons or Gears BDF
method.
I1a2 ...... Stiff and mixed algebraic-differential equations
DASPG Solves a first order differential-algebraic system of
equations, g(t, y, y) = 0, using PetzoldGear BDF method.
I1b ........ Multipoint boundary value problems
I1b2 ...... Nonlinear
BVPFD
BVPMS
J1a2 ......Complex
A-32 Appendix A: GAMS Index
IMSL MATH/LIBRARY
FAST-DFT
PARALLEL_&
NONONEGATIVE_LSQ Solves multiple systems of linear equations
IMSL MATH/LIBRARY
PARALLEL_& BOUNDED_LSQ
Parallel routines for simple bounded constrained linearleast squares based on a descent algorithm.
K1a2a ... Linear constraints
LCLSQ Solves a linear least-squares problem with linear
constraints.
PARALLEL_
NONNEGATIVE_LSQ
PARALLEL_
BOUNDED_LSQ
IMSL MATH/LIBRARY
WRCRL
WRCRN
WRIRL
WRIRN
WROPT
WRRRL
WRRRN
SCALAPACK_READ
SCALAPACK_WRITE
SHOW
N3.........Character manipulation
ACHAR Returns a character given its ASCII value.
CVTSI Converts a character string containing an integer number
into the corresponding integer form.
IACHAR Returns the integer ASCII value of a character argument.
ICASE Returns the ASCII value of a character converted to
uppercase.
IICSR Compares two character strings using the ASCII collating
sequence but without regard to case.
IIDEX Determines the position in a string at which a given
character sequence begins without regard to case.
N4.........Storage management (e.g., stacks, heaps, trees)
IWKCIN Initializes bookkeeping locations describing the character
workspace stack.
IWKIN Initializes bookkeeping locations describing the workspace
stack.
ScaLAPACK_READ Moves data from a file to Block-Cyclic form, for use in
ScaLAPACK.
ScaLAPACK_WRITE Move data from Block-Cyclic form, following use in
ScaLAPACK, to a file.
N5.........Searching
N5b.......Insertion position
ISRCH Searches a sorted integer vector for a given integer and
return its index.
SRCH
Searches a sorted vector for a given scalar and return its
index.
SSRCH Searches a character vector, sorted in ascending ASCII
order, for a given string and return its index.
A-38 Appendix A: GAMS Index
IMSL MATH/LIBRARY
N5c....... On a key
Determines the position in a string at which a given
character sequence begins without regard to case.
ISRCH Searches a sorted integer vector for a given integer and
return its index.
SRCH
Searches a sorted vector for a given scalar and return its
index.
SSRCH Searches a character vector, sorted in ascending ASCII
order, for a given string and return its index.
IIDEX
N6......... Sorting
N6a....... Internal
N6a1..... Passive (i.e., construct pointer array, rank)
N6a1a ... Integer
Sorts an integer array by nondecreasing absolute value and
return the permutation that rearranges the array.
SVIGP Sorts an integer array by algebraically increasing value and
return the permutation that rearranges the array.
SVIBP
N6a1b... Real
Sorts a real array by nondecreasing absolute value and
return the permutation that rearranges the array.
SVRGP Sorts a real array by algebraically increasing value and
return the permutation that rearranges the array.
LIN_SOL_TRI Sorts a rank-1 array of real numbers x so the y results are
algebraically nondecreasing, y1 y2 yn .
SVRBP
N6a2..... Active
N6a2a ... Integer
Sorts an integer array by nondecreasing absolute value.
Sorts an integer array by nondecreasing absolute value and
return the permutation that rearranges the array.
SVIGN Sorts an integer array by algebraically increasing value.
SVIGP Sorts an integer array by algebraically increasing value and
return the permutation that rearranges the array.
SVIBN
SVIBP
N6a2b... Real
Sorts a real array by nondecreasing absolute value.
Sorts a real array by nondecreasing absolute value and
return the permutation that rearranges the array.
SVRGN Sorts a real array by algebraically increasing value.
SVRGP Sorts a real array by algebraically increasing value and
return the permutation that rearranges the array.
SVRBN
SVRBP
N8......... Permuting
PERMA
PERMU
PLOTP
R...........SERVICE ROUTINES
IDYWK Computes the day of the week for a given date.
IUMAG Sets or retrieves MATH/LIBRARY integer options.
NDAYS Computes the number of days from January 1, 1900, to the
given date.
NDYIN Gives the date corresponding to the number of days since
January 1, 1900.
SUMAG Sets or retrieves MATH/LIBRARY single-precision
options.
TDATE Get stodays date.
TIMDY Gets time of day.
VERML Obtains IMSL MATH/LIBRARY-related version, system
and license numbers.
R1.........Machine-dependent constants
AMACH Retrieves single-precision machine constants.
IFNAN Checks if a value is NaN (not a number).
IMACH Retrieves integer machine constants.
ISNAN Detects an IEEE NaN (not-a-number).
NAN
Returns, as a scalar function, a value corresponding to the
IEEE 754 Standard format of floating point (ANSI/IEEE
1985) for NaN.
UMACH Sets or retrieves input or output device unit numbers.
R3.........Error handling
BUILD_ERROR
_STRUCTURE
IMSL MATH/LIBRARY
Routines
Function/Page
Purpose Statement
A
ACBCB see page 1497
B
BCLSF see page 1310
gradient.
BCPOL see page 1306
Evaluates the derivative of a two-dimensional tensorproduct spline, given its tensor-product B-spline
representation.
Evaluates the derivative of a two-dimensional tensorproduct spline, given its tensor-product B-spline
representation on a grid.
Evaluates the derivative of a three-dimensional tensorproduct spline, given its tensor-product B-spline
representation.
Evaluates the derivative of a three-dimensional tensorproduct spline, given its tensor-product B-spline
representation on a grid.
C
CADD
CAXPY
CCOPY
CDOTC
CDOTU
CGBMV
y Ax + y , y A x + y , or y A + y ,
where A is a matrix stored in band storage mode.
CGEMM
C AB + C , C A B + C , C AB
T
+C , C A B + C , C AB + C ,
T
or C A B + C , C A B + C ,
T
C A B + C , or C A B + C
Fortran Numerical MATH LIBRARY
CGEMV
y Ax + y , y A x + y , or y A + y ,
CGERC
A A + xy .
CGERU
A A + xy .
CHBCB see page 1467
CHBMV
CHEMM
CHEMV
CHER
A A + xy + yx .
CHER2K
C AB + BA + C or C A B + B A + C ,
where C is an n by n Hermitian matrix and A and B are n by k
matrices in the first case and k by n matrices in the second case.
CHERK
C AA + C or C A A + C ,
where C is an n by n Hermitian matrix and A is an n by k matrix
in the first case and a k by n matrix in the second case.
CHFCG see page 1463
CSCAL
CSET
CSROT
CSROTM
CSSCAL
CSUB
CSVCAL
CSWAP
CSYMM
CSYR2K
C AB + BA + C or C A B + B A + C ,
where C is an n by n symmetric matrix and A and B are n by k
matrices in the first case and k by n matrices in the second case.
CSYRK
C AA + C or C A A + C ,
where C is an n by n symmetric matrix and A is an n by k matrix
in the first case and a k by n matrix in the second case.
CTBMV
x Ax , x A x , or x A x ,
where A is a triangular matrix in band storage mode.
CTBSV
x A1 x, x ( A1 ) x, or x ( AT ) x ,
1
B AB , B A B , B BA, B BA ,
T
B A B ,or B BA
x Ax , x A x , or x A x ,
where A is a triangular matrix.
CTRSM
B A1 B, B BA1 , B ( A1 ) B, B B ( A1 ) ,
T
B ( AT ) B, or B B ( AT )
1
x A1 x, x ( A1 ) x, or x ( AT ) x ,
T
CVCAL
CZCDOT
CZDOTC
CZDOTI
CZDOTU
CZUDOT
D
DASPG see page 980
See AMACH.
E
EIG see page 1586
F
FAURE_FREE see page 1736
sequences.
FFT2B see page 1142
Solves Poissons or Helmholtzs equation on a twodimensional rectangle using a fast Poisson solver based on
the HODIE finite-difference scheme on a uni mesh.
Solves Poissons or Helmholtzs equation on a threedimensional box using a fast Poisson solver based on the
HODIE finite-difference scheme on a uniform mesh.
G
GDHES see page 1397
H
HRRRR see page 1481
HYPOT see page 1757
I
IACHAR see page 1699
IADD
ICAMAX
ICAMIN
ICOPY
IIMAX
IIMIN
ISAMAX
Finds the smallest index of the component of a singleprecision vector having maximum absolute value.
ISAMIN
Finds the smallest index of the component of a singleprecision vector having minimum absolute value.
ISET
ISMAX
Finds the smallest index of the component of a singleprecision vector having maximum value.
ISMIN
Finds the smallest index of the component of a singleprecision vector having minimum value.
ISUB
ISUM
ISWAP
J
JCGRC see page 433
L
LCHRG see page 489
M
MCRCR see page 1479
N
NAN see page 1601
O
OPERATORS:
.h. see page 1556
P
PCGRC see page 427
PARALLEL_NONNEGATIVE_LS
Q see page 66
PARALLEL_BOUNDED_LSQ see
page 74
Q
QAND see page 896
R
RAND see page 1608
S
SADD
SASUM
precision vector.
SAXPY
ScaLAPACK_GETDIM see
page 1624
Reads matrix data from a file and transmits it into the twodimensional block-cyclic form required by ScaLAPACK
routines.
SCASUM
Sums the absolute values of the real part together with the
absolute values of the imaginary part of the components of a
complex vector.
SCNRM2
SCOPY
SDDOTA
Computes the sum of a single-precision scalar, a singleprecision dot product and the double-precision accumulator,
T
which is set to the result ACC ACC + a + x y.
SDDOTI
SDOT
SDSDOT
SGBMV
y Ax + y , or y A x + y ,
where A is a matrix stored in band storage mode.
SGEMM
C AB + C , C A B + C , C AB
T
+C , or C A B + C
SGEMV
y Ax + y , or y A x + y ,
SGER
A A + xy .
SHOW see page 1643
SHPROD
SNRM2
SPLINE_CONSTRAINTS see
page 652
SPRDCT
SROT
SROTG
SROTM
SROTMG
SSBMV
SSCAL
SSET
SSUM
SSWAP
SSYMM
SSYMV
SSYR
A A + xx .
SSYR2
A A + xy + yx .
SSYR2K
C AB + BA + C or C A B + B A + C ,
where C is an n by n symmetric matrix and A and B are n by
k matrices in the first case and k by n matrices in the second
case.
SSYRK
C AA + C or C A A + C ,
where C is an n by n symmetric matrix and A is an n by k
matrix in the first case and a k by n matrix in the second
case.
STBMV
STBSV
x A1 x or x ( A1 ) x ,
T
STRSM
B A1 B, B BA1 , B ( A1 ) B
T
or B B ( A1 ) ,
T
STRSV
x A1 x or x ( A1 ) x
T
SURFACE_CONSTRAINTS
see page 664
SVCAL
SXYZ
T
TDATE see page 1707
U
UMACH see page 1774
V
VCONC see page 1514
W
WRCRL see page 1660
Z
ZANLY see page 1189
ZQADD
ZQINI
ZQMUL
ZQSTO
Appendix C: References
Akima
Akima, H. (1970), A new method of interpolation and smooth curve fitting based on local
procedures, Journal of the ACM, 17, 589602.
Akima, H. (1978), A method of bivariate interpolation and smooth surface fitting for irregularly
distributed data points, ACM Transactions on Mathematical Software, 4, 148159.
Anderson et al.
Anderson, E., Bai, Z., Bishop, C., Blackford, S., Demmel, J., Dongarra, J., DuCroz, J.,
Greenbaum, A., Hammarling, S., McKenney, A., and Sorensen, D. (1999), LINPACK Users
Guide, SIAM, 3rd ed., Philadelphia.
Arushanian et al.
Arushanian, O.B., M.K. Samarin, V.V. Voevodin, E.E. Tyrtyshikov, B.S. Garbow, J.M. Boyle,
W.R. Cowell, and K.W. Dritz (1983), The TOEPLITZ Package Users Guide, Argonne National
Laboratory, Argonne, Illinois.
Ashcraft
Ashcraft, C. (1987), A vector implementation of the multifrontal method for large sparse,
symmetric positive definite linear systems, Technical Report ETA-TR-51, Engineering
Technology Applications Division, Boeing Computer Services, Seattle, Washington.
Ashcraft et al.
Ashcraft, C., R.Grimes, J. Lewis, B. Peyton, and H. Simon (1987), Progress in sparse matrix
methods for large linear systems on vector supercomputers. Intern. J. Supercomputer Applic.,
1(4), 1029.
Atkinson
Atkinson, Ken (1978), An Introduction to Numerical Analysis, John Wiley & Sons, New York.
Bischof et al.
Bischof, C., J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, D. Sorensen
(1988), LAPACK Working Note #5: Provisional Contents, Argonne National Laboratory Report
ANL-88-38, Mathematics and Computer Science.
Bjorck
Bjorck, Ake (1967), Iterative refinement of linear least squares solutions I, BIT, 7, 322337.
Bjorck, Ake (1968), Iterative refinement of linear least squares solutions II, BIT, 8, 830.
Boisvert (1984)
Boisvert, Ronald (1984), A fourth order accurate fast direct method for the Helmholtz equation,
Elliptic Problem Solvers II, (edited by G. Birkhoff and A. Schoenstadt), Academic Press, Orlando,
Florida, 3544.
Blackford et al.
Blackford, L. S., Choi, J., Cleary, A., D'Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J.,
Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D. and Whaley, R. C., (1997),
ScaLAPACK Users Guide, Society for Industrial and Applied Mathematics, Philadephia, PA.
Brankin et al.
Brankin, R.W., I. Gladwell, and L.F. Shampine, RKSUITE: a Suite of Runge-Kutta Codes for the
Initial Value Problem for ODEs, Softreport 91-1, Mathematics Department, Southern Methodist
University, Dallas, Texas, 1991.
Brenner
Brenner, N. (1973), Algorithm 467: Matrix transposition in place [F1], Communication of ACM,
16, 692694.
Brent
Brent, R.P. (1971), An algorithm with guaranteed convergence for finding a zero of a function,
The Computer Journal, 14, 422425.
Brent, Richard P. (1973), Algorithms for Minimization without Derivatives, Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey.
Brigham
Brigham, E. Oran (1974), The Fast Fourier Transform, Prentice-Hall, Englewood Cliffs, New
Jersey.
Cheney
Cheney, E.W. (1966), Introduction to Approximation Theory, McGraw-Hill, New York.
Cline et al.
Cline, A.K., C.B. Moler, G.W. Stewart, and J.H. Wilkinson (1979), An estimate for the condition
number of a matrix, SIAM Journal of Numerical Analysis, 16, 368375.
Crowe et al.
Crowe, Keith, Yuan-An Fan, Jing Li, Dale Neaderhouser, and Phil Smith (1990), A direct sparse
linear equation solver using linked list storage, IMSL Technical Report 9006, IMSL, Houston.
Crump
Crump, Kenny S. (1976), Numerical inversion of Laplace transforms using a Fourier series
approximation, Journal of the Association for Computing Machinery, 23, 8996.
de Boor
de Boor, Carl (1978), A Practical Guide to Splines, Springer-Verlag, New York.
Dongarra et al.
Dongarra, J.J., and C.B. Moler, (1977) EISPACK A package for solving matrix eigenvalue
problems, Argonne National Laboratory, Argonne, Illinois.
Dongarra, J.J., J.R. Bunch, C.B. Moler, and G.W. Stewart (1979), LINPACK Users Guide, SIAM,
Philadelphia.
Dongarra, J.J., J. DuCroz, S. Hammarling, R. J. Hanson (1988), An Extended Set of Fortran basic
linear algebra subprograms, ACM Transactions on Mathematical Software, 14 , 117.
Dongarra, J.J., J. DuCroz, S. Hammarling, I. Duff (1990), A set of level 3 basic linear algebra
subprograms, ACM Transactions on Mathematical Software, 16 , 117.
Du Croz et al.
Du Croz, Jeremy, P. Mayes, G. and Radicati (1990), Factorization of band matrices using Level-3
BLAS, Proceedings of CONPAR 90 VAPP IV, Lecture Notes in Computer Science, Springer,
Berlin, 222.
Duff et al.
Duff, I.S., A.M. Erisman, and J.K. Reid (1986), Direct Methods for Sparse Matrices, Clarendon
Press, Oxford.
Fabijonas
B. R. Fabijonas,. Algorithm 838: Airy Functions, ACM Transactions on Mathematical Software,
Vol. 30, No. 4, December 2004, Pages 491501.
Fabijonas et al.
B. R. Fabijonas, D. W. Lozier, and F. W. J. Olver Computation of Complex Airy Functions and
Their Zeros Using Asymptotics and the Differential Equation, ACM Transactions on
Mathematical Software, Vol. 30, No. 4, December 2004, 471490.
Forsythe
Forsythe, G.E. (1957), Generation and use of orthogonal polynomials for fitting data with a digital
computer, SIAM Journal on Applied Mathematics, 5, 7488.
Garbow
Garbow, B.S. (1978) CALGO Algorithm 535: The QZ algorithm to solve the generalized eigenvalue
problem for complex matrices, ACM Transactions on Mathematical Software, 4, 404410.
Garbow et al.
Garbow, B.S., J.M. Boyle, J.J. Dongarra, and C.B. Moler (1972), Matrix eigensystem Routines:
EISPACK Guide Extension, Springer-Verlag, New York.
Garbow, B.S., J.M. Boyle, J.J. Dongarra, and C.B. Moler (1977), Matrix Eigensystem Routines
EISPACK Guide Extension, Springer-Verlag, New York.
Garbow, B.S., G. Giunta, J.N. Lyness, and A. Murli (1988), Software for an implementation of
Weeks method for the inverse Laplace transform problem, ACM Transactions of Mathematical
Software, 14, 163170.
Gautschi
Gautschi, Walter (1968), Construction of Gauss-Christoffel quadrature formulas, Mathematics of
Computation, 22, 251270.
Gay
Gay, David M. (1981), Computing optimal locally constrained steps, SIAM Journal on Scientific
and Statistical Computing, 2, 186197.
Gay, David M. (1983), Algorithm 611: Subroutine for unconstrained minimization using a
model/trust-region approach, ACM Transactions on Mathematical Software, 9, 503 524.
Gear
Gear, C.W. (1971), Numerical Initial Value Problems in Ordinary Differential Equations,
Prentice-Hall, Englewood Cliffs, New Jersey.
Gill et al.
Gill, Philip E., and Walter Murray (1976), Minimization subject to bounds on the variables, NPL
Report NAC 72, National Physical Laboratory, England.
Gill, Philip E., Walter Murray, and Margaret Wright (1981), Practical Optimization, Academic
Press, New York.
C-6 Appendix C: References
Gill, P.E., W. Murray, M.A. Saunders, and M.H. Wright (1985), Model building and practical
aspects of nonlinear programming, in Computational Mathematical Programming, (edited by K.
Schittkowski), NATO ASI Series, 15, Springer-Verlag, Berlin, Germany.
Golub
Golub, G.H. (1973), Some modified matrix eigenvalue problems, SIAM Review, 15, 318334.
Grosse
Grosse, Eric (1980), Tensor spline approximation, Linear Algebra and its Applications, 34, 2941.
Hanson
Hanson, Richard J. (1986), Least squares with bounds and linear constraints, SIAM Journal Sci.
Stat. Computing, 7, #3.
Hanson, Richard.J. (1990), A cyclic reduction solver for the IMSL Mathematics Library, IMSL
Technical Report 9002, IMSL, Houston.
Hanson et al.
Hanson, Richard J., R. Lehoucq, J. Stolle, and A. Belmonte (1990), Improved performance of
certain matrix eigenvalue computations for the IMSL/MATH Library, IMSL Technical Report
9007, IMSL, Houston.
Hartman
Hartman, Philip (1964) Ordinary Differential Equations, John Wiley and Sons, New York, NY.
Hausman
Hausman, Jr., R.F. (1971), Function Optimization on a Line Segment by Golden Section,
Lawrence Radiation Laboratory, University of California, Livermore.
Hindmarsh
Hindmarsh, A.C. (1974), GEAR: Ordinary differential equation system solver, Lawrence
Livermore Laboratory Report UCID30001, Revision 3.
Hull et al.
Hull, T.E., W.H. Enright, and K.R. Jackson (1976), Users guide for DVERK A subroutine for
solving non-stiff ODEs, Department of Computer Science Technical Report 100, University of
Toronto.
IEEE
ANSI/IEEE Std 754-1985 (1985), IEEE Standard for Binary Floating-Point Arithmetic, The
IEEE, Inc., New York.
IMSL (1991)
IMSL (1991), IMSL STAT/LIBRARY Users Manual, Version 2.0, IMSL, Houston.
Irvine et al.
Irvine, Larry D., Samuel P. Marin, and Philip W. Smith (1986), Constrained interpolation and
smoothing, Constructive Approximation, 2, 129151.
Jenkins
Jenkins, M.A. (1975), Algorithm 493: Zeros of a real polynomial, ACM Transactions on
Mathematical Software, 1, 178189.
Kershaw
Kershaw, D. (1982), Solution of tridiagonal linear systems and vectorization of the ICCG
algorithm on the Cray-1, Parallel Computations, Academic Press, Inc., 85-99.
Knuth
Knuth, Donald E. (1973), The Art of Computer Programming, Volume 3: Sorting and Searching,
Addison-Wesley Publishing Company, Reading, Mass.
Krogh
Krogh, Fred T. (2005), An Algorithm for Linear Programming,
http://mathalacarte.com/fkrogh/pub/lp.pdf, Tojunga, CA.
Lawson et al.
Lawson, C.L., R.J. Hanson, D.R. Kincaid, and F.T. Krogh (1979), Basic linear algebra
subprograms for Fortran usage, ACM Transactions on Mathematical Software, 5, 308 323.
Leavenworth
Leavenworth, B. (1960), Algorithm 25: Real zeros of an arbitrary function, Communications of the
ACM, 3, 602.
Levenberg
Levenberg, K. (1944), A method for the solution of certain problems in least squares, Quarterly of
Applied Mathematics, 2, 164168.
Lewis et al.
Lewis, P.A. W., A.S. Goodman, and J.M. Miller (1969), A pseudo-random number generator for
the System/360, IBM Systems Journal, 8, 136146.
Liepman
Liepman, David S. (1964), Mathematical constants, in Handbook of Mathematical Functions,
Dover Publications, New York.
Liu
Liu, J.W.H. (1986), On the storage requirement in the out-of-core multifrontal method for sparse
factorization. ACM Transactions on Mathematical Software, 12, 249264.
Liu, J.W.H. (1987), A collection of routines for an implementation of the multifrontal method,
Technical Report CS-87-10, Department of Computer Science, York University, North York,
Ontario, Canada.
Liu, J.W.H. (1989), The multifrontal method and paging in sparse Cholesky factorization. ACM
Transactions on Mathematical Software, 15, 310325.
Liu, J.W.H. (1990), The multifrontal method for sparse matrix solution: theory and practice,
Technical Report CS-90-04, Department of Computer Science, York University, North York,
Ontario, Canada.
Marquardt
Marquardt, D. (1963), An algorithm for least-squares estimation of nonlinear parameters, SIAM
Journal on Applied Mathematics, 11, 431441.
Micchelli et al.
Micchelli, C.A., T.J. Rivlin, and S. Winograd (1976), The optimal recovery of smooth functions,
Numerische Mathematik, 26, 279285
Micchelli, C.A., Philip W. Smith, John Swetits, and Joseph D. Ward (1985), Constrained Lp
approximation, Constructive Approximation, 1, 93102.
More et al.
More, Jorge, Burton Garbow, and Kenneth Hillstrom (1980), User guide for MINPACK-1,
Argonne National Labs Report ANL-80-74, Argonne, Illinois.
Muller
Muller, D.E. (1956), A method for solving algebraic equations using an automatic computer,
Mathematical Tables and Aids to Computation, 10, 208215.
Murtagh
Murtagh, Bruce A. (1981), Advanced Linear Programming: Computation and Practice, McGrawHill, New York.
Murty
Murty, Katta G. (1983), Linear Programming, John Wiley and Sons, New York.
Parlett
Parlett, B.N. (1980), The Symmetric Eigenvalue Problem, PrenticeHall, Inc., Englewood Cliffs,
New Jersey.
Pereyra
Pereyra, Victor (1978), PASVA3: An adaptive finite-difference FORTRAN program for first
order nonlinear boundary value problems, in Lecture Notes in Computer Science, 76, SpringerVerlag, Berlin, 6788.
Petro
Petro, R. (1970), Remark on Algorithm 347: An efficient algorithm for sorting with minimal
storage, Communications of the ACM, 13, 624.
Petzold
Petzold, L.R. (1982), A description of DASSL: A differential/ algebraic system solver,
Proceedings of the IMACS World Congress, Montreal, Canada.
Piessens et al.
Piessens, R., E. deDoncker-Kapenga, C.W. Uberhuber, and D.K. Kahaner (1983), QUADPACK,
Springer-Verlag, New York.
Powell
Powell, M.J.D. (1977), Restart procedures for the conjugate gradient method, Mathematical
Programming, 12, 241254.
Powell, M.J.D. (1978), A fast algorithm for nonlinearly constrained optimization calculations, in
Numerical Analysis Proceedings, Dundee 1977, Lecture Notes in Mathematics, (edited by G.A.
Watson), 630, Springer-Verlag, Berlin, Germany, 144157.
Powell, M.J.D. (1983), ZQPCVX a FORTRAN subroutine for convex quadratic programming,
DAMTP Report NA17, Cambridge, England.
Powell, M.J.D. (1985), On the quadratic programming algorithm of Goldfarb and Idnani,
Mathematical Programming Study, 25, 46-61.
Powell, M.J.D. (1988), A tolerant algorithm for linearly constrained optimization calculations,
DAMTP Report NA17, University of Cambridge, England.
Powell, M.J.D. (1989), TOLMIN: A fortran package for linearly constrained optimization
calculations, DAMTP Report NA2, University of Cambridge, England.
Reinsch
Reinsch, Christian H. (1967), Smoothing by spline functions, Numerische Mathematik, 10,
177183.
Rice
Rice, J.R. (1983), Numerical Methods, Software, and Analysis, McGraw-Hill, New York.
Schittkowski
Schittkowski, K. (1987), More test examples for nonlinear programming codes, SpringerVerlag,
Berlin, 74.
Schnabel
Schnabel, Robert B. (1985), Finite Difference Derivatives Theory and Practice, Report, National
Bureau of Standards, Boulder, Colorado.
Scott et al.
Scott, M.R., L.F. Shampine, and G.M. Wing (1969), Invariant Embedding and the Calculation of
Eigenvalues for Sturm-Liouville Systems, Computing, 4, 1023.
Sewell
Sewell, Granville (1982), IMSL software for differential equations in one space variable, IMSL
Technical Report 8202, IMSL, Houston.
Shampine
Shampine, L.F. (1975), Discrete least-squares polynomial fits, Communications of the ACM, 18,
179180.
Singleton
Singleton, R.C. (1969), Algorithm 347: An efficient algorithm for sorting with minimal storage,
Communications of the ACM, 12, 185187.
Smith
Smith, B.T. (1967), ZERPOL, A Zero Finding Algorithm for Polynomials Using Laguerres
Method, Department of Computer Science, University of Toronto.
Smith et al.
Smith, B.T., J.M. Boyle, J.J. Dongarra, B.S. Garbow, Y. Ikebe, V.C. Klema, and C.B. Moler
(1976), Matrix Eigensystem Routines EISPACK Guide, Springer-Verlag, New York.
Spang
Spang, III, H.A. (1962), A review of minimization techniques for non-linear functions, SIAM
Review, 4, 357359.
Stewart
Stewart, G.W. (1973), Introduction to Matrix Computations, Academic Press, New York.
Stewart, G.W. (1976), The economical storage of plane rotations, Numerische Mathematik, 25,
137139.
Stoer
Stoer, J. (1985), Principles of sequential quadratic programming methods for solving nonlinear
programs, in Computational Mathematical Programming, (edited by K. Schittkowski), NATO
ASI Series, 15, Springer-Verlag, Berlin, Germany.
Titchmarsh
Titchmarsh, E. Eigenfunction Expansions Associated with Second Order Differential Equations,
Part I, 2d Ed., Oxford University Press, London, 1962.
Trench
Trench, W.F. (1964), An algorithm for the inversion of finite Toeplitz matrices, Journal of the
Society for Industrial and Applied Mathematics, 12, 515522.
Walker
Walker, H.F. (1988), Implementation of the GMRES method using Householder transformations,
SIAM J. Sci. Stat. Comput., 9, 152163.
C-14 Appendix C: References
Washizu
Washizu, K. (1968), Variational Methods in Elasticity and Plasticity, Pergamon Press, New York.
Weeks
Weeks, W.T. (1966), Numerical inversion of Laplace transforms using Laguerre functions, J.
ACM, 13, 419429.
Wilkinson
Wilkinson, J.H., and Howinson, S., and Dewynne, J (1965), The Algebraic Eigenvalue Problem,
Oxford University Press, London, 635.
Wilmot et al.
Wilkinson, J.H. (1965), The Mathematics of Financial Derivatives: A Student Introduction,
Cambridge University Press, NY, 41-57.
Appendix D: Benchmarking or
Timing Programs
NTRIES PREC
Description
NSIZE
NTRIES PREC
Description
...
QUIT
The parameters of NSIZE and NTRIES appear in summary tables. The parameter PREC has values
1, 2 or 3. The choice depends on whether the user wants precision of single, double or both
versions timed. The array functions return a summary table with these 6 values:
1.
Average time
2.
Standard deviation
3.
Total time
4. nsize
5. ntries
6.
Time Units/Sec.
As an example, the program time_rand_gend is compiled and linked with the single and double
precision timing functions s_rand_gen_bench and d_rand_gen_bench.
The two lines of input are:
100000
QUIT
Fortran Numerical MATH LIBRARY
This routine evaluates the elapsed time to compute 100,000 random numbers obtained with
rand_gen and rnun(drnun). The Average is the mean of the individual elapsed times for 5
calls to the routines, obtaining 100,000 random numbers in each call. The St. Dev. is the
standard deviation for that Average. This value indicates the variability of the Average. In
order for this value to provide any useful information it is necessary for |NTRIES| > 1. The value
|NTRIES| = 1 is acceptable, but only one time sample and no standard deviation is obtained.
Values of NTRIES > 0 result in the printing of results as shown in Table A. The numbers in the
table will vary depending on the machine and other factors that impact performance of Fortran
codes.
Benchmark of rand_gend and rnun:
Date of benchmark, (Y, Mo, D, H, M, S): 2006 5 11
1
3.6000E+00
3.2000E+00
8 58 58
Average
4.8990E-01
4.0000E-01
St. Dev.
1.8000E+01
1.6000E+01
Total Ticks
1.0000E+04
1.0000E+04
Size
5.0000E+00
5.0000E+00
Repeats
5.0000E+01
5.0000E+01
2.8000E+00
3.2000E+00
8 58 58
Average
4.0000E-01
4.0000E-01
St. Dev.
1.4000E+01
1.6000E+01
Total Ticks
1.0000E+04
1.0000E+01
Size
5.0000E+00
5.0000E+00
Repeats
5.0000E+01
5.0000E+01
If NTRIES < 0 the 6 2 functions return the tabular values shown, with |NTRIES| samples. No
printing is performed with NTRIES < 0.
To compute a related benchmark such as the rate random numbers per second for single
precision rand_gen, separately calculate
rate = size ticks per sec./average
= 104 50/3.6
= 138,889. numbers/sec.
= 0.139 million numbers/sec.
Number
Routines
Timed for Comparison
Program Units
time_dft.f90,
s_dft_bench.f90,
fast_dft
fftcf, fftcb
dfftcf, dfftcb
d_dft_bench.f90
2
time_eig_gen.f90,
lin_eig_gen
e8crg, de8crg
lin_eig_self
e5csf, de5csf
lin_geig_gen
g8crg, dg8crg
lin_sol_self
l2nds, dl2nds
lin_sol_gen
l2nrg, dl2nrg
lin_sol_lsq
lsgrr, dlsgrr
s_eig_gen_bench.f90,
d_eig_gen_bench.f90
3
time_eig_self.f90,
s_eig_self_bench.f90,
d_eig_self_bench.f90
time_geig_gen.f90,
s_geig_gen_bench.f90,
d_geig_gen_bench.f90
time_inv_chol.f90,
s_inv_chol_bench.f90,
d_inv_chol_bench.f90
time_inv_gen.f90,
s_inv_gen_bench.f90,
d_inv_gen_bench.f90
time_inv_lsq.f90,
s_inv_lsq_bench.f90,
d_inv_lsq_bench.f90
time_inv_self.f90,
lin_sol_self
s_inv_self_bench.f90,
lftsf, lfssf
dlftsf, dlfssf
d_inv_self_bench.f90
9
rand_gen
time_rand_gen.f90,
rnun, drnun
s_inv_rand_bench.f90,
d_inv_rand_bench.f90
Table B: Scalar Benchmark Comparisons
Number
Routines
Timed for Comparison
Program Units
A
10
time_sol_chol.f90,
lin_sol_self
s_inv_sol_chol.f90,
lftds, lfsds
dlftds, dlfsds
d_inv_sol_chol.f90
11
time_sol_gen.f90,
lin_sol_gen
s_sol_gen_bench.f90,
lftrg, lfsrg
dftrg, dlfsrg
d_sol_gen_bench.f90
12
time_sol_lsq.f90,
lin_sol_lsq
l2rrv, dl2rrv
s_sol_lsq_bench.f90,
d_sol_lsq_bench.f90
13
time_sol_self.f90,
lin_sol_self
lftsf, lfssf,
dlftsf, dlfssf
s_sol_self_bench.f90,
d_sol_self_bench.f90
14
time_svd.f90,
lin_svd
lsvrr, dlsvrr
lin_sol_tri
lslcr, dlslcr
A .x. B
matmul(D,E)
s_svd_bench.f90,
d_svd_bench.f90
15
time_tri.f90,
s_tri_bench.f90,
d_tri_bench.f90
16
time_mult.f90
s_mult_bench.f90
d_mult_bench.f90
Perform forward and backward DFT of a random complex sequence of size NSIZE.
2.
3.
4.
5.
Compute the inverse of a positive definite real matrix of dimension NSIZE NSIZE.
Uses Cholesky method.
6.
Compute the inverse of a general real random matrix of dimension NSIZE NSIZE.
Uses LU factorization.
7.
8.
9.
10.
Solve a single system of linear equations with a positive definite real random matrix of
dimension NSIZE NSIZE.
11.
Solve a single system of linear equations with a general real random matrix of
dimension NSIZE NSIZE.
12.
Solve a single least-squares system of linear equations with a real random matrix of
dimension (2 NSIZE) NSIZE.
13.
Solve a single system of linear equations with a symmetric real random matrix of
dimension NSIZE NSIZE.
14.
Compute the full singular value decomposition of a general real random matrix of
dimension NSIZE NSIZE.
15.
16.
Compute products of square matrices of size NSIZE NSIZE. Compare the IMSL
defined operation C = A .x. B with F = matmul(D,E). The arrays are assumed
shape. Identical problems A = D and B = E are timed.
17.
Compare times to use SHOW() for writing a random array of size NSIZE to a
CHARACTER buffer vs. writing the same array to a scratch file.
Two initial lines of output echo the Description field, whether or not the root is working, and the
number of processors in the MPI communicator. The parameters NSIZE, NTRIES and NRACKS
appear in the summary tables. The parameter PREC has values 1, 2 or 3. The choice depends on
Fortran Numerical MATH LIBRARY
whether the user wants precision of single, double or both versions timed. The array functions
return a 7 2 summary table of values. The (1:6, 1) and (1:6,2) elements of this array represent the
results and parameters of the benchmark for the parallel and non-parallel versions. The (7,1) and
(7,2) elements of this array represent the ratio of the parallel to the scalar times and a first-order
approximation to the variation in the ratio.
Parallel Box Version
1. Average time
Average time
2. Standard deviation
Standard deviation
3. Total Seconds
Total Seconds
4. nsize
nsize
5. nracks
nracks
6. ntries
ntries
7. Parallel/Scalar Ratio
Variation in Ratio
As an example, the program time_parallel_i is compiled and linked with the single and
double precision timing functions s_parallel_i_bench and d_parallel_i_bench.
This routine evaluates the time to compute 4 inverse matrices of size 600 by 600 using the defined
operator .i. The Average is the mean of the individual elapsed times for 5 calls to the routines,
obtaining 4 inverses in each call. The St. Dev. is the standard deviation for that Average. This
value indicates the variability of the Average. In order for this value to provide any useful
information it is necessary for |NTRIES| > 1. The value |NTRIES| = 1 is acceptable, but only one
time sample and no standard deviation is obtained. Values of NTRIES > 0 result in the printing of
results as shown in Table C. The numbers in the table will vary depending on the machine and
other factors that impact performance of Fortran codes. If NTRIES < 0 the 7 2 functions return
the tabular values shown, with |NTRIES| samples. No printing is performed with NTRIES < 0.
8 58 58
1.5815E+00
4.0241E+00
Average
2.5031E-01
1.8035E-02
St. Dev.
7.9077E+00
2.0121E+01
Total Seconds
5.0000E+01
5.0000E+01
Size
5.0000E+00
5.0000E+00
5.0000E+00
5.0000E+00
Repeats
3.9129E-01
8 58 59
1.6985D+00
4.0372D+00
Average
9.8576D-01
2.3836D-02
St. Dev.
8.4923D+00
2.0186D+01
Total Seconds
5.0000D+01
5.0000D+01
Size
5.0000D+00
5.0000D+00
5.0000D+00
5.0000D+00
Repeats
1.2392D-01
Below is a list of the performance evaluation programs that time the box data computations using
parallel and non-parallel resources.
Number
1
Program Units
time_parallel_i.f90,
s_parallel_i_bench.f90,
Function Timed
.i. A
d_parallel_i_bench.f90
2
time_parallel_ix.f90,
s_parallel_ix_bench.f90,
A .ix. B
d_parallel_ix_bench.f90
3
time_parallel_xi.f90,
s_parallel_xi_bench.f90,
B .xi. A
d_parallel_xi_bench.f90
4
time_parallel_x.f90,
s_parallel_x_bench.f90,
A .x. B
d_parallel_x_bench.f90
5
time_parallel_tx.f90,
s_parallel_tx_bench.f90,
A .tx. B
d_parallel_tx_bench.f90
6
time_parallel_xt.f90,
s_parallel_xt_bench.f90,
A .xt. B
d_parallel_xt_bench.f90
7
time_parallel_hx.f90,
s_parallel_hx_bench.f90,
A .hx. B
d_parallel_hx_bench.f90
8
time_parallel_xh.f90,
s_parallel_xh_bench.f90,
A .xh. B
d_parallel_xh_bench.f90
9
time_parallel_chol.f90,
s_parallel_chol_bench.f90,
10
time_parallel_cond.f90,
s_parallel_cond_bench.f90,
CHOL(A)
d_parallel_chol_bench.f90
COND(A)
d_parallel_cond_bench.f90
11
time_parallel_rank.f90,
s_parallel_rank_bench.f90,
RANK(A)
d_parallel_rank_bench.f90
Table D: Parallel and non-Parallel Box Comparisons
Number
12
Program Units
Function Timed
time_parallel_det.f90,
s_parallel_det_bench.f90,
DET(A)
d_parallel_det_bench.f90
13
time_parallel_orth.f90,
s_parallel_orth_bench.f90,
ORTH(A,R=R)
d_parallel_orht_bench.f90
14
time_parallel_svd.f90,
s_parallel_svd_bench.f90,
SVD(A,U=U,V=V)
d_parallel_svd_bench.f90
15
time_parallel_norm.f90,
s_parallel_norm_bench.f90,
NORM(A,TYPE=I)
d_parallel_norm_bench.f90
16
time_parallel_eig.f90,
s_parallel_eig_bench.f90,
EIG(A,W=W)
d_parallel_eig_bench.f90
17
time_parallel_fft.f90,
s_parallel_fft_bench.f90,
FFT_BOX(A)
IFFT_BOX(A)
d_parallel_fft_bench.f90
Product Support
Clarity of documentation
Not included in these topics are mathematical/statistical consulting and debugging of your
program.
Refer to the following for Visual Numerics Product Support contact information:
http://www.vni.com/tech/imsl/phone.html
The following describes the procedure for consultation with Visual Numerics:
1. Include your Visual Numerics license number
2. Include the product name and version number: IMSL Fortran Numerical Library Version 6.0
3. Include compiler and operating system version numbers
4. Include the name of the routine for which assistance is needed and a description of the problem
Product Support i
ii Product Support
Index
1
1-norm 1501, 1504, 1505, 1509
2
2DFT (Discrete Fourier Transform)
1093, 9
3
3DFT (Discrete Fourier Transform)
9
A
Aasen' s method 19, 20
accuracy estimates of eigenvalues,
example 534
Adams xix
Adams-Moulton's method 944
adjoint eigenvectors, example 534
adjoint matrix xxiii
ainv= optional argument xxvi
Akima interpolant 690
algebraic-logarithmic singularities
883
ANSI xix, 1601, 1602, 12
arguments, optional subprogram xxvi
array permutation 1673
ASCII collating sequence 1701
ASCII values 1698, 1699, 1700
B
band Hermitian storage mode 344,
346, 352, 355, 358, 360, 362,
1779
band storage mode 280, 282, 287,
295, 298, 324, 327, 330, 338,
342, 1447, 1448, 1450, 1452,
1453, 1455, 1460, 1467, 1489,
1493, 1495, 1497, 1504, 1505,
1777
Fortran Numerical MATH LIBRARY
C
Campbell 54
Cauchy principal value 860, 886
central differences 1390
changing messages 1642
character arguments 1699
character sequence 1703
character string 1704
character workspace 1787
Chebyshev approximation 649, 854
Chebyshev polynomials 30
Cholesky
algorithm 20
decomposition 18, 525, 538
factorization 1574, 1575
method 22
Cholesky decomposition 489
Cholesky factorization 185, 190,
194, 204, 305, 308, 311, 318,
349, 362, 395, 399, 404, 412,
417, 421, 491, 494
circulant linear system 425
circulant matrices 8
classical weight functions 901, 914
codiagonal band hermitian storage
mode 349
codiagonal band Hermitian storage
mode 1782
Index i
D
DASPG routine 54
data fitting
ii Index
polynomial 30
two dimensional 33
data points 803
data, optional xxvii
date 1707, 1708, 1710, 1711
decomposition, singular value 36, 16
degree of accuracy 1763
DENSE_LP 1346
deprecated routines 1787
determinant 1581, 1582, 7
determinant of A 9
determinants 113, 148, 161, 163,
204, 224, 274, 298, 318, 342,
362
determinants 7
DFT (Discrete Fourier Transform)
1086, 1099
differential algebraic equations 924
Differential Algebraic Equations 540
differential equations 923, 961
differential-algebraic solver 54
diffusion equation 53
direct- access message file 1643
direct search complex algorithm
1306
direct search polytope algorithm
1263
discrete Fourier cosine
transformation 1122
discrete Fourier sine transformation
1118
discrete Fourier transform 1085,
1592, 1593, 1594, 1595, 1597,
1598, 1599, 9, 11
inverse 1596, 11
dot product 1426, 1427, 1428
double precision xix, 1517
DOUBLE PRECISION types xxii
E
efficient solution method 532
eigensystem
complex 555, 627, 629, 632
Hermitian 607
real 548, 571, 618, 621, 625
symmetric 589, 639
eigenvalue 1586, 1587, 8
eigenvalue-eigenvector
decomposition 521, 525, 1586,
1587, 8
expansion (eigenexpansion) 523
eigenvalues 543, 545, 550, 552, 557,
559, 561, 563, 566, 568, 573,
575, 578, 581, 584, 586, 591,
593, 596, 599, 602, 604, 609,
611, 614, 616, 618, 621, 627,
629, 634, 636
Fortran Numerical MATH LIBRARY
Index iii
F
factored secant update 1204, 1210
factorization, LU 9
Fast Fourier Transforms 1084
Faure 1736, 1738, 37, 9
Faure sequence 1736, 1737, 37, 9
Fejer quadrature rule 914
FFT (Fast Fourier Transform) 1088,
1096, 1102
finite difference gradient 1377
finite-difference approximation
1198, 1204
finite-difference gradient 1232,
1255, 1279
finite-difference Hessian 1243
finite-difference Jacobian 1267
first derivative 918
first derivative evaluations 1225
first order differential 980
FORTRAN 77
combining with Fortran 90 xix
Fortran 90
language xix
rank-2 array xxvi
real-time clock 1716
iv Index
G
Galerkin principle 54
Gauss quadrature 861
Gauss quadrature rule 901, 905
Gaussian elimination 364, 369, 374,
377, 391, 408, 412
Gauss-Kronrod rules 865
Gauss-Lobatto quadrature rule 901,
905
Gauss-Radau quadrature rule 901,
905
Gears BDF method 944
generalized
eigenvalue 525, 538, 1586, 1587, 8
feedback shift register (GFSR)
1714
inverse
matrix 27, 28, 32
generalized inverse
system solving 32
generator 1717, 1720
getting started xxvi
GFSR algorithm 1715
Givens plane rotation 1430
Givens transformations 1432, 1433
globally adaptive scheme 865
Golub 12, 20, 30, 35, 59, 62, 64, 521,
525, 530
gradient 1390, 1392, 1397, 1403
Gray code 1739
GSVD 62
H
Hadamard product 1428, 1481
Hanson 521
harmonic series 1089, 1097
Helmholtzs equation 1053
Helmholtz's equation 1059
Hermite interpolant 687
Hermite polynomials 1038
Hermitian positive definite system
226, 231, 246, 251, 344, 346,
358, 360
Hermitian system 258, 261, 269, 271
Hessenberg matrix, upper 527, 531
Fortran Numerical MATH LIBRARY
K
Kershaw 47
Index v
I
IEEE 1601, 1602, 12
infinite eigenvalues 538
infinite interval 872
infinity norm 1499
infinity norm distance 1510
informational errors 1764
initialization, several 2D transforms
1098
initialization, several transforms
1091
initial-value problem 927, 934, 944
integer options 1739
INTEGER types xxii
integrals 706
integration 862, 865, 869, 872, 875,
883, 886, 889, 896
interface block xix
internal write 1646
interpolation 651
cubic spline 677, 680
quadratic 649
scattered data 649
inverse 9
iteration, computing eigenvectors
23, 51, 523
matrix xxvi, 10, 17, 18, 22
generalized 27, 28
transform 1087, 1094, 1100
inverse matrix 9
isNaN 1601, 1602
ISO xix
iterated integral 891
iterative refinement xxvii, 6, 7, 48,
82, 107, 142, 176, 181, 185,
190, 194, 199, 204, 205, 209,
212, 221, 251, 271, 295, 315,
338, 344, 360, 446, 458
IVPAG routine 54
low-discrepancy 1739
LU factorization 93, 98, 103, 113,
127, 133, 138, 148, 287, 290,
293, 298, 330, 333, 336, 342,
369, 374, 382, 387
LU factorization of A 9, 10, 11, 18,
1522
M
machine-dependent constants 1769
mathematical constants 1751
matrices 1444, 1445, 1447, 1448,
1450, 1452, 1453, 1455, 1457,
1458, 1460, 1465, 1467, 1469,
1476, 1479, 1487, 1489, 1491,
1497, 1502, 1504, 1505, 1647,
1649, 1653, 1655, 1658, 1660,
1664
adjoint xxiii
complex 330, 333, 342, 504, 550,
552, 1455, 1460
band 1448, 1493, 1497, 1505
general 127, 148, 149, 1445,
1453, 1457
general sparse 382
Hermitian 236, 263, 266, 274,
349, 352, 355, 362, 591, 593,
596, 599, 602, 604, 1463, 1467
rectangular 1458, 1479, 1491,
1658, 1660
tridiagonal 321
upper Hessenberg 614, 616
copying 1444, 1445, 1447, 1448,
1457, 1458, 1465, 1467
covariance 22, 27, 28
general 1775
Hermitian 1776
inverse xxvi, 9, 10, 17, 18, 22
generalized 27, 28, 32
inversion and determinant 13
multiplying 1474, 1476, 1479,
1487, 1489, 1491
orthogonal xxiii
permutation 1674
poorly conditioned 38
printing 1647, 1649, 1653, 1655,
1658, 1660, 1664
real 287, 290, 298, 508, 543, 545,
1452, 1460
band 1447, 1489, 1504
general 93, 98, 113, 114, 1444,
1450, 1457
general sparse 369
rectangular 1458, 1476, 1481,
1487, 1502, 1647, 1649
sparse 6
vi Index
N
naming conventions xxii
NaN (Not a Number) 1601
quiet 1601
signaling 1601
Newton algorithm 1218
Newton method 1243, 1249, 1293,
1299
Newton' s method 42, 60
noisy data 848, 851
nonadaptive rule 889
nonlinear equations 1198, 1201,
1204, 1210
nonlinear least-squares problem
1218, 1267, 1273, 1310, 1317,
1324
nonlinear programming 1377, 1383
norm 1602
normalize 1614
not-a-knot condition 677, 680
numerical differentiation 862
O
odd sequence 1118
odd wave numbers 1126, 1129,
1133, 1135
optional argument xxvi
optional data xxvi, xxvii
optional subprogram arguments xxvi
ordinary differential equations 923,
924, 927, 934, 944
ordinary eigenvectors, example 534
orthogonal
decomposition 59
factorization 29
matrix xxiii
orthogonal matrix 473
orthogonalized 51, 523
overflow xxiv
P
page length 1671
page width 1671
Q
QR algorithm 59, 521
double-shifted 530
QR decomposition 8, 466, 1582
QR factorization 473, 484
quadratic interpolation 782, 784,
786, 789, 792, 796
quadratic polynomial interpolation
649
quadrature formulas 861
quadrature rule 911
quadruple precision 1516
quasi-Monte Carlo 899
quasi-Newton method 1232, 1237,
1279, 1286
quintic polynomial 800
Index vii
radial-basis functions 33
random complex numbers,
transforming an array 1089,
1096, 1102
random number generator 1727,
1729, 1730, 1732
random number generators 1722,
1723
random numbers 1714
rank-2k update 1442
rank-k update 1441
rank-one matrix 484, 491, 494
rank-one matrix update 1439, 1440
rank-two matrix update 1440
rational weighted Chebyshev
approximation 854
READ_MPS 1333, 1343
real numbers, sorting 1677
real periodic sequence 1103, 1106
real sparse symmetric positive
definite system 404
real symmetric definite linear system
427, 433
real symmetric positive definite
system 176, 181, 194, 199,
300, 303, 313, 315
real symmetric system 209, 212, 219,
221
real triangular system 154
real tridiagonal system 275
REAL types xxii
real vectors 1153, 1163
record keys, sorting 1679
rectangular domain 750
rectangular grid 786, 789, 792, 796
recurrence coefficients 905, 908, 911
reduction
array of black and white 40
regularizing term 47
Reid xix
required arguments xxvi
reserved names 1784
reverse communication 54
ridge regression 64
cross-validation
example 64
Rodrigue 47
row and column pivoting 27, 30
row vector, heavily weighted 35
Runge-Kutta-order method 934
Runge-Kutta-Verner fifth-order
method 927
Runge-Kutta-Verner sixth-order
method 927
ScaLAPACK
contents 1620
data types 1620
definition of library 1619
interface modules 1622
reading utility
block-cyclic distributions 1625,
1636, 1637
scattered data 800
scattered data interpolation 649
Schur form 527, 532
search 1691, 1694, 1696
second derivative 918
self-adjoint
eigenvalue problem 525
linear system 25
matrix 17, 20, 521, 523, 525, 16
eigenvalues 23, 520, 527, 16
tridiagonal 20
semi-infinite interval 872
sequence 1129, 1135
serial number 1713
Shared-Memory Multiprocessors and
xxx
simplex algorithm 1351, 1355
sine 875
sine Fourier coefficients 1129
sine Fourier transform 1126
single precision xix
SINGLE PRECISION options 1743
Single Program, Multiple Data
SPMD 1619
singular value decomposition 504
singular value decomposition (SVD)
36, 1612, 1613, 16
singularity 7
singularity points 869
smooth bivariate interpolant 800
smoothing 844
smoothing formulas 32
smoothing spline routines 649
solvable 540
solving
general system 9
linear equations 17
rectangular
viii Index
least squares 36
system 27
solving linear equations 5
sorting 1679, 1681, 1683, 1684,
1685, 1687, 1688, 1690, 1691,
1694, 1696
sorting an array, example 1678
Sparse <atrix, Complex
Harwell-Boeing column-oriented
sparse form 1533
Fortran Numerical MATH LIBRARY
T
tensor product splines 648
tensor-product B-spline coefficients
720, 725, 833, 838
tensor-product B-spline
representation 741, 742, 746,
750, 754, 756, 760, 766
tensor-product spline 741, 742, 746,
750, 754, 756, 760, 766
tensor-product spline approximant
833, 838
tensor-product spline interpolant 725
terminal errors 1763
third derivative 918
time 1706
Timing
Benchmarking
Fortran Numerical MATH LIBRARY
U
unconstrained minimization 1218
underflow xxiv
uniform (0, 1) distribution 1732,
1734
uniform mesh 1059
unitary matrix xxiii
univariate functions 1218
univariate quadrature 860
upper Hessenberg matrix 531
user errors 1763
user interface xix
user-supplied function 918
user-supplied gradient 1259, 1286,
1383
Using LAPACK, LINPACK, and
EISPACK xxxii
using library subprograms xxiii
V
Van Loan 12, 20, 30, 35, 59, 62, 64,
521, 525, 530
variable knot B-spline 819
variable order 961
variances 1716
variational equation 53
vectors 1425, 1426, 1428, 1429,
1437, 1491, 1493, 1512, 1514
complex 1514
real 1512
version 1713
W
workspace allocation 1785, 1786
World Wide Web
URL for ScaLAPACK User's
Guide 1620
Index ix
Z
zero of a real function 1192
zeros of a polynomial 1184, 1186,
1188
zeros of a univariate complex
function 1189
zeros of the polynomial 1183
x Index