Parallel Programming with Co-Arrays Robert W. Numrich 2024 scribd download
Parallel Programming with Co-Arrays Robert W. Numrich 2024 scribd download
com
https://textbookfull.com/product/parallel-programming-with-
co-arrays-robert-w-numrich/
OR CLICK BUTTON
DOWNLOAD NOW
https://textbookfull.com/product/fortran-2018-with-parallel-
programming-1st-edition-subrata-ray-author/
textboxfull.com
https://textbookfull.com/product/concepts-of-programming-languages-
global-edition-robert-w-sebesta-sebesta-r-w/
textboxfull.com
https://textbookfull.com/product/pro-tbb-c-parallel-programming-with-
threading-building-blocks-1st-edition-michael-voss/
textboxfull.com
https://textbookfull.com/product/parallel-computers-architecture-and-
programming-v-rajaraman/
textboxfull.com
Beginning Java with WebSphere Expert s Voice in Java
Janson Robert W
https://textbookfull.com/product/beginning-java-with-websphere-expert-
s-voice-in-java-janson-robert-w/
textboxfull.com
https://textbookfull.com/product/parallel-and-high-performance-
computing-meap-v09-robert-robey/
textboxfull.com
https://textbookfull.com/product/parallel-programming-for-modern-high-
performance-computing-systems-czarnul/
textboxfull.com
https://textbookfull.com/product/parallel-programming-concepts-and-
practice-1st-edition-bertil-schmidt/
textboxfull.com
https://textbookfull.com/product/irving-fisher-robert-w-dimand/
textboxfull.com
Parallel Programming
with Co-arrays
Robert W. Numrich
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
Preface ix
1 Prologue 1
3 Partition Operators 9
3.1 Uniform partitions . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Non-uniform partitions . . . . . . . . . . . . . . . . . . . . . 12
3.3 Row-partitioned matrix-vector multiplication . . . . . . . . . 14
3.4 Input/output in the co-array model . . . . . . . . . . . . . . 16
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Collective Operations 27
5.1 Reduction to root . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2 Broadcast from root . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 The sum-to-all operation . . . . . . . . . . . . . . . . . . . . 31
5.4 The max-to-all and min-to-all operations . . . . . . . . . . . 32
5.5 Vector norms . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 Collectives with array arguments . . . . . . . . . . . . . . . . 33
5.7 The scatter and gather operations . . . . . . . . . . . . . . . 34
5.8 A cautionary note about functions with side effects . . . . . 37
5.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6 Performance Modeling 39
6.1 Execution time for the sum-to-all operation . . . . . . . . . . 40
6.2 Execution time for the dot-product operation . . . . . . . . . 41
6.3 Speedup and efficiency . . . . . . . . . . . . . . . . . . . . . 43
6.4 Strong scaling under a fixed-size constraint . . . . . . . . . . 43
6.5 Weak scaling under a fixed-time constraint . . . . . . . . . . 47
6.6 Weak scaling under a fixed-work constraint . . . . . . . . . . 49
6.7 Weak scaling under a fixed-efficiency constraint . . . . . . . 50
6.8 Some remarks on computer performance modeling . . . . . . 52
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9 Blocked Matrices 79
9.1 Partitioned dense matrices . . . . . . . . . . . . . . . . . . . 80
9.2 An abstract class for dense matrices . . . . . . . . . . . . . . 80
9.3 The dense matrix class . . . . . . . . . . . . . . . . . . . . . 81
9.4 Matrix-matrix multiplication . . . . . . . . . . . . . . . . . . 84
9.5 LU decomposition . . . . . . . . . . . . . . . . . . . . . . . . 87
9.6 Partial pivoting . . . . . . . . . . . . . . . . . . . . . . . . . 92
9.7 Solving triangular systems of equations . . . . . . . . . . . . 94
9.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
16 Epilogue 171
Bibliography 197
Index 207
Preface
Robert Numrich
Minneapolis, Minnesota
Chapter 1
Prologue
This book describes a set of fundamental techniques for writing parallel appli-
cation codes. These techniques form the basis for parallel algorithms frequently
used in scientific and engineering applications. Parallel computing by itself is
a very large topic as is scientific computing by itself. The book makes no claim
of comprehensive coverage of either topic, just a basic outline of how to write
parallel code for scientific applications.
All the examples in the book employ two fundamental techniques that are
part of every parallel programming model in one form or another:
• data decomposition
• execution control
The programmer must master these two techniques and may find them the
hardest part of designing a parallel application. The book applies these two
fundamental techniques to five fundamental algorithms:
• matrix-vector multiplication
• matrix factorization
• matrix transposition
• collective operations
• halo exchanges
It is not a complete list, but it is a list that every parallel code developer must
understand.
The book describes these techniques in terms of partition operators. The
programmer frequently encounters partition operators, either explicitly or im-
plicitly, in scientific application codes, and the techniques needed for new codes
are often variations of techniques encountered in previous codes. The specific
form of the partition operators becomes progressively more complicated as
the examples become more complicated. The book’s goal is to show that all
its examples fit into a single unified framework.
The book encourages the use of Fortran as a modern object-oriented lan-
guage. Parallel programming is an exercise in transcribing mathematical def-
initions for partition operators into small functions associated with Fortran
2 Parallel Programming with Co-Arrays
objects. The exchange of one data distribution for another is the exchange
of one set of functions with another set. This technique follows one of the
fundamental principles of object-oriented design.
This book demonstrates that the programmer needs to learn just a handful
of techniques that are variations on a common theme. As always, however, the
only way to learn to write parallel code is to write parallel code. And to test
it.
The interplay between two sets of indices makes the programmer’s job
difficult. One set describes the decomposition of a data structure. The other
set describes how the first set is distributed across a parallel computer. Perhaps
the statement attributed to Kronecker,
“God created the integers, all else is the work of man.”
! - - - - begin segment 4 - - - -!
: ! !
: ! !
end program First ! - - - - end segment 4 - - - -!
The programmer needs to know the number of images and the value of the
local image index at run-time. Each image obtains these values by invoking
two intrinsic functions added to the language to support the co-array model
as replicated in Listing 2.2.
The first function returns the number of images, and the second function
returns the image index of the image that invokes it. The image index has
a value between one and the number of images and uniquely identifies the
The Co-array Programming Model 5
image. In all the examples that follow, the book uses the symbol p for the
number of images and the symbol me for the value of the local image index.
The allocation statement, replicated in Listing 2.3, sets the upper and lower
co-bounds for the co-dimension. Specifying the co-bounds with an asterisk in
square brackets follows the Fortran convention for declaring a normal variable
with an assumed size. This convention allows the programmer to write code
that can run on different numbers of images without changing the source code,
without re-compiling and without re-linking. The implied lower co-bound is
one and the implied upper co-bound equals the number of images at run-time.
The programmer can override the default co-bound values as described in
Section A.3. If the lower co-bound is zero, for example, the upper co-bound
is the number of images minus one. Whatever the value for the lower co-
bound, the programmer is responsible for using valid co-dimension indices.
Compiler vendors may provide an option to check the run-time value of a co-
dimension index, but an out-of-bound value for a co-dimension index results
in undefined, often disastrous, behavior just as it does for an out-of-bound
value for a normal dimension index.
In the program shown in Listing 2.1, each image picks a partner with image
index one greater than its own. Since the partner’s index cannot be greater
than the number of images, the last image picks the first image as its partner.
Each image sets the value of its local co-array variable to its own image index
as shown in Listing 2.4.
three execution control statements plus the statement that ends the program.
The first segment begins with the first executable statement and ends with
the allocation statement. Every image must execute this statement because
the variable being allocated is a co-array variable. There is an implied barrier
that no image may cross until all have allocated the variable.
The second segment consists of all statements following the allocation
statement up to and including the sync all statement. Without this con-
trol statement, an image might try to obtain a value from its partner before
the partner has defined the value. The sync all statement guarantees that
each image has defined its value before any image references the value. When
an image executes a statement in the third segment, as shown in Listing 2.6,
it obtains the value defined by its partner in the second segment.
The third segment ends with execution of the deallocation statement. Be-
cause the variable is a co-array variable, every image waits until all images
reach this statement before deallocating the variable. Otherwise, an image
might try to reference the value of a variable that no longer exists. The fourth
segment consists of other statements between the deallocation statement and
the end of program statement assuming there are no more control statements
in the program.
Execution of this program using seven images might result in the output
shown in Listing 2.7. Each image has obtained the correct value from its
partner, but output from different images appears randomly in the standard
output file. The reason for this behavior is that the run-time system executes
the output statement independently for each image. It is required to write a
complete record for each image, with no intermixing of records from different
images, and to merge records from different images into a single file. The order
in which it merges the records, however, is implementation specific, and the
order may vary from execution to execution even on the same system.
To guarantee order in an output file, the programmer may need to restrict
execution of output statements to a single image that gathers data from other
images before writing to the file in a specific order as shown, for example, in
The Co-array Programming Model 7
Listing 2.8. The alternative form of the output statement avoids the need for
an additional variable to hold the value obtained from the remote image.
2.2 Exercises
1. Change the example program shown in Listing 2.1 in the following ways.
O = {o1 , . . . , on } . (3.2)
The order relationship is the natural integer order, but for some applications
a permutation may change the order relationship.
By default, the Fortran language assumes the value one for the lower bound
of the index set. It allows the programmer to override this default value replac-
ing it, for example, with zero when interfacing with other languages. Changing
the default lower bound, however, usually adds little benefit for algorithm de-
sign.
Parallel applications require the programmer to partition the global index
set into local index sets. If n, the size of the global index set, is a multiple of
p, the number of partitions, the size of the local index set,
m = n/p , (3.3)
is the same for each partition. The global base index for each partition has
the value,
k0α = (α − 1)m , α = 1, . . . , p , (3.4)
and the local index set is a contiguous set of integers added to the base,
xn
The partition of the global index set induces a partition of the vector elements,
α
x1 xk0α +1
.. ..
. = . (3.7)
.
xα
m xk0α +m
The numerical values of the vector elements are the same, but they are labeled
by the local index set,
L = {1, . . . , m} , (3.8)
rather than by the global index set.
Partitioning a vector is an operation in linear algebra,
xα = P α x , α = 1, . . . , p . (3.9)
P α = 0 · · · Iα · · · 0 ,
(3.10)
Partition Operators 11
with m rows and n columns. Each zero symbol represents a zero matrix of
size m × m, and the symbol I α represents the identity matrix of the same size.
The partition operation defined by formula (3.9), then, is the matrix-vector
multiplication,
x1
..
.
xk0α +1 xkα +1
0
.. ..
Iα
= 0 ··· ··· 0 . (3.11)
. .
xk0 +m
α xkα +m
0
..
.
xn
k−1
α= +1 , (3.12)
m
where the notation b·c denotes the floor function. The same element considered
as a local element assigned to image α is labeled with a local index i according
to the formula,
i = k − k0α . (3.13)
On the other hand, a local index i assigned to partition α maps back to the
global index k according to the formula,
k = k0α + i , i = 1, . . . , m . (3.14)
The partition operator, therefore, defines a forward map from one global
index to two local indices, a partition index and a local index,
k → (α, i) , (3.15)
according to furmulas (3.12) and (3.13), and a backward map from the two
local indices to the global index,
(α, i) → k , (3.16)
r = n mod p , (3.17)
When the remainder is zero, the partition is uniform and the size is the same
for all images. When the remainder is not zero, the partition size for images
with indices less than or equal to the remainder equals the ceiling value. The
partition size for images with indices greater than the remainder is one less.
A global index 1 ≤ k ≤ n maps to partition index α by the formula,
k−1
b m c+1 r=0
α= b k−1
m c+1 k ≤ mr . (3.20)
k−r−1
b m−1 c + 1 k > mr
(α − 1)mα
r = 0 α = 1, . . . , p
α
k0 = (α − 1)mα r > 0 α = 1, . . . , r , (3.21)
(α − 1)mα + r r > 0 α = r + 1, . . . , p
i = k − k0α . (3.22)
The local index maps back to the global index according to the rule,
k = k0α + i , i = 1, . . . , mα . (3.23)
Partition Operators 13
103 •
• •••••••
• •• •••••
•••
• ••• •••••• •••
••
•• • •
•• •
•
• •• •
•• •
• •
• •
• •
• •
•
• •
•
•
• • • • • • •
102 •••
••••• ••••
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
••
•
•
••
•
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
•
•••• ••• ••
••••••••••
• •
• •
• •
•
• •
• •
• •
• •
•
• •
•
•
•• • • •
••• •• •• •
• • •
• • •
• • •
•
•
• •••••• ••• ••••••••• • •
• •
•
• •
• •
•• •
•
•••••••••• ••••• ••••• • • • • •
mp • •• •
•• ••••• ••••• •• • •
•
•
•
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
• • ••• •• • • • • • • • •
•
• • •• • • • • •
•
• • • • •
101 •• • • • •
•
•
• • •
•
•
•
•
•
• •
•
• •
• • • •
•
• •
100 •
FIGURE 3.1: Partition size based on the floor function as a function of the
number of images for a fixed index set n = 1000. The solid line is a step
function representing the partition size assigned to images not equal to the
last image. The bullets represent the partition size assigned to the last image.
For the case n = 41, p = 9, the original definition of the partition oper-
ator yields sizes (5, 5, 5, 5, 5, 4, 4, 4, 4). The alternative definition yields sizes
(4, 4, 4, 4, 4, 4, 4, 4, 9). The disadvantage of the alternative definition is that all
images must allocate co-array variables with size 9 to accommodate the last
image even though all the other images need only size 4. In addition, since the
last image owns more data than the others, a workload imbalance may develop
as images wait for the last image. This problem becomes more important as
the number of images increases as shown in Figure 3.1.
It might be tempting to use the ceiling function in place of the floor func-
tion to define the partition size for all but the last image. Formulas for the
partition size, however, become more complicated. Cases exist where the par-
tition size is zero for the last image and something smaller than the ceiling
14 Parallel Programming with Co-Arrays
function for some images below the last one. This rule can still be used with
caution, but it may lead to errors and it may waste one image with nothing
to do.
y = Ax , (3.30)
P α y = P α Ax , α = 1, . . . , p , (3.31)
y α = Aα x , (3.32)
for the result vector y α assigned to image α where the partitioned matrix,
Aα = P α A , (3.33)
consists of the subset of rows assigned to image α. Figure 3.2 shows the matrix-
vector multiplication operation partitioned by rows with a one-to-one corre-
spondence between the partition index α and the image index.
.. ..
. .
yα = Aα
x
.. ..
. .
.. ..
. .
Once an image has initialized its piece of the matrix and the vector on the
right side, it computes its matrix-vector multiplication using only local data.
Each image executes the code independently for its own piece of the problem
with no interaction between images.
Listing 3.2 shows the function that computes the size of the partition
assigned to the image that invokes it. It is a direct transcription of formula
(3.19). Placing this calculation for such a straightforward formula may seem
superfluous. But this calculation occurs frequently in a code and is subject
to errors that may be hard to find. Furthermore, the programmer can change
this function to use an alternative definition for the partition size without
changing the rest of the code. In later applications, this kind of function
becomes a procedure associated with a class of objects.
The statement,
m = (n -1)/ p + 1 ! - - ceiling ( n / p ) - -!
Listing 3.3 shows code that uses a temporary co-array buffer to hold each
column of the matrix as image one reads one column at a time from a file.
Each image executes the sync all statement waiting for image one to finish
reading the data, and then each image pulls its piece of the matrix into its
Partition Operators 17
own local memory. The second sync all statement guarantees that image
one does not read new data into the buffer before all images have obtained
their data. Notice the calculation of the global base index defined by formula
(3.21) as each image determines its piece of the current column.
The function shown in Listing 3.3 assigns image one to open a file, read the
data, and make data available to the other images. This technique is very com-
mon in parallel applications, but serialized input may become a performance
bottleneck. To avoid this problem, the programmer may want to associate
a procedure pointer to a library procedure that performs input in parallel
perhaps from a library using another programming model.
The current version of the co-array model allows only one image at a time
to open a particular file. It need not be image one, but only one at a time. Some
future version of the language may allow the programmer to use direct access
files with each row, or column, of the matrix held in a separate record. Each
image could then position itself at the records corresponding to its partition
and read from the file independently in parallel. Section A.13 contains a more
detailed description of input/output for the co-array model.
3.5 Exercises
1. For the case n = 1000 with p = 37, use the ceiling function dn/pe to
define a base partition size. Use that size for as many images as possible.
What is the size on the last image? On the second to last image?
2. Modify the code sample in Listing 3.3 so that image one sends the data
to the other images.
3. Modify the code in Listing 3.3 such that image p reads the data from
the file.
4. Each image obtains only its own piece of the result vector from the row-
partitioned matrix-vector multiplication code. How could each image
obtain the full result?
Chapter 4
Reverse Partition Operators
Pα , α = 1, . . . , p , (4.1)
where I is the n × n identity operator and n is the size of the global index set.
This constraint is called a partition of unity. It describes the recovery of
a global data structure from the local data structures. Indeed, application of
20 Parallel Programming with Co-Arrays
xα = P α x , (4.5)
To recover the global vector, therefore, each image must apply the reverse
operator to its local piece of the vector and then it must sum together pieces
from the other images.
The partition of unity is a fundamental tool for the development of parallel
algorithms. Whenever a formula, like formula (4.6), involves partition indices
not equal to the local partition index, the formula implies communication
between images. The mathematical expression for a partitioned algorithm,
therefore, explicitly contains the interactions between processors within itself.
The definitions for these operators are not unique, but given the forward
operator as in Section 3.1,
P α = 0 · · · Iα · · · 0 ,
(4.7)
under the constraint imposed by the partition of unity. This matrix repre-
sentation of the reverse operator has n rows and mα columns. The symbol 0
represents a zero matrix and the matrix Iα is the identity matrix for partition
α. Applied to a vector xα of length mα , the reverse operator produces a vector
of length n,
0 0
.. ..
. .
α
x = Iα xα ,
(4.9)
. .
.. ..
0 0
Reverse Partition Operators 21
with all zeros except in the part corresponding to partition α. The reader
may readily verify, by direct matrix multiplication and summation, that the
partition of unity (4.3) is satisfied for these definitions of the forward and
reverse operators.
Partition operators are not projection operators. Nor are they inverses of
each other. The forward operator maps a vector of length n to a vector of
length mα ,
P α : Vn → Vmα . (4.10)
The reverse operator maps a vector of length mα to a vector of length n,
Pα : Vmα → Vn . (4.11)
Pα P α : Vn → Vn , (4.12)
form a set of orthogonal projection operators [79]. Summed over the index α,
these projection operators yield the identity operator. Nothing is lost during
the forward and reverse partition operations.
The forward partition operation is an example of the scatter operation,
and the reverse partition operation is an example of the gather operation.
Chapter 5 discusses these operations in more detail implemented as collective
operations.
xα = P α x , (4.16)
22 Parallel Programming with Co-Arrays
holds the piece assigned to image α. To verify the meaning of the symbol APα ,
observe that,
Pα = (P α )T , (4.17)
implies the identity,
APα = (P α AT )T . (4.18)
The matrix P α AT on the right consists of the rows of AT assigned to the
partition α. These rows of the transposed matrix are just the columns of the
original matrix A assigned to image α.
..
.
xα
yα = ··· Aα ··· ··· ×
..
.
..
.
Figure 4.1 shows how each image computes a partial result vector,
yα = Aα xα , (4.19)
with length equal to the full dimension. The images do not interact, but to
obtain the full result, they must sum together the partial results. Listing
4.1 shows code for the summation. The code employs staggered references to
remote images to reduce pressure on local memory should every image try to
access the same memory at the same time.
buffer = matmul (A , x )
y (1: n ) = buffer (1: n )
sync all
alpha = me
do q =1 ,p -1
alpha = alpha +1
if ( alpha > p ) alpha = alpha - p
y (1: n ) = y (1: n ) + buffer (1: n )[ alpha ]
end do
and use the associative property of multiplication to write the dot product in
partitioned form,
Xp
d= (uT Pα )(P α v) . (4.22)
α=1
From definition (4.17) for the reverse partition operator, the dot product be-
comes
Xp
d= (P α u)T (P α v) . (4.23)
α=1
Define the partitioned vectors, uα = P α u and v α = P α v, and write
p
X
d= (uα )T (v α ) . (4.24)
α=1
This formula says that each image first computes its local dot product,
dα = (uα )T v α , (4.25)
and then participates in a sum,
p
X
d= dα , (4.26)
α=1
across all images such that they all obtain the same value of the dot product.
24 Parallel Programming with Co-Arrays
The extended operators mix the elements of a vector before cutting it into
pieces,
v α = Qα v = P α Qv . (4.32)
The product operators,
Qα Qα : Vn → Vn , (4.33)
still define a set of orthogonal projection operators. Summing them together
unmixes the elements of the partitioned vectors and reassembles the original
vector. The analysis of partitioned algorithms does not change; only the spe-
cific implementation changes because the index maps are more complicated.
The full generality of these extended operators is seldom used in parallel
codes. One particular case, however, where the unitary operator is a permu-
tation,
Q=Π, (4.34)
is quite common. It occurs in Chapter 13 for cyclic distributions and again in
Chapter 14 for finite element meshes.
Exploring the Variety of Random
Documents with Different Content
XXXIV.
For autocrats of old, with treacherous guile,
Had bribed the villain’s soul by sensual wile;
To meanest man a lower drudge assigned,—
With gift of female thrall cajoled the hind;
The stolid churl his servitude forgave
Whilst he in turn was master to a slave;
Through every rank the sexual serfdom ran,
And woman’s life was bound in vassalage to man.
XXXV.
Then, fearing that the slave herself might guess
The knavery of her forced enchainedness,
A subtle fiction mannish brain designed
To dominate her conscience and her mind,—
Inhuman dogmas did his genius frame,
Investing them with sanctimonious name
Of “woman’s duty”; and the fetish base
E’en to this reasoned day uplifts its impious face.
XXXVI.
By cant condoned, man fashioned woman’s “sphere,”
And mapped out “natural” bounds to her career;
His sapience—should she dare any deed
In contravention of his code—decreed
On soul or body penalties condign,
In part dubbed civil law, and part divine:
Misguided man,—confused in self-deceit
His unisexual wit and pious pretext meet.
XXXVII.
Obeisance yet his caste of sex demands;—
In legislative script the verbiage stands
How lowest boor is lordly “baron” styled,
And highest bride as common “feme” reviled;
The tardier fear that grants the clown a share
In his own governance, denies it her;
And British matrons are, by man-made rules,
In solemn statute ranked with infants, felons, fools.
XXXVIII.
The crass injustice early man displayed,
His own crude infancy of brain betrayed;
His riper judgment scorns the childish use,
And cries to all his bygone freaks a truce;
Enactments that long blemished legal page
Shall fade as figments of a foolish age,
Till saner years have every bond erased
Which selfish law of man on life of woman placed.
XXXIX.
Till like with him in human right she stands,
Her will an equal power of rule commands;
Her voice, in council and in senate heard,
To stern debate brings harmonising word;
In mutual stress each sex the other cheers,
Since one are made their hopes and one their fears;
“Self-reverent each, and reverencing each,”—
The theme that truer man and freer woman teach.
XL.
For but a slave himself must ever be,
Till she to shape her own career be free;—
Free from all uninvited touch of man,
Free mistress of her person’s sacred plan;
Free human soul; the brood that she shall bear,
The first—the truly free, to breathe our air;
From woman slave can come but menial race,
The mother free confers her freedom and her grace.
XLI.
By her the progress of our future kind,
Their stalwart body and their spacious mind;
For, folded in her form each human mite
Has its first home, its sustenance and light;
Hers the live warmth that fans its spirit flame,
Her generous sap supplies its fleshly frame,
And e’en the juice,—the fullborn infant’s food,
Is yet a blanched form of woman’s living blood.
XLII.
Strange wisdom by her unkenned craft is taught
While yet the embryo in her womb is wrought;
For, long ere entering on our tumult rife,
It learns from her the needful art of life;
Unconscious teacher, she, yet all she knows
Of dark experience to her infant flows,
And brands him, ere he rest upon her knee,
Offshoot of slavish race, not scion of the free.
XLIII.
To either sex the bondage and the pain,
They seek to live a freeman’s life in vain;
For man or woman can but act the part,
When ’tis not freeborn blood that fills the heart:
Strive as he may, the modern man, at best,
Is tyrant, differing somewhat from the rest;
Nor woman thraldom-bred can surely know
Where lies her richest gift, or how its wealth to show.
XLIV.
Thus learn we that in woman rendered free
Is raised the rank of all humanity;
The despot is the fullfruit of the slave;—
To form the freeman, equable and brave,
Habit of freedom must spontaneous come
As life itself, and from the selfsame womb;
Life, liberty, and love,—lien undefiled,—
The freeborn mother’s heirloom to her freeborn child.
XLV.
So shall her noble issue, maid or boy,
With equal freedom equal fate enjoy;
Together reared in purity and truth,
Through plastic childhood and retentive youth;
Their mutual sports of sinew and of brain
In strength alike the sturdy comrades train;
Of differing sex no thought inept intrudes,
Their purpose calmly sure all errant aim excludes.
XLVI.
For soul, not sex, shall to each life assign
What destiny to fill, or what decline;
Through years mature impartial range shall reach,
And wider wisdom, juster ethics, teach;
Conformed to claims of intellect and need,
The tempered numbers of their high-born breed;
Not overworn with childward pain and care,
The mother—and the race—robuster health shall share.
XLVII.
Nor blankly epicene, as scoffers say,
The necessary sequence of that day;
For not by vapid imitation low,
Or aping falser sex shall truer grow;
Nor modish mind may fathom Nature’s range,
Or fix the fleeting scope of human change;
Can singer blind the rainbow’s tints compare?—
The brain enslaved from birth the freeman’s powers declare?
XLVIII.
Work we in faith, secure that precious seed
Shall bear due fruit for man’s extremest need;
Not greatly timorous, as those fruits we see,
What changed existence from such food may be;
For well we wot shall come forth worthy soul,
Or male or female, with impartial dole
Of all that life can grant of good or great,—
Happy what each may bring to help the common fate.
XLIX.
By mutual aid perfecting complex man,
Their twofold vision human life may scan
From differing standpoints, grasping from the two
A clearer concept and a bolder view;
And thus diverse humanity shall learn
A wisdom which not single sex might earn;
Each on the problem casting needful light,
Not fully known of one without the other’s sight.
L.
How should he write what she alone may tell?—
The movements of her psychic ebb and swell;
The latent springs of life that in her gush,
When motherhood’s first throb awakes her flush,
And swift the signal flashes to her soul,
Of future being claiming her control;
Seeking from her its mind and body’s food;
Drawing, to make its own, her evil and her good.
LI.
Within herself the drama’s scene is laid,
The Birth and Growth of Soul the mystery played;
She, in her part, is but an agent mute,
Her brain untutored, nor her tact acute,
Her nerve-strung body slow as senseless soil
To watch the working of the seedling’s toil;
In vain before her inmost vision spread
The hidden streams from whence the vital founts are fed.
LII.
The mother’s blindness was blind man’s decree,
And to himself reverts the misery;
Through hapless years his ordinance has run,
And harsh reward of ignorance has won;
His pride of maledom, dull to recognise
The deeper depth accessive to her eyes,
Forbade to teach her brain to understand
The facts that, deftly sought, lay ready to her hand.
LIII.
Less wisely he, his curious search to serve,
In helpless creature teased the quivering nerve,
And strove to probe the covert ways of life
By living butchery with learned knife,
And cruel anodyne that chained the will,
Yet left the shuddering victim conscious still:
But Nature shrinks from foul and fierce attack,
Nor yields her holiest truths on such a murderer’s rack.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com