100% found this document useful (2 votes)
52 views60 pages

MATLAB Text Analytics Toolbox User S Guide The Mathworks - Download The Ebook With All Fully Detailed Chapters

The document provides a comprehensive guide for the MATLAB Text Analytics Toolbox, detailing various functionalities such as text data preparation, modeling, prediction, and visualization. It includes links to additional MATLAB toolboxes and resources for users to download eBooks instantly. The guide also covers language support and offers a glossary for text analytics terminology.

Uploaded by

krinetholex78
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
52 views60 pages

MATLAB Text Analytics Toolbox User S Guide The Mathworks - Download The Ebook With All Fully Detailed Chapters

The document provides a comprehensive guide for the MATLAB Text Analytics Toolbox, detailing various functionalities such as text data preparation, modeling, prediction, and visualization. It includes links to additional MATLAB toolboxes and resources for users to download eBooks instantly. The guide also covers language support and offers a glossary for text analytics terminology.

Uploaded by

krinetholex78
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Explore the full ebook collection and download it now at textbookfull.

com

MATLAB Text Analytics Toolbox User s Guide The


Mathworks

https://textbookfull.com/product/matlab-text-analytics-
toolbox-user-s-guide-the-mathworks/

OR CLICK HERE

DOWLOAD EBOOK

Browse and Get More Ebook Downloads Instantly at https://textbookfull.com


Click here to visit textbookfull.com and download textbook now
Your digital treasures (PDF, ePub, MOBI) await
Download instantly and pick your perfect format...

Read anywhere, anytime, on any device!

MATLAB Econometrics Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-econometrics-toolbox-user-s-
guide-the-mathworks/

textbookfull.com

MATLAB Bioinformatics Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-bioinformatics-toolbox-user-s-
guide-the-mathworks/

textbookfull.com

MATLAB Mapping Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-mapping-toolbox-user-s-guide-
the-mathworks/

textbookfull.com

MATLAB Optimization Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-optimization-toolbox-user-s-
guide-the-mathworks/

textbookfull.com
MATLAB Trading Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-trading-toolbox-user-s-guide-
the-mathworks/

textbookfull.com

MATLAB Computer Vision Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-computer-vision-toolbox-user-
s-guide-the-mathworks/

textbookfull.com

MATLAB Curve Fitting Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-curve-fitting-toolbox-user-s-
guide-the-mathworks/

textbookfull.com

MATLAB Fuzzy Logic Toolbox User s Guide The Mathworks

https://textbookfull.com/product/matlab-fuzzy-logic-toolbox-user-s-
guide-the-mathworks/

textbookfull.com

MATLAB Global Optimization Toolbox User s Guide The


Mathworks

https://textbookfull.com/product/matlab-global-optimization-toolbox-
user-s-guide-the-mathworks/

textbookfull.com
Text Analytics Toolbox™
User's Guide

R2020a
How to Contact MathWorks

Latest news: www.mathworks.com

Sales and services: www.mathworks.com/sales_and_services

User community: www.mathworks.com/matlabcentral

Technical support: www.mathworks.com/support/contact_us

Phone: 508-647-7000

The MathWorks, Inc.


1 Apple Hill Drive
Natick, MA 01760-2098
Text Analytics Toolbox™ User's Guide
© COPYRIGHT 2017–2020 by The MathWorks, Inc.
The software described in this document is furnished under a license agreement. The software may be used or copied
only under the terms of the license agreement. No part of this manual may be photocopied or reproduced in any form
without prior written consent from The MathWorks, Inc.
FEDERAL ACQUISITION: This provision applies to all acquisitions of the Program and Documentation by, for, or through
the federal government of the United States. By accepting delivery of the Program or Documentation, the government
hereby agrees that this software or documentation qualifies as commercial computer software or commercial computer
software documentation as such terms are used or defined in FAR 12.212, DFARS Part 227.72, and DFARS 252.227-7014.
Accordingly, the terms and conditions of this Agreement and only those rights specified in this Agreement, shall pertain
to and govern the use, modification, reproduction, release, performance, display, and disclosure of the Program and
Documentation by the federal government (or other entity acquiring for or through the federal government) and shall
supersede any conflicting contractual terms or conditions. If this License fails to meet the government's needs or is
inconsistent in any respect with federal procurement law, the government agrees to return the Program and
Documentation, unused, to The MathWorks, Inc.
Trademarks
MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See
www.mathworks.com/trademarks for a list of additional trademarks. Other product or brand names may be
trademarks or registered trademarks of their respective holders.
Patents
MathWorks products are protected by one or more U.S. patents. Please see www.mathworks.com/patents for
more information.
Revision History
March 2018 Online Only New for Version 1.1 (Release 2018a)
September 2018 Online Only Revised for Version 1.2 (Release 2018b)
March 2019 Online Only Revised for Version 1.3 (Release 2019a)
September 2019 Online Only Revised for Version 1.4 (Release 2019b)
March 2020 Online Only Revised for Version 1.5 (Release 2020a)
Contents

Text Data Preparation


1
Extract Text Data from Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2

Prepare Text Data for Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

Parse HTML and Extract Text Content . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17

Correct Spelling in Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21

Create Extension Dictionary for Spelling Correction . . . . . . . . . . . . . . . . 1-23

Create Custom Spelling Correction Function Using Edit Distance


Searchers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-27

Data Sets for Text Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-33

Modeling and Prediction


2
Create Simple Text Model for Classification . . . . . . . . . . . . . . . . . . . . . . . . 2-2

Analyze Text Data Using Multiword Phrases . . . . . . . . . . . . . . . . . . . . . . . . 2-7

Analyze Text Data Using Topic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13

Choose Number of Topics for LDA Model . . . . . . . . . . . . . . . . . . . . . . . . . 2-19

Compare LDA Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-23

Create Co-occurrence Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28

Analyze Text Data Containing Emojis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-32

Create Simple Preprocessing Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-38

Train a Sentiment Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-41

........................................................... 2-48

Classify Text Data Using Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 2-49

iii
Classify Text Data Using Convolutional Neural Network . . . . . . . . . . . . . 2-57

Multilabel Text Classification Using Deep Learning . . . . . . . . . . . . . . . . . 2-66

Sequence-to-Sequence Translation Using Attention . . . . . . . . . . . . . . . . 2-86

Classify Out-of-Memory Text Data Using Deep Learning . . . . . . . . . . . . 2-106

Pride and Prejudice and MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-112

Word-By-Word Text Generation Using Deep Learning . . . . . . . . . . . . . . 2-118

Classify Out-of-Memory Text Data Using Custom Mini-Batch Datastore


........................................................ 2-124

Display and Presentation


3
Visualize Text Data Using Word Clouds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2

Visualize Word Embeddings Using Text Scatter Plots . . . . . . . . . . . . . . . . 3-8

Language Support
4
Language Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
Language-Independent Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3

Japanese Language Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5


Tokenization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Part of Speech Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Named Entity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Stop Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
Lemmatization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
Language-Independent Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8

Analyze Japanese Text Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10

German Language Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-20


Tokenization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-20
Sentence Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-20
Part of Speech Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-21
Named Entity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-22
Stop Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-23
Stemming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-23
Language-Independent Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-24

Analyze German Text Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-25

iv Contents
Korean Language Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
Tokenization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
Part of Speech Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
Named Entity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
Stop Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
Lemmatization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
Language-Independent Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36

Language-Independent Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38


Word and N-Gram Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38
Modeling and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38

Glossary
5
Text Analytics Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Documents and Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
Modeling and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5

v
1

Text Data Preparation

• “Extract Text Data from Files” on page 1-2


• “Prepare Text Data for Analysis” on page 1-10
• “Parse HTML and Extract Text Content” on page 1-17
• “Correct Spelling in Documents” on page 1-21
• “Create Extension Dictionary for Spelling Correction” on page 1-23
• “Create Custom Spelling Correction Function Using Edit Distance Searchers” on page 1-27
• “Data Sets for Text Analytics” on page 1-33
1 Text Data Preparation

Extract Text Data from Files


This example shows how to extract the text data from text, HTML, Microsoft® Word, PDF, CSV, and
Microsoft Excel® files and import it into MATLAB® for analysis.

Usually, the easiest way to import text data into MATLAB is to use the extractFileText function.
This function extracts the text data from text, PDF, HTML, and Microsoft Word files. To import text
from CSV and Microsoft Excel files, use readtable. To extract text from HTML code, use
extractHTMLText. To read data from PDF forms, use readPDFFormData.

Text File

Extract the text from sonnets.txt using extractFileText. The file sonnets.txt contains
Shakespeare's sonnets in plain text.

filename = "sonnets.txt";
str = extractFileText(filename);

View the first sonnet by extracting the text between the two titles "I" and "II".

start = " I" + newline;


fin = " II";
sonnet1 = extractBetween(str,start,fin)

sonnet1 =
"
From fairest creatures we desire increase,
That thereby beauty's rose might never die,
But as the riper should by time decease,
His tender heir might bear his memory:
But thou, contracted to thine own bright eyes,
Feed'st thy light's flame with self-substantial fuel,
Making a famine where abundance lies,
Thy self thy foe, to thy sweet self too cruel:
Thou that art now the world's fresh ornament,
And only herald to the gaudy spring,
Within thine own bud buriest thy content,
And tender churl mak'st waste in niggarding:
Pity the world, or else this glutton be,
To eat the world's due, by the grave and thee.

"

Microsoft Word Document

Extract the text from sonnets.docx using extractFileText. The file exampleSonnets.docx
contains Shakespeare's sonnets in a Microsoft Word document.

filename = "exampleSonnets.docx";
str = extractFileText(filename);

View the second sonnet by extracting the text between the two titles "II" and "III".

start = " II" + newline;


fin = " III";
sonnet2 = extractBetween(str,start,fin)

1-2
Extract Text Data from Files

sonnet2 =
"
When forty winters shall besiege thy brow,

And dig deep trenches in thy beauty's field,

Thy youth's proud livery so gazed on now,

Will be a tatter'd weed of small worth held:

Then being asked, where all thy beauty lies,

Where all the treasure of thy lusty days;

To say, within thine own deep sunken eyes,

Were an all-eating shame, and thriftless praise.

How much more praise deserv'd thy beauty's use,

If thou couldst answer 'This fair child of mine

Shall sum my count, and make my old excuse,'

Proving his beauty by succession thine!

This were to be new made when thou art old,

And see thy blood warm when thou feel'st it cold.

"

The example Microsoft Word document uses two newline characters between each line. To replace
these characters with a single newline character, use the replace function.

sonnet2 = replace(sonnet2,[newline newline],newline)

sonnet2 =
"
When forty winters shall besiege thy brow,
And dig deep trenches in thy beauty's field,
Thy youth's proud livery so gazed on now,
Will be a tatter'd weed of small worth held:
Then being asked, where all thy beauty lies,
Where all the treasure of thy lusty days;
To say, within thine own deep sunken eyes,
Were an all-eating shame, and thriftless praise.
How much more praise deserv'd thy beauty's use,
If thou couldst answer 'This fair child of mine
Shall sum my count, and make my old excuse,'
Proving his beauty by succession thine!
This were to be new made when thou art old,
And see thy blood warm when thou feel'st it cold.
"

1-3
1 Text Data Preparation

PDF Files

Extract text from PDF documents and data from PDF forms.

PDF Document

Extract the text from sonnets.pdf using extractFileText. The file exampleSonnets.pdf
contains Shakespeare's sonnets in a PDF.

filename = "exampleSonnets.pdf";
str = extractFileText(filename);

View the third sonnet by extracting the text between the two titles "III" and "IV". This PDF has a
space before each newline character.

start = " III " + newline;


fin = "IV";
sonnet3 = extractBetween(str,start,fin)

sonnet3 =
"
Look in thy glass and tell the face thou viewest
Now is the time that face should form another;
Whose fresh repair if now thou not renewest,
Thou dost beguile the world, unbless some mother.
For where is she so fair whose unear'd womb
Disdains the tillage of thy husbandry?
Or who is he so fond will be the tomb,
Of his self-love to stop posterity?
Thou art thy mother's glass and she in thee
Calls back the lovely April of her prime;
So thou through windows of thine age shalt see,
Despite of wrinkles this thy golden time.
But if thou live, remember'd not to be,
Die single and thine image dies with thee.

"

PDF Form

To read text data from PDF forms, use readPDFFormData. The function returns a struct containing
the data from the PDF form fields.

filename = "weatherReportForm1.pdf";
data = readPDFFormData(filename)

data = struct with fields:


event_type: "Thunderstorm Wind"
event_narrative: "Large tree down between Plantersville and Nettleton."

HTML

Extract text from HTML files, HTML code, and the web.

1-4
Extract Text Data from Files

HTML File

To extract text data from a saved HTML file, use extractFileText.

filename = "exampleSonnets.html";
str = extractFileText(filename);

View the forth sonnet by extracting the text between the two titles "IV" and "V".

start = newline + "IV" + newline;


fin = newline + "V" + newline;
sonnet4 = extractBetween(str,start,fin)

sonnet4 =
"
Unthrifty loveliness, why dost thou spend
Upon thy self thy beauty's legacy?
Nature's bequest gives nothing, but doth lend,
And being frank she lends to those are free:
Then, beauteous niggard, why dost thou abuse
The bounteous largess given thee to give?
Profitless usurer, why dost thou use
So great a sum of sums, yet canst not live?
For having traffic with thy self alone,
Thou of thy self thy sweet self dost deceive:
Then how when nature calls thee to be gone,
What acceptable audit canst thou leave?
Thy unused beauty must be tombed with thee,
Which, used, lives th' executor to be.
"

HTML Code

To extract text data from a string containing HTML code, use extractHTMLText.

code = "<html><body><h1>THE SONNETS</h1><p>by William Shakespeare</p></body></html>";


str = extractHTMLText(code)

str =
"THE SONNETS

by William Shakespeare"

From the Web

To extract text data from a web page, first read the HTML code using webread, and then use
extractHTMLText.

url = "https://www.mathworks.com/help/textanalytics";
code = webread(url);
str = extractHTMLText(code)

str =
'Text Analytics Toolbox™ provides algorithms and visualizations for preprocessing, analyzing,

Text Analytics Toolbox includes tools for processing raw text from sources such as equipment

1-5
1 Text Data Preparation

Using machine learning techniques such as LSA, LDA, and word embeddings, you can find cluste

Parse HTML Code

To find particular elements of HTML code, parse the code using htmlTree and use findElement.
Parse the HTML code and find all the hyperlinks. The hyperlinks are nodes with element name "A".

tree = htmlTree(code);
selector = "A";
subtrees = findElement(tree,selector);

View the first 10 subtrees and extract the text using extractHTMLText.

subtrees(1:10)

ans =
10×1 htmlTree:

<A class="svg_link navbar-brand" href="https://www.mathworks.com?s_tid=gn_logo"><IMG alt="Mat


<A href="https://www.mathworks.com/products.html?s_tid=gn_ps">Products</A>
<A href="https://www.mathworks.com/solutions.html?s_tid=gn_sol">Solutions</A>
<A href="https://www.mathworks.com/academia.html?s_tid=gn_acad">Academia</A>
<A href="https://www.mathworks.com/support.html?s_tid=gn_supp">Support</A>
<A href="https://www.mathworks.com/matlabcentral/?s_tid=gn_mlc">Community</A>
<A href="https://www.mathworks.com/company/events.html?s_tid=gn_ev">Events</A>
<A href="https://www.mathworks.com/company/aboutus/contact_us.html?s_tid=gn_cntus">Contact Us
<A href="https://www.mathworks.com/products/get-matlab.html?s_tid=gn_getml">Get MATLAB</A>
<A class="svg_link pull-left" href="https://www.mathworks.com?s_tid=gn_logo"><IMG alt="MathWo

str = extractHTMLText(subtrees);

View the extracted text of the first 10 hyperlinks.

str(1:10)

ans = 10×1 string


""
"Products"
"Solutions"
"Academia"
"Support"
"Community"
"Events"
"Contact Us"
"Get MATLAB"
""

To get the link targets, use getAttributes and specify the attribute "href" (hyperlink reference).
Get the link targets of the first 10 subtrees.

attr = "href";
str = getAttribute(subtrees(1:10),attr)

str = 10×1 string


"https://www.mathworks.com?s_tid=gn_logo"
"https://www.mathworks.com/products.html?s_tid=gn_ps"

1-6
Extract Text Data from Files

"https://www.mathworks.com/solutions.html?s_tid=gn_sol"
"https://www.mathworks.com/academia.html?s_tid=gn_acad"
"https://www.mathworks.com/support.html?s_tid=gn_supp"
"https://www.mathworks.com/matlabcentral/?s_tid=gn_mlc"
"https://www.mathworks.com/company/events.html?s_tid=gn_ev"
"https://www.mathworks.com/company/aboutus/contact_us.html?s_tid=gn_cntus"
"https://www.mathworks.com/products/get-matlab.html?s_tid=gn_getml"
"https://www.mathworks.com?s_tid=gn_logo"

CSV and Microsoft Excel Files

To extract text data from CSV and Microsoft Excel files, use readtable and extract the text data
from the table that it returns.

Extract the table data from factoryReposts.csv using the readtable function and view the first
few rows of the table.

T = readtable('factoryReports.csv','TextType','string');
head(T)

ans=8×5 table
Description Category
_____________________________________________________________________ ____________________

"The materials get jammed in the mixer." "Mechanical Failure"


"The controller software keeps crashing." "Software Failure"
"Some drips of liquid are appearing underneath the assembler." "Leak"
"There are some high pitched electrical sounds emitted by the mixer." "Electronic Failure"
"Severe coolant leak underneath mixer." "Leak"
"The mixer is smoking during operation." "Electronic Failure"
"The scanner spools are sometimes jamming." "Mechanical Failure"
"Some liquid pooling underneath output of assembler." "Leak"

Extract the text data from the event_narrative column and view the first few strings.

str = T.Description;
str(1:10)

ans = 10×1 string


"The materials get jammed in the mixer."
"The controller software keeps crashing."
"Some drips of liquid are appearing underneath the assembler."
"There are some high pitched electrical sounds emitted by the mixer."
"Severe coolant leak underneath mixer."
"The mixer is smoking during operation."
"The scanner spools are sometimes jamming."
"Some liquid pooling underneath output of assembler."
"The products occasionally leave the scanner cracked."
"The sorter motor keeps getting jammed."

Extract Text from Multiple Files

If your text data is contained in multiple files in a folder, then you can import the text data into
MATLAB using a file datastore.

1-7
1 Text Data Preparation

Create a file datastore for the example sonnet text files. The example files are named
"exampleSonnetN.txt", where N is the number of the sonnet. Specify the file name using the
wildcard "*" to find all file names of this structure. To specify the read function to be
extractFileText, input this function to fileDatastore using a function handle.

location = fullfile(matlabroot,"examples","textanalytics","data","exampleSonnet*.txt");
fds = fileDatastore(location,'ReadFcn',@extractFileText)

fds =
FileDatastore with properties:

Files: {
' ...\matlab\examples\textanalytics\data\exampleSonnet1.txt';
' ...\matlab\examples\textanalytics\data\exampleSonnet2.txt';
' ...\matlab\examples\textanalytics\data\exampleSonnet3.txt'
... and 2 more
}
Folders: {
' ...\matlab\examples\textanalytics\data'
}
UniformRead: 0
ReadMode: 'file'
BlockSize: Inf
PreviewFcn: @extractFileText
SupportedOutputFormats: ["txt" "csv" "xlsx" "xls" "parquet" "parq" "png"
ReadFcn: @extractFileText
AlternateFileSystemRoots: {}

Loop over the files in the datastore and read each text file.

str = [];
while hasdata(fds)
textData = read(fds);
str = [str; textData];
end

View the extracted text.

str

str = 5×1 string


" From fairest creatures we desire increase,↵ That thereby beauty's rose might never die,↵
" When forty winters shall besiege thy brow,↵ And dig deep trenches in thy beauty's field,↵
" Look in thy glass and tell the face thou viewest↵ Now is the time that face should form a
" Unthrifty loveliness, why dost thou spend↵ Upon thy self thy beauty's legacy?↵ Nature's
"from fairest creatures we desire increase that thereby beautys rose might never die but as t

See Also
extractFileText | extractHTMLText | readPDFFormData | tokenizedDocument

Related Examples
• “Prepare Text Data for Analysis” on page 1-10
• “Create Simple Text Model for Classification” on page 2-2

1-8
Extract Text Data from Files

• “Visualize Text Data Using Word Clouds” on page 3-2


• “Analyze Text Data Containing Emojis” on page 2-32
• “Analyze Text Data Using Topic Models” on page 2-13
• “Analyze Text Data Using Multiword Phrases” on page 2-7
• “Classify Text Data Using Deep Learning” on page 2-49
• “Train a Sentiment Classifier” on page 2-41

1-9
1 Text Data Preparation

Prepare Text Data for Analysis


This example shows how to create a function which cleans and preprocesses text data for analysis.

Text data can be large and can contain lots of noise which negatively affects statistical analysis. For
example, text data can contain the following:

• Variations in case, for example "new" and "New"


• Variations in word forms, for example "walk" and "walking"
• Words which add noise, for example stop words such as "the" and "of"
• Punctuation and special characters
• HTML and XML tags

These word clouds illustrate word frequency analysis applied to some raw text data from factory
reports, and a preprocessed version of the same text data.

Load and Extract Text Data

Load the example data. The file factoryReports.csv contains factory reports, including a text
description and categorical labels for each event.

filename = "factoryReports.csv";
data = readtable(filename,'TextType','string');

1-10
Prepare Text Data for Analysis

Extract the text data from the field event_narrative, and the label data from the field
event_type.

textData = data.Description;
labels = data.Category;
textData(1:10)

ans = 10×1 string


"Items are occasionally getting stuck in the scanner spools."
"Loud rattling and banging sounds are coming from assembler pistons."
"There are cuts to the power when starting the plant."
"Fried capacitors in the assembler."
"Mixer tripped the fuses."
"Burst pipe in the constructing agent is spraying coolant."
"A fuse is blown in the mixer."
"Things continue to tumble off of the belt."
"Falling items from the conveyor belt."
"The scanner reel is split, it will soon begin to curve."

Create Tokenized Documents

Create an array of tokenized documents.

cleanedDocuments = tokenizedDocument(textData);
cleanedDocuments(1:10)

ans =
10×1 tokenizedDocument:

10 tokens: Items are occasionally getting stuck in the scanner spools .


11 tokens: Loud rattling and banging sounds are coming from assembler pistons .
11 tokens: There are cuts to the power when starting the plant .
6 tokens: Fried capacitors in the assembler .
5 tokens: Mixer tripped the fuses .
10 tokens: Burst pipe in the constructing agent is spraying coolant .
8 tokens: A fuse is blown in the mixer .
9 tokens: Things continue to tumble off of the belt .
7 tokens: Falling items from the conveyor belt .
13 tokens: The scanner reel is split , it will soon begin to curve .

To improve lemmatization, add part of speech details to the documents using


addPartOfSpeechDetails. Use the addPartOfSpeech function before removing stop words and
lemmatizing.

cleanedDocuments = addPartOfSpeechDetails(cleanedDocuments);

Words like "a", "and", "to", and "the" (known as stop words) can add noise to data. Remove a list of
stop words using the removeStopWords function. Use the removeStopWords function before using
the normalizeWords function.

cleanedDocuments = removeStopWords(cleanedDocuments);
cleanedDocuments(1:10)

ans =
10×1 tokenizedDocument:

1-11
1 Text Data Preparation

7 tokens: Items occasionally getting stuck scanner spools .


8 tokens: Loud rattling banging sounds coming assembler pistons .
5 tokens: cuts power starting plant .
4 tokens: Fried capacitors assembler .
4 tokens: Mixer tripped fuses .
7 tokens: Burst pipe constructing agent spraying coolant .
4 tokens: fuse blown mixer .
6 tokens: Things continue tumble off belt .
5 tokens: Falling items conveyor belt .
8 tokens: scanner reel split , soon begin curve .

Lemmatize the words using normalizeWords.


cleanedDocuments = normalizeWords(cleanedDocuments,'Style','lemma');
cleanedDocuments(1:10)

ans =
10×1 tokenizedDocument:

7 tokens: items occasionally get stuck scanner spool .


8 tokens: loud rattle bang sound come assembler piston .
5 tokens: cut power start plant .
4 tokens: fry capacitor assembler .
4 tokens: mixer trip fuse .
7 tokens: burst pipe constructing agent spray coolant .
4 tokens: fuse blow mixer .
6 tokens: thing continue tumble off belt .
5 tokens: fall item conveyor belt .
8 tokens: scanner reel split , soon begin curve .

Erase the punctuation from the documents.


cleanedDocuments = erasePunctuation(cleanedDocuments);
cleanedDocuments(1:10)

ans =
10×1 tokenizedDocument:

6 tokens: items occasionally get stuck scanner spool


7 tokens: loud rattle bang sound come assembler piston
4 tokens: cut power start plant
3 tokens: fry capacitor assembler
3 tokens: mixer trip fuse
6 tokens: burst pipe constructing agent spray coolant
3 tokens: fuse blow mixer
5 tokens: thing continue tumble off belt
4 tokens: fall item conveyor belt
6 tokens: scanner reel split soon begin curve

Remove words with 2 or fewer characters, and words with 15 or greater characters.
cleanedDocuments = removeShortWords(cleanedDocuments,2);
cleanedDocuments = removeLongWords(cleanedDocuments,15);
cleanedDocuments(1:10)

ans =
10×1 tokenizedDocument:

1-12
Prepare Text Data for Analysis

6 tokens: items occasionally get stuck scanner spool


7 tokens: loud rattle bang sound come assembler piston
4 tokens: cut power start plant
3 tokens: fry capacitor assembler
3 tokens: mixer trip fuse
6 tokens: burst pipe constructing agent spray coolant
3 tokens: fuse blow mixer
5 tokens: thing continue tumble off belt
4 tokens: fall item conveyor belt
6 tokens: scanner reel split soon begin curve

Create Bag-of-Words Model

Create a bag-of-words model.


cleanedBag = bagOfWords(cleanedDocuments)

cleanedBag =
bagOfWords with properties:

Counts: [480×352 double]


Vocabulary: [1×352 string]
NumWords: 352
NumDocuments: 480

Remove words that do not appear more than two times in the bag-of-words model.
cleanedBag = removeInfrequentWords(cleanedBag,2)

cleanedBag =
bagOfWords with properties:

Counts: [480×163 double]


Vocabulary: [1×163 string]
NumWords: 163
NumDocuments: 480

Some preprocessing steps such as removeInfrequentWords leaves empty documents in the bag-of-
words model. To ensure that no empty documents remain in the bag-of-words model after
preprocessing, use removeEmptyDocuments as the last step.

Remove empty documents from the bag-of-words model and the corresponding labels from labels.
[cleanedBag,idx] = removeEmptyDocuments(cleanedBag);
labels(idx) = [];
cleanedBag

cleanedBag =
bagOfWords with properties:

Counts: [480×163 double]


Vocabulary: [1×163 string]
NumWords: 163
NumDocuments: 480

1-13
1 Text Data Preparation

Create a Preprocessing Function

It can be useful to create a function which performs preprocessing so you can prepare different
collections of text data in the same way. For example, you can use a function so that you can
preprocess new data using the same steps as the training data.

Create a function which tokenizes and preprocesses the text data so it can be used for analysis. The
function preprocessText, performs the following steps:

1 Tokenize the text using tokenizedDocument.


2 Remove a list of stop words (such as "and", "of", and "the") using removeStopWords.
3 Lemmatize the words using normalizeWords.
4 Erase punctuation using erasePunctuation.
5 Remove words with 2 or fewer characters using removeShortWords.
6 Remove words with 15 or more characters using removeLongWords.

Use the example preprocessing function preprocessText to prepare the text data.

newText = "The sorting machine is making lots of loud noises.";


newDocuments = preprocessText(newText)

newDocuments =
tokenizedDocument:

6 tokens: sorting machine make lot loud noise

Compare with Raw Data

Compare the preprocessed data with the raw data.

rawDocuments = tokenizedDocument(textData);
rawBag = bagOfWords(rawDocuments)

rawBag =
bagOfWords with properties:

Counts: [480×555 double]


Vocabulary: [1×555 string]
NumWords: 555
NumDocuments: 480

Calculate the reduction in data.

numWordsCleaned = cleanedBag.NumWords;
numWordsRaw = rawBag.NumWords;
reduction = 1 - numWordsCleaned/numWordsRaw

reduction = 0.7063

Compare the raw data and the cleaned data by visualizing the two bag-of-words models using word
clouds.

figure
subplot(1,2,1)

1-14
Prepare Text Data for Analysis

wordcloud(rawBag);
title("Raw Data")
subplot(1,2,2)
wordcloud(cleanedBag);
title("Cleaned Data")

Preprocessing Function

The function preprocessText, performs the following steps in order:

1 Tokenize the text using tokenizedDocument.


2 Remove a list of stop words (such as "and", "of", and "the") using removeStopWords.
3 Lemmatize the words using normalizeWords.
4 Erase punctuation using erasePunctuation.
5 Remove words with 2 or fewer characters using removeShortWords.
6 Remove words with 15 or more characters using removeLongWords.

function documents = preprocessText(textData)

% Tokenize the text.


documents = tokenizedDocument(textData);

% Remove a list of stop words then lemmatize the words. To improve


% lemmatization, first use addPartOfSpeechDetails.
documents = addPartOfSpeechDetails(documents);

1-15
1 Text Data Preparation

documents = removeStopWords(documents);
documents = normalizeWords(documents,'Style','lemma');

% Erase punctuation.
documents = erasePunctuation(documents);

% Remove words with 2 or fewer characters, and words with 15 or more


% characters.
documents = removeShortWords(documents,2);
documents = removeLongWords(documents,15);

end

See Also
addPartOfSpeechDetails | bagOfWords | erasePunctuation | normalizeWords |
removeEmptyDocuments | removeInfrequentWords | removeLongWords | removeShortWords |
removeStopWords | tokenizedDocument | wordcloud

Related Examples
• “Extract Text Data from Files” on page 1-2
• “Create Simple Text Model for Classification” on page 2-2
• “Visualize Text Data Using Word Clouds” on page 3-2
• “Analyze Text Data Containing Emojis” on page 2-32
• “Analyze Text Data Using Topic Models” on page 2-13
• “Analyze Text Data Using Multiword Phrases” on page 2-7
• “Classify Text Data Using Deep Learning” on page 2-49
• “Train a Sentiment Classifier” on page 2-41

1-16
Parse HTML and Extract Text Content

Parse HTML and Extract Text Content


This example shows how to parse HTML code and extract the text content from particular elements.

Parse HTML Code

Read HTML code from the URL https://www.mathworks.com/help/textanalytics using


webread.
url = "https://www.mathworks.com/help/textanalytics";
code = webread(url);

Parse the HTML code using htmlTree.


tree = htmlTree(code);

View the HTML element name of the tree.


tree.Name

ans =
"HTML"

View the child elements of the tree. The children are subtrees of tree.
tree.Children

ans =
4×1 htmlTree:

" "
<HEAD><TITLE>Text Analytics Toolbox Documentation</TITLE><META charset="utf-8"/><META content
" "
<BODY id="responsive_offcanvas"><!-- Mobile TopNav: Start --><DIV class="header visible-xs vi

Extract Text from HTML Tree

To extract text directly from the HTML tree, use extractHTMLText.


str = extractHTMLText(tree)

str =
"Text Analytics Toolbox™ provides algorithms and visualizations for preprocessing, analyzing,

Text Analytics Toolbox includes tools for processing raw text from sources such as equipment

Using machine learning techniques such as LSA, LDA, and word embeddings, you can find cluste

Find HTML Elements

To find particular elements of an HTML tree, use findElement. Find all the hyperlinks in the HTML
tree. In HTML, hyperlinks use the "A" tag.
selector = "A";
subtrees = findElement(tree,selector);

View the first few subtrees.

1-17
1 Text Data Preparation

subtrees(1:20)

ans =
20×1 htmlTree:

<A class="svg_link navbar-brand" href="https://www.mathworks.com?s_tid=gn_logo"><IMG alt="Mat


<A class="mwa-nav_login" href="https://www.mathworks.com/login?uri=http://www.mathworks.com/h
<A href="https://www.mathworks.com/products.html?s_tid=gn_ps">Products</A>
<A href="https://www.mathworks.com/solutions.html?s_tid=gn_sol">Solutions</A>
<A href="https://www.mathworks.com/academia.html?s_tid=gn_acad">Academia</A>
<A href="https://www.mathworks.com/support.html?s_tid=gn_supp">Support</A>
<A href="https://www.mathworks.com/matlabcentral/?s_tid=gn_mlc">Community</A>
<A href="https://www.mathworks.com/company/events.html?s_tid=gn_ev">Events</A>
<A href="https://www.mathworks.com/company/aboutus/contact_us.html?s_tid=gn_cntus">Contact Us
<A href="https://www.mathworks.com/store?s_cid=store_top_nav&amp;s_tid=gn_store">How to Buy</
<A href="https://www.mathworks.com/company/aboutus/contact_us.html?s_tid=gn_cntus">Contact Us
<A href="https://www.mathworks.com/store?s_cid=store_top_nav&amp;s_tid=gn_store">How to Buy</
<A class="mwa-nav_login" href="https://www.mathworks.com/login?uri=http://www.mathworks.com/h
<A class="svg_link pull-left" href="https://www.mathworks.com?s_tid=gn_logo"><IMG alt="MathWo
<A href="https://www.mathworks.com/products.html?s_tid=gn_ps">Products</A>
<A href="https://www.mathworks.com/solutions.html?s_tid=gn_sol">Solutions</A>
<A href="https://www.mathworks.com/academia.html?s_tid=gn_acad">Academia</A>
<A href="https://www.mathworks.com/support.html?s_tid=gn_supp">Support</A>
<A href="https://www.mathworks.com/matlabcentral/?s_tid=gn_mlc">Community</A>
<A href="https://www.mathworks.com/company/events.html?s_tid=gn_ev">Events</A>

Create a word cloud from the text of the hyperlinks.

str = extractHTMLText(subtrees);
figure
wordcloud(str);
title("Hyperlinks")

1-18
Parse HTML and Extract Text Content

Get HTML Attributes

Get the class attributes from the paragraph elements in the HTML tree.
subtrees = findElement(tree,'p');
attr = "class";
str = getAttribute(subtrees,attr)

str = 21×1 string array


<missing>
<missing>
"add_margin_5"
<missing>
<missing>
<missing>
<missing>
<missing>
"category_desc"
"category_desc"
"category_desc"
"category_desc"
<missing>
<missing>
<missing>
"text-center"
<missing>
<missing>
<missing>

1-19
1 Text Data Preparation

"copyright"
<missing>

Create a word cloud from the text contained in paragraph elements with class "category_desc".

subtrees = findElement(tree,'p.category_desc');
str = extractHTMLText(subtrees);
figure
wordcloud(str);

See Also
extractHTMLText | findElement | getAttribute | htmlTree | tokenizedDocument

Related Examples
• “Prepare Text Data for Analysis” on page 1-10
• “Create Simple Text Model for Classification” on page 2-2
• “Visualize Text Data Using Word Clouds” on page 3-2
• “Analyze Text Data Using Topic Models” on page 2-13
• “Analyze Text Data Using Multiword Phrases” on page 2-7
• “Classify Text Data Using Deep Learning” on page 2-49
• “Train a Sentiment Classifier” on page 2-41

1-20
Correct Spelling in Documents

Correct Spelling in Documents


This example shows how to correct spelling in documents using Hunspell.

Load Text Data

Create an array of tokenized documents.

str = [
"Use MATLAB to correct spelling of words."
"Correctly spelled worrds are important for lemmatization."
"Text Analytics Toolbox providesfunctions for spelling correction."];
documents = tokenizedDocument(str)

documents =
3x1 tokenizedDocument:

8 tokens: Use MATLAB to correct spelling of words .


8 tokens: Correctly spelled worrds are important for lemmatization .
8 tokens: Text Analytics Toolbox providesfunctions for spelling correction .

Correct Spelling

Correct the spelling of the documents using the correctSpelling function.

updatedDocuments = correctSpelling(documents)

updatedDocuments =
3x1 tokenizedDocument:

9 tokens: Use MAT LAB to correct spelling of words .


8 tokens: Correctly spelled words are important for solemnization .
9 tokens: Text Analytic Toolbox provides functions for spelling correction .

Notice that:

• The input word "MATLAB" has been split into the two words "MAT" and "LAB".
• The input word "worrds" has been changed to "words".
• The input word "lemmatization" has been changed to "solemnization".
• The input word "Analytics" has been changed to "Analytic".
• The input word "providesfunctions" has been split into the two words "provides" and "functions".

Specify Custom Words

To prevent the software from updating particular words, you can provide a list of known words using
the 'KnownWords' option of the correctSpelling function.

Correct the spelling of the documents again and specify the words "MATLAB", "Analytics", and
"lemmatization" as known words.

updatedDocuments = correctSpelling(documents,'KnownWords',["MATLAB" "Analytics" "lemmatization"])

updatedDocuments =
3x1 tokenizedDocument:

1-21
1 Text Data Preparation

8 tokens: Use MATLAB to correct spelling of words .


8 tokens: Correctly spelled words are important for lemmatization .
9 tokens: Text Analytics Toolbox provides functions for spelling correction .

Notice here that the words "MATLAB", "Analytics", and "lemmatization" remain unchanged.

See Also
correctSpelling | tokenizedDocument

More About
• “Create Extension Dictionary for Spelling Correction” on page 1-23
• “Create Custom Spelling Correction Function Using Edit Distance Searchers” on page 1-27
• “Prepare Text Data for Analysis” on page 1-10
• “Create Simple Text Model for Classification” on page 2-2
• “Analyze Text Data Using Topic Models” on page 2-13

1-22
Create Extension Dictionary for Spelling Correction

Create Extension Dictionary for Spelling Correction


This example shows how to create a Hunspell extension dictionary for spelling correction.

When using the correctSpelling function, the function may update some correctly spelled words.
To provide a list of known words, you can use the “'KnownWords'” option directly with a string array
of known words. Alternatively, you can specify a Hunspell extension dictionary (also known as a
personal dictionary) that not only specifies a list of known words, it can also specify forbidden words
and words alongside affix rules.

Specify Known Words

Create an array of tokenized documents.

str = [
"Use MATLAB to correct spelling of words."
"Correctly spelled worrds are important for lemmatizing."
"Text Analytics Toolbox providesfunctions for spelling correction."];
documents = tokenizedDocument(str);

Correct the spelling of the documents using the correctSpelling function.

updatedDocuments = correctSpelling(documents)

updatedDocuments =
3x1 tokenizedDocument:

9 tokens: Use MAT LAB to correct spelling of words .


8 tokens: Correctly spelled words are important for legitimatizing .
9 tokens: Text Analytic Toolbox provides functions for spelling correction .

The function has corrected the spelling of the words "worrds" and "providesfunctions", though it has
also updated some correctly spelled words:

• The input word "MATLAB" has been split into the two words "MAT" and "LAB".
• The input word "lemmatizing" has been changed to "legitimatizing".
• The input word "Analytics" has been changed to "Analytic".

To create a Hunspell extension dictionary containing a list of known words, create a .dic file
containing these words with one word per line. Create an extension dictionary with name
knownWords.dic file containing the words "MATLAB", "lemmatization", and "Analytics".

MATLAB
Analytics
lemmatizing

Correct the spelling of the documents again and specify the extension dictionary knownWords.dic.

updatedDocuments = correctSpelling(documents,'ExtensionDictionary','knownWords.dic')

updatedDocuments =
3x1 tokenizedDocument:

8 tokens: Use MATLAB to correct spelling of words .


8 tokens: Correctly spelled words are important for lemmatizing .

1-23
Another Random Document on
Scribd Without Any Related Topics
It always fought well, showing great energy in the offensive and preserving a great tenacity in the
defensive.
Nevertheless, the fighting value of this division appears to have diminished during the course of the
year 1917.

1918.

1. About January 1 the division was relieved and went into training in the region Fournes-Chimay, where
it remained for four weeks.

St. Gobain.

2. The division relieved the 47th Reserve Division near Septvaux about February 1, and occupied the
line until March 28.
3. Retired from the front on the 28th; the division was sent toward Chauny-La Fere, where it constituted
the reserve division of the 8th Reserve Corps.

Noyon.

4. In April the division alternated between short periods in line and brief rests. North of Plemont it
relieved the 7th Reserve Division about April 2, was relieved by the 1st Bavarian Division a few days
later, and returned to line about April 11, relieving the 1st Bavarian Division. About this time the division
received a draft of 900 men of the 1919 class.
5. The division was withdrawn from the Lassigny front about May 25.

Battle of the Oise.

6. The division participated in the Oise fighting of June, although it did not take a direct part in the
opening attack. It supported the effort of the 3d Bavarian Reserve Division, lending some battalions,
from which prisoners were taken. About the middle of June the division passed to the second line,
rested two weeks, and returned to the Montdidier-Noyon front about June 30.

Lassigny.

7. The division remained in line throughout July and encountered the Allied attack of middle August.
About August 21 it was withdrawn.
8. Between August 21 and October 7 the division was not satisfactorily identified. Elements were
reported near Terguier in September, near Ypres, and in the region of St. Etienne-Arnes.

Woevre.

9. The division entered the Woevre line on October 7, near Manheulles, where it remained until the
armistice.

VALUE—1918 ESTIMATE.
The division was used during 1918 as a sector-holding division. It took no prominent part in the
offensives of the year.
3d Reserve Division.
COMPOSITION.
1914 1915 1916 1917
Brigade. Regiment. Brigade. Regiment. Brigade. Regiment. Brigade. Regiment. Brigad
Infantry. 5 Res. 2 Res. 5 Res. 2 Res. 5 Res. 2 Res. 5 Res. 2 Res. 5 Res.
9 Res. 9 Res. 9 Res. 49 Res.
6 Res. 34 Res. 6 Res. 34 Res. 6 Res. 20 Landst. 34 Fus.
49 Res. 49 Res. 49 Res.
Cavalry. 5 Res. Dragoon Rgt. (3 5 Res. Dragoon Rgt. 1 Sqn.
Sqns.).
Artillery. 3 Res. F. A. Rgt. (6 3 Res. F. A. Rgt. 3 Res. F. A. Rgt. 73 Art. Command: 73 Art.
Btries.).
3 Res. F. A. Rgt. (9 3 Res
Btries.).
4 Abt
Rgt.
865 L
1177
1195
Engineers 2d Pion. Btn. No. 2: 2d Pion. Btn. No. 2: 303 Pion. Btn. (?): 303 Pio
and
Liaisons.
Field Co. 2 Pions. 2 Res. Co. 2 Pions. 2 Res. Co. 2 Pions. 2 Res
3 Res. Pont. Engs. 203 T. M. Co. 2 Co. 34 Res. Pions. 203 T
3 Res. Tel. Detch. 3 Res. Pont. Engs. 203 M. T. Co. 196 S
Sectio
3 Res. Tel. Detch. 403 Tel. Detch. 403 Sig
3 Res. Pont. Engs. 403 T
33 W
Medical and 502 Ambulance Co. 502 Am
Veterinary.
14 Field Hospital. 14 Res.
15 Field Hospital. 15 Res.
16 Field Hospital. 163 Vet
163 Vet. Hospital.
Transport. 704 M. T. Col. 704 M.
Attached. 154 Cyclist Co.
HISTORY.
(2d District—Pomerania.)

1914.

East Prussia-Russia.

1. At the beginning of the war the 3d Reserve Division, recruited in the 2d District (Pomerania), formed
a part of the 8th German Army (Hindenburg). It fought with this army in eastern Prussia; it was
engaged in the battle of Tannenberg (Aug. 26–28), in the battles of Biallo, Lyck, Suwalki, and
Augustowo (September-October).

1915.

1. In February, 1915, the 3d Reserve Division participated in the battle of the Mazurian Lakes, and in
May in the battles on the Polish frontier.
2. During the great offensive of the summer of 1915 the division was engaged in the operations on the
Bobr, which resulted in the taking of Ossovietz. In August it fought in the vicinity of Kovno. It
participated in the siege of this city (Aug. 13–18) at the battle of Niemen (Aug. 19-Sept. 8). When the
front was stabilized it took position to the north of Smorgoni (southeast of Vilna).

1916.

1. The 3d Reserve Division occupied this sector (north of Smorgoni) up to March, 1917. At this time it
was placed in reserve in the Vilna sector.

Belgium.

2. At the beginning of May, 1917, it was sent to the western front. It entrained May 13 at Soly (east of
Vilna), and was transported via Vilna, Wirballen, Gumbinnen, Berlin, Hanover, Aix-la-Chapelle, Liege,
Louvain, and Brussels up to Bruges, where it detrained May 18. It was sent to rest in this district until
June 4.
3. On this date the division was transported to the district north of St. Quentin and went into the line on
the 8th in the Vendhuille-Bellicourt sector (west of Catelet), where it habituated itself to the western
front.

1917.

Ypres.

4. The division was relieved the end of July. After having been in reserve for several days it engaged in
the battle of Ypres on the Frezenberg front on August 4; here it was severely tried by artillery fire.
5. It was withdrawn from the front August 18 and sent to rest, first at Tournai and later in the
Moorslede District.
6. On September 23 it was again sent into the line in the battle of Flanders to the south of Zonnebeke
(Polygone wood), and again suffered serious losses on the 26th.

Alsace.

7. The 3d Reserve Division was relieved September 28 and transported to Alsace (Mulhouse District),
where it remained in repose up to the middle of October.
8. About the 10th or 15th of October it occupied the sector north of the canal from the Rhone to the
Rhine, and remained there till the end of October.
9. At this time it was withdrawn from the front. It entrained for Metz November 10. In December it was
in the vicinity of Sissone.

Aisne.

10. About December 13 it entered the line in the Craonne sector (Juvincourt area). At the beginning of
January it took over the neighboring sector (Bouconville).

VALUE—1917 ESTIMATE.

Very mediocre morale. The 49th Reserve Regiment was very severely tested by losses and desertions to
such a point that it had to be returned to the rear after August 18, 1917. September 26 the 8th
Company of the same regiment refused to take part in the attack. The relatively high proportion of men
of the 2d Landsturm levy may be responsible for these facts, since they formed part of the regiments of
the Second District.
According to prisoners captured in February, 1918, the 3d Reserve Division seemed to be of mediocre
quality: “6,000 men lost in Flanders, poorly replaced by men 50 per cent of whom were old, many being
above 40, and by 30 per cent Poles.”
Nevertheless, despite the mediocrity of its personnel, it must be noted that the 49th Reserve was
subjected to a special training for attack troops in November and December.

1918.

Laon.

1. The division held the line in the Craonne sector until about April 20, when it was relieved.

Oise.

2. It reappeared on May 1 near Hainvillers (southeast of Montdidier), where it remained until about
June 20. The division was in the thick of the June fighting on the Oise and lost heavily.
3. About June 20 the division went to rest in the region of Guise.

Marne.
4. The division participated in the fighting between the Marne and Soissons when the Allies delivered
their attack on the Marne salient. It relieved the 115th Division at Longpont on July 18 and withstood
the attack until July 31. The 49th Reserve Regiment was almost annihilated in the course of the fighting
near Mery. The other regiments were reduced to 70–80 rifles per company.
5. Retired from the front on July 31, the division rested at La Capelle until September 1.

Cambrai.

6. The division came into line east of Chevisy on September 2. Its composition had been altered by the
disbandment of the 2d Reserve Regiment and the addition of the 2d Grenadier Regiment from the 109th
Division. The British attack on the Somme of September 12 engulfed the division, which lost 1,300
prisoners.

Belgium.

7. It was withdrawn about September 27 and transferred to Belgium, where it entered the line near
Dixmude on September 29. It held the line in this sector until October 16, when it passed into the
second line for a week’s rest. Returning to line on the 23d, it remained in line until the armistice.

VALUE—1918 ESTIMATE.

The division is rated as a third-class division. Its morale was on the whole bad. The Polish elements
deserted freely. In July pillaging of supply trains was apparently prevalent in the divisional area.
Elements of the division refused to fight in the Oise battle in June, and the German command appeared
to have confidence in its fighting value.
3d Naval Division.
COMPOSITION.
1917 1918
Brigade. Regiment. Brigade. Regiment.
Infantry 4 Nav. 1 Mar. Mar. 1 Mar.
2 Mar. Inf. Brig. 2 Mar.
3 Mar. 3 Mar.
Cavalry 3 Sqn. 7 Hus. Rgt.
Artillery 9 F. A. Rgt. 2 Matr. F. A. Rgt.
925 Light Am. Col.
1234 Light Am. Col.
1292 Light Am. Col.
Engineers and Liaisons 1 Co. Mar. Pion. Btn. 115 Pion. Btn.
3 Co. Mar. Pion. Btn. 1 Res. Co. 24 Pions.
337 Pion. Co. 293 Signal Command:
165 T. M. Co. 293 Tel. Detch.
66 Wireless Detch.
Medical and Veterinary 610 Ambulance Co.
2 Mar. Field Hospital.
390 Field Hospital.
569 Vet. Hospital.
Transports 679 M. T. Col.
Attached Coast Defense Btn.
HISTORY.

1917.

1. The 3d Naval Division was organized in April, 1917. Its Regiments


(1st, 2d, and 3d Naval Infantry) were detached from the Naval
Corps, before the constitution of the division, to take part in the
attacks upon Steenstraat on April 22, 1915, and on the Somme from
September, 1916, to April, 1917. Since its formation the 3d Naval
Division has scarcely left the coast.

Flanders.

2. In August, 1917, the 3d Naval Division occupied the sector of


Lombartzyde.
3. In October it was in action on the Ypres front at Poelcappelle.
4. In December it again took over the sector of Lombartzyde.

RECRUITING.

The 3d Naval Division is recruited from the entire German Empire,


the naval troops being imperial troops.

VALUE—1917 ESTIMATE.

Before the war the troops of the 3d Naval Division were landing and
occupying troops for the German colonies. They are good units
whose recruiting has been kept up to a high standard.
1918.

Albert.

1. The division was relieved north of St. Georges about the 1st of
March and moved to Valenciennes, where it arrived about the 13th.
From March 18 to 23 it moved up to the front by stages via Haussy-
Cattenieres-Lesdain. On the 23d it followed up the advance, passing
through Fins and Manancourt on the 24th–25th and coming into
action at Contalmaison on the 25th. It captured Albert on the 26th.
The division held a sector west of Albert until mid-April, and on April
24 returned to its former sector west of Anthuille. It was relieved
about the end of May by the 24th Division.
2. On June 20 the division returned to relieve the 24th Division in
the Aveluy sector. In mid-July the company strength was low. No
drafts had been received recently and sickness was prevalent. This,
together with the August spell in line, had considerably reduced the
morale of the division. It was relieved on August 19 by the 83d
Division.

Scarpe-Somme.

3. The division rested at Flers for five days, when it came into line
west of Grevillers on the night of August 23–24 to reinforce the line.
It was withdrawn in a few days (Aug. 26) and rested at Cambrai.
Five hundred prisoners were taken from the division in this period.
4. The division rested at Thourout during the first half of September.
On the 27th it was engaged west of Marcoing and fought in that
area until the end of the month. The total prisoners captured from
the division was 700.
5. After two weeks’ rest in the Cambrai area, the division returned to
line at Molain on October 17. It fought in the Molain-Catillon area
until October 23, when it was relieved by the 19th Reserve Division.
On November 1 it was again in line, northwest of the Hattencourt
Farm. The last identification was at Any, on November 7.

VALUE—1918 ESTIMATE.

The division was rated as third class. Its use in the Somme March
offensive and as an intervention division in the Scarpe-Somme battle
suggest that the division was a second class division.
4th Guard Division.
COMPOSITION.
1915 1916 1917 1918[3]
Brigade. Regiment. Brigade. Regiment. Brigade. Regiment. Brigade. Regiment.
Infantry. 5 Ft. 5 Gd. 5 Ft. 5 Gd. 5 Ft. 5 Gd. 5 Ft.
5 Gren. 5 Gren. 5 Gren. 5 Gren.
93 Res. 93 Res. 93 Res. 93 Res.
Cavalry. 3d Sqn. Gd. Res. Ulan 2d Sqn. Gd. Res. Drag. 2d Sqn. Gd. Res. Drag.
Regt. Rgt. Rgt.
Artillery. (z) 2d Gd. Res. F. A. 4 Gd. Art. Command: 4 Gd. Art. Command:
Rgt.
6 Gd. F. A. Rgt. 6 Gd. F. A. Rgt. 6 Gd. F. A. Rgt.
3 Abt. 1 Gd. Ft. A.
Rgt. (5, 6, and 10
Btries.).
1208 Light Am. Col.
1285 Light Am. Col.
1359 Light Am. Col.
Engineers (z) Co. 3 Gd. Pions. 106 Pion. Btn.: 106 Pion. Btn.:
and
Liaisons.
261 Pion. Co. 261 Pion. Co. 261 Pion. Co.
4 Gd. T. M. Co. 269 Pion. Co. 269 Pion. Co.
4 Gd. Pont. Engs. 4 Gd. T. M. Co. 4 Gd. T. M. Co.
4 Gd. Tel. Detch. 315 Searchlight 4 Gd. Signals
Section. Command:
4 Gd. Tel. Detch. 4 Gd. Tel. Detch.
61 Wireless Detch.
Medical and 267 Ambulance Co. 267 Ambulance Co.
Veterinary.
392 Field Hospital. 392 Field Hospital.
397 Field Hospital. 397 Field Hospital.
Vet. Hospital. 4 Gd. Vet. Hospital.
Transports. 13 Gd. Truck Train. 533 M. T. Col.
533 M. T. Col.
Attached. 32 M. G. S. S. Detch. 44 Observation Group.
70 Anti-Aircraft 72 Sound Ranging
Section. Section.
244 Reconnaissance
Flight.
20 Balloon Sqn.

3. According to a document of Aug. 21, 1918.


HISTORY.

1915.

The 4th Guard Division was formed on the Russian front in March, 1915.

Russia.

1. From March 14 to July 12 the 4th Guard Division was in line near Przasnysz. It belonged to
Gallwitz’s army, which was operating north of the Vistula.
2. From July 13 to September 28 the division took part in many fights, notably on the Narew, and
took part in the pursuit as far as the region of the marshes of Lithuania.
3. Withdrawn from the front and reached Kovno on foot, where it entrained for the Western
Front on October 10 via Koenigsberg, Luebeck, Hamburg, Aix-la-Chapelle, Namur. Detrained at
Douai and sent to rest.

France.

4. From November 14 to 26 it occupied a sector near Arras, then went to rest near Cambrai.
5. From December 15, 1915, to January 4, 1916, it built entrenchments in the region of
Wytschaete-Messines.

1916.

1. During January and February, 1916, the 4th Guard Division continued its entrenching work in
the sector Wytschaete-Messines and held the sector at the same time.
2. Until the end of April, 1917, the 4th Guard Division, together with the 1st Reserve Guard
Division, formed the reserve corps of the guard. Both these divisions were put through a course
of training with a view to active operations.
3. From May 9 to July 23 the division remained in line northeast of Neuville-St. Vaast.

Somme.

4. Engaged in the battle of the Somme July 25 (Estres sector), suffered heavy losses and was
withdrawn August 19. Engaged again after a few days of rest and fought some severe local
battles until September 10 (Thiepval sector).
5. After seven days of rest behind the Flanders front it held a quiet sector north of Ypres from
September 17 to October 25.
6. From November 6 to 25 it was again sent to the Somme, where it was subjected to several
heavy local attacks (Warlencourt sector).
1917.

1. Remained in the Warlencourt sector until March 17, 1917. It was relieved immediately after it
had retired to the Hindenburg line.

Lens.

2. After three weeks’ rest in the region of Tournai it was sent by stages to the south of Lens,
where it went back in the lines. It suffered considerable losses there. Withdrawn from the front
July 11.
3. At rest in the region of Pont-a-Vendin and Meurchin. On August 15 the division was hurried up
to the north of Lens. It attacked to regain the lost ground but in vain. Its losses were extremely
heavy.
4. The division stayed in line until September 15.

Flanders.

5. At rest for a week behind the front. Entrained September 23 and 24 at Carvin for Flanders.
6. It was at first in the reserve of the army, but went into line September 27 east of Zonnebeke.
After one of its regiments had attacked and was stopped by the British artillery fire (Oct. 22), the
division obtained replacements and on October 4 renewed its attempt to retake the heights lost
on September 26. Warned by a British attack, they became demoralized and fled in disorder
toward Becelaere. The losses of the 4th Guard Division were so heavy that it had to be relieved
on October 5 to 7.
7. Entrained for Guise and arrived there October 10. Went into line on the 14th in the sector of
Itancourt, southeast of St. Quentin, and was still holding it in December. Its forces were much
reduced by the attacks in Flanders and were reinforced by neighboring units (13th Landwehr
Division).

VALUE—1917 ESTIMATE.

Formerly an excellent combat unit, having that traditional esprit de corps which animated the
regiments of the Prussian Guards. At the present time (November, 1917) it has lost a good part
of its fighting value. It seems to have been much weakened by the battle of Ypres (October,
1917).

1918.

Guise.

1. The division rested during January near d’Origny Ste. Benoite (west of Guise).

Somme.
2. On February 4 the division came into line northwest of Bellenglise. It was relieved about the
middle of February.
3. Upon relief, it marched via Bohain to St. Souplet, near Le Cateau. Here the division underwent
a course of training in this area until March 18, when it marched via Bohain-Brancourt-
Montbrehain-Ramicourt back to its old sector at Bellenglise, arriving in line March 20.

Battle of Picardy.

4. The division attacked in the first line and advanced by Hesbecourt March 21–26. Passing into
support for eight days it was reengaged April 3–8 near Bouzencourt and le Hamel, suffering very
heavy losses. Between the 8th and the 24th the division rested. It was in line again near
Marcelcave from the 22d to the end of April, participating in the attack at Villers-Bretoneaux on
the 24th. Heavy losses were again sustained.
5. Again the division went to rest at St. Souplet, near le Cateau. The 2d Battalion of the 427th
Regiment, dissolved, arrived as a reinforcement for the division on May 27. The division was
moved by rail to Flavy le Martel on night of June 1. It marched by nights to Canny sur Matz (by
Golancourt, Guiscard, and Candor) and entered the line on the night of June 8–9.

Battle of the Oise.

6. The division attacked on the 9th between Roye sur Matz and Canny sur Matz. It penetrated by
Marquelise to Antheuil. The French counterattack threw it back north of Antheuil on the 11th.
The division stayed in line until the 19th.

Lorraine.

7. After resting at Bohain until June 29 the division was moved to Lorraine by Valenciennes-
Brussels-Namur-Saarburg. Here it was rested and reconstituted.
8. The division returned by rail to Athies sur Laon on July 22. From there it marched to Mousey
sur Aisne by stages and then in trucks to Mareuil en Dole on July 25.

Battle of the Marne, Vesle, Aisne.

9. The division was engaged July 27 southeast of Fere en Tardenois. It fell back toward Fismes
on August 1–2, from where it was shifted into the Courlandon-Breuil sector, which it held from
August 14 to the beginning of September. On the 5th it moved to the south of Glennes,
remaining there until the 30th, when it fell back across the canal. The division was relieved on
October 2, but turned back to line on the 5th to cover the retreat near Benu au Bac. On the 7th
it went to rest for a week.

Ardennes.
10. Reengaged west of Chateau Porcien from October 14 to November 5. The 93d Regiment was
mentioned in the German communique of November 2 as fighting especially well. In the retreat
the division passed through Renneville and Rubigny, where it was last identified on November 11.

VALUE—1918 ESTIMATE.

The division was always regarded as a first-class fighting division, although the losses on the
Somme in March and the setback on the Oise in June lowered its value. Constant fighting
impaired the morale and kept the effectives low, but the division was always to be included in the
first-class divisions.
4th Division.
COMPOSITION.
1914 1915 1916 1917
Brigade. Regiment. Brigade. Regiment. Brigade. Regiment. Brigade. Regiment. Brigad
Infantry. 7. 14. 7. 14. 7. 14. 8. 14. 8.
149. 149. 149. 49.
8. 49. 8. 49. 8. 49. 140.
140. 140. 140.
Cavalry. 12 Drag. Rgt. (v. 12 Drag. Regt. 3d Sqn. Horse Gren. 2 Sqn.
Arnim). Rgt. Rgt.
Artillery. 4 Brig.: 4 Brig.: 4 Brig.: 4 Art. Command: 4 Art. C
17 F. A. Rgt. 17 F. A. Rgt. 17 F. A. Rgt. 53 F. A. Rgt. 53 F.
53 F. A. Rgt. 53 F. A. Rgt. 53 F. A. Rgt. 48 Ft
939 L
945 L
1319
Engineers 1 Pion. Btn. No. 2: 1 Pion. Btn. No. 2: 114 Pion. Btn.: 114 Pio
and
Liaisons.
Field Co. 1 Pion. Btn. 2 Co. 2 Pions. 2 Co. 2 Pions. 2 Co.
No. 2.
4 Pont. Engs. 4 T. M. Co. 5 Co. 2 Pions. 5 Co.
4 Tel. Detch. 4 Pont. Engs. 2 Co. 114 Pions. 4 T. M
4 Tel. Detch. 4 T. M. Co. 55 Se
Sectio
7 Searchlight 4 Sig
Section.
4 Tel. Detch. 4 Tel
4 Pont. Engs. 72 W
Medical and 6 Ambulance Co. 6 Ambu
Veterinary.
17 Field Hospital. 17 Field
Vet. Hospital. 19 Field
131 Vet
Transport. 8 Truck Train. 537 M.
9 Truck Train.
537 M. T. Col.
Attached. Construction Co.
HISTORY.
(Second District—Pomerania.)

1914.

France.

1. At the beginning of the campaign the 4th Division fought on the Western Front until November, 1914.
It detrained at Rheydt on August 9 and 10, and entered Belgium on the 14th and France on the 25th.
Fought at Sailly-Saillisel on the 28th; reached Grand-Morin September 5 and fought at Acy en Multien
on the 6th. After retreating to the north of Soissons it remained south of Roye from the end of
September to the end of October, and was near Ypres in November.

Russia.

2. Sent to Russia and took part in the second offensive on Warsaw.

1915.

1. In January it took part in the battles of Bolimow. In February it went to the Carpathians (Army of the
South under Linsingen). Took part in the offensive on Lemburg.
2. About September 27, 1915, it was relieved in the region south of Baranovitchi and entrained at
Kobryn for the Western Front.

France.

3. It arrived in the vicinity of Sedan at the beginning of October. After a few days’ rest it marched to the
north of Tahure.
4. On October 30 the division took part in the attack of Butte De Tahure and suffered severe losses.
5. At the beginning of November it left Champagne for the region of Reims where its units went into the
trenches on November 8. Until the beginning of April, 1916, it held the sector northwest of Prunay.

1916.

1. At the beginning of April the division was sent to rest in the vicinity of Rethel. During this period
(Nov. 15 to Apr. 16) its losses were light.

Verdun.

2. At the beginning of May the division was sent to the region of Verdun. On May 4 it took part on the
attack on Hill 304, where it suffered heavy losses.
3. Relieved May 15 and sent to rest in the region of Mouzon-Carignan, from where it went to the region
of Damvillers.
4. At the beginning of July it was sent to hold the sector of Thiaumont at the moment when the French
recommenced their offensive in that region. Its losses were very heavy.
5. On August 3 it left Thiaumont for the region of Cumieres, on the left bank of the Meuse (Aug. 5).
6. At the end of September it held the sector Malancourt-Avocourt.
7. Relieved at the end of October and trained at Dun. After a short rest it went into line in December
northeast of Vaux.

1917.

1. The division remained in the Vaux sector until April 17.


2. It relieved the 10th Reserve Division in the region of Satigneul (night of Apr. 15–16) a few hours
before the beginning of our attack. It remained in this sector until May 5 and was subjected to French
attacks of April 16 and May 4.
3. Beginning May 5, it was relieved and went into camp in the region of Caurel.

Champagne.

4. On May 7 and the following days it went into the sector of Grille Mont Haut and held this until June
19.
5. The division was put in reserve on this date in the region Epove-Warmeriville.
6. Went into line in the sector Moronvilliers (July 19 and days following) until the end of October.

Belgium.

7. At the end of October it entrained at Juniville and went to Belgium, where it held the sector
Poelcapelle until November 24.
8. It went into line again east of Armentieres on November 30 and was still in that sector on January
11, 1918.

RECRUITING.

In spite of heavy losses suffered several times, it would seem that they wished to keep up the
Pomeranian character of the 4th Division, although it received in September, 1915, some men of the
1915 class from Hesse-Nassau, and later on a number of Brandenburgers and Silesians, as the third and
sixth districts often furnished their ratio to the districts temporarily out of men. A great majority of men,
however, came from Pomerania, and as the resources of this Province in men are limited it was
necessary, to keep up the provincial composition of this division, to draw from the Landwehr depots and
the battalions of Pomeranian Landsturm. Since it was impossible to maintain the quality of the division,
it seems that they were anxious to maintain its nationality.

VALUE.
The 4th Division was always a very good division and gave proof of very fine military qualities in all the
battles in which it took part, especially in the sector of Sapigneul during the offensives of April 16 and
May 4, 1917. It would seem that the nature of the replacements they received, especially the most
recent ones, has considerably altered the value of this division.

1918.

1. The division was relieved from the front of Armentieres on January 23, and went to rest and
instruction in the Oisene area (southwest of Deyuze). After four weeks the division entrained at Roubaix
on March 16 and detrained at Douai on the following day. Hence it marched by stages to Neuville St.
Remy, a suburb of Cambrai. The division was concentrated south of Inchy on the night of March 20–21.

Battle of Picardy.

2. Engaged on March 21, the division advanced by Doignies and Herrnies. It passed to rest on the 24th
and was reengaged from March 26 to April 6 at Miraumont, Hebuterne, and Colincamps. The division
suffered very heavy losses in the engagement.
3. Relieved from the Hebuterne front on April 6, the division rested two weeks in the Bapaune-Cambrai
area. The division moved north to the Lys front via Douai-Lille.

Battle of the Lys.

4. The division was in line west of Merville from April 23 to May 14.
5. While at rest north of Tournai, the division was reconstituted and prepared for another heavy
engagement.
6. The division entrained for Loos on June 30 and moved on to Sailly sur la Lys on July 18.

The Lys Withdrawal.

7. The division came into line near Merris on July 27. It lost 500 prisoners south of Meteren on August
18. On the 30th the division fell back on Bailleul and later to Bac St. Maur and Fleurbaix. It was relieved
at Fleurbaix on October 11.
8. The division rested from the 11th to the 21st near Denain.
9. Again the division was engaged to the east and northeast of Solesmes and near Le Quesnoy,
retreating to Beaurain, Ghissignies, and Ruesnes. It passed in the second line on November 1, but came
back to the line south of Le Quesnoy about November 5. It retreated by Locquignol toward Maubeuge,
where it was last identified on November 9.

VALUE—1918 ESTIMATE.

The 4th Division was a very good division. In 1918 its morale was mediocre, due to the young recruits.
4th Ersatz Division.
COMPOSITION.
1914 1915 1916 1917
Brigade. Regiment. Brigade. Regiment. Brigade. Regiment. Brigade. Regiment. Brigad
Infantry. 9 Ers. 9, 10, 11, 9 Ers. 9, 10, 11, 9 Mixed 359. 13 Ers. 360. 13 Ers.
and 12, and 12, Ers.
Brig. Ers. Brig. Ers.
Btns. Btns.
360. 361.
13 Mixed 361. 362.
Ers.
362.
13 Ers. 13, 14, 15, 13 Ers. 13, 14, 15,
and 16, and 16,
Brig. Ers. Brig. Ers.
Btns. Btns.
33 Ers. 33, 34, 35, 33 Ers. 33, 34, 35,
36, and and 81,
81, Brig. Brig. Ers.
Ers. Btns.
Btns.
Cavalry. Ers. Detchs. of the 9, Ers. Detchs. of the 9, 4 Ers. Cav. Sqn. Ers. Cav. Detch. (3d C. 3 Sqn.
13, and 33 Ers. 13, and 33 Ers. Dist.).
Brigs. Brigs.
Artillery. 1 Ers. Abt. of the 18 1 Ers. Abts. of the 18 90 F. A. Rgt. 139 Art. Command: 139 Art
and 39 F. A. Rgt. and 39 F. A. Rgt.
1 Ers. Abts. of the 40 1 Ers. Abts. of the 40 91 F. A. Rgt. 90 F. A. Rgt. 90 F.
and 75 F. A. Rgt. and 75 F. A. Rgt.
1 Ers. Abts. of the 45 1 Ers. Abts. of the 45 119 F
and 60 F. A. Rgt. and 60 F. A. Rgt.
1052
1059
1323
Engineers 1 Ers. Co., 3 Pions. 1 Ers. Co., 3 Pions. 303 Pion. Co. 504 Pion. Btn.: 504 Pio
and
Liaisons.
1 Ers. Co., 4 Pions. 1 Ers. Co., 4 Pions. 304 Pion. Co. 304 Pion. Co. 304 P
1 Ers. Co., 9 Pions. 1 Ers. Co., 9 Pions. 305 Pion. Co. 305 Pion. Co. 305 P
161 T. M. Co. 161 T. M. Co. 161 T
251 Searchlight 59 Se
Section. Sectio
554 Tel. Detch. 554 Sig
554 T
123 W
Medical and 64 Ambulance Co. 64 Amb
Veterinary.
135 Field Hospital. 135 Fie
136 Field Hospital. 136 Fie
Vet. Hospital. 136 Vet
Transports. 762 M.
Attached. 1 Res. Co., 25 Pions. 103 Antiaircraft Section
2 Res. Co., 25 Pions.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like