0% found this document useful (0 votes)
17 views124 pages

Preview-9780135183670 A39997764

Uploaded by

ntuta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views124 pages

Preview-9780135183670 A39997764

Uploaded by

ntuta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

Network Programmability

and Automation
Fundamentals
Khaled Abuelenain, CCIE No. 27401
Jeff Doyle, CCIE No. 1919
Anton Karneliuk, CCIE No. 49412
Vinit Jain, CCIE No. 22854

Cisco Press
Hoboken, New Jersey

9781587145148_print.indb 1 25/03/21 11:42 am


ii Network Programmability and Automation Fundamentals

Network Programmability and Automation


Fundamentals
Copyright© 2021 Cisco Systems, Inc.

Cisco Press logo is a trademark of Cisco Systems, Inc.

Published by:
Cisco Press

All rights reserved. This publication is protected by copyright, and permission must be obtained from the
publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form
or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding
permissions, request forms, and the appropriate contacts within the Pearson Education Global Rights &
Permissions Department, please visit www.pearson.com/permissions.

No patent liability is assumed with respect to the use of the information contained herein. Although
every precaution has been taken in the preparation of this book, the publisher and author assume no
responsibility for errors or omissions. Nor is any liability assumed for damages resulting from the use of
the information contained herein.

ScoutAutomatedPrintCode

Library of Congress Control Number: 2020922839

ISBN-13: 978-1-58714-514-8
ISBN-10: 1-58714-514-6

Warning and Disclaimer


This book is designed to provide information about network programmability and automation. Every
effort has been made to make this book as complete and as accurate as possible, but no warranty or
fitness is implied.

The information is provided on an “as is” basis. The authors, Cisco Press, and Cisco Systems, Inc. shall
have neither liability nor responsibility to any person or entity with respect to any loss or damages arising
from the information contained in this book or from the use of the discs or programs that may accom-
pany it.

The opinions expressed in this book belong to the author and are not necessarily those of
Cisco Systems, Inc.

Trademark Acknowledgments
All terms mentioned in this book that are known to be trademarks or service marks have been appropri-
ately capitalized. Cisco Press or Cisco Systems, Inc. cannot attest to the accuracy of this information. Use
of a term in this book should not be regarded as affecting the validity of any trademark or service mark.

9781587145148_print.indb 2 25/03/21 11:42 am


iii

Feedback Information
At Cisco Press, our goal is to create in-depth technical books of the highest quality and value. Each book
is crafted with care and precision, undergoing rigorous development that involves the unique expertise of
members from the professional technical community.

Readers’ feedback is a natural continuation of this process. If you have any comments regarding how we
could improve the quality of this book or otherwise alter it to better suit your needs, you can contact us
through email at [email protected]. Please make sure to include the book title and ISBN in your
message.

We greatly appreciate your assistance.

Editor-in-Chief: Mark Taub Technical Editors: Jeff Tantsura, Viktor Osipchuk

Director, ITP Product Management: Brett Bartow Editorial Assistant: Cindy Teeters

Alliances Manager, Cisco Press: Arezou Gol Designer: Chuti Prasertsith

Managing Editor: Sandra Schroeder Composition: codeMantra

Development Editor: Ellie C. Bru Indexer: Ken Johnson

Project Editor: Mandie Frank Proofreader: Abigail Bass

Copy Editor: Kitty Wilson

Americas Headquarters Asia Pacific Headquarters Europe Headquarters


Cisco Systems, Inc. Cisco Systems (USA) Pte. Ltd. Cisco Systems International BV Amsterdam,
San Jose, CA Singapore The Netherlands

Cisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco Website at www.cisco.com/go/offices.

Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks,
go to this URL: www.cisco.com/go/trademarks. Third party trademarks mentioned are the property of their respective owners. The use of the word partner does
not imply a partnership relationship between Cisco and any other company. (1110R)

9781587145148_print.indb 3 25/03/21 11:42 am


iv Network Programmability and Automation Fundamentals

Credits

Figure/Text Selection Attribution/Credit


“HTTP is not designed to be a transport protocol. It is a © Roy Thomas Fielding, 2000
transfer protocol in which the messages reflect the
semantics of the Web architecture by performing actions
on resources through the transfer and manipulation of
representations of those resources [Section 6.5.2]”
“This specification [HTTP/2.0] is an alternative to, but Hypertext Transfer Protocol
does not obsolete, the HTTP/1.1 message syntax. Version 2
HTTP’s existing semantics remain unchanged.”
“a sequence of octets, along with representation metadata Uniform Resource Identifier
describing those octets that constitutes a record of the (URI): Generic Syntax,
state of the resource at the time when the representation Copyright © The Internet
is generated.” Society (2005)
“While RFC 2396, section 1.2, attempts to address the IETF (Internet Engineering Task
distinction between URIs, URLs and URNs, it has not Force). Architectural Principles
been successful in clearing up the confusion.” of Uniform Resource Name
Resolution, ed. K. Sollins. 1998
1. Device state metrics; Shamus McGillicudy,
“A Network Source of Truth
2. Data from shared services such as DDI (DNS, DHCP
Promotes Trust in Network
and IPAM) and Active Directory;
Automation,” Enterprise
3. Network flows from sources such as NetFlow; and management Associates,
May 2020
4. Configuration data normalized into key value pairs.
“Deliver working software frequently, from a couple of ©2020 Agile Alliance
weeks to a couple of months, with a preference to the
shorter timescale”.
“allows client/server applications to communicate over Copyright (c) 2020 IETF
the Internet in a way that is designed to prevent
eavesdropping, tampering, and message forgery.”
“the most important security protocol on the internet” Copyright (c) 2020 IETF
“there was more TLS 1.3 use in the first five months after Copyright (c) 2020 IETF
RFC 8446 was published than in the first five years after
the last version of TLS was published as an RFC”

9781587145148_print.indb 4 25/03/21 11:42 am


Credits v

Figure 1-6 CI/CD ©2021 Red Hat, Inc. www.redhat.com


Figure 6-2 Accessing a Django Application © 2005-2021 Django Software
in a Web Browser Foundation
Figure 7-1 The SupportApache-small.png Image, Copyright © 2020 The Apache
as It Appears in a Web Browser Software Foundation
Figure 7-6 The Postman Interface ©2020 Postman, Inc.
Figure 7-7 A GET Request Using Postman ©2020 Postman, Inc.
Figure 7-8 Viewing the Response Headers in ©2020 Postman, Inc.
Postman
Figure 13-1 GitHub Website with YANG © 2020 GitHub, Inc.
Modules
Figure 17-3 Comparing the Body of the Response Screenshot of Comparing the
in the Developer Sandbox and body of the response in the
Postman Developer Sandbox and Postman
©2020 Postman, Inc.
Figure 18-3 Converting a username:password Screenshot of Converting
Tuple to Base64 Format username:password tuple to
Base64 format © Cisco systems
Figure 18-4 Arista YANG Modules Screenshot of Arista YANG modules
© 2020 GitHub, Inc.
Figure 19-3 Available Ansible Module Categories Copyright © 2020 Red Hat, Inc.

9781587145148_print.indb 5 25/03/21 11:42 am


vi Network Programmability and Automation Fundamentals

About the Authors


Khaled Abuelenain, CCIE No. 27401 (R&S, SP), is currently the Consulting Director at
Acuative, a Cisco Managed Services Master Partner. Khaled has spent the past 18 years
designing, implementing, operating, and automating networks and clouds. He specializes
in service provider technologies, SD-WAN, data center technologies, programmabil-
ity, automation, and cloud architectures. Khaled is especially interested in Linux and
OpenStack.

Khaled is a contributing author of the best-selling Cisco Press book Routing TCP/IP,
Volume II, 2nd edition, by Jeff Doyle. He also blogs frequently on network program-
mability and automation on blogs.cisco.com. Khaled is also a member of the DevNet500
group, being one of the first 500 individuals in the world to become DevNet certified.

Khaled lives in Riyadh, Saudi Arabia, and when not working or writing, he likes to run
marathons and skydive. He can be reached at [email protected], on Twitter at
@kabuelenain or on LinkedIn at linkedin.com/in/kabuelenain.

Jeff Doyle, CCIE No. 1919, is a Member of Technical Staff at Apstra. Specializing in
IP routing protocols, complex BGP policy, SDN/NFV, data center fabrics, IBN, EVPN,
MPLS, and IPv6, Jeff has designed or assisted in the design of large-scale IP and IPv6
service provider networks in 26 countries over 6 continents.

Jeff is the author of CCIE Professional Development: Routing TCP/IP, Volumes I and
II and OSPF and IS-IS: Choosing an IGP for Large-Scale Networks; a co-author of
Software-Defined Networking: Anatomy of OpenFlow; and an editor and contributing
author of Juniper Networks Routers: The Complete Reference. Jeff is currently writ-
ing CCIE Professional Development: Switching TCP/IP. He also writes for Forbes and
blogs for both Network World and Network Computing. Jeff is one of the founders of
the Rocky Mountain IPv6 Task Force, is an IPv6 Forum Fellow, and serves on the execu-
tive board of the Colorado chapter of the Internet Society (ISOC).

Anton Karneliuk, CCIE No. 49412 (R&S, SP), is a Network Engineer and Manager at
THG Hosting, responsible for the development, operation, and automation of networks
in numerous data centers across the globe and the international backbone. Prior to join-
ing THG, Anton was a team lead in Vodafone Group Network Engineering and Delivery,
focusing on introduction of SDN and NFV projects in Germany. Anton has 15 years of
extensive experience in design, rollout, operation, and optimization of large-scale service
providers and converged networks, focusing on IP/MPLS, BGP, network security, and
data center Clos fabrics built using EVPN/VXLAN. He also has several years of full-stack
software development experience for network management and automation.

Anton holds a B.S. in telecommunications and an M.S. in information security from


Belarusian State University of Informatics and Radio Electronics. You can find him
actively blogging about network automation and running online training at Karneliuk.
com. Anton lives with his wife in London.

9781587145148_print.indb 6 25/03/21 11:42 am


About the Authors vii

Vinit Jain, CCIE No. 22854 (R&S, SP, Security & DC), is a Network Development
Engineer at Amazon, managing the Amazon network backbone operations team.
Previously, he worked as a technical leader with the Cisco Technical Assistance Center
(TAC), providing escalation support in routing and data center technologies. Vinit is a
speaker at various networking forums, including Cisco Live! events. He has co-authored
several Cisco Press titles, such as Troubleshooting BGP, and Troubleshooting Cisco
Nexus Switches and NX-OS, LISP Network Deployment and Troubleshooting, and
has authored and co-authored several video courses, including BGP Troubleshooting,
the CCNP DCCOR Complete video course, and the CCNP ENCOR Complete video
course. In addition to his CCIEs, Vinit holds multiple certifications related to program-
ming and databases. Vinit graduated from Delhi University in mathematics and earned a
master’s in information technology from Kuvempu University in India. Vinit can be found
on Twitter as @VinuGenie.

9781587145148_print.indb 7 25/03/21 11:42 am


viii Network Programmability and Automation Fundamentals

About the Technical Reviewers


Jeff Tantsura, CCIE No. 11416 (R&S), has been in the networking space for over 25
years and has authored and contributed to many RFCs and patents and worked in both
service provider and vendor environments.

He is co-chair of IETF Routing Working Group, chartered to work on new network


architectures and technologies, including protocol-independent YANG models and next-
generation routing protocols. He is also the co-chair of the RIFT (Routing in Fat Trees)
Working Group, chartered to work on a new routing protocol that specifically addresses
fat tree topologies typically seen in the data center environment.

Jeff serves on the Internet Architecture Board (IAB). His focus has been on 5G transport
and integration with RAN, IoT, MEC, low-latency networking, and data modeling. He’s
also a board member of San Francisco Bay Area ISOC chapter.

Jeff is Head of Networking Strategy at Apstra, a leader in intent networking, where he


defines networking strategy and technologies.

Jeff also holds the certification Ericsson Certified Expert IP Networking.

Jeff lives in Palo Alto, California, with his wife and youngest child.

Viktor Osipchuk, CCIE No. 38256 (R&S, SP), is a Senior Network Engineer at Google,
focusing on automation and improving one of the largest production networks in the
world. Before joining Google, Viktor spent time at DigitalOcean and Equinix, helping to
architect and run their worldwide infrastructures. Viktor spent many years at Cisco,
supporting customers and focusing on automation, telemetry, data models, and APIs
for large-scale web and service provider deployments. Viktor has around 15 years of
diverse network experience, an M.S. in telecommunications, and associated industry
certifications.

9781587145148_print.indb 8 25/03/21 11:42 am


ix

Dedications
Khaled Abuelenain: To my mother, the dearest person to my heart, who invested all
the years of her life so I can be who I am today. I owe you more than any words can
express. To my father, my role model, who always led by example and showed me the
real meaning of work ethic. Nothing I do or say will ever be enough to thank you both.

And to the love of my life, my soulmate, and my better half, Mai, for letting me work
and write while you take care of, literally, everything else. This book would not have
happened if not for your phenomenal support, patience and love. I will forever be
grateful for the blessing of having you in my life.

Jeff Doyle: I would like to dedicate this book to my large and growing herd of grand-
children: Claire, Samuel, Caroline, Elsie, and Amelia. While they are far too young to
comprehend or care about the contents of this book, perhaps someday they will look at
it and appreciate that Grampa is more than a nice old man and itinerant babysitter.

Anton Karneliuk: I dedicate this book to my family, which has tremendously supported
me during the writing process. First of all, many thanks to my amazing wife, Julia, who
took on the huge burden of sorting out many things for our lives, allowing me to con-
centrate on the book. You acted as a navigation star during this journey, and you are my
beauty. I’d also like to thank my parents and brother for me helping me form the habit
of working hard and completing the tasks I’ve committed to, no matter how badly I want
to drop them.

Vinit Jain: I would like to dedicate this book to the woman who has been a great influ-
ence and inspiration in my life: Sonal Sethia (Sonpari). You are one of the most brilliant,
talented, courageous, and humble people I have ever known. You have always inspired
me to push myself beyond what I thought I was capable of. You have been there for me
during difficult times and believed in me when even I did not. You are my rock. This is
a small token of my appreciation, gratitude, and love for you. I am really glad to have
found my best friend in you and know that I will always be there for you.

9781587145148_print.indb 9 25/03/21 11:42 am


x Network Programmability and Automation Fundamentals

Acknowledgments
Khaled: First and foremost, I would like to thank Jeff Doyle, my co-author, mentor, and
friend, for getting me started with writing, and for his continuous assistance and guid-
ance. Jeff has played a fundamental role in my professional life as well as in the lives of
many other network engineers; he probably doesn’t realize the magnitude of this role!
Despite all that he has done for this industry and the network engineering community,
Jeff remains one of the most humble and amiable human beings I have ever come across.
Thank you, Jeff, I owe you a lot!
I am grateful to Anton and Vinit for agreeing to work with me on this project. It has been
challenging at times, but it has been seriously fun most of the time.
I would also like to thank Jeff Tantsura and Viktor Osipchuk for their thorough technical
reviews and feedback. I bothered Viktor very frequently with discussions and questions
over email, and never once did he fail to reply and add a ton of value!
I especially want to thank Brett Bartow and Eleanor Bru for their immense support and
phenomenal patience. And I’m grateful to Mandie Frank, Kitty Wilson, and everyone else
at Cisco Press who worked hard to get this book out to the light. Such an amazing team.
Jeff Doyle: I would like to express my thanks to my friend Khaled Abuelenain for bring-
ing me into this project, and thanks to Anton and Vinit for letting me be a part of their
excellent work. Thanks also to Brett Bartow and everyone at Pearson, whom I’ve worked
with for many years and continue to tell everyone who will listen that this is the best pub-
lishing team any technical writer could hope to work for. Finally, thanks to my wife Sara
who, as always, puts up with my obsessiveness. When she sees me sitting and staring into
nothingness she knows there’s writing going on in my head.
Anton: Special thanks to Schalk Van Der Merwe, CTO, and Andrew Mutty, CIO, at The
Hut Group for believing in me and giving me freedom and responsibility to implement
my automation ideas in a high-scale data center environment. Thanks to all my brothers-
in-arms from The Hut Group hosting networks for constantly sharing with me ideas
about what use cases to focus on for automation. I want to thank my previous manager
in Vodafone Group, Tamas Almasi, who supported me during my initial steps in network
automation and helped me create an appropriate mindset during numerous testbeds and
proofs of concept. Last but not least, I’m very grateful to Khaled Abuelenain for his invi-
tation to co-author this book and the whole author and technical reviewer team; it was a
pleasure to work with you.
Vinit: A special thanks to Khaled for asking me to co-author this book and for being amaz-
ingly patient and supportive of me as I faced challenges during this project. I would like
to thank Jeff Doyle and Anton Karneliuk for their amazing collaboration on this project. I
learned a lot from all of you guys and look forward to working with all of you in the future.
I would also like to thank our technical reviewers, Jeff Tantsura and Viktor Osipchuk, and
our editor, Eleanor Bru, for your in-depth verification of the content and insightful input
to make this project a successful one.
This project wouldn’t have been possible without the support of Brett Bartow and other
members of the editorial team.

A01_Abuelenain_FM_pi-pxxxvi.indd 10 27/03/21 6:49 pm


xi

Contents at a Glance
Introduction   xxix

Part I Introduction
Chapter 1 The Network Programmability and Automation Ecosystem 1

Part II Linux
Chapter 2 Linux Fundamentals 21

Chapter 3 Linux Storage, Security, and Networks 119

Chapter 4 Linux Scripting 183

Part III Python


Chapter 5 Python Fundamentals 249

Chapter 6 Python Applications 311

Part IV Transport
Chapter 7 HTTP and REST 387

Chapter 8 Advanced HTTP 469

Chapter 9 SSH 509

Part V Encoding
Chapter 10 XML 553

Chapter 11 JSON 591

Chapter 12 YAML 615

Part VI Modeling
Chapter 13 YANG 639

Part VII Protocols


Chapter 14 NETCONF and RESTCONF 689

Chapter 15 gRPC, Protobuf, and gNMI 781

Chapter 16 Service Provider Programmability 819

9781587145148_print.indb 11 25/03/21 11:42 am


xii Network Programmability and Automation Fundamentals

Part VIII Programmability Applications


Chapter 17 Programming Cisco Platforms 881

Chapter 18 Programming Non-Cisco Platforms 957

Chapter 19 Ansible 989

Part IX Looking Ahead


Chapter 20 Looking Ahead 1109

Index   1121

9781587145148_print.indb 12 25/03/21 11:42 am


xiii

Contents
Introduction   xxix

Part I Introduction

Chapter 1 The Network Programmability and Automation Ecosystem   1


First, a Few Definitions   2
Network Management   3
Automation   5
Orchestration   6
Programmability   7
Virtualization and Abstraction   8
Software-Defined Networking   13
Intent-Based Networking   13
Your Network Programmability and Automation Toolbox   14
Python   15
Ansible   15
Linux   16
Virtualization   17
YANG   17
Protocols   18
Encoding the Protocols   18
Transporting the Protocols   18
Software and Network Engineers: The New Era   19

Part II Linux

Chapter 2 Linux Fundamentals   21


The Story of Linux   21
History   21
Linux Today   22
Linux Development   22
Linux Architecture   23
Linux Distributions   26
The Linux Boot Process   26
A Linux Command Shell Primer   28
Finding Help in Linux   31

9781587145148_print.indb 13 25/03/21 11:42 am


xiv Network Programmability and Automation Fundamentals

Files and Directories in Linux   35


The Linux File System   35
File and Directory Operations   38
Navigating Directories   38
Viewing Files   41
File Operations   46
Directory Operations   48
Hard and Soft Links   51
Hard Links   51
Soft Links   55
Input and Output Redirection   57
Archiving Utilities   67
Linux System Maintenance   73
Job, Process, and Service Management   73
Resource Utilization   83
System Information   85
System Logs   91
Installing and Maintaining Software on Linux   94
Manual Compilation and Installation   96
RPM   97
YUM   101
DNF   117
Summary   118

Chapter 3 Linux Storage, Security, and Networks   119


Linux Storage   119
Physical Storage   119
Logical Volume Manager   128
Linux Security   135
User and Group Management   136
File Security Management   143
Access Control Lists   148
Linux System Security   155
Linux Networking   158
The ip Utility   159
The NetworkManager Service   168

9781587145148_print.indb 14 25/03/21 11:42 am


Contents xv

Network Scripts and Configuration Files   174


Network Services: DNS   179
Summary   181

Chapter 4 Linux Scripting   183


Regular Expressions and the grep Utility   184
The AWK Programming Language   193
The sed Utility   196
General Structure of Shell Scripts   203
Output and Input   207
Output   207
Input   211
Variables   215
Integers and Strings   216
Indexed and Associative Arrays   220
Conditional Statements   223
The if-then Construct   224
The case-in Construct   230
Loops   232
The for-do Loop   232
The while-do Loop   236
The until-do Loop   237
Functions   238
Expect   242
Summary   246

Part III Python

Chapter 5 Python Fundamentals   249


Scripting Languages Versus Programming Languages   250
Network Programmability   253
Computer Science Concepts   255
Object-Oriented Programming   256
Algorithms   258
Python Fundamentals   260
Python Installation   260
Python Code Execution   263
Python Data Types   270

9781587145148_print.indb 15 25/03/21 11:42 am


xvi Network Programmability and Automation Fundamentals

Variables   270
Numbers   273
Strings   276
Operators   281
Python Data Structures   286
List   286
Dictionaries   290
Tuples   292
Sets   294
Control Flow   295
if-else Statements   296
for Loops   301
while Loops   304
Functions   306
Summary   309
References 310

Chapter 6 Python Applications   311


Organizing the Development Environment   311
Git   312
Docker   317
The virtualenv Tool   331
Python Modules   333
Python Applications   336
Web/API Development   336
Django   337
Flask   345
Network Automation   353
NAPALM   354
Nornir   359
Templating with Jinja2   363
Orchestration   375
Docker   376
Kubernetes   378
Machine Learning   382
Summary   385

9781587145148_print.indb 16 25/03/21 11:42 am


Contents xvii

Part IV Transport

Chapter 7 HTTP and REST   387


HTTP Overview   387
The REST Framework   392
The HTTP Connection   394
Client/Server Communication   394
HTTP/1.1 Connection Enhancements   395
Persistent Connections   395
Pipelining   396
Compression   396
HTTP Transactions   397
Client Requests   397
GET   398
HEAD   398
POST   399
PUT   402
DELETE   405
CONNECT   407
OPTIONS   407
TRACE   408
Server Status Codes   408
1xx: Informational Status Codes   411
2xx: Successful Status Codes   411
3xx: Redirection Status Codes   412
4xx: Client Error Status Codes   413
5xx: Server Error Status Codes   414
Server Status Codes on Cisco Devices   414
HTTP Messages   415
HTTP General Header Fields   418
Cache Servers: Cache-Control and Pragma   418
Connection   420
Date   420
Upgrade   420
Via   421
Transfer-Encoding   421
Trailer   422
Client Request Header Fields   422

9781587145148_print.indb 17 25/03/21 11:42 am


xviii Network Programmability and Automation Fundamentals

Content Negotiation Header Fields: Accept, Accept-Charset,


Accept-Encoding and Accept-Language   423
Client Authentication Credentials: Authorization,
Proxy-Authorization and Cookie   423
Host   424
Expect   424
Max-Forwards   424
Request Context: From, Referer and User-Agent   424
TE   425
Server Response Header Fields   425
Age   425
Validator Header Fields: ETag and Last-Modified   425
Response Authentication Challenges: X-Authenticate and
Set-Cookie   426
Response Control Header Fields: Location, Retry-After, and Vary   426
Response Context: Server   427
The HTTP Entity Header Fields   427
Control Header Fields: Allow   428
Representation Metadata Header Fields: Content-X   428
Content-Length   430
Expires   430
Resource Identification   431
URI, URL, and URN   431
URI Syntax   432
URI Components   432
Characters   435
Absolute and Relative References   436
Postman   436
Downloading and Installing Postman   438
The Postman Interface   438
Using Postman   441
HTTP and Bash   447
HTTP and Python   455
TCP Over Python: The socket Module   455
The urllib Package   458
The requests Package   464
Summary   467

9781587145148_print.indb 18 25/03/21 11:42 am


Contents xix

Chapter 8 Advanced HTTP   469


HTTP/1.1 Authentication   469
Basic Authentication   472
OAuth and Bearer Tokens   474
Client Registration   476
Authorization Grant   477
Access Token   481
API Call to the Resource Server   483
State Management Using Cookies   483
Transport Layer Security (TLS) and HTTPS   487
Cryptography Primer   488
Key Generation and Exchange   488
Stream and Block Data Encryption   492
Message Integrity and Authenticity   493
Encryption and Message Integrity and Authenticity Combined   495
Digital Signatures and Peer Authentication   496
TLS 1.3 Protocol Operation   498
The TLS Version 1.3 Handshake   500
0-RTT and Early Data   502
The Record Protocol   503
HTTP over TLS (HTTPS)   503
HTTP/2   503
Streams, Messages, and Frames   504
Frame Multiplexing   505
Binary Message Framing   506
Other HTTP/2 Optimizations   507
Summary   508

Chapter 9 SSH   509


SSH Overview   509
SSH1   510
SSH2   512
SSH Transport Layer Protocol   513
SSH Authentication Protocol   514
SSH Connection Protocol   518

9781587145148_print.indb 19 25/03/21 11:42 am


xx Network Programmability and Automation Fundamentals

Setting Up SSH   521


Setting Up SSH on CentOS   521
Enabling SSH on Cisco Devices   526
Configuring and Verifying SSH on Cisco IOS XE   526
Configuring SSH on IOS XR   532
Configuring SSH on NX-OS   537
Secure File Transfer   540
Setting Up SFTP on Cisco Devices   545
Secure Copy Protocol   549
Summary   551
References 551

Part V Encoding

Chapter 10 XML   553


XML Overview, History, and Usage   553
XML Syntax and Components   554
XML Document Building Blocks   554
XML Attributes, Comments, and Namespaces   558
XML Formatting Rules   561
Making XML Valid   562
XML DTD   563
XSD   565
Brief Comparison of XSD and DTD   574
Navigating XML Documents   574
XPath   574
XML Stylesheet Language Transformations (XSLT)   578
Processing XML Files with Python   580
Summary   588

Chapter 11 JSON   591


JavaScript Object Notation (JSON)   591
JSON Data Format and Data Types   592
JSON Schema Definition (JSD)   595
Structure of the JSON Schema   595
Repetitive Objects in the JSON Schema   598
Referencing External JSON Schemas   602
Using JSON Schemas for Data Validation   609
Summary   614

9781587145148_print.indb 20 25/03/21 11:42 am


Contents xxi

Chapter 12 YAML   615


YAML Structure   616
Collections   618
Scalars   620
Tags   621
Anchors   624
YAML Example   625
Handling YAML Data Using Python   626
Summary   637

Part VI Modeling

Chapter 13 YANG   639


A Data Modeling Primer   639
What Is a Data Model?   639
Why Data Modeling Matters   640
YANG Data Models   642
Structure of a YANG Module   644
Data Types in a YANG Module   646
Built-in Data Types   647
Derived Data Types   648
Data Modeling Nodes   649
Leaf Nodes   649
Leaf-List Nodes   651
Container Nodes   652
List Nodes   653
Grouping Nodes   654
Augmentations in YANG Modules   656
Deviations in YANG Modules   658
YANG 1.1   662
Types of YANG Modules   663
The Home of YANG Modules   664
Native (Vendor-Specific) YANG Modules   666
IETF YANG Modules   670
OpenConfig YANG Modules   671
YANG Tools   673
Using pyang   673

9781587145148_print.indb 21 25/03/21 11:42 am


xxii Network Programmability and Automation Fundamentals

Using pyangbind   679


Using pyang to Create JTOX Drivers   683
Summary   688

Part VII Protocols

Chapter 14 NETCONF and RESTCONF   689


NETCONF   689
NETCONF Overview   689
NETCONF Architecture   692
The NETCONF Transport Layer   693
NETCONF Transport Protocol Requirements   693
NETCONF over SSH   694
The NETCONF Messages Layer   695
Hello Messages   696
rpc Messages   698
rpc-reply Messages   699
The NETCONF Operations Layer   701
Retrieving Data: <get> and <get-config>   702
Changing Configuration: <edit-config>, <copy-config>, and
<delete-config>   712
Datastore Operations: <lock> and <unlock>   720
Session Operations: <close-session> and <kill-session>   721
Candidate Configuration Operations: <commit>, <discard-changes>,
and <cancel-commit>   722
Configuration Validation: <validate>   724
The NETCONF Content Layer   725
NETCONF Capabilities   731
The Writable Running Capability   732
The Candidate Configuration Capability   732
The Confirmed Commit Capability   732
The Rollback-on-Error Capability   732
The Validate Capability   733
The Distinct Startup Capability   733
The URL Capability   733
The XPath Capability   735
NETCONF Using Python: ncclient   735
RESTCONF   739
Protocol Overview   739

9781587145148_print.indb 22 25/03/21 11:42 am


Contents xxiii

Protocol Architecture   742


The RESTCONF Transport Layer   743
The RESTCONF Messages Layer   743
Request Messages   743
Response Messages   744
Constructing RESTCONF Messages   745
RESTCONF HTTP Headers   745
RESTCONF Error Reporting   746
Resources   746
The API Resource   747
The Datastore Resource   749
The Schema Resource   750
The Data Resource   753
The Operations Resource   756
The YANG Library Version Resource   758
Methods and the RESTCONF Operations Layer   759
Retrieving Data: OPTIONS, GET, and HEAD   759
Editing Data: POST, PUT, PATCH, and DELETE   763
Query Parameters   771
RESTCONF and Python   777
Summary   779

Chapter 15 gRPC, Protobuf, and gNMI   781


Requirements for Efficient Transport   781
History and Principles of gRPC   782
gRPC as a Transport   784
The Protocol Buffers Data Format   786
Working with gRPC and Protobuf in Python   790
The gNMI Specification   798
The Anatomy of gNMI   799
The Get RPC   801
The Set RPC   807
The Capabilities RPC   810
The Subscribe RPC   811
Managing Network Elements with gNMI/gRPC   814
Summary   818

9781587145148_print.indb 23 25/03/21 11:42 am


xxiv Network Programmability and Automation Fundamentals

Chapter 16 Service Provider Programmability   819


The SDN Framework for Service Providers   819
Requirements for Service Provider Networks of the Future   819
SDN Controllers for Service Provider Networks   821
Segment Routing (SR)   823
Segment Routing Basics   823
Segment Routing Traffic Engineering   832
BGP Link State (BGP-LS)   843
BGP-LS Basics   843
BGP-LS Route Types   850
Node NLRI   854
Link NLRI   856
Prefix NLRI   858
Path Computation Element Protocol (PCEP)   859
Typical PCEP Call Flow   861
PCEP Call Flow with Delegation   865
Configuring PCEP in Cisco IOS XR   867
Summary   880

Part VIII Programmability Applications

Chapter 17 Programming Cisco Platforms   881


API Classification   882
Network Platforms   883
Networking APIs   884
Open NX-OS Programmability   884
IOS XE Programmability   885
IOS XR Programmability   886
Use Cases   887
Use Case 1: Linux Shells   887
Use Case 2: NX-API CLI   893
Use Case 3: NX-API REST   898
Use Case 4: NETCONF   905
Meraki   922
Meraki APIs   922
Meraki Use Case: Dashboard API   923
DNA Center   931
DNA Center APIs   933
Intent API   934

9781587145148_print.indb 24 25/03/21 11:42 am


Contents xxv

Device Management   934


Event Notifications and Webhooks   935
Integration API   935
Use Case: Intent API   936
Collaboration Platforms   942
Cisco’s Collaboration Portfolio   942
Collaboration APIs   944
Cisco Unified Communications Manager (CUCM)   944
Webex Meetings   945
Webex Teams   945
Webex Devices   946
Finesse   946
Use Case: Webex Teams   948
Summary   954

Chapter 18 Programming Non-Cisco Platforms   957


General Approaches to Programming Networks   957
The Vendor/API Matrix   957
Programmability via the CLI   958
Programmability via SNMP   959
Programmability via the Linux Shell   960
Programmability via NETCONF   960
Programmability via RESTCONF and REST APIs   961
Programmability via gRPC/gNMI   961
Implementation Examples   962
Converting the Traditional CLI to a Programmable One   962
Classical Linux-Based Programmability   967
Managing Network Devices with NETCONF/YANG   973
Managing Network Devices with RESTCONF/YANG   978
Summary   987

Chapter 19 Ansible   989


Ansible Basics   989
How Ansible Works   990
Ad Hoc Commands and Playbooks   996
The World of Ansible Modules   1000
Extending Ansible Capabilities   1003
Connection Plugins   1003

9781587145148_print.indb 25 25/03/21 11:42 am


xxvi Network Programmability and Automation Fundamentals

Variables and Facts   1005


Filters   1013
Conditionals   1016
Loops   1024
Jinja2 Templates   1034
The Need for Templates   1034
Variables, Loops, and Conditions   1040
Using Python Functions in Jinja2   1049
The join() Function   1050
The split() Function   1051
The map() Function   1054
Using Ansible for Cisco IOS XE   1055
Operational Data Verification Using the ios_command Module   1058
General Configuration Using the ios_config Module   1061
Configuration Using Various ios_* Modules   1069
Using Ansible for Cisco IOS XR   1073
Operational Data Verification Using the iosxr_command Module   1075
General Configuration Using the iosxr_config Module   1078
Configuration Using Various iosxr_* Modules   1083
Using Ansible for Cisco NX-OS   1084
Operational Data Verification Using the nxos_command Module   1086
General Configuration Using the nxos_config Module   1090
Configuration Using Various nxos_* Modules   1093
Using Ansible in Conjunction with NETCONF   1095
Operational Data Verification Using the netconf_get Module   1098
General Configuration Using the netconf_config Module   1103
Summary   1108

Part IX Looking Ahead

Chapter 20 Looking Ahead   1109


Some Rules of Thumb   1109
Automate the Painful Stuff   1109
Don’t Automate a Broken Process   1110
Clean Up Your Network   1110
Find Your Sources of Truth   1110
Avoid Automation You Can’t Reuse   1111

9781587145148_print.indb 26 25/03/21 11:42 am


Contents xxvii

Document What You Do   1111


Understand What Level of Complexity You’re Willing to Handle   1111
Do a Cost/Benefit Analysis   1112
What Do You Study Next?   1112
Model-Driven Telemetry   1113
Containers: Docker and Kubernetes   1114
Application Hosting   1115
Software Development Methodologies   1116
Miscellaneous Topics   1117
What Does All This Mean for Your Career?   1118

Index   1121

9781587145148_print.indb 27 25/03/21 11:42 am


xxviii Network Programmability and Automation Fundamentals

Icons Used in This Book

Laptop Cisco Carrier Mobile PC with software


Routing System Customer

Router Database

Wireless Wireless Modem/


Switch Cloud Connectivity Wireless Gateway

Server Cisco Nexus 7000 File Server

Command Syntax Conventions


The conventions used to present command syntax in this book are the same conventions
used in Cisco’s Command Reference. The Command Reference describes these conven-
tions as follows:

■■ Boldface indicates commands and keywords that are entered literally as shown. In
actual configuration examples and output (not general command syntax), boldface
indicates commands that are manually input by the user (such as a show command).

■■ Italics indicate arguments for which you supply actual values.

■■ Vertical bars (|) separate alternative, mutually exclusive elements.

■■ Square brackets [ ] indicate optional elements.

■■ Braces { } indicate a required choice.

■■ Braces within brackets [{ }] indicate a required choice within an optional element.

Note This book covers multiple operating systems, and in each example, icons and router
names indicate the OS that is being used. IOS and IOS XE use router names like R1 and R2
and are referenced by the IOS router icon. IOS XR routers use router names like XR1 and
XR2 are referenced by the IOS XR router icon.

9781587145148_print.indb 28 25/03/21 11:42 am


xxix

Introduction
For more than three decades, network management has been entirely based on the
command-line interface (CLI) and legacy protocols such as SNMP. These protocols and
methods are severely limited. The CLI, for example, is vendor specific, lacks a unified
data hierarchy (sometimes even for platforms from the same vendor), and was designed
primarily as a human interface. SNMP suffers major scaling problems, is not fit for writ-
ing configuration to devices, and overall, is very complex to implement and customize.

In essence, automation aims at offloading as much work from humans as possible and
delegating that work to machines. But with the aforementioned legacy interfaces and pro-
tocols, machine-to-machine communication is neither effective nor efficient; and at times,
close to impossible.

Moreover, device configuration and operational data have traditionally lacked a proper
hierarchy and failed to follow a data model. In addition, network management workflows
have always been far from mature, compared to software development workflows in
terms of versioning, collaboration, testing, and automated deployments.

Enter network programmability. Programmability revolves around programmable inter-


faces, commonly referred to as application programming interfaces (APIs). APIs are inter-
faces that are designed primarily to be used for machine-to-machine communication. A
Python program accessing a network router to retrieve or push configuration, without
human intervention, is an example of a machine-to-machine interaction. Contrast this
with the CLI, where a human needs to manually enter commands on a device and then
visually inspect the output.

Network equipment vendors (for both physical and virtual equipment) are placing ever-
increasing emphasis on the importance of managing their equipment using programmable
interfaces, and Cisco is at the forefront of this new world. This new approach to managing
a network provides several benefits over legacy methods, including the following:

■■ Normalizing the interface for interaction with network platforms by abstracting


communication with these platforms and breaking the dependency of this commu-
nication on specific network OS scripting languages (for example, NX-OS, IOS XR,
and Junos OS)

■■ Providing new methods of interacting with network platforms and, in the process,
enabling and aligning with new technologies and architectures, such as SDN, NFV,
and cloud

■■ Harnessing the power of programming to automate manual tasks and perform


repetitive tasks efficiently

■■ Enabling rapid infrastructure and service deployment by using workflows for service
provisioning

■■ Increasing the reliability of the network configuration process by leveraging error


checking, validation, and rollback and minimizing human involvement in the
configuration process

A01_Abuelenain_FM_pi-pxxxvi.indd 29 25/03/21 2:57 pm


xxx Network Programmability and Automation Fundamentals

■■ Using common software development tools and techniques for network configura-
tion management, such as software development methodologies, versioning, staging,
collaboration, testing, and continuous integration/continuous delivery

This book covers all the major programmable interfaces used in the market today for net-
work management. The book discusses the protocols, tools, techniques, and technologies
on which network programmability is based. Programming, operating systems, and APIs
are not new technologies. However, programmable interfaces on network platforms, and
using these programmable interfaces to fully operate and maintain a network, along with
the culture accompanying these new methods and protocols, may be (relatively) new. This
book explains, in detail, all the major components of this new ecosystem.

Goals and Methods of This Book


This is a “fundamentals” book aimed at transitioning network engineers from a legacy
network-based mindset to a software-based (and associated technologies) mindset. A
book covering fundamentals generally struggles to cover as many subjects as possible
with just enough detail. The fine balance between breadth and depth is challenging, but
this book handles this challenge very well.

This book introduces the emerging network programmability and automation ecosystem
based on programmable interfaces. It covers each protocol individually, in some significant
detail, using the relevant RFCs as guiding documents. Protocol workflows, messages, and
other protocol nuances tend to be dry, and at times boring, so to keep things interesting,
practical examples are given wherever possible and relevant. You, the reader, can follow
and implement these examples on your machine, which can be as simple as a Linux virtual
machine with Python Version 3.x installed, and free tools to work with APIs, such as Postman
and cURL. This book makes heavy use of the Cisco DevNet sandboxes, so in the majority of
cases, you do not need a home lab to test and experiment with physical equipment.

A whole section of the book is dedicated to putting the knowledge and skills learned
throughout the book to good use. One chapter covers programming Cisco platforms and
another covers programming non-Cisco platforms. A third chapter in that same section is
dedicated exclusively to Ansible. This book provides an abundance of hands-on practice.

The last chapter provides a way forward, discussing tools and technologies that you
might want to explore after you are done with this book.

Who This Book Is For


This book is meant for the following individuals and roles, among others:

■■ Network architects and engineers who want to integrate programmability into their
network designs

■■ NOC engineers monitoring and operating programmable networks or those who


rely on network management systems that utilize programmability protocols

9781587145148_print.indb 30 25/03/21 11:42 am


Introduction xxxi

■■ Network engineers designing, implementing, and deploying new network services

■■ Software engineers or programmers developing applications for network manage-


ment systems

■■ Network and software engineers working with networks or systems involving SDN,
NFV, or cloud technologies

■■ Network engineers pursuing their Cisco DevNet certifications

Whether you are an expert network engineer with no prior programming experience or
knowledge, or a software engineer looking to utilize your expertise in the network auto-
mation domain, after reading this book, you will fully understand the most commonly
used protocols, tools, technologies, and techniques related to the subject, and you will be
capable of effectively using the newly learned material to design, implement, and operate
full-fledged programmable networks and the associated network automation systems.

How This Book Is Organized


This book covers the information you need to transition from having a focus on net-
working technology to focusing on software and network programmability. This book
covers six main focus areas:

■■ Operating systems: Linux

■■ Software development: Python

■■ Transport: HTTP, REST, and SSH

■■ Encoding: XML, JSON, and YAML

■■ Modeling: YANG

■■ Protocols: NETCONF, RESTCONF, gRPC, and service provider programmability

■■ Practical programmability: Cisco platforms, non-Cisco platforms, and Ansible

Each chapter in this book either explicitly covers one of these focus areas or prepares
you for one of them. Special consideration has been given to the ordering of topics to
minimize forward referencing. Following an introduction to the programmability land-
scape, Linux is covered first because to get anything done in network programmability,
you will almost always find yourself working with Linux. The book next covers Python
because the vast majority of the rest of the book includes coverage of Python in the con-
text of working with various protocols. The following chapters present an organic flow
of topics: transport, encoding, modeling, and the protocols that build on all the previous
sections. For example, understanding NETCONF requires you to understand SSH, XML,
and YANG, and understanding RESTCONF requires that you understand HTTP, XML/
JSON, and YANG. Both NETCONF and RESTCONF require knowledge of Python, most
likely running on a Linux machine.

A01_Abuelenain_FM_pi-pxxxvi.indd 31 25/03/21 2:57 pm


xxxii Network Programmability and Automation Fundamentals

How This Book Is Structured


The book is organized into nine parts, described in the following sections.

PART I, “Introduction”
Chapter 1, “The Network Programmability and Automation Ecosystem”: This chapter
introduces the concepts and defines the terms that are necessary to understand the pro-
tocols and technologies covered in the following chapters. It also introduces the network
programmability stack and explores the different components of the stack that constitute
a typical network programmability and automation toolbox.

PART II, “Linux”


Chapter 2, “Linux Fundamentals”: Linux is the predominant operating system used
for running software for network programmability and automation. Linux is also the
underlying operating system for the vast majority of network device software, such as
IOS XR, NX-OS, and Cumulus Linux. Therefore, to be able to effectively work with pro-
grammable devices, it is of paramount importance to master the fundamentals of Linux.
This chapter introduces Linux, including its architecture and boot process, and covers
the basics of working with Linux through the Bash shell, such as working with files and
directories, redirecting input and output, performing system maintenance, and installing
software.

Chapter 3, “Linux Storage, Security, and Networks”: This chapter builds on Chapter 2
and covers more advanced Linux topics. It starts with storage on Linux systems and the
Linux Logical Volume Manager. It then covers Linux user, group, file, and system secu-
rity. Finally, it explains three different methods to manage networking in Linux; the ip
utility, the NetworkManager service, and network configuration files.

Chapter 4, “Linux Scripting”: This chapter builds on Chapters 2 and 3 and covers Linux
scripting using the Bash shell. The chapter introduces the grep, awk, and sed utilities and
covers the syntax and semantics of Bash scripting. The chapter covers comments, input
and output, variables and arrays, expansion, operations and comparisons, how to execute
system commands from a Bash script, conditional statements, loops, and functions. It also
touches on the Expect programming language.

PART III, “Python”


Chapter 5, “Python Fundamentals”: This chapter assumes no prior knowledge of pro-
gramming and starts with an introduction to programming, covering some very important
software and computer science concepts, including algorithms and object-oriented pro-
gramming. It also discusses why programming is a foundational skill for learning network
programmability and covers the fundamentals of the Python programming language,

9781587145148_print.indb 32 25/03/21 11:42 am


Introduction xxxiii

including installing Python Version 3.x, executing Python programs, input and output,
data types, data structures, operators, conditional statements, loops, and functions.

Chapter 6, “Python Applications”: This chapter builds on Chapter 5 and covers the
application of Python to different domains. The chapter illustrates the use of Python for
creating web applications using Django and Flask, for network programmability using
NAPALM and Nornir, and for orchestration and machine learning. The chapter also cov-
ers some very important tools and protocols used in software development in general,
such as Git, containers, Docker and virtual environments.

PART IV, “Transport”


Chapter 7, “HTTP and REST”: This is one of the most important chapters in this book.
It introduces the HTTP protocol and the REST architectural framework, as well as the
relationship between them. This chapter covers HTTP connections based on TCP. It
also covers the anatomy of HTTP messages and dives into the details of HTTP request
methods and response status codes. It also provides a comprehensive explanation of the
most common header fields. The chapter discusses the syntax rules that govern the use
of URIs and then walks through working with HTTP, using tools such as Postman, cURL,
and Python libraries, such as the requests library.

Chapter 8, “Advanced HTTP”: Building on Chapter 7, this chapter moves to more


advanced HTTP topics, including HTTP authentication and how state can be maintained
over HTTP connections by using cookies. This chapter provides a primer on cryptogra-
phy for engineers who know nothing on the subject and builds on that to cover TLS, and
HTTP over TLS (aka HTTPS). It also provides a glimpse into HTTP/2 and HTTP/3, and
the enhancements introduced by these newer versions of HTTP.

Chapter 9, “SSH”: Despite being a rather traditional protocol, SSH is still an integral
component of the programmability stack. SSH is still one of the most widely used pro-
tocols, and having a firm understanding of the protocol is crucial. This chapter discusses
the three sub-protocols that constitute SSH and cover the lifecycle of an SSH connec-
tion: the SSH Transport Layer Protocol, User Authentication Protocol, and Connection
Protocol. It also discusses how to set up SSH on Linux systems as well as how to work
with SSH on the three major network operating system: IOS XR, IOS XE, and NX-OS.
Finally, it covers SFTP, which is a version of FTP based on SSH.

PART V, “Encoding”
Chapter 10, “XML”: This chapter covers XML, the first of three encoding protocols
covered in this book. XML is the oldest of the three protocols and is probably the most
sophisticated. This chapter describes the general structure of an XML document as well
as XML elements, attributes, comments, and namespaces. It also covers advanced XML
topics such as creating document templates using DTD and XML-based schemas using
XSD, and it compares the two. This chapter also covers XPath, XSLT, and working with
XML using Python.

9781587145148_print.indb 33 25/03/21 11:42 am


xxxiv Network Programmability and Automation Fundamentals

Chapter 11, “JSON”: JSON is less sophisticated, newer, and more human-readable than
XML, and it is therefore a little more popular that XML. This chapter covers JSON data
formats and data types, as well as the general format of a JSON-encoded document.
The chapter also covers JSON Schema Definition (JSD) for data validation and how JSD
coexists with YANG.

Chapter 12, “YAML”: YAML is frequently described as a superset of JSON. YAML is


slightly more human-readable than JSON, but data encoded in YAML tends to be signifi-
cantly lengthier than its JSON-encoded counterpart. YAML is a very popular encoding
format and is required for effective use of tools such as Ansible. This chapter covers the
differences between XML, JSON, and YAML and discusses the structure of a YAML
document. It also explains collections, scalers, tags, and anchors. Finally, the chapter
discusses working with YAML in Python.

PART VI, “Modeling”


Chapter 13, “YANG”: At the heart of the new paradigm of network programmability is
data modeling. This is a very important chapter that covers both generic modeling and
the YANG modeling language. This chapter starts with a data modeling primer, explain-
ing what a data model is and why it is important to have data models. Then it explains
the structure of a data model. This chapter describes the different node types in YANG
and their place in a data model hierarchy. It also delves into more advanced topics, such
as augmentations and deviations in YANG. It describes the difference between open-
standard and vendor-specific YANG models and where to get each type. Finally, the
chapter covers a number of tools for working with YANG modules, including pyang and
pyangbind.

PART VII, “Protocols”


Chapter 14, “NETCONF and RESTCONF”: NETCONF was the first protocol developed
to replace SNMP. RESTCONF was developed later and is commonly referred to as the
RESTful version of NETCONF. Building on earlier chapters, this chapter takes a deep
dive into both NETCONF and RESTCONF. The chapter covers the protocol architecture
as well as the transport, message, operations, and content layers of each of the two
protocols. It also covers working with these protocols using Python.

Chapter 15, “gRPC, Protobuf, and gNMI”: The gRPC protocol was initially developed
by Google for network programmability that borrows its operational concepts from the
communications models of distributed applications. This chapter provides an overview of
the motivation that drove the development of gRPC. It covers the communication flow of
gRPC and protocol buffers (Protobuf) used to serialize data for gRPC communications.
The chapter also shows how to work with gRPC using Python. The chapter then takes a
deep dive into gNMI, a gRPC-based specification. Finally, the chapter shows how gRPC
and gNMI are used to manage a Cisco IOS XE device.

9781587145148_print.indb 34 25/03/21 11:42 am


Introduction xxxv

Chapter 16, “Service Provider Programmability”: Service providers face unique


challenges due to the typical scale of their operations and the stringent KPIs that must be
imposed on their networks, especially given the heated race to adopt 5G and associated
technologies. This chapter discusses how such challenges influence the programmability
and automation in service provider networks and provides in-depth coverage of Segment
Routing, BGP-LS, and PCEP.

PART VIII, “Programmability Applications”


Chapter 17, “Programming Cisco Platforms”: This chapter explores the programmabil-
ity capabilities of several Cisco platforms, covering a wide range of technology domains.
In addition, this chapter provides several practical examples and makes heavy use of
Cisco’s DevNet sandboxes. This chapter covers the programmability of IOS XE, IOS XR,
NX-OS, Meraki, DNA Center, and Cisco’s collaboration platforms, with a use case
covering Webex Teams.

Chapter 18, “Programming Non-Cisco Platforms”: This chapter covers the program-
mability of a number of non-Cisco platforms, such as the Cumulus Linux and Arista EOS
platforms. This chapter shows that the knowledge and skills gained in the previous chap-
ters are truly vendor neutral and global. In addition, this chapter shows that programma-
bility using APIs does in fact abstract network configuration and management and breaks
the dependency on vendor-specific CLIs.
Chapter 19, “Ansible”: This chapter covers a very popular tool that has become synony-
mous with network automation: Ansible. As a matter of fact, Ansible is used in the appli-
cation and compute automation domains as well. Ansible is a very simple, yet extremely
powerful, automation tool that provides a not-so-steep learning curve, and hence a quick
and effective entry point into network automation. This is quite a lengthy chapter that
takes you from zero to hero in Ansible.

PART IX, “Looking Ahead”


Chapter 20, “Looking Ahead”: This chapter builds on the foundation covered in the pre-
ceding chapters and discusses more advanced technologies and tools that you might want
to explore to further your knowledge and skills related to network programmability and
automation.

9781587145148_print.indb 35 25/03/21 11:42 am


This page intentionally left blank

9781587145148_print.indb 36 25/03/21 11:42 am


Chapter 1

The Network Programmability


and Automation Ecosystem

We all have that one story we tell on ourselves about some stupid mistake that brought
down a network segment or even an entire network. Here’s mine.

Thirty years ago, I was sitting in an office in Albuquerque, logged in to a router in Santa
Fe, making some minor, supposedly nondisruptive modifications to the WAN interface.
I wanted to see the changes I had made to the config, and I got as far as typing sh of the
IOS show command before realizing I was still in interface config mode and needed to
back out of it before entering the show command. But instead of backspacing or taking
some other moderately intelligent action, I reflexively hit Enter.

The router, of course, interpreted sh as shutdown, did exactly what it was told to do,
and shut down the WAN interface—the only interface by which the router was remotely
accessible. There was no warning message. No “You don’t want to do that, you idiot.” The
WAN interface just went down, leaving me no choice but to drive the 60 miles to Santa
Fe to get physical access to the router, endure the sour looks of the workers in the office
I had isolated, and turn the interface back up.

There are other stories. Like the time not too many years after The Santa Fe Incident
when I mistyped a router ID, causing the OSPF network to have duplicate RIDs and
consequently misbehave in some interesting ways. I think that one later became a trouble-
shooting exercise in one of my books.

My point is that configuration mistakes cause everything from annoying little link fail-
ures to catastrophic outages that take hours or days to correct and put your company on
the front pages of the news. Depending on the study you read, human error accounts for
60% to 75% of network outages.

Every network outage has a price, whether it’s the cost of a little branch office being
offline for an hour or a multinational corporation suffering millions of dollars in lost rev-
enue and damaged reputation.

9781587145148_print.indb 1 25/03/21 11:42 am


2 Chapter 1: The Network Programmability and Automation Ecosystem

Even when we’re not making configuration mistakes, we Homo sapiens tend to be a
troublesome and expensive feature of any network.

The cost of building a network (CAPEX) has always been outweighed by the cost of
running that network (OPEX). And that operational cost is more than just paying people
to configure, change, monitor, and troubleshoot the network. There are costs associated
with direct human operations, such as the following:

■■ Configuration mistakes, large and small, which are exacerbated by working under
pressure during network outages

■■ Failure to comply with configuration standards

■■ Failure to even have configuration standards

■■ Failure to see and correctly interpret network telemetry that indicates impending
trouble

■■ Failure to maintain accurate network documentation

■■ Having network experts constantly in “firefighting mode” rather than performing


steady-state network analysis and advanced planning

It’s important to emphasize that network automation and programmability do not neces-
sarily mean reducing the workforce, although workers are going to require some retrain-
ing. At its best, automation makes network staff more valuable by removing their daily
“firefighting drills,” allowing them to spend their time thinking about the 3- and 5-year
network plan; evaluating new technologies, vendor solutions, and industry trends; analyz-
ing whether the network can better serve company objectives; and just keeping a better
eye on the big picture.

Pilots of Boeing 777s report that on an average flight, they spend just 7 minutes manu-
ally flying the plane. They are quick to emphasize, however, that while the autopilot is
doing the flying, the pilots are still very much in control. They input instructions and
expectations, and then supervise the automated processes. The autopilot performs the
mundane physical tasks necessary to fly the plane, and it probably performs those tasks
more quickly and accurately than most pilots do. The pilots, freed from the distractions
of manual flying, apply their expertise to monitoring approaching weather and flight pat-
terns, keeping an eye on the overall health of the plane, and even looking over the shoul-
der of the autopilot to be sure it is correctly executing the instructions they gave it. The
pilot’s role is expanded, not diminished.

The pilot tells the airplane what he wants (that is, programming), and the plane does
what it is told (that is, automation). We don’t have this level of artificial intelligence and
machine learning in our networks yet, but that’s where we’re headed.

First, a Few Definitions


There’s a fair amount of confusion around the concepts discussed in this book. Is auto-
mation just a part of network management? Are automation and programmability the

9781587145148_print.indb 2 25/03/21 11:42 am


First, a Few Definitions 3

same thing? How does orchestration fit in? And does SDN really stand for “Still Does
Nothing”?

Network Management
The terms automation, programmability, orchestration, virtualization, SDN, and
intent—all of which are defined in this section—apply, in one way or another, to
­network management. So let’s start by defining that:

Network management is how you make a network meet whatever expectations you have
of it.

This is about as simple a definition that you can get, but behind this one sentence is
arrayed an extensive repository of systems, processes, methodologies, rules, and stan-
dards pertaining to the management of all aspects of the network.

One framework for sorting out all the aspects of a network to be managed is FCAPS,
which represents the following areas:

■■ Fault management

■■ Configuration management

■■ Accounting management

■■ Performance management

■■ Security management

It’s doubtful that you would be reading this book if you didn’t already know what
these five areas represent. You probably also know that there are deep aspects of each.
Configuration management, for example, covers not just provisioning but configuration
standards and procedures, change management, configuration change tracking, reference
designs, configuration file archiving, and the specialized systems to support all that stuff.
You probably also hear ITIL discussed regularly in the context of network management.
ITIL, which stands for Information Technology Infrastructure Library, is a library of
principles, processes, and procedures that support FCAPS but that also goes beyond
that framework to apply to personnel and organizations, IT products, partners, suppliers,
practices, and services that go into managing the network. Whereas FCAPS is system
­oriented, ITIL is services and governance oriented.

The outline of the ITIL 4 management practices provide an example of the complexity
of ITIL:

■■ General management practices

■■ Architecture management

■■ Continual service improvement

■■ Information security management

9781587145148_print.indb 3 25/03/21 11:42 am


4 Chapter 1: The Network Programmability and Automation Ecosystem

■■ Knowledge management

■■ Measurement and reporting

■■ Organizational change management

■■ Portfolio management

■■ Project management

■■ Relationship management

■■ Risk management

■■ Service financial management

■■ Strategy management

■■ Supplier management

■■ Workforce and talent management

■■ Service management practices

■■ Availability management

■■ Business analysis

■■ Capacity and performance management

■■ Change control

■■ Incident management

■■ IT asset management

■■ Monitoring and event management

■■ Problem management

■■ Release management

■■ Service catalog management

■■ Service configuration management

■■ Service continuity management

■■ Service design

■■ Service desk

■■ Service level management

■■ Service request management

■■ Service validation and testing

9781587145148_print.indb 4 25/03/21 11:42 am


First, a Few Definitions 5

■■ Technical management practices

■■ Deployment management

■■ Infrastructure and platform management

■■ Software development and management

This is quite a list, and it covers only the top-level topics. Fortunately for our definitions,
we don’t have to go into all of them. I just wanted to show you how extensive and formal-
ized network management can be. For the purposes of this book we don’t need to go into
the highly structured, highly detailed ITIL specifications. Most of the topics in this book
support the simpler FCAPS framework; in fact, most topics in the book support configu-
ration management.

Managing a network system means interacting with the system in some way. It usually
involves the following:

■■ Accessing the CLI via SSH (don’t use Telnet!) or directly via a console port for
­configuration, monitoring, and troubleshooting

■■ Monitoring (and sometimes changing) the system through Simple Network


Management Protocol (SNMP) agents and Management Information Bases (MIBs)

■■ Collecting system logs via syslog

■■ Collecting traffic flow statistics with NetFlow or IP Flow Information Export


(IPFIX)

■■ Sending information to and extracting information from network devices through


Application Programming Interfaces (APIs), whether the APIs are RESTful (such as
RESTCONF) or not (such as NETCONF or gRPC)

Automation
Automation, very simply, means using software to perform a task you would otherwise
do manually. And automation is nothing new or unfamiliar to you. Routing protocols, for
example, are automation programs that save you the work of manually entering routes at
every network node. DNS is an automation program that saves you from having to look
up the IP address of any destination you want to talk to. You get the point.

Automation software might be built into a network node, might be a purchased software
platform, or might be a program or script you create yourself.

That last bit -- creating your own automation routines -- is what this book is all about: It
gives you the fundamentals to be able to understand and operate the underlying protocols
used by products, as well as utilize those protocols in your scripts and programs.

9781587145148_print.indb 5 25/03/21 11:42 am


6 Chapter 1: The Network Programmability and Automation Ecosystem

Besides the obvious benefit of making life easier, automation provides the following
advantages:

■■ Fast rollout of network changes

■■ Relief from performing routine repetitive tasks

■■ Consistent, reliable, tested, standards-compliant system changes

■■ Reduced human errors and network misconfigurations

■■ Better integration with change control policies

■■ Better network documentation and change analysis

Out of all of these advantages, you might be inclined to choose speed of deployment
as the most important. Being able to deploy a network change “with a push of a button”
definitely is less expensive than visiting each network node and manually reconfiguring.
The time savings increase dramatically as the number of affected nodes increases.

However, consistent and reliable network changes, along with reduced human error (that
is, accuracy) are of even greater benefit than speed of deployment. The significance of
accuracy becomes more obvious as the number of times a change has to be implemented
increases. Implementing a network change on five devices can be done fairly accurately
using primitive tools and elevated vigilance. This may not be possible when implementing
the same change on 1000 devices. Speed saves operational expense during deployment,
but accuracy provides cumulative benefits over the life of the network.

Orchestration
Orchestration, in the traditional musical sense, is the composition of musical parts for a
diversity of instruments. When the instrumentalists play their individual parts together—
usually under the direction of a conductor—you get Beethoven’s Fifth or the theme to
Lion of the Desert or Lawrence of Arabia.

Orchestration in the IT sense is very much the same: Individual elements work together,
following their own instructions, to create a useful service or set of services. For exam-
ple, the deployment of a certain application in a data center is likely to require compute,
storage, security, and network resources. Orchestration enables you to coordinate and
deploy all of those resources to accomplish one goal.

Does this sound like automation? Well, yes and no. It’s true that the differences between
automation and orchestration can sometimes get fuzzy, but here’s the difference:

■■ Automation is the performance of a single task, such as a configuration change


across a set of switches or routers or the deployment of a virtual machine on
a server, without manual (human) intervention.

■■ Orchestration is the coordination of many automated tasks, in a specific sequence,


across disparate systems to accomplish a single objective. Another term for this is
workflow.

9781587145148_print.indb 6 25/03/21 11:42 am


First, a Few Definitions 7

So, automation performs individual tasks, and orchestration automates workflows. Both
automation and orchestration save time and reduce human error.

A wealth of orchestration tools are available on the market, including the following:

■■ VMware vRealize Orchestrator, for VMWare environments

■■ OpenStack Heat, for OpenStack

■■ Google Cloud Composer, for (you guessed it) orchestrating Google Cloud

■■ Cisco Network Services Orchestrator (NSO), which, as the name implies, focuses on
network services

■■ RedHat Ansible, which is usually used as a simple automation tool but can also
­perform some workflow automation

■■ Kubernetes, a specialized platform for orchestrating containerized workloads and


services

Programmability
It’s an understandable misconception that programmability is a part of automation. After
all, most automation does not work unless you give it operating parameters. A routing
protocol, for instance, doesn’t do anything unless you tell it what interfaces to run on,
perhaps what neighbors to negotiate adjacencies with, and what authentication factors
to use.

Is network programmability, then, just providing instructions to automation software?


No, that’s configuration.

Programmability is the ability to customize your network to your own standards,


policies, and practices. Programmability enables you to operate your network and the
services it supports as a complete entity, built to support the specifics of your business.
In this age in which most businesses depend on their applications and are built around
their networks, that’s huge.

Isn’t that the way it should always have been? Your network should comply to your
requirements; you should not have to adjust your requirements to comply to your net-
work. Once you can customize your network to your own standards, you have the power
to innovate, to quickly adapt to competitive challenges, and to create advantages over
your competitors. These are all far more important advantages than just operational sav-
ings, reduced downtime, and faster problem remediation. (Although you get all that, too.)

But programmability, as a technical marketing term today, has a slightly different mean-
ing. Programmability, used in this context, is the ability to monitor devices, retrieve data,
and configure devices through a programmable interface, which is a software interface
to your device through which other software can speak with your device. This interface is
formally known as an Application Programming Interface (API).

9781587145148_print.indb 7 25/03/21 11:42 am


8 Chapter 1: The Network Programmability and Automation Ecosystem

What is the difference between a legacy interface such as the CLI and an API? For one,
a CLI was created with a human operator in mind. A human types commands into the
terminal and receives output to the screen. On the other hand, an API is used by other
software, such as automation products or custom Python scripts to speak with a device
without any human interaction, apart from writing the Python script itself or configuring
the device parameters on that automation product.

APIs are covered in a lot of detail in this book because they are a foundational building
block for any software-to-software interaction. Instead of reading an exhaustive compari-
son between legacy interfaces and APIs, you will see for yourself the major advantages of
interacting with your network through programmable interfaces as you progress through
the chapters of this book.

Virtualization and Abstraction


Virtualization is one of those words you’ve known and understood throughout your
career. First there are those V acronyms: VPN, VPLS, VLAN, VXLAN, VC, VM, VRF,
VTEP, OTV, NVE, VSAN, and more. There is often virtualization even when the word
itself isn’t used. TCP, for example, provides a virtual point-to-point connection over con-
nectionless IP by using handshaking, sequence numbers, and acknowledgments.

Virtualization is the creation of a service that behaves like a physical service but is not.
We use virtualization to share resources, such as consolidating multiple data networks
over a shared MPLS cloud, communicating routing tables (VRFs) for multiple isolated
networks and security zones over a shared MP-BGP core, implementing multiple VLANs
on one physical LAN, or creating virtualized servers on a single physical server. The moti-
vation might be to create a bunch of different services when you have only one physical
resource to work with, or it might be to more efficiently use that resource by divvying
it up among multiple users, each of whom gets the impression that they are the only one
using the resource.

VLAN 300

VLAN 200

VLAN 100

1 2 3 4 5 6 7 8 9 10 11 12

Figure 1-1 VLANs Connected to a Switch Are Not Aware of Each Other

9781587145148_print.indb 8 25/03/21 11:42 am


First, a Few Definitions 9

Boiling all this down to a simple definition, virtualization is a software-only or software-


defined service built on top of one or more hardware devices. In the case of a virtualized
network, the network might look quite different from the underlying physical network.
For example, in Figure 1-1, from the individual perspectives of VLANs 100, 200, and
300, each is connected to a single switch, and none is aware of the other two VLANs.
In Figure 1-2, the Layer 3 VPNs Red, Green, and Blue are built on top of a single MPLS
infrastructure but are aware only of their own VPN peers.

And here are a couple more definitions related to the networks pictured here: An overlay
network is a software-defined network running over a physical underlay network. You’ll
encounter overlays and underlays particularly in data center networking.

Red
VPN

Green VRF Red Red


VPN VRF Red VPN
VRF Green MPLS Network
VRF Blue
Blue VRF Blue
Blue
VPN VPN

Red VRF Red VRF Red Red


VPN VRF Blue
VPN
VRF Blue

Blue Blue
VRF Red
VPN VPN
VRF Green VRF Green
VRF Blue
Red
VPN

Green Blue Green


VPN VPN VPN

Figure 1-2 VPNs Built on a Single MPLS Infrastructure That Are Aware Only of Their
Own VPN Peers

Abstraction is a term you may not understand clearly, although you have certainly
heard it used in the context of network abstractions. You also likely use the concept
often when you’re whiteboarding some network, and you draw a cloud to represent the
Internet, an MPLS core, or some other part of a network, where you just mean that pack-
ets go in at one edge and come out at some other edge.

Abstraction goes hand-in-hand with virtualization because we build virtualized services


on top of abstractions. The “whiteboard cloud” example illustrates this: Our whiteboard
discussion is focused on the details of ingress and egress packet flows, not on the magic
that happens in the cloud to get the packets to the right place.

Virtual machines are, for instance, built on an abstraction of the underlying physical
server (see Figure 1-3). The server abstraction is the CPU, storage, memory, and I/O allot-
ted to the VM rather than the server itself.

9781587145148_print.indb 9 25/03/21 11:42 am


10 Chapter 1: The Network Programmability and Automation Ecosystem

APP APP APP APP APP APP APP APP APP APP APP APP APP APP APP APP

Virtual Machine Virtual Machine Virtual Machine Virtual Machine

Operating System Operating System Operating System Operating System

Virtual Resources Virtual Resources Virtual Resources Virtual Resources

CPU Disk CPU Disk CPU Disk CPU Disk

RAM I/O RAM I/O RAM I/O RAM I/O

Hypervisor

Hypervisor
Server Resources

CPU Disk RAM I/O

Physical Server

Figure 1-3 A Server Abstracted into the Components Allotted to Each VM

Another example, sticking with servers, is a container platform such as Docker (see
Figure 1-4), which packages up application code and its dependencies into containers that
are isolated from the underlying server hardware. The advantage of both VMs and con-
tainerized applications is that they can be deployed, changed, and moved independently
of the physical infrastructure.

APP APP APP APP APP APP APP APP APP APP APP

Containerization (Docker, LXC, rkt, etc.)

Operating System

Server Resources

CPU Disk RAM I/O

Physical Server

Figure 1-4 Using Containers to Abstract Away the Underlying Server and Operating
System for Individual Applications

9781587145148_print.indb 10 25/03/21 11:42 am


First, a Few Definitions 11

Network abstraction is the same idea but with more elements. By adding an “abstraction
layer”—or abstracting away the network—you focus only on the virtualized network:
adding, changing, and removing services independently of the network infrastructure
(see Figure 1-5). Just as a VM uses some portion of the actual server resources, virtual
network services use some portion of the physical network resources. Network abstrac-
tion can also allow you to change infrastructure elements without changing the virtual
network.

Physical
Switching
Fabric

Virtual
Switching VXLAN 5000 VXLAN 5001 VXLAN 5002
Fabric

VM VM VM VM
Virtual
Servers VM VM VM VM

VM VM VM VM

Figure 1-5 Data Center Infrastructure Abstracted Away by VXLAN

A network abstraction layer is essential for efficient automation and programmability


because you want to be able to control your network independently of the specifics of
vendors and operating systems. One of the things you will learn in Chapter 6, “Python
Applications,” for example, is how to use a Python library called NAPALM (Network
Automation and Programmability Abstraction Layer with Multivendor Support). In
Chapter 13, “YANG,” you’ll learn about YANG, a network modeling language (that is, a
language for specifying a model, or an abstraction, of the network).

The abstraction, or model, of the network serves as a single source of truth for your
automation and orchestration. Do you have enough resources for a service that is about
to be deployed? What’s the available bandwidth? How will RIBs and FIBs change, and is
there enough memory capacity to support the changes? What effect will the added ser-
vice have on existing Layer 2 or Layer 3 forwarding behavior?

9781587145148_print.indb 11 25/03/21 11:42 am


12 Chapter 1: The Network Programmability and Automation Ecosystem

Without a single source of truth, the “intelligent” part of your automation or orchestra-
tion must reach out and touch every element in the network to gather the information it
needs for pre-deployment verification. Each element is its own source of truth and might
or might not express its truth consistently with other elements—especially in multiven-
dor networks. A single source of truth, if built properly, continuously collects network
state and telemetry to provide a real-time, accurate, and relevant model of the network.
Every service you want to deploy can then be verified against this abstraction before it is
deployed, increasing your confidence and decreasing failures.

But don’t confuse this perspective of a single source of truth with a Configuration
Management Database (CMDB), which is a repository of what the network state should
be and, therefore, is not updated from the live network. Instead, the live network state is
compared to the CMDB to verify its compliance.

Network abstraction gives rise to Network as Code (NaC) or the broader Infrastructure
as Code, which encompasses network, storage, and compute. NaC is the code that ties
together network abstraction, virtualization, programming, and automation to create an
intelligent interface to your network.

NaC also brings networking into the DevOps realm and enables the application of proven
software practices such as Continuous Integration/Continuous Delivery/Continuous
Deployment (CI/CD), illustrated in Figure 1-6. Among the many tools you can use for
developing NaC is RedHat Ansible, which is covered in Chapter 19, “Ansible.”

Continuous Continuous Continuous


Integration Delivery Deployment

Automatically Automatically
Build Test Merge Release to Deploy to
Repository Production

Figure 1-6 CI/CD

With your new network code developed, tested, and merged with existing code and
passed to the virtualization layer, the last bit of the workflow is to translate the general-
ized code into verified, vendor- and operating system–specific configurations and push
them to physical network elements. Interactive scripts using languages such as Expect,
TCL, and Perl were—and sometimes still are—used to log in to the network devices and
configure them; these scripts just automate the actions an operator would take to manu-
ally configure the device via the CLI.

These days, automation tools interact with networking devices through APIs, which are
themselves abstractions of the underlying physical device. The difference is that the
APIs reside on the individual devices and are specific to their own device. Automation
software usually communicates with the APIs via eXtensible Markup Language (XML)
or JavaScript Object Notation (JSON), covered in Chapters 10, “Extensible Markup
Language (XML) and XML Schema Definition (XSD),” and 11, “JavaScript Object
Notation (JSON) and JSON Schema Definition (JSD).” You’ll find that even the CLIs of

9781587145148_print.indb 12 25/03/21 11:42 am


First, a Few Definitions 13

modern routers and switches are actually applications running on top of the local APIs
rather than direct interfaces to the operating systems.

Software-Defined Networking
Software-Defined Networking (SDN) isn’t covered in this book, but all this discussion
of automation, programmability, network abstraction, and APIs merits at least a mention
of SDN.

The “SDN 101” concept of the technology is that SDN is a centralized control plane
on top of a distributed forwarding plane. Instead of a network of switches and routers
that each have their own control planes, SDN “pops the control planes off” and central-
izes them in one controller or a controller cluster. The control plane is greatly simplified
because individual control planes no longer have to synchronize with each other to main-
tain consistent forwarding behavior.

This concept embodies much of what we’ve been discussing in the previous sections:
separation of physical infrastructure from service workflows, a network abstraction layer,
and a single source of truth. Incorporating everything we’ve previously discussed pro-
vides a more refined definition of SDN:

SDN is a conceptual framework in which networks are treated as abstractions and


are controlled programmatically, with minimal direct touch of individual network
components.

This definition still adheres to the idea of centralized control, but it encompasses a wider
set of SDN solutions, such as SD-WAN, that virtualizes the Wide-Area Network and
places it under centralized control and subject to vendor SDN solutions such as Cisco’s
Application Centric Infrastructure (ACI) and VMware’s NSX. The definition also takes in
products such as Cisco’s Network Services Orchestrator (NSO) that don’t really fit in the
more traditional definition of SDN.

Note ACI and NSX are often lumped together when giving examples of SDN solutions.
While they do many of the same things, there are also some significant differences in
how they work and what they do. For instance, ACI has a different approach to network
abstraction from the approach described here.

Intent-Based Networking
Intent-Based Networking (IBN) is the next evolutionary step beyond SDN. Like SDN, it isn’t
covered in this book, but it is certainly based on the concepts described so far. Chapter 20,
“Looking Ahead,” has more to say about IBN; you’ll also get some exposure to it in the
discussion of Cisco DNA Intent APIs in Chapter 17, “Programming Cisco Platforms.”

SDN gives you a centralized control point for your network, but you still have to provide
most of the intelligence to deploy or change the underlay and overlay. In other words,
you still have to tell the control plane how to do what you want it to do.

9781587145148_print.indb 13 25/03/21 11:42 am


14 Chapter 1: The Network Programmability and Automation Ecosystem

IBN adds an interpretive layer on top of the control plane that enables you to just express
what you want—that is, your intent—and IBN translates your expressed intent into how
to do it. Depending on the implementation, the IBN system either then pushes the devel-
oped configurations to a controller for deployment to the infrastructure or (more often)
acts as the control plane itself and pushes configurations directly to the infrastructure.

Once your intent is configured (intent fulfillment), an IBN system uses closed-loop
telemetry and state tables to monitor the network and ensure that it does not drift from
your expressed intent (intent assurance).

IBN is still in its infancy as this book is being written, but it holds enormous potential for
transforming the way we operate our networks. You’ll learn more about this in Chapter 20.

Your Network Programmability and Automation


Toolbox
All of the definitions so far in this chapter bring us to an important question: What tools
does an adept network engineer and architect need to carry? And with that question we
arrive at the entire purpose of this book.

One of the reasons for spending so much time on definitions is to be able to classify
various tools and to understand the relationships among those classifications. Figure 1-7
offers one perspective on how you might classify tools within the functions discussed in
the previous section and a number of functions not discussed in this chapter.

Programming Python Go R Javascript

Applications Custom Scripts Ansible Chef Puppet Salt CFEngine

Platform Linux Windows MAC OS

L3VPN/ VXLAN/ Segment


Virtualization VPLS VRF L2VPN VLAN EVPN MPLS Routing

Abstraction YANG Models

Protocol NETCONF RESTCONF PCEP gNMI gRPC

Encoding XML JSON YAML

Transport SSH HTTP/REST


API API API API API
Infrastructure Device 1 Device 2 Device 3 Device 4 Device 5 Device 6 ……… Device n

Figure 1-7 The Network Programmability and Automation Ecosystem

9781587145148_print.indb 14 25/03/21 11:42 am


Your Network Programmability and Automation Toolbox 15

Before going further, it’s important to note that Figure 1-7 is just one perspective. The
order in which the Application, Automation, Platform, Virtualization, and Abstraction
layers appear and how they interact can vary according to the network environment.
What’s more important are the tools available to you within the various layers.

It’s also important to note that I’ve provided examples of more tools than are covered
in this book. And that gets us to why the authors of this book have chosen the tools we
have for you to learn.

Python
At the top of the programmability and automation ecosystem are programming lan-
guages. Python, Go, R, and JavaScript are given as examples in Figure 1-7. There are, of
course, other programming languages that could be added here, C and C++ being the
most prominent of them, although they are used more by people making their living
at software development than by people making their living at other things—like
­networking—who need to be able to write programs and scripts to make their job easier.
There are also a number of languages that we could add to the list, such as Perl, Expect,
and TCL, that are still around to one degree or another but that have been overshadowed
by newer, more powerful languages. Like Python.

Which brings us to why this book exclusively covers Python: It’s by far the most widely
used programming language for network automation, supporting a terrific number of
libraries, modules, and packages specific to networking. Python is easy to learn, easy to
use, and easy to debug, which fits the bill for networkers who just need to get their job
done without having to become professional programmers. That said, Python is far from
a “beginner” language. It’s used extensively by companies such as Facebook, Netflix,
Instagram, Reddit, Dropbox, and Spotify. Google software developers even have a saying:
“Use Python where we can, C++ where we must.”

Python is versatile, working equally well for scripting and as a glue language (for tying
together modules written in other languages). It’s also highly portable to different plat-
forms. Once you know a little Python, you might even find yourself using it for quick
little tasks such as running math calculations.

Finally, the more you use packaged automation products such as Ansible or Cisco ACI
or interact with network devices through their APIs rather than directly with their CLI,
the more you’ll find Python to be an essential tool in your toolbox. Chapter 5, “Python
Fundamentals,” covers the basics of Python, and Chapter 6, “Python Applications,”
­covers some useful libraries and tools that you will want to use when automating your
network using Python.

Ansible
The next category of tools in your programmability and automation toolbox is applica-
tions. And first on that list are custom scripts. If you are already wielding a programming
language such as Python to perform your job, you almost certainly have a collection of

9781587145148_print.indb 15 25/03/21 11:42 am


16 Chapter 1: The Network Programmability and Automation Ecosystem

scripts that you use to automate everyday repetitive tasks. The more proficient you are,
the more useful your scripts become. You’ll learn how to script some of the boring parts
of your job in this book.
Also in the applications category are a number of prebuilt automation platforms that you
can either download for free or purchase: Ansible, Salt, Chef, Puppet, and CFEngine are
examples, but there are many others. What they have in common is that they all began
life as platforms for automating server management. If you’re in a DevOps shop or any
environment that orchestrates large numbers of end systems, your organization probably
already has a favorite automation platform from this list.
We’ve chosen Ansible as the automation engine to familiarize you with in this book. Not
only is Ansible open source and available for the very reasonable price of free, it is the
most popular automation framework among networkers. It’s easy to learn and integrates
well as a Python module; in fact, Ansible is written in Python. Even if you end up using
some other framework within your organization, having a grounding in Ansible is valu-
able and will give you a head start in understanding the concepts of any of this class of
automation platforms. Ansible is covered in Chapter 19.

Linux
The next tool in the lineup is the platform on which you’re doing your programming and
running your automation. Not the hardware itself but the operating system on the hard-
ware. Figure 1-7 lists the three most well-known operating systems: Linux, Windows,
and macOS. For each of these, there are specific versions and distributions. For example,
Linux includes Fedora, CentOS, SuSE, Ubuntu, and many others. Under Windows are
the many incarnations of Windows Server, Windows 7, 8, 10, and so on. There are also
platform-specific operating systems on which your automation applications can run (for
example, Cisco IOS XE and NX-OS).

Recall the earlier comment about Figure 1-7 being just one perspective on how the pro-
grammability and automation ecosystem is organized. The Programming, Applications,
and Platform tools might be running on a management server. They might be running on
your laptop. One or more of the layers might be running directly on top of an infrastruc-
ture device or themselves might be part of the infrastructure. So, don’t take Figure 1-7 as
the only way the various elements of the ecosystem might interact with each other.

For network programmability and automation, you need to have a strong working knowl-
edge of Linux. Three chapters in this book are dedicated to Linux: Chapter 2, “Linux
Fundamentals,” Chapter 3, “Linux Storage, Security, and Networks” and Chapter 4, “Linux
Scripting.” Here are just a few of the reasons Linux needs to be part of your toolbox:

■■ Linux is the most widely used operating system in IT environments, running more
than two-thirds of the servers on the Internet. Linux is used as a server OS and also
for the following:

■■ Automation

■■ Virtualization and containers

9781587145148_print.indb 16 25/03/21 11:42 am


Your Network Programmability and Automation Toolbox 17

■■ Programming and scripting

■■ Software-Defined Networking

■■ Big Data systems

■■ Cloud computing

■■ Linux supports a huge number of built-in networking features.

■■ Linux supports a huge number of development tools, such as Git.

■■ Linux supports a number of automation tools and supporting capabilities, including


almost everything shown in Figure 1-7.

■■ Python interpreters (along with many other languages) run natively on Linux, and
many Linux distributions come with Python already built in.

■■ The vast majority of network operating systems today (such as Cisco NX-OS, IOS
XE, and IOS XR) run as applications on top of some Linux distribution. Some entire
cloud platforms, such as OpenStack, are supported in Linux. Even macOS is very
Linux-like under the hood.

■■ Although there are paid versions of Linux, such as Red Hat Enterprise Linux (RHEL),
what you’re primarily paying for is support. Linux distributions for the most part are
free to download and use.

■■ Because Linux is open source, with enormous development support worldwide, the
source code is tremendously reliable, stable, and secure.

Virtualization
“Wait a minute,” you might say, “the services you show for the virtualization layer run on
individual infrastructure devices. What are they doing separated from the ­infrastructure?”

You’re right, the services themselves run on network devices. But what all of them repre-
sent are different forms of virtualized overlays to the physical underlay network. Think of
the overlay and the underlay as the top and bottom of your network data plane. Between
them are sandwiched all the layers that implement the virtualized overlay onto the
­physical underlay.

YANG
You’ve already read about abstraction in this chapter: Abstraction means a generic model
of your network. Hence, it is closely associated with the virtualization layer. In Figure 1-7,
the only modeling language shown is YANG (Yet Another Next Generation). There are
other data modeling languages, such as Unified Modeling Language (UML) and NEMO
(NEtwork MOdeling), but YANG is used so extensively for network modeling that it is
the only language shown Figure 1-7. You’ll learn all about using YANG in Chapter 13.

9781587145148_print.indb 17 25/03/21 11:42 am


18 Chapter 1: The Network Programmability and Automation Ecosystem

Protocols
The protocols layer dictates a programmatic interface for accessing or building the
abstraction of a network. Protocols may be RESTful, such as RESTCONF, or not, such as
NETCONF or gRPC. A protocol uses a particular encoding for its messages. NETCONF,
for example, uses XML only, whereas RESTCONF supports both XML and JSON. A
protocol uses a particular transport to reach a device. RESTCONF uses HTTP, while
NETCONF uses SSH. A protocol uses Remote Procedure Calls (RPCs) to install, manipu-
late, and delete configurations based on your model or retrieve configuration or opera-
tional data based on your model. Models are described in YANG. The protocols shown in
Figure 1-7 are all covered in this book in Chapters 14, “NETCONF and RESTCONF,” 15,
“gRPC, Protobuf, and gNMI,” and 16, “Service Provider Programmability.”

Encoding the Protocols


The protocols themselves need a common language to communicate with the infrastruc-
ture, and this is the purpose of the encoding layer. eXtensible Markup Language (XML),
JavaScript Object Notation (JSON), and Yet Another Markup Language (YAML) are
the most common encoding languages in use for network automation and configuration
­management.

One of the major advantages of encoding languages is that they provide structured
input and output to which data models easily map. Encoding languages provide data in a
standard format, where a piece of data usually has a name or tag, and a value, where the
tag is defined in a data model in a well-defined hierarchy. When you need to search for
data, you search for the tag and then simply read the value. This paradigm maps well to
programming data structures such as arrays. In contrast, with the ASCII format of typical
network configuration files, you have to parse through text files and match strings to be
able to find a piece of information. You’ll learn about all three of the encoding languages
in this book in Chapters 10, 11, and 12, “YAML.”

Transporting the Protocols


After your protocol is encoded, it must be transported to the discrete network nodes.
As a networker, you’re certainly familiar with the concept of transporting data across
a network—particularly via UDP and TCP. Secure Shell (SSH) and Hypertext Transfer
Protocol (HTTP) are the most common transports for getting data to and from network
devices.

Notice in Figure 1-7 that some devices at the infrastructure level have APIs, and some do
not. As the name implies, an API exposes a programmatic interface to applications that
need to communicate with the device, such as for automation, configuration manage-
ment, telemetry collection, and security monitoring. An API becomes a communication
socket from an application to the device.

Not all network devices have APIs; old devices often do not have them, for example.
When a device does not have an API, an application needs to mimic a human operator by

9781587145148_print.indb 18 25/03/21 11:42 am


Software and Network Engineers: The New Era 19

logging in through the CLI. In fact, most scripts 20 or more years ago did just this: Telnet
or SSH to the device (20 years ago it was most likely Telnet) and then perform a series of
commands in the device OS’s expected syntax, look for the correct response, go to the
next command, and so on. It was much like cutting and pasting a configuration to the
command line.

CLI-centric scripts are a major operational headache for two reasons:

■■ Unstructured data: When you use a CLI, you know what you’re looking for and can
quickly adapt to variations in the data you see. Suppose, for example, that you need
to see the administrative state of an interface. You type show interface or some simi-
lar command and read the output. Your mind is immensely adaptable and capable of
quickly reading through the data and finding the data that you need. Scripts cannot
do that as easily. They must parse the data they receive to find what they need.

■■ Changing CLIs: A script expects to find the information it needs in a certain place
and in a certain format. If you upgrade your OS or (heaven forbid) change the device
to a different OS altogether, the data might be presented differently, and your script
has to be rewritten to accommodate the change.

For all that headache, using CLI-oriented automation scripts is still better than managing
a large infrastructure manually.

The value of APIs is that they deliver structured data. While CLIs are designed for human
operators, APIs are designed for applications. There can still be headaches, but they’re
greatly reduced.

Software and Network Engineers: The New Era


If you talk about automation and related concepts such as SDN and IBN, you get a cou-
ple different responses:

■■ Older engineers: “So I need to be a programmer now? Are software developers


going to take my job if I don’t?”

■■ Newer engineers: “I’ve invested enormous time and money into earning the certifi-
cations that will set me on the career path I want. Most of my study time has been
spent configuring and troubleshooting through the CLI. Is all that a waste of time?”

We have good news and bad news for you, whether you’re an old network hand or an
engineer proudly displaying your freshly earned certifications. The bad news is that
yes, if you want to keep up with where the industry is going, you need to acquire some
programming skills and understand the protocols supporting modern automation trends.
That’s what this book is here for, along with a mountain of other resources to help you
get up to speed. If you’re a seasoned engineer, none of this is different from what you’ve
done your entire career: keeping up with new technologies by keeping up with the lat-
est RFCs, reading the right trade journals and blogs, and attending industry events like
Cisco Live and your regional network operators’ groups. If you’re just starting out, you’re

9781587145148_print.indb 19 25/03/21 11:42 am


20 Chapter 1: The Network Programmability and Automation Ecosystem

already in deep learning mode, and you’ll find that enhancing your growing skill set is not
that hard at all. And we guarantee that it makes you more valuable as an engineer.

The good news is that no, automation and programmability do not mean that your jobs
are going to be eliminated or taken over by software developers. Software developers’
programming abilities go far beyond what’s needed for networking, and for the most
part, they know little about networking itself. You only have to know enough about pro-
gramming to make your own job easier. As mentioned at the beginning of this chapter,
automating the mundane parts of your job just means you have more time to utilize your
deep knowledge of networking. The network is better for it, and you are most certainly
the better for it.

So let’s get started adding some shiny new tools to your toolbox!

9781587145148_print.indb 20 25/03/21 11:42 am


Chapter 2

Linux Fundamentals

Chapter 1, “The Network Programmability and Automation Ecosystem,” discusses


where operating systems (such as Windows, UNIX, and Linux) fit in the big picture of
programmability and automation. As indicated in Chapter 1, today Linux is the predomi-
nant operating system used by developers and network engineers alike—and for good
reasons. This chapter is dedicated to Linux fundamentals. It starts with an assumption
that you know nothing about Linux. By the end of the chapter, you will have gained
enough knowledge and hands-on experience to successfully install, operate, and maintain
a Linux-based system. This system will be the first building block in the development
environment you will use to apply most of the material covered in subsequent chapters of
this book.

The Story of Linux


This section introduces the Linux operating system: how it started, where it stands today,
and where it is headed in the future. It also touches on the architecture of the operating
system and introduces the concept of Linux distributions.

History
The Linux operating system was first developed in 1991 by a Finnish computer ­science
student at the University of Helsinki called Linus Torvalds. His motivation was to provide
a free alternative to the UNIX-like operating system MINIX that would run on Intel’s
80386 chipset. The majority of the Linux kernel was written in the C programming
­language.

The first release of Linux consisted of only a kernel. A kernel is the lowest-level software
component of an operating system and is the layer that acts as an interface between the
hardware and the rest of the operating system. A kernel on its own is not very useful.
Therefore, the Linux kernel was bundled with a set of free software utilities developed

9781587145148_print.indb 21 25/03/21 11:42 am


22 Chapter 2: Linux Fundamentals

under a project called GNU (which is a recursive acronym for GNU’s Not Unix). In 1992,
Linux was relicensed using the General Public License Version 2 (GPLv2), which is also
a part of the GNU project. Together, the kernel and GNU utilities made up the Linux
operating system. A group of developers worked on developing the Linux kernel as well
as integrating the kernel with GNU software components in order to release the first
stable version of Linux, Linux 1.0, in March 1994. In the following few years, most of the
big names in the industry, such as IBM, Oracle, and Dell, announced their support for
Linux.

Even though Linux is licensed under the GPL and is, therefore, free, companies have built
businesses around Linux and made a lot of money out of it. Companies like Red Hat
make money by packaging the free Linux kernel along with other software components,
bundled with subscription-based support services. This product is then sold to customers
who do not want to have to depend on the goodwill of the open source ­community to
receive support for their Linux servers that are running mission-critical applications.

Linux Today
Today, Linux is supported on virtually any hardware platform, and most commercial
application developers provide versions of their software that run on Linux. Linux powers
more than half of the servers on the Internet. More than 85% of smartphones shipped in
2017 ran on Android, a Linux-based operating system. More smart TVs, home appliances,
and even cars are running some version of Linux every day. All supercomputers today run
on Linux. Most network devices today either run on Linux or on a Linux-like network
operating system (NOS), and many vendors expose a Linux shell so that network engi-
neers can interact directly with it. The Linux shell is covered in detail later in this chapter.

Linux Development
Linux is an open-source operating system that is developed collaboratively by a vast num-
ber of software developers all over the world and sponsored by a nonprofit organization
called the Linux Foundation.

Developers interested in introducing changes to the Linux kernel submit their changes to
the relevant mailing list in units called patches. The developers on the mailing list respond
with feedback on a patch, and the patch goes through a cycle of enhancements and feed-
back. Once a patch is ready to be integrated into the kernel, a Linux maintainer who is
responsible for one section of the kernel signs off on the patch and forwards it to Linus
Torvalds, who is also a Linux Foundation fellow, for final approval. If approved, the patch
is integrated into the next release of the Linux kernel. A new major release of the kernel is
made available approximately every three months.

When Linux was released in March 1994, the kernel consisted of just 176,250 lines of
code. At the time of writing this book, version 5.0 of the Linux kernel consists of more
than 25 million lines of code.

9781587145148_print.indb 22 25/03/21 11:42 am


The Story of Linux 23

Linux Architecture
A detailed discussion of the Linux OS architecture is beyond the scope of this book.
However, this section describes a few of the important characteristics of the different
Linux OS components.

Figure 2-1 provides an architectural block diagram of Linux. It shows that applications
are in the top layer, presenting the software interface through which the user interacts
with the device, hardware is at the bottom, and the kernel is in between.

Application Software

Routing Firewall HTTP


Database NMS
Software Software Server

System Software

System
Libraries Shells Tools
Daemons

Kernel

Process Device Memory


Security
Scheduler Drivers Mgmt

Hardware

Figure 2-1 Architectural Block Diagram of Linux

The kernel is the part of the operating system that interfaces the different software
components with the hardware. It translates instructions from the application software
to a language that the hardware understands through the device drivers that are part of
the Linux kernel. So, if a user decides to send a paper print request, this instruction is
received by the application, passed to the kernel, and then passed straight to the hard-
ware driver. The same applies to networking: When a user tries to visit an Internet web
page, or if an application such as BGP on a router tries to establish peering with a distant
router (assuming that the NOS is based on Linux), the browser or BGP opens a network
socket (request) that the kernel handles, and translates the request into instructions that
the device driver of the network card can understand. This all happens before the request
gets converted into electrical signals leaving the system over the network cable.

The operating system needs to make sure that applications and the kernel do not share
valuable system resources and, in doing so, disrupt each other’s operations. Therefore,
each is run in its own space—that is, a different segregated and protected part of mem-
ory. The kernel has its own allocation of memory, the kernel space, to prevent the kernel
from crashing the system if something goes wrong. Alternatively, when a user executes

9781587145148_print.indb 23 25/03/21 11:43 am


24 Chapter 2: Linux Fundamentals

an application, it runs in what is called user space, or userland, where each running
application and its data are stored.

Applications come from different sources, and they may be poorly and/or recklessly
developed, leading to software bugs. When you run such applications separately from
kernel space, they can’t interact with the kernel resources, which means they can’t cause
the system to halt or crash.

Figure 2-2 illustrates the communication between the application software, the different
components of Linux, and the hardware. It also highlights the important architectural
concepts of kernel and user space.

Application Software

User Space

GNU C Library (glibc)

System Call Interface (SCI)

Kernel Space Kernel

Device Drivers

Hardware

Figure 2-2 Communication Between Applications, Linux, and System Hardware

Applications and daemons must make what is called a system call to the kernel in order
to access hardware resources such as memory or network devices. A daemon is an appli-
cation that provides a specific service such as HTTP, NTP, or log collection and runs in
the background (without hogging the user interface). Daemon process names typically
end with the letter d, such as httpd or syslogd. When a daemon is performing a system-
level function such as log collection, the daemon is commonly referred to as a system
daemon. Daemons are covered in further detail later in this chapter and in Chapter 3,
“Linux Storage, Security, and Networks.”

A system call is made by using a group of subroutines provided by a library called the
GNU C library (glibc). These subroutines access the kernel through the kernel’s system
call interface (SCI). The kernel’s SCI and the GNU C library are collectively known as
the Linux application programming interface (Linux API). The kernel accesses the
hardware via device drivers. If this sounds a little overwhelming to you, do not worry. All
the components just mentioned are simply pieces of software, each providing a particular
function.

9781587145148_print.indb 24 25/03/21 11:43 am


The Story of Linux 25

The Linux kernel is a monolithic kernel, which means the kernel functionality is run
in the kernel space, including, for example, the device drivers. This is in contrast to the
microkernel architecture, in which only a minimal set of services are run in the kernel
space, while the rest of the services are run in the user space.

The kernel includes software modules for the following:

■■ Process scheduling

■■ Interprocess communication (IPC)

■■ Memory management

■■ Network services

■■ Management of virtual files

■■ Device drivers

■■ File system drivers

■■ Security

■■ System call interfaces that are used by applications in the user space to make system
calls to the kernel

Loadable kernel modules (LKMs) are software packages that are not part of the kernel
but that are loaded into the kernel space in order to extend the functionality of the ker-
nel, without requiring rebuilding and rebooting of the kernel every time this functionality
is required.

Software components that are commonly a part of a Linux operating system installation
but that are not part of the Linux kernel typically include the following:

■■ Daemons

■■ Window system for implementing the WIMP (windows, icons, menus, pointer) user
interface

■■ Vendor-proprietary device drivers

■■ User applications such as word processing applications and Internet browsers

■■ Command shells that accept commands from the users, parse and validate these
commands, and then interpret these commands into a lower-level language to be
passed to the Linux kernel for execution

■■ Utilities that provide common system tasks through the shell, such as the ls, sort,
and cp utilities

Many of these components are revisited and covered in more detail later in this chapter.

9781587145148_print.indb 25 25/03/21 11:43 am


26 Chapter 2: Linux Fundamentals

Linux Distributions
A Linux distribution, usually referred to as a distro, is the actual package of software
that you install on a device as the Linux OS. As mentioned in the previous section, a
kernel on its own is not very useful, and in order to have a functioning, usable operat-
ing system, the kernel is packaged with other software components that run in the user
space. Some distros come fully loaded with bleeding-edge software packages and drivers,
which translates into a significant OS footprint. Other distros are composed of minimal
software packages. A Linux distro is typically tailored for a specific audience or function,
such as the scientific community or the software developer community, or to run applica-
tion servers, such as email or web servers. A distro typically includes a kernel, loadable
kernel modules, and other software components that run in the user space.

There are more than 300 different distros in active development today. However, there are
fewer than 10 distro “families” from which all other distros are spun off. These are the
major distro families:

■■ Slackware: This is the parent distro for SuSE.

■■ Debian: This is the parent distro for Ubuntu.

■■ Red Hat: This is the parent distro for Red Hat Enterprise Linux (RHEL), CentOS, and
Fedora.

■■ Enoch: This is the parent distro for Gentoo.

■■ Arch: This is a Linux distro optimized for the x86_64 architecture and follows
a “do-it-yourself” philosophy.

■■ Wind River: This is the Linux distro on which Cisco’s NX-OS, IOS XE, and IOS XR
run as applications.

■■ Android: This is the operating system running on more than 70% of smart mobile
phones today.

Knowledge of any particular distro can be ported to any other distro without the steep
learning curve associated with studying a subject for the first time. For the purpose of
this book, the distro of choice is CentOS, a member of the Red Hat family that targets
server environments and is a free version of RHEL (Red Hat Enterprise Linux) but with-
out the support provided by Red Hat. RHEL and CentOS have been developed from the
start with commercial use in mind. At the time of writing, the latest version of CentOS
is 8.x. Detailed steps for installing CentOS 8.2 on different platforms are provided in the
online documentation for the OS.

The Linux Boot Process


After you install Linux on a computer, when you power up the computer, the Linux OS
goes through a boot process in which the different components of the OS are loaded into
memory. Before you begin to use Linux, you need to have a basic understanding of what

9781587145148_print.indb 26 25/03/21 11:43 am


The Linux Boot Process 27

happens from the minute you power on your computer until the login screen appears and
the OS is ready to be used. This section helps you learn what you need to know, without
going into too much detail. The discussion focuses on the Intel x86 architecture since
modern servers predominantly use CPUs based on x86 architecture. Figure 2-3 provides
a high-level view of the process.

Control passed to Boot


Power-on Load BIOS into RAM Loader on Boot Sector

Initramfs creates temp Boot Loader loads the


filesystem and kernel compressed kernel
image is decompressed image and initramfs

Device drivers are loaded,


Init or systemd process
initramfs unmounted, physical
loaded in user space
drive mounted

Init or systemd process


starts all other user Boot Process
space processes Completed

Figure 2-3 The Linux OS Boot Process

When you press the power-on button of your computer, system software, or firmware,
saved on non-volatile flash memory on the computer’s motherboard, is run in order to
initialize the computer’s hardware and do power-on self-tests (POST) to confirm that
the hardware is functioning properly. This firmware is called the BIOS, which stands for
basic input/output system. After the hardware is initialized and the POST completed,
based on the boot order that is set in the BIOS configuration, the BIOS starts searching
for a boot sector on each of the drives listed in the configuration, in the order config-
ured. The boot sector comes in several types, based on the drive type you are booting
from. However, the BIOS has no understanding of the kind of boot sector it is accessing
or the partitioning of the drive on which the boot sector resides. All it knows is that the
boot sector is a bootable sector (because of the boot sector signature in its last 2 bytes),
and it passes control to whatever software resides there (in this case, the boot loader).
A ­master boot record (MBR) is a special type of boot sector that resides before the first
partition and not on any one partition.

The boot loader then assumes control. The boot loader’s primary function is to load the
kernel image into memory and pass control to it in order to proceed with the rest of the
boot process. A boot loader can also be configured to present the user with options in
multi-boot environments, where the loader prompts the user to choose which of sev-
eral different operating systems to boot. There are several boot loaders available, such

9781587145148_print.indb 27 25/03/21 11:43 am


28 Chapter 2: Linux Fundamentals

as LILO, GRUB, and SYSLINUX, and the choice of which one to use depends on what
needs to be achieved. Boot loaders can work in one or more stages, where the first stage
is usually OS independent and the later stages are OS specific. Different boot loaders can
also be chain-loaded (by configuration), one after the other, depending on what you (or
the software implementation) need to do.

The boot loader searches for a kernel image to load based on the boot loader’s configura-
tion and, possibly, user input. Once the correct kernel image is identified, it is loaded into
memory in compressed state. The boot loader also loads an initial RAM disk function
called initrd or initramfs, which is a software image that provides a temporary file system
in memory and allows the kernel to decompress and create a root file system without
mounting any physical storage devices. (This is discussed further in the next section.)
The kernel then decompresses in RAM and loads hardware device drivers as loadable
kernel modules. Then initrd or initramfs is unmounted, and the physical drive is mounted
instead.

Recall from earlier in this chapter that the Linux software components are classified as
kernel space programs or user space programs. Up to this this point, no user space pro-
grams have run. The first user space program to run is the parent process, which is the
/usr/sbin/init process or the /lib/systemd/systemd process in some systems. All other
user space processes or programs are invoked by the init (or systemd) process.

Based on which components you chose to install, the init process starts a command shell
and, optionally, a graphical user interface. At this point, you are prompted to enter your
username and password in order to log in to the system.

To switch from the GUI to the command shell and back on CentOS, you need to log in to
the GUI that boots up by default and then press Alt+Ctrl+F2 (or any function key from
F3 to F6). The GUI then switches to full command-line mode. To switch back to the GUI,
press Alt+Ctrl+F1. CentOS starts five command-line terminals and one graphical user
interface.

A Linux Command Shell Primer


An interpreter is a program that accepts commands written in a high-level language, such
as Python, and coverts them into lower-level code, either to be executed directly by the
hardware or to be passed on to another program (such as the Python virtual machine)
for further processing. Similarly, a command shell is a program that accepts commands
from the user, parses and validates those commands, one by one, and then interprets the
commands into a lower-level language to be passed to the Linux kernel for execution. Of
course, the Linux shell communications model is a little more involved than this. This sec-
tion focuses on the user interface of the command shell.

But why use the command-line interface (CLI) when you can use the graphical user
interface (GUI)? There is nothing wrong with the GUI, but whether you want to use the
CLI or the GUI depends primarily on what you are trying to accomplish. This book is
about network automation and programmability. You will never tap into the true power

9781587145148_print.indb 28 25/03/21 11:43 am


A Linux Command Shell Primer 29

of automation that Linux provides without relying heavily on the CLI (aka the command
shell), whose use is described throughout this chapter. The significant value that automa-
tion provides applies to repeatable tasks; the key word here is repeatable. Automation
in essence involves breaking up a task into smaller, repeatable tasks and then applying
automation to those tasks, and this is where the CLI comes into play. Chapter 4, “Linux
Scripting,” builds on the CLI commands covered in this chapter and shows how to use
Linux scripting to automate repeatable tasks, among other things.

There are numerous shells available today, some of which are platform independent
and others that are available for particular operating systems only. Some shells are GPL
licensed, and others are not. The shell covered here is the Bash shell, where Bash stands
for Bourne-again shell. Bash is a UNIX shell and command language written by Brian
Fox for the GNU Project as a free software replacement for the Bourne shell, and it is the
default shell on the vast majority of Linux distros in active development today.

To get started with Bash, log in to a CentOS machine and start the Terminal program,
which is the interface to the Bash shell. You can start Terminal in several ways; the most
straightforward method on CentOS 8 is to press on Activities at the top left corner of the
screen. A search window will appear. Type terminal and press on the icon for Terminal
that appears right under the search text box.

If you created the user NetProg during the installation and have logged in as that user,
you should see a prompt similar to the one in Example 2-1.

Example 2-1 The Terminal Program Prompt

[NetProg@localhost ~]$

Throughout this chapter, the Terminal program window will be referred to as Terminal,
the terminal, the Bash shell, the command-line shell, or just the shell, interchangeably.
The command prompt in Terminal is a great source of information. The username of the
current user is shown first. In this case, it is user NetProg. Then, after the @ comes the
computer (host) name, which in this case is the default localhost. Next comes the ~ (tilde),
which represents the home directory of the current user, NetProg. Each user in Linux
has a home directory that is named after the user and is always located under the
/home directory. In this case, this directory is /home/NetProg. If you use the pwd
­command, which stands for print working directory, the shell prints out the current
working ­directory, which in this case is /home/NetProg, as you can see in Example 2-2.
This is referred to as the working directory.

Example 2-2 Using the pwd Command

[NetProg@localhost ~]$ pwd


/home/NetProg

The last piece of information that you can extract from the prompt is the fact that this
is not user root, signified by the $ sign at the end of the prompt line. Example 2-3

9781587145148_print.indb 29 25/03/21 11:43 am


30 Chapter 2: Linux Fundamentals

i­ntroduces the command su, which stands for switch user. When you type su and press
Enter, you are prompted for the root password that you set during the CentOS installa-
tion. Notice that the prompt changes to a # when you switch to user root.

Example 2-3 Using the su Command

[NetProg@localhost ~]$ su
Password:
[root@localhost NetProg]# pwd
/home/NetProg

The basic syntax for the su command is su {username}. When no username is specified
in the command, it defaults to user root. Notice also in Example 2-3 that while the cur-
rent user changed to root, the current directory is not the home directory of user root. In
fact, it is the home directory of user NetProg, as shown in the pwd command output in
Example 2-3. To switch to user root as well as the home directory for root, you use the su
command with the - option, as shown in Example 2-4.

Example 2-4 Using the su - Command

[NetProg@localhost ~]$ su -
Password:
[root@localhost ~]# pwd
/root

If a user wants to run a command that requires root privileges, the user has two options.
The first is to use the su - command to switch to the root user, and then execute the
­command as root. The second option is to use the sudo utility using the syntax sudo
{command}. The sudo utility is used to execute a command as a superuser, granted that
the user invoking the sudo command is authorized to do so. In other words, the user
invoking the sudo command should be a member of the superusers group on the system,
more formally known as the sudoers group. When the sudo utility is invoked, the invok-
ing user is checked against the sudoers group, and if she is a member, the user is prompt-
ed to enter her password. If the authorization is successful, the command that requires
root privileges is executed. More on users and groups in Chapter 3.

Whenever you need to clear the terminal screen, you use the command clear. This
­command clears the current terminal screen and all of the scroll back buffer except for
one screen length of buffer history.

When you press the up arrow once at the terminal prompt, the last command you entered
is recalled. Pressing the up arrow once more recalls the command before that. Each
time you press the arrow key, one older command is recalled. To see a list of your previ-
ously entered commands, type the command history, which lists, by default, the last
1000 commands you entered. The number of previously entered commands that can be
retained is configurable. If you are using the Bash shell, the history is maintained in the
~/.bash_history file.

9781587145148_print.indb 30 25/03/21 11:43 am


Finding Help in Linux 31

Finding Help in Linux


Before proceeding any further, let’s look at how to find help in Linux. Covering every
option and argument of every command in Linux in this single chapter would simply not
be possible. However, Linux provides an easy way to get help that enables you to further
investigate and experiment with the commands covered in the subsequent sections and
chapters so you can expand your knowledge beyond what is covered here. Linux has
built-in documentation for virtually every Linux command and feature. It makes compre-
hensive information readily available to Linux users.

The simplest way to get help for a command is by using the --help option, right after the
command. Example 2-5 shows the help provided for the command ls, which stands for
list. As stated in the help output, this command is used to “List information about the
FILEs (the current directory by default).” As you can see, the ­output from the command
help is quite detailed. The output in Example 2-5 has been truncated for brevity. Don’t
worry if some or most of this output does not make much sense to you at this point.

Example 2-5 Help for the ls Command

[NetProg@localhost ~]$ ls --help


Usage: ls [OPTION]... [FILE]...
List information about the FILEs (the current directory by default).
Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.

Mandatory arguments to long options are mandatory for short options too.
-a, --all do not ignore entries starting with .
-A, --almost-all do not list implied . and ..
--author with -l, print the author of each file
-b, --escape print C-style escapes for nongraphic characters
--block-size=SIZE scale sizes by SIZE before printing them; e.g.,
'--block-size=M' prints sizes in units of
1,048,576 bytes; see SIZE format below
-B, --ignore-backups do not list implied entries ending with ~

--------- OUTPUT TRUNCATED FOR BREVITY ---------

To illustrate the output of the command ls and how the help output from Example 2-5
can be put to good use, Example 2-6 shows the output of the command when entered
while in the home directory of user NetProg. Three different variations of arguments are
used. The first is plain vanilla ls. The second is ls -l, which forces ls to use a long listing
format. The final variation is ls -la, which tells ls to not ignore entries starting with a
period, which are hidden files and directories; this argument basically tells ls to list all
files and directories, including hidden ones.

9781587145148_print.indb 31 25/03/21 11:43 am


32 Chapter 2: Linux Fundamentals

Example 2-6 Using Three Different Variations of the ls Command

[NetProg@localhost ~]$ ls
Desktop Documents Downloads Music Pictures Public Templates Videos

[NetProg@localhost ~]$ ls -l
total 32
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Desktop
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Documents
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Downloads
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Music
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Pictures
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Public
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Templates
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Videos

[NetProg@localhost ~]$ ls -la


total 80
drwx------. 14 NetProg NetProg 4096 Feb 13 04:48 .
drwxr-xr-x. 5 root root 4096 Feb 13 04:07 ..
-rw-------. 1 NetProg NetProg 4 Feb 13 04:08 .bash_history
-rw-r--r--. 1 NetProg NetProg 18 Jan 4 12:45 .bash_logout
-rw-r--r--. 1 NetProg NetProg 193 Jan 4 12:45 .bash_profile
-rw-r--r--. 1 NetProg NetProg 231 Jan 4 12:45 .bashrc
drwx------. 9 NetProg NetProg 4096 Feb 13 04:48 .cache
drwxr-xr-x. 11 NetProg NetProg 4096 Feb 13 04:48 .config
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Desktop
drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Documents
--------- OUTPUT TRUNCATED FOR BREVITY ---------

Again, don’t worry if some of this output does not make sense to you. The ls command is
covered in detail later in this chapter, in the section “File and Directory Operations.”
The second way Linux provides help to users is through the manual pages, also known
as the man pages. Man pages are documentation pages for Linux built-in commands and
programs. Applications that are not built-in also have the option to add their own man
pages during installation. To access the man pages for a command, you enter man
{command}. Example 2-7 shows the first man page for the ls command.

Example 2-7 The First Man Page for the ls Command

LS(1) User Commands LS(1)

NAME
ls - list directory contents

SYNOPSIS

9781587145148_print.indb 32 25/03/21 11:43 am


Finding Help in Linux 33

ls [OPTION]... [FILE]...

DESCRIPTION
List information about the FILEs (the current directory by default).
Sort entries alphabetically if none of -cftuvSUX nor --sort
 is specified.

Mandatory arguments to long options are mandatory for short options


too.

-a, --all
do not ignore entries starting with .

-A, --almost-all
do not list implied . and ..

--author
with -l, print the author of each file

-b, --escape
print C-style escapes for nongraphic characters

--block-size=SIZE
scale sizes by SIZE before printing them; e.g., '--block-size=M'
prints sizes in units of 1,048,576 bytes; see SIZE format below

-B, --ignore-backups
do not list implied entries ending with ~

Manual page ls(1) line 1 (press h for help or q to quit)

You should use the down arrow or Enter key to scroll down through the man page one
line at a time. You should press the spacebar or Page Down key to scroll down one page
at a time. To scroll up, you should either use the up arrow key to scroll one line at a time
or the Page Up key to scroll up one page at a time. To exit the man page, press q. To
search through the man pages, type / followed by the search phrase you are looking for.
What you type, including the /, appears at the bottom of the page. If you press Enter,
the phrase you are looking for is highlighted throughout the man pages. The line contain-
ing the first search result appears at the top of the page. You should press the letter n to
move forward through the search results or N (capital n) to move backward to previous
search results. To get to the top of the man page, you should press g, and to go to the end
of the man page, you should press G (capital g). Being able to jump to the start or end of
a man page with a single keypress is handy when you’re dealing with a man page that is
thousands of lines long.

9781587145148_print.indb 33 25/03/21 11:43 am


34 Chapter 2: Linux Fundamentals

All available man pages on a Linux distro are classified into sections, and the number of
sections depends on the distro you are using. CentOS has nine sections. Each section
consists of the man pages for a different category of components of the Linux OS. In
Example 2-7, notice the LS(1) on the first line of the output, on both the left and right
sides. This indicates that this man page is for the command ls, and this is Section 1 of the
man pages.

From the output of the man man command, which brings up the manual pages for the man
command itself, you can see that the man pages are classified into the following ­sections:

■■ Section 1: Executable programs or shell commands

■■ Section 2: System calls (that is, functions provided by the kernel)

■■ Section 3: Library calls (that is, functions within program libraries)

■■ Section 4: Special files (usually found in /dev)

■■ Section 5: File formats and conventions, such as /etc/passwd

■■ Section 6: Games

■■ Section 7: Miscellaneous (including macro packages and conventions), such as


man(7) and groff(7)

■■ Section 8: System administration commands (usually only for root)

■■ Section 9: Kernel routines (nonstandard)

Why are the man pages categorized into different sections? Consider this scenario: tar
is both a command that executes the utility to archive files and also a file format for
archived files. The man pages for the archiving utility are provided in Section 1 (execut-
able programs or shell commands), while the man pages for the file format are provided
in Section 5 (file formats and conventions). When you type man tar, you invoke the man
pages for the tar utility under Section 1, by default. In order to invoke the man pages for
the tar file format, you need to type man 5 tar. And to see all possible man pages for
a specific phrase, you use the form man -k {phrase}, as shown in Example 2-8 for the
phrase tar. Note that the phrase tar was enclosed in quotes with an ^ before and a $ after
tar. This is an example of putting regular expressions to good use to avoid getting results
that you do not need. In this case, you are only looking for the phrase tar and not for
words that start or end with tar or any other variation of tar such as words that contain
tar in them. Regular expressions are discussed in detail in Chapter 4. For now, you just
need to know that regular expressions make it possible to match on certain strings using
special symbols, such as the ^ symbol, which represents the beginning of a line, and the
$ symbol, which represents the end of a line.

Example 2-8 Man Pages in Different Sections for tar

[NetProg@localhost ~]$ man -k "^tar$"


tar (1) - an archiving utility
tar (5) - format of tape archive files

9781587145148_print.indb 34 25/03/21 11:43 am


Files and Directories in Linux 35

An interesting—and maybe more intuitive—alternative to the man pages is the GNU


info documentation. The info pages are help pages similar to the man pages, but the info
pages are more detailed, documentation-style (rather than command-reference-style)
hypertext documents, named nodes. The hyperlinks in the info pages enhance the experi-
ence of a user looking for information or help. The GNU info files can be accessed using
either the info or pinfo commands. You can pass a phrase to one of these commands as
an argument, where the phrase is what you are looking for. Or you can just type the com-
mand with no argument and then search the output for what you are looking for by typ-
ing / and then the search phrase. You can navigate through the info files by using the up
and down arrow keys. You can go to the next node by pressing n or to the previous node
by pressing p. Experiment with the GNU info pages by trying to locate the help for the
commands covered so far and comparing the info pages with the man pages.

Files and Directories in Linux


By now, you should be familiar with the Linux Bash shell prompt and know where
to go to find help. This section takes a closer look at the Linux file system, files, and
­directories.

The Linux File System


A disk (or any other storage medium, for that matter) is organized into one or more parti-
tions. A partition is simply a logical section or slice of a disk. Each partition is logically
separated from the other. In order to start using a particular partition, you need to create
a file system on that partition. A file system defines how data is stored and retrieved from
a disk. It defines a block of related data that has a beginning and an end and, most impor-
tantly, a name by which the block of data is identified. This block of data is called a file.
Files are further grouped into directories, and directories are grouped into other directo-
ries, creating a tree-like hierarchy. Among other things, a file system does the following:

■■ Defines the size of the allocation unit, which is the minimum amount of physical
storage space that can be allocated to a file

■■ Manages the space allocation to files, which may be composed of discontiguous


allocations, as files are created, deleted, or changed in size

■■ Defines how to map between files and directories

■■ Defines the naming schemes of files and how to map the names to the actual loca-
tions of the files on the physical medium

■■ Maintains the metadata for files and directories—that is, the information about
those files or directories (for example, file size, and time of last modification)

Linux supports several file systems, including ext2, ext3, ext4, XFS, Btrfs, JFS, and NTFS.

9781587145148_print.indb 35 25/03/21 11:43 am


36 Chapter 2: Linux Fundamentals

Linux organizes its files and directories following the Filesystem Hierarchy Standard
(FHS), which was developed specifically for Linux. This standard ensures that the
­different Linux distros follow the same guidelines when implementing their file system
hierarchy so that application developers can develop software that is portable between
distros and meet other needs for standardization. The detailed standard can be found at
http://refspecs.linuxbase.org/fhs.shtml.

Figure 2-4 illustrates the very basic directory tree structure in Linux. This hierarchy starts
at the root, represented by a /, and all other files and directories branch from this root
directory.

boot dev etc home media mnt opt proc root run srv sys tmp usr var

console systemd user1 log bin log

null rsyslog.d user2 journal local messages

sda yum.repos.d user2 sbin cache

sdb yum

sr0

Figure 2-4 The Linux Directory Tree Structure, Starting at the Root (/) Directory and
Branching Out

Everything in Linux is represented by a file somewhere in the file hierarchy. It is impor-


tant that you become familiar with the Linux file system hierarchy and know which files
to view or edit in order to get a particular task accomplished. Your knowledge will gradu-
ally increase as you progress through this chapter and Chapter 3. For now, the following
is a high-level description of the main default directories on a CentOS Linux system:

■■ /: This is the root directory that is at the top of the file hierarchy and from which all
other directories branch. This is not to be confused with the /root directory, which is
the home directory of user root.

■■ boot: This directory contains the boot loader, kernel image, and initial RAM
disk files.

■■ dev: This directory contains all the files required to access the devices. For example,
when a hard disk is mounted on the system, the path to this disk is something similar
to /dev/sda.

■■ etc: This directory contains the system configuration files and the configuration
files of any application installed using the package management system of the distro
being used (yum or dnf in the case of CentOS).

■■ home: Each user in Linux has a home directory that is named after the user’s user-
name. All home directories for all users reside under this home directory. A user’s
home directory contains all the subdirectories and files owned by this user. User
NetProg’s home directory, for example, is /home/NetProg.

M02_Abuelenain_C02_p021-p118.indd 36 26/03/21 1:44 pm


Files and Directories in Linux 37

■■ media: This directory contains subdirectories that are used as mount points for
(temporary) removable media such as floppy disks and CD-ROMs.

■■ mnt: This directory is provided so that the system administrator can temporarily
mount a file system, as needed.

■■ opt: This directory is reserved for the installation of add-on application software
packages.

■■ proc: This directory is used for the storage and retrieval of process information as
well as other kernel and memory information.

■■ root: This is the home directory of the user root that has superuser privileges. Note
that this is not the root directory, which is the / directory.

■■ run: This directory contains files that are re-created each time the system reboots.
The information in these files is about the running system and is as old as the last
time the system was rebooted.

■■ srv: This directory contains site-specific data that is served by this system.

■■ sys: This directory contains information about devices, drivers, and some kernel
features. Its underlying structure is determined by the particular Linux kernel being
used.

■■ tmp: This directory contains temporary files that are used by users and applications.
All the contents of this directory are flushed every configurable period of time
(which is, by default, 10 days for CentOS).

■■ usr: This directory contains the files for installed applications. Application-shared
libraries are also placed here. This directory contains the following subdirectories:

■■ usr/bin: This subdirectory contains the binary files for the commands that are
used by any user on the system, such as pwd, ls, cp, and mv.

■■ usr/sbin: This subdirectory contains the command binary files for commands that
may be executed by users of the system with administrator privileges.

■■ usr/local: This directory is used for the installed application files, similar to the
Program Files directory in Windows.

■■ var: This directory contains files that are constantly changing in size, such as system
log files.

■■ .: The dot is a representation of the current working directory. The value of . is equiv-
alent to the output of the pwd command.

■■ ..: The double-dot notation is a representation of the parent directory of the working
directory. That is, the directory that is one level higher in the file system hierarchy
than the working directory.

9781587145148_print.indb 37 25/03/21 11:43 am


38 Chapter 2: Linux Fundamentals

File and Directory Operations


This section introduces a number of commands for navigating, creating, deleting,
­copying, and viewing files and directories using the Linux command-line shell.

Navigating Directories
The command cd stands for change directory and is used to change the working
directory from one directory to another. The syntax for cd is cd {path}, where path is the
destination that you want to become your working directory. The path argument can
be provided in one of two forms: either as a relative path or as an absolute path.
Example 2-9 illustrates the use of both forms.

The relative path can be used when the destination directory is a subdirectory under
the current working directory. In this case, the first part of the path (which is the
absolute path to the current working directory) is implied. In Example 2-9, because
the current working directory is /home/NetProg and you want to navigate to /home/
NetProg/LinuxStudies, you can use the command cd LinuxStudies, where the first part
of the path, /home/NetProg/, is implied because this is the current working directory.
Obviously, the relative path does not work if you need to navigate to a directory that is
not under your current working directory. In Example 2-9, for example, you could not
navigate to /home/NetProg/Documents from /home/NetProg/LinuxStudies by simply
entering cd Documents. In this case, the absolute path must be used.

Example 2-9 Relative and Absolute Paths

[NetProg@localhost ~]$ ls
Desktop Downloads Music Public Videos
Documents LinuxStudies Pictures Templates

! Using the relative path to navigate to LinuxStudies


[NetProg@localhost ~]$ cd LinuxStudies
[NetProg@localhost LinuxStudies]$ pwd
/home/NetProg/LinuxStudies

! Now the relative path does not work when attempting to navigate to
/home/NetProg/Documents
[NetProg@localhost LinuxStudies]$ cd Documents
-bash: cd: Documents: No such file or directory

! Using the absolute path to navigate to Documents


[NetProg@localhost LinuxStudies]$ cd /home/NetProg/Documents
[NetProg@localhost Documents]$ pwd
/home/NetProg/Documents

9781587145148_print.indb 38 25/03/21 11:43 am


Files and Directories in Linux 39

At any point in your navigation, entering cd without any arguments takes you back to
your home directory, characterized by the tilde (~) in the command prompt.

When you have navigated to the desired directory, you typically need to display its
c­ ontents. The ls command stands for list and, as the name implies, ls is used to list the
directory contents of the current working directory. When used without any options,
it lists the files, side by side, without displaying any information apart from the file or
subdirectory name. The -a option causes ls to display all files, including hidden files. The
name of a hidden file starts with a dot (.). The -l option displays the files in a list format,
along with the attributes of each file. The -i option adds the inode number to the dis-
played information. Example 2-10 displays the output of the ls command with all three
options added, inside the home directory of user NetProg.

Example 2-10 ls Command Output

[NetProg@localhost ~]$ ls -lai


total 84
31719425 drwx------. 14 NetProg NetProg 4096 Feb 13 17:41 .
2 drwxr-xr-x. 5 root root 4096 Feb 13 04:07 ..
31719432 -rw-------. 1 NetProg NetProg 293 Feb 14 09:55 .bash_history
31719426 -rw-r--r--. 1 NetProg NetProg 18 Jan 4 12:45 .bash_logout
31719427 -rw-r--r--. 1 NetProg NetProg 193 Jan 4 12:45 .bash_profile
31719428 -rw-r--r--. 1 NetProg NetProg 231 Jan 4 12:45 .bashrc
31719433 drwx------. 9 NetProg NetProg 4096 Feb 13 04:48 .cache
31719434 drwxr-xr-x. 11 NetProg NetProg 4096 Feb 13 04:48 .config
31719485 drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Desktop
31719489 drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Documents
31719486 drwxr-xr-x. 2 NetProg NetProg 4096 Feb 13 04:48 Downloads

--------- OUTPUT TRUNCATED FOR BREVITY ---------

A lot of information can be extracted from the output in Example 2-10. The phrase total
84 indicates the total number of disk blocks allocated to store all the files in that direc-
tory. The second two lines of the output are for the current directory (.) and the directory
one level above the current directory (..).

To elaborate on the use of the . and .., assume that the current working directory is
/home/NetProg/LinuxStudies, and you want to navigate to /home/NetProg/Documents.
You have two options: Either enter cd /home/NetProg/Documents, which is the absolute
path, or use the shorthand notation cd ../Documents, where the .. substitutes for /home/
NetProg. You use the dot (.) notation similarly but for the current working directory. The
value of using shorthand notation for the current working directory may not be imme-
diately obvious, considering the availability of relative paths. However, by the end of
Chapter 4 you will see how useful this notation is.

9781587145148_print.indb 39 25/03/21 11:43 am


40 Chapter 2: Linux Fundamentals

Next, all the files and subdirectories are listed; by default, they appear in alphabetical
order. As noted earlier, the name of a hidden file or directory starts with a dot (.). The
information from the beginning of each line all the way up to the file or directory name
is collectively known as the file or directory attributes. Here is a description of each
attribute:

■■ Inode number: The inode number is also called the file serial number or file index
number. As per the info description for ls, the inode number is a number that
“uniquely identifies each file within a particular file system.”

■■ File type: This first character right after the inode number defines the file type.
Three characters are commonly used in this field:

■■ - stands for a regular file.

■■ d stands for a directory.

■■ l stands for a soft link.

There are several other file types that are not discussed here.

■■ File permissions: Also called the file mode bits, the file permissions define who is
allowed to read (r), write (w), or execute (x) the file. Users are classified into three
categories: the owner of the file (u), the group of the file (g), and everyone else, or
other (o). File permissions are covered in detail later in this chapter. For now, you
need to know that the first three letters belong to the file owner, the second three
belong to the group of the file, and the last three belong to everyone else. So,
rwxr-xr-- means that the file owner with permissions rwx can read, write, and
execute the file. Users who are members of the same group as the file group, with
permissions r-x, can read and execute the file but not write to it. Everyone else,
with permissions r--, can only read the file but can neither write to it nor execute
it. While the meaning of write, read, and execute are self-explanatory for files, they
may not be so obvious for directories. Reading from a directory means listing its
contents using the ls command, and writing to a directory means creating files or
subdirectories under that directory. Executing directory X means changing the
current working directory to directory X by using the cd command.

■■ Alternate access method: Notice the dot (.) right after the file permissions. This dot
means that you have alternate means to set permissions for this file, such as using
access control lists (ACL). ACLs are covered in detail in Chapter 3.

■■ Number of hard links: The number to the right of the file permissions is the total
number of hard links to a file or to all files inside a directory. This is discussed in
detail in section “Hard and Soft Links,” later in this chapter.

■■ File/directory owner: This is the name of the file owner. In Example 2-10, it is
NetProg.

9781587145148_print.indb 40 25/03/21 11:43 am


Files and Directories in Linux 41

■■ File/directory group: This is the file’s group name. In Example 2-10, it is also
NetProg. The file owner may or may not be part of this group. For example, the file
owner could be NetProg, the group of the file could be Sales, and user NetProg may
not be a member of the group Sales. Chapter 3 discusses how file access works in
each case.

■■ Size: This is the file size, in bytes.

■■ Time stamp: This is the time when the file was last modified.

Viewing Files
In this section you will see how to display the contents of files on the terminal by using
the commands cat, more, less, head, and tail.

The cat command, which stands for concatenate, writes out a file to standard output
(that is, the screen). Example 2-11 shows how to use the cat command to display the
­output of the PIM.txt file.

Example 2-11 cat Command Output

[NetProg@localhost LinuxStudies]$ cat PIM.txt


!
Router-1# show ip pim neighbor
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
P - Proxy Capable, S - State Refresh Capable, G - GenID Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode
192.168.10.10 TenGigabitEthernet1/2 7w0d/00:01:26 v2 1 / G
192.168.20.20 TenGigabitEthernet2/1 2w2d/00:01:32 v2 1 / P G

PE-L3Agg-Mut-303-3# show ip pim interface

Address Interface Ver/ Nbr Query DR DR


Mode Count Intvl Prior
192.168.10.11 TenGigabitEthernet1/2 v2/S 1 30 1 192.168.10.10
192.168.20.21 TenGigabitEthernet2/1 v2/S 1 30 1 192.168.20.20
!
[NetProg@localhost LinuxStudies]$

Several useful options can be used with cat. For example, cat -n inserts a line number at
the beginning of each line. cat -b, on the other hand, inserts a line number for ­non-empty

9781587145148_print.indb 41 25/03/21 11:43 am


42 Chapter 2: Linux Fundamentals

lines only. cat -s is the squeeze option, which squeezes more than one consecutive empty
lines into a single empty line. Example 2-12 shows the output of the cat command on
the file PIM.txt, using the -sn option to squeeze any consecutive empty lines in the file
into one empty line and then number all lines, including the empty lines. For a more
comprehensive list of options, you can visit the cat info page by using the command info
coreutils cat.

Example 2-12 cat -sn Command Output

[NetProg@localhost LinuxStudies]$ cat -sn PIM.txt


1 !
2 Router-1# show ip pim neighbor
3 PIM Neighbor Table
4 Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
5 P - Proxy Capable, S - State Refresh Capable, G - GenID Capable
6 Neighbor Interface Uptime/Expires Ver DR
7 Address Prio/Mode
8 192.168.10.10 TenGigabitEthernet1/2 7w0d/00:01:26 v2 1 / G
9 192.168.20.20 TenGigabitEthernet2/1 2w2d/00:01:32 v2 1 / P G
10
11 PE-L3Agg-Mut-303-3# show ip pim interface
12
13 Address Interface Ver/ Nbr Query DR DR
14 Mode Count Intvl Prior
15 192.168.10.11 TenGigabitEthernet1/2 v2/S 1 30 1 192.168.10.10
16 192.168.20.21 TenGigabitEthernet2/1 v2/S 1 30 1 192.168.20.20
17 !
[NetProg@localhost LinuxStudies]$

One of the major drawbacks of cat is that the file being displayed is output to the screen
all at once, without a pause. The next two commands, more and less, provide a more
readable form of output, where just one section of the file is displayed on the screen,
and then the user is prompted for input in order to proceed with the following section of
the file, and so forth. Therefore, both of these commands are handy tools for displaying
files that are longer than the current screen length. more is the original utility and is very
compact, so it is ideal for systems with limited resources. However, the major drawback
of more is that it does not allow you to move backward in a file; you can only move for-
ward. Therefore, the less utility was eventually developed to allow users to move forward
and backward over the content of a file. Over time, several developers contributed to the
less program, adding more features in the process. One other distinctive feature of less is
that it does not have to read the whole file before it starts displaying output; it is there-
fore much faster than many other programs, including the prominent vi text editor.

Example 2-13 shows the more command being used to display the contents of the file
InternetRoutes.txt.

9781587145148_print.indb 42 25/03/21 11:43 am


Files and Directories in Linux 43

Example 2-13 more Command Output

[NetProg@localhost LinuxStudies]$ more InternetRoutes.txt


Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP
+ - replicated route, % - next hop override

Gateway of last resort is 67.16.148.37 to network 0.0.0.0

S* 0.0.0.0/0 [1/0] via 67.16.148.37


1.0.0.0/8 is variably subnetted, 2511 subnets, 16 masks
B 1.0.4.0/22 [200/100] via 67.16.148.37, 6d02h
B 1.0.4.0/24 [200/100] via 67.16.148.37, 6d02h
B 1.0.5.0/24 [200/100] via 67.16.148.37, 6d02h
B 1.0.6.0/24 [200/100] via 67.16.148.37, 6d02h
B 1.0.7.0/24 [200/100] via 67.16.148.37, 6d02h
B 1.0.16.0/24 [200/150] via 67.16.148.37, 7w0d
--More--(0%)

Notice the text --More--(0%) at the end of the output, which indicates how much of the
file has been displayed so far. You can perform the following operations while viewing a
file by using more:

■■ In order to keep scrolling down the file contents, press the Enter key to scroll one
line at a time or press the Spacebar to scroll one screenful at a time.

■■ Type a number and press s to skip that number of lines forward in the file.

■■ Similarly, type a number and then press f to skip forward that number of screens.

■■ If you press =, the line number where you are currently located is displayed in place
of the percentage at the bottom of the screen. This is the line number of the last line
of the output at the bottom of the screen.

■■ To search for a specific pattern using regular expressions, type /{pattern} and press
Enter. The output jumps to the first occurrence of the pattern you are searching for.

■■ To quit the output and return to the terminal prompt, press q.

Now let’s look at an example of using the less command. Example 2-14 shows the con-
tents of the InternetRoutes.txt file displayed using less. Notice the filename at the end of
the output.

9781587145148_print.indb 43 25/03/21 11:43 am


44 Chapter 2: Linux Fundamentals

Example 2-14 less Command Output

Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP


D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP
+ - replicated route, % - next hop override

Gateway of last resort is 67.16.148.37 to network 0.0.0.0

S* 0.0.0.0/0 [1/0] via 67.16.148.37


1.0.0.0/8 is variably subnetted, 2511 subnets, 16 masks
B 1.0.4.0/22 [200/100] via 67.16.148.37, 6d02h
B 1.0.4.0/24 [200/100] via 67.16.148.37, 6d02h
B 1.0.5.0/24 [200/100] via 67.16.148.37, 6d02h
B 1.0.6.0/24 [200/100] via 67.16.148.37, 6d02h
B 1.0.7.0/24 [200/100] via 67.16.148.37, 6d02h
B 1.0.16.0/24 [200/150] via 67.16.148.37, 7w0d
InternetRoutes.txt

The following are some operations you can perform while viewing the file by using less:

■■ Use Enter, e, or j to scroll forward through the file one line at a time or use the
Spacebar or z to scroll forward one screenful at a time.

■■ Press y to scroll backward one line at a time or b to scroll backward one screen
at a time. Type a number before the y or b to scroll that many lines or screens,
respectively.

■■ Press g to go to the beginning of the file or G to go to the end of the file.

■■ Press = to see the filename and the range of line numbers currently displayed on the
screen, out of the total number of lines in the file, partial and full data size informa-
tion, as well as your location in the file as a percentage.

■■ To search for a specific pattern using regular expressions, type /{pattern} and press
Enter. The output jumps to the first occurrence of the pattern you are searching for.

■■ To quit the output and return to the terminal prompt, press q.

For a complete list of operations you can perform while viewing the file by using less,
visit the man or info pages for the less command.

It is generally recommended to use less instead of more because the latter is not under
current development right now. Keep in mind, however, that you might run into systems
with limited resources that support only more.

9781587145148_print.indb 44 25/03/21 11:43 am


Files and Directories in Linux 45

The final two commands covered in this section are head and tail. As their names may
imply, these simple commands or utilities print a set number of lines from the start of the
file or from the end of the file. Example 2-15 shows both commands being used to dis-
play selected output from the start or end of the InternetRoutes.txt file.

Example 2-15 head and tail Command Output

! displaying the first 15 lines of the file


[NetProg@localhost LinuxStudies]$ head -n 15 InternetRoutes.txt
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP
+ - replicated route, % - next hop override

Gateway of last resort is 67.16.148.37 to network 0.0.0.0

S* 0.0.0.0/0 [1/0] via 67.16.148.37


1.0.0.0/8 is variably subnetted, 2511 subnets, 16 masks
B 1.0.4.0/22 [200/100] via 67.16.148.37, 6d02h

! displaying the last 10 lines of the file


[NetProg@localhost LinuxStudies]$ tail -n 10 InternetRoutes.txt
B 110.204.0.0/17 [200/100] via 67.16.148.37, 6d02h
B 110.204.128.0/17 [200/100] via 67.16.148.37, 6d02h
B 110.205.0.0/16 [200/100] via 67.16.148.37, 6d02h
B 110.205.0.0/17 [200/100] via 67.16.148.37, 6d02h
B 110.205.128.0/17 [200/100] via 67.16.148.37, 6d02h
B 110.206.0.0/17 [200/100] via 67.16.148.37, 6d02h
B 110.206.128.0/17 [200/100] via 67.16.148.37, 6d02h

Connection closed by foreign host.


[NetProg@localhost LinuxStudies]$

The first section of the output in Example 2-15 shows how to extract the first 15 lines
of the file by using head, and the second section of the output shows how to display
the last 10 lines of the same file by using tail. A very useful variation is to use the head
command with a negative number, such as -20, after the -n option. When this form is
used, it means that all of the file is to be displayed except for the last 20 lines. Instead
of using number of lines, you can specify the first or last number of bytes of the file to
be displayed (using head or tail, respectively) by replacing the option -n with -c. Finally,

9781587145148_print.indb 45 25/03/21 11:43 am


46 Chapter 2: Linux Fundamentals

to keep a live view of a file, you can use the tail command with the -f option. With
this option, if a new line is added to the file, it appears on the screen. This comes in
handy when viewing log files that are expected to change, and these changes need to be
­monitored as they happen; this is a very common scenario when troubleshooting system
incidents. To quit live mode, press Ctrl+c.

File Operations
This section covers the most common file operations: creating, copying, deleting, and
moving files.

In Example 2-16, the directory LinuxStudies has three empty subdirectories under it. The
example shows how to use the touch command to create a file and call it PolicyMap.txt.
When you pass a filename to the touch command as an argument, that file is created if it
does not already exist. If the file already exists, the access and modification time stamps
of the file are changed to the time when the touch command was issued. You can see this
file in the output of the ls command.

Example 2-16 Creating a File by Using touch

[NetProg@localhost LinuxStudies]$ ls -l
total 12
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:14 configurations
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:19 operational
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:06 temp

[NetProg@localhost LinuxStudies]$ touch PolicyMap.txt

[NetProg@localhost LinuxStudies]$ ls -l
total 12
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:14 configurations
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:19 operational
-rw-rw-r--. 1 NetProg NetProg 0 Feb 17 12:22 PolicyMap.txt
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:06 temp
[NetProg@localhost LinuxStudies]$

Note three things in Example 2-16:

■■ The file created is empty and has a size of zero bytes.

■■ The file’s time stamp is the time at which the touch command was issued.

■■ Linux is case sensitive. The files PolicyMap.txt and policymap.txt are two entirely
different files. The same case sensitivity applies to commands.

9781587145148_print.indb 46 25/03/21 11:43 am


Files and Directories in Linux 47

Example 2-17 shows how to copy the file PolicyMap.txt to the operational directory
by using the cp command. Because this is a copy operation, now both the LinuxStudies
directory and the subdirectory operational contain copies of the file, as shown by using
the ls -l command in each of the directories. Remember that the dot (.) and double dot (..)
notations, combined with relative paths, are often used to refer to the current work-
ing directory and the parent directory, respectively. The file is then deleted from the
LinuxStudies directory by using the rm command. Issuing the ls command again shows
that the file was indeed deleted. Following that, the file is moved (not copied) with the
mv command from the operational subdirectory to the configurations subdirectory.
Issuing the ls command in both directories shows that the file was moved to the latter,
and the former is empty now. Finally, the file is renamed PolicyMapConfig.txt: The mv
command renames and moves the old file to the new one, in the same location. The ls
command, issued one final time, confirms that the file renaming was successful.

Example 2-17 File Operations: Copy, Delete, and Move

! File copy operation


[NetProg@localhost LinuxStudies]$ cp PolicyMap.txt operational
[NetProg@localhost LinuxStudies]$ ls -l
total 16
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:14 configurations
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:19 operational
-rw-rw-r--. 1 NetProg NetProg 0 Feb 17 12:24 PolicyMap.txt
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:06 temp
[NetProg@localhost LinuxStudies]$ ls -l operational
total 4
-rw-rw-r--. 1 NetProg NetProg 361 Feb 17 12:24 PolicyMap.txt

! File delete operation


[NetProg@localhost LinuxStudies]$ rm PolicyMap.txt
[NetProg@localhost LinuxStudies]$ ls -l
total 12
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:14 configurations
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:19 operational
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:06 temp
[NetProg@localhost LinuxStudies]$ cd operational/
[NetProg@localhost operational]$ ls -l
total 4
-rw-rw-r--. 1 NetProg NetProg 361 Feb 17 12:24 PolicyMap.txt

! File move operation


[NetProg@localhost operational]$ mv PolicyMap.txt ../configurations/
[NetProg@localhost operational]$ ls -l
total 0

9781587145148_print.indb 47 25/03/21 11:43 am


48 Chapter 2: Linux Fundamentals

[NetProg@localhost operational]$ cd ../configurations/


[NetProg@localhost configurations]$ ls -l
total 4
-rw-rw-r--. 1 NetProg NetProg 361 Feb 17 12:24 PolicyMap.txt

! Renaming a file by moving it to the same location but with a different name
[NetProg@localhost configurations]$ mv PolicyMap.txt PolicyMapConfig.txt
[NetProg@localhost configurations]$ ls -l
total 4
-rw-rw-r--. 1 NetProg NetProg 361 Feb 17 12:24 PolicyMapConfig.txt
[NetProg@localhost configurations]$

A more secure alternative to the command rm is the command shred, which overwrites
the file a configurable number of times (three by default) in order to eliminate the pos-
sibility of recovering the deleted file even via direct hardware access.

The following is a summary of the commands used for file operations:

■■ touch {file_name}: Use this syntax to create a file.

■■ cp {source} {destination}: Use this syntax to copy a file to another location.

■■ mv {source} {destination}: Use this syntax to move a file from one location to
­another:

■■ mv {old_file_name} {new_file_name}: Use this syntax to rename a file. The old


and new files could be collocated in the same directory or located in different
­directories.

■■ rm {file_name}: Use this syntax to delete a file.

■■ shred {file_name}: Use this syntax to securely delete a file.

When operating on files, it is important to be careful about what the current directory
is, what the destination directory is, and where the file currently resides. When using the
commands listed here, you need to use absolute paths, relative paths, or no path at all—
whichever is applicable at the time. Also, notice the use of the shorthand dot and double
dot notations in the previous examples and how they make a command line both shorter
and easier.

Directory Operations
This section discusses directory operations. Some of the commands in this section are
the same as those used with files, but they have added options for directories. Some of
the commands in this section are exclusive to directories.

9781587145148_print.indb 48 25/03/21 11:43 am


Files and Directories in Linux 49

The next few examples demonstrate how to create a directory, copy it to a new location,
move it to a new location, and rename it. The examples also show what you need to do to
delete a directory in two different cases: either the directory is empty or it is not.
Example 2-18 shows two directories being created under the LinuxStudies directory with
the mkdir command: EmptyDir and NonEmptyDir. By using the cp command, the file
PolictMapConfig.txt is then copied to the directory NonEmptyDir (not shown in the
example). One directory is now empty, and the other directory contains one file.

Example 2-18 Creating Directories

[NetProg@localhost LinuxStudies]$ ls
configurations operational temp
[NetProg@localhost LinuxStudies]$ mkdir EmptyDir
[NetProg@localhost LinuxStudies]$ mkdir NonEmptyDir
[NetProg@localhost LinuxStudies]$ ls
configurations EmptyDir NonEmptyDir operational temp
[NetProg@localhost LinuxStudies]$

The next example shows a different hierarchy: A new directory is created, and the two
directories created in the previous example are moved into it. In Example 2-19, a new
directory called MasterDir is created using the mkdir command and then the mv com-
mand is used to move both directories under the newly created MasterDir directory. The
output of the ls command shows that both directories were successfully moved to the
new location.

Example 2-19 Moving Directories

[NetProg@localhost LinuxStudies]$ mkdir MasterDir


[NetProg@localhost LinuxStudies]$ ls
configurations EmptyDir MasterDir NonEmptyDir operational temp
[NetProg@localhost LinuxStudies]$ mv EmptyDir MasterDir
[NetProg@localhost LinuxStudies]$ mv NonEmptyDir MasterDir
[NetProg@localhost LinuxStudies]$ ls MasterDir
EmptyDir NonEmptyDir
[NetProg@localhost LinuxStudies]$

Example 2-20 shows a new directory called MasterDirReplica being created. The cp com-
mand is then used in an attempt to copy both EmptyDir and NonEmptyDir to the new
directory. As shown in the example, the operation fails; the error message indicates that
when copying directories, you need to issue the cp command with the -r option, which
stands for recursive. When the cp command is issued with the correct option, the copy
operation is successful, as indicated by the output of the ls command. Notice that the -r
option needs to be added to the cp command, whether the directory is empty or not.

9781587145148_print.indb 49 25/03/21 11:43 am


50 Chapter 2: Linux Fundamentals

Example 2-20 Copying Directories

[NetProg@localhost LinuxStudies]$ mkdir MasterDirReplica


[NetProg@localhost LinuxStudies]$ ls
configurations MasterDir MasterDirReplica operational temp
[NetProg@localhost LinuxStudies]$ cp MasterDir/EmptyDir MasterDirReplica
cp: -r not specified; omitting directory 'MasterDir/EmptyDir'
[NetProg@localhost LinuxStudies]$ cp MasterDir/NonEmptyDir MasterDirReplica
cp: -r not specified; omitting directory 'MasterDir/NonEmptyDir'
[NetProg@localhost LinuxStudies]$ cp -r MasterDir/EmptyDir MasterDirReplica
[NetProg@localhost LinuxStudies]$ cp -r MasterDir/NonEmptyDir MasterDirReplica
[NetProg@localhost LinuxStudies]$ ls MasterDirReplica/
EmptyDir NonEmptyDir
[NetProg@localhost LinuxStudies]$

The command to delete an empty directory in Linux is rmdir. For historical reasons,
rmdir works only for empty directories, and the command rm -r is required to delete non-
empty directories. In Example 2-21, an attempt is made to delete both the EmptyDir and
NonEmptyDir directories by using the rmdir command, but as expected, it does not work on
the directory NonEmptyDir. Using rm -r works fine, and the final ls command shows that.

Example 2-21 Deleting Directories

[NetProg@localhost LinuxStudies]$ ls MasterDir


EmptyDir NonEmptyDir
[NetProg@localhost LinuxStudies]$ rmdir MasterDir/EmptyDir
[NetProg@localhost LinuxStudies]$ ls MasterDir
NonEmptyDir
[NetProg@localhost LinuxStudies]$ rmdir MasterDir/NonEmptyDir
rmdir: failed to remove 'MasterDir/NonEmptyDir': Directory not empty
[NetProg@localhost LinuxStudies]$ ls MasterDir
NonEmptyDir
[NetProg@localhost LinuxStudies]$ rm -r MasterDir/NonEmptyDir
[NetProg@localhost LinuxStudies]$ ls MasterDir
[NetProg@localhost LinuxStudies]$

Finally, Example 2-22 shows the use of the mv command to rename the directory
NonEmptyDir to NonEmptyDirRenamed.

Example 2-22 Renaming Directories

[NetProg@localhost MasterDirReplica]$ ls
EmptyDir NonEmptyDir
[NetProg@localhost MasterDirReplica]$ mv NonEmptyDir NonEmptyDirRenamed
[NetProg@localhost MasterDirReplica]$ ls
EmptyDir NonEmptyDirRenamed

9781587145148_print.indb 50 25/03/21 11:43 am


Files and Directories in Linux 51

The following is a summary of the commands used for directory operations:

■■ mkdir {directory_name}: Use this syntax to create directories.

■■ cp -r {source} {destination}: Use this syntax to copy directories to another location.

■■ mv {source} {destination}: Use this syntax to move directories from one location to
another.

■■ mv {old_dir_name} {new_dir_name}: Use this syntax to rename directories. A


renamed directory could be collocated (in the same path) with the original directory,
or it could be in a different path (moved and renamed in the same operation).

■■ rmdir {directory_name}: Use this syntax to delete empty directories.

■■ rm -r {directory_name}: Use this syntax to delete non-empty directories.

Hard and Soft Links


Linux provides the facility to create a link from one file to another file. A link is basically
a relationship between two files. This relationship means that changes to one file affect
the linked file in one way or another. There are two types of links in Linux: hard links and
soft, or symbolic, links. You create links in Linux by using the ln command. A link is cre-
ated between the original file, referred to as the target, and a newly created file, referred
to as the link.

Hard Links
You create a hard link between a target and a link by using the syntax (in its simplest
form) ln {target-file} {link-file}. A hard link is characterized by the following:

■■ Any changes to the contents of the target file are reflected in the link file and
vice versa.

■■ Any changes to the target file attributes, such as the file permissions, are reflected in
the link file and vice versa.

■■ Deleting the target file does not delete the link file.

■■ A target file can have one or more link files linked to it. The target and all its hard
links have the same inode number.

■■ Hard links are allowed for files only, not for directories.

Example 2-23 shows a hard link named HL-1-to-Access-List.txt created to the file Access-
List.txt. The command ls -li is used to list the files in the directory LinuxStudies, includ-
ing the file attributes. Notice that apart from the different name, the hard link file is basi-
cally a replica of the target: Both have the same file size, permissions, and inode number.
Then a second hard link, named HL-2-to-Access-List.txt, is created. Notice the number 1
to the right of the file permissions of the original target, Access-List.txt, before any hard
links are created. This number increments by 1 every time a hard link is created.

9781587145148_print.indb 51 25/03/21 11:43 am


52 Chapter 2: Linux Fundamentals

Then the first hard link is deleted. Notice that the target file and the second hard link stay
intact. The target file is deleted, and the second hard link stays intact. As mentioned pre-
viously, deleting the target or one of the hard links does not affect the other hard links.

Example 2-23 Creating and Deleting Hard Links

[NetProg@localhost ~]$ cd LinuxStudies


[NetProg@localhost LinuxStudies]$ ls -li
total 2304
57934070 -rw-r--r--. 1 NetProg NetProg 470 Feb 14 10:08 Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt

! Create the first hardlink


[NetProg@localhost LinuxStudies]$ ln Access-List.txt HL-1-to-Access-List.txt
[NetProg@localhost LinuxStudies]$ ls -li
total 2308
57934070 -rw-r--r--. 2 NetProg NetProg 470 Feb 14 10:08 Access-List.txt
57934070 -rw-r--r--. 2 NetProg NetProg 470 Feb 14 10:08 HL-1-to-Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt

! Create the second hard link


[NetProg@localhost LinuxStudies]$ ln Access-List.txt HL-2-to-Access-List.txt
[NetProg@localhost LinuxStudies]$ ls -li
total 2312
57934070 -rw-r--r--. 3 NetProg NetProg 470 Feb 14 10:08 Access-List.txt
57934070 -rw-r--r--. 3 NetProg NetProg 470 Feb 14 10:08 HL-1-to-Access-List.txt
57934070 -rw-r--r--. 3 NetProg NetProg 470 Feb 14 10:08 HL-2-to-Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt

! Remove the first hard link - target and second hard link stay intact
[NetProg@localhost LinuxStudies]$ rm HL-1-to-Access-List.txt
[NetProg@localhost LinuxStudies]$ ls -li
total 2308
57934070 -rw-r--r--. 2 NetProg NetProg 470 Feb 14 10:08 Access-List.txt
57934070 -rw-r--r--. 2 NetProg NetProg 470 Feb 14 10:08 HL-2-to-Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt

! Delete the target - second hard link stays intact


[NetProg@localhost LinuxStudies]$ rm Access-List.txt
[NetProg@localhost LinuxStudies]$ ls -li
total 2304
57934070 -rw-r--r--. 1 NetProg NetProg 470 Feb 14 10:08 HL-2-to-Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt
[NetProg@localhost LinuxStudies]$

9781587145148_print.indb 52 25/03/21 11:43 am


Files and Directories in Linux 53

Example 2-24 shows how hard-linked files change together. Reverting to the original
state where we have the file Access-List.txt and two hard links to it, using the command
chmod, the permissions of the target are changed from -rw-r--r-- to -rw-rw-r--. This com-
mand is covered in detail in Chapter 3. For now, notice the new permissions that change
for the target as well as all the hard links. To take this a step further, the permissions are
changed for the second hard link to -rw-rw-rw-. Notice now how the permissions change
for the target as well as the other hard link.

Example 2-24 How Attributes Are Reflected Across Hard Links

[NetProg@localhost LinuxStudies]$ ls -li


total 2312
57934070 -rw-r--r--. 3 NetProg NetProg 470 Feb 14 10:45 Access-List.txt
57934070 -rw-r--r--. 3 NetProg NetProg 470 Feb 14 10:45 HL-1-to-Access-List.txt
57934070 -rw-r--r--. 3 NetProg NetProg 470 Feb 14 10:45 HL-2-to-Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt

! Changing the file permissions for the target


[NetProg@localhost LinuxStudies]$ chmod g+w Access-List.txt
[NetProg@localhost LinuxStudies]$ ls -li
total 2312
57934070 -rw-rw-r--. 3 NetProg NetProg 470 Feb 14 10:45 Access-List.txt
57934070 -rw-rw-r--. 3 NetProg NetProg 470 Feb 14 10:45 HL-1-to-Access-List.txt
57934070 -rw-rw-r--. 3 NetProg NetProg 470 Feb 14 10:45 HL-2-to-Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt

! Changing the file permissions for the second hard link


[NetProg@localhost LinuxStudies]$ chmod o+w HL-2-to-Access-List.txt
[NetProg@localhost LinuxStudies]$ ls -li
total 2312
57934070 -rw-rw-rw-. 3 NetProg NetProg 470 Feb 14 10:45 Access-List.txt
57934070 -rw-rw-rw-. 3 NetProg NetProg 470 Feb 14 10:45 HL-1-to-Access-List.txt
57934070 -rw-rw-rw-. 3 NetProg NetProg 470 Feb 14 10:45 HL-2-to-Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt
[NetProg@localhost LinuxStudies]$

Similarly, content changes in one file are automatically reflected in the target and all
other hard links, as shown in Example 2-25. The command cat displays the contents of
file Access-List.txt on the terminal, and then cat is used again to display the contents of
HL-1-to-Access-List.txt. The contents of the two files are, as expected, the same. Now
the text editor vi is used to add a new line at the top of access list Test-Access-List with
sequence number 5 in the file Access-List.txt. The file contents are then viewed using
cat to confirm that the changes were successfully saved. Viewing the contents of both
­hard-linked files shows that the new line was also added to the ACL Test-Access-List in
both files.

9781587145148_print.indb 53 25/03/21 11:43 am


54 Chapter 2: Linux Fundamentals

Example 2-25 How Content Changes Are Reflected Across Hard Links

[NetProg@localhost LinuxStudies]$ ls -li


total 2312
57934145 -rw-rw-r--. 3 NetProg NetProg 512 Feb 14 14:45 Access-List.txt
57934145 -rw-rw-r--. 3 NetProg NetProg 512 Feb 14 14:45 HL-1-to-Access-List.txt
57934145 -rw-rw-r--. 3 NetProg NetProg 512 Feb 14 14:45 HL-2-to-Access-List.txt
57934069 -rw-r--r--. 2 NetProg NetProg 2353097 Feb 14 10:09 showrun.txt

! Identical file content before editing


[NetProg@localhost LinuxStudies]$ cat Access-List.txt
!
ipv4 access-list Test-Access-List
10 permit ipv4 192.168.10.0 0.0.0.255 any
20 permit ipv4 192.168.20.0 0.0.3.255 any
30 permit ipv4 192.168.30.0 0.0.0.255 any
!
[NetProg@localhost LinuxStudies]$ cat HL-1-to-Access-List.txt
!
ipv4 access-list Test-Access-List
10 permit ipv4 192.168.10.0 0.0.0.255 any
20 permit ipv4 192.168.20.0 0.0.3.255 any
30 permit ipv4 192.168.30.0 0.0.0.255 any
!

! Content changes in target are automatically reflected in both hard links


[NetProg@localhost LinuxStudies]$ vi Access-List.txt
[NetProg@localhost LinuxStudies]$ cat Access-List.txt
!
ipv4 access-list Test-Access-List
5 permit ipv4 192.168.10.0 0.0.0.255 any
10 permit ipv4 192.168.10.0 0.0.0.255 any
20 permit ipv4 192.168.20.0 0.0.3.255 any
30 permit ipv4 192.168.30.0 0.0.0.255 any
!
[NetProg@localhost LinuxStudies]$ cat HL-1-to-Access-List.txt
!
ipv4 access-list Test-Access-List
5 permit ipv4 192.168.10.0 0.0.0.255 any
10 permit ipv4 192.168.10.0 0.0.0.255 any
20 permit ipv4 192.168.20.0 0.0.3.255 any
30 permit ipv4 192.168.30.0 0.0.0.255 any
!

9781587145148_print.indb 54 25/03/21 11:43 am


Files and Directories in Linux 55

[NetProg@localhost LinuxStudies]$ cat HL-2-to-Access-List.txt


!
ipv4 access-list Test-Access-List
5 permit ipv4 192.168.10.0 0.0.0.255 any
10 permit ipv4 192.168.10.0 0.0.0.255 any
20 permit ipv4 192.168.20.0 0.0.3.255 any
30 permit ipv4 192.168.30.0 0.0.0.255 any
!
[NetProg@localhost LinuxStudies]$

In very simple terms, a hard link creates a new live copy of a file that changes as the
target or any of the other hard links change. More accurately, a hard link creates a new
pointer to the same inode that has a different name (filename). The inode number is the
same across all hard-linked files, and hard links cannot span different file systems because
inode numbers may not be unique across different file systems on separate partitions.

One last thing to note in the output of the ls -l command is that each file by default has
one hard link that must be present before you create any hard links to the file manually.
This hard link is the original target file. This reinforces the concept that a hard link is just
a pointer to the same inode number, and the first pointer to a particular inode number is
the target itself.

One use case for hard links is the utility to distribute data. Consider a configuration file
or a device inventory with one or more hard links, each being used by a different device
or application. Every time the file or one of the hard links is updated by one of the appli-
cations or devices, the updates are automatically reflected in all the other files. Think of
all the possibilities that this functionality provides in the real world of automation!

Soft Links
A soft link, commonly referred to as a symbolic link, or symlink for short, does not
­c reate a live copy of a target file as a hard link does. Instead, a symbolic link, as the name
implies, is just a pointer, or a shortcut, to the original file, not the inode, as is the case
with hard links. Symlinks are created using the command ln -s {target_file} {link_file}.
Symlinks are characterized by the following:

■■ The target file and the link file have different inode numbers.

■■ Symlink file permissions are always rwxrwxrwx, regardless of the permissions of the
target file.

■■ Symlinks have the letter l to the left of the file permissions in the output of the ls -l
command and an arrow pointing to the target file at the end of the line of the same
output.

■■ A symlink does not disappear when the target file is deleted. Instead, the output of
the command ls -l shows the target file highlighted in red and flashing to indicate
that the symlink is broken.

9781587145148_print.indb 55 25/03/21 11:43 am


56 Chapter 2: Linux Fundamentals

■■ The symlink references the target file by name. If the target file is replaced by any
other file that has the same name, the symlink points to that new file.

■■ Unlike hard links, symlinks can be created for directories as well as files.

Example 2-26 shows symlinks at work. First, a symlink named SL-1-to-Access-List.txt is


created for the file Access-List.txt. Notice the different inode numbers and file permis-
sions between the target and link files. Notice also the l that is prepended to the file per-
missions of the soft link and the arrow pointing to the target at the end of the line; both
the l and the arrow indicate that this is a soft link. The target is then deleted using the rm
command. However, the soft link file still appears in the output of the ls command. On
a computer screen, the target would also be highlighted in red to indicate a broken link.
Next in the example, a new empty file is created using the touch command, but it has the
same name as the file that was deleted, Access-List.txt. When the ls command is issued,
it shows that the symlink is operational again, and it points to the newly created text file.
To further confirm that the symlink is working, the cat command is issued, and it shows
both files being empty. The vi editor is then used to add an ACL, Access-List-Test, to
the symlink file, and after the cat command is issued for both files, it turns out that the
changes made to the symlink have been reflected to the target, Access-List.txt.

Example 2-26 Symlinks at Work

[NetProg@localhost LinuxStudies]$ ln -s Access-List.txt SL-1-to-Access-List.txt


[NetProg@localhost LinuxStudies]$ ls -li
total 2312
57934145 -rw-rw-r--. 3 NetProg NetProg 512 Feb 14 14:45 Access-List.txt
57934143 lrwxrwxrwx. 1 NetProg NetProg 15 Feb 14 14:55 SL-1-to-Access-List.txt
-> Access-List.txt

! Deleting the target does not delete the symlink


[NetProg@localhost LinuxStudies]$ rm Access-List.txt
[NetProg@localhost LinuxStudies]$ ls -li
total 2308
57934143 lrwxrwxrwx. 1 NetProg NetProg 15 Feb 14 14:55 SL-1-to-Access-List.txt
-> Access-List.txt

! Creating a new file with the same name as the deleted target reinstates the
symlink
[NetProg@localhost LinuxStudies]$ touch Access-List.txt
[NetProg@localhost LinuxStudies]$ ls -li
total 2308
57934146 -rw-rw-r--. 1 NetProg NetProg 0 Feb 14 14:58 Access-List.txt
57934143 lrwxrwxrwx. 1 NetProg NetProg 15 Feb 14 14:55 SL-1-to-Access-List.txt
-> Access-List.txt

[NetProg@localhost LinuxStudies]$ cat Access-List.txt

9781587145148_print.indb 56 25/03/21 11:43 am


Input and Output Redirection 57

[NetProg@localhost LinuxStudies]$ cat SL-1-to-Access-List.txt

[NetProg@localhost LinuxStudies]$ vi SL-1-to-Access-List.txt


[NetProg@localhost LinuxStudies]$ cat SL-1-to-Access-List.txt
!
ipv4 access-list Test-Access-List
10 permit ipv4 192.168.10.0 0.0.0.255 any
20 permit ipv4 192.168.20.0 0.0.0.255 any
30 permit ipv4 192.168.30.0 0.0.0.255 any
!
[NetProg@localhost LinuxStudies]$ cat Access-List.txt
!
ipv4 access-list Test-Access-List
10 permit ipv4 192.168.10.0 0.0.0.255 any
20 permit ipv4 192.168.20.0 0.0.0.255 any
30 permit ipv4 192.168.30.0 0.0.0.255 any
!
[NetProg@localhost LinuxStudies]$

Soft links provide similar functionality to Windows shortcuts. One use case for symlinks
is to consolidate all your work in one directory. The directory contains symlinks to all
files from other directories. Changes made to any symlink are reflected to the original
file, and you do not have to move the original file from its place in the file system.

Input and Output Redirection


Earlier in this chapter, you briefly learned about the GNU utilities that are bundled with
the Linux kernel to form the Linux operating system. Utilities are a collection of software
tools that enable a user to perform common system tasks without having to write their
own tool set.

All the commands introduced so far in this chapter (as well as in the remainder of this
chapter) are actually utilities, and each is invoked via the respective command. For
example, the ls, cat, more, less, head, tail, cp, mv, rm, mkdir, rmdir, and ln commands
covered so far are actually utilities, and you run each utility by typing the corresponding
command in the shell. Most utilities are grouped together in packages. When a package
is installed, all constituent utilities are installed. Two popular packages are coreutils and
util-linux.

The true power of Linux lies not only in its architecture but in the vast number of utilities
that come prepackaged with it, new utilities that can be easily installed and immediately
add to the power and usability of the system, and, finally, the option of programming
your own custom utilities. Utilities are introduced in this section because input and
­output redirection are arguably two of the most powerful features of Linux that act on

9781587145148_print.indb 57 25/03/21 11:43 am


58 Chapter 2: Linux Fundamentals

utilities. Redirection stretches the flexibility and usability of utilities and combines the
workings of two or more utilities in ways unique to Linux, as you will see in this section.

The Linux and UNIX philosophy has been inspired by the experience of the software
development leaders who developed programming languages. Ken Thomson and Dennis
Ritchie developed the C language as well as UNIX. Ken and Dennis, among others, real-
ized early on that the operating system should present an interface to the user that facili-
tates a productive and interactive experience. Mimicking programming languages, they
wanted the user to be able to filter input/output of programs and apply control to the
flow of standard input, output, and errors between these utilities.
The UNIX forefathers applied software engineering methods traditionally used in
­programming languages to their operating system user experience. These engineering
methods are reflected in the powerful command-line utilities of both UNIX and Linux,
along with pipes and redirection, to smoothly integrate tools.

The power of the UNIX and Linux command line is achieved with the following design
philosophies:

■■ Make each program do one thing well. To do a new job, build afresh rather than
complicate old programs by adding new features.

■■ Expect the output of every program to become the input to another, as yet
unknown, program. Don’t clutter output with extraneous information.

Thanks to these design philosophies for the command line, an administrator is immediate-
ly equipped with a powerful and infinitely flexible tool chain for all sorts of operations.

The community of Linux developers around the world is continuously contributing to


the long list of available utilities, creating small blocks that can work together to produce
powerful results, making it easy to automate mundane administrative tasks. A good way
to demonstrate this power is by showing an advanced example that illustrates the full
potential of utilities and pipes. Example 2-27 is a relatively complex example that pings
the gateways in the Linux routing table and inserts the unreachable ones in a file. This file
is then sent via email. In this example, the output of one command is piped to another
using the | (pipe) symbol.

Note You do not need to worry about the particular semantics of this example as its goal
is to illustrate the sheer power of piping the output of one command to be used as input to
another command.

Example 2-27 A Relatively Complex Example of Piping

[NetProg@localhost]$ netstat -nr | awk '{print $2}' | grep -o '[0-9]\{1,3\}\.[0-9]\


{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' | xargs -n1 ping -c1 | grep -b1 100 |
mail -s "Unreachable gateways" [email protected]

9781587145148_print.indb 58 25/03/21 11:43 am


Input and Output Redirection 59

This section covers input and output redirection in detail. For now, here is a brief explana-
tion of each command in Example 2-27:

■■ netstat -nr: Displays the routing table and pipes it to the next command (awk).

■■ awk '{print $2}': Filters the second column only (gateways) from the output of the
previous command (netstat), and pipes the result to the next command (grep).

■■ grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}': Only shows IP addresses


from the output of the previous command (awk) and pipes the result to the next
command (xargs).

■■ xargs -n1 ping -c1: Pings IP addresses that were provided by the previous command
(grep) and pipes the ping results to the next command (grep).

■■ grep -b1 100: Filters the unreachable IP addresses from the ping performed in the
previous command (xargs) and pipes the result to the next command (mail).

■■ mail -s "Unreachable IPs" -t [email protected]: Sends an email with the


output of the previous command (grep) with the subject “Unreachable IPs” to user
NetProg’s email address.

In Linux, for each command that you execute or utility that you run, there are three files,
each of which contains a different stream of data:

■■ stdin (standard input): This is the file that the command reads to get its input. stdin
has the file handle 0.

■■ stdout (standard output): This is the file to which the command sends its output.
stdout has the file handle 1.

■■ stderr (standard error): This is the file to which the command sends any errors, also
known as exceptions. stderr has the file handle 2.

A file handle is a number assigned to a file by the OS when that file is opened.

stdin is, by default, what you type on your keyboard. Similarly, stdout and stderr are, by
default, displayed onscreen.

Input and output redirection are powerful capabilities in Linux that are very important
pieces of the automation puzzle. Output that is normally seen onscreen can be redirected
to a file. Output from a command can also be split into regular output and error, which
can then be redirected separately. The contents of a file or the output of a command can
be redirected to another command as input to that command.

The sort utility accepts input through stdin (via the keyboard), sorts the input in alpha-
betical order, and then sends the output to stdout (to the screen). In Example 2-28, after
the user types the command sort and presses Enter, the shell waits for input from the
user through the keyboard. The user types the letters q, w, e, r, t, and y on the keyboard,
one by one, pressing Enter after each letter, to start a new line. The user then executes

9781587145148_print.indb 59 25/03/21 11:43 am


60 Chapter 2: Linux Fundamentals

the sort command by pressing the Ctrl+d key combination. As shown in the example, the
lines are sorted in alphabetical order, as expected.

Example 2-28 Using the sort Utility and Providing the Input Through the Default stdin
Stream, the Keyboard

[NetProg@localhost LinuxStudies]$ sort


q
w
e
r
t
y ! Press ctrl+d here
e
q
r
t
w
y
[NetProg@localhost LinuxStudies]$

Input redirection can be used to change a command’s stdin to a file instead of the key-
board. One way to do this is to specify the file as an argument to the command. Another
way is to use the syntax {command} < {file}, where the contents of file are used as input
to command. Example 2-29 shows how stdin to the sort command is changed to the file
qwerty.txt using both methods.

Example 2-29 Changing stdin for the sort Command from the Keyboard to a File by
Providing the File as an Argument, and by Using Input Redirection

[NetProg@localhost LinuxStudies]$ cat qwerty.txt


q
w
e
r
t
y
[NetProg@localhost LinuxStudies]$ sort qwerty.txt
e
q
r
t
w
y
[NetProg@localhost LinuxStudies]$ sort < qwerty.txt

9781587145148_print.indb 60 25/03/21 11:43 am


Input and Output Redirection 61

e
q
r
t
w
y
[NetProg@localhost LinuxStudies]$

How to change stdout and stderr may be a bit more obvious than how to change stdin
because the output from commands is usually expected to appear on the screen. Output
redirection can be used to redirect the output to a file instead. In Example 2-30, the
­output from the sort command is output to the file qwerty-sorted.txt, and then the cat
command is used to display the contents of the sorted file.

Example 2-30 Redirecting Stdout to the File qwerty-sorted.txt with the >

[NetProg@localhost LinuxStudies]$ ls
configurations operational QoS.txt qwerty.txt temp
[NetProg@localhost LinuxStudies]$ sort qwerty.txt > qwerty-sorted.txt
[NetProg@localhost LinuxStudies]$ ls
configurations operational QoS.txt qwerty-sorted.txt qwerty.txt temp
[NetProg@localhost LinuxStudies]$ cat qwerty-sorted.txt
e
q
r
t
w
y
[NetProg@localhost LinuxStudies]$

Notice that file qwerty-sorted.txt did not exist before the sort command was executed.
The file was created before it was used to store the redirected output. Similarly, if the file
had existed before the command was executed, it would have been overwritten.

What if you need to append the output to the file instead of overwriting it ? Example 2-31
shows how to append output to an existing file by using the >> ­notation. As you saw ear-
lier, the cat command outputs the contents of a file to the screen. In Example 2-31, instead
of displaying the contents of QoS.txt on the screen, the cat command with the >> notation
is used to redirect the file’s contents to the qwerty-sorted.txt file, but this time the output is
appended to the existing content of qwerty-sorted.txt instead of overwriting it.

9781587145148_print.indb 61 25/03/21 11:43 am


62 Chapter 2: Linux Fundamentals

Example 2-31 Appending Command Output by Using >>

[NetProg@localhost LinuxStudies]$ cat QoS.txt >> qwerty-sorted.txt


[NetProg@localhost LinuxStudies]$ cat qwerty-sorted.txt
e
q
r
t
w
y
!
policy-map MOBILE_RAN_QOS_OUT
!
class MOBILE_VOICE_CLASS
priority level 1
police rate percent 50
conform-action transmit
exceed-action drop
!
set cos 5
!
class MOBILE_BROADBAND
bandwidth percent 35
set cos 3
random-detect default
!
class class-default
bandwidth percent 15
set cos 0
random-detect default
!
end-policy-map
!
[NetProg@localhost LinuxStudies]$

stderr is also, by default, displayed on the screen. If you want to redirect stderr to a file
instead, you use the syntax {command} 2> {file}, where the regular output goes to the
screen, while the errors or exception messages are redirected to the file specified in the
command. Example 2-32 shows how the error message from issuing the stat command
on a nonexistent file is redirected to file error.txt. The stat command gives you important
information about the file, such as the file size, inode number, permissions, user ID of
the file owner, group ID of the file, and what time the file was last accessed, modified
­(content changed), and changed (metadata such as permissions changed).

9781587145148_print.indb 62 25/03/21 11:43 am


Input and Output Redirection 63

Example 2-32 Redirecting stderr to a File by Using 2>

[NetProg@localhost LinuxStudies]$ stat QoS.txt


File: QoS.txt
Size: 361 Blocks: 8 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 31719574 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 1001/ NetProg) Gid: ( 1001/ NetProg)
Context: unconfined_u:object_r:user_home_t:s0
Access: 2018-02-23 18:04:59.919457898 +0300
Modify: 2018-02-23 18:00:21.881647657 +0300
Change: 2018-02-23 18:00:21.881647657 +0300
Birth: -
[NetProg@localhost LinuxStudies]$ stat WrongFile.txt
stat: cannot stat 'WrongFile.txt': No such file or directory
[NetProg@localhost LinuxStudies]$ stat WrongFile.txt 2> error.txt
[NetProg@localhost LinuxStudies]$ cat error.txt
stat: cannot stat 'WrongFile.txt': No such file or directory
[NetProg@localhost LinuxStudies]$

So far, you have seen how to redirect stdout to a file and how to do the same for stderr—
but not both together. To redirect both stdout and stderr to a file, you use the syntax
{command} &> {file}. Example 2-33 shows the cat command being used to concat-
enate three files: QoS.txt, WrongFile.txt, and qwerty.txt. However, one of these files,
WrongFile.txt, does not exist, and so an error message is generated. As a result, the
contents of QoS.txt and qwerty.txt are concatenated, and then both stdout and stderr are
redirected to the same file, OutandErr.txt.

Example 2-33 Redirecting Both stdout and stderr to OutandErr.txt

[NetProg@localhost LinuxStudies]$ cat QoS.txt WrongFile.txt qwerty.txt &>


OutandErr.txt
[NetProg@localhost LinuxStudies]$ cat OutandErr.txt
!
policy-map MOBILE_RAN_QOS_OUT
!
class MOBILE_VOICE_CLASS
priority level 1
police rate percent 50
conform-action transmit
exceed-action drop
!
set cos 5
!
class MOBILE_BROADBAND
bandwidth percent 35

9781587145148_print.indb 63 25/03/21 11:43 am


64 Chapter 2: Linux Fundamentals

set cos 3
random-detect default
!
class class-default
bandwidth percent 15
set cos 0
random-detect default
!
end-policy-map
!
cat: WrongFile.txt: No such file or directory
q
w
e
r
t
y
[NetProg@localhost LinuxStudies]$

To split stdout and stderr into their own separate files, you can use the syntax
{command} > {output_file} 2> {error_file}, as shown in Example 2-34.

Example 2-34 Redirecting stdout and stderr Each to Its Own File

[NetProg@localhost LinuxStudies]$ cat QoS.txt WrongFile.txt qwerty.txt > output.txt


2> error.txt
[NetProg@localhost LinuxStudies]$ cat output.txt
!
policy-map MOBILE_RAN_QOS_OUT
!
class MOBILE_VOICE_CLASS
priority level 1
police rate percent 50
conform-action transmit
exceed-action drop
!
set cos 5
!
class MOBILE_BROADBAND
bandwidth percent 35
set cos 3
random-detect default
!

9781587145148_print.indb 64 25/03/21 11:43 am


Input and Output Redirection 65

class class-default
bandwidth percent 15
set cos 0
random-detect default
!
end-policy-map
!
q
w
e
r
t
y
[NetProg@localhost LinuxStudies]$ cat error.txt
cat: WrongFile.txt: No such file or directory
[NetProg@localhost LinuxStudies]$

To ignore or discard an error altogether and not save it to a file, you can simply redirect
it to /dev/null. The file /dev/null is a special device file that discards any data redirected
to it.

You can also append both stdout and stderr to an existing file by using the syntax
{command} >> {file} 2>&1.

As mentioned earlier in the chapter, Linux provides a facility to redirect the output of
one command to be used as input for another command. This is done using the | (pipe)
operator. Example 2-35 shows how the output of the stat command for the QoS.txt file
is piped to the sort command, which sorts the output in alphabetical order. The result is
then piped again to the head command to extract the first line of the output.

Example 2-35 Piping Command Output to Another Command

[NetProg@localhost LinuxStudies]$ stat QoS.txt


File: QoS.txt
Size: 361 Blocks: 8 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 31719574 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 1001/ NetProg) Gid: ( 1001/ NetProg)
Context: unconfined_u:object_r:user_home_t:s0
Access: 2018-02-23 18:04:59.919457898 +0300
Modify: 2018-02-23 18:00:21.881647657 +0300
Change: 2018-02-23 18:00:21.881647657 +0300
Birth: -

! Stat output piped to sort


[NetProg@localhost LinuxStudies]$ stat QoS.txt | sort

9781587145148_print.indb 65 25/03/21 11:43 am


66 Chapter 2: Linux Fundamentals

Access: (0664/-rw-rw-r--) Uid: ( 1001/ NetProg) Gid: ( 1001/ NetProg)


Access: 2018-02-23 18:04:59.919457898 +0300
Birth: -
Change: 2018-02-23 18:00:21.881647657 +0300
Context: unconfined_u:object_r:user_home_t:s0
Device: fd03h/64771d Inode: 31719574 Links: 1
File: QoS.txt
Modify: 2018-02-23 18:00:21.881647657 +0300
Size: 361 . Blocks: 8 IO Block: 4096 regular file

! Double piping to sort and then head


[NetProg@localhost LinuxStudies]$ stat QoS.txt | sort | head -n 1
Access: (0664/-rw-rw-r--) Uid: ( 1001/ NetProg) Gid: ( 1001/ NetProg)
[NetProg@localhost LinuxStudies]$

You have seen how stdout is by default displayed on the screen and how to redirect it
to a file. But can you display it on the screen and simultaneously redirect it to a file?
Yes. You can use the pipe operator coupled with the tee command to do just that. In
Example 2-36, the output of the command ls -l is piped to the tee command, and as a
result, the output is both displayed on the screen and saved to the file lsoutput.txt.

Example 2-36 Piping Command Output To the tee Command to Display It on the
Screen As Well As Save It To File lsoutput.txt

[NetProg@localhost LinuxStudies]$ ls -l | tee lsoutput.txt


total 40
-rw-rw-r--. 1 NetProg NetProg 46 Feb 23 20:28 colors.txt
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 23 16:59 configurations
-rw-rw-r--. 1 NetProg NetProg 61 Feb 23 18:15 error.txt
-rw-rw-r--. 1 NetProg NetProg 475 Feb 23 19:43 Existing.txt
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:34 operational
-rw-rw-r--. 1 NetProg NetProg 419 Feb 23 19:25 OutandErr.txt
-rw-rw-r--. 1 NetProg NetProg 361 Feb 23 18:00 QoS.txt
-rw-rw-r--. 1 NetProg NetProg 373 Feb 23 18:08 qwerty-sorted.txt
-rw-rw-r--. 1 NetProg NetProg 12 Feb 23 17:28 qwerty.txt
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 15:03 temp

[NetProg@localhost LinuxStudies]$ cat lsoutput.txt


total 40
-rw-rw-r--. 1 NetProg NetProg 46 Feb 23 20:28 colors.txt
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 23 16:59 configurations
-rw-rw-r--. 1 NetProg NetProg 61 Feb 23 18:15 error.txt
-rw-rw-r--. 1 NetProg NetProg 475 Feb 23 19:43 Existing.txt
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 12:34 operational

9781587145148_print.indb 66 25/03/21 11:43 am


Archiving Utilities 67

-rw-rw-r--. 1 NetProg NetProg 419 Feb 23 19:25 OutandErr.txt


-rw-rw-r--. 1 NetProg NetProg 361 Feb 23 18:00 QoS.txt
-rw-rw-r--. 1 NetProg NetProg 373 Feb 23 18:08 qwerty-sorted.txt
-rw-rw-r--. 1 NetProg NetProg 12 Feb 23 17:28 qwerty.txt
drwxrwxr-x. 2 NetProg NetProg 4096 Feb 17 15:03 temp
[NetProg@localhost LinuxStudies]$

The tee command overwrites the output file (in this case, the lsoutput.txt file). You can
use the tee command with the -a option to append to the file instead of overwriting it.

Archiving Utilities
An archiving utility takes a file or a group of files as input and encodes the file or files
into one single file, commonly known as an archive. The archiving utility also makes it
possible to add files to the archive, remove files from the archive, update the files in the
archive, or de-archive the archive file into its constituent files. Archiving utilities have his-
torically been used for backup purposes and to package several files into one file that can
be easily distributed, downloaded, and so on. The most commonly used archiving utility
in Linux is the tar utility, which stands for tape archive. Archive files produced by the tar
utility have a .tar extension and are commonly referred to as tarballs.

In contrast to an archiving utility, a compression utility takes a file or a group of files as


input and compresses the file or files into another format that is smaller than the original
file. This compression is lossless, meaning that no information is lost in the process. The
compressed file can be decompressed and returned to its original state without any data
or metadata being lost. In Linux, the most popular compression utilities are gzip, bzip2,
and xz. The performance of compression utilities is measured based on several criteria,
two of which are how quickly the compression happens and the compression ratio (which
is how small the new compressed file is in comparison with the original uncompressed
file). The xz utility is the best when it comes to compression ratio, but it is the slowest.
The gzip utility is the fastest but has the lowest (worst) compression ratio. As you have
already concluded, bzip2 lies in the middle with respect to speed and compression ratio.

We cover archiving and compression utilities together in this section because a very
common use case involves compressing files using one of the compression utilities listed
here and then archiving the compressed files by using the tar utility. In addition to cov-
ering these utilities, this section also illustrates how compression and archiving can be
­performed using a single command.
Example 2-37 shows how to use the gzip utility to compress the InternetRoutes.txt
file. You simply issue the command gzip InternetRoutes.txt, and the utility creates
another file, InternetRoutes.txt.gz, which is the compressed file, and removes the original
­uncompressed file. To decompress the file back to its original form, you use the com-
mand gunzip InternetRoutes.txt.gz. What if you want to keep the original file as well
as the compressed file after compression? You use gzip with the -k option, which stands

9781587145148_print.indb 67 25/03/21 11:43 am


68 Chapter 2: Linux Fundamentals

for keep, as shown in the example. Similarly, you can use the -k option with gunzip to
decompress the file and keep the compressed file intact.

Example 2-37 Using the gzip Utility to Compress a Text File

[NetProg@localhost LinuxStudies]$ ls -l
total 17212
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt

[NetProg@localhost LinuxStudies]$ gzip InternetRoutes.txt


[NetProg@localhost LinuxStudies]$ ls -l
total 1268
-rw-rw-r--. 1 NetProg NetProg 1296408 Feb 24 12:27 InternetRoutes.txt.gz

[NetProg@localhost LinuxStudies]$ gunzip InternetRoutes.txt.gz


[NetProg@localhost LinuxStudies]$ ls -l
total 17212
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt

[NetProg@localhost LinuxStudies]$ gzip -k InternetRoutes.txt


[NetProg@localhost LinuxStudies]$ ls -l
total 18480
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt
-rw-rw-r--. 1 NetProg NetProg 1296408 Feb 24 12:27 InternetRoutes.txt.gz

[NetProg@localhost LinuxStudies]$ rm InternetRoutes.txt


[NetProg@localhost LinuxStudies]$ ls -l
total 1268
-rw-rw-r--. 1 NetProg NetProg 1296408 Feb 24 12:27 InternetRoutes.txt.gz

[NetProg@localhost LinuxStudies]$ gunzip -k InternetRoutes.txt.gz


[NetProg@localhost LinuxStudies]$ ls -l
total 18480
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt
-rw-rw-r--. 1 NetProg NetProg 1296408 Feb 24 12:27 InternetRoutes.txt.gz
[NetProg@localhost LinuxStudies]$

Notice in the example that the size of the original uncompressed file is approximately
17 MB, and the size of the compressed file is approximately 1.2 MB; this represents a
compression ratio of about 13.6.

9781587145148_print.indb 68 25/03/21 11:43 am


Archiving Utilities 69

Example 2-38 shows how the bzip2 utility is used to compress the same InternetRoutes.
txt file by using the command bzip2 InternetRoutes.txt and then uncompress the file by
using the command bunzip2 InternetRoutes.txt.bz2.

Example 2-38 Using the bzip2 Utility to Compress a Text File

[NetProg@localhost LinuxStudies]$ ls -l
total 17212
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt

[NetProg@localhost LinuxStudies]$ bzip2 -kv InternetRoutes.txt


InternetRoutes.txt: 19.386:1, 0.413 bits/byte, 94.84% saved, 17622037 in,
909025 out.
[NetProg@localhost LinuxStudies]$ ls -l
total 18100
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt
-rw-rw-r--. 1 NetProg NetProg 909025 Feb 24 12:27 InternetRoutes.txt.bz2

[NetProg@localhost LinuxStudies]$ rm InternetRoutes.txt


[NetProg@localhost LinuxStudies]$ ls -l
total 888
-rw-rw-r--. 1 NetProg NetProg 909025 Feb 24 12:27 InternetRoutes.txt.bz2

[NetProg@localhost LinuxStudies]$ bunzip2 InternetRoutes.txt.bz2


[NetProg@localhost LinuxStudies]$ ls -l
total 17212
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt
[NetProg@localhost LinuxStudies]$

Notice that the -k option also works with bzip2, and when used, the original uncom-
pressed file is left intact. This option works equally well with bunzip2 to leave the
­compressed file intact. As shown in Example 2-38, the -v option, which stands for ver-
bose, provides some information and statistics on the compression process. It is worth
noting that the verbose option is available for the vast majority of Linux commands, and
it is available for use with all archiving and compression utilities in this chapter. If you
look at the sizes of the original and compressed files, you see that the compression ratio
in the example is approximately 19.4, which is in line with the verbose output.

Example 2-39 shows how to use the xz utility to compress the same InternetRoutes.txt
file, using the command xz InternetRoutes.txt, and then uncompress the file by using
the command xz -d InternetRoutes.txt.xz. The -v option is used here as well to provide
some insight into the compression process.

9781587145148_print.indb 69 25/03/21 11:43 am


70 Chapter 2: Linux Fundamentals

Example 2-39 Using the xz Utility to Compress a Text File

[NetProg@localhost LinuxStudies]$ ls -l
total 17212
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt

[NetProg@localhost LinuxStudies]$ xz -v InternetRoutes.txt


InternetRoutes.txt (1/1)
100 % 711.9 KiB / 16.8 MiB = 0.041 2.3 MiB/s 0:07
[NetProg@localhost LinuxStudies]$ ls -l
total 712
-rw-rw-r--. 1 NetProg NetProg 728936 Feb 24 12:27 InternetRoutes.txt.xz

[NetProg@localhost LinuxStudies]$ xz -dv InternetRoutes.txt.xz


InternetRoutes.txt.xz (1/1)
100 % 711.9 KiB / 16.8 MiB = 0.041
[NetProg@localhost LinuxStudies]$ ls -l
total 17212
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 InternetRoutes.txt
[NetProg@localhost LinuxStudies]$

As you can see from the output in Example 2-39, xz provides a compression ratio of
about 24.2, which is the best compression ratio so far. However, the xz utility takes a
substantial amount of time (7 seconds, according to the verbose output) to compress the
file. As shown in the earlier examples, compression using gzip is almost instantaneous,
while bzip2 takes a couple of seconds to compress the same file.

As mentioned at the beginning of this section, tar is an archiving utility that is used
to group several files into a single archive file. Example 2-40 shows how tar is used to
archive three files into one. To archive a number of files, you issue the command tar -cvf
{Archive_File.tar} {file1} {file2} .. {fileX}. The option c is for create, v is for verbose, and
f is for stating the archive filename in the command. To view the constituent files of the
archive, you use the command tar -tf {Archive_File.tar}. This does not extract the files in
the archive. It only lists the files that make up the archive, as shown by the ls -l command
in the example, right after this command is used. Finally, in order to extract the files from
the archive, you use the command tar -xvf {Archive_File.tar}.

Example 2-40 Using the tar Utility to Archive Three Files into One

[NetProg@localhost LinuxStudies]$ ls -l
total 17260
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 BGP.txt
-rw-rw-r--. 1 NetProg NetProg 43147 Feb 24 13:33 IPRoute.txt
-rw-r--r--. 1 NetProg NetProg 796 Feb 24 13:33 QoS.txt

9781587145148_print.indb 70 25/03/21 11:43 am


Archiving Utilities 71

! Archive three files into one tarball


[NetProg@localhost LinuxStudies]$ tar -cvf Archive.tar BGP.txt IPRoute.txt QoS.txt
BGP.txt
IPRoute.txt
QoS.txt
[NetProg@localhost LinuxStudies]$ ls -l
total 34520
-rw-rw-r--. 1 NetProg NetProg 17674240 Feb 24 15:22 Archive.tar
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 BGP.txt
-rw-rw-r--. 1 NetProg NetProg 43147 Feb 24 13:33 IPRoute.txt
-rw-r--r--. 1 NetProg NetProg 796 Feb 24 13:33 QoS.txt

[NetProg@localhost LinuxStudies]$ rm BGP.txt IPRoute.txt QoS.txt


[NetProg@localhost LinuxStudies]$ ls -l
total 17260
-rw-rw-r--. 1 NetProg NetProg 17674240 Feb 24 15:22 Archive.tar

! Display the constituent files in the archive without de-archiving the tarball
[NetProg@localhost LinuxStudies]$ tar -tf Archive.tar
BGP.txt
IPRoute.txt
QoS.txt
[NetProg@localhost LinuxStudies]$ ls -l
total 17260
-rw-rw-r--. 1 NetProg NetProg 17674240 Feb 24 15:22 Archive.tar

! De-archive the tarball into its constituent files


[NetProg@localhost LinuxStudies]$ tar -xvf Archive.tar
BGP.txt
IPRoute.txt
QoS.txt
[NetProg@localhost LinuxStudies]$ ls -l
total 34520
-rw-rw-r--. 1 NetProg NetProg 17674240 Feb 24 15:22 Archive.tar
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 BGP.txt
-rw-rw-r--. 1 NetProg NetProg 43147 Feb 24 13:33 IPRoute.txt
-rw-r--r--. 1 NetProg NetProg 796 Feb 24 13:33 QoS.txt
[NetProg@localhost LinuxStudies]$

Notice that the size of the archive file is actually a little bigger than the sizes of the con-
stituent files added together. This is because archiving utilities do not compress files.
Moreover, the archive file contains extra metadata that is required for describing the
archive file contents and metadata related to the archiving process.

9781587145148_print.indb 71 25/03/21 11:43 am


72 Chapter 2: Linux Fundamentals

Luckily, the tar command can be used with certain options to summon compression utili-
ties to compress the files before archiving them:

■■ To compress the files using the gzip utility before the tar archive is created, use
the syntax tar -zcvf {archive-file.tar.gz} {file1} {file2} .. {fileX}. To de-archive
the tarball and then decompress the constituent files, use the syntax tar -zxvf
{archive-file.tar.gz}.

■■ To compress the files using the bzip2 utility before the tar archive is created, use
the syntax tar -jcvf {archive-file.tar.bz2} {file1} {file2} .. {fileX}. To de-archive
the tarball and then decompress the constituent files, use the syntax tar -jxvf
{archive-file.tar.bz2}.

■■ To compress the files using the xz utility before the tar archive is created, use
the syntax tar -Jcvf {archive-file.tar.xz} {file1} {file2} .. {fileX}. To de-archive
the tarball and then decompress the constituent files, use the syntax tar -Jxvf
{archive-file.tar.xz}.

Example 2-41 shows the tar command being used with the -J option, which summons
the xz utility to compress the files before tar archives these files. Then the same option is
used to decompress the files after they are extracted from the tar archive.

Example 2-41 Using the tar Utility with xz to Compress and Archive Three Files
into One

[NetProg@localhost LinuxStudies]$ ls -l
total 17260
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 BGP.txt
-rw-rw-r--. 1 NetProg NetProg 43147 Feb 24 13:33 IPRoute.txt
-rw-r--r--. 1 NetProg NetProg 796 Feb 24 13:33 QoS.txt

[NetProg@localhost LinuxStudies]$ tar -Jcvf Archive.tar.xz BGP.txt


IPRoute.txt QoS.txt
BGP.txt
IPRoute.txt
QoS.txt
[NetProg@localhost LinuxStudies]$ ls -l
total 17980
-rw-rw-r--. 1 NetProg NetProg 735548 Feb 24 16:01 Archive.tar.xz
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 BGP.txt
-rw-rw-r--. 1 NetProg NetProg 43147 Feb 24 13:33 IPRoute.txt
-rw-r--r--. 1 NetProg NetProg 796 Feb 24 13:33 QoS.txt

[NetProg@localhost LinuxStudies]$ rm BGP.txt IPRoute.txt QoS.txt


[NetProg@localhost LinuxStudies]$ ls -l
total 720

9781587145148_print.indb 72 25/03/21 11:43 am


Linux System Maintenance 73

-rw-rw-r--. 1 NetProg NetProg 735548 Feb 24 16:01 Archive.tar.xz

[NetProg@localhost LinuxStudies]$ tar -Jxvf Archive.tar.xz


BGP.txt
IPRoute.txt
QoS.txt
[NetProg@localhost LinuxStudies]$ ls -l
total 17980
-rw-rw-r--. 1 NetProg NetProg 735548 Feb 24 16:01 Archive.tar.xz
-rw-rw-r--. 1 NetProg NetProg 17622037 Feb 24 12:27 BGP.txt
-rw-rw-r--. 1 NetProg NetProg 43147 Feb 24 13:33 IPRoute.txt
-rw-r--r--. 1 NetProg NetProg 796 Feb 24 13:33 QoS.txt
[NetProg@localhost LinuxStudies]$

Notice that the size of the tar file is now smaller in size than the sizes of the constitu-
ent files added together. This is due to the compression preceding the archiving. Notice
also that the archive file is named with file extension .tar.xz. This is not mandatory, and
the command works just fine if the archive is just a filename with no extension. However,
this extension enables a user to identify the file as a tar archive that has been compressed
using the xz compression utility.

Linux System Maintenance


This section discusses general maintenance of a Linux system. In order to maintain a
healthy Linux system as well as troubleshoot and resolve system incidents, you need to
understand how to do the following:

■■ Manage jobs and processes

■■ Monitor utilization of CPU, memory, and other resources

■■ Collect system information

■■ Locate, read, and analyze system logs

The following sections discuss these points in some depth. For further details, you can
consult the man or info pages for each command.

Job, Process, and Service Management


In operating systems jargon, a thread is a sequence of instructions that are executed by
the CPU. A thread is a basic building block and cannot be broken up into smaller compo-
nents to be executed simultaneously.

A process is a group of threads. Two or more of those threads may be executed simulta-
neously in a multithreaded system in order to run a process faster. A process has its own

9781587145148_print.indb 73 25/03/21 11:43 am


74 Chapter 2: Linux Fundamentals

address space in memory. Linux virtualizes memory such that each process thinks that
it has exclusive access to all the physical memory on the system even though it actually
only has access to its own process address space. Utilities such as ls, cp, and cat run as
processes.

A job may be composed of two or more processes. For example, running the ­command
ls starts a process, while piping ls to less using the command ls | less starts a job
­composed of more than one process.

A service is composed of one or more processes and provides a specific function; exam-
ples are the HTTP, NTP, and SSH services. A service is also usually run in the background
and is therefore referred to as a daemon. Service names in Linux almost always end with
the letter d. Services are briefly introduced earlier in this chapter.

As you progress through this section, the differences between processes, jobs, and
­services will become more apparent.

The command ps lists the processes currently running on the system. Without any
options or arguments, the command lists the running processes associated with the cur-
rent user and terminal, as shown in Example 2-42.

Example 2-42 ps Command Output

[NetProg@localhost ~]$ ps
PID TTY TIME CMD
2897 pts/0 00:00:00 bash
2954 pts/0 00:00:00 ps
[NetProg@localhost ~]$

For each process, the ps command output lists the following fields:

■■ PID: This is the process ID, which is a number that uniquely identifies each process.

■■ TTY: This is the terminal number from which the process was started. pts/0 in the
output stands for pseudo-terminal slave 0. The first terminal window you open will
be pts/0, the second pts/1, and so forth.

Note Use of the terms “master” and “slave” is ONLY in association with the official
terminology used in industry specifications and standards, and in no way diminishes
Pearson’s commitment to promoting diversity, equity, and inclusion, and challenging,
countering and/or combating bias and stereotyping in the global population of the learners
we serve.

■■ TIME: This is the total amount of time the process spent consuming the CPU
throughout the duration of its lifetime.

■■ CMD: This is the process name.

9781587145148_print.indb 74 25/03/21 11:43 am


Linux System Maintenance 75

For more detailed output, several options can be added to the ps command. Adding the
-A or -e options lists all processes running on the system for all users and all TTY lines,
as shown in Example 2-43. The output is in the same format as the vanilla ps command
output. In order to compare the output from both commands, you can pipe the output
to wc -l. The command wc stands for word count, and when used with the -l option, the
command returns the number of lines in the command argument (in this case, the output
of the ps command). As you can see, both commands return 189 lines of output. The
purpose of this example is two-fold: to display the output of the ps command using both
options and to introduce the very handy command wc -l.

Example 2-43 ps -e and ps -A Commands

[NetProg@localhost ~]$ ps -e
PID TTY TIME CMD
1 ? 00:00:02 systemd
2 ? 00:00:00 kthreadd
3 ? 00:00:00 ksoftirqd/0
5 ? 00:00:00 kworker/0:0H
7 ? 00:00:00 migration/0
8 ? 00:00:00 rcu_bh
9 ? 00:00:01 rcu_sched
10 ? 00:00:00 watchdog/0
12 ? 00:00:00 kdevtmpfs

--------- OUTPUT TRUNCATED FOR BREVITY ---------

[NetProg@localhost ~]$ ps -A
PID TTY TIME CMD
1 ? 00:00:02 systemd
2 ? 00:00:00 kthreadd
3 ? 00:00:00 ksoftirqd/0
5 ? 00:00:00 kworker/0:0H
7 ? 00:00:00 migration/0
8 ? 00:00:00 rcu_bh
9 ? 00:00:01 rcu_sched
10 ? 00:00:00 watchdog/0
12 ? 00:00:00 kdevtmpfs

--------- OUTPUT TRUNCATED FOR BREVITY ---------

[NetProg@localhost ~]$ ps -e | wc -l
189
[NetProg@localhost ~]$ ps -A | wc -l
189
[NetProg@localhost ~]$

9781587145148_print.indb 75 25/03/21 11:43 am


76 Chapter 2: Linux Fundamentals

Notice in Example 2-43 that the TTY field shows a question mark (?) throughout the
output. This indicates that these processes are not associated with a terminal window,
referred to as a controlling terminal.
The command ps -u lists all the processes owned by the current user and adds to the
information displayed for each process. To display the processes for any other user, you
use the syntax ps -u {username}. Example 2-44 shows the output of ps -u, which lists the
processes owned by the user NetProg (the current user).

Example 2-44 ps -u Command Output

[NetProg@localhost ~]$ ps -u
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
NetProg 4312 0.0 0.0 116564 3280 pts/1 Ss 12:22 0:00 bash
NetProg 4517 0.0 0.0 116564 3280 pts/2 Ss 12:33 0:00 bash
NetProg 5609 0.0 0.0 119552 2284 pts/2 S+ 13:59 0:00 man ps
NetProg 5620 0.0 0.0 110260 944 pts/2 S+ 13:59 0:00 less -s
NetProg 7990 0.0 0.0 119552 2284 pts/1 S+ 17:40 0:00 man ps
NetProg 8001 0.0 0.0 110260 948 pts/1 S+ 17:40 0:00 less -s
NetProg 8827 0.0 0.0 116564 3288 pts/0 Ss 18:28 0:00 bash
NetProg 10108 0.0 0.0 151064 1792 pts/0 R+ 19:49 0:00 ps –u
[NetProg@localhost ~]$

Notice that the output of ps -u adds seven more fields to the output:

■■ User: The user ID of the process owner

■■ %CPU: The CPU time the process used divided by the process runtime (process
­lifetime), expressed as a percentage

■■ %MEM: The ratio of the main memory used by the process (resident set size) to the
total amount of main memory on the system, expressed as a percentage

■■ VSZ: Virtual memory size, the amount of virtual memory used by the process,
expressed in kilobytes

■■ RSS: Resident set size, the amount of main memory (RAM) used by the process,
expressed in kilobytes

■■ STAT: The state of the process

■■ START: The start time of the process

The process STAT field contains 1 or more of the 14 characters describing the state of the
process. For example, state S indicates that the process is in the sleep state (that is, wait-
ing for an event to happen in order to resume running, such as waiting for input from the
user). The + indicates that the process is running in the foreground rather than running in
the background.

9781587145148_print.indb 76 25/03/21 11:43 am


Linux System Maintenance 77

Processes are grouped into process groups, and one or more process groups make up a
session. All the processes in one pipeline, such as cat, sort, and tail in cat file.txt | sort |
tail -n 10, are in the same process group and have the same process group ID (PGID). The
process whose PID is the same as its PGID is the process group leader, and it is the first
member of the process group. On the other hand, all process groups started by a shell
are in the same session, and they have the same session ID (SID). The process whose PID
is the same as its SID is the session leader. In Example 2-44, as expected, the two shell
processes (bash) are session leaders, as indicated by the s in their STAT field. To check the
PID, PGID, and SID of a process all at once, issue the command ps -j.

To list all processes that have a specific name, use the -C option. In Example 2-45, ps -C
bash lists all processes that are named bash.

Example 2-45 ps -C bash Command Output

[NetProg@localhost ~]$ ps -C bash


PID TTY TIME CMD
4312 pts/1 00:00:00 bash
4517 pts/2 00:00:00 bash
8827 pts/0 00:00:00 bash
[NetProg@localhost ~]$

Finally, to see the parent process ID (PPID) of a process, you can use the option -f.
As the name implies, the parent process is the process that started this process. In
Example 2-46, the command ps -ef | head -n 10 is used to display the first 10 processes
in the list, along with the PPID of each process.

Example 2-46 ps -ef | head -n 10 Command Output

[NetProg@localhost ~]$ ps -ef | head -n 10


UID PID PPID C STIME TTY TIME CMD
root 1 0 0 09:29 ? 00:00:02 /usr/lib/systemd/systemd --switched-
root --system --deserialize 21
root 2 0 0 09:29 ? 00:00:00 [kthreadd]
root 3 2 0 09:29 ? 00:00:00 [ksoftirqd/0]
root 5 2 0 09:29 ? 00:00:00 [kworker/0:0H]
root 7 2 0 09:29 ? 00:00:00 [migration/0]
root 8 2 0 09:29 ? 00:00:00 [rcu_bh]
root 9 2 0 09:29 ? 00:00:00 [rcu_sched]
root 10 2 0 09:29 ? 00:00:00 [watchdog/0]
root 12 2 0 09:29 ? 00:00:00 [kdevtmpfs]
[NetProg@localhost ~]$

Note that the process with PID 0 is the kernel, and the process with PID 1 is the systemd
process, or the init process in some systems (recall the Linux boot process?). Knowing
this, the output in Example 2-46 should make more sense.

9781587145148_print.indb 77 25/03/21 11:43 am


78 Chapter 2: Linux Fundamentals

As you have seen from the output of ps -e | wc -l in Example 2-43, the list of run-
ning processes can be very long. While the output can be piped to grep in order to list
­specific lines of the command output, the use of the command pgrep may be a little
more intuitive. Example 2-47 shows the output of the command pgrep -u NetProg -l
bash, showing all processes owned by user NetProg and named bash.

Example 2-47 pgrep -u NetProg -l bash Command Output

[NetProg@localhost ~]$ pgrep -u NetProg -l bash


3173 bash
5815 bash
6561 bash
6667 bash
6730 bash
[NetProg@localhost ~]$

You can start and stop processes by using the kill command. The kill command sends
1 of 64 signals to a process or process group. This signal may be a SIGTERM signal to
request the process to terminate gracefully, or it may be a SIGKILL signal to force the
termination of the process. The signals SIGSTOP and SIGCONT are also used to pause
and resume a process, respectively. To view all the available signals, you can issue the
command kill -l, as shown in Example 2-48. The default signal SIGTERM is used if no
signal is explicitly specified in the command. The command kill may be used with the
signal numeric value or signal name.

Example 2-48 Using kill -l to List Signals Used with the kill Command

[NetProg@localhost ~]$ kill -l


1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP
6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1
11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM
16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR
31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3
38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8
43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13
48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12
53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7
58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2
63) SIGRTMAX-1 64) SIGRTMAX
[NetProg@localhost ~]$

Example 2-49 shows how to pause, resume, and kill the process bash with process
ID 3173. The -p option is used with the ps command to list a specific process using
its PID.

9781587145148_print.indb 78 25/03/21 11:43 am


Linux System Maintenance 79

Example 2-49 Pausing, Resuming, and Killing the Process bash Using the kill
Command

[NetProg@localhost ~]$ ps -C bash


PID TTY TIME CMD
3173 pts/1 00:00:00 bash
5815 ? 00:00:00 bash
8501 pts/3 00:00:00 bash
8980 pts/4 00:00:00 bash
9233 pts/2 00:00:00 bash
[NetProg@localhost ~]$ ps -up 3173
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
NetProg 3173 0.0 0.0 116696 3440 pts/1 Ss 11:41 0:00 bash
[NetProg@localhost ~]$ kill -SIGSTOP 3173
[NetProg@localhost ~]$ ps -up 3173
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
NetProg 3173 0.0 0.0 116696 3440 pts/1 Ts 11:41 0:00 bash
[NetProg@localhost ~]$ kill -SIGCONT 3173
[NetProg@localhost ~]$ ps -up 3173
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
NetProg 3173 0.0 0.0 116696 3440 pts/1 Ss 11:41 0:00 bash
[NetProg@localhost ~]$ kill -SIGTERM 3173
[NetProg@localhost ~]$ ps -up 3173
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
NetProg 3173 0.0 0.0 116696 3440 pts/1 Ss 11:41 0:00 bash
[NetProg@localhost ~]$ kill -SIGKILL 3173
[NetProg@localhost ~]$ ps -up 3173
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
[NetProg@localhost ~]$ ps -C bash
PID TTY TIME CMD
5815 ? 00:00:00 bash
8501 pts/3 00:00:00 bash
8980 pts/4 00:00:00 bash
9233 pts/2 00:00:00 bash
[NetProg@localhost ~]$

As you can see from Example 2-49, when process bash with PID 3137 receives the
SIGSTOP signal, its state changes from Ss (interruptible sleep, indicated by S, and session
leader, indicated by s) to Ts ( stopped by job control signal, indicated by T, and session
leader, indicated by s).

The process returns to the Ss state when it receives the SIGCONT signal. When the
SIGTERM signal is then used in an attempt to terminate the process, its state does not
change; therefore, SIGKILL is used, and it successfully forces the process to terminate. It

9781587145148_print.indb 79 25/03/21 11:43 am


80 Chapter 2: Linux Fundamentals

should be noted, however, that it is generally not recommended to terminate a process by


using the SIGKILL signal unless the process is suspected to be malicious or is not prop-
erly responding.

Jobs, on the other hand, can be displayed by using the jobs command. The jobs com-
mand lists all jobs run by the current shell. In Example 2-50, a simple for loop is used
to create a job that runs indefinitely. (Loops and control structures in Bash are covered
in Chapter 4.) In addition, you can enter the command gedit to start the text editor pro-
gram. An & is added at the end of both command lines shown in Example 2-50. This
instructs the shell to run both jobs in the background, so the running process will not
hog the shell prompt, and the prompt will be available for you to enter other commands.
A third job is created by running the ping command to google.com in the foreground.
The ping command is then stopped (paused) by using the Ctrl+z key combination. The
command jobs then lists all three jobs.

Example 2-50 Using the jobs Command to Display Job Status

[NetProg@localhost ~]$ jobs


[NetProg@localhost ~]$ i=0; while true; do ((i++)); sleep 5; done &
[1] 19332
[NetProg@localhost ~]$ gedit &
[2] 19347
[NetProg@localhost ~]$ ping google.com
PING google.com (172.217.18.46) 56(84) bytes of data.
64 bytes from ham02s12-in-f46.1e100.net (172.217.18.46): icmp_seq=1 ttl=63 time=220 ms
64 bytes from ham02s12-in-f46.1e100.net (172.217.18.46): icmp_seq=2 ttl=63 time=280 ms
^Z
[3]+ Stopped ping google.com
[NetProg@localhost ~]$ jobs
[1] Running while true; do ((i++)); sleep 5; done &
[2]- Running gedit &
[3]+ Stopped ping google.com
[NetProg@localhost ~]$

The number in brackets in Example 2-50 is the job number, and the number after that is
the PID. The jobs in the example are numbered 1 to 3. The first two jobs are in Running
state, and the ping job is in the Stopped state. To list the running jobs only, you use the
jobs -r command, and to display the stopped jobs only, you use the jobs -s command.

To resume a stopped process, you bring the process to the foreground by using the com-
mand fg {job_number}. To send it to the background again, you use the command bg
{job_number}. When the job is running in the foreground, you can stop it by using the
Ctrl+c key combination. Example 2-51 shows a ping process brought to the foreground
and stopped.

9781587145148_print.indb 80 25/03/21 11:43 am


Linux System Maintenance 81

Example 2-51 Bringing a ping Job to the Foreground and Stopping It

[NetProg@localhost ~]$ jobs


[1] Running while true; do ((i++)); sleep 5; done &
[2]- Running gedit &
[3]+ Stopped ping google.com
[NetProg@localhost ~]$ fg 3
ping google.com
64 bytes from ham02s12-in-f46.1e100.net (172.217.18.46): icmp_seq=3 ttl=63 time=5463 ms
64 bytes from ham02s12-in-f46.1e100.net (172.217.18.46): icmp_seq=4 ttl=63 time=5261 ms
64 bytes from ham02s12-in-f46.1e100.net (172.217.18.46): icmp_seq=5 ttl=63 time=4698 ms
64 bytes from ham02s12-in-f46.1e100.net (172.217.18.46): icmp_seq=6 ttl=63 time=5178 ms
^C
--- google.com ping statistics ---
11 packets transmitted, 6 received, 45% packet loss, time 633961ms
rtt min/avg/max/mdev = 220.614/3517.335/5463.953/2321.238 ms, pipe 6
[NetProg@localhost ~]$

If a job is running in the background and you want to stop it without bringing it to the
foreground first, you use the kill command in exactly the same way you use it with pro-
cesses. Example 2-52 shows the kill command being used to terminate the two running
jobs. Notice that the -l option is used with the jobs command to add a PID column to the
output.

Example 2-52 Terminating a Job Using the kill Command

[NetProg@localhost ~]$ jobs -l


[1]- 19332 Running while true; do ((i++)); sleep 5; done &
[2]+ 19347 Running gedit &
[NetProg@localhost ~]$ kill 19332
[1]- Terminated while true; do ((i++)); sleep 5; done
[NetProg@localhost ~]$ jobs
[2]+ Running gedit &
[NetProg@localhost ~]$ kill 19347
[2]+ Terminated gedit
[NetProg@localhost ~]$ jobs
[NetProg@localhost ~]$

To view service status, start and stop services, and carry out other service-related
­operations, you can use the command systemctl. The general syntax of the command
is systemctl {options} {service_name}. These are the most common options for this
­command:

■■ status: Displays the status of the service

■■ start: Starts the service

9781587145148_print.indb 81 25/03/21 11:43 am


82 Chapter 2: Linux Fundamentals

■■ stop: Stops the service

■■ enable: Enables the service so that it is automatically started at system startup

■■ disable: Disables the service so that it is not automatically started at system startup

Example 2-53 shows how to check the status of the httpd service, start it, stop it, and
enable it.

Example 2-53 Using the systemctl Command to View and Change the Status of the
httpd Service

[NetProg@localhost ~]$ systemctl status httpd


● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset:
disabled)
Active: inactive (dead)
Docs: man:httpd(8)
man:apachectl(8)
[NetProg@localhost ~]$ sudo systemctl enable httpd
[sudo] password for NetProg:
Created symlink from /etc/systemd/system/multi-user.target.wants/httpd.service to /
usr/lib/systemd/system/httpd.service.
[NetProg@localhost ~]$ systemctl status httpd
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset:
disabled)
Active: inactive (dead)
Docs: man:httpd(8)
man:apachectl(8)
[NetProg@localhost ~]$ sudo systemctl start httpd
[NetProg@localhost ~]$ systemctl status httpd
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset:
disabled)
Active: active (running) since Sun 2018-04-08 17:35:40 +03; 3s ago
Docs: man:httpd(8)
man:apachectl(8)
Main PID: 2921 (httpd)
Status: "Processing requests..."
CGroup: /system.slice/httpd.service
├─2921 /usr/sbin/httpd -DFOREGROUND
├─2925 /usr/sbin/httpd -DFOREGROUND
├─2926 /usr/sbin/httpd -DFOREGROUND
├─2927 /usr/sbin/httpd -DFOREGROUND
├─2928 /usr/sbin/httpd -DFOREGROUND
└─2929 /usr/sbin/httpd -DFOREGROUND

9781587145148_print.indb 82 25/03/21 11:43 am


Linux System Maintenance 83

Apr 08 17:35:40 localhost.localdomain systemd[1]: Starting The Apache HTTP Ser....


Apr 08 17:35:40 localhost.localdomain httpd[2921]: AH00558: httpd: Could not r...e
Apr 08 17:35:40 localhost.localdomain systemd[1]: Started The Apache HTTP Server.
Hint: Some lines were ellipsized, use -l to show in full.
[NetProg@localhost ~]$ sudo systemctl stop httpd
[sudo] password for NetProg:

[NetProg@localhost ~]$ systemctl status httpd


● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset:
disabled)
Active: inactive (dead) since Sun 2018-04-08 17:41:14 +03; 8s ago
Docs: man:httpd(8)
man:apachectl(8)
Process: 3132 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=0/SUCCESS)
Process: 2921 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited,
status=0/SUCCESS)
Main PID: 2921 (code=exited, status=0/SUCCESS)
Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec"

Apr 08 17:35:40 localhost.localdomain systemd[1]: Starting The Apache HTTP Ser....


Apr 08 17:35:40 localhost.localdomain httpd[2921]: AH00558: httpd: Could not r...e
Apr 08 17:35:40 localhost.localdomain systemd[1]: Started The Apache HTTP Server.
Apr 08 17:41:13 localhost.localdomain systemd[1]: Stopping The Apache HTTP Ser....
Apr 08 17:41:14 localhost.localdomain systemd[1]: Stopped The Apache HTTP Server.
Hint: Some lines were ellipsized, use -l to show in full.
[NetProg@localhost ~]$

In Example 2-53, the httpd service is both inactive and disabled. When the enable option
is used, the httpd service changes its state to enabled, which means the service will be
automatically started when the system is booted. However, the service is still inactive;
that is, it is not currently running. When you use the start option, the httpd service
becomes active. Finally, the service is stopped using the stop option. Note that start-
ing and stopping the service is independent of the service’s enabled/disabled status. The
former describes the current status of the service, while the latter describes whether the
service should be started automatically at system startup time.

Resource Utilization
Resource utilization, at a very basic level, refers to CPU, memory, and storage utilization.
While checking disk space on a system tends to be a straightforward process, check-
ing the CPU and memory utilization can be quite challenging if you don’t know exactly
what tools to use. The single most important Linux command to use to check resource
­utilization is top.

9781587145148_print.indb 83 25/03/21 11:43 am


84 Chapter 2: Linux Fundamentals

Example 2-54 shows the output of the top command. The list of processes is live—that
is, updated in real time as the output is being viewed. The output is also limited by
the shell window size. The bigger the window, the longer the list of processes that you
can view.

Example 2-54 Output of the top Command

top - 23:52:06 up 3 min, 3 users, load average: 0.23, 0.34, 0.16


Tasks: 205 total, 1 running, 204 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.9 us, 0.6 sy, 0.0 ni, 98.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 8010152 total, 6713388 free, 783364 used, 513400 buff/cache
KiB Swap: 5242876 total, 5242876 free, 0 used. 6940600 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND


2051 NetProg 20 0 2520892 226276 47980 S 4.3 2.8 0:17.71 gnome-shell
1183 root 20 0 354540 55024 10848 S 2.6 0.7 0:05.44 X
2675 NetProg 20 0 723988 25624 15312 S 1.3 0.3 0:02.18 gnome-terminal-
2788 NetProg 20 0 157860 2368 1532 R 1.0 0.0 0:01.06 top
2789 NetProg 20 0 157944 2316 1532 S 1.0 0.0 0:01.00 top
1992 NetProg 20 0 214904 1312 880 S 0.3 0.0 0:00.41 VBoxClient
1 root 20 0 193708 6844 4068 S 0.0 0.1 0:01.85 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.02 ksoftirqd/0
4 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0

--------- OUTPUT TRUNCATED FOR BREVITY ---------

The following is a list of keys that, when pressed, change the formatting of the output
while the top command is running:

■■ m: Pressing this key shows memory usage, as a percentage.

■■ t: Pressing this key shows CPU usage, as a percentage.

■■ 1: Pressing this key shows all processors on the system.

■■ Shift+m: Pressing this key combination sorts processes by memory usage, in


descending order.

■■ Shift+p: Pressing this key combination sorts processes by CPU usage, in descending
order.

■■ Shift+r: Pressing this key combination sorts processes by PID.

■■ k-{PID}-{Signal_No|Signal_Name}: Pressing k starts a dialog above the first column


of the process list. This dialog requests a PID. After you type a PID and press Enter,
it requests the signal you want to send to that process. This can be used to send any
of the 64 signals to any of the processes on the system.

9781587145148_print.indb 84 25/03/21 11:43 am


Linux System Maintenance 85

Example 2-55 shows the output you get when you use the top command and press 1
­followed by t. As you can see, all four processors on the system are listed, with the
percentage utilization of each.

Example 2-55 Output of the top Command Showing Each of the Four CPUs Being
Used

top - 00:06:33 up 17 min, 4 users, load average: 0.11, 0.08, 0.10


Tasks: 201 total, 2 running, 198 sleeping, 1 stopped, 0 zombie
%Cpu0 : 6.8/2.3 9[||||| ]
%Cpu1 : 8.2/0.0 8[|||| ]
%Cpu2 : 4.2/2.1 6[||| ]
%Cpu3 : 1.9/0.0 2[| ]
KiB Mem : 8010152 total, 6692660 free, 803756 used, 513736 buff/cache
KiB Swap: 5242876 total, 5242876 free, 0 used. 6920176 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND


2051 NetProg 20 0 2521404 226756 47984 S 31.4 2.8 0:44.63 /usr/bin/gn+
1183 root 20 0 359576 59968 10900 S 9.8 0.7 0:15.58 /usr/bin/X +
2675 NetProg 20 0 725372 27116 15320 D 3.9 0.3 0:07.04 /usr/libexe+
692 root 20 0 6472 652 540 S 2.0 0.0 0:00.16 /sbin/rngd +
1992 NetProg 20 0 214904 1312 880 S 2.0 0.0 0:02.36 /usr/bin/VB+
2212 NetProg 20 0 1520856 28752 17476 S 2.0 0.4 0:01.20 /usr/libexe+
2788 NetProg 20 0 157888 2404 1564 R 2.0 0.0 0:06.34 top
1 root 20 0 193708 6844 4068 S 0.0 0.1 0:02.05 /usr/lib/sy+

--------- OUTPUT TRUNCATED FOR BREVITY ---------

When troubleshooting an incident, it is sometimes useful to have top run with a refresh
rate that is faster than the default. The command top -d {N} runs top and refreshes the
output every N seconds. N does not have to be an integer; it can be a fraction of
a second.

System Information
Linux provides several ways to collect information describing the hardware and soft-
ware of the system it is running on, as well as set and change this information, where
­applicable.

The date command, as shown in Example 2-56, displays the date and time configured on
the system. Adding the -R option to the command displays the same information in RFC
2822 format.

9781587145148_print.indb 85 25/03/21 11:43 am


86 Chapter 2: Linux Fundamentals

Example 2-56 Output of the date Command

[NetProg@localhost ~]$ date


Sun Apr 8 01:38:48 +03 2018
[NetProg@localhost ~]$ date -R
Sun, 08 Apr 2018 01:38:53 +0300
[NetProg@localhost ~]$

From left to right, the first command output in Example 2-56 displays the following
information:

■■ Day of the week

■■ Month

■■ Day

■■ Time, in hh:mm:ss format

■■ Time zone

■■ Year

Another command you can use to view and set the system time and date is the
­timedatectl command, shown in Example 2-57.

Example 2-57 Output of the timedatectl Command

[NetProg@localhost ~]$ timedatectl


Local time: Sun 2018-04-08 11:40:50 +03
Universal time: Sun 2018-04-08 08:40:50 UTC
RTC time: Sun 2018-04-08 08:40:48
Time zone: Asia/Riyadh (+03, +0300)
NTP enabled: no
NTP synchronized: no
RTC in local TZ: no
DST active: n/a
[NetProg@localhost ~]$

The uptime command displays how long the system has been running as well as CPU
load average. Example 2-58 shows the output from the uptime command.

Example 2-58 Output of the uptime Command

[NetProg@localhost ~]$ uptime


22:39:25 up 41 min, 2 users, load average: 0.02, 0.06, 0.11
[NetProg@localhost ~]$

9781587145148_print.indb 86 25/03/21 11:43 am


Linux System Maintenance 87

The uptime command output in Example 2-56 displays the following information:

■■ The system time when the command was issued (in this example, 10:39:25 p.m.)

■■ How long the system has been up (in this case 41 minutes)

■■ How many users are logged in (in this case 2 users)

■■ The load average over the past 1 minute, 5 minutes, and 15 minutes

The load average is an indication of the average system utilization over a specific dura-
tion. The load average factors in all processes that are either using the CPU or waiting
to use the CPU (runnable state), as well as processes waiting for I/O access, such as disk
access (uninterruptable state). If one process is in either of these states for a duration of
1 minute, then the load average over 1 minute is 1 for a single-processor system.

The output of the uptime command shows the load average over the past 1, 5, and 15
minutes. A 0.2 in the first load average field of the output of the uptime command indi-
cates an average load of 20% over the past 1 minute if the system has a single processor.
For multiprocessor systems, the load average should be divided by the number of proces-
sors in the system. Therefore, a 0.2 value in a system with four processors means a load
average of 5%. A value of 1 in a four-processor system indicates a load average of 25%.

When you have multiple processors on a system, you can view detailed processor infor-
mation by viewing the contents of the file /proc/cpuinfo. Example 2-59 displays part of
the contents of this file. The cpuinfo file lists each processor and details for each of them.
Processors are numbered 0 to the number of processors minus 1. A quick way to display
the number of processors on a system is to use the command cat /proc/cpuinfo | grep
processor | wc -l.

Example 2-59 CPU Information from the /proc/cpuinfo File

[root@localhost ~]# cat /proc/cpuinfo


processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 158
model name : Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
stepping : 9
cpu MHz : 2837.118
cache size : 6144 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes

9781587145148_print.indb 87 25/03/21 11:43 am

You might also like