Command Linux
Command Linux
me/nettrain
Shell Samurai
Master the Linux Command Line
Stetson Blake
Cincinnati, Ohio
T.me/nettrain
Table of Contents
Preface 1
Introduction and Thank You 8
Acknowledgements 4
Who Am I and Why Am I Worth Listening to? 9
Why You Should Learn Linux 9
What is a Shell Anyway? 10
Why You Should Use the Command-Line 11
Who This Book is For 12
What You Should Know Already 12
What is a Shell Samurai? 14
How this book is structured 7
Syntax Used In This Book 6
Book Changelog 2
Need Help? 3
1. Setting Up a Lab Environment 15
Lab Setup Intro 15
Install Linux on an old Laptop or Desktop computer 15
Use Windows Subsystem for Linux or WSL 16
Use Virtualization Software 17
Dual Boot 17
Use a Server from A Cloud Host 17
Using Cloud Hosting Setup Tutorial 19
2. Linux Fundamentals 31
What is Linux? 31
Linux Distributions 33
What is The Kernel? 34
Kernel Space vs User Space 35
Interrupts 35
What is a System Call? 36
Signals 37
T.me/nettrain
Everything is a file? Always has been 39
Linux Directory Structure 41
An Overview of Filesystems 42
Devices 43
What is a daemon? 45
Boot Process and Init, systemd 45
What's a TTY? 51
inodes 51
3. White Belt Command Line Skills - The basics, file manipulation, etc 55
Intro to the Shell - pwd, cd and basic ls 55
more ls, mkdir, cat, touch, copy and move 61
Users, Groups, Root and Sudo 68
File Permissions 74
Flying Around the Command Line with Shortcuts 78
file 80
rm — Proceed with extreme caution 80
Helping Yourself with help, man and tldr 83
Installing Packages and Programs 88
Compressing Files and Uncompressing Them 91
The Environment 95
whereis and which 96
Shutting the System Down 97
Aliases - Not just for Criminals, Spies and Detectives 97
4. Black Belt Command Line Skills - Output Redirection, Text and More 99
Black Belt Intro 99
Compiling Code from Source 99
Managing Processes 102
stdout, stdin and stderr 110
Symlinks and Hardlinks 115
Chaining Commands in the Shell with ; and && Operators 117
Text Processing 119
Spying on syscalls with strace 142
T.me/nettrain
Finding Files with Find 143
Swap Space 145
Scheduling tasks with cron 146
Secure Shell or SSH 149
CURL - Not just for Olympians and Hair Stylists 153
jq - Processing JSON 158
Tmux - Secret Background Windows 161
lsof 165
The /proc directory 169
troubleshooting low disk 171
Logs - Not Just for Lumberjacks 174
Log Rotate - Don’t let the disk fill up! 175
5. The Network 179
DNS - The Yellow Pages of the Internet 188
Transferring files with rsync and scp 193
6. Real World Samurai Skills and Interview Questions 197
Outro 226
T.me/nettrain
Preface
1
T.me/nettrain
Book Changelog
This book is a living document. Over time, I’ll add sections to it and make it
even better. This section serves as a log of the changes that were made over time.
If you bought a copy, you’ll have access to all future updates.
2
T.me/nettrain
Need Help?
As a purchaser of this e-book, I invite you to join our Discord Server where
you can connect with fellow Shell Samurai alumni. I’ll also hang out there from
time-to-time and try to provide help as I can:
https://discord.gg/vBv8VCS63Y
3
T.me/nettrain
Acknowledgements
It seems that no book is complete without gratitude for those that helped to
shape its contents and the author’s life. With that said, I’d like to give some
gratitude of my own, without sounding too sappy. Each person mentioned below
I consider a friend and is someone who helped me achieve my goals or who has
impacted my career in one way or another.
I want to first thank YOU, the reader, for taking a chance on a e-book by a self-
published author who has never put out a massive book like this before. You
took a risk and I’m working my best to make sure it pays off for you! You the real
MVP.
I want to thank Ryan Kulp for inspiring me to ship a product, own my
direction in life and go after my dreams.
I want to thank Reilly Chase for his feedback on all the projects and business
ideas I’ve run by him over the years.
I’d like to thank James Keating for being an excellent manager and developer
of people.
I want to thank Aaron Shultz for giving me confidence in my skills and
abilities.
I want to thank Ryan Chewning for introducing me to VIM many years ago
and making me think about DNS as the Yellow Pages of the internet. Ryan is one
of the first Linux wizard’s I met.
Thank you to Dave Mezera for building an excellent business and local ISP
and taking a chance on hiring me as an engineer.
I want to thank JP Harvey for being a tough boss but forcing me to grow and
take ownership.
I want to thank Brad Kramer for his leadership and operational excellence in
eliminating toilsome processes for engineering teams.
I want to thank Edward Nevard for his sharp mind and detailed eye in
editing and helping grow our Facebook group, This is an IT Support Group.
Thank you to Jordan Andrews for his speedy graphic design work and keen
4
T.me/nettrain
eye.
I want to thank John Williams for his friendship and support.
I want to thank Andrew Baker for introducing me to new software projects
like Node.js before I even wrote code.
I want to thank my parents for pushing me to look into this computer thing as
a career.
I want to thank Chuntel Murawski for supporting me in my goals.
And thank you to all the people I’ve forgotten to mention, because I’m sure
there’s a few!
5
T.me/nettrain
!
Throughout the book, we’ll give “code” samples for you to read through and
try yourself. Code samples will be formatted in Code Span. Code samples will
be formatted in monospace font. Here’s an example:
Hot Tips
Occasionally, you’ll see a Hot Tips section in the middle of a page. This
indicates a tip pertaining to the topic that I’m throwing in as an extra.
Suggestions
If you find that any portion of this book is not formatted well, please reach out
to me and I will fix it! I’ve done my best to make the book as readable and
accessible as possible in PDF and ePub formats.
6
T.me/nettrain
How this book is structured
This book is structured into chapters which are meant to ease you into Linux
and then quickly ramp up with using Linux and learning commands to
administer systems.
Here’s a brief overview, be sure to also check out the Table of Contents to find
specific sections.
Preface
This chapter! We cover who I am, why you should learn Linux and what skills
you should have already to read this book.
Linux Fundamentals
This chapter introduces the basics of how Linux operates. We’ll dive into the
kernel, directory structures, how the system boots and more! We recommend
coming back to this chapter later as a reference. This section is mostly theory and
doesn’t focus on too many commands.
7
T.me/nettrain
Black Belt Command Line Skills
This chapter builds upon the previous one and goes more in-depth on
advanced commands and concepts you’ll use in your Shell Samurai journey.
You’ll learn about compiling code from source, standard streams, managing
processes, symlinks, regex and more.
The Network
We’ll briefly cover some networking concepts, then dive right into Linux
network commands including capturing packets, working with DNS and
transferring files over the network.
Outro
This chapter provides some reference materials and links along with
suggestions on where to take your Linux journey next.
8
T.me/nettrain
Who Am I and Why Am I Worth Listening
to?
9
T.me/nettrain
phones to super-computers, fridges, cars and satellites, Linux is running
everywhere. In 2017, it was estimated that 90% of the public cloud was running
Linux. Every time you use the internet, you’re using Linux.
By mastering Linux, you will gain a deeper understanding of how computers
work in general and how to build career skills that will serve you. Learning
Linux is an investment in your career and your future.
To prove my point, here’s a photo I took of Linux running (or…not running
correctly) on my plane’s entertainment system. Pretty nifty, huh? Also kinda
scary to see ANYTHING not working right on a plane, even if it’s just the thing
that lets me watch movies.
10
T.me/nettrain
layer that wraps the operating system and lets us talk to it. The Linux Shell is a
program that provides a Command-Line Interface (CLI) to perform various tasks
like running programs, configuring the network and adding users. In contrast
with a GUI (Graphical User Interface), a CLI is text only, baby!
11
T.me/nettrain
Finally, it’s just plain fun! Learning how things work is fun. Typing away at a
shell in a dark room can make you feel like a super-hacker. Unlocking new
abilities and skills is fun too. I imagine you work with computers because you
think in a similar way. There’s much foundational knowledge we’ll begin to
build about Linux, so let’s get started!
12
T.me/nettrain
!
This book will not cover every single little detail of Linux. You will have to do
some homework on your own. Things will go wrong sometimes. You will have to
Google and make decisions on your own. Commands might not give you the
expected output you thought they might. That’s how the real world is! Although,
if you bought this book, you have access to our Discord server where you can ask
for some help and you CAN feel free to reach out to me and I’ll do my best to
help you! (After you’ve tried Google )
Finally, hats off to you for taking the time and effort to better yourself and
understand something new! That attitude will take you far. I’m clapping for you
right now from my couch.
13
T.me/nettrain
What is a Shell Samurai?
Thank you for joining me and taking the path to becoming a Shell Samurai!
Let’s learn Linux and kick ass at it together!
- Stetson Blake
14
T.me/nettrain
1. Setting Up a Lab Environment
Installing Linux on an old system is a great option to get started using Linux.
Installing Linux and using it as your daily driver is a great way to dive into the
deep end. Just be cautious as many newbies may install Linux on their “main
computer” and then break things beyond repair. This isn’t a big deal because you
can simply reinstall and the only cost is your time. If you go this route, be sure to
install Linux on a machine that isn’t your primary computer.
If you decide to take this path, use Ubuntu Desktop available at Ubuntu.com.
Grab the latest version, which is currently 22.02.2 LTS. The LTS references
Long Time Support and means that this edition will be supported for a few years.
15
T.me/nettrain
You’ll need a USB drive to install Ubuntu from a bootable disk or if you’re a
little more old school, you can burn the Ubuntu image to a disc and boot from
that. Ubuntu has setup instructions for installing on a Bootable USB stick here.
16
T.me/nettrain
password will not show up on the terminal! This is called blind typing and
is normal. Enter your password again to confirm.
• Congrats you should now have your very own Ubuntu terminal up and
running!
You can access WSL in a few ways. First, either through the start menu and
searching for Ubuntu. Then, you can pin it to start or the task bar. The other way
you can access your shell is by opening Command Prompt or Powershell and
simply typing ubuntu and hitting enter.
Check out the Microsoft Documentation for more info on Windows Subsystem
for Linux.
Dual Boot
Dual Booting is a way of installing Linux alongside another operating system
on the same machine. This is a similar route to just installing Linux on a
computer like I mentioned above. You’ll have to do some partitioning of your
hard drive and do some other configuration. Due to that, it’s an option, but not a
very good one in my opinion due to the complexity involved for a newbie.
17
T.me/nettrain
This option is my favorite. Today, the “Cloud” is ubiquitous and cheap. For
many years, you’ve been able to rent someone else’s computer in the form of a
“VPS” or Virtual Private Server. There are tons of platforms like Digital Ocean,
Vultr and AWS. Be careful with AWS, because you can really run your bill up if
you don’t know what you’re doing!
Point is, any basic Linux server shouldn’t cost more than $5/month. For some
cloud providers, you can even shut down your server when you aren’t using it
and just pay for the amount of time it’s turned on! Billing changes constantly, so
be sure to read your provider’s FAQ and Billing Documentation to be sure.
If you use this option, be sure to check out the section on SSH in this book as
that is the primary way you’ll connect to a cloud server! When you install your
server, be sure to use the latest version of Ubuntu, which is the Linux
Distribution we’ll use in this book.
The path you take to get access to a Linux Shell doesn’t matter too much. All
that matters is that you start and get a basic bash shell up and running.
18
T.me/nettrain
Using Cloud Hosting Setup Tutorial
19
T.me/nettrain
20
T.me/nettrain
Creating a Droplet
Next, you’ll be taken to a wizard that allows you to select various settings
related to your virtual machine. For the region, select the one geographically
closest to you so that you have lower latency connecting to it. I chose New York.
Scroll down to Choose an Image and select Ubuntu and pick the latest
(highest version number) that is currently offered. That’s 22.10 x64 as of this
writing. The other options under marketplace, snapshots and backups have
images that others have created with additional software pre-installed. You can
ignore those for now.
21
T.me/nettrain
Under size, choose the smallest and cheapest plan you can find. That’s
currently the $4/month tier which has 512 mb / 1 CPU, 10 GB disk and 500 GB
transfer. I like Digital Ocean because their pricing is simple! Digital Ocean
currently bills hourly, so technically you could spin your VPS up for an hour,
play with it and delete it when you’re done and pay almost half a cent. Not too
bad.
22
T.me/nettrain
Next, we’ll setup authentication for your VPS. Scroll down to Choose
Authentication Method and choose SSH Key or Password. My account shows a
few keys that already exist for other accounts. If you’re using Password
authentication, just keep in mind that this is a less secure option. Probably fine
for your Linux lab, but certainly a no-no in production environments.
23
T.me/nettrain
Click New SSH Key and you’ll see a window pop up asking for a Public SSH
Key along with some instructions on how to create one. We cover SSH more in-
depth in Black Belt Command Line Skills here, but we can go ahead and make a
public/private key-pair now and introduce the basics.
If you’re on Windows, you can open Command Prompt or Powershell (Start
button > search for Command Prompt or Powershell). If you’re on a Mac, open
the terminal from Applications and use that. Once you have a prompt open, type
ssh-keygen and walk through the instructions to create a new key. You’ll be
asked for an optional passphrase (you can skip this by just hitting enter) and a
location to save your key-pairs.
Once you’re done, you can use the cat command in Powershell or Mac
Terminal and type command in Command Prompt on the path of the PUBLIC
key file, likely named id_rsa.pub. Copy the output from that command into
your Public SSH Key prompt.
If this is confusing right now, read up in the SSH section more, check out the
Digital Ocean documents or as a worst-case scenario, use Password based
authentication. The tl;dr is that we just made a public and private key for
authenticating to your server. This is a cryptographically secure way to connect
to a server through Public Key Cryptography or asymmetric cryptography.
Finally, give the key a memorable name and click Add SSH Key.
24
T.me/nettrain
We’re almost there! In the finalize details section, give your VPS a hostname
(no underscores allowed!) and any optional tags. Tags could be things like
“database” or “myWebApp”. We don’t need them for now, but you can add
some if you’d like. There should also be a default project where you can place
your droplet for organization purposes.
Once you’re satisfied, go ahead and hit Create Droplet!
25
T.me/nettrain
Connecting To Your VPS
Digital Ocean will provision your VPS and it usually doesn’t take long. Like,
under a minute. You can monitor the status from your Digital Ocean dashboard:
If you click your VPS’ name you’ll be taken to another dashboard that has all
kinds of data on the health of your instance and other details. Your VPS should
be up once you see an ipv4 address assigned. Go ahead and click the ipv4
address to copy it to your clipboard. Mine is 137.184.198.176 in this screenshot:
26
T.me/nettrain
Back in your Command Prompt, Powershell or Mac terminal, we can connect
to our VPS with the ssh command.
The syntax and format you’ll use is ssh root@<IP_Address>. For my example,
I would perform ssh [email protected]
We use the root user to connect to our server. Later, we can set it up so that we
can only connect as the Ubuntu user, which is more secure.
You’ll then get a message that seems scary if you’ve never seen it before:
➜ ~ ssh [email protected]
The authenticity of host '137.184.198.176 (137.184.198.176)' can't
be established.
ED25519 key fingerprint is SHA256:JPW506TBNx9PIZoxtug+AVhsO/
eHG+0Cr/UfP0UheoA.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/
[fingerprint])?
Go ahead and type yes and hit enter. Our computer is telling us that we’ve
never connected to this VPS or machine before and it can’t verify it knows who it
is. It also gives us a fingerprint that identifies this server uniquely. This can be
used to verify the server is who it says it is.
27
T.me/nettrain
If all has gone well, you’ll be dumped into your new Ubuntu system and see
some system statistics and various links to support:
➜ ~ ssh [email protected]
[email protected]: Permission denied (publickey).
28
T.me/nettrain
This typically indicates that you are connecting as the wrong user (I’m not,
I’m using root here) or that your id_rsa private key has the wrong permissions or
you may not be using the correct key! You can get some more info on the error by
providing -vvv flags to the ssh command to indicate you’d like verbose output:
ssh [email protected] -vvv
This will generate a ton of output that you can skim through and maybe find
something obvious that’s gone wrong.
You don’t only have to use the command-line to connect to your terminal. You
can also use an SSH client like PuTTY to connect to your server, which allows
you to configure and save your server settings so you don’t have to remember
them every time. PuTTy will save your server name and IP address, but you’ll
have to configure it to point at your private key, if you didn’t chose password
authentication:
29
T.me/nettrain
Congratulations! You’ve spun up your first Ubuntu server on Digital Ocean!
You now have a light-weight, instantly accessible Linux server that you can
connect to from anywhere in the world. You could run a web server on it, start a
side project or in our case…Learn Linux more deeply and become a Shell
Samurai!
30
T.me/nettrain
2. Linux Fundamentals
What is Linux?
The word Linux is a bit of a tricky to fully explain. Linux actually refers to the
Linux Kernel, which is the core brains of Linux operating systems. We’ll get into
the kernel more in later sections. Mostly, you’ll hear people say Linux and they
broadly mean a Linux operating system, like Ubuntu.
Linux was first released in September, 1971 by Linus Torvalds from Finland as
a free and open-source alternative to commercial operating systems. Since then,
it’s grown to massive popularity with over 90% of the world’s super computers
running Linux. Linus, of course, was standing on the shoulders of giants. Prior to
Linux’s development, Ken Thompson and Dennis Ritchie developed the Unix
operating system at Bell Labs along with tons of other utilities which would
make it into Linux. Linus also took inspiration from and was initially developed
on an Operating System called Minix. Minix was developed by a professor,
Andrew Tanenbaum, for teaching his students how operating systems work.
What is Open-Source?
In the context of Linux, open source refers to the concept of making the
underlying source code of the operating system freely available to anyone to use,
modify and distribute. This has some awesome implications. One being that the
code can be modified and tuned to specific use-cases. Another being that it can be
audited for security. In addition, anyone can read Linux’s source-code. This
means that Linux is developed quickly and for a wide-range of use-cases.
Linux’s open source philosophy has indeed led to a massive community of
thousands of developers working on improving Linux, fixing bugs and
developing new features. Due to this, Linux has become a stable, reliable and
secure operating system running all over the world.
Unix vs Linux
It’s important to call out that Unix and Linux are not the same. This can be
confusing for beginners. Unix was developed in at Bell Labs in the 60’s, while
31
T.me/nettrain
Linus Torvalds developed Linux in the 90’s, 20-something years later. Much of
Linux’s design was inspired by Unix. Linux is even called Unix-like. Linus picked
up parts of Unix that he thought were helpful and dropped other pieces. Unix
systems are still developed today and running many production systems.
32
T.me/nettrain
Linux Distributions
Since Linux is open source, lots of different developers have worked on it and
forked their work into separate distributions or “distros” based on the opinions
they have on how the operating system should work. A fork is what happens
when software is cloned from other software and takes a different direction.
Linux distributions each have their own set of unique features, tools and pre-
installed applications.
It’s important to keep in mind that Linux actually refers to the Linux kernel,
while distributions using the kernel are known as Linux operating systems.
For this book, all of our examples will use Debian or Ubuntu tooling. Let’s
briefly cover some of the most popular Linux distributions.
Debian
Debian is one of the oldest Distributions based on the Linux kernel. It was
first released in 1993 and has a lot of momentum to this day! Debian has different
“branches”, which are just different versions of the same operating system. Stable,
Testing and Unstable are the three main Debian branches. Stable is the branch
that most people should use, while testing and unstable are more frequently
updated with changes that eventually make it over to Stable. Debian is the basis
for a lot of other distributions, with the most popular being Ubuntu. Debian, like
many distributions, utilizes a lot of free and open-source software.
Ubuntu
Ubuntu was initially released in 2004 and is developed by a British company
called Canonical. It is very popular among beginners and professionals alike. It’s
the distribution we’ll use in this book’s examples. Ubuntu is based on Debian
and even uses Debian’s package management system for installing software.
Ubuntu has many versions of its OS for desktops, servers and even a version
used in Internet of Things devices and robots called core.
Ubuntu was my first introduction to Linux and I think you’ll enjoy using it
too.
33
T.me/nettrain
RHEL
RHEL, or Red Hat Enterprise Linux is developed by none other than…a
company called Red Hat. Red Hat, as of this writing, is owned by IBM. Red Hat
Enterprise Linux has more restrictions on its distribution and is more commercial
than other distributions. As well, RHEL uses a different packaging system for
installing software than Debian and other distributions called RPM or Red Hat
Package Manager.
Fedora
Fedora is in some ways the pre-cursor and basis for Red Hat Enterprise Linux.
Red Hat is a fork of Fedora. Fedora shares many similarities with Red Hat, but
comes with a lot fewer restrictions and commercial agreements. Fedora also uses
RPM or Red Hat Package Manager since it’s based on Red Hat.
Arch
Arch is worth a mention in this book, though it isn’t commonly used on
servers. Arch is usually installed on Desktops and Laptops for users that want
deep control over their system. Arch is very opinionated about many of its design
decisions. This just means that the developers have strong convictions about the
way to do things. I don’t recommend Arch for beginners, but learning it could
give you a deeper understanding of the Linux operating system if you’re up for
the challenge.
34
T.me/nettrain
"
stetson@linux_rocks:~$ uname -r
5.15.0-56-generic
There are many kernel versions and because the Linux kernel is open-source,
we can inspect the kernel source code, modify it and run it on our own systems.
We won’t dive that deep in this course, but know that it’s something you could
do. For more info on the Linux kernel, check out Kernel.org
Interrupts
An interrupt is a signal sent by hardware or software to the kernel indicating
35
T.me/nettrain
that immediate attention is needed. Interrupts, like in real life, call attention to
something and STOP the current execution of whatever else is going on.
A simple example of a hardware interrupt is input from a device like your
keyboard or mouse. Any time you type a key, the kernel suspends operation,
saves its state and executes an interrupt handler or code for the device driver
associated with your keyboard. Once the signal is handled, typically by
outputting a character to your screen, control is returned back to the kernel.
36
T.me/nettrain
• write() - writes data to a file descriptor
If system calls don’t quite make sense right now, don’t worry. Just know that
they are a process that happens behind the scenes to help programs do their job.
This is a pretty advanced concept for this early in the book, but I feel I’d be doing
you a disservice if I didn’t introduce system calls!
Signals
Signals are a king of software interrupt. A signal is a way that processes can
communicate with each other (sometimes called interprocess-communication)
and the operating system. We can even send our own signals to processes from
the command-line! A signals might be sent for a lots of different reasons, like the
completion of a piece of code or asking a process to shutdown and terminate
itself.
37
T.me/nettrain
"
accessing memory it isn’t allowed to.
• SIGTERM or 15: Software Termination. Gracefully terminate a process and
give it a chance to cleanup.
• SIGSTOP: Suspend execution until a SIGCONT signal is received
• SIGCONT: Tells a process to continue execution
These are just a few examples of Signals and their names, Linux currently
implements about 30 total.
I mentioned that we can also send our own signals to processes. We do this
with the kill command. By default, you can use the kill command and pass it a
process ID to send a SIGKILL signal. This is a very useful command for
“runaway processes” or processes that you’ve tried to shut down and won’t stop
when we’ve asked nicely.
If you’d like to send a signal to a process of a certain type, you can specify it
in the kill command as it’s Signal number. Eg:
$ kill -9 <process_id>
The above command sends the SIGKILL signal, so the command is the same
as issuing kill <process_id>. You could instead specify signal ids 1, 2, 11 or
any of the others as a flag.
Hot Tip : If you ever want some process that’s running on your terminal to
immediately stop in Linux, you can hit Control+C on your keyboard to send a
SIGINT or Signal Interrupt. This requests that the current process stop. If you
have a script that is taking too long to run or you entered a command you didn’t
mean to run, try hitting Control+C to stop execution and return control to the
shell.
38
T.me/nettrain
Everything is a file? Always has been
A phrase you’ll hear a lot about Linux is that “Everything is a File”. What?
How could that be? Is my CPU a file? Yes. My Keyboard? Yes, also. The
Monitor?! Yes! While the components of your system aren’t literally a file, that’s
how Linux sees them and interacts with them. This is a design philosophy of
Linux. Almost every single aspect of the system, including devices like monitors
and keyboards, processes and more are represented as a file.
For example: A printer is represented as a file and printing can be performed
by writing to that file. Similarly, processes are represented by files in the /proc
directory on the file system. Detailed information about a process can be
obtained by reading the contents of that file.
This abstraction of system resources as files makes Linux highly flexible and
powerful. It enables the system and kernel to treat a wide range of resources in a
consistent way. It also makes it easier for users and developers like us to interact
with these resources, as they can use familiar file operations, such as reading,
writing, and copying, to interact with them.
In short, the "Everything is a file" philosophy in Linux provides a consistent
way of interacting with and manipulating system resources, making the system
both user-friendly and highly flexible.
39
T.me/nettrain
40
T.me/nettrain
Linux Directory Structure
Every Linux install has a directory structure with folders and files that help
the system run properly. If you have shell access, you can list these directories by
issuing ls -l /
Most every Linux system has the following directories under its root
filesystem:
41
T.me/nettrain
• /usr - used for user installed programs and utilities.
• /var - Variable data. Usually logs are output somewhere in this directory
We’ll go deeper into what’s kept in some of these directories later, for now just
keep the general structure in mind.
An Overview of Filesystems
Below are some of the most common types of filesystems used today:
• ext4 - The most recent version of native Linux filesystems. This filesystem
succeeds the ex2 and ext3 versions. Supports disk volumes up to 1
EXABYTEand file sizes up to 16 terabytes.
• btrfs or B-tree filesystem - Sometimes called Butter FS or Better FS. New
filesystem type that is looking to expand on features from ext4.
• XFS - A high-performance journaling filesystem. Typically used on systems
like media-servers which have large files.
• NTFS and FAT - Windows filesystems
• HFS+ and APFS - Apple standard used on most Mac systems
You can also use df -T to see info about storage on your system and what
types of filesystems are in use:
root@linux_rocks:/tmp# df -T
Filesystem Type 1K-blocks Used Available Use% Mounted on
overlay overlay 61255492 3491404 54622764 7% /
tmpfs tmpfs 65536 0 65536 0% /dev
tmpfs tmpfs 1018268 0 1018268 0% /sys/fs/
cgroup
shm tmpfs 65536 0 65536 0% /dev/shm
/dev/vda1 ext4 61255492 3491404 54622764 7% /etc/hosts
42
T.me/nettrain
tmpfs tmpfs 1018268 0 1018268 0% /sys/
firmware
Keeping an eye on storage space in your Linux shell is essential! You should
also be familiar with the disks and filesystem types they’re using.
Devices
Devices like keyboards, mice and disks all need drivers to function on your
system correctly. Since everything is a file on Linux, we can interact with devices
through device files. Device files are stored in the /dev directory on your system.
Go ahead and perform an ls -l command on the /dev directory on your system.
(This command simply lists in long format -l every file and directory in /dev):
stetson@linux_rocks:/$ ls -l /dev
You’ll get back a ton of devices on each line. Some of them represent disks,
others represent virtual terminals. One file is called /dev/null which is a special
type of device on our system that takes input and throws it all away. You can
echo some text to /dev/null now and then cat the file and observe that it
43
T.me/nettrain
contains nothing:
The Linux Kernel knows that /dev/null should return nothing since it acts as
a kind of “black hole” for anything we throw into it. Above, we echo a simple
string of text and use the > redirect operator to send our output to /dev/null. We
cover command redirection more in Black Belt Command Line Skills.
Looking back to our ls -l /dev output, we’ll see a few columns of data. The
first character on each line has a special meaning:
stetson@linux_rocks:/$ ls -l /dev
crw--w---- 1 root tty 4, 0 Dec 30 00:38 tty0
brw-rw---- 1 root disk 252, 1 Dec 30 00:38 vda1
prw-r--r-- 1 root root 0 Dec 30 20:13 fdata
srw-rw-rw- 1 root root 0 Dec 30 20:13 log
See each letter to the far left of the column? The c, b, p and s?
• c - character
• b - block
• p - pipe
• s - socket
Character devices transfer data one character at a time and work with data
streams. /dev/null is a character device. Printers are also represented by a
character device.
Block Devices are accessed by programs in fixed chunks. Hard drives and file
systems are both a type of block device.
Pipe Devices or named pipes are similar to character devices. They allow two or
44
T.me/nettrain
more processes to communicate.
Socket Devices also allow communication between processes like pipe devices
do. They typically allow communication with many processes or programs at
once. These files are also often found outside of the /dev directory we’ve been
looking at.
Device Naming
If you explore the device names in /dev, you’ll see some funny looking
names. Some common device names for storage are /dev/sda, /dev/sdb and /
dev/sdba2. These names mean SCSI device a, SCSI device b and SCSI device a
partition 2. These names come from the SCSI (pronounced like “scuzzy“)
protocol or Small Computer System Interface. Although SCSI interfaces are
largely not used anymore, this naming scheme still sticks around.
What is a daemon?
A daemon (pronounced DAY-MAN, DAY-MON or sometimes DEE-MON) is a
background process that runs on the Linux OS continuously. When we say
background, we mean that the program runs without any user interaction. It
runs “behind the scenes“ so to speak.
An example of a Daemon would be processes like Apache (web server),
Nginx (also a web server), PostgreSQL (a database server) and SSHD (a secure
shell daemon). If you’re a business or individual who maintains a web service of
any kind, it’s likely you want it to stay running all the time, and without much
drama or management. You want it to start when your server starts and keep
running forever. This process would act as a daemon.
45
T.me/nettrain
How does Linux Boot?
Every operating system has a boot process. Linux is no different. In this
section, we’ll go over the basics of the boot process and explain each component.
There’s a ton of processes happening between powering up a Linux system
and when it finally gets to a blank prompt. The boot process can be explained
very simply, then we’ll go deeper.
BIOS
BIOS stands for Basic Input/Output System. BIOS is a firmware used in the
boot procedure to find the bootloader among some other secondary system
functions like changing boot disk order and network settings. BIOS firmware is
stored on an EPROM chip, which allows for firmware updates from the
manufacturer.
Once BIOS boots up, its goal is to search for the Master Boot Record (MBR)
and execute it. The MBR contains the primary boot-loader in the first 512-bytes of
46
T.me/nettrain
the disk.
UEFI
UEFI or Unified Extensible Firmware Interface is newer and has the same goal
of starting the system as BIOS does. However, UEFI stores the initialization data
in a .efi file, rather than storing it on the firmware. This .efi file is stored on the
disk of the system on a partition called the EFI System Partition, alongside the
bootloader. The GUID Partition Table (GPT) format is used in UEFI systems.
UEFI was developed to improve on some of the shortcomings of BIOS, such
as:
Today, UEFI is the standard and BIOS has become archaic and dated.
The Bootloader
Once BIOS or UEFI has found the bootloader and executed it, the bootloader
takes over. The bootloader’s main goal is to start the Linux kernel. It does this by
finding the kernel, loading it into memory and passing in any kernel parameters
that may be needed.
During this process, the kernel has still not been loaded, so accessing the disk
must be done via slower device drivers than the one the system uses once it’s
booted.
The most common and ubiquitous Bootloader is GNU GRUB or GRand
Unified Bootloader. There are also other Linux bootloaders like LILO (Linux
Loader), BURG, Syslinux and core boot. These bootloaders were developed to be
faster, simpler or have different features than GRUB.
47
T.me/nettrain
Init and User Space Start
After the bootloader loads the kernel into memory, it starts its first user-space
process, init. This kicks off the rest of the process to bring the system completely
up by starting essential services. The init process will always be assigned a
process id of 1. Once init starts, its goal is to start up the rest of the programs that
are necessary to run the system.
Systemd is the newest and current standard for init on modern Linux
distributions. System V init (SysV) is found on older Red Hat Enterprise Linux
distributions. Upstart is the init that was used on older Ubuntu distributions (pre
15.04). There are a few other init systems used for embedded applications,
cellphones and other devices. The three I mentioned are the ones you should be
most familiar with: Systemd, SysV and Upstart. We’ll be working with and
covering systemd.
systemd
systemd works with targets which can be thought of as goals for systemd to
reach. Targets are a type of unit used to group other units and define the system
state that should be reached. A unit is a file that has instructions for how to start a
daemon or system service and the dependencies for starting it. The dependencies
for a service are also units.
For example, consider a Web Server like Nginx. It doesn’t make sense to start
up a web server if the system doesn’t already have its network online and
functional. A Web Server or Nginx unit file will have a dependency on the network
unit file.
48
T.me/nettrain
systemd. Units might be services, sockets, devices and more. Systemd
units are started in parallel based on the dependencies they define.
• Services are the most common unit, these define daemons or background
processes running on the system. Systemd starts these up as well if they
are a dependency of the unit files
[Unit]
Description=The NGINX HTTP and reverse proxy server
After=syslog.target network-online.target remote-fs.target nss-
lookup.target
Wants=network-online.target
[Service]
Type=forking
PIDFile=/run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t
ExecStart=/usr/sbin/nginx
ExecReload=/usr/sbin/nginx -s reload
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true
[Install]
WantedBy=multi-user.targetManaging systemd Services
This config file looks very similar to .INI file formats you might have seen in
the past. The sections are denoted by square brackets for Unit, Service and
Install. Each of the keys and values underneath the sections are known as
directives.
The Unit section contains a description of the service along with an After and
Wants directive. The After directive specifies units that should be started before
49
T.me/nettrain
this unit file. The Wants directive is similar to a directive called Requires, but less
strict. It tells systemd to attempt to start any units listed here when this unit file is
activated.
Under the Service section there are instructions for how systemd should start
the daemon. The nginx binary lives at /usr/sbin/nginx. The ExecStartPre,
ExecStart, ExecReload and ExecStop directives tell systemd what program it
should run and with what options before the unit starts, when the unit starts,
when it reloads and finally when the unit is stopped.
The last section is Install. It’s an optional section. The WantedBy directive
sets up a dependency to specify target units that should be started when the
current unit is enabled.
This is just the beginning of Unit Files. Hopefully, this gives you a real-world
example of a Unit file you might find running in production. Next, we’ll look at
interacting with systemd unit files via systemctl.
Systemctl
Systemctl is a command-line tool used to control systemd. We most often use
it to start, stop, restart, enable or disable a service. Starting and stopping are
pretty clear, but what about restarting and enabling or disabling?
Restarting a service stops it and starts it again. You might restart a service
when it isn’t working properly or when you’ve changed a configuration file that
you want to be applied.
Enabling or disabling a service will mark the service to start at boot or to not
start at boot. Pretty simple.
50
T.me/nettrain
systemctl enable <service> - Enable a service to start on boot
systemctl disable <service> - Disable a service from starting on boot
Systemd and the systemctl command are essential pieces of the Linux
operating system today. Understand them well and you will thrive in your Linux
journey!
What's a TTY?
You’ll soon run into the term TTY on your Linux journey, which stands for
TeleTYpwriter. It refers to a physical or virtual terminal you use to interact with
Linux. Originally, TTYs were physical terminals with a keyboard and screen that
sent your commands to a remote server.
On modern systems, TTYs are usually virtual terminals that you access
through the command line. Each TTY gets an associated device and file located in
the /dev directory. An example device might be named /dev/tty1.
Any time you login to Linux via command-line, the system will create a login
shell process. Multiple TTYs mean that multiple users can be logged in!
You’ll also see pseudo-TTYs (or PTYs) used for remote logins and terminal
emulators. PTYs provide a virtual terminal that appears to be connected to a
physical TTY device.
inodes
Behind the scenes, Linux uses a concept known as inodes or index nodes to
store metadata about files and directories. In fact, every single file and directory
on the system is represented by an inode. Inodes contain info on the file’s
permissions, owner, size, timestamps and location on the disk. When your
filesystem is created, space for inodes is also allocated. Inodes are managed by a
database known as the inode table.
Inodes are assigned unique numbers when they are created. If a file is deleted,
the inode id gets recycled and used again. You can view inode numbers by
running ls -li in any directory.
51
T.me/nettrain
stetson@linux_rocks:~$ ls -li
total 16
397530 -rw-rw-r-- 1 stetson stetson 10240 Feb 27 00:28
important_file
397528 -rw-rw-r-- 1 stetson stetson 0 Feb 27 00:27
moon_landing_photos.png
397522 -rw-rw-r-- 1 stetson stetson 209 Feb 27 00:44
myArchive.tar.gz
397525 -rw-rw-r-- 1 stetson stetson 0 Feb 27 00:27
secret_documents
Above, the numbers below the first column are inode ids.
If you want to see detailed information about a file and its inode, you can use
the stat command to do so:
Inodes contain important metadata about files, but Linux also uses inodes to
locate files on disk and manage them. If you read or write to a file, Linux uses the
inode id to look the file up and retrieve its contents for you in the background.
Something surprising is that the number of inodes available are limited. It’s
possible to run out of inodes even if your system still has tons of space left on the
disk. This isn’t a very common scenario, but one you should be aware of! You can
check how many inodes are available with the df -i command:
stetson@linux_rocks:/var/log$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
tmpfs 59672 744 58928 2% /run
/dev/vda1 1290240 239867 1050373 19% /
tmpfs 59672 3 59669 1% /dev/shm
52
T.me/nettrain
tmpfs 59672 3 59669 1% /run/lock
/dev/vda15 0 0 0 - /boot/efi
tmpfs 11934 26 11908 1% /run/user/0
tmpfs 11934 28 11906 1% /run/user/1000
Above, my system has 1050373 inodes free on its man /dev/vda1 disk with
239867 free.
53
T.me/nettrain
54
T.me/nettrain
3. White Belt Command Line Skills - The
basics, file manipulation, etc
username@hostname:current_folder $
For example:
stetson@linux_rocks:~/$
The prompt is composed of: the current user you are logged in as, the
hostname of the machine you’re working on and the current working directory.
In the example above, stetson is the user, linux_rocks is the hostname and ~/ is
the current working directory. ~ is a shortcut that represents your home directory
55
T.me/nettrain
which is located at /home/user_name\
This command is not real but serves to show how flags could be used to
provide additional parameters like 4096 for the memory value and 2 for the CPU
value. The command we’re running, launch_server will then parse the
arguments we’ve given it and act in a different way!
56
T.me/nettrain
which has permissions to do anything).
The prompt sits there, waiting for us to take control and command Linux to
do our bidding. Let’s try our first command, hostname
stetson@linux_rocks:/home/stetson$ hostname
linux_rocks
Whoa! We got some text back. Sweet. The bash interpreter took our command,
sent it to operating system and replied with the output that the hostname
command returned. Now check this out, it’s going to blow your mind. Hit the up
arrow on your keyboard. The same hostname command should be back in your
prompt, waiting for you to execute it again. Hit enter again, if you want. Or hit
the down arrow key, and the prompt will be empty again. You can also hit the
left and right arrow keys and delete letters with backspace like you could do on a
GUI (Graphic User Interface). Maybe that wasn’t that mind-blowing, but let’s
venture forward anyway.
/
|-- bin -> usr/bin
|-- boot
|-- dev
|-- etc
|-- home
|-- lib -> usr/lib
|-- media
|-- mnt
|-- opt
|-- proc
|-- root
57
T.me/nettrain
|-- run
|-- sbin -> usr/sbin
|-- srv
|-- sys
|-- tmp
|-- usr
|— var
stetson@linux_rocks:~$ pwd
/home/stetson
You’ll get back a path to indicate where your prompt or shell is currently
focused in the Linux operating system.
Earlier we mentioned that your prompt shows the current working directory
after the colon and before the $ sign. You might instead see a tilde or ~. The tilde
indicates that the current path is your home directory. Your user might be ubuntu,
so the home path would be /home/ubuntu instead of /home/stetson. The tilde is
simply a shortcut to refer to the current user’s home directory.
Enter ls now:
58
T.me/nettrain
#$
stetson@linux_rocks:~$ ls
Desktop Documents Music Pictures Public Templates Videos
Depending on where you ran ls from, you’ll see nothing or a ton of output.
In the example above, every name we see is a directory. We’ll go deeper into the
ls command later.
Absolute Paths
Absolute Paths are the “full” directory path from the root directory. Just earlier,
we saw that a user’s home folder might reside somewhere like /home/stetson.
This is an example of an Absolute Path. Whenever I say absolute path to myself, I
slam my first on the table to emphasize absolute. I’m probably weird for this but it
helps me to remember the difference vs Relative Paths.
Relative Paths
Relative Paths are the path referenced from your current directory. For
example, if we’re in /home/stetson and we’d instead like to change our
directory to /home/stetson/Documents, we don’t need to tell cd to go to /home/
stetson/Documents. We can instead tell it to take us to Documents with cd
Documents. The path is relative from where we currently are. It also saves us
time from typing!
With that out of the way, let’s do some filesystem navigation and folder
surfing!
59
T.me/nettrain
"
stetson@linux_rocks:~$ cd /usr/bin
stetson@linux_rocks:/usr/bin$
We won’t get any text returned back other than a new empty prompt with the
path now changed to /usr/bin.
Let’s cd back to our home directory, this time using either /home/user_name or
just the ~ tilde. Remember that you should see your username all the way left
of the prompt.
cd ~ or cd /home/ubuntu will work the same (assuming that your current
user is named ubuntu).
Once there, issue pwd again to verify your path was changed:
stetson@linux_rocks:/usr/bin$ cd ~
stetson@linux_rocks:~$ pwd
/home/stetson
Let’s learn a few more cd tricks. Just like we can specify ~ to refer to our home
directory, we have a few other shortcuts available to us:
cd ..
.. Refers to the directory “above” our current directory. Give it a try from
your user’s home directory to “go back” or “up” a directory. The pwd command
should return /home. Note that the double periods .. refers to the directory
above and a single period . refers to the current directory.
cd -
Performing cd - will take you back to the directory you were just in. So if
you were in /home/stetson, performed cd .. and then a cd -, you should land
back in /home/stetson again. Try it out now!
Hot Tip : The text you enter after cd is known as an argument. When we
write an argument after a command, we would say that we’re passing an
argument to a command.
60
T.me/nettrain
Your Homework:
• Try cd-ing your way around the Linux filesystem. Don’t worry about
breaking things. You can’t do any harm with only cd. What’s the “longest”
path you can find? What about the shortest?
• Try using cd without any arguments. What happens? Verify with pwd.
Learning More ls
We previously used the ls command to list directories and files. However,
there’s one thing I didn’t tell you. Some files and directories are hidden! Any file
in Linux which has a period in front of it is hidden. Luckily for us, we can reveal
them with ls. We’re going to use a flag called a which stands for all.
stetson@linux_rocks:~$ ls -a
. .. .bash_history .bash_logout .bashrc .profile
We’ll see the . and .., which again reference the current directory and
directory above us. We also should see a few other files that are prefixed with a
period. Like .bashrc, .profile. If you perform ls without the -a flag, you’ll
no longer see the hidden files.
Let’s try adding another flag to ls. We’re using two flags here, a for all and l
for long.
stetson@linux_rocks:~$ ls -la
total 24
drwxr-xr-x 2 stetson stetson 4096 Feb 14 23:36 .
61
T.me/nettrain
"
drwxr-xr-x 1 root root 4096 Feb 14 23:33 ..
-rw------- 1 stetson stetson 170 Feb 14 23:37 .bash_history
-rw-r--r-- 1 stetson stetson 220 Feb 14 23:33 .bash_logout
-rw-r--r-- 1 stetson stetson 3771 Feb 14 23:33 .bashrc
-rw-r--r-- 1 stetson stetson 807 Feb 14 23:33 .profile
There’s a lot here! The long listing output from ls gives us the file’s
permissions, links to the file, owner name, owner group, size of the file, last
modified time stamp and finally, the name of the file or directory.
The first character of the permissions lets us know if the item is a file or a
directory, indicated by d for a directory and - for a file. Eg: drwxr-xr-x is a
directory, while -rw-r--r-- is a file.
We’ll break down the weird r, w and x flags we see in a later chapter, just keep
in mind they related to the file’s permissions.
Hot Tip : We don’t need to write ls -l -a, we can instead use ls -la.
Give it a try.
One more ls tip. Imagine you have a directory filled with other directories
and files. How can you see the entire output? You can use the -R or recursive
flag to do that.
Observe:
stetson@linux_rocks:~/more_files$ ls -R cloud_files/
cloud_files/:
secret_files
cloud_files/secret_files:
aws
cloud_files/secret_files/aws:
s3
cloud_files/secret_files/aws/s3:
cloud_spend container_audit security_audits
cloud_files/secret_files/aws/s3/cloud_spend:
62
T.me/nettrain
cloud_files/secret_files/aws/s3/container_audit:
cloud_files/secret_files/aws/s3/security_audits:
Reading Files
Remember that .bash_history file from earlier? Let’s read it. We’ll use the
cat (short for concatenate) command and pass in .bash_history as an argument
to read the hidden file:
Whoa! We get back the commands that we entered earlier. Bash keeps a
history of the commands we entered earlier in .bash_history. Cool!
That’s just one way to use touch. We can also update modified timestamps on
files and directories. Try using touch again, this time specifying a different file.
Use ls -l again and verify the modified timestamp was changed:
stetson@linux_rocks:~$ ls -l
total 4
63
T.me/nettrain
"
-rw-rw-r-- 1 stetson stetson 0 Mar 14 19:59 file.txt
drwxrwxr-x 3 stetson stetson 4096 Mar 9 19:44 projects
stetson@linux_rocks:~$ touch file.txt
stetson@linux_rocks:~$ ls -l
total 4
-rw-rw-r-- 1 stetson stetson 0 Mar 15 21:22 file.txt
drwxrwxr-x 3 stetson stetson 4096 Mar 9 19:44 projects
Above, I used touch to change the modified time stamp on file.txt from
Mar 14 19:59 to Mar 15 21:22.
Creating Directories
Just as we can create a new file with touch, we can make our own directories
with the mkdir command and an argument. Give it a try now! Then, cd into that
directory, make more files with touch and more directories with mkdir. The
world is your oyster! Err — shell? The terminal is your shell? Never mind.
Hot Tip(s) : You can use mkdir with the -p flag to create subdirectories
automatically. For example, mkdir -p my_directory/another_one/secret_files
will make the directory another_one without you having to create it manually.
Just another way we can save ourselves some typing!
You can also give mkdir multiple directories as arguments separated by space
and it’ll make them all next to each other.
For example:
64
T.me/nettrain
Simple enough. For this we’ll use the cp and mv commands, which stand for
copy and move, respectively.
Each command takes two arguments:
stetson@linux_rocks:~/my_awesome_directory$ ls
file1.png
stetson@linux_rocks:~/my_awesome_directory$ mv file1.png /tmp
stetson@linux_rocks:~/my_awesome_directory$ ls /tmp
file1.png
stetson@linux_rocks:~/my_awesome_directory$ ls
stetson@linux_rocks:~/my_awesome_directory$
Using mv will entirely move the file! If we use cp instead, then…you guessed
it, we’ll copy the file. This retains the original file and makes a file where we tell
it to.
stetson@linux_rocks:~/my_awesome_directory$ ls
file1.png
stetson@linux_rocks:~/my_awesome_directory$ cp file1.png /tmp
stetson@linux_rocks:~/my_awesome_directory$ ls /tmp
file1.png
stetson@linux_rocks:~/my_awesome_directory$ ls
file1.png
Move can also be used to rename a file. Simply give a name as the second
argument. You can move a file and rename it at the same time, too!
65
T.me/nettrain
Observe:
stetson@linux_rocks:~/my_awesome_directory$ ls
file1.png
stetson@linux_rocks:~/my_awesome_directory$ mv file1.png
my_awesome_file.png
stetson@linux_rocks:~/my_awesome_directory$ ls
my_awesome_file.png
stetson@linux_rocks:~/my_awesome_directory$ mv
my_awesome_file.png /tmp/my_renamed_awesome_file.png
stetson@linux_rocks:~/my_awesome_directory$ ls /tmp
my_renamed_awesome_file.png
Copying Directories
Let’s say we want to move an entire directory that contains some files:
We get this strange error message about an -r flag. What gives? Linux wants
us to use the recursive flag to copy any files and directories inside the directory.
stetson@linux_rocks:~$ cp -r my_awesome_directory/
another_directory
stetson@linux_rocks:~$ ls -l
total 8
drwxrwxr-x 2 stetson stetson 4096 Feb 16 00:48 another_directory
drwxrwxr-x 2 stetson stetson 4096 Feb 16 00:46 my_awesome_directory
66
T.me/nettrain
"
Nice! That time it worked!
Hot Tip : Wildcards can be used in many places in the bash shell. Let’s say
we have a directory with a bunch of different file extensions, but we only want to
copy the files with .png extensions. We can use a wildcard to do so:
stetson@linux_rocks:~/important_files$ ls -l
total 0
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 AWS_Users.pdf
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 TPS_Reports.pdf
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 file1.png
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 file2.jpeg
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 house_inspection.jpeg
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 jazz.mp3
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 my_dog.png
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 rock_n_roll.mp3
-rw-rw-r-- 1 stetson stetson 0 Feb 16 00:53 vacation.png
stetson@linux_rocks:~/important_files$
stetson@linux_rocks:~/important_files$ mv *.png /tmp
stetson@linux_rocks:~/important_files$ ls /tmp
file1.png my_dog.png my_renamed_awesome_file.png vacation.png
We simply used mv *.png /tmp to move ALL of the png files to /tmp.
Well done on learning ls, mkdir, cat, touch, copy and move. You’re well on
your way to honing your Shell Samurai katana.
Hot Tip : You should know about about tab completion. Almost any time
while using the Linux shell, you can hit your tab key and your shell will attempt
to auto-complete whatever it is you’re typing. For example, if you’re in a
directory and trying to cat a file called
super_long_file_name_that_you_dont_want_to_type.txt, you can type super and
whack the tab key and bash will auto-complete the file name for you. Note that if
you have other files in that directory also starting with super, bash will cycle
through them. If there are two other files, superman and super_awesome.txt,
67
T.me/nettrain
bash will think you want one of those. Just keep hitting tab until you cycle back
to the file you intended.
To see your assigned user id and group id, simply issue the id command:
stetson@linux_rocks:~$ id
uid=1000(stetson) gid=1000(stetson) groups=1000(stetson),27(sudo)
You’ll see that my user has id 1000, group id 1000 and belongs to two groups:
stetson and sudo.
We as users and administrators work with usernames, but the Kernel works
with user ids.
Introducing Root
You may have come across the term root access before. The root user has
permissions to do anything. Root can read any file or terminate any process. The
root user, has root access. For this reason, root access should be heavily guarded
and secured! You wouldn’t want anything bad to happen to your system, would
68
T.me/nettrain
you? (Not a threat).
As your regular non-root user, try now to cat the /etc/shadow file:
Depending on how you installed Linux, you may not have the sudo command
available! If you’re on Ubuntu and get a command not found error, you’ll want
to install sudo with apt-get install sudo. You’ll also want to ensure that your
user has sudo permissions! Let’s see an example.
Sudo
Sudo or Super User Do is an effective tool that lets non-root users run programs
as super user or root. It’s best practice to not manage Linux systems while
logged in as root. As root, you have unlimited power, so what if you
accidentally deleted a bunch of directories you didn’t mean to? Sudo can act as a
temporary elevation of privileges that allows us to install software, delete files or
run other programs.
Many of the commands in this book may require sudo to be used right before
69
T.me/nettrain
"
the command in order for it to run successfully. For example, you might run sudo
apt-get install nginx to install the Nginx web server. The same command
would fail if we only used apt-get install nginx and ran it as a non-root user.
Hot Tip : Many times while writing commands in my own terminal, I’ll
forget to pre-pend them with sudo, and I get a permission denied error. Rather
than scroll up and put sudo before them or rewriting the entire command, I’ll
instead just use sudo !!. The double-exclamation !! characters tell bash to re-
run the previous command with sudo. Performing sudo !! replaces the !! with
my original command on the previous line. This allows me to be speedier on the
command line.
Once you have sudo installed, you’ll need to add your normal user to the
sudo group from the root account. Use adduser <user_name> sudo to do so:
Running as Root…Cautiously
I mentioned that you shouldn’t run most commands as root, but sometimes
you just don’t have a choice. Another way to perform tasks that require
permissions to be elevated is to simply become root. You can do so by running
su, short for switch user. You’ll then need to enter the root password to become
root:
stetson@linux_rocks:~$ su
Password:
root@linux_rocks:/home/stetson# whoami
root
70
T.me/nettrain
With great powers, come great responsibilities. You want to be extra careful
while running commands as root because you could make a mistake that may
alter or break the system.
As root, you can also use su to switch back to another user:
root@linux_rocks:/# su stetson
$ whoami
stetson
Adding Users
Adding users is straightforward. The primary command we can use is
useradd. This command allows us to specify any attributes we’d like along with
the user’s name.
Let’s go ahead and add a user now (you’ll need root or superuser permissions
to do so):
Above, I added bob, but without sudo permissions, so I was denied access. I
used sudo to then add the user again and then checked the /etc/passwd directory
to verify that bob shows up. What is /the etc/passwd file?
/etc/passwd
The /etc/passwd file is a special type of file that serves as the central area for
storing user account information. Each line in the file represents a user account
and has a series of fields separated by colons, which provide additional details
71
T.me/nettrain
about the user.
The primary use of /etc/passwd is to allow the system to match usernames
with corresponding user ids during authentication. When a user logs in, Linux
checks that the credentials match what’s stored in /etc/passwd. The file also does
a few other administrative tasks like determining a user’s default groups and
primary group membership.
/etc/passwd used to be used to store encrypted passwords, but modern
systems store encrypted passwords in a file called /etc/shadow we worked with
earlier. /etc/shadow has a few more security restrictions in place.
Above, I tell Linux I want to give bob a new password. After confirming the
password, I’m able to use su or switch user to switch to bob with the new
password. Finally, I use the whoami command to verify I’m logged in as bob.
72
T.me/nettrain
the current shell instance. We’ll get deeper into processes later, so don’t worry if
this doesn’t make sense right now!
$ ps -p $$
PID TTY TIME CMD
1139 pts/0 00:00:00 sh
You can then use usermod --shell /bin/bash <user_name> as either root or
a sudo user to change a user’s default shell. Below, I change my default shell as
stetson, exit out of the shell and then become stetson again with su:
$ whoami
stetson
$ sudo usermod --shell /bin/bash stetson
$ exit
root@linux_rocks:/# su stetson
stetson@linux_rocks:/$
73
T.me/nettrain
root : root
File Permissions
If you recall earlier, we performed an ls with the -l or long flag and got back
a bunch of output. The first column returned had a combination of r, w, and x
characters to indicate file permissions. Let’s take a closer look and demystify file
permissions.
On your command line, perform ls -l /etc or just follow along with the
example here:
stetson@linux_rocks:~/important_files$ ls -larth
total 36K
drwxr-xr-x 5 stetson stetson 4.0K Feb 16 00:52 ..
-rw-rw-r-- 1 stetson stetson 12 Feb 16 22:46 jazz.mp3
-rwxrwxrwx 1 stetson stetson 12 Feb 16 22:46 rock_n_roll.mp3
-rw-rw-r-- 1 stetson stetson 13 Feb 16 22:47 file2.jpeg
-rw-rw-r-- 1 news news 13 Feb 16 22:47
house_inspection.jpeg
-rw-rw-r-- 1 root root 13 Feb 16 22:47 AWS_Users.pdf
---x--x--x 1 stetson stetson 13 Feb 16 22:47 TPS_Reports.pdf
drwxrwxr-x 2 stetson stetson 4.0K Feb 16 22:58 folder_one
drwxrwxr-x 3 stetson stetson 4.0K Feb 16 22:58 .
74
T.me/nettrain
Finally, the third and fourth columns tell us the file owner and group owner.
Most of these files are owned by the user stetson and the group stetson. The
AWS_Users.pdf file is owned by the root group and the root user. The
house_inspection.jpeg file is owned by the news group and news user.
Introducing chmod
If we’d like to change file or directory permissions, we can do so with the
chmod command in two ways. The first way is to use chmod ugo+rwx <file>.
Here, the u, g and o characters stand for user, group, other and rwx for read,
write and execute. Each of these characters are optional, depending on the
permissions we desire. We then use the + to add the permissions. If instead, we
wanted to remove permissions, we would replace the + with a - to do so. Here’s
an example:
75
T.me/nettrain
stetson@linux_rocks:~/more_files$ ls -l
total 276
-rw-rw-r-- 1 stetson stetson 280245 Feb 17 18:19 jazz.mp3
stetson@linux_rocks:~/more_files$ chmod ugo+rwx jazz.mp3
stetson@linux_rocks:~/more_files$ ls -l
total 276
-rwxrwxrwx 1 stetson stetson 280245 Feb 17 18:19 jazz.mp3
We added all permissions to the file jazz.mp3. This is usually a very bad idea
because it gives any user on the system the ability to read and modify our file.
Let’s remove our permissions from the file so no one can modify it, and then add
back some permissions just for the user:
I also mentioned that there’s another way to change file and directory
permissions. The second way is to use chmod with a binary format to change
permissions. Take a look at the following command:
This will give jazz.mp3 FULL permissions for anyone. Each number
corresponds to user, group or other. Each number tells Linux what each of those
groups of users can do. They are in binary format:
• 4: read bit
76
T.me/nettrain
• 2: write bit
• 1: execute bit
Introducing chown
We’ve just learned how we might change access permissions on a file. How
could we change the group and owner of a file? Recall that ls -l gives us back
the user and group in the third and fourth columns:
stetson@linux_rocks:~/important_files$ ls -l
total 28
-rw-rw-r-- 1 root root 13 Feb 16 22:47 AWS_Users.pdf
---x--x--x 1 stetson stetson 13 Feb 16 22:47 TPS_Reports.pdf
-rw-rw-r-- 1 stetson stetson 13 Feb 16 22:47 file2.jpeg
drwxrwxr-x 2 stetson stetson 4096 Feb 16 22:58 folder_one
-rw-rw-r-- 1 news news 13 Feb 16 22:47
house_inspection.jpeg
-rw-rw-r-x 1 stetson stetson 12 Feb 16 22:46 jazz.mp3
-rwxrwxrwx 1 stetson stetson 12 Feb 16 22:46 rock_n_roll.mp3
stetson@linux_rocks:~/important_files$ ls -l
total 28
-rwxrwxrwx 1 stetson stetson 12 Feb 16 22:46 rock_n_roll.mp3
stetson@linux_rocks:~/important_files$
stetson@linux_rocks:~/important_files$
stetson@linux_rocks:~/important_files$ sudo chown alice
rock_n_roll.mp3
[sudo] password for stetson:
77
T.me/nettrain
stetson@linux_rocks:~/important_files$ ls -l
total 28
-rwxrwxrwx 1 alice stetson 12 Feb 16 22:46 rock_n_roll.mp3
Above, we used chown to set Alice as the user owner of the file
rock_n_roll.mp3.
Observe that Alice is now the owner of the file.
How can we change the group?
You can use the chgrp command (change group):
Alice is now the owner of the file and the group root owns the file.
One more trick and we’ll move on. If we want to set the group and the user at
the same time, we can use chown.
Check it out, just give the user and then the group separated by a colon:
bob is now the user owner and stetson is the group owner.
78
T.me/nettrain
If you’ve ever watched a seasoned systems administrator at the command-
line at work, you’ll notice that they seem to fly around the command-line with
ease. They’re able to delete words quickly, move their cursor to the front and
back of commands and quickly replace text. How do they do it? They’re most
likely using shortcuts. We can’t usually use the mouse in the Linux CLI, so
shortcuts help us to get around quickly.
Let’s try a few shortcuts out. It’s helpful if you type some text on your shell
first, so that we can try these examples. Anything will work, maybe a sentence or
two about your day or even a real world command like sudo apt-get install
nginx. Go ahead and type your sentence, but don’t hit enter yet:
There are more Linux shortcuts and even more that you can custom code
79
T.me/nettrain
yourself, but knowing these few will have you flying around the shell like a
Linux Samurai in no time.
file
The file command can quickly tell us what type of file a given file is. Here’s a
few examples:
File determines the file type and sometimes even has even more details. The
jpg photo above I took on my iPhone gives back model of phone, resolution and
more.
file -z foo.zip
We’ve so far learned all sorts of ways to create files, copy them and move
them around. How can we remove a file? Simple! Just use rm (or remove):
80
T.me/nettrain
stetson@linux_rocks:~/more_files$ touch jazz.mp3
stetson@linux_rocks:~/more_files$ ls -l
total 0
-rw-rw-r-- 1 stetson stetson 0 Feb 18 09:18 jazz.mp3
stetson@linux_rocks:~/more_files$ rm jazz.mp3
stetson@linux_rocks:~/more_files$ ls -l
total 0
stetson@linux_rocks:~/more_files$
Above, I create jazz.mp3 with the touch command. Then, I list it to show
that it exists. Once I’m tired of listening to jazz, I decide to delete the file with rm
jazz.mp3. Try creating and deleting some random files on your system now.
A word of caution! Linux isn’t like other operating systems. There is no trash
can or recycle bin for deleted files. When you delete them, they’re gone for good.
However, some files and directories are write protected to guard against
accidental deletion. They’ll ask you to confirm before deleting:
stetson@linux_rocks:~/more_files$ rm jazz.mp3
rm: remove write-protected regular empty file 'jazz.mp3'? y
If you’d just like to delete it instead without being asked, supply the -f or
force flag. It should work if you have the right permissions:
stetson@linux_rocks:~/more_files$ rm -f jazz.mp3
stetson@linux_rocks:~/more_files$
If you’d like to write-protect your own files, we can use our chmod tricks we
picked up in the previous section:
Just use chmod go-w file.txt to remove the write permission from the
group and owner.
To delete many files at once, we can use wildcards or the asterisk to specify
what to delete. This is very dangerous, but hey, you can still do it!
Check it out, let’s blow away all the files inside the secret_files directory
now:
81
T.me/nettrain
"
stetson@linux_rocks:~/secret_files$ ls
AWS_Report.pdf Classified_Documents.pdf Home_Inspection.pdf
My_dogs.png jazz.mp3 rock_n_roll.mp3
stetson@linux_rocks:~/secret_files$ rm *
stetson@linux_rocks:~/secret_files$ ls
stetson@linux_rocks:~/secret_files$
If instead we wanted to delete all .pdf files, we can use rm *.df. This feature is
called shell globbing, which is used to match and expand on patterns.
Be very careful and don’t use root unless absolutely necessary. Most
important system files you won’t be able to delete as a regular user, only as root.
Deleting Directories
Deleting directories isn’t hard, but we’ll need to use the -r flag (recursive) if
there’s any files or other directories inside the directory:
stetson@linux_rocks:~/more_files$ ls
secret_files
stetson@linux_rocks:~/more_files$ rm secret_files/
rm: cannot remove 'secret_files/': Is a directory
stetson@linux_rocks:~/more_files$ rm -r secret_files/
stetson@linux_rocks:~/more_files$ ls
stetson@linux_rocks:~/more_files$
82
T.me/nettrain
stetson@linux_rocks:~/more_files$ ls
stetson@linux_rocks:~/more_files$ touch empty_directory
stetson@linux_rocks:~/more_files$ ls empty_directory
empty_directory
stetson@linux_rocks:~/more_files$ ls
empty_directory
stetson@linux_rocks:~/more_files$ rmdir empty_directory
stetson@linux_rocks:~/more_files$ ls
stetson@linux_rocks:~/more_files$
83
T.me/nettrain
or any other commands we’ve learned to view the man page. For example: man
ls
You’ll notice that when you use the man command, your entire screen may be
taken up with text! Man is using the less command to read text. It’s a little
intimidating, but you can navigate it easily with some practice. Simply use the
up and down arrow keys (or j and k keys) to browse up and down. If you want
to leave the man page, press q and you’ll return back to your friendly and
welcoming shell.
DESCRIPTION
man is the system's manual pager. Each page argument
given to man is normally the name of a program, utility or function.
The manual page associated with each of these arguments is then
found and displayed. A section, if provided, will direct man to
look only in that section of the manual. The default action is to
search in all of the available sections following a pre-defined order
(see DEFAULTS), and to show only the first page found, even if page
exists in several sections.
The table below shows the section numbers of the manual followed by
the types of pages they contain.
84
T.me/nettrain
What the heck? Why are there so many categories? We didn’t provide a
section number for our previous man commands like man ls or man man. What’s
going on here?
Not all man pages have an entry in one of these sections. Man pages generally
only have sections that are appropriate to that command. When you type man
and a command, man will search through the sections until it finds a match and
then return that page. If you’d like to see what sections a specific command has
an entry in, you can provide the f flag, like so:
ls /usr/share/man/man2/
You can also work backwards and look up the files for each man page with
man -wa <command>:
85
T.me/nettrain
/usr/share/man/man2/open.2.gz
stetson@linux_rocks:~$ man -wa cat
/usr/share/man/man1/cat.1.gz
Ok, we’ll leave it here! To be quite honest, I never use man page sections
myself. I generally just use man with no arguments and read through whatever
documentation I need that way. I did want to explain manual sections because it
was always a very confusing topic and I didn’t want to leave you missing out!
TLDR
One more option for reading docs I love is a package called tldr. Sometimes,
man pages are super long. Sometimes we just don’t have time to read all of that!
(Ironically, if we did, we probably wouldn’t need them as frequently).
Introducing, TLDR. TLDR is short for Too Long, Didn’t Read. It’s a community-
maintained help page that gives just the bare essentials to use a command. If
you’re on ubuntu, you can install tldr with apt-get install tldr
86
T.me/nettrain
Official Documentation
Two other great resources are the Official Ubuntu Documentation and the
Debian Handbook. These online resources really go deep and into more than just
commands, but they’re worth knowing about and perusing.
Together with help, man, tldr and online documentation, you should be
able to learn the inner-workings of any Linux command! Helping yourself is the
Shell Samurai way!
87
T.me/nettrain
Installing Packages and Programs
Introducing…Package Managers
Linux and any other operating system would be pretty useless if we couldn’t
install or update software. Luckily, there’s a solution for this built right-in. We’ll
explore how to use a package manager to install system software. Although
software is usually installed via a package manager, it can also be compiled from
source, too. We’ll cover that in the black belt section of the book.
Depending on the distribution of Linux you’re using, there will be a different
package manager for installing software. Debian (.deb) packages and Red Hat
(.rpm) are the two most popular packages. Others exist, too, like Gentoo and
Slackware. Debian packages are used for distributions like Debian, Ubuntu, and
Linux Mint. Red Hat packages are used in Fedora, CentOS, RHEL, Suse, and
others.
88
T.me/nettrain
store, for that matter).
Package Managers are configured to download and install their packages
from a repository. Linux distributions are already pointing at software repositories
that the distributors have configured. These are web servers that are well-known
and contain multiple versions of the software. If you’re on a Debian system like
Ubuntu, you can take a look at those repositories at /etc/apt/sources.list
Install a Package!
Ok, now we are all set to actually install our first package. We’ll use the
command apt-get install <package-name>. Let’s try installing nmap, a
network scanning package. Remember to use sudo! Our package manger will
show us dependencies it needs to install and the size to install the new package.
Go ahead and hit Y and then enter to accept the install.
89
T.me/nettrain
"
0 upgraded, 7 newly installed, 0 to remove and 0 not upgraded.
Need to get 5668 kB of archives.
After this operation, 26.9 MB of additional disk space will be
used.
Do you want to continue? [Y/n] Y
I’ve omitted the full output here, but you’ll see APT doing its thing and
unpacking and setting up the package. If you don’t see any error messages, the
install should’ve worked fine. Confirm by running which nmap or running the
nmap command. If you’re feeling up for a real challenge, read the nmap
documentation and use it to scan your local network!
Hot Tip : apt can be used in place of many apt-get commands. Instead of
running apt-get install nmap above, we could performed apt install nmap.
So what’s the difference? apt was introduced to be a bit more user-friendly than
apt-get. apt has less configuration options and is considered a bit easier to use.
apt-get has more configuration options and has a some lower-level control. For
basic commands like installing packages, it really doesn’t matter which you use!
There are a ton of packages available. How can we search through the haystack
for packages to install on our sytsems? Easy, just use apt search
<search_query>. You’ll notice that this searches through both the description and
the name of the package. If you’d instead like to just search for package names,
use apt search --names-only <search_query>
90
T.me/nettrain
"
Libraries are collections of pre-written code that perform specific tasks.
Dependencies are just software that is needed by another piece of software to
work. Dependencies could also be device drivers or other software packages.
root@linux_rocks:~#dpkg -i my_package_to_be_installed.deb
What about if we’d like to see all the packages currently installed? Easy.
root@linux_rocks:~# dpkg -l
Hot Tip : This book has been covering Debian-based Linux distributions like
Ubuntu. Red Hat or .rpm based distributions have a similar package
management process as .deb systems. However, instead of dpkg doing the work
and APT enhancing it, rpm (Red Hat Package Management) does the bulk of the
work and yum (Yellowdog Updater Modified) enhances it.
91
T.me/nettrain
used to unzip songs, movies and office documents for years. Linux can unzip
those types of files, but it also has it’s own built-in utilities and ways of
compressing and archiving files.
We should note an important distinction between compression and archival.
Compression is the act of reducing a file or a bunch of files and directories into
smaller files. Archival is the process of creating a single file that contains one or
more files and directories. Yes, compression and archival are often used together,
but they serve two different purposes.
When working with compressed files in Linux, we’ll use the gzip and tar
tools for our compression and archival needs. There are some other utilities like
zip, unzip, compress and bzip2, but tar and gz are the most common tools for
compression and archival!
If we’d like to unzip the file and remove the .gz extension, simply use gunzip:
92
T.me/nettrain
"
output (this means we get output) and specify a filename. Be sure to use the .tar
extension for your file name!
Notice that tar keeps the original files intact and creates a new archive. Above,
I’ve deleted the original files to prove that our files are not gone.
Now, we can extract the contents of the tar file with the x, v and f flags to
extract, be verbose and specify the file name:
Above, we see that our tar command shows each file that was extracted right
after we run the command.
Hot Tip : You will sometimes hear .tar files referred to as a tarball. This is an
informal term coming from the name of the tar utility together with ball, which
refers to bundling or balling up multiple files together.
93
T.me/nettrain
All Together Now — Compression and Archival
Tar files will have the extension .tar.gz, implying that we’ll need to first
gunzip to remove the .gz extension and then use tar to un-archive the files. We
work from the outside in.
Indeed, we can use the tools in order, first gunzip and then tar:
stetson@linux_rocks:~$ ls
tar_file.tar.gz
stetson@linux_rocks:~$ gunzip tar_file.tar.gz
stetson@linux_rocks:~$ ls
tar_file.tar
stetson@linux_rocks:~$ tar xvf tar_file.tar
important_file
secret_documents
moon_landing_photos.png
stetson@linux_rocks:~$ ls
important_file moon_landing_photos.png secret_documents
tar_file.tar
An easier option is to use the z flag with tar, which tells tar to use gzip/
gunzip to compress and un-compress, saving us a step:
Tar can be a bit confusing to remember all the options for! Try this trick, to
compress files: Compress Zee Files. For uncompressing: eXtract Zee Files. tar
czf and tar xzf. Simple. Should your life depend on extracting a file with tar,
94
T.me/nettrain
you’ll want to remember this.
The Environment
Let’s jump right into trying a few exercises and learn about the environment.
Type the following commands in your shell and observe the output:
echo $USER
echo $HOME
These values are called environment variables. To see them all, go ahead and
type env in your shell now. You’ll see a bunch of values come back!
Environment variables are used for programs and scripts running on the
system to learn about the environment in which they’re running. Environment
variables might also be used to set the location of a remote file server, set API
keys, or locations of other files and programs.
We’ll now look at another important environment variable, called PATH.
This environment variable, PATH, is used to specify where Linux should look
95
T.me/nettrain
for executable files any time a shell command is entered. The order of directories
is the order that bash will look for the command that you want to run. Each path
is separated by a colon. You might update your PATH if you download a
program and it’s installed in a non-standard location on your system. An
indicator that you might need to update PATH is if bash returns a command not
found error.
96
T.me/nettrain
usr/share/python3
stetson@linux_rocks:/$ whereis ls
ls: /usr/bin/ls
stetson@linux_rocks:/$ whereis cat
cat: /usr/bin/cat
In the example above, whereis returns all of the paths that it searches for the
python3 executable in. which outputs the executable path that is actually used.
which and whereis are a great help for troubleshooting commands and
executables!
Along your shell journey, you’ll find that there are certain commands you
being to type over and over again that are long and tedious. The Linux Shell
Samurai mindset does not conform to the will of such tedious commands!
Wouldn’t it be nice if there was some way to instead use a shortcut and type way
less?
You’re in great luck, friend, aliases do just that!
Let’s write our first alias, enter the following commands at your shell:
97
T.me/nettrain
stetson@linux_rocks:/tmp$ alias sayhello="echo Hello Shell
Samurais!"
stetson@linux_rocks:/tmp$
stetson@linux_rocks:/tmp$ sayhello
Hello Shell Samurais!
An alias is a shortcut to a short snippet of shell code. Above, we set our alias
sayhello to run the echo command along with some text. Then, we can call
sayhello any time we’d like and get the same echo output. Usually aliases do a
lot more than just echo some text to the command-line. They might search and
parse through history or run a backup job.
One problem with our alias: If we reboot, we’ll lose it!
If we decide our alias is pretty decent and worth keeping around, we’ll need
to make sure it’s loaded again. To do this, we can use the .bashrc file in our
home directory. .bashrc is a hidden file that’s executed every time we login to
our shell. It’s commonly used for setting up shell configurations like colors,
completion and aliases!
Go to your home directory if you aren’t already there. You can use the tilde ~
shortcut to cd to your home directory with cd ~. If you list all files and use the
-a flag to show hidden files, you will see a .bashrc file.
Use your favorite text editor to edit the .bashrc file and add your own alias
now. You can use the same format above and put any linux command inside the
quotes “” to the right:
Save the file. Next, you’ll need to re-load the .bashrc file for it to take effect
in your current shell. You can do so with source .bashrc
Remember that bash loads the .bashrc on every new login, we are just forcing
it to load the .bashrc after we made our changes.. Now, try running your new
alias, sayhello!
98
T.me/nettrain
4. Black Belt Command Line Skills -
Output Redirection, Text and More
In this chapter, we will dive deeper into the world of Linux and explore
advanced topics that will help you become a Shell Samurai. From managing
processes and services on your system to working with system directories like /
proc, you will learn the ins and outs of how Linux works behind the scenes. We
will also cover useful tools like strace and jq, which can help you debug and
process API information more effectively. In addition, we’ll learn how to
schedule tasks with cron, access remote systems with SSH, and work with APIs
using CURL. So buckle up and get ready to become a Linux black belt!
99
T.me/nettrain
Let’s compile our first piece of Linux software. We’re going to download and
compile GNU Hello. Hello is a simple piece of software written in C built exactly
for the purpose of teaching beginners to compile code.
Before we get started, we’ll need the build tools that will allow our system to
compile the code. Go ahead and run sudo apt install build-essential to
install these now. Build-essential includes a C compiler, make and a few other
utilities.
Next, ake sure you have wget installed with apt-get install wget. Wget is
a downloader tool we’ll use to download the Hello source.
Now navigate to the GNU Hello home page. Scroll down to download and
click the HTTPS link. You’ll see a ton of pairs of files named similarly:
hello-2.12.tar.gz and hello-2.12.tar.gz.sig. 2.12 is the latest version as of this book’s
writing. The file ending in tar.gz.sig is a signature file used to verify that the
source code hasn’t been tampered with. We want the file ending in tar.gz. Scroll
down to the latest version of the Hello source code and copy the link.
From our terminal, run wget to download the compressed source code:
stetson@linux_rocks:~$ wget
https://ftp.gnu.org/gnu/hello/hello-2.12.tar.gz
Resolving ftp.gnu.org (ftp.gnu.org)... 209.51.188.20,
2001:470:142:3::b
Connecting to ftp.gnu.org (ftp.gnu.org)|209.51.188.20|:443...
connected.
HTTP request sent, awaiting response... 200 OK
Length: 1017723 (994K) [application/x-gzip]
Saving to: 'hello-2.12.tar.gz'
The source code is compressed as a tar.gz formatted file. Recall that we can
uncompress tar files with tar -zxvf hello-2.12.tar.gz. Go ahead and
uncompress the file, then cd into the directory that was created with cd
hello-2.12 or whatever the version is that you downloaded.
100
T.me/nettrain
There’s a few README files inside this directory, we can check them out with
the less command. The file called INSTALL tells us how to compile the code. It’s
always a good idea to read the documentation from source code.
The first thing we’ll run inside this directory is the configure shell script. Feel
free to read any of the files we’re running with less or cat. We’ll run configure
with ./configure. Configure will check for dependencies on our system and
throw an error if we’re missing anything. Configure also generates a Makefile, ,
which is a script that defines how the software should be compiled and installed.
After running configure, we run make. The make command uses Makefiles to
understand how to build the software. We installed make with the build tools
earlier. Go ahead and run make now. You’ll see a flurry of text fly by your screen.
If you don’t see any scary looking errors, it’s a good sign that make ran correctly.
Verify by running ls -ltr. This command lists files in reverse order of time
created. You should see a hello file at the bottom of the list. Go ahead and test it
works by running ./hello:
stetson@linux_rocks:~/hello-2.12$ ./hello
Hello, world!
We compiled the code and ran it, but to complete the process we’ll run sudo
make install. This command installs the compiled software onto your system
so that you don’t have to run it from the directory you downloaded the code to
every time. Run sudo make install and then hello without the ./ at the
beginning:
stetson@linux_rocks:~/hello-2.12$ hello
Hello, world!
Hello was installed to our system! You can verify this by running which
hello. This shows us the path that the command hello is running from:
101
T.me/nettrain
& "
source!
Hot Tip : Be cautious about installing with make install. Compiling code
from source may install other packages we don’t know about. Instead of using
make install, you can apt-get install checkinstall and instead run
checkinstall to keep track of all the files created by the install script. It’ll even
generate a .deb package file. This makes uninstalling the compiled source easier.
Take a look at the docs for checkinstall.
Managing Processes
What is a process?
A process is simply an instance of a running program. Multiple processes run
on Linux at the same time and the kernel manages and oversees them all!
Each process has a unique ID, called the PID (Process ID) that identifies it. You
can list running processes with the ps command. The process id is assigned in
sequential order as processes are created.
At your shell, run the ps command to see the currently running processes on
your system:
stetson@linux_rocks:~$ ps
PID TTY TIME CMD
5030 pts/0 00:00:00 sh
5031 pts/0 00:00:00 bash
5034 pts/0 00:00:00 ps
ps, or process status gives a very plain and simple overview of currently
running processes. In order from left to right:
• PID: Process ID
• TTY: The controlling terminal of the process
• TIME: Total CPU time used
• CMD: Name of the command or executable that spawned the process
102
T.me/nettrain
Our ps output is only giving us a few processes back, how come? By default,
we need to provide additional flags to see all of the processes that are running. A
very popular command is ps aux, which will show us processes belonging to
every user. Note that we don’t provide ps aux with a dash - before the aux flags.
The reasoning for this goes back to an older style command flag. For more info
on this, check out this StackExchange post.
Back to ps aux. The a flag tells ps that we’d like to see all processes running,
including other user’s processes. The u flag gives us additional details. The x
flag shows processes that don’t have a TTY associated, which are usually
daemons that are running in the background or part of system boot.
Try out ps aux now in your terminal:
stetson@linux_rocks:/var/log$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME
COMMAND
root 1 0.0 1.7 167772 8132 ? Ss 2022
6:29 /lib/systemd/systemd --system --deserialize 39
root 2 0.0 0.0 0 0 ? S 2022 0:00
[kthreadd]
root 3 0.0 0.0 0 0 ? I< 2022 0:00
[rcu_gp]
root 4 0.0 0.0 0 0 ? I< 2022 0:00
[rcu_par_gp]
root 5 0.0 0.0 0 0 ? I< 2022 0:00
[netns]
root 7 0.0 0.0 0 0 ? I< 2022 0:00
[kworker/0:0H-events_highpri]
root 9 0.2 0.0 0 0 ? I< 2022 238:40
[kworker/0:1H-events_highpri]
We now see a ton of output! No need to memorize everything, but it’s good to
be familiar with some of these columns:
103
T.me/nettrain
system
• VSZ: - Virtual Memory Usage of the Process in KiB
• RSS: Resident Set Size - non-swapped physical memory a process is using
in KiB
• TTY: Controlling Terminal
• STAT: Process’ state. For example: Z - Zombie, S - Sleeping, R - Running
• START: Starting time of the process
• TIME: cumulative used CPU Time
• COMMAND: The command that launched the process
stetson@linux_rocks:/home$ top
top - 07:36:40 up 8 days, 7:55, 1 users, load average: 0.00, 0.00,
0.00
Tasks: 12 total, 1 running, 11 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.4 us, 0.2 sy, 0.0 ni, 99.2 id, 0.0 wa, 0.0 hi, 0.1
si, 0.0 st
MiB Mem : 1988.8 total, 111.1 free, 399.4 used, 1478.2 buff/
cache
MiB Swap: 1024.0 total, 961.5 free, 62.5 used. 1164.6 avail
Mem
104
T.me/nettrain
4415 www-data 20 0 50864 1488 1372 S 0.0 0.1 0:00.00
nginx
4416 www-data 20 0 50864 1492 1372 S 0.0 0.1 0:00.00
nginx
18522 root 20 0 3884 3064 2568 S 0.0 0.2 0:00.74
bash
20611 root 20 0 12220 3468 3184 S 0.0 0.2 0:00.02
sshd
29968 root 20 0 6640 2860 2440 S 0.0 0.1 0:00.07
su
29969 stetson 20 0 2060 452 384 S 0.0 0.0 0:00.00
sh
29970 stetson 20 0 3884 3100 2588 S 0.0 0.2 0:00.05
bash
29979 stetson 20 0 5724 2648 2224 R 0.0 0.1 0:00.04
top
Here’s a breakdown:
105
T.me/nettrain
0.0 wa: 0.0% wait.
0.0 hi: 0.0% hardware interrupts.
0.1 si: 0.1% software interrupts.
0.0 st: 0.0% steal time.
The fourth and fifth lines show memory usage in MiB:
Top supports a few keyboard shortcuts for sorting each process by pressing
Shift and one of the following keys:
106
T.me/nettrain
N - Sort by PID column
T - Sort by the TIME+ column
P - Sort by %CPU column
Htop is another popular package for viewing process usage. The name comes
from the author’s name, Hisham H. Muhammad or Hisham’s top. htop
provides a bit more readable interface for viewing process utilization.
107
T.me/nettrain
root 31333 0.0 0.0 2844 652 pts/3 S+ 07:56 0:00
grep --color=auto nano
We get back our process output with the nano process and another line that is
the process for grep we just ran!
In your same terminal window, run kill <process-id>. My nano process id
is 31337. Verify that the process has been killed by grepping for it again with ps
aux | grep nano. We just sent a kill signal to the process and told it to
terminate!
This time, you won’t get a text editor, but instead two numbers. The first is a
job id: 1. The second is our process id: 31340. We’ve just sent nano to be run in
the background, hidden away from our command-line interface. If you run ps
aux | grep nano you can find nano still running as a process:
Run the command jobs and we’ll see a list returned with nano:
stetson@linux_rocks:/$ jobs
[1]+ Stopped nano
Let’s get back control of nano. Run fg %1 or simply fg for foreground and
you’ll have the text editor window back. We run fg with the %1 argument to tell
fg we want control of job id 1. Since nano is the only job that is running, we can
108
T.me/nettrain
!
also just run fg without arguments.
Hot Tip : The ampersand & isn’t the only way to send a process to the
background. You can also use Control+Z on your keyboard, which will send a
SIGTSTP signal to the process and put it in the background. You can still run fg to
get back control of the process. Give it a try now!
Sending a process to run in the background can be helpful when you don’t
have the process managed by systemd and want to get some log output from it
or observe its behavior.
Processes vs Threads
When talking about processes, you’ve no doubt heard of the term threads or
multi-threading. In Linux, processes and threads are both abstractions used by the
system to execute code and allocate resources. There’s a couple differences
between the two concepts in how they are structured, share resources and are
isolated:
Process:
109
T.me/nettrain
Thread:
110
T.me/nettrain
example, you type a key and it takes your input and outputs it to computer.
Standard streams abstract this idea of physically connected cables and applies
the same design to programs. Data goes from process to process via standard
streams.
With these concepts in mind, let’s take a look at output redirection.
Wow! The text was placed in the quote.txt file. How? Output Redirection!
We used the > redirection operator to take the Standard Output of the echo
command and redirected it to the quote.txt file. Instead of echo outputting to
our shell, it sent stdout to the file.
The redirect operator is used to redirect output to a file.
By the way, if you were wondering if you can redirect the output of cat to a
file, you totally can. Check it out:
One thing to keep in mind is that using the > output redirection command
will overwrite a file if it already exists. If the file doesn’t exist, it’ll be created. Try
overwriting the quote.txt file now:
111
T.me/nettrain
If instead, you’d like to append to the file so that it doesn’t get overwritten, you
can simply use 2 of the output redirection operators (>>). Try it now and append
to the quote.txt file:
Standard In - stdin
Just as we have stdout streams, we also have stdin streams at our disposal.
This is the standard stream that carries input data into programs. Interactive
programs like text editors read from stdin. Stdin is connected to the keyboard
by default.
From our quote.txt file, let’s change it up a bit and flip our output redirection
operator the other way and use <. The < operator indicates stdin redirection.
Here’s an example:
Here, we redirect the contents of our quote.txt file to grep instead of grep
waiting for keyboard input. Grep highlights shell, since it found it in the file.
Stdin redirection can be used with tons of commands! Onwards to stderr.
112
T.me/nettrain
"
stetson@linux_rocks:~$
Where is our output? Our redirect operator > only redirects stdout. Stderr
outputs to the screen, but we can’t redirect it like this. We’ll have to do something
else.
Instead, we’ll use file descriptors. We’ll use the 2> command instead of > and
catch the standard error.
Hot Tip : File descriptors are unique identifiers that can represent an open
file or an input/output stream. When we use > to redirect output, it’s the
equivalent of using 1>. Below, we instead use 2> for stderr. 0> is stdin.
Nice! Now we’ve got the error message output to our file. This gave us the
error, but if there was any standard output, we wouldn’t see that.
To remedy this, we can use:
2>&1
Let’s break it down. 2> will redirect the stderr stream somewhere. The new
part, &1 will specify that our stderr stream should be the same as the destination
of stdout. We use 1 because it is the file descriptor for stdout.
Here’s an example:
113
T.me/nettrain
stetson@linux_rocks:~$
stetson@linux_rocks:~$ ls /tmp/samurai > output.txt 2>&1
stetson@linux_rocks:~$
stetson@linux_rocks:~$ cat output.txt
/tmp/samurai
The order is important here. We run a command and redirect its output to
output.txt. Then with 2>&1 we tell stderr that it should output to the same place
that stdout is outputting. That place is output.txt here.
In conclusion, understanding and utilizing stdin, stdout, and stderr are crucial
for effective Linux usage. Although grasping these concepts might be challenging
and abstract initially, experimenting with various examples and redirection
techniques will help you become proficient in handling standard streams.
114
T.me/nettrain
Symlinks and Hardlinks
If you’ve used Windows before, you’re probably familiar with the concept of a
shortcut. They’re essentially links to other files on the system. Linux has a similar
concept called symlinks or symbolic links. A symlink acts as a shortcut or
reference to another file or directory on the system. There is also the concept of a
hard link, which is a link to a inode. You might remember that an inode is a data
structure used by the filesystem to to store metadata about a file or directory.
When a program or user attempts to access a symlink, they are redirected to
the original file or directory’s “real” location on disk. Symlinks are extremely
handy for creating shortcuts to frequently used files and directories.
Let’s try a quick example. From your shell, create a new file with echo “hello
Shell Samurais!” > /tmp/shell_samurais.txt. Next, create a symlink with ln
-s /tmp/shell_samurais.txt ~/my_first_symlink. The first argument is the
file we’re linking to, and the second argument is the symlink name.
If you list the directory contents of your home directory at ~, you’ll see an
arrow -> indicating that my_first_symlink points to /tmp/shell_samurais.txt.
To drive this concept home, open up the symlink in your home directory with
your favorite text editor and delete the contents. Then, add some new text. I’ll
add Hello from VIM!!! - Stetson.
Cat the file in /tmp/shell_samurais.txt and you’ll see that the text has been
115
T.me/nettrain
updated! Whenever a symlink is modified, the file that it points to is modified as
well. The reverse is also true. Modifying a file that a symlink points to will also
update the symlink.
For example, below I update the symlink in the /tmp directory and then view
the symlink at ~/my_first_symlink:
Hard links
Hard links are similar to symlinks, but not the same thing. Whereas a symlink
points to the path name of a file, a hard link points to the same physical data on
disk as the original file. It uses an inode to do this. Let’s create a hard link now.
We still use the ln command, but we don’t provide the -s flag.
Below, I create a hard link at /tmp/shell_samurais.txt pointing to ~/
my_first_hard_link:
stetson@linux_rocks:~$ ln /tmp/shell_samurais.txt
my_first_hard_link
You should also know the difference between how a symlink or hardlink
behaves when the file being pointed to is deleted. A symlink will not work any
longer. If you try to cat the contents, Linux will complain that the file doesn’t
exist. A hard link will still show the data, since it’s pointing to the physical data
on disk. The only way to delete the data completely is to remove the original file
and all links to it. Observe what happens when we delete /tmp/
shell_samurais.txt:
stetson@linux_rocks:~$ rm /tmp/shell_samurais.txt
116
T.me/nettrain
stetson@linux_rocks:~$ cat my_first_hard_link
Hello from VIM!!! - Stetson
this was edited from tmp!
stetson@linux_rocks:~$ cat my_first_symlink
cat: my_first_symlink: No such file or directory
stetson@linux_rocks:~$ ls -l
total 4
-rw-rw-r-- 1 stetson stetson 56 Mar 4 01:38 my_first_hard_link
lrwxrwxrwx 1 stetson stetson 23 Mar 4 01:34 my_first_symlink -> /
tmp/shell_samurais.txt
Another tip to keep in mind is that hard links cannot point to different file
systems since inodes are unique to a file system.
This is all great to know, but when might we actually use a symlink or hard
link? Links are useful for instances where the same file might have to be accessed
from different directories. Symlinks are useful when we need to create a shortcut
to a file or directory in a different location or when we want to create an alias for
a frequently accessed file or directory, while hard links are useful when we need
to make a copy of a file in a different location, but don't want to duplicate the
data or when we need to create multiple names for the same file within the same
file system.
Understanding the difference between these two types of links is important
for efficient file management and optimizing disk space usage. With this
knowledge, you can confidently create and manage links in your Linux system.
Chaining Commands in the Shell with ; and && Operators
Two very handy operators in the shell are the semi-colon (;) and the double-
ampersand && operators. These operators make it easier to execute multiple
commands in sequence or control their execution based on the success of
previous commands.
First, the semi-colon operator can be used to chain commands together that
would normally be all separate commands. Here's an example of how it works:
117
T.me/nettrain
lastlog wtmp
Next, the double ampersand (&&) operator allows you to execute the second
command only if the first command exits successfully. To see it in action, let's try
a command that produces an error:
118
T.me/nettrain
Text Processing
Confirm your file was written by cat-ing it from your terminal or using less
to read it.
Nice Work! Nano is a great way to quickly edit text files, but it isn’t the text
editor we’d recommend for long-term use.
For that…we’ll talk about VIM.
119
T.me/nettrain
Text Editing Like a Boss with Vim
The age-old debate between Vim and Emacs continues to divide text editor
enthusiasts. Both have been around for years, offering rapid editing speed and
extensive customization options. Advocates for each editor often passionately
argue for their preferred choice, leading to heated online debates.
That said, the choice is yours. Explore all your options and weigh out the
tradeoffs and decide what works best for you. For now, let’s focus on learning
Vim, our recommended text editor for Shell Samurai users.
What is Vim?
Vim is a popular text editor and builds upon the VI text editor, which stands
for Visual. Vim is “VI Improved” and it probably comes installed with your
system already. If not, we can install it with our trusty friend, apt-get install
vim
Vim has a notorious reputation for being difficult to learn and even harder to
exit. It’s become so much of a meme that exiting Vim is a popular joke around
how hard it is to get out of Vim and back to the command line!
Despite Vim’s learning curve, I think that it is worth learning since it can help
edit text files much more quickly and efficiently. Don’t feel like you need to learn
Vim all at once. Our goal in Shell Samurai is to get you to the bare minimum
proficiency and then guide you to learn more on your own time, as needed. We’ll
provide some resources at the end of this section that should help to take your
Vim journey to the next level. With that, let’s dive into Vim.
120
T.me/nettrain
A Very Brief Introduction To Vim
You’ve installed Vim or it’s come with your system already. Start Vim with the
simple command, vim. Whoa! Immediately, we’ll see a screen with a bunch of
tildes along the left-hand side and some text containing the version and other
details:
121
T.me/nettrain
How do we we actually start editing text? If you just start typing characters,
likely nothing will happen or you’ll get weird errors. Before doing anything, type
the i character on your keyboard. This will get you into insert mode. You can
verify this by looking in the bottom left corner. Now, you can type text like
normal. Go ahead and type a few sentences of whatever you’d like. Make sure to
type multiple lines.
You can use the arrow keys while in “insert” mode to move around and use
delete as normal. Once you’re happy with your text, hit the Escape key. The —-
INSERT -— text will disappear. To save the file, hit the : colon key on your
keyboard. You’ll see the colon show up in the bottom left corner. Type write and
then a name to call the file. Hit enter. You’ve saved the file!
122
T.me/nettrain
Now, we’ll quit Vim. Hit Escape again and type the colon character and then
quit! Like so :quit!
The exclamation mark indicates we want to close Vim without saving. We just
saved, so that’s fine. Hit enter and we’re returned back to our familiar prompt
again. Well done!
Back at the terminal, feel free to cat or less your file to verify it was written.
123
T.me/nettrain
Quickly Moving to the Top, Bottom, Left and Right
Vim has a ton of shortcuts that let you edit text super quick. Let’s introduce a
few more. Put your cursor on the first character of a line of text. Now, type the $
symbol. You’ll see that the cursor has moved to the last character on the line.
Type the 0 key and notice that your cursor moves to the beginning of the line
again. Try moving back and forth a few times.
Let’s go the other direction now. Put your cursor anywhere on the first line of
text. Now, type capital G with Shift+G. If you have multiple lines of text, your
character will quickly shoot to the bottom of the text. Let’s go to the top again.
Type the lower-case g character twice. gg. One after another. Your cursor should
move back to the top of the text.
Like the HJKL keys, you might be thinking that this sounds insanely hard to
learn. It does take some practice and muscle memory, but it’s worth it in the
long-run. Try to stick with it.
124
T.me/nettrain
"
but there’s a short cut for both of these. Write can be abbreviated as w and quit
can be abbreviated as q. We can also combine them! You should be out of insert
mode now, type :wq and hit enter. You have just written and quit your file.
• vimtutor — This is command comes with Vim. Open your terminal, type
vimtutor, hit enter to accept the default location and follow along.
Vimtutor is interactive and right in your terminal. If you follow along,
you’re bound to learn enough about Vim to become a Vim ninja.
• Steps to Learn Vim - a short blog post with more Vim resources.
• Vim Adventures - an awesome Vim interactive javascript game. Browser
based.
• Vim Genius - This short interactive course guides beginners through basic
Vim shortcuts and commands in the browser. No Linux install needed.
Pipe -- Not Just for Plumbers
Let’s look at an example now. We can ls a directory and we’ll get back a
125
T.me/nettrain
bunch of text. Try running ls on the /etc directory now. Then, let’s make that
text a little easier to read and pipe it to less. Run this:
Now you can scroll through the files with the arrow keys (or j and k keys).
Remember, you can hit q to quit. We’ve just redirected the output of the ls
command to the input of the less command. Simple!
We can also use multiple pipe operators to keep chaining commands. We’ll
look at a little bit of an advanced use here, you might have not used all of these
commands yet, but stick with me:
Above, we use ps aux to list all running processes on our system. We then
pipe that output to the sort command. Sort is going to sort based on the 4th
column, which is memory usage. -k +4 says we want to sort on the 4th column,
while the n flag sorts in ascending order. Finally, we pipe the output to the tail
command, which will display the tail end of the input. Tail gives us the last 10
lines by default.
That was a pretty advanced use of pipe, so don’t feel intimidated if it was a bit
much to handle. The important thing to remember here is that we can use
multiple pipe operators. We could’ve kept chaining 10 or 50 pipes together. That
would’ve been a real doozy!
Oh, yeah, we can also use our redirect operator along with the pipe operator
126
T.me/nettrain
together. One more quick example and we’ll move on:
This is the same example as above, only we’re sending the output to the
processes.txt file at the end.
Pipe is a powerful operator! By using the pipe command, the output of one
command becomes the input of the next command in the chain. This enables us
to create complex command pipelines that perform multiple tasks efficiently,
without the need for intermediate files or manual intervention.
127
T.me/nettrain
have only output to the file with no output on your screen. Tee offers an easy
way to view and store command output, which can be helpful for
troubleshooting and documentation purposes.
Grep is an essential tool for Shell Samurais to process and search for text. With
grep, we can search files and text for certain words or characters that match a
pattern we specify.
Let’s make a file for ourselves to use grep on:
We’re using printf here instead of echo because printf will interpret our \n
characters as a new line. Now, let’s search the file using grep:
Grep quickly located the word in the file! If we searched for a word that didn’t
exist, we’d get no results back:
There are a few other uses of grep that are very helpful.
128
T.me/nettrain
We can grep recursively in directories, so that we check directories within the
directory we specify:
We can even grep and ignore case sensitivity on our earlier example:
Lastly, we can pipe output to grep and search that! Here, I grep for security in
the /etc directory and find the security directory.
Grep is great to filter text, search for text strings and more! Grep even
supports regex, so we can find almost any text we want. Learn grep and become
an effective Shell Samurai.
129
T.me/nettrain
- my_regex.txt
Let’s try a some more advanced regex. Create a file and add the following
content and call it words.txt:
babble
bubble
bobble
We can now use [] brackets to search for characters that match inside. Here’s
an example:
130
T.me/nettrain
Our regex returned babble and bumble because we specified the a and u
characters. It does not return bobble since an “o” is not in the brackets.
We could use the ^ character inside our brackets to search for anything that
does NOT contain that character.
Let’s look for b[^u]bble now:
We get babble and bobble, but NOT bubble, since we told grep we don’t want
the u character to be returned.
cut
Cut is a text processing utility we can use to very quickly cut or chop text up
and output it to stdout. Let’s dive in to some examples now. Write the following
command in your shell:
131
T.me/nettrain
like it\nBut this one is mine" > shell_creed.txt
Note, the -e flag tells echo to listen to backslash escapes, like the \n one we’ve
used to specify a new line. Then, we redirected the output to a new file called
shell_creed.txt.
The c flag tells cut to output the first character of each line. Go ahead and try
changing the -c flag’s argument now:
Above, I’ve marked the space character with \s to show that cut will interpret
spaces and output them! The 4th character of the bottom line happened to be a
space.
What if we wanted the second word from every line? We can use the -f or
field flag with cut to ask for the second word in each line:
Wait! That didn’t work. We just got back the whole file. Why? By default, cut
is set to look for TAB delimiters, so we’ll have to tell it that we want it to use
spaces instead. Do so now with the -d or delimiter flag:
132
T.me/nettrain
this
Much better!
As you can see, cut is another text processing utility we can add to our tool
belt. As with any Linux shell tool, it becomes more powerful when it’s combined
with other commands via the redirect (>) or pipe operators (|).
At some point, you may find that you want to count the number of lines in a
file. It’s simple to do so, just use the wc or Word Count command. Without any
arguments, wc returns the word count, line count and byte count.
Here’s an example:
stetson@linux_rocks:/tmp$ wc file.txt
8059 8059 32236 file.txt
If you only want the lines in a file, you can use the -l flag:
stetson@linux_rocks:/tmp$ wc -l file.txt
8059 file.txt
133
T.me/nettrain
"
sed
Sed stands for stream editor and at this point, it’s become an ancient and
powerful tool. Sed was initially developed in 1971 at Bell Labs, just like Unix. Sed
is actually considered a programming language. It’s just that powerful and even
considered turing-complete.
Let’s take a brief side quest to learn about Turing completeness.
Hot Tip : Turing Complete is a computer science term meaning that a system
can perform any computation that a Turing machine can do. Wait a second. What
is a Turing machine? A Turing machine is a hypothetical, imaginary computer
that would work by writing to and reading from a tape. The machine can move
left or right along a tape and perform operations on values it reads from each cell
on the tape.
A Turing Machine is simple but can perform any mathematical operation that
a modern computer can. If a programming language is Turing complete, this
means it can do any operation that any other Turing complete language can do.
This concept is named after the British mathematician Alan Turing, who
introduced the idea of the Turing machine as a way to understand the limits of
computation. I’d recommend reading more about Turing Machines if you’re
interested in the history of computing!
Sed actually works in a simple way. It reads text line by line into a buffer and
performs instructions on each line. For example, sed could read a whole book as
text and replace every instance of the with samurai. Let’s take a look at our first
sed command and instructions.
Here, we’ve used sed with some simple instructions to replace instances of
sword with shell. The text after sed is the sed command. S stands for substitute,
where sword is the text we want to look for and shell is what we want to replace.
Let’s look at a few other examples. Create a new text file with your favorite
editor and write the following text, separated by new lines. Save it as
134
T.me/nettrain
shell_creed.txt:
This is my sword
There are many like it
But this one is mine
We already looked at replacing text. We’ll do the same thing but modify our
file in the process:
Let me explain. The -i flag stands for in-place, it tells sed to modify the file
and save the changes to the original file.
Again, we are substituting with s and specifying the text to search for and the
text to replace. This time, though, we also give it the g flag to tell sed to perform
the substitution globally.
Let’s try another sed command to instead delete any lines containing shell:
This time, we’re still calling sed with the -i or in-place flag, but the d
command we give sed tells it to delete every line containing shell.
Sed is very powerful, but becomes more powerful when we combine it with
awk.
AWK
AWK is also considered a programming language. Where sed is often used to
modify and process text, awk is used to analyze text. AWK was also developed by
Bell Labs in the 70’s and carries ancient and powerful history with it. Awk’s
name comes from it’s developers, Alfred Aho, Peter Weinberger and Brian
135
T.me/nettrain
Kernighan.
Like sed, AWK reads text, line by line and scans each line for a user-specified
pattern. If it finds a match, it performs an action.
Let’s take a quick look at an awk command. Here, I’ll use awk to print only the
9th column from the output of ls -lh:
stetson@linux_rocks:~$ ls -lh
total 16K
-rw-rw-r-- 1 stetson stetson 10K Feb 27 00:28 important_file
-rw-rw-r-- 1 stetson stetson 0 Feb 27 00:27
moon_landing_photos.png
-rw-rw-r-- 1 stetson stetson 209 Feb 27 00:44 myArchive.tar.gz
-rw-rw-r-- 1 stetson stetson 0 Feb 27 00:27 secret_documents
stetson@linux_rocks:~$
stetson@linux_rocks:~$ ls -lh | awk '{print $9}'
important_file
moon_landing_photos.png
myArchive.tar.gz
secret_documents
Above, I feed the output of ls -lh to awk with the pipe operator. Then, I use
the awk action ‘{print $9}’ to tell awk that I only want the 9th column of each
line, where each column is separated by whitespace.
AWK can also perform a few other tricks. Suppose you have a file with a list
of numbers called numbers.txt:
10
20
30
40
50
output:
136
T.me/nettrain
150
output:
25
item001, Laptop, 5
item002, Mouse, 12
item003, Keyboard, 8
item004, Monitor, 6
Let's say you want to replace all commas with tabs and then print only the
lines with a quantity greater than 10. You can achieve this using sed and AWK
together as follows:
In this example, the sed command replaces all commas with tabs, and then
the output is piped to the awk command. The awk command checks if the third
field (quantity) is greater than 10 and, if so, prints the entire line.
137
T.me/nettrain
This has been a brief intro to using sed and awk. Combining the two tools
creates a powerhouse combo for any Shell Samurai to automate away painful
processes. Whether it’s processing web server logs or analyzing transaction
records, sed and awk should be in your tool belt.
First, sort. sort…sorts lines. Given a file fruits.txt which contains the
contents:
carrot
banana
apple
orange
kiwi
tomato
We can run the above text file through sort to get the list returned back in
alphabetical order:
Provide the -r flag and we’ll get the list back in reverse alphabetical order:
138
T.me/nettrain
kiwi
carrot
banana
apple
Now, let’s pretend that our file has duplicate lines we want to eliminate. Let’s
not pretend, let’s put a few duplicates in:
carrot
banana
apple
tomato
apple
orange
orange
kiwi
tomato
carrot
kiwi
carrot
Wait a minute! That didn’t remove the duplicate lines at all! Or maybe it did…
take a closer look. Uniq, by default, will only remove duplicated lines that are
next to each other. Our repeated lines containing orange were “condensed” to a
single line. However, we still have two carrots and two tomatoes.
That’s where sort comes in. We’ll run sort on the file first and then pipe the
139
T.me/nettrain
output to uniq:
Much better! sort first made our file alphabetically ordered, and then we
piped the output to uniq, which got rid of the duplicate lines!
If we wanted only the unique lines, we can get those by providing the -u flag:
And if we want to get only the lines that are duplicates, we can give uniq the
-d flag:
uniq and sort are indeed two more powerful tool for processing text. Like
other Linux utilities, they become even more powerful when we utilize the pipe
operator | to send their output to other commands. Are you beginning to feel the
power of Linux’s modular design? One tool does one thing very well and has an
interface to talk to other tools.
140
T.me/nettrain
and scrolling through, head allows you to instead read just the first 10 lines:
If instead you’d like to adjust the amount of lines returned, we can give head
the -n flag and the number of lines we’d like to get back. Let’s try just three lines
of output:
141
T.me/nettrain
What’s the opposite of head? Tail! Tail will also give us the last 10 lines of a
given file.
Again, if we’d like to specify the amount of lines returned, we can use the -n
flag and specify the number of lines we get back as an argument.
142
T.me/nettrain
"
ton of alien output. Don’t be scared, you don’t need to understand everything
returned right now.
Let’s try to take a look at the first line that comes back from strace:
The first line we see shows execve, which is the execute program system call
and it’s what executes the program ls.
All the text between the parentheses are the arguments that are passed to the
execve system call. We can see in the arguments the path of the ls program at /
usr/bin/ls
The last section, = 0, is the return value. 0 is considered a successful return
value and tells the system that the execve system call worked properly.
Feel free to read through the rest of the strace output and see if you can find
out more on your own. Each line you see is a system call and its arguments. If
you don’t know what a specific system call does, google it!
Strace can give you superpowers for spying on programs and processes. My
intent isn’t to go deep with it, but to make you aware of strace and how it can be
used to better understand the Linux operating system and debug issues.
Hot Tip : You should almost never use strace on a production web server or
database. It is very resource-intensive and can grind your application to a halt. Be
very certain and careful before using strace in production.
143
T.me/nettrain
The basic syntax of the find command is as follows:
The [path] is the starting directory where the search will begin, and
[expression] is a set of options and tests that define the search criteria. For
example, to find all files named "file.txt" in the current directory and its
subdirectories, you would use the following command:
Try creating a file in your home directory and then using find to find it now.
Find doesn’t just make us use the file name as a search parameter. For
instance, we can search for files based on their size using the -size option
followed by a size value and unit (c for bytes, k for kilobytes, M for megabytes,
and so on). The following command will find all files larger than 10 megabytes in
the / directory (you may need to run this as root or sudo):
Find doesn’t just search one directory deep, we can search from the root
directory for any file:
Another useful feature of find is its ability to execute actions on the files that
match our search criteria. By using the -exec option, you can specify a command
to run on each file that was found.
The following example will find all files with the ".log" extension in the /var/
log directory and delete them:
144
T.me/nettrain
Above, {} is replaced with the file path of each matching file, and \; signifies
the end of the -exec command.
In conclusion, find is a valuable asset for navigating the Linux filesystem and
performing operations on files and directories. By mastering find’s options,
you’ll be much better equipped to handle file management tasks efficiently and
effectively. Remember to consult the man pages (man find) for more information
and examples of the find command's extensive capabilities.
Swap Space
Swap space is not a unique topic to Linux, but it is fundamental to
understand operating systems in general and Linux is no different! Swap space is
space on a hard disk that can be substituted for physical memory or RAM. It’s a
kind of virtual memory used like an “overflow” for RAM. When RAM runs out
of space, swap space is used instead.
It’s generally not considered a good thing if your system is frequently using
swap space rather than RAM. Hard disk storage is much slower to access than
RAM, so if our system is using it all of the time, it’s going to operate more
slowly!
You can use the swapon command to show where your swap space is
currently configured and the configured size:
stetson@linux_rocks:/var/log$ swapon
NAME TYPE SIZE USED PRIO
/swap file 1024M 0B -2
Here, my swap space is located at /swap and is 1024 MB. If we’d instead like
to see how much swap is being currently used, we can use the same command
that we’d use to check memory statistics, free -h:
stetson@linux_rocks:/var/log$ free -h
total used free shared buff/cache
available
Mem: 1.9Gi 388Mi 706Mi 386Mi 894Mi
145
T.me/nettrain
"
1.1Gi
Swap: 1.0Gi 0B 1.0Gi
Check the second row, you’ll see that I have 1 Gig of swap free with nothing
used (second column).
I estimate that much of the world’s economy and modern technology are held
together by cron. I’m serious. It’s everywhere. What is cron? Cron is a time-based
scheduler built-in to Linux. It’s available on nearly every system by default. Cron
is not a command, but a system-managed process and daemon that runs in the
background to schedule tasks.
Cron allows administrators to define schedules and tasks in a file known as a
“cron job”.
As an example, a cron job might do any of the following.
• Email users every week with a report of their usage of a Software license
• Backup a database every night at midnight
• Delete old log entries every Wednesday at 12 PM UTC
• Send a report of the system’s memory and disk-usage every single minute
to a monitoring service
As you can see above, cron jobs always define a schedule and a task to
perform. The Cron table, or crontab is a configuration file that defines each cron
job. Crontab files are created for each user. Cron jobs will run with the same
permissions of the user that created them.
Logged in as any user, you can quickly display the crontab file with crontab
-l
Hot Tip : In the rare event, cron is not available on your system, you may
need to install it! Run apt-get update && apt-get upgrade and then apt-get
146
T.me/nettrain
install cron.
Once you list a crontab output, you should see something like the following:
0 0 * * * /path/to/backup-script.sh
This looks a bit alien. Let me explain. The first five fields, 0 0 * * *, specify
minutes, hours, days, months and day of the week. The asterisk is a wildcard for
any. The above example tells cron to run the backup-script.sh at minute 0 and
hour 0 (midnight) every day of the month, every month and every day of the
week.
Let’s try making our own crontab entry. Type the command crontab -e and
hit enter. The e flag is for edit. Cron should then either throw you directly into
an editor or ask you to select a text editor. You may get vim as your default text
editor. Use vim if you’re comfortable with it (be sure to review the vim chapter in
this book). If you’re not comfortable with vim, exit out of it by hitting Escape and
typing :q! and enter. Type select-editor and you can freely choose a different
editor, like nano:
stetson@linux_rocks:~$ select-editor
With our text editor selection out of the way, let’s edit our crontab file!
On a new line, type the following contents and save the file:
This crontab entry is going to run every 2 minutes (*/2 ) for as long as our
147
T.me/nettrain
"
system is up and print the current date out to a file at /tmp/current_date.txt.
After saving the crontab, you should see a message confirming that the new
crontab was installed:
stetson@linux_rocks:/tmp$ crontab -e
no crontab for stetson - using an empty one
crontab: installing new crontab
Now let’s wait for cron to run our command. After about 2 minutes have
passed, run cat /tmp/current_date.txt to verify the cron job ran:
Keep in mind that we could have given cron any shell script or file to run, but
we chose to instead directly configure our command in the crontab file.
Now, let’s get rid of the cron entry so that our disk doesn’t fill up. Open the
crontab once again with crontab -e, delete the line we just created and save the
file. Without doing this, the cronjob would run for as long as our system is up.
Hot Tip : Cron syntax can be very confusing. If you asked me to create a
cron job that ran on every third Wednesday at 3:45 PM, I couldn’t do it. I would
guess that most Linux systems administrators couldn’t either! That’s alright,
though, because we have tools to help us! There are many “cron calculator” web
apps that help users translate from English to crontab. Simply Google “cron
calculator” or use a site like CronTab.Guru or CronHub. These tools help us
configure Cron schedules like a true Shell Samurai!
Although these examples were simple, cron is a powerful and fundamental
Shell Samurai skill. Many, many business operations run on cron today. Why
should we do manual work when we can get machines and robots to do it for us?
Cron can be helpful for automating processes like backing up files, sending
148
T.me/nettrain
emails and even more. Use cron and become productive and dangerous.
149
T.me/nettrain
Installing SSH and Connecting
With that bit of theory out of the way, let’s setup SSH from scratch and
connect to our server!
First, install OpenSSH with sudo apt-get install openssh-server if you
don’t already have it. You can check by running which sshd and checking if you
get a path back like /usr/sbin/sshd.
Once you have OpenSSH installed, you’ll need to configure it. Using sudo or
the root user, open up the config file for sshd at /etc/ssh/sshd_config with
any text editor. Lines that begin with a # are comments and will not be applied.
Search for and uncomment the following lines and set the values to below. If a
line doesn’t exist already, be sure to add it.
Port 22
ListenAddress 0.0.0.0
PermitRootLogin no
PubkeyAuthentication yes
PasswordAuthentication yes
After saving the sshd configuration, go ahead and restart the SSH server with
systemctl restart sshd
150
T.me/nettrain
From your local system, run ssh-keygen with no arguments. There are
additional flags available for cryptographic algorithm, key-length and other
settings. We’ll stick with the defaults for now. ssh-keygen will also ask for a
password to protect the private key. You can hit enter to skip the password, but
you might want to provide one on production systems for additional security!
stetson@linux_rocks:/$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/stetson/.ssh/id_rsa):
Created directory '/home/stetson/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/stetson/.ssh/id_rsa
Your public key has been saved in /home/stetson/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:hbT1XTCvr5cZ3JLEvjdeawG/QS+kgKtMfyupJ2T52lQ
stetson@linux_rocks
The key's randomart image is:
+---[RSA 3072]----+
| . . o..|
| . + . . + |
| o.. ... .|
| ... .+o |
| .S.E. =*.o|
| = .. . =Bo|
| = +.. +X|
| +o* . .B=|
| o=.o.. +=.|
+----[SHA256]-----+
Check out the randomart image at the end. This a visual fingerprint of the
public key, it’s an additional, security feature that can be used to verify public
keys by quickly looking at the fingerprint.
151
T.me/nettrain
access to the server you want to transfer your key to. I’ll assume you do, so you
can go ahead and cat your public key (not private!) out and copy it to your
clipboard.
The ssh-rsa at the beginning of the file specifies the algorithm used to
generate the key. The stetson@linux_rocks at the end of the file is a comment
that was auto-generated. We could put anything here.
On your remote server running OpenSSH, we’ll paste this key into our user’s
authorized keys file at /home/<user_name>/.ssh/authorized_keys. Create this
file if it does not exist already.
Finally, we’re ready to connect!
From your client, you can run ssh <user_name>@IP_Address — If all went
well, you’ll be asked to verify the public key is one you recognize:
After entering yes, you should be logged in to the remote server! If your local
ssh agent is not using your local private key for some reason, you can specify it
with the -i flag explicitly and provide the path:
152
T.me/nettrain
stetson@linux_rocks:/$ ssh -i ~/.ssh/id_rsa.pem [email protected]
This has been a very brief primer on SSH! SSH has a few other tricks up its
sleeves, like copying files with Secure Copy (scp), network tunneling and
proxying and more!
Knowing how to use SSH is an absolute requirement for any systems
administrator and Shell Samurai. It takes a little time to come up to speed with
and I don’t know all of the intricacies myself, even! Be sure to reference online
documentation, man pages and other resources if you need help!
curl or client URL is a tool for transferring data from the web. We can use
curl to download data or call HTTP APIs. Curl’s webpage boasts that it’s used in
over ten billion places from routers, cars, phones, tablets and audio equipment to
virtually any other digital device you can think of. Curl comes with just about
every Linux installation by default.
We’ll use curl in this section to learn about APIs.
Curl allows us to specify a URL and data we’d like to send to that URL. Let’s
make sure we have curl installed with sudo apt-get install curl, although it
should be on your system by default.
Curl’s general syntax is:
curl https://jsonplaceholder.typicode.com/todos/1
153
T.me/nettrain
stetson@linux_rocks:/$ curl https://jsonplaceholder.typicode.com/
todos/1
{
"userId": 1,
"id": 1,
"title": "delectus aut autem",
"completed": false
}stetson@linux_rocks:/$
We’ll immediately get back a response, followed by our prompt. The response
doesn’t have a newline at the end, which is why the prompt got tacked on to the
end.
This API response came back to us in JSON or Javascript Object Notation
format. It’s an extremely common format for the modern web, although lately
some websites and applications are using the newer GraphQL instead.
You’ll notice the response has keys like userId, id, title and completed.
Each key has a value like 1, 1, delectus aut autem and false.
154
T.me/nettrain
new representation of that data.
• PATCH - Applies modifications to a resource
• DELETE - Deletes the resource from the server
CRUD
You’ll also hear the term CRUD when describing web applications and in
discussions about APIs. CRUD stands for Create, Read, Update and Delete.
These are the 4 operations used in databases and web apps to transform data and
is a model used to describe web apps. Each of our 5 HTTP methods lines up with
Create, Read, Update or Delete.
For example, GET is a read operation. POST is an Update operation. PUT and
PATCH update data and of course DELETE…DELETES data.
With the CRUD design, we can create web apps that perform almost business
goal.
155
T.me/nettrain
The HTTP endpoint or path would look something like:
https://bobssurfboards.com/api/v1/surfboards/1
In the above made-up example, we call everything after the .com a path
component. The API portion specifies that we’re talking to an API. v1 might be
used to specify a version of the API. surfboards represents our resource or thing
we want to use the API to operate on. Finally, the 1 integer represents a specific
surfboard resource in the database.
In our example, we might send a GET request to the above URL and get back
something that looks like this in JSON format:
{
"id": "1",
"name": "The Cruiser",
"description": "A versatile board that's great for beginners and
advanced surfers alike. Comes in a variety of sizes and colors.",
"price": 599.99,
"length": 8,
"width": 22,
"thickness": 3,
"material": "Epoxy",
"image_url": "https://www.bobssurfboards.com/images/surfboards/
cruiser.jpg"
}
{
"description": "ON SALE FOR A LIMITED TIME! A versatile board
that's great for beginners and advanced surfers alike. Comes in a
variety of sizes and colors.",
"price": 599.99,
}
I hope this has been a helpful primer for understanding more about APIs.
156
T.me/nettrain
Note that not every API is the same or acts the same. Each API provider should
have documentation that describes how they’d like for you to authenticate to the
API and what format requests and formats should follow.
The -X flag is followed by the HTTP verb or method and the -d flag is
followed by the data we want to post! You can try POSTing to the
jsonplaceholder API now:
157
T.me/nettrain
except this time we’ll give the data in JSON format:
Beyond Basics
Curl is a great tool for testing APIs, downloading data like images and videos
and saving web pages. You can use curl in your own shell scripts and programs
to automate tasks, get data from the web and more. Curl can do so much and I
invite you to read up on curl’s official website to learn more about the powers of
this tiny tool.
jq - Processing JSON
jq is like sed for JSON data. JSON, or Javascript Object Notation is an
extremely common format for log files, API calls and the Javascript programming
language. Earlier, we looked at using curl to call APIs. When we curl an API
endpoint, we’ll normally get back a bit of JSON that’s a bit…ugly. jq makes
formatting and searching through JSON data easy. Let’s look at how to use it.
We’ll start by using a free API at https://api.publicapis.org/entries — this
API endpoint is provided by publicapis.org and simply returns a JSON document
of other APIs we could use. Go ahead and run curl https://
api.publicapis.org/ entries now in your terminal.
Whoa! As is tradition, we get back a LOAD of text in JSON format. This data
really sucks to parse through manually and try to read. Enter…jq.
First, make sure you have jq with apt-get install jq
158
T.me/nettrain
Then, run the prior curl command again, but attach a pipe and send it to jq:
This time, you’ll notice that jq has formatted the text neatly, a term typically
called pretty printing. We use the ‘.’ at the end to tell jq to take the input and
produce it, unchanged to our terminal. jq is useful enough with just that trick,
but let’s try some more!
You’ll notice if you scroll all the way to the top of the file at https://
api.publicapis.org/entries (or load it with your browser) that we see
something like this:
{
"count": 1425,
"entries": [
{
"API": "AdoptAPet",
"Description": "Resource to help get pets adopted",
"Auth": "apiKey",
"HTTPS": true,
…
Let’s dive into that entries key. JSON is a key-value object store, and entries
contains a list of all the APIs that public APIs is providing us. Try this command
out now:
We could also specify the count key if we’d like, which would return the
number of entries (1425 as of this writing) that come back with:
Back to the entries filter, let’s tell jq that we want just the first item:
159
T.me/nettrain
stetson@linux_rocks:~$ curl https://api.publicapis.org/entries | jq
'.entries[0]'
\ % Total % Received % Xferd Average Speed Time Time
Time Current
Dload Upload Total Spent
Left Speed
100 274k 0 274k 0 0 848k 0 --:--:-- --:--:--
--:--:-- 845k
{
"API": "AdoptAPet",
"Description": "Resource to help get pets adopted",
"Auth": "apiKey",
"HTTPS": true,
"Cors": "yes",
"Link": "https://www.adoptapet.com/public/apis/pet_list.html",
"Category": "Animals"
}
Diving deeper, if we only want one field from the returned data, we can chain
that on to the end and get it back. Let’s try it with the description key:
160
T.me/nettrain
stetson@linux_rocks:~$ curl https://api.publicapis.org/entries | jq
'.entries[500].API'
% Total % Received % Xferd Average Speed Time Time
Time Current
Dload Upload Total Spent
Left Speed
100 274k 0 274k 0 0 892k 0 --:--:-- --:--:--
--:--:-- 892k
"Hotstoks"
How about the name of every API in the list? Use the following:
The above jq query uses [] to specify that entries is a list or array. The pipe
character pipes each item in the list to the next filter, .Description.
We won’t go into jq any deeper today, but hopefully this has shown some of
the possibilities of the tool! As always, jq becomes even more useful when
combined with other Linux tools and utilities! When you’re working with JSON
data and want to chop it up quickly, reach for JQ!
161
T.me/nettrain
The name terminal multiplexer gives a hint to its usage. From one shell
window, we can arrange multiple windows or panes. Each of these windows is
opened in a session. We can save those sessions for later by de-taching from the
window and later re-attaching and pick our work back up again. This will all
make sense soon, so let’s get started with tmux!
First, install tmux with apt-get install tmux. Then, simply run the tmux
command. You’ll immediately see a “status bar” at the bottom of your screen,
but otherwise, your terminal should look the same. There’s some other
components in the status bar like the date as well. You can even customize the
status bar further, but let’s keep it simple for now.
Tmux starts us off with a single window and a single pane. Tmux uses the
concept of a prefix key and a command key. By default, the prefix key is Control+b.
This is often notated as C-b in documentation for tmux. Press Control+b now.
162
T.me/nettrain
Tmux is now listening for a command. Press % on your keyboard and the screen
will split in two vertically! If we instead wanted to split our window
horizontally, we could press Control+b and then the double quote key “.
Your screen is now split in half vertically or horizontally, so how do you get to
the other pane? Simple! Hit Control+b again, and then an arrow key in the
direction of the pane you’d like to navigate to. Tmux will highlight the currently
used pane.
Closing Panes
Feel free to go crazy and split panes horizontally and vertically until your
terminal is an unusable mess. Then, to close a pane, we can use Control+d or just
type exit to close the terminal pane.
Creating Windows
Panes are great but I honestly use windows more often. Let’s make one now.
Hit the prefix command again (Control+b) and then c on your keyboard for
create. You’ll see that your status bar has now updated to reflect the new
window:
163
T.me/nettrain
Navigating Windows
To select a window, hit Control+b and then the number of the window (0 or 1
in my screenshot above). You can also use n and p for next and previous.
Renaming Windows
One more tip, the default window name sh or bash isn’t very descriptive. You
can hit Control+b and then the comma to rename your window to something
more memorable. My windows below are named hacker stuff and Shell
samurais. Nice!
164
T.me/nettrain
Instantly your status bar is gone and you’re thrust back into your “regular” shell.
Nice! Where did the session go?
To get our session back, we have to re-attach to it. If you run tmux again, you’ll
get another session, but not the one we were just in! First, list your sessions with
tmux ls:
stetson@linux_rocks:/$ tmux ls
0: 1 windows (created Tue Mar 7 03:15:24 2023)
Then, we can re-attach to the session with tmux attach -t 0. The 0 is the
session number we saw along the left of the tmux ls output. Like the default
window names, 0 isn’t a very easily rememberable name.
We can rename the session we listed out with tmux rename-session -t 0
hacking if we’d like. Then we’d be able to attach with tmux attach -t hacking
If instead, you wanted to start a session and name it, just type tmux new -s
<session-name>
Congrats! You’ve learned yet another Linux utility to add to your Shell
Samurai belt! I hope tmux comes in handy on your journey. Like many tools,
we’ve only just scratched the surface of what they can do. With just this basic
knowledge, you can do a lot!
lsof
Linux uses integers or numbers to track files that are currently open as well as
their associated processes. When I say open, I don’t just mean a file that’s open in
your text editor, but any file that is currently being used by a process, daemon or
program running on the system. Remember to that in Linux, everything is a file.
This includes processes, network devices, disks and more. This means lsof can
track all of those resources!
165
T.me/nettrain
lsof allows us to retrieve information about open files by accessing the
kernel’s internal file table. We can see details like the process id, user id, file
descriptor, file type, device numbers and more. This makes it an invaluable tool
for troubleshooting problems related to file access, network connections, and
resource management.
To use lsof, simply type the command followed by various options and
arguments to filter the results as needed. For example, to list all open files by a
specific process, use the -p flag followed by the process ID.
I’ll give this a try now by opening vim in one terminal session with vim
my_file.
Then, in another terminal session, I’ll run lsof with the process id:
166
T.me/nettrain
vim 31449 root mem REG 254,1 1054193 /usr/lib/
aarch64-linux-gnu/libc-2.31.so (path dev=0,162)
vim 31449 root mem REG 254,1 1054322 /usr/lib/
aarch64-linux-gnu/libpthread-2.31.so (path dev=0,162)
vim 31449 root mem REG 254,1 1054503 /usr/lib/
aarch64-linux-gnu/libpython3.8.so.1.0 (path dev=0,162)
vim 31449 root mem REG 254,1 1054201 /usr/lib/
aarch64-linux-gnu/libdl-2.31.so (path dev=0,162)
vim 31449 root mem REG 254,1 1054493 /usr/lib/
aarch64-linux-gnu/libgpm.so.2 (path dev=0,162)
vim 31449 root mem REG 254,1 1051099 /usr/lib/
aarch64-linux-gnu/libacl.so.1.1.2253 (path dev=0,162)
vim 31449 root mem REG 254,1 1053850 /usr/lib/
aarch64-linux-gnu/libcanberra.so.0.2.5 (path dev=0,162)
vim 31449 root mem REG 254,1 1051215 /usr/lib/
aarch64-linux-gnu/libselinux.so.1 (path dev=0,162)
vim 31449 root mem REG 254,1 1053484 /usr/lib/
aarch64-linux-gnu/libtinfo.so.6.2 (path dev=0,162)
vim 31449 root mem REG 254,1 1054224 /usr/lib/
aarch64-linux-gnu/libm-2.31.so (path dev=0,162)
vim 31449 root mem REG 254,1 1054134 /usr/lib/
aarch64-linux-gnu/ld-2.31.so (path dev=0,162)
vim 31449 root 0u CHR 136,2 0t0 5 /dev/pts/2
vim 31449 root 1u CHR 136,2 0t0 5 /dev/pts/2
vim 31449 root 2u CHR 136,2 0t0 5 /dev/pts/2
vim 31449 root 3u REG 0,162 12288 792734 /.myfile.swp
Above, we see output generated by running lsof and the process ID 31449.
The output shows information about the open files and file descriptors associated
with the vim process running as the root user. The numbers that Linux uses to
track open files are known as file descriptors.
Let's break down the output:
167
T.me/nettrain
error (stderr). Other numbers represent additional open files or resources.
• TYPE: The type of the file or resource (e.g., DIR for a directory, REG for a
regular file, CHR for a character device).
• DEVICE: The device number for the file or resource.
• SIZE/OFF: The file size or offset.
• NODE: The inode number.
• NAME: The name or path of the file or resource.
In this output, you can see that the vim process has several file descriptors:
• cwd and rtd: The current working directory and the root directory of the
process, both pointing to the root directory /.
• txt: The executable (text) file associated with the process, located at /usr/
bin/vim.basic.
• Multiple mem entries: These represent memory-mapped files, such as
shared libraries and the executable itself, loaded into the process's address
space.
• 0u, 1u, and 2u: Standard input (stdin), standard output (stdout), and
standard error (stderr), respectively. They are all connected to the same
character device, /dev/pts/2, which is a pseudo-terminal.
• 3u: An additional open file descriptor pointing to a regular
file /.myfile.swp. This is likely a swap file created by vim to store changes
made to a file (.myfile) during an editing session.
168
T.me/nettrain
operating system then translates the file descriptor into the appropriate internal
data structures to carry out the requested operation.
/proc Directory
Once again, in Linux, everything is a file, even processes! Information about
processes is stored in a special directory (also called a virtual filesystem) under the
root directory (/) called /proc. Try listing the contents of /proc now:
stetson@linux_rocks:~$ ls -l /proc
total 0
dr-xr-xr-x 9 root root 0 Dec
30 00:38 1
dr-xr-xr-x 9 root root 0 Mar
2 00:27 10
dr-xr-xr-x 9 root root 0 Mar
2 00:27 1025510
dr-xr-xr-x 9 root root 0 Mar
2 00:27 105
dr-xr-xr-x 9 root root 0 Mar
2 00:27 108
dr-xr-xr-x 9 root root 0 Mar
2 00:27 109
dr-xr-xr-x 9 root root 0 Mar
2 00:27 11
You’ll see a ton of output and you should also see a directory for every
process that you’d see with the ps command.
The /proc directory contains a set of files and directories, with each directory
usually named after a running process's ID. These directories contain files that
represent different components of the process's state, such as memory usage, file
descriptors, command line arguments, and more.
169
T.me/nettrain
Some common files and directories found in /proc include:
Let’s go deeper and look at a single processes’ status file in /proc. Type cat /
proc/1/status or replace 1 with any process or directory on your system:
I’ve truncated most of the output here for brevity since there are around 60
lines of output.
170
T.me/nettrain
You’ll get back tons of output that wasn’t available in the ps output! The /
proc file system lets us peer into what the Linux kernel sees when it manages
resources and processes. Take a look at the man page on proc for more info with
man proc
The takeaway here is that the /proc directory exposes extra details about the
kernel's internal data structures. It provides tons of valuable information about
the system's hardware, running processes, and kernel settings. You can use /
proc for diagnostics and monitoring not possible from other simple commands!
root@broken-waterfall-4525:~# df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 47M 1.7M 45M 4% /run
/dev/vda1 9.6G 9.5G 6.9M 100% /
tmpfs 234M 1.1M 233M 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/vda15 105M 5.3M 100M 5% /boot/efi
171
T.me/nettrain
tmpfs 47M 4.0K 47M 1% /run/user/0
tmpfs 47M 4.0K 47M 1% /run/user/1000
Looking over the summary from df, we can see that this system has a few
disks. /dev/vda15 is mounted on /boot/efi so it’s used as part of the boot
process. We can ignore the tmpfs lines because they are temporary storage
created in memory. /dev/vda1 looks interesting though…
Looking at the Avail column, it has only 6.9M of disk free. The Use% column
shows 100% as well, that could certainly get us into some trouble! Let’s see if we
can find out what’s going on.
There’s got to be some files somewhere that are getting larger and causing this
issue. After all, our server worked previously, so what changed? Let’s try to track
those files down!
We’ll use a similar command to df called du or disk usage. You can use du -h
on common directories where we might expect large files to gather like /var/
log. The h flag tells du to make the output human readable.
Let’s run du -h on our /var/log directory now:
root@broken-waterfall-4525:/var/log# du -h /var/log
4.0K /var/log/landscape
1.2G /var/log/journal/278483eb55e3c7011468e370639f28de
17M /var/log/journal
4.0K /var/log/private
4.0K /var/log/dist-upgrade
4.0K /var/log/caddy
104K /var/log/apt
48K /var/log/unattended-upgrades
4.0K /var/log/sysstat
105M /var/log/postgresql
1.3G /var/log
172
T.me/nettrain
example, if the input contained sizes like 2.5K, 10M, and 1G, sort -h will
convert them to a common unit (e.g., bytes) and sort them accordingly, so that
the largest size comes first. The -r option sorts the input in reverse order, so we
get the largest files in our output first:
Now that we’ve found our culprit, we have a few options to deal with it:
If you encounter a situation like this in the real world, you should do all of
these. First deleting the file, after confirming it isn’t needed. Obviously don’t go
deleting files that may be critical to the system. This is just a log file, so it’s
probably ok to delete. In certain situations, logs might have to be kept for audit
purposes, so use your common sense. Also, the logs could contain other details
that might help us diagnose further issues.
Assuming the low disk space is causing our server to not function, it’s helpful
to “stop the bleeding” and remove the log files immediately, putting the server
back into normal operation.
We could just delete the directories with rm, but we’ll instead use sudo
journalctl --vacuum-size=100M. This command will clear all logs older than
the specified size (in this example, 100 MB). You can adjust the size as needed to
173
T.me/nettrain
free up the desired amount of disk space.
SystemMaxUse=100M
174
T.me/nettrain
details like IP address, path requested, HTTP status code and more. If you have a
path that isn’t working, like myexamplewebsite.com/login you can use the logs
to troubleshoot and drill deeper into the issue.
In this section, we’ll talk about what kinds of of things can be logged, how to
view logs and how to configure logs.
Logging Locations
Generally, logs are written to the /var/log directory. However, some
applications are configured to write to different areas of the file system. The
syslog service is what generally sends logs to the system logger.
Every sysadmin has run into this scenario: Your Linux server is serving a web
application or doing some other background process and suddenly starts acting
kinda funny. You check the logs and see mysterious entries about disk space. Oh
no. So you dig in further and find that the disk has no space left! Do you simply
add more disk to remedy the problem? No, you keep troubleshooting and find
that the logs are filling your disk. Those devious logs. What was meant to help us
administer our system and keep an eye on health has suddenly betrayed us and
made our system unhealthy. No good.
Logrotate remedies this by…rotating logs. Well, not spinning them in circles,
but cleaning up old logs that we no longer need by compressing them, deleting
them or splitting them up. Generally, if your app is important, you’ll be keeping
logs somewhere else that is not on the Linux server. It could be in the cloud or
another log server you manage yourself. Logrotate lets us set certain parameters
like the max amount of disk space we’d like to keep for logs before it starts
deleting or compressing old ones.
Logrotate is a handy friend and an essential tool for administering any
production web app. Let’s take a look at how to install and configure logrotate.
Logrotate might already being running on your system! We can verify this by
looking in the logs directory at /var/log. List that directory with ls, do you see
files titled syslog.1.gz, syslog.2.gz, kern.log.1.gz, etc? If so, logrotate is
175
T.me/nettrain
probably already running!
How does our system know to rotate logs without us doing anything? Look in
the /etc/cron.daily directory for a file titled logrotate or similar. Go ahead
and cat that file and take a look. Cron will run this file daily and take care of
your log rotation!
Configuring Logrotate
If logrotate is already installed, the configuration will be stored at the /etc/
logrotate.conf file and /etc/logrotate.d directory. The contents of that file
and directory are pasted together to make a complete configuration.
Here’s an example configuration for a program called fail2ban, which keeps
random people from logging into your server:
/var/log/fail2ban.log {
weekly
rotate 4
compress
# Do not rotate if empty
notifempty
delaycompress
missingok
postrotate
fail2ban-client flushlogs 1>/dev/null
endscript
176
T.me/nettrain
configuration might not always be created, so it’s good to know how to configure
it!
177
T.me/nettrain
178
T.me/nettrain
"
5. The Network
The Network
Linux and computing in general would be practically useless if we couldn’t
make computers talk to each other. Linux provides a robust system for managing
network connections and configuring the network. Understanding Linux
networking is essential for optimizing network performance, and
troubleshooting connectivity issues.
Learning the OSI model can be difficult. There are many mnemonics used to
remember it. The one I used is too lewd to mention in this book, but I’ll
recommend All People Seem to Need Data Processing for remembering the layers.
Hot Tip : When working with other systems administrators, you’ll hear the
OSI model referenced from time-to-time. For example: “That sounds like a layer
8 issue to me” is used jokingly to imply that there is user error. Layer 8 doesn’t
179
T.me/nettrain
really exist on the OSI model. Also, network devices often are thought of as
either Layer 3 devices like routers or Layer 2 devices, like switches. Routers work
on the network layer because they make intelligent decisions about where to
send packets based on IP addresses, which are a layer 3 concept.
Switches on the other hand, are a layer 2 device. They are usually quicker
than routers because they don’t make decisions on where to send packets based
on IP addresses, but rather based on MAC addresses. MAC addresses are a
physical id that is burned in to a network interface. Switches keep a local
database called the ARP table (Address Resolution Protocol) which maps IP
address and MAC address to an interface. Switches forward packets rather than
route them based on the destination MAC address.
root@linux_rocks:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.0.2 netmask 255.255.0.0 broadcast
172.17.255.255
ether 02:42:ac:11:00:02 txqueuelen 0 (Ethernet)
RX packets 24273 bytes 35763327 (35.7 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 9121 bytes 509036 (509.0 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
You’ll see that the command gives sections for each interface on the system.
First, eth0, and then lo or loopback. In addition to the IP address and subnet
configurations we’d expect, we can see other statistics about the interface like it’s
mac address (ether 02:42:ac:11:00:02), received packets count, transmitted
180
T.me/nettrain
packets, errors and other statistics. You might also see an interface called wlan0,
which is a wireless interface.
The loopback address on the bottom at 127.0.0.1 is a special kind of IP
address typically used for testing and ensuring that the network stack works on
the system. The loopback address is only reachable locally and cannot be reached
from outside the system.
root@linux_rocks:~# ip route
default via 172.17.0.1 dev eth0
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.0.2
The Linux routing table is pretty simple by default. We see 2 entries, first, a
default route which says that any packet that doesn’t match other entries will get
routed to 172.17.0.1. The device used to send those packets is eth0.
Second, there is an entry that defines the local network, 172.17.0.0/16. This
rules means that the host can reach the subnet directly from its eth0 interface.
Another way to display routes is via the route -n command. This method is
a little older and generally ip route is used, but route -n is still supported. The
output we get from route -n is largely the same:
root@linux_rocks:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref
Use Iface
0.0.0.0 172.17.0.1 0.0.0.0 UG 0 0
0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0
0 eth0
181
T.me/nettrain
We do see some different fields, though. Namely, the Flags column has U and
UG underneath. U means the network is up. UG means that the network is up and
is a gateway.
ARP Table
ARP stands for Address Resolution Protocol. The ARP table is a local database
that maps MAC addresses and IP addresses to an interface to know where to
forward packets. We can view our linux arp table with arp -a:
root@linux_rocks:~# arp -a
? (172.17.0.1) at 02:42:71:e3:bd:c5 [ether] on eth0
Ping shows the domain and IP address that was resolved along with a line for
each packet it sends with a sequence number, time to live and time in
182
T.me/nettrain
milliseconds it took to get a reply. Ping also is handy for checking the latency to a
host by observing the time it took to reply.
Ping gives us a tidy summary at the end of how many packets it sent and time
to send packets as rtt or round trip time.
Remember that many hosts may block ICMP or ping packets. So if you’re
sending a ping to a host and it isn’t replying, it may still be online but just
ignoring your requests.
Traceroute
You might already be familiar with the tracert command on Windows
systems. Traceroute accomplishes the same goal of observing how packets are
routed from your local host to a destination address. The way traceroute works
is really interesting. Traceroute sends packets with ever-increasing TTL or Time
to Live value. Time to Live packets are used by networks to prevent packets from
being sent around in a loop forever. When a device receives a packet, it
decrements the TTL by 1. If the value ever reaches 0, the packet is then discarded
and a message is sent back to the source.
For example, traceroute would sends it’s first packet with a TTL of 1 to the
destination. When the first router or layer 3 device gets that packet, it will
decrement the TTL field in the packet by 1 and thus the TTL is 0. Since the value
of the TTL is 0, the router will discard the packet and send back an ICMP time
exceeded message to the source. Traceroute then records the IP address of that
router and time it took to get a message back. Traceroute increases the TTL to 2
and sends the packet again, repeating the process with the next router in the
“chain”. This process repeats until the packet reaches its destination.
That’s a lot of theory, let’s see an example!
Use traceroute and any domain name or IP address now:
183
T.me/nettrain
3 142-254-144-193.inf.spectrum.com (142.254.144.193) 39.544 ms
49.353 ms 49.090 ms
4 lag-63.mdtwohrs02h.netops.charter.com (24.29.4.197) 56.695 ms
56.641 ms 56.612 ms
5 lag-44.fpkhoh0301r.netops.charter.com (65.29.38.110) 48.874 ms
48.833 ms 48.782 ms
6 lag-11.mcr11blasoh56.netops.charter.com (65.29.33.66) 54.551
ms lag-11.dytnoh5501r.netops.charter.com (65.29.33.206) 25.436 ms
19.323 ms
7 lag-28.rcr01clmkohpe.netops.charter.com (65.29.1.44) 25.019 ms
21.453 ms 26.597 ms
8 lag-25.chctilwc00w-bcr00.netops.charter.com (107.14.17.252)
35.699 ms lag-15.chctilwc00w-bcr00.netops.charter.com (66.109.6.68)
28.653 ms lag-27.vinnva0510w-bcr00.netops.charter.com (66.109.6.66)
31.493 ms
9 lag-11.chcgildt87w-bcr00.netops.charter.com (66.109.6.20)
33.576 ms lag-31.chcgildt87w-bcr00.netops.charter.com (66.109.10.82)
33.445 ms lag-31.asbnva1611w-bcr00.netops.charter.com (107.14.18.82)
36.200 ms
10 74.125.147.54 (74.125.147.54) 38.337 ms 24.30.201.159
(24.30.201.159) 38.136 ms 74.125.50.44 (74.125.50.44) 36.028 ms
11 * * *
12 108.170.246.33 (108.170.246.33) 34.563 ms 142.251.52.64
(142.251.52.64) 33.461 ms 108.170.240.97 (108.170.240.97) 33.444 ms
13 142.251.60.205 (142.251.60.205) 29.742 ms 108.170.246.66
(108.170.246.66) 30.687 ms 108.170.243.175 (108.170.243.175) 27.983
ms
14 ord37s35-in-f14.1e100.net (142.250.190.110) 27.861 ms
142.251.49.21 (142.251.49.21) 39.974 ms ord37s35-in-f14.1e100.net
(142.250.190.110) 33.954 ms
Each line in this output is a router between me and google.com. We see the
DNS name of the device and IP address along with response times to that device.
What’s interesting about traceroute output is that you can glimpse into
network infrastructure a little bit. For example, taking one of these first lines,
lag-63.mdtwohrs02h.netops.charter.com, I can tell that this is probably a link
aggregation port with id 63 and the device name is likely mdtwohrs02h. Netops is
most likely the team at Charter doing Network Engineering. While I can’t
decrypt the exact syntax of charter’s device names, it’s fun to peer inside the
network a bit.
Another line has asbnva which stands for Ashburn, Virginia. There’s a really
184
T.me/nettrain
great slide deck I’ll recommend put out by Nanog (North American Network
Operator’s Group) called A Practical Guide to (Correctly) Troubleshooting with
Traceroute that goes way in-depth on some of these hostnames and the history.
185
T.me/nettrain
3979911459, win 64240, options [mss 1460,sackOK,TS val 3904437079 ecr
0,nop,wscale 7], length 0
tcpdump puts out a lot of data, so I’ve omitted a lot above, but we can quickly
start to interpret what tcpdump is telling us. The first few lines of the output
above we can see my system querying DNS for shellsamurai.com’s IP address.
If we wanted to save output from tcpdump, we can save it to a file with sudo
tcpdump -w /tmp/my_tcpdump. Tcpdump will dump packets to the file until
until you send an interrupt with Control+C. Then, you can move the file back to
another system and analyze it with Wireshark.
This is just a quick primer on tcpdump! There are tons of options and
configurations that can be used to filter traffic. One of my favorite resources for
186
T.me/nettrain
tcpdump is Daniel Miessler’s blog post “A tcpdump Tutorial with examples” —
be sure to check it out if you are interested in learning tcpdump more effectively!
187
T.me/nettrain
DNS - The Yellow Pages of the Internet
DNS was once described to me as the “Yellow Pages of the Internet”. Like a
phone book (people still use these?) that matches someone’s name to their phone
number, DNS or Domain Name System does the same for Domain Names to
map them to an IP address.
Instead of internet users needing to have Rainman-like ability to remember
that reddit is at 151.101.1.140, instagram is at 157.240.254.174 and amazon is
located at 205.251.242.103, we can have DNS do the heavy lifting!
DNS is a huge distributed database of all of these mappings between IP
address and domain name. In this section, we’ll look at the basic operations of
DNS, how Linux handles DNS and a few tools to configure DNS.
188
T.me/nettrain
Zone files are the actual information stored the domain. Zone files keep a
record name, TTL, Class, Type and Data. Let’s look at an example:
• root nameserver - Responsible for all domains. Does not contain zone
records. Contains a list of TLD nameservers
• .com TLD nameserver - Responsible for all domains ending in .com
• example.com nameserver - Responsible for example.com and all
subdomains
The root name server delegates queries down to the .com nameserver, who also
delegates queries to the example.com nameserver. This design allows DNS to be
decentralized.
189
T.me/nettrain
might find the record for shellsamurai.com. When your system makes a
request to shellsamurai.com, the process is kicked off:
First, your system will look at its local cache to see if there is already an IP
address stored for the domain. If there isn’t, the request gets forwarded on to a
recursive DNS resolver.
Then, the recursive resolver checks its local cache to see if it has an entry for
the domain. If it does, it responds with it. If it doesn’t have the entry cached, it
queries the root DNS servers to find who the authoritative DNS server is that
handles .com domain names.
The recursive resolver then queries the TLD (Top Level Domain) DNS server
that handles .com domains and obtains the IP address of the nameserver that is
responsible for shellsamurai.com
The recursive resolver finally then queries the authoritative DNS server to
find the IP address mapped to the website.
Once the authoritative server finds the address, it provides it to the recursive
server who will cache the result and send the IP address back to the local DNS
resolver, who also caches it and provides it to your local system.
/etc/hosts
The file at /etc/hosts acts as a kind of override for host to IP address
mappings. Before Linux calls out to DNS servers, it looks locally to see if there
are any domain to IP address mappings already in place. Here’s an example:
/etc/resolv.conf
190
T.me/nettrain
The file at /etc/resolv.conf used to be the primary way to adjust your DNS
server settings. It’s usually managed for you by the system these days. If you
want to add a name server and change the config, you can use the nameserver
keyword:
Here, my system has its DNS server set to 192.168.65.5. You probably don’t
need to adjust this setting on your own.
;; QUESTION SECTION:
;shellsamurai.com. IN A
;; ANSWER SECTION:
shellsamurai.com. 377 IN A 23.21.157.88
shellsamurai.com. 377 IN A 23.21.234.173
191
T.me/nettrain
Dig gives us a lot of output by default, which we can filter with other flags
and options. For now, let’s stick with the output we’ve got:
192
T.me/nettrain
record.
• ;; Query time: 106 msec: This line shows the time it took for the DNS
query to be completed.
• ;; Query time: 106 msec: This line shows the time it took for the DNS
query to be completed.
• ;; SERVER: 192.168.65.5#53(192.168.65.5): This line shows the IP
address and port number of the DNS server that provided the response.
• ;; WHEN: Sat Mar 11 14:49:20 EST 2023: This line shows the date and
time when the DNS response was received.
• ;; MSG SIZE rcvd: 66: This line shows the size of the DNS response in
bytes. In this case, the response contained 66 bytes of data.
Overall, dig can give us a ton of information about DNS servers that resolve
our query and is very handy for troubleshooting any kind of DNS issue.
Options are any additional flags we’d pass to change how scp works.
193
T.me/nettrain
Source is the source file or directory we want to transfer.
Destination refers to the location where we want to copy our file or directory.
We’ll need SSH setup and configured on the systems we’re transferring files
to or from so be sure to brush up on the SSH section of the book to get SSH up
and running. To follow these examples, you’ll need two systems to transfer files
between. Setting up two linux virtual machines might be a good idea. Windows
has a few packages for SCP that can be used as a client as well.
Once you’ve ensured SSH is up and running, let’s try transferring a file:
root@linux_rocks:/tmp# ls -l
total 4
-rw-r--r-- 1 root root 22 Mar 11 15:56 shell_samurais.txt
root@linux_rocks:/tmp# cat shell_samurais.txt
Hello Shell Samurais!
stetson@localhost:~$mkdir -p my_dir/shell/samurais/secrets/
cloudkeys
stetson@localhost:~$ touch my_dir/shell/samurais/secrets/cloudkeys/
secret_key.txt
stetson@localhost:~$ scp -r my_dir root@localhost:/tmp
Above, I create a few nested directories and a file inside them. Then, use scp
with the -r flag to transfer recursively to my Linux server.
194
T.me/nettrain
rsync
rsync is yet another file transfer tool we can use, short for remote
synchronization. It’s very similar to scp but has a major distinction. rsync will
make what’s called a diff and only transfer the differences over. If you were
making an upload that got interrupted, you can simply rsync again and the files
that didn’t get transferred will get copied over. Additionally, rsync can transfer
only the parts of files that haven’t been transferred yet, making it much quicker
and more efficient than scp. rsync is often configured in cron as a simple backup
solution.
rsync also uses SSH to transfer files. rsync’s syntax is simple to use and
similar to scp:
Well done! This has been a brief overview of a few methods of transferring
files with Linux tools that are built right into most distributions. Indeed, file
backup and transfer is an important concept for keeping critical files safe. Learn
it, live it, love it.
195
T.me/nettrain
196
T.me/nettrain
6. Real World Samurai Skills and
Interview Questions
Intro
In this chapter, we’ll look at what I call some “real-world” Shell Samurai
skills. Although most everything in this book so far is applicable in the real-
world, our goal in this chapter is to run through a few tutorials that you can use
in a career working with Linux.
In addition, this chapter contains some common Interview Questions and
scenarios you might run into!
197
T.me/nettrain
Installing Nginx Web server
Nginx (pronounced engine-x) is one of the most popular pieces of Web Server
software. In addition to just serving content, Nginx can do much more, like
acting as a load balancer and reverse proxy. Nginx is free and open-source and
was first released in 2004. Nginx was ranked first in usage in June 2022 ahead of
Apache and Cloudflare for web severs. There’s a 1 in 3 chance any website you
visit is using Nginx!
Nginx boasts some additional features and tricks, including its ability to
handle over 10k simultaneous connections while using a very low amount of
memory.
Enough talk about Nginx’s popularity and shiny features, let’s take a look at a
simple Nginx install.
Installing Nginx
As is customary and should be familiar to you by now, we need to first install
Nginx. We can do so with apt-get install nginx. You’ll be prompted to
review new packages and dependencies that will be installed before being asked
if you’d like to continue. Hit Y and then enter to confirm:
198
T.me/nettrain
libcap2 libcap2-bin libelf1 libfontconfig1 libfreetype6 libgd3
libicu66 libjbig0 libjpeg-turbo8 libjpeg8 libnginx-mod-http-
image-filter libnginx-mod-http-xslt-filter libnginx-mod-mail
libnginx-mod-stream libpam-cap libpng16-16 libtiff5 libwebp6
libx11-6 libx11-data libxau6 libxcb1 libxdmcp6 libxml2 libxpm4
libxslt1.1 nginx nginx-common nginx-core tzdata
0 upgraded, 35 newly installed, 0 to remove and 0 not upgraded.
Need to get 13.9 MB of archives.
After this operation, 54.9 MB of additional disk space will be
used.
Do you want to continue? [Y/n] Y
The installer will then ask a few questions to confirm your timezone and
region, answer accordingly:
Once Nginx has been installed, verify that it’s up and running with systemctl
status nginx. If you see output saying nginx has been disabled or isn’t
running, go ahead and run systemctl start nginx and systemctl enable
nginx to enable it to start on boot up.
Next, navigate to your system’s IP address in your browser as http://
192.168.1.1 to check if Nginx is up and running! As a reminder, you can check
your ip address using ifconfig. You can also use curl ifconfig.me to return
199
T.me/nettrain
your systems public ip address. Ifconfig.me is a free service that simply returns
your public IP.
If all went well, you should see Nginx’s default landing page:
This is great but doesn’t help us much if we can only see the default config
page! We want to serve our own webpages on Nginx!
We can navigate to /var/www/html and ls the contents, we’ll see a
index.nginx-debian.html file there:
root@linux_rocks:/var/www/html# ls
index.nginx-debian.html
Let’s rename it with .bak on the end so we can keep that file. Then, touch a
new index.html file inside of /var/www/html:
root@linux_rocks:/var/www/html# mv index.nginx-debian.html
index.nginx-debian.html.bak
root@linux_rocks:/var/www/html# touch index.html
Now, open your favorite text editor and we’ll write some HTML of our own
inside index.html! I’m using Tailwind for some basic CSS styling in this
example, but feel free to omit that tag if you just want to see a page quickly:
200
T.me/nettrain
<!DOCTYPE html>
<html lang="en">
<head>
<title>My first Nginx Web Page!</title>
<script src="https://cdn.tailwindcss.com"></script>
</head>
<body>
<div class="container">
Welcome to my web page
</div>
</body>
</html>
Save the file and close it. Go back to your browser and hit refresh!
Nice! We’ve just served our own web page from scratch with Nginx, well
done! There are many, many more ways to configure Nginx, but I wanted to keep
it brief and simply introduce you to this software.
As mentioned, around 30% of websites today are using Nginx in some form.
Nginx has many more ways you can use it to configure to act as a proxy or load-
201
T.me/nettrain
balancers, among other use-cases. Adding Nginx to your tool belt is a
straightforward way to make yourself more valuable as a Shell Samurai!
An Overview of Git
In addition to developing the Linux kernel, Linus Torvalds also originally
developed Git in 2005 for helping with development on Linux! Git, the software,
is open-source. You may have heard that Microsoft acquired Github and assume
this means they own git too. Github is simply a Git hosting platform. There are
other git hosting platforms, like Gitlab and Bitbucket. You don’t have to host
code on Github.
Git, the software, is a distributed version control system. This means that every
developer keeps a complete copy of the codebase on their local machine. They
don’t need to be connected to a file share or central server to make changes or
202
T.me/nettrain
contributions to code.
Git has a concept of branching and merging, which allows developers to make
isolated changes to the codebase to add new features or bug fixes. Then, they can
merge their code back into the main or master codebase.
Every single change that is made to the codebase is logged and tracked over
time. If anything goes wrong or you want to see the history of a piece of code, Git
makes it easy to do.
203
T.me/nettrain
You’ll see a page with a myriad of options. Name your Repository anything
you’d like and select Private. I’ve decided to set my repository to public and call
it my_first_repo.
204
T.me/nettrain
When you’re happy with your repository name and privacy settings, hit
Create repository and you’ll be greeted with some instructions:
205
T.me/nettrain
Creating a New Repository on the Command Line
Hit the copy button underneath the instructions “…or create a new
repository on the command line” to save those commands to your clipboard.
206
T.me/nettrain
links. Putting a # before a piece of text will make it a large Heading 1.
git init
This command will create a new git repository on our local system.
Now that we know how to create a repository, let’s actually do it! Create a
new folder anywhere on your system. You can call it whatever you want, but I
like to name the folder the same as my repository, my_first_repo in this case. I’ll
put my folder under my home directory and then a projects folder:
207
T.me/nettrain
stetson@linux_rocks:~$ mkdir -p projects/my_first_repo
I provide the -p flag here because the projects folder didn’t exist yet and I
wanted to create both it and the my_first_repo folder. Go ahead and cd into
your new folder.
Configuring Git
We’re almost ready to commit our changes, but git needs a little more setup.
First, we’ll set our git email and name:
208
T.me/nettrain
Click Developer Settings on the left-hand pane on the bottom.
209
T.me/nettrain
Finally, click tokens (classic):
210
T.me/nettrain
Now click Generate Token, Generate new token (classic)
Now, give the access token a name and expiration date. Github will ask what
permissions the key should have. I’ll select everything under repo and
211
T.me/nettrain
write:packages. The other permissions shouldn’t be necessary. Scroll to the
bottom and click generate token.
Copy the token and keep it somewhere safe! Github won’t show it to us again
but you can always generate another. This token will act as our password to
authenticate from the command-line.
Here’s mine with a few characters censored in-case I forget to delete this
token and some devious reader steals it!
212
T.me/nettrain
Back to our command-line, run the previous commands we copied from the
repository page. Those commands will add a README.md, commit it and push
it up to Github. Either type these by hand to get some extra practice or copy
paste them into your shell. When you get to the step to git push -u origin
main, git will ask for your username and password. Enter your username and
then enter the personal access token as the password.
If all went well, you should see your code being uploaded to the remote
origin we setup:
Back in your browser window, refresh the repository we created and you’ll
see our first file, README.md listed along with the text inside it displayed:
213
T.me/nettrain
Let’s get some more practice in. Go back to your command line now and edit
the README.md file and push it back up to Github. If you want to get fancy,
research Markdown formatting and implement some styling in your file!
Once again, here’s a reminder on how to push to Github once you’ve made
your changes:
Add our file, README.md to staging to tell git we’re about to add it to a
commit.
214
T.me/nettrain
Counting objects: 100% (5/5), done.
Delta compression using up to 4 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 362 bytes | 362.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local
object.
To https://github.com/shell-samurai/my_first_repo.git
889beec..a8d6010 main -> main
Push the code! Note that we don’t have to set the upstream as origin and
branch as main again since we already did that previously. Entering our personal
access token every time is kind of a pain though…I recommend setting up one of
the other authentication methods like SSH or using the gh client if you’re going
to be using git more than for just this tutorial.
Pulling Code
Let’s say that someone else has modified our code base and we want to get
those changes to our local system. How can we do that?
First, let’s make some changes we don’t have on our system. Go to your
README.md file and click the pencil icon to edit the markdown file in your
browser.
215
T.me/nettrain
Back to your shell, run git pull:
216
T.me/nettrain
1 file changed, 2 insertions(+)
If you examine the README.md file, you’ll see that git has pulled the changes
down to your local codebase! Nice!
Want to see the repository we worked on together? It’s available here:
https://github.com/shell-samurai/my_first_repo
217
T.me/nettrain
Interview Questions and Scenarios
218
T.me/nettrain
questions like “Describe how the Linux boot process works”. When interviewers
blindly toss questions at you that are abstract, it’s usually a red flag for me.
Situation:
Well, my team maintained a legacy server that was constantly experiencing
downtime issues and causing all sorts of trouble.
Task:
I was tasked with taking the lead on the project, planning the migration and
executing on it.
Action:
I put together a task force of 3 people, met with them weekly, drafted a
maintenance plan, led the migration and moved the application to the cloud.
Result:
As a result, the application’s stability was improved! The move from the old
hardware was a success. The company application is much more stable now. My
supervisor commended me in my annual review and I was able to build
leadership skills and rapport with my team.
219
T.me/nettrain
The STAR method helps you think about and provide a clear, structured
response that sell your skills and experience. It also allows you to show how you
worked well with a team in the past.
Hopefully you have some examples where projects went well for you. Maybe
you can’t think of a project that went well. That’s alright, if you’re able to reflect
on what went wrong and what could go better next time. No matter what, be
sure to highlight the successful parts and reflect on what could have gone better.
STAR is just one component of the interview process but it’s a tactic I have
used over and over again along with millions of other professionals who want to
sell their skillset to organizations. Use it.
What is Linux?
Linux is an open-source, Unix-like operating system developed in the 1990s
by Linus Torvalds. It’s widely used in servers, supercomputers, mobile devices,
and other embedded systems.
220
T.me/nettrain
most Linux distributions by default.
221
T.me/nettrain
What is the difference between a process and a thread in Linux?
A process is an instance of a program that is executing on the system. It has its
own memory space and resources, and it can communicate with other processes
through inter-process communication mechanisms. A thread is a lightweight
process that shares the same memory space as its parent process. It allows for
concurrent execution of multiple tasks within a single process.
Author’s Obvious tip: You probably won’t fire all of these commands off from
memory, but I’d be surprised if you did!
222
T.me/nettrain
Advanced Linux Interview Questions
What is systemd and what does it do?
systemd is a system and service manager used to control the startup process
and manage system services in Linux. It provides a consistent and unified way to
start, stop, and monitor services, and it can be used to manage system resources
like timers and sockets. systemd is used by many modern Linux distributions,
including Red Hat, Fedora, Debian, and Ubuntu.
223
T.me/nettrain
messages encrypted with the corresponding public key, which can be freely
distributed. The security of this system is based on the mathematical relationship
between the two keys, making it impossible to derive the private key from the
public key. Private and public keys can be used to secure communication over
untrusted channels and for authentication.
224
T.me/nettrain
can be set up to allow or block traffic based on the source or destination IP
address, port number, protocol, and other factors.
iptables can also be used to forward traffic to other hosts or machines on the
network.
What is a container and how does it work?
A container is a lightweight, isolated runtime environment that allows
multiple applications or services to run on a single host without interfering with
each other. It achieves this isolation by using Linux namespaces to create
separate network, process, and file system environments, and by using cgroups
to limit resource usage. Containers can be created and managed using tools like
Docker and Kubernetes.
225
T.me/nettrain
Outro
Shell Samurai
Master the Linux Command Line
I’d love to hear your feedback on Shell Samurai! You can reach out at
[email protected] or join our Private Discord Server.
226
T.me/nettrain