0% found this document useful (0 votes)
63 views

Navigating Linux

Uploaded by

PRANAV TIWARI
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Navigating Linux

Uploaded by

PRANAV TIWARI
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 118

Navigating Linux System Commands

A guide for beginners to the Linux Shell and GNU coreutils

Sayan Ghosh

June 13, 2024

IIT Madras
BS Data Science and Applications
Disclaimer
This document is a companion activity book for the System Commands (BSSE2001) course taught by Prof.
Gandham Phanikumar at IIT Madras BS Program. This book contains resources, references, questions
and solutions to some common questions on Linux commands, shell scripting, grep, sed, awk, and other
system commands.

This was prepared with the help and guidance of the course instructors:

Santhana Krishnan and Sushil Pachpinde

Copyright
© This book is released under the public domain, meaning it is freely available for use and distribution
without restriction. However, while the content itself is not subject to copyright, it is requested that proper
attribution be given if any part of this book is quoted or referenced. This ensures recognition of the original
authorship and helps maintain transparency in the dissemination of information.
Colophon
This document was typeset with the help of KOMA-Script and LATEX using the kaobook class.
The source code of this book is available at:
https://github.com/sayan01/se2001-book
(You are welcome to contribute!)
Edition
Compiled on June 13, 2024
UNIX is basically a simple operating system, but you have to be a genius to
understand the simplicity.

– Dennis Ritchie
Preface

Through this work I have tried to make learning and understanding the basics of Linux fun and easy. I have
tried to make the book as practical as possible, with many examples and exercises. The structure of the book
follows the structure of the course BSSE2001 - System Commands, taught by Prof. Gandham Phanikumar at
IIT Madras BS Program. .
The book takes inspiration from the previous works done for the course,

▶ Sanjay Kumar’s Github Repository


▶ Cherian George’s Github Repository
▶ Prabuddh Mathur’s TA Sessions

as well as external resources like:

▶ Robert Elder’s Blogs and Videos


▶ Aalto University, Finland’s Scientific Computing - Linux Shell Crash Course

The book covers basic commands, their motivation, use cases, and examples. The book also covers some
advanced topics like shell scripting, regular expressions, and text processing using sed and awk.
This is not a substitute for the course, but a companion to it. The book is a work in progress and any
contribution is welcome at https://github.com/sayan01/se2001-book

Sayan Ghosh
Contents

Preface v

Contents vii

1 Essentials of Linux 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 What is Linux? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Desktop Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Window Managers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Why Linux? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.5 What is Shell? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.6 Shell vs Terminal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.7 Why the Command Line? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.8 Command Prompt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Simple Commands in GNU Core Utils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 File System Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Manuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.3 System Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.4 File Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.5 Text Processing and Pagers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.6 Aliases and Types of Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.2.7 User Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.2.8 Date and Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.3 Navigating the File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.3.1 What is a File System? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.3.2 In Memory File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.3.3 Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.3.4 Basic Commands for Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.4 File Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4.1 Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.4.2 Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.4.3 Execute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.4.4 Interesting Caveats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.4.5 Changing Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.4.6 Special Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.4.7 Octal Representation of Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.5 Types of Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.5.1 Regular Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.5.2 Directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.5.3 Symbolic Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.5.4 Character Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.5.5 Block Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.5.6 Named Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.5.7 Sockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.5.8 Types of Regular Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.6 Inodes and Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.6.1 Inodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.6.2 Separation of Data, Metadata, and Filename . . . . . . . . . . . . . . . . . . . . . . . 48
1.6.3 Directory Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
1.6.4 Hard Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
1.6.5 Symbolic Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1.6.6 Symlink vs Hard Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
1.6.7 Identifying Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
1.6.8 What are . and ..? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2 Command Line Editors 55


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.1.1 Types of Editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.1.2 Why Command Line Editors? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.1.3 Mouse Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.1.4 Editor war . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.1.5 Differences between Vim and Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.1.6 Nano: The peacemaker amidst the editor war . . . . . . . . . . . . . . . . . . . . . . 58
2.2 Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.2.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.2.2 Ed Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.2.3 Exploring Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.3 Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.3.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.3.2 Exploring Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.4 Nano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2.4.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2.4.2 Exploring Nano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.4.3 Editing A Script in Nano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3 Networking and SSH 87


3.1 Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.1.1 What is networking? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.1.2 Types of Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.1.3 Devices in a Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.1.4 IP Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.1.5 Subnetting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.1.6 Private and Public IP Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.1.7 CIDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.1.8 Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.1.9 Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.1.10 Firewalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.1.11 SELinux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.1.12 Network Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.2 SSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.2.1 What is SSH? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.2.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.2.3 How does SSH work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.2.4 Key-based Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.2.5 Configuring your SSH keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.2.6 Sharing your public key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.2.7 How to login to a remote server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.2.8 Call an exorcist, there’s a daemon in my computer . . . . . . . . . . . . . . . . . . . . 104
List of Figures

1.1 Linux Distributions Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2


1.2 Desktop Environment Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Operating System Onion Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 GNU Core Utils Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 ls -l Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Linux Filesystem Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.7 Relative Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.8 File Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.9 Octal Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.10 System Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.11 Inode and Directory Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
1.12 Directed Acyclic Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1.13 Abstract Representation of Symbolic Links and Hard Links . . . . . . . . . . . . . . . . . . . . . 52
1.14 Symbolic Links and Hard Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.1 A Teletype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.2 Ken Thompson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.3 Dennis Ritchie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.4 Xerox Alto, one of the first VDU terminals with a GUI, released in 1973 . . . . . . . . . . . . . . 60
2.5 A first generation Dectape (bottom right corner, white round tape) being used with a PDP-11
computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.6 George Coulouris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.7 Bill Joy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.9 Stevie Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.8 The Keyboard layout of the ADM-3A terminal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.10 Bram Moolenaar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.11 The inital version of Vim, when it was called Vi IMitation . . . . . . . . . . . . . . . . . . . . . . 64
2.12 Vim 9.0 Start screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.13 Neo Vim Window editing this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.14 Simplified Modes in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.15 Detailed Modes in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.16 Vim Cheat Sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.18 Richard Stallman - founder of GNU and FSF projects . . . . . . . . . . . . . . . . . . . . . . . . 80
2.17 Emacs Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.19 Guy L. Steele Jr. combined many divergent TECO with macros to create EMACS . . . . . . . . . 81
2.20 James Gosling - creator of Gosling Emacs and later Java . . . . . . . . . . . . . . . . . . . . . . . 81
2.21 ADM-3A terminal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
2.22 Space Cadet Keyboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.23 Nano Text Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.1 Types of Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2 Growth Rate of Different Functions - Note how quickly 𝑛 2 grows . . . . . . . . . . . . . . . . . 87

3.3 Hub and Spoke Model Employed by Airlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88


3.4 LAN and WAN connecting to the Internet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.5 Hub, Switch, Router connecting to the Internet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.6 SELinux Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.7 Symmetric Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.8 Symmetric Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.9 Asymmetric Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

List of Tables

1.1 Basic Shortcuts in Terminal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5


1.2 Basic Commands in GNU Core Utils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Manual Page Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Keys in Info Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Escape Characters in echo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6 Date Format Specifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.7 Linux Filesystem Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.8 Linux Filesystem Directory Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.9 Octal Representation of Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.10 Types of Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.11 Metadata of a File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.12 Symlink vs Hard Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.1 History of Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58


2.2 Ed Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.3 Commands for location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.4 Commands for Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.5 Ex Commands in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.6 Navigation Commands in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.7 Moving the Screen Commands in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.8 Replacing Text Commands in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.9 Toggling Case Commands in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.10 Deleting Text Commands in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.11 Deleting Text Commands in Vim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.12 Address Types in Search and Replace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.13 Keys to enter Insert Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.14 History of Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.15 Navigation Commands in Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.16 Exiting Emacs Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.17 Searching Text Commands in Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.18 Copying and Pasting Commands in Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.19 File Handling Commands in Nano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.20 Editing Commands in Nano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.1 Private IP Address Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92


3.2 Well-known Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.3 Network Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Essentials of Linux 1
1.1 Introduction . . . . . . . . 1
1.1 Introduction 1.1.1 What is Linux? . . . . . . . 1
1.1.2 Desktop Environments . 2
1.1.1 What is Linux? 1.1.3 Window Managers . . . . 3
1.1.4 Why Linux? . . . . . . . . 4
Definition 1.1.1 (Linux) Linux is a kernel that is used in many 1.1.5 What is Shell? . . . . . . . 4
1.1.6 Shell vs Terminal . . . . . 4
operating systems. It is open source and free to use. Linux is not an
1.1.7 Why the Command Line? 5
operating system unto itself, but the core component of it.
1.1.8 Command Prompt . . . . 6
1.2 Simple Commands in
So what is Ubuntu? Ubuntu is one of the many distributions that use the GNU Core Utils . . . . . . 7
Linux kernel. It is a complete operating system that is free to use and 1.2.1 File System Navigation . 8
open source. It is based on the Debian distribution of Linux. There are 1.2.2 Manuals . . . . . . . . . . . 9
many other distributions of Linux, such as: 1.2.3 System Information . . . 11
1.2.4 File Management . . . . . 14
▶ Debian - Used primarily on servers, it is known for its stability. 1.2.5 Text Processing and
Pagers . . . . . . . . . . . . 17
• Ubuntu - A commercial distribution based on Debian which
1.2.6 Aliases and Types of
is popular among new users.
Commands . . . . . . . . . 21
• Linux Mint - A distribution based on Ubuntu which is known 1.2.7 User Management . . . . 24
for its ease of use. It is one of the distributions recommended 1.2.8 Date and Time . . . . . . . 25
to new users.
1.3 Navigating the File Sys-
• Pop OS - A distribution based on Ubuntu which is known for tem . . . . . . . . . . . . . . 28
its focus on developers, creators, and gamers. 1.3.1 What is a File System? . . 28
• and many more 1.3.2 In Memory File System . 29
▶ Red Hat Enterprise Linux (RHEL) - A commercial distribution 1.3.3 Paths . . . . . . . . . . . . . 32
1.3.4 Basic Commands for
used primarily in enterprises. It is owned by Red Hat and is targeted
Navigation . . . . . . . . . 33
primarily to companies with their free OS paid support model.
1.4 File Permissions . . . . . . 35
• Fedora - A community-driven distribution sponsored by Red 1.4.1 Read . . . . . . . . . . . . . 36
Hat. It is known for its cutting-edge features and is used by 1.4.2 Write . . . . . . . . . . . . . 36
developers. It remains on the upstream of RHEL, receiving 1.4.3 Execute . . . . . . . . . . . 36
new features before RHEL. 1.4.4 Interesting Caveats . . . . 36
• CentOS - A discontinued distribution based on RHEL. It 1.4.5 Changing Permissions . . 38
was known for its stability and was used in servers. It was 1.4.6 Special Permissions . . . 38
downstream from RHEL. 1.4.7 Octal Representation of
• CentOS Stream - It is a midstream between the upstream de- Permissions . . . . . . . . 40
velopment in Fedora Linux and the downstream development 1.5 Types of Files . . . . . . . 43
for Red Hat Enterprise Linux. 1.5.1 Regular Files . . . . . . . . 43
1.5.2 Directories . . . . . . . . . 43
• Rocky Linux - A distribution created by the Rocky Enterprise
1.5.3 Symbolic Links . . . . . . 43
Software Foundation after the announcement of discontinu-
1.5.4 Character Devices . . . . . 43
ation of CentOS. It is a downstream of RHEL that provides 1.5.5 Block Devices . . . . . . . 44
feature parity and binary compatibility with RHEL. 1.5.6 Named Pipes . . . . . . . . 44
• Alma Linux - A distribution created by the CloudLinux team 1.5.7 Sockets . . . . . . . . . . . 45
after the announcement of discontinuation of CentOS. It is a 1.5.8 Types of Regular Files . . 45
downstream of RHEL that provides feature parity and binary 1.6 Inodes and Links . . . . . 47
compatibility with RHEL. 1.6.1 Inodes . . . . . . . . . . . . 47
▶ Arch Linux - A community-driven distribution known for its 1.6.2 Separation of Data, Meta-
data, and Filename . . . . 48
simplicity and customizability. It is a rolling release distribution,
1.6.3 Directory Entries . . . . . 49
which means that it is continuously updated. It is a bare-bones
1.6.4 Hard Links . . . . . . . . . 49
1.6.5 Symbolic Links . . . . . . 50
1.6.6 Symlink vs Hard Links . 52
1.6.7 Identifying Links . . . . . 52
1.6.8 What are . and ..? . . . . . 53
2 1 Essentials of Linux

distribution that lets the user decide which packages they want to
install.
• Manjaro - A distribution based on Arch Linux which is known
for its user-friendliness. It is a rolling release distribution that
is easier to install for new users. It uses a different repository
for packages with additional testing.
• EndeavourOS - A distribution based on Arch Linux which is
known for its simplicity and minimalism. It is a rolling release
distribution that is easier to install for new users. It uses the
same repository for packages as Arch Linux.
• Artix Linux - It uses the OpenRC init system instead of
systemd. It also offers other init systems like runit, s6, dinit.
▶ openSUSE - It is a free and open-source Linux distribution devel-
oped by the openSUSE project. It is offered in two main variations:
1: “Free software” means software that Tumbleweed, an upstream rolling release distribution, and Leap,
respects users’ freedom and community. a stable release distribution which is sourced from SUSE Linux
Roughly, it means that the users have
Enterprise.
the freedom to run, copy, distribute,
study, change and improve the software. • Tumbleweed - Rolling Release upstream.
Thus, “free software” is a matter of
• Leap - Stable Release downstream.
liberty, not price. To understand the
concept, you should think of “free” as ▶ Gentoo - A distribution known for its customizability and per-
in “free speech,” not as in “free beer.” formance. It is a source-based distribution, which means that the
We sometimes call it “libre software,”
borrowing the French or Spanish word user compiles the software from source code. It is known for its
for “free” as in freedom, to show we do performance optimizations for the user’s hardware.
not mean the software is gratis. ▶ Void - It is an independent rolling-release Linux distribution that
uses the X Binary Package System package manager, which was
You may have paid money to get
copies of a free program, or you may designed and implemented from scratch, and the runit init system.
have obtained copies at no charge. But Excluding binary kernel blobs, a base install is composed entirely
regardless of how you got your copies, of free1 software.
you always have the freedom to copy
and change the software, even to sell
copies.
1.1.2 Desktop Environments
- GNU on Free Software

Linux Distributions Share Definition 1.1.2 (Desktop Environment) A desktop environment is


a collection of software designed to give functionality and a certain
Ubuntu
look and feel to a desktop operating system. It is a combination of a
Debian 34% window manager, a file manager, a panel, and other software that
16% provides a graphical user interface and utilities to a regular desktop
user, such as volume and brightness control, multimedia applications,
10% settings panels, etc. This is only required by desktop (and laptop)
RHEL 0% uses and are not present on server instances.
39%
Gentoo
Others There are many desktop environments available for Linux, but the
important ones are:
Figure 1.1: Linux Distributions Usage in
2024
▶ GNOME - One of the most popular desktop environments for
Linux. It is known for its simplicity and ease of use. It is the default
2: GTK is a free software cross-platform
desktop environment for many distributions, including Ubuntu.
widget toolkit for creating graphical user
interfaces. It is based on the GTK Toolkit. 2 Popular distros shipping by
3: Ubuntu used to ship with Unity as default with GNOME are Fedora, RHEL, CentOS, Debian, Zorin,
the default desktop environment, but and Ubuntu. 3
switched to GNOME in 2017.
▶ KDE Plasma - A highly customizable desktop environment based
4: Qt is cross-platform application devel-
opment framework for creating graphi-
on the Qt Tookit. 4 Many distributions like Slackware and Open-
cal user interfaces.
1.1 Introduction 3

SUSE ship with KDE Plasma as the default desktop environment,


and most others have the option to install with KDE Plasma.
Ubuntu’s KDE Plasma variant is called Kubuntu.
▶ Xfce - A lightweight desktop environment known for its speed
and simplicity. It is based on the GTK Toolkit. It is used in many
distributions like Xubuntu, Manjaro, and Fedora.
▶ LXQt - A lightweight desktop environment known for its speed
and simplicity. It is based on the Qt Toolkit. It is used in many
distributions like Lubuntu.
▶ Cinnamon
▶ MATE
Desktop Environment Share
Xfce
It is important to note that although some distributions come pre-bundled
with certain Desktop Environments, it doesn’t mean that you cannot use 36%

another DE with it. DE are simply packages installed on your distribution, KDE Plasma 28%
and almost all the popular DEs can be installed on all distributions. Many 8%
Others
distributions also come with multiple pre-bundled desktop environments 11%
17%
due to user preferences. Most server distributions and some enthusiast Cinnamon
distributions come with no pre-bundled desktop environment, and let Gnome Shell
the user determine which one is required, or if one is required. Figure 1.2: Desktop Environment Usage
in 2022

1.1.3 Window Managers

Definition 1.1.3 (Window Manager) A window manager is system


software that controls the placement and appearance of windows
within a windowing system in a graphical user interface. It is a part
of the desktop environment, but can also be used standalone. It is
responsible for the appearance and placement of windows, and can
also provide additional functionality like virtual desktops, window
decorations, window title bars, and tiling.

Although usually bundled with a desktop environment, many window


managers are also standalone and installed separately by the user if they
don’t want to use all the application from a single desktop environment.

Some popular window managers are:


5: A tiling window manager is a win-
dow manager that automatically splits
the screen into non-overlapping frames,
▶ Openbox - A lightweight window manager known for its speed
which are used to display windows. Most
and simplicity. It is used in many distributions like Lubuntu. desktop environments ship with a float-
▶ i3 - It is a tiling window manager 5 which is usually one of the first ing window manager instead, which
window managers that users try when they want to move away users of other operating systems are
more familiar with.
from a desktop environment and to a tiling window manager.
▶ awesome - A tiling window manager that is highly configurable
and extensible. It is written in Lua and is known for its beautiful
configurations.
▶ bspwm - A tiling window manager. It is based on binary space
partitioning.
▶ dwm - A dynamic tiling window manager that is known for
its simplicity and minimalism. It is written in C and is highly
configurable.
4 1 Essentials of Linux

1.1.4 Why Linux?

You might be wondering "Why should I use Linux?" Most people use
either Windows or Mac on their personal computers. Although these
consumer operating systems get the job done, they don’t let the user
6: Although Linux is just a kernel and completely control their own hardware and software. Linux 6 is a free and
not an entire operating system, through-
open-source operating system that gives the user complete control over
out this book I would be referring to
GNU/Linux, the combination of GNU their system. It is highly customizable and can be tailored to the user’s
core utilities and the Linux kernel, as needs. It is also known for its stability and security. It is used in almost all
Linux in short. servers, supercomputers, and embedded systems. It is also used in many
consumer devices like Android phones, smart TVs, and smartwatches.

In this course we will be covering how to navigate the linux file system,
how to manage files, how to manage the system, and how to write scripts
to automate tasks. In the later part of the course we go over concepts
such as pattern matching and text processing.

This course does not go into details of the linux kernel, but rather attempts
to make the reader familiar with the GNU core utils and able to navigate
around a linux server easily.

1.1.5 What is Shell?

The kernel is the core of the operating system. It is responsible for


managing the hardware and providing services to the user programs.
The shell is the interface between the user and the kernel (Figure 1.3).
Through the shell we can run many commands and utilities, as well as
some inbuilt features of the shell.

Definition 1.1.4 (Shell) A shell is a command-line interpreter that


Figure 1.3: Operating System Onion provides a way for the user to interact with the operating system. It
Rings - The layers of an operating system
takes commands from the user and executes them. It is a program
that provides the user with a way to interact with the operating
system.

The most popular shell in Linux is the bash shell. It is the default shell in
7: POSIX, or Portable Operating System
Interface, is a set of standards that de- most distributions. It is a POSIX-compliant 7 shell. There are many other
fine the interfaces and environment that shells available, such as zsh, fish 8 , dash, csh, ksh, and tcsh. Each shell
operating systems use to access POSIX- has its own features and syntax, but most of the keywords and syntax are
compliant applications. POSIX standards
are based on the Unix operating system
the same. In this course we will be covering only the bash shell and its
and were released in the late 1980s. syntax, but most of what we learn here is also applicable on other shells
8: Fish is a non-POSIX compliant shell as well.
that is known for its features like auto-
suggestion, syntax highlighting, and tab
completions. Although a useful alterna-
tive to other shells for scripting, it should 1.1.6 Shell vs Terminal
not be set as the default shell.
Definition 1.1.5 (Terminal) A terminal is a program that provides a
way to interact with the shell. It is a program that provides a text-based
interface to the shell. It is also known as a terminal emulator.

The terminal is the window that you see when you open a terminal
program. It provides a way to interact with the shell. The shell is the
program that interprets the commands that you type in the terminal.
1.1 Introduction 5

The terminal is the window that you see, and the shell is the program
that runs in that window. Whereas the shell is the application that
is parsing your input and running the commands and keywords, the
terminal is the application that lets you see the shell graphically. There are
multiple different terminal emulators, providing a lot of customization
and beautification to the terminal, as well as providing useful features
such as scroll back, copying and pasting, and so on.
Some popular terminal emulators are:

▶ gnome-terminal - The default terminal emulator for the GNOME


desktop environment.
▶ konsole - The default terminal emulator for the KDE desktop
environment.
▶ xfce4-terminal - The default terminal emulator for the Xfce desktop
environment.
▶ alacritty - A terminal emulator known for its speed and simplicity.
▶ terminator - A terminal emulator known for its features like split-
ting the terminal into multiple panes.
▶ tilix - A terminal emulator known for its features like splitting the
terminal into multiple panes.
▶ st - A simple terminal emulator known for its simplicity and
minimalism.
▶ urxvt
▶ kitty
▶ terminology

In most terminal emulators, there are some basic shortcuts that can be
used to make the terminal experience more efficient. Some of the basic
shortcuts are listed in Table Table 1.1.

Table 1.1: Basic Shortcuts in Terminal


Shortcut Description
Ctrl + C Terminate the current process
Ctrl + D Exit the shell
Ctrl + L Clear the terminal screen
Ctrl + A Move the cursor to the beginning of the line
Ctrl + E Move the cursor to the end of the line
Ctrl + U Delete from the cursor to the beginning of the line
Ctrl + K Delete from the cursor to the end of the line
Ctrl + W Delete the word before the cursor
Ctrl + Y Paste the last deleted text
Ctrl + R Search the command history
Ctrl + Z Suspend the current process
Ctrl + \ Terminate the current process
Ctrl + S Pause the terminal output
Ctrl + Q Resume the terminal output

1.1.7 Why the Command Line?

Both the command line interface (CLI) and the graphical user interface
(GUI) are simply shells over the operating system’s kernel. They let you
interact with the kernel, perform actions and run applications.
GUI:
6 1 Essentials of Linux

The GUI requires a mouse and a keyboard, and is more intuitive and
easier to use for beginners. But it is also slower and less efficient than
the CLI. The GUI severely limits the user’s ability to automate tasks and
perform complex operations. The user can only perform those operations
that the developers of the GUI have thought of and implemented.
CLI:
The CLI is faster and more efficient than the GUI as it lets the user use the
keyboard to perform actions. Instead of clicking on pre-defined buttons,
the CLI lets you construct your instruction to the computer using syntax
and semantics. The CLI lets you combine simple commands that do one
thing well to perform complex operations. The biggest advantage of the
CLI is that it lets you automate tasks. It might be faster for some users
to rename a file from file1 to file01 using the GUI, but it will always be
faster to automate this using the CLI if you want to do this for thousands
of files in the folder.
In this course we will be learning how to use the CLI to navigate the file
system, manage files, manage the system, process text, and write scripts
to automate tasks.

1.1.8 Command Prompt

The command prompt is the text that is displayed in the terminal to


indicate that the shell is ready to accept commands. It usually ends with
a $ or a # symbol. The $ symbol indicates that the shell is running as a
normal user, and the # symbol indicates that the shell is running as the
root user. The root user has complete control over the system and can
perform any operation on the system.
An example of a command prompt is:
1 username@hostname:~$

Here, username is the name of the user, hostname is the name of the
computer, and $ indicates that the shell is running as a normal user. The
∼ symbol indicates that the current working directory is the user’s home
9: The home directory is the direc- directory. 9 This prompt can be changed and customized according to
tory where the user’s files and set- the user’s preferences using the PS1 variable discussed in Chapter ??.
tings are stored. It is usually located at
/home/username. This can be shorted to
~ in the shell.
1.2 Simple Commands in GNU Core Utils 7

1.2 Simple Commands in GNU Core Utils

Definition 1.2.1 (GNU Core Utils) The GNU Core Utilities are the
basic file, shell, and text manipulation utilities of the GNU operating
system. These are the utilities that are used to interact with the
operating system and perform basic operations. a
a GNU Core Utils
Figure 1.4: GNU Core Utils Logo

The shell lets you simply type in the name of the command and press
enter to run it. You can also pass arguments to the command to modify
its behavior. Although the commands are simple, they are powerful and
can be combined to perform complex operations. 10 10: The combination of commands to
perform complex operations is called pip-
ing. This will be covered later.
Some basic commands in the core-utils are listed in Table Table 1.2.
Table 1.2: Basic Commands in GNU Core
Command Description Utils
ls List the contents of a directory
cd Change the current working directory
pwd Print the current working directory
mkdir Create a new directory
rmdir Remove a directory
touch Create a new file
rm Remove a file
cp Copy a file
mv Move a file
echo Print a message
cat Concatenate and display the contents of a file
less Display the contents of a file one page at a time
head Display the first few lines of a file
tail Display the last few lines of a file
find Find files and directories
locate Find files and directories
which Find the location of a command
uname Print system information
ps Display information about running processes
kill Terminate a process
chmod Change the permissions of a file
chown Change the owner of a file
chgrp Change the group of a file
date Print the current date and time
cal, ncal Print a calendar
df Display disk space usage
du Display disk usage
free Display memory usage
top Display system information
history Display the command history
sleep Pause the shell for a specified time
true Do nothing, successfully
false Do nothing, unsuccessfully
tee Read from stdin and write to stdout and files
whoami Print the current user
groups Print the groups the user belongs to
clear Clear the terminal screen
exit Exit the shell
8 1 Essentials of Linux

1.2.1 File System Navigation

pwd:

The pwd command prints the current working directory. The current
working directory is the directory that the shell is currently in. The shell
starts in the user’s home directory when it is opened. The pwd command
prints the full path of the current working directory.
1 $ pwd
2 /home/username

ls:

The ls command lists the contents of a directory. By default, it lists the


contents of the current working directory. The ls command can take
arguments to list the contents of a different directory.
1 $ ls
2 Desktop Documents Downloads Music Pictures Videos
11: Hidden files are files whose names
start with a dot. They are hidden by de- We can also list hidden files 11 using the -a flag.
fault in the ls command.
1 $ ls -a
2 . .. .bashrc Desktop Documents Downloads Music Pictures
Videos

Here, the . and .. directories are special directories. The . directory is


the current directory, and the .. directory is the parent directory. The
.bashrc file is a configuration file for the shell which is a hidden file.

ls can also list the details of the files using the -l flag.

1 $ ls -l
2 total 24
3 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Desktop
4 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Documents
5 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Downloads
6 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Music
7 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Pictures
8 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Videos

As seen in Figure 1.5, the first column is the file type and permissions.
The second column is the number of links to the file or directory. The
third and fourth columns are the owner and group of the file or directory.
Figure 1.5: ls -l Output The fifth column is the size of the file or directory. The sixth, seventh,
and eighth columns are the last modified date and time of the file or
12: More details about the file permis-
directory. The ninth column is the name of the file or directory. 12
sions and the file types will be covered
later in the course. We can also list the inode numbers 13 using the -i flag.
13: An inode is a data structure on a
1 $ ls -i
filesystem on Linux and other Unix-like
2 123456 Desktop 123457 Documents 123458 Downloads 123459 Music
operating systems that stores all the in-
formation about a file except its name 123460 Pictures 123461 Videos
and its actual data. This includes the file
type, the file’s owner, the file’s group, Inodes will be discussed in detail later in the course.
the file’s permissions, the file’s size, the
file’s last modified date and time, and cd:
the file’s location on the disk. The inode
is the reference pointer to the data in the The cd command changes the current working directory. It takes the path
disk. to the directory as an argument.
1.2 Simple Commands in GNU Core Utils 9

1 $ cd Documents
2 $ pwd
3 /home/username/Documents

The cd command can also take the ~ symbol as an argument to change to


the user’s home directory. This is the default behavior of the cd command
when no arguments are passed.
1 $ cd
2 $ pwd
3 /home/username

If we want to go back to the previous directory, we can use the - symbol


as an argument to the cd command. 14 14: This internally uses the OLDPWD envi-
ronment variable to change the directory.
1 $ cd Documents More about variables will be covered
2 $ pwd later in the course.
3 /home/username/Documents
4 $ cd -
5 $ pwd
6 /home/username

Question 1.2.1 ls can only show files and directories in the cwd15 , 15: cwd means Current Working Direc-
not subdirectories. True or False? tory

Answer 1.2.1 False. ls can show files and directories in the cwd, and
also in subdirectories. The -R flag can be used to show files and
directories in subdirectories, recursively.

1.2.2 Manuals

man:

How to remember so many flags and options for each of the commands?
The man command is used to display the manual pages for a command.

Definition 1.2.2 (Manual Pages) Manual pages are a type of soft-


ware documentation that provides details about a command, utility,
function, or file format. They are usually written in a simple and con-
cise manner and provide information about the command’s syntax,
options, and usage.

1 $ man ls

This will display the manual page for the ls command. The manual page
is divided into sections, and you can navigate through the sections using
the arrow keys. Press q to exit the manual page.

Example manual page:


1
2 LS(1)
User Commands
LS(1)
3
10 1 Essentials of Linux

4 NAME
5 ls - list directory contents
6
7 SYNOPSIS
8 ls [OPTION]... [FILE]...
9
10 DESCRIPTION
11 List information about the FILEs (the current directory by
default). Sort entries alphabetically if none of -cftuvSUX
nor --sort is specified.
12
13 Mandatory arguments to long options are mandatory for short
options too.
14
15 -a, --all
16 do not ignore entries starting with .
17
18 -A, --almost-all
19 do not list implied . and ..
20 ...

The manual page provides information about the command, its syntax,
options, and usage. It is a good practice to refer to the manual page of a
command before using it.
To exit the manual page, press q.
There are multiple sections in the manual page, man takes the section
number as an argument to display the manual page from that section.
1 $ man 1 ls

This will display the manual page for the ls command from section 1.
The details of the sections can be seen in Table Table 1.3.
Table 1.3: Manual Page Sections
Section Description
1 User Commands
2 System Calls
3 Library Functions
4 Special Files usually found in /dev
5 File Formats and conventions
6 Games
7 Miscellaneous
8 System Administration
9 Kernel Developer’s Manual

Man pages only provide information about the commands and utilities
that are installed on the system. They do not provide information about
the shell builtins or the shell syntax. For that, you can refer to the shell’s
documentation or use the help command.
Some commands also have a --help flag that displays the usage and
options of the command.
Some commands have their own info pages, which are more detailed
than the man pages.
To be proficient with shell commands, one needs to read the man, info,
16: An useful video by Robert Elder
about the differences between man, info,
and help pages. 16
and help can be found on YouTube.
1.2 Simple Commands in GNU Core Utils 11

Exercise 1.2.1 Run man, info, and --help on all the commands
discussed in this section. Note the differences in the information
provided by each of them. Read the documentations carefully and try
to understand how each command works, and the pattern in which
the documentations are written.

info:
The info command is used to display the info pages for a command. The
info pages are more detailed than the man pages for some commands. It
is navigable like a hypertext document. There are links to chapters inside
the info pages that can be followed using the arrow keys and entered
using the enter key. The Table Table 1.4 lists some of the keys that can be
used to navigate the info pages.
Table 1.4: Keys in Info Pages
Key Description
h Display the help page
q Exit the info page
n Move to the next node
p Move to the previous node
u Move up one level
d Move down one level
l Move to the last node
t Move to the top node
g Move to a specific node
<enter> Follow the link
m Display the menu
s Search for a string
S Search for a string (case-sensitive)

help:
The help command is a shell builtin that displays information about the
shell builtins and the shell syntax.
1 $ help read

This will list the information about the read builtin command.
The help command can also be used to display information about the
shell syntax.
1 $ help for

This will list the information about the for loop in the shell.
Help pages are not paged, and the output is displayed in the terminal.
To page the output, one can use the less command.
1 $ help read | less

1.2.3 System Information

uname:
The uname command prints system information. It can take flags to print
specific information about the system. By default, it prints only the kernel
name.
12 1 Essentials of Linux

1 $ uname
2 Linux

17: Here rex is the hostname of the sys- The -a flag prints all the system information. 17
tem, 6.8.2-arch2-1 is the kernel ver-
sion, x86_64 is the architecture, and 1 $ uname -a
GNU/Linux is the operating system. 2 Linux rex 6.8.2-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 28 Mar 2024
17:06:35 +0000 x86_64 GNU/Linux

ps:

The ps command displays information about running processes. By


default, it displays information about the processes run by the current
18: The PID is the process ID, the TTY user that are running from a terminal. 18
is the terminal the process is running
from, the TIME is the time the process 1 $ ps
has been running, and the CMD is the 2 PID TTY TIME CMD
command that is running.
3 12345 pts/0 00:00:00 bash
4 12346 pts/0 00:00:00 ps

There are a lot of flags that can be passed to the ps command to display
more information about the processes. These will be covered in Chapter
??.

Remark 1.2.1 ps has three types of options:


▶ UNIX options
▶ BSD options
▶ GNU options
The UNIX options are the preceded by a hyphen (-) and may be
grouped. The BSD options can be grouped, but should not be preceded
by a hyphen (-). The GNU options are preceded by two hyphens (–).
These are also called long options.
The same action can be performed by using different options, for
example, ps -ef and ps aux are equivalent, although first is using
UNIX options, and the latter is using BSD options.
Another difference in GNU core utils and BSD utils is that the GNU
utils have long options, whereas the BSD utils do not.
BSD utils also usually do not support having flags after the positional
arguments, whereas most GNU utils are fine with this.

kill:

The kill command is used to terminate a process. It takes the process


ID as an argument.
1 $ kill 12345

The kill command can also take the signal number as an argument to
send a signal to the process. For example, the SIGKILL signal can be sent
19: The SIGKILL signal is used to termi-
nate a process immediately. It cannot be to the process to terminate it. 19
caught or ignored by the process. It is
1 $ kill -9 12345
numbered as 9.

free:

The free command is used to display the amount of free and used
memory in the system.
1.2 Simple Commands in GNU Core Utils 13

1 $ free
2 total used free shared buff/
cache available
3 Mem: 8167840 1234560 4567890 123456
2367890 4567890
4 Swap: 2097148 0 2097148

The free command can take the -h flag to display the memory in
human-readable format.
1 $ free -h
2 total used free shared buff/
cache available
3 Mem: 7.8Gi 1.2Gi 4.3Gi 120Mi
2.3Gi 4.3Gi
4 Swap: 2.0Gi 0B 2.0Gi

df:
The df command is used to display the amount of disk space available
on the filesystems.
1 $ df
2 Filesystem 1K-blocks Used Available Use% Mounted on
3 /dev/sda1 12345678 1234567 11111111 10% /
4 /dev/sda2 12345678 1234567 11111111 10% /home

The df command can take the -h flag to display the disk space in
human-readable format.
1 $ df -h
2 Filesystem Size Used Avail Use% Mounted on
3 /dev/sda1 12G 1.2G 9.9G 11% /
4 /dev/sda2 12G 1.2G 9.9G 11% /home

du:
The du command is used to display the disk usage of directories and files.
By default, it displays the disk usage of the current directory.
1 $ du
2 4 ./Desktop
3 4 ./Documents
4 4 ./Downloads
5 4 ./Music
6 4 ./Pictures
7 4 ./Videos
8 28

The du command can take the -h flag to display the disk usage in
human-readable format. The -s flag displays the total disk usage of the
directory.
1 $ du -sh
2 28K .

Question 1.2.2 How to print the kernel version of your system?

Answer 1.2.2 uname -r will print the kernel version of your system.
14 1 Essentials of Linux

uname is a command to print system information. The -r flag is to


print the kernel release. There are other flags to print other system
information.
We can also run uname -a to get all fields and extract only the kernel
info using commands taught in later weeks.

Question 1.2.3 How to see how long your system is running for?
What about the time it was booted up?

Answer 1.2.3 uptime will show how long the system is running for.
uptime -s will show the time the system was booted up.
The -s flag is to show the time of last boot.

Question 1.2.4 How to see the amount of free memory? What about
free hard disk space? If we are unable to understand the big numbers,
how to convert them to human readable format? What is difference
between MB and MiB?

Answer 1.2.4 free will show the amount of free memory.


df will show the amount of free hard disk space.
df -h and free -h will convert the numbers to human readable
format.
MB is Megabyte, and MiB is Mebibyte.
1 MB = 1000 KB, 1 GB = 1000 MB, 1 TB = 1000 GB, this is SI unit.
1 MiB = 1024 KiB, 1 GiB = 1024 MiB, 1 TiB = 1024 GiB, this is 210 unit.

1.2.4 File Management

file:
The file command is used to determine the type of a file. It can take
multiple file names as arguments.
1 $ file file1
2 file1: ASCII text
3 $ file /bin/bash
4 /bin/bash: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV
), dynamically linked, interpreter /lib64/ld-linux-x86-64.so
.2, for GNU/Linux 3.2.0, BuildID[sha1]=1234567890abcdef,
stripped

mkdir:
The mkdir command is used to create new directories. It can take multiple
directory names as arguments.
1 $ mkdir a b c
2 $ ls -F
3 a/ b/ c/
1.2 Simple Commands in GNU Core Utils 15

Exercise 1.2.2 Run man ls to find out what the -F flag does, and why
we used it in the above example.

touch:

The touch command is used to create new files. It can take multiple file
names as arguments. If a file is already present, the touch command
updates the last modified date and time of the file, but does not modify
the contents of the file.
1 $ touch file1 file2 file3
2 $ ls -l
3 -rw-r--r-- 1 username group 0 Mar 1 12:00 file1
4 -rw-r--r-- 1 username group 0 Mar 1 12:00 file2
5 -rw-r--r-- 1 username group 0 Mar 1 12:00 file3
6 $ sleep 60
7 $ touch file3
8 $ ls -l
9 -rw-r--r-- 1 username group 0 Mar 1 12:00 file1
10 -rw-r--r-- 1 username group 0 Mar 1 12:00 file2
11 -rw-r--r-- 1 username group 0 Mar 1 12:01 file3

Exercise 1.2.3 Notice the difference in the last modified date and time
of the file3 file from the other files. Also notice the sleep command
used to pause the shell for 60 seconds.

rmdir:

The rmdir command is used to remove directories. It can take multiple


directory names as arguments.
1 $ mkdir a b c d
2 $ rmdir a b c
3 $ ls -F
4 d/

Remark 1.2.2 The rmdir command can only remove empty directories.
This is a safety feature so that users dont accidentally delete directories
with files in them. To remove directories with files in them along with
those files, use the rm command.

rm:

The rm command is used to remove files and directories. It can take


multiple file and directory names as arguments.
1 $ touch file1 file2 file3
2 $ ls -F
3 file1 file2 file3
4 $ rm file1 file2
5 $ ls -F
6 file3

However, using rm to delete a directory will give an error.


1 $ mkdir a
2 $ rm a
16 1 Essentials of Linux

3 rm: cannot remove ’a’: Is a directory

This is because the rm command does not remove directories by default.


This is a safety feature to prevent users from accidentally deleting
directories with files in them.

To remove directories along with their files, use the -r flag.


1 $ rm -r a

To force the removal of files and directories without a confirmation, use


the -f flag.

Warning 1.2.1 The rm command is a dangerous command. It does


not move the files to the trash, but permanently deletes them. Be
extremely careful when using the rm command. Only use the -f flag
if you are absolutely sure that you want to delete the files.

To force rm to always ask for confirmation before deleting files, use the
-i flag.

1 $ rm -i file3
2 rm: remove regular empty file ’file3’? y

cp:

The cp command is used to copy files. It takes the source file and the
destination file as arguments.
1 $ touch file1
2 $ ls -F
3 file1
4 $ cp file1 file2
5 $ ls -F
6 file1 file2

The cp command can also take the -r flag to copy directories.


1 $ mkdir a
2 $ touch a/file1
3 $ cp -r a b
4 $ ls -R
5 .:
6 a/ b/
7
8 ./a:
9 file1
10
11 ./b:
12 file1

Exercise 1.2.4 Why did we use the -R flag in the above example?
What does it do?

There are three ways to copy files using cp:


1 SYNOPSIS
2 cp [OPTION]... [-T] SOURCE DEST
3 cp [OPTION]... SOURCE... DIRECTORY
1.2 Simple Commands in GNU Core Utils 17

4 cp [OPTION]... -t DIRECTORY SOURCE...

Exercise 1.2.5 There are three ways of running the cp command to


copy a file. Here we have demonstrated only one. Read the manual
page of the cp command to find out the other two ways and try them
out yourself.

mv:

The mv command is used to move files. The syntax is similar to the cp


20: This means that mv also has three
command. 20 It is used to move files from one location to another, or to
ways of running it.
rename files.
1 $ touch file1
2 $ ls -F
3 file1
4 $ mv file1 file2
5 $ ls -F
6 file2

Exercise 1.2.6 Create a directory dir1 using the mkdir command,


then create a file file1 inside dir1. Now move (rename) the dir1
directory to dir2 using the mv command. The newly created directory
should be named dir2 and should contain the file1 file. Were you
require to use the -r flag with the mv command like you would have
in cp command?

1.2.5 Text Processing and Pagers

echo:

The echo command is used to print a message to the terminal. It can take
multiple arguments and print them to the terminal.
1 $ echo Hello, World!
2 Hello, World!

The echo command can also take the -e flag to interpret backslash
escapes.
1 $ echo -e "Hello, \nWorld!"
2 Hello,
3 World!

Some escape characters in echo are listed in Table Table 1.5 on the
following page.

Exercise 1.2.7 Run the command echo -e "\x41=\0101" and try to


understand the output and the escape characters used.

cat:

The cat command is used to concatenate and display the contents of


files.
18 1 Essentials of Linux

Table 1.5: Escape Characters in echo


Escape Description
\\ backslash
\a alert (BEL)
\b backspace
\c produce no further output
\e escape
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\0NNN byte with octal value NNN (1 to 3 digits)
\xHH byte with hexadecimal value HH (1 to 2 digits)

1 $ cat file1
2 Hello, World! from file1

The cat command can take multiple files as arguments and display their
contents one after another.
1 $ cat file1 file2
2 Hello, World! from file1
3 Hello, World! from file2

less:

Sometimes the contents of a file are too large to be displayed at once.


Nowadays modern terminal emulators can scroll up and down to view
the contents of the file, but actual ttys cannot do that. To view the contents
21: more is another pager program that
of a file one page at a time, use the less command. less is a pager program
displays the contents of a file one page at that displays the contents of a file one page at a time. 21
a time. It is older than less and has fewer
features. less is an improved version of 1 $ less file1
more and is more commonly used. Due
to this, it is colloquially said that "less is To scroll up and down, use the arrow keys, or the j and k keys. 22 Press q
more", as it has more features. to exit the less command.
22: Using j and k to move the cursor
up and down is a common keybinding head:
in many terminal applications. This orig-
inates from the vi text editor which will The head command is used to display the first few lines of a file. By
be covered later in the course.
default, it displays the first 10 lines of a file.
1 $ head /etc/passwd
2 root:x:0:0:root:/root:/usr/bin/bash
3 bin:x:1:1::/:/usr/bin/nologin
4 daemon:x:2:2::/:/usr/bin/nologin
5 mail:x:8:12::/var/spool/mail:/usr/bin/nologin
6 ftp:x:14:11::/srv/ftp:/usr/bin/nologin
7 http:x:33:33::/srv/http:/usr/bin/nologin
8 nobody:x:65534:65534:Kernel Overflow User:/:/usr/bin/nologin
9 dbus:x:81:81:System Message Bus:/:/usr/bin/nologin
10 systemd-coredump:x:984:984:systemd Core Dumper:/:/usr/bin/
nologin
11 systemd-network:x:982:982:systemd Network Management:/:/usr/bin/
nologin
23: We can also directly run head -5
/etc/passwd to display the first 5 lines The head command can take the -n flag to display the first n lines of a
of the file. file. 23
1.2 Simple Commands in GNU Core Utils 19

1 $ head -n 5 /etc/passwd
2 root:x:0:0:root:/root:/usr/bin/bash
3 bin:x:1:1::/:/usr/bin/nologin
4 daemon:x:2:2::/:/usr/bin/nologin
5 mail:x:8:12::/var/spool/mail:/usr/bin/nologin
6 ftp:x:14:11::/srv/ftp:/usr/bin/nologin

Remark 1.2.3 Here we are listing the file /etc/passwd which contains
information about the users on the system. The file will usually be
present on all Unix-like systems and have a lot of system users. 24 24: A system user is a user that is used by
the system to run services and daemons.
It does not belong to any human and
tail: usually is not logged into. System users
have a user ID less than 1000.
The tail command is used to display the last few lines of a file. By
default, it displays the last 10 lines of a file.
1 $ tail /etc/passwd
2 rtkit:x:133:133:RealtimeKit:/proc:/usr/bin/nologin
3 sddm:x:964:964:SDDM Greeter Account:/var/lib/sddm:/usr/bin/
nologin
4 usbmux:x:140:140:usbmux user:/:/usr/bin/nologin
5 sayan:x:1000:1001:Sayan:/home/sayan:/bin/bash
6 qemu:x:962:962:QEMU user:/:/usr/bin/nologin
7 cups:x:209:209:cups helper user:/:/usr/bin/nologin
8 dhcpcd:x:959:959:dhcpcd privilege separation:/:/usr/bin/nologin
9 saned:x:957:957:SANE daemon user:/:/usr/bin/nologin

The tail command can take the -n flag to display the last n lines of a
file.
1 $ tail -n 5 /etc/passwd
2 sayan:x:1000:1001:Sayan:/home/sayan:/bin/bash
3 qemu:x:962:962:QEMU user:/:/usr/bin/nologin
4 cups:x:209:209:cups helper user:/:/usr/bin/nologin
5 dhcpcd:x:959:959:dhcpcd privilege separation:/:/usr/bin/nologin
6 saned:x:957:957:SANE daemon user:/:/usr/bin/nologin

Exercise 1.2.8 Notice that the UID (3rd column) of the sayan user
is 1000. The last column is /bin/bash instead of /usr/bin/nologin
like others. This is because it is a regular user and not a system user.

wc:
The wc command is used to count the number of lines, words, and
characters in a file. By default, it displays the number of lines, words,
and characters in a file.
1 $ wc /etc/passwd
2 43 103 2426 /etc/passwd

We can also use the -l, -w, and -c flags to display only the number of
lines, words, and characters respectively.
1 $ wc -l /etc/passwd
2 43 /etc/passwd
20 1 Essentials of Linux

Question 1.2.5 Can we print contents of multiple files using a single


command?

Answer 1.2.5 cat file1 file2 file3 will print the contents of
file1, file2, and file3 in the order given. The contents of the files
will be printed one after the other.

Question 1.2.6 Can cat also be used to write to a file?

Answer 1.2.6 Yes, cat > file1 will write to file1. The input will
be taken from the terminal and written to file1. The input will be
written to file1 until the user presses Ctrl+D to indicate end of
input. This is redirection, which we see in later weeks.

Question 1.2.7 How to list only first 10 lines of a file? How about first
5? Last 5? How about lines 105 to lines 152?

Answer 1.2.7 head filename will list the first 10 lines of filename.
head -n 5 filename will list the first 5 lines of filename.
tail -n 5 filename will list the last 5 lines of filename.
head -n 152 filename | tail -n 48 will list lines 105 to 152 of
filename. This uses | which is a pipe, which we will see in later
weeks.

Question 1.2.8 Do you know how many lines a file contains? How
can we count it? What about words? Characters?

Answer 1.2.8 wc filename will count the number of lines, words,


and characters in filename.
wc -l filename will count the number of lines in filename.
wc -w filename will count the number of words in filename.
wc -c filename will count the number of characters in filename.

Question 1.2.9 How to delete an empty directory? What about non-


empty directory?

Answer 1.2.9 rmdir dirname will delete an empty directory.


rm -r dirname will delete a non-empty directory.

Question 1.2.10 How to copy an entire folder to another name? What


about moving?
Why the difference in flags?
1.2 Simple Commands in GNU Core Utils 21

Answer 1.2.10 cp -r sourcefolder targetfolder will copy an en-


tire folder to another name.
mv sourcefolder targetfolder will move an entire folder to an-
other name.
The difference in flags is because cp is used to copy, and mv is used
to move or rename a file or folder. The -r flag is to copy recursively,
and is not needed for mv as it is not recursive and simply changes the
name of the folder (or the path).

1.2.6 Aliases and Types of Commands

alias:

The alias command is used to create an alias for a command. An alias is


a custom name given to a command that can be used to run the command.
25 25: Aliases are used to create shortcuts
for long commands. They can also be
used to create custom commands.
1 $ alias ll=’ls -l’
2 $ ll
3 total 24
4 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Desktop
5 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Documents
6 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Downloads
7 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Music
8 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Pictures
9 drwxr-xr-x 2 username group 4096 Mar 1 12:00 Videos

The alias ll is created for the ls -l command.

Warning 1.2.2 Be careful when creating aliases. Do not create aliases


for existing commands. This can lead to confusion and errors.

But this alias is temporary and will be lost when the shell is closed. To
make the alias permanent, add it to the shell configuration file. For bash,
this is the .bashrc file in the home directory.
1 $ echo "alias ll=’ls -l’" >> ~/.bashrc

This will add the alias to the .bashrc file. To make the changes take
effect, either close the terminal and open a new one, or run the source
command.
1 $ source ~/.bashrc

Warning 1.2.3 Be careful when editing the shell configuration files.


A small mistake can lead to the shell not working properly. Always
keep a backup of the configuration files before editing them.

We can see the aliases that are currently set using the alias command.
1 $ alias
2 alias ll=’ls -l’

We can also see what a particular alias expands to using the alias
command with the alias name as an argument.
22 1 Essentials of Linux

1 $ alias ll
2 alias ll=’ls -l’

To remove an alias, use the unalias command.


1 $ unalias ll

Exercise 1.2.9 Create an alias la for the ls -a command. Make it


permanent by adding it to the .bashrc file. Check if the alias is set
using the alias command.

Exercise 1.2.10 Create an alias rm for the rm -i command. Make it


permanent by adding it to the .bashrc file. Check if the alias is set
using the alias command. Try to delete a file using the rm command.
What happens?

which:

The which command is used to show the path of the command that will
be executed.
1 $ which ls
2 /sbin/ls

We can also list all the paths where the command is present using the -a
flag.
1 $ which -a ls
2 /sbin/ls
3 /bin/ls
4 /usr/bin/ls
5 /usr/sbin/ls

This means that if we delete the /sbin/ls file, the /bin/ls file will be
executed when we run the ls command.

whatis:

The whatis command is used to show a short description of the com-


mand.
1 $ whatis ls
2 ls (1) - list directory contents
3 ls (1p) - list directory contents

Here the brackets show the section of the manual where the command is
present. This short exerpt is taken from its man page itself.

whereis:

The whereis command is used to show the location of the command,


source files, and man pages.
1 $ whereis ls
2 ls: /usr/bin/ls /usr/share/man/man1/ls.1.gz /usr/share/man/man1p/
ls.1p.gz

Here we can see that the ls command is present in /usr/bin/ls,


and its man pages are present in /usr/share/man/man1/ls.1.gz and
/usr/share/man/man1p/ls.1p.gz.
1.2 Simple Commands in GNU Core Utils 23

locate:

The locate command is used to find files by name. The file can be present
anywhere in the system and if it is indexed by the mlocate database, it
can be found using the locate command.
1 $ touch tmp/48/hellohowareyou
2 $ pwd
3 /home/sayan
4 $ locate hellohowareyou
5 /home/sayan/tmp/48/hellohowareyou

Note: you may have to run updatedb to update the database before using
locate. This can only be run by the root user or using sudo.

type:

The type command is used to show how the shell will interpret a
command. Usually some commands are both an executable and a shell
built-in. The type command will show which one will be executed.
1 $ type ls
2 ls is hashed (/sbin/ls)
3 $ type cd
4 cd is a shell builtin

This shows that the ls command is an executable, and the cd command


is a shell built-in.

We can also use the -a flag to show all the ways the command can be
interpreted.
1 $ type -a pwd
2 pwd is a shell builtin
3 pwd is /sbin/pwd
4 pwd is /bin/pwd
5 pwd is /usr/bin/pwd
6 pwd is /usr/sbin/pwd

Here we can see that the pwd command is a shell built-in, and is also
present in multiple locations in the system. But if we run the pwd com-
mand, the shell built-in will be executed.

type is also useful when you are not sure whether to use man or help
for a command. Generally for a shell built-in, help is used, and for an
executable the info and the man pages are used.

Types of Commands: A command can be an alias, a shell built-in, a


shell function, keyword, or an executable.

The type command will show which type the command is.

▶ alias: A command that is an alias to another command defined by


the user or the system.
▶ builtin: A shell built-in command is a command that is built into
the shell itself. It is executed internally by the shell. This is usually
faster than an external command.
▶ file: An executable file that is stored in the file system. It has to be
stored somewhere in the PATH variable.
▶ function: A shell function is a set of commands that are executed
when the function is called.
24 1 Essentials of Linux

▶ keyword: A keyword is a reserved word that is part of the shell


syntax. It is not a command, but a part of the shell syntax.

Exercise 1.2.11 Find the path of the true command using which.
Find a short description of the true command using whatis. Is the
executable you found actually the one that is executed when you run
true? Check using type true

Question 1.2.11 How to create aliases? How to make them permanent?


How to unset them?

Answer 1.2.11 alias name=’command’ will create an alias.


unalias name will unset the alias.
To make them permanent, add the alias to the ∼/.bashrc file.
The ∼/.bashrc file is a script that is executed whenever a new
terminal is opened.

Question 1.2.12 How to run the normal version of a command if it is


aliased?

Answer 1.2.12 \command will run the normal version of command if it


is aliased.

Question 1.2.13 What is the difference between which, whatis, whereis,


locate, and type?

Answer 1.2.13 Each of the commands serve a different purpose:


▶ which will show the path of the command that will be executed.
▶ whatis will show a short description of the command.
▶ whereis will show the location of the command, source files,
and man pages.
▶ locate is used to find files by name.
▶ type will show how the command will be interpreted by the
shell.

1.2.7 User Management

whoami:

The whoami command is used to print the username of the current user.
1 $ whoami
2 sayan

groups:

The groups command is used to display the groups that the current user
belongs to.
1.2 Simple Commands in GNU Core Utils 25

1 $ groups
2 sys wheel rfkill autologin sayan

passwd

The passwd command is used to change the password of the current user.
26: This executable is a special one, as
The root user can also change the password of other users. 26 it is a setuid program. This will be dis-
cussed in detail in Subsection 1.4.6.
who:

The who command is used to display the users who are currently logged
in.
1 $ who
2 sayan tty2 2024-05-22 13:49
3 sayan pts/0 2024-05-22 15:58 (:0)
4 sayan pts/1 2024-05-22 15:58 (tmux(1082933).%2)
5 sayan pts/2 2024-05-22 15:58 (tmux(1082933).%1)
6 sayan pts/3 2024-05-22 15:58 (tmux(1082933).%3)
7 sayan pts/4 2024-05-22 15:58 (tmux(1082933).%4)
8 sayan pts/5 2024-05-22 15:58 (tmux(1082933).%5)
9 sayan pts/6 2024-05-22 15:58 (tmux(1082933).%6)
10 sayan pts/7 2024-05-22 15:58 (tmux(1082933).%7)
11 sayan pts/8 2024-05-22 15:58 (tmux(1082933).%8)
12 sayan pts/9 2024-05-22 15:58 (tmux(1082933).%9)
13 sayan pts/10 2024-05-22 17:58 (:0)
14 sayan pts/11 2024-05-22 18:24 (tmux(1082933).%10)
15 sayan pts/12 2024-05-22 18:24 (tmux(1082933).%11)

Exercise 1.2.12 Run the who command on the system commands VM.
What is the output?

w:

The w command is used to display the users who are currently logged in
and what they are doing.
1 $ w
2 19:47:07 up 5:57, 1 user, load average: 0.77, 0.80, 0.68
3 USER TTY LOGIN@ IDLE JCPU PCPU WHAT
4 sayan tty2 13:49 5:57m 19:10 21.82s dwm

This is different from the who command as it only considers the login
shell. Here dwm is the window manager running on the tty2 terminal.

1.2.8 Date and Time

date:

The date command is used to print formatted date and time information.
Without any arguments, it prints the current date and time.
1 $ date
2 Mon May 20 06:23:07 PM IST 2024

We can specify the date and time to be printed using the -d flag.
26 1 Essentials of Linux

1 $ date -d "2020-05-20 00:30:45"


2 Wed May 20 12:30:45 AM IST 2020
3 $ date -d "2019-02-29"
4 date: invalid date ’2019-02-29’
5 $ date -d "2020-02-29"
6 Sat Feb 29 12:00:00 AM IST 2020

Exercise 1.2.13 Why did we get an error when trying to print the
date 2019-02-29?

We can also modify the format of the date and time using the + flag and
different format specifiers. Some of the important format specifiers are
listed in Table Table 1.6. Rest of the format specifiers can be found in the
date manual page.

1 $ date +"%Y-%m-%d %H:%M:%S"


2 2024-05-20 18:23:07
3 $ date +"%A, %B %d, %Y"
4 Monday, May 20, 2024

Table 1.6: Date Format Specifiers


Specifier Description
%Y Year
%m Month
%d Day
%H Hour
%M Minute
%S Second
%A Full weekday name
%B Full month name
%a Abbreviated weekday name
%b Abbreviated month name

We can even mention relative dates and times using the date command.
1 $ date -d "next year"
2 Tue May 19 06:23:07 PM IST 2025
3 $ date -d "next month"
4 Thu Jun 20 06:23:07 PM IST 2024
5 $ date -d "tomorrow"
6 Tue May 21 06:23:07 PM IST 2024
7 $ date -d "yesterday"
8 Sun May 19 06:23:07 PM IST 2024

cal:

The cal command is used to print a calendar. By default, it prints the


calendar of the current month.
1 $ cal
2 May 2024
3 Su Mo Tu We Th Fr Sa
4 1 2 3 4
5 5 6 7 8 9 10 11
6 12 13 14 15 16 17 18
7 19 20 21 22 23 24 25
8 26 27 28 29 30 31
1.2 Simple Commands in GNU Core Utils 27

We can specify the month and year to print the calendar of that month
and year.
1 $ cal 2 2024
2 February 2024
3 Su Mo Tu We Th Fr Sa
4 1 2 3
5 4 5 6 7 8 9 10
6 11 12 13 14 15 16 17
7 18 19 20 21 22 23 24
8 25 26 27 28 29

There are multiple flags that can be passed to the cal command to display
different types of calendars and of multiple months or of entire year.

Remark 1.2.4 In Linux, there are sometimes multiple implementations


of the same command. For example, there are two implementations
of the cal command, one in the bsdmainutils package, which is the
BSD implementation and also includes another binary named ncal
for printing the calendar in vertical format. The other implementation
is in the util-linux package, which does not contain a ncal binary. The
flags and the output of the cal command can differ between the two
implementations.

Question 1.2.14 How to print the current date and time in some
custom format?

Answer 1.2.14 date -d today +%Y-%m-%d will print the current date
in the format YYYY-MM-DD. The format can be changed by changing
the format specifiers. The format specifiers are given in the man date
page. The -d today can be dropped, but is mentioned to show that
the date can be changed to any date. It can be strings like ’2024-01-01’
or ’5 days ago’ or ’yesterday’, etc.

These are some of the basic commands that are used in the terminal.
Each of these commands has many more options and flags that can be
used to customize their behavior. It is left as an exercise to the reader to
explore the manual pages of these commands and try out the different
options and flags.
Many of the commands that we have
discussed here are also explained in the
form of short videos on Robert Elder’s
Youtube Channel.
28 1 Essentials of Linux

1.3 Navigating the File System

1.3.1 What is a File System?

Unlike Windows which has different drive letters for different partitions,
Linux follows a unified file structure. The filesystem hierarchy is a tree of
27: A non-directory is a leaf node of a directories and files27 . The root28 of the filesystem tree is the directory
tree.
/. The basic filesystem hierarchy structure can be seen in Figure 1.6 and
28: The root of a tree is the first node
from which the tree originates. A tree Table Table 1.7.
can have only one root.
But what does so many directories mean? What do they do? What is the
purpose of each directory?

Table 1.7: Linux Filesystem Hierarchy


Directory Path Description
/ Root directory
/bin Essential command binaries
/boot Static files of the bootloader
/dev Device files
/etc Host-specific system configuration
/home User home directories
/lib Essential shared libraries and kernel modules
/media Mount point for removable media
/mnt Mount point for mounting a filesystem temporarily
/opt Add-on application software packages
/proc Virtual filesystem providing process information
/root Home directory for the root user
/run Run-time variable data
/sbin Essential system binaries
/srv Data for services provided by the system
/sys Kernel and system information
/tmp Temporary files
/usr Secondary hierarchy
/var Variable data

Some directories do not store data on the disk, but are used to store
information about the system. These directories are called virtual directo-
ries. For example, the /proc directory is a virtual directory that provides
information about the running processes. The /sys directory is another
virtual directory that provides information about the system. The /tmp is
a volatile directory whose data is deleted as soon as the system is turned
off. The /run directory is another volatile directory that stores runtime
data.
Figure 1.6: Linux Filesystem Hierarchy Rest directories are stored on the disk. The reason for having so many
directories is to categorize the type of files they store. For example, all
29: In Linux, you do not install appli- the executable binaries of different applications and utilities installed in
cations by downloading them from the the system is stored in /bin and /sbin directories. All the shared libraries
internet and running an installer like in
installed on the system are stored in /lib directory. Sometimes some
Windows. You use a package manager
to install applications. The package man- applications are installed in /opt directory which are not installed directly
ager downloads the application from the by the package manager. 29
internet and installs it on your system
automatically. This way the package man- We also need to store the user’s documents and files. This is done in the
ager can also keep track of the installed /home directory. Each user has their own directory in the /home directory.
applications and their dependencies and
whether they should be updated. This
The root user’s directory is /root. All the application’s configuration files
is similar to the Play Store on mobile are stored in the user’s home directory in the /home directory itself. This
phones.
1.3 Navigating the File System 29

separation of application binary and per-user application settings helps


people to easily change systems but keep their own /home directory
constant and in turn, also all their application settings.
Some settings however needs to be applied system-wide and for all users.
These settings are stored in the /etc directory. This directory contains all
the system-wide configuration files.
To boot up the system, the bootloader needs some files. These files are
stored in the /boot directory. 30 The bootloader is the first program that 30: Modern systems use UEFI instead
runs when the computer is turned on. It loads the operating system into of BIOS to boot up the system. The boot-
loader is stored in the /boot/EFI directory
memory and starts it. or in the /efi directory directly.
Although the file system is a unified tree hierarchy, this doesn’t mean
that we cannot have multiple partitions on Linux: au contraire, it is easier
to manage partitions on Unix. We simply need to mention which empty
directory in the hierarchy should be used to mount a partition. As soon
as that partition is mounted, it gets populated with the data stored on
that disk with all the files and subdirectories, and when the device is
unmounted the directory again becomes empty. Although a partition can
be mounted on any directory, there are some dedicated folders in / as
well for this purpose. For example, the /mnt directory is used to mount
a filesystem temporarily, and the /media directory is used to mount
removable media like USB drives, however it is not required to strictly
follow this.
Finally, the biggest revelation in Linux is that, everything is a file. Not
only are all the system configurations stored as plain text files which can
be read by humans, but the processes running on your system are also
stored as files in proc. Your kernel’s interfaces to the applications or users
are also simple files stored in sys. Biggest of all, even your hardware
31: Device files are not stored as normal
devices are stored as files in dev. 31
files on the disk, but are special files that
The /usr directory is a secondary hierarchy that contains subdirectories the kernel uses to communicate with
the hardware devices. These are either
similar to those present in /. This was created as the olden system block or character devices. They are used
had started running out of disk space for the /bin and /lib directories. to read and write data to and from the
Thus another directory named usr was made, and subdirectiores like hardware devices.
/usr/bin and /usr/lib were made to store half of the binaries and libraries.
There wasn’t however any rigid definition of which binary should go
where. Modern day systems have more than enough disk space to store
everything on one partition, thus the /bin and /lib dont really exist any
more. If they do, they are simply shortcuts 32 to the /usr/bin and /usr/lib 32: Shortcuts in Linux are called sym-
bolic links or symlinks.
directories which are still kept for backwards compatibility.
These can also be loosely classified into sharable and non-sharable directo-
ries and static and variable directories as shown in Table Table 1.8.
Table 1.8: Linux Filesystem Directory
Sharable Non-sharable Classification
Static /usr, /opt /etc, /boot
Variable /var/spool /tmp, /var/log

1.3.2 In Memory File System

Some file systems like proc, sys, dev, run, and tmp are not stored on the
disk, but are stored in memory.
30 1 Essentials of Linux

They have a special purpose and are used to store information about the
system. These are called virtual directories.
These cannot be stored in a disk as it would be too slow to access them.
Many of these files are very short lived yet are accessed very frequently.
So these are stored in memory to speed up the access.
/dev and /run are mounted as tmpfs filesystems.
This can be seen by running the mount command or the df command.
1 $ mount
2 /dev/sda1 on / type ext4 (rw,noatime)
3 devtmpfs on /dev type devtmpfs (rw,nosuid,size=4096k,nr_inodes
=990693,mode=755,inode64)
4 tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
5 devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,
mode=620,ptmxmode=000)
6 sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
7 securityfs on /sys/kernel/security type securityfs (rw,nosuid,
nodev,noexec,relatime)
8 cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,
relatime,nsdelegate,memory_recursiveprot)
9 pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,
relatime)
10 efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,
nodev,noexec,relatime)
11 bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode
=700)
12 configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,
noexec,relatime)
13 proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
14 tmpfs on /run type tmpfs (rw,nosuid,nodev,size=1590108k,nr_inodes
=819200,mode=755,inode64)
15 systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd
=36,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino
=5327)
16 hugetlbfs on /dev/hugepages type hugetlbfs (rw,nosuid,nodev,
relatime,pagesize=2M)
17 mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime
)
18 debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,
relatime)
19 tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,
noexec,relatime)
20 fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,
noexec,relatime)
21 systemd-1 on /data type autofs (rw,relatime,fd=47,pgrp=1,timeout
=60,minproto=5,maxproto=5,direct,pipe_ino=2930)
22 tmpfs on /tmp type tmpfs (rw,noatime,inode64)
23 /dev/sda4 on /efi type vfat (rw,relatime,fmask=0137,dmask=0027,
codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=
remount-ro)
24 /dev/sda2 on /home type ext4 (rw,noatime)
25 binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,
nosuid,nodev,noexec,relatime)
26 tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size
=795052k,nr_inodes=198763,mode=700,uid=1000,gid=1001,inode64)
27 portal on /run/user/1000/doc type fuse.portal (rw,nosuid,nodev,
relatime,user_id=1000,group_id=1001)
1.3 Navigating the File System 31

28 /dev/sdb3 on /data type ext4 (rw,noatime,x-systemd.automount,x-


systemd.idle-timeout=1min)

Here we can see that the /dev directory is mounted as a devtmpfs


filesystem. The /run directory is mounted as a tmpfs filesystem. The
/proc directory is mounted as a proc filesystem. The /sys directory is
mounted as a sysfs filesystem.

These are all virtual filesystems that are stored in memory.

proc:

Proc is an old filesystem that is used to store information about the


running processes. The /proc directory contains a directory for each
running process. The directories are named as the process id of the
process.
1 $ ls -l /proc | head
2 total 0
3 dr-xr-xr-x 9 root root 0 May 23 13:01 1
4 dr-xr-xr-x 9 root root 0 May 23 13:01 100
5 dr-xr-xr-x 9 sayan sayan 0 May 23 13:06 1004
6 dr-xr-xr-x 9 sayan sayan 0 May 23 13:06 1009
7 dr-xr-xr-x 9 root root 0 May 23 13:01 102
8 dr-xr-xr-x 9 root sayan 0 May 23 13:06 1029
9 dr-xr-xr-x 9 sayan sayan 0 May 23 13:06 1038
10 dr-xr-xr-x 9 sayan sayan 0 May 23 13:06 1039
11 dr-xr-xr-x 9 sayan sayan 0 May 23 13:06 1074

These folders are simply for information and do not store any data. This
is why they have a size of 0. Each folder is owned by the user who started
the process.

Inside each of these directories, there are files that contain information
about the process.

You can enter the folder of a process that is started by you and see the
information about the process.
1 $ cd /proc/301408
2 $ ls -l | head -n15
3 total 0
4 -r--r--r-- 1 sayan sayan 0 May 23 16:55 arch_status
5 dr-xr-xr-x 2 sayan sayan 0 May 23 16:55 attr
6 -rw-r--r-- 1 sayan sayan 0 May 23 16:55 autogroup
7 -r-------- 1 sayan sayan 0 May 23 16:55 auxv
8 -r--r--r-- 1 sayan sayan 0 May 23 16:55 cgroup
9 --w------- 1 sayan sayan 0 May 23 16:55 clear_refs
10 -r--r--r-- 1 sayan sayan 0 May 23 16:55 cmdline
11 -rw-r--r-- 1 sayan sayan 0 May 23 16:55 comm
12 -rw-r--r-- 1 sayan sayan 0 May 23 16:55 coredump_filter
13 -r--r--r-- 1 sayan sayan 0 May 23 16:55 cpu_resctrl_groups
14 -r--r--r-- 1 sayan sayan 0 May 23 16:55 cpuset
15 lrwxrwxrwx 1 sayan sayan 0 May 23 16:55 cwd -> /home/sayan/docs/
projects/sc-handbook
16 -r-------- 1 sayan sayan 0 May 23 16:55 environ
17 lrwxrwxrwx 1 sayan sayan 0 May 23 13:41 exe -> /usr/bin/entr

Here you can see that the command line of the process is stored in the
cmdline file. Here the process is of a command called entr.
32 1 Essentials of Linux

You can also see the current working directory (cwd) of the process.

There are some other files in the /proc directory that contain information
about the system.

▶ cpuinfo - stores cpu information.


▶ version - stores system information, content similar to uname -a
command.
▶ meminfo - Diagnostic information about memory. Check free
command.
▶ partitions - Disk partition information. Check df.
▶ kcore - The astronomical size ( 247 bits) tells the maximum virtual
memory (47 bits) the current Linux OS is going to handle.

sys:

Sys is a newer filesystem that is used to store information about the system.
It is neatly organized and is easier to navigate than proc. This highly uses
33: A symlink is a special type of file symlinks to organize the folders while maintaining redundancy as well.
that points to another file or directory. It 33
is similar to a shortcut in Windows. This
will be discussed in detail in Subsection Try running the following code snippet in a terminal if you have a
??.
caps lock key on your keyboard and are running linux directly on your
34: This will only work on your own
system, not on the system commands bare-metal. 34
VM, since you do not have the priviledge
1 $ cd /sys/class/leds
to modify the files there. Make sure you
have the ability to run commands as root 2 $ echo 1 | sudo tee *capslock/brightness
and are able to use sudo. It is also unlikely
to work on a virtual machine. It will also If you are running a linux system directly on your hardware, you will
not work on linux systems older than 2.6. see the caps lock key light up. Most modern keyboards will quickly turn
off the light again as the capslock is technically not turned on, only the
led was turned on manually by you.

35: Unix System V is one of the first /sys vs /proc:


commercial versions of the Unix operat-
ing system. It was originally developed The /proc tree originated in System V Unix 35 , where it only gave
by AT&T and first released in 1983. information about each running process, using a /proc/$PID/ format.
Linux greatly extended that, adding all sorts of information about the
running kernel’s status. In addition to these read-only information files,
36: BSD, or Berkeley Software Distri-
bution, is a Unix-like operating system
Linux’s /proc also has writable virtual files that can change the state of
that was developed at the University of the running kernel. BSD 36 type OSes generally do not have /proc at all,
California, Berkeley. It was first released so much of what you find under proc is non-portable.
in 1977 and was based on the original
Unix source code from AT&T. BSD is not The intended solution for this mess in Linux’s /proc is /sys. Ideally, all
linux, it is a totally different kernel, with
the non-process information that got crammed into the /proc tree should
similar core utils to GNU.
have moved to /sys by now, but historical inertia has kept a lot of stuff in
/proc. Often there are two ways to effect a change in the running kernel:
the old /proc way, kept for backwards compatibility, and the new /sys
way that you’re supposed to be using now.

1.3.3 Paths

Whenever we open a terminal on a Linux system, we are placed in


a directory. This is called the current working directory. All shells and
applications have a current working directory from where they are
launched.
1.3 Navigating the File System 33

To refer to and identify the directory you are talking about, we use a
path.

Definition 1.3.1 (Path) Path is a traversal in the filesystem tree. It is a


way to refer to a file or directory.

Absolute Path:
The traversal to the directory from the root directory is called the absolute
path. For example, if we want to refer to the directory named alex inside
the directory home in the root of the file system, then it is qualified as:
1 /home/alex

Relative Path:
The traversal to the directory from the current working directory is called
the relative path. For example, if we want to refer to the directory named
alex inside the directory home from the /usr/share directory, then it will
be qualified as:
1 ../../home/alex

Remark 1.3.1 The .. in the path refers to the parent directory. It is used
Figure 1.7: Relative Path
in relative paths to refer to directories whose path requires travelling
up the tree.

1.3.4 Basic Commands for Navigation

The file system can be navigated in the Linux command line using the
following commands:
▶ pwd: Print the current working directory
▶ ls: List the contents of the current directory
▶ cd: Change the current working directory
▶ mkdir: Create a new directory
▶ rmdir: Remove a directory
▶ touch: Create a new file
▶ rm: Remove a file
▶ pushd: Push the current directory to a stack
▶ popd: Pop the current directory from a stack37 37: pushd and popd are useful for
quickly switching between directories
More details about these commands can be found in their respective man in scripts.
pages. For example, to find more about the ls command, you can type
man ls.

Question 1.3.1 What is the command to list the contents of the current
directory?

Answer 1.3.1 ls
34 1 Essentials of Linux

Question 1.3.2 What is the command to list the contents of the current
directory including hidden files?

Answer 1.3.2 ls -a

Question 1.3.3 What is the command to list the contents of the current
directory in a long list format? (show permissions, owner, group, size,
and time)

Answer 1.3.3 ls -l

Question 1.3.4 What is the command to list the contents of the current
directory in a long list format and also show hidden files?

Answer 1.3.4 ls -al or ls -la or ls -l -a or ls -a -l

Question 1.3.5 The output of ls gives multiple files and directories


in a single line. How can you make it print one file or directory per
line?

Answer 1.3.5 ls -1
This can also be done by passing the output of ls to cat or storing
the output of ls in a file and then using cat to print it. We will see
38: that is a one, not an L these in later weeks.38
1.4 File Permissions 35

1.4 File Permissions

Figure 1.8: File Permissions

Definition 1.4.1 (File Permissions) File permissions define the access


rights of a file or directory. There are three basic permissions: read,
write, and execute. These permissions can be set for the owner of the
file, the group of the file, and others.

We have already briefly seen how to see the permissions of a file using
the ls -l command.
1 $ touch hello.txt
2 $ mkdir world
3 $ ls -l
4 total 4
5 -rw-r--r-- 1 sayan sayan 0 May 21 15:20 hello.txt
6 drwxr-xr-x 2 sayan sayan 4096 May 21 15:21 world

Here, the first column of the output of ls -l shows the permissions of


the file or directory. As seen in Figure 1.8, the permissions are divided
into four parts:

▶ The first character shows the type of the file. - for a regular file and
39: There are other types of files as well,
d for a directory and more. 39 like l for a symbolic link, c for a character
▶ The next three characters show the permissions for the owner of device, b for a block device, s for a socket,
the file. and p for a pipe. These will be discussed
later.
▶ The next three characters show the permissions for the group of
the file.
▶ The last three characters show the permissions for others.

Definition 1.4.2 (Owner) Although this can be changed, the owner


of a file is usually the user who created it. All files in the filesystem
have an owner. This is symbolically coded as u.

Definition 1.4.3 (Group) The group of a file is usually the group


of the user who created it. But it can also be changed to any other
existing group in the system. All users in the group a have the same
permissions on the file. This is symbolically coded as g.
a except the owner of the file

Definition 1.4.4 (Others) Others are all the users who are not the
owner of the file and are not in the group of the file. This is symboli-
cally coded as o.
36 1 Essentials of Linux

There are three actions that can be performed on a file: read, write, and
execute.

▶ Read: The read permission allows the file to be read. This is


symbolically coded as r.
▶ Write: The write permission allows the file to be modified. This is
symbolically coded as w.
40: Executing a file means running the ▶ Execute: The execute permission allows the file to be executed. 40
file as a program. For a directory, the This is symbolically coded as x.
execute permission allows the directory
to be traversed into.
These however, have different meanings for files and directories.

1.4.1 Read

▶ For a file, the read permission allows the file to be read. You can
use commands like cat or less to read the contents of the file if
the user has read permissions.
▶ For a directory, the read permission allows the directory to be listed
using ls.

1.4.2 Write

▶ For a file, the write permission allows the file to be modified. You
41: Redirection is a way to send the
can use commands like echo along with redirection 41 or a text
output of a command to a file.
editor like vim or nano to write to the file if the user has write
permissions.
▶ For a directory, the write permission allows the directory to be
modified. You can create, delete, or rename files in the directory if
the user has write permissions.

1.4.3 Execute

▶ For a file, the execute permission allows the file to be executed. This
is usually only needed for special files like executables, scripts, or
libraries. You can run the file as a program if the user has execute
permissions.
▶ For a directory, the execute permission allows the directory to
be traversed into. You can change to the directory if the user
has execute permissions using cd. You can also only long-list the
contents of the directory if the user has execute permissions on
that directory.

1.4.4 Interesting Caveats

This causes some interesting edge-cases that one needs to be familiar


with.
Cannot modify a file? Think again!
If you have write and execute permissions on a directory, even if you do
not have write permission on a file inside the directory, you can delete
the file due to your write permission on the directory, and then re-create
1.4 File Permissions 37

the modified version of the file with the same name. But if you try to
simply modify the file directly, you will get permission error.
1 $ mkdir test
2 $ cd test
3 $ echo "hello world" > file1
4 $ chmod 400 file1 # 400 means read permission only
5 $ cat file1
6 hello world
7 $ echo "hello universe" > file1 # unable to write
8 -bash: file1: Permission denied
9 $ rm file1 # can remove as we have write permission on folder
10 rm: remove write-protected regular file ’file1’? y
11 $ echo "hello universe" > file1 # can create new file
12 $ cat file1
13 hello universe

However, this only works on files. You cannot remove a directory if you
do not have write permission on the directory, even if you have write
permission on its parent directory.

Can list names but not metadata?

If you have read permission on a directory but not execute permission,


you cannot traverse into the directory, but you can still use ls to list the
contents of the directory. However, you cannot use ls -l to long-list the
contents of the directory. That is, you only have access to the name of the
files inside, not their metadata.
1 $ mkdir test
2 $ touch test/1 test/2
3 $ chmod 600 test # removing execute permission from folder
4 $ ls test # we can still list the files due to read permission
5 1 2
6 $ ls -l test # but cannot long-list the files
7 ls: cannot access ’test/2’: Permission denied
8 ls: cannot access ’test/1’: Permission denied
9 total 0
10 -????????? ? ? ? ? ? 1
11 -????????? ? ? ? ? ? 2

Cannot list names but can traverse?

If you have execute permission on a directory but not read permission,


you can traverse into the directory but you cannot list the contents of the
directory.
1 $ mkdir test
2 $ touch test/1 test/2
3 $ chmod 300 test # removing read permission from folder
4 $ ls test # we cannot list the files
5 ls: cannot open directory ’test’: Permission denied
6 $ cd test # but we can traverse into the folder
7 $ pwd
8 /home/sayan/test

Subdirectories with all permissions, still cannot access?

If you have all the permissions to a directory, but dont have execute
permission on its parent directory, you cannot access the subdirectory, or
even list its contents.
38 1 Essentials of Linux

1 $ mkdir test
2 $ mkdir test/test2 # subdirectory
3 $ touch test/test2/1 # file inside subdirectory
4 $ chmod 700 test/test2 # all permissions to subdirectory
5 $ chmod 600 test # removing execute permission from parent
directory
6 $ ls test
7 test2
8 $ cd test/test2 # cannot access subdirectory
9 -bash: cd: test/test2: Permission denied
10 $ ls test/test2 # cannot even list contents of subdirectory
11 ls: cannot access ’test/test2’: Permission denied

1.4.5 Changing Permissions

The permissions of a file can be changed using the chmod command.

Synopsis:
1 chmod [OPTION]... MODE[,MODE]... FILE...
2 chmod [OPTION]... OCTAL-MODE FILE...

OCTAL-MODE is a 3 or 4 digit octal number where the first digit is for


the owner, the second digit is for the group, and the third digit is for
others. We will discuss how the octal representation of permissions is
calculated in the next section.

The MODE can be in the form of ugoa+-=rwxXst where:

▶ u is the user who owns the file


▶ g is the group of the file
▶ o is others
▶ a is all
▶ + adds the permission
▶ - removes the permission
▶ = sets the permission
▶ r is read
▶ w is write
▶ x is execute
▶ X is execute only if its a directory or already has execute permission.
▶ s is setuid/setgid
▶ t is restricted deletion flag or sticky bit

We are already familiar with what r, w, and x permissions mean, but


what are the other permissions?

1.4.6 Special Permissions

Definition 1.4.5 (SetUID/SetGID) The setuid and setgid bits are


special permissions that can be set on executable files. When an
executable file has the setuid bit set, the file will be executed with
the privileges of the owner of the file. When an executable file has
the setgid bit set, the file will be executed with the privileges of the
group of the files.
1.4 File Permissions 39

SetUID:

This is useful for programs that need to access system resources that are
only available to the owner or group of the file.

A very notable example is the passwd command. This command is used


to set the password of an user. Although changing password of a user is
a priviledged action that only the root user can do, the passwd command
can be run by any user to change their password. This is possible due to
the setuid bit set on the passwd command. When the passwd command
is run, it is run with the privileges of the root user, and thus can change
the password of that user.

You can check this out by running ls -l /usr/bin/passwd and seeing


the s in the permissions.
1 $ ls -l /usr/bin/passwd
2 -rwsr-xr-x 1 root root 80912 Apr 1 15:49 /usr/bin/passwd

SetGID:

The behaviour of SetGID is similar to SetUID, but the file is executed


with the privileges of the group of the file.

However, SetGID can also be applied to a directory. When a directory


has the SetGID bit set, all the files and directories created inside that
directory will inherit the group of the directory, not the group of the
user who created the file or directory. This is highly useful when you
have a directory where multiple users need to work on the same files and
directories, but you want to restrict the access to only a certain group
of users. The primary group of each user is different from each other,
but since they are also part of another group (which is the group owner
of the directory) they are able to read and write the files present in the
directory. However, if the user creates a file in the directory, the file will
be owned by the user’s primary group, not the group of the directory.
So other users would not be able to access the file. This is fixed by the
SetGID bit on the directory.
1 $ mkdir test
2 $ ls -ld test # initially the folder is owned by the user’s
primary group
3 drwxr-xr-x 2 sayan sayan 4096 May 22 16:27 test
4 $ chgrp wheel test # we change the group of the folder to wheel,
which is a group that the user is part of
5 $ ls -ld test
6 drwxr-xr-x 2 sayan wheel 4096 May 22 16:27 test
7 $ whoami # this is the current user
8 sayan
9 $ groups # this is the users groups, first one is its primary
group
10 sayan wheel
11 $ touch test/file1 # before setting the SetGID bit, a new file
will have group owner as the primary group of the user
creating it
12 $ ls -l test/file1 # notice the group owner is sayan
13 -rw-r--r-- 1 sayan sayan 0 May 22 16:29 test/file1
14 $ chmod g+s test # we set the SetGID bit on the directory
15 $ ls -ld test # now the folder has a s in the group permissions
16 drwxr-sr-x 2 sayan wheel 4096 May 22 16:29 test
40 1 Essentials of Linux

17 $ touch test/file2 # now if we create another new file, it will


have the group owner as the group of the directory
18 $ ls -l test/file2 # notice the group owner is wheel
19 -rw-r--r-- 1 sayan wheel 0 May 22 16:29 test/file2

Restricted Deletion Flag or Sticky Bit:


The restricted deletion flag or sticky bit is a special permission that can
be set on directories.
Historically, this bit was to be applied on executable files to keep the
program in memory after it has finished executing. This was done to
speed up the execution of the program as the program would not have
to be loaded into memory again. This was called sticky bit because the
42: The part of memory where the pro- program would stick in memory. 42
gram’s text segment is stored is called
the swap. However, this is no longer how this bit is used.
When the sticky bit is set on a directory, only the owner of the file, the
owner of the directory, or the root user can delete or rename files in the
directory.
This is useful when you have a directory where multiple users need to
write files, but you want to restrict the deletion of files to only the owner
of the file or the owner of the directory.
The most common example of this is the /tmp directory. The /tmp
directory is a directory where temporary files are stored. You want to let
any user create files in the /tmp directory, but you do not want any user
to delete files created by other users.
1 $ ls -ld /tmp
2 drwxrwxrwt 20 root root 600 May 22 16:43 /tmp

Exercise 1.4.1 Log into the system commands VM and cd into the
/tmp directory. Create a file in the /tmp directory. Try to find if there
are files created by other users in the /tmp directory using ls -l
command. If there are files created by other users, try to delete them.
a

a You can create a file normally, or using the mktemp command.

1.4.7 Octal Representation of Permissions

The permissions of a file for the file’s owner, group, and others can be
43: If the octal is 4 digits, the first digit is represented as a 3 or 4 digit octal number. 43 Each of the octal digits is
for special permissions like setuid, setgid,
and sticky bit.
the sum of the permissions for the owner, group, and others.

▶ Read permission is represented by 4


▶ Write permission is represented by 2
▶ Execute permission is represented by 1

Thus if a file has read, write, and execute permissions for the owner, read
and execute permissions for the group, and only read permission for
others, the octal representation of the permissions would be 751.
The octal format is usually used more than the symbolic format as it is
easier to understand and remember and it is more concise.
1.4 File Permissions 41

Figure 1.9: Octal Permissions

Table 1.9: Octal Representation of Permissions

Octal Read Write Execute Symbolic Representation Description


0 0 0 0 — No permissions
1 0 0 1 –x Execute only
2 0 1 0 -w- Write only
3 0 1 1 -wx Write and execute
4 1 0 0 r– Read only
5 1 0 1 r-x Read and execute
6 1 1 0 rw- Read and write
7 1 1 1 rwx Read, write, and execute

1 $ chmod 754 myscript.sh # this sets the permissions of myscript.sh


to rwxr-xr--
2 $ ./myscript.sh
3 Hello World!

However, if you want to add or remove a permission without changing


the other permissions, the symbolic format is more useful.
1 $ chmod u+x myscript.sh # this adds execute permission to the
owner of myscript.sh
2 $ ./myscript.sh
3 Hello World!

Question 1.4.1 How to list the permissions of a file?

Answer 1.4.1 ls -l
The permissions are the first 10 characters of the output.
stat -c %A filename will list only the permissions of a file.
There are other format specifiers of stat to show different statistics
which can be found in man stat.

Question 1.4.2 How to change permissions of a file? Let’s say we


want to change file1’s permissions to rwxr-xr- What is the octal
form of that?
42 1 Essentials of Linux

Answer 1.4.2 chmod u=rwx,g=rx,o=r file1 will change the permis-


sions of file1
The octal form of rwxr-xr- is 754.
So we can also use chmod 754 file1
Providing the octal is same as using = to set the permissions.
We can also use + to add permissions and - to remove permissions.
1.5 Types of Files 43

1.5 Types of Files

We had briefly seen that the output of ls -l shows the type of the file as
the first character of the permissions.
There are 7 types of files in a linux file system as shown in Table
Table 1.10.
Table 1.10: Types of Files

1.5.1 Regular Files Type Symbol


Regular Files -
Regular files are the most common type of file. Almost all files are Directories d
regular files. Scripts and executable binaries are also regular files. All the Symbolic Links l
configuration files of the system are regular files as well. The regular files Character Devices c
are actually the only files that contain data and are stored on the disk. Block Devices b
Named Pipes p
Sockets s
1.5.2 Directories

Directories are files that contain a list of other files. Directories do not
contain data, they contain references to other files. Usually the size of
a directory is equal to the block size of the filesystem. Directories have
some special permissions that are different from regular files as discussed
in Section 1.4.

1.5.3 Symbolic Links

Symbolic links are files that point to other files. They only consume the
space of the path they are pointing to. Symlinks 44 are useful to create 44: Symlinks is short for symbolic links.

shortcuts to files or directories. They are dependent on the original file


and will stop working if the original file is deleted or moved. They are
discussed in detail in Section 1.6. There are another type of links called
hard links. However, hard links are not
files, they are pointers to the same inode.
1.5.4 Character Devices They do not consume extra space, and
are not dependent on the original file.
Hard links do not have a separate type,
Character devices are files that represent devices that are accessed as a they are just regular files.
stream of bytes. For example the keyboard, mouse, webcams, and most
USB devices are character devices. These are not real files stored on the
disk, but are files that represent devices. They can interacted with like a
file using the read and write system calls to interact with the hardware
directly. These files are made available by the kernel and are stored in
the /dev directory. Any read/write operation on a character device is
monitored by the kernel and the data is sent to the device.
1 $ cd /dev/input
2 $ ls -l
3 total 0
4 drwxr-xr-x 2 root root 220 May 22 13:49 by-id
5 drwxr-xr-x 2 root root 420 May 22 13:49 by-path
6 crw-rw---- 1 root input 13, 64 May 22 13:49 event0
7 crw-rw---- 1 root input 13, 65 May 22 13:49 event1
8 crw-rw---- 1 root input 13, 74 May 22 13:49 event10
9 crw-rw---- 1 root input 13, 75 May 22 13:49 event11
10 crw-rw---- 1 root input 13, 76 May 22 13:49 event12
44 1 Essentials of Linux

11 crw-rw---- 1 root input 13, 77 May 22 13:49 event13


12 crw-rw---- 1 root input 13, 78 May 22 13:49 event14
13 crw-rw---- 1 root input 13, 79 May 22 13:49 event15
14 crw-rw---- 1 root input 13, 80 May 22 13:49 event16
15 crw-rw---- 1 root input 13, 81 May 22 13:49 event17
16 crw-rw---- 1 root input 13, 82 May 22 13:49 event18
17 crw-rw---- 1 root input 13, 83 May 22 13:49 event19
18 crw-rw---- 1 root input 13, 66 May 22 13:49 event2
19 crw-rw---- 1 root input 13, 84 May 22 13:49 event20
20 crw-rw---- 1 root input 13, 67 May 22 13:49 event3
21 crw-rw---- 1 root input 13, 68 May 22 13:49 event4
22 crw-rw---- 1 root input 13, 69 May 22 13:49 event5
23 crw-rw---- 1 root input 13, 70 May 22 13:49 event6
24 crw-rw---- 1 root input 13, 71 May 22 13:49 event7
25 crw-rw---- 1 root input 13, 72 May 22 13:49 event8
26 crw-rw---- 1 root input 13, 73 May 22 13:49 event9
27 crw-rw---- 1 root input 13, 63 May 22 13:49 mice
28 crw-rw---- 1 root input 13, 32 May 22 13:49 mouse0
29 crw-rw---- 1 root input 13, 33 May 22 13:49 mouse1

Here the event and mouse files are character devices that represent input
devices like the keyboard and mouse. Note the c in the permissions,
which indicates that these are character devices.

1.5.5 Block Devices

Block devices are files that represent devices that are accessed as a block
of data. For example hard drives, SSDs, and USB drives are block devices.
These files also do not store actual data on the disk, but represent devices.
Any block file can be mounted as a filesystem. We can interact with
block devices using the read and write system calls to interact with the
hardware directly. For example, the /dev/sda file represents the first
hard drive in the system.

This makes it easy to write an image to a disk directly using the dd


45: The dd command is a powerful tool
that can be used to copy and convert files.
command. 45
It is acronym of data duplicator. However,
it is also known as the disk destroyer com-
The following example shows how we can use the dd command to write
mand, as it can be used to overwrite the an image 46 to a USB drive. It is this easy to create a bootable USB drive
entire disk if you are not careful with for linux.
which disk you are writing the image to.
1 $ dd if=~/Downloads/archlinux.iso of=/dev/sdb bs=4M status=
46: ISO file
progress

Here if is the input file, of is the output file, bs is the block size, and
status is to show the progress of the operation.

Warning 1.5.1 Be very careful when using the dd command. Make


sure you are writing to the correct disk. Writing to the wrong disk
can cause data loss.

1.5.6 Named Pipes


47: Also known as FIFOs
Named pipes 47 are files that are used for inter-process communication.
They do not store the data that you write to them, but instead pass the
1.5 Types of Files 45

data to another process. A process can only write data to a named pipe if
another process is reading from the named pipe. 48 48: and vice versa

1 $ mkfifo pipe1
2 $ ls -l pipe1
3 prw-r--r-- -1 sayan sayan 0 May 22 18:22 pipe1

Here the p in the permissions indicates that this is a named pipe. If you
now try to write to the named pipe, the command will hang until another
process reads from the named pipe. Try the following in two different
terminals:
Terminal 1:
1 $ echo "hello" > pipe1

Terminal 2:
1 $ cat pipe1

You will notice that whichever command you run first will hang until
the other command is run.

1.5.7 Sockets

Sockets are a special file type, similar to TCP/IP sockets, providing


inter-process networking protected by the file system’s access control.
This is similar to named pipes, but the difference is that named pipes are
meant for IPC between processes in the same machine, whereas sockets
can be used for communication across machines.
Try out the following in two different terminals:
Terminal 1:
1 $ nc -lU socket.sock

Terminal 2:
1 $ echo "hello" | nc -U socket.sock

Notice here, that if you run the command in terminal 2 first, it will error
out with the text:
1 nc: socket.sock: Connection refused

Only if we run them in correct order can you see the message "hello"
49: The nc command is the netcat com-
being printed in terminal 1. 49 mand. It is a powerful tool for network
debugging and exploration. It can be
You can press Ctrl+C to stop the nc command in both terminals.
used to create sockets, listen on ports,
and send and receive data over the net-
work. This will be discussed in more
1.5.8 Types of Regular Files detail in the networking section and in
the Modern Application Development
course.
Regular files can be further classified into different types based on the
data they contain. In Linux systems, the type of a file is determined by
its MIME type. The extension of a file does not determine its type, the
The bytes at the start of a file used to
contents of the file do. It is thus common to have files without extensions identify the type of file are called the
in Linux systems, as they provide no value. magic bytes.
More details can be found at:
The file command can be used to determine the type of a file. https://en.wikipedia.org/wiki/
List_of_file_signatures
46 1 Essentials of Linux

1 $ file /etc/passwd
2 /etc/passwd: ASCII text
3 $ file /bin/bash
4 /bin/bash: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV)
, dynamically linked, interpreter /lib64/ld-linux-x86-64.so
.2, BuildID[sha1]=165d3a5ffe12a4f1a9b71c84f48d94d5e714d3db,
for GNU/Linux 4.4.0, stripped

Question 1.5.1 What types of files are possible in a linux file system?

Answer 1.5.1 There are 7 types of files in a linux file system:


▶ Regular Files (starts with -)
▶ Directories (starts with d)
▶ Symbolic Links (starts with l)
▶ Character Devices (starts with c)
▶ Block Devices (starts with b)
▶ Named Pipes (starts with p)
▶ Sockets (starts with s)

Question 1.5.2 How to know what kind of file a file is? Can we
determine using its extension? Can we determine using its contents?
What does MIME mean? How to get that?

Answer 1.5.2 The file command can be used to determine the type
of a file.
The extension of a file does not determine its type.
The contents of a file can be used to determine its type.
MIME stands for Multipurpose Internet Mail Extensions.
It is a standard that indicates the nature and format of a document.
file -i filename will give the MIME type of filename.
1.6 Inodes and Links 47

1.6 Inodes and Links

1.6.1 Inodes

Definition 1.6.1 (Inodes) An inode is an index node. It serves as a


unique identifier for a specific piece of metadata on a given filesystem.

Whenever you run ls -l and see all the details of a file, you are seeing
the metadata of the file. These metadata, however, are not stored in the
file itself. These data about the files are stored in a special data structure
called an inode.

Each inode is stored in a common table and each filesystem mounted to


your computer has its own inodes. An inode number may be used more
than once but never by the same filesystem. The filesystem id combines
with the inode number to create a unique identification label.

You can check how many inodes are used in a filesystem using the df
-i command.

1 $ df -i
2 Filesystem Inodes IUsed IFree IUse% Mounted on
3 /dev/sda1 6397952 909213 5488739 15% /
4 /dev/sda4 0 0 0 - /efi
5 /dev/sda2 21569536 2129841 19439695 10% /home
6 /dev/sdb3 32776192 2380 32773812 1% /data
7 $ df
8 Filesystem 1K-blocks Used Available Use% Mounted on
9 /dev/sda1 100063312 63760072 31174076 68% /
10 /dev/sda4 1021952 235760 786192 24% /efi
11 /dev/sda2 338553420 273477568 47805068 86% /home
12 /dev/sdb3 514944248 444194244 44518760 91% /data

You can notice the number of inodes present, number of inodes used, and
number of inodes that are free. The IUse% column shows the percentage
of inodes used. This however, does not mean how much of space is used,
but how many files can be created.

Observe that although the /data partition has only 1% of inodes used, it
has 91% of space used. This is because the files in the /data partition are
large files, and thus the number of inodes used is less. Remember that
a file will take up one inode, no matter how large it is. But the space it
takes up will be the size of the file.

We can also see the inode number of a file using the ls -i command.
1 $ ls -i
2 1234567 file1
3 1234568 file2
4 1234569 file3

Here the first column is the inode number of the file.

Remark 1.6.1 The inode number is unique only within the filesystem.
If you copy a file from one filesystem to another, the inode number
will change.
48 1 Essentials of Linux

1.6.2 Separation of Data, Metadata, and Filename

In UNIX systems, the data of a file and the metadata of a file are stored
separately. The inodes are stored in a inode-array or table, and contain
the metadata of the file and the pointer to it in the storage block. These
50: A system call is a request in a op- metadata can be retrieved using the stat system call. 50
erating system made via a software in-
terrupt by an active process for a service
Conveniently, a userland utility to list the metadata of a file is also called
performed by the kernel. The diagram in
Figure 1.10 shows how system calls work. stat.

1 $ stat /etc/profile
2 File: /etc/profile
3 Size: 993 Blocks: 8 IO Block: 4096 regular
file
4 Device: 8,1 Inode: 2622512 Links: 1
5 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/
root)
6 Access: 2024-05-21 18:30:27.000000000 +0530
7 Modify: 2024-04-07 23:32:30.000000000 +0530
Figure 1.10: System Calls 8 Change: 2024-05-21 18:30:27.047718323 +0530
9 Birth: 2024-05-21 18:30:27.047718323 +0530

We can also specify the format of the output of the stat command using
the -format or -c flag to print only the metadata we want.

The data of the file is stored in the storage block. The inode number
indexes a table of inodes on the file system. From the inode number, the
kernel’s file system driver can access the inode contents, including the
location of the file, thereby allowing access to the file.

On many older file systems, inodes are stored in one or more fixed-size
areas that are set up at file system creation time, so the maximum number
of inodes is fixed at file system creation, limiting the maximum number
of files the file system can hold.

Some Unix-style file systems such as JFS, XFS, ZFS, OpenZFS, ReiserFS,
btrfs, and APFS omit a fixed-size inode table, but must store equivalent
data in order to provide equivalent capabilities. Common alternatives to
the fixed-size table include B-trees and the derived B+ trees.

Remark 1.6.2 Although the inodes store the metadata of the file, the
filename is not stored in the inode. It is stored in the directory entry.
Thus the filename, file metdata, and file data are stored separately.

Table 1.11: Metadata of a File


Metadata Description
Size Size of the file in bytes
Blocks Number of blocks used by the file
IO Block Block size of the file system
Device Device ID of the file system
Inode Inode number of the file
Links Number of hard links to the file
Access Access time of the file (atime)
Modify Modification time of the file (mtime)
Change Change time of the inode (ctime)
Birth Creation time of the file
1.6 Inodes and Links 49

1.6.3 Directory Entries

Figure 1.11: Inodes and Directory Entry

Unix directories are lists of association structures, each of which contains


one filename and one inode number. The file system driver must search
a directory for a particular filename and then convert the filename to the
correct corresponding inode number.
Thus to read a file from a directory, first the directory’s directory entry is
read which stores the name of the file and its inode number. The kernel
then follows the inode number to find the inode of the file. The inode
stores all the metadata of the file, and the location of the data of the
file. The kernel then follows the inode to find the data of the file. This is
shown in Figure 1.11.
So what happens if two directory entries point to the same inode? This is
called a hard link.

1.6.4 Hard Links

If multiple entries of the directory entry points to the same inode, they
are called hard links. Hard links can have different names, but they are
the same file. As they point to the same inode, they also have the same
metadata.
This is useful if you want to have the same file in multiple directories
without taking up more space. It is also useful if you want to keep a
backup of a important file which is accessed by many people. If someone
accidentally deletes the file, the other hard links will still be there and
able to access the file.

Definition 1.6.2 (Hard Links) Hard Links are just pointers to the
same inode. They are the same file. They are not pointers to the path
of the file. They are pointers to the file itself. They are not affected
by the deletion of the other file. When creating a hard link, you need
to provide the path of the original file, and thus it has to be either
50 1 Essentials of Linux

absolute path, or relative from the current working directory, not


relative from the location of the hard link.

Hard links can be created for files only, and not directories. It can be
created using the ln command.
1 $ ln file1 file2

This will create a hard link named file2 that points to the same inode
as file1.

Remark 1.6.3 Hard links are not dependent on the original file. They
are the same file and equivalent. The first link to be created has no
special status.

Historically directories could also have hard links, but this would cause
the file tree to stop being a Directed Acyclic Graph 51 and become
a Directed Cyclic Graph if a hardlink of an ancestor was put as a
subdirectory. This would create confusions and infinite walks in the file
system. Modern systems generally prohibit this confusing state, except
that the parent of root is still defined as root. 52

As hard links depend on the inode, they can only exist in the same
filesystem as inodes are unique to a filesystem only.

If we want to create shortcuts across filesystems, or if we want to create a


link to a directory, we can use symbolic links.

1.6.5 Symbolic Links


Figure 1.12: Directed Acyclic Graph

A symbolic link contains a text string that is automatically interpreted


51: A Directed Acyclic Graph is a graph and followed by the operating system as a path to another file or directory.
that has no cycles as seen in Figure 1.12.
This other file or directory is called the "target". The symbolic link is
52: The most notable exception to this
a second file that exists independently of its target. If a symbolic link
prohibition is found in Mac OS X (ver-
sions 10.5 and higher) which allows hard is deleted, its target remains unaffected. If a symbolic link points to a
links of directories to be created by the target, and sometime later that target is moved, renamed or deleted, the
superuser. symbolic link is not automatically updated or deleted, but continues to
exist and still points to the old target, now a non-existing location or file.
Symbolic links pointing to moved or non-existing targets are sometimes
called broken, orphaned, dead, or dangling.

Definition 1.6.3 (Soft Links) Soft Links are special kinds of files that
just store the path given to them. Thus the path given while making
soft links should either be an absolute path, or relative from the
location of the soft link to the location of the original file. It should
not be relative from current working directory.a
a This is a common mistake.

Symlinks are created using the symlink system call. This can be done
using the ln -s command.
1 $ echo "hello" > file1
2 $ ln -s file1 file2
3 $ ls -l
1.6 Inodes and Links 51

4 total 4
5 -rw-r--r-- 1 sayan sayan 6 May 23 15:27 file1
6 lrwxrwxrwx 1 sayan sayan 5 May 23 15:27 file2 -> file1
7 $ cat file2
8 hello

Interesting Observation:
Usually we have seen that if we use ls -l with a directory as its argument,
it lists the contents of the directory.
The only way to list the directory itself is to use ls -ld.
But if a symlink is made to a directory, then ls -l on that symlink will
list only the symlink.
To list the contents of the symlinked directory we have to append a / to
the symlink.
1 $ ln -s /etc /tmp/etc
2 $ ls -l /tmp/etc
3 lrwxrwxrwx 1 sayan sayan 4 May 23 15:30 /tmp/etc -> /etc
4 $ ls -l /tmp/etc/ | head -n5
5 total 1956
6 -rw-r--r-- 1 root root 44 Mar 18 21:50 adjtime
7 drwxr-xr-x 3 root root 4096 Nov 17 2023 alsa
8 -rw-r--r-- 1 root root 541 Apr 8 20:53 anacrontab
9 drwxr-xr-x 4 root root 4096 May 19 00:44 apparmor.d

Here I used head to limit the number of lines shown as the directory is
large. 53 53: This way of combining commands
will be discussed later.
The symlink file stores only the path provided to it while creating it. This
was historically stored in the data block which was pointed to by the
inode. But this made it slower to access the symlink.
Modern systems store the symlink value in the inode itself if its not too
large. Inodes usually have a limited space allocated for each of them, so
a symlink with a small target path is stored directly in the inode. This is
called a fast symlink.
However if the target path is too large, it is stored in the data block
pointed to by the inode. This is retroactively called a slow symlink.
This act of storing the target path in the inode is called inlining.
Symlinks do not have a permission set, thus they always report lrwxrwxrwx
as their permissions.
The size reported of a symlink file is independent of the actual file’s
size.
1 $ echo "hello" > file1
2 $ ln -s file1 file2
3 $ ls -l
4 -rw-r--r-- 1 sayan sayan 6 May 23 15:27 file1
5 lrwxrwxrwx 1 sayan sayan 5 May 23 15:27 file2 -> file1
6 $ echo "a very big file" > file2
7 $ ls -l
8 -rw-r--r-- 1 sayan sayan 16 May 23 15:40 file1
9 lrwxrwxrwx 1 sayan sayan 5 May 23 15:27 file2 -> file1

Rather, the size of a symlink is the length of the target path.


52 1 Essentials of Linux

1 $ ln -s /a/very/long/and/non-existant/path link1
2 $ ln -s small link2
3 $ ls -l
4 total 0
5 lrwxrwxrwx 1 sayan sayan 34 May 23 15:41 link1 -> /a/very/long/and
/non-existant/path
6 lrwxrwxrwx 1 sayan sayan 5 May 23 15:41 link2 -> small

Notice that the size of link1 is 34, the length of the target path, and the
size of link2 is 5, the length of the target path.

1.6.6 Symlink vs Hard Links

Figure 1.13: Abstract Representation of


Symbolic Links and Hard Links

Figure 1.14: Symbolic Links and Hard


Links

The difference between a symlink and a hard link is that a symlink is


a pointer to the original file, while a hard link is the same file. Other
differences are listed in Table 1.12.
Table 1.12: Symlink vs Hard Link

Property Symlink Hard Link


File Type Special File Regular File
Size Length of the target path Size of the file
Permissions lrwxrwxrwx Same as the original file
Inode Different Same
Dependency Dependent on the original file Independent of the original file
Creation Can be created across filesystems Can only be created in the same filesystem
Target Can point to directories Can only point to files

1.6.7 Identifying Links

Soft Links:
To identify if a file is a symlink or a hard link, you can use the ls -l
command. If the file is a symlink, the first character of the permissions
will be l. ls -l will also show the target of the symlink after a -> symbol.
However, you cannot acertain if a file has a sofy link pointing to it
somewhere else or not.
Hard Links:
To identify if a file is a hard link, you can use the ls -i command. Hard
links will have the same inode number as each other. The inode number
is the first column of the output of ls -i.
1.6 Inodes and Links 53

Also the number of links to the file will be more than 1. The number of
links is the second 54 column of the output of ls -l. 54: third if using ls -li

Even if a hard link is not present in current directory, you can acertain
that a file has a hard link pointing to it somewhere else using the number
of hardlinks column of ls -l.
1 $ touch file1
2 $ ln -s file1 file2
3 $ ln file1 file3
4 $ ls -li
5 total 0
6 4850335 -rw-r--r-- 2 sayan sayan 0 May 23 15:56 file1
7 4851092 lrwxrwxrwx 1 sayan sayan 5 May 23 15:56 file2 -> file1
8 4850335 -rw-r--r-- 2 sayan sayan 0 May 23 15:56 file3

1.6.8 What are . and ..?

. and .. are special directory entries. They are hard links to the current
directory and the parent directory respectively. Each directory has a .
entry pointing to itself and a .. entry pointing to its parent directory.
Due to this, the number of hard links to a directory is exactly equal to
the number of subdirectories it has plus 2.
This is because each subdirectory has a .. entry pointing to the parent
directory, and the parent directory has a . entry pointing to itself.
So the directory’s name in its parent directory is 1 link, the . in the
directory is 1 link, and all the subdirectories have a .. entry pointing to
the directory, which is 1 link each.

Number of links to a directory = Number of subdirectories + 2

This formula always stands because a user cannot create additional hard
links to a directory.

Question 1.6.1 How to list the inodes of a file?

Answer 1.6.1 ls -i will list the inodes of a file. The inodes are the
first column of the output of ls -i This can be combined with other
flags like -l or -a to show more details.

Question 1.6.2 How to create soft link of a file?

Answer 1.6.2 ln -s sourcefile targetfile will create a soft link


of sourcefile named targetfile. The soft link is a pointer to the
original file.

Question 1.6.3 How to create hard link of a file?


54 1 Essentials of Linux

Answer 1.6.3 ln sourcefile targetfile will create a hard link of


sourcefile named targetfile. The hard link is same as the original
file. It does not depend on the original file anymore after creation.
They are equals, both are hardlinks of each other. There is no parent-
child relationship. The other file can be deleted and the original file
will still work.

Question 1.6.4 How to get the real path of a file?


Assume three files:
▶ file1 is a soft link to file2
▶ file2 is a soft link to file3
▶ file3 is a regular file
Real path of all these three should be the same. How to get that?

Answer 1.6.4 realpath filename will give the real path of filename.
You can also use readlink -f filename to get the real path.
Command Line Editors 2
2.1 Introduction 2.1 Introduction . . . . . . . . 55
2.1.1 Types of Editors . . . . . . 55
2.1.2 Why Command Line
Now that we know how to go about navigating the linux based operating Editors? . . . . . . . . . . . 55
systems, we might want to view and edit files. This is where command 2.1.3 Mouse Support . . . . . . 55
line editors come in. 2.1.4 Editor war . . . . . . . . . 56
2.1.5 Differences between Vim
and Emacs . . . . . . . . . 56
Definition 2.1.1 A command line editor is a type of text editor that
2.1.6 Nano: The peacemaker
operates entirely from the command line interface. They usually do
amidst the editor war . . 58
not require a graphical user interface or a mouse. a
2.2 Vim . . . . . . . . . . . . . 58
a This means that CLI editors are the only way to edit files when you are connected to 2.2.1 History . . . . . . . . . . . 58
a remote server. Since remote servers do not have any graphical server like X11, you 2.2.2 Ed Commands . . . . . . . 65
cannot use graphical editors like gedit or kate.
2.2.3 Exploring Vim . . . . . . . 72
2.3 Emacs . . . . . . . . . . . . 80
2.3.1 History . . . . . . . . . . . 80
2.1.1 Types of Editors 2.3.2 Exploring Emacs . . . . . 82
2.4 Nano . . . . . . . . . . . . . 84
▶ Graphical Editors: These are editors that require a graphical user 2.4.1 History . . . . . . . . . . . 84
2.4.2 Exploring Nano . . . . . . 85
interface. Examples include gedit ‗ , kate † , vs code, etc.
2.4.3 Editing A Script in Nano 85
▶ Command Line Editors: These are editors that operate entirely
from the command line interface.

We will be only discussing command line editors in this chapter.

2.1.2 Why Command Line Editors?

Command line editors are very powerful and efficient. They let you edit
files without having to leave the terminal. This is usually faster than
opening a graphical editor. In many cases like sshing 1 into a remote 1: SSH stands for Secure Shell. It is a
server, command line editors are the only way to edit files. Another cryptographic network protocol for op-
erating network services securely over
reason for the popularity of command line editors is that they are very an unsecured network. This will be dis-
lightweight. cussed in detail in a later chapter.

2.1.3 Mouse Support

Most command line editors do not have mouse support, and others do
not encourage it. But wont it be difficult to navigate without a mouse?
Not really. Once you get used to the keyboard shortcuts, you will find
that you can navigate way faster than with a mouse.
Mouse editors usually require the user to click on certain buttons, or
to follow multi-click procedures in nested menus to perform certain
tasks.

‗ Gedit is the default text editor in GNOME desktop environment.


† Kate is the default text editor in KDE desktop environment.
56 2 Command Line Editors

Whereas in keyboard based editors, all the actions that can be performed
are mapped to some keyboard shortcuts.

Modern CLI editors usually also allow the user to totally customize the
keyboard shortcuts.

This being said, most modern CLI editors do have mouse support as well
if the user is running them in a terminal emulator that supports mouse
2: X11 and Wayland are display servers
over a X11 or Wayland display server. 2
that are used to render graphical appli-
cations. Although not directly covered,
these can be explored in details on the
internet. 2.1.4 Editor war

Although there are many command line editors available, the most
popular ones are vim and emacs.

Definition 2.1.2 The editor war is the rivalry between users of the
Emacs and vi (now usually Vim, or more recently Neovim) text
editors. The rivalry has become an enduring part of hacker culture
3: More on this including the history and the free software community. 3
and the humor can be found on the in-
ternet.
Vim

Vim is a modal editor, meaning that it has different modes for different
tasks. Most editors are modeless, this makes vim a bit difficult to learn.
However, once familiar with it, it is very powerful and efficient. Vim
heavily relies on alphanumeric keys for navigation and editing. Vim
4: qutebrowser is a browser that uses vim- keybindings are so popular that many other editors and even some
like keybindings. Firefox and Chromium browsers 4 have vim-like keybindings.
based browsers also have extensions that
provide vim-like keybindings. These al- Emacs
low the user to navigate the browser
using vim-like keybindings and without
Emacs is a modeless editor, meaning that it does not have different modes
ever touching the mouse.
for different tasks. Emacs is also very powerful and efficient. It uses
multi-key combinations for navigation and editing.

2.1.5 Differences between Vim and Emacs

Keystroke execution

Emacs commands are key combinations for which modifier keys are
held down while other keys are pressed; a command gets executed once
completely typed.

Vim retains each permutation of typed keys (e.g. order matters). This
creates a path in the decision tree which unambiguously identifies any
command.

Memory usage and customizability

Emacs executes many actions on startup, many of which may execute


arbitrary user code. This makes Emacs take longer to start up (even
compared to vim) and require more memory. However, it is highly
customizable and includes a large number of features, as it is essentially
an execution environment for a Lisp program designed for text-editing.
2.1 Introduction 57

Vi is a smaller and faster program, but with less capacity for customization.
vim has evolved from vi to provide significantly more functionality and
customization than vi, making it comparable to Emacs.

User environment

Emacs, while also initially designed for use on a console, had X11 GUI
support added in Emacs 18, and made the default in version 19. Current
Emacs GUIs include full support for proportional spacing and font-size
variation. Emacs also supports embedded images and hypertext.

Vi, like emacs, was originally exclusively used inside of a text-mode


console, offering no graphical user interface (GUI). Many modern vi
derivatives, e.g. MacVim and gVim, include GUIs. However, support for
proportionally spaced fonts remains absent. Also lacking is support for
different sized fonts in the same document.

Function/navigation interface

Emacs uses metakey chords. Keys or key chords can be defined as prefix
keys, which put Emacs into a mode where it waits for additional key
presses that constitute a key binding. Key bindings can be mode-specific,
further customizing the interaction style. Emacs provides a command
line accessed by M-x that can be configured to autocomplete in various
ways. Emacs also provides a defalias macro, allowing alternate names
for commands.

Vi uses distinct editing modes. Under "insert mode", keys insert characters
into the document. Under "normal mode" (also known as "command
mode", not to be confused with "command-line mode", which allows the
user to enter commands), bare keypresses execute vi commands.

Keyboard

The expansion of one of Emacs’ backronyms is Escape, Meta, Alt, Control,


Shift, which neatly summarizes most of the modifier keys it uses, only
leaving out Super. Emacs was developed on Space-cadet keyboards
that had more key modifiers than modern layouts. There are multiple
emacs packages, such as spacemacs or ergoemacs that replace these key
combinations with ones easier to type, or customization can be done ad
hoc by the user.

Vi does not use the Alt key and seldom uses the Ctrl key. vi’s keyset is
mainly restricted to the alphanumeric keys, and the escape key. This is
an enduring relic of its teletype heritage, but has the effect of making
most of vi’s functionality accessible without frequent awkward finger
reaches.

Language and script support

Emacs has full support for all Unicode-compatible writing systems and
allows multiple scripts to be freely intermixed.

Vi has rudimentary support for languages other than English. Modern


Vim supports Unicode if used with a terminal that supports Unicode.
58 2 Command Line Editors

2.1.6 Nano: The peacemaker amidst the editor war

Nano is a simple command line editor that is easy to use. It does not
have the steep learning curve of vim or emacs. But it is not as powerful
as vim or emacs as well. It is a common choice for beginners who just
want to append a few lines to a file or make a few changes. It is also a
non-modal editor like editor which uses modifier chording like emacs.
However, it mostly uses the control key for this purpose and has only
simple keybindings such as Ctrl+O to save and Ctrl+X to exit.

2.2 Vim

2.2.1 History

The history of Vim is a very long and interesting one.

Table 2.1: History of Vim


QED text editor by Butler
Lampson and Peter Deutsch
1967 · · · · · ·•
for Berkeley Timesharing
System.

Ken Thompson and Dennis


1967 · · · · · ·• Ritchie’s QED for MIT CTSS,
Multics, and GE-TSS.

Ken Thompson releases ed -


1969 · · · · · ·•
The Standard Text Editor.
George Coulouris and Patrick
1976 · · · · · ·• Mullaney release em - The
Editor for Mortals.
Bill Joy and Chuck Haley
1976 · · · · · ·• build upon em to make en,
which later becomes ex.
Bill Joy adds visual mode to
1977 · · · · · ·•
ex.
Bill Joy creates a hardlink ‘vi‘
1979 · · · · · ·•
for ex’s visual mode.
Tim Thompson develops a vi
clone for the Atari ST named
1987 · · · · · ·•
STevie (ST editor for VI
enthusiasts).

Bram Moolenaar makes a


1988 · · · · · ·• stevie clone for the Amiga
named Vim (Vi IMitation).

Teletypes
2.2 Vim 59

Definition 2.2.1 A teletype (TTY) or a teleprinter is a device that can


send and receive typed messages from a distance.

Very early computers used to use teletypes as the output device. These
were devices that used ink and paper to actually print the output of
the computer. These did not have an automatic refresh rate like modern
monitors. Only when the computer sent a signal to the teletype, would
Figure 2.1: A Teletype
the teletype print the output.

Due to these restrictions it was not economical or practical to print the


entire file on the screen. Thus most editors used to print only one line at
a time on the screen and did not have updating graphics.

QED

QED was a text editor developed by Butler Lampson and Peter Deutsch
in 1967 for the Berkeley Timesharing System. It was a character-oriented
editor that was used to create and edit text files. It used to print or edit
only one character at a time on the screen. This is because the computers
at that time used to use a teletype machine as the output device, and not
a monitor.

Ken Thompson used this QED at Berkeley before he came to Bell Labs,
and among the first things he did on arriving was to write a new version
for the MIT CTSS system. Written in IBM 7090 assembly language, it
differed from the Berkeley version most notably in introducing regular Figure 2.2: Ken Thompson
expressions 5 for specifying strings to seek within the document being
edited, and to specify a substring for which a substitution should be 5: Regular expressions are a sequence
made. Until that time, text editors could search for a literal string, and of characters that define a search pattern.
substitute for one, but not specify more general strings. Usually this pattern is used by string
searching algorithms for "find" or "find
Ken not only introduced a new idea, he found an inventive implemen- and replace" operations on strings. This
will be discussed in detail in a later chap-
tation: on-the-fly compiling. Ken’s QED compiled machine code for ter.
each regular expression that created a NDFA (non-deterministic finite
automaton) to do the search. He published this in C. ACM 11 #6, and also
received a patent for the technique: US Patent #3568156.

While the Berkeley QED was character-oriented, the CTSS version was
line-oriented. Ken’s CTSS qed adopted from the Berkeley one the notion
of multiple buffers to edit several files simultaneously and to move and
copy text among them, and also the idea of executing a given buffer as
editor commands, thus providing programmability.

When developing the MULTICS project, Ken Thompson wrote yet another
version of QED for that system, now in BCPL 6 and now created trees for
regular expressions instead of compiling to machine language.

In 1967 when Dennis Ritchie joined the project, Bell Labs had slowly
started to move away from Multics. While he was developing the initial
stages of Unix, he rewrote QED yet again, this time for the GE-TSS system
in Assembly language. This was well documented, and was originally Figure 2.3: Dennis Ritchie
intented to be published as a paper. 7
6: BCPL ("Basic Combined Program-
ED ming Language") is a procedural, im-
perative, and structured programming
After their experience with multiple implementations of QED, Ken language. Originally intended for writ-
Thompson wrote ed in 1969. ing compilers for other languages, BCPL
is no longer in common use.
60 2 Command Line Editors

7: At that time, systems did not have a This was now written in the newly developed B language, a predecessor
standardized CPU architecture or a gen- to C. This implementation was much simpler than QED, and was line
eralized low level compiler. Due to this,
applications were not portable across
oriented. It stripped out much of regular expression support, and only
systems. Each machine needed its own had the support for *. It also got rid of multiple buffers and executing
version of the application to be written contents of buffer.
from the scratch, mostly in assembly lan-
guage. Slowly, with time, Dennis Ritchie created the C language, which is widely
The reference manual for GE-TSS QED in use even today.
can still be found on Dennis Ritchie’s
website Much of this information is taken Ken Thompson re-wrote ed in C, and added back some of the complex
from his blog. features of QED, like back references in regular expressions.

Ed ended up being the Standard Text Editor for Unix systems.

Remark 2.2.1 Since all of Bell-Labs and AT&T’s software was propri-
etary, the source code for ed was not available to the public. Thus, the
ed editor accessible today in GNU/Linux, is another implementation
of the original ed editor by the GNU project.

However, ed was not very user friendly and it was very terse. Although
this was originally intented, since it would be very slow to print a lot
of diagnostic messages on a teletype, slowly, as people moved to faster
computers and monitors, they wanted a more user friendly editor.

VDU Terminals

Definition 2.2.2 A terminal that uses video display technology like


cathode rat tubes (CRT) or liquid crystal displays (LCD) to display
the terminal output is called a VDU terminal. (Video Display Unit)

These terminals were able to show video output, instead of just printing
the output on paper. Although initially these were very expensive, and
were not a household item, they were present in the research parks like
Figure 2.4: Xerox Alto, one of the first Xerox PARC.
VDU terminals with a GUI, released in
1973 EM

8: George named em as Editor for Mor- George Coulouris (not the actor) was one of the people who had access
tals because Ken Thompson visited his to these terminals in his work at the Queen Mary College in London.
lab at QMC while he was developing it
and said something like: "yeah, I’ve seen The drawbacks of ed were very apparent to him when using on these
editors like that, but I don’t feel a need machines.
for them, I don’t want to see the state
of the file when I’m editing". This made He found that the UNIX’s raw mode, which was at that time totally
George think that Ken was not a mortal,
unused, could be used to give some of the convenience and immediacy
and thus he named it Editor for Mortals.
of feedback for text editing.

He claimed that although the ed editor was groundbreaking in its time,


it was not very user friendly. He termed it as not being an editor for
mortals.

He thus wrote em in 1976, which was an Editor for Mortals. 8

Although em added a lot of features to ed, it was still a line editor, that is,
you could only see one line at a time. The difference from ed was that it
Figure 2.5: A first generation Dectape allowed visual editing, meaning you can see the state of the line as you
(bottom right corner, white round tape) are editing it.
being used with a PDP-11 computer
2.2 Vim 61

Whereas most of the development of Multics and Unix was done in the
United States, the development of em was done in the United Kingdom,
in the Queen Mary College, which was the first college in the UK to have
UNIX.

EN

In the summer of 1976, George Coulouris was a visiting professor at the


University of California, Berkeley. With him, he had brought a copy of his
em editor on a Dectape 9 and had installed it there on their departmental 9: A Dectape is a magnetic tape storage
device that was used in the 1970s. It was
computers which were still using teletype terminals. Although em was
used to store data and programs.
designed for VDU terminals, it was still able to run (albeit slowly) on the
teletype terminals by printing the current line every time.

There he met Bill Joy, who was a PhD student at Berkeley. On showing
him the editor, Bill Joy was very impressed and wanted to use it on the
PDP-11 computers at Berkeley. The system support team at Berkley were
using PDP-11 which used VDU Terminals, an environment where em
would really shine.

He explained that ’em’ was an extension of ’ed’ that gave key-stroke level
interaction for editing within a single line, displaying the up-to-date line
on the screen (a sort of single-line screen editor). This was achieved by
setting the terminal mode to ’raw’ so that single characters could be read
as they were typed - an eccentric thing for a program to do in 1976.

Although the system support team at Berkeley were impressed by this


editor, they knew that if this was made available to the general public, it
would take up too much resources by going to the raw mode on every
Figure 2.6: George Coulouris
keypress. But Bill and the team took a copy of the source code just to see
if they might use it.

George then took a vacation for a few weeks, but when he returned, he
found that Bill had taken his ex as a starting point and had added a lot of
features to it. He called it en initially, which later became ex.

EX

Bill Joy took inspiration from several other ed clones as well, and their
own tweaks to ed, although the primary inspiration was em. Bill and
Chuck Haley built upon em to make en, which later became ex.

This editor had a lot of improvements over em, such as adding the ability
to add abbreviations (using the ab command), and adding keybindings
(maps).

It also added the ability to mark some line using the k key followed by
any letter, and then jump to that line from any arbitrary line using the ’
key followed by the letter.

Slowly, with time, the modern systems were able able to handle the
raw mode, and real time editing more and more. This led to the natural
progression, What if we could see the entire file at once, and not just one
line at a time? Figure 2.7: Bill Joy

VI
10: This visual mode is not the same as
Bill added the visual mode to ex in 1977. 10 This was not a separate editor, the visual mode in vim.
but rather just another mode of the ex editor. You could open ex in visual
mode using the -v flag to ex.
62 2 Command Line Editors

1 $ ex -v filename

This visual mode was the first time a text editor was modal. This means
that the editor had different modes for different tasks. When you want
to edit text, you would go to the insert mode, and type the text. When
you want to navigate, you would go to the normal mode, and use the
navigation keys and other motions defined in vi.

11: This means that it did not take up


Slowly, as the visual mode became more and more popular, Bill added a
additional space on the disk, but was just hardlink to ex called vi. 11
another entry in the directory entry that
pointed to the same inode which stored The modal version of vi was also inspired from another editor called
the ex binary path. Upon execution, ex bravo, which was developed at Xerox PARC. 12
would detect if it was called as vi and
would start in visual mode by default. If you use vi/vim, you may notice that the key to exit the insert mode
We have covered hardlinks in Chapter 1.
is Esc. This may seem inconveniently placed at the top left corner, but
12: Xerox PARC has always been ahead
of its time. The first graphical user inter-
this was because the original vi was developed on a ADM-3A terminal,
face was developed at Xerox PARC. The which had the Esc key to the left of the Q key, where modern keyboards
bravo editor used bitmapped graphics have the Tab key. 13
to display the text, and had extensive
mouse support. The overdependence on Also, the choice of h,j,k,l for navigation was because the ADM-3A terminal
the mouse in such an early time was one
did not have arrow keys, rather, it had h,j,k,l keys for navigation.
of the reasons that the bravo editor was
not as popular as vi.
This can be seen in Figure 2.8.
13: Since the placement of the Escape
key is inconvenient in modern keyboard
layouts, many people remap the Escape
key to the Caps Lock key either in vim
or in the operating system itself.

Figure 2.8: The Keyboard layout of the


ADM-3A terminal

Bill Joy was also one of the people working on the Berkeley Software
Distribution (BSD) of Unix. Thus he bundled vi with the first BSD
distribution of UNIX released in 1978. The pre-installed nature of vi in
the BSD Distribution made it very popular.

However, since both the source code of ed was restricted by Bell Labs
- AT&T, and the source code of vi was restricted by the University of
California, Berkeley, they could not be modified by the users or distributed
freely.

This gave birth to a lot of clones of vi.

Vi Clones

The vi clones were written because the source code for the original
version was not freely available until recently. This made it impossible
to extend the functionality of vi. It also precluded porting vi to other
operating systems, including Linux.

▶ calvin: a freeware "partial clone of vi" for use on MS-DOS. It has


the advantages of small size (the .exe file is only 46.1KB!) and
fast execution but the disadvantage that it lacks many of the ex
commands, such as search and replace.
Figure 2.9: Stevie Editor
2.2 Vim 63

▶ lemmy: a shareware version of vi implemented for the Microsoft


Windows platforms which combines the interface of vi with the
look and feel of a Windows application.
▶ nvi: It is a re-implementation of the classic Berkeley vi editor,
derived from the original 4.4BSD version of vi. It is the "official"
Berkeley clone of vi, and it is included in FreeBSD and the other
BSD variants.
▶ stevie: ‘ST Editor for VI Enthusiasts‘ was developed by Tim Thomp-
son for the Atari ST. It is a clone of vi that runs on the Atari ST.
Tim Thompson wrote the code from scratch (not based on vi) and
posted its source code as a free software to comp.sys.atari.st on
June 1987. Later it was ported to UNIX, OS/2, and Amiga. Because
of this independence from vi and ed’s closed source license, most
vi clones would base their work off of stevie to keep it free and
open source.
▶ elvis: Elvis creator, Steve Kirkendall, started thinking of writing
his own editor after Stevie crashed on him, causing him to lose
hours of work and damaging his confidence in the editor. Stevie
stored the edit buffer in RAM, which Kirkendall believed to be
impractical on the MINIX operating system. One of Kirkendall’s
main motivation for writing his own vi clone was that his new
editor stored the edit buffer in a file instead of storing it in RAM.
Therefore, even if his editor crashed, the edited text could still be
retrieved from that external file. Elvis was one of the first vi clones
to offer support for GUI and syntax highlighting.

The clones add numerous new features which make them significantly
easier to use than the original vi, especially for neophytes. A particularly
useful feature in many of them is the ability to edit files in multiple
windows. This facilitates working on more than one file at the same time,
including cutting and pasting text among them.
Many of the clones also offer GUI versions of vi that operate under the X
Windows system and can take advantage of bit-mapped (high resolution)
displays and the mouse.
Vim
Bram Moolenaar, a Dutch programmer, was impressed by STeVIe, a vi
clone for the Atari ST. But he was working with the Commodore Amiga
at that time, and there was no vi clone for the Amiga. So Bram began
working on the stevie clone for the AmigaOS in 1988.
He released the first public release (v 1.14) in 1991 as visible in Figure
2.11.
Since Vim was based off of Stevie, and not ed or vi so it could be freely
distributed. It was licensed under a charityware license, named as Vim
License. The license stated that if you liked the software, you should
consider making a donation to a charity of your choice.
Moolenaar was an advocate of a NGO based in Kibaale, Uganda, which
he founded to support children whose parents have died of AIDS. In
1994, he volunteered as a water and sanitation engineer for the Kibaale Figure 2.10: Bram Moolenaar
Children’s Centre and made several return trips over the following
twenty-five years.
Later Vim was re-branded as ‘Vi IMproved‘ as seen in Figure 2.12.
64 2 Command Line Editors

Figure 2.11: The inital version of Vim,


when it was called Vi IMitation

Vim has been in development for over 30 years now, and is still actively
maintained. It has added a lot of features over the years, such as syntax
highlighting, plugins, completion, PCRE support, mouse support, etc.
neovim
Recently there have been efforts to modernize the vim codebase. Since
it is more than 30 years old, it has a lot of legacy code. The scripting
Figure 2.12: Vim 9.0 Start screen language of vim is also not a standard programming language, but rather
a custom language called vimscript.
To counter this, a new project called neovim has been started. It uses lua
as the scripting language, and has a lot of modern features like out of the
14: LSP stands for Language Server Pro- box support for LSP, 14 better mouse integration, etc.
tocol. It is a protocol that allows the editor
to communicate with a language server In this course, we will be learning only about basic vi commands and we
to provide features like autocompletion, will be using vim as the editor.
go to definition, etc. This makes vim
more like an IDE.

Figure 2.13: Neo Vim Window editing


this book
This book is written using neovim.
2.2 Vim 65

2.2.2 Ed Commands

Before we move on to Vi Commands, let us first learn about the basic ed


commands. These will also be useful in vim, since the ex mode of vim is
based on ed/ex where we can directly use ed commands on our file in
the buffer.
Table 2.2: Ed Commands
Description Commands
Show the Prompt P
Command Format [addr[,addr]]cmd[params]
Commands for location 1 . $ % + - , ; /RE/
Commands for editing fpacdijsmu
Execute a shell command !command
edit a file e filename
read file contents into buffer r filename
read command output into buffer r !command
write buffer to filename w filename
quit q

Commands for location


Table 2.3: Commands for location
Commands Description
a number like 2 refers to second line of file
. refers to current line
$ refers to last line
% refers to all the lines
+ line after the cursor (current line)
- line before the cursor (current line)
, refers to buffer holding the file or last line in buffer
; refers to current position to end of the file
/RE/ refers line matched by pattern specified by ’RE’

Commands for Editing

Table 2.4: Commands for Editing


Commands Description
f show name of file being edited
p print the current line
a append at the current line
c change the current line
d delete the current line
i insert line at the current position
j join lines
s search for regex pattern
m move current line to position
u undo latest change

Let us try out some of these commands in the ed editor.

Lets start with creating a file, which we will then open in the ed editor.
1 $ echo "line-1 hello world
2 line-2 welcome to line editor
3 line-3 ed is perhaps the oldest editor out there
4 line-4 end of file" > test.txt
66 2 Command Line Editors

This creates a file in the current working directory with the name
test.txt and the contents as given above.

We invoke ed by using the executable ed and providing the filename as


an argument.
1 $ ed test.txt
2 117

As soon as we run it, you will see a number, which is the number of
characters in the file. The terminal may seem hung, since there is no
prompt, of either the bash shell, or of the ed editor. This is because ed is
a line editor, and it does not print the contents of the file on the screen.

Off the bat, we can observe the terseness of the ed editor since it does
not even print a prompt. To turn it on, we can use the P command. The
default prompt is *.
1 ed test.txt
2 117
3 P
4 *

Now we can see the prompt * is always present, whenever the ed editor
expects a command from the user.

Lets go to the first line of the file using the 1 command. We can also go
to the last line of the file using the $ command.
1 *1
2 line-1 hello world
3 *$
4 line-4 end of file
5 *

To print out all the lines of the file, we can use the , or % with p command.
1 *,p
2 line-1 hello world
3 line-2 welcome to line editor
4 line-3 ed is perhaps the oldest editor out there
5 line-4 end of file

1 *%p
2 line-1 hello world
3 line-2 welcome to line editor
4 line-3 ed is perhaps the oldest editor out there
5 line-4 end of file

However, if we use the , command without the p command, it will not


print all the lines. Rather, it will just move the cursor to the last line and
print the last line.
1 *,
2 line-4 end of file

We can also print any arbitrary line range using the line numbers sepa-
rated by a comma and followed by the p command.
1 *2,3p
2 line-2 welcome to line editor
3 line-3 ed is perhaps the oldest editor out there
2.2 Vim 67

One of the pioneering features of ed was the ability to search for a pattern
in the file. Let us quickly explain the syntax of the search command. 15 15: The details of regular expressions
will be covered in a later chapter.
1 */hello/
2 line-1 hello world

We may or may not include the p command after the last / in the search
command.
We can advance to the next line using the + command.
1 *p
2 line-1 hello world
3 *+
4 line-2 welcome to line editor

And go to the previous line using the - command.


1 *p
2 line-2 welcome to line editor
3 *-
4 line-1 hello world

We can also print all the lines from the current line to the end of the file
using the ;p command.
1 *.
2 line-2 welcome to line editor
3 *;p
4 line-2 welcome to line editor
5 line-3 ed is perhaps the oldest editor out there
6 line-4 end of file

We can also run arbitrary shell commands using the ! command.


1 *! date
2 Mon Jun 10 11:36:34 PM IST 2024
3 !

The output of the command is shown to the screen, however, it is not


saved in the buffer.
To read the output of a command into the buffer, we can use the r
command.
1 *r !date
2 32
3 *%p
4 line-1 hello world
5 line-2 welcome to line editor
6 line-3 ed is perhaps the oldest editor out there
7 line-4 end of file
8 Mon Jun 10 11:37:42 PM IST 2024

The output after running the r !date command is the number of char-
acters read into the buffer. We can then print the entire buffer using the
%p command.

The read data is appended to the end of the file.


16: Remember that the buffer is the in-
We can write the buffer 16 to the disk using the w command.
memory copy of the file and any changes
made to the buffer are not saved to the
1 *w
file until we write the buffer to the file.
2 149
68 2 Command Line Editors

The output of the w command is the number of characters written to the


file.

To exit ed, we can use the q command.


1 *q

To delete a line, we can use the d command. Lets say we do not want the
date output in the file. We can re-open the file in ed and remove the last
line.
1 $ ed test.txt
2 149
3 P
4 *$
5 Mon Jun 10 11:38:49 PM IST 2024
6 *d
7 *%p
8 line-1 hello world
9 line-2 welcome to line editor
10 line-3 ed is perhaps the oldest editor out there
11 line-4 end of file
12 *wq
13 117

We can add lines to the file using the a command. This appends the line
after the current line. On entering this mode, the editor will keep on
taking input for as many lines as we want to add. To end the input, we
can use the . command on a new line.
1 $ ed test.txt
2 117
3 P
4 *3
5 line-3 ed is perhaps the oldest editor out there
6 *a
7 perhaps not, since we know it was inspired from QED
8 which was made multiple times by thompson and ritchie
9 before ed was made.
10 .
11 *%p
12 line-1 hello world
13 line-2 welcome to line editor
14 line-3 ed is perhaps the oldest editor out there
15 perhaps not, since we know it was inspired from QED
16 which was made multiple times by thompson and ritchie
17 before ed was made.
18 line-4 end of file
19 *

We can also utilize the regular expression support in ed to perform search


and replace operations. This lets us either search for a fixed string and
replace with another fixed string, or search for a pattern and replace it
with a fixed string.

Let us change hello world to hello universe.


1 *1
2 line-1 hello world
3 *s/world/universe/
4 line-1 hello universe
2.2 Vim 69

5 *%p
6 line-1 hello universe
7 line-2 welcome to line editor
8 line-3 ed is perhaps the oldest editor out there
9 perhaps not, since we know it was inspired from QED
10 which was made multiple times by thompson and ritchie
11 before ed was made.
12 line-4 end of file
13 *

We can print the name of the currently opened file using the f com-
mand.
1 *f
2 test.txt

If we wish to join two lines, we can use the j command. Let us join lines
4,5, and 6.
1 *4
2 perhaps not, since we know it was inspired from QED
3 *5
4 which was made multiple times by thompson and ritchie
5 *6
6 before ed was made.
7 *4,5j
8 *4
9 perhaps not, since we know it was inspired from QEDwhich was made
multiple times by thompson and ritchie
10 *5
11 before ed was made.
12 *4,5j
13 *4
14 perhaps not, since we know it was inspired from QEDwhich was made
multiple times by thompson and ritchiebefore ed was made.
15 *

Here we can see that we do the joining in two steps, first the lines 4 and
5 are joined, and then the newly modified line 4 and 5 are joined.

We can move a line from its current position to another line using the m
command.

Lets insert a line-0 at the end of the file and then move it to the beginning
of the file.
1 *7
2 line-4 end of file
3 *a
4 line-0 in the beginning, there was light
5 .
6 *8
7 line-0 in the beginning, there was light
8 *m0
9 *1,4p
10 line-0 in the beginning, there was light
11 line-1 hello universe
12 line-2 welcome to line editor
13 line-3 ed is perhaps the oldest editor out there
14 *
70 2 Command Line Editors

17: The undo command in ed is not as We can also undo the last change using the u command. 17
powerful as the undo command in vim.
In vim, we can undo multiple changes 1 *1
using the u command. In ed, we can 2 line-0 in the beginning, there was light
only undo the last change. If we run 3 *s/light/darkness
the u command multiple times, it will 4 line-0 in the beginning, there was darkness
undo the last change of undoing the last
change, basically redoing the last change.
5 *u
6 *.
7 line-0 in the beginning, there was light
8 *

If search and replace is not exactly what we want, and we want to totally
change the line, we can use the c command. It will let us type a new line,
which will replace the current line.
1 *%p
2 line-0 in the beginning, there was light
3 line-1 hello universe
4 line-2 welcome to line editor
5 line-3 ed is perhaps the oldest editor out there
6 perhaps not, since we know it was inspired from QEDwhich was made
multiple times by thompson and ritchiebefore ed was made.
7 line-4 end of file
8 *4
9 line-3 ed is perhaps the oldest editor out there
10 *c
11 line-4 ed is the standard editor for UNIX
12 .
13 *4
14 line-4 ed is the standard editor for UNIX
15 *

Just like the a command, we can also use the i command to insert a line
at the current position. This will move the current line to the next line.
1 *6
2 line-4 end of file
3 *i
4 before end of file
5 .
6 *6,$p
7 before end of file
8 line-4 end of file
9 *

Finally, we can also number the lines using the n command.


1 *%p
2 line-0 in the beginning, there was light
3 line-1 hello universe
4 line-2 welcome to line editor
5 line-4 ed is the standard editor for UNIX
6 perhaps not, since we know it was inspired from QEDwhich was made
multiple times by thompson and ritchiebefore ed was made.
7 before end of file
8 line-4 end of file
9 *%n
10 1 line-0 in the beginning, there was light
11 2 line-1 hello universe
12 3 line-2 welcome to line editor
2.2 Vim 71

13 4 line-4 ed is the standard editor for UNIX


14 5 perhaps not, since we know it was inspired from QEDwhich
was made multiple times by thompson and ritchiebefore ed was
made.
15 6 before end of file
16 7 line-4 end of file
17 *
72 2 Command Line Editors

2.2.3 Exploring Vim

There are a plethora of commands in vim. We wont be able to cover all


of them in this course. Only the basic commands required to get started
with using vim as your primary editor would be covered. A detailed
tutorial on vim can be found by running the command vimtutor in your
terminal.
1 $ vimtutor

This opens a temporary files that goes through a lot of sections of vim,
explaining the commands in detail. This opens the text file in vim itself,
so you can actually try out each exercise as and when you read it.
Many exercises are present in this file to help you remember and master
commands. Feel free to modify the file since it is a temporary file and
any changes made is lost if the command is re-run.

To open a file in vim, we provide the filename as an argument to the vim


executable.
1 $ vim test.txt

Sometimes the normal mode is called Modal Editor


command mode or escape mode, since
we can run commands in this mode and
Vim is a modal editor, which means that it has different modes that it
we press the Esc key to go to this mode.
However, the ex mode is also called com- operates in. The primary modes are:
mand mode, since we can run ex com-
mands in this mode. To avoid confusions, ▶ Normal/Command Mode - The default mode where we can navi-
we will refer to the navigational(default) gate around the file, and run vim commands.
mode as normal mode, since vim inter-
▶ Insert Mode - The mode where we can type text into the file.
nally also refers to it as normal mode,
and we will refer to the ex mode as ex ▶ Visual Mode - The mode where we can select text to copy, cut, or
mode. delete.
▶ Ex Mode - The mode where we can run ex commands.

Pressing the Esc key takes you to the normal mode from any other
mode.

Figure 2.14: Simplified Modes in Vim

18: This is a simplified version of the


modes in vim. There are other interim The figure Figure 2.14 demonstrates how to switch between the different
modes and other keystrokes that toggle modes in vim. 18
the modes. This is shown in detail in
Figure 2.15. Commands in Ex mode
2.2 Vim 73

Figure 2.15: Detailed Modes in Vim

Since we are already familar with commands in ed, most of the commands
are same/similar in ex mode of vim.
Table 2.5: Ex Commands in Vim
Key Description
:f show name of file
:p print current line
:a append at current line
:c change current line
:d delete current line
:i insert line at current position
:j join lines
:s search and replace regex pattern in current line
:m move current line to position
:u undo latest change
:w [filename] write buffer to filename
:q quit if no change
:wq write buffer to filename and quit
:x write buffer to filename and quit
:q! quit without saving
:r filename read file contents into buffer
:r !command read command output into buffer
:e filename edit a file
:sp [filename] split the screen and open another file
:vsp [filename] vertical split the screen and open another file

There are many more commands in the ex mode of vim. Along with
implementing the original ed commands, it also has a lot of additional
74 2 Command Line Editors

commands to make it more integrated with the vim editor, such as the
ability to open split windows, new tabs, buffers, and perform normal
mode commands in ex mode.
Basic Navigation
The basic keys for moving around in a text file in vim are the h,j,k,l
keys. They move the cursor one character to the left, down, up, and right
respectively. These keys are chosen because they are present on the home
row of the keyboard, and do not require the user to move their hands
19: These were historically chosen be- from the home row to navigate. 19
cause the ADM-3A terminal had these
keys for navigation as seen in Figure 2.8. Along with these, we have keys to navigate word by word, or to move to
the next pattern match, or the next paragraph, spelling error, etc.

Table 2.6: Navigation Commands in Vim


Key Description
h move cursor left
j move cursor down
k move cursor up
l move cursor right
w move to the beginning of the next word
e move to the end of the current word
b move to the beginning of the previous word
% move to the matching parenthesis, bracket, or brace
0 move to the beginning of the current line
$ move to the end of the current line
/ search forward for a pattern
? search backward for a pattern
n repeat the last search in the same direction
N repeat the last search in the opposite direction
gg move to the first line of the file
G move to the last line of the file
1G move to the first line of the file
1gg move to the first line of the file
:1 move to the first line of the file
{ move to the beginning of the current paragraph
} move to the end of the current paragraph
fg move cursor to next occurrence of ‘g’ in the line
Fg move cursor to previous occurrence of ‘g’ in the line

Wait you forgot your cursor behind!


All of the above commands move the cursor to the mentioned location.
However, if you want to move the entire screen, and keep the cursor at
the its current position, you can use the z command along with t to top
the line, b to bottom the line, and z to center the line.
There are other commands using the Ctrl key that moves the screen,
and not the cursor.
Table 2.7: Moving the Screen Commands
in Vim Key Description
Ctrl+F move forward one full screen
Ctrl+B move backward one full screen
Ctrl+D move forward half a screen
Ctrl+U move backward half a screen
Ctrl+E move screen up one line
Ctrl+Y move screen down one line
2.2 Vim 75

Replacing Text
Usually in other text editors, if you have a word, phrase, or line which
you want to replace with another, you would either press the backspace
or delete key to remove the text, and then type the new text. However, in
vim, there is a more efficient way to replace text.

Table 2.8: Replacing Text Commands in


Key Description Vim
r replace the character under the cursor
R replace the character from the cursor till escape is pressed
cw change the word under the cursor
c4w change the next 4 words
C delete from cursor till end of line and enter insert mode
cc delete entire line and enter insert mode
5cc delete next 5 lines and enter insert mode
S delete entire line and enter insert mode
s delete character under cursor and enter insert mode

Toggling Case
You can toggle the case of a character, word, line, or any arbitrary chunk
of the file using the or the g command.

Table 2.9: Toggling Case Commands in


Key Description Vim
~ toggle the case of the character under the cursor
g w toggle the case of the word under the cursor
g 0 toggle the case from cursor till beginning of line
g $ toggle the case from cursor till end of line
g { toggle the case from cursor till previous empty line
g { toggle the case from cursor till next empty line
g % toggle the case from the bracket, brace, or parenthesis till its pair

You might start to see a pattern emerging here. Many commands in vim
do a particular command, and on which text it operates is determined by
the character followed by it. Such as c to change, d to delete, y to yank,
etc. The text on which the command operates is mentioned using w for
the word, 0 for till beginning of line, etc.
This is not a coincidence, but rather a design of vim to make it more
efficient to use. The first command is called the operator command, and
the second command is called the motion command.
Vim follows a operator-count-motion pattern. For example: d2w deletes
the next 2 words. This makes it very easy to learn and remember
commands, since you are literally typing out what you want to do.
Deleting or Cutting Text
In Vim, the delete command is used to cut text from the file.
Motion - till, in, around
By now you should notice that dw doesn’t always delete the word under
the cursor. Technically dw means delete till the beginning of the next
word. So if you press dw at the beginning of a word, it will delete the
word under the cursor. But if your cursor is in the middle of a word and
you type dw, it will only delete the part of the word till the beginning of
the next word from the cursor position.
76 2 Command Line Editors

Table 2.10: Deleting Text Commands in


Vim Key Description
x delete the character under the cursor
X delete the character before the cursor
5x delete the next 5 characters
dw delete the word under the cursor
d4w delete the next 4 words
D delete from cursor till end of line
dd delete entire line
6dd delete next 6 lines

To delete the entire word under the cursor, regardless of where the cursor
is in the word, you can use the diw this means delete inside word.
However, now you may notice that diw doesn’t delete the space after
the word. This results in two consequtive spaces, one from the end of
the word, and one from the beginning of the word being left behind. To
delete the space as well, you can use daw which means delete around
word.
This works not just with w but with any other motion such as delete
inside paragraph, which will delete the entire paragraph under the
cursor, resulting in two empty lines being left behind, and delete around
paragraph, which will delete the entire paragraph under the cursor, and
only one empty line being left behind.
Try out the same with deleting inside other items, such as brackets,
parenthesis, braces, quotes, etc. The syntax remains the same, di{, di[,
di(, di", di’, etc.

Yanking and Pasting Text


Yes, copying is called yanking in vim. The command to yank is y and to
paste is p. You can combine y with all the motions and in and around
motions as earlier. You can also add the count to yank multiple lines or
words.
Table 2.11: Deleting Text Commands in
Vim Key Description
yy yank the entire line
yw yank the word under the cursor
.. ..
. .
p paste the yanked text after the cursor
P paste the yanked text before the cursor

Remark 2.2.2 Important to note that the commands


1 yy

and
1 0y$

are not the same. The first command yanks the entire line, including
the newline character at the end of the line. The second one yanks
the entire line, but does not include the newline character at the end
of the line. Thus if you directly press p after the first command, it
will paste the line below the current line, and if you press p after the
2.2 Vim 77

second command, it will paste the line at the end of the current line.

Undo and Redo


The undo command in vim is u and the redo command is Ctrl+R. You
can undo multiple changes, unlike ed.

Remark 2.2.3 If you want to use vim as your primary editor, it is


highly recommended to install the mbbill/undotree plugin. This
plugin will show you a tree of all the changes you have made in the
current buffer, and you can go to any point in the tree and undo
or redo changes. This becomes very useful if you undo too many
changes and by mistake make a new change, this changes your branch
in undo tree, and you cannot redo the changes you undid. With the
undotree plugin, you can switch branches of the undo tree and redo
the changes.

Searching and Replacing


The search command in vim is / for forward search and ? for backward
search. You can use the n command to repeat the last search in the same
direction, and the N command to repeat the last search in the opposite
direction. For example, if you perform forward search then using the
n command will search for the next occurrence of the pattern in the
forward direction, and using the N command will search for the previous
occurrence of the pattern in the backward direction. However if you
perform a backward search using the ? command, then using the n
command will search for the previous occurrence of the pattern in the
backward direction, and using the N command will search for the next
occurrence of the pattern in the forward direction.
You can also use the * command to search for the word under the cursor,
and the # command to search for the previous occurrence of the word
under the cursor.
You can perform search and replace using the :s command. The command
takes a line address on which to perform the search and replace. Usually
you can use the % address to search in the entire file, or the .,$ address
to search from cursor till the end of the file.
You can also use any line number to specify the address range, similar to
the ed editor.
1 :[addr]s/pattern/replace/[flags]

The flags at the end of the search and replace command can be g to
replace all occurrences in the line, and c to confirm each replacement.
The address can be a single line number, a range of line numbers, or a
pattern to search for. The pattern can be a simple string, or a regular
expression.
Some examples of addresses are shown in Table 2.12.
Insert Mode
You can enter insert mode from escape mode using the keys listed in Table
2.13. In insert mode, if you want to insert any non-graphical character,
you can do that by pressing Ctrl+V followed by the key combination for
78 2 Command Line Editors

Table 2.12: Address Types in Search and


Replace Key Description
m,n from line m to line n
m line m
m,$ from line m to end of file
.,$ from current line to end of file
1,n from line 1 to line n
/regex/,n from line containing regex to line n
m,/regex/ from line m to line containing regex
.,/regex/ from current line to line containing regex
/regex/,. from line containing regex to current line
1,/regex/ from the first line to line containing regex
/regex/,$ from line containing regex to the last line
/regex1/;/regex2/ from line containing regex1 to line containing regex2
% entire file

Table 2.13: Keys to enter Insert Mode


Key Description
i enter insert mode before the cursor
a enter insert mode after the cursor
I enter insert mode at the beginning of the line
A enter insert mode at the end of the line
o add new line below the current line and enter insert mode
O add new line above the current line and enter insert mode

the character. For example, to insert a newline character, you can press
Ctrl+V followed by Enter.

These are just the basic commands to get you started with vim. You can
refer to vim cheat sheets present online to get more familiar with the
commands.

▶ https://vim.rtorr.com/ is a good text based HTML cheat sheet


for vim.
▶ https://vimcheatsheet.com/ is a paid graphical cheat sheet for
20: A free version of the graphical cheat vim. 20
sheet is shown in Figure 2.16.
2.2 Vim 79

Figure 2.16: Vim Cheat Sheet


80 2 Command Line Editors

2.3 Emacs

2.3.1 History

Emacs was mostly developed by Richard Stallman and Guy Steele.

Table 2.14: History of Emacs


TECO (Tape Editor and
1962 · · · · · ·• Corrector) was developed at
MIT.
Richard Stallman visits
1976 · · · · · ·• Stanford AI Lab and sees Fred
Wright’s E editor.

Guy Steele accumulates a


1978 · · · · · ·• collection of TECO macros
into EMACS.
EMACS becomes MIT’s
1979 · · · · · ·•
standard text editor.
James Gosling writes Gosling
Figure 2.17: Emacs Logo 1981 · · · · · ·•
Emacs that runs on UNIX.
Richard Stallman starts GNU
1984 · · · · · ·• Emacs - a free software
alternative to Gosling Emacs.

TECO

TECO was developed at MIT in 1962. It was a text editor used to correct
the output of the PDP-1 computer. It is short for Tape Editor and Corrector.
Unlike most modern text editors, TECO used separate modes in which
the user would either add text, edit existing text, or display the document
One could not place characters directly into a document by typing them
into TECO, but would instead enter a character (’i’) in the TECO command
language telling it to switch to input mode, enter the required characters,
during which time the edited text was not displayed on the screen, and
finally enter a character (<esc>) to switch the editor back to command
Figure 2.18: Richard Stallman - founder mode. This is very similar to how vi works.
of GNU and FSF projects
Stallman’s Visit to Stanford

21: What You See Is What You Get


In 1976, Richard Stallman visited the Stanford AI Lab where he saw
Fred Wright’s E editor. He was impressed by E’s WYSIWYG 21 interface
where you do not need to tackle multiple modes to edit a text file. This
is the default behaviour of most modern editors now. He then returned
to MIT where he found that Carl Mikkelsen had added to TECO a
combined display/editing mode called Control-R that allowed the screen
display to be updated each time the user entered a keystroke. Stallman
reimplemented this mode to run efficiently and added a macro feature
to the TECO display-editing mode that allowed the user to redefine any
keystroke to run a TECO program.

Initially TECO was able to only edit the file sequentially, page by page.
This was due to earlier memory restrictions of the PDP-1. Stallman
2.3 Emacs 81

modified TECO to read the entire file into the buffer, and then edit the
buffer in memory allowing for random access to the file.
Too Many Macros!
The new version of TECO quickly became popular at the AI Lab and soon
accumulated a large collection of custom macros whose names often
ended in MAC or MACS, which stood for macro. This quickly got out of
hand as there were many divergent macros, and a user would be totally
lost when using a co-worker’s terminal.
In 1979, Guy Steele combined many of the popular macros into a single
file, which he called EMACS, which stood for Editing MACroS, or E with
MACroS.
To prevent thousands of forks of EMACS, Stallman declared that ‘EMACS
was distributed on a basis of communal sharing, which means all improve-
ments must be given back to me to be incorporated and distributed.’
Till now, the EMACS, like TECO, ran on the PDP-10 which ran the ITS Figure 2.19: Guy L. Steele Jr. combined
many divergent TECO with macros to
operating system and not UNIX. create EMACS
EINE ZWEI SINE and other clones
22: EINE stands for Eine Is Not EMACS
No, that is not German. These are some of the popular clones of EMACS 23: ZWEI stands for ZWEI Was Eine
made for other operating systems. Initially
These kinds of recursive acronyms are
EINE 22 was a text editor developed in the late 1970s. In terms of features, common in the nix world. For example,
its goal was to ‘do what Stallman’s PDP-10 (original) Emacs does’. Unlike GNU stands for GNU’s Not Unix, WINE
the original TECO-based Emacs, but like Multics Emacs, EINE was (A compatibility layer to run Windows
written in Lisp. It used Lisp Machine Lisp. applications) is short for WINE Is Not an
Emulator.
In the 1980s, EINE was developed into ZWEI 23 . Innovations included 24: SINE stands for SINE Is Not EINE
programmability in Lisp Machine Lisp, and a new and more flexible
doubly linked list method of internally representing buffers.
SINE 24 was written by Owen Theodore Anderson in 1981.
In 1978, Bernard Greenberg wrote a version of EMACS for the Multics
operating system called Multics EMACS. This used Multics Lisp.
Gosling Emacs
In 1981, James Gosling wrote Gosling Emacs for UNIX. It was written in
C and used Mocklisp, a language with lisp-like syntax, but not a lisp. It
was not free software.
GNU Emacs Figure 2.20: James Gosling - creator of
Gosling Emacs and later Java
In 1983, Stallman started the GNU project to create a free software
alternatives to proprietary softwares and ultimately to create a free 25
25: Recall from the previous chapter
operating system. that free software does not mean a soft-
ware provided gratis, but a software
In 1984, Stallman started GNU Emacs, a free software alternative to
which respects the user’s freedom to run,
Gosling Emacs. It was written in C and used a true Lisp dialect, Emacs copy, distribute, and modify the software.
Lisp as the extension language. Emacs Lisp was also implemented in C. It is like free speech, not free beer.
This is the version of Emacs that is most popular today and available on
most operating systems repositories.
How the developer’s keyboard influences the editors they make
Remember that ed was made while using ADM-3A which looked like
Figure 2.21.
Figure 2.21: ADM-3A terminal
82 2 Command Line Editors

Whereas emacs was made while the Knight keyboard and the Space
Cadet keyboard were in use, which can be seen in Figure 2.22.
Notice how the ADM-3A has very limited modifier keys, and does not
even have arrow keys. Instead it uses h,j,k,l keys as arrow keys with
a modifier. This is why vi uses mostly key combinations and a modal
interface. Vi also uses the Esc key to switch between modes, which is
Figure 2.22: Space Cadet Keyboard present conveniently in place of the Caps Lock or Tab key in modern
keyboard layouts.
The Space Cadet keyboard has many modifier keys, and even a key for
the Meta key. This is why emacs uses many key modifier combinations,
and has a lot of keybindings.

2.3.2 Exploring Emacs

This is not a complete overview of Emacs, or even its keybindings. A


more detailed reference card can be found on their website.
Opening a File
We can open a file in emacs by providing its filename as an argument to
the emacs executable.
1 $ emacs test.txt

Most of emacs keybindings use modifier keys such as the Ctrl key, and
the Meta key. The Meta key is usually the Alt key in modern keyboards.
In the reference manual and here, we will be representing the Meta key
as M- and the Ctrl key as C-.
Basic Navigation
These keys are used to move around in the file. Like vim, emacs also
focusses on keeping the hands free from the mouse, and on the keyboard.
All the navigation can be done through the keyboard.

Table 2.15: Navigation Commands in


Emacs Key Description
C-p move up one line
C-b move left one char
C-f move right one char
C-n move down one line
C-a goto beginning of current line
C-e goto end of current line
C-v move forward one screen
M-< move to first line of the file
M-b move left to previous word
M-f move right to next word
M-> move to last line of the file
M-a move to beginning of current sentence
M-e move to end of current sentence
M-v move back one screen

Exiting Emacs
We can exit emacs either with or without saving the file. We can also
suspend emacs and return to the shell. This is a keymapping of the shell,
and not of emacs.
2.3 Emacs 83

Table 2.16: Exiting Emacs Commands


Key Description
C-x C-s save buffer to file
C-z suspend emacs
C-x C-c exit emacs and stop it

Searching Text
Emacs can search for a fixed string, or a regular expression and replace it
with another string.

Table 2.17: Searching Text Commands in


Key Description Emacs
C-s search forward
C-r search backward
M-x replace string

Copying and Pasting


Copying can done by marking the region, and then copying it.

Table 2.18: Copying and Pasting Com-


Key Description mands in Emacs
M-backspace cut the word before cursor
M-d cut the word after cursor
M-w copy the region
C-w cut the region
C-y paste the region
C-k cut from cursor to end of line
M-k cut from cursor to end of sentence
84 2 Command Line Editors

2.4 Nano

Although vim and emacs are the most popular command line text editors,
nano is also a very useful text editor for beginners. It is very simple and
does not have a steep learning curve.

It is a non-modal text editor, which means that it does not have different
Figure 2.23: Nano Text Editor modes for different actions. You can directly start typing text as soon as
you open nano.

Although it uses modifier keys to invoke commands, it does not have a


lot of commands as vim or emacs.

2.4.1 History
26: It is believed that pine stands for
Pine is Not Elm, Elm being another text- Pine 26 was a text-based email client developed at the University of
based email client. However, the author
clarifies that it was not named with that
Washington. It was created in 1989. The email client also had a text editor
in mind. Although if a backronym was built in called Pico.
to be made, he prefered ‘Pine is Nearly
Elm’ or ‘Pine is No-longer Elm’ Although the license of Pine and Pico may seem open source, it was
27: Up to version 3.91, the Pine license not. The license was restrictive and did not allow for modification or
was similar to BSD, and it stated that redistribution. 27
‘Permission to use, copy, modify, and
distribute this software and its documen- Due to this, many people created clones of Pico with free software
tation for any purpose and without fee to
licenses. One of the most popular clones was TIP (TIP isn’t Pico) which
the University of Washington is hereby
granted ...’ The university registered a was created by Chris Allegretta in 1999. Later in 2000 the name was
trademark for the Pine name with respect changed to Nano. 28 In 2001, nano became part of the GNU project.
to ‘computer programs used in commu-
nication and electronic mail applications’ GNU nano implements several features that Pico lacks, including syntax
in March 1995. From version 3.92, the highlighting, line numbers, regular expression search and replace, line-
holder of the copyright, the University of
by-line scrolling, multiple buffers, indenting groups of lines, rebindable
Washington, changed the license so that
even if the source code was still avail- key support, and the undoing and redoing of edit changes.
able, they did not allow modifications
and changes to Pine to be distributed In most modern linux systems, the nano binary is present along with the
by anyone other than themselves. They pico binary, which is actually a symbolic link to the nano binary.
also claimed that even the old license
never allowed distribution of modified You can explore this by finding the path of the executable using the which
versions.
command and long-listing the executable.
28: Mathematically, nano is 10−9 or one
billionth. and pico is 10−12 or one tril- 1 $ which pico
lionth. or put relatively, nano is 1000 2 /usr/bin/pico
times bigger than pico, although the size 3 $ ls -l /usr/bin/pico
of nano binary is smaller than pico.
4 lrwxrwxrwx 1 root root 22 Sep 6 2023 /usr/bin/pico -> /etc/
alternatives/pico
5 $ ls -l /etc/alternatives/pico
6 lrwxrwxrwx 1 root root 9 Sep 6 2023 /etc/alternatives/pico -> /
bin/nano

Remark 2.4.1 Note that here we have a symlink to another symlink.


Theoretically, you can extend to as many levels of chained symlinks
as you want. Thus, to find the final sink of the symlink chain, you can
use the readlink -f command or the realpath command.

1 $ realpath $(which pico)


2 /usr/bin/nano
2.4 Nano 85

2.4.2 Exploring Nano

In nano, the Control key is represented by the ^symbol. The Meta or Alt
key is represented by the M-.

File Handling

You can open a file in nano by providing the filename as an argument to


the nano executable.
1 $ nano test.txt

Table 2.19: File Handling Commands in


Key Description Nano
^S save the file
^O save the file with a new name
^X exit nano

Editing

Nano is a simple editor, and you can do without learning any more
commands than the ones listed above, but here are some more basic
commands for editing text.

Table 2.20: Editing Commands in Nano


Key Description
^K cut current line and save in cutbuffer
M-6 copy current line and save in cutbuffer
^U paste contents of cutbuffer
M-T cut until end of buffer
^] complete current word
M-U undo last action
M-E redo last undone action
^J justify the current paragraph
M-J justify the entire file
M-: start/stop recording a macro
M-; run the last recorded macro
F12 invoke the spell checker, if available

There are many more commands in nano, but they are omitted from here You can also find third-party cheat sheets
for brevity. You can find the complete list of keybindings by pressing ^G online.
key in nano, or by running info nano.

2.4.3 Editing A Script in Nano

Since learning nano is mostly to be able to edit a text file even if you are
not familiar with either vim or emacs, let us try to edit a simple script file
to confirm that you can use nano.
1 $ touch myscript.sh
2 $ chmod u+x myscript.sh
3 $ nano myscript.sh

Now try to write a simple script in the file. An example script is shown
below.
86 2 Command Line Editors

1 #!/bin/bash
2 read -rp ’What is your name? ’ name
3 echo "Hello $name"
4 date=$(date "+%H:%M on a %A")
5 echo "Currently it is $date"
If you do not understand how the script
works, do not worry. It will be covered Now save the file by pressing ^S
in depth in later chapters.

Remark 2.4.2 In some systems, the ^S key will freeze the terminal.
Any key you press after this will seem to not have any effect. This is
because it is interpreted as the XOFF and is used to lock the scrolling
of the terminal. To unfreeze the terminal, press ^Q. In such a system,
you can save the file by pressing ^O and then typing out the name
of the file if not present already, and pressing Enter. To disable this
behaviour, you can add the line
1 set -ixon

to your .bashrc file.

and exit nano by pressing ^X.


Now you can run the script by typing
1 $ ./myscript.sh
2 What is your name? Sayan
3 Hello Sayan
4 Currently it is 21:50 on a Tuesday

Now that we are able to edit a text file using text editors, we are ready to
write scripts to solve problems.
Networking and SSH 3
3.1 Networking . . . . . . . 87
3.1 Networking 3.1.1 What is networking? . 87
3.1.2 Types of Networks . . 87
3.1.1 What is networking? 3.1.3 Devices in a Network . 88
3.1.4 IP Addresses . . . . . . 89
3.1.5 Subnetting . . . . . . . 89
Have you ever tried to get some work done on a computer while the
3.1.6 Private and Public IP
internet was down? It’s a nightmare. Modern day computing relies highly
Addresses . . . . . . . . 91
on networking. But what is networking? 3.1.7 CIDR . . . . . . . . . . . 92
3.1.8 Ports . . . . . . . . . . . 92
Definition 3.1.1 (Networking) A computer network comprises two 3.1.9 Protocols . . . . . . . . . 93
or more computers that are connected—either by cables (wired) or 3.1.10 Firewalls . . . . . . . . . 94
3.1.11 SELinux . . . . . . . . . 94
wifi (wireless)—with the purpose of transmitting, exchanging, or
3.1.12 Network Tools . . . . . 95
sharing data and resources.
3.2 SSH . . . . . . . . . . . . 101
3.2.1 What is SSH? . . . . . . 101
We have been using the computer, and linux, for a while now but the 3.2.2 History . . . . . . . . . . 101
utility of a computer increases exponentially when it is connected to 3.2.3 How does SSH work? . 101
a network. It allows computers to share files and resources, and to 3.2.4 Key-based Authentica-
communicate with each other. Current day world wide web is built on tion . . . . . . . . . . . . 102
the internet. 3.2.5 Configuring your SSH
keys . . . . . . . . . . . . 102
3.2.6 Sharing your public key 103
Definition 3.1.2 (Internet) Internet is a global network of networks 3.2.7 How to login to a re-
that connects millions of computers worldwide. It allows computers mote server . . . . . . . 104
to connect to other computers across the world through a hierarchy 3.2.8 Call an exorcist, there’s
of routers and servers. a daemon in my com-
puter . . . . . . . . . . . 104
Learning about networking and how networking works is useful, al- One succinct blogpost explaining how
though we won’t be devling into details in this book. It is left as an the internet works from which the figure
exercise for the reader to explore external resources if they are inter- Figure 3.1 is taken is available at https:
//www.highspeedinternet.com/
ested. resources/how-the-internet-works

3.1.2 Types of Networks

If the end goal is to connect computers with each other, one naive solution
might be to connect all the computers with each other. Although this
might seem intuitive at first, this quickly gets out of hand when the
number of computers keep increasing.
Figure 3.1: Types of Networks
If we have 𝑛 computers, then the number of connections required to
connect all the computers with each other is given by the formula

𝑛(𝑛 − 1) 𝑛 2 − 𝑛
=
2 2

This is a quadratic function and grows very quickly.

This means it will cost a lot to connect all the computers to each other.
This is applicable not only in computer networking with physical wires, Figure 3.2: Growth Rate of Different
Functions - Note how quickly 𝑛 2 grows
88 3 Networking and SSH

but in many other fields. Take an examples of airlines and airplane routes.
If there were 𝑛 airports, then the number of routes required to connect
all the airports is given by the same formula. This would be disastrous
for the economy and the environment if we ran so many airplanes daily.
So what gives?
Hub and Spoke Network
The solution to this problem is to use a hub and spoke model, where
there are one, or multiple, central hubs which connect to many other
nodes. Any path from any node to another goes through one or more
hubs. This reduces the number of connections required to connect all the
Figure 3.3: Hub and Spoke Model Em- nodes.
ployed by Airlines
This is the solution used in airlines, and also in most computer networks.
1
1: Although computer networks use a
hub model for the local area network, Due to this, networks can be classified into three broad categories based
the network of networks, especially the
on their geographical area coverage.
gateway routers follow a mesh model to
ensure redundancy and make the net-
▶ Local Area Network (LAN): A network that covers a small geo-
work more robust.
graphical area, like a home, office, or a building.
▶ Metropolitan Area Network (MAN): A network that covers a
larger geographical area, like a city or a town.
▶ Wide Area Network (WAN): A network that covers a large geo-
graphical area, like a country or the entire world.

To connect these networks to computers and also to each other, we require


some special devices.

3.1.3 Devices in a Network


Figure 3.4: LAN and WAN connecting
to the Internet
In computer networks, this hub of the Hub and Spoke Model can either
be a level 1 hub, a level 2 switch, or a level 3 router.
Hub
A hub will simply broadcast the message to all the connected nodes. This
causes a lot of traffic to be generated and is not very efficient. Hub does
not have the capability to identify which node is who. This is called a
2: To understand more about the levels,
level 1 hub. 2
refer OSI Model
Switch
A switch is smarter than a hub. It can identify each device connected to
it and can send the packets of data only to the intended recipient. This is
more efficient than a hub. This is called a level 2 switch since it uses the
level 2 of the OSI model (Data Link Layer) to identify the devices. This
means that the devices are identified by their MAC addresses. Using
this, you can only communicate with devices in your local network.
This is useful for a home network or a office network. But we cannot
communicate with the entire world using this, since it doesn’t understand
IP addresses.
Router
A router is even smarter than a switch. It can understand IP addresses
and can route the packets from one network to another. This is called a
Figure 3.5: Hub, Switch, Router connect- level 3 router since it uses the level 3 of the OSI model (Network Layer) to
ing to the Internet
3.1 Networking 89

identify the devices. This means that the networks are identified by their
IP addresses. This is what we use to connect to the internet. The internet
is nothing but a whole lot of routers communicating with each other
to find the optimal path to send the packets to its destination. Border
Gateway Protocol (BGP) is the protocol used by routers to communicate
with each other and find the optimal path. They usually are connected in
a mesh network to ensure redundancy and robustness.

Level 3 Switch

A level 3 switch, or a routing switch, is a switch with the capabilities of


a router. It can understand the language of IP addresses and can route
the packets to different networks. This is useful in large organizations
where there are many networks and each network can be divided into
subnetworks called VLANs. You can read more about the differences
between these devices online.
So, in short, internet is a network of network of . . . networks. It is a
hierarchical structure connecting all the computers in the world. Some
computers are connected earlier in the hierarchy (usually the ones closer
geographically) and some are connected later.

3.1.4 IP Addresses

So how do routers know which where to send the data packets? This
is where IP addresses come in. To communicate over the internet, two
computers need to know their public IP addresses. The routers then finds
the optimal path to send the data packets to the destination network.

IP addresses are of two types: IPv4 and IPv6. The most common one is
IPv4, which is a 32-bit address represented as four octets separated by
dots. For example,

162.136.73.21
3: An octet is a group of 8 bits. Since an
Here, each octet can take values from 0 to 255. 3 Technically all such IP address is 32 bits, it is represented as 4
groups of 8 bits. 8 bits can only represent
combinations are possible IP addresses, resulting in 232 = 4 , 294 , 967 , 296
numbers from 0 to 255. since 28 = 256.
possible IP addresses. That is a lot of IP addresses, but not enough for
the growing number of devices in the world.

This is where IPv6 comes in. IPv6 is a 128-bit address represented as 8


groups of 4 hexadecimal digits separated by colons. For example,

2001 : 0 𝑑𝑏 8 : 85 𝑎 3 : 0000 : 0000 : 8 𝑎 2 𝑒 : 0370 : 7334 Notice that there are some groups of
zeros in the address. These can be com-
pressed by writing only one zero in place
This results in 2128 = 340282366920938463463374607431768211456 possi- of multiple zeros in each group. Further,
ble IP addresses, which is a lot more than IPv4. any leading zeros can be omitted for
each group. Making the above address
as 2001:db8:85a3:0:0:8a2e:370:7334.
Further, if there are multiple groups
3.1.5 Subnetting of zeros, they can be compressed to
::. This can be used only once in
an address. Doing this, the above ad-
Legacy Classes
dress can be compressed further to
2001:db8:85a3::8a2e:370:7334.

10.125.42.62 → 00001010.01111101.00101010.00111110
90 3 Networking and SSH

Recall that an IP address, although represented as four octets, is actually


a 32-bit address. This means that in binary form, an IP address is a string
of 32 1s and 0s. Using the first four bits, we can classify an IP address
into five classes.

▶ Class A: The first bit is ‘0‘. The IP addresses in the range 0.0.0.0 to
127.255.255.255.
▶ Class B: The first two bits are ‘10‘. IP addresses in the range 128.0.0.0
to 191.255.255.255.
▶ Class C: The first three bits are ‘110‘. IP addresses in the range
192.0.0.0 to 223.255.255.255.
▶ Class D: The first four bits are ‘1110‘. IP addresses in the range
224.0.0.0 to 239.255.255.255. These are reserved for multicast
addresses.
▶ Class E: The first four bits are ‘1111‘. IP addresses in the range
240.0.0.0 to 255.255.255.255. These are reserved for experimental
purposes.

However, these classes do not simply assign an IP to each machine. They


are further divided into the network part and the host part.

Class A assigns the first octet to the network part, this is used to identify
which network the machine is in. The remaining three octets are used to
identify the host in that network. This means that a class A network can
have 224 − 2 = 16 , 777 , 214 hosts. However, there can only be 27 = 128
class A networks. Thus, class A networks are used by large organizations
which have many hosts, but not many large organizations exist, so 128
networks are enough.

Similarly, class B assigns the first two octets to identify the network and
the remaining two octets to identify the host. This means that a class B
network can have 216 − 2 = 65 , 534 hosts. And there can be 214 = 16 , 384
class B networks. These are used by medium-sized organizations, which
are plenty in number, and have a moderate number of hosts.

The same goes for class C networks, where the first three octets are used
to identify the network and the last octet is used to identify the host. This
network can have 28 − 2 = 254 hosts. And there can be 221 = 2 , 097 , 152
class C networks. These are used by small organizations, which are plenty
in number, and have a small number of hosts.

Subnet Masks

Definition 3.1.3 (Subnetting) The process of dividing a network into


smaller network sections is called subnetting.

Usually, each network has only one subnet, which contains all the hosts in
that network. However, the network can be further divided into smaller
subnetworks, each containing a subset of the hosts. This is useful in large
organizations where the network is divided into departments, and each
department is given a subnetwork.

To indicate which part of the IP address is the network part and which
part is the host part, we use a subnet mask. A subnet mask is a 32-bit
number where the first 𝑛 bits are 1s and the remaining bits are 0s. The
number of 1s in the subnet mask indicates the number of bits used to
3.1 Networking 91

identify the network. For example, for the IP Address 192.168.0.15, it


can be written in binary as

11000000 − 10101000 − 00000000 − 00001111

As we know, it belongs to the class C network, where the first three octets
are used to identify the network, and the rest is used to identify the host.
So the default network mask is

11111111 − 11111111 − 11111111 − 00000000

or 255.255.255.0 in decimal. The network portion of the IP address is


found by taking the bitwise AND 4 of the IP address and the subnet 4: The bitwise AND operation is a bi-
mask. This results in the network address nary operation that takes two equal-
length binary representations and per-
forms the logical AND operation on each
11000000 − 10101000 − 00000000 − 00000000 pair of corresponding bits. The result in
each position is 1 if the first bit is 1 and
which is 192.168.0.0 and the host address is 0000 1111 which is 15. the second bit is 1; otherwise, the result
is 0.
However, if we do not require all the 8 bits in the host space 5 then we 5: That is, if we have less than 254 hosts
can use some of the initial bits of the host space to identify subnetworks. in the network.
This is called subnetting.
For example, the netmask of 255.255.255.0 leaves 8 bits for the host
space, or 28 − 2 = 254 6 hosts. If we want to split this network into 6: We subtract 2 from the total number
two subnets, we can use the MSB of the host space for representing the of hosts to account for the network ad-
dress and the broadcast address, which
subnetworks. This results in each subnet having 27 − 2 = 126 hosts. are the first(0) and the last(255) addresses
in the network.
Remark 3.1.1 Observe that we effectively lost two available addresses
from the total number of hosts in the network. Earlier we could have
254 hosts, but now we can have only 126 × 2 = 252 hosts. This is
because each subnet also reserves the first and the last address for the
network address and the broadcast address.

To do this, we change the subnet mask to

11111111 − 11111111 − 11111111 − 10000000

which can be represented as 255.255.255.128.


This gives us two subnets, one with the address range of 192.168.0.1 to
192.168.0.127, and another of 192.168.0.129 to 192.168.0.255.

3.1.6 Private and Public IP Addresses

But what if we want to communicate with computers in our local


network? This is where private IP comes in. Some ranges of IP addresses
are reserved for private networks. These are not routable over the internet.
Each LAN has a private IP address range, and the router translates these
private addresses to the public IP address when sending the packets over 7: Dynamic Host Configuration Proto-
the internet. The assignment of these private IP addresses is done by the col (DHCP) is a network management
protocol used on Internet Protocol net-
DHCP server 7 in the router.
works whereby a DHCP server dynami-
Each class of networks has a range of IP addresses that are reserved for cally assigns an IP address and other net-
work configuration parameters to each
private networks. device on a network so they can commu-
nicate with other IP networks.
92 3 Networking and SSH

Table 3.1: Private IP Address Ranges

Class Network Bits (CIDR) Address Range Number of Addresses


Class A 8 10.0.0.0 - 10.255.255.255 16,777,216
Class B 12 172.16.0.0 - 172.31.255.255 1,048,576
Class C 16 192.168.0.0 - 192.168.255.255 65,536

3.1.7 CIDR

However, this practice of subdividing IP addresses into classes is a legacy


concept, and not followed anymore. Instead, we use CIDR (Classless
Inter-Domain Routing) to announce how many bits are used to identify
the network and how many bits are used to identify the host.
For example, we could express the idea that the IP address 192.168.0.15
is associated with the netmask 255.255.255.0 by using the CIDR no-
tation of 192.168.0.15/24. This means that the first 24 bits of the IP
address given are considered significant for the network routing.
This is helpful because not all organizations fit into the tight categorization
of the legacy classes.

3.1.8 Ports

Ports usually refer to physical holes in a computer where you can connect
a cable. However, in networking, ports refer to logical endpoints for
communication.

Definition 3.1.4 (Port) A port or port number is a number assigned


to uniquely identify a connection endpoint and to direct data to a
specific service. At the software level, within an operating system, a
port is a logical construct that identifies a specific process or a type
of network service.

For any communication between two computers, the data needs to be


sent to a specific port. This is because there are multiple services running
on a computer, and the operating system needs to know which service to
direct the data to.
There are 216 = 65 , 536 ports available for use. However, the first 1024
ports are reserved for well-known services. These are called the well-
known ports and are used by services like HTTP, FTP, SSH, etc. The well
known ports can be found in Table 3.2.
Other ports, from 1024 to 49151, are registered ports, these ports can be
registered with the Internet Assigned Numbers Authority (IANA) by
anyone who wants to use them for a specific service.
Ports from 49152 to 65535 are dynamic ports, these are used by the
operating system for temporary connections and are not registered.
Whenever you send a request to a web server for example, the request
is sent to the server’s IP address and the port number 80 which is the
8: or 443 for HTTPS.
default port for HTTP 8 but the port number from your (the client) side
is usually a random port number from the dynamic port range. This is a
3.1 Networking 93

short-lived port number and is used to establish a connection with the


server.

These kinds of ports which are short-lived and used to establish a


connection are called ephemeral ports.

Table 3.2: Well-known Ports

Port Number Service Protocol


20 File Transfer Protocol (FTP) Data Transfer TCP
21 File Transfer Protocol (FTP) Command Control TCP
22 Secure Shell (SSH) Secure Login TCP
23 Telnet remote login service, unencrypted text messages TCP
25 Simple Mail Transfer Protocol (SMTP) email delivery TCP
53 Domain Name System (DNS) service TCP/UDP
67, 68 Dynamic Host Configuration Protocol (DHCP) UDP
80 Hypertext Transfer Protocol (HTTP) used in the World Wide Web TCP
110 Post Office Protocol (POP3) TCP
119 Network News Transfer Protocol (NNTP) TCP
123 Network Time Protocol (NTP) UDP
143 Internet Message Access Protocol (IMAP) Management of digital mail TCP
161 Simple Network Management Protocol (SNMP) UDP
194 Internet Relay Chat (IRC) TCP
443 HTTP Secure (HTTPS) HTTP over TLS/SSL TCP
546, 547 DHCPv6 IPv6 version of DHCP UDP

3.1.9 Protocols

Definition 3.1.5 (Protocols) In computing, a protocol is a set of rules


that define how data is transmitted between devices in a network.

There are many protocols used in networking, some of the most common
ones are

▶ HTTP: HyperText Transfer Protocol


▶ HTTPS: HyperText Transfer Protocol Secure
▶ FTP: File Transfer Protocol
▶ SSH: Secure Shell
▶ SMTP: Simple Mail Transfer Protocol
▶ POP3: Post Office Protocol
▶ IMAP: Internet Message Access Protocol
▶ DNS: Domain Name System
▶ DHCP: Dynamic Host Configuration Protocol
▶ NTP: Network Time Protocol
▶ SNMP: Simple Network Management Protocol
▶ IRC: Internet Relay Chat
▶ BGP: Border Gateway Protocol
▶ TCP: Transmission Control Protocol
▶ UDP: User Datagram Protocol

These protocols act on the different layers of the OSI model. For example,
HTTP, HTTPS, FTP, SSH, etc. are application layer protocols, while TCP,
UDP, etc. are transport layer protocols.
94 3 Networking and SSH

3.1.10 Firewalls

Definition 3.1.6 (Firewall) A firewall is a network security system


that monitors and controls incoming and outgoing network traffic
based on predetermined security rules.

A fireewall acts as a barrier between your computer and the internet. It


monitors the incoming and outgoing traffic and blocks any traffic that
does not meet the security rules. This is useful to prevent unauthorized
access to your computer and to prevent malware from entering your
computer and/or communicating with the outside world.
1 $ sudo ufw enable # Enable the firewall
2 $ sudo ufw allow 22 # Allow SSH
3 $ sudo ufw allow 80 # Allow HTTP
4 $ sudo ufw allow 443 # Allow HTTPS
5 $ sudo ufw status # Check the status of the firewall

3.1.11 SELinux

We can have additional security by using SELinux in addition to the


firewall. SELinux is a security module that provides access control security
policies. SELinux is short for Security-Enhanced Linux. It provides a
flexible Mandatory Access Control (MAC) that restricts the access of
users and processes to files and directories.
Least Privilege Principle

Definition 3.1.7 (Least Privilege Principle) The principle of least priv-


ilege (POLP) is an important concept in computer security, promoting
minimal user profile privileges on computers based on users’ job
necessity.

This principle states that a user should have only the minimum privileges
required to perform their job. This principle is applied throughout linux,
and also in SELinux.
You can check if SELinux is enabled by running
1 $ sestatus

If SELinux is enabled, you can check the context of a file or a directory


using
1 $ ls -lZ

However, if SELinux is not enabled, it will show a ? in the context.


If SELinux is enabled, you can set the context of a file or a directory using
the chcon command.
RBAC Items

Definition 3.1.8 (Role-Based Access Control (RBAC)) Role-Based


Access Control (RBAC) is a policy-neutral access control mechanism
3.1 Networking 95

defined around roles and privileges. The components of RBAC such


as role-permissions, user-role, and role-role relationships make it
simple to perform user assignments.

SELinux uses the concept of RBAC to control the access of users and
processes to files and directories.

There are four components in the SELinux context that are used to control
the access of users and processes to files and directories, as shown in
Figure 3.6.

Figure 3.6: SELinux Context

▶ User: The user who is trying to access the file or directory.


▶ Role: The role of the user, which defines the permissions of the
user.
▶ Type: The type of the file or directory.
▶ Domain: The domain 9 of the process trying to access the file or 9: or type or sensitivity
directory.

Modes of SELinux

▶ Enforcing: In this mode, SELinux is enabled and actively enforcing


the security policies.
▶ Permissive: In this mode, SELinux is enabled but not enforcing the
security policies. It logs the violations but does not block them.
▶ Disabled: In this mode, SELinux is disabled and not enforcing any
security policies.

You can change the mode of SELinux by editing the /etc/selinux/config


file.
1 $ sudo vim /etc/selinux/config

Tools for SELinux

▶ sestatus: Check the status of SELinux.


▶ semamange: Manage the SELinux policy.
▶ restorecon: Restore the context of files and directories.

3.1.12 Network Tools

There are a lot of tools in GNU/Linux used for managing, configuring,


and troubleshooting networks. Some of the important tools are listed in
Table 3.3.

ip

To find out the private IP address of the NICs of your system, you can 10: ip a also works.
run the ip addr command. 10
96 3 Networking and SSH

Table 3.3: Network Tools

Tool Description
ip Show / manipulate routing, devices, policy routing and tunnels
ping To see if the remote machine is up
traceroute Diagnostics the hop timings to the remote machine
nslookup Ask for conversion of IP address to name
dig DNS lookup utility
netstat Print network connections
mxtoolbox Public accessibility of your server
whois Information about the domain
nmap Network port scanner
wireshark Network protocol analyzer and packet sniffer

1 $ ip addr
2 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state
UNKNOWN group default qlen 1000
3 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4 inet 127.0.0.1/8 scope host lo
5 valid_lft forever preferred_lft forever
6 inet6 ::1/128 scope host noprefixroute
7 valid_lft forever preferred_lft forever
8 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
state UP group default qlen 1000
9 link/ether 1c:1b:0d:e1:5d:61 brd ff:ff:ff:ff:ff:ff
10 altname enp3s0
11 inet 192.168.0.109/24 brd 192.168.0.255 scope global dynamic
eno1
12 valid_lft 7046sec preferred_lft 7046sec
13 inet6 fe80::68e2:97e0:38ec:4abc/64 scope link noprefixroute
14 valid_lft forever preferred_lft forever

Here you can see there are two interfaces, lo and eno1. The lo interface
is the loopback interface, and the eno1 interface is the actual network
interface. The IP address of the lo interface is usually always 127.0.0.1.
This address is used to refer to the same system in terms of IP address
without knowing the actual private IP of the system in the LAN.

The IP address of the eno1 interface is the private IP address allocated by


your router. This is not your public IP address, which is the address of
your router on the internet. Usually public IPs are statically assigned by
ISPs and are not changed often. It is configured in your router.

Private IPs however often needs to be assigned dynamically since devices


can connect and disconnect from the network at any time. This is done
by the DHCP server in your router.

Remark 3.1.2 The NIC name can be different in different systems. For
a ethernet connection, it is usually eno1 or eth0 which is the legacy
name. For a wifi NIC, it is usually wlan0.

Remark 3.1.3 Earlier the tool used to check the network status was
ifconfig. However, this tool is deprecated now and should not be
used. The new tool to check the network status is ip.
3.1 Networking 97

ping
The ping command is used to check if a remote server is up and running.
It sends an ICMP 11 packet to the remote server and waits for a response. 11: Internet Control Message Protocol
(ICMP) is a supporting protocol in the
Internet protocol suite. It is used by net-
Remark 3.1.4 Only a positive response from the server indicates work devices, including routers, to send
that the server is up and running. A negative response does not error messages and operational informa-
tion indicating success or failure when
necessarily mean that the server is down. Servers can be configured
communicating with another IP address.
to not respond to ICMP packets.

1 $ ping -c 4 google.com # Send 4 ICMP packets to google.com


2 PING google.com (172.217.163.206) 56(84) bytes of data.
3 64 bytes from maa05s06-in-f14.1e100.net (172.217.163.206):
icmp_seq=1 ttl=114 time=45.6 ms
4 64 bytes from maa05s06-in-f14.1e100.net (172.217.163.206):
icmp_seq=2 ttl=114 time=45.4 ms
5 64 bytes from maa05s06-in-f14.1e100.net (172.217.163.206):
icmp_seq=3 ttl=114 time=45.3 ms
6 64 bytes from maa05s06-in-f14.1e100.net (172.217.163.206):
icmp_seq=4 ttl=114 time=45.8 ms
7
8 --- google.com ping statistics ---
9 4 packets transmitted, 4 received, 0% packet loss, time 3004ms
10 rtt min/avg/max/mdev = 45.316/45.524/45.791/0.181 ms

The response of the ping command shows the time taken for the packet
to reach the server and also the resolved IP address of the server.
nslookup
Another tool to lookup the associated IP address of a domain name is
the nslookup command.
1 $ nslookup google.com
2 Server: 192.168.0.1
3 Address: 192.168.0.1#53
4
5 Non-authoritative answer:
6 Name: google.com
7 Address: 172.217.163.206
8 Name: google.com
9 Address: 2404:6800:4007:810::200e

Here you can see the resolved IP address of the domain is 172.217.163.206.
If you copy this IP address and paste it in your browser, you can see that
the website of google opens up. The second address returned is the IPv6
IP Address.
The first lines mentioning the Server is the DNS Server which returned
the resolution of the IP address from the queried domain name.

Remark 3.1.5 Notice that the DNS Server mentioned in the above
output is actually a private IP. This is the IP address of the router in
the LAN which acts as the DNS Server cache. However if you type
the domain of a website which you have not visited, or have visited
long ago into nslookup, then the DNS Server mentioned will be the
public address of the DNS Server, which might be your ISP’s DNS
98 3 Networking and SSH

Server, or some other public DNS Server.

You can also use mxtoolbox to check the IP address of your server from
the public internet.

dig

Another tool to lookup the associated IP address of a domain name is


the dig command. It can also reverse lookup the IP address to find the
associated domain name.
1 $ dig google.com
2
3 ; <<>> DiG 9.18.27 <<>> google.com
4 ;; global options: +cmd
5 ;; Got answer:
6 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31350
7 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL:
9
8
9 ;; OPT PSEUDOSECTION:
10 ; EDNS: version: 0, flags:; udp: 4096
11 ; COOKIE: 3e5ff6a57c0fe2b3b5ce91b3666ae859ec9b6471261cecef (good)
12 ;; QUESTION SECTION:
13 ;google.com. IN A
14
15 ;; ANSWER SECTION:
16 google.com. 50 IN A 172.217.163.206
17
18 ;; AUTHORITY SECTION:
19 google.com. 162911 IN NS ns1.google.com.
20 google.com. 162911 IN NS ns3.google.com.
21 google.com. 162911 IN NS ns4.google.com.
22 google.com. 162911 IN NS ns2.google.com.
23
24 ;; ADDITIONAL SECTION:
25 ns2.google.com. 163913 IN A 216.239.34.10
26 ns4.google.com. 163913 IN A 216.239.38.10
27 ns3.google.com. 337398 IN A 216.239.36.10
28 ns1.google.com. 340398 IN A 216.239.32.10
29 ns2.google.com. 163913 IN AAAA 2001:4860:4802:34::a
30 ns4.google.com. 163913 IN AAAA 2001:4860:4802:38::a
31 ns3.google.com. 2787 IN AAAA 2001:4860:4802:36::a
32 ns1.google.com. 158183 IN AAAA 2001:4860:4802:32::a
33
34 ;; Query time: 3 msec
35 ;; SERVER: 192.168.0.1#53(192.168.0.1) (UDP)
36 ;; WHEN: Thu Jun 13 18:18:52 IST 2024
37 ;; MSG SIZE rcvd: 331

And we can then feed the IP address to dig again, to find the domain
name associated with the IP address.
1 $ dig -x 172.217.163.206
2
3 ; <<>> DiG 9.18.27 <<>> -x 172.217.163.206
4 ;; global options: +cmd
5 ;; Got answer:
6 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15781
3.1 Networking 99

7 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL:


9
8
9 ;; OPT PSEUDOSECTION:
10 ; EDNS: version: 0, flags:; udp: 4096
11 ; COOKIE: 78a3c6e4e4b103f501380f62666ae89b3a3b52e8be388fe0 (good)
12 ;; QUESTION SECTION:
13 ;206.163.217.172.in-addr.arpa. IN PTR
14
15 ;; ANSWER SECTION:
16 206.163.217.172.in-addr.arpa. 83966 IN PTR maa05s06-in-f14.1e100.
net.
17
18 ;; AUTHORITY SECTION:
19 217.172.in-addr.arpa. 78207 IN NS ns4.google.com.
20 217.172.in-addr.arpa. 78207 IN NS ns2.google.com.
21 217.172.in-addr.arpa. 78207 IN NS ns1.google.com.
22 217.172.in-addr.arpa. 78207 IN NS ns3.google.com.
23
24 ;; ADDITIONAL SECTION:
25 ns1.google.com. 340332 IN A 216.239.32.10
26 ns2.google.com. 163847 IN A 216.239.34.10
27 ns3.google.com. 337332 IN A 216.239.36.10
28 ns4.google.com. 163847 IN A 216.239.38.10
29 ns1.google.com. 158117 IN AAAA 2001:4860:4802:32::a
30 ns2.google.com. 163847 IN AAAA 2001:4860:4802:34::a
31 ns3.google.com. 2721 IN AAAA 2001:4860:4802:36::a
32 ns4.google.com. 163847 IN AAAA 2001:4860:4802:38::a
33
34 ;; Query time: 3 msec
35 ;; SERVER: 192.168.0.1#53(192.168.0.1) (UDP)
36 ;; WHEN: Thu Jun 13 18:19:58 IST 2024
37 ;; MSG SIZE rcvd: 382

Note that the answer we got after running google.com through dig and
then through dig -x (maa05s06-in-f14.1e100.net) is different from the
original domain name.

This is because the domain name is resolved to an IP address, and then


the IP address is resolved to a different domain name. This is because
the the domain name is actually an alias to the cannonical name.

Remark 3.1.6 The IP address you would get by running dig or


nslookup on google would be different from the IP address you get
when using mxtoolbox. This is because google is a large company and
they have multiple servers which are load balanced. So someone in
India might get a different IP address compared to someone in the
US.

To get the output of dig in a more readable and concise format, you can
use the +short or +noall option.
1 $ dig +noall +answer google.com
2 google.com. 244 IN A 172.217.163.206

netstat

The netstat command is used to print network connections, routing


100 3 Networking and SSH

tables, interface statistics, masquerade connections, and multicast mem-


berships.
It is useful to find what connections are open on your system, and what
ports are being used by which applications.
1 $ netstat | head
2 Active Internet connections (w/o servers)
3 Proto Recv-Q Send-Q Local Address Foreign Address
State
4 tcp 0 0 rex:53584 24.224.186.35.bc.:
https TIME_WAIT
5 tcp 0 0 rex:56602 24.224.186.35.bc.:
https TIME_WAIT
6 tcp 0 0 localhost:5037 localhost:43267
TIME_WAIT
7 tcp 0 0 localhost:5037 localhost:46497
TIME_WAIT
8 tcp 0 0 rex:35198 24.224.186.35.bc.:
https TIME_WAIT
9 tcp 0 0 rex:44302 24.224.186.35.bc.:
https TIME_WAIT
10 tcp 0 0 localhost:5037 localhost:55529
TIME_WAIT
11 tcp 0 0 localhost:5037 localhost:38005
TIME_WAIT
3.2 SSH 101

3.2 SSH

3.2.1 What is SSH?

The Secure Shell (SSH) Protocol is a protocol for secure communication


12: like the internet.
between two computers over a compromised or untrusted network. 12
SSH uses encryption and authentication to secure the communication
between the two computers.

SSH is now the ubiquitous protocol for secure remote access to servers,
and is used by system administrators all over the world to manage their
servers.

SSH lets a user of a computer to log into the computer from another
computer over the network, to execute any command in the terminal that
they have access to.

SSH can also be used to transfer files between two computers using the
scp command.

3.2.2 History

It was initially developed by Tatu Ylönen in 1995 as a replacement for


the insecure Telnet and FTP protocols when he found that someone had
installed a packet sniffer on the server of his university.

There are multiple implementations of the SSH protocol, the most pop-
ular being OpenSSH, developed by the OpenBSD project. This is the
implementation that is used in most of the linux distributions as well.

3.2.3 How does SSH work?

SSH works by using symmetric and asymmetric encryption. The data


packets sent over the network are enctypted, usually using AES symmetric
encryption. This ensures that even if the data packets are intercepted
by a man-in-the-middle attacker, they cannot be read since they are Figure 3.7: Symmetric Encryption
encrypted.

To login into a remote server, all you need to do is provide the user-
name and the IP address or the domain name of the server to the ssh
command.
1 $ ssh username@ipaddress

OR
Figure 3.8: Symmetric Encryption

1 $ ssh username@domainname

SSH allows user to login to a remote server using their username and
password, but this is not encouraged since it lets the user to be vulnerable
to brute-force attacks.

Another way to authenticate is by using public-private key pairs.


102 3 Networking and SSH

3.2.4 Key-based Authentication

One of the most powerful features of SSH is its ability to use public-
private key pairs for authentication. In our course, we emphasize the
importance of this method. Instead of relying on passwords, which can
be vulnerable to brute-force attacks, a pair of cryptographic keys is used.
The public key is stored on the server, while the private key is kept secure
Figure 3.9: Asymmetric Encryption
on your local machine. This ensures a highly secure and convenient way
of accessing remote servers without the need for constantly entering
passwords.

3.2.5 Configuring your SSH keys

For this course, it is a must for you to not only create, but also understand
SSH keys. Let us quickly see how to create a ssh key-pair which can be
used to login into a remote server.

We need to use the ssh-keygen command to create a new public-private


key pair.
1 $ ssh-keygen
2 Generating public/private ed25519 key pair.
3 Enter file in which to save the key (/home/test1/.ssh/id_ed25519):

Here you have to either simply press enter to continue with the default
location, or you can also type a custom location where you want to save
the key. If this is your first time creating keys, it is recommended to use
the default location.

Remark 3.2.1 There are multiple algorithms that can be used to


generate a key pair. The most common ones are RSA, DSA, and
ED25519. The ED25519 algorithm is the new default algorithm used
by OpenSSH since it is shorter yet more secure than RSA. If you have
an outdated version of OpenSSH, you might get the default RSA
algorithm. To change the algorithm, you can use the -t flag along
with the ssh-keygen command.
1 $ ssh-keygen -t rsa

will create a RSA key pair and using


1 $ ssh-keygen -t ed25519

will create a ED25519 key pair.

Next, it will ask you to enter a passphrase. You can enter a passphrase
for added security, or you can simply press enter to continue without a
passphrase. If you do add a passphrase, you will have to always enter
the passphrase whenever you use the key. We can continue without a
passphrase for now by pressing enter.
1 Enter passphrase (empty for no passphrase):
2 Enter same passphrase again:
3
4 Your identification has been saved in /home/username/.ssh/
id_ed25519
5 Your public key has been saved in /home/username/.ssh/id_ed25519.
pub
3.2 SSH 103

6 The key fingerprint is:


7 SHA256:n4ounQd6v9uWXAtMyyq7CdncMsh1Zuac5jesWXrndeA test1@rex
8 The key’s randomart image is:
9 +--[ED25519 256]--+
10 | |
11 | |
12 | |
13 | . |
14 | . S+ . . |
15 | . *.O o=... . |
16 | =o=o*+++ .E .|
17 | o.=Bo*O o. . |
18 | +**XB.+. |
19 +----[SHA256]-----+

Our key pair has been generated. The private key is stored in /home/username/.ssh/id_-
ed25519 and the public key is stored in /home/username/.ssh/id_-
ed25519.pub.

Make sure to never share your private key with anyone. Ideally, you
dont even need to see the private key yourself. You should only share
the public key with the server you want to login to.

3.2.6 Sharing your public key

ssh-copy-id

Finally, to share the public key with the server, there are usually multiple
ways. If the server allows you to login using a password, you can simply
use the ssh-copy-id command. This command will take your username
and password to login to the server, and then copy the public key which
you provide to the server.
1 $ ssh-copy-id -i /key/to/public/key username@ipaddress

Remark 3.2.2 The -i flag is used to specify the path to the public key.
You can drop the .pub from the path as well (making it the path to the
private key), since ssh-copy-id will automatically look for the public
key. However, this flag is not required if you are using the default
location. This is why using the default location is recommended for
beginners. The simplified syntax then becomes
1 $ ssh-copy-id username@ipaddress

The same applies for logging into the server using the ssh command.

manual install

However, most servers do not allow password login at all, since it defeats
the purpose of using a public-private key pair. In such cases, you need to
somehow copy the public key to the server.

If you have physical access to the server, you can simply copy the public
key to the server in the /.ssh/authorized_keys file of the server.
1 $ file ~/someoneskey.pub
2 ~/someoneskey.pub: OpenSSH ED25519 public key
3 $ cat ~/someoneskey.pub >> ~/.ssh/authorized_keys
104 3 Networking and SSH

Remark 3.2.3 Make sure to use the » operator and not the > operator.
The » operator appends the contents of the file to the end of the file,
while the > operator overwrites the file, we do not want that.

System Commands Course

However, in case of our course, you do not have access to the server
as well. To submit your public key, you have to login into the web-
site https://se2001.ds.study.iitm.ac.in/passwordless using your
institute credentials, and then submit your public key in the form pro-
vided.

You can print out the contents of the public key using the cat command
and copy the contents into the form.
1 [test1@rex ~]$ cat .ssh/id_ed25519.pub
2 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDxh5EuvzQkGvsqlMQW3rOkY+wyo+2
d6Y5CSqNGlLs2a test1@rex

You should copy the entire contents of the file, including your username
and hostname.

3.2.7 How to login to a remote server

You can then login into the server using the ssh command.
1 $ ssh [email protected]

OR, if not using the default location of key

1 $ ssh -i /path/to/private/key [email protected].


in

If successful, you will be logged into the server and the prompt will
change to the server’s prompt.
1 [test1@rex ~]$ ssh [email protected]
2 Last login: Mon Jun 3 07:43:22 2024 from 192.168.2.3
3 29f1001234@se2001:~$

Notice that the prompt has changed from test1@rex which was the
prompt of your local machine, to rollnumber@se2001 which is the
prompt of the server.

3.2.8 Call an exorcist, there’s a daemon in my computer

What is sshd? It is a daemon.

Definition 3.2.1 (Daemon) In multitasking computer operating sys-


tems, a daemon is a computer program that runs as a background
process, rather than being under the direct control of an interactive
user.
3.2 SSH 105

There are many daemons running in your computer. You can use
systemctl status to see the loaded and active daemons in your com-
puter.
1 $ systemctl status
2 * rex
3 State: running
4 Units: 419 loaded (incl. loaded aliases)
5 Jobs: 0 queued
6 Failed: 0 units
7 Since: Thu 2024-06-13 12:55:42 IST; 7h ago
8 systemd: 255.6-1-arch
9 CGroup: /
10 |-init.scope
11 | ‘-1 /usr/lib/systemd/systemd --switched-root --system
--deserialize=43
12 |-system.slice
13 | |-NetworkManager.service
14 | | ‘-547 /usr/bin/NetworkManager --no-daemon
15 | |-adb.service
16 | | ‘-558 adb -L tcp:5037 fork-server server --reply-fd
4
17 | |-avahi-daemon.service
18 | | |-550 "avahi-daemon: running [rex.local]"
19 | | ‘-557 "avahi-daemon: chroot helper"
20 | |-cronie.service
21 | | ‘-621 /usr/sbin/crond -n
22 | |-cups.service
23 | | ‘-629 /usr/bin/cupsd -l
24 | |-dbus-broker.service
25 | | |-545 /usr/bin/dbus-broker-launch --scope system --
audit

Here you can see some of the important daemons running, such as
NetworkManager which is used to manage the network connections,
cronie which is used to run scheduled tasks, cups which is used to
manage printers, etc.

sshd

sshd is the daemon that runs on the server and listens to any incoming
SSH connections. It is the daemon that lets you login into the server using
the SSH protocol.

Your own system might not be running the sshd daemon, since you are
not running a server. However, you can check if the sshd daemon is
running using the systemctl command.
1 $ systemctl status sshd
2 * sshd.service - OpenSSH Daemon
3 Loaded: loaded (/usr/lib/systemd/system/sshd.service;
disabled; preset: disabled)
4 Active: inactive (dead)

Here you can see that the sshd daemon is currently inactive. This is
because I am not running a server and don’t usually login remotely to my
system. However, the output of the same command would be something
like as shown below if it is enabled on your system.
1 $ systemctl status sshd
106 3 Networking and SSH

2 * sshd.service - OpenSSH Daemon


3 Loaded: loaded (/usr/lib/systemd/system/sshd.service;
disabled; preset: disabled)
4 Active: active (running) since Thu 2024-06-13 19:48:44 IST;
12min ago
5 Main PID: 3583344 (sshd)
6 Tasks: 1 (limit: 9287)
7 Memory: 2.1M (peak: 2.3M)
8 CPU: 8ms
9 CGroup: /system.slice/sshd.service
10 ‘-3583344 "sshd: /usr/bin/sshd -D [listener] 0 of
10-100 startups"
11
12 Jun 13 19:48:44 rex systemd[1]: Started OpenSSH Daemon.
13 Jun 13 19:48:45 rex sshd[3583344]: Server listening on 0.0.0.0
port 22.
14 Jun 13 19:48:45 rex sshd[3583344]: Server listening on :: port 22.

If we run the same command on the server, we can see that it is running.
However, we wont be able to read the logs of the server, since we are not
authorized.
1 $ ssh [email protected]
2 username@se2001:~$ systemctl status sshd
3 * ssh.service - OpenBSD Secure Shell server
4 Loaded: loaded (/lib/systemd/system/ssh.service; enabled;
vendor preset: enabled)
5 Active: active (running) since Thu 2024-06-13 12:32:47 UTC; 1
h 57min ago
6 Docs: man:sshd(8)
7 man:sshd_config(5)
8 Process: 732 ExecStartPre=/usr/sbin/sshd -t (code=exited,
status=0/SUCCESS)
9 Main PID: 745 (sshd)
10 Tasks: 1 (limit: 4557)
11 Memory: 22.0M
12 CPU: 8.769s
13 CGroup: /system.slice/ssh.service
14 ‘-745 "sshd: /usr/sbin/sshd -D [listener] 0 of 10-100
startups"
15
16 Warning: some journal files were not opened due to insufficient
permissions.

Remark 3.2.4 Notice that there are some differences in the output
when run from my local system and from the system commands
server. Such as, the name of the service is ssh on the server, while
it is sshd on my local system. Also the full name is OpenBSD Secure
Shell server on the server, while it is OpenSSH Daemon on my local
system. The path of the service file is also different. This is because
the server is running a ubuntu distribution whereas my local system
runs an arch distribution. They have different packages for ssh, and
hence the differences.

I have set the LC_ALL environment variable to the locale C while generating the above
outputs to prevent latex errors. Ideally if you run the command, you will see a prettier
unicode output.

You might also like