A file system is a structured method of storing and managing data—including files, directories, and metadata—on your machine. Think of it like a library. If thousands of books were scattered around, finding one would be hard. But in an organized structure, like labeled shelves, locating a book becomes easy.
This article aims to simplify the complexities of Linux file systems, guiding beginners through their layers, characteristics, and implementations. By shedding light on these nuances, we empower users to make informed choices in navigating the dynamic landscape of Linux operating systems.
What is the Linux File System
A Linux file system is a set of processes that controls how, where, and when data is stored or retrieved from storage devices. It manages data systematically on disk drives or partitions, and each partition in Linux has its own file system because Linux treats everything as a file, including devices and applications.. Like Windows uses C: and D: drives, Linux uses mount points, but everything appears under the root /
directory. In Linux, everything is treated as a file, including devices and applications.
In this article, we will be focusing on the file system for hard disks on a Linux OSand discuss which type of file system is suitable.
Linux File System Structure
The architecture of a file system comprises three layers mentioned below.
1. Logical File System:
The Logical File System acts as the interface between the user applications and the file system itself. It facilitates essential operations such as opening, reading, and closing files. Essentially, it serves as the user-friendly front-end, ensuring that applications can interact with the file system in a way that aligns with user expectations.
2. Virtual File System:
The Virtual File System (VFS) is a crucial layer that enables the concurrent operation of multiple instances of physical file systems. It provides a standardized interface, allowing different file systems to coexist and operate simultaneously. This layer abstracts the underlying complexities, ensuring compatibility and cohesion between various file system implementations.
3. Physical File System:
The Physical File System is responsible for the tangible management and storage of physical memory blocks on the disk. It handles the low-level details of storing and retrieving data, interacting directly with the hardware components. This layer ensures the efficient allocation and utilization of physical storage resources, contributing to the overall performance and reliability of the file system.
Together, these layers form a cohesive architecture, orchestrating the organized and efficient handling of data in the Linux operating system.
Architecture Of a File System Characteristics of a File System
A file system defines the rules and structures for how data is organized, stored, accessed, and managed on a storage device.
- Space Management: How the data is stored on a storage device. Pertaining to the memory blocks and fragmentation practices applied in it.
- Filename: A file system may have certain restrictions to file names such as the name length, the use of special characters, and case sensitive-ness.
- Directory: The directories/folders may store files in a linear or hierarchical manner while maintaining an index table of all the files contained in that directory or subdirectory.
- Metadata: For each file stored, the file system stores various information about that file's existence such as its data length, its access permissions, device type, modified date-time, and other attributes. This is called metadata.
- Utilities: File systems provide features for initializing, deleting, renaming, moving, copying, backup, recovery, and control access of files and folders.
- Design: Due to their implementations, file systems have limitations on the amount of data they can store.
Some important terms:
Understanding these key terms is essential before exploring various Linux file system implementations for disk storage.
1) Journaling:
Journaling file systems keep a log called the journal, that keeps track of the changes made to a file but not yet permanently committed to the disk so that in case of a system failure the lost changes can be brought back. Journaling works like a checklist:
- Log changes in the journal.
- Apply changes to the disk.
- Mark them as complete.
Journaling can be configured in three different modes, each offering a trade-off between reliability and performance. The Journal mode is the most reliable as it logs both file data and metadata, ensuring the highest level of data integrity. However, it is also the slowest mode due to the extensive logging process. The Ordered mode, on the other hand, logs only the metadata, with the file data being written before the metadata. This provides a balanced approach between data safety and system performance. Lastly, the Writeback mode logs only metadata without enforcing any order between file data and metadata writes. While it is the fastest journaling mode, it is also the least safe, as it increases the risk of data corruption in the event of a crash.
2) Versioning:
Versioning file systems store previously saved versions of a file, i.e., the copies of a file are stored based on previous commits to the disk in a minutely or hourly manner to create a backup.
3) Inode:
The index node is the representation of any file or directory based on the parameters - size, permission, ownership, and location of the file and directory.
Now, we come to part where we discuss the various implementations of the file system in Linux for disk storage devices.
Linux File Systems:
Here are some linux file systems:
Note: Cluster and distributed file systems will not be included for simplicity.
Types of File System in Linux1) ext (Extended File System):
Implemented in 1992, it is the first file system specifically designed for Linux. It is the first member of the ext family of file systems.
2) ext2:
The second ext was developed in 1993. It is a non-journaling file system that became known for its efficient handling of flash drives and SSDs. It solved the problems of separate timestamp for access, inode modification and data modification. Due to not being journaled, it is slow to load at boot time.
3) Xiafs:
Also developed in 1993, Xiafs was developed as an alternative but lacked the power and functionality of ext2. Due to limited features and scalability, it is no longer in use.
4) ext3:
Introduced in 1999, ext3 brought in journaling capabilities, offering improved reliability. Unlike ext2, it avoided long boot-time checks after an improper shutdown. It also supported online file system growth and HTree indexing, making it efficient for large directories.
5) JFS (Journaled File System):
First created by IBM in 1990, the original JFS was taken to open source to be implemented for Linux in 199 it is Known for its ability to perform well under varied loads JFS performs well under different kinds of load but is not commonly used anymore due to the release of ext4 in 2006 which gives better performance.
6) ReiserFS:
It is a journal file system developed in 2001. Despite its earlier issues, it has tail packing as a scheme to reduce internal fragmentation. It uses a B+ Tree that gives less than linear time in directory lookups and updates. It was the default file system in SUSE Linux till version 6.4, until switching to ext3 in 2006 for version 10.2.
7) XFS:
XFS is a 64-bit journaling file system and was ported to Linux in 2001. It now acts as the default file system for many Linux distributions. It provides features like snapshots, online defragmentation, sparse files, variable block sizes, and excellent capacity. It also excels at parallel I/O operations.
8) SquashFS:
Developed in 2002, this file system is read-only and is used only with embedded systems where low overhead is needed.
9) Reiser4:
It is an incremental model to ReiserFS. It was developed in 2004. However, it is not widely adapted or supported on many Linux distributions.
10) ext4:
The fourth ext developed in 2006, is a journaling file system. It has backward compatibility with ext3 and ext2 and it provides several other features, some of which are persistent pre-allocation, unlimited number of subdirectories, metadata checksumming and large file size. ext4 is the default file system for many Linux distributions and also has compatibility with Windows and Macintosh.
11) btrfs (Better/Butter/B-tree FS):
It was developed in 2007. It provides many features such as snapshotting, drive pooling, data scrubbing, self-healing and online defragmentation. It is the default file system for Fedora Workstation.
12) bcachefs:
This is a copy-on-write file system that was first announced in 2015 with the goal of performing better than btrfs and ext4. Its features include full filesystem encryption, native compression, snapshots, and 64-bit check summing.
13) Others:
Linux also has support for file systems of operating systems such as NTFS and exFAT, but these do not support standard Unix permission settings. They are mostly used for interoperability with other operating systems.
File Systems Comparison:
Please note that there are more criteria than the ones listed in the table. This table is supposed to give you an idea of how file systems have evolved.
Parameters | File Systems |
---|
ext | ext2 | Xiafs | ext3 | JFS | ReiserFS | XFS | Reiser4 | ext4 | btrfs |
---|
Max. filename length (bytes) | 255 | 255 | 248 | 255 | 255 | 4032 255 characters | 255 | 3976 | 255 | 255 |
Allowable characters in directory entries (Any byte) | except NUL | except NUL, / | except NUL | except NUL or / | Any Unicode except NUL | except NUL or / | except NUL | except NUL, / | except NUL, / | except NUL, / |
Max. pathname length | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined |
Max. file size | 2 GB | 16GB - 2TB | 64MB | 16GB - 2TB | 4PB | 8TB | 8EB | 8TB (on x86) | 16GB - 16TB | 16EB |
Max. volume size | 2 GB | 2TB - 32TB | 2GB | 2TB - 32TB | 32PB | 16TB | 8EB | - | 1EB | 16EB |
Max. no. of files | - | - | - | - | - | - | - | - | 2^32 | 2^64 |
Metadata only journaling | No | No | No | Yes | Yes | Yes | Yes | No | Yes | No |
Compression | No | No | No | No | No | No | No | Yes | No | Yes |
Block sub-allocation | No | No | No | No | Yes | Yes | No | Yes | No | Yes |
Online grow | No | No | - | Yes | No | Yes | Yes | Yes | Yes | Yes |
Encryption | No | No | No | No | No | No | No | Yes | Yes (experimental) | No |
Checksum | No | No | No | No | No | No | Partial | No | Partial | Yes |
Observations:
We see that XFS, ext4 and btrfs perform the best of all the other file systems. In fact, btrfs looks as if it's almost the best. Despite that, the ext family of file systems has been the default for most Linux distributions for a long time. So, what is it that made the developers choose ext4 as the default rather than btrfs or XFS? Since ext4 is so important for this discussion, let's describe it a bit more.
ext4 in Linux File System
Ext4 was designed to be backward compatible with ext3 and ext2, its previous generations. It's better than the previous generations in the following ways:
- It provides a large file system as described in the table above.
- Utilizes extents that improve large file performance and reduces fragmentation.
- Provides persistent pre-allocation which guarantees space allocation and contiguous memory.
- Delayed allocation improves performance and reduces fragmentation by effectively allocating larger amounts of data at a time.
- It uses HTree indices to allow unlimited number of subdirectories.
- Performs journal checksumming which allows the file system to realize that some of its entries are invalid or out of order after a crash.
- Support for time-of-creation timestamps and improved timestamps to induce granularity.
- Transparent encryption.
- Allows cleaning of inode tables in background which in turn speeds initialization. The process is called lazy initialization.
- Enables writing barriers by default. Which ensures that file system metadata is correctly written and ordered on disk, even when write caches lose power.
There are still some features in the process of developing like metadata checksumming, first-class quota supports, and large allocation blocks.
However, ext4 has some limitations. Ext4 does not guarantee the integrity of your data, if the data is corrupted while already on disk then it has no way of detecting or repairing such corruption. The ext4 file system cannot secure deletion of files, which is supposed to cause overwriting of files upon deletion. It results in sensitive data ending up in the file-system journal.
XFS performs highly well for large filesystems and high degrees of concurrency. So XFS is stable, yet there's not a solid borderline that would make you choose it over ext4 since both work about the same. Unless you want a file system that directly solves a problem of ext4 like having capacity > 50TiB.
Btrfs on the other hand, despite offering features like multiple device management, per-block checksumming, asynchronous replication and inline compression, does not perform the best in many common use cases as compared to ext4 and XFS. Several of its features can be buggy and result in reduced performance and data loss.
Some Hands On Example on Linux File System
For example, if our use_case is to set up a server that will first store and serve large multimedia files (videos and audios). In that case we have to prioritize efficient speed and use of storage space.
According to this requirement the XFS file system would be a better choice. Because we know that XFS is optimized for large files and can work on high volumes of data transfer which in general makes it ideal for media servers.
Following steps to use it:
Step 1: Installing XFS utilities package on Linux system.
sudo apt-get install xfsprogs
Installing xfsprogsStep 2: Create a partition to format as XFS.
For example: `/dev/sda1`
This can be done using tool like `fdisk`.
Step 3: Format the partition as XFS.
sudo mkfs.xfs /dev/sda1 -f
Format the partitionWe have formatted partition using XFS filesystem. (Used -f for forcefully to avoid error or warning) .
Step 4: Mount the XFS partition to a directory we want.
sudo mount /dev/sda1 /mnt/jayesh_xfs_partition
mounting of XFS partition We have mounted XFS partition to a directory `/mnt/jayesh_xfs_partition`, (you can create your own directory.)
Step 5: To verify the mount.
df -h
Successful mountConclusion:
In this article we discussed Linux file system in operating systems, delving into its layers, characteristics, and the architecture of Linux file systems. It provides a thorough exploration of various options, from ext to contemporary choices like ext4, XFS, and btrfs. The comparison table highlights the superior performance of XFS, ext4, and btrfs, with ext4 standing out for its backward compatibility and design enhancements. The article wisely recommends ext4 as the default for general users unless specific needs dictate alternatives, citing instances where XFS excels for large media files. In essence, the article serves as a practical guide for users to navigate the complexities of file systems, emphasizing the reliable nature of ext4 for most use cases while acknowledging niche applications for other systems.
Similar Reads
Linux/Unix Tutorial Linux is one of the most widely used open-source operating systems. It's fast, secure, stable, and powers everything from smartphones and servers to cloud platforms and IoT devices. Linux is especially popular among developers, system administrators, and DevOps professionals.Linux is:A Unix-like OS
10 min read
Getting Started with Linux
What is Linux Operating SystemThe Linux Operating System is a type of operating system that is similar to Unix, and it is built upon the Linux Kernel. The Linux Kernel is like the brain of the operating system because it manages how the computer interacts with its hardware and resources. It makes sure everything works smoothly a
13 min read
LINUX Full Form - Lovable Intellect Not Using XPLINUX stands for Lovable Intellect Not Using XP. Linux was developed by Linus Torvalds and named after him. Linux is an open-source and community-developed operating system for computers, servers, mainframes, mobile devices, and embedded devices. Linux receives requests from system programs and it r
2 min read
Difference between Linux and WindowsLinux: Linux could be a free and open supply OS supported operating system standards. It provides programming interface still as programme compatible with operating system primarily based systems and provides giant selection applications. A UNIX operating system additionally contains several several
7 min read
What are Linux Distributions ?A Linux distribution, often shortened to âdistro,â is a packaged version of Linux that comes with the Linux kernel plus a collection of software and utilities that make the OS functional and user-friendly. Some distros are optimized for business environments, offering tools for productivity and ente
8 min read
Difference between Unix and LinuxUnix was created in the 1970s by Ken Thompson and Dennis Ritchie at Bell Labs. Dennis Ritchie was also the creator of the C programming language. Originally a command-line operating system, Unix has evolved to support graphical interfaces (GUI) as well. It became popular in universities, enterprises
5 min read
Installation with Linux
How to Install Arch Linux in VirtualBox?Installing Arch Linux on a virtual machine is an excellent way to experience this powerful and flexible Linux distribution without affecting your main system. If you're looking to install Arch Linux in VirtualBox, this guide will take you through the process step-by-step. Arch Linux is known for its
7 min read
Fedora Linux Operating SystemFedora Linux is a free and open-source operating system based on the Linux kernel and was developed by the community-supported Fedora Project. It is known for its fast release cycle, which keeps the operating system up to date with the latest software and technologies.What is the Fedora Linux Operat
12 min read
How to install Ubuntu on VirtualBox?Installing Ubuntu on VirtualBox is a great way to experience the powerful features of this popular Linux distribution without altering your main operating system. Whether youâre a developer, a student, or simply curious about Linux, setting up Ubuntu on VirtualBox allows you to test and explore in a
6 min read
How to Install Linux Mint?Linux Mint is the second-largest Linux-based distro used in the world. Linux Mint is a community-driven Linux distribution based on Ubuntu which itself is based on Debian and bundled with a variety of free and open-source applications. So here we discuss the installation of Linux mint. Installation
3 min read
How to Install Kali Linux on Windows?Kali Linux is an open-source Linux distribution based on Debian, designed for sophisticated penetration testing and security auditing. Kali Linux includes hundreds of tools for diverse information security activities such as penetration testing, security research, computer forensics, and reverse eng
2 min read
How to Install Linux on Windows PowerShell Subsystem?There are several ways to Install a Linux subsystem on your Windows PC Powershell Environment. It is good for learners, but it is recommended using original Linux OS if you are a developer as the Subsystem lacks the pre-installed Linux tools. Before we begin installing a Linux subsystem, we need to
2 min read
How to Find openSUSE Linux Version?openSUSE is well known for its GNU/Linux-based operating systems, mainly Tumbleweed, a tested rolling release, and Leap, a distribution with Long-Term-Support(LTS). MicroOS and Kubic are new transactional, self-contained distributions for use as desktop or container runtime. Here we figure out which
2 min read
How to Install CentOSCentOS is a popular open-source Linux distribution aimed at servers and provides compatibility with Red Hat's RPM package manager. It is built with the goal of providing a stable operating system that provided great compatibility with the upstream RHEL (Red hat enterprise Linux) CentOS is therefore
2 min read
Linux Commands
Linux CommandsLinux commands are essential for controlling and managing the system through the terminal. This terminal is similar to the command prompt in Windows. Itâs important to note that Linux/Unix commands are case-sensitive. These commands are used for tasks like file handling, process management, user adm
15+ min read
Essential Unix CommandsUnix commands are a set of commands that are used to interact with the Unix operating system. Unix is a powerful, multi-user, multi-tasking operating system that was developed in the 1960s by Bell Labs. Unix commands are entered at the command prompt in a terminal window, and they allow users to per
7 min read
How to Find a File in Linux | Find CommandThe find command in Linux is used to search for files and directories based on name, type, size, date, or other conditions. It scans the specified directory and its sub directories to locate files matching the given criteria.find command uses are:Search based on modification time (e.g., files edited
9 min read
Linux File System
Linux File SystemA file system is a structured method of storing and managing dataâincluding files, directories, and metadataâon your machine. Think of it like a library. If thousands of books were scattered around, finding one would be hard. But in an organized structure, like labeled shelves, locating a book becom
12 min read
Linux File Hierarchy StructureThe Linux File Hierarchy Structure or the Filesystem Hierarchy Standard (FHS) defines the directory structure and directory contents in Unix-like operating systems. It is maintained by the Linux Foundation. In the FHS, all files and directories appear under the root directory /, even if they are sto
6 min read
Linux Directory StructureIn Linux, everything is treated as a file even if it is a normal file, a directory, or even a device such as a printer or keyboard. All the directories and files are stored under one root directory which is represented by a forward slash /. The Linux directory layout follows the Filesystem Hierarchy
6 min read
Linux Kernel
Linux KernelLinux Kernel is the heart of Linux operating systems. It is an open-source (source code that can be used by anyone freely) software that is most popular and widely used in the industry as well as on a personal use basis. Who created Linux and why? Linux was created by Linus Torvalds in 1991 as a hob
4 min read
Kernel in Operating SystemA kernel is the core part of an operating system. It acts as a bridge between software applications and the hardware of a computer. The kernel manages system resources, such as the CPU, memory, and devices, ensuring everything works together smoothly and efficiently. It handles tasks like running pr
10 min read
How Linux Kernel Boots?Many processes are running in the background when we press the system's power button. It is very important to learn the Linux boot process to understand the workings of any operating system. Knowing how the kernel boots is a must to solve the booting error. It is a very interesting topic to learn, l
11 min read
Difference between Operating System and KernelIn the world of computing, two terms that are frequently mentioned are Operating System (OS) and Kernel. In this article, we will explore the key differences between the OS and the Kernel, their functions, and how they work together to manage hardware and software.What is an Operating System?An Oper
3 min read
Linux Kernel Module Programming: Hello World ProgramKernel modules are pieces of code that can be loaded and unloaded into the kernel upon demand. They extend the functionality of the kernel without the need to reboot the system. Custom codes can be added to Linux kernels via two methods. The basic way is to add the code to the kernel source tree and
7 min read
Linux Loadable Kernel ModuleIf you want to add code to a Linux kit, the basic way to do that is to add source files to the kernel source tree and assemble the kernel. In fact, the process of setting up the kernel consists mainly of selecting which files to upload to the kernel will be merged. But you can also add code to the L
7 min read
Loadable Kernel Module - Linux Device Driver DevelopmentFor Linux device drivers, we can use only two languages: Assembler and C. Assembler implements the main parts of the Linux kernel, while C implements the architecture-dependent parts. Uploaded kernel modules are often referred to as kernel modules or modules, but those are misleading names because t
4 min read
Linux Networking Tools
Network configuration and troubleshooting commands in LinuxComputers are often connected to each other on a network. They send requests to each other in the form of packets that travel from the host to the destination. Linux provides various commands from network configuration and troubleshooting. Network Configuration and Troubleshooting Commands in Linux
5 min read
How to configure network interfaces in CentOS?A network interface is a link between a computer and another network(Private or Public). The network interface is basically a card which is known as NIC or Network Interface Card, this does not necessarily have to be in a physical form instead, it can be inbuilt into the software. If we take the exa
5 min read
Command-Line Tools and Utilities For Network Management in LinuxIf you are thinking of becoming a system administrator, or you are already a system admin, then this article is for you.As a system admin, your daily routine will include configuring, maintaining, troubleshooting, monitoring, securing networks, and managing servers within data centers. Network confi
8 min read
Linux - Network Monitoring ToolsNetwork monitoring is using a system (hardware or software) that continuously observes your network and the data flows through it, depending on how the monitoring solution actually functions and informs the network administrator. We can keep a check on all the activities of our network easily. While
4 min read
Linux Process
Linux Firewall
Shell Scripting & Bash Scripting
Introduction to Linux Shell and Shell ScriptingIf we are using any major operating system, we are indirectly interacting with the shell. While running Ubuntu, Linux Mint, or any other Linux distribution, we are interacting with the shell by using the terminal. In this article we will discuss Linux shells and shell scripting so before understandi
8 min read
What is Terminal, Console, Shell and Kernel?Understanding the terms terminal, console, shell, and kernel is crucial for anyone working with computers or learning about operating systems. These concepts are key components of how we interact with our devices and software. The terminal is a text-based interface used to interact with the computer
5 min read
How to Create a Shell Script in linuxShell is an interface of the operating system. It accepts commands from users and interprets them to the operating system. If you want to run a bunch of commands together, you can do so by creating a shell script. Shell scripts are very useful if you need to do a task routinely, like taking a backup
7 min read
Shell Scripting - Different types of VariablesThe shell is a command-line interpreter for Linux and Unix systems. It provides an interface between the user and the kernel and executes commands. A sequence of commands can be written in a file for execution in the shell. It is called shell scripting. It helps to automate tasks in Linux. Scripting
4 min read
Bash Scripting - Introduction to Bash and Bash ScriptingBash is a command-line interpreter or Unix Shell and it is widely used in GNU/Linux Operating System. It is written by Brian Jhan Fox. It is used as a default login shell for most Linux distributions. Scripting is used to automate the execution of the tasks so that humans do not need to perform them
12 min read
Bash Script - Define Bash Variables and its typesVariables are an important aspect of any programming language. Without variables, you will not be able to store any required data. With the help of variables, data is stored at a particular memory address and then it can be accessed as well as modified when required. In other words, variables let yo
12 min read
Shell Scripting - Shell VariablesA shell variable is a character string in a shell that stores some value. It could be an integer, filename, string, or some shell command itself. Basically, it is a pointer to the actual data stored in memory. We have a few rules that have to be followed while writing variables in the script (which
6 min read
Bash Script - Difference between Bash Script and Shell ScriptIn computer programming, a script is defined as a sequence of instructions that is executed by another program. A shell is a command-line interpreter of Linux which provides an interface between the user and the kernel system and executes a sequence of instructions called commands. A shell is capabl
4 min read
Shell Scripting - Difference between Korn Shell and Bash shellKorn Shell: Korn Shell or KSH was developed by a person named David Korn, which attempts to integrate the features of other shells like C shell, Bourne Shell, etc. Korn Shell allows developers to generate and create new shell commands whenever it is required. Korn shell was developed a long year bac
3 min read
Shell Scripting - Interactive and Non-Interactive ShellA shell gives us an interface to the Unix system. While using an operating system, we indirectly interact with the shell. On Linux distribution systems, each time we use a terminal, we interact with the shell. The job of the shell is to interpret or analyze the Unix commands given by users. A shell
3 min read
Shell Script to Show the Difference Between echo â$SHELLâ and echo â$SHELLâIn shell scripting and Linux, the echo command is used to display text on the terminal or console. When used with the $SHELL variable, which contains the path of the current user's shell program, the output of the echo command can be different depending on whether the variable is enclosed in single
4 min read