Linux File System Types Explained, Which One Should You Use
Linux supports a variety of file systems such as ext4, ZFS, XFS, Btrfs, Reiser4, and so on. Different types of file systems solve different kinds of problems and their usage is application specific.
Choosing Linux file system that is appropriate for your application is an important decision. This tutorial describes some of the major Linux file systems and provides recommendations on the right file system to suit your application.
Table of contents
What is Linux file system
Almost every bit of data and programming that is needed to boot a Linux system and keep it working is saved in the file system. For example, the operating system itself, compilers, application programs, shared libraries, configuration files, log files, media mount points, and so on.
File systems operate in the background. Like the rest of an operating system’s kernel, they’re largely invisible in everyday use.
Linux file system is generally a built-in layer of a Linux operating system used to handle the data management of the storage. It controls how data is stored and retrieved. It manages the file name, file size, creation date, and much more information about a file.
ext4 file system
In 1992 the Extended File System or ext was launched specifically for the Linux operating system. It has its roots in the Minix Operating system. In 1993 an update called Extended File System 2 or ext2 was then released and was for many years the default file system in many Linux distros. By 2001 ext2 was upgraded to ext3 which introduced journaling to protect against corruption in the event of crashes or power failures.
Ext4 (Fourth Extended Filesystem) was introduced in 2008 and it is the default Linux filesystem since 2010. It was designed as a progressive revision of the ext3 file system and overcomes a number of limitations in ext3. It has significant advantages over its predecessor such as improved design, better performance, reliability, and new features.
Nowadays ext4 is the default file system on most Linux distributions. It can support files and file systems of up to 16 terabytes in size. It also supports an unlimited number of sub-directories (the ext3 file system only supports up to 32,000). Further, ext4 is backward compatible with ext3 and ext2, allowing these older versions to be mounted with the ext4 driver.
There is a reason ext4 is the default choice for most Linux distributions. It’s tried, tested, stable, performs great, and is widely supported. If you are looking for stability, ext4 is the best Linux filesystem for you.
Despite all of its features, ext4 does not support transparent compression, transparent encryption, or data deduplication.
XFS file system
XFS is a highly scalable file system that was developed by Silicon Graphics and first deployed in the Unix-based IRIX operating system in 1994. It is a journaling file system and, as such, keeps track of changes in a log before committing the changes to the main file system. The advantage is guaranteed consistency of the file system and expedited recovery in the event of power failures or system crashes.
Originally XFS was created to support extremely large filesystems with sizes of up to 16 exabytes and file sizes of up to 8 exabytes. It has a long history of running on large servers and storage arrays.
One notable feature of XFS is Guaranteed Rate IO. This allows applications to reserve bandwidth. The file system calculates the available performance and adjusts its operation according to the existing reservations.
XFS has a reputation of operating in environments that require high performance and scalability and hence is routinely measured as one of the highest performing file systems on large systems with enterprise workloads.
Today XFS is supported by most Linux distributions and has now become the default filesystem on Red Hat Enterprise Linux, Oracle Linux, CentOS and many other distributions.
Best use cases for XFS file system
So, do you have a large server? Do you have large storage requirements or have a local, slow SATA drive?
If both your server and your storage device are large, and there is no need to reduce the filesystem size, XFS is likely to be the best choice. XFS is a great filesystem, that scales well for large servers. But even with smaller storage arrays, XFS performs very well when the average file sizes are large, for example, hundreds of megabytes in size.
Btrfs file system
Btrfs is the next generation general purpose Linux file system that offers unique features like advanced integrated device management, scalability and reliability. It is licensed under the GPL and open for contribution from anyone. Different names are used for the file system, including “Butter FS”, “B-tree FS”, and “Better FS”.
Btrfs development began at Oracle in 2007. It was merged into the mainline Linux kernel in the beginning of 2009 and debuted in the Linux 2.6.29 release.
Btrfs is not a successor to the default ext4 file system used in most Linux distributions, but it offer better scalability and reliability. Btrfs is a copy-on-write (CoW) file system intended to address various weaknesses in current Linux file systems. It primary focusing on fault tolerance, self-healing properties, and easy administration.
Btrfs can support up to a 16 exbibyte partition and a file of the same size. If you are confused by the numbers, all you need to know is that Btrfs can support up to sixteen times of the data of Ext4.
How does Copy-on-Write work and why would you want it
On a traditional file system, modifying a file would read the data, change it and then write it back to the same place. In a copy on write file system, it reads the data, changes it and writes it to a new location. This prevents the loss of data during the read-modify-write transaction because the data is always on disk.
Since you don’t “repoint” until the new block is completely written out, if you lose power or crash in the middle of a write, you end up with either the old block or the new block, but not a half-written corrupted block. So you don’t need to fsck filesystems on startup and you lower your risk of data corruption.
You can snapshot the filesystem at any point, creating a snapshot entry in the metadata with the current set of pointers. This protects old blocks from being garbage collected later on and allows the filesystem to present a volume as it was during the snapshot. In other words, you have instant rollback capabilities. You can even clone that volume to make it a writable volume based on the snapshot.
Your other choice is ZFS on Linux, which may be more stable, but requires a few more steps to install on typical Linux distributions.
- Copy on Write (CoW) and snapshotting – Make incremental backups painless even from a “hot” filesystem or virtual machine (VM).
- File level checksums – Metadata for each file includes a checksum that is used to detect and repair errors.
- Compression – Files may be compressed and decompressed on the fly, which speeds up read performance.
- Auto defragmentation – The filesystems are tuned by a background thread while they are in use.
- Subvolumes – Filesystems can share a single pool of space instead of being put into their own partitions.
- RAID – Btrfs does its own RAID implementations so LVM or mdadm are not required in to have RAID. Currently RAID 0, 1 and 10 are supported. RAID 5 and 6 are considered unstable.
- Partitions are optional – While Btrfs can work with partitions, it has the potential to use raw devices (/dev/<device>) directly.
- Data deduplication – There is limited data deduplication support; however, deduplication will eventually become a standard feature in Btrfs. This enables Btrfs to save space by comparing files via binary diffs.
Btrfs is a filesystem that does not need administration once it has been implemented, that is, you should never have to run an fsck on it. Whenever any errors or inconsistencies arise, it should just handle them on its own and be on its way.
While it is true that Btrfs is still considered experimental and it is currently under active development, the time when Btrfs will become the default filesystem for Linux systems is getting closer. Some Linux distributions have already begun to switch to it with their current releases.
If you aren’t afraid of having to deal with a somewhat less mature ecosystem, though, Btrfs may be the better option for you.
ZFS file system
ZFS (Zettabyte File System) remains one of the most technically advanced and feature-complete filesystems since it appeared in October 2005. It is a local filesystem (i.e.: ext4) and logical volume manager (i.e.: LVM) created by Sun Microsystems. ZFS was published under an open source license until Oracle bought Sun Microsystems and closed source the license.
You can think of ZFS as volume manager and a RAID array in one, which allows extra disks to be added to your ZFS volume which allows extra space to be added to your file system all at once. In addition to, ZFS comes with some other features that traditional RAID doesn’t have.
ZFS depends heavily on memory, so you need at least 8GB to start. In practice, use as much you can get for your hardware/budget.
ZFS is commonly used by data hoarders, NAS users, and other geeks who prefer to put their trust in a redundant storage system of their own rather than the cloud. It’s a great file system to use for managing multiple disks of data and rivals some of the greatest RAID setups.
ZFS is similar to other storage management approaches, but in some ways, it’s radically different. ZFS does not normally use the Linux Logical Volume Manager (LVM) or disk partitions, and it’s usually convenient to delete partitions and LVM structures prior to preparing media for a zpool.
The zpool is the analog of the LVM. A zpool spans one or more storage devices, and members of a zpool may be of several various types. The basic storage elements are single devices, mirrors and raidz. All of these storage elements are called vdevs.
ZFS is able to enforce storage integrity far better than any RAID controller, as it has intimate knowledge of the structure of the filesystem. Data safety is an important design feature of ZFS. All blocks written in a zpool are aggressively checksummed to ensure the data’s consistency and correctness.
For server use where you want to eliminate almost entirely any possibility of data loss and stability is the name of the game, you may want to look into ZFS.
Endless scalability. Well, it’s not technically endless, but it’s a 128-bit file system that’s capable of managing zettabytes (one billion terabytes) of data. Therefore, no matter how much hard drive space you have, ZFS will be suitable for managing it.
Maximum integrity. Everything you do inside of ZFS uses a checksum to ensure file integrity. You can rest assured that your files and their redundant copies will not encounter silent data corruption. Also, while ZFS is busy quietly checking your data for integrity, it will do automatic repairs anytime it can.
Drive pooling. The creators of ZFS want you to think of it as being similar to the way your computer uses RAM. When you need more memory in your computer, you put in another stick and you’re done. Similarly with ZFS, when you need more hard drive space, you put in another hard drive and you’re done. No need to spend time partitioning, formatting, initializing, or doing anything else to your disks. When you need a bigger storage “pool”, just add disks.
RAID. ZFS is capable of many different RAID levels, all while delivering performance that’s comparable to that of hardware RAID controllers. This allows you to save money, make setup easier, and have access to superior RAID levels that ZFS has improved upon.
Reiser4 file system
ReiserFS is a general-purpose, journaled computer file system initially designed and implemented by a team at Namesys led by Hans Reiser. Introduced in version 2.4.1 of the Linux kernel, it was the first journaling file system to be included in the standard kernel.
With the exception of security updates and critical bug fixes, Namesys has ceased development on ReiserFS. Reiser4 is the successor filesystem for ReiserFS. It has added encryption, improved performance, and much more.
Reiser4 requires a patched kernel. It is still not included in the official Linux kernel, but patches for Linux-5.x is already available. The reasons Reiser4 is not in the Linux kernel today can be summarized as claims that further testing are required.
Reiser4 provides the most efficient disk space usage among all file systems in all scenarios and workloads. ReiserFS offers advantages over other file systems, especially when it comes to handling a large number of small files. It supports journaling for fast recovery in case of problems. The file system structure is based on trees. In addition, Reiser4 consumes a little more CPU than other filesystems.
Reiser4 has a unique ability to optimize disk space occupied by small files (less than one block). They are stored entirely in their inode, without allocation of blocks in the data area.
As well as implementing the traditional Linux filesystem functions, reiser4 provides users with a number of additional features: transparent compression and encryption of files, full data journaling, as well as almost unlimited (with the help of plug-in architecture) extensibility.
However, there is currently no support for direct IO (work has begun on implementation), quotas, and POSIX ACL.
Choosing the file system that satisfies your specific application needs requires consultation and research of various parameters. This article outlines the benefits of ext4, ZFS, XFS, Btrfs, and Reiser4 file system options and to assist you make the decision regarding the right file system for your application environments. Thank you for spending your time here.