A filesystem is a critical part of any computing device. It is a means of classifying and organizing files and storing data. With the help of a filesystem, the space available in a device is managed efficiently for storing data, so that the required information can be received whenever necessary. The data and the metadata (data about the data) is accessed from the files and directories, using the mechanism provided by the filesystem. Filesystems are used in storage devices such as optical discs and magnetic storage discs. In short, a filesystem is a set of data types that is employed for:
- Data storage
- Hierarchical categorization
- Data management
- File navigation
- Accessing the data
- Recovery of data
Before exploring the extended filesystems of Linux such as ext2, ext3 and ext4, it is necessary to know the basics of the Linux filesystem architecture. The whole space of the Linux file system is divided into three different parts.
1. User space: The applications are located in the user space, which sends system calls to the system call interface. System call is nothing but a request that is sent to the kernel of the operating system, for a service.
2. Kernel space: Kernel is the core of the operating system that answers the system calls from the user space by providing the requested resources, managing the I/O (input/output) devices, memory devices, file management etc.
3. Disc space: The device driver in the kernel space sends the I/O request to the hard disk of the system which contains critical file data.
Filesystems of Linux
There are various file systems used in Linux operating systems such as ext2, ext3, ext4, sysfs, procfs, NFS etc. We will now discuss the basics of ext2, ext3 and ext4 Linux filesystems.
Second Extended Filesystem (Ext2)
The ext2 filesystem was developed by Remy Card and it was introduced by Linux in 1993. Ext2 was one of the most efficient and widely used filesystems in Linux. In Debian and Red Hat Linux, ext2 was used as a default filesystem, until ext3 was introduced. But even now, ext2 is used for flash-based storage media like USB flash drives, SD cards etc. The whole filesystem of ext2 is divided into numerous data blocks, among which only the last block can be filled by data. The compression and decompression of the ext2 filesystem is supported by e2compr. The maximum file size of ext2 filesystem is in the range of 16 Gigabytes to 2 Terabytes and the maximum length of the file name (metadata about a file) is 255 bytes.
Third Extended Filesystem (Ext3)
The ext3 filesystem was developed by Stephen Tweedie. The changes made in the journal, which is a circular log present in the filesystem, is monitored by ext3 which is called journaling. Journaling filesystem is an additional feature in ext3, which was not in ext2. In a non-journaled filesystem, data recovery and detecting the errors involved more time, as we may have to go through the entire data structure of the directory. But, in a journaled filesystem, we have a journal that keeps track of the changes we do in the filesystem. So, to detect the errors or recover data, after a crash, it just requires reading the journal instead of processing the whole data structure. The maximum file size and the filename length of ext3 is same as that of the ext2.
Features of Ext3 over Ext2
- Backing up and restoring data is not required
- Htree indexing is implemented for larger directories when the feature is enabled
- Journaling filesystem
The stable version of ext4 filesystem was introduced in 2008 by Linux. The maximum volume size of data supported by ext4 is 1exbibyte (1 exbibyte = 260 bytes) and file size is up to 16 tebibytes. The maximum length of the filename is 56 bytes. The fragmentation in terms of physical blocks, where data is stored, is replaced by extents. This modification, which was not available in ext2 and ext3, increased the performance of the filesystem. Extent is a data storage area that reduces file fragmentation and file scattering. A single extent in the filesystem can be up to 128 mebibyte (1 mebibyte = 220 bytes) and each block in an extent is 4 kibibyte (1 kibibyte = 210 bytes).
Features of Ext4 over Ext2 and Ext3
- Introduction of extent
- HTree indexes which is a specialized data tree structure used for directory indexing, is enabled in ext4 by default
- Backward compatibility, i.e. the characteristic of a device to process the input from older devices is an additional feature of ext4
- The pre-allocation of on-disk space of certain files in the system is created in a contiguous space, which is used in media streaming and databases
- Allocate-on-flush technique is implemented in ext4, which reduces disk fragmentation and CPU usage
- The sequential writing of data is much faster than the older filesystems
- A timestamp sequence of the data or event recorded and measured in nanoseconds is implemented in ext4. This feature reduces granularity of the timestamp, thus catering to the processing speed of the computer