Understanding Linux Filesystems

Filesystem – method of efficient data storing on Linux systems. Linux is using filesystems to manage stored data on storage devices. A filesystem maintains a map to locate each file placed in the storage device. If we take a look at Windows and the way of storing data there, you will find that paths are indicating on which drive specific data is stored. However, Linux uses different way to map data locations on a drive. It uses virtual directory structure, which contains file paths from all storage devices installed on the system – merged into one single directory structure.

Virtual directory – this directory structure contains single base directory called root directory (denoted with / sign). Also, Linux places physical devices (HDDs and SSDs for example) in virtual directory using mounting points. A mount point is a directory placeholder within the virtual directory that points to a specific device.

Here are the most important directories inside one Linux filesystem:

/bootContains boot loader files used to boot the system
/etcContains system and application configuration files
/homeContains user data files
/mediaUsed as mount point for removable devices
/mntUsed as mount point for removable devices, too
/optContains data for optional third-party applications
/tmpContains temporary files created by system users
/usrContains data for standard Linux programs
/usr/binContains local user programs
/usr/localContains data for programs unique to the installation
/usr/sbinContains data for system programs and data
/varContains variable data files, including system and application logs

Common Filesystem Types

Before assigning drive partition to a mount point in virtual directory, filesystem must be formatted. Each filesystem has its own way of indexing files and directories and tracking file access. All these are similar, but different. First, let’s discuss types of filesystem Linux supports (both Linux filesystems and non-Linux filesystems). Then, we will see how to create filesystem, then mount it, and then manage and administer them. Here are Linux Filesystem types we can use:

btrfsSupports files up to 18000 petabytes (just insane).
Performs RAID and LVM, built-in snapshots for backup, improved fault-tolerance, data compression on the fly
ecryptfsApplies POSIX compliant encryption protocol to data before storing it on the device.
Only operating system that created that data, can read it.
ext3Supports files up to 2 terabytes. Total filesystem size can reach 17 terabytes. Supports journaling, faster startup and recovery
ext4Supports files up to 17 terabytes. Total filesystem size can reach 144 petabytes. Supports journaling, improved performance
swapNot indended for storing persistent data. Swap filesystem creates virtual memory using space on physical drive. 2GB swap space acts as 2GB RAM system can use. However, it is stored in physical drive, not in RAM memory.

Those were Linux filesystems, and now let’s see which non-Linux filesystems are supported by Linux:

CIFSFilesystem protocol created by Microsoft. Used to read and write data across a network using network storage device.
exFATCommonly used to format USB devices and SD cards
HFSDeveloped by Apple for Mac systems. Linux can interact with HFS and with HFS+ filesystems
ISO-9660Weird name. It is standard used to create filesystems on CD-ROM devices
NFSStandard for reading and writing data across a network using network storage device
NTFSFilesystem used by Microsoft. Linux can read and write data on NTFS partitions
SMBFilesystem created by Microsoft. Used for network storage and interacting with other network devices (such as printers).
Support for SMB allows Linux clients and servers to interact with Microsoft clients and servers on a network
UDFUsed on DVD-ROM devices to store data. Linux can read data from DVD, and write data to DVD, using this filesystem
VFATExtension for Microsoft’s FAT filesystem. Often used for removable devices such as USB memory sticks
ZFSCreated by Sun Microsystems for Unix workstations and servers. Has features similar to btrfs Linux filesystem

Creating filesystems

Program to create filesystem is called mkfs. Actually, there are multiple programs, such as mkfs.ext4, to create ext4 filesystem. But, instead of learning all individual program names, use mkfs and use -t option where you will specify filesystem type. Before using mkfs command, be careful to specify right partition or device. If you make mistake, you will format wrong partition or disk, loosing all data in it.To create swap filesystem, use mkswap command.

After we formatted drive partition with a filesystem, add filesystem to the virtual directory on your Linux system. This process is called mounting the filesystem. This can be done manually from command line, or automatically let Linux do it after boot:

1. Manually mounting devices to a filesystem:

This will temporary mount filesystem to Linux virtual directory. We will use mount command.

# mount -f ext4 /dev/sda1 /media/usb1

We called mount command, specified filesystem type with –f flag, chose ext4 as our filesystem, and mounted /dev/sdb1 to /media/usb1

Downside for mount command is that temporarily mounts the device on virtual directory. When you reboot system, you must mount the device again. This method is good for removable devices. To unmount the device, use umount command.

2. Automatically mount devices to a filesystem:

For permanent storage devices, Linux uses /etc/fstab file. Inside this file, there are paths to devices which are automatically mounted at boot time. Let’s take a look what is inside

[root@arch ~]# cat /etc/fstab 
# Static information about the filesystems.
# See fstab(5) for details.

# <file system> <dir> <type> <options> <dump> <pass>
# /dev/sdb2
UUID=3898728b-a9f9-4964-8a23-6d3bbec99030	/         	ext4      	rw,relatime	0 1

# /dev/sdb1
UUID=CAAB-5AD3      	/boot     	vfat      	rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,utf8,errors=remount-ro	0 2

# /dev/sdb3
UUID=66d3a8df-476b-4c5d-85c5-2f02bac66026	/home     	ext4      	rw,relatime	0 2

Aight, looks a bit messy, but let’s dissect this into little pieces:

  1. Device file (either raw file or udev file)
  2. Mount point location
  3. Filesystem type
  4. Additional options required to mount the drive.

Devices are identified with their UUID value, ensuring correct drive partition is mounted. The first partition here is mounted on /boot directory, it uses vfat filesystem type. The second partition is mounted on root / of the virtual directory, it uses ext4 filesystem type. The third partition is mounted at /home and it uses ext4 filesystem type too.

If you want to mount something automatically, add it in /etc/fstab file.

Managing Filesystems

Tools to manage filesystem are used to monitor disk performance, disk usage, partition sizes, mount points, etc. These are the most used tools for this:

Managing with df command

The df command displays disk usage by partition. Let’s take a look what my machine pulls out. With –h argument, it prints human-readable output using megabytes, gigabytes… As you can see, I have few directories that take space on a partition. The most important ones are /dev/sda[123] which are real partitions on my SSD. The df command shows total size of each partition, used and available space, and where is mounted on a filesystem.

[root@arch ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
dev             3.9G     0  3.9G   0% /dev
run             3.9G  1.4M  3.9G   1% /run
/dev/sda2        98G   12G   82G  13% /
tmpfs           3.9G  138M  3.8G   4% /dev/shm
tmpfs           4.0M     0  4.0M   0% /sys/fs/cgroup
tmpfs           3.9G  4.0K  3.9G   1% /tmp
/dev/sda3       122G  5.7G  110G   5% /home
/dev/sda1        99M   63M   36M  64% /boot
/dev/loop2      5.9M  5.9M     0 100% /var/lib/snapd/snap/tor/2
/dev/loop1       56M   56M     0 100% /var/lib/snapd/snap/core18/1885
/dev/loop0       31M   31M     0 100% /var/lib/snapd/snap/snapd/9279
tmpfs           788M  112K  788M   1% /run/user/1000

Managing filesystem with du

Displays disk usage by directory. Use –h flag to see output in human-readable form. Let’s take a look how it looks like:

[root@arch ~]# du -h
4.0K	./.@PACKAGE_NAME@/app
16K	./.@PACKAGE_NAME@
4.0K	./.cache/yay
8.0K	./.cache
4.0K	./.config/yay
8.0K	./.config
4.0K	./opt
8.0K	./.gnupg/crls.d
12K	./.gnupg
8.8M	./snap/tor/2/.tor
[snip]
0End each line with NUL, and not in new line
aOutput disk usage by directory and files (massive output)
chPrint summary of all directories and files and their usage (human-readable)
shJust print disk usage summary (human-readable)
xSkip directories on other filesystems
XExclude files that match pattern

Managing filesystems with iostat

Displays real-time chart of disk statistics by partition. Install it with sysstat package. The iostat command is used to monitor system input/output device loading and it is very versatile. Here’s an example of iostat output:

[aldin@arch ~]$ iostat -ph
Linux 5.8.12-arch1-1 (arch) 	10/19/2020 	_x86_64_	(6 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.6%    0.0%    1.9%    0.0%    0.0%   91.5%

      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     7.05        81.5k       157.1k         0.0k       1.3G       2.4G       0.0k sda
     0.01         0.4k         0.0k         0.0k       6.7M       1.0k       0.0k sda1
     1.82        56.2k         6.2k         0.0k     890.6M      98.3M       0.0k sda2
     5.20        24.8k       150.9k         0.0k     393.3M       2.3G       0.0k sda3
     0.00         0.0k         0.0k         0.0k     344.0k       0.0k       0.0k loop0
     0.01         0.1k         0.0k         0.0k       2.2M       0.0k       0.0k loop1
     0.01         0.2k         0.0k         0.0k       2.4M       0.0k       0.0k loop2

The iostat will generate CPU utilization report, Device utilization report, and network filesystem report. Let’s take a look at “CPU utilization report”:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.6%    0.0%    1.9%    0.0%    0.0%   91.5%
%user Shows percentage of CPU utilization that occured while executing at the user level (application)
%niceShows percentage of CPU utilization that occured while executing at the user level with nice priority
%system Shows percentage of CPU utilization that occured while executing at the system level (kernel)
%iowait Shows percentage of time that the CPUs were idle
%steal Show the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor.
%idleShow the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.

Device utilization report is showing following output. This block of text shows statistics for physical device or partition. Let’s examine each column:

      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     7.05        81.5k       157.1k         0.0k       1.3G       2.4G       0.0k sda
     0.01         0.4k         0.0k         0.0k       6.7M       1.0k       0.0k sda1
     1.82        56.2k         6.2k         0.0k     890.6M      98.3M       0.0k sda2
     5.20        24.8k       150.9k         0.0k     393.3M       2.3G       0.0k sda3
     0.00         0.0k         0.0k         0.0k     344.0k       0.0k       0.0k loop0
     0.01         0.1k         0.0k         0.0k       2.2M       0.0k       0.0k loop1
     0.01         0.2k         0.0k         0.0k       2.4M       0.0k       0.0k loop2
tpsTransfers per second sent to the device. One transfer is one I/O request to the device
kb_read/sAmount of data read from the device (kilobytes per second)
kb_wrtn/sAmount of data written to the device (kilobytes per second)
kB_dscd/sAmount of data discarded for the device (kilobytes per second)
kB_readTotal number of kilobytes read
kB_wrtnTotal number of kilobytes written
kB_dscdTotal number of discarded kilobytes

Managing filesystems with lsblk

Displays current partition sizes and mount points on all block devices. The command will show all block devices (except RAM) in a tree-like format. Here are the most used flags for lsblk:

aPrint empty devices
bPrint SIZE in bytes, not in human-readable format
dDon’t print device holders or partitions. Just print device
fPrint information about filesystems
iUse ASCII characters to create a tree
mShow device owner, group and file permissions information
nNo header line shown

And here is quick little example of lsblk command. The -n flag will not show header line, the -m flag shows device owner, group, and file permissions, and the -i flag uses ASCII to create that little tree-like structure for /dev/sda/ partitions

[aldin@arch ~]$ lsblk -nmi
loop0   30.3M root  disk  brw-rw----
loop1   55.3M root  disk  brw-rw----
loop2    5.8M root  disk  brw-rw----
sda    223.6G root  disk  brw-rw----
|-sda1   100M root  disk  brw-rw----
|-sda2   100G root  disk  brw-rw----
`-sda3 123.5G root  disk  brw-rw----

Leave a Reply