Inode
An inode ( English index node , pronounced "eye-node") is the basic data structure for managing file systems with Unix-like operating systems . Each inode is uniquely identified within a partition by its inode number . Each name entry in a directory refers to exactly one inode. This contains the metadata of the file and refers to the data of the file or the file list of the directory.
The application software no longer differentiates between device drivers and regular files when reading or writing data . Due to the inode concept, everything counts as a file in the Unix variants ( “On UNIX systems it is reasonably safe to say that everything is a file…” ; see Everything is a file ). As a result, such operating systems differ in the management of their data storage from other systems such as Microsoft Windows with NTFS , but also from VMS or MVS .
Basics
If a file is saved on a computer , not only the file content (user data) has to be saved, but also metadata , such as the time the file was created or the owner of the file. At the same time, the computer must be able to efficiently assign the corresponding user data and metadata to a file name - including the file path. The specification of how this data is organized and stored on a data carrier is called the file system. There are different file systems depending on the area of application and operating system. The file system specification is implemented by a driver , which can be implemented either as a kernel module of the operating system kernel or, more rarely, as a normal program in user space .
File systems of Unixoid operating systems - such as Linux and macOS - use so-called inodes. These contain the metadata as well as references to where the user data is stored. The size, number and location of the inodes are stored in a special location in the file system, the superblock . The inodes are numbered and stored in one piece on the data carrier. The root directory of a file system has a fixed inode number. Sub-folders are "normal" files that contain a list of the files they contain with the assignment of the associated inode numbers as user data.
If, for example, the file / bin / ls is to be opened, this is simplified as follows:
- The file system driver reads out the superblock, thereby finding out the starting position of the inodes and their length, so that any inode can now be found and read.
- The inode of the root directory is now opened. Since this is a folder, there is a reference to the location of the list of all files in it, including their inode numbers. The bin directory is searched for.
- Now the inode of the bin directory can be read and, similar to the last step, the inode of the file ls can be found.
- Since the ls file is not a directory but a regular file, its inode now contains a reference to the storage location of the desired data.
construction
Each one of a slash /
( slash limited) name is assigned to an inode. This saves the following meta information about the file, but not the actual name:
- The type of file (regular file, directory, symbolic link, ...), see below;
- the numeric identifier of the owner (UID, user id ) and the group (GID, group id );
- the access rights for the owner ( user ), the group ( group ) and all others ( others );
The classic user and rights management is done with the programschown
( change owner ),chgrp
( change group ) andchmod
( change mode ). By Access Control Lists (ACL), a finer rights assignment is possible. - different times of the file: creation, access ( access time , atime ) and last change ( modification time , mtime );
- the time of the last status change of the inode ( status , ctime );
- the size of the file;
- the link counter (see below );
- one or more references to the blocks in which the actual data is stored.
Regular files
Regular files (Engl. Regular files ) are both user data and executable programs . The latter are identified by the e x ecutable right and are started in a separate process when called up by the system. Not only are compiled programs "executable" , but also scripts for which the shebang specifies the interpreter to be used . In the case of “sparse files”, so-called sparse files , the logical size differs from the hard disk space actually occupied by the data blocks.
Directories
Directories are files whose "file content" consists of a tabular list of the files they contain. The table contains a column with the file names and a column with the associated inode numbers. With some file systems the table contains further information, so ext4 also saves the file type of all contained files in it, so that this does not have to be read from the inodes of all files when listing a directory content. The entries .
and ..
as references to the current or higher-level directory always exist for each directory.
Hard and symbolic links
With symbolic links , there are special files that instead of data a file path contains references to the link. Depending on the file system and length of the file path, the link is either saved directly in the inode or in a data block to which the inode refers.
With hard links , however, it does not constitute specific files. A hard link is when an inode is referenced multiple times by different directories or different files. All references (links) to the inode are equivalent, so there is no original. In the inode, the link counter indicates how many file names refer to it, so it is 1 after a file has been created and is increased as soon as further hard links are created for this file. In the case of a directory, it is two more than the subfolders it contains, because next to the entry in the folder above and the entry '. in the folder the entries '..' in all sub-folders refer to it. If a file is deleted, its entry is removed from the higher-level directory and the link counter is reduced by one. If the link counter is then 0, the system waits until the file is no longer opened by any program and only then releases the storage space.
More types of files
In addition to these common files, there are other file types:
- Block-oriented device files
- Character-oriented device files
- named pipes
- Sockets
- Doors (only under Solaris )
Reference to the data
In older file systems, the inode usually contains a limited number of entries in data blocks in which the file's useful data are located. Since the number of entries in the inode would severely limit the file size, a reference to a data block is saved for larger files in the inode, in which a list with references to data blocks is saved instead of user data. Theoretically, this principle can be repeated nested within one another.
As an alternative, more modern file systems have so-called extents , whereby only the number of the first and the last block of a contiguous area of data blocks are stored in the inode. If the file is not fragmented, only two block IDs need to be saved regardless of the file size: the number of the first data block and the number of the last data block.
example
With the ext2 file system , the standard size of an inode is 128 bytes, in which up to 12 entries in a regular file each refer to a data block in which the actual content is stored. If these 12 blocks are not sufficient, an entry in the inode points to a cluster, which then contains the references to the actual data clusters. Such a reference is known as a simple indirect block . Up to three indirect blocks are possible, so that the maximum file size can be between 16 GiB and 4 TiB depending on the block size .
Example of an inode structure with 12 KiB in directly addressed data blocks and approximately 16 million KiB + 65536 KiB + 256 KiB in indirectly addressed data blocks / clusters, each one KiB in size:
The 256 entries in the blocks to which reference is made result from the fact that a 1 KiB block can contain exactly 256 addresses with a length of 4 bytes (32-bit address space).
practice
The inode number of a file can be displayed using the command . The program offers the option to search for files with a specific inode number.
ls -i Dateiname
find
-inum
The metadata stored in an inode can be stat
displayed with the command . The tool debugfs
knows the command stat <[inodenumer]>
to display the data stored in the inode.
The number of possible inodes and thus the possible files is limited in some file systems; if the maximum number is reached, no further files can be created. When creating ext2 / ext3 / ext4 file systems, the number of inodes can be set. In the case of data carriers with a large number of small files, you have to be careful when formatting that the number of inodes is high enough. With the option, the program df
( display free disk space ) shows -i
the number of occupied and available inodes on all mounted file systems.
File systems usually have checking software which, among other things, checks the inodes for inconsistencies. It is used for this on Linux and some other Unixoid systems fsck
. fsck
tries to determine the file system type and then checks all inodes for correctness.
The number of available inodes is determined for most file systems when they are created and can be i. d. Usually do not change afterwards. The number of inodes thus also limits the maximum number of files that can be saved on the file system (including all folders, special files, etc.).
literature
- Æleen Frisch: Essential System Administration . 3. Edition. O'Reilly Media, 2002, ISBN 978-0-596-00343-2 .
Web links
Individual evidence
- ↑ Æleen fresh: Essential System Administration . 2nd Edition. O'Reilly Verlag, 1995, ISBN 1-56592-127-5 , pp. 37 .
- ↑ Æleen fresh: Essential System Administration . 2nd Edition. O'Reilly Verlag, 1995, ISBN 1-56592-127-5 , pp. 38 .
- ^ The Second Extended File System, Internal Layout by Dave Poirier, section s_inode_size
- ↑ a b c The Second Extended File System, Internal Layout by Dave Poirier, section i_block