ZFS (file system)

from Wikipedia, the free encyclopedia
ZFS
Manufacturer Sun Microsystems
Full name Zettabyte File System (obsolete)
Initial release June 2006 ( Solaris  10)
Partition identifier u. a. 6A898CC3-1DD2-11B2-99A6-080020736631( GPT )
Maximum values
Size of a file 2 64 −1 bytes
Number of all files 2 48
File system size 2 128 bytes
properties
File rights management POSIX , ACLs
Transparent compression yes ( LZJB and gzip )
Transparent encryption yes (introduced with Oracle Solaris 11 Express 2010.11)
Supporting operating systems Solaris
↳ OpenSolaris
↳ illumos
Unix-like :
↳ FreeBSD
↳ FreeNAS
↳ TrueOS
↳ NetBSD
↳ macOS
↳ Linux
Windows  NT
↳ from  10

ZFS is a transactional file system developed by Sun Microsystems that contains numerous extensions for use in the server and data center area . These include the comparatively large maximum file system size, simple management of even complex configurations, the integrated RAID functions, volume management and checksum-based protection against data transmission errors. The name ZFS originally stood for Zettabyte File System , but is now a pseudo- acronym , which means that the long form is no longer in use.

properties

ZFS is a 128-bit copy-on-write file system with significantly expanded functionality compared to conventional file systems. With conventional file systems, exactly one file system manages exactly one partition . If several physical partitions are to be combined to form logical partitions, additional logical volume manager software must be installed. In order to protect against failures, many file systems can also be protected by an optional software-based RAID subsystem ( software RAID ). ZFS combines these three functions and supplements them with checksum-based protection against data transmission errors.

Disk pools

In practice, logical units - so-called pools (or zPools) - are initially formed from physical data carriers (actually data storage devices) , which can optionally also be designed to be fail-safe (RAID). Any number of logical partitions (each with a file system) can then be created within a pool, whereby these, as far as the size of the pool allows, grow dynamically, but can also be reduced. In order to enforce administrative restrictions, a minimum and maximum size can be specified for each logical partition. The logical partitions can be embedded in a hierarchical structure within which these and other parameters can also be inherited. It is also possible to provide data areas from pools as dedicated "block devices" (see also data block- oriented devices).

The second specialty of ZFS is the particularly simple administration. In order to create a pool over several hard disks and to create a partition on it, only two simply structured commands are required. Partitioning, building the logical volume and finally mounting it in the existing file system are done automatically, but can also be done manually if necessary.

Resilience

redundancy

As with a classic volume manager , the underlying pools can be made fail-safe. For this purpose, so-called redundancy groups are formed from several physical data carriers using software RAID , one or more of these redundant groups then form a fail-safe pool. The RAID subsystem integrated in the file system has the advantage over classic hardware or software raid implementations that a distinction can be made between occupied and free data blocks and, therefore, only occupied disk space has to be mirrored when reconstructing a raid volume, which results in In the event of damage, especially in the case of file systems that are not very full, enormous time savings. ZFS offers several RAID levels to choose from. With mirroring ( RAID-1 ), two (or more) hard drives form a mirrored redundancy group, whereby the data can be saved twice or more. There are also two implementations called RAID-Z . RAID-Z1 works similarly to RAID-5 , RAID-Z2 largely corresponds to RAID 6 . With RAID-Z1, three (or more) hard disks form the redundancy group, with the data being parity-protected as with a RAID 5 system , so that one of the hard disks can fail without data loss. Due to the integrated design of the ZFS, however, in contrast to RAID-5, no battery-buffered memory ( NVRAM ) is required, since there is no write hole between data writing and parity writing. The write-gap-free implementation of RAID-6 called RAID-Z2 has been available since Solaris Express 47. Since July 2009, RAID-Z3, a RAID-Z implementation with 3 parity bits, has also been available. The speed optimization through parallel access ( RAID-0 - striping ) is carried out automatically by ZFS.

Snapshots

Copy-On-Write allows you to create snapshots very efficiently ; this happens practically immediately and the file system remains online. A snapshot freezes the current file system status, subsequent write operations represent the differences to the last snapshot. ZFS snapshots can be mounted for reading or archived (zfs send), there are also ZFS clones, these correspond to a writable snapshot.

Automatic data error correction

In addition to the possibility of protecting data against hard drive failures, each individual block in the file system is also provided with a checksum so that data errors in the file system (e.g. caused by data transmission errors) can be automatically detected and, if necessary, rectified without manual intervention. The loss of performance is minimal. ZFS also ensures that the status of the file system is consistent at all times and therefore no checking of the file system (via fsck ) is necessary even after a power failure .

Deduplication

In October 2009, deduplication was released for ZFS. As a result, blocks with identical content are only physically stored once, which helps save disk space. A typical use case is the creation of virtual hard disks for virtual machines, each of which contains an installation of a virtualized operating system. Another would be to remove redundant information from backups of the same type. However, deduplication requires a lot of RAM, which is why ZFS was called resource-hungry when used. In OpenZFS, LZ4 compression is more suitable , which is designed more for speed than compression and does not require any additional memory.

performance

Furthermore, ZFS is a relatively fast file system; However, due to the integrated RAID functions and end-to-end checksums, the speed on older or slower systems cannot match simpler file systems, whereby the performance of ZFS also depends on which RAID functionality is used and whether the individual disks can transfer data independently and simultaneously.

Data capacity

ZFS is designed for very large amounts of data, which is achieved through the consistent use of 128-bit pointers. In practice, however, the limits are comparable to those of a 64-bit file system. In the implementation under Solaris and, for example, FreeBSD, 64-bit data types are used, since there are currently no 128-bit data types in C that can be used across different architectures and compilers. Basically the first 64 bits of the pointer are always stored together with 64 zeros, which are ignored during processing. This enables existing file systems to be used later as true 128-bit file systems. The capacity of ZFS is designed to last forever .

Further development

Sun has been developing ZFS for the Solaris operating system since 2001 and officially released it in 2006 with Solaris 10 6/06, including commercial support. In addition, Sun provided ZFS under the Common Development and Distribution License (CDDL) for OpenSolaris (from build 27a). ZFS was available on all of the Solaris supported architectures : SPARC and IA-32 (i.e. 32-bit x86 and 64-bit x86, x64 ). The project was designed and implemented by the Sun team under the direction of Jeff Bonwick .

On the basis of the Sun release, ZFS was ported to FreeBSD early on by Pawel Jakub Dawidek, with the support of Sun developers, and has been included in the base system since FreeBSD 7.0 (released in early 2008), but was classified as experimental at that time; As of FreeBSD 8.0 (late 2009) it is considered stable.

Even Apple had ZFS support, for now read-only, in Mac OS X Leopard (released in late 2007) integrated. Full implementation was announced for the server version 10.6 ( Snow Leopard , 2009), but was not implemented after all. Instead, source code and binary programs of the ZFS port were published on Apple's open source project website Mac OS Forge . On October 23, 2009, Apple announced there that the ZFS project had been discontinued. Don Brady, who was responsible for the development of ZFS at Apple, founded the company Ten's Complement after leaving Apple and continued to develop the file system there under the name ZEVO . In 2012 the company was taken over by GreenBytes, which in turn was taken over by Oracle in 2014. Although ZEVO was paid commercial software, GreenBytes released a community edition for free in 2012 .

Direct support within the Linux - kernel is problematic due to licensing issues, so there are none in the official kernel sources integrated Linux implementation. However, the ZFS on FUSE project created an implementation that made ZFS usable under Linux as well. However, this ran in userspace and therefore had various disadvantages, including a reduced data throughput. ZFS on FUSE has not been further developed since 2012; the last version is 0.7.0 and was released on March 9, 2011. Its replacement is the OpenZFS port ZFS on Linux .

OpenZFS

Under the name OpenZFS , a start was made in September 2013 to bring together all previous developments that were independent of Sun and Oracle in one project. The file system should also be standardized across operating systems.

The already existing developments, such as those from FreeBSD, which in turn are based on the CDDL release from Sun, served as the basis. OpenZFS is therefore basically compatible with Oracle ZFS, but not completely. OpenZFS has individual development branches for various operating systems:

The free operating system illumos, which emerged as a spin-off from OpenSolaris , serves as the basis for the OpenZFS ports. New functions and further developments are entered directly in illumos and taken over from the portings, and thus across operating systems.

ZFS is also integrated in FreeBSD and can be used optionally. Some distributions even use ZFS primarily, such as FreeNAS . An enterprise version of this distribution is also available, optionally with certified hardware, called TrueNAS. IXsystems is responsible for the project for both versions .

While ZFS on FUSE, which was developed in 2006, was originally the only option under Linux, an alternative solution has now become possible with OpenZFS. As ZFS on Linux , the necessary kernel modules are maintained outside of the kernel source tree. Since this implementation runs in the kernel space , there are no disadvantages that previously resulted from the (inevitable) use of FUSE . According to the developers, this project has been ready for productive use since version 0.6.1 was released in April 2013. It is u. a. in the Linux distribution Ubuntu from Canonical included since version 4.16 ( "Xenial Xerus"), but must be installed by the user. The OpenZFS binary packages are obtained directly from the official repository .

For the BSD-based macOS from Apple, there is also a port with OpenZFS on OS X (the name of macOS until 2016 was OS X ), which enables the use of ZFS as a file system driver in macOS from Mountain Lion (version 10.8, 2012).

The port to Windows from Microsoft is called ZFSin . The first alpha version, ZFSin 0.1, was released on September 20, 2017 for Windows 10 x64 .

ZFS from Oracle

After the takeover of Sun by Oracle (in the years 2009 to 2010), the further development of ZFS takes place within the framework of Solaris. However, since this is not public, it is not easy to see how committed Oracle is actually going about it.

Technical specifications

Word length 128 bit
Volume manager integrated
Resilience RAID-1
RAID-Z1 (1 parity bit, ~ RAID 5)
RAID-Z2 (2 parity bits, ~ RAID 6) and
RAID-Z3 (3 parity bits) integrated
Max. File system size 1 16 EiB (= 2 64 bytes)
Max. Number of files in a file system 2 48
Max. Size of a file 1 16 EiB (= 2 64 bytes)
Max. Size of each pool 2 128 bytes
Max. Number of files in a directory 2 48
Max. Number of devices in the pool 2 64
Max. Number of file systems in the pool 2 64
Max. Number of pools in the system 2 64
  • 1 Limitations only result from current implementations. By definition, the file system could store much larger amounts of data.

criticism

ZFS was designed for server and data center use and collects its plus points there, which sometimes results in disadvantages when used on workstation computers and embedded systems .

The processing of the 128-bit pointer (see properties ) is comparatively complex, because it does not correspond to the word width of current CPUs, typically 32 bits in the range appliances and older personal computer as well as 64 bits in the field current stand-alone computer and most servers is . Thus, there is no optimal performance on such systems. In general, the 128-bit design is only advantageous where unusually large amounts of data are to be stored. In the SOHO area, on the other hand, depending on the data carrier size, 32 or 64-bit-based file systems are sufficient with regard to the amount of data that can be stored (see Btrfs , Ext2 , FAT32 , HFS + , NTFS , UFS , etc.), which are usually already using 32-bit - Data types can manage file systems with a capacity of almost 16  terabytes (e.g. ext2), with 64-bit pointers of course much more, e.g. approx. 8  exabytes (8 million terabytes) with XFS . The 128-bit design means only additional computing and time expenditure as well as a slightly increased space requirement on the medium.

ZFS uses copy-on-write and a journal (ZIL, ZFS intent log). In this way, ZFS can fall back on a consistent file system at all times. Backups and restores of blocks as well as file system checks are not necessary in the event of crashes such as a power failure. Inconsistencies in metadata and data are automatically detected with every reading process and, if possible, automatically corrected in the event of redundant information. However, the performance of such file systems decreases noticeably from an occupancy of approx. 80%, as with all other file systems.

Trivia

The following quote is circulating about the theoretical capacity of ZFS:

"Populating 128-bit file systems would exceed the quantum limits of earth-based storage. You couldn't fill a 128-bit storage pool without boiling the oceans. "

“Filling a 128-bit file system would exceed the quantum mechanical limit of earthly data storage. You couldn't fill a 128-bit memory pool without evaporating the oceans. "

- Jeff Bonwick, ZFS chief developer

To understand the quote, it should be noted that the storage or transmission of a unit of information - e.g. B. a bit - is linked to the storage or transmission of energy, since information cannot exist without a medium, i. H. Information is linked to the existence of distinguishable states. To fill a storage pool with 128-bit addressing, an amount of energy would be required that is greater than the amount of energy that would be sufficient to evaporate the terrestrial oceans. At the same time, “boiling the ocean” in English is an idiomatic expression for trying something impossible. Bonwick thus illustrates that ZFS offers enough capacity for all future.

comment

The minimum energy required to store 2 128 bytes is (at 20 ° C) 2100 TWh and does not result from quantum mechanics, but from thermodynamics . It goes back to Boltzmann and Planck, see Boltzmann constant . This allows you to evaporate almost 3 cubic kilometers of water at 20 ° C, which is significantly less than a millionth of the amount of water in our oceans.

See also

Web links

German

English

Individual evidence

  1. oracle.com
  2. freebsd.org
  3. wiki.netbsd.org
  4. openzfsonosx.org
  5. zfsonlinux.org
  6. openzfsonwindows.org
  7. cf. You say zeta, I say zetta
  8. ^ Adam Leventhal: Triple-Parity RAID-Z in Adam Leventhal's Weblog ; Retrieved November 2, 2009.
  9. Amy Rich: ZFS. Retrieved June 17, 2010 .
  10. Jeff Bonwick: ZFS Deduplication. Retrieved June 19, 2010 .
  11. lists.freebsd.org
  12. svn.freebsd.org
  13. See ZFS Project Shutdown ( Memento of October 29, 2009 in the Internet Archive ) (October 23, 2009): “The ZFS project has been discontinued. The mailing list and repository will also be removed shortly. "
  14. Chris Mellor: Greenbytes crunches up ex-Apple man's Zevo ZFS. The Register , July 23, 2012, accessed May 5, 2020 .
  15. Oracle Buys Green bytes. Oracle, accessed on May 5, 2020 (English): "On May 15, 2014, Oracle announced it has agreed to acquire GreenBytes, a provider of ZFS technology with domain expertise in the areas of deduplication, replication, and virtualization."
  16. Chris Mellor: Dedupe, dedupe ... dedupe, dedupe, dedupe: Oracle polishes ZFS diamond. The Register , December 11, 2014, accessed May 5, 2020 .
  17. Ben Schwan: Easy to install ZFS for the Mac. In: Heise online . February 1, 2012 . Retrieved May 5, 2020.
  18. Ben Schwan: ZFS support ZEVO will soon be free. In: Heise online . July 27, 2012 . Retrieved May 5, 2020.
  19. Thorsten Leemhuis: Jurists disagree on ZFS licensing problems in Ubuntu 16.04 LTS. In: Heise online . February 29, 2016 . Retrieved May 2, 2020.
  20. Fabian A. Scherschel: Linus Torvalds again rejects ZFS in the Linux kernel. In: Heise online . January 10, 2020 . Retrieved May 22, 2020.
  21. Archive link ( Memento from May 13, 2013 in the Internet Archive )
  22. Ben Martin: Using ZFS though FUSE. Linux.com, June 19, 2008, accessed on May 5, 2020 (English): "Write operations do suffer a performance loss with zfs-fuse as apposed to an in-kernel filesystem ..."
  23. Announcing: ZFS-fuse 0.7.0 ( Memento from December 30, 2012 in the Internet Archive )
  24. github.com/zfs-fuse (accessed May 5, 2020)
  25. OpenZFS: Community wants to standardize ZFS implementations - Article at Golem.de , from September 18, 2013 (accessed on: September 24, 2013)
  26. Oliver Diedrich: OpenZFS. In: Heise online . 18th September 2013 . Retrieved May 2, 2020.
  27. Announcement - Announcement at OpenZFS , from September 17, 2013 (accessed on: September 24, 2013)
  28. Thorsten Leemhuis: Growth problems - special features in the interaction of Linux with large hard drives . In: c't . tape 2020 , no. 4 . Verlag Heinz Heise , January 31, 2011, p. 44 , section “ZFS! = OpenZFS! = ZoL” ( Heise Select [accessed on May 2, 2020]): “This implementation [ZFS from Oracle] is not open and is therefore not available for integration in Linux. It seems nonsensical that Torvalds is referring to the ZFS code status from 2010: It doesn't fit Linux at all and lags far behind the development anyway. This code resulted in OpenZFS, which offers a similar range of functions, but different in detail - and is therefore not fully compatible with ZFS. ZFS on Linux (ZoL), which Linux distributions and users use to support ZFS, is also based on OpenZFS. This is in the best of health, as Torvalds also mentions in his second forum post. "
  29. FAQ. ( Wiki ) In: OpenZFS Wiki. Retrieved on May 5, 2020 (English, Compatible with Pool version 28 from OpenZFS with ZFS from Solaris and ZEVO.).
  30. OpenZFS. ( Wiki ) Accessed May 2, 2020 (English).
  31. Philip Paeps: The ZFS file system. (PDF; 1.4 MB) COSCUP 2019. August 18, 2019, pp. 26-27 , accessed on May 5, 2020 (English): “illumos, a fork of the last open source version of Solaris, became the new upstream for work on ZFS "
  32. Features - FreeNAS - Open Source Storage Operating System . In: FreeNAS - Open Source Storage Operating System . ( freenas.org [accessed April 2, 2018]).
  33. TrueNAS All-Flash and Hybrid Storage | ZFS Storage Appliance - FreeNAS - Open Source Storage Operating System . In: FreeNAS - Open Source Storage Operating System . ( freenas.org [accessed April 2, 2018]).
  34. admin-magazin.de
  35. Thorsten Leemhuis: Linux distribution Ubuntu 16.04 LTS released. In: Heise online . April 21, 2016 . Retrieved May 2, 2020.
  36. OpenZFS on OS X. Retrieved May 2, 2020 .
  37. First binary 20170920 (English) from ZFSin (OpenZFS for Windows 10 x64)
  38. Thorsten Leemhuis: Growth problems - special features in the interaction of Linux with large hard drives . In: c't . tape 2020 , no. 4 . Verlag Heinz Heise , January 31, 2011, p. 44 , section “ZFS! = OpenZFS! = ZoL” ( Heise Select [accessed on May 2, 2020]): “The ambiguous term actually stands for the ZFS implementation by Oracle, which has been in use since the Sun takeover in 2010 is being developed behind closed doors. From the outside it is hard to see how committed Oracle is still going to work - beyond a high-end storage appliance and the remnants of Solaris, there is hardly anything to be seen of ZFS at Oracle. "