IBM General Parallel File System

GPFS is an abbreviation for General Parallel File-System , a cluster file system from IBM . It arose from several research projects on parallel working file systems and was and is sold under several trade names:

IBM General Parallel Filesystem
Elastic Storage
Spectrum Scale

history

GPFS emerged from the IBM research projects "Tiger Shark File System" and "Vesta File System" and was originally referred to as "Multimedia" file system, which can still be found in internal names today. It quickly became apparent that GPFS is particularly suitable for high-performance computers due to its parallel architecture. In 1998 GPFS appeared as the official IBM product and the successor to Vesta / PIOFS as a Posix -compliant file system.

It became the file system behind the ASCI White and Purple supercomputers at the Lawrence Livermore National Laboratory. It was later ported to other operating systems:

AIX since 1998
Linux since 2001
Windows since 2008

Other network protocols such as Windows CIFS were supported. Originally a file system behind large storage installations, it was later sold as a software product independently of the hardware. Capabilities such as shared nothing clusters have recently been added. On July 14, 2014, IBM announced a cloud service called Elastic Storage . On February 17, 2015, GPFS was renamed Spectrum Scale by IBM .

GPFS in supercomputing

GPFS is used as a cluster file system with high read / write bandwidth in several installations of the TOP500 super computer list, examples:

NCSA http://www.ncsa.illinois.edu/news/story/ncsa_to_deploy_ibms_gpfs_for_all_supercomputing_systems
Biowulf / NiH https://www.top500.org/news/nih-receives-major-supercomputer-upgrade/
Cheyenne / NCAR (SGI) https://www.top500.org/news/ncar-launches-five-petaflop-supercomputer/
Juron, Julia / Jülich: https://www.top500.org/news/juelich-supercomputing-centre-deploys-cray-and-ibm-supercomputers-for-human-brain-project/
Leibniz Computing Center, SuperMUC / Munich https://www.lrz.de/services/compute/supermuc/systemdescription/
ASCI White and Purple / LLNL https://asc.llnl.gov/computing_resources/purple/ (2002)
Argonne Mira System https://www.alcf.anl.gov/mira and https://www.alcf.anl.gov/resources-expertise/data-networking
a current record: https://www.heise.de/newsticker/meldung/IBM-Forscher-stellen-Weltrekord-beim-Massenspeicher-Zugriff-auf-1284611.html

Functions

Integrated storage systems from IBM consisting of hardware and software with GPFS under the Linux operating system are:

V7000 Unified, an appliance for block and file storage
Elastic Storage Server (ESS), various power-based appliances for file and object storage
SONAS - Scale Out Network Attached Storage https://www.ibm.com/de-en/marketplace/scale-out-file-and-object-storage (meanwhile in Spectrum Scale )

GPFS / Spectrum Scale has the following functional properties:

Several NAS computers can mount a cluster volume at the same time (parallel) for writing, so the file system is scalable for a large number of clients.
Striping and thus parallel reading and writing are supported at the level of the mass storage device and individual files. This parallelism enables very high throughput rates to be achieved.
Distributed lock manager: Parallel writing to a file system is made possible by the fact that a file can only be written by one process at a time
Metadata and data can be distributed across different disks to improve performance
Several GPFS servers (also called nodes) work as a highly available cluster, failures are intercepted
GPFS can also be based on the principle of version 4.1 shared-nothing cluster work (FPO - File Placement Optimizer) and can thus as HDFS work
very large limits for file size (8 EB ), directory size , file system size (8 YB ), number of files per file system (2 ^ 64)
Support for HSM / Hierarchical Storage Management
The volumes can be shared with the CIFS and NFS protocol at the same time, from version 4.1 also as a Hadoop Distributed File System.
Access rights control works for NFS (for Unix systems) with POSIX file rights and for CIFS (Windows systems) with ACLs . These file access rights can be controlled independently of one another
The file system works according to the copy-on-write principle. Similar to Windows "shadow copies", snapshots can be accessed via any exported directory, both via NFS and via CIFS
Asynchronous replication between different GPFS volumes is possible (Active File Management)

Web links

The product homepage at IBM is https://www.ibm.com/systems/de/storage/spectrum/index.html
the resources page at IBM is https://www.ibm.com/de-en/marketplace/scale-out-file-and-object-storage
The entry page for the IBM online documentation is https://www.ibm.com/support/knowledgecenter/SSFKCN/gpfs_welcome.html
GPFS Wiki https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)

Individual evidence

^ FAST 2002 Conference on File and Storage Technologies. Retrieved October 30, 2017 .
^ ASCI Purple. Retrieved October 30, 2017 .
↑ File Placement Optimizer. Retrieved October 30, 2017 .
↑ Elastic Storage Announcement. Retrieved January 27, 2018 .

[1] FAST 2002 Conference on File and Storage Technologies. Retrieved October 30, 2017 .

[2] ASCI Purple. Retrieved October 30, 2017 .

[3] File Placement Optimizer. Retrieved October 30, 2017 .

[4] Elastic Storage Announcement. Retrieved January 27, 2018 .