Long-term archiving
Under long-term archiving ( LTA ) is defined as the acquisition, the long-term storage and preservation of the permanent availability of information . Especially with the long-term archiving of digitally available information ( digital preservation ) new problems arise. For the preservation of digital resources, “long-term” does not mean issuing a guarantee for five or fifty years, but rather the responsible development of strategies that can cope with the constant change caused by the information market.
definition
A generally applicable definition of the term does not yet exist. Since archives always keep archives “for eternity” at first, the term long-term archive is also a pleonasm , and, according to Reinhard Altenhöner and Sabine Schrimpf's contribution, it suggests a static state. Both therefore advocate the term “long-term availability” (LZV).
Since many of the problems of digital long-term archiving only occur after about ten years, such as large version jumps in the software used, this value is used as a limit for considering long-term archiving. In addition, long-term archiving can be distinguished from data backup .
Problems
While physical objects have been kept and preserved in archives , museums and libraries for a long time, completely new problems arise with electronic publications . If data is stored in analog form, the data quality deteriorates with the degradation of the medium, which is why the focus is on maintaining the medium. Digitally stored data, on the other hand, can be reconstructed through suitable formatting in the event of small errors in the medium, whereby constant data quality can be guaranteed despite the deterioration of the medium. If these errors in the medium become too large, the data can no longer be completely reconstructed and are thus irretrievably lost ("digital forgetting"). Therefore, when it comes to long-term archiving of digital data, the focus is no longer on preserving the medium, but on copying it in good time before data is lost. Since the media (e.g. magnetic tape and DVD), formats and read / write devices for digital storage change rapidly over time, regular testing and continuity across the changes requires constant attention and long-term planning. Proprietary formats and copyright restrictions , among other things, cause problems when transferring to new systems .
Shelf life of the carrier media
For example, while old parchment and paper can be kept for hundreds of years if stored properly, this does not apply to new storage media . Most of the publications from the first half of the 20th century are printed on paper that is degraded by acid corrosion. In older printed works and manuscripts, other problems arise: was iron gall ink used in the manufacture, the ink ingredients can be a by unbalanced mixtures ink corrosion use. This occurs when the ink is a gallic acid excess or vitriol surplus prevails. The cellulose is attacked in a similar way to acid corrosion, and the paper can break due to different and changing moisture levels along the letter lines.
Analog films, photos and magnetic tapes also have a limited shelf life. The service life of digital storage media such as floppy disks, hard drives and burned CDs / DVDs is even shorter. Digital data carriers lose their media-specific structured data either due to environmental influences (for example due to sufficiently strong magnetic fields in the vicinity of floppy disks and magnetic tapes), or a data structure is changed so much by chemical or physical influences that no more data can be stored in it. or data that has already been written can no longer be read at all (for example if CD-ROMs have been exposed to UV radiation for a sufficiently long time). Often the data readability only fails because the appropriate reading devices and programs for making it readable are no longer available at a later point in time, or that older data formatting standards can no longer be interpreted, or that the technical interfaces of very old data reading devices are no longer supported. In order to avoid the aforementioned problems, it can make sense to (re) convert certain selected, electronically stored data into the non-electronic form (back) and to permanently chisel important data in stone as a modern equivalent of the cultural habit of our ancestors - to engrave an almost indestructible nickel plate with an ion beam.
Another method of permanently storing images and texts in analog legible form is to burn them onto stoneware slabs using ceramic pigments . The Memory of Mankind (MOM) project stores images of museum cultural assets as well as everyday cultural products on stoneware slabs and stores them in chambers in Hallstatt's salt mountain . The theoretical durability is given as hundreds of thousands of years. The durability of a ceramic data carrier has been proven for at least 5000 years ( cuneiform tablets ).
medium | Expected life | Recording density (kbit / kg) |
---|---|---|
Ceramic panels | 5000 years (secured), probably several 10,000 years | |
Stoneware panels with fired ceramic color printing | several 100,000 years if protected against erosion (assumed) | |
Stone tablets and stone paintings | several 1,000 years (secured) | 1 × 10 −3 - 1 |
Nickel plate | several 1,000 years (presumed) | |
Books and manuscripts made from acid-free paper and with acid-free and non-ferrous ink |
several 100 years (secured) | 3 × 10 3 - 3 × 10 4 |
Books and manuscripts made from acidic paper (especially printed works from the 19th and early 20th centuries) |
70 - 100 years | |
Newsprint | analogous to acidic / -free letterpress paper | |
Films on celluloid (cellulose nitrate) | more than 100 years (secured), probably up to 400 years | |
Films on cellulose triacetate | 44 years (secured) | |
Films on polyethylene terephthalate (PET) | Color film up to 150 years (presumed)
Black and white film up to 700 years (presumed) |
|
Optical storage media (burned) b |
|
|
Optical storage media (pressed) |
|
|
Diskettes as archive media (stored) | 10 - 30 years (depending on data density?) |
|
Hard disk drives on the fly | 2 - 10 years, depending on the daily operating time, an average of 5 years | |
Hard disk drives as archive media (stored) | 10-30 years | |
USB stick | 30 years | |
SD card | 30 years | |
Magnetic tapes | > 30 years (secured) | |
Magneto Optical Disk (MO disk) | 30 - 50 years | |
Iomega REV removable drive | up to 30 years (presumed) |
Rapid media and system change
In the case of digitally stored information in particular, there is the additional problem that the data is no longer accessible although the medium itself has been preserved.
Readability of the storage medium
In order to be able to access stored information, the respective carrier medium must be readable. With some media such as stone tablets or books, this can also be possible for a person without aids. In the case of digitally stored media, an appropriate reading device, often a drive, is usually necessary. If reading devices are no longer available, for example due to technological change, the data can no longer be read out, or only with difficulty. One example is outdated tape formats .
Outdated data formats
Even if the storage medium has been preserved and it is still readable, it may be impossible to access the stored data. Since digitally stored data is not directly accessible, but is digitally coded and structured in a media-specific manner, it is only possible to read this data if a program and an operating system are available that “understand” the content of a file. Since many operating systems and programs use their own (proprietary) process to encode the data, data readability can no longer be guaranteed as soon as an operating system or program is not continuously maintained. This problem is exacerbated by the policy of many software manufacturers to publish new program versions with changed data storage formats, which older data storage formats of the same program can no longer fully use.
Other restrictions
Proprietary systems and copyright restrictions make it difficult to copy and migrate data, which is necessary for long-term archiving, because the necessary steps are not known or permitted. The introduction of digital rights management (DRM) in particular will exacerbate the problem in the future. Such a set of rules for digital data or documents is necessary because, just as with conventional data, copyright issues must be clarified before they can be archived. The difference between conventional data and electronic documents results from the fact that in the latter case, the copy and the original are practically indistinguishable. When migrating documents in particular, it is necessary to make copies and, if necessary, to change original documents. Therefore, the consent of the author for such measures must be obtained in advance. Further copies that are handed out to readers of documents are to be appropriately remunerated and, if necessary, must be linked with blocking notices if forwarding free of charge is not permitted.
Finding information
It is not enough to just copy original data: it must be possible to find it again on the new medium. Therefore, certain additional data on the structure and content of the original data, so-called metadata , must be entered in catalogs, databases or other finding aids in order to be available for later data readout or search.
Data consistency
An often overlooked problem with long-term archiving as well as with short-term archiving is checking that the data is free of errors. Data can be modified on purpose, but it can also be changed unnoticed by system errors.
A way out could be the distributed storage at different locations in different organizations and the protection with distributed stored cryptographic checksums . This will u. a. practiced with the open source solution LOCKSS . In Germany there is also a German project ( LuKII ) that meets this requirement.
Procedure
Basically, methods of migration / conversion and emulation can be distinguished in electronic archiving .
Due to the use of open standards such as graphic formats ( TIFF , PNG , JFIF ) or free document formats ( XML , PDF / A , OpenDocument ), which are considered to be relatively long-lived and whose structure is publicly known, the cycles after which saved Data needs to be reformatted longer. The probability that there will still be systems and programs that can read such data in a few years' time is therefore significantly higher.
To prevent the loss of data due to aging of data carriers, the data must be regularly copied to new data carriers within the guaranteed data security period of a medium. This means that it is also possible to switch to a new carrier format as soon as the one previously used has become obsolete due to technical developments .
However, the high costs that arise from this maintenance of the data stocks mean that only the most important data can be preserved in this way. Today's flood of data and metadata, which is created not least by the steadily increasing use of digital data processing systems, further exacerbates the problem of the best possible classification of storage-relevant data volumes. The proportion of long-term stored data will necessarily be relatively small, which places high technical and other specialist requirements on the selection of the information to be backed up in terms of data technology. An additional problem arises from the drifting apart of the relationship between data volume and data bandwidth. The volume grows significantly faster than the bandwidth available to transfer data from one medium to another.
This doesn't just affect government and commercial data. In the private sector, too, conventional media, which can often be stored for a long time, are being replaced by more manageable digital media (photographs and negatives by digital images on a CD-ROM).
The deposit copy libraries and archives are responsible for long-term archiving in Germany .
See also
- ArchiSig and ArchiSafe - German government projects for long-term archiving
- Barbarastollen - Largest European archive for long-term archiving
- Information lifecycle management
- Internet archives , web archiving
- KEO - satellite as a time capsule
- Microfilm - An alternative and older form of long-term archiving
- Memory of Mankind
- OPENARCHIVE - open source long-term archive software
- Long-term study
- Stein von Rosette - example of centuries-old long-term archiving, modern attempt at improvement in the Rosetta project
literature
- Holger Schneider: Digital Amnesia: Long-term archiving of digital documents in a business environment , Herne 2012, ISBN 978-3844811445
- Reinhard Altenhöner, Sabine Schrimpf: Preservation and long-term availability of digital resources: strategy, organization and techniques . In: Rolf Griebel, Hildegard Schäffler and Konstanze Söllner (eds.): De Gruyter Saur, Berlin 2014, ISBN 978-3-11-030293-6 , pp. 850–872.
- Ralf Blittkowsky: Archiving the calculation formulas? . In Telepolis. Heise-Verlag February 14, 2004.
- Uwe M. Borghoff , Peter Rödig, Jan Scheffczyk, Lothar Schmitz: Long-term archiving . Dpunkt Verlag, 2003 ISBN 3-89864-245-3 .
- Bernward Helfer, Karl-Ernst Lupprian (arr.): File formats. Properties and suitability for archiving electronic documents. A handout for archivists. Wiesbaden and Munich 2004.
- Georg Hohmann: Digital Eternity and Virtual Museums . In: Telepolis. Heise-Verlag October 30, 2003.
- Ulrich Kampffmeyer , Jörg Rogalla: Principles of electronic archiving . VOI Compendium Volume 3. VOI Association Organizational and Information Systems e. V., Darmstadt 1997, ISBN 3-932898-03-6 .
- Heike Neuroth, Achim Oßwald, Regine Scheffel, Stefan Strathmann, Mathias Jehn: nestor handbook A small encyclopedia of digital long-term archiving , Hülsbusch, May 2009, ISBN 3-940317-48-9 .
- Roy Rosenzweig: Scarcity or Abundance? Preserving the Past in a Digital Era . In: American Historical Review 108, June 3, 2003, pp. 735-762.
- Sabine Schrimpf, Tobias Steinke: Long-term Archiving Policy of the German National Library , Version 1.2, as of May 4, 2018.
- Katherine Skinner, Matt Schultz: A Guide to Distributed Digital Preservation (PDF, 156 S .; 3.1 MB), Educopia Institute Atlanta, 2010, License: CC-BY-NC-ND-3.0 , ISBN 978-0-9826653- 0-5 .
- Ute Schwens, Hans Liegmann: Long-term archiving of digital resources . In: Rainer Kuhlen, Thomas Seeger, Dietmar Strauch (eds.): Basics of practical information and documentation. 5th, completely revised edition. Munich: Saur, 2004.
- Thorsten Wetzenstein: Digital long-term archiving under the aspect of access. Thesis. University of Heidelberg. 2010.
- Federal Office for Information Security (Ed.), IT Baseline Protection Manual. (Here: Section M 4.170: Selection of suitable data formats for archiving documents, status 2007).
- BSI technical guideline 03125: Preservation of evidential value of cryptographically signed documents.
- Digital Preservation Tutorial. Cornell.
- Digital archiving of photographic collections - a basic report by the University of Basel and the Swiss Cultural Property Protection Agency.
- Guidelines for the Preservation of Digital Heritage. UNESCO, March 2003.
Web links
- NESTOR competence network for long-term archiving of digital sources in Germany
- kopal - Cooperative development of a long-term archive of digital information
- The coordination office for the permanent archiving of electronic documents of the Swiss Archives Association u. a. with details on archive formats
- DFG Practice Rules "Digitization"
Individual evidence
- ^ Ute Schwens, Hans Liegmann: Long-term archiving of digital resources . In: Rainer Kuhlen, Thomas Seeger, Dietmar Strauch (eds.): Basics of practical information and documentation. 5th, completely revised edition. Munich: Saur, 2004, p. 567.
- ↑ Reinhard Altenhöner, Sabine Schrimpf: Preservation and long-term availability of digital resources: strategy, organization and techniques . In: Rolf Griebel, Hildegard Schäffler and Konstanze Söllner (eds.): Praxishandbuch library management . De Gruyter Saur, Berlin 2014, ISBN 978-3-11-030293-6 , pp. 850-872 .
- ^ Lothar Schmitz, Uwe M Borghoff, Peter Rödig, Jan Scheffczyk: Long-term archiving . In: Computer Science Spectrum . tape 28 , no. 6 , December 1, 2005, ISSN 1432-122X , p. 489 , doi : 10.1007 / s00287-005-0039-7 .
- ↑ Archive DVDs in the long-term test -c't-Archiv, 16/2008, page 116. In: heise.de. August 16, 2011, archived from the original on July 23, 2008 ; accessed on February 20, 2015 .
- ↑ mp: A uniform standard for the flood of digital data. March 10, 2008, accessed October 27, 2012 .
- ^ A b c Michael W. Gilbert: Digital Media Life Expectancy and Care. University of Massachusetts Amherst, 1998, archived from the original on December 22, 2003 ; accessed on January 4, 2011 .
- ↑ Bit Rot. Software Preservation Society, May 7, 2009, accessed January 4, 2011 .
- ↑ Google study on the cause of failure of hard drives. In: heise.de. February 16, 2007, accessed February 20, 2015 .
- ↑ Google study on the durability of hard drives in continuous operation ( Memento from February 13, 2009 in the Internet Archive ) (PDF; 247 kB): Section 3.1, Figure 2 (English)
- ↑ UNC: Hard disks & flash memory: fun with risk potential. (No longer available online.) In: speicherguide.de. June 29, 2006, archived from the original on September 24, 2015 ; accessed on September 17, 2015 . Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice.
- ↑ Shelf life of storage media: where data is right. In: netzwelt.de. April 22, 2007, accessed February 20, 2015 .
- ↑ a b Andreas Hitzig: Hard disk, SSD, USB & Co. - What is the best memory? In: PC World Online. March 20, 2017. Retrieved November 22, 2019 .
- ↑ Henrik Stamm: MO technology. Institute for Computer Science at Humboldt University Berlin, May 26, 2001, accessed on September 17, 2015 .
- ↑ Hartmut Gieselmann: DVDs in the long-term test - c't. In: heise.de. July 21, 2008, accessed February 20, 2015 .
- ↑ Uwe M. Borghoff u. a .: Long-term archiving . Methods for preserving digital documents. dpunkt.-Verl., Heidelberg 2003, p. 21.
- ↑ Frank Dickmann: Long-term archiving of research data: How do you deal with peta and exabytes? In: Deutsches Ärzteblatt Supplement: Practice . tape 108 , no. 41 , 2011, p. 6–8 ( uni-goettingen.de [accessed March 24, 2020]).
- ↑ Heike Neuroth, Stefan Strathmann, Achim Oßwald, Regine Scheffel, Jens Klump, Jens Ludwig (eds.): Long-term archiving of research data. An inventory . Verlag Werner Hülsbusch, Universitätsverlag Göttingen, Boizenburg 2012, ISBN 978-3-86488-008-7 , p. 16 , urn : nbn: de: hbz: 79pbc-opus-4204 ( th-koeln.de [PDF]).
- ↑ Asko Lehmuskallio, Edgar Gómez Cruz: Why material visual practices? In: Digital Photography and Everyday Life: Empirical Studies on Material Visual Practices . Routledge, 2016, ISBN 978-1-317-44778-8 , pp. 1 .
- ↑ Natascha Schumann: Introduction to digital long-term archiving . Scivero Verl., 2012, ISBN 978-3-944417-00-4 , pp. 46 ( ssoar.info [accessed March 24, 2020]).