Dark data

from Wikipedia, the free encyclopedia

As Dark Data refers to data which, although from, information systems are recorded and stored but not used. With large amounts of data ( big data ), a lot of data can arise that are not all analyzed, viewed or used and so it happens that one is no longer aware of the existence of the data.

Reasons for Dark Data

There are different reasons for the emergence of dark data or the decision to allow dark data. These can be, for example:

  • All data should be backed up and archived, regardless of how often they are used
  • legal and security reasons (e.g. obligation not to delete certain data)
  • superfluous or incorrect data has arisen and is fading into the background
  • Data is linked to other data, but is not used itself
  • Data cannot be found, hidden, corrupted or encrypted and is therefore ignored
  • Data should be kept for later analyzes and therefore not considered any further (time delay or waiting for better technologies)
  • Outdated data and data remnants (data that are not adapted to the time are classified as irrelevant and forgotten or ignored)
  • Memory is becoming larger and thus more used and data can be compressed better
  • Lack of search, classification, sorting and categorization of data
  • high cost and time expenditure for the evaluation

Importance and outlook

IBM estimates that approximately 90% of the data generated by sensors and analog-to-digital converters is never used. Most companies also analyze an average of only 1% of the data. In companies, this is mostly due to the large amount of data that can no longer be managed and which would be very costly to process. At Computer Weekly, 60% of the organizations surveyed said they believed that their business intelligence systems were inadequate. 65% also said that content management is very disorganized. In addition, at the New York Times, 90% of the data centers stated that 90% of their energy consumption is wasted and thus a higher burden on the environment and additional costs arise from dark data. Therefore, many companies are trying to evaluate dark data with artificial intelligence . A well-known example is Watson from IBM. If important data is recognized too late, this can have dire consequences for companies. In addition, it is difficult to assess how to handle sensitive data that should be transferred or passed on but has not yet been analyzed or what happens to the data in the event of data theft .

According to some companies, the data that is not used today may be important for individual applications or analyzes in the future.

Individual evidence

  1. a b Dark Data. In: ITwissen.info. DATACOM Buchverlag GmbH, accessed on October 14, 2019 .
  2. Digging up dark data: What puts IBM at the forefront of insight economy | #IBMinsight - SiliconANGLE . In: SiliconANGLE . October 30, 2015 ( online [accessed February 1, 2018]).
  3. ^ The big data challenge of transformation for the manufacturing. Retrieved February 1, 2018 .
  4. Dark data could stop big data's path to success . In: ComputerWeekly.com . ( Online [accessed February 1, 2018]).
  5. James Glanz: Data Centers Waste Vast Amounts of Energy, Belying Industry Image . In: The New York Times . September 22, 2012, ISSN  0362-4331 ( online [accessed February 1, 2018]).
  6. IBM Cognitive Colloquium Spotlights Uncovering Dark Data - InformationWeek . In: InformationWeek . ( Online [accessed February 1, 2018]).
  7. Deriving Value from Data Before It Goes Dark - insideBIGDATA . In: insideBIGDATA . October 12, 2015 ( online [accessed February 1, 2018]).