Data masking

Data masking is the English technical term for the anonymization or alienation of data . The methods used are therefore also data protection measures .

Alternative terms that mean the same thing in terms of content are: data obfuscation, data sanitization or data scrambling. Data masking differs from the encryption of data in that there does not have to be a 1: 1 mapping between original data and alienated data. In addition, the data mostly remains legible.

Data masking does not only refer to personal data and is therefore more broadly defined than the pure anonymization and pseudonymization of personal and address data. Rather, all conceivable data types can be "masked". The aim of alienating the original data is the so-called data leakage prevention (prevention of data leaks). One tries to solve the problem of data theft , data abuse or other forms of data crime through data masking by changing the database itself: In databases that are accessible to external persons, such as B. test or training systems, one does not save original or productive data, but the information changed by the data masking.

Masking methods (examples)

Blacklist : A blacklist contains replacement words for original words or tokens (Find & Replace).

Free text : a freely definable string of letters, numbers and special characters

Anagram : Anagrams based on the content of the original value

Technical valid credit card number : technically valid credit card numbers that meet the Luhn algorithm

Random Company Names : random company names made up of words from a sample database and also a company ending (e.g. GmbH)

Random First Name : Random first names using a sample database or library for first names

Random Last Name : random last names using a sample database of last names

Random Names : random full names (i.e. first and last name)

Random e-mail address : Example e-mail addresses consisting of name (first name.surname are common), '@', company name, '.' and an ending like com or de (e.g. Michael.Müller@meine-firma.com)

Telephone Number : realistic telephone / fax / mobile numbers

Replace each x-th Char with y : masked words by covering certain parts with a special character such as B. *

Replace the first and last x Chars with y : masked words by covering the beginning and end of the word with a special character such as B. *

Random Number between x% - y% of Original Value : numerical random values within predefined limits that are dependent on a base value

x% of Original Value : numerical values, each of which is a percentage of a base value determined by a percentage

Shuffle values in attribute : The original values are randomly reassigned to the individual table rows.

Adjust in the same proportion : Numerical values are changed in the same way as the numerical values in another attribute.

Adjust inverse proportional : Numerical values are changed in the opposite way (i.e. exactly inversely) as the numerical values in another attribute.

Random date between x and y days of Original Value : random date values within predefined limits that depend on a reference value

Application examples

Prevention of data abuse and data theft. With data masking, data is not encrypted, but alienated in such a way that it remains legible and retains its context and information structure as much as possible. This is used z. B. to fill test, demo or training systems with non-safety-critical data derived (masked) from original data. The security problem is solved directly at the data source.

Compliance with data protection regulations

Other areas of application relate to compliance with numerous data protection laws and guidelines worldwide, such as B .: HIPPA, HITECH, PHI, GLBA, PCI DSS, SOA, Dodd-Frank Wall Street Reform, Consumer Protection Act, SB 1386, European Union (EU) Data Protection Directive, PIPEDA, ISO 27000 series, USA Patriot Act etc. Companies in the countries affected by these regulations are obliged to selectively protect certain - mostly personal - data. This can be done using encryption or data masking measures.

Individual evidence

see. Section 3 (6) of the Federal Data Protection Act or corresponding state law

literature

Roning, Gerd and Gnoss Roland (2003). Anonymization of individual economic statistics, in: Series of publications "Forum der Bundesstatistik". Volume 42, published by the Federal Statistical Office, Wiesbaden.

Web links