Obfuscation (software)

from Wikipedia, the free encyclopedia

Obfuscation ( Engl. Obfuscate "obscure", "obfuscate", "confused", "conceal") is a term used in software engineering and describes the deliberate modification of program code , so that the source code for those hard to understand or difficult recoverable. The aim is to greatly increase the effort for reverse engineering in order to make changes, unwanted copying of program parts or theft of intellectual property more difficult or to improve the functionality e.g. B. to disguise malware.

In the case of interpreter or script languages , if the source text is delivered, this means making the delivered copy of the source text illegible and more difficult to read (for humans). In a compiled program, an obfuscator does not scramble and obscure the source code, but rather the compilation or a copy of the source code immediately prior to compilation. Above all, the (automatic) decompilation should be prevented or the decompilation should be as incomprehensible as possible.

properties

Obfuscation changes executable program code without changing the program function. For example, variable and function names are renamed. The machine code or byte code (in the case of a compiled program) can also be scrambled so that the command sections that correspond to a high-level language command mix with those of the previous / subsequent high-level language command ; often additional unnecessary (machine) commands are also inserted. This can make a machine decompilation into the original high-level language much more difficult or even impossible.

A side effect can, depending on the nature of the code, be the slight reduction in its memory requirements (especially with scripting language programs), e. B. by renaming long identifiers into shorter ones. This is beneficial for website scripts to reduce the transfer volume. It can also be advantageous in the case of application programs for terminals with low storage capacity or computing power. Since it is more difficult to find errors in programs that have been modified in this way, many projects do without obfuscation.

Demarcation

Obfuscation renames names. Since it does not encrypt the entire program, it is not an application of steganography and in general not of cryptography either . However, u. Character strings, files or entire classes stored in the program may be encrypted so that they cannot be read in plain text (see below).

Examples of code obfuscation methods

Equivalent formulas and constant transformations
For example, an addition +10 can be replaced by “add 15 and subtract 5”.
Changing the control flow
The order in which program instructions are executed can sometimes be rearranged without affecting functionality. This can be done in the source code as well as in the compilation (then with machine commands).
Variable substitution
Easy renaming of variable names like “invoice amount” or “address” to generated names like “ax7zhgr”.
Conditional instructions and jumps
This also includes superfluous comparisons that always result in true or false, links or pointers .
Change in the functional hierarchy
Individual instructions or blocks can - contrary to the logical structure - be swapped out in subroutines or copied from subroutines to the calling point.
Insertion of redundant code
Superfluous code is inserted into the sequence of instructions that only performs irrelevant calculations.
Insert code that makes decompiling difficult
For example, inserting code after the end of a method, which causes some decompilers to crash.
Encryption
Encryption is particularly suitable for camouflaging individual bytes or strings such as hard-coded passwords or files supplied in the code, or even entire classes and libraries.
Mixing functions
The (machine) instructions of two functions / high-level language commands can be written alternately. This blurs the boundaries between the functions.
Columns of variables
Restructuring of arrays or lists
  • a one-dimensional array can be split into several one-dimensional arrays
  • a one-dimensional array can be expanded into a multi-dimensional array
  • a multi-dimensional array can be shrunk to a one-dimensional array
  • two or more one-dimensional arrays can be merged into one one-dimensional array.
Anti-debugs
Routines that aim at the detection and then early termination of a program when a debugger is detected. To do this, for example, they scan the memory for search strings from various debuggers.

Programs

There are different numbers of obfuscators for disguising software, depending on the programming language and platform. Many of these are available for direct application to the source code, or for platforms whose source code would be easily accessible without the use of obfuscators, for example by using a bytecode-like intermediate language before execution. However, there are also obfuscators that obscure programs written in programming languages ​​that compile directly into executable code.

C / C ++

The following obfuscators for C / C ++ are actively maintained: Stunnix C ++ Obfuscator, StarForce C ++ Obfuscate, Morpher C / C ++ Obfuscator, Semantic Design C and C ++ Obfuscators

Windows Script Encoder

To disguise various scripts such as JScript , VBScript and especially ASP files, Microsoft recommends using the Windows Script Encoder. If the web server is compromised , the attacker should not be able to understand how the ASP application works. However, there are now decoders to undo the obfuscation.

Java bytecode and MSIL

There are a number of proprietary and open source obfuscators for obfuscating Java bytecode and the .NET Common Intermediate Language .

The following obfuscators for Java bytecode are actively maintained: DashO, JavaGuard, ProGuard , yGuard and Zelix Klassmaster. ProGuard is recommended by Google for obfuscating Android programs .

JavaScript

For obfuscating JavaScript code, there are a large number of obfuscators. Most of these obfuscators also support the reduction of the code, or there are many minimizers that also contain obfuscation techniques. The following list of JavaScript obfuscators are programs whose main feature is the obfuscation of JavaScript code: JScrambler, JSObfuscator, Javascript Obfuscator, UglifyJS, Compressor and Minimizer, Stunnix, Jasob.

Disadvantages of obfuscation

Obfuscation can make reverse engineering of a program more difficult or time-consuming, but not necessarily make it impossible. In addition, it limits the application of reflection to obfuscated code.

Some antivirus programs, such as AVG , alert the user when visiting a website with obfuscated JavaScript code, since obfuscation can also be used to hide malicious code.

Obfuscation and the copyleft license

Whether it is legal to circumvent a copyleft software license by revealing disguised source code has been the subject of debate within the open source community. This type of bypass occurs when the author is reluctant to publish the source code of his own program but is forced to do so by the license of the original program. The subject is addressed in the GNU General Public License by designating source code as the preferred version of published code. The GNU website announces that obfuscated source code is not real source code and does not count as source code, which means that the use of obfuscators on GPL-protected source code constitutes a license violation in the eyes of the GNU project.

Others

There are programming competitions for creatively disguised program source texts, but this only corresponds to obfuscation for script languages:

Web links

Individual evidence

  1. ^ Richard R. Brooks: Disruptive Security Technologies with Mobile Code and Peer-to-Peer Networks . CRC Press, May 14, 2012, chap. 7 , p. 155 ff . (English).
  2. a b Codewall .Net Obfuscation ( Memento from December 25, 2014 in the Internet Archive ) - "Control Flow Obfuscation. Control Flow Obfuscation scrambles the execution paths of the method bodies of your application making decompilers crash."
  3. Proguard Results - "It primarily reduces the size of the processed code, with some potential increase in efficiency as an added bonus."
  4. Control Flow Obfuscation , Microsoft Developer Networt, Visual Studio 2005
  5. JBCO: the Java ByteCode Obfuscator - "JBCO has been shown to cause failure or crashes in two of the more modern decompilers Dava and SourceAgain"
  6. DexGuard Website - Encrypt strings, encrypt entire classes, encrypt native libraries, encrypt assets
  7. Open Directory - Computers: Programming: Languages: JavaScript: Tools: Obfuscators . Dmoz.org. August 3, 2013. Retrieved November 25, 2013.
  8. Open Directory - Computers: Programming: Languages: PHP: Development Tools: Obfuscation and Encryption . Dmoz.org. September 19, 2013. Retrieved November 25, 2013.
  9. Stunnix C ++ Obfuscator Homepage
  10. Star-Force Obfuscator Homepage
  11. Morpher Homepage
  12. Semantic Design C Obfuscator
  13. ^ Semantic Design C ++ Obfuscator
  14. List of Java Code Obfuscators at java2s.com
  15. ProGuard alternatives according to ProGuard
  16. DashO homepage
  17. JavaGuard homepage
  18. yGuard Homepage
  19. Zelix Klassmaster Homepage
  20. ProGuard in the Android developer documentation
  21. JScrambler Homepage ( Memento from March 20, 2015 in the Internet Archive )
  22. JSObfuscator homepage
  23. Javascript Obfuscator Homepage
  24. UglifyJS Homepage
  25. Javascript Obfuscator Homepage
  26. Stunnix homepage
  27. Jasob homepage
  28. ^ "Can We Obfuscate Programs?" by Boaz Barak . Math.ias.edu. Archived from the original on August 10, 2014. Retrieved November 25, 2013.
  29. AVG ThreatLabs (virus database) via obfuscated JavaScript , accessed March 1, 2015
  30. securityfocus.com , accessed February 8, 2015.
  31. Reasoning behind the "preferred form of the work for making modifications to it language in the GPL . Lwn.net. Accessed on November 25, 2013
  32. Free Software Definition - www.gnu.org