Universal binary

from Wikipedia, the free encyclopedia
Apple's Universal Binary logo

Universal binaries (on German about universal binaries ), often abbreviated to UB , in are Apple -Jargon executables (d. H. Programs ), the native executable machine language for more than one processor architecture included. Apple used Universal Binaries when converting from PowerPC to Intel x86 processors from 2005 onwards. The technology was alsointegratedin Xcode , so thatapplicationscreated and compiled accordinglycould run natively on both PowerPC and Intel Macs . With the switch from Intel to ARM , which was announced in 2020, thesame technology will be used againwith Universal Binary  2 and will also be integrated into Xcode.

history

Universal Binaries are based on using MACH -designed mid-1980s Mach-O - binary format . Under Next Step , developed on the basis of Mach Version 2 in 1987, the format was on the operating system (from NeXTStep 3.1 in 1993) supported architectures m68k , IA-32 "i386" (32-bit x86 ), PA-RISC and SPARC expanded and called Multi-Architecture Binaries .

From 1994 onwards, Apple had also converted the processor architecture under the System 7 operating system from the m68k to the PowerPC architecture and implemented a similar concept with fat binaries . However, the binary format used is not related to that of Mach-O.

In 1997 Apple acquired NeXT together with the Mach operating system renamed OPENSTEP from NeXTSTEP and ported it to the PowerPC platform used by Apple at the time in the Rhapsody project . Rhapsody should have completely replaced the classic Mac OS , but it was a completely different operating system and not compatible with existing Mac OS programs. When Rhapsody was not accepted by the manufacturers of important application software, Apple ported large parts of the Mac OS programming interface under the name Carbon to the new operating system that Rhapsody had now renamed Mac OS X. The Mach-O format has since continued with Mac OS X, but was initially only used for a single architecture: the PowerPC architecture.

In 2005, with the switch from the PowerPC to the IA-32 architecture, Apple took up the already existing technology of multi-architecture binaries again: At the Worldwide Developers Conference (WWDC), it was renamed Universal Binaries , to the public presented. Apple integrated the technology into its own Xcode development environment to make it easier for application program developers to integrate native binary code for both architectures into their software products. In this context, it was also possible to accommodate both 32-bit and 64-bit binary code for the same architecture in one universal binary, i.e. both 32-bit and 64-bit x86 as well as 32-bit and 64-bit PowerPC. After the transition to IA-32, support was removed from Xcode. iOS also supports universal binaries , which enables mobile apps for various ARM architectures.

implementation

In order to use universal binaries , the kernel of an operating system must be able to handle the Mach-O binary format extended by NeXT. Compared to the simple Mach-O format, multi-architecture binaries are Mach-O files encapsulated in one another, with additional metadata. The header structure of Mach-O itself was not changed, but added additional flags:

  • Architecture Magic Numbers
    • MH_MAGICrepresents binary code in big endian byte order in 32-bit
    • MH_CIGAM stands for binary code in little endian byte order in 32-bit
    • MH_MAGIC_64represents binary code in big-endian-byte order in 64-bit
    • MH_CIGAM_64 stands for binary code in little-endian byte order in 64-bit
  • CPU type, for example:
    • CPU_TYPE_POWERPC for 32-bit PowerPC
    • CPU_TYPE_POWERPC64 for 64-bit PowerPC
    • CPU_TYPE_I386for 32-bit x86 or 32-bit IA-32 (from Intel 80386 , therefore i386)
    • CPU_TYPE_X86_64for 64-bit x86 or x64 (also x86-64, 64-bit IA-32)

A universal binary is recognized by the operating system as such by its header when it is executed, which means that the operating system can then process the respective executable code based on the existing architecture. Even today, open source components of macOS contain references to m68k, SPARC and other CPUs. In 2020, 17 different architectures were counted.

This process now makes it possible to run an application on an Apple computer with PowerPC as well as with Intel architecture ( Universal Binary from 2005) or with Intel and ARM architecture ( Universal Binary from 2020) without any loss of speed.

Technical

Universal binaries are implemented using the Mach-O binary format , which, in contrast to the ELF format , which is common in Linux and other Unix-like operating systems, can contain binary code for several architectures. With the tool lipofrom Xcode and objdumpfrom the GNU Binutils you can read out the binary codes of a universal binary. Also filegives an overview of the architectures included.

The universal Apple Safari contains code for both Intel ( i386) and PowerPC ( powerpc:common). “ mach-o-le” And “ mach-o-be” stand for the little endian and big endian byte orders .

$ objdump -f /Applications/Safari.app/Contents/MacOS/Safari 
In archive /Applications/Safari.app/Contents/MacOS/Safari:

/Applications/Safari.app/Contents/MacOS/Safari:     file format mach-o-le
architecture: i386, flags 0x000001ff:
HAS_RELOC, EXEC_P, HAS_LINENO, HAS_DEBUG, HAS_SYMS, HAS_LOCALS, DYNAMIC, WP_TEXT, D_PAGED
start address 0x0000000000000000


/Applications/Safari.app/Contents/MacOS/Safari:     file format mach-o-be
architecture: powerpc:common, flags 0x000001ff:
HAS_RELOC, EXEC_P, HAS_LINENO, HAS_DEBUG, HAS_SYMS, HAS_LOCALS, DYNAMIC, WP_TEXT, D_PAGED
start address 0x0000000000061830

There are also programs that are offered in two separate versions (Intel-Binary and PPC-Binary), here you have to decide on the right file when downloading (unless you want to download twice), but this has the advantage of smaller files .

In the case of programs for only one architecture, only one binary code is displayed accordingly:

$ objdump -f /Applications/VLC.app/Contents/MacOS/VLC 

/Applications/VLC.app/Contents/MacOS/VLC:     file format mach-o-le
architecture: i386, flags 0x000001ff:
HAS_RELOC, EXEC_P, HAS_LINENO, HAS_DEBUG, HAS_SYMS, HAS_LOCALS, DYNAMIC, WP_TEXT, D_PAGED
start address 0x0000000000000000

Without additional programs, the binary code with the command can fileread, so for the above example Safari: file /Applications/Safari.app/Contents/MacOS/Safari.

"Classic" Fat Binaries

When switching from the Motorola 68k to the PowerPC processor architecture, Apple used the concept of putting code for several processors in the same file. At that time the term Fat Binary was used. However, under the classic Mac OS, this was not implemented as a Mach-O file, but as the actually more modern PEF file format ( Preferred Executable Format ). The m68k code was in the resource fork and the PowerPC code in the data fork .

Trivia

Theoretically, it is possible to pack far more than two architectures into a so-called super- universal binary , so that the resulting program can then run natively on numerous architectures. In practice this was z. B. implemented with the conversion from PowerPC to Intel for up to four architectures.

Web links

Individual evidence

  1. ^ William Woodruff: Mach-O Internals. (PDF; 307 KB) February 10, 2016, p. 12 (English).;
  2. a b Ben Schwan: macOS: Universal Binaries with ARM, Intel and PowerPC. In: Heise online . July 14, 2020 . Retrieved July 14, 2020.
  3. lipo.1 manpage in the source text (English), accessed on July 21, 2015
  4. objdump (1) manpage, accessed on July 21, 2015
  5. file (1) manpage, accessed on July 21, 2015
  6. LAME 3.98.4 Universal for Mac OSX 10.5 (English) - LAME as an example of a universal binary for four architectures: PowerPC, PowerPC64, i386 and x86-64