Common Object File Format

from Wikipedia, the free encyclopedia

The Common Object File Format ( COFF ; German  general object file format ) is a binary format for programs and object files. It was introduced by AT&T for the Unix System V operating system and is now mainly used in the PE format for Windows based on it (see Portable Executable ). "Cof", "obj" or " lib " are often used for file extensions, if available and apart from the extensions used for PE .

history

The a.out format was originally used for executable files on Unix . However, this did not support modern developments such as embedded debugging information or dynamic libraries . Therefore AT&T developed the Common Object File Format for Release 3 of the Unix System V. Since the original COFF was limited in terms of design, different variants developed among the Unix manufacturers (e.g. XCOFF from IBM for AIX , ECOFF from SGI and others). With Release 4 of System V in 1989, AT&T replaced COFF with the new format ELF (Executable and Linking Format) developed together with Sun Microsystems .

properties

COFF made it possible to embed debugging information directly in a binary file. Libraries can be linked dynamically and handled as separate files, so they do not need to become an unchangeable, non-exchangeable part of a program file. For this purpose, all addresses in the relocation entries are loaded into the virtual memory of the application relative to the actual address of the section. This address needs of the section only for compile-time to be set already in place in programming. Formats developed according to COFF also have these capabilities.

use

Modern Unix and Linux versions no longer support COFF, but it is still used for embedded systems . Under Windows NT (and earlier) the COFF variant Portable Executable (PE, sometimes also PE / COFF) is the standard file format for libraries and executables, but this variant differs slightly from the original COFF.

structure

A COFF file consists of several parts. It begins with the file header and an optional header . This is followed by a number of sections , consisting of a header, a data section and an area for line number entries and an area for relocation entries. A symbol table and a character string table follow at the end of the file .

File header

The file header is at the beginning of a file. Data is stored there that describes the structure of the entire file. This includes the magic number , which is different for the different variants ( PE , XCOFF etc.), a Unix timestamp with the time the file was created, and the position and size of other sections. In addition to using Flag various properties of the file to be defined (eg. As whether it is executable).

struct filehdr {
    unsigned short  f_magic;        /* Magische Zahl */
    unsigned short  f_nscns;        /* Anzahl der Sektionen in der Datei */
    long            f_timdat;       /* Zeitstempel der Erstellung */
    long            f_symptr;       /* Zeiger zur Symboltabelle */
    long            f_nsyms;        /* Größe der Symboltabelle */
    unsigned short  f_opthdr;       /* Größe der "optional header" */
    unsigned short  f_flags;        /* Flags */
};

Optional header

The optional header contains different data depending on the COFF variant. It is often used for other information required for execution (e.g. the entry address). Since it can be of different lengths, its size is stored in the "File Header".

Section header

The section header contains data about a section, in particular how big it is and where it should be loaded into virtual memory . For executable files, usually the beginning of memory, i.e. H. the first section is loaded to address 0; this can be different for linked data. They also contain a pointer to and the size of the line number entries and the relocation entries.

struct sectionhdr {
    char           s_name[8];  /* Name der Sektion */
    unsigned long  s_paddr;    /* Speicheradresse, an die diese Sektion geladen werden soll*/
    unsigned long  s_vaddr;    /* virtuelle Adresse, an die diese Sektion geladen werden soll */
    unsigned long  s_size;     /* Größe der Sektion (inklusive Header)*/
    unsigned long  s_scnptr;   /* Zeiger zu den Daten dieser Sektion */
    unsigned long  s_relptr;   /* Zeiger zu den Relokationseinträgen dieser Sektion */
    unsigned long  s_lnnoptr;  /* Zeiger zu dem Zeilennummerneinträgen dieser Sektion */
    unsigned short s_nreloc;   /* Anzahl der Relokationseinträge */
    unsigned short s_nlnno;    /* Anzahl der Zeilennummerneinträge */
    unsigned long  s_flags;    /* Flags */
};

Data section

The data section can be of different lengths. It contains the actual data in the file. These are usually instructions in machine code , space for variables and data that are required for execution - in short, the actual program.

Relocation entry

A relocation entry defines where the symbols can be found in the data section. This is defined individually for each symbol.

typedef struct reloc{
    unsigned long  r_vaddr;   /* Adresse für die Relokation */
    unsigned long  r_symndx;  /* Symbol, für das die Relokation gilt */
    unsigned short r_type;    /* Type der Relokation*/
};

Line number entry

A line number entry defines which line in the source code corresponds to which instruction in the machine code. This is especially important for debugging applications. Each section has its own table of line numbers. The lines are counted individually for each function in the section.

typedef struct lineno{
    union l_addr{
        unsigned long l_symndx;  /* Index des Namens der Funktion */
        unsigned long l_paddr;   /* Adresse der Zeilennummer */
    };
    unsigned short l_lnno;  /* Zeilennummer */
};

Line numbers are incremented from 0 at the beginning of each function. For a line on which a function begins, an entry with l_lnno = 0and the symbol of the function as is l_symndxcreated. For each additional line in the function, an entry is created with the number of lines since the start of the function as l_lnnoand the address of the first statement from this line as l_paddr.

Symbol table

The symbol table contains information about the symbols in the file. Symbols are e.g. B. Functions or variables that can be used by other programs. The size and position of the symbol table is specified in the file header . The symbol table consists of entries of the form

typedef struct sysent{
  union e {
    char e_name[8];             /* Name des Symbols */
    struct e {
      unsigned long e_zeroes;   /* Falls 0, ist der Name des Symbols in der Zeichenkettentabelle angelegt*/
      unsigned long e_offset;   /* Position des Symbols in der Zeichenkettentabelle */
    };
  };
  unsigned long e_value;        /* Wert (in der Regel Adresse) des Symbols */
  short e_scnum;                /* Sektion */
  unsigned short e_type;        /* Datentyp */
  unsigned char e_sclass;       /* Speicherklasse */
  unsigned char e_numaux;       /* Anzahl zusätzlicher Einträge*/
};

The name of the symbol is e_namesaved in if it is eight characters or less. Otherwise it is stored in the character string table, then is e_zeros = 0, and e_offsetindicates the position of this entry in the character string table. The "value" of the symbol is e_valuesaved in. This is usually the address at which this symbol is stored, which in turn depends on the data type and the storage class that is e_sclassstored in. e_typedefines the data type of the symbol. This can either be an elementary type (int, float etc.) or a composite type (struct, union). In addition, the symbol can define a value, a pointer, a field ("array") or a function that returns this value. e_classdefines the storage class, i.e. where and how the symbol is stored (e.g. it can be an external symbol, a function argument, a global or static variable, etc.). Additional entries may follow depending on the type of symbol. The number of these entries is also e_numauxindicated.

String table

The string table follows the file at the end. It begins with an integer in which the length of the table is stored. Then all the strings follow one another. To read a character string, you have to know its position and you can start reading at this point. The strings are zero-terminated .

Individual evidence

  1. Common Object File Format Texas Instruments, accessed March 8, 2014
  2. Overview over SCO System V Release 3 ( Memento of the original from March 9, 2014 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. HP, accessed March 8, 2014  @1@ 2Template: Webachiv / IABot / h10025.www1.hp.com
  3. ^ XCOFF Object File Format IBM, accessed March 8, 2013
  4. Object File / Symbol Table Format Specification Compaq / HP, accessed March 8, 2014
  5. Typer of Executable ( Memento of the original from March 9, 2014 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. Linux.org, accessed March 8, 2014 @1@ 2Template: Webachiv / IABot / www.linux.org
  6. PE and COFF Specification , Microsoft Developer Network, accessed March 8, 2014

Web links