Intel HEX

from Wikipedia, the free encyclopedia

The Intel HEX format is a data format for storing and transmitting binary data. Today it is mainly used to save programming data for microcontrollers or microprocessors , EPROMs and similar components. However, it can also be used to store charging modules. The HEX format is the oldest data format of its kind and has been in use since the 1970s. Later expansions specifically support the segmented addressing of the Intel 80x86 processors .

An Intel HEX file is in ASCII format. The bytes of the coded binary data are each represented as a hexadecimal number made up of two ASCII characters (0… 9 and A… F). HEX files can be opened and modified with a text editor. The HEX file is roughly twice as large as the binary data it contains, since one byte is represented by two bytes in hexadecimal notation. The data records are provided with a checksum so that transmission errors can be recognized.

history

The Intel-Hex Format (originally Intellec-Hex ) was designed by Intel in 1973 for the Intellec Development Systems ( MDS ) to load and start programs from punched tape. It should also simplify the transfer of data to ROM production. At the same time, it was used to program (E) PROM by means of a punched strip or punched card-controlled EPROM programming device. With the introduction of floppy disk drives with the MCS Series II under ISIS II (1975), files were also created in this format. Since then, HEX has been used as the file extension .

format

The format described here corresponds to the Hexadecimal Object File Format Specification from Intel.

Structure of a data record

The coding is (7 bit) ASCII. Each data record is introduced by a colon (":"), consists of an even number of characters and is terminated by an end of line. The structure of the line end is not defined and depends on the media. Intel tools for streaming media always generate a CR / LF (0D0A HEX ).

Two characters each represent a data byte. The notation is hexadecimal, big-endian with the characters 0..9 and A..F, i. This means that the more significant nibble comes first. All entries are also made in the Big-Endian address fields . Lowercase letters (a..f) are not mentioned in the definition but are supported by most implementations.

Intel designation content use
RECORD MARK Start of sentence ":" (Colon, ASCII coding 3A HEX )
RECLEN Data length Length of the user data as two hexadecimal digits
LOAD OFFSET Loading address 16-bit address (big endian)
RECTYP Record type Record type (00..05)
INFO or DATA Data User data (RECLEN x 2 characters)
CHKSUM Checksum Check sum over the data record (without the beginning of a record)

Record types

Overview

There are six record types :

Type designation use
00 Data record Payload
01 End of file record End of file (and start address for 8-bit data)
02 Extended segment address record Extended segment address for subsequent user data
03 Start segment address record Start segment address (CS: IP register)
04 Extended Linear Address Record Extended linear address, more significant 16 bits of the address for subsequent user data
05 Start linear address record Linear start address (EIP register)

The data records can appear in any order; an end record (type 01 ) ends processing.

Data record (type 00 )

The data record contains the 16-bit load address and the user data.

Start code Number of bytes address Type Data field Checksum
length 1 character 2 digits 4 digits 2 digits 2 n digits 2 digits
content : n address 00 Data Checksum

n : number of bytes in the data field
Address : 16-bit address for storing the data record
Data : data field, n bytes

End of File Record (type 01 )

The data record marks the end of the file. In the original (8-bit) definition, the start address of the program (PC) is specified in the address field for loadable formats. In the 16/32 bit formats this must be 0000 .

Start code Number of bytes address Type Checksum
length 1 character 2 digits 4 digits 2 digits 2 digits
content : 00 0000 01 FF

Extended segment address record (type 02 )

The data field of the Extended Segment Address Record (extended segment address) contains bits 4-19 of the address segment of the following data records (counting starting with 0) in cases in which the size of a 16-bit address space (i.e. 64 kbytes) is insufficient. The address contained in the data field is shifted by 4 bits to the left (corresponding to a multiplication by = 16) and added to the 16-bit addresses contained in the following data records (type 00 ). The Extended Segment Address Record remains in effect until it is changed by another Extended Segment Address Record . The address field of the data record of type 02 is always 0000, the length is 02 .

Start code Number of bytes address Type Data field Checksum
length 1 character 2 digits 4 digits 2 digits 4 digits 2 digits
content : 02 0000 02 segment Checksum

Start segment address record (type 03 )

The data record specifies the start address for load modules. For x86 processors this is the CS: IP content. The data record can appear in any position. The start address is calculated as segment * 16 + offset . The address field is always 0000 , the length is 04 .

Start code Number of bytes address Type Data field Checksum
length 1 character 2 digits 4 digits 2 digits 4 digits 4 digits 2 digits
content : 04 0000 03 segment Offset Checksum

Extended Linear Address Record (type 04 )

The data field of the Extended Linear Address Record (extended linear address, also 32-bit address data set or HEX386 record) is used to support a 32-bit address space with a 4 GB limit and contains 16–31 bits, the more significant 16 bits (ULBA , Upper Linear Base Address, counting starting with 0) of a 32-bit address (LBA, Linear Base Address). The address data record applies to all subsequent type 00 data records until it is replaced by another extended address data record. The absolute memory address of a type 00 data record is obtained by placing the address data from the extended address data record in front of the address field of this data record. If a Type 00 data record within a 32-bit address space is not preceded by a Type 04 address data record, the upper 16 address bits are set to 0000 by default .

The address field of the extended address data record itself is always set as 0000 (with a length of 02 ) :

Start code Number of bytes address Type Data field Checksum
length 1 character 2 digits 4 digits 2 digits 4 digits 2 digits
content : 02 0000 04 ULBA, address (high word) Checksum

Start linear address record (type 05 )

The data record specifies the start address for load modules. For x86 processors, this is the content of the EIP register. The address field is always 0000 , the length is 04 .

Start code Number of bytes address Type Data field Checksum
length 1 character 2 digits 4 digits 2 digits 8 digits 2 digits
content : 04 0000 05 EIP Checksum

Calculation of the checksum

The checksum is calculated from the entire data set, excluding the start code and the checksum itself. The data record is summed byte by byte, the lower byte is taken from the sum and the two's complement is formed from this.

The two's complement is formed by inverting the bits of the low byte and then adding 1. This can be done e.g. This can be achieved, for example, by using the exclusive-or link with FF HEX and adding 01 HEX . So 00 HEX remains unchanged, 01 HEX becomes FF HEX , etc.

The two's complement expresses a negative number in the binary system. Since the checksum represents the negative sum of the remaining bytes, checking a data record for errors is very easy. You simply add up the individual bytes of a data record including the checksum and receive 00 HEX as the low-order byte if the data record is correct.

variants

Intel

In the course of processor development from Intel 4004 until today, different variants have been defined:

variant commitment Permitted sentence types
I08HEX 4/8-bit CPU (4004..8085) 00 (Data),
01 (End of File)
I16HEX 16-bit CPU (8086/186/286) 00 (Data),
01 (End of File),
02 (Extended Segment Address),
03 (Start Segment Address)
I32HEX 32-bit CPU (from 80386) 00 (Data),
01 (End of File),
02 (Extended Segment Address),
03 (Start Segment Address),
04 (Extended Linear Address),
05 (Start Linear Address)

other producers

The HEX format was widely used as a quasi-standard. The byte order in the data field was partially changed, i. H. the order does not match the address location. Manufacturers (e.g. Texas Instruments) have also changed the addressing. There the address does not correspond to a byte, but to the width of a register of the processor.

example

:020000021000EC
:10010000214601360121470136007EFE09D2190140
:100110002146017EB7C20001FF5F16002148011988
:10012000194E79234623965778239EDA3F01B2CAA7
:100130003F0156702B5E712B722B732146013421C7
:00000001FF
  • Start code
  • Byte count
  • address
  • Type
  • Data field
  • Checksum
  • The checksum for the first sample data is calculated as follows: .

    Related file formats

    The Motorola S format (also S-Record , SREC or S19 for short ) is very similar . There are also other formats for this area of ​​application, such as the simple binary code or the Jedec format.

    Sources, web links

    Individual evidence

    1. Hexadecimal Object File Format Specification , Revision A of 6 January 1988
    2. a b General: Intel Hex File Format. ARM Germany GmbH, accessed on September 6, 2017 (English).