Multipurpose Internet Mail Extensions

from Wikipedia, the free encyclopedia

The Multipurpose Internet Mail Extensions ( MIME ) are extensions of the Internet standard RFC 822 (replaced by RFC 5322 in 2008 ), which defines the data format of e-mails . This only provides for the American Standard Code for Information Interchange (ASCII). The MIME create compatibility for additional characters such as umlauts as well as for multimedia (e.g. for mail attachments). They were defined in RFC 2045 , RFC 2046 , RFC 2047 , RFC 2048, and RFC 2049 . RFC  2048 has only been rated as Best Current Practice by the Internet Engineering Task Force .

In addition, MIME is used for the declaration of content in various Internet protocols such as HTTP and in desktop environments such as KDE , Gnome , Xfce or Aqua .

general description

MIME makes it possible to exchange information about the type of data transmitted between sender and recipient ( content type field, internet media type ) and at the same time to define a character encoding suitable for the transmission path used ( content transfer encoding ).

Several coding methods are specified that enable the transmission of non-ASCII characters in texts as well as non-text documents such as images, speech and video in text-based transmission systems such as e-mail or Usenet . The non-text elements are encoded at the sender and decoded again at the recipient.

The coding of non-7-bit ASCII characters is often done using quoted printable coding , whereas binary data is usually Base64- coded. With this encoding method, the total size of the attached files increases by 33–36% (explanation see Base64). 752 KiB becomes 1 MiB (1024 KiB) and 1 MiB becomes 1393 KiB. Alternatively, for text data, it is Content-Transfer-Encoding: 8bitalso possible to transfer the non-ASCII characters directly using (the coding must be specified, e.g. UTF-8 or ISO 8859-15 for German texts).

When used in other protocols such as HTTP, the transport coding binarycan also be used, with which any bytes can be sent directly without special coding - this is not allowed for e-mails.

There is an extension of this standard called S / MIME (Secure MIME), which also allows messages to be encrypted and digitally signed . In addition, with PGP / MIME (described in RFC 2015 and RFC 3156 ) there is also a PGP- compatible extension for secure data exchange.

A multi-part message includes a plurality of body parts, which by designated boundary lines ( boundary are delimited), it must be ensured at the identifier, that this does not occur in the remaining body part. Often this is done by choosing a random string that is unlikely to appear in the rest of the bodypart. Example of a simple multipart message (with a shortened boundary, which is frontierdefined here as ):

 MIME-Version: 1.0
 Content-Type: multipart/mixed; boundary=frontier

 This is a multi-part message in MIME format.
 
 --frontier
 Content-Type: text/plain

 This is the body of the message.
 --frontier
 Content-Type: text/html
 Content-Transfer-Encoding: base64

 PGh0bWw+CiAgPGhlYWQ+CiAgPC9oZWFkPgogIDxib2R5PgogICAgPHA+VGhpcyBpcyB0aGUg
 Ym9keSBvZiB0aGUgbWVzc2FnZS48L3A+CiAgPC9ib2R5Pgo8L2h0bWw+Cg==
 --frontier--

Individual body parts are introduced by the sequence of introductory two slashes ( hyphen-minus sign ) and Boundary and the last part by the same sequence with two concluding slashes.

Details of the specification

MIME Part 1 - Format of Internet Message Bodies

This first part of the specification, RFC 2045 , introduces basic additional fields in the head of emails:

  1. MIME version
  2. Content-Type
  3. Content transfer encoding

The content transfer encoding specifies whether the transmission should take place according to the Internet standard RFC 6152 , whether this has taken place, or whether an encoding for Internet standard RFC 822 has been carried out, which must be reversed at the recipient:

  • 7bit - no coding, text only contains ASCII characters
  • 8bit - no coding, text also contains non-ASCII characters, transmission via Extended SMTP
  • binary - no coding, binary content
  • quoted-printable - coding of control characters and non-ASCII characters by replacing them with their hexadecimal value
  • base64 - encoding through transformation into a 6-bit representation

If the ESMTP server does not accept binary data according to RFC 3030 ( BDATcommand), what is not text requires encoding in any case, which is then based on Base64 . E-mails containing nothing more than any text, however, do not require any transformation:

example

From: <adam@example.org>
To: <eva@example.org>
Subject: Umlaute dank MIME
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Transfer-Encoding: 8bit

Wären die drei zusätzlichen Kopfzeilen nicht, wäre diese Zeile nicht leserlich.

MIME Part 2 - Media Types

This second part of the specifications, RFC 2046 , defines main types and sub - types of content for the Content-Type field .

In general, however, other Internet media types can also be used, the specific processing of which is then left to the mail program.

text

The specification of a character set is provided as a parameter of this main type . Simple text without formatting is predefined as the subtype :

  • text / plain

image

JPEG is predefined as a sub-type for images :

  • image / jpeg

audio

As a type of clay that is codec of ISDN predefined:

  • audio / basic

Video

MPEG is predefined as a sub-type for films :

  • video / mpeg

application

This main type is intended for data from application programs . Two sub-types are predefined:

  • application / octet-stream
    This subtype should lead to the saving of the data and expressly not to the start of an application program.
  • application / postscript
    This subtype should lead to the printing of the data.

multipart

This main type is intended for combinations of several contents. Five sub-types are predefined:

  • multipart / mixed
    This subtype is intended for compilations in a specific order.
  • multipart / alternative
    This sub-type is intended for the same content in different formats, of which only the most appropriate should be presented. Typically one of the formats is a predefined sub-type of MIME.
  • multipart / digest
    This subtype is intended to provide an overview of the content.
  • multipart / parallel
    This sub-type is intended for systems that can present all types of content at the same time.
  • multipart / related
    This subtype is defined separately in RFC 2387 and is intended to combine several contents that only make sense together. The MIME Encapsulation of Aggregate Documents defined in RFC 2557 is based on this. It was the consequence of the fact that Hypertext Markup Language could not become a standard for MIME. In addition, the term Internet Media Type and finally the term XHTML Media Type were coined for alignment with the Hypertext Transfer Protocol instead of the Simple Mail Transfer Protocol .

message

This main type is intended for handling other emails. Three sub-types are predefined:

  • message / rfc822
    This sub-type is intended to hold several traditional emails.
  • message / partial
    This sub-type is intended to split a large e-mail into several parts, send them one after the other and automatically reassemble them.
  • message / external-body
    This subtype is intended to only contain a link to another email.

MIME Part 3 - Header Extensions for Non-ASCII Text

This third part of the specifications also removes the restriction to the English character set for the subject and other fields in the head of e-mails.

Originally it was not allowed to use umlauts or other special characters in the subject of e-mails, only the characters defined in ASCII . A subject such as "Greetings" could then, depending on the program from which the e-mail was transmitted, as "Sch? Ne Gr ?? e", "Sch ne Gr e" or "Schvne Gr | _e" arrive. To solve these problems, a procedure was defined in RFC 2047 , how the subject is encoded at the sender and decoded again at the recipient without the data being corrupted during transmission. This consists of the following scheme:

=?Zeichensatz?Kodierung?Kodierter Text?=

According to RFC 2047 there are many equivalent variants for coding the subject line "Greetings":

  • =?UTF-8?B?U2Now7ZuZSBHcsO8w59l?=
  • =?ISO-8859-1?B?U2No9m5lIEdy/N9l?=
  • =?UTF-8?Q?Sch=C3=B6ne_Gr=C3=BC=C3=9Fe?=
  • =?ISO-8859-1?Q?Sch=F6ne_Gr=FC=DFe?=

In all of these variants, umlauts are no longer visible, so transmission is secure. But the subject is no longer directly human readable. With the two lower variants (with the coding Qfor quoted printable ) you can still guess the text, with the upper two (with the coding Bfor Base64 ) nothing is recognizable at all. However, all information is included so that the original subject can be decoded again at the recipient.

MIME Part 4 - Registration Procedures

This fourth part of the specifications, now RFC 4289 , describes the registration of additional extensions with the Internet Assigned Numbers Authority . The media types registered there are diverse and also include expressly outdated and deprecated. Registrations were already accepted in 1994 without taking MIME into account. Since 1995, the entire registration is only best current practice . At the end of 2005, the registration of media types was removed from the MIME specification in order to counteract common misunderstandings. How a registered media type relates to MIME can only be determined from the specifications.

MIME Part 5 - Conformance Criteria and Examples

This fifth part of the specifications, RFC 2049 , defines minimum requirements for e-mail programs :

  • Mandatory additional header for every email created:
    MIME-Version: 1.0
  • Sending all non- RFC 822 e-mails with MIME codes and headers.
  • Reporting of ISO 8859 character sets in received emails.
  • Detect and present the message / rfc822 content type .
  • Extensive recognition and display of the multipart content type .
  • Process all unrecognized content types as octet-stream content type .

Encryption

RFC 1847 fundamentally defines encryption and electronic signature using MIME. Two additional media types are provided for this:

  • multipart / signed
  • multipart / encrypted

The Secure / Multipurpose Internet Mail Extensions (S / MIME) defined in RFC 5751 are based on the Cryptographic Message Syntax .

The MIME Security with Pretty Good Privacy (PGP / MIME) defined in RFC 2015 uses Pretty Good Privacy (PGP) instead .

Specifications

  • RFC 2045 Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies
  • RFC 2046 Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types
  • RFC 2047 MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text
  • RFC 2048 Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures
  • RFC 2049 Multipurpose Internet Mail Extensions (MIME) Part Five: Conformance Criteria and Examples

Individual evidence

  1. RFC 2854 - The 'text / html' Media Type . Internet Engineering Task Force . Retrieved July 25, 2011.
  2. XHTML Media Types . World Wide Web Consortium . April 30, 2002. Retrieved July 25, 2011.
  3. MIME media types . Internet Corporation for Assigned Names and Numbers . Retrieved July 25, 2011.
  4. RFC 1590 - Media Type Registration Procedure . Internet Engineering Task Force. Retrieved July 25, 2011.
  5. RFC 2048 - MIME Part Four: Registration Procedures . Internet Engineering Task Force. Retrieved July 25, 2011.
  6. RFC 4288 - Media Type Specifications and Registration Procedures . Internet Engineering Task Force. Retrieved July 25, 2011.