Control characters

As a control character , even control code or control code , English control , the characters of a character set called the no displayable represent characters - displayable characters include letters , numbers and punctuation marks .

Originally they were used to control text output devices such as text printers, automatic typists , telegram devices or teleprinters . Using control characters, it is possible to transfer control commands for the output devices within the character set instead of transferring the control information via another protocol .

Today only a few control characters have a meaning (e.g. Line Feed, Form Feed, Carriage Return, Escape), most control characters are practically no longer used. Sometimes they are also used to mark to transfer that are not defined in the character set used otherwise.

A character table usually defines both displayable characters and control characters; the most common ASCII codes are the characters 0 to 31 and the character 127. The Unicode characters are used to make control characters visible as graphic symbols, e.g. to control data transmission of the Control Pictures division (U + 2400 to U + 243F).

C0 control characters

Legend for the following table
Dec	Code value of the character in the decimal number system
Hex	Code value of the character in the hexadecimal number system
Ctrl	Usual notation (" caret notation ") as tax code The control character can be entered on the keyboard: The introductory symbol `^`stands for `Ctrl`( control ) or, on German keyboards, the `Strg`key ( control ). This is held down while the second character is entered.
C.	The " " characters indicate the spelling for this character in the C programming language and languages derived from it, such as C ++ , Java and, above all, scripting languages , shells , and others. This notation is usually interpreted in character strings , e.g. B. `\x` printf("Ein\tTab\nZeilenumbruch\rWagenrücklauf");
ISO	official abbreviation for the control character (according to ISO-646 standard)
U	graphic Unicode symbol from the block U + 2400–243F
Type	Character type: `CC` = Communication Control (English for protocol characters) `FE` = Format Effector (English for output characters) `IS` = Information Separator (English for separator)
English	official name for which the abbreviation stands (according to ASCII standard)
German	unofficial German translation of this English name
(original) meaning	Meaning of the control character. The italic explanations describe the obsolete meaning, which is now to be regarded as historical and is no longer used.

ASCII or C0 control characters
Dec	Hex	Ctrl	C.	ISO	U	Type	English	German	(original) meaning
0	0x00	^ @	\ 0	NUL	␀		zero	Null sign	Sign without informational content. Can be added to a message as desired and is discarded by the recipient. Marks the end of a string in C .
1	0x01	^ A		SOH	␁	CC	Start of heading	Beginning of the header	Marks the beginning of the machine-readable destination address or routing information. The header is ended with the character STX.
2	0x02	^ B		STX	␂	CC	Start of text	Beginning of the message	Marks the beginning of the message to be transmitted and thus the end of the header.
3	0x03	^ C		ETX	␃	CC	End of text	End of message	Marks the end of the message to be transmitted. Used as "abort" character for terminal input.
4th	0x04	^ D		EOT	␄	CC	End of transmission	End of broadcast	Marks the end of the entire transmission, which can consist of several messages including headers. Used as a "program termination" for some command interpreters. Used as "end of input" for terminal input.
5	0x05	^ E		ENQ	␅	CC	Inquiry	inquiry	A request in a bidirectional communication device. The other station can respond with its identification or with the status. Usually called "Wer Da?" On German teleprinters.
6th	0x06	^ F		ACK	␆	CC	Acknowledge	acknowledgment of receipt	Control character that expresses the positive response to a previous request.
7th	0x07	^ G	\ a	BEL	␇		Bell	Beep	Generates an acoustic signal (bell or beep ) on the receiving terminal. Used as an alarm or warning sign.
8th	0x08	^ H	\ b	BS	␈	FE	Backspace	Regression	Moves the printhead / cursor back one position. The sequence e Backspace ´ generates an é on a printer, often just an e on a terminal.
9	0x09	^ I	\ t	HT	␉	FE	Horizontal tab	Horizontal tab character	Moves the print head / cursor to the next predefined position (tab stop) in the current line.
10	0x0A	^ J	\ n	LF	␊	FE	Line feed	Line feed	Moves the printhead / cursor to the next line. If agreed between sender and recipient, it means "New Line", whereby the first print position of the next line is approached. We you. a. used as "line end character" in Unix systems ( Unix , BSD, macOS , Linux ). Under MS-DOS or Windows , the combination "Carriage Return" + "Line Feed" ends a line.
11	0x0B	^ K	\ v	VT	␋	FE	Vertical tab	Vertical tab character	Moves the printhead / cursor to the next predefined line.
12	0x0C	^ L	\ f	FF	␌	FE	Form feed	Form feed	Moves the printhead / cursor to the first printing position on the next page ( page break ). (Ejects the current page, clears the screen).
13	0x0D	^ M	\ r	CR	␍	FE	Carriage return	Carriage return	Moves the printhead / cursor back to the first print position of the current line. Is used as a line break in BASIC . Is used in classic Mac OS up to version 9 as a line end character ("New line"). Under MS-DOS or Windows , the combination "Carriage Return" + "Line Feed" ends a line. Carriage return can be used on terminals or printers to write several times in a line (e.g. loading bar).
14th	0x0E	^ N		SO	␎		Shift Out	Switching	Switch to special display, e.g. B. Bold on a printer.
15th	0x0F	^ O		SI	␏		Shift In	Downshift	Switch back to normal display.
16	0x10	^ P		DLE	␐	CC	Data Link Escape	"Data connection escape symbol" (literally translated)	Gives special meaning to the following characters. May only be used for additional protocol characters.
17th	0x11	^ Q		DC1	␑		Device Control 1	Device control symbol 1	Device-specific control characters, e.g. to switch certain device functions (e.g. font for printers) on and off. ^ S (XOFF) and ^ Q (XON) are also used for flow control with XON / XOFF .
18th	0x12	^ R		DC2	␒		Device Control 2	Device control symbol 2
19th	0x13	^ P		DC3	␓		Device Control 3	Device control symbol 3
20th	0x14	^ T		DC4	␔		Device Control 4	Device control symbol 4
21st	0x15	^ U		NAK	␕	CC	Negative Acknowledge	Negative confirmation	Expresses the negative answer to a previous query.
22nd	0x16	^ V		SYN	␖	CC	Synchronous idle	Synchronization signal	In the case of synchronous data transmissions, it enables synchronization even in the absence of signals to be transmitted.
23	0x17	^ W		ETB	␗	CC	End of Transmission Block	End of the transmission block	Indicates the end of a block of transmitted data blocks if this block end cannot be recognized from the data itself.
24	0x18	^ X		CAN	␘		Cancel	cancellation	Indicates that the data just transmitted is or was incorrect and must be discarded.
25th	0x19	^ Y		EM	␙		End of medium	End of medium	Indicates the end of the storage medium (physical or logical).
26th	0x1A	^ Z		SUB	␚		Substitutes	replacement	Replaces a character that is invalid or incorrect, e.g. B. because of a parity error in the transmission. End of file character (EOF) for text files under CP / M due to the lack of byte-specific file lengths, was initially also common under DOS, although unnecessary.
27	0x1B	^ [		ESC	␛		Escape	Escape symbol	If the following characters have a special meaning, an escape sequence starts .
28	0x1C	^ \		FS	␜	IS	File separator	File separator	Separators that logically divide data blocks. The exact meaning of the logical units “File”, “Group”, “Record”, “Unit” is not specified, but it should be arranged from “File” as the uppermost structural unit to “Unit” as the lowest structural unit.
29	0x1D	^]		GS	␝	IS	Group separator	Group separator
30th	0x1E	^^		RS	␞	IS	Record separator	Record separator
31	0x1F	^ _		US	␟	IS	Unit separator	Unit separator
127	0x7F			DEL	␡		Delete	Delete characters	The DEL sign has a binary code made up of all ones. There is a historical reason for this: once a hole has been punched in a punched tape, it cannot be refilled. However, you can punch out all the remaining holes in a character and thus make it a non-printing control character 'BU' (in the 5-channel Baudot code ), i.e. overwriting an incorrect entry in this way. This is why this character also stands for "deleted character" or "deleted".

C1 control characters

The control characters newly defined in ISO 8859 for all of its sub-standards are rarely used and are now only of historical interest. Most Windows character sets, including the CP 1252 , occupy these code positions with printable characters that are not contained in the corresponding ISO standard, e.g. ISO 8859-1 .

All C1 control characters can also be mapped as C0 control characters using escape sequences , see ANSI escape sequence .

ISO 8859 or C1 control characters
Dec	Hex	IETF	ISO	Character name	comment
128	0x80	PA	PAD	Padding character	Reserved control character; considered in a DIS-10646 draft, but never included in the ISO-10646 standard. Marked as XXX in Unicode .
129	0x81	HO	HOP	High octet preset
130	0x82	bra	BPH	Break Permitted Here	A position where a line break can occur. Similar to the wide-free spaces , Unicode U + 200B space zero width .
131	0x83	NH	NBH	No break here	A position where you do not want a line break. Comparable to Unicode U + 2060 word joiner .
132	0x84	IN	IND	index	Moves the current position one line down, but maintains the horizontal position. The index function was declared obsolete in the 4th edition of ECMA-48 (1986) and was deleted in the 5th edition (1991).
133	0x85	NL	NEL	Next line	Moves the current position to the beginning of the next line, alternatively to the home or line limit position. NEL is at the same position as EBCDIC NL ( English Nextline ).
134	0x86	SA	SSA	Start of Selected Area
135	0x87	IT	ESA	End of selected area
136	0x88	HS	HTS	Character tabulation set	Sets a tab stop at the active position. Before ECMA-48 (4th edition, 1986) referred to as the " Horizontal Tabulation Set ".
137	0x89	HJ	HTJ	Character tabulation with justification	Moves text to the next tab stop position. The text is understood as the part from the previous tab stop to the active position. Before ECMA-48 (4th edition, 1986) referred to as " Horizontal Tabulation with Justify ".
138	0x8A	VS	VTS	Line tabulation set	Places a vertical tab stop on the active line. Before ECMA-48 (4th edition, 1986) referred to as " Vertical Tabulation Set ".
139	0x8B	PD	PLD	Partial Line Forward	Before ECMA-48 (5th edition, 1991) referred to as " Partial Line Down ".
140	0x8C	PU	PLU	Partial Line Backward	Before ECMA-48 (5th edition, 1991) referred to as " Partial Line Up ".
141	0x8D	RI	RI	Reverse Line Feed	Moves the previous line while maintaining the horizontal position. Before ECMA-48 (4th edition, 1986) referred to as the " Reverse Index ".
142	0x8E	S2	SS2	Single shift 2	Load character set G2 for 1 character into GL
143	0x8F	S3	SS3	Single shift 3	Load character set G2 for 1 character into GL
144	0x90	DC	DCS	Device control string	Start character of a control sequence that ends with `ST`(" String Terminator "); can contain a command for the receiving device or a status report of the sending device.
145	0x91	P1	PU1	Private Use One	Reserved, no standardized meaning.
146	0x92	P2	PU2	Private Use Two	Reserved, no standardized meaning.
147	0x93	TS	STS	Set transmit state
148	0x94	CC	CCH	Cancel Character
149	0x95	MW	MW	Message waiting	Sets a " message waiting " indicator in the receiving device.
150	0x96	SG	SPA	Start protected area	With the following character string, which contains a list of character positions, defines an area that is protected against manual modification or transmission; deletion protection is optional. The character string must end with EPA (" End Protected Area "). The function is called " Start of Protected Area " according to ANSI X3.64 and ECMA-48 (1979), " Start of Guarded Protected Area " acc. ISO 6429 (1983) and ECMA-48 (1984) or " Start of Guarded Area " according to ISO 6429 (1992) and ECMA-48 (1986 and 1991).
151	0x97	EG	EPA	End Protected Area	Specifies the end of a zone that started with SPA. The function is called " End of Protected Area " according to ANSI X3.64 and ECMA-48 (1979), " End of Guarded Protected Area " acc. ISO 6429 (1983) and ECMA-48 (1984) or " End of Guarded Area " according to ISO 6429 (1992) and ECMA-48 (1986 and 1991).
152	0x98	SS	SOS	Start Of String	Marks the beginning of a control character string which is ended with `ST`(" String Terminator "). The character string must not `SOS`contain any additional characters (152 decimal or 98 hexadecimal). The interpretation of the character string is up to the respective program.
153	0x99	GC	SGCI	Single Graphic Character Introducer	Reserved control character; considered in a DIS-10646 draft, but never included in the ISO-10646 standard. Marked as XXX in Unicode .
154	0x9A	SC	SCI	Single character introducer	Executes the function defined by a single subsequent byte, which, however, has not been standardized. Also the introduction of a proprietary VT100 control sequence.
155	0x9B	CI	CSI	Control sequence intro	Initiation of a control sequence. See ANSI escape sequence .
156	0x9C	SI	ST	String terminator	Sign of the end of a string , with `APC`, `DCS`, `OSC`, `PM`or `SOS`was started.
157	0x9D	OC	OSC	Operating system command	Marks the beginning of an “ Operating System Command ” character string which is ended with `ST`(“ String Terminator ”). The interpretation of the character string is up to the respective operating system.
158	0x9E	PM	PM	Privacy message	Marks the beginning of a " Privacy Message " which is ended with `ST`(" String Terminator ").
159	0x9F	AC	APC	Application Program Command	Marks the beginning of an “ Application Program Command ” character string that is ended with `ST`(“ String Terminator ”). The interpretation of the character string is up to the respective program.

Unicode

The control characters of the ASCII range 0x00 to 0x1F can be found in Unicode under C0 Controls (U + 0000 to U + 001F), those of the ISO-8859 range 0x80 to 0x9F under C1 Controls (U + 0080 to U + 009F). The first 128 characters in the Unicode coding UTF-8 correspond to those of the ASCII and ISO-8859 coding, so this also applies to the control characters in the range 0x00 to 0x1F. In addition to these characters, there are a number of other control characters in Unicode .

Graphic symbols for the control characters can be found in the Unicode area Control Pictures (U + 2400 to U + 243F).

Entry under MS-Windows or DOS

As a test, control characters can also be entered under Windows . By holding down the (left) Alt key and then typing in the decimal code of a control character on the numeric keypad, a control character can be entered at the prompt.

Example: Open the command prompt, Alt+ (0 and 7 on the numeric keypad) logs the character ^ G at the prompt, which also makes it clear: Strg+ Gdoes the same thing. If you now Enterpress (or ^ M), this control character is executed in the terminal window and a beep sounds from the system loudspeaker (if available), which corresponds to the bell (BEL) (see table above). Likewise, Alt+ (0 and 8), like pressing Backspace(or Strg+ H), deletes a character. BASIC interpreters that use their own keyboard drivers (e.g. GW-BASIC ) also accept hexadecimal ASCII codes in the form & hZZ, where Z stands for a hex number (e.g. & h0D for carriage return).

Web links

Character tables on different systems ( Memento of March 10, 2010 in the Internet Archive )
ASCII control codes in detail (English)

Individual evidence

↑ ^a ^b ^c unicode.org (PDF).
↑ RFC 1345
↑ ISO 8859
↑ unicode.org (PDF).
↑ unicode.org (PDF).

[U2400-1] unicode.org (PDF).

[RFC1345-2] RFC 1345

[ISO8859-3] ISO 8859

[U0000-4] unicode.org (PDF).

[U0080-5] unicode.org (PDF).