Char (data type)

from Wikipedia, the free encyclopedia

Char or Character ([ kæɹ, kʌɹ ], from English character , "character") is a data type (in many programming languages ) for data areas / fields whose elements each represent a character .

Data type

char as the data type specifies that the individual characters of a memory area each (generally) consist of 8 bits, each of which represents a character that can be represented (letter, number, special character, ...). Which character this is results from the content of the memory location; For each hex combination (with a possible value range from 00 hex to FF hex ), a certain character is agreed according to the character coding used; z. B. 48 hex = 'H', 30 hex = '0'. In contrast to numeric formats, there are no signs (positive / negative) with char characters.

The char is only conditionally suitable for calculations and indexing, as it is 1 byte wide in most programming languages. For variables that require a larger range of values, other data types such as B. Integer can be used.

Character encoding

Most programming languages provide a character in a byte (8 bits), where the character set ASCII and its derivatives such as ISO 8859-1 and EBCDIC are the most common encodings. Newer programming languages ​​such as C # or Java use two bytes per character ( UNICODE ) and encode characters in UTF-16 . The established languages ​​such as C and C ++ have been expanded to include the multi-byte data type wchar_t ( UnicodeString under Object-Pascal ).

Literals, constants, variables

Characters can be used directly in the source text in the form of literals . In many programming languages ​​they are enclosed in single quotation marks, e.g. B. 'a'. Alternatively, characters can be created as constants in the definition of fields or assigned to a variable by a corresponding command (directly or as part of a higher-level data structure); FieldA = fieldB, areaX (including fieldA) = input.

So-called escape sequences can be used to display special characters . The backslash is often used as an escape character , for example a horizontal tab character is represented as '\ t'.

Operations

Characters are ordered depending on the coding chosen. Most programming languages ​​therefore offer corresponding comparison operators such as is equal (e.g. “=”, “==” or “IS EQUAL”), is not equal (e.g. “! =”, “<>”, “IS NOT EQUAL ”), is smaller than (e.g.“ <”or“ IS LESS THAN ”), is greater than (e.g.“> ”or“ IS GREATER THAN ”).

There are also usually operators to increase (e.g. "++", "SUCC") and decrease (e.g. "-" or "PRED"), i.e. to determine the successor or predecessor of a character.

Since each character is represented by a certain value, depending on its coding, many programming languages ​​also offer the possibility of converting characters into numbers and vice versa. This can happen either implicitly, e.g. By assigning a character to a numeric variable, or explicitly by using a function that e.g. B. can be called “ord” or “char” for short.

See also

swell

  1. http://www.research.att.com/~bs/glossary.html#Gwchar_t