C (programming language)

from Wikipedia, the free encyclopedia
C.
The C Programming Language logo.svg
Basic data
Paradigms : imperative , structured
Publishing year: 1972
Designer: Dennis Ritchie
Developer: Dennis Ritchie & Bell Labs
Important implementations : GCC , MSVC , Borland C , Portland Group, Intel , Clang
Influenced by: B , BCPL , Algol 68
Affected: awk , C ++ , C-- , C # , Objective-C , D , Go , Java , JavaScript , PHP , Perl , Python , Vala , Seed7
Operating system : Platform independence

C is an imperative and procedural programming language that computer scientist Dennis Ritchie developed at Bell Laboratories in the early 1970s . Since then it has been one of the most widely used programming languages.

The areas of application of C are very different. It is used for system and application programming . The basic programs of all Unix systems and the system kernels of many operating systems are programmed in C. Numerous languages, such as C ++ , Objective-C , C # , D , Java , JavaScript , LSL , PHP , Vala or Perl , are based on the syntax and other properties of C.

history

Ken Thompson (left) and Dennis Ritchie (right)

Emergence

C was developed in 1969–1973 by Dennis Ritchie at Bell Laboratories for programming the then new Unix operating system. He relied on the programming language B , which Ken Thompson and Dennis Ritchie had written in the years 1969-70 - the name C emerged as an evolution of B. B in turn goes to the developed by Martin Richards mid-1960s programming language BCPL back . Ritchie also wrote the first compiler for C. In 1973 the language had matured to the point where the Unix kernel for the PDP-11 could now be rewritten in C.

Further development

K&R C expanded the language with new keywords such as longor unsignedand introduced the I / O standard library developed by Mike Lesk and, on the recommendation of Alan Snyder, the preprocessor .

Standards

C is a programming language that is available on almost all computer systems. In order to curb the proliferation of numerous dialects, C was standardized several times ( C89 / C90 , C99 , C11 ). Apart from the microcontroller area , which has its own dialects, most current PC / server implementations are closely based on the standard; full implementation of current standards is rare. The standardized C standard library is also available in most C systems with a runtime environment . This means that C programs that do not contain programming that is very close to the hardware can usually be ported to other target systems.

The first edition of The C Programming Language , published in 1978, contains the former unofficial standard K&R C

K&R C

Until 1989 there was no official standard of the language. Since 1978, however, the book The C Programming Language was considered an informal de facto standard, which Brian W. Kernighan and Dennis Ritchie had published in the same year. This specification is referred to as K&R C.

As the number of extensions to the language grew steadily in the following years, it was not possible to agree on a common standard library and not even the UNIX compilers K&R C fully implemented, it was decided to define an official standard. After this was finally published in 1989, K&R C remained the de facto standard for many programmers for a few years, but then quickly lost its importance.

ANSI C

In 1983 the American National Standards Institute (ANSI) set up a committee called the X3J11, which completed its work in 1989 and passed the ANSI X3.159-1989 Programming Language C standard. This version of the C language is also known as ANSI C, Standard C, or C89 for short.

A year later, the International Organization for Standardization (ISO) adopted the until then purely American standard as an international standard, ISO / IEC 9899: 1990 , also known as C90 for short. So the names C89 and C90 refer to the same version of C.

After the first development by ANSI and ISO, the language standard was hardly changed for a few years. The Normative Amendment 1 to C90 was not published until 1995 . It was called ISO / IEC 9899 / AMD1: 1995 and is also referred to as C95 for short. In addition to correcting some details, international fonts were better supported with C95.

C99

After a few minor revisions, the new standard ISO / IEC 9899: 1999 , or C99 for short , appeared in 1999 . It was largely compatible with C90 and introduced some new features, some of which were inherited from C ++, some of which had previously been implemented by various compilers. C99 has been supplemented by three Technical Corrigendas over the years .

C11

In 2007 the development of a new standard with the unofficial working title C1X began . It was released in December 2011 and is known in short as C11. In addition to better compatibility with C ++, new features have been added to the language.

Since the first international standard C90, C has been further developed by the international working group ISO / IEC JTC1 / SC22 / WG14. The national standardization organizations take on the publications of the international standard in a form adapted to their needs.

C18

This standard corresponds to that of C11 with the exception of error corrections and a new value of __STDC_VERSION__ and is therefore supported to the same extent as C11. The standard was released in June 2018 under the ISO / IEC 9899: 2018 standard .

use

Despite its rather old age, the C language is still widespread today and is used in universities as well as in industry and in the open source area.

System and application programming

The main area of ​​application of C is in system programming , especially of embedded systems , drivers and operating system cores . The reason lies in the combination of desired characteristics such as portability and efficiency with the possibility of addressing hardware directly and thereby having low requirements for a runtime environment.

Also, application software is often created in C, the relevance of the language back here fell behind other, which is particularly evident on mobile platforms. Many programming interfaces for application programs and operating system APIs are implemented in the form of C interfaces, for example Win32 .

Implementation of other languages

Because of the high execution speed and small code size, compilers , program libraries and interpreters of other high-level programming languages ​​(such as the Java Virtual Machine ) are often implemented in C.

C is used as the intermediate code of some high-level language implementations. This is first translated into C code, which is then compiled. This approach is used to increase portability without machine-specific development for the code generator (C compilers exist for almost every platform). Some compilers that use C in this way are Chicken , EiffelStudio , Esterel , PyPy , Sather , Squeak, and Vala .

However, C was designed as a programming language rather than a target language for compilers. It is therefore rather poorly suited as an intermediate language. That led to C-based intermediate languages ​​like C-- .

C is often used for creating connections (Engl. Bindings) used (for example, Java Native Interface ). These connections allow programs written in a different high-level language to call functions implemented in C. The reverse is often also possible and can be used to extend programs written in C with another language (e.g. mod perl ).

syntax

C is case-sensitive .

In addition, C has a very small amount of keywords . The number of keywords is so low because almost all tasks that are implemented in other languages ​​using their own keywords are implemented using functions of the C standard library (for example input and output via console or files, dynamic memory management, etc. ).

There are 32 keywords in C89:

auto
break
case
char
const
continue
default
do
double
else
enum
extern
float
for
goto
if
int
long
register
return
short
signed
sizeof
static
struct
switch
typedef
union
unsigned
void
volatile
while

With the C99 there were five more:

_Bool
_Complex
_Imaginary
inline
restrict

Seven more were added with C11:

_Alignas
_Alignof
_Atomic
_Generic
_Noreturn
_Static_assert
_Thread_local

Hello world program

A simple version of the Hello World program in C is the one Ritchie and Kernighan themselves used in the second edition of their book The C Programming Language . It should be noted that no return type has to be specified in the older ANSI C standard, since the compiler assumes an implicit int as the return type.

#include <stdio.h>
main()
{
  printf("hello, world\n");
}

Data types

char

To store a character (as well as small numbers) one usually uses the data type Character , written as char.

The computer does not actually store the character (such as "A"), but an equivalent binary number that is at least eight bits long (eg 01000001). This binary number is in the memory and can be automatically converted into the corresponding letter at any time using a table, whereby the current character set or the code page of the system environment is decisive. For example, 01000001 stands for the character “A” according to the ASCII table .

In order to be able to include characters from character sets that contain more characters than the relatively small ASCII character set, wchar_ta second data type conceived for characters was soon introduced.

// gespeichert wird nicht das Zeichen „A“, sondern meist ein Byte ("01000001")
char zeichen = 'A';

// gibt das Zeichen mit der Ordnungszahl 65 aus (in ASCII ein „A“)
printf("%c", 65);

int

To store an integer (such as 3), use a variable of data type Integer , written as int. The size of an integer nowadays (depending on the processor architecture and operating system) is usually 32 bits, but often 64 and sometimes 16 bits. 65536 different values ​​can be saved in 16 bits. To enable the use of negative numbers, the value range for 16 bits is usually from -32768 to 32767. If negative numbers are not required, the programmer unsigned intcan use an unsigned integer with. With 16-bit integers, this results in a value range from 0 to 65535.

In order to reduce or enlarge the value range of an integer, one of the qualifiers short, longor is placed in long longfront of it. The key word intcan then also be omitted, so is longsynonymous with long int. To switch between signed and unsigned integers, there are two qualifiers signedand unsigned. For a signed integer, however, the qualifier can also be omitted, which is signed intequivalent to int. The C standard library supplements these data types via the platform-independent header file <stdint.h>in which a set of integer types with a fixed length is defined.

char ganzzahl = 1;      // mindestens 8 Bit, also 256 mögliche Werte
short ganzzahl = 2;     // mindestens 16 Bit, also 65536 mögliche Werte
int ganzzahl = 3;       // mindestens 16 Bit, also 65536 mögliche Werte
long ganzzahl = 4;      // mindestens 32 Bit, also 4294967296 mögliche Werte
long long ganzzahl = 5; // mindestens 64 Bit, also 18446744073709551616 mögliche Werte

float and double

Numbers with decimal places are stored in one of the three data types float, doubleand long double. In most C implementations, the data types float and double correspond to the internationally valid standard for binary floating point arithmetic (IEC 559, which emerged in 1989 from the older American standard IEEE 754 ). A float implements the “ single long format ”, a double the “ double long format ”. A float comprises 32 bits and a double 64 bits. so doubles are more accurate. Due to this fact, floats are only used in special cases. The size of long doubles varies depending on the implementation, but a long double must never be smaller than a double. The exact properties and value ranges on the architecture used can be determined via the header file <float.h>.

// Genauigkeit ist jeweils implementierungsabhängig

float kommazahl = 0.000001f;
double kommazahl = 0.000000000000002;
long double kommazahl = 0.3l;

void

The data type voidis referred to as "incomplete type" in the C standard. You cannot create variables of this type. It is used voidfirstly when a function should not return a value, secondly when an empty parameter list is explicitly required for a function and thirdly when a pointer is to point to "objects of any type".

// Deklaration einer Funktion, die keinen Wert zurückgibt
void funktionsname();

// Deklaration einer Funktion, die int zurückgibt und keine Parameter akzeptiert
int funktionsname(void);

// Zeiger auf ein Objekt von beliebigem Typ
void* zeigername;

pointer

As in other programming languages, pointers in C are variables that store a memory address (such as address 170234) instead of a directly usable value (such as the character "A" or the number 5). The addresses in the memory are numbered. For example, the value 00000001 could be stored at the memory address 170234 (binary value of the decimal number 1). Pointers make it possible to access the value that is located at a memory address. This value can in turn be an address that points to another memory address. When declaring a pointer, the data type of the object that is pointed to is specified first, then an asterisk , then the desired name of the pointer.

char* zeiger;   // kann die Adresse eines Characters speichern
double* zeiger; // kann die Adresse eines Doubles speichern

Fields

As in other programming languages to use fields (arrays) in C by several values of the same data type to be stored. The values ​​of an array have consecutive memory addresses. The number of different values ​​in an array is defined as the index of the field. Since there is no separate data type for strings in C , arrays are also used to store strings.

// Definition eines Arrays mit 3 ganzzahligen Werten
int zahlen[] = { 17, 0, 3 };

// Array, das zur Speicherung eines Strings verwendet wird
char string[] = "Hallo, Welt!\n";

struct

Structures written as is used to store different types of data in a variable struct. In this way, variables of different data types can be combined.

struct person {
    char* vorname;
    char nachname[20];
    int alter;
    double groesse;
};

enum

As in other programming languages, an enum in C is used to combine several constant values ​​into one type.

enum Temperatur { WARM, KALT, MITTEL };

enum Temperatur heutige_temperatur = WARM;

if (heutige_temperatur == KALT)
    printf("Warm anziehen!"); // wird nicht ausgegeben, da es heute „WARM“ ist

typedef

The keyword typedef is used to create an alias for a data type used.

// legt den Alias "Ganzzahl" für den Datentyp "int" an
typedef int Ganzzahl;

// ist jetzt gleichbedeutend zu: int a, b;
Ganzzahl a, b;

_Bool

Until the C99 standard, there was no data type for storing a truth value . Only since 1999 can variables be _Booldeclared as and accept one of the two values ​​0 (false) or 1 (true).

_Bool a = 1; // seit C99

By explicitly using the header stdbool.h, the widespread use of the logical data type boolwith the two possible characteristics trueor is falsepossible:

#include <stdbool.h>

bool a = true; // seit C99

_Complex and _Imaginary

Since C99 there are three floating-point data types for complex numbers , which are derived from the three floating-point types: float _Complex, double _Complexand long double _Complex. Floating-point data types have also been introduced in C99 for purely imaginary numbers: float _Imaginary, double _Imaginaryand long double _Imaginary.

Functions

A C program consists of the mainfunction and optionally other functions. Additional functions can either be defined yourself or taken over from the C standard library.

Main

Every C program must have a function called main, otherwise the program will not be compiled. The mainfunction is the entry point of a C program, which means that program execution always begins with this function.

// das kürzeste mögliche standardkonforme C89-Programm
main(){return 0;}
// das kürzeste mögliche standardkonforme C99-Programm
int main(){}

Apart from the mainfunction, no other functions need to be included in a C program. If other functions are to be executed, they must be called in the mainfunction. The mainfunction is therefore also referred to as the main program, all other functions as subprograms .

Self-defined functions

Any number of functions can be defined in C. A function definition consists firstly of the data type of the return value , secondly the name of the function, thirdly a parenthesized list of parameters and fourthly a parenthesized function body in which what the function should do is programmed.

// Datentyp des Rückgabewerts, Funktionsname und zwei Parameter
int summe(int x, int y) {
    // Funktionsrumpf, hier wird die Summe berechnet und zurückgegeben
    return x + y;
}

int main() {
    // die Funktion wird mit den Werten 2 und 3 aufgerufen, der Rückgabewert
    // wird in der Variable „ergebnis“ gespeichert
    int ergebnis = summe(2, 3);

    // main gibt den Wert von „ergebnis“ zurück
    return ergebnis;
}

The keyword is used to define a function that should not return anything void. Also if no parameters are to be passed to the function.

#include <stdio.h>

void begruessung() {
    puts("Hi!");

    return;
}

Functions of the C standard library

The functions of the standard library are not part of the C programming language. They are supplied with every standard-compliant compiler in the hosted environment and can be used as soon as the respective header file has been integrated. For example, the function is used printfto output text. It can be used after including the header file stdio.h.

#include <stdio.h>

int main() {
    printf("hello world!\n");

    return 0;
}

instructions

A function consists of instructions . As in most programming languages, the most important statements are: declarations and definitions, assignments , conditional statements , statements that implement loops and function calls. The following, rather pointless, program contains examples.

// Unterprogramme
void funktion_die_nichts_tut() { // Definition
    return;                      // Return-Anweisung
}

int plus_eins_funktion(int argument) { // Definition
    return argument + 1;               // Return-Anweisung
}

// Hauptprogramm
int main() {                         // Definition
    int zahl;                        // Definition
    funktion_die_nichts_tut();       // Funktionsaufruf
    zahl = 5;                        // Zuweisung
    zahl = plus_eins_funktion(zahl); // Funktionsaufruf und Zuweisung

    if (zahl > 5)  // bedingte Anweisung
        zahl -= 1; // Zuweisung: der Wert von „zahl“ ist wieder „5“

    return 0; // Return-Anweisung
}

Naming

When naming your own variables, constants, functions and data types, you have to adhere to some naming rules. First, the first character of an identifier must be a letter or an underscore. Second, the following characters can only be the letters A through Z and a through z, digits and the underscore. Third, the name cannot be any of the keywords.

Since C95, characters from the Universal Coded Character Set are also allowed in identifiers, provided the implementation supports it. The permitted characters are listed in Appendix D of the ISO-C standard. Put simply, it is all those characters that are used in any language as letters or letter-like characters.

As of C99, these characters can be replaced, regardless of platform, using an escape sequence as follows:

  • \uXXXX(where X stands for a hexadecimal digit) for characters with a code from 00A0 hex to FFFF hex .
  • \UXXXXXXXXfor all characters with a code ≥00A0 hex .

Certain identifiers are also reserved for implementation :

  • Identifiers that begin with two consecutive underscores
  • Identifiers that begin with an underscore followed by an uppercase letter.

Extensions to the language core that require new keywords also use names from this reserved area in order to avoid that they collide with identifiers in existing C programs, e.g. B _Complex. _Generic,, _Thread_local.

Standard library

The C standard library is an integral part of a hosted ( Engl. Hosted ) C implementation. Among other things, it contains macros and functions that are made available using the standard header file . In free-standing (Engl. Freestanding ) implementations, however, the scope of the standard library may be limited.

The standard library is divided into several standard header files, but the linked library is often a single large file.

  • “Hosted”: The C compiler and program are located in an operating system environment that offers the usual services (e.g. a file system, textual input and output channels, memory management).
  • "Freestanding": The C program does not run under an operating system, but has to implement all device functions itself. Often, however, at least some libraries are available in advance. Cross compilers (also known as “target compilers”) are often used here.

Modules

A modularization in C takes place at the file level. A file forms a translation unit; Internally required functions and variables can thus be hidden from other files. The public functional interfaces are announced with so-called header files . This means that C has a weak modular concept.

The global language design stipulates that a program can consist of several modules. For each module there is a source code file (with the extension .c) and a header file (with the extension .h). The source code file essentially contains the implementation, the header file the external interface. Keeping both files consistent is the task of the programmer in C (as in C ++ , but no longer in C # ).

Modules that use functions from other modules include their header files and thus give the compiler the necessary information about the existing functions, calling conventions, types and constants.

Each module can be translated individually and creates an object file . Several object files can be combined into a library or used individually.

Several object files and libraries (which also just a collection of object files are) can by Left (German: Binder ) are bound to an executable program.

Compiler

The most widespread is the free C compiler of the GNU Compiler Collection , which has existed since 1987 . The Visual C ++ compiler, which has been developed since 1993, is also widely used on Windows . In addition to these two, numerous other compilers are available.

Since there are comparatively few keywords in C, there is the advantage of a very simple, small compiler. C is therefore often the first available programming language on new computer systems (after machine code and assembler ).

Relationship to assembler, portability

The programming language C was developed with the aim of implementing a real language abstraction from the assembly language. There should be a direct assignment to a few machine instructions in order to minimize the dependency on a runtime environment. As a result of this design, it is possible to write C code on a very hardware-related level, analogous to assembler instructions. The porting of a C compiler to a new processor platform is not very complex compared to other languages. E.g. the free GNU C compiler (gcc) is available for a variety of different processors and operating systems. For the developer this means that regardless of the target platform, there is almost always a C compiler. C thus significantly supports the portability of programs, provided the programmer can do without assembler parts in the source code and / or hardware-specific C constructs. In microcontroller programming, C is by far the most frequently used high-level language.

safety

Conceptually, C is designed for easy compilation of the source texts and for the quick execution of the program code. As a rule, however, the compilers generate only a small amount of code to ensure data security and operational reliability during the runtime of the programs. Therefore, attempts are increasingly being made to uncover and correct these deficiencies through formal verification or to remedy them through additional source texts to be created by the programmer.

C hardly restricts direct memory access. This means that the compiler (unlike in Pascal, for example ) can only help to a very limited extent in troubleshooting. For this reason, C is less suitable for safety-critical applications (medical technology, traffic control technology, space travel). If C is nevertheless used in these areas, an attempt is usually made to increase the quality of the programs created by additional tests such as code coverage .

C contains some safety-critical functions; For example gets(), in the old standards a function of the standard library overwrites external memory areas ( buffer overflow ) if it encounters an unsuitable (too long) input. The error is neither noticeable nor catchable within C. In order not to lose the great advantage of C - the existence of numerous older source codes - current implementations continue to support these and similar functions, but usually issue a warning when they are used in the source code when translating. gets()was finally removed from the language specification with C11 .

C is not type-safe because different data types can be handled with assignment compatibility .

literature

Introductions

Advanced

  • Andrew Koenig: The C-Expert: Programming without glitches. Addison Wesley, 1989, ISBN 978-3-89319-233-5 (German translation from: C Traps and Pitfalls. Addison Wesley, 1989.)
  • Peter van der Linden: Expert C programming. Verlag Heinz Heise, 1995, ISBN 978-3-88229-047-9 (German translation from: Expert C Programming. Prentice Hall, 1994.)

Manuals

  • Rolf Isernhagen, Hartmut Helmke: Software technology in C and C ++. The compendium. Modular, object-oriented and generic programming. ISO-C90, ISO-C99, ISO-C ++ 98, MS-C ++. NET . 4th, completely revised edition, Hanser, Munich / Vienna 2004, ISBN 3-446-22715-6 .
  • Jürgen Wolf: C from A to Z. The comprehensive manual . 3rd updated and expanded edition 2009, 4th, corrected reprint 2015, Rheinwerk, Bonn 2015, ISBN 978-3-8362-1411-7 .

K&R C

  • Brian Kernighan, Dennis Ritchie: The C Programming Language . Prentice Hall, Englewood Cliffs (NJ) 1978, ISBN 0-13-110163-3 . (German translation: Brian Kernighan, Dennis Ritchie: Programming in C. With the reference manual in German . Hanser, Munich / Vienna 1983)

K & R2

  • Brian Kernighan, Dennis Ritchie: The C Programming Language . 2nd edition, Prentice Hall, Englewood Cliffs (NJ) 1988, ISBN 0-13-110362-8 . (German translation: Brian Kernighan, Dennis Ritchie: Programming in C. With the C-Reference Manual in German . 2nd edition, Hanser, Munich / Vienna 1990, ISBN 978-3-446-15497-1 )

Web links

Wikibooks: C programming  - learning and teaching materials

Individual evidence

  1. ^ Dennis M. Ritchie: The Development of the C Language. Lucent Technologies, January 1993, accessed on September 10, 2015 : "The scheme of type composition adopted by C owes considerable debt to Algol 68, although it did not, perhaps, emerge in a form that Algol's adherents would approve of."
  2. ^ Dennis M. Ritchie: The Development of the C Language. Lucent Technologies, January 1993, accessed September 10, 2015 .
  3. Ken Thompson: Users' Reference to B. Retrieved May 30, 2015 .
  4. ^ Brian W. Kernighan, Dennis M. Ritchie: The C Programming Language , Prentice Hall, Englewood Cliffs (NJ) 1978, ISBN 0-13-110163-3 .
  5. Language definition of C11 as ISO standard ISO / IEC 9899: 2011 , published on December 8, 2011.
  6. ISO updates C standard , article on heise online, from December 22, 2011.
  7. Options Controlling C Dialect. Retrieved September 8, 2018 .
  8. ISO / IEC 9899: 2018 Programming languages ​​C. Retrieved September 8, 2018 .
  9. ^ Rolf Isernhagen, Hartmut Helmke: Software technology in C and C ++. The compendium. Modular, object-oriented and generic programming. ISO-C90, ISO-C99, ISO-C ++ 98, MS-C ++. NET . 4th, completely revised edition, Hanser, Munich / Vienna 2004, ISBN 3-446-22715-6 , page 4.
  10. Walkthrough: Creating Windows Desktop Applications (C ++) Microsoft Docs, accessed December 5, 2019.
  11. ^ Brian Kernighan, Dennis Ritchie: The C Programming Language . 2nd edition, Prentice Hall, Englewood Cliffs (NJ) 1988, ISBN 0-13-110362-8 , page 6.
  12. Scheler, Stilkerich, Schröder-Preikschat: Components / Modules (PDF; 1.1 MB)
  13. ^ Bertrand Meyer: Object-oriented software development . Hanser, Vienna, Munich; Prentice Hall Internat. 1990, p. 406 ISBN 3-446-15773-5 .
  14. Junan Qian, Baowen Xu: Formal Verification for C Program , Informatica, Volume 18, Number 2 (2007), pages 289–304, accessed July 5, 2016
  15. Harvey Tuch: Formal verification of C systems code , Sydney Research Lab., National ICT Australia (2009), accessed July 5, 2016
  16. Jay Abraham: Improving Software Quality with Static Code Analysis , MathWorks (2012), accessed July 5, 2016
  17. gets - C ++ Reference. Retrieved March 12, 2020 .
  18. Markus Bautsch: Cycles of Software Crises - How to avoid insecure and uneconomic software , ENISA Quartely, Vol. 3, No. 4, Oct-Dec 2007, p. 3-5
  19. Lambert Kenneth Louden: Programming Languages: Principles and Practices , Ch. 8.6.1 Type Compatibility / 8.7 Type Conversion , Cengage Learning, 2011, ISBN 978-1-133-38749-7 .