Thread (computer science)

from Wikipedia, the free encyclopedia

In computer science , thread [ θɹɛd ] ( English thread , 'thread', 'strand') - also called activity carrier or lightweight process - denotes an execution thread or an execution sequence in the execution of a program . A thread is part of a process .

A distinction is made between two types of threads:

  1. Threads in the narrower sense , the so-called kernel threads , run under the control of the operating system .
  2. In contrast to this are the so-called user threads , which the computer program of the user has to manage completely himself.

This article deals with the thread in the strict sense, i.e. the kernel thread.

Threads from the perspective of the operating system

Several kernel threads in one process (task)

A (kernel) thread is a sequential execution run within a process and shares a number of resources with the other existing threads ( multithreading ) of the associated process :

Historically, Friedrich L. Bauer coined the term sequential process for this .

Threads within the same process can be assigned to different processors. Each thread has its own so-called thread context:

  • independent register set including instruction pointer,
  • its own stack , but mostly in the common process address space.
  • As a special feature, there may be resources that can or may only be used by the generating thread (example: thread-local storage , window handle).

Other resources are shared by all threads. The sharing of resources can also lead to conflicts. These must be resolved through the use of synchronization mechanisms.

Since threads that are assigned to the same process use the same address space, communication between these threads is very simple from the outset (cf. with interprocess communication for processes).

Each “thread” is responsible for performing a specific task. The execution strands of the program functions can thus be divided into manageable units and extensive tasks can be distributed over several processor cores.

In most operating systems, a thread can have an inactive state in addition to the states active (running), ready and blocked (waiting) . In the 'computing' state (= active = running) the execution of commands takes place on the CPU ; in the 'computing ready' state (= ready = ready) the thread is stopped to allow another thread to compute and in the case of 'blocked' (= waiting) the thread waits for an event (usually that an operating system service has been completed / performed). A thread in the 'inactive' state is usually just being set up by the operating system or has finished calculating and can now be removed from the thread list by the operating system or reused in some other way.

The difference in meaning between (kernel) thread and process, task and user thread

A process describes the execution of a computer program on one or more processor (s). An address space and other operating system resources are assigned to a process - in particular, processes are shielded from each other: If a process tries to access addresses or resources that have not been assigned to it (and possibly belong to another process), this will fail and the operating system will use it canceled. A process can contain several threads or - if parallel processing is not provided for in the course of the program - only a single thread. Threads share processors, memory and other operating system-dependent resources such as files and network connections within a process. Because of this, the administrative effort for threads is usually less than that for processes. A significant efficiency advantage of threads is, on the one hand, that, in contrast to processes, a complete change of the process context is not necessary when changing threads, since all threads use a common part of the process context, on the other hand, in the simple communication and fast data exchange between threads.

So-called multitask operating systems already existed in the 1980s , since, in contrast to the job-oriented systems, in particular in the process computing technology known at the time, several tasks had to be carried out in parallel. At that time, the term task was used to describe a task from the operating system's point of view, which is a synonym for process. The term task (German: task ) but is also commonly used in the software architecture of focus for related tasks, and especially rarely synonymous with thread used.

A thread is literally a single thread of execution of a program, but the term thread is used for the thread of execution from the point of view of the operating system (kernel thread). In a user software, this execution strand can be further subdivided into independent individual strands through suitable programming. In the English-speaking world, the term user thread (for Microsoft fiber , German: fiber ) has become established for a single such execution strand of the user software . With user threads, the user software is solely responsible for managing its execution threads.

Examples

// In addition, the "User Thread" column in combination with the examples assumes that this is an OS-dependent property. Only the "remarks" below the table contradict this. //

The following table shows examples of the various combinations of process, kernel and user thread:

process Kernel
thread
User
thread
example
No No No A computer program running under MS-DOS . The program can only perform one of three actions at a time.
No No Yes Windows 3.1 on the surface of DOS. All Windows programs run in a simple process, a program can destroy the memory of another program, but this is noticed and a General Protection Fault ( GPF has) result.
No Yes No Original implementation of the Amiga OS . The operating system fully supports threads and allows several applications to run independently of one another, scheduled by the operating system kernel. Because of the lack of process support, the system is more efficient (because it avoids the additional expense of memory protection), with the price that application errors can paralyze the entire computer.
No Yes Yes DR-DOS 7.01, 7.03, 8.0; Enhanced DR-DOS all versions
Mac OS 9 supports user threads using Apple's Thread Manager and kernel threads using Apple's Multiprocessing Services , which works with the nanokernel , introduced in Mac OS 8.6. This means that threads are supported, but the MultiFinder method is still used to manage applications.
Yes No No Most known implementations of Unix (except Linux). The operating system can execute more than one program at a time, the program executions are protected against each other. When a program misbehaves, it can disrupt its own process, which can result in killing that one process without disrupting the operating system or other processes. However, the exchange of information between processes can either be prone to errors (when using techniques such as shared memory ) or costly (when using techniques such as message passing ). The asynchronous execution of tasks requires a complex fork () system call .
Yes No Yes Sun OS (Solaris) from Sun. Sun OS is Sun Microsystems' version of Unix . Sun OS implements user threads as so-called green threads to enable a simple process to execute several tasks asynchronously, for example playing a sound, repainting a window , or reacting to an operator event such as the selection of the stop button . Although processes are preemptively managed, the green threads work cooperatively. This model is often used instead of real threads and is still up-to-date in microcontrollers and so-called embedded devices and is used very frequently.

Windows 3.x in enhanced mode when using DOS boxes also falls into this category, since the DOS boxes represent independent processes with a separate address space.

Yes Yes No This is the general case of applications under Windows NT from 3.51 SP3 +, Windows 2000, Windows XP, Mac OS X, Linux, and other modern operating systems. All of these operating systems allow the programmer to use of user threads or libraries that own user threads use, but do not use all of the programs that possibility. Furthermore, user threads can also be created automatically by the operating system for each started application (for example to operate the graphical user interface) without the programmer having to do this explicitly; such programs are then automatically multithreaded . It is also necessary to use several user threads if you want to use several processors / processor cores in the application.
Yes Yes Yes Most operating systems since 1995 fall into this category. Using threads to execute concurrently is the common choice, although multi-process and multi-fiber applications exist as well. These are used, for example, so that a program can process its graphical user interface while it is waiting for input from the user or doing other work in the background.

Remarks:

  • The use of user threads is in principle independent of the operating system. It is therefore possible with any operating system. It is only important that the complete state of the processor can be read out and written back again (user threads have also been implemented in some 8-bit operating systems, e.g. as GEOS on the C64 / C128). Therefore the values ​​in the table are to be seen as reference values.
  • Some modern operating systems (e.g. Linux) no longer allow a strict distinction between processes and kernel threads. Both are done with the same system call (clone (2)), you can specify which resources are shared and which are not shared (exception: CPU register, stack). With a number of resources, this can even be changed while the thread is running (memory: TLS vs. shared memory, file handle: socketpair).

Implementations

Java

Working with multiple threads is intended from the outset in Java . Multithreading also works if the operating system does not support it or supports it only inadequately. This is possible because the Java virtual machine can take over thread switching including stack management. In operating systems with thread support, the operating system properties can be used directly. The decision about this lies in the programming of the virtual machine.

In Java there is the class Thread in the basic package java.lang . Instances of this class are administrative units of the threads. Thread can either be used as the base class for a user class, or an instance of Thread knows an instance of any user class. In the second case, the user class must implement the java.lang.Runnable interface and therefore contain a run () method .

A thread is started by calling thread.start () . The assigned run () method is processed. As long as run () is running, the thread is active.

In the run () method or in the methods called from there, the user can use wait () to make the thread wait for a period of time (specified in milliseconds) or any length of time. This waiting is ended with a notify () from another thread. This is an important mechanism for inter-thread communication. wait () and notify () are methods of the class Object and can be used on all instances of data. Associated wait () and notify () are to be organized in the same instance (a user class). It makes sense to transfer the data that one thread would like to communicate to the other in this instance.

The realization of critical sections is done with synchronized .

In the first version of Java, methods of the class Thread were introduced to interrupt a thread from outside, continue and abort: suspend () , resume () and stop () . However, these methods were quickly referred to as deprecated in successor versions . In the detailed explanation it was stated that a system is unsafe if a thread can be stopped or aborted from the outside. The reason, in a few words, is as follows: A thread may be in a phase of a critical section and some data may have changed. If it is stopped, the critical section is blocked and deadlocks are the result. If it is canceled and the blocking is lifted by the system, then data is inconsistent. At this point, a runtime system cannot make its own decision; only the user program itself can control a thread being stopped or aborted.

.NET

.NET natively supports thread programming. This is implemented by the classes in the System.Threading namespace .

In addition to the process and thread constructs mentioned above, there is also the concept of an application domain ( AppDomain ). A process can contain several application domains, these are isolated from the runtime ("logical process"), a resource provided by the .NET Framework is bound to the generating application domain. Resources of the underlying operating system (including kernel threads!) Are not tied to these logical process limits.

The .NET runtime also offers a thread pool managed by the runtime, which is used by the runtime to process asynchronous events and input / output operations.

The .NET runtime also distinguishes between foreground threads and background threads. A thread becomes the background thread by setting the Background property to true . A process ends when the last foreground thread has finished. All background threads that are still running are automatically terminated. Thread pool threads are started as background threads.

An independent thread is started via a new instance of a thread class to which a callback function ( delegate ) is passed in the constructor . The thread is then started using the Start () instance method . The thread terminates when the callback function returns control to the caller.

Alternatively, the thread pool of the .NET runtime can be used for brief background processing. This holds a certain number of threads that can be used for processing via ThreadPool.QueueUserWorkItem () . After the returned callback function has returned, the thread is not destroyed by the operating system, but is cached for later use. The advantage of this class is the optimized, limited use of the underlying equipment.

External control of the threads is possible ( Abort () , Suspend () , Resume () ), but can lead to unpredictable events such as deadlocks or aborts of the application domain. Therefore Suspend and Resume are marked as obsolete in newer versions of .NET.

The threads are synchronized with a WaitHandle . This is mostly used via the Monitor class, which uses a mutex made available by each .NET object . In C # you can use the lock (object) {instruction; } Construct can be used. Many classes of the .NET Framework also exist in a thread-safe variant that can be created using a static Synchronized () method.

Unix / Linux

Under Unix, there have always been easy-to-use system calls for creating parallel processes ( fork ). With this means the parallel processing is traditionally realized under Unix / Linux. Threads were added in later Unix versions, but portability between earlier derivatives was not guaranteed. The standard POSIX thread ( Native POSIX Thread Library ) finally stipulated a uniform minimum range of functions and a uniform API that is also supported by current Linux versions ( NPTL ). Compared to a process, a thread is also referred to as a lightweight process ( Solaris ), since switching between processes requires more effort (computing time) in the operating system than switching between threads of a process.

Windows

To create your own thread in C or C ++ under Windows, you can directly access the Windows API interfaces. To do this, you have to call as a simple pattern:

#include <windows.h>
DWORD threadId;
HANDLE hThread = CreateThread (NULL, 0, runInThread, p, 0, &threadId);
CloseHandle (hThread);

runInThreadis the subroutine that should run in this thread, it is called immediately afterwards. If it is runInThreadterminated, the thread is terminated, similar Thread.run()to Java.

This API is a C-oriented interface. In order to program threads in an object-oriented manner, runInThreada method of a class can be called in the subroutine according to the following scheme :

DWORD WINAPI runInThread(LPVOID runnableInstance)
{
   Runnable* runnable = static_cast <Runnable*> (runnableInstance);
                        // Klassenzeiger oder Zeiger auf Basisklasse
   return(runnable->run());  // run-Methode dieser Klasse wird gerufen.
}

The class that contains the run () method for the thread is Runnablecontained in a class here ; it can also be a base class of a larger class. The pointer to the instance of a Runnableclass that may be derived from must be passed as a parameter (p) to CreateThread, namely cast as (Runnable *). So you have the same technology in your hand as with Java. The universal base class (an interface) for all classes whose run () methods are to run in a separate thread is defined as follows:

class Runnable   // abstrakte Basisklasse (als Schnittstelle verwendbar)
   {
      virtual DWORD fachMethode()=0; // API zum Vererben
   public:
      DWORD run() { return(fachMethode()); } // API zum Aufrufen
      virtual ~Runnable() {} // Wenn vererbt werden soll: Dtor virtuell
   };

The user class with the specialist method is defined below:

class MyThreadClass : public Runnable
   {
      DWORD fachMethode(); // Überschreibt/implementiert die Fachmethode
   };

The user class is then instantiated and the thread started:

MyThreadClass myThreadObject;
hThread = CreateThread (NULL, 0, runInThread, &myThreadObject, 0, &threadId);

Because of the dynamic binding , the desired method is myThread->fachMethode()called. Attention: The life cycle of myThreadObject must be taken into account: You must not "clear" it implicitly as long as the new thread is still working with it! Thread synchronization is required here.

Further accesses to the thread at API level can be carried out with knowledge of the returned HANDLE, for example

SetThreadPriority (hThread, THREAD_PRIORITY_BELOW_NORMAL);

or to runInThreadquery the return value of the called method (in the example 0):

DWORD dwExitCode;
GetExitCodeThread (hThread, &dwExitCode);

trouble

The use of threads and simple synchronization mechanisms such as mutexes and semaphores has proven to be demanding in practice. Since the program flow is no longer simply sequential, it is difficult for a developer to predict it. Since the execution sequence and the change between threads is regulated by the scheduler and the developer has little influence on this, a concurrent program can easily end up in a previously unintended overall state, which manifests itself in deadlocks , live locks , data errors and crashes. These effects occur sporadically and are therefore hardly reproducible, which makes troubleshooting in an application difficult.

Thread representation in UML

In the Unified Modeling Language (UML), parallel processes are often represented with statecharts . In a state diagram, internal parallel partial state diagrams can be represented within a state. All state diagrams of the overall system are processed quasi-parallel. The quasi-parallelism is achieved in that each state transition is very short (in practice a few microseconds to milliseconds) and therefore the successive processing appears to be parallel. The transition from one state to another is typically triggered by an event that was previously written into the so-called event queue. According to the definition given above, this transition due to an event is a user thread. In principle, the parallelism implemented in this way can be achieved with just a single operating system thread.

If UML is used for fast systems, then the question of time prioritization plays a role. If status transitions can take a long time or if a transition should also wait for conditions (already happens when reading or writing to a file), then parallelism with threads must be implemented. For this reason, it must be possible to assign the status diagram processing to several threads of the system, which may have different priorities. The UML tool Rhapsody knows the term active class for this . Each active class is assigned to its own thread.

In addition to the formulation of parallel work with state diagrams, parallelism with threads can also be modeled in UML-designed systems. The programming model offered by Java can be used for this. In this case, an explicit Thread class with the properties known in Java must be included in the user model. This makes it easier and more effective to master highly cyclical problems, as the following example shows:

void run()
{
   while (not_abort)           // zyklisch bis zum Abbruch von außen
   {
      data.wait();             // der Zyklus beginnt, wenn Daten vorliegen
      dosomething();           // Abarbeitung mehrerer Dinge
      if (condition)
      {
         doTheRightThing();    // Abarbeitung ist von Bedingungen abhängig
         partnerdata.notify(); // andere Threads benachrichtigen
      }
   }
}

The run () method shown here is a method of a user class, in it the entire processing in the thread is described in program lines, as is usual in the UML, as is the case with functional processing. The UML is used to show this user class, the associated Thread class and their relationships ( class diagram ), supplemented with sequence diagrams , for example . The programming is clear. A state diagram does not offer any better graphic options for this case.

See also

literature

Web links

Individual evidence

  1. cf. gotw.ca