A program error or software error or software anomaly , often Bug ( English called) generally refers to any wrongdoing by computer programs . This occurs when the programmer has not implemented a certain definition of the specification or has implemented it incorrectly , or if the runtime environment works incorrectly or differently than expected. Furthermore, incompleteness, inaccuracy or ambiguities in the specification of the program can lead to "errors".
For the most complete possible detection and elimination of program errors, usually in the processes of software development , i. H. Before the actual, “productive” use of software, go through the software test project phase , with validation being carried out. Errors that occur are common and the aim of testing is to find them, while errors during operation, depending on the effect of the error, etc. U. represent critical anomalies / faults. In practice, computer programs seldom appear without program errors. As a quality feature for programs one knows u. a. the defect density . It describes the number of errors per 1,000 lines of code ( kilo source lines of code ) or per function point .
So-called debuggers , with which a program can be executed and controlled step by step, are helpful as special instruments for searching for the causes of errors in programs . In the case of particularly critical software (e.g. aircraft control), a (complex) formal verification is sometimes carried out.
So-called bug trackers (such as Bugzilla or Mantis ) are used for recording and documentation . These include both error reports and suggestions for improvement and requests (so-called feature requests ) from users or general processes. See also defect management .
- "Failure to meet a requirement (EN ISO 9000: 2005)".
Specifically, the error is then defined as
- "Deviation of the ACTUAL (observed, determined, calculated states or processes) from the TARGET (defined, correct states and processes), if it exceeds the predefined tolerance limit [which can also be 0]."
According to ISTQB , the term 'error' is formed from the following contexts:
- a faulty action (English Error)
- "The human action that leads to an error condition ([according to IEEE 610])"
- ... results in an error state (Engl. Defect)
- "Defect (internal fault condition) in a component or a system that can impair a required function of the product ..."
- a fault effect can (engl. Failure) lead to
- "The manifestation of an internal error in the execution [of the program] as an incorrect behavior or result or as a failure of the system."
- Example division by zero : Error handling: Zero as a possible input value was not checked / excluded; Error status: the program is (unnoticed) incorrectly up to the entry of the value zero; Failure effect: runtime error while executing the command.
Expressions such as problem, defect, deviation, anomaly, deficiency are also used as a synonym for “error” or in addition to it. This means that the “severity of the error” can also be distinguished conceptually, e.g. B. Violation of programming style regulations from delivery of incorrect results or a program termination .
"Bug" as a synonym for program error
The word bug means in English " Schnabelkerf ; Bug "and colloquially" rural arthropod "or" (insect-like) vermin ". In the jargon of American engineers , the meaning “malfunction” or “construction error” has been attested since the late 19th century; This use of the word is based on the (joking) idea that little crawling cattle are messing with the gearbox, the line, etc. The oldest evidence is two letters from Thomas Edison in 1878 to William Orton , the president of the telegraph company Western Union , and Tivadar Puskás , the inventor of the telephone exchange , in which it says:
“[...] I did find a 'bug' in my apparatus, but it was not in the telephone proper. It was of the genus 'callbellum.' ”
"[…] I did find a 'bug' in my set, but not in the phone itself. It was of the genus 'callbellum'."
“The first step [in all of my inventions] is an intuition, and comes with a burst, then difficulties arise - this thing gives out and [it is] then that 'Bugs' - as such little faults and difficulties are called - show themselves [...]. "
“The first step [in all of my inventions] is an intuitive thought that comes in an outbreak, but then difficulties arise - the thing stops working, and then [it's] that 'bugs' - like such little mistakes and difficulties be called - show oneself [...]. "
Edison is not an inventor, but at least a key witness for a meaning of the word that was circulating back then. The association of the term with computers goes back to the computer pioneer Grace Hopper . They spread the story that on September 9, 1945, a moth malfunctioned a relay in the Mark II Aiken Relay Calculator computer . The moth was removed, stuck in the logbook and given the following note: “ First actual case of bug being found. ”(German:“ The first time that a 'vermin' was actually found. ”). The legend of finding the term persists, although the logbook entry indicates that the term was already common before. In addition, Grace Hopper was wrong about the year: The incident actually occurred on September 9, 1947. The corresponding page of the logbook was kept at the Naval Surface Warfare Center Computer Museum of the US Navy in Dahlgren , Virginia until the early 1990s . This logbook page with the moth is currently at the Smithsonian Institute .
Types of bugs
In software engineering (see also) a distinction is made between the following types of errors in programs:
- Lexical errors are character strings that cannot be interpreted, i.e. undefined identifiers (variables, functions, literals ...)
- Syntax errors are violations of the grammatical rules of the programming language used , e.g. incorrect use of reserved symbols (e.g. missing brackets), type conflicts, incorrect number of parameters.
Lexical and syntax errors usually prevent the compilation of the faulty program and are therefore recognized at an early stage. In programming languages that are interpreted sequentially , the program usually only breaks off at the syntactically / lexically incorrect point.
- Semantic errors are errors in which a programmed instruction is syntactically correct, but still incorrect in terms of content, for example confusion of the command code, syntactically undetectable incorrect parameter order.
- Logical errors consist in a problem-solving approach that is incorrect in detail, for example due toa wrong conclusion , an incorrectly interpreted specification or simply an oversight or typographical error. Examples: plus instead of minus, less than less than / equal to, etc. The tolerance of such errors and the attribute grammarof programming languages that areintended to limit them, such as the assignment compatibilityof data types , arevery differentdepending on the programming language usedand can be difficult to understand security gaps and Cause program crashes.
- Design errors are errors in the basic concept, either in the definition of the requirements for the software or in the development of the software design on the basis of which the program is developed. Errors in the definition of requirements are often based on a lack of knowledge of the subject area for which the software is written or on misunderstandings between users and developers. Errors directly in software design, on the other hand, can often be traced back to a lack of experience on the part of the software developer , unstructured programming or subsequent errors due to errors in the requirements specification . In other cases, the design has grown over time and becomes confusing over time, which in turn can lead to design errors when the program is further developed. Often, programming is carried out directly without a correct concept , which can then lead to design errors, especially when the software is more complex. For errors in the requirements definition as well as in the software design, cost or time pressures are often an issue. A typical design error is code repetition , which does not lead directly to program errors, but can easily be overlooked during software maintenance , modification or expansion of program code and then inevitably leads to undesirable effects.
- Error in the operating concept. The program behaves differently from what individual or many users expect, although technically it works flawlessly.
Other error terms
- Runtime errors : While the errors mentioned above mean a really faulty program that either cannot be executed or provides incorrect results, a "correct" program can also lead to errors when it is executed. Run-time errors are all types of errors that occur while the program is being processed. Depending on the situation, the cause can be, for example, an unsuitable program environment (e.g. an incorrect operating system version, incorrect parameters when calling the program (also as a subroutine ), incorrect input data, etc.)
- Runtime errors can show up in many different ways. The program often shows undesirable behavior, in extreme cases the execution of the program is aborted ("crash"), or the program goes into a state in which it no longer accepts user input ("freeze", "hang"). If memory is no longer released after use in programming languages without automatic garbage collection (e.g. C or C ++ ), more and more memory will be used by the program in the long run. This situation is called a memory leak . However , similar problems can also arise in programming languages with automatic garbage collection (e.g. Java or C # ) if, for example, objects are accumulated in an uncontrolled manner through low-level programming . Even more critical are memory areas accidentally released by the programmer , which are often still referenced by hanging pointers , as this can lead to completely uncontrolled behavior of the software. Some runtime environments therefore generally do not allow such programmable memory releases. There are also bugs in interaction with other programs.
- Errors in the compiler, the runtime environment or other libraries. Such errors are usually particularly difficult to understand because the behavior of the program in such cases does not correspond to its semantics. Particularly reliable is therefore expected of the compiler and runtime environment.
- A regression bug ( regression means "backward step") is an error that only appears in a later program version. These are often undetected side effects of bug fixes or program changes elsewhere.
- Errors as a result of physical operating conditions. A wide variety of events such as electromagnetic fields, radiation, temperature fluctuations, vibrations, etc. can also lead to errors in systems that are otherwise properly configured and operated within the specifications. Errors of this type are very unlikely, are very difficult to detect and can have fatal consequences in real-time applications. For statistical reasons, however, they cannot be excluded. The famous "falling of a bit" in the memory or on the hard disk due to the influences described is, for example, such an error. Since the effects of such an error (e.g. system crash or inability to boot because a system file was damaged ) from which other program errors are usually very difficult to distinguish, one often suspects another cause, especially since such an error is often not reproducible.
In some projects, the term bug is not used, but rather, for example, metabugs, in which a bug is an element of a task list. In some projects, the term "issues" is used instead, as this term is not limited to bugs.
Specific examples of errors with a particular media impact can be found in the list of program error examples .
Software errors are far more than just annoying accompanying circumstances for software developers; they cause considerable costs from a business and economic point of view . The IX study 1/2006 showed z. B. the following values determined for Germany:
- Approx. The annual losses due to software errors in medium-sized and large companies amount to 84.4 billion euros
- Approx. 14.4 billion euros annually (35.9% of the IT budget) are used to eliminate program errors ;
- Approx. The productivity losses caused by computer failures due to faulty software amount to 70 billion euros
The same study also examines the development of software quality for the period from 2002 to 2004 - with the result:
- the rate of failed projects rose from 15% to 18%
- the rate of successful projects fell from 34% to 29%
- the rate of projects with cost overruns increased from 43% to 56%
- the rate of projects that missed deadlines rose from 82% to 84%
- the rate of projects with suitable functionality fell from 67% to 64%
A report by the Supreme Audit Office for New Projects (1985) at the US federal administration shows that there are particularly many failures
- 27% of the software paid for was never delivered,
- 52% never worked,
- 18% were only used after extensive renovation.
- Only 3% of the software ordered fulfilled the agreed contractual conditions.
The Standish Group International stated: On average, projects exceed
- the originally planned project costs by 89%
- the scheduled appointments by 222%.
Ewusi-Menach identified the following factors as reasons for project cancellations due to poor software quality:
- Unclear objectives
- Incorrect project team occupation
- Inadequate quality assurance
- Lack of technical know-how
- Insufficient consideration of the initial situation
- Lack of user involvement
Avoidance and correction of program errors
In general, the earlier the error occurs in the development process and the later it is discovered, the more time-consuming it will be to correct the error.
During the planning
The most important thing is good and appropriate planning of the development process. There are already a number of procedural models from which a suitable one can be selected.
In the analysis phase
One problem is that the correctness of a program can only be proven against an appropriately formalized specification. However, creating such a specification can be as complicated and error-prone as programming the program itself.
The development of increasingly abstract programming paradigms and programming styles such as functional programming , object-oriented programming , design by contract and aspect-oriented programming also serve, among other things, to avoid errors and simplify troubleshooting. A suitable technique should be selected from the available techniques for the problem. An important point here is that experienced programmers have to be available for the respective paradigm, otherwise the opposite effect often arises.
It is also very useful to let the development tools handle as many error avoidance tasks as possible reliably and automatically. B. is facilitated with the help of structured programming . On the one hand, this concerns controls such as visibility rules and type safety , as well as the avoidance of circular references that can be adopted by the compiler before the programs are translated , but also controls that can only be carried out at runtime , such as index checks for data fields or type checks for objects of object-oriented programming.
In the design phase
Software experts agree that virtually every non-trivial program contains bugs. Techniques were therefore developed to deal with errors within programs in a tolerant manner. These techniques include defensive programming , exception handling , redundancy and the monitoring of programs (e.g. using a watchdog timer) as well as the plausibility check of the program during development and the data during the program run.
In addition, a number of advanced applications are offered that analyze either the source code or the binary code and try to find errors that are often made automatically. This category includes, for example, execution monitoring programs that usually reliably detect incorrect memory accesses and memory leaks . Examples are the freely available tool Valgrind and the commercial Purify . Another category of test programs includes applications that statically analyze source or binary code, such as finding and reporting unclosed resources and other problems. These include FindBugs , Lint and Splint .
It makes perfect sense that the test is developed before the actual program. This ensures that a test is not written that matches the program that has already been written . This can be done during the analysis or design phase by determining test cases based on the specification . The determination of test cases at this early stage of software development also enables the requirements for the program to be checked for testability and completeness. The test cases determined on the basis of the specification form the basis for the acceptance tests and can be continuously refined over the entire development process.
Some software providers sometimes conduct test phases publicly and issue beta versions so that users can test and comment on the unpredictably diverse conditions of use of different users themselves.
If an error occurs during operation, an attempt must be made to keep its effects as low as possible and to contain its scope by creating "protective walls" or "safeguards". This requires, on the one hand, the ability to detect errors and, on the other hand, to be able to react adequately to an error.
An example of error detection during the runtime of a computer program are assertions , which are used to query conditions that are always fulfilled according to the program design. Other mechanisms are exception handling such as trap and exception.
By implementing proof-carrying code , the software can guarantee and ensure its reliability to a certain extent during runtime.
Complete freedom from errors for software that exceeds a certain complexity limit is practically neither achievable nor demonstrable. With increasing complexity, the overview decreases, especially if several people are involved in the programming. Even expensive or extensively tested software contains programming errors. In the case of usable programs, one then speaks of robustness rather than being free from errors . Software is considered to be robust when errors only occur very rarely and then only cause minor inconveniences and do not cause major damage or loss.
In special cases, it is possible to prove that a program is free of errors (with regard to the specified requirements). Especially in areas in which the use of software is associated with high financial, economic or human risks, such as B. in software used for military or medical purposes or in the aerospace industry, a method called "(formal) verification " is also used, in which the correctness of the software is proven mathematically. However, due to the enormous effort involved, this method has strict limits and is therefore practically impossible to carry out with complex programs (see also predictability ). However, there are now tools that, according to their own information, can provide this evidence quickly and reliably, at least for partial areas ( runtime errors ).
In addition to mathematical verification, there is also a practical form of verification, which is described by the quality management standard ISO 9000 . With it, an error is formally stated only if a requirement is not met. Conversely, a work result (and thus also software ) can be described as 'error-free' if it demonstrably meets all requirements. The fulfillment of a requirement is determined by tests . If all tests defined for a requirement bring the expected results, the requirement has been met. If this applies to the tests of all requirements (assuming correct and complete testing), it is concluded that the requirements are free of errors. If the requirements on which the tests are based are incorrect or incomplete, the software will still not work “as desired”.
Classification of defects
Errors that have occurred are generally processed systematically in error management . According to IEEE standard 1044 (classification of software anomalies), every error goes through a so-called classification process, consisting of the four steps of recognition, analysis (investigation), processing (action) and conclusion (disposition). In each of these steps, the administrative activities recording, classifying and identifying impact are carried out.
Criteria according to which errors can be classified include: a. (with examples):
- The type of error: A distinction is made between: lexical errors (unknown reference), syntactic errors (forgotten semicolon), semantic errors (incorrect declaration ), runtime errors (incorrectly formatted input data) and logical errors (plus instead of minus, loop errors , ...)
- The cause of the error: imprecise specification, rotated numbers, incorrect formula, unchecked (incorrect) input data ...
- The point in time at which the error occurred ('incorrect action'): Already in the program specification, in the code draft, in the coding, ...
- The time at which the error occurs ('error effect'): A fundamental difference arises from whether the error occurs during program development, for example during testing (here this is a normal case) or in productive operation (where it often represents a critical fault).
- the point in time of discovery: the longer the “error dwell time”, the more time-consuming i. A. The corrective action will proceed.
- the effect (s) of the error: display errors, incorrect results, program termination, external effects ...
- Effort and duration of troubleshooting: minimal ... very high; immediately ... very long duration;
- Processing status: occurred, examined, correction order in process, retest possible, ..., done
With the help of metrics, "the results [and insights into errors] should also be a reason to search for the causes behind the problems". "Error classifications form the basis for standardized procedures for error handling and also support continuous quality improvement in the sense of quality management ." Additional information for each error such as a detailed error description, affected programs, persons involved, etc. accompany the measures for eliminating the errors and document them. For more information, see the BITKOM guide.
For the sake of simplicity, program errors in the error handling process are often only divided into categories / classes such as A, B, C ... or 1, 2, 3 ... etc. according to the severity of the error, which also includes the effect of the error and the effort required to correct it . For examples, see the BITKOM guidelines, especially in the appendix.
Consequences of program errors
The consequences of program errors can vary greatly and manifest themselves in many different ways. If errors are discovered during the development process, the consequences of the error are limited to the revision of the software (code corrections, concept revision, documentation ...) - depending on the situation, with a greater or lesser effect on the project budget and the project duration. On the other hand, errors that are only recognized in productive operation often have a more critical effect, for example they can cause process disruptions or production downtime, damage the image, cause the loss of customers and markets, trigger obligations of recourse or even endanger the company's existence. In the worst case, errors in technical applications can lead to disasters.
Specific examples of program errors and their consequences can be found in the list of program error examples .
Reproducibility of program errors
Some program errors are extremely difficult or impossible to reproduce reliably. If a previously failed process is repeated under apparently unchanged conditions, there is a probability that these errors will not be expressed again. There are two possible reasons for this behavior: On the one hand, there can be delays between the error activation and the problem that ultimately occurs, for example a program crash, which obscures the actual cause and makes it difficult to identify. On the other hand, other elements of the software system (hardware, operating system, other programs) can influence the behavior of the errors in the program under consideration. An example of this are errors that occur in concurrent environments with insufficient synchronization (more precisely: sequencing ). Because of the resulting race conditions , the processes can be processed in a sequence that leads to a runtime error. If the same action is repeated, it is possible that the order of the processes is different and that no problem arises.
- William E. Perry: Software Testing. Mitp-Verlag, Bonn 2002, ISBN 3-8266-0887-9 .
- Elfriede Dustin, Jeff Rashka, John Paul: Testing software automatically. Procedure, handling and performance. Springer, Berlin a. a. 2001, ISBN 3-540-67639-2 .
- Cem Kaner, Jack Falk, Hung Quoc Nguyen: Testing Computer Software. 2nd edition. John Wiley & Sons, New York NY u. a. 1999, ISBN 0-471-35846-0 .
- The 25 most dangerous programming errors (English)
- SQS: The most spectacular software errors of 2012. In: Computerwoche. January 17, 2013, accessed January 20, 2013 .
- M. Pol, T. Koomen, A. Spillner: Management and optimization of the test process. dpunkt.Verlag, Heidelberg 2002, ISBN 3-89864-156-2 .
- Spillner et al. Practical knowledge software test - test management reading sample Chap. 1.1 Basic knowledge / definition of errors ( Memento from December 17, 2010 in the Internet Archive ) (PDF) dpunkt.de
- Merriam-Webster Unabridged Dictionary (iOS-App, 2016): bug: a) an insect or other creeping or crawling invertebrate… b) any of certain insects commonly considered especially obnoxious… c) an insect of the order Hemiptera , especially: a member of the suborder Heteroptera ...
- The Papers of Thomas A. Edison, vol. 4, ed. Paul B. Israel, Baltimore and London, 1993. Online 
- Fred R. Shapiro: Etymology of the Computer Bug: History and Folklore . In: American Speech 62: 4, 1987, pp. 376-378.
- iX-Magazin , study software test management , was previously available in the IX Kiosk ( Memento from January 9, 2013 in the Internet Archive )
- Wallmüller: Software Quality Management in Practice, beck-shop.de (PDF; 612 kB), Hanser, Munich 2001, ISBN 978-3-446-21367-8 .
- Junginger: Value-oriented control of risks in information management . 2005, ISBN 3-8244-8225-8 .
- Georg Edwin Thaller software test, verification and validation 2002, ISBN 978-3-88229-198-8 .
- IEEE Standard Classification for Software Anomalies. (PDF) IEEE Standards Board, 1993, p. 32 , accessed on November 22, 2014 (White Paper; document behind Paywall).
- Defect Classification for Software Guidelines . (PDF; 3.25 MB) BITKOM