Standardized programming

The standardized programming ( NP ) describes a standardized sequence control of a data processing program. It was standardized in DIN 66220 and was further developed with DIN 66260 in the direction of structured programming . Both approaches support modular programming .

Standardized programming is a generalized program flow control that divides the subtasks of a batch program such as data input, group control, processing / output into a uniform, logically clear, functional scheme. For programs with such tasks, this scheme can be used independently of the subject-related task and “technology-neutral” (e.g. in any imperative / procedural programming language ).

Historical review

At the time when standardized programming for commercial data processing emerged at the end of the 1960s, there were only batch programs: sequential files in which, depending on the task at hand, different 'record types' appeared mixed / consecutively . B. only a magnetic tape station or a punch card reader was available for input. Mixing in the program was not necessary. B. 64 KB including the operating system was also too complex. So the program read 'its' input file and processed one record after the other. The data types (customer, order ...) were differentiated in terms of content via a data field 'record type' and assigned to one another using classification terms. The program 'ran' (with GOTO ) depending on the data constellation to certain points, for example to arithmetic operations, to print a list line, to read data and also to process it - and ultimately at the end of the program. The programs were often ' spaghetti code ', hardly followed uniform structural specifications and were therefore intransparent, error-prone and unfriendly to change.

More powerful computers, new programming languages, but also advances in software development methods gradually led to better solutions: Proposals were made for a standardized structure of batch programs. The entire control of the program was broken down into clearly defined 'routines', which call the task-specific processing parts of the program - the 'standardized programming'.

However, in the practice of ( individual ) software development, such standardized procedures are often not used - with the result that creating the program code for program control in many cases remained a special challenge or 'problem' for the developer, which often causes high development effort and reveals errors during software tests.

Goal setting

According to a small textbook by a large BUNCH computer manufacturer from 1971 with the title Logic of Programming, Standardized Programming , with which budding programmers and system analysts were trained at the time, “Unification and standardization of program creation, reduction of programming time, elimination of possible sources of error and thus reduction of the Test time and reducing costs for the programmer ”are referred to as the goals of standardized programming. Accordingly, these goals are important if several / many developers in an organization produce software with development tools in which the control functions of standardized programming are not included as an integrated component.

The standardized programming

The core of standardized programming is the program sequence control. The standardized programming enforces the logical and functional order of a program through a standardized program flow control that is independent of the respective programmer. Since the process control is explicitly defined by the programmer, software development can be assigned to the programming paradigm of imperative / procedural programming according to standardized programming .

Batch programs always have the same structure regardless of their individual task . That means: Identical designations for the processing procedures; identical control logic (blocks B to E are only set to the specific task such as number of files, group terms ...). The task-specific processing is only contained in the procedures of blocks G to J, A and F if necessary.

The scheme of the program flow control

Block subdivision of standardized programming

The program sequence is divided into blocks, each block representing a functionally related part of a program. The subdivision represents, so to speak, the “natural” structure of a commercial program: Before starting the actual processing, initial values must be set and control information evaluated, then input data must be read, the next record must be selected for processing, group changes may have to be dealt with, after all To process the data record and, if necessary, to output data. Each block represents a self-contained unit. For a better overview, a block can consist of several sub-blocks. The program flow chart shown shows the following blocks:

A : Preliminary program for all program steps to be carried out once
B : input; it consists of as many sub-blocks as there are serial input files. In each of these blocks, not only the actual input is dealt with, but also plausibility checks and sequence checks.
C : record release; From the records read per file, it is selected which record is to be processed next - and is used (beforehand) to determine group changes.
D : Group control and call of the group change subroutines for all group levels.
E : single processing; it is in turn divided into sub-blocks according to the number of input files.
F : Final program for all program commands to be run through once after the actual program sequence.
G : group processing subroutines; There are as many sub-programs as there are group levels, each separately for pre-runs and post-runs.
H : subroutines for individual processing; these subroutines should be as small and clear as possible.
J : subroutines for sequential output and optional input / output; there is a subroutine for each of the files mentioned.

The data feed control

An essential sub-function in standardized programming is the automatic 'data supply'. Exactly one data record is read from each input file - including the end of the file. The next record to be processed is determined from the current (most recently read) records of all files and sent for processing. Then exactly this file is read again. This type of data supply is implemented in the input and record release blocks .

Group term fields

The central key for controlling the program flow is the group term, which must be included in every set of input files. Since the group terms can be in different positions in the input files, with standardized programming the group terms are extracted from the data records and stored in separate file-related group term fields from left to right in descending order of the group level hierarchy. For each input file there is such a group term field, which is the same size and format for all files - therefore file-related . In the example, the group term field of input file 2 consists from right to left of the file number L2D (“2”), the group terms L21 (subgroup term), L22 (main group term), L23 (supergroup term), L24 (an even higher group term) and the field L2S for the file status. The fields Ln1 to Ln4 can be used together as LnM for the pairing check. The names of the fields were assigned based on the terminology used by RPG (Report Program Generator), which was widely used at the time (L for level, group level).

In addition to the file-related group term fields, there are two further, file-neutral fields that are used for group control:

LA contains the group term of the last processed record
LN contains the group term of the next record to be processed

The size corresponds fully to the size of the file-related fields L0, L1, .... Only the definition of the fields has been chosen differently for reasons of expediency for the group control. In the example, field LN1 is used to control the subgroup, field LN2 to control the main group, etc.

The file number

After entering a record from each file, you must select the record that is to be processed next. If there are different group terms, the solution is simple, the sentence with the smallest group term is to be processed next. However, if the group terms are the same (paired), it must be decided which file has the higher priority, ie the file number determines the priority. In the classic case of a comparison of master data (e.g. parts master) and movement data (e.g. warehouse movements), the movement data determines whether the master data must be processed or not, i.e. the movement record must be processed first. The file number - a constant - determines the order (priority) of processing if the remaining group terms are paired.

The file status

The file status distinguishes between four states:

0 = retrace the sentence
1 = do not retrace the sentence
2 = file completed
3 = file does not exist

At the beginning the states of all input files are set to 0 ("trace record"). After a file has been dragged, its status is set to 1. The status is only set to 0 again when it is released for processing. Status 3 can e.g. B. control process variants with different or existing / non-existing files on the basis of preliminary data.

The control mechanism

The program is controlled (roughly) according to the logic as shown in the diagram 'Block subdivision of standardized programming'. All processing subroutines (A to G) are controlled from here. The basis for this are the file-related and file-neutral group term fields, as they are provided when reading and in the sentence release.

The structure of the blocks

Block A: preliminary program

Actions that are required at the beginning of the program are carried out here. Examples: Reading in parameters (e.g. as 'advance cards' to define the run date or run variants); Initialization / loading of lookup tables and other data areas; Open files (OPEN); Set switch; one-time output of headlines; briefly all one-off work before the start of the processing cycle dependent on the input data.

Block B: input

Depending on the number of sequential input files, block B is subdivided into sub-blocks B0, B1, B2, ... which are run through one after the other. Depending on the file status, a record is read or not. The end of file check is carried out immediately after reading. If so, the file status is set to "end of file", otherwise to "do not follow suit" and the file-specific group term fields are filled.

If it is not certain that an input file is always in the correct sorting order, a sequence check or other plausibility checks should take place in this block - possibly with a premature program termination.

A filtering can also be used here , i. H. Overreading of certain data records take place - which therefore also do not trigger group changes.

Block C: Record release

Data feed control and group change processing

In block C, the selection and approval of the next record to be processed takes place on the basis of the contents of the group term fields for each file . The next data record to be processed is the one with the lowest overall classification term for 'ascending order' of all group terms. If 'descending order' is specified for one or more order terms (example: process the latest date first), this must be taken into account in a suitable manner when the record is released. The priority-controlling defined file number means that the correct data record is selected first, even if the group terms are the same for specific tasks in several files.

Block D: group control

The group control is carried out with the help of the file-neutral group term fields LN and LA. To check a group change of the lowest level, LN1 is compared with LA1 (see diagram 'File-neutral group term field'), for a change of the second lowest level LN2 is compared to LA2, etc. - up to the highest group level.

For detected group changes, the subroutines of block G are called, first of all the group follow-ups (except after the first reading; from the lowest level to the determined change level) and then the group foreruns (except after all data records have been processed; from the determined change level down to the lowest).

Block G: group processing

What is to be done at the end or at the beginning of each group term is processed on a task-specific basis . Processing takes place in the sub-blocks G1, G2 etc. (part of the number identical to the group level, e.g. LN1, LN2). Additional sub-blocks such as G1V, G2N… distinguish between group lead (example: output of a list header) and group lag (example: output of sums). The calls are made from block D only for the group levels identified there.
Note: The execution of the pre-run and the post-run for a specific group term (e.g. zip code 12345) is far apart; in between there is at least one individual processing, possibly also group processing for lower group levels.

Block E: individual processing

In the subroutines of block E, the records are from the controlling processes input files . In the sub-blocks E1, E2 etc., exactly the data record is processed that was selected in the record release (block C) - and whose file number is in the LND field. Possibly. required (old) group follow-ups and (new) group pre-runs have already been processed at this point in time.

Depending on the task, type of record, etc., z. For example, data is buffered, sums are calculated and accumulated, data / individual lines are output (by calling a subroutine of the J block), switches are set (e.g. QL1, QL2, QG1, QG2, ... the switches for standardized programming are not activated here received) etc.

Block H: processing subroutines

“The frequent use of subroutines is highly recommended. Even if a certain processing part occurs only once in the program, it can make sense to outsource this part to a subroutine in order to work out the flow logic of the processing program in a correspondingly clear and concise manner . "( SPERRY UNIVAC : Logic of programming - standardized programming. Around 1970)

Block J: subroutines of the serial output and optional input / output

It is recommended that everything that belongs to the output in the broader sense is outsourced to these subroutines in order to 'relieve' the processing routines causing the output from the necessary (often formal) details. This can (in addition to the actual sentence output) z. E.g. deleting print areas, calculating the record address for random files, issuing control instructions for certain devices, etc.

Block F: Final program

This includes the closing of files, the output of e.g. B. Totals and termination of the program.

The special features of standardized programming

The programming time has been shortened significantly compared to "wild" programming, as has the test time. The system is relatively easy to learn, is independent of machine types and programming languages and independent of the respective programmers (accordingly it was also gladly rejected by "artists"). Someone who knows the method of standardized programming can quickly familiarize themselves with a foreign program that follows this methodology.

Further considerations

Controlling / non-controlling files

In case of doubt, different decisions can be made as to whether a file is treated as controlling in NP processing. This is important, for example, if the data in certain files only contain shortened group terms. So could z. For example, in a task the data for customers, orders and reminders come from three files. If the customer file is treated as controlling, group changes occur regardless of whether there are orders or reminders or not. Alternatively, the customer data could be 'read' specifically with direct access or sequentially as part of the individual processing (e.g. in the preliminary customer, resulting from orders) .

As an alternative to control via several input files, data can be combined in read access (e.g. when using the database language SQL ) or using your own preprocessing programs to form just one common database.

Group terms as essential elements

Group terms (also called grouping term or classification term) are the contents of data fields, according to which the data to be processed are combined into groups, possibly also multi-level. This means, for example, that at the beginning of a partial term heading / header lines and / or at the end total lines are output. Such groupings are common in reporting , but also for other processing purposes. Which field contents are used as group term (s) is always determined by the processing purpose.

The special properties of group terms (examples) described below may have to be taken into account in the program development through special implementation measures:

Group terms can be single-level (zip code only) or multi-level (zip code and age group ...).

They appear uniformly in all input files or in some cases only in abbreviated form . Example: customer data with only customer number, order data also with order number.

They come from directly stored information or are derived information (such as age or age group (from date of birth) or amount size class). Derivatives must be produced as part of preprocessing . In the case of simple derivations, this is possible in the reading process itself; upstream processing programs may be required.

They can correspond to the full content of a field or be part of a field (like positions 1 and 2 of the postcode)

The sorting can be ascending or descending (newest date in front).

Group terms can be common terms for the database (country, postcode for address data) and / or terms that are to be evaluated for special purposes (number of months since the last order; birthday MMDD). The same database can be processed according to many different criteria.

The group terms come from one (1) or more databases (postcode and age from customer data, manufacturer country from article data).

The group terms may have different data formats - either the individual sub-terms and / or identical terms from different input files are formatted differently.

For processing, the data records must be sorted in the defined order or be readable in this way. This order is usually checked during processing.

In addition to the group terms, an additional sorting of the data records is usual, for example for the creation of an invoice by article number, although the invoice only summarizes the orders per customer number.

Tools for standardized programming

Sample program framework as a template

To create a program according to standardized programming, aids should be used with which the effort involved in creating the program can be minimized and the quality of the programs created (e.g. with regard to correctness, testability, uniformity) can be increased or secured. Which includes:

Program generators: Generators are available in the system software market with which a program framework can be generated on the basis of specifications to be defined. This contains i. d. Usually the complete program control with all the necessary data fields and the subroutines addressed by the main control (input, group control and processing, processing, etc.). Often these generators only support certain programming languages. The data fields and subroutines are usually generated specifically for the generator according to naming conventions other than those used here.
Program templates: If no generator is available, a sample program framework is helpful, which provides structures similar to those mentioned under 'program generators' - and taking other company standards into account. To create a new program, the frame is copied and individually adapted to the task (number of files and group terms).

In both cases, after the aforementioned preparatory activities, the entire process control is ready, the programmer 'only' has to set the task-specific processing details in the subroutines that are still empty (such as group feed_1, individual processing_A etc.).

Further standardizations

A literal interpretation of 'standardized programming' could include all normalizing / standardizing aspects of programming (= creating a computer program in the narrower sense). In addition to the standardized sequence control (as described in this article), this can include the following aspects:

Naming conventions: HERE essentially described as a suggestion for naming the function blocks. Rules for naming data definitions should also be given in detail.
GOTO-free programming: Depending on the programming language used, loop constructs are offered for this. The aim here is a clear program logic. At least direct jumps into foreign subroutines should never be allowed, i. H. each subroutine jumps back to its call point.
Standard functions: In many companies there are ready-made routines (subroutines, macros, code sequences, ...) for certain tasks (technical / functional) that can be used in individual programs. Examples: Open / Close, date calculation, printout, ...
Standard data definitions: The structure of data records (their field sequence, length, formats, ...) should always be in a form that is used in all programs that process these files. Here it should z. For example, it may be possible to use a different prefix for the input than for the output.
Design of screen contents: colors, position of input fields and error messages, ...
Design of lists: arrangement of headers and footers, ...
Program comments: In some companies it is mandatory to comment on the commands created in great detail. There is a risk of redundancy in relation to other written specifications.
...

The failures in the numerous attempts at national or even global standardization should not prevent companies from setting up appropriate specifications as internal rules - and from checking compliance (as part of quality assurance).

Normalization / standardization is an essential aspect of quality . See also programming style .

Criticism and further developments

In the late 1960s and 1970s, 'top-down, incremental refinement, and modular programming ' were topics of discussion in software development. In particular, Edsger W. Dijkstra's proposals for structured programming are still effective today, but could even then be implemented within standardized programs. The standardized programming did not stand in the way of a step-by-step refinement, especially of blocks E, H and J. The elementary basic structures were z. B. can be implemented in COBOL , a "GO TO" -free program with standardized programming was possible, the block concept was also possible with languages such as COBOL and PL / I , and even with assembler . However, the readability of the programs is still criticized by some programmers, especially when the source code was generated by NP generators and z. B. contained unfamiliar field and procedure names, sometimes even "GO TO" commands.

The 'standardized programming' approach is also confirmed by the fact that many report generators and database evaluation languages use structurally almost identical constructs: The user knows and defines e.g. B. List header and footer (according to program forerun and end), group header and group footer (according to group forerun and group later ) for several hierarchically defined group levels. The single line (also called the detail area) shows information about the individual data record, which corresponds to the individual processing.

In its full scope, the standardized programming scheme contains sub-functions which under certain circumstances may be superfluous or can be implemented in a simplified manner. So z. E.g. in the block 'record release' the selection of the next data record to be processed is not necessary if only one input stock is available.