Availability

from Wikipedia, the free encyclopedia

The availability of a technical system is the probability or the degree that the system will meet certain requirements at a certain point in time or within an agreed time frame. Alternatively, the availability of a set of objects is defined as the proportion of the available objects in relation to the total number of objects in this set (cf. CLC / TR 50126-3). It is a quality criterion and a key figure of a system.

Definition as a key figure

The availability can be defined based on the time in which a system is available:

In this context, a distinction must be made between a planned and an unplanned unavailability . Since only the downtime within the agreed period is calculated to calculate availability, a planned unavailability, for example due to maintenance tasks, is outside the agreed period. Only an unplanned unavailability is counted as downtime. If full 24/7 availability is agreed, it means that there are no planned unavailabilities. Any business interruption is then counted as downtime. Maintenance work on such systems must be carried out during operation.

The alternative definition of availability in relation to a set of elements is:

application

In the case of larger, complex technical systems (e.g. electricity supply, power plants, telecommunications), availability is the ratio of the time within an agreed period in which the system is operationally available for its actual purpose (operating time) to the agreed period Time. The operating time of a technical system can be limited by regular maintenance and by faults / damage and repairs to eliminate them. The availability is usually given in percent .

In the case of computer systems ( e.g. DSL , online brokering), the availability is measured in the “duration of the uptime per unit of time” and given in percent. (The availability is also no longer given if the response time of a system exceeds a certain parameter.) The time units typically used are minutes, hours, days, months, quarters or years.

Dispatch reliability is used for aircraft , while availability means the number of free seats on an aircraft in relation to the total of all seats when booking a flight .

In the CLC / TR 50126-3 the term fleet availability is used. It describes the ratio of serviceable vehicles to the total number of all vehicles in a fleet.

Availability as a property of a system is therefore stipulated in the contract ( Service Level Agreement , SLA) between the system operator and the customer. The consequences (e.g. contractual penalties) in the event of non-compliance with availability can also be regulated there.

Depending on the agreement, availability has a major impact on the requirements with regard to failure and maintainability of the system.

For a system that is available 12 hours a day, 5 days a week, 52 weeks a year (12 × 5 × 52) (3120 hours), this means in hours:

Availability Minimum expected operating time [hours] Maximum allowed downtime [hours] Remaining time [hours]
99% 3088.8 31.2 5640
99.5% 3104.4 15.6 5640
99.7% 3110.64 9.36 5640
99.9% 3116.88 3.12 5640
99.95% 3118.44 1.56 5640
100% 3120 0 5640

Based on 365 days a year, a remaining time of 5640 hours or 235 days is available for system maintenance, for example, without the availability having to suffer.

For a system that is available 24 hours a day, 365 days a year (24 × 365) (8760 hours), this means:

Availability Minimum expected operating time [hours] Maximum allowed downtime [hours] Maximum allowed downtime [minutes]
99% 8672.4 87.6 5256
99.1% 8681.16 78.84 4730.4
99.2% 8689.92 70.08 4204.8
99.3% 8698.68 61.32 3679.2
99.4% 8707.44 52.56 3153.6
99.5% 8716.2 43.8 2628
99.6% 8724.96 35.04 2102.4
99.7% 8733.72 26.28 1576.8
99.8% 8742.48 17.52 1051.2
99.9% 8751.24 8.76 525.6
99.95% 8755.62 4.38 262.8
99.97% 8757,372 2.628 157.68
99.98% 8758,248 1.752 105.12
99.99% 8759.124 0.876 52.56
99.999% 8759.9124 0.0876 5.256
100% 8760 0 0

There is no more time left here. Maintenance must therefore take place during the permitted downtime.

The availability network can be used to optimize availability .

Systems that have to run with high availability (99.99% or better) are referred to as high availability systems .

Availability indicators are

  • maximum duration of a single failure (availability: annual average downtime, also availability class),
  • Reliability (ability to work correctly over a given period of time under certain conditions),
  • Fail-safe operation (robustness against incorrect operation, sabotage and force majeure),
  • System and data integrity ,
  • Maintainability (generalizing: usability at all),
  • Response time (how long it takes for the system to take a specific action),
  • Mean Time to Repair ( MTTR , mean recovery time after a failure),
  • Mean Time between Failure ( MTBF , mean operating time between two occurring errors without repair time),
  • Mean Time to Failure (MTTF, see MTBF, is however used for systems / components that are not repaired but replaced).

If a system consists of several subsystems that build on one another, the total net availability results from the multiplication of the availability values ​​of the individual subsystems ( stochastically independent events ).
Example:
An application in a data center has an availability of 99.5% and the underlying line has an availability of 98.5%, which results in an overall availability of 98%

See also

literature

  • Josef Börcsök: Electronic security systems. Hardware concepts, models and calculation. Hüthig, Heidelberg 2004, ISBN 3-7785-2939-0 ( practice ).
  • Peter S. Weygant: Clusters for High Availability. A Primer of HP-UX Solutions. Prentice Hall PTR, Upper Saddle River NJ 1996, ISBN 0-13-494758-4 ( Hewlett-Packard Professional Books ).
  • CENELEC: Railway applications - Specification and proof of reliability, availability, maintainability, safety (RAMS) - Part 3: Guide to the application of EN 50126-1 for rail vehicles RAM , CLC / TR 50126-3, Technical Report, July 2008