Cache

Cache ([ kæʃ ], [ kaʃ ]) refers to a fast buffer memory in information technology , which helps to avoid (repeated) access to a slow background medium or costly recalculations. Data that has already been loaded or generated once remain in the cache so that it can be retrieved from it more quickly if required later. Data that will probably be needed soon can also be called up in advance from the background medium and initially made available in the cache ( read-ahead ).

Caches can be designed as a hardware structure (for example as main memory chips ) or software structure (for example as temporary files or reserved storage space ).

Cache is a loan word from English. It has its origin in the French cache , which actually means hiding place . The name illustrates the fact that the cache and its substitute function for the addressed background medium are usually hidden from the user. Anyone who uses the background medium does not need to know the size or functionality of the cache, because the cache is not addressed directly. The user "addresses the background medium", but instead the cache "responds" - exactly in the same way as the background medium would have responded, ie would have delivered data. Because this intermediate unit is invisible, it is also called transparency . In practice, it is a mirrored resource that can be edited / used very quickly on behalf of the original.

If, in addition to the device using the cache, other devices access the background medium, this can lead to inconsistencies . In order to be able to access an identical data image, it is necessary to transfer the changes to the cache to the background medium before access. Cache strategies like write-through or write-back are practical here. In extreme cases, a complete " cache flush " must be carried out.

In addition, the cache may have to be informed that data on the background medium has changed and its content is no longer valid. If the cache logic does not ensure this, the disadvantage arises that changes made in the background medium or in the computer program are not recognized in the meantime. If changes are suspected or to ensure that the current status is taken into account, the user must explicitly initiate a cache update.

Use

The goals when using a cache are to reduce the access time and / or to reduce the number of accesses to a slow background medium. In particular, this means that the use of caches is only worthwhile where the access time also has a significant influence on the overall performance. While the z. B. is the case with the processor cache of most (scalar) microprocessors, it does not apply to vector computers , where the access time plays a subordinate role. That is why caches are usually not used there because they are of little or no use.

Another important effect when using caches is the reduction of the necessary data transfer rate to the connection of the background medium (see e.g. memory hierarchy ); the background medium can therefore be connected “more slowly”. B. can result in lower costs. Because the majority of queries can often be answered by the cache ("cache hit", see below), the number of accesses and thus the necessary transmission bandwidth decrease . For example, a modern microprocessor without a cache, even with a very short main memory access time, would be slowed down by not having enough memory bandwidth, because the omission of the cache would greatly increase the number of accesses to the main memory and thus the demand on memory bandwidth.

In the case of CPUs, the use of caches can thus help reduce the Von Neumann bottleneck in the Von Neumann architecture . The execution speed of programs can be increased enormously on average.

A disadvantage of caches is their poorly predictable timing, since the execution time of an access is not always constant due to cache misses. If the data is not in the cache, the user must wait until it has been loaded from the slow background medium. With processors, this often happens when accessing data that has not yet been used or when loading the next program command with (wide) jumps .

Cache hierarchy

Since it is technically complex and therefore usually not economically sensible to build a cache that is both large and fast, you can use several caches - e.g. B. a small, fast cache and a significantly larger, but somewhat slower cache (which is, however, still much faster than the background memory to be cached). This means that the competing goals of low access time and large cache size can be achieved together. This is important for the hit rate .

If there are several caches, they form a cache hierarchy that is part of the memory hierarchy . The individual caches are numbered according to their hierarchical level , ie level ‑ 1 to level ‑ n or L1, L2 for short, etc. The lower the number, the closer the cache is to the fast "user"; the lowest number therefore denotes the cache with the fastest access time; this is searched first. If the L1 cache does not contain the required data, the (usually somewhat slower, but larger) L2 cache is searched, etc. This happens until the data is either found in a cache level (a "cache hit", see below) or all caches were searched without success (a "cache miss", see below). In the latter case, the slow background memory must be accessed.

If a cache hit occurs e.g. B. in the L3 cache, the requested data is delivered to the accesser and transferred to the L1 cache at the same time; for this a cache line has to give way, which "sinks" into the L2 cache.

With an inclusive cache , each cache level is transparent in itself; H. a cache line that is in the L1 cache is also present in the L2 and L3 cache. If the cache line is "displaced" from the L1 cache (overwritten with data from a different address), nothing else needs to be done - it is still available in the L2 cache (if no write-back or similar is necessary is).
With an exclusive cache , a cache line of an address is only available once in all cache levels. A cache line to address A in the L1 cache is not also available in the L2 or L3 cache. If it is displaced from the L1 cache, it can either be discarded entirely or it must be explicitly copied into the L2 cache. A (different) cache line is therefore also displaced there to make room for the sinking one. This other one now sinks into the L3 cache, where a third cache line has to give way.

Exclusive cache hierarchies generate significantly more data traffic between the caches. As many cache lines as the sum of the L1, L2 and L3 cache sizes can be kept available for this purpose, while only the L3 cache size is decisive for the inclusive cache.

In the hardware area, modern CPUs in particular have two or three cache levels; other devices usually only have one cache level. In the software sector, mostly only one cache level is used, a prominent exception are web browsers that use two levels ( main memory and hard disk drive ).

Strategies

Cache size

In order to maximize the usefulness of the cache, which is usually several powers of ten, compared to the background memory, the locality properties of the access patterns are used in the functioning and organization of a cache . For example, if you observe the activity of a running program on a processor over a short time interval, you will notice that a few and “always the same” small memory areas (e.g. code within loops, control variables, local variables and procedure parameters) are repeatedly accessed becomes. Therefore, even small caches with a few kibibytes can be very effective.

However, if an algorithm is constantly processing new data (e.g. streaming data), a cache cannot accelerate it through multiple accesses, at best slightly through read-ahead .

Exploitation of the location

Since caches are supposed to be fast, a different (faster) memory technology is used for them than for the memory to be cached (for example SRAM versus DRAM , DRAM versus magnetic disk , etc.). Therefore, caches are usually much more expensive in terms of the price-bit ratio, which is why they are designed much smaller. This means that a cache cannot have all the data available at the same time. In order to solve the problem of which data should be kept in the cache, the locality properties of the accesses are used:

Temporal locality: Since data is accessed repeatedly (e.g. when processing a program loop), it is more likely that data that has already been accessed will also be accessed again. This data should therefore preferably be kept in the cache. This also results in the need to remove old data that has not been used for a long time from the cache in order to make room for newer data. This process is called “repression”.
Spatial locality: Since program code and data are not scattered around randomly in the address space, but are arranged "one behind the other" and sometimes only in certain address areas (code, data, stack segment, heap , etc.), it is after access to a specific address it is likely that a “nearby” address (read: the difference between the two addresses is very small) is accessed. When processing a program z. For example, one command after the other is processed, with these being “one after the other” in the memory (if there is no jump command). Many data structures such as arrays are also located “one behind the other” in memory.

Because of their spatial location, caches do not store individual bytes, but rather blocks of data ("cache block" or sometimes also called "cache line"). In addition, this makes implementation easier and reduces memory overhead , since you don't have to save an address per data byte, but only for each cache block. The choice of block size is an important design parameter for a cache that can have a major impact on performance.

organization

A cache consists of a (mostly) fixed number of entries, each entry consists of:

Cache line: The actual cached data (e.g. 64 bytes for current PC processors)
Address tag: The address on the background media at which the cached data begins
Access / management information: v. a. Information for the "displacement strategy" (see below)

Cache-Line / Cache-Line

A cache line is the smallest administrative unit within the cache of processors. It is a copy of a storage area. The accesses from the cache memory to the CPU or to the main memory thus take place in a single, block-wise transfer. The size of a cache line is 16 bytes ( Intel 80486 ), 32 bytes (Pentium P5 to Pentium III) and 64 bytes (Pentium 4 to current Core-i / AMD ZEN processors in 2018). The minimum length results from the memory bus width multiplied by the prefetch depth of the memory.

Block number tags instead of address tags

In the following, it is assumed that cache lines are only ever read and written from addresses whose address can be divided by the (byte) length of the cache line.

example

A cache line is said to be 64 bytes. It should be specified that data can only be read and written with start addresses e.g. B. 0, 64, 128, 192, 256, ... The background medium is divided into blocks that are just the size of a cache line.

Then the entire (start) address of the data no longer has to be stored in the address tags, but only the number of data blocks cached on the background medium. By choosing suitable numbers (powers of two) in the binary system, the tags can be saved in a space-saving manner; this speeds up checking whether a requested address is contained in the cache.

Block / sentence division of the tags

Fully associative cache

The blocks (cache lines) of a cache can be summarized in so-called sets. Only one of the records is responsible for a specific address. Within a set, all blocks serve only part of all available addresses. In the following, the variable stands for the total number of cache blocks and for the number of blocks per record, the so-called associativity. The cache then consists of records . ${\ displaystyle m}$ ${\ displaystyle n}$ ${\ displaystyle {\ tfrac {m} {n}}}$

organization	Number of sets	Associativity
DM	${\ displaystyle m}$	1
FA	1	${\ displaystyle m}$ ( ) ${\ displaystyle = n}$
SA	${\ displaystyle {\ tfrac {m} {n}}}$	${\ displaystyle n}$

Depending on how strongly you make this division, one speaks of one of the three types of cache organization:

Mapped directly (Engl. Direct mapped , short DM ): ${\ displaystyle n = 1}$ , d. that is, each block represents its own sentence, so there are as many sentences as blocks. Thus exactly one cache block is responsible for a given address. So there is a direct mapping between the background memory address and cache blocks, hence the name. When making a request to such a cache, you only have to read out a single cache block (more precisely, check the associated tag , see below), which minimizes the hardware requirements for the tag comparator. In return, the efficiency of the cache is limited, as there may be free cache blocks that are not used, see Conflict Miss below.
Fully associative (Engl. Fully associative, shortly FA ): ${\ displaystyle n = m}$ , d. i.e., there is only one sentence that contains all blocks. This means that every address in every cache block can be cached. When making a request to the cache, it is therefore necessary to check all cache tags. Since caches have to be as fast as possible, this is carried out in parallel, which increases the amount of hardware required for tag comparators. The advantage, however, is that the cache can always hold data as long as any cache block is still free.
Set associative or quantity associative (Engl. Set associative, short SA )

The first two are a special case of the sentence-associative cache. The directly mapped and the fully associative cache can thus be derived from the set-associative cache: n = 1 leads to a directly mapped cache, n = m to a fully associative cache.

Explanation based on an example

The main sizes of a cache are:

The size of the stored data (i.e. size of the cache): here in the example 64 KiB
The size of the address space to be covered: here in the example 4 GiB
The length of a cache line: here in the example 64 bytes
The granularity of the data: here in the example 1 byte
Presence of dirty and valid tags.

Regardless of the structure, the cache consists of 64 KiB / 64 bytes = 1024 cache lines

Fully associative cache

31	30th	29	28	27	26th	25th	24	23	22nd	21st	20th	19th	18th	17th	16	15th	14th	13	12	11	10	9	8th	7th	6th	5	4th	3	2	1	0
Length of address tag																												Byte position

There is one cache group that includes all 1024 cache lines. Each main memory data word can be stored in any of the 1024 cache lines of the one cache group. 1024 comparators are required, which must compare log2 (4 GiB / 64 bytes) = log2 (4 GiB) -log2 (64 bytes) bits = 32-6 = 26 bits. These 26 bits are attached to each cache line as an address tag. Hardware effort:

1024 comparators
1024 × 64 × 8 bit actual cache
1024 × 26 bit address tag
1024 × 64 bit valid tags
1024 × 64 bit dirty tags
1024 ×? bit for the LRU tags

Direct mapped cache / simple or non-associative cache

31	30th	29	28	27	26th	25th	24	23	22nd	21st	20th	19th	18th	17th	16	15th	14th	13	12	11	10	9	8th	7th	6th	5	4th	3	2	1	0
Length of address tag																cache group used										Byte position

There are 1024 cache groups, each with one cache line. Each main memory data word can only be stored in this cache line belonging to its address. The cache group results from bits 15 to 6 of the address. Only one comparator is required, which has to compare log2 (4 GiB) -log2 (64 KiB) bits = 16 bits. These 16 bits are attached to each cache line as an address tag. Hardware effort:

A comparator
1024 × 64 × 8 bit actual cache
1024 × 16 bit address tag
1024 × 64 bit valid tags
1024 × 64 bit dirty tags
No LRU tags

Double associative cache

31	30th	29	28	27	26th	25th	24	23	22nd	21st	20th	19th	18th	17th	16	15th	14th	13	12	11	10	9	8th	7th	6th	5	4th	3	2	1	0
Length of address tag																	cache group used									Byte position

There are 512 cache groups with two cache lines each. Each main memory data word can be stored in one of the two cache lines belonging to its address. The cache group results from bits 14 to 6 of the address. Two comparators are required which have to compare log2 (4 GiB) -log2 (64 KiB) +1 bits = 17 bits. These 17 bits are attached to each cache line as an address tag. Hardware effort:

Two comparators
1024 × 64 × 8 bit actual cache
1024 × 17 bit address tag
1024 × 64 bit valid tags
1024 × 64 bit dirty tags
1024 × 1 bit LRU tags

2 ^ n-fold associative cache

31	30th	29	28	27	26th	25th	24	23	22nd	21st	20th	19th	18th	17th	16	15th	14th	13	12	11	10	9	8th	7th	6th	5	4th	3	2	1	0
Length of address tag																		cache group used								Byte position

There are 1024/2 ^ n cache groups with 2 ^ n cache lines each. Each main memory data word can be stored in one of the 2 ^ n cache lines belonging to its address. The cache group results from bits 15- (n-1) to 6 of the address. 2 ^ n comparators are required, which have to compare log2 (4 GiB) -log2 (64 KiB) + n bits = 16+ (n-1) bits. These 16+ (n-1) bits are attached to each cache line as an address tag. Hardware effort:

2 ^ n comparators
1024 × 64 × 8 bit actual cache
1024 × (16 + n-1) bit address tag
1024 × 64 bit valid tags
1024 × 64 bit dirty tags
1024 × multiple bit LRU tags

Cache hits and misses

The process in which the data of a request to a cache is available is called a “cache hit”, the reverse is called a “cache miss”.

In order to obtain quantitative measures for the evaluation of the efficiency of a cache, two parameters are defined:

Hit rate: The number of requests for which a cache hit occurred divided by the total number of requests made to this cache. As you can easily see from the definition, this quantity is between zero and one. A hit rate of e.g. B. 0.7 (= 70%) means that in 70% of all requests to the cache, the cache was able to deliver the data immediately and had to match in 30% of all requests.
Miss rate: Similar to the hit rate, this is defined as the number of requests for which the data was not in the cache divided by the number of total requests. The following applies: Miss rate = 1 - hit rate.

There are three types of cache misses:

Capacity: The cache is too small. Data was available in the cache but was removed from it. If this address is then accessed again, this miss is referred to as a "capacity miss". The only remedy is a larger cache.
Conflict: Due to the set-associative organization (also applies to DM caches) it is possible that there is no longer enough space in one set while there are still free cache blocks in other sets. Then a block must be removed from the overcrowded record, although the cache actually still has space. If this removed block is accessed again, this cache miss is called a "conflict miss". This can be remedied by increasing the cache blocks per record - i.e. increasing the associativity. In the case of fully associative caches (which only have one record) there are no conflict misses due to the principle involved.
Compulsory: A “Compulsory Miss” or “Cold Start Miss” is the name given to the first access to an address whose data is not yet in the cache, and at the same time the cache still has free space. The difference to the other two misses is that there is no repression here, but a block is described for the first time / anew. It is difficult or impossible to prevent. Modern processors have “ prefetcher ” units that automatically load data into the caches speculatively if there is still space. This is to reduce the number of compulsory misses.

These three types are also known as "The Three Cs" for short. In multiprocessor systems, when using a cache coherence protocol of the Write-Invalidate type , a fourth "C" can be added, namely a "Coherency Miss": When a processor writes to a cache block, the same block in the cache of a second processor is thrown out If the second processor has to access an address that was covered by this remote cache block, this leads to a coherency miss.

Working method

When managing the cache, it makes sense to only keep those blocks in the cache that are frequently accessed. There are various replacement strategies for this purpose. A frequently used variant is the LRU strategy ( l east r ecently u sed ), in which the block is always exchanged that has not been accessed for the longest time. Modern processors (e.g. the AMD Athlon ) mostly implement a pseudo-LRU replacement strategy that works almost like real LRU, but is easier to implement in hardware.

Displacement strategies

FIFO (First In First Out): The oldest entry is suppressed.
LRU (Least Recently Used): The entry that has not been accessed for the longest time is suppressed.
LFU (Least Frequently Used): The least frequently read entry is suppressed. However, no complete timestamps are saved, which would require a relatively long integer number. Rather, only a few bits are used (two are common, but only one is possible) to mark a cache entry as being used more or less frequently. The bits are updated in parallel to a displacement.
Random: A random entry is suppressed.
CLOCK: Data is stored in the cache in the order in which it is accessed. If a data item is accessed, a bit is set for this cache block. In the event of a miss, the first date without a set bit is searched from front to back; this is replaced. The bit is cleared for all data passed through. It is also marked which date was last loaded into the cache. From there the search for a date begins that can be replaced.
Optimal: The Laszlo Belady process , in which the memory area is displaced which will not have been accessed for the longest, is optimal. However, it is only applicable if the complete program flow is known in advance (i.e. it is a so-called offline method, in contrast to FIFO and LRU, which are online methods). The program sequence is almost never known in advance; therefore, the best practice cannot be used in practice. However, the optimal algorithm can serve as a comparison for other methods.

Writing strategy

In principle, there are two possibilities for write access to a block that is in the cache:

Copy back (write-back): When writing, the block to be written is not immediately stored in the next higher memory level, but first in the cache. This creates an inconsistency between the cache and the memory to be cached. The latter therefore contains outdated information. Only when the word has been displaced from the cache is it also written to the next higher memory level. In addition, each cache block is given a so-called dirty bit, which indicates whether the block has to be copied back when it is replaced. This leads to problems with memory access by other processors or DMA devices because they would read outdated information. The remedy here cache coherency protocols such. B. MESI for UMA systems.
Continuous writing (write-through): The block to be written is immediately stored in the next higher memory level. This ensures consistency. So that the processor does not have to wait every time until the block is stored in the next higher memory level (which is slower than the cache), a write buffer is used. However, when this fills up, the processor has to stop and wait.

Analogous to the above, there are basically two options for write access to a block that is not in the cache:

write-allocate: As with a normal cache miss, the block is fetched from the next higher memory level. The corresponding bytes that were changed by the write access are then overwritten in the block that has just arrived.
non-write-allocate: It is written to the next higher memory level bypassing the cache without the corresponding block being loaded into the cache. This can be advantageous for some applications in which a lot of written data is never read again. Using non-write-allocate prevents other, possibly important blocks from being displaced, thus reducing the miss rate.

Some instruction sets contain instructions that enable the programmer to explicitly specify whether data to be written is to be written past the cache.

Usually either the combination write-back with write-allocate or write-through with non-write-allocate is used. The first combination has the advantage that successive write accesses to the same block (principle of locality) are completely processed in the cache (except for the first miss). This is of no advantage in the second case, since every write access to the main memory must anyway, which is why the combination write-through with write-allocate is rather unusual.

Cache flush

A cache flush ("buffer memory emptying") causes the entire contents of the cache to be written back to the background memory. The contents of the cache are mostly left untouched. Such a procedure is necessary to restore the consistency between the cache and the background memory. This is necessary, for example, whenever data from the main memory is required by external devices, for example for multiprocessor communication or when transferring part of the main memory used as an output buffer to the DMA controller.

Others

Entries in the cache

The following is stored in the cache for each cache block:

The actual data
The day (part of the address)
Several status bits, such as:

modified or dirty: Indicates whether this cache block has been changed (only for write-back cache).

Hot and cold caches

A cache is “hot” when it is working optimally, ie when it is full and has only a few cache misses; if this is not the case, the cache is considered "cold". After being put into operation, a cache is initially cold because it does not yet contain any data and often has to reload data, which is time-consuming, and then warms up increasingly because the temporarily stored data increasingly correspond to the requested data and less reloading is required. In the ideal state, data access is served almost exclusively from the cache and reloading can be neglected.

Examples

Processor cache

With CPUs, the cache can be integrated directly in the processor or placed externally on the motherboard (previously more common, today rather untypical). Often there are several levels that build on one another. Smaller levels are typically faster, but are smaller in size for cost reasons. Depending on the location of the cache, it works with different clock frequencies: The L1 (level 1, closest to the CPU) is almost always integrated directly in the processor (i.e. on the die ) and therefore works with the full processor clock - i.e. U. several gigahertz . An external cache, on the other hand, is often only clocked at a few hundred megahertz .

Current processors (e.g. AMD Ryzen , Intel Core i series , IBM Power 9 ) mainly have three cache levels: L1, L2 and L3. Common sizes for L1 caches are 4 to 256 KiB per processor core, the L2 cache is 64 KiB to 1024 KiB (usually also per core), the L3 cache 2 to 32 MiB (common for all cores). In cheaper versions, the L3 cache is sometimes left out or switched off, but the L2 cache is partially enlarged. Processor cache as an extra chip on the mainboard is no longer built as an extra die in the same chip housing (see multi-chip package ).

In any case, logging is necessary to ensure the coherence of the data (e.g. between caches and main memory). Flags are used for this purpose , which mark a memory area (typically a whole line of 64 bytes) as "dirty", i.e. changed (see above under write strategy ). The problem is exacerbated with multiple cache levels and multiple processors or processor cores.

The cache consistency must be observed both with several active devices on the data bus and with several interconnected processors ( multiprocessor systems ).

In multi-processor systems, a distinction is made between a. between SIMD and MIMD structures (Single / Multiple Instruction - Multiple Data). With MIMD systems, the probability is high that different processors will access different memory areas, with SIMD it is less. The cache configuration can then be set.

Modern processors have separate L1 caches for programs and data (read and write cache); this is also partly the case with the L2 ( Montecito ). One speaks here of a Harvard cache architecture . This has the advantage that you can use different cache designs for the different access patterns for loading program code and data. In addition, if you have separate caches, you can better place them in relation to the respective units on the processor die and thus shorten the critical paths in the processor layout. Furthermore, instructions and data can be read / written at the same time, which further reduces the Von Neumann bottleneck . One disadvantage is that self-modifying code has to be treated separately, which greatly slows its execution. However, for safety reasons and because it is often difficult to understand, difficult to test and therefore difficult to maintain, it is used very rarely today anyway.

Drive cache

In the case of hard disks, the cache is located on the control board (see hard disk cache ) or on a separate board, the hard disk controller .

The size of current hard disks - depending on the intended use of the hard disk by the manufacturer - is between 8 and 64 MiB (as of 2012).

Most optical drives have caches to absorb the access times and fluctuations in the data stream (e.g. due to synchronization problems), which are often in the three-digit millisecond range.

Software caches

Caches can also be used with software; the same principle is meant here as with hardware implementation: data is temporarily stored for faster access to a faster medium.

Examples:

Hard drive cache (managed by the operating system): Hard disk → main memory
Application data ( memoisation ): Calculation → main memory
Browser cache: Network → (hard disk / RAM)
Web server: Database → HTML file ( HTTP caching )

Software caches, which use the hard disk as a faster medium, are usually created in the form of temporary files .

Caching is also used when an operating system has certain resources - such as B. function libraries or fonts - initially left in the working memory, although they are no longer needed after their use. As long as there is no lack of memory, they can remain in the main memory in order to be immediately available without reloading from the hard disk when they are needed again. However, if the memory management of the operating system detects a lack of memory, these resources are deleted first.

Search engine cache

The search engine cache is the reading cache of a search engine. A search engine has three core components:

A web crawler searches the WWW for new or changed websites and loads them (additionally) into
the search engine cache, which is used to regularly create various indexes . Looks through these indexes
a search algorithm that is supposed to find suitable web pages according to a user query.

The content of all websites that the search engine takes into account as basic data for user queries is temporarily stored in the search engine cache. The servers of a search engine cannot search every website for the latest content in real time for every query ; instead, an index over the cache is searched for.

In general, a website operator can report changes to his website to the search engine, then the web crawler queries the page again as soon as possible; otherwise the web crawler checks every website at regular intervals - the cache contents can therefore be out of date. A web page can give the crawler an indication of how often it changes in general. Search engines sometimes deal with this information in different ways.

The most popular search engine in Germany is Google ; their cache, indexing and search strategies are therefore of particular interest. The web crawler frequency with which websites are checked is between one and four weeks for most websites ("[...] content is usually updated every 7 days "). The so-called Googlebot examines reported websites .

Web links

Wiktionary: Cache - explanations of meanings, word origins, synonyms, translations

Individual evidence

↑ Cache on duden.de
^ French dictionary. Translation: cache. (No longer available online.) Éditions Larousse , archived from the original on January 28, 2013 ; accessed on November 20, 2018 .
↑ Definition for cache. University of Oxford , accessed February 18, 2012 .
^ John Hennessy, David Patterson: Computer Architecture. A quantitative approach. 4th edition. Morgan Kaufmann Publishers, ISBN 978-0-12-370490-0 (English), pp. C-11-C-12
↑ " A brief insight into the functionality of the Google cache ( Memento from July 28, 2017 in the Internet Archive )" (with search function)

[1] Cache on duden.de

[2] French dictionary. Translation: cache. (No longer available online.) Éditions Larousse , archived from the original on January 28, 2013 ; accessed on November 20, 2018 .

[3] Definition for cache. University of Oxford , accessed February 18, 2012 .

[4] John Hennessy, David Patterson: Computer Architecture. A quantitative approach. 4th edition. Morgan Kaufmann Publishers, ISBN 978-0-12-370490-0 (English), pp. C-11-C-12

[google_cache-5] " A brief insight into the functionality of the Google cache ( Memento from July 28, 2017 in the Internet Archive )" (with search function)