Cache coherence

Multiple caches of a resource

Ensuring cache coherence in multiprocessor systems with several CPU caches prevents the individual caches from returning different (inconsistent) data for the same memory address.

A temporary inconsistency between the memory and the caches is permissible, provided this is identified and corrected at the latest during read access. Inconsistencies are e.g. B. by the write-back method (write-back) which, in contrast to a write-through method (write-through), does not immediately update the main memory when writing to the cache memory. Compare cache consistency .

Cache Coherence Protocols

A cache coherence protocol is used to keep track of the status of a cached block of memory. There are essentially two technical bases on which such a protocol can be implemented:

Directory based ( Directory-Based ): It is run a central list with the status of all cached blocks. There it is stored which processors currently have a read-only copy (status shared ) or which processor has exclusive write access (status exclusive ) to a block. The protocol regulates the transition between the various statuses and the behavior in the event of read miss, write miss or data write back.
Snooping-based : Access to the central memory usually takes place via a shared medium (e.g. bus or switch). All connected cache controllers can observe this medium and identify write or read access to blocks that they have cached themselves. The exact reaction of the controller is specified in the protocol.

Most frequently - even snoopingbasiert both directory- than - one will write back invalidation protocol ( write-invalidate-protocol ) is used for. B. the Modified Shared Invalid Protocol (MSI) or its extensions MESI and MOESI . Alternatively, there are write-back update protocols (see bus snarfing ), which, however, lead to increased bus traffic.

The choice between directory and snooping-based may vary. a. also depends on the number of processors involved (cache controller). From 64 processors at the latest, directory-based protocols must usually be used because the bandwidth of the bus does not scale sufficiently. With smaller installations, the snooping-based approach is somewhat more efficient due to the lack of a central instance.

In the case of multiprocessor installation with distributed memory, a separate directory is usually kept for each memory so that directory access does not become a bottleneck.

Individual evidence

^ John Hennessy, David Patterson: Computer Architecture. A Quantitative Approach., 4th Edition, Morgan Kaufmann Publishers, 2007, ISBN 978-0-12-370490-0 (English), pp. 208ff
^ John Hennessy, David Patterson: Computer Architecture. A Quantitative Approach., 4th Edition, Morgan Kaufmann Publishers, 2007, ISBN 978-0-12-370490-0 (English), p. 230
^ John Hennessy, David Patterson: Computer Architecture. A Quantitative Approach., 4th Edition, Morgan Kaufmann Publishers, 2007, ISBN 978-0-12-370490-0 (English), p. 231

literature

Thomas Rauber, Gudula Rünger: Parallel Programming Springer Verlag, 2007, ISBN 978-3-540-46549-2 .
David E. Culler: Parallel Computer Architecture , Morgan Kaufmann Publishers Inc., 1999, ISBN 1-55860-343-3 .
John Hennessy, David Patterson: Computer Architecture. A quantitative approach. 4th Edition, Morgan Kaufmann Publishers, 2007, ISBN 978-0-12-370490-0 (English).

[1] John Hennessy, David Patterson: Computer Architecture. A Quantitative Approach., 4th Edition, Morgan Kaufmann Publishers, 2007, ISBN 978-0-12-370490-0 (English), pp. 208ff

[2] John Hennessy, David Patterson: Computer Architecture. A Quantitative Approach., 4th Edition, Morgan Kaufmann Publishers, 2007, ISBN 978-0-12-370490-0 (English), p. 230

[3] John Hennessy, David Patterson: Computer Architecture. A Quantitative Approach., 4th Edition, Morgan Kaufmann Publishers, 2007, ISBN 978-0-12-370490-0 (English), p. 231