Slab allocator

from Wikipedia, the free encyclopedia
QS IT
This article was due to content flaws on the quality assurance side of the computer science editorial added. This is done in order to bring the quality of the articles from the subject area of ​​computer science to an acceptable level. Help to eliminate the shortcomings in this article and take part in the discussion !  ( + )


Reason: Superfluous general explanations; No meaningful article structure; There is no clear definition. The English article can be used for mending.

The slab allocator is a method of managing memory that is used by many operating systems and also by applications . The aim of the algorithm is to use the available main memory as efficiently as possible, i.e. with little waste , when small memory areas are reserved frequently .

Note: Some details of this technical description of the slab allocator are based on the implementation in the Linux kernel 2.6.10-rc2. These can differ from other systems and may not be technically up to date.

history

The slab allocator was developed in 1994 by Jeff Bonwick for SunOS 5.4. Since he has publicly documented his approach, it has been adopted by many other operating systems (e.g. Linux ) and applications (e.g. Perl ). In 2001 , Bonwick released a significantly improved version that was also publicly documented.

In 1999 the slab allocator found its way into the Linux kernel. The current implementation already contains some elements from Bonwick's 2001 version, but this is not yet fully implemented.

idea

The aim of memory management should be to use the available memory as efficiently as possible while enabling the required objects to be released and reserved quickly. The problem of fragmentation should also not be neglected: During operation, as few small, unused memory areas as possible should arise between the used ones. For technical reasons, the memory is organized in large ( IA-32 : 4096 bytes ) memory pages , which is not particularly efficient for dynamically requested memory.

Properties of core objects

Most of the objects that an operating system uses are relatively small and have a limited lifespan, so they are returned to the system after a short period of time. In addition, multiple objects of the same type are often required.

Private caches

In his elaboration, Bonwick distances himself from the idea of ​​private caches , i.e. temporary storage that every process or kernel module in the operating system manages itself. Although a process itself has the best overview of its own storage requirements, it has almost no overview of the storage situation in the system as a whole and none of the requirements of other processes.

Clients & Central Allocator

Bonwick therefore divides memory management into two parts. The central allocator provides functions to manage the memory. He ensures the most efficient use possible. Opposite are the clients . These only describe the required objects and do not have to worry about details.

Structure of the central allocator

The following structure is now proposed by Bonwick:

  • Objects of the same type are grouped into slabs (English: tile, tile).
  • These slabs are organized under a cache , so other objects of the same type are kept in stock.
  • This caching reduces time-consuming recourse to the buddy memory management above , which only supplies entire memory pages.
  • The memory pages required for slabs and caches must be fetched from the buddy system.

In other words: The slab allocator divides the memory provided by the buddy system further. The central allocator offers interfaces ( APIs ) via which the clients can request storage.

Caches

tasks

The caches have three tasks:

  • They provide stocks of frequently used objects already initialized.
  • They provide general stores of objects of certain sizes (for objects that are less frequently required or for individual memory areas), under Linux from to bytes.
  • You should use the hardware caches ( TLB , L1 cache ) as well as possible.

construction

Each cache is represented by a data structure (under Linux of the type: kmem_cache_s). This contains three doubly linked lists (within the substructure list of the type kmem_list3) under which the slabs are lined up. One list contains the slabs that are only filled with unused objects ("all free"), one with the objects that are completely in use ("all used"), and the third list contains slabs with used and currently unused objects (" mixed free / used ").

There is also (since the 2001 version of Bonwick's draft) a cache (of the array_cache type) for every processor in the system that remembers the most recently released objects. These objects are also assigned again first, as the probability is very high that they will still be found in one of the processor caches, which means that they are better used.

Example: Information about the caches

This concept becomes clearer using the example. Under Linux there is runtime information on the slab allocator in the file / proc / slabinfo. The following sample edition has been greatly abridged.

Surname <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab>: tunables <batchcount>
raid5 / md0 256 264 364 11 1: tunables 54
travel_inode_cache 2955 5650 376 10 1: tunables 54
scsi_cmd_cache 2 11 352 11 1: tunables 54
sgpool-128 32 32 2048 2 1: tunables 24
sgpool-64 32 32 1024 4th 1: tunables 54
sgpool-32 32 32 512 8th 1: tunables 54
sgpool-16 32 45 256 15th 1: tunables 120
sgpool-8 32 62 128 31 1: tunables 120
clip_arp_cache 0 0 160 25th 1: tunables 120
rpc_buffers 8th 8th 2048 2 1: tunables 24
rpc_tasks 8th 25th 160 25th 1: tunables 120
rpc_inode_cache 0 0 416 9 1: tunables 54
unix_sock 243 270 384 10 1: tunables 54
size-128 (DMA) 0 0 128 31 1: tunables 120
size-128 4282 4402 128 31 1: tunables 120
size-64 (DMA) 0 0 64 61 1: tunables 120
size-64 856 1403 64 61 1: tunables 120
size-32 (DMA) 0 0 32 119 1: tunables 120
size-32 2176 8211 32 119 1: tunables 120

Meaning of the columns:

  • name: name of the cache
  • <active_objs>: Number of objects currently in use
  • <num_objs>: Number of all objects
  • <objsize>: size of an object (in bytes)
  • <objperslab>: Number of objects per slab
  • <pagesperslab>: Number of memory pages per slab
  • <batchcount>: Number of objects that are created or released at once (when creating or releasing a slab)

The first part of the table shows the caches already mentioned for specific objects, the second part the caches in general sizes, each in a variant for DMA- compatible memory and for non-DMA-compatible memory.

Fragmentation

Fragmentation can be divided into two types: internal and external fragmentation.

Internal fragmentation

Bonwick defines internal fragmentation as "per buffers wasted space", ie the space that is wasted per slab. Unused gaps arise from:

  1. Rounding up the object size to an integer multiple of the word size (sizeof (void *)) of the processor used. Thanks to aligned addresses, this speeds up access on most architectures (the processor offers optimized instructions for aligned addresses), but costs memory.
  2. Scrap at the end of the slab, because in the rarest cases the objects in the slab exactly result in its size.

The waste at the end is smaller than the slab size, so it can be influenced by the size of the slab: the larger the slab, the more objects it contains and the smaller the waste. This off-cut is also used for so-called coloring , which will be discussed below.

External fragmentation

Bonwick defines external fragmentation as "unused buffers in the freelist", i.e. unused objects in the slabs. As can already be seen in the table above, external fragmentation exists with this method. However, it is relatively small.

Because the same objects have similar lifetimes, there are few, only partially used slabs. Most slabs are either fully occupied or not at all. The exception here are the caches in certain sizes, but these are not used that often.

Furthermore, the memory is not divided further, as other methods (e.g. the buddy method ) do. Because the more objects a slab contains, the higher the external fragmentation, since the probability of getting a slab completely empty decreases.

Coloring

Coloring tries to make better use of the CPU caches by aligning the same objects differently in different slabs.

To do this, the scrap is taken at the end of the slab and parts of it are inserted in front of the first object in the slab. Thus, the objects are “offset” in comparison to another slab. By skilfully aligning with multiples of the cache line size , mutual displacement of the objects from the cache and the addresses from the TLB can be reduced.

Expiry of a memory request

The typical sequence of a memory request results from the previous one:

  1. An attempt is made to take an object from the per-CPU cache.
  2. If this fails, an object from an already existing slab is delivered.
  3. If there are no more free objects in the slabs, a new slab must be created, with the detour via the higher-level buddy system.

The negative influence on the caches and thus on the performance increases from point to point.

Slabs

construction

A slab consists of

  1. an administration header that contains information such as the first object in the slab, the index of the first free object, the number of occupied objects, the shift for the coloring ( color space ) of this slab and a link to the previous and next slab in the list of free objects / occupied / partially used slabs,
  2. an administration structure for the free objects,
  3. the objects, rounded up to an integer multiple of the word size of the processor,
  4. the remaining waste, if any.

The management of free objects

In the Linux kernel, the free objects are managed using a list of integers. It contains as many numbers as there are slab objects. Each number field only has a meaning if the corresponding object is not used and then indicates the index of the next free object. The information about the first free object in the head of the slab can thus quickly determine and return the next one.

literature

Web links