Introduction to Memory Management
The Memory Management Unit (MMU) is a hardware component which maps physical frames to virtual addresses. The MMU operates on basic units of memory called pages. Page size can be different base the architecture.
Page is a unit of memory sized and aligned at the page size and page frame or frame refers to page size.
A page table is the data structure used by a virtual memory system in a operating system to store the mapping between virtual addresses and physical addresses.
When a process accesses a region of memory that is not mapped or the processes has insufficient permissions for the address requested the MMU will generate a page fault exception.
The Linux page allocator is based on a buddy allocator, implemented in
mm/page_alloc.c and tracks the free pages of different orders.
Translation Lookaside Buffer (TLB)
Translation lookaside buffer (TLB) is a memory cache that is used to reduce the time taken to access a user memory location. It is a part of the chip’s memory-management unit (MMU) and stores the recent translations of virtual memory to physical memory, also can be called an address-translation cache. TLB contains the virtual address, physical address and permissions of the mapping entries, if the the address is in the TLB the MMU will look up the physical resource, if not in TLB the MMU will generate a page fault exception and interrupt the CPU.
Non-Uniform Memory Access Numa
NUMA is used on multiprocessor systems whose memory is divided into multiple memory nodes. The access time of a memory node depends on the relative locations of the accessing CPU and accessed node, which it means that CPU access to memory depends on the distance cost between them. Memory is divided to node and each node is divided to many blocks called zones and a zone contains pages.
Direct memory access (DMA) is a feature of computer systems that allows certain hardware subsystems to access main system memory (random-access memory), independent of the central processing unit (CPU). Examples of hardware systems that are using DMA are disk drive controllers, graphics cards, network cards and sound cards. Linux kernel divides pages into different zones.
ZONE_DMA - This zone contains pages that can undergo DMA
ZONE_DMA32 - Like
ZOME_DMA, this zone contains pages that can undergo DMA. Unlike
ZONE_DMA, these pages are accessible only by 32-bit devices. On some architectures, this zone is a larger subset of memory.
ZONE_NORMAL - This zone contains normal, regularly mapped, pages.
Linux Kernel Memory Management APIs
Kmalloc function is similar with userspace
malloc() with the exception of the additional flag parameters. The
kmalloc() function is a simple interface for obtaining kernel memory in byte-sized chunk and guarantees that the pages are physically contiguous (and virtually contiguous).
Further similar to calloc on userspace is kzalloc where is setting the memory with zero.
p = kmalloc(sizeof(struct example), GFP_KERNEL);
The flags are broken up into three categories:
zone modifiers, and
types. Action modifiers specify how the kernel is supposed to allocate the requested memory. Zone modifiers specify from which memory zone the allocation should originate and the type flags specify the required action and zone modifiers to fulfill a particular type of transaction.
Usual usage of flags
Process context which can sleep -
Process context which cannot sleep -
Interrupt handler -
DMA-able memory which can sleep - (
GFP_DMA | GFP_KERNEL)
DMA-able memory which cannot sleep - (
GFP_DMA | GFP_ATOMIC)
GFP_DMA flag is used to specify that the allocator must satisfy the request from
ZONE_DMA. This flag is used by device drivers, which need DMA-able memory for their devices, normally, you combine this flag with the
The kfree() method frees a block of memory previously allocated with kmalloc() similar with free that we can find in userland.
The vmalloc() function is similar to kmalloc(), except it allocates memory that is only virtually contiguous and not physically contiguous. The vmalloc() function ensures only that the pages are contiguous within the virtual address space and it does this by allocating potentially noncontinuous chunks of physical memory and “fixing up” the page tables to map the memory into a contiguous chunk of the logical address space.
p = vmalloc(16 * PAGE_SIZE);
and free with
The slab allocator is an abstraction layer to make easier allocation of numerous objects of a same type. The basic idea behind the slab allocator is to have caches of commonly used objects kept in an initialized state available for use by the kernel. The Linux kernel has three main different memory allocators: SLAB, SLUB, and SLOB. The slab layer acts as a generic data structure-caching layer. In this sense, the free list acts as an object cache, caching a frequently used type of object.
kmem_cache_create use arguments such as name, size, flags and ctor (constructor function to call when new objects are added to the cache) to create the cache
The following flags can be used for
SLAB_NO_REAP - slab layer will reap objects in the cache when memory is low
SLAB_HWCACHE_ALIGN - align each object within a slab to different cache lines
SLAB_MUST_HWCACHE_ALIGN - debugging purpose
SLAB_POISON - fill memory with 0xa5a5a5a5, useful for catching access to uninitialized memory
SLAB_RED_ZONE - insert “red zone” around the allocated memory to detect buffer overruns
SLAB_PANIC - make slab panic() if allocation fails
SLAB_CACHE_DMA - memory allocated from the slab must come from ZONE_DMA
Example of kmem_cache_create
struct kmem_cache *kmem_cache_create(const char *name, unsigned int size, unsigned int align,slab_flags_t flags, void (*ctor)(void *)); p_cache = kmem_cache_create("Example cache",sizeof(struct example),0,SLAB_HWCACHE_ALIGN,NULL_NULL);
Once a cache of objects is created, you can allocate objects from it by calling kmem_cache_alloc
void *kmem_cache_alloc (struct kmem_cache * cachep, gfp_t flags); p = kmem_cache_alloc(p_cache,GFP_KERNEL);
Deallocate an object
void kmem_cache_free(struct kmem_cache *cachep, void *obj); kmem_cache_free(p_cache, p);
Finally, at module unload time, we have to return the cache to the system:
void kmem_cache_destroy(struct kmem_cache *s) kmem_cache_destroy(p_cache);
Use ksize To find the actual amount of memory allocated for an object
size_t ksize(const void *objp);
Further you can see the kernel slab cache information in real time with slabtop
https://www.kernel.org/doc/htmldocs/kernel-api/mm.html https://elixir.bootlin.com/linux/v5.1-rc5/source/kernel/sched/core.c#L2837 http://makelinux.net/ldd3/chp-8-sect-2.shtml