Memory Allocation

Dynamic memory allocation

All term long we have been using the malloc() and free() library functions to do dynamic memory allocation.

#include <stdlib.h>

void* malloc(size_t size); // Returns NULL on error
void free(void* ptr);

When we use the malloc() function the C standard library allocates a block of memory of the desired size on the heap and returns a pointer to that block of memory. In these notes we are going to take a look behind the scenes at what a dynamic memory allocation scheme would need to do to successfully replicate the behavior of malloc().

The illustrations below, along with the sample code we will be looking at in these notes, all come from the book Computer Systems: A Programmer's Perspective, Third Edition by Bryant and O'Hallaron.

Memory architecture for a process on Linux

The diagram here illustrates how the memory in a typical Linux process is organized. The heap appears in the bottom of the diagram, and grows upward as the heap expands. The operating system maintains a global pointer, brk, that points just beyond the top of the heap. A program can expand the heap by calling the sbrk() system call:

#include <unistd.h>

void* sbrk(intptr_t incr);

The sbrk() function attempts to expand the heap upward by the number of bytes requested. If the request is successful, the brk pointer will have been moved upward by the requested number of bytes, and sbrk() will return the previous value of the brk pointer. In effect, this function acts just like malloc(), in that it is a way to allocate a block of memory on the heap and then get back a pointer to that block of memory.

Dynamic memory allocation

Most C programs use malloc() and free() to allocate and deallocate memory on the heap.

#include <stdlib.h>

void* malloc(size_t size); // Returns NULL on error
void* calloc(size_t num,size_t size); // Initializes to 0
void* realloc(void* ptr,size_t size);
void free(void* ptr);

The calloc() function both allocates a block of memory and also initializes its contents to be all 0s. The realloc() function is used to resize a previously allocated block of memory. For example, if you previously had used malloc() to allocate space for an array, you can resize the array by calling realloc(). You pass the pointer to the array along with the new size of the array in bytes to realloc(). The function will then try to allocate space for a new array with the requested size, copy the contents of the old array over to the new array, and then return a pointer to the new array.

Designing a memory allocator

The memory allocator in the C standard library is a very good, general purpose memory allocation system that serves the needs of most programs very well.

In certain special cases we may be able to replace the standard library's memory allocation system with our own custom memory allocation system. The rest of these notes will describe how we could design our own custom memory allocation system, and how we could tweak its behavior to potentially outperform the standard library's system in certain special cases.

The first idea we would use in setting up our own custom memory allocation system would be to start by using the sbrk() system call to carve out an initial large block of memory on the heap. Our memory allocator would then go to work handing out pieces of that large block on demand whenever our program needed to allocate some memory. Our memory allocation system would work by breaking the initial large block into subblocks, where each subblock would have a structure something like the following:

At the beginning and the end of each block we would have a header and a footer. The header would be a sequence of 32 bits, with the first 29 bits storing the size of the block and the last three bits reserved for flags. The last of the three bits would serve as a flag to indicate whether or not the given block is currently allocated or free. The footer would be an exact copy of the header.

The block size information would make it possible for us to set up an implicit list of blocks. By examining the block size value in the header of the first block, we would be able to compute the location of the start of the next block in the sequence. At the end of the sequence of blocks we would place a special block whose size value is 0 to act as an end marker.

The next picture illustrates what a typical allocation request would do. Suppose we asked to allocate a block of 8 bytes. The allocator would walk down the block list until it encountered an unallocated block that was large enough to accomodate the request. In the picture above the third block in the sequence is an unallocated block of size 32 bytes. We could then split that unallocated block into two blocks of size 16: a new allocated block containing the requested 8 bytes along with a 4 byte header and a 4 byte footer, and a new unallocated block of size 16.

When a client of our allocation system sends us a request to free a block of memory we had previously allocated, they will start by giving us a pointer to the payload area in an allocated block. If we go 4 bytes back from that point we will have a pointer to the header at the start of the allocated block. Once we have a pointer to the header, we can figure out both where the block before this block ends and where the block after this block begins. By examining both of the neighboring blocks we could potentially coalesce some free blocks together to make a much large free block.

The pictures below illustrate various scenarios in which it may or may not be possible to do some coalescing after freeing a block.

The code in the archive I have linked to at the top of these notes implements a custom memory allocator based on the ideas I discussed above.

Explicit free lists

Once we have implemented a basic memory allocation system we can go to work on trying to improve its performance. There are a great many ideas that we could implement to increase the performance of the system in general and/or optimize it to handle certain special cases.

Here is one such idea. An obvious problem with the system described above is that when an allocation request comes in we have to jump from block to block in the implicit block list to locate a block that is unallocated and is also big enough to meet our request. A part of this search that is wasteful involves jumping over allocated blocks until we come to an unallocated block. One way to optimize this search would be to implement a scheme that organizes the unallocated blocks in their own separate list, called an explicit free list. We can do this by taking advantage of the fact that unallocated blocks have plenty of free space available that currently is not being used. We can take advantage of that unused space to store useful information, such a pair of pointers that link the unallocated blocks together into a doubly linked list. All we would need to add to our system them would be a global pointer that points to the first unallocated block in the system.

When a request to allocate memory comes in, we could jump directly to the first unallocated block and then walk down the list of unallocated blocks until we found a block large enough to satisfy the request. We could then do a small number of pointer updates to unhook the newly allocated block from the explict free list.

Alternative strategies include segregated free lists, segregated storage, and buddy lists.

Homework assignment

Download the project folder by clicking the button at the top of these notes. In that project folder you will find the code for the memory allocation system described above. This version of the memory allocator does a search through all blocks to find a free block in the find_fit() function in the file mm.c. In this exercise you will be modifying the memory allocator to use the explicit free list strategy described in the section above. You will be modifying the find_fit() function and some other functions in mm.c to do this.

Once you have updated the memory manager you will also need to write a test program that allocates and frees a bunch of blocks of different sizes using the memory manager.

The mmap function

#include <unistd.h>
#include <sys/mman.h>

void* mmap(void* start,size_t length,int prot,int flags,int fd,off_t offset); // Returns pointer on success, -1 on error

int munmap(void* start,size_t length);

prot can be any combination of

Value	Interpretation
PROT_EXEC	Pages contain executable code
PROT_READ	Pages may be read
PROT_WRITE	Pages may be written
PROT_NONE	Pages may not be accessed

flags can be

Value	Interpretation
MAP_ANON	Backing store is anonymous file, pages are demand-zero
MAP_PRIVATE	Private, copy on write object
MAP_SHARED	Shared object