5
Buddy memory To avoid the wasting memory problem in multi-lists memory, it is natural to allocate memory from the direct upper layers (twice size) when the free list is empty, instead of pre-allocated memory in all free lists. Free list multi free lists of fixed size memory, with sizes growing up in power of 2 Allocate Find the first free list with size larger than request size Take one element from the target free list If the free list is empty, create pairs from upper list Free Find the correct free list to free (using records) Return the address to the target free list. If the buddy is also in the free list, then free to upper. Performance Constant time on both allocation & free Free lists Size = 256 Size = 512 Size = 1024 Size = 2048 Size = 4096 …

6
Buddy memory Good internal de-fragment The buddy address can be calculated by address XOR size Constant time operation O(h), where h = log2(max size/min size) is a constant. buddythis

8
Pair creation If the current free list is empty, it will allocate memory from upper allocator. Since the size of upper is 2x, it will create a pair of available memory into current free list. If there are N threads simultaneously allocate memory in current layer, of that the free list is empty, only N/2 threads shall allocate memory from upper layer. Memory from upper layer Memory to current layer

9
Free Queue The free list was implemented with queue, of which head can run over tail. HeadTailunder available (require pair creation from upper layer) Use the above states to determine which threads shall call pair_creation() from upper layer.

10
Parallel strategy (Alloc) Each allocation requestor creates a socket to listen the address. The socket was implemented on free queue. atomicAdd(&head,1) creates a socket. The output address can come from current free list or pair creation from upper free list. Head Tail Available memory in free queue Need pair creation from upper layer New Head Threads with allocation requests to this layer

11
Odd/Even Pair Creation The under available threads will perform pair creations in odd/even loop until new tail >= new head to avoid the overhead of simultaneous pair creation. Head TailNew Head Threads with allocation requests to this layer New Tail Pair Creations

12
Parallel strategy (Free) Store the freed address to free list Calculate the buddy address. XOR(addr, size) Check if the buddy is already in the free list. Use hand shake algorithm for fast lookup If YES, mark both elements in free list as N/A, then free the memory block into upper layer.

13
Hand shake The freed memory record its index in free list The free list record the freed memory address Fast check if buddy memory address is in free list Calculate buddy memory address (XOR) Read the index from this address Check if the address of this index in free list is equal to the buddy memory address. Memory block Record index in free list Record address of memory