Wednesday, July 02, 2008

Memory management - buddy system trick.

How I wrote before, on low level we're need a simple and fast allocator with minimal external fragmentation. It was buddy system. I've used a binary buddy in case of simple and clean code and perfomance.
But we're need to store a free buddies list, but we haven't any allocators lower and we're cannot allocate something for it, and we're cannot create linked list of pointers to the blocks and chunks - and it's not so clever - the really big overdraft on it.
Theory:
I've decide that I can use a bitmap in this case, for grow up speed I will use a bitmap for each buddy list level in this case for first level we're need just 1 bit per attribute, for second 2 bits per attribute ... and so on. Also we're need to make a simple initialization and keep information about separating. In this case we're using 2 bits per buddy block, I've called them - the first one is negative bitmap, the second one is positive bitmap.
What are they means? The first bit attribute (negative) points that if the bit is set it's a free block, if not it's used (or unmapped) block. Second bit attribute (positive) points that if the bit set it's not separated block, otherwise is. In this model we're simply full negative bitmap with 0x0and positive bitmap with 0xffffffff, and how you can understand we will use unsigned integer 32bit sized ones for bitmap.
Make a simple calculations and you will get a good news - for map buddy with 16-parts you need 64bits or 8 bytes, isn't pretty?
Implementation:
So, the first I will declare the general structure that will describe buddy system:

typedef struct __bbuddy_type {
uint32_t *pbmp;
uint32_t *nbmp;
uint32_t pn;
} bbuddy_t;

How you can see there are two pointers and what the neck ?
So, I'll explain it - on low level you need to take a pointer (determining the size you need) and use it for your first buddy - it's simply than something else. Also, I store buddy_t first and after it I'm storing uint32_t data.
Also we're need an macro that will calculate the overall size to store the buddy system, let's take it:

#define bbuddy_size(n) (((((n/32)==0) ? 1 : ((n/32>1) ? (n/32)+2 : n/32+1))*(2*sizeof(uint32_t)))+\
sizeof(bbuddy_t))


Explaining it, for parts that bigger than 1/32 of buddy mapped area we're need just one uint32_t , for 1/32 yet another one, and for smaller parts working a general rule to calculate size. Like in squares roots calculation ;)
Also you will need for several useful macros that will calculates index of pointer and bit position within bitmap area:

#define bbuddy_indexp(p,n) ((p/32 > 1) ? ((p/32)+(n/32)) : p/32)
#define bbuddy_indexbn(n) ((n/32>0) ? n%32 : n)

The first macro will calculate index to choose correct uint32_t data, and second one to choose a bit position, be careful using this macros as is not a good practice, I will show examples of using its.
In the implementation explanation I don't want to show you all the source code, it's a simple - you can implement it yourself without any kind of problems.
But there are some useful code snippets I'll show. 
The first point on implementation is initialization of buddy system, assign a correct pointers for bitmaps, init first block free. Here be careful with pointers shifting deals, I will show you the simple one, that works correctly, but don't forgot that C operator '+' on pointers shifts pointers with the size of  pointer type.
I've used char* - it's a clean and transparent to understand:

uint8_t bbuddy_init(bbuddy_t *b,uint32_t max_part)
{
int yy=(max_part/32)*2,i;
char *ptr=(char*)b;

if(b && max_part) {
ptr+=sizeof(bbuddy_t);
b->pbmp=(uint32_t*)ptr;
ptr+=(sizeof(uint32_t)*((max_part/32 > 1) ? (max_part/32)+2 : (max_part/32)+1));
b->nbmp=(uint32_t*)ptr;
} else
return 1;

b->pn=max_part;

max_part/=32;

for(i=0;i<yy;i++) {
b->nbmp[i]=nil;
b->pbmp[i]=fil;
}

/* init first */
b->nbmp[0] |= (1 << 0);
b->nbmp[0] |= (1 << 1);

return 0;
}

NOTE: nil it's a macro defined to 0x0 and fil is a macro defined to 0xffffffff.
Here we go ... we're a correctly init buddy system structure.
Now, I want to wrote about allocations/splitting/freeing specifics of this implementation.
I've implement an allocation function with mind that I've giving a part number of buddy system and it's trying to return me a number of avialable and allocated part. Look below:

static uint32_t __bbuddy_block_alloc(bbuddy_t *b,uint32_t align,uint8_t m_flag)
{
uint32_t p_indx=align/32;
uint32_t n_indx=0,i=0,o=0;

if(p_indx) {
for(i=0;i<p_indx;i++) {
if(b->nbmp[p_indx+i]!=0x0)
goto __is_free_long;
}
m_flag++;
return __bbuddy_block_alloc(b,align/2,m_flag);
}

if(!p_indx) { /* checking small layers, big blocks */
o=align*2;
for(i=align;i<o;i++)
if((b->nbmp[p_indx] & (1 << i)) && !m_flag) {
b->nbmp[p_indx] &= ~(1 << i); /* used */
b->pbmp[p_indx] |= (1 << i); /* not separated */
return i-align;
} else if((b->nbmp[p_indx] & (1 << i)) && m_flag) {
b->nbmp[p_indx] &= ~(1 << i); /* used */
b->pbmp[p_indx] &= ~(1 << i); /* separated */
/* mark childs free */
align*=2; i*=2;
p_indx=align/32;
n_indx=bbuddy_indexbn(i);
b->nbmp[p_indx] |= (1 << n_indx);
b->nbmp[p_indx] |= (1 << (n_indx+1));
b->pbmp[p_indx] |= (1 << n_indx);
b->pbmp[p_indx] |= (1 << (n_indx+1));

m_flag--;
return __bbuddy_block_alloc(b,align,m_flag);
}
if(i<2)
return ENOBLOCK;
else {
m_flag++;
return __bbuddy_block_alloc(b,align/2,m_flag);
}
}

__is_free_long:
while(o<32) {
if((b->nbmp[p_indx+i] & (1 << o)) && !m_flag) { /* yep, found */
b->nbmp[p_indx+i] &= ~(1 << o); /* used */
b->pbmp[p_indx+i] |= (1 << o); /* not separated */
return (p_indx > 1) ? (((p_indx+i)-2)*32)+o : o;
}
else if((b->nbmp[p_indx+i] & (1 << o)) && m_flag) { /* make sep */
b->nbmp[p_indx+i] &= ~(1 << o); /* used */
b->pbmp[p_indx+i] &= ~(1 << o); /* separated */
/* mark childs free */
align*=2; p_indx=align/32; o*=2;
p_indx=bbuddy_indexp(align,o);
n_indx=bbuddy_indexbn(o);
b->nbmp[p_indx] |= (1 << n_indx);
b->nbmp[p_indx] |= (1 << (n_indx+1));
b->pbmp[p_indx] |= (1 << n_indx);
b->pbmp[p_indx] |= (1 << (n_indx+1));

m_flag--;
return __bbuddy_block_alloc(b,align,m_flag);
}
o++;
}


return 0;
}

How you can see there are difference between 1/16 and bigger parts - I've showed it, it's separated via source code implementation in case of making additional checking and different calculation in both situations, you can split there parts into one to make code lines counter smaller , but it doesn't give any perfomance effect (tested).
Freeing buddy block must be assigned with splitting buddy into bigger block if possible, I've implement this via recursive separate function to make a lightweight to read release function, also I will point to indexes calculations here. 
And I want you to understand, I'm giving to release function number of smaller blocks like offset to release, after those I'm calculating possible variants of blocks can be and if found I'm releasing it, and only after it I'm calls my splitter function. Also, I don't include additonal checking code, make it yourself ;), look below:

static uint32_t __bbuddy_block_release(bbuddy_t *b,uint32_t num,uint32_t i)
{
uint32_t p_indx,n_indx;
uint32_t layer=b->pn,ls=layer,a=0;

/* first look up on higher possible layer
* if there are now, look deeper
*/

if(num>b->pn)
return EINVALIDINDX; /* error encount */

while(ls) { ls/=2; a++; }
if(i==0) {
for(i=(a-1);i>0;i--) {
if(num%2) break;
else num/=2;
}
}

layer=(1 << i);

n_indx=(layer<32) ? (num+layer) : bbuddy_indexbn(num);
p_indx=bbuddy_indexp(layer,num);
if(!(b->nbmp[p_indx] & (1 << n_indx)) && (b->pbmp[p_indx] & (1 << n_indx))) {
b->nbmp[p_indx] |= (1 << n_indx); /* mark free */
return __bbuddy_split_up(b,layer,num);
} else if(!(b->nbmp[p_indx] & (1 << n_indx)) && !(b->pbmp[p_indx] & (1 << n_indx))) { /* going deep */
i++;
if(i>=a)
return EBUDDYCORRUPED;
else
return __bbuddy_block_release(b,num*2,i);

}

return 0;
}

In case of my initialization and implementation of bitmap, I'm starting to look up for the block from the biggest blocks, because in othercase there are will be errors on cleanly initied bitmap and buddy will corruped.
Also, I don't want to describe my splitter function and I will not show it's source code - it's a trivial too like others.
Be patient, on my code (I wrote it and tested within 1-2 hours) some checks is absent, and some code I've specially removed - take your brains on working way ;)
Like API functions I've made a wrapper to hide all recursion in static functions, and with assign of initial counters values.
On next post I will try to explain how we're can use this abstraction to allocate real pointers, or you can think yourself and assume this method - it's a too trivial ;)

2 comments:

Unknown said...

Have you tested this?
Also, what is the copyright for the source code?

p.s. You should never use "goto"

Supervisor said...

it's public domain. yep, it works.
goto is ok it's better than copy-paste.