Deeper than deep: buddy system

How I wrote before, on low level we're need a simple and fast allocator with minimal external fragmentation. It was buddy system. I've used a binary buddy in case of simple and clean code and perfomance.
But we're need to store a free buddies list, but we haven't any allocators lower and we're cannot allocate something for it, and we're cannot create linked list of pointers to the blocks and chunks - and it's not so clever - the really big overdraft on it.
Theory:
I've decide that I can use a bitmap in this case, for grow up speed I will use a bitmap for each buddy list level in this case for first level we're need just 1 bit per attribute, for second 2 bits per attribute ... and so on. Also we're need to make a simple initialization and keep information about separating. In this case we're using 2 bits per buddy block, I've called them - the first one is negative bitmap, the second one is positive bitmap.
What are they means? The first bit attribute (negative) points that if the bit is set it's a free block, if not it's used (or unmapped) block. Second bit attribute (positive) points that if the bit set it's not separated block, otherwise is. In this model we're simply full negative bitmap with 0x0and positive bitmap with 0xffffffff, and how you can understand we will use unsigned integer 32bit sized ones for bitmap.
Make a simple calculations and you will get a good news - for map buddy with 16-parts you need 64bits or 8 bytes, isn't pretty?
Implementation:
So, the first I will declare the general structure that will describe buddy system:


typedef struct __bbuddy_type {
  uint32_t *pbmp;
  uint32_t *nbmp;
  uint32_t pn;
} bbuddy_t;

How you can see there are two pointers and what the neck ?
So, I'll explain it - on low level you need to take a pointer (determining the size you need) and use it for your first buddy - it's simply than something else. Also, I store buddy_t first and after it I'm storing uint32_t data.
Also we're need an macro that will calculate the overall size to store the buddy system, let's take it:


#define bbuddy_size(n)  (((((n/32)==0) ? 1 : ((n/32>1) ? (n/32)+2 : n/32+1))*(2*sizeof(uint32_t)))+\
                         sizeof(bbuddy_t))

Explaining it, for parts that bigger than 1/32 of buddy mapped area we're need just one uint32_t , for 1/32 yet another one, and for smaller parts working a general rule to calculate size. Like in squares roots calculation ;)
Also you will need for several useful macros that will calculates index of pointer and bit position within bitmap area:


#define bbuddy_indexp(p,n)  ((p/32 > 1) ? ((p/32)+(n/32)) : p/32)
#define bbuddy_indexbn(n)   ((n/32>0) ? n%32 : n)

The first macro will calculate index to choose correct uint32_t data, and second one to choose a bit position, be careful using this macros as is not a good practice, I will show examples of using its.
In the implementation explanation I don't want to show you all the source code, it's a simple - you can implement it yourself without any kind of problems.
But there are some useful code snippets I'll show.
The first point on implementation is initialization of buddy system, assign a correct pointers for bitmaps, init first block free. Here be careful with pointers shifting deals, I will show you the simple one, that works correctly, but don't forgot that C operator '+' on pointers shifts pointers with the size of pointer type.
I've used char* - it's a clean and transparent to understand:


uint8_t bbuddy_init(bbuddy_t *b,uint32_t max_part)
{
  int yy=(max_part/32)*2,i;
  char *ptr=(char*)b;

  if(b && max_part) {
    ptr+=sizeof(bbuddy_t);
    b->pbmp=(uint32_t*)ptr;
    ptr+=(sizeof(uint32_t)*((max_part/32 > 1) ? (max_part/32)+2 : (max_part/32)+1));
    b->nbmp=(uint32_t*)ptr;
  } else
    return 1;

  b->pn=max_part;

  max_part/=32;

  for(i=0;i<yy;i++) {
    b->nbmp[i]=nil;
    b->pbmp[i]=fil;
  }

  /* init first */
  b->nbmp[0] |= (1 << 0);
  b->nbmp[0] |= (1 << 1);

  return 0;
}

NOTE: nil it's a macro defined to 0x0 and fil is a macro defined to 0xffffffff.
Here we go ... we're a correctly init buddy system structure.
Now, I want to wrote about allocations/splitting/freeing specifics of this implementation.
I've implement an allocation function with mind that I've giving a part number of buddy system and it's trying to return me a number of avialable and allocated part. Look below:


static uint32_t __bbuddy_block_alloc(bbuddy_t *b,uint32_t align,uint8_t m_flag)
{
  uint32_t p_indx=align/32;
  uint32_t n_indx=0,i=0,o=0;

  if(p_indx) {
    for(i=0;i<p_indx;i++) {
      if(b->nbmp[p_indx+i]!=0x0)
        goto __is_free_long;
    }
    m_flag++;
    return __bbuddy_block_alloc(b,align/2,m_flag);
  }

  if(!p_indx) { /* checking small layers, big blocks */
    o=align*2;
    for(i=align;i<o;i++) 
      if((b->nbmp[p_indx] & (1 << i)) && !m_flag) {
        b->nbmp[p_indx] &= ~(1 << i); /* used */
        b->pbmp[p_indx] |= (1 << i); /* not separated */
        return i-align;
      } else if((b->nbmp[p_indx] & (1 << i)) && m_flag) {
        b->nbmp[p_indx] &= ~(1 << i); /* used */
        b->pbmp[p_indx] &= ~(1 << i); /* separated */
        /* mark childs free */
        align*=2;         i*=2;
        p_indx=align/32;
        n_indx=bbuddy_indexbn(i);
        b->nbmp[p_indx] |= (1 << n_indx);
        b->nbmp[p_indx] |= (1 << (n_indx+1));
        b->pbmp[p_indx] |= (1 << n_indx);
        b->pbmp[p_indx] |= (1 << (n_indx+1));

        m_flag--;
        return __bbuddy_block_alloc(b,align,m_flag);
      }
    if(i<2)
      return ENOBLOCK;
    else {
      m_flag++;
      return __bbuddy_block_alloc(b,align/2,m_flag);
    }
  }

 __is_free_long: 
  while(o<32) {
    if((b->nbmp[p_indx+i] & (1 << o)) && !m_flag) { /* yep, found */
      b->nbmp[p_indx+i] &= ~(1 << o); /* used */
      b->pbmp[p_indx+i] |= (1 << o); /* not separated */
      return (p_indx > 1) ? (((p_indx+i)-2)*32)+o : o;
    }
    else if((b->nbmp[p_indx+i] & (1 << o)) && m_flag) { /* make sep */
      b->nbmp[p_indx+i] &= ~(1 << o); /* used */
      b->pbmp[p_indx+i] &= ~(1 << o); /* separated */
      /* mark childs free */
      align*=2;        p_indx=align/32; o*=2;
      p_indx=bbuddy_indexp(align,o);
      n_indx=bbuddy_indexbn(o);
      b->nbmp[p_indx] |= (1 << n_indx);
      b->nbmp[p_indx] |= (1 << (n_indx+1));
      b->pbmp[p_indx] |= (1 << n_indx);
      b->pbmp[p_indx] |= (1 << (n_indx+1));

      m_flag--;
      return __bbuddy_block_alloc(b,align,m_flag);
    }
    o++;
  }


  return 0;
}

How you can see there are difference between 1/16 and bigger parts - I've showed it, it's separated via source code implementation in case of making additional checking and different calculation in both situations, you can split there parts into one to make code lines counter smaller , but it doesn't give any perfomance effect (tested).
Freeing buddy block must be assigned with splitting buddy into bigger block if possible, I've implement this via recursive separate function to make a lightweight to read release function, also I will point to indexes calculations here.
And I want you to understand, I'm giving to release function number of smaller blocks like offset to release, after those I'm calculating possible variants of blocks can be and if found I'm releasing it, and only after it I'm calls my splitter function. Also, I don't include additonal checking code, make it yourself ;), look below:


static uint32_t __bbuddy_block_release(bbuddy_t *b,uint32_t num,uint32_t i)
{
  uint32_t p_indx,n_indx;
  uint32_t layer=b->pn,ls=layer,a=0;

  /* first look up on higher possible layer
   * if there are now, look deeper
   */
  if(num>b->pn)
    return EINVALIDINDX; /* error encount */

  while(ls) {    ls/=2; a++;  } 
  if(i==0) {
    for(i=(a-1);i>0;i--) {
      if(num%2) break;
      else num/=2;
    }
  } 

  layer=(1 << i);

  n_indx=(layer<32) ? (num+layer) : bbuddy_indexbn(num);
  p_indx=bbuddy_indexp(layer,num);
  if(!(b->nbmp[p_indx] & (1 << n_indx)) && (b->pbmp[p_indx] & (1 << n_indx))) { 
    b->nbmp[p_indx] |= (1 << n_indx); /* mark free */
    return __bbuddy_split_up(b,layer,num);
  } else if(!(b->nbmp[p_indx] & (1 << n_indx)) && !(b->pbmp[p_indx] & (1 << n_indx))) { /* going deep */
    i++;
    if(i>=a)
      return EBUDDYCORRUPED;
    else
      return __bbuddy_block_release(b,num*2,i);

  }

  return 0;
}

In case of my initialization and implementation of bitmap, I'm starting to look up for the block from the biggest blocks, because in othercase there are will be errors on cleanly initied bitmap and buddy will corruped.
Also, I don't want to describe my splitter function and I will not show it's source code - it's a trivial too like others.
Be patient, on my code (I wrote it and tested within 1-2 hours) some checks is absent, and some code I've specially removed - take your brains on working way ;)
Like API functions I've made a wrapper to hide all recursion in static functions, and with assign of initial counters values.
On next post I will try to explain how we're can use this abstraction to allocate real pointers, or you can think yourself and assume this method - it's a too trivial ;)

As I intended to memory managers implementing, I decide to write here about several techniques about this.
I don't interested on allocators in uspace now. I want to describe some tips and tricks on memory management while you haven't anything - just raw gdt/tss/ so on and raw addresses.
On low level of memory management and allocation system we're need to have a simple and fast memory allocator, it must avoid external fragmentation as strong as it possible.
There are one good allocator that doesn't sick with big external fragmentation - buddy system. You can find many it's types, but I'd like binary type - it's a very simple in calculation (that really needs).
Some words about buddy system... Buddy system is a such system that based on buddy relations for splitting memory areas. This system has several limitation - you must know deep of separation, I mean how much blocks maximally can contents buddy itself, and overally it can maps fixed memory area size (but it can be depends on implementation). The goal of this technique is to find free block aligned to requested size, when you are looking for some block size - you will separate block with bigger size and so on, while you didn't finf more situable block size. Look for illustrations:
buddy system explanation

Fig. 1
The first bar shows clean buddy system without any allocated block (4096 bytes). When we're requesting to allocate block with size 512 byte we're separate zone to two blocks, and one of this zone separate we're separate too and use one of them - like it shown on the second bar. If we're request to allocate three more 512-sized block we will got the picture displayed on third bar.
It's a good and relativetly fast methodics, but has many restrictions - for example assume that your buddy system has 16byte-sized minimal blocks - and you request 10 bytes - so, you will get a wasted space - and so on - it's an inernal fragmentation. Be sure - you can align block sizes to other numbers , like you want and depend on your target, but anyway it will not safe to use buddy for all allocations.
Another point is a free/busy/separated blocks list storing - on low level you cannot use binary trees in case that they need to have some memory allocation technique already, usually for this used simple pointers lists, but it's a worse practice.
On the next post about memory management I will describe my solution of this problem with some useful code snippets.

Deeper than deep

Pages

Wednesday, July 02, 2008

Memory management - buddy system trick.

Monday, June 30, 2008

Memory management - low allocator base - binary buddy.

Who is here ?

Blog Archive

Active projects