01 - Segmentation

Submitted by Webbot on September 17, 2008 - 10:48pm.

Before we go any further then it will be useful to understand the make-up of an executable program and how that maps to the hardware. This description is fairly generic and so wont get into the ins-and-outs of .hex, .obj, .exe, .com files etc but attempts to explain the raw principals.

 

An executable program file is normally made up of the following items or ‘segments’:-

  1. Your code. This is machine code produced from your compiler for the target chip and is ‘read-only’ – ie your program doesn’t re-write itself when it is running. It is also of a fixed size (ie a constant number of bytes). This is normally known as the ‘Code Segment’ or ‘Text Segment’.
  2. Any global data/variables. These are of a known size and are ‘read-write’. In a C program then this is any global variables defined outside of any of your methods. These variables can be split down into two separate types:-
    1. Initialised. If your program creates a variable with an initial value. Ie ‘int foo=10;’ then the compiler creates this variable in the data section with an initial value of 10 – so no code is required to assign it the value of 10. These sorts of variables are normally stored in the ‘Data Segment’.
    2. Un-initialised. If your program creates a global variable such as ‘char buffer[256]’ then you are reserving 256 bytes of data but you aren’t actually assigning any values to any of the 256 bytes. The compiler will not generally put this into the ‘Data Segment’ as that would make the size of the runnable file 256 bytes larger. Instead it adds 256 to the size of what is often called the ‘BSS Segment’. Once the program loads it zaps the entire BSS Segment to a known value – normally zeroes – so all of your numeric variables are automatically initialised to a value of zero. This can be compiler specific so 'best practice' says that you should not rely on this happening - if you want to assume it has an initial value then you should initialise it yourself in your source code.
  3. The stack. This is never stored in a program file but is an area of memory used when running the program. It is used by the microcontroller to store things like return addresses and variables that are defined within a method.  When you call a method the chip will first of all store the return address (ie where in your code it needs to go back to afterwards) onto the stack. It will then create enough space on the stack to store all of the variables defined within the method that is being called. Once the method finishes then all of this space is recovered from the stack and made available again. In a multi-threaded environment (say on 'Windows') then each program and thread will need its own stack. Since a microcontroller is normally used with a single thread then we will assume that this is the case and that there is thus only one stack.
  4. The heap. This is used if your compiler allows you to create new variables at runtime by calling the ‘malloc’, ‘calloc’ or ‘new’ directives. Note that most microcontrollers don’t allow you to do this or will implement them in a limited way.

 

So your file will have: a) a Code Segment of a fixed size, b) a Data Segment of a fixed size. It will also store a number for the total number of bytes required in the BSS Segment.

 

Let's now look at the different sorts of memory stored in a typical microcontroller (in this case an ATMega8 as used in the $50 Robot).

 

ATMega8 memory types

 

Flash Memory is where you upload your program using your ISP programmer. It is used to store Read Only information (such as your code) and its contents survive reboots.

 

SRAM This is Read Write memory that is used to store changing information - like your variables and the stack. The contents are erased every time you reboot.

 

 EEPROM This is used to store Read Write information but, unlike SRAM, its contents are retained between reboots. This is useful for storing information such as user preferences (eg Current Volume for my text to speech amplifier - so that next time I turn on the robot it keeps the same value).

 

This is the layout of your HEX file and the whole thing is loaded into Flash memory when you upload your program:-

 

 

The 'Vector Table' is set up to contain the addresses of all of the Interrupt Service Routines (ISRs) specified in your code. Most of the unused vectors will point to a piece of code that behaves as if the power has been switched off then back on. The big exception is that the first entry in the table holds the address of the start of the code to be executed at power on. You may think that this is your 'main' function - but you'd be wrong. It actually points to the start of the 'Stub Code'.

The 'Stub Code' is automatically linked in at compile time. Its purpose is to set up the microcontroller into a known 'ready' state before any of your code is run. We will have a look at what it does in more detail in a minute.

The 'Code Segment' contains all of your code and any of your unchanging data that you have told the compiler to place in program memory.

The 'Data Segment' contains all of your initialised data - ie variables, arrays, etc, that you have assigned an initial value to.

 

Now let’s look at how your program sits in memory when it is being run. As mentioned above: the first piece of code to be executed is the 'Stub Code'. This will set up the Stack Pointer to the end (ie top) of your SRAM memory. Next it will copy all of your Data Segment from the Flash memory to the start of the SRAM memory. Next it will carve out the number of bytes in the SRAM required by your BSS Segment (ie for unitialised variables, arrays etc) starting immediately after the end of the memory used for the Data Segment. This memory will be zapped to zeroes so that all these unitialised variables start with a value of zero. It then stores the address of the end of the BSS Segment as being the start address of 'available' memory for use by the heap. The heap will grow upwards towards the stack and the stack will grow downwards towards the heap. If the two overlap then you get the dreaded ‘Stack Overflow’ or ‘Out of Memory’ or the program just goes beserk. Having done all this the Stub Code will call your 'main' function. At which point the SRAM is organised as follows:

 

 

Why don’t most microcontrollers allow you to dynamically allocate memory via calls to ‘malloc’, ‘calloc’ (in C) or ‘new’ (in C++)?

The answer comes down to managing the heap. A very simple heap manager will just keep a ‘current end of heap’ marker. If you then ask for 20 bytes then the marker will be incremented by 20. If you free the 20 bytes, and they are at the end of the heap, then the marker will move back by 20 bytes. But what if you grab three separate lots of 20 bytes and then free the middle 20? Something has to remember that these are now free so that if the last 20 bytes are freed then the pointer will move back by 40 (so that only first 20 bytes are still in use). So the heap is often stored as a linked list of memory blocks with, at least, a variable to indicate if they are currently used or unused. Or may be as one linked list of ‘used’ heap and another list of ‘available’ space. But the problem gets worse with what is called ‘fragmentation’. Assume that your middle 20 bytes are ‘unused’ and your program now asks for 5 bytes – should it carve this out of the available 20 or should it increase the heap pointer further? This constant allocating and de-allocating of different sized heap memory means that the heap becomes more and more fragmented – ie has lots of small lumps of memory available. This could mean that if you ask for 100 bytes then the system cannot give you an answer. It may have a total of 100 bytes available but they are scattered all over the heap and there isn’t one continuous lump of 100 bytes available. Bang and crash!! More powerful computers use hardware that allows you to shuffle all the available chunks into one contiguous lump so that your program can continue.

 

So that is a very quick, and very dirty, explanation as to why microcontrollers don’t allow you to dynamically allocate and de-allocate heap memory – it just doesn’t have the necessary hardware and software to do it very well.

 

Some microcontroller compilers do allow you to dynamically allocate memory – but with this one enormous proviso:- The memory is automatically freed when the method that allocated it in the first place exits. How does that work? Well its then very similar to the stack. It will just use the heap pointer to allocate new memory and then move it back to its original position when the method exits. Therefore: you can never have any fragmentation in the heap. However: its usage is fairly limited.

 

Consider this code:-


void myMethod(){

    char buffer[256];

    for(int I=0; I<256;I++){

            buffer[I]=I;

    }

}


This will allocate space for ‘buffer’ by moving the stack pointer down by 256 bytes.

 

Whereas this code:-

 


void myMethod(){

   char buffer* = malloc(256);

    for(int I=0; I<256;I++){

            buffer->[I]=I;

    }

}


will allocate space for ‘buffer’ by moving the heap pointer up by 256 bytes. So they are almost identical.

 

But you must be careful not to do the following:-


char buffer*;

void myMethod(){

   buffer = malloc(256);

    for(int I=0; I<256;I++){

            buffer->[I]=I;

    }

}


since ‘buffer’ is assigned a value in the method – but, unlike a ‘proper’ C compiler - once the method exits then ‘buffer’ is referring to a lump of memory that is now being used by other things. So changing some of its values will corrupt something else. Bad!

 

So dynamic memory allocation should generally be avoided and we will not discuss it any further!!

 

Special consideration of Segments in a micro-controller

 

The main difference to the generic description of segments is that a micro-controller has different sorts of memory. ‘Flash’ memory which you can write to using your ISP and survives reboots but otherwise doesn’t really change, and ‘RAM’ or ‘SRAM’ which is used as a temporary scratch pad when the program is running.

 

The ‘Code Segment’ is written into the Flash memory and the RAM is used to store the runtime data ie the Stack Segment, BSS Segment and Data Segment.

 

Since the Stack and BSS segments don’t need to hold any initial values then that’s fine but there is a problem with the Data Segment. If we turn off the controller and turn it back on then it must ‘somehow’ know what values to store into the initialised global variables in the Data Segment. It can only do this by holding the Data Segment in Flash memory, and then copying it to RAM before your ‘main’ method is called.  So the Flash memory needs to be big enough to store your Code segment and a read-only copy of your Data Segment. When the program runs then your RAM must be big enough to hold your Data, BSS and Stack Segments.

 

Generally a micro-controller has a lot more Flash memory than it has RAM and so we sometimes need to change our program to reduce the amount of RAM required. A common technique is to store read-only data (such as text for logging messages or lookup tables) only in the Flash area. It is more complicated, and slower, to access but frees up more RAM for variable data.