05 - The perils of optimisation! (volatile and interrupts)

 

As already mentioned - if you just compile your program with optimisation turned on then your program may no longer work.

 

Why is this? Surely the compiler shouldn't corrupt my program?

 

The biggest issue you will come across is pieces of code such as this:-


void delay_cycles(unsigned long int cycles)
    {
    while(cycles > 0)
        cycles--;
    }


The purpose of this method is to introduce a delay. ie to waste time and not do anything else!

When optimising the compiler looks at the code and says: ok there is one variable which we decrement but it is never used in anyother way. Hence the code is just wasting time. So your code gets optimised as follows:-


void delay_cycles(unsigned long int cycles)
    {
    }


In fact it may get rid of the method altogether and remove any calls to it. Because you may be using these delays to control servos or to read sensors then these will stop working as intended.

 

And here is a big hint with optimisation: when writing new code always compile it without optimisation. If it works correctly then recompile with optimisation and make sure it still works. You are better doing this frequently ie don't write an enormous program and then turn optimisation on as you will now have loads of code to inspect to try and work out what isn't working properly.

 

Q. So how do we fix this problem and make the compiler introduce the delay loop?

A.There are 3 potential solutions:-

  1. Some compilers allow you to use what are called 'pragma directives' so that you can change the level of optimisation at any point. In the above case you would tell the compiler to turn off optimisation for the 'delay_cycles' method. Unfortunately - I haven't found a way to do this with the avr-gcc compiler. This compiler has loads of inidividual optimisation settings that can be turned on and off individually - and the O1,O2,O3,Os settings are just short hands to turn on different groups of these settings. Although you can turn on/off individual settings inside your code there is no easy way to turn them all off. If anyone knows otherwise then I would love to know...
  2. You could place all your time critical code into separate C files which you then always compile with optimisation turned off. The could then put placed into a library that the rest of your code could then incorporate at link time.
  3. We can fool the compiler by using the keyword volatile (see below)

 

The volatile keyword indicates that a variable can be modified by another thread or under interrupts. Lets look at that. Assume your program is like this rough example:-


uint32_t long timer = 0;

 

void delay_ms(int ms)

{

   uint32_t endtime = timer + ms;

   while(timer < endtime)

   {

      // keep waiting

   }

 

}

 

// This method is called once every ms

timer_interrupt()

{

   timer++; // add 1 to timer

}

 

int main(void)

{

..... set up timer interrupt to happen every 1ms

..... do stuff

delay_ms(10); // wait for 10ms

..... do stuff

}


The big problem will be noticed in 'delay_ms'. This calculates the endtime correctly but then goes into the while loop. With optimisation enabled then the value of 'timer' is loaded once into register(s) and then compared against 'endtime'. But since your compiler knows that your while loop never changes the value of timer, and so its value is never re-read from the variable, then it may perform the test once but then effectively enters an infinite loop and will never return. In post-boy speak: the compiler doesn't know that it needs to keep doing a 'wash-your-hands'.

 

YOU know that the value of 'timer' is being changed under interrupts every 1ms, BUT the compiler doesn't know that.

 

So to fix the problem we just change the first line to say:-

 

volatile unsigned long timer = 0;

 

The addition of the volatile keyword tells the compiler that the variable may be being changed by something else and so it should never keep it in a register. ie if it is referenced then it should ALWAYS be reloaded from memory, and if changed then should ALWAYS be written back out to the memory variable.

 

But what about our 'delay_cycles' routine - that doesn't use anything that is changed under interrupts?

Well YOU know that, but the compiler doesn't. So you can fix the optimisation by adding the keyword volatile :-

 

void delay_cycles(volatile unsigned long int cycles)
    {
    while(cycles > 0)
        cycles--;
    }

Since we have told the compiler that 'cycles' is 'volatile' then it has to keep re-reading it, decrementing it, and writing it back. So the compiler cannot get rid of the code.

 

 

However: there is still one potential problem with volatile variables when they are modified within an interrupt, or in another thread (if you have a pre-emptive multi-tasking kernel).

To understand the problem lets return to our example above for 'delay_ms' where the variable 'timer' is being changed under interrupts.

The 'timer' variable is stored as a 32 bit (ie 4 byte) variable. Since we have marked it as 'volatile' then it needs to be re-read whenever it is referenced; however your microprocessor is probably unable to fetch all 4

bytes from memory in one atomic (ie un-interruptable) instruction. An 8 bit processor may only be able to load one byte at a time, a 16 bit processor may only be able to read 2 bytes at a time, etc. So reading the variable will require more than one instruction.

 

Lets assume that the variable currently holds the following hexadecimal four byte value (from high byte to low byte):  00,  01, FF, FF

 

In a non-interrupt situation then your 'delay_ms' may read in the low 2 bytes FF and FF, and then read the upper 2 bytes 00 and 01 - all is well - it has got the correct value.

 

But what happens if an interrupt happens half way through:-

delay_ms reads in the low 2 bytes (FF and FF)

but then there is a timer interrupt and your interrupt routine adds 1 to the timer variable. Then timer is now set to 00, 02, 00, 00

the delay_ms routine now reads the high 2 bytes (which are now 00,02)

 

So your delay_ms has read the variable as: 00, 02, FF, FF  - which is mixture of its old value and its new value.

 

In interrupt driven code this is a common mistake, and in a multi-threaded environment is even more of a problem. These problems are horrendously difficult to detect and fix. The above case would only happen 1 time in 65,535 and ONLY if the interrput happens at the exact moment. So it will only happen once in a blue moon. So is it important? Well it depends on the side effects. If the code is reading a sonar to decide which way to turn and one reading in 65,535 is wrong then it may not be too important as the effect will be dampened by all the other valid readings. But what if it the outcome is to press the red button and launch the big bomb !!

The easiest way to avoid this mistake is to disable interrupts whilst the variable is being read. Since this could be a common requirement then I use a '.h' file to define a macro to do this for me:-

#define CRITICAL_SECTION_START    unsigned char _sreg = SREG; cli()
     #define CRITICAL_SECTION_END    SREG = _sreg

The CRITICAL_SECTION_START macro will remember whether interrupts are current enabled or not, and will then disable them. The CRITICAL_SECTION_END macro will return the interrupatable flag back to how it was. So now you can bracket your code with these two macros. eg


volatile uint32_t long timer = 0;

 

uint32_t getTimer()

{

    uint32_t rtn;

    CRITICAL_SECTION_START; // make sure 'timer' doesn't get changed

    rtn = timer;

    CRITICAL_SECTION_END;  // allow 'timer' to be changed again

    return rtn;

}

void delay_ms(int ms)

{

   uint32_t endtime = getTimer() + ms;

   while(getTimer() < endtime)

   {

      // keep waiting

   }

 

}

 

// This method is called once every ms

timer_interrupt()

{

   timer++; // add 1 to timer, We dont need CRITICAL_SECTION macro since we are in an interrupt routine and interrupts are already disabled

 

   // We could also write

   // CRITICAL_SECTION_START;

   // timer++;

   // CRITICAL_SECTION_END;

   // but the result will be the same

}

 

int main(void)

{

..... set up timer interrupt to happen every 1ms

..... do stuff

delay_ms(10); // wait for 10ms

..... do stuff

}


 

In a true multi-tasking environment, whereby you can have loads of threads executing simultaneously, then disabling all interrupts could become too restrictive. Since the majority of users will not be in this situation then I will kind of skip it - other than to say that the use of 'semaphores' could be a solution.

 

The last BIG caveat with compiler optimisation settings is that they sometimes get it wrong and turn your code into nonsense. If your program stops working, or is doing something unexpected, then try compiling with all optimisation disabled. If this fixes the problem then it is normally a good idea to contact the compiler manufacturer to report a bug. Unless we all do this then the compilers will never get fixed and we will just accept that optimisation "doesn't work".