0% found this document useful (0 votes)
95 views35 pages

2016 Esc SV Efficient Embedded Programming MG

The document provides guidance on writing efficient embedded code by structuring code properly, using appropriate data sizes, choosing between signed and unsigned integers, deciding between floating point and fixed point, and being careful with comparisons and calculations. Some key recommendations include isolating device-dependent code, using natural data sizes for the architecture, choosing unsigned types unless negative values are needed, using fixed point instead of floating point when possible, and avoiding assumptions in comparisons and calculations that could cancel optimizations.

Uploaded by

John Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views35 pages

2016 Esc SV Efficient Embedded Programming MG

The document provides guidance on writing efficient embedded code by structuring code properly, using appropriate data sizes, choosing between signed and unsigned integers, deciding between floating point and fixed point, and being careful with comparisons and calculations. Some key recommendations include isolating device-dependent code, using natural data sizes for the architecture, choosing unsigned types unless negative values are needed, using fixed point instead of floating point when possible, and avoiding assumptions in comparisons and calculations that could cancel optimizations.

Uploaded by

John Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Efficient Embedded Programming

Shawn A. Prestridge
Senior Field Applications Engineer, IAR Systems

#ESCsv
#ESCsv
Agenda

• How to structure your code • Local variables


• Using correct data sizes • Be careful with comparisons/calculations
• Signed vs. Unsigned integers • Varargs
• Floating point vs. Fixed point • Function prototypes
• Structures • Static vs. volatile
• Global variables • Clever code
• Taking addresses • Saving stack

#ESCsv
#ESCsv
Structuring your application
 To be efficient and portable
ComInterface
 Isolate device-dependent code Speed Optimization
 Leave most of the code undisturbed
 Use tuned code where needed

General Code
Size Optimization
Generic Tuned
Program Program
Files Files

SetPort(Port,Pin,Status); Device Driver Files

Hardware

#ESCsv
#ESCsv
Use “natural” data sizes
• Different architectures have different “natural” data sizes
• Different available memories, sizes, etc.

• Using an “unnatural” data size might cost


• A 32-bit MCU might need to shift, mask and sign-extend operations to use smaller types
• A 32-bit MCU will need to store 64-bit data in multiple registers to hold all its contents or perform operations in
RAM

• Use a natural size unless there is a compelling reason not to


• Perhaps you are doing I/O and you need a precise number of bits
• Bigger types might take up too much room.
int32[1024] > char8[1024]

#ESCsv
#ESCsv
Cost of using unnatural data sizes
char test_char(char a, char b) int test_int(int a, int b)
{ {
return(a+b); return(a+b);
} }

// ARM Cortex-M (32 bit) // ARM Cortex-M (32 bit)


// ADDS R0,R0,R1 // ADDS R0,R1,R0 1 Cycle
2 Cycles
// UXTB R0,R0

A single cycle executed once may not have a deleterious effect on your application; however, if this
function is called repeatedly, you will see performance degradation. Moreover, if you use these types of
unnatural data sizes throughout your program, you will also see the size of your application grow
unnecessarily.

#ESCsv
#ESCsv
Using fast types to get appropriate data sizes
• Use appropriate data size: 32-bit machine
• 8-bit operations less efficient on 32-bit CPU typedef int int_fast8_t;
typedef int int_fast16_t;
• 32-bit operations hard for 8-bit CPU typedef int int_fast32_t;

• Use another typedef to adjust


• “int_fastX_t”: value fits in X bits
• Use for loop counters, state variables, etc.
64-bit machine
• Code is same and efficient across machines typedef long int_fast8_t;
typedef long int_fast16_t;
• Are included as part of <stdint.h> typedef long int_fast32_t;

#ESCsv
#ESCsv
Signed or unsigned?
Think about signedness!

Signed +-*
/%
• Negative values possible
• Arithmetic operations always performed
• Operations will never be cheaper, but in many cases more expensive

Unsigned
• Negative values impossible
• Bit operations are consistent
• Arithmetic operations may be optimized to bit-operations.
<< >> | &
^~
Unless you need to handle negative
numbers, use unsigned types

#ESCsv
#ESCsv
Floating point numbers
• IEEE 754: float
• Wide range: Float 10-38 to 1038 , Double 10-308 to 10308
• Good precision: Float 10-7, Double 10-16
• Designed for giving small error in complex computations
• Expensive in size and speed...unless there’s hardware support
• “Real-world” data usually have:
• Fixed range
• Limited precision is available/needed
• Fixed-point arithmetic:
• Implemented using integers
• Can give significant savings (size and speed)
• Use “relaxed floating-point semantics”
#ESCsv
#ESCsv
Integer or floating point?
• Floating point is very expensive
• Arithmetic is more complex
• Brings in large library (from C runtime)
• Could require other functionality to be more complex
• printf( ) is ~3 times larger
• Scanf( ) is also ~3x larger
#define Other 20
• Use only when really needed #define ImportantRatio (1.05 * Other)
• Can be done inadvertently #define ImportantRatioBetter ((int)(1.05 * Other))

• Example code: int i=a + b * ImportantRatio; //Will include FPLIB


int i=a + b * ImportantRatioBetter; // Integer

#ESCsv
#ESCsv
Be careful with comparisons!
• Confusing integral promotion
• 8-bit char, 16-bit int
void f1(unsigned char c1)
{
if (c1 == ~0x80) Test always false
{…
}
What actually is done
void f1(unsigned char c1)
{
if ((int)c1 == 0xFF7F)
{… https://www.iar.com/support/tech-notes/general/integral-
} types-and-possibly-confusing-behavior/

#ESCsv
#ESCsv
Be careful with comparisons!
• They can cancel optimizations
void f0(unsigned int c)
{
unsigned int i;
for (i = 0; i <= c; ++i)
{
a[i] = b[i];
}
}

What if c == UINT_MAX?
• i will reach UINT_MAX and wrap around, thus making an infinite loop
• Optimizer must assume that this can happen, so it cancels several loop optimizations

Avoid using <= in loop tests!


#ESCsv
#ESCsv
Be careful with calculations!
• Confusing implicit casting
Bit shift an unsigned 32 bit object 15 times
uint32_t a = 0; The C Standard 6.3.1.1:
a = 0xFFFF8000
a = (1 << 15); If an int can represent all values of the
original type, the value is converted to an
uint32_t b = 0; int; otherwise, it is converted to an
b = 0x00008000 unsigned int. These are called integral
b = (1 << 15u);
promotions.
(1 << 15) performed as signed integer → 0x8000 = - 32768
Casted to signed long preserving value -32768 → 0xFFFF8000
Casted to unsigned long → 0xFFFF8000

#ESCsv
#ESCsv
Structures
• Structures are sometimes padded, e.g.:
• When CPU requires alignment
0
• When optimization is set for speed name
struct record_a
4
{
uint16_t name; id
uint32_t id; 8
tag
uint8_t tag;
};
12
Padding to align “word”
thus, record_a 12 bytes

#ESCsv
#ESCsv
Structures
• Changing the alignment to minimize the size
#pragma pack(1) 0
name
struct record_b
{ 4 id
uint16_t name;
tag
uint32_t id;
8
uint8_t tag;
};
#pragma pack()
This can result in significantly larger and slower code when accessing members of the
structure.

#ESCsv
#ESCsv
Structuring your structures
Guideline: Order fields by size
• Largest objects first
• Smallest objects last
• Automatically packs your structure efficiently!

struct record_c
0
{ id
uint32_t id;
uint16_t name; 4
name
uint8_t tag;
tag
}; 8

#ESCsv
#ESCsv
Structuring your structures
struct record_a #pragma pack(1) struct record_c
{ struct record_b {
uint16_t name; { uint32_t id;
uint32_t id; uint16_t name; uint16_t name;
uint8_t tag; uint32_t id; uint8_t tag;
}; uint8_t tag; };
}; …
struct record_a A; #pragma pack() struct record_c C;

void setA() struct record_b B; void setC()


{ {
A.name = 'A'; void setB() C.name = 'C';
A.id = 1; { C.id = 3;
A.tag = 0x01; B.name = 'B'; C.tag = 0x03;
} B.id = 2; }
B.tag = 0x02;
}

A = 12 byte B = 8 byte C = 8 byte


setA()= 20 byte 32 setB()= 24 byte 32 setC()= 20 byte 28
11 cycles 22 cycles 11 cycles

#ESCsv
#ESCsv
Using global variables
Advantages:
• Accessible from the whole program
• Easy to understand and use

Drawbacks:
• Will not be placed in a register, operations will be slow
• Any function call may change the value
• Compiler can’t be as aggressive

Solution = Copy the value into a local variable


• Local variable will probably be placed in register, so operations on it is fast
• Function calls will not change the value
• Write back the value if needed.

#ESCsv
#ESCsv
Example of using temp variable with global
Before: uint8_t gGlobal;

void foo(int32_t x)
{
...
bar(gGlobal); // bar may change gGlobal
...
gGlobal++; // Memory read and write
operation
...
}
uint8_t gGlobal;
After: void foo(int32_t x)
{
uint8_t cTemp=gGlobal; // cTemp probably placed in register
...
bar(cTemp); // bar cannot change
gGlobal
...
cTemp++; // Fast update of
register
...
gGlobal=cTemp; // Store the result back
}

#ESCsv
#ESCsv
Taking addresses
• Address taken  uses memory
• Cannot be register-allocated
int a, temp;
...
int a;
scanf(“%d”,&temp);
...
scanf(“%d”,&a); a=temp;
/*After this temp is dead, and the local
Func1(a); register-placed ’a’ is used*/

Func1(a);
FuncN(a);

FuncN(a);

#ESCsv
#ESCsv
Registers and locals
Live ranges
a b i
void foo(char a) • Register usage
{ • Variablescan share a register
char b,i;
• Only present in register while “live”

b=a<<3;
for(i=0;… )
OUTPUT(b);

b=a<<5;
for(i=0;… )
OUTPUT(b);
}
R0 R1 R2

#ESCsv
#ESCsv
Registers and locals
Live ranges
a b i k
void foo(char a)
{
char b,i,k;

b=a<<3;
for(i=0;… )
OUTPUT(b);

k=a<<4
OUTPUT(k);

b=a<<5;
for(i=0;… )
OUTPUT(b);
}
R0 R1 R2 R3

#ESCsv
#ESCsv
Registers and locals
Live ranges
a b i k j
void foo(char a)
{ • Don’t worry about
char b,i,k,j;
“extra” variables
b=a<<3;
for(i=0;… )
OUTPUT(b);

k=a<<4
OUTPUT(k);

b=a<<5;
for(j=0;… )
OUTPUT(b);
}
R0 R1 R2 R2 R2

#ESCsv
#ESCsv
Varargs
• Variable number of arguments
• int printf(char *, ...)
• Special macros to access arguments
• Forces arguments to stack
• Step through them using pointers
• No parameters in registers
• Source code gets more complex
• Error-prone
• Object code gets bigger
• Bad idea, simply
#ESCsv
#ESCsv
Function prototypes
• C and function calls:
• No declaration: call out of the blue
• K&R declarations: void foo();
• ANSI prototypes: char bar(short);
• Without ANSI prototypes:
• No proper type-checking
• All arguments promoted: casting code and parameter data is larger = more stack used
• Always use ANSI prototypes!
• Turn on checking in compiler!
• ELF has no type checking at link time
• You can make sure you use the same prototype everywhere by using a header file.

#ESCsv
#ESCsv
Static vs. volatile
• The keyword static means that all references to a variable are known
• It also means that the compiler/linker will assign a RAM address for the variable
• The variable can only be used in the source file where it was declared

• The keyword volatile means that all accesses to a variable must be


preserved
• Useful when you’re doing shared access of a variable
• Also handy for trigger access, where accessing the variable is a stimulus for doing
some other action
• Can be used for modified access, i.e. where the contents of the variable can change in
ways not known to the compiler

#ESCsv
#ESCsv
Static vs. volatile
• Is it atomic? volatile int32_t vol = 1;
void f5()
{
vol++; /* Is this atomic? */
}

LDR.N R0,??DataTable7 ;;vol address


LDR R1,[R0, #+0] ;; Atomic
Never assume that
ADDS R1,R1,#+1 ;; ADDS…
volatile means atomic.
STR R1,[R0, #+0] ;; Atomic
BX LR ;; return

#ESCsv
#ESCsv
Static vs. volatile
void delay(int time)
• The “empty loop” pitfall {
int i;
• Empty delay loop does not work for (i=0;i<time;i++)
• Code with no effect {}
• Gets removed return;
}
• Delay is always zero
void SetHW(void)
{
• To really delay: SET_DIRECTION(0x20);
delay(100);
• Access volatile variable OUT_SIGNAL(0xB4);
• Use OS services delay(200);
• Use CPU timer READ_PORT(0x02)
}

#ESCsv
#ESCsv
Don’t write “clever” code
• Straightforward code
• Easy to read  Easy to maintain
• Better optimization & better tested

Clever solution Better solution


unsigned long int a;
unsigned long int a;
unsigned char b;
unsigned char b;
b|=!!(a<<11); if((a & 0x1FFFFF) != 0)
b |= 0x01;

if (b == 3)
a = (b == 3) ? 4 : 5; a = 4;
else
a = 5;

#ESCsv
#ESCsv
Comments on “clever” code
“If understanding the code requires knowledge of page 543 of the spec, then chances
are nobody will understand the code. I prefer simple and understandable.”

“The more you stray from the well worn paths, the more likely you'll write code the
compiler misunderstands and either gets wrong or optimizes poorly.”

“What has precedence between || and &?


Correct answer is: who cares! Don't write code like that! “

#ESCsv
#ESCsv
The stack
• The stack is used for: heap
• Local variables
• Return addresses
SP
• Function arguments
• Compiler temporaries
• Interrupt contexts stack

• Life span is the duration of the function


global/static
variables

#ESCsv
#ESCsv
Stack overflow
• Stack overflow
• there is no protection on SP heap
• the stack grows into the global area overwriting application
data
• corrupted variables
• wild pointers
• corrupted return addresses
SP
stack

• Errors are really hard to catch! danger


SP zone!

• Setting the stack size global/static


• too small - overflow variables
• too big - waste of memory
#ESCsv
#ESCsv
Avoiding stack overflow
• Test and measurement methods
• Track the stack pointer
• Use stack guard zones
• Fill the stack area with an arbitrary bit pattern

• Calculation methods
• Manual calculation
• Static stack calculation tool

#ESCsv
#ESCsv
Avoiding stack overflow
• Avoid printf() and its relatives…
• Pass by reference instead of by copy
• At least for objects of size greater than the register size
• Limit the number of arguments to a function
• The ABI of an MCU tells a compiler how many arguments can be passed in the
registers, all others must be passed on the stack
• void doThisOrThat(…,doWhat); 
void doThis(…);
void doThat(…);

#ESCsv
#ESCsv
Summary

#ESCsv
#ESCsv
Thank You!
Questions?

@ESC_Conf
#ESCsv
#ESCsv

You might also like