2016 Esc SV Efficient Embedded Programming MG
2016 Esc SV Efficient Embedded Programming MG
Shawn A. Prestridge
Senior Field Applications Engineer, IAR Systems
#ESCsv
#ESCsv
Agenda
#ESCsv
#ESCsv
Structuring your application
To be efficient and portable
ComInterface
Isolate device-dependent code Speed Optimization
Leave most of the code undisturbed
Use tuned code where needed
General Code
Size Optimization
Generic Tuned
Program Program
Files Files
Hardware
#ESCsv
#ESCsv
Use “natural” data sizes
• Different architectures have different “natural” data sizes
• Different available memories, sizes, etc.
#ESCsv
#ESCsv
Cost of using unnatural data sizes
char test_char(char a, char b) int test_int(int a, int b)
{ {
return(a+b); return(a+b);
} }
A single cycle executed once may not have a deleterious effect on your application; however, if this
function is called repeatedly, you will see performance degradation. Moreover, if you use these types of
unnatural data sizes throughout your program, you will also see the size of your application grow
unnecessarily.
#ESCsv
#ESCsv
Using fast types to get appropriate data sizes
• Use appropriate data size: 32-bit machine
• 8-bit operations less efficient on 32-bit CPU typedef int int_fast8_t;
typedef int int_fast16_t;
• 32-bit operations hard for 8-bit CPU typedef int int_fast32_t;
#ESCsv
#ESCsv
Signed or unsigned?
Think about signedness!
Signed +-*
/%
• Negative values possible
• Arithmetic operations always performed
• Operations will never be cheaper, but in many cases more expensive
Unsigned
• Negative values impossible
• Bit operations are consistent
• Arithmetic operations may be optimized to bit-operations.
<< >> | &
^~
Unless you need to handle negative
numbers, use unsigned types
#ESCsv
#ESCsv
Floating point numbers
• IEEE 754: float
• Wide range: Float 10-38 to 1038 , Double 10-308 to 10308
• Good precision: Float 10-7, Double 10-16
• Designed for giving small error in complex computations
• Expensive in size and speed...unless there’s hardware support
• “Real-world” data usually have:
• Fixed range
• Limited precision is available/needed
• Fixed-point arithmetic:
• Implemented using integers
• Can give significant savings (size and speed)
• Use “relaxed floating-point semantics”
#ESCsv
#ESCsv
Integer or floating point?
• Floating point is very expensive
• Arithmetic is more complex
• Brings in large library (from C runtime)
• Could require other functionality to be more complex
• printf( ) is ~3 times larger
• Scanf( ) is also ~3x larger
#define Other 20
• Use only when really needed #define ImportantRatio (1.05 * Other)
• Can be done inadvertently #define ImportantRatioBetter ((int)(1.05 * Other))
#ESCsv
#ESCsv
Be careful with comparisons!
• Confusing integral promotion
• 8-bit char, 16-bit int
void f1(unsigned char c1)
{
if (c1 == ~0x80) Test always false
{…
}
What actually is done
void f1(unsigned char c1)
{
if ((int)c1 == 0xFF7F)
{… https://www.iar.com/support/tech-notes/general/integral-
} types-and-possibly-confusing-behavior/
#ESCsv
#ESCsv
Be careful with comparisons!
• They can cancel optimizations
void f0(unsigned int c)
{
unsigned int i;
for (i = 0; i <= c; ++i)
{
a[i] = b[i];
}
}
What if c == UINT_MAX?
• i will reach UINT_MAX and wrap around, thus making an infinite loop
• Optimizer must assume that this can happen, so it cancels several loop optimizations
#ESCsv
#ESCsv
Structures
• Structures are sometimes padded, e.g.:
• When CPU requires alignment
0
• When optimization is set for speed name
struct record_a
4
{
uint16_t name; id
uint32_t id; 8
tag
uint8_t tag;
};
12
Padding to align “word”
thus, record_a 12 bytes
#ESCsv
#ESCsv
Structures
• Changing the alignment to minimize the size
#pragma pack(1) 0
name
struct record_b
{ 4 id
uint16_t name;
tag
uint32_t id;
8
uint8_t tag;
};
#pragma pack()
This can result in significantly larger and slower code when accessing members of the
structure.
#ESCsv
#ESCsv
Structuring your structures
Guideline: Order fields by size
• Largest objects first
• Smallest objects last
• Automatically packs your structure efficiently!
struct record_c
0
{ id
uint32_t id;
uint16_t name; 4
name
uint8_t tag;
tag
}; 8
#ESCsv
#ESCsv
Structuring your structures
struct record_a #pragma pack(1) struct record_c
{ struct record_b {
uint16_t name; { uint32_t id;
uint32_t id; uint16_t name; uint16_t name;
uint8_t tag; uint32_t id; uint8_t tag;
}; uint8_t tag; };
}; …
struct record_a A; #pragma pack() struct record_c C;
#ESCsv
#ESCsv
Using global variables
Advantages:
• Accessible from the whole program
• Easy to understand and use
Drawbacks:
• Will not be placed in a register, operations will be slow
• Any function call may change the value
• Compiler can’t be as aggressive
#ESCsv
#ESCsv
Example of using temp variable with global
Before: uint8_t gGlobal;
void foo(int32_t x)
{
...
bar(gGlobal); // bar may change gGlobal
...
gGlobal++; // Memory read and write
operation
...
}
uint8_t gGlobal;
After: void foo(int32_t x)
{
uint8_t cTemp=gGlobal; // cTemp probably placed in register
...
bar(cTemp); // bar cannot change
gGlobal
...
cTemp++; // Fast update of
register
...
gGlobal=cTemp; // Store the result back
}
#ESCsv
#ESCsv
Taking addresses
• Address taken uses memory
• Cannot be register-allocated
int a, temp;
...
int a;
scanf(“%d”,&temp);
...
scanf(“%d”,&a); a=temp;
/*After this temp is dead, and the local
Func1(a); register-placed ’a’ is used*/
…
Func1(a);
FuncN(a);
…
FuncN(a);
#ESCsv
#ESCsv
Registers and locals
Live ranges
a b i
void foo(char a) • Register usage
{ • Variablescan share a register
char b,i;
• Only present in register while “live”
b=a<<3;
for(i=0;… )
OUTPUT(b);
b=a<<5;
for(i=0;… )
OUTPUT(b);
}
R0 R1 R2
#ESCsv
#ESCsv
Registers and locals
Live ranges
a b i k
void foo(char a)
{
char b,i,k;
b=a<<3;
for(i=0;… )
OUTPUT(b);
k=a<<4
OUTPUT(k);
b=a<<5;
for(i=0;… )
OUTPUT(b);
}
R0 R1 R2 R3
#ESCsv
#ESCsv
Registers and locals
Live ranges
a b i k j
void foo(char a)
{ • Don’t worry about
char b,i,k,j;
“extra” variables
b=a<<3;
for(i=0;… )
OUTPUT(b);
k=a<<4
OUTPUT(k);
b=a<<5;
for(j=0;… )
OUTPUT(b);
}
R0 R1 R2 R2 R2
#ESCsv
#ESCsv
Varargs
• Variable number of arguments
• int printf(char *, ...)
• Special macros to access arguments
• Forces arguments to stack
• Step through them using pointers
• No parameters in registers
• Source code gets more complex
• Error-prone
• Object code gets bigger
• Bad idea, simply
#ESCsv
#ESCsv
Function prototypes
• C and function calls:
• No declaration: call out of the blue
• K&R declarations: void foo();
• ANSI prototypes: char bar(short);
• Without ANSI prototypes:
• No proper type-checking
• All arguments promoted: casting code and parameter data is larger = more stack used
• Always use ANSI prototypes!
• Turn on checking in compiler!
• ELF has no type checking at link time
• You can make sure you use the same prototype everywhere by using a header file.
#ESCsv
#ESCsv
Static vs. volatile
• The keyword static means that all references to a variable are known
• It also means that the compiler/linker will assign a RAM address for the variable
• The variable can only be used in the source file where it was declared
#ESCsv
#ESCsv
Static vs. volatile
• Is it atomic? volatile int32_t vol = 1;
void f5()
{
vol++; /* Is this atomic? */
}
#ESCsv
#ESCsv
Static vs. volatile
void delay(int time)
• The “empty loop” pitfall {
int i;
• Empty delay loop does not work for (i=0;i<time;i++)
• Code with no effect {}
• Gets removed return;
}
• Delay is always zero
void SetHW(void)
{
• To really delay: SET_DIRECTION(0x20);
delay(100);
• Access volatile variable OUT_SIGNAL(0xB4);
• Use OS services delay(200);
• Use CPU timer READ_PORT(0x02)
}
#ESCsv
#ESCsv
Don’t write “clever” code
• Straightforward code
• Easy to read Easy to maintain
• Better optimization & better tested
if (b == 3)
a = (b == 3) ? 4 : 5; a = 4;
else
a = 5;
#ESCsv
#ESCsv
Comments on “clever” code
“If understanding the code requires knowledge of page 543 of the spec, then chances
are nobody will understand the code. I prefer simple and understandable.”
“The more you stray from the well worn paths, the more likely you'll write code the
compiler misunderstands and either gets wrong or optimizes poorly.”
#ESCsv
#ESCsv
The stack
• The stack is used for: heap
• Local variables
• Return addresses
SP
• Function arguments
• Compiler temporaries
• Interrupt contexts stack
#ESCsv
#ESCsv
Stack overflow
• Stack overflow
• there is no protection on SP heap
• the stack grows into the global area overwriting application
data
• corrupted variables
• wild pointers
• corrupted return addresses
SP
stack
• Calculation methods
• Manual calculation
• Static stack calculation tool
#ESCsv
#ESCsv
Avoiding stack overflow
• Avoid printf() and its relatives…
• Pass by reference instead of by copy
• At least for objects of size greater than the register size
• Limit the number of arguments to a function
• The ABI of an MCU tells a compiler how many arguments can be passed in the
registers, all others must be passed on the stack
• void doThisOrThat(…,doWhat);
void doThis(…);
void doThat(…);
#ESCsv
#ESCsv
Summary
#ESCsv
#ESCsv
Thank You!
Questions?
@ESC_Conf
#ESCsv
#ESCsv