Code Generation
Code Generation
J. S. Freudenberg
1 Simulink Models
Suppose that you have developed a Simulink model of a virtual world, such as a wall or spring-mass system.
We have seen how to choose the parameters of the virtual world so that it has desired properties. For
example, we have seen how to choose the spring constant and inertia of the virtual spring-mass system so
that it has a desired frequency of oscillation and satisfies a maximum torque limit. We also learned how to
add damping to such a model to counteract the destabilizing effect of forward Euler integration. Once we
develop a model of the virtual world that behaves correctly in simulation, it remains to implement this world
in C code that can be executed on the MPC5553 microprocessor. Until now we have simply written the C code
by hand, and have debugged any resulting errors as necessary. Such errors may arise from simple mistakes
in implementing the force feedback algorithm, such as using incorrect parameter values or sign errors. They
may arise in converting from physical units to units that the processor understands, such as duty cycle and
encoder counts. Other errors arise from type conversions, such as those from signed to unsigned integers of
different lengths. Furthermore, changes to the virtual world that are relatively easy to model in Simulink by
adding additional blocks may require substantial work to code in C.
The potential difficulties with hand coding control algorithms have not proven too burdensome in our lab
exercises. However, many real world applications are much more complex, and the time taken to hand code
an algorithm, with all the necessary debugging, may take months. Hence, if we already have an algorithm
that works well in simulation, it would be advantageous to be able to generate C code directly from the
Simulink model. Even if this code is not used in production, it may be used for testing on hardware, thus
enabling the rapid prototyping paradigm for embedded software design. In this approach, control algorithms
are first tested on a model of the system to be controlled. If the algorithms work correctly on the model,
then autocode generation is used to obtain C code that can be tested on the mechanical hardware, thus
enabling an additional level of testing and debugging to take place. The idea is that the algorithms will be
known to work before they are coded into C, and thus any errors that arise must be in the coding, not in
the original algorithm specification.
Consider the Simulink diagram in Figure 1. As we have seen, with appropriate values of k, Jw , b, and
T , we may successfully implement a virtual spring mass system that is a harmonic oscillator with specified
period that satisfies the limit imposed on the reaction torque. The C code required to implement this
system on the microprocessor must perform several tasks in addition to computing the reaction torque for a
given wheel position, as shown in Figure 1. Wheel position must be obtained from the QD function of the
eTPU. The duty cycle must be updated and sent to the PWM function of the eMIOS subsystem. Because
wheel position comes from the eTPU in encoder counts, it must be converted into degrees. The reaction
torque generated by the Simulink model is in N-mm, and must be translated into duty cycle. Variable type
conversions must be performed. The eTPU and eMIOS peripherals on the MPC5553 must be initialized,
just as we initialized them when hand coding in C.
The various initialization and unit conversion tasks are tedious and error prone. We shall see that the best
way to deal with these is to write Simulink subsystems that perform these tasks correctly. It will take some
effort to do so, but once we are done, we will have a library of these subsystems that can be reused so that
∗ Revised November 5, 2009.
1
−1
damping
Figure 1: Discrete Simulation of Virtual Wheel and Torsional Spring with Damping
(virtual wheel discrete.mdl).
we never have to do these low level operations again. This will free us to spend time designing virtual worlds
using the Simulink model, and then automatically creating the C code that runs on the microprocessor.
To generate C code from a Simulink model, we shall need several additional software tools. These include
Real Time Workshop [5], a Mathworks product that generate C code from a Simulink model and Embedded
Coder [6], another Mathworks product that, when used in conjunction with Real Time Workshop, ensures
that the generated code is compact and efficient. In addition, we shall need Simulink blocks that initialize
the MPC5553 microprocessor and that supply device drivers for its peripherals, such as the eTPU and
eMIOS. These latter blocks are available from the RAppID Toolbox [2], a product of Freescale Semiconductor
Corporation.
2 Bit Manipulations
Recall our use of the “union” command in C to access various bit-fields in a register. One can also perform
bit manipulations using Simulink blocks. This is sometimes necessary when developing a Simulink model
to generate code that must interface with hardware (think of the dip switches and LEDS in the lab). For
example, Figures 2-3 illustrate a subsystem that converts a single 32-bit unsigned integer into four 8-bit
unsigned integers. The blocks used to build these figures are found in the Simulink Library Browser Menu:
- Simulink/Sources/Constant
- Simulink/Signal Attributes/Data Type Conversion
- Simulink/Ports & Subsystems/Subsystem
- Simulink/Sinks/Display
- Simulink/Logic and Bit Operations/Bitwise Operator
- Simulink/Logic and Bit Operations/Shift Arithmetic
- Simulink/Signal Routing/Mux
The same blocks may be used to build a subsystem that performs the reverse conversion, from four 8-bit
unsigned integers to a single 32-bit unsigned integer. Such a subsystem is illustrated in Figures 4-5. A more
elegant way to perform bit manipulations is through the use of Matlab S-functions to insert C code directly
into a Simulink block. We shall learn about S-functions in a subsequent handout.
2
Port Data Types
It is often convenient to have the data types of all signals displayed on the Simulink diagram. To do so, enable
the option Format/Port/Signal Displays/Port Data Types. The results are illustrated in Figures 2-
4. Note that there are also options to display signal dimensions (Format/Port/Signal Displays/Signal
Dimensions), and to denote vector valued signals with wider lines (Format/Port/Signal Displays/Wide
Nonscalar Lines).
0
128
double uint32 uint8
2^15 uint32 In1 Out1 0
Figure 2: A subsystem to convert a 32 bit unsigned integer into four 8 bit unsigned integers
(thirtytwobit foureightbits.mdl).
Bitwise Vy = Vu
uint32 uint32 uint8
AND Qy = Qu uint8
0xFF Ey = Eu
Least significant 8 bits Shift Data Type Conversion
Arithmetic
Bitwise Vy = Vu * 2^−8
uint32 uint32 uint8
AND Qy = Qu >> 8 uint8
0xFF00 Ey = Eu
uint8
uint32 Bitwise Shift Data Type Conversion1 1
1 Operator1 Right 8 Bits Out1
In1
Bitwise Vy = Vu * 2^−16
uint32
AND Qy = Qu >> 16 uint32 uint8 uint8
0xFF0000 Ey = Eu
Bitwise Shift Data Type Conversion2
Operator2 Right 16 Bits
Bitwise Vy = Vu * 2^−24
uint32
AND Qy = Qu >> 24 uint32 uint8 uint8
0xFF000000 Ey = Eu
Shift Data Type Conversion3
Most significant 8 bits
Right 24 Bits
Figure 3: Inside the subsystem block that performs the conversion in Figure 2.
Figure 4: A subsystem to convert four 8 bit unsigned integers into a 32 bit unsigned integer
(foureightbits thirtytwobit.mdl).
3
Vy = Vu
uint32 uint32
uint32 Qy = Qu
Ey = Eu
Data Type Conversion Shift
Arithmetic
uint8
Vy = Vu * 2^8
uint8 uint32 uint32
uint32 Qy = Qu << 8
uint8 uint32
1 uint8 Ey = Eu 1
In1 Data Type Conversion1 Shift Out1
uint8 Arithmetic1
Vy = Vu * 2^16 Sum of
uint32 uint32
Qy = Qu << 16 uint32 Elements
Ey = Eu
Data Type Conversion2 Shift
Arithmetic2
Vy = Vu * 2^24
uint32 uint32
uint32 Qy = Qu << 24
Ey = Eu
Data Type Conversion3 Shift
Arithmetic3
Figure 5: Inside the subsystem block that performs the conversion in Figure 4.
holds the current value of the 24-bit counter used by the eTPU to keep track of wheel position.1
eTPU: 0
Channel: 0
Secondary Channel: 1
Angular Velocity (rpm)
Max Speed (rpm): 60000
Position Counter Increments per Revolution (4 X lines on motor): 4096
Position Counts Scaling): PositionCounts X 4
FUNCTION NUMBER: FS_ETPU_QD_FUNCTION_NUMBER
ENTRY TABLE ENCODING: FS_ETPU_QD_TABLE_SELECT Direction (0−pos: 1−neg)
Figure 6: A device driver block for the QD function of the eTPU (eTPU QD.mdl).
The driver block in Figure 6 may be used to develop a subsystem to convert encoder counts into wheel
angle in degrees. Such a subsystem is shown in Figures 7-8. Similarly, a subsystem may be created that
converts reaction torque from N-mm to PWM duty cycle (Figures 9-10).
Figure 7: A subsystem to read wheel angle in encoder counts from the eTPU and output wheel angle in
degrees (read wheel.mdl).
By replacing the step input and scope output in Figure 1 with subsystems that interface to the MPC5553
(see Figure 11) we begin to build a Simulink model that can be used for autocode generation of a virtual
world.
1 As in earlier labs, we only use 16 bits of this counter.
4
eTPU Quadrature Decoder
Set3 functions DC Motor Controls Position Count uint16 single
1
Haptic Wheel Angle
(degrees)
1
z
Unit Delay1
Torque (N−mm)
write torque
Figure 9: A subsystem to input reaction torque in N-mm and update the duty cycle of the MIOS PWM
module (write torque.mdl).
1 Reaction Torque
single
Constant
EMIOS Output PWM
−1 Reaction Torque
thetawdot thetaw
thetawddot
K Ts K Ts
k 1/Jw
Haptic Wheel Angle (degrees) z−1 z−1
thetaz
spring 1/virtual inertia Discrete−Time Discrete−Time
constant Integrator Integrator1
Read Wheel Angle
damping
Figure 11: Simulink model from Figure 1 modified to interface with the MPC5553
(virtual wheel drivers.mdl)
5
4 Processor and Peripheral Initialization
Although the basic functionality of the virtual world and its interfacing are captured in Figure 11, several
items remain before it is possible to use the model for autocode generation. We need to initialize the
microprocessor we are using, as well as the peripherals such as the eTPU and the eMIOS. We also need
to control the timing with which the virtual world is updated; we have done this previously by using the
decrement counter to generate an interrupt at a specified rate. Finally, we may need to structure the
embedded software into several tasks that execute at different rates, and to address the resulting shared data
issues.
To accomplish the first item listed in the previous paragraph, we shall use an additional Simulink block,
depicted in Figure 12. This block identifies the microprocessor target, the system clock speed, the C compiler
used, and whether the generated code is in RAM or flash memory. It allows the user to specify whether a real
time operating system (RTOS) is present, in which case we use OSEKturbo [1], an OSEK/VDX compliant
RTOS available from Freescale. If an RTOS is not available, the “simpletarget” option is selected. Menus
for initializing the peripherals are available by opening the block in Figure 12.
RAppID−EC
6
RAppID MPC5554 Target Setup
Trigger()
Out1
Triggered Scope
Subsystem
Figure 13: Highest Level of the Virtual Spring Mass Damper System (one virtual wheel autocode.mdl)
f() single
−1 Reaction Torque
Trigger
single
Haptic Wheel Angle (degrees)
b
single
damping
Read Wheel Angle
7
updated. Similarly, the C code generated by Real-Time Workshop from a Simulink diagram must also take
the flow of execution into account. To see the order in which Simulink will update each block in a diagram,
enable the option Format/Block Displays/Sorted Order. The result of doing so for the simple diagram in
Figure 1 is displayed in Figure 15. Note there are two numbers on each block: the first indicates subsystem
number (in this case there is only one subsystem), the second refers to the order that the block is executed
inside that subsystem.
0:7
0:8
−1
0:5
damping
Figure 15: Discrete Simulation of Virtual Wheel with Sorted Blocks. (virtual wheel discrete sort.mdl).
It is instructive to work around the diagram in Figure 15 to determine the reasons for the specified block
sorting. Keep in mind that Simulink sorts blocks according to the following two principles, paraphrased from
pp. 30-32 from Chapter 4 of the Simulink user’s guide [4]:
• During each time step of the simulation, a given block must be evaluated before any other block whose
output at that time step depends upon the output of the given block at the same time step.
• Blocks whose outputs at a given time step depend only upon past inputs and initial conditions can be
evaluated in any order consistent with the previous principle.
For example, in Figure 15, the output of the step block and the rightmost discrete integrator block must be
evaluated before the output of the leftmost summing block can be computed.
8
with fast dynamics must be updated at a faster rate than a subsystem with much slower dynamics. As we
shall see in Section 7, it is possible to perform such a multi-rate simulation in Simulink. There are some
subtleties that arise when performing multirate simulations, and we shall also discuss these in Section 7.
When code is generated for a multirate simulation, Real-Time Workshop does either one of two things. If
a real time operating system (RTOS) is available, such as OSEKturbo, then each subsystem is implemented
as a separate task in the RTOS, with faster tasks given higher priority. If an RTOS is not available, then
multi-tasking is simulated using nested interrupts in a procedure called “pseudo-multitasking”. Although
the generated code is different in each case, the simulations will yield equivalent results. Multitasking and
pseudo-multitasking are discussed in Chapter 8 of the User Guide for Real-Time Workshop [3].
One issue that arises when a simulation is broken into multiple tasks is that of guarding the integrity
of any data that must be shared between the tasks. This is done through the use of rate transition blocks,
which we shall illustrate in Section 7.
2
Haptic Wheel
Position 7
fast 8
subsystem torque2 angular speed2
k2 1/Jw2 1 1 6
s s
thetaz Virtual Wheel
Step Integrator3 Integrator2 Position 2
spring 1/virtual inertia2
constant2 Scope
1
total reaction
torque
5 4
slow torque1 angular speed1
subsystem
thetawddot thetawdot thetaw
k1 1/Jw1 1 1 3
s s
Virtual Wheel
1/virtual inertia1 Integrator Integrator1 Position 1
spring
constant1
Figure 16: Continuous time model of two virtual wheels (two virtual wheels analog.mdl)
Although it should not matter for this relatively simple model, for more complex models there is an
advantage to separate the slower and faster dynamics before implementing this model on the microprocessor.
The faster portion of the model must be implemented with a smaller time step for numerical integration
than that used for the slower portion of the model. There is no need to numerically integrate the slower
dynamics at the fast rate, and doing so has the disadvantages of increasing computation time needlessly and
perhaps causing numerical roundoff errors to accumulate.
Motivated by the preceding discussion, we consider a discrete time simulation of this system using multiple
sample rates: a slow sample rate is used for the slow dynamics and a faster rate for the fast dynamics. The
9
reaction torque (analog model)
1000
800
600
400
200
−200
−400
−600
−800
−1000
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
time, seconds
Figure 17: Restoring torque in response to a step change in haptic wheel position
(two virtual wheel plots.m)
model also includes damping to counteract the destabilizing effects of the forward Euler integration. The
resulting model is shown in Figure 18.
10
8 7 6
haptic wheel second (fast) virtual second (fast) virtual
position wheel torque wheel speed
fast subsystem
thetaz theta2ddot theta2dot theta2
K Ts K Ts
k2 1/Jw2 5
z−1 z−1
second (fast) virtual
spring 1/virtual inertia2 Discrete−Time Discrete−Time wheel position
constant2 Integrator3 Integrator2
b2
Scope
damping4
4
reaction torque
1/z
Step
Rate Transition1
1/z 1/z
3 2
first (slow) virtual first (slow) virtual
Rate Transition4 wheel torque Rate Transition3 wheel speed
slow subsystem
theta1dot
ZOH theta1ddot theta1
1/z
K Ts K Ts
k1 1/Jw1 1
z−1 z−1
thetaz first (slow) virtual
Rate Transition spring 1/virtual inertia1 Discrete−Time Discrete−Time Rate Transition2 wheel position
constant1 Integrator Integrator1
b1
damping1
Figure 18: Discrete time model of two virtual wheels (two virtual wheels discrete.mdl).
if the data is transferred correctly, the timing of the transfer may vary, with the result that it is not possible
to know exactly when the fast subsystem will begin to use the new data from the slow subsystem. To resolve
these issues, a rate transition block will effectively act like a delay equal to one slow update period. Hence
the fast subsystem will always work with a value of the slow subsystem that is delayed by one slow update
period.
The latency introduced in slow to fast data transfers described in the preceding paragraph is the price
paid to insure deterministic transfer timing. It is possible to configure a rate transition block so that date
protection and/or deterministic data transfer are turned off. In these cases, the generated C code will
require less memory and execute more quickly, with the downside that unpredictable results may occur. See
Chapter 6 of the Real Time Workshop documentation [3] for more information.
11
f()
Function−Call
Generator1
Trigger()
Virtual wheel Position2
Triggered
Step Subsystem:
Small Inertia (Fast Subsystem)
f()
Function−Call
Generator
−1
Trigger()
Gain
Virtual wheel Position1 Scope Reaction torque
ZOH
theta1 angular speed1
Triggered
Subsystem: 1/z
Large Inertia (Slow Subsystem)
Rate Transition1
Figure 19: Two virtual wheels implemented as triggered subsystems (two virtual wheel subsystems.mdl).
3
f() torque2
Trigger
2
angular speed2
thetawddot thetawdot thetaw
K Ts K Ts
1 k2 1/Jw2 1
z−1 z−1
theta2 Virtual wheel
spring 1/virtual inertia2 Discrete−Time Discrete−Time Position2
constant2 Integrator3 Integrator2
b2
damping2
3
f() torque1
Trigger
2
angular speed1
thetawddot thetawdot thetaw
K Ts K Ts
1 k1 1/Jw1 1
z−1 z−1
theta1 Virtual wheel
spring 1/virtual inertia1 Discrete−Time Discrete−Time Position1
constant1 Integrator Integrator1
b1
damping1
12
8.1 Without an RTOS
Now that the two virtual wheel system has been separated into separate subsystems, we must add initializa-
tion and device driver blocks. The resulting model is shown in Figures 22-24. The processor and peripheral
initialization block has been placed at the highest level of the model. The device driver blocks are placed
inside the fast subsystem, so that the encoder is read and the duty cycle is updated at the fast rate.
Note that the torque computed by the slow subsystem must be passed to the fast subsystem, so that
the latter may compute the total reaction torque used to update the duty cycle. Similarly, the angle of the
haptic wheel, which is obtained from the eTPU driver block in the fast subsystem, must be passed to the
slow task so that the latter may use this information to compute the reaction torque for the virtual wheel
with the larger inertia.
RAppID−EC FastTask
1/z 0:1
single
0:7 fcn_call
f()
Slow Task Trigger
Trigger() 0:F{2}
single
Slow Torque
Haptic Wheel Position
single
slow virtual wheel position
SlowTask
Figure 22: Highest Level of the Two Virtual Spring Inertia Damper System for Code generation without an
RTOS (two virtual wheels simpletarget.mdl)
1 single
−1 Slow Reaction Torque
single single
Reaction Torque
Slow Torque
Fast Reaction Torque
single
−1 Write Reaction Torque
Fast Torque (N−mm)
f()
Trigger
b2
single
damping
double
Convert single
15 deg at 1 sec
Sim
single
Out 1
RTW
Haptic Wheel Position
Environment
single Controller
Haptic Wheel Angle (degrees) 16 bit value
Write 16 Bits
Figure 23: Fast triggered subsystem for autocode generation from Figure 22.
13
f() 1
Trigger Slow Torque
2:4
b1
single
damping
Figure 24: Slow triggered subsystem for autocode generation from Figure 22.
System Clock : 128 MHz Resource Name : Torque Resource Name : Encoder _ticks
Target : MPC 5554 Resource Type : STANDARD Resource Type : STANDARD
Signal Type : Single Signal Type : Single
Compiler : metrowerks
Vector Size : 1 Vector Size : 1
Target Type : IntRAM
Operating System : osekturbo
The University of Michigan The University of Michigan
Resource Initialization Signal Object Resource Initialization Signal Object 1
RAppID −EC
double
f()
double
f() Function −Call
Function −Call Generator
Generator 1
Trigger() double
Trigger() double torque 3
torque 3
double double
double double angular speed 3
angular speed 3
double
Fast System Scope
double
Slow System Scope 1 virtual wheel position 3
virtual wheel position 3
Fast System
Slow System Triggered
Triggered Subsystem
Subsystem 1
Figure 25: Highest Level of the Two Virtual Spring Inertia Damper System for Code Generation with an
RTOS (two virtual wheels osekturbo.mdl)
14
1
torque3
f()
Trigger
2
angular speed 3
damping 2
Output
Input
Resource Name:Encoder_ticks Resource Name:Torque
Figure 26: Fast triggered subsystem for autocode generation from Figure 25.
2
angular speed 3
Read Resource
theta 2z theta 2ddot theta 2dot theta 2
K Ts K Ts
Output k1 1/Jw1 3
z−1 z−1
Resource Name :Encoder _ticks virtual wheel position 3
spring 1/virtual inertia 3 Discrete−Time Discrete−Time
The University of Michigan constant 3 Integrator 6 Integrator 5
ReadFromResource
b1
damping 2
Figure 27: Slow triggered subsystem for autocode generation from Figure 25.
15
References
[1] www.freescale.com/webapp/sps/site/overview.jsp?code=CW\ OSEK.
[2] www.freescale.com/webapp/sps/site/prod\ summary.jsp?code=RAPPIDTOOLBOX.
[3] www.mathworks.com/access/helpdesk/help/pdf\ doc/rtw/rtw\ ug.pdf.
[4] www.mathworks.com/access/helpdesk/help/pdf\ doc/simulink/sl\ using.pdf.
[5] www.mathworks.com/products/rtw/.
[6] www.mathworks.com/products/rtwembedded/.
[7] J. A. Ledin. Hardware-in-the-loop simulation. Embedded Systems Programming, pages 42–60, February
1999.
16