Motorola PowerPC RISC CPU Reference Manual
Motorola PowerPC RISC CPU Reference Manual
Motorola PowerPC RISC CPU Reference Manual
Section 2
REGISTERS
2.1 Programming Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
2.2 PowerPC UISA Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.2.1 General Purpose Registers (GPRs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.2.2 Floating-Point Registers (FPRs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.2.3 Floating-Point Status and Control Register (FPSCR). . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.2.4 Condition Register (CR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Section 3
OPERAND CONVENTIONS
3.1 Data Alignment and Memory Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
3.2 Byte Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3.2.1 Structure Mapping Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.2.1.1 Big-Endian Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.2.1.2 Little-Endian Mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.2.2 Data Memory in Little-Endian Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.2.2.1 Aligned Scalars. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.2.2.2 Misaligned Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
3.2.2.3 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3.2.2.4 Load and Store Multiple Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
3.2.3 Instruction Memory Addressing in Little-Endian Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
3.2.4 Input/Output in Little-Endian Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.3 Floating-Point Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.3.1 Floating-Point Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.3.2 Value Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
3.3.3 Normalized Numbers (±NORM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
3.3.4 Zero Values (±0). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
3.3.5 Denormalized Numbers (±DENORM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
MOTOROLA RCPU
iv REFERENCE MANUAL
Paragraph Page
Number Number
3.3.6 Infinities (±×) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
3.3.7 Not a Numbers (NaNs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
3.3.8 Sign of Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
3.3.9 Normalization and Denormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
3.3.10 Data Handling and Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
3.3.11 Rounding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-19
3.4 Floating-Point Execution Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-21
3.4.1 Execution Model for IEEE Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
3.4.2 Execution Model for Multiply-Add Type Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-24
3.4.3 Non-IEEE Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25
3.4.4 Working Without the Software Envelope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
Section 4
ADDRESSING MODES AND INSTRUCTION SET SUMMARY
4.1 Memory Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
4.1.1 Memory Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4.1.2 Addressing Modes and Effective Address Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4.2 Classes of Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.2.1 Definition of Boundedly Undefined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.2.2 Defined Instruction Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.2.3 Illegal Instruction Class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4.2.4 Reserved Instruction Class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4.3 Integer Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
4.3.1 Integer Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
4.3.2 Integer Compare Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11
4.3.3 Integer Logical Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-12
4.3.4 Integer Rotate and Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-14
4.3.4.1 Integer Rotate Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-16
4.3.4.2 Integer Shift Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17
4.4 Floating-Point Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-19
4.4.1 Floating-Point Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-19
4.4.2 Floating-Point Multiply-Add Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-22
4.4.3 Floating-Point Rounding and Conversion Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . 4-25
4.4.4 Floating-Point Compare Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-27
4.4.5 Floating-Point Status and Control Register Instructions . . . . . . . . . . . . . . . . . . . . . . . . . 4-28
4.5 Load and Store Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-30
4.5.1 Integer Load and Store Address Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-30
4.5.1.1 Register Indirect with Immediate Index Addressing . . . . . . . . . . . . . . . . . . . . . . . . 4-30
4.5.1.2 Register Indirect with Index Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-31
4.5.1.3 Register Indirect Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-32
4.5.2 Integer Load Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-33
4.5.3 Integer Store Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
4.5.4 Integer Load and Store with Byte Reversal Instructions. . . . . . . . . . . . . . . . . . . . . . . . . 4-37
Section 5
INSTRUCTION CACHE
5.1 Instruction Cache Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
5.2 Programming Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5.2.1 I-Cache Control and Status Register (ICCST) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5.2.2 I-Cache Address Register (ICADR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
MOTOROLA RCPU
vi REFERENCE MANUAL
Paragraph Page
Number Number
5.2.3 I-Cache Data Register (ICDAT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5.3 Instruction Cache Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5.3.1 Cache Hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.3.2 Cache Miss. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.3.3 Instruction Fetch on a Predicted Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.4 Cache Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.4.1 Instruction Cache Block Invalidate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.4.2 Invalidate All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.4.3 Load and Lock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.4.4 Unlock Line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.5 Unlock All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.6 Cache Enable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.7 Cache Disable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.8 Cache Inhibit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.9 Cache Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5.5 I-Cache and On-Chip Memories with Zero Wait States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.6 Cache Coherency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.7 Updating Code and Attributes of Memory Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.8 Reset Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.9 Debugging Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5.9.1 Running a Debug Routine from the I-Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5.9.2 Instruction Fetch from the Development Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
Section 6
EXCEPTIONS
6.1 Exception Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6.1.1 Ordered and Unordered Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6.1.2 Synchronous, Precise Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6.1.3 Asynchronous Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
6.1.3.1 Asynchronous, Maskable Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
6.1.3.2 Asynchronous, Non-Maskable Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
6.2 Exception Vector Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
6.3 Precise Exception Model Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
6.4 Implementation of Asynchronous Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8
6.5 Recovery from Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9
6.5.1 Recovery from Ordered Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9
6.5.2 Recovery from Unordered Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9
6.5.3 Commands to Alter MSR[EE] and MSR[RI] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
6.6 Exception Order and Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
6.7 Ordering of Synchronous, Precise Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
6.8 Exception Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
6.8.1 Enabling and Disabling Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16
6.8.2 Steps for Exception Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16
Section 7
INSTRUCTION TIMING
7.1 Instruction Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
7.1.1 Instruction Sequencer Data Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7.1.2 Instruction Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
7.1.3 Basic Instruction Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
7.2 Execution Unit Timing Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
7.2.1 Integer Unit (IU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
MOTOROLA RCPU
viii REFERENCE MANUAL
Paragraph Page
Number Number
7.2.1.1 Update of the XER During Divide Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.2.2 Floating Point Unit (FPU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.2.3 Load/Store Unit (LSU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.2.3.1 Load/Store Instruction Issue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
7.2.3.2 Load/Store Synchronizing Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
7.2.3.3 Load/Store Instruction Timing Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
7.2.3.4 Bus Cycles for String Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8
7.2.3.5 Stalls During Floating-Point Store Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8
7.2.4 Branch Processing Unit (BPU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7.3 Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7.3.1 Execution Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7.3.2 Fetch Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7.4 Context Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7.5 Implementation of Special-Purpose Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7.6 Instruction Execution Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7.7 Instruction Execution Timing Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16
7.7.1 Load from Internal Memory Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16
7.7.2 Write-Back Arbitration Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-17
7.7.3 Load with Private Write-Back Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
7.7.4 Fastest External Load Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19
7.7.5 History Buffer Full Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20
7.7.6 Store and Floating-Point Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7.7.7 Branch Folding Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
7.7.8 Branch Prediction Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
Section 8
DEVELOPMENT SUPPORT
8.1 Program Flow Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
8.1.1 Indirect Change-of-Flow Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
8.1.1.1 Marking the Indirect Change-of-Flow Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
8.1.1.2 Sequential Instructions with the Indirect Change-of-Flow Attribute . . . . . . . . . . . . . 8-3
8.1.2 Instruction Fetch Show Cycle Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
8.1.3 Program Flow-Tracking Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-5
8.1.3.1 Instruction Queue Status Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-5
8.1.3.2 History Buffer Flush Status Pins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
8.1.3.3 Flow-Tracking Status Pins in Debug Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
8.1.3.4 Cycle Type, Write/Read, and Address Type Pins . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8.1.4 External Hardware During Program Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
8.1.4.1 Back Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
8.1.4.2 Window Trace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
8.1.4.3 Synchronizing the Trace Window to Internal CPU Events . . . . . . . . . . . . . . . . . . . . 8-8
8.1.4.4 Detecting the Trace Window Starting Address. . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10
8.1.4.5 Detecting the Assertion or Negation of VSYNC . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10
MOTOROLA RCPU
x REFERENCE MANUAL
Paragraph Page
Number Number
8.5.1 Port Usage in Debug Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-40
8.5.2 Debug Mode Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-42
8.5.3 Port Usage in Normal (Non-Debug) Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
8.6 Examples of Debug Mode Sequences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-44
8.6.1 Prologue Instruction Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-44
8.6.2 Epilogue Instruction Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-44
8.6.3 Peek Instruction Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-45
8.6.4 Poke Instruction Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-45
8.7 Software Monitor Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-46
8.8 Development Support Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
8.8.1 Register Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-48
8.8.2 Comparator A–D Value Registers (CMPA–CMPD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-50
8.8.3 Comparator E–F Value Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-50
8.8.4 Comparator G–H Value Registers (CMPG–CMPH) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
8.8.5 I-Bus Support Control Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
8.8.6 L-Bus Support Control Register 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
8.8.7 L-Bus Support Control Register 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
8.8.8 Breakpoint Counter A Value and Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-57
8.8.9 Breakpoint Counter B Value and Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58
8.8.10 Exception Cause Register (ECR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58
8.8.11 Debug Enable Register (DER) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-60
Section 9
INSTRUCTION SET
9.1 Instruction Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
9.1.1 Split Field Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
9.1.2 Instruction Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
9.1.3 Notation and Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3
9.2 RCPU Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6
Appendix A
INSTRUCTION SET LISTINGS
Appendix B
MULTIPLE-PRECISION SHIFTS
Appendix C
FLOATING-POINT MODELS AND CONVERSIONS
C.1 Conversion from Floating-Point Number to Signed Fixed-Point Integer Word . . . . . . . . . . . . . C-1
C.2 Conversion from Floating-Point Number to Unsigned Fixed-Point Integer Word . . . . . . . . . . . C-1
C.3 Floating-Point Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
C.3.1 Floating-Point Round to Single-Precision Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
C.3.2 Floating-Point Convert to Integer Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-5
C.4 Floating-Point Convert from Integer Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-8
Appendix E
SIMPLIFIED MNEMONICS
E.1 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-1
E.2 Simplified Mnemonics for Subtract Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-2
E.2.1 Subtract Immediate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-2
E.2.2 Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-2
E.3 Simplified Mnemonics for Compare Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-2
E.4 Simplified Mnemonics for Rotate and Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-3
E.5 Simplified Mnemonics for Branch Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-4
E.5.1 BO and BI Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-5
E.5.2 Basic Branch Mnemonics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-5
E.5.3 Branch Mnemonics Incorporating Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-9
E.5.4 Branch Prediction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-10
E.6 Simplified Mnemonics for Condition Register Logical Instructions . . . . . . . . . . . . . . . . . . . . . E-11
E.7 Simplified Mnemonics for Trap Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-12
E.8 Simplified Mnemonics for Special-Purpose Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-15
E.9 Recommended Simplified Mnemonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-16
E.9.1 No-Op. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-16
E.9.2 Load Immediate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-16
E.9.3 Load Address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-16
E.9.4 Move Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-17
E.9.5 Complement Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-17
E.9.6 Move to Condition Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-17
MOTOROLA RCPU
xii REFERENCE MANUAL
LIST OF FIGURES
Figure Title Page
8-11 Trap Enable Data Shifted Into Development Port Shift Register ....................... 8-32
8-12 Breakpoint Data Shifted Into Development Port Shift Register ......................... 8-32
8-13 CPU Instructions/Data Shifted into Shift Register.............................................. 8-32
8-14 Status Shifted Out of Shift Register — Non-Debug Mode ................................. 8-33
8-15 Status/Data Shifted Out of Shift Register .......................................................... 8-34
8-16 Sequencing Error Activity .................................................................................. 8-35
8-17 Checkstop State and Debug Mode.................................................................... 8-40
8-18 Debug Mode Development Port Usage ............................................................. 8-41
8-19 Non-Debug Mode Development Port Usage ..................................................... 8-44
8-20 Prologue Events ................................................................................................ 8-44
8-21 Epilogue Events................................................................................................. 8-45
8-22 Peek Instruction Sequence................................................................................ 8-45
8-23 Poke Instruction Sequence................................................................................ 8-46
8-24 Development Support Programming Model....................................................... 8-48
8-25 Development Support Registers Read Access Protection ................................ 8-49
8-26 Development Support Registers Write Access Protection................................. 8-49
8-27 CMPA-CMPD Bit Settings ................................................................................. 8-50
8-28 CMPE-CMPF Bit Settings.................................................................................. 8-50
8-29 CMPG-CMPH Bit Settings ................................................................................. 8-51
8-30 ICTRL Bit Settings ............................................................................................. 8-52
8-31 LCTRL1 Bit Settings .......................................................................................... 8-54
8-32 LCTRL2 Bit Settings .......................................................................................... 8-55
8-33 Breakpoint Counter A Value and Control Register (COUNTA)......................... 8-57
8-34 Breakpoint Counter B Value and Control Register (COUNTB)......................... 8-58
8-35 ECR Bit Settings ................................................................................................ 8-59
8-36 DER Bit Settings ............................................................................................... 8-61
Audience
This manual is intended for system software and hardware developers and appli-
cations programmers who want to develop products for RCPU-based microcontrol-
ler systems. It is assumed that the reader understands operating systems,
microprocessor and microcontroller system design, and the basic principles of
RISC processing.
Additional Reading
This section lists additional reading that provides background to or supplements
the information in this manual.
Conventions
This document uses the following notational conventions:
ACTIVE_HIGH Names for signals that are active high are shown in uppercase
text without an overbar. Signals that are active high are referred
to as asserted when they are high and negated when they are
low.
ACTIVE_LOW A bar over a signal name indicates that the signal is active low.
Active-low signals are referred to as asserted (active) when they
are low and negated when they are high.
mnemonics Instruction mnemonics are shown in lowercase bold.
italics Italics indicate variable command parameters, for example,
Nomenclature
Logic level one is the voltage that corresponds to Boolean true (1) state.
Logic level zero is the voltage that corresponds to Boolean false (0) state.
To set a bit or bits means to establish logic level one on the bit or bits.
To clear a bit or bits means to establish logic level zero on the bit or bits.
A signal that is asserted is in its active logic state. An active low signal changes
from logic level one to logic level zero when asserted, and an active high signal
changes from logic level zero to logic level one.
A signal that is negated is in its inactive logic state. An active low signal changes
from logic level zero to logic level one when negated, and an active high signal
changes from logic level one to logic level zero.
LSB means least significant bit or bits. MSB means most significant bit or bits. Ref-
erences to low and high bytes are spelled out.
The RCPU integrates four execution units: an integer unit (IU), a load/store unit
(LSU), a branch processing unit (BPU), and a floating-point unit (FPU). The RCPU
can issue one sequential (non-branch) instruction per clock cycle. In addition, the
processor attempts to evaluate branch conditions ahead of time and execute
branch instructions simultaneously with sequential instructions, often resulting in
zero-cycle execution time for branch instructions. Instructions can complete out of
order for increased performance; however, the processor makes execution appear
sequential.
PERIPHERAL
CONTROL UNIT IRQs
(PCU)
SYSTEM
INTERFACE
UNIT
RISC MCU INTERNAL LOAD/STORE BUS (SIU) EXTERNAL
PROCESSOR (L-BUS) BUS
(RCPU)
ON-CHIP SRAM
4-KBYTE
I-CACHE
DEVELOPMENT DEVELOPMENT
SUPPORT PORT
RCPU MCU
The following subsections describe the features of the RCPU, provide a block dia-
gram showing the major functional units, and give an overview of RCPU operation.
I-ADDR
L-DATA
L-ADDR
MOTOROLA
INSTRUCTIONSEQUENCER RCPU
CONTROL BUS
OVERVIEW
LOAD/ FPR FPR FPU
2 SLOTS/CLOCK
(32 X 64) HISTORY
(4 SLOTS/CLOCK)
RCPU BLOCK
RCPU
REFERENCE MANUAL
1.1.3 Instruction Sequencer
The instruction sequencer provides centralized control over data flow between ex-
ecution units and register files. The sequencer implements the basic instruction
pipeline, fetches instructions from the memory system, issues them to available ex-
ecution units, and maintains a state history so it can back the machine up in the
event of an exception.
In addition, the sequencer implements all branch processor instructions, which in-
clude flow control and condition register instructions. Refer to 1.1.4.1 Branch Pro-
cessing Unit (BPU) for more details on the branch processing unit within the
instruction sequencer.
The instruction sequencer fetches the instructions from the instruction cache into
the instruction pre-fetch queue, which can hold up to four instructions. The proces-
sor uses branch folding (a technique of removing the branch instructions from the
pre-fetch queue) in order to execute branches in parallel with execution of sequen-
tial instructions. Sequential (non-branch) instructions reaching the top of the in-
struction pre-fetch queue are issued to the execution units. Instructions may be
flushed from the queue when an external interrupt is detected, a previous instruc-
tion causes an exception, or a branch prediction turns out to be incorrect.
All instructions, including branches, enter the history buffer along with processor
state information that may be affected by the instruction’s execution. This informa-
tion is used to enable out-of-order completion of instructions together with the han-
dling of precise exceptions. Instructions may be flushed from the machine when an
exception is taken. Refer to 6.3 Precise Exception Model Implementation and to
7.1 Instruction Flow for additional information.
An instruction retires from the machine after it finishes execution without exception
and all preceding instructions have already retired from the machine.
HISTORY BUFFER
ISSUE
BRANCH
INSTRUCTION
PRE-FETCH UNIT
QUEUE
FETCH
Load/store unit Includes implementation of all load and store instructions, whether defined as
(LSU) part of the integer processor or the floating-point processor.
Integer unit (IU) Includes implementation of all integer instructions except load/store
instructions. This module includes the GPRs (including GPR history and
scoreboard) and the following subunits:
• The IMUL-IDIV unit includes the implementation of the integer multiply and
divide instructions.
• The ALU-BFU unit includes implementation of all integer logic, add and
subtract instructions, and bit field instructions.
Floating-point unit Includes the FPRs (including FPR history and scoreboard) and the
(FPU) implementation of all floating-point instructions except load and store floating-
point instructions.
The BPU is implemented as part of the instruction sequencer. The BPU performs
condition register look-ahead operations on conditional branches. The BPU looks
through the instruction queue for a conditional branch instruction and attempts to
resolve it early, achieving the effect of a zero-cycle branch in many cases.
The BPU uses a bit in the instruction encoding to predict the direction of the condi-
tional branch. (Refer to the discussion of the BO field in 4.6 Flow Control Instruc-
tions.) Therefore, when an unresolved conditional branch instruction is
encountered, the processor pre-fetches instructions from the predicted target
stream until the conditional branch is resolved.
The BPU contains an adder to compute branch target addresses and three special-
purpose, user-accessible registers: the link register (LR), the count register (CTR),
and the condition register (CR). The BPU calculates the return pointer for subrou-
tine calls and saves it into the LR. The LR also contains the branch target address
for the branch conditional to link register (bclrx) instruction. The CTR contains the
branch target address for the branch conditional to count register (bcctrx) instruc-
tion. The contents of the LR and CTR can be copied to or from any GPR. Because
the BPU uses dedicated registers rather than general-purpose or floating-point
registers, execution of branch instructions is independent from execution of integer
and floating-point instructions.
The RCPU depends on a software envelope to fully implement the IEEE floating-
point specification. Overflows, underflows, NaNs, and denormalized numbers
cause floating-point assist exceptions that invoke a software routine to deliver (with
hardware assistance) the correct IEEE result. Refer to 6.11.10 Floating-Point As-
sist Exception (0x00E00) for additional information.
There is a 32-bit wide data path between the load/store unit and the integer register
file and a 64-bit wide data path between the load/store unit and the floating-point
register file.
The LSU interfaces with the external bus interface for all instructions that access
memory. Addresses are formed by adding the source one register operand speci-
A cache access cycle begins with an instruction request from the instruction unit in
the processor. In case of a cache hit, the instruction is delivered to the instruction
unit. In the case of a cache miss, the cache initiates a burst read cycle on the I-bus
with the address of the requested instruction. The first word received from the bus
is the requested instruction. The cache forwards this instruction to the instruction
unit of the CPU as soon as it is received from the I-bus. A cache line is then select-
ed to receive the data which will be coming from the bus. An LRU replacement al-
gorithm is used to select a line when no empty lines are available.
Each cache line can be used as an SRAM, thus allowing the application to lock crit-
ical code segments that need fast and deterministic execution time.
Figure 1-4 illustrates the basic instruction pipeline timing. Refer to SECTION 7 IN-
STRUCTION TIMING for more detailed timing illustrations.
FETCH I1 I2 I3
DECODE I1 I2
L ADDRESS DRIVE I1
LOAD WRITEBACK I1
BRANCH DECODE I1
BRANCH EXECUTE I1
RCPU INST PL
• PowerPC user instruction set architecture (UISA) — Defines the base user-
level instruction set, user-level registers, data types, floating-point exception
model, memory models for a uniprocessor environment, and programming
model for a uniprocessor environment.
• PowerPC virtual environment architecture (VEA) — Describes the memory
model for a multiprocessor environment, defines cache control instructions,
and describes other aspects of virtual environments. Implementations that
conform to the VEA also adhere to the UISA, but may not necessarily adhere
to the OEA.
• PowerPC operating environment architecture (OEA) — Defines the memory
management model, supervisor-level registers, synchronization require-
ments, and the exception model. Implementations that conform to the OEA
also adhere to the UISA and the VEA.
The following sections summarize the PowerPC registers that are implemented in
the RCPU. Refer to SECTION 2 REGISTERS for detailed descriptions of PowerPC
registers. In addition, for descriptions of the I-cache control registers, refer to SEC-
TION 5 INSTRUCTION CACHE. For details on development-support registers, re-
fer to SECTION 8 DEVELOPMENT SUPPORT.
• The link register (LR) can be used to provide the branch target address and
to hold the return address after branch and link instructions.
• The count register (CTR) is decremented and tested automatically as a result
of branch-and-count instructions.
• The integer exception register (XER) contains the integer carry and overflow
bits and two fields for the load string and compare byte indexed (lscbx) in-
struction. The XER is 32 bits wide in all implementations.
NOTE
While these registers are defined as SPRs and can be accessed by
using the mtspr and mfspr instructions, they (except for the time
base) are typically accessed implicitly.
• The EIE, EID, and NRI are provided to facilitate exception processing.
• Cache control SPRs allow system software to control the operation of the in-
struction cache.
• Development support SPRs allow development-system software control over
the on-chip development support.
• The floating-point exception cause register (FPECR) is a 32-bit internal status
and control register used to assist the software emulation of floating-point op-
erations.
Refer to 4.1.2 Addressing Modes and Effective Address Calculation for addi-
tional information.
Although multiple exception conditions can map to a single exception vector, the
specific condition can be determined by examining a register associated with the
exception — for example, the DAE/DSISR and the FPSCR. Specific exception
conditions can be explicitly enabled or disabled by software.
While exception conditions may be recognized out of order, they are handled strict-
ly in order. When an instruction-caused exception is recognized, any unexecuted
instructions that appear earlier in the instruction stream are allowed to complete.
Any exceptions caused by those instructions are handled in order.
Figure 2-1 shows the user-level and supervisor-level RCPU programming models
and also illustrates the three levels of the PowerPC architecture. The numbers to
the left of the SPRs indicate the decimal number that is used in the syntax of the
instruction operands to access the register.
NOTE
Registers such as the general-purpose registers (GPRs) and float-
ing-point registers (FPRs) are accessed through operands that are
part of the instructions. Access to registers can be explicit (that is,
through the use of specific instructions for that purpose such as move
to special-purpose register (mtspr) and move from special-purpose
register (mfspr) instructions) or implicitly as the part of the execution
of an instruction. Some registers are accessed both explicitly and
implicitly.
FPR1 0 31
0 31
Supervisor-Level SPRs
FPR31
SPR18 DAE/ Source Instruction Service Register (DSISR)
0 63
SPR19 Data Address Register (DAR)
GPR0 Condition SPR22 Decrementer Register (DEC)
Register SPR26 Save and Restore Register 0 (SRR0)
GPR1
SPR27 Save and Restore Register 1 (SRR1)
CR
SPR80 External Interrupt Enable (EIE) *
0 31 SPR81 External Interrupt Disable (EID) *
GPR31 SPR82 Non-Recoverable Interrupt (NRI) *
Floating Point
0 31 SPR272 SPR General 0 (SPRG0)
Status and
SPR273 SPR General 1 (SPRG1)
Control
SPR274 SPR General 2 (SPRG2)
Register
SPR275 SPR General 3 (SPRG3)
FPSCR
SPR284 Time Base Lower – Write (TBL)
0 31 SPR285 Time Base Upper – Write (TBU)
SPR287 Processor Version Register (PVR)
SPR560 I-Cache Control and Status Register (ICCST) *
User-Level SPRs SPR561 I-Cache Address Register (ICADR) *
SPR1 Integer Exception Register (XER) SPR562 I-Cache Data Port (ICDAT) *
SPR8 Link Register (LR) SPR1022 Floating-Point Exception Cause Register (FPECR) *
SPR9 Count Register (CTR) 0 31
0 31
Development Support SPRs
SPR144 Comparator A Value Register (CMPA) *
SPR145 Comparator B Value Register (CMPB) *
SPR146 Comparator C Value Register (CMPC) *
SPR147 Comparator D Value Register (CMPD) *
USER MODEL VEA SPR148 Exception Cause Register (ECR) *
SPR149 Debug Enable Register (DER) *
SPR150 Breakpoint Counter A Value and Control (COUNTA) *
SPR151 Breakpoint Counter B Value and Control (COUNTB) *
Time Base Facility SPR152 Comparator E Value Register (CMPE) *
(for Reading) SPR153 Comparator F Value Register (CMPF) *
SPR268 Time Base Lower — Read (TBL) SPR154 Comparator G Value Register (CMPG) *
SPR269 Time Base Upper — Read (TBU) SPR155 Comparator H Value Register (CMPH) *
0 31 SPR156 L-Bus Support Comparators Control (LCTRL1) *
SPR157 L-Bus Support Comparators Control (LCTRL2) *
SPR158 I-Bus Support Control Register (ICTRL) *
SPR159 Breakpoint Address Register (BAR) *
SPR630 Development Port Data Register (DPDR) *
0 31
GPR0
GPR1
. ..
. ..
GPR31
RESET:UNCHANGED
All floating-point arithmetic instructions operate on data located in FPRs and, with
the exception of the compare instructions (which update the CR), place the result
into an FPR. Information about the status of floating-point operations is placed into
the floating-point status and control register (FPSCR) and in some cases, into the
CR, after the completion of the operation’s writeback stage. For information on how
the CR is affected by floating-point operations, see 2.2.4 Condition Register
(CR).
Load and store double instructions transfer 64 bits of data between memory and
the FPRs in the floating-point processor with no conversion. Load single instruc-
Single- and double-precision arithmetic instructions accept values from the FPRs
in double-precision format. For single-precision arithmetic instructions, all input val-
ues must be representable in single-precision format; otherwise, the result placed
into the target FPR and the setting of status bits in the FPSCR and in the condition
register are undefined.
FPR0
FPR1
. ..
. ..
FPR31
RESET:UNCHANGED
Table 2-1 summarizes which bits in the FPSCR are sticky bits, which are normal
status bits, and which are control bits.
FEX and VX are the logical ORs of other FPSCR bits. Therefore these two bits are
not listed among the FPSCR bits directly affected by the various instructions.
RESET:UNCHANGED
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESET:UNCHANGED
1 FEX Floating-point enabled exception summary. This bit signals the occurrence of any of the en-
abled exception conditions. It is the logical OR of all the floating-point exception bits masked
with their respective enable bits. The mcrfs instruction implicitly clears FPSCR[FEX] if the re-
sult of the logical OR described above becomes zero. The mtfsf, mtfsfi, mtfsb0, and mtfsb1
instructions cannot set or clear FPSCR[FEX] explicitly. This is not a sticky bit.
2 VX Floating-point invalid operation exception summary. This bit signals the occurrence of any in-
valid operation exception. It is the logical OR of all of the invalid operation exceptions. The
mcrfs instruction implicitly clears FPSCR[VX] if the result of the logical OR described above
becomes zero. The mtfsf, mtfsfi, mtfsb0, and mtfsb1 instructions cannot set or clear FP-
SCR[VX] explicitly. This is not a sticky bit.
3 OX Floating-point overflow exception. This is a sticky bit. See 6.11.10.8 Overflow Exception Con-
dition.
4 UX Floating-point underflow exception. This is a sticky bit. See 6.11.10.9 Underflow Exception
Condition.
5 ZX Floating-point zero divide exception. This is a sticky bit. See 6.11.10.7 Zero Divide Exception
Condition.
6 XX Floating-point inexact exception. This is a sticky bit. See 6.11.10.10 Inexact Exception Con-
dition.
7 VXSNAN Floating-point invalid operation exception for SNaN. This is a sticky bit. See 6.11.10.6 Invalid
Operation Exception Conditions.
8 VXISI Floating-point invalid operation exception for ∞-∞. This is a sticky bit. See 6.11.10.6 Invalid
Operation Exception Conditions.
9 VXIDI Floating-point invalid operation exception for ∞/∞. This is a sticky bit. See 6.11.10.6 Invalid
Operation Exception Conditions.
10 VXZDZ Floating-point invalid operation exception for 0/0. This is a sticky bit. See 6.11.10.6 Invalid Op-
eration Exception Conditions.
11 VXIMZ Floating-point invalid operation exception for ×*0. This is a sticky bit. See 6.11.10.6 Invalid Op-
eration Exception Conditions.
12 VXVC Floating-point invalid operation exception for invalid compare. This is a sticky bit. See 6.11.10.6
Invalid Operation Exception Conditions.
13 FR Floating-point fraction rounded. The last floating-point instruction that potentially rounded the
intermediate result incremented the fraction. (See 3.3.11 Rounding.) This bit is not sticky.
14 FI Floating-point fraction inexact. The last floating-point instruction that potentially rounded the in-
termediate result produced an inexact fraction or a disabled exponent overflow. (See 3.3.11
Rounding.) This bit is not sticky.
[15:19] FPRF Floating-point result flags. This field is based on the value placed into the target register even
if that value is undefined. Refer to Table 2-3 for specific bit settings.
15 Floating-point result class descriptor (C). Floating-point instructions other than the
compare instructions may set this bit with the FPCC bits, to indicate the class of
the result.
[16:19] Floating-point condition code (FPCC). Floating-point compare instructions always
set one of the FPCC bits to one and the other three FPCC bits to zero. Other
floating-point instructions may set the FPCC bits with the C bit, to indicate the class
of the result. Note that in this case the high-order three bits of the FPCC retain their
relational significance indicating that the value is less than, greater than, or equal
to zero.
16 Floating-point less than or negative (FL or <)
17 Floating-point greater than or positive (FG or >)
18 Floating-point equal or zero (FE or =)
19 Floating-point unordered or NaN (FU or ?)
20 — Reserved
21 VXSOFT Floating-point invalid operation exception for software request. This bit can be altered only by
the mcrfs, mtfsfi, mtfsf, mtfsb0, or mtfsb1 instructions. The purpose of VXSOFT is to allow
software to cause an invalid operation condition for a condition that is not necessarily associat-
ed with the execution of a floating-point instruction. For example, it might be set by a program
that computes a square root if the source operand is negative. This is a sticky bit. See 6.11.10.6
Invalid Operation Exception Conditions.
22 VXSQRT Floating-point invalid operation exception for invalid square root. This is a sticky bit. This guar-
antees that software can simulate fsqrt and frsqrte, and to provide a consistent interface to
handle exceptions caused by square-root operations. See 6.11.10.6 Invalid Operation Excep-
tion Conditions.
23 VXCVI Floating-point invalid operation exception for invalid integer convert. This is a sticky bit. See
6.11.10.6 Invalid Operation Exception Conditions.
24 VE Floating-point invalid operation exception enable. See 6.11.10.6 Invalid Operation Exception
Conditions.
26 UE Floating-point underflow exception enable. This bit should not be used to determine whether
denormalization should be performed on floating-point stores. See 6.11.10.9 Underflow Ex-
ception Condition.
27 ZE Floating-point zero divide exception enable. See 6.11.10.7 Zero Divide Exception Condition.
Table 2-3 illustrates the floating-point result flags that correspond to FPSCR bits
[15:19].
01001 – Infinity
10010 – Zero
00010 + Zero
00101 +Infinity
CR — Condition Register
0 3 4 7 8 11 12 15 16 19 20 23 24 27 28 31
RESET:UNCHANGED
The CR0 bits are interpreted as shown in Table 2-4. If any portion of the result (the
32-bit value placed into the destination register) is undefined, the value placed in
the first three bits of CR0 is undefined.
1 Positive (GT) — This bit is set when the result is positive (and not zero).
3 Summary overflow (SO) — This is a copy of the final state of XER[SO] at the completion of the instruction.
0 Floating-point exception (FX) — This is a copy of the final state of FPSCR[FX] at the completion of the
instruction.
1 Floating-point enabled exception (FEX) — This is a copy of the final state of FPSCR[FEX] at the comple-
tion of the instruction.
2 Floating-point invalid exception (VX) — This is a copy of the final state of FPSCR[VX] at the completion of
the instruction.
3 Floating-point overflow exception (OX) — This is a copy of the final state of FPSCR[OX] at the completion
of the instruction.
SO OV CA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 BYTES
RESET:UNCHANGED
The SPR number for the XER is one. The bit definitions for XER, shown in Table
2-7, are based on the operation of an instruction considered as a whole, not on in-
termediate results. For example, the result of the subtract from carrying (subfcx)
instruction is specified as the sum of three values. This instruction sets bits in the
XER based on the entire operation, not on an intermediate sum.
In most cases, reserved fields in registers are ignored when written and return zero
when read. However, XER[16:23] are set to the value written to them and return
that value when read.
0 SO Summary Overflow (SO) — The summary overflow bit is set whenever an instruction sets the
overflow bit (OV) to indicate overflow and remains set until software clears it. It is not altered by
compare instructions or other instructions that cannot overflow.
1 OV Overflow (OV) — The overflow bit is set to indicate that an overflow has occurred during exe-
cution of an instruction. Integer and subtract instructions having OE = 1 set OV if the carry out
of bit 0 is not equal to the carry out of bit 1, and clear it otherwise. The OV bit is not altered by
compare instructions or other instructions that cannot overflow.
2 CA Carry (CA) — In general, the carry bit is set to indicate that a carry out of bit 0 occurred during
execution of an instruction. Add carrying, subtract from carrying, add extended, and subtract
from extended instructions set CA to one if there is a carry out of bit 0, and clear it otherwise.
The CA bit is not altered by compare instructions or other instructions that cannot carry, except
that shift right algebraic instructions set the CA bit to indicate whether any '1' bits have been
shifted out of a negative quantity.
[3:24] — Reserved
[25:31] BYTES This field specifies the number of bytes to be transferred by a load string word indexed (lswx)
or store string word indexed (stswx) instruction.
BranchAddress
RESET:UNCHANGED
NOTE
Although the two least-significant bits can accept any values written
to them, they are ignored when the LR is used as an address. The
link register can be accessed by the mtspr and mfspr instructions
using the SPR number eight. Prefetching instructions along the tar-
get path (loaded by an mtspr instruction) is possible provided the link
register is loaded sufficiently ahead of the branch instruction. It is
usually possible to prefetch along a target path loaded by a branch
and link instruction.
Both conditional and unconditional branch instructions include the option of placing
the effective address of the instruction following the branch instruction in the LR.
This is done regardless of whether the branch is taken.
Loop Count
RESET:UNCHANGED
Prefetching instructions along the target path is also possible provided the count
register is loaded sufficiently ahead of the branch instruction.
The count register can be accessed by the mtspr and mfspr instructions by spec-
ifying SPR 9. In branch conditional instructions, the BO field specifies the condi-
tions under which the branch is taken. The first four bits of the BO field specify how
the branch is affected by or affects the condition register and the count register.
The encoding for the BO field is shown in Table 4-21 in SECTION 4 ADDRESSING
MODES AND INSTRUCTION SET SUMMARY.
The PowerPC VEA includes the time base facility (TB), a 64-bit structure that con-
tains a 64-bit unsigned integer that is incremented periodically. The frequency at
which the counter is updated is implementation-dependent and need not be con-
stant over long periods of time.
The TB consists of two 32-bit registers: time base upper (TBU) and time base lower
(TBL). In the context of the VEA, user-level applications are permitted read-only ac-
cess to the TB. The OEA defines supervisor-level access to the TB for writing val-
ues to the TB. Different SPR encodings are provided for reading and writing the
time base.
Refer to 2.4 PowerPC OEA Register Set for more information on writing to the TB.
Refer to 4.7.2 Move to/from Special Purpose Register Instructions for simpli-
fied mnemonics for reading and writing to the time base. For information on the
time base clock source, refer to the System Interface Unit Reference Manual (SI-
URM/AD).
TBU TBL
RESET:UNCHANGED
[0:31] TBU Time Base (Upper) — The high-order 32 bits of the time base
[32:63] TBL Time Base (Lower) — The low-order 32 bits of the time base
Because of the possibility of a carry from TBL to TBU occurring between reads of
the TBL and TBU, a sequence such as the following example is necessary to read
the time base on RCPU-based systems.
loop:
mftbu rx #load from TBU
mftb ry #load from TBL
mftbu rz #load from TBU
cmpw rz,rx #see if ‘old’=’new’
bne loop #loop if carry occurred
The comparison and loop are necessary to ensure that a consistent pair of values
has been obtained.
RESERVED ILE
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESET:
0 0 0 U 0 0 0 0 0 * 0 0 0 0 0 0
* Reset value of this bit depends on the value of the internal reset configuration word. Refer to the System Interface
Unit Reference Manual (SIURM/AD) for more information.
FE[0:1] Mode
01, 10, 11 Floating-point precise mode — The system floating-point assist error
handler is invoked precisely at the instruction that caused the enabled
exception.
DSISR
RESET:UNCHANGED
For information about bit settings, see 6.11.4 Alignment Exception (0x00600).
Data Address
RESET:UNCHANGED
TBU TBL
RESET:UNCHANGED
[0:31] TBU Time Base (Upper) — The high-order 32 bits of the time base
[32:63] TBL Time Base (Lower) — The low-order 32 bits of the time base
The TB can be written at the supervisor privilege level only. The mttbl and mttbu
simplified mnemonics write the lower and upper halves of the TB, respectively. The
mtspr, mttbl, and mttbu instructions treat TBL and TBU as separate 32-bit regis-
ters; setting one leaves the other unchanged. It is not possible to write the entire
64-bit time base in a single instruction.
For information about reading the time base, refer to 2.3 PowerPC VEA Register
Set — Time Base.
Decrementing Counter
RESET:UNCHANGED
The DEC counts down, causing an exception (unless masked) when it passes
through zero. The DEC satisfies the following requirements:
mtdec rA
If the execution of this instruction causes bit 0 of the DEC to change from zero to
one, an exception request is signaled. The DEC can be read into GPR rA with the
following instruction:
mfdec rA
Copying the DEC to a GPR does not affect the DEC content or the exception mech-
anism.
SRR0
RESET:UNDEFINED
For information on how specific exceptions affect SRR0, refer to the descriptions
of individual exceptions in SECTION 6 EXCEPTIONS.
SRR1
RESET:UNDEFINED
For information on how specific exceptions affect SRR1, refer to the individual ex-
ceptions in SECTION 6 EXCEPTIONS.
SPRG0
SPRG1
SPRG2
SPRG3
RESET:UNCHANGED
SPRG0 Software may load a unique physical address in this register to identify an area of memory reserved for
use by the exception handler. This area must be unique for each processor in the system.
SPRG1 This register may be used as a scratch register by the exception handler to save the content of a GPR.
That GPR then can be loaded from SPRG0 and used as a base register to save other GPRs to memory.
VERSION REVISION
RESET:UNCHANGED
[0:15] VERSION A 16-bit number that identifies the version of the processor and of the PowerPC architec-
ture
[16:31] REVISION A 16-bit number that distinguishes between various releases of a particular version
80 EIE 1 1
81 EID 0 1
82 NRI 0 0
Memory operands can be bytes, half words, words, or double words, or, for the
load/store multiple and move assist instructions, a sequence of bytes or words. The
address of a memory operand is the address of its first byte (that is, of its lowest-
numbered byte). Operand length is implicit for each instruction.
NOTES:
1. An “x” in an address bit position indicates that the bit can be zero
or one independent of the state of other bits in the address.
The concept of alignment is also applied more generally to data in memory. For ex-
ample, 12 bytes of data are said to be word-aligned if the address of the lowest-
numbered byte is a multiple of four.
Big-endian ordering assigns the lowest address to the highest-order eight bits of
the scalar. This is called big-endian because the big end of the scalar, considered
as a binary number, comes first in memory.
Little-endian byte ordering assigns the lowest address to the lowest-order (right-
most) eight bits of the scalar. The little end of the scalar, considered as a binary
number, comes first in memory.
Two bits in the MSR specify byte ordering: LE (little-endian mode) and ILE (excep-
tion little-endian mode). The LE bit specifies the endian mode in which the proces-
sor is currently operating, and ILE specifies the mode to be used when the system
error handler is invoked. That is, when an exception occurs, the ILE bit (as set for
the interrupted process) is copied into MSR[LE] to select the endian mode for the
context established by the exception. For both bits, a value of zero specifies big-
endian mode and a value of one specifies little-endian mode.
The default byte and bit ordering is big-endian, as shown in Figure 3-1. After a hard
reset, the hard reset handler (using the mtspr instruction) can select little-endian
mode for normal operation and exception processing by setting the LE and ILE bits,
respectively, in the MSR.
MSB
0 1 2 n
Big-Endian Bit Ordering
For a device in which the smallest addressable unit is the 64-bit double word, there
is no question of the order of bytes within double words. All transfers of individual
scalars between registers and system memory are of double words. A subset of
the 64-bit scalar (for example, a byte) is not addressable in memory. As a result, to
access any subset of the bits of a scalar, the entire 64-bit scalar must be accessed,
and when a memory location is read, the 64-bit value returned is the 64-bit value
last written to that location.
For PowerPC processors, the smallest addressable memory unit is the byte (8
bits), and scalars are composed of one or more sequential bytes. When a 32-bit
scalar is moved from a register to memory, it occupies four consecutive byte ad-
dresses, and a decision must be made regarding the order of these bytes in these
four addresses.
11 12 13 14
07 06 05 04 03 02 01 00
21 22 23 24 25 26 27 28
0F 0E 0D 0C 0B 0A 09 08
‘D’ ‘C’ ‘B’ ‘A’ 31 32 33 34
17 16 15 14 13 12 11 10
51 52 ‘G’ ‘F’ ‘E’
1F 1E 1D 1C 1B 1A 19 18
61 62 63 64
23 22 21 20
8 No change
The modified EA is passed to the main memory and the specified width of the data
is transferred between a GPR or FPR and the addressed memory locations (as
modified). The effective address modification makes it appear to the processor that
individual aligned scalars are stored as little-endian, when in fact they are stored
as big-endian but in different bytes within double words from the order in which they
are stored in big-endian mode.
11 12 13 14
00
00 01 02 03 04 05 06 07
21 22 23 24 25 26 27 28
08
08 09 0A 0B 0C 0D 0E 0F
‘D’ ‘C’ ‘B’ ‘A’ 31 32 33 34
10
10 11 12 13 14 15 16 17
51 52 ‘G’ ‘F’ ‘E’
18
18 19 1A 1B 1C 1D 1E 1F
61 62 63 64
20
20 21 22 23 24 25 26 27
Because of the modifications on the EA, the same structure S appears to the pro-
cessor to be mapped into memory this way when LM = 1 (little-endian enabled).
This is shown in Figure 3-5.
Note that as seen by the program executing in the processor, the mapping for the
structure S is identical to the little-endian mapping shown in Figure 3-3. From out-
side of the processor, the addresses of the bytes making up the structure S are as
shown in Figure 3-4. These addresses match neither the big-endian mapping of
Figure 3-2 or the little-endian mapping of Figure 3-3. This must be taken into ac-
count when performing I/O operations in little-endian mode; this is discussed in
3.2.4 Input/Output in Little-Endian Mode.
The PowerPC architecture defines that half words, words, and double words be
placed in memory such that the little-endian address of the lowest-order byte is the
EA computed by the load or store instruction; the little-endian address of the next-
lowest-order byte is one greater, and so on. Figure 3-6 shows a four-byte word
stored at little-endian address 5. The word is presumed to contain the binary rep-
resentation of 0x1112 1314.
12 13 14
00
07 06 05 04 03 02 01 00
11
08
0F 0E 0D 0C 0B 0A 09 08
12 13 14
00
00 01 02 03 04 05 06 07
11
08
08 09 0A 0B 0C 0D 0E 0F
NOTE
The misaligned word in this example spans two double words. The
two parts of the misaligned word are not contiguous in the big-endian
addressing space.
An implementation may choose to support only a subset of misaligned little-endian
memory accesses. For example, misaligned little-endian accesses contained with-
in a single double word may be supported, while those that span double words may
cause alignment exceptions.
String accesses are inherently misaligned; they transfer word-length quantities be-
tween memory and registers, but the quantities are not necessarily aligned on word
boundaries.
NOTE
The system software must determine whether to emulate the except-
ing instruction or treat it as an illegal operation.
Although the words addressed by these instructions are on word boundaries, each
word is in the half of its containing double word opposite from where it would be in
big-endian mode. Note that the system software must determine whether to emu-
late the excepting instruction or treat it as an illegal operation.
If this same program is assembled for and executed in little-endian mode, the map-
ping seen by the processor appears as shown in Figure 3-9.
Each machine instruction appears in memory as a 32-bit integer containing the val-
ue described in the instruction description, regardless of whether the processor is
operating in big- or little-endian mode. This is because scalars are always mapped
in memory in big-endian byte order.
When little-endian mapping is used, all references to the instruction stream must
follow little-endian addressing conventions, including addresses saved in system
registers when the exception is taken, return addresses saved in the link register,
and branch displacements and addresses.
• An instruction address placed in the link register by branch and link, or an in-
struction address saved in an SPR when an exception is taken is the address
that a program executing in little-endian mode would use to access the in-
struction as a word of data using a load instruction.
• An offset in a relative branch instruction reflects the difference between the
addresses of the instructions, where the addresses used are those that a pro-
In order for I/O transfers in little-endian mode to appear to transfer bytes properly,
they must be performed as if the bytes transferred were accessed one at a time,
using the little-endian address modification appropriate for the single-byte transfers
(XOR the bits with 0b111). This does not mean that I/O on little-endian PowerPC
machines must be done using only one-byte-wide transfers. Data transfers can be
as wide as desired, but the order of the bytes within double words must be as if
they were fetched or stored one at a time.
The length of the exponent and the fraction fields differ between these two preci-
sion formats. The structure of the single-precision format is shown in Figure 3-10;
the structure of the double-precision format is shown in Figure 3-11.
0 8 9 31
S EXP FRACTION
S EXP FRACTION
• S (sign bit).
• EXP (exponent + bias)
• FRACTION (fraction)
The significand consists of a leading implied bit concatenated on the right with the
FRACTION. This leading implied bit is a one for normalized numbers and a zero
for denormalized numbers in the unit bit position (that is, the first bit to the left of
the binary point). Parameters for the two floating-point formats are listed in Table
3-5.
. . .
10. . . . .00 1 1
01. . . . .10 –1 –1
. . .
Negative . . .
. . .
Tiny Tiny
The positive and negative NaNs are not related to the numbers or ±× by order or
value, but they are encodings that convey diagnostic information such as the rep-
resentation of uninitialized variables.
0 0 0 Non-zero +Denormalized
0 0 0 Zero +0
1 0 0 Zero –0
1 0 0 Non-zero –Denormalized
where (s) is the sign, (E) is the unbiased exponent and (1.fraction) is the significand
composed of a leading unit bit (implied bit) and a fractional part. The format for nor-
malized numbers is shown in Figure 3-14.
SIGN OF MANTISSA, 0 OR 1
Single-precision format:
1.2x10-38 ð M ð 3.4x1038
Double-precision format:
2.2x10-308 ð M ð 1.8x10308
EXPONENT = 0 MANTISSA = 0
(BIASED)
SIGN OF MANTISSA, 0 OR 1
SIGN OF MANTISSA, 0 OR 1
Denormalized numbers are non-zero numbers smaller in magnitude than the rep-
resentable normalized numbers. They are values in which the implied unit bit is ze-
ro. Denormalized numbers are interpreted as follows:
Emin is the minimum representable exponent value (that is, –126 for single-preci-
sion, –1022 for double-precision).
EXPONENT = MAXIMUM
(BIASED) MANTISSA = 0
SIGN OF MANTISSA, 0 OR 1
The fraction value is zero. Infinities are used to approximate values greater in mag-
nitude than the maximum normalized value. Infinity arithmetic is defined as the lim-
iting case of real arithmetic, with restricted operations defined between numbers
and infinities. Infinities and the reals can be related as follows:
Arithmetic using infinite numbers is always exact and does not signal any excep-
tion, except when an exception occurs due to the invalid operations as described
in 6.11.10.6 Invalid Operation Exception Conditions.
Signaling NaNs signal exceptions when they are specified as arithmetic operands.
Quiet NaNs represent the results of certain invalid operations, such as invalid arith-
If (frA) is a NaN
Then frD ← (frA)
Else if (frB) is a NaN
Then frD ← (frB)
Else if (frC) is a NaN
Then frD ← (frC)
Else if generated QNaN
Then frD ← generated QNaN
If the operand specified by frA is a NaN, that NaN is stored as the result. Otherwise,
if the operand specified by frB is a NaN (if the instruction specifies an frB operand),
that NaN is stored as the result. Otherwise, if the operand specified by frC is a NaN
(if the instruction specifies an frC operand), that NaN is stored as the result. Oth-
erwise, if a QNaN is generated by a disabled invalid operation exception, that
QNaN is stored as the result. If a QNaN is to be generated as a result, the QNaN
generated has a sign bit of zero, an exponent field of all ones, and a high-order
fraction bit of one with all other fraction bits zero. An instruction that generates a
QNaN as the result of a disabled invalid operation generates this QNaN. This is
shown in Figure 3-19.
0 111...1 1000....0
The sign of the result of an addition operation is the sign of the source operand hav-
ing the larger absolute value. The sign of the result of the subtraction operation, x
– y, is the same as the sign of the result of the addition operation, x+(–y).
A number is normalized by shifting its significand left while decrementing its expo-
nent by one for each bit shifted, until the leading significand bit becomes one. The
guard bit and the round bit participate in the shift with zeros shifted into the round
bit; see 3.4.1 Execution Model for IEEE Operations.
During normalization, the exponent is regarded as if its range were unlimited. If the
resulting exponent value is less than the minimum value that can be represented
in the format specified for the result, the intermediate result is said to be “tiny” and
the stored result is determined by the rules described in 6.11.10.9 Underflow Ex-
ception Condition. The sign of the number does not change.
When denormalized numbers are operands of multiply and divide operations, op-
erands are prenormalized internally before the operations are performed.
Floating-point single-precision formats are used by the following four types of in-
structions:
S EXP x x x x x x x x x xx x x x x x x x x x x x x 00000000000000000000000000000
0 1 11 12 63
3.3.11 Rounding
All arithmetic instructions defined by the PowerPC architecture produce an inter-
mediate result considered infinitely precise. This result must then be written with a
precision of finite length into an FPR. After normalization or denormalization, if the
infinitely precise intermediate result cannot be represented in the precision re-
quired by the instruction, it is rounded before being placed into the target FPR.
The instructions that potentially round their result are the arithmetic, multiply-add,
and rounding and conversion instructions. As shown in Figure 3-21, whether
rounding occurs depends on the source values.
Yes
FI = 1
Fraction No
Incremented FR = 0
Yes
Each of these instructions sets FPSCR bits FR and FI, according to whether round-
ing occurs (FI) and whether the fraction was incremented (FR). If rounding occurs,
FI is set to one and FR may be either zero or one. If rounding does not occur, both
FR and FI are cleared. Other floating-point instructions do not alter FR and FI. Four
modes of rounding are provided that are user-selectable through the floating-point
rounding control field in the FPSCR. These are encoded as follows in Table 3-7.
00 Round to nearest
Let Z be the infinitely precise intermediate arithmetic result or the operand of a con-
version operation. If Z can be represented exactly in the target format, no rounding
occurs and the result in all rounding modes is equivalent to truncation of Z. If Z can-
not be represented exactly in the target format, let Z1 and Z2 be the next larger and
next smaller numbers representable in the target format that bound Z; then Z1 or
Z2 can be used to approximate the result in the target format.
By incrementing LSB of Z
Infinitely precise value
By truncating after LSB
Z2 Z1 0 Z2 Z1
Z Z
Negative values Positive values
Z1/Z2
The IEEE-754 standard includes 32-bit and 64-bit arithmetic. The standard re-
quires that single-precision arithmetic be provided for single-precision operands.
The standard permits double-precision arithmetic instructions to have either (or
both) single-precision or double-precision operands, but states that single-preci-
sion arithmetic instructions should not accept double-precision operands.
The bits and fields for the IEEE 64-bit execution model are defined as follows:
G R X Interpretation
0 0 0 IR is exact
0 0 1
0 1 0 IR closer to NL
0 1 1
1 0 1
1 1 0 IR closer to NH
1 1 1
The significand of the intermediate result is made up of the L bit, the FRACTION,
and the G, R, and X bits.
Before results are stored into an FPR, the significand is rounded if necessary, us-
ing the rounding mode specified by FPSCR[RN]. If rounding causes a carry into C,
the significand is shifted right one position and the exponent is incremented by one.
This may yield an inexact result and possibly exponent overflow. Fraction bits to
the left of the bit position used for rounding are stored into the FPR, and low-order
bit positions, if any, are set to zero.
Four rounding modes are provided which are user-selectable through FPSCR[RN]
as described in 3.3.11 Rounding. For rounding, the conceptual guard, round, and
sticky bits are defined in terms of accumulator bits.
Table 3-9 shows the positions of the guard, round, and sticky bits for double-pre-
cision and single-precision floating-point numbers.
Rounding can be treated as though the significand were shifted right, if required,
until the least significant bit to be retained is in the low-order bit position of the
FRACTION. If any of the guard, round, or sticky bits are non-zero, the result is in-
exact.
• Round to nearest
— Guard bit = 0: The result is truncated. (Result exact (GRX = 000) or closest
to next lower value in magnitude (GRX = 001, 010, or 011)
— Guard bit = 1: Depends on round and sticky bits:
• Case a: If the round or sticky bit is one (inclusive), the result is increment-
ed. (result closest to next higher value in magnitude (GRX = 101, 110, or
111))
• Case b: If the round and sticky bits are zero (i.e., the result is midway be-
tween the closest representable values), the result is rounded to an even
value. That is, if the low-order bit of the result is one, the result is incre-
mented. If the low-order bit of the result is zero, the result is truncated.
• If during the round to nearest process, truncation of the unrounded number
produces the maximum magnitude for the specified precision, the following
action is taken:
— Guard bit = 1: Store infinity with the sign of the unrounded result.
— Guard bit = 0: Store the truncated (maximum magnitude) value.
• Round toward zero — Choose the smaller in magnitude of Z1 or Z2. If the
guard, round, or sticky bit is non-zero, the result is inexact.
• Round toward +infinity
Choose Z1.
• Round toward –infinity
Choose Z2.
Where the result is to have fewer than 53 bits of precision because the instruction
is a floating round to single-precision or single-precision arithmetic instruction, the
intermediate result either is normalized or is placed in correct denormalized form
before the result is potentially rounded.
NOTE
The rounding occurs only after add; therefore, the computation of the
sum and product together are infinitely precise before the final result
is rounded to a representable format.
The first part of the operation is a multiply. The multiply has two 53-bit significands
as inputs, which are assumed to be prenormalized, and produces a result conform-
ing to the above model. If there is a carry out of the significand (into the C bit), the
significand is shifted right one position, placing the L bit into the most significant bit
of the FRACTION and placing the C bit into the L bit. All 106 bits (L bit plus the frac-
The result of the add is then normalized, with all bits of the add result, except the
X' bit, participating in the shift. The normalized result provides an intermediate re-
sult as input to the rounder that conforms to the model described in 3.4.1 Execu-
tion Model for IEEE Operations, where:
Status bits are set to reflect the result of the entire operation: for example, no status
is recorded for the result of the multiplication part of the operation.
Non-IEEE mode is entered by setting the NI (non-IEEE enable) bit in the FPSCR.
The hardware never asserts the FPSCRXX (inexact) bit on an underflow condition;
it is done as a part of the floating-point assist interrupt handler. Therefore, in non-
IEEE mode, FPSCRXX cannot be depended upon to be a complete accumulation
of all inexact conditions.
Arithmetic and logical instructions do not modify memory. To use a memory oper-
and in a computation and then modify the same or another memory location, the
memory contents must be loaded into a register, modified, and then written back
to the target location.
NOTE
Other PowerPC implementations invoke the program exception han-
dler in this case. Refer to 6.11.11 Software Emulation Exception
(0x01000) for additional information.
• Instructions that are not implemented in the PowerPC architecture. These op-
codes are available for future extensions of the PowerPC architecture; that is,
future versions of the PowerPC architecture may define any of these instruc-
tions to perform new functions.
• Instructions that are implemented in the PowerPC architecture but are not im-
plemented in a specific PowerPC implementation. For example, instructions
that can be executed on 64-bit PowerPC processors are considered illegal for
32-bit processors.
• All unused extended opcodes are illegal.
• An instruction consisting entirely of zeros is guaranteed to be an illegal instruc-
tion.
An attempt to execute an illegal instruction invokes the software emulation error
handler. Notice that in other PowerPC implementations, the program exception
handler may be invoked in this case.
NOTE
Other PowerPC implementations invoke the program exception han-
dler in this case. Refer to 6.11.11 Software Emulation Exception
(0x01000) for additional information.
These instructions treat the source operands as signed integers unless the instruc-
tion is explicitly identified as an unsigned operation or an address conversion.
The integer instructions that update the condition register (i.e., those with a mne-
monic ending in a period) set condition register field CR0 (bits [0:3]) to characterize
the result of the operation. These instructions include those with the Rc bit equal to
one and the addic., andi., and andis. integer logical and arithmetic instructions.
The condition register field CR0 is set as if the result were compared algebraically
to zero.
The following integer arithmetic instructions always set XER[CA] to reflect the carry
out of bit 0: addic, addic., subfic, addc, subfc, adde, subfe, addme, subfme,
addze, and subfze. Integer arithmetic instructions with the overflow enable (OE)
bit set cause XER[SO] and XER[OV] to be set to reflect overflow of the 32-bit result.
Unless otherwise noted, when condition register field CR0 and the XER are affect-
ed, they reflect the value placed in the target register.
The RCPU performs best for aligned load and store operations. See 6.11.4 Align-
ment Exception (0x00600) for scenarios that cause an alignment exception.
Add addi rD,rA,SIMM The sum (rA|0) + SIMM is placed into register rD.
Immediate
Add addis rD,rA,SIMM The sum (rA|0) + (SIMM || 0x0000) is placed into register rD.
Immediate
Shifted
Add add rD,rA,rB The sum (rA) + (rB) is placed into register rD.
add.
add Add
addo
add. Add with CR Update. The dot suffix enables the update
addo.
of the condition register.
addo Add with Overflow Enabled. The o suffix enables the
overflow bit (OV) in the XER.
addo. Add with Overflow and CR Update. The o. suffix enables
the update of the condition register and enables the
overflow bit (OV) in the XER.
Subtract subf rD,rA,rB The sum ¬ (rA) + (rB) +1 is placed into rD.
from subf.
subf Subtract from
subfo
subf. Subtract from with CR Update. The dot suffix enables the
subfo.
update of the condition register.
subfo Subtract from with Overflow Enabled. The o suffix
enables the overflow. The o suffix enables the overflow
bit (OV) in the XER.
subfo. Subtract from with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Add addic rD,rA,SIMM The sum (rA) + SIMM is placed into register rD.
Immediate
Carrying
Add addic. rD,rA,SIMM The sum (rA) + SIMM is placed into rD. The condition register is
Immediate updated.
Carrying
and Record
Subtract subfic rD,rA,SIMM The sum ¬ (rA) + SIMM + 1 is placed into register rD.
from
Immediate
Carrying
Add addc rD,rA,rB The sum (rA) + (rB) is placed into register rD.
Carrying addc.
addc Add Carrying
addco
addc. Add Carrying with CR Update. The dot suffix enables the
addco.
update of the condition register.
addco Add Carrying with Overflow Enabled. The o suffix
enables the overflow bit (OV) in the XER.
addco. Add Carrying with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Subtract subfc rD,rA,rB The sum ¬ (rA) + (rB) + 1 is placed into register rD.
from subfc.
subfc Subtract from Carrying
Carrying subfco
subfc. Subtract from Carrying with CR Update. The dot suffix
subfco.
enables the update of the condition register.
subfco Subtract from Carrying with Overflow. The o suffix
enables the overflow bit (OV) in the XER.
subfco. Subtract from Carrying with Overflow and CR Update.
The o. suffix enables the update of the condition register
and enables the overflow bit (OV) in the XER.
Add adde rD,rA,rB The sum (rA) + (rB) + XER(CA) is placed into register rD.
Extended adde.
adde Add Extended
addeo
adde. Add Extended with CR Update. The dot suffix enables the
addeo.
update of the condition register.
addeo Add Extended with Overflow. The o suffix enables the
overflow bit (OV) in the XER.
addeo. Add Extended with Overflow and CR Update. The o.
suffix enables the update of the condition register and
enables the overflow bit (OV) in the XER.
Subtract subfe rD,rA,rB The sum ¬ (rA) + (rB) + XER(CA) is placed into register rD.
from subfe.
subfe Subtract from Extended
Extended subfeo
subfe. Subtract from Extended with CR Update. The dot suffix
subfeo.
enables the update of the condition register.
subfeo Subtract from Extended with Overflow. The o suffix
enables the overflow bit (OV) in the XER.
subfeo. Subtract from Extended with Overflow and CR Update.
The o. suffix enables the update of the condition register
and enables the overflow (OV) bit in the XER.
Add to addme rD,rA The sum (rA) + XER(CA) + 0xFFFF FFFF is placed into register rD.
Minus One addme.
addme Add to Minus One Extended
Extended addmeo
addme. Add to Minus One Extended with CR Update. The dot
addmeo.
suffix enables the update of the condition register.
addmeo Add to Minus One Extended with Overflow. The o suffix
enables the overflow bit (OV) in the XER.
addmeo. Add to Minus One Extended with Overflow and CR
Update. The o. suffix enables the update of the condition
register and enables the overflow (OV) bit in the XER.
Subtract subfme rD,rA The sum ¬ (rA) + XER(CA) + 0xFFFF FFFF is placed into register rD.
from Minus subfme.
subfme Subtract from Minus One Extended
One subfmeo
subfme. Subtract from Minus One Extended with CR Update. The
Extended subfmeo.
dot suffix enables the update of the condition register.
subfmeo Subtract from Minus One Extended with Overflow. The
o suffix enables the overflow bit (OV) in the XER.
subfmeo. Subtract from Minus One Extended with Overflow and
CR Update. The o. suffix enables the update of the
condition register and enables the overflow bit (OV) in the
XER.
Add to Zero addze rD,rA The sum (rA) + XER(CA) is placed into register rD.
Extended addze.
addze Add to Zero Extended
addzeo
addze. Add to Zero Extended with CR Update. The dot suffix
addzeo.
enables the update of the condition register.
addzeo Add to Zero Extended with Overflow. The o suffix enables
the overflow bit (OV) in the XER.
addzeo. Add to Zero Extended with Overflow and CR Update.
The o. suffix enables the update of the condition register
and enables the overflow bit (OV) in the XER.
Subtract subfze rD,rA The sum ¬ (rA) + XER(CA) is placed into register rD.
from Zero subfze.
subfze Subtract from Zero Extended
Extended subfzeo
subfze. Subtract from Zero Extended with CR Update. The dot
subfzeo.
suffix enables the update of the condition register.
subfzeo Subtract from Zero Extended with Overflow. The o suffix
enables the overflow bit (OV) in the XER.
subfzeo. Subtract from Zero Extended with Overflow and CR
Update. The o. suffix enables the update of the condition
register and enables the overflow bit (OV) in the XER.
Negate neg rD,rA The sum ¬ (rA) + 1 is placed into register rD.
neg.
neg Negate
nego
neg. Negate with CR Update. The dot suffix enables the
nego.
update of the condition register.
nego Negate with Overflow. The o suffix enables the overflow
bit (OV) in the XER.
nego. Negate with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Multiply Low mulli rD,rA,SIMM The low-order 32 bits of the 48-bit product (rA) ∗ SIMM are placed into
Immediate register rD. The low-order 32 bits of the product are the correct 32-bit
product. The low-order bits are independent of whether the operands
are treated as signed or unsigned integers. However, XER[OV] is set
based on the result interpreted as a signed integer.
The high-order bits are lost. This instruction can be used with mulhwx
to calculate a full 64-bit product.
Multiply Low mullw rD,rA,rB The low-order 32 bits of the 64-bit product (rA) ∗ (rB) are placed into
mullw. register rD. The low-order 32 bits of the product are the correct 32-bit
mullwo product. The low-order bits are independent of whether the operands
mullwo. are treated as signed or unsigned integers. However, XER[OV] is set
based on the result interpreted as a signed integer.
The high-order bits are lost. This instruction can be used with mulhwx
to calculate a full 64-bit product. Some implementations may execute
faster if rB contains the operand having the smaller absolute value.
mullw Multiply Low
mullw. Multiply Low with CR Update. The dot suffix enables the
update of the condition register.
mullwo Multiply Low with Overflow. The o suffix enables the
overflow bit (OV) in the XER.
mullwo. Multiply Low with Overflow and CR Update. The o. suffix
enables the update of the condition register and enables
the overflow bit (OV) in the XER.
Multiply mulhw rD,rA,rB The contents of rA and rB are interpreted as 32-bit signed integers.
High Word mulhw. The 64-bit product is formed. The high-order 32 bits of the 64-bit
product are placed into rD.
Both operands and the product are interpreted as signed integers.
This instruction may execute faster if rB contains the operand having
the smaller absolute value.
mulhw Multiply High Word
mulhw. Multiply High Word with CR Update. The dot suffix
enables the update of the condition register.
Multiply mulhwu rD,rA,rB The contents of rA and of rB are extracted and interpreted as 32-bit
High Word mulhwu. unsigned integers. The 64-bit product is formed. The high-order 32 bits
Unsigned of the 64-bit product are placed into rD.
Both operands and the product are interpreted as unsigned integers.
This instruction may execute faster if rB contains the operand having
the smaller absolute value.
mulhwu Multiply High Word Unsigned
mulhwu. Multiply High Word Unsigned with CR Update. The dot
suffix enables the update of the condition register.
Divide Word divw rD,rA,rB The dividend is the signed value of (rA). The divisor is the signed value
divw. of (rB). The 64-bit quotient is formed. The low-order 32 bits of the 64-
divwo bit quotient are placed into rD. The remainder is not supplied as a
divwo. result.
Both operands are interpreted as signed integers. The quotient is the
unique signed integer that satisfies the following:
dividend = (quotient times divisor) + r
where 0 ð r < |divisor| if the dividend is non-negative, and
–|divisor| < r ð 0 if the dividend is negative.
If an attempt is made to perform any of the divisions
0x8000 0000 / –1
or
<anything> / 0
the contents of register rD are undefined, as are the contents of the
LT, GT, and EQ bits of the condition register field CR0 if the instruction
has condition register updating enabled. In these cases, if instruction
overflow is enabled, then XER[OV] is set.
The 32-bit signed remainder of dividing (rA) by (rB) can be computed
as follows, except in the case that (rA) = –231 and (rB) = –1:
divw rD,rA,rB rD = quotient
mull rD,rD,rB rD = quotient∗divisor
subf rD,rD,rA rD = remainder
Divide Word divwu rD,rA,rB The dividend is the value of (rA). The divisor is the value of (rB). The
Unsigned divwu. 32-bit quotient is placed into rD. The remainder is not supplied as a
divwuo result.
divwuo.
Both operands are interpreted as unsigned integers. The quotient is
the unique unsigned integer that satisfies the following:
dividend = (quotient times divisor) + r
where 0 ð r < divisor.
If an attempt is made to perform the division
<anything> / 0
the contents of register rD are undefined, as are the contents of the
LT, GT, and EQ bits of the condition register field CR0 if the instruction
has the condition register updating enabled. In these cases, if
instruction overflow is enabled, then XER[OV] is set.
The 32-bit unsigned remainder of dividing (rA) by (rB) can be
computed as follows:
divwu rD,rA,rB rD = quotient
mull rD,rD,rB rD = quotient*divisor
subf rD,rD,rA rD = remainder
See E.2 Simplified Mnemonics for Subtract Instructions for information on sim-
plified mnemonics.
Compare cmpi crfD,L,rA,SIMM The contents of register rA is compared with the sign-extended
Immediate value of the SIMM operand, treating the operands as signed
integers. The result of the comparison is placed into the CR field
specified by operand crfD.
Compare cmp crfD,L,rA,rB The contents of register rA is compared with register rB, treating the
operands as signed integers. The result of the comparison is placed
into the CR field specified by operand crfD.
Compare cmpli crfD,L,rA,UIMM The contents of register rA is compared with 0x0000 || UIMM,
Logical treating the operands as unsigned integers. The result of the
Immediate comparison is placed into the CR field specified by operand crfD.
Compare cmpl crfD,L,rA,rB The contents of register rA is compared with register rB, treating the
Logical operands as unsigned integers. The result of the comparison is
placed into the CR field specified by operand crfD.
While the PowerPC architecture specifies that the value in the L field specifies
whether the operands are treated as 32- or 64-bit values, the RCPU ignores the
value in the L field and treats the operands as 32-bit values.
The crfD field can be omitted if the result of the comparison is to be placed in CR0.
Otherwise the target CR field must be specified in the instruction crfD field, using
one of the CR field symbols (CR0 to CR7) or an explicit field number. Refer to Ta-
ble E-2 for the list of CR field symbols and to E.3 Simplified Mnemonics for Com-
pare Instructions for simplified mnemonics.
AND andi. rA,rS,UIMM The contents of rS is ANDed with 0x0000 || UIMM and the result is
Immediate placed into rA.
AND andis. rA,rS,UIMM The contents of rS is ANDed with UIMM || 0x0000 and the result is
Immediate placed into rA.
Shifted
OR ori rA,rS,UIMM The contents of rS is ORed with 0x0000 || UIMM and the result is
Immediate placed into rA.
The preferred no-op is ori 0,0,0
OR oris rA,rS,UIMM The contents of rS is ORed with UIMM || 0x0000 and the result is
Immediate placed into rA.
Shifted
XOR xori rA,rS,UIMM The contents of rS is XORed with 0x0000 || UIMM and the result is
Immediate placed into rA.
XOR xoris rA,rS,UIMM The contents of rS is XORed with UIMM || 0x0000 and the result is
Immediate placed into rA.
Shifted
AND and rA,rS,rB The contents of rS is ANDed with the contents of register rB and the
and. result is placed into rA.
and AND
and. AND with CR Update. The dot suffix enables the update
of the condition register.
OR or rA,rS,rB The contents of rS is ORed with the contents of rB and the result is
or. placed into rA.
or OR
or. OR with CR Update. The dot suffix enables the update of
the condition register.
XOR xor rA,rS,rB The contents of rS is XORed with the contents of rB and the result is
xor. placed into register rA.
xor XOR
xor. XOR with CR Update. The dot suffix enables the update
of the condition register.
NAND nand rA,rS,rB The contents of rS is ANDed with the contents of rB and the one’s
nand. complement of the result is placed into register rA.
nand NAND
nand. NAND with CR Update. The dot suffix enables the update
of the condition register.
NAND with rS = rB can be used to obtain the one's complement.
NOR nor rA,rS,rB The contents of rS is ORed with the contents of rB and the one’s
nor. complement of the result is placed into register rA.
nor NOR
nor. NOR with CR Update. The dot suffix enables the update
of the condition register.
NOR with rS = rB can be used to obtain the one's complement.
Equivalent eqv rA,rS,rB The contents of rS is XORed with the contents of rB and the
eqv. complemented result is placed into register rA.
eqv Equivalent
eqv. Equivalent with CR Update. The dot suffix enables the
update of the condition register.
AND with andc rA,rS,rB The contents of rS is ANDed with the complement of the contents of
Complement andc. rB and the result is placed into rA.
andc AND with Complement
andc. AND with Complement with CR Update. The dot suffix
enables the update of the condition register.
OR with orc rA,rS,rB The contents of rS is ORed with the complement of the contents of rB
Complement orc. and the result is placed into rA.
orc OR with Complement
orc. OR with Complement with CR Update. The dot suffix
enables the update of the condition register.
Extend Sign extsb rA,rS The contents of rS[24:31] are placed into rA[24:31]. Bit 24 of rS is
Byte extsb. placed into rA[0:23].
extsb Extend Sign Byte
extsb. Extend Sign Byte with CR Update. The dot suffix enables
the update of the condition register.
Extend Sign extsh rA,rS The contents of rS[16:31] are placed into rA[16:31]. Bit 16 of rS is
Half Word extsh. placed into rA[0:15].
extsh Extend Sign Half Word
extsh. Extend Sign Half Word with CR Update. The dot suffix
enables the update of the condition register.
Count cntlzw rA,rS A count of the number of consecutive zero bits of rS is placed into rA.
Leading cntlzw. This number ranges from 0 to 32, inclusive.
Zeros Word
cntlzw Count Leading Zeros Word
cntlzw. Count Leading Zeros Word with CR Update. The dot
suffix enables the update of the condition register.
When the Count Leading Zeros Word instruction has condition register
updating enabled, the LT field is cleared to zero in CR0.
Extract Select a field of n bits starting at bit position b in the source register, right or left justify this field in the
target register, and clear all other bits of the target register to zero.
Insert Select a left- or right-justified field of n bits in the source register, insert this field starting at bit position
b of the target register, and leave other bits of the target register unchanged. (No simplified mnemonic
is provided for insertion of a left-justified field when operating on double-words; such an insertion
requires more than one instruction.)
Rotate Rotate the contents of a register right or left n bits without masking.
Shift Shift the contents of a register right or left n bits, clearing vacated bits to zero (logical shift).
Clear left Clear the leftmost b bits of a register, then shift the register left by n bits. This operation can be used to
and shift scale a known non-negative array index by the width of an element.
left
The IU performs rotation operations on data from a GPR and returns the result, or
a portion of the result, to a GPR. Rotation operations rotate a 32-bit quantity left by
a specified number of bit positions. Bits that exit from position 0 enter at position
31. A rotate right operation can be accomplished by specifying a rotation of 32-n
bits, where n is the right rotation amount.
Rotate and shift instructions use a mask generator. The mask is 32 bits long and
consists of 1-bits from a start bit, MB, through and including a stop bit, ME, and 0-
bits elsewhere. The values of MB and ME range from zero to 31. If MB > ME, the
1-bits wrap around from position 31 to position 0. Thus the mask is formed as fol-
lows:
if MB ð ME then
mask[mstart:mstop] = ones
mask[all other bits] = zeros
else
mask[mstart:31] = ones
mask[0:mstop] = ones
mask[all other bits] = zeros
It is not possible to specify an all-zero mask. The use of the mask is described in
the following sections.
If condition register updating is enabled, rotate and shift instructions set condition
register field CR0 according to the contents of rA at the completion of the instruc-
tion. Rotate and shift instructions do not change the values of XER[OV] or XER[SO]
bits. Rotate and shift instructions, except algebraic right shifts, do not change the
XER[CA] bit.
Rotate Left rlwinm rA,rS,SH,MB,ME The contents of register rS are rotated left by the number of bits
Word rlwinm. specified by operand SH. A mask is generated having 1-bits from
Immediate the bit specified by operand MB through the bit specified by
then AND operand ME and 0-bits elsewhere. The rotated data is ANDed with
with Mask the generated mask and the result is placed into register rA.
rlwinm Rotate Left Word Immediate then AND with Mask
rlwinm. Rotate Left Word Immediate then AND with Mask with
CR Update. The dot suffix enables the update of the
condition register.
Simplified mnemonics:
extlwi rA,rS,n,brlwinm rA,rS,b,0,n-1
srwi rA,rS,n rlwinm rA,rS,32-n,n,31
clrrwi rA,rS,n rlwinm rA,rS,0,0,31-n
Note: The rlwinm instruction can be used for extracting, clearing
and shifting bit fields using the methods shown below:
To extract an n-bit field that starts at bit position b in register rS,
right-justified into rA (clearing the remaining 32 - n bits of rA), set
SH = b + n, MB = 32 - n, and ME = 31.
To extract an n-bit field that starts at bit position b in rS, left-justified
into rA, set SH = b, MB = 0, and ME = n - 1.
To rotate the contents of a register left (right) by n bits, set SH = n
(32 - n), MB = 0, and ME = 31.
To shift the contents of a register right by n bits, set SH = 32 - n, MB
= n, and ME = 31.
To clear the high-order b bits of a register and then shift the result
left by n bits, set SH = n, MB = b - n and ME = 31 - n.
To clear the low-order n bits of a register, set SH = 0, MB = 0, and
ME = 31 - n.
Rotate Left rlwnm rA,rS,rB,MB,ME The contents of rS are rotated left by the number of bits specified
Word then rlwnm. by rB[27:31]. A mask is generated having 1-bits from the bit
AND with specified by operand MB through the bit specified by operand ME
Mask and 0-bits elsewhere. The rotated data is ANDed with the
generated mask and the result is placed into rA.
rlwinm Rotate Left Word then AND with Mask
rlwinm. Rotate Left Word then AND with Mask with CR
Update. The dot suffix enables the update of the
condition register.
Simplified mnemonics:
rotlw rA,rS,rBrlwnm rA,rS,rB,0,31
Note: The rlwinm instruction can be used to extract and rotate bit
fields using the methods shown below:
To extract an n-bit field that starts at the variable bit position b in the
register specified by operand rS, right-justified into rA (clearing the
remaining 32-n bits of rA), set rB[27:31] = b + n, MB = 32 - n, and
ME = 31.
To extract an n-bit field that starts at variable bit position b in the
register specified by operand rS, left-justified into rA (clearing the
remaining 32 - n bits of rA), set rB[27:31] = b, MB = 0, and ME = n
- 1.
To rotate the contents of the low-order 32 bits of a register left
(right) by variable n bits, set rB[27:31] = n (32 - n), MB = 0, and ME
= 31.
Rotate Left rlwimi rA,rS,SH,MB,ME The contents of rS are rotated left by the number of bits specified
Word rlwimi. by operand SH. A mask is generated having 1-bits from the bit
Immediate specified by MB through the bit specified by ME and 0-bits
then Mask elsewhere. The rotated data is inserted into rA under control of the
Insert generated mask.
rlwimi Rotate Left Word Immediate then Mask
rlwimi. Rotate Left Word Immediate then Mask Insert with CR
Update. The dot suffix enables the update of the
condition register.
Simplified mnemonic:
inslw rA,rS,n,brlwim rA,rS,32-b,b,b+n-1
Note: The opcode rlwimi can be used to insert a bit field into the
contents of register specified by operand rA using the methods
shown below:
To insert an n-bit field that is left-justified in rS into rA starting at bit
position b, set SH = 32 - b, MB = b, and ME = (b + n) - 1.
To insert an n-bit field that is right-justified in rS into rA starting at
bit position b, set SH =3 2 - (b + n), MB = b, and ME = (b + n) - 1.
Simplified mnemonics are provided for both of these methods.
Any shift right algebraic instruction, followed by addze, can be used to divide quick-
ly by 2n.
Operand
Name Mnemonic Operation
Syntax
Shift Left slw rA,rS,rB The contents of rS are shifted left the number of bits specified by
Word slw. rB[26:31]. Bits shifted out of position 0 are lost. Zeros are supplied to
the vacated positions on the right. The 32-bit result is placed into rA.
If rB[26] = 1, then rA is filled with zeros.
slw Shift Left Word
slw. Shift Left Word with CR Update. The dot suffix enables
the update of the condition register.
Shift Right srw rA,rS,rB The contents of rS are shifted right the number of bits specified by
Word srw. rB[26:31]. Zeros are supplied to the vacated positions on the left. The
32-bit result is placed into rA.
If rB[26]=1, then rA is filled with zeros.
srw Shift Right Word
srw. Shift Right Word with CR Update. The dot suffix enables
the update of the condition register.
Shift Right srawi rA,rS,SH The contents of rS are shifted right the number of bits specified by
Algebraic srawi. operand SH. Bits shifted out of position 31 are lost. The 32-bit result is
Word sign extended and placed into rA. XER[CA] is set if rS contains a
Immediate negative number and any 1-bits are shifted out of position 31;
otherwise XER(CA) is cleared. An operand SH of zero causes rA to be
loaded with the contents of rS and XER[CA] to be cleared to zero.
srawi Shift Right Algebraic Word Immediate
srawi. Shift Right Algebraic Word Immediate with CR Update.
The dot suffix enables the update of the condition
register.
Shift Right sraw rA,rS,rB The contents of rS are shifted right the number of bits specified by
Algebraic sraw. rB[26:31]. The 32-bit result is placed into rA. XER[CA] is set to one if
Word rS contains a negative number and any 1-bits are shifted out of
position 31; otherwise XER[CA] is cleared to zero. An operand (rB) of
zero causes rA to be loaded with the contents of rS, and XER[CA] to
be cleared to zero. If rB[26] = 1, then rA is filled with 32 sign bits (bit
0) from rS. If rB[26] = 0, then rA is filled from the left with sign bits.
Condition register field CR0 is set based on the value written into rA.
sraw Shift Right Algebraic Word
sraw. Shift Right Algebraic Word with CR Update. The dot suffix
enables the update of the condition register.
Floating- fadd frD,frA,frB The floating-point operand in register frA is added to the floating-point
Point Add fadd. operand in register frB. If the most significant bit of the resultant
significand is not a one the result is normalized. The result is rounded
to the target precision under control of the floating-point rounding
control field RN of the FPSCR and placed into register frD.
Floating-point addition is based on exponent comparison and addition
of the two significands. The exponents of the two operands are
compared, and the significand accompanying the smaller exponent is
shifted right, with its exponent increased by one for each bit shifted,
until the two exponents are equal. The two significands are then
added algebraically to form an intermediate sum. All 53 bits in the
significand as well as all three guard bits (G, R, and X) enter into the
computation.
If a carry occurs, the sum's significand is shifted right one bit position
and the exponent is increased by one.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fadd Floating-Point Add
fadd. Floating-Point Add with CR Update. The dot suffix
enables the update of the condition register.
Floating- fadds frD,frA,frB The floating-point operand in register frA is added to the floating-point
Point Add fadds. operand in register frB. If the most significant bit of the resultant
Single- significand is not a one, the result is normalized. The result is rounded
Precision to the target precision under control of the floating-point rounding
control field RN of the FPSCR and placed into register frD.
Floating-point addition is based on exponent comparison and addition
of the two significands. The exponents of the two operands are
compared, and the significand accompanying the smaller exponent is
shifted right, with its exponent increased by one for each bit shifted,
until the two exponents are equal. The two significands are then
added algebraically to form an intermediate sum. All 53 bits in the
significand as well as all three guard bits (G, R, and X) enter into the
computation.
If a carry occurs, the sum's significand is shifted right one bit position
and the exponent is increased by one.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fadds Floating-Point Single-Precision
fadds. Floating-Point Single-Precision with CR Update. The dot
suffix enables the update of the condition register.
Floating- fsub frD,frA,frB The floating-point operand in register frB is subtracted from the
Point fsub. floating-point operand in register frA. If the most significant bit of the
Subtract resultant significand is not a one the result is normalized. The result is
rounded to the target precision under control of the floating-point
rounding control field RN of the FPSCR and placed into register frD.
The execution of the Floating-Point Subtract instruction is identical to
that of Floating-Point Add, except that the contents of register frB
participates in the operation with its sign bit (bit 0) inverted.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fsub Floating-Point Subtract
fsub. Floating-Point Subtract with CR Update. The dot suffix
enables the update of the condition register.
Floating- fsubs frD,frA,frB The floating-point operand in register frB is subtracted from the
Point fsubs. floating-point operand in register frA. If the most significant bit of the
Subtract resultant significand is not a one the result is normalized. The result is
Single- rounded to the target precision under control of the floating-point
Precision rounding control field RN of the FPSCR and placed into register frD.
The execution of the Floating-Point Subtract instruction is identical to
that of Floating-Point Add, except that the contents of register frB
participates in the operation with its sign bit (bit 0) inverted.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fsubs Floating-Point Subtract Single-Precision
fsubs. Floating-Point Subtract Single-Precision with CR Update.
The dot suffix enables the update of the condition
register.
Floating- fmul frD,frA,frC The floating-point operand in register frA is multiplied by the floating-
Point fmul. point operand in register frC.
Multiply
If the most significant bit of the resultant significand is not a one, the
result is normalized. The result is rounded to the target precision under
control of the floating-point rounding control field RN of the FPSCR
and placed into register frD.
Floating-point multiplication is based on exponent addition and
multiplication of the significands.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmul Floating-Point Multiply
fmul. Floating-Point Multiply with CR Update. The dot suffix
enables the update of the condition register.
Floating- fmuls frD,frA,frC The floating-point operand in register frA is multiplied by the floating-
Point fmuls. point operand in register frC.
Multiply
If the most significant bit of the resultant significand is not a one the
Single-
result is normalized. The result is rounded to the target precision
Precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
Floating-point multiplication is based on exponent addition and
multiplication of the significands.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmuls Floating-Point Multiply Single-Precision
fmuls. Floating-Point Multiply Single-Precision with CR Update.
The dot suffix enables the update of the condition
register.
Floating- fdiv frD,frA,frB The floating-point operand in register frA is divided by the floating-
Point Divide fdiv. point operand in register frB. No remainder is preserved.
If the most significant bit of the resultant significand is not a one, the
result is normalized. The result is rounded to the target precision under
control of the floating-point rounding control field RN of the FPSCR
and placed into register frD.
Floating-point division is based on exponent subtraction and division
of the significands.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1 and zero divide
exceptions when FPSCR[ZE]=1.
fdiv Floating-Point Divide
fdiv. Floating-Point Divide with CR Update. The dot suffix
enables the update of the condition register.
Floating- fdivs frD,frA,frB The floating-point operand in register frA is divided by the floating-
Point Divide fdivs. point operand in register frB. No remainder is preserved.
Single-
If the most significant bit of the resultant significand is not a one, the
Precision
result is normalized. The result is rounded to the target precision under
control of the floating-point rounding control field RN of the FPSCR
and placed into register frD.
Floating-point division is based on exponent subtraction and division
of the significands.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1 and zero divide
exceptions when FPSCR[ZE] = 1.
fdivs Floating-Point Divide Single-Precision
fdivs. Floating-Point Divide Single-Precision with CR Update.
The dot suffix enables the update of the condition
register.
Floating- fmadd frD,frA,frC,frB The floating-point operand in register frA is multiplied by the floating-
Point fmadd. point operand in register frC. The floating-point operand in register frB
Multiply- is added to this intermediate result.
Add
If the most significant bit of the resultant significand is not a one the
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmadd Floating-Point Multiply-Add
fmadd. Floating-Point Multiply-Add with CR Update. The dot
suffix enables the update of the condition register.
Floating- fmadds frD,frA,frC,frB The floating-point operand in register frA is multiplied by the floating-
Point fmadds. point operand in register frC. The floating-point operand in register frB
Multiply- is added to this intermediate result.
Add
If the most significant bit of the resultant significand is not a one the
Single-
result is normalized. The result is rounded to the target precision
Precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmadds Floating-Point Multiply-Add Single-Precision
fmadds. Floating-Point Multiply-Add Single-Precision with CR
Update. The dot suffix enables the update of the
condition register.
Floating- fmsub frD,frA,frC,frB The floating-point operand in register frA is multiplied by the floating-
Point fmsub. point operand in register frC. The floating-point operand in register frB
Multiply- is subtracted from this intermediate result.
Subtract
If the most significant bit of the resultant significand is not a one the
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmsub Floating-Point Multiply-Subtract
fmsub. Floating-Point Multiply-Subtract with CR Update. The
dot suffix enables the update of the condition register.
Floating- fmsubs frD,frA,frC,frB The floating-point operand in register frA is multiplied by the floating-
Point fmsubs. point operand in register frC. The floating-point operand in register frB
Multiply- is subtracted from this intermediate result.
Subtract
If the most significant bit of the resultant significand is not a one the
Single-
result is normalized. The result is rounded to the target precision
Precision
under control of the floating-point rounding control field RN of the
FPSCR and placed into register frD.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fmsubs Floating-Point Multiply-Subtract Single-Precision
fmsubs. Floating-Point Multiply-Subtract Single-Precision with CR
Update. The dot suffix enables the update of the
condition register.
Floating- fnmadd frD,frA,frC,frB The floating-point operand in register frA is multiplied by the floating-
Point fnmadd. point operand in register frC. The floating-point operand in register frB
Negative is added to this intermediate result.
Multiply-
If the most significant bit of the resultant significand is not a one the
Add
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR, then negated and placed into register frD.
This instruction produces the same result as would be obtained by
using the floating-point multiply-add instruction and then negating the
result, with the following exceptions:
• QNaNs propagate with no effect on their sign bit.
• QNaNs that are generated as the result of a disabled invalid
operation exception have a sign bit of zero.
• SNaNs that are converted to QNaNs as the result of a disabled
invalid operation exception retain the sign bit of the SNaN.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fnmadd Floating-Point Negative Multiply-Add
fnmadd. Floating-Point Negative Multiply-Add with CR Update.
The dot suffix enables the update of the condition
register.
Floating- fnmadds frD,frA,frC,frB The floating-point operand in register frA is multiplied by the floating-
Point fnmadds. point operand in register frC. The floating-point operand in register frB
Negative is added to this intermediate result.
Multiply-
If the most significant bit of the resultant significand is not a one the
Add
result is normalized. The result is rounded to the target precision
Single-
under control of the floating-point rounding control field RN of the
Precision
FPSCR, then negated and placed into register frD.
This instruction produces the same result as would be obtained by
using the floating-point multiply-add instruction and then negating the
result, with the following exceptions:
• QNaNs propagate with no effect on their sign bit.
• QNaNs that are generated as the result of a disabled invalid
operation exception have a sign bit of zero.
• SNaNs that are converted to QNaNs as the result of a disabled
invalid operation exception retain the sign bit of the SNaN.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fnmadds Floating-Point Negative Multiply-Add Single-Precision
fnmadds. Floating-Point Negative Multiply-Add Single-Precision
with CR Update. The dot suffix enables the update of the
condition register.
Floating- fnmsub frD,frA,frC,frB The floating-point operand in register frA is multiplied by the floating-
Point fnmsub. point operand in register frC. The floating-point operand in register frB
Negative is subtracted from this intermediate result.
Multiply-
If the most significant bit of the resultant significand is not a one the
Subtract
result is normalized. The result is rounded to the target precision
under control of the floating-point rounding control field RN of the
FPSCR, then negated and placed into register frD.
This instruction produces the same result as would be obtained by
using the floating-point multiply-subtract instruction and then negating
the result, with the following exceptions:
• QNaNs propagate with no effect on their sign bit.
• QNaNs that are generated as the result of a disabled invalid
operation exception have a sign bit of zero.
• SNaNs that are converted to QNaNs as the result of a disabled
invalid operation exception retain the sign bit of the SNaN.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fnmsub Floating-Point Negative Multiply-Subtract
fnmsub. Floating-Point Negative Multiply-Subtract with CR
Update. The dot suffix enables the update of the
condition register.
Floating- fnmsubs frD,frA,frC,frB The floating-point operand in register frA is multiplied by the floating-
Point fnmsubs. point operand in register frC. The floating-point operand in register frB
Negative is subtracted from this intermediate result.
Multiply-
If the most significant bit of the resultant significand is not a one the
Subtract
result is normalized. The result is rounded to the target precision
Single-
under control of the floating-point rounding control field RN of the
Precision
FPSCR, then negated and placed into register frD.
This instruction produces the same result as would be obtained by
using the floating-point multiply-subtract instruction and then negating
the result, with the following exceptions:
• QNaNs propagate with no effect on their sign bit.
• QNaNs that are generated as the result of a disabled invalid
operation exception have a sign bit of zero.
• SNaNs that are converted to QNaNs as the result of a disabled
invalid operation exception retain the sign bit of the SNaN.
FPSCR[FPRF] is set to the class and sign of the result, except for
invalid operation exceptions when FPSCR[VE] = 1.
fnmsubs Floating-Point Negative Multiply-Subtract Single-
Precision
fnmsubs. Floating-Point Negative Multiply-Subtract Single-
Precision with CR Update. The dot suffix enables the
update of the condition register.
Floating- fctiw frD,frB The floating-point operand in register frB is converted to a 32-bit
Point fctiw. signed integer, using the rounding mode specified by FPSCR[RN],
Convert to and placed in frD[32:63]. frD[0:31] are undefined.
Integer
If the operand in register frB is greater than 231– 1, frD[32:63] are set
Word
to 0x7FFF FFFF.
If the operand in register frB is less than –231, frD[32:63] are set to
0x8000 0000.
The conversion is described fully in APPENDIX C FLOATING-POINT
MODELS AND CONVERSIONS.
Except for trap-enabled invalid operation exceptions, FPSCR[FPRF]
is undefined. FPSCR[FR] is set if the result is incremented when
rounded. FPSCR[FI] is set if the result is inexact.
fctiw Floating-Point Convert to Integer Word
fctiw. Floating-Point Convert to Integer Word with CR Update.
The dot suffix enables the update of the condition
register.
Floating- fctiwz frD,frB The floating-point operand in register frB is converted to a 32-bit
Point fctiwz. signed integer, using the rounding mode Round toward Zero, and
Convert to placed in frD[32:63]. frD[0:31] are undefined.
Integer
If the operand in frB is greater than 231 –1, frD[32:63] are set to
Word with
0x7FFF FFFF.
Round
If the operand in register frB is less than –231, frD[32:63] are set to
0x8000 0000.
The conversion is described fully in APPENDIX C FLOATING-POINT
MODELS AND CONVERSIONS.
Except for trap-enabled invalid operation exceptions, FPSCR[FPRF]
is undefined. FPSCR[FR] is set if the result is incremented when
rounded. FPSCR[FI] is set if the result is inexact.
fctiwz Floating-Point Convert to Integer Word with Round
Toward Zero
fctiwz. Floating-Point Convert to Integer Word with Round
Toward Zero with CR Update. The dot suffix enables the
update of the condition register.
The CR field and the FPCC are interpreted as shown in Table 4-10.
2 FE (frA) = (frB)
Floating- fcmpu crfD,frA,frB The floating-point operand in register frA is compared to the floating-
Point point operand in register frB. The result of the compare is placed into
Compare CR field crfD and the FPCC.
Unordered
If an operand is a NaN, either quiet or signaling, CR field crfD and the
FPCC are set to reflect unordered. If an operand is a Signaling NaN,
VXSNAN is set.
Floating- fcmpo crfD,frA,frB The floating-point operand in register frA is compared to the floating-
Point point operand in register frB. The result of the compare is placed into
Compare CR field crfD and the FPCC.
Ordered
If an operand is a NaN, either quiet or signaling, CR field crfD and the
FPCC are set to reflect unordered. If an operand is a Signaling NaN,
VXSNAN is set, and if invalid operation is disabled (VE = 0) then VXVC
is set. Otherwise, if an operand is a Quiet NaN, VXVC is set.
The floating-point status and control register instructions are summarized in Table
4-12.
Move from mffs frD The contents of the FPSCR are placed into frD[32:63].
FPSCR mffs.
mffs Move from FPSCR
mffs. Move from FPSCR with CR Update. The dot suffix
enables the update of the condition register.
Move to mcrfs crfD,crfS The contents of FPSCR field specified by operand crfS are copied to
Condition the CR field specified by operand crfD. All exception bits copied are
Register cleared to zero in the FPSCR.
from FPSCR
Move to mtfsfi crfD,IMM The value of the IMM field is placed into FPSCR field crfD. All other
FPSCR Field mtfsfi. FPSCR fields are unchanged.
Immediate
mtfsfi Move to FPSCR Field Immediate
mtfsfi. Move to FPSCR Field Immediate with CR Update. The
dot suffix enables the update of the condition register.
When FPSCR[0:3] is specified, bits 0 (FX) and 3 (OX) are set to the
values of IMM[0] and IMM[3] (i.e., even if this instruction causes OX to
change from zero to one, FX is set from IMM[0] and not by the usual
rule that FX is set to one when an exception bit changes from zero to
one). Bits 1 and 2 (FEX and VX) are set according to the usual rule
described in 2.2.3 Floating-Point Status and Control Register
(FPSCR), and not from IMM[1:2].
Move to mtfsf FM,frB frB[32:63] are placed into the FPSCR under control of the field mask
FPSCR mtfsf. specified by FM. The field mask identifies the 4-bit fields affected. Let
Fields i be an integer in the range 0-7. If FM = 1 then FPSCR field i (FPSCR
bits 4∗i through 4∗i+ 3) is set to the contents of the corresponding field
of the low-order 32 bits of register frB.
mtfsf Move to FPSCR Fields
mtfsf. Move to FPSCR Fields with CR Update. The dot suffix
enables the update of the condition register.
When FPSCR[0:3] is specified, bits 0 (FX) and 3 (OX) are set to the
values of frB[32] and frB[35] (i.e., even if this instruction causes OX to
change from zero to one, FX is set from frB[32] and not by the usual
rule that FX is set to one when an exception bit changes from zero to
one). Bits 1 and 2 (FEX and VX) are set according to the usual rule
described in 2.2.3 Floating-Point Status and Control Register
(FPSCR), and not from frB[33:34].
Move to mtfsb0 crbD The bit of the FPSCR specified by operand crbD is cleared to zero.
FPSCR Bit 0 mtfsb0.
Bits 1 and 2 (FEX and VX) cannot be explicitly reset.
mtfsb0 Move to FPSCR Bit 0
mtfsb0. Move to FPSCR Bit 0 with CR Update. The dot suffix
enables the update of the condition register.
Move to mtfsb1 crbD The bit of the FPSCR specified by operand crbD is set to one.
FPSCR Bit 1 mtfsb1.
Bits 1 and 2 (FEX and VX) cannot be reset explicitly.
mtfsb1 Move to FPSCR Bit 1
mtfsb1. Move to FPSCR Bit 1 with CR Update. The dot suffix
enables the update of the condition register.
Figure 4-1 shows how an effective address is generated when using register indi-
rect with immediate index addressing.
0 16 17 31
Sign Extension d
Yes
rA = 0? 0
No
+
0 31 0 31
GPR (rA) Effective Address
0 31 Store Memory
GPR (rD/rS) Load Access
REGIND/IMM
Figure 4-2 shows how an effective address is generated when using register indi-
rect with index addressing.
0 31
GPR (rB)
Yes
rA = 0? 0
No
+
0 31 0 31
GPR (rA) Effective Address
0 31 Store Memory
GPR (rD/rS) Load Access
REGIND/IA
Figure 4-3 shows how an effective address is generated when using register indi-
rect addressing.
0 31
Yes
rA = 0? 0 0 0 0 0•••••••••••••••••••••••••••••••••••••••••••••••••••• 0 0 0 0 0
No
0 31
GPR (rA)
0 31
Effective Address
0 31 Store Memory
GPR (rD/rS) Load Access REGIND ADD
Operand
Name Mnemonic Operation
Syntax
Load Byte lbz rD,d(rA) The effective address is the sum (rA|0) + d. The byte in memory
and Zero addressed by the EA is loaded into register rD[24:31]. The remaining
bits in register rD are cleared to zero.
Load Byte lbzx rD,rA,rB The effective address is the sum (rA|0) + (rB). The byte in memory
and Zero addressed by the EA is loaded into register rD[24:31]. The remaining
Indexed bits in register rD are cleared to zero.
Load Byte lbzu rD,d(rA) The effective address (EA) is the sum (rA|0) + d. The byte in memory
and Zero addressed by the EA is loaded into register rD[24:31]. The remaining
with Update bits in register rD are cleared to zero. The EA is placed into register rA.
The PowerPC architecture defines load with update instructions with
rA = 0 or rA = rD as invalid forms. In the RCPU, however, if rA = 0 then
the EA is written into R0. If rA = rD then rA is loaded from memory
location MEM(rA, N) where N is determined by the instruction operand
size.
Load Byte lbzux rD,rA,rB The effective address (EA) is the sum (rA|0) + (rB). The byte
and Zero addressed by the EA is loaded into register rD[24:31]. The remaining
with Update bits in register rD are cleared to zero. The EA is placed into register rA.
Indexed
The PowerPC architecture defines load with update instructions with
rA = 0 or rA = rD as invalid forms. In the RCPU, however, if rA = 0 then
the EA is written into R0. If rA = rD then rA is loaded from memory
location MEM(rA, N) where N is determined by the instruction operand
size.
Load lhz rD,d(rA) The effective address is the sum (rA|0) + d. The half-word in memory
Half Word addressed by the EA is loaded into register rD[16:31]. The remaining
and Zero bits in rD are cleared to zero.
Load lhzx rD,rA,rB The effective address is the sum (rA|0) + (rB). The half-word in
Half Word memory addressed by the EA is loaded into register rD[16:31]. The
and Zero remaining bits in register rD are cleared.
Indexed
Load lhzu rD,d(rA) The effective address is the sum (rA|0) + d. The half-word in memory
Half Word addressed by the EA is loaded into register rD[16:31]. The remaining
and Zero bits in register rD are cleared.
with Update
The EA is placed into register rA.
The PowerPC architecture defines load with update instructions with
rA = 0 or rA =rD as invalid forms. In the RCPU, however, if rA=0 then
the EA is written into R0. If rA = rD then rA is loaded from memory
location MEM(rA, N) where N is determined by the instruction operand
size.
Operand
Name Mnemonic Operation
Syntax
Load lhzux rD,rA,rB The effective address is the sum (rA|0) + (rB). The half-word in
Half Word memory addressed by the EA is loaded into register rD[16:31]. The
and Zero remaining bits in register rD are cleared. The EA is placed into register
with Update rA.
Indexed
The PowerPC architecture defines load with update instructions with
rA = 0 or rA = rD as invalid forms. In the RCPU, however, if rA = 0 then
the EA is written into R0. If rA = rD then rA is loaded from memory
location MEM(rA, N) where N is determined by the instruction operand
size.
Load lha rD,d(rA) The effective address is the sum (rA) + d. The half-word in memory
Half Word addressed by the EA is loaded into register rD[16:31]. The remaining
Algebraic bits in register rD are filled with a copy of bit 0 of the loaded half-word.
Load lhax rD,rA,rB The effective address is the sum (rA|0) + (rB). The half-word in
Half Word memory addressed by the EA is loaded into register rD[16:31]. The
Algebraic remaining bits in register rD are filled with a copy of bit 0 of the loaded
Indexed half-word.
Load lhau rD,d(rA) The effective address is the sum (rA|0) + d. The half-word in memory
Half Word addressed by the EA is loaded into register rD[16:31]. The remaining
Algebraic bits in register rD are filled with a copy of bit 0 of the loaded half-word.
with Update The EA is placed into register rA.
The PowerPC architecture defines load with update instructions with
rA = 0 or rA = rD as invalid forms. In the RCPU, however, if rA = 0 then
the EA is written into R0. If rA = rD then rA is loaded from memory
location MEM(rA, N) where N is determined by the instruction operand
size.
Load lhaux rD,rA,rB The effective address is the sum (rA|0) + (rB). The half-word in
Half Word memory addressed by the EA is loaded into register rD[16:31]. The
Algebraic remaining bits in register rD are filled with a copy of bit 0 of the loaded
with Update half-word. The EA is placed into register rA.
Indexed
The PowerPC architecture defines load with update instructions with
rA = 0 or rA = rD as invalid forms. In the RCPU, however, if rA = 0 then
the EA is written into R0. If rA = rD then rA is loaded from memory
location MEM(rA, N) where N is determined by the instruction operand
size.
Load Word lwz rD,d(rA) The effective address is the sum (rA|0) + d. The word in memory
and Zero addressed by the EA is loaded into register rD[0:31].
Load Word lwzx rD,rA,rB The effective address is the sum (rA|0) + (rB). The word in memory
and Zero addressed by the EA is loaded into register rD[0:31].
Indexed
Load Word lwzu rD,d(rA) The effective address is the sum (rA|0) + d. The word in memory
and Zero addressed by the EA is loaded into register rD[0:31]. The EA is placed
with Update into register rA.
The PowerPC architecture defines load with update instructions with
rA = 0 or rA = rD as invalid forms. In the RCPU, however, if rA = 0 then
the EA is written into R0. If rA = rD then rA is loaded from memory
location MEM(rA, N) where N is determined by the instruction operand
size.
Operand
Name Mnemonic Operation
Syntax
Load Word lwzux rD,rA,rB The effective address is the sum (rA|0) + (rB). The word in memory
and Zero addressed by the EA is loaded into register rD[0:31]. The EA is placed
with Update into register rA.
Indexed
The PowerPC architecture defines load with update instructions with
rA = 0 or rA = rD as invalid forms. In the RCPU, however, if rA = 0 then
the EA is written into R0. If rA = rD then rA is loaded from memory
location MEM(rA, N) where N is determined by the instruction operand
size.
Store Byte stb rS,d(rA) The effective address is the sum (rA|0) + d. Register rS[24:31] is
stored into the byte in memory addressed by the EA.
Store Byte stbx rS,rA,rB The effective address is the sum (rA|0) + (rB). rS[24:31] is stored into
Indexed the byte in memory addressed by the EA.
Store Byte stbu rS,d(rA) The effective address is the sum (rA|0) + d. rS[24:31] is stored into the
with Update byte in memory addressed by the EA. The EA is placed into register
rA.
Store Byte stbux rS,rA,rB The effective address is the sum (rA|0) + (rB). rS[24:31] is stored into
with Update the byte in memory addressed by the EA. The EA is placed into
Indexed register rA.
Store sth rS,d(rA) The effective address is the sum (rA|0) + d. rS[16:31] is stored into the
Half Word half-word in memory addressed by the EA.
Store sthx rS,rA,rB The effective address (EA) is the sum (rA|0) + (rB). rS[16:31] is stored
Half Word into the half-word in memory addressed by the EA.
Indexed
Store sthu rS,d(rA) The effective address is the sum (rA|0) + d. rS[16:31] is stored into the
Half Word half-word in memory addressed by the EA. The EA is placed into
with Update register rA.
Store sthux rS,rA,rB The effective address is the sum (rA|0) + (rB). rS[16:31] is stored into
Half Word the half-word in memory addressed by the EA. The EA is placed into
with Update register rA.
Indexed
Store Word stw rS,d(rA) The effective address is the sum (rA|0) + d. Register rS is stored into
the word in memory addressed by the EA.
Store Word stwx rS,rA,rB The effective address is the sum (rA|0) + (rB). rS is stored into the
Indexed word in memory addressed by the EA.
Store Word stwu rS,d(rA) The effective address is the sum (rA|0) + d. Register rS is stored into
with Update the word in memory addressed by the EA. The EA is placed into
register rA.
Store Word stwux rS,rA,rB The effective address is the sum (rA|0) + (rB). Register rS is stored
with Update into the word in memory addressed by the EA. The EA is placed into
Indexed register rA.
Operand
Name Mnemonic Operation
Syntax
Load lhbrx rD,rA,rB The effective address is the sum (rA|0) + (rB). Bits 0 to 7 of the
Half Word half-word in memory addressed by the EA are loaded into
Byte- rD[24:31]. Bits 8 to 15 of the half-word in memory addressed by the
Reverse EA are loaded into rD[16:23]. The rest of the bits in rD are cleared
Indexed to zero.
Load Word lwbrx rD,rA,rB The effective address is the sum (rA|0)+(rB). Bits 0–7 of the word
Byte- in memory addressed by the EA are loaded into rD[24:31]. Bits 8
Reverse to 15 of the word in memory addressed by the EA are loaded into
Indexed rD[16:23]. Bits 16 to 23 of the word in memory addressed by the
EA are loaded into rD[8:15]. Bits 24 to 31 of the word in memory
addressed by the EA are loaded into rD[0:7].
Store sthbrx rS,rA,rB The effective address is the sum (rA|0)+(rB). rS[24:31] are stored
Half Word into bits 0 to 7 of the half-word in memory addressed by the EA.
Byte- rS[16:23] are stored into bits 8 to 15 of the half-word in memory
Reverse addressed by the EA.
Indexed
Store Word stwbrx rS,rA,rB The effective address is the sum (rA|0)+(rB). rS[24:31] are stored
Byte- into bits 0 to 7 of the word in memory addressed by EA. Register
Reverse rS[16:23] are stored into bits 8 to 15 of the word in memory
Indexed addressed by the EA. Register rS[8:15] are stored into bits 16 to
23 of the word in memory addressed by the EA. rS[0:7] are stored
into bits 24 to 31 of the word in memory addressed by the EA.
The PowerPC architecture defines the load multiple instruction (lmw) with rA in the
range of registers to be loaded as an invalid form. In the RCPU, however, if rA is
in the range of registers to be loaded, the instruction completes normally, and rA is
loaded from the memory location as follows:
rA ← MEM(EA+(rA–rS)*4, 4)
For integer load and store multiple instructions, the effective address must be a
multiple of four. If not, a system alignment exception is generated.
Load/store string indexed instructions of zero length have no effect, except that
load string indexed instructions of zero length may set register rD to an undefined
value.
The PowerPC architecture defines the load string instructions with rA in the range
of registers to be loaded as an invalid form. In the RCPU, however, if rA is in the
range of registers to be loaded, the instruction completes normally, and rA is load-
ed from memory.
Operand
Name Mnemonic Operation
Syntax
Store String stswx rS,rA,rB The effective address is the sum (rA|0)+(rB).
Word
Let n = XER[25:31]; n is the number of bytes to store.
Indexed
Let nr = CEIL(n/4); nr is the number of registers to supply data.
n consecutive bytes starting at the EA are stored from register rS
through rS+nr-1.
Bytes are stored left to right from each register. The sequence of
registers wraps around through r0 if required.
Figure 4-4 shows how an effective address is generated when using register indi-
rect with immediate index addressing.
0 5 6 10 11 15 16 31
Instruction Encoding: Opcode frD/frS rA d
0 16 17 31
Sign Extension d
Yes
rA = 0 0
No
+
0 31 0 31
GPR (rA) Effective Address
0 63 Store Memory
FPR (frD/frS) Load Access
REGIND/IMM IN ADD
Figure 4-5 shows how an effective address is generated when using register indi-
rect with index addressing.
0 31
GPR (rB)
Yes
rA = 0? 0
No
+
0 31 0 31
GPR (rA) Effective Address
0 63 Store Memory
FPR (frD/frS) Load Access
REG IND/IN ADD
Operand
Name Mnemonic Operation
Syntax
No Denormalization Required
Denormalization Required
Floating-point store instructions are listed in Table 4-19. Recall that rA, rB, and rD
denote GPRs, while frA, frB, frC, frS and frD denote FPRs.
If the operand is between the range of single denormalized and double denormal-
ized, it is considered a programming error. The hardware handles this case as if
the operand were single denormalized.
The following check is done on the stored operand in order to determine whether
it is a denormalized single-precision operand and invoke the floating-point assist
exception handler:
Floating- fmr frD,frB The contents of register frB is placed into frD.
Point Move fmr.
fmr Floating-Point Move Register
Register
fmr. Floating-Point Move Register with CR Update. The dot
suffix enables the update of the condition register.
Floating- fneg frD,frB The contents of register frB with bit 0 inverted is placed into register
Point fneg. frD.
Negate
fneg Floating-Point Negate
fneg. Floating-Point Negate with CR Update. The dot suffix
enables the update of the condition register.
Floating- fabs frD,frB The contents of frB with bit 0 cleared to zero is placed into frD.
Point fabs.
fabs Floating-Point Absolute Value
Absolute
fabs. Floating-Point Absolute Value with CR Update. The dot
Value
suffix enables the update of the condition register.
Floating- fnabs frD,frB The contents of frB with bit 0 set to one is placed into frD.
Point fnabs.
fnabs Floating-Point Negative Absolute Value
Negative
fnabs. Floating-Point Negative Absolute Value with CR Update.
Absolute
The dot suffix enables the update of the condition
Value
register.
When the branch instructions contain immediate addressing operands, the target
addresses can be computed sufficiently ahead of the branch instruction that in-
structions can be prefetched along the target path. If the branch instructions use
the link and count registers, instructions along the target path can be prefetched if
the link or count register is loaded sufficiently ahead of the branch instruction.
Branch instructions compute the effective address (EA) of the next instruction ad-
dress using the following addressing modes:
• Branch relative
• Branch to absolute address
• Branch conditional to relative address
• Branch conditional to absolute address
• Branch conditional to link register
• Branch conditional to count register
Figure 4-6 shows how the branch target address is generated when using the
branch relative addressing mode.
0 5 6 29 30 31
Sign Extension LI 0 0
0 31
Current Instruction Address +
0 31
Branch Target Address
BR ADDR
Figure 4-7 shows how the branch target address is generated when using the
branch conditional relative addressing mode.
0 31
No
Condition Next Sequential Instruction Address
True?
Yes
0 16 17 29 30 31
Sign Extension BD 0 0
0 31
Current Instruction Address +
0 31
Branch Target Address
BR COND REL ADDR
Figure 4-8 shows how the branch target address is generated when using the
branch to absolute address mode.
0 5 6 29 30 31
Instruction Encoding: 0x12 LI AA LK
0 5 6 29 30 31
Sign Extension LI 0 0
0 29 30 31
Branch Target Address 0 0
BR TO ABS
Figure 4-9 shows how the branch target address is generated when using the
branch conditional to absolute address mode.
0 5 6 1011 15 16 29 30 31
Instruction Encoding: 0x10 BO BI BD AA LK
0 31
No
Condition Next Sequential Instruction Address
True?
Yes
0 16 17 29 30 31
Sign Extension BD 0 0
0 29 30 31
Branch Target Address 0 0
BR COND TO ABS
Figure 4-10 shows how the branch target address is generated when using the
branch conditional to link register address mode.
0 31
No
Condition Next Sequential Instruction Address
True?
Yes
0 29 30 31
LR || 0 0
0 31
Branch Target Address
BR COND TO LR ADDR
Figure 4-11 shows how the branch target address is generated when using the
branch conditional to count register address mode.
0 31
No
Condition Next Sequential Instruction Address
True?
Yes
0 29 30 31
CTR || 0 0
0 31
Branch Target Address
BR COND TO COUNT REG
0000y Decrement the CTR, then branch if the decremented CTR ¦ 0 and the condition is FALSE.
0001y Decrement the CTR, then branch if the decremented CTR = 0 and the condition is FALSE.
0100y Decrement the CTR, then branch if the decremented CTR ¦ 0 and the condition is TRUE.
0101y Decrement the CTR, then branch if the decremented CTR = 0 and the condition is TRUE.
The first four bits of the BO operand specify how the branch is affected by or affects
the condition and count registers. The fifth bit, shown in Table 4-21 as having the
value y, is used for branch prediction. The branch always encoding of the BO op-
erand does not have a y bit.
Clearing the y bit to zero indicates that the following behavior is likely:
• For bcx with a negative value in the displacement operand, the branch is tak-
en.
• In all other cases (bcx with a non-negative value in the displacement operand,
bclrx, or bcctrx), the branch is not taken.
Setting the y bit to one reverses the preceding indications.
Note that branch prediction occurs for branches to the LR or CTR only if the target
address is ready.
The sign of the displacement operand is used as described above even if the target
is an absolute address. The default value for the y bit should be zero, and should
only be set to one if software has determined that the prediction corresponding to
y = one is more likely to be correct than the prediction corresponding to y = zero.
Software that does not compute branch predictions should set the y bit to zero.
For all three of the branch conditional instructions, the branch should be predicted
to be taken if the value of the following expression is one, and to fall through if the
value is zero.
4.6.2.2 BI Operand
The 5-bit BI operand in branch conditional instructions specifies which of the 32 bits
in the CR represents the condition to test.
Branch bc BO,BI, The BI operand specifies the bit in the condition register (CR) to be
Conditional bca target_addr used as the condition of the branch. The BO operand is used as de-
bcl scribed in Table 4-21.
bcla bc Branch Conditional. Branch conditionally to the address
computed as the sum of the immediate address and the
address of the current instruction.
bca Branch Conditional Absolute. Branch conditionally to the
absolute address specified.
bcl Branch Conditional then Link. Branch conditionally to the
address computed as the sum of the immediate address
and the address of the current instruction. The instruction
address following this instruction is placed into the link
register.
bcla Branch Conditional Absolute then Link. Branch
conditionally to the absolute address specified. The
instruction address following this instruction is placed into
the link register.
Branch bclr BO,BI The BI operand specifies the bit in the condition register to be used as
Conditional bclrl the condition of the branch. The BO operand is used as described in
to Link Table 5–21.
Register
bclr Branch Conditional to Link Register. Branch conditionally
to the address in the link register.
bclrl Branch Conditional to Link Register then Link. Branch
conditionally to the address specified in the link register.
The instruction address following this instruction is then
placed into the link register.
Branch bcctr BO,BI The BI operand specifies the bit in the condition register to be used as
Conditional bcctrl the condition of the branch. The BO operand is used as described in
to Count Table 5–21.
Register
bcctr Branch Conditional to Count Register. Branch
conditionally to the address specified in the count
register.
bcctrl Branch Conditional to Count Register then Link. Branch
conditionally to the address specified in the count
register. The instruction address following this instruction
is placed into the link register.
Note: If the “decrement and test CTR” option is specified (BO[2]=0),
the instruction form is invalid.
Note that if the link register update option (LR) is enabled for any of these instruc-
tions, the PowerPC architecture defines these forms of the instructions as invalid.
Condition crand crbD,crbA,crbB The bit in the condition register specified by crbA is ANDed
Register AND with the bit in the condition register specified by crbB. The
result is placed into the condition register bit specified by crbD.
Condition cror crbD,crbA,crbB The bit in the condition register specified by crbA is ORed with
Register OR the bit in the condition register specified by crbB. The result is
placed into the condition register bit specified by crbD.
Condition crxor crbD,crbA,crbB The bit in the condition register specified by crbA is XORed
Register XOR with the bit in the condition register specified by crbB. The
result is placed into the condition register bit specified by crbD.
Condition crnand crbD,crbA,crbB The bit in the condition register specified by crbA is ANDed
Register with the bit in the condition register specified by crbB. The
NAND complemented result is placed into the condition register bit
specified by crbD.
Condition crnor crbD,crbA,crbB The bit in the condition register specified by crbA is ORed with
Register NOR the bit in the condition register specified by crbB. The
complemented result is placed into the condition register bit
specified by crbD.
Condition creqv crbD,crbA, The bit in the condition register specified by crbA is XORed
Register crbB with the bit in the condition register specified by crbB. The
Equivalent complemented result is placed into the condition register bit
specified by crbD.
Condition crandc crbD,crbA, The bit in the condition register specified by crbA is ANDed
Register AND crbB with the complement of the bit in the condition register specified
with by crbB and the result is placed into the condition register bit
Complement specified by crbD.
Condition crorc crbD,crbA, The bit in the condition register specified by crbA is ORed with
Register OR crbB the complement of the bit in the condition register specified by
with crbB and the result is placed into the condition register bit
Complement specified by crbD.
Move mcrf crfD,crfS The contents of crfS are copied into crfD. No other condition
Condition register fields are changed.
Register Field
System Call sc — When executed, the effective address of the instruction following the
sc instruction is placed into SRR0. MSR[16:31] are placed into
SRR1[16:31], and SRR1[0:15] are set to undefined values. Then a
system call exception is generated.
The exception causes the next instruction to be fetched from offset
0xC00 from the base physical address indicated by the new setting of
MSR[IP]. Refer to 6.11.8 System Call Exception (0x00C00) for more
information.
This instruction is context synchronizing.
Return from rfi — SRR1[16:31] are placed into MSR[16:31], then the next instruction is
Interrupt fetched, under control of the new MSR value, from the address
SRR0[0:29] || 0b00.
This is a supervisor-level, context-synchronizing instruction.
Mnemonics are provided so that branch conditional instructions can be coded with
the condition as part of the instruction mnemonic rather than as a numeric operand.
Some of these are shown as examples with the branch instructions.
Trap Word twi TO,rA,SIMM The contents of rA is compared with the sign-extended SIMM oper-
Immediate and. If any bit in the TO operand is set to one and its corresponding
condition is met by the result of the comparison, then the system trap
handler is invoked.
Trap Word tw TO,rA,rB The contents of rA is compared with the contents of rB. If any bit in the
TO operand is set to one and its corresponding condition is met by the
result of the comparison, then the system trap handler is invoked.
The contents of register rA is compared with either the sign-extended SIMM field
or with the contents of register rB, depending on the trap instruction. The compar-
ison results in five conditions which are ANDed with operand TO. If the result is not
zero, the trap exception handler is invoked. These conditions are provided in Table
4-26.
0 Less than
1 Greater than
2 Equal
A standard set of codes has been adopted for the most common combinations of
trap conditions. Refer to E.7 Simplified Mnemonics for Trap Instructions for a
description of these codes and of simplified mnemonics employing them.
4.7.1 Move to/from Machine State Register and Condition Register Instructions
Table 4-27 summarizes the instructions for reading from or writing to the machine
state register and the condition register.
Move to mtcrf CRM,rS The contents of rS are placed into the condition register under control
Condition of the field mask specified by operand CRM. The field mask identifies
Register the 4-bit fields affected. Let i be an integer in the range 0-7. If CRM(i)
Fields = 1, then CR field i (CR bits 4*i through 4*i+3) is set to the contents of
the corresponding field of rS.
Move to mcrxr crfD The contents of XER[0:3] are copied into the condition register field
Condition designated by crfD. All other fields of the condition register remain
Register unchanged. XER[0:3] is cleared to zero.
from XER
Move from mfcr rD The contents of the condition register are placed into rD.
Condition
Register
Move from mfmsr rD The contents of the MSR are placed into rD. This is a supervisor-level
Machine instruction.
State
Register
Move to mtspr SPR,rS The SPR field denotes a special purpose register, encoded as shown
Special in Table 4-29 and Table 4-30 below. The contents of rS are placed
Purpose into the designated SPR.
Register
Simplified mnemonic examples:
mtxer rA mtspr 1,rA
mtlr rA mtspr 8,rA
mtctr rA mtspr 9,rA
Move from mfspr rD,SPR The SPR field denotes a special purpose register, encoded as shown
Special in Table 4-29 and Table 4-30 below. The contents of the designated
Purpose SPR are placed into rD.
Register
Simplified mnemonic examples:
mfxer rA mfspr rA,1
mflr rA mfspr rA,8
mfctr rA mfspr rA,9
For mtspr and mfspr instructions, the SPR number coded in assembly language
does not appear directly as a 10-bit binary number in the instruction. The number
coded is split into two 5-bit halves that are reversed in the instruction, with the high-
order five bits appearing in bits [16:20] of the instruction and the low-order five bits
in bits [11:15].
Table 4-29 summarizes SPR encodings to which the RCPU permits user-level ac-
cess.
Table 4-30 summarizes SPR encodings that the RCPU permits at the supervisor
level.
Table 4-31 summarizes SPR encodings that the RCPU permits in debug mode, or
in supervisor mode when debug mode is not enabled out of reset.
Simplified mnemonics for the mftb instruction allow it to be coded with the TBR
name as part of the mnemonic. Refer to E.8 Simplified Mnemonics for Special-
Purpose Registers for details. Notice that the simplified mnemonics for move from
time base and move from time base upper are variants of the mftb instruction rath-
er than of mfspr. The mftb instruction serves as both a basic and simplified mne-
monic. Assemblers recognize an mftb mnemonic with two operands as the basic
form and an mftb mnemonic with one operand as the simplified form.
Move from mftb rD,TBR The TBR field denotes either the time base lower (TBL) or time base
Time Base upper (TBU), encoded as shown in Table 4-33. The contents of the
designated register are copied to rD.
Table 4-33 summarizes the time base (TBL/TBU) register encodings to which
user-level access read access (using the mftb instruction) is permitted.
Writing to the time base is permitted at the supervisor privilege level only and is ac-
complished with the mtspr instruction (see 4.7.2 Move to/from Special Purpose
Register Instructions) or the mttb simplified mnemonic (see E.8 Simplified Mne-
monics for Special-Purpose Registers).
The enforce in-order execution of I/O (eieio) instruction serializes load/store in-
structions. No load or store instruction following eieio is issued until all loads and
stores preceding eieio have completed execution.
The instruction synchronize (isync) instruction causes the RCPU to halt instruction
fetch until all instructions currently in the processor have completed execution, i.e.,
all issued instructions as well as the pre-fetched instructions waiting to be issued.
This condition is referred to as fetch serialization.
The concept behind the use of the lwarx and stwcx. instructions is that a
processor may load a semaphore from memory, compute a result based on the
value of the semaphore, and conditionally store it back to the same location. The
conditional store is performed based on the existence of a reservation established
by the preceding lwarx. If the reservation exists when the store is executed, the
store is performed and a bit is set to one in the condition register. If the reservation
does not exist when the store is executed, the target memory location is not modi-
fied and a bit is set to zero in the condition register.
The lwarx and stwcx. primitives allow software to read a semaphore, compute a
result based on the value of the semaphore, store the new value back into the
semaphore location only if that location has not been modified since it was first
read, and determine if the store was successful. If the store was successful, the
sequence of instructions from the read of the semaphore to the store that updated
the semaphore appear to have been executed atomically (that is, no other proces-
sor or mechanism modified the semaphore location between the read and the up-
date), thus providing the equivalent of a real atomic operation. However, other
processors may have read from the location during this operation.
The lwarx and stwcx. instructions require the EA to be aligned. Exception handling
software should not attempt to emulate a misaligned lwarx or stwcx. instruction,
because there is no correct way to define the address associated with the reserva-
tion.
In general, the lwarx and stwcx. instructions should be used only in system pro-
grams, which can be invoked by application programs as needed.
At most one reservation exists at a time on a given processor. The address asso-
ciated with the reservation can be changed by a subsequent lwarx instruction. The
conditional store is performed based on the existence of a reservation established
by the preceding lwarx regardless of whether the address generated by the lwarx
matches that generated by the stwcx. A reservation held by the processor is
cleared by any of the following:
In case of an external memory access, this attribute causes the external bus inter-
face (EBI) to set a storage reservation on the cycle address. The EBI must either
snoop the external bus or receive some indication from external snoop logic in case
the storage reservation is broken by some other processor accessing the same lo-
cation. When an stwcx. instruction to external memory is executed, the EBI checks
if the reservation was lost. If so, the cycle is blocked from going to the external bus,
and the EBI notifies the LSU that the stwcx. instruction did not complete.
Enforce In- eieio — The eieio instruction provides an ordering function for the effects of
Order load and store instructions executed by a given processor. Executing
Execution of an eieio instruction ensures that all memory accesses previously
I/O initiated by the given processor are complete with respect to main
memory before allowing any memory accesses subsequently initiated
by the given processor to access main memory.
Instruction isync — This instruction causes instruction fetch to be halted until all
Synchronize instructions currently in the processor have completed execution, i.e.,
all issued instructions as well as the pre-fetched instructions waiting to
be issued.
This instruction has no effect on other processors or on their caches.
Load Word lwarx rD,rA,rB The effective address is the sum (rA|0) + (rB). The word in memory
and addressed by the EA is loaded into register rD.
Reserve
This instruction creates a reservation for use by an stwcx. instruction.
Indexed
An address computed from the EA is associated with the reservation,
and replaces any address previously associated with the reservation.
The EA must be a multiple of four. If it is not, the alignment exception
handler is invoked.
Store Word stwcx. rS,rA,rB The effective address is the sum (rA|0) + (rB).
Conditional
If a reservation exists, register rS is stored into the word in memory
Indexed
addressed by the EA and the reservation is cleared.
If a reservation does not exist, the instruction completes without
altering memory.
The EQ bit in the condition register field CR0 is modified to reflect
whether the store operation was performed (i.e., whether a reservation
existed when the stwcx. instruction began execution). If the store was
completed successfully, the EQ bit is set to one.
The EA must be a multiple of four; otherwise, the alignment exception
handler is invoked.
Synchronize sync — Executing a sync instruction ensures that all instructions previously
initiated by the given processor appear to have completed before any
subsequent instructions are initiated by the given processor. When the
sync instruction completes, all memory accesses initiated by the
given processor prior to the sync will have been performed with
respect to all other mechanisms that access memory. The sync
instruction can be used to ensure that the results of all stores into a
data structure, performed in a critical section of a program, are seen
by other processors before the data structure is seen as unlocked.
Operand
Name Mnemonic Operation
Syntax
Instruction icbi rA,rB The effective address is the sum (rA|0) + (rB).
Cache
This instruction causes any subsequent fetch request for an
Block
instruction in the block to not find the block in the cache and to be sent
Invalidate
to storage. The instruction causes the target block in the instruction
cache of the executing processor to be marked invalid. If the target
block is not accessible to the program for loads, the system data
storage error handler may be invoked.
This is a supervisor-level instruction.
A cache access cycle begins with an instruction request from the CPU instruction
unit. In case of a cache hit, the instruction is delivered to the instruction unit. In case
of a cache miss, the cache initiates a burst read cycle (four beats per burst, one
word per beat) on the instruction bus (I-bus) with the address of the requested in-
struction. The first word received from the bus is the requested instruction. The
cache forwards this instruction to the instruction unit as soon as it is received from
the I-bus. A cache line is then selected to receive the data that will be coming from
the bus. A least-recently-used (LRU) replacement algorithm is used to select a line
when no empty lines are available.
Each cache line can be used as an SRAM, allowing the application to lock critical
code segments that need fast and deterministic execution time.
2
21
WORD SELECT
7
WAY0 WAY1
SET0 TAG0 W0 W1 W2 W3 . . .. TAG0 W0 W1 W2 W3
SET1 TAG1 W0 W1 W2 W3 . . .. TAG1 W0 W1 W2 W3
L
R
U
A
VALID BIT
VALID BIT
LOCK BIT
LOCK BIT
R
...
...
...
...
...
...
...
...
R
A
Y
SET126TAG126 W0 W1 W2 W3 . . .. TAG126 W0 W1 W2 W3
SET127TAG127 W0 W1 W2 W3 . . .. TAG127 W0 W1 W2 W3
21 21
128 128
COMP COMP
HIT1
HIT0
BIDIRECTIONAL MUX 2 ➝1
128
TO LINE BUFFER/
HIT FROM BURST BUFFER
INST CACHE ORG
SET 4-KBYTE
DECODER CACHE
ARRAY
ADDR[28:29]
128
4-WORD
128
LINE
BUFFER 128
WORD STREAM 4-WORD
32 128
SELECT HIT BURST
MUX MUX BUFFER
DATA 128
4➝1 128
2➝1
32
INSTRUCTION BYPASS
TO CPU I-BUS
MUX
128 DATA
2➝1
32
IC DATA PATH
These registers are privileged; attempting to access them when the CPU is oper-
ating at the user privilege level results in a program interrupt.
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESERVED
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 IEN I-cache enable status bit. This bit is a read-only bit. Any attempt to write it is ignored
0 = I-cache is disabled
1 = I-cache is enabled
[1:3] — Reserved
[7:9] — Reserved
[13:31] — Reserved
ADR
RESET: UNDEFINED
[0:31] ADR The address to be used in the command programmed in the control and status
register
DAT
RESET: UNDEFINED
[0:31] DAT The data received when reading information from the I-cache
The transfer begins with the word requested by the instruction unit (critical word
first), followed by any remaining words of the line, then by any remaining words at
the beginning of the line (wrap around). As the missed instruction is received from
the bus, it is immediately delivered to the instruction unit and also written to the
burst buffer.
As subsequent instructions are received from the bus they are also written into the
burst buffer and, if needed, delivered to the instruction unit (stream hit) either di-
rectly from the bus or from the burst buffer. When the entire line resides in the burst
buffer, it is written to the cache array if the cache array is not busy with an instruc-
tion unit request.
Together with the missed word, an indication may arrive from the I-bus that the
memory device is non-cacheable. If such an indication is received, the line is writ-
ten only to the burst buffer and not to the cache. Instructions stored in the burst
buffer that originated in a cache-inhibited memory region are used only once before
being refetched. Refer to 5.4.8 Cache Inhibit for more information.
To minimize power consumption, the I-cache does not initiate a miss sequence in
most cases when the instruction is inside a predicted path. The I-cache evaluates
fetch requests to determine whether they are inside a predicted path. If a hit is de-
tected, the requested data is delivered to the processor. However, on a cache
miss, in most cases the cache-miss sequence is not initiated until the processor fin-
ishes the branch evaluation.
Most of the commands are executed immediately after the control register is written
and cannot generate any errors. When these commands are executed, there is no
need to check the error status in the ICCST.
The load & lock command may take longer and may generate errors. When exe-
cuting this command, the user needs to insert an isync instruction immediately af-
ter the I-cache command and check the error status in the ICCST after the isync
instruction. The error type bits in the ICCST are sticky, allowing the user to perform
a series of I-cache commands before checking the termination status. These bits
are set by hardware and cleared by software.
Only commands that are not executed immediately need to be followed by an
isync instruction for the hardware to perform them correctly. However, all com-
mands need to be followed by isync in order to make sure all fetches of instruc-
tions that follow the I-cache command in the program stream are affected by the I-
cache command.
Because the ICCST is a supervisor-level register, cache commands that require
setting bits in this register are accessible only at the supervisor privilege level
(MSR[PR] = 0). Attempting to write this register at the user privilege level results in
a program exception.
The CPU icbi instruction (discussed below) can be performed at the user privilege
level.
When the command is invoked, if MSR[PR] = 0, all valid lines in the cache, except
lines that are locked, are made invalid. As a result of this command, the LRU of all
lines points to an unlocked way or to way zero if both lines are not locked. This last
feature is useful in order to initialize the I-cache out of reset.
The I-cache performs this instruction in one clock cycle. In order to calculate the
latency of this instruction accurately, bus latency should be taken into account.
The user needs to check the error type bits in the ICCST to determine if the oper-
ation completed properly or not. The load & lock command can generate two er-
rors:
• Type 1 — bus error in one of the cycles that fetches the line
• Type 2 — no place to lock. It is the responsibility of the user to make sure that
there is at least one unlocked way in the appropriate set.
The I-cache performs this instruction in one clock cycle. To calculate the latency of
this instruction accurately, bus latency should be taken into account.
In order to unlock the whole cache set the unlock all command in the ICCST.
This command has no error cases that the user needs to check.
The I-cache performs this instruction in one clock cycle. To calculate the latency of
this instruction accurately, bus latency should be taken into account.
1. Unlock all locked lines containing code that originated in this memory region
2. Invalidate all lines containing code that originated in this memory region
3. Execute an isync instruction
If these steps are not followed, code from a cache-inhibited region could be left in-
side the cache, and a reference to a cache-inhibited region could result in a cache
hit. When a reference to a cache-inhibited region results in a cache hit, the data is
delivered to the processor from the cache, not from memory.
When the FREEZE signal is asserted, indicating that the processor is under debug,
all fetches from the cache are treated as if they were from cache-inhibited regions.
1. Write the address of the data to be read to the ICADR. Note that it is also
possible to read this register for debugging purposes.
2. Read the ICDAT
So that all parts of the I-cache can be accessed, the ICADR is divided into the fol-
lowing fields:
Table 5-5 ICADR Bits Function for the Cache Read Command
[0:17] 18 19 20 [21:27] [28:29] [30:31]
When the data array is read from, the 32 bits of the word selected by the ICADR
are placed in the target general-purpose register.
When the tag array is read, the 21 bits of the tag selected by the ICADR, along with
additional information, are placed in the target general-purpose register. Table 5-
6 illustrates the bits layout of the I-cache data register when a tag is read.
Tag value Reserved 0 = not valid 0 = not locked LRU bit Reserved
1 = valid 1 = locked
Performing a load & lock with such an on-chip memory is not advised. In most cas-
es the instruction will still be fetched from the on-chip memory, even though it is
also present in the I-cache.
In order to ensure proper operation of the I-cache after reset, the following actions
must be performed:
1. unlock all
2. invalidate all
3. cache enable
When FREEZE is asserted, the I-cache treats all misses as if they were from
cache-inhibited regions. That is, the misses are loaded only to the burst buffer and
the cache state therefore remains exactly the same (assuming the debug routine
is not in the I-cache). Notice that when FREEZE is asserted, cache hits are still
read from the array, and therefore the LRU bits are updated.
1. Save both ways of the sets that are needed for the debug routine by reading
the tag value, LRU bit value, valid bit value, and lock bit value.
2. Unlock the locked ways in the selected sets.
3. Use load & lock to load and lock the debug routine into the I-cache (load &
lock operates the same when FREEZE is asserted).
4. Run the debug routine. All accesses to it will result in hits.
After the debug routine is finished, the old state of the I-cache can be restored by
following these steps:
1. Unlock and invalidate all the sets that are used by the debug routine (both
ways).
2. Use load & lock to restore the old sets.
3. Unlock the ways that were not locked before.
4. In order to restore the old state of the LRU, make sure the last access (load
& lock or unlock) is performed the MRU way (not the LRU way).
NOTE
If debug mode is enabled and the appropriate bit in the debug enable
register (DER) is set, recognition of an exception results in debug-
mode processing rather than normal exception processing. Refer to
SECTION 8 DEVELOPMENT SUPPORT for details.
• Reset
• Machine check
• Non-maskable internal (instruction and data) breakpoints
• Non-maskable external breakpoints
Unordered exceptions may be reported at any time and are not guaranteed to pre-
serve program state information. The processor can never recover from a reset ex-
ception. It can recover from other unordered exceptions in most cases. However,
if an unordered exception occurs during the servicing of a previous exception, the
machine state information in SRR0 and SRR1 (and, in some cases, the DAR and
DSISR) may not be recoverable; the processor may be in the process of saving or
restoring these registers.
To determine whether the machine state is recoverable, the user can read the RI
(recoverable exception) bit in SRR1. Refer to 6.5 Recovery from Exceptions for
details.
When a precise exception occurs, the processor backs the machine up to the in-
NOTE
In the RCPU implementation of the PowerPC architecture, the ma-
chine-check exception is synchronous, (i.e., it is assigned to the in-
struction that caused it). In other PowerPC implementations, this
exception may be asynchronous.
Table 6-2 shows which precise exceptions are taken before the excepting instruc-
tion is executed, which are taken after, and which are taken after the instruction is
partially executed.
Others Before
• SRR0 addresses the instruction that would have completed if the exception
had not occurred.
• An exception is generated such that all instructions preceding the instruction
addressed by SRR0 appear to have completed with respect to the executing
processor.
Asynchronous, non-maskable exceptions can occur while other exceptions are be-
ing processed. If one of these exceptions occurs immediately after another excep-
tion, the state information saved by the first exception may be overwritten when the
second exception occurs. These exceptions are thus considered unordered. For
additional information, refer to 6.5.2 Recovery from Unordered Exceptions.
NOTE
The exception vectors shown in Table 6-3, up to and including the
floating-point assist exception (vector offset 0x00E00), are defined
by the PowerPC architecture. Exception vectors beginning with offset
0x01000 (software emulation exception in the RCPU) are reserved in
the PowerPC architecture for implementation-specific exceptions.
Exception
Vector Offset Causing Conditions
Type
System reset 0x00100 A reset exception results when the RESET input to the processor is asserted.
Machine check 0x00200 A machine check exception results when the TEA signal is asserted internally or
externally.
— 0x00300 Reserved. (In the PowerPC architecture, this exception vector is reserved for
data access exceptions.)
— 0x00400 Reserved. (In the PowerPC architecture, this exception vector is reserved for
instruction access exceptions.)
External 0x00500 An external interrupt occurs when the RCPU IRQ input signal is asserted.
interrupt
Alignment 0x00600 An alignment exception is caused when the processor cannot perform a memory
access for one of the following reasons:
• The operand of a floating-point load or store is not word-aligned.
• The operand of a load- or store-multiple instruction is not word-aligned.
• The operand of lwarx or stwcx. is not word-aligned.
• In little-endian mode, an operand is not properly aligned.
• In little-endian mode, the processor attempts to execute a multiple or string
instruction.
Program 0x00700 A program exception is caused by one of the following exception conditions:
• Floating-point enabled exception — A floating-point enabled program
exception condition is generated when the following condition is met as a
result of a move to FPSCR instruction, move to MSR instruction, or return
from interrupt instruction:
(MSR[FE0] | MSR[FE1]) & FPSCR[FEX] = 1.
• Privileged instruction — A privileged instruction type program exception is
generated when the execution of a privileged instruction is attempted and
the MSR register user privilege bit, MSR[PR], is set. This exception is also
generated for mtspr or mfspr with an invalid SPR field if SPR[0]=1 and
MSR[PR]=1.
• Trap — A trap type program exception is generated when any of the
conditions specified in a trap instruction is met.
Decrementer 0x00900 The decrementer exception occurs when the most significant bit of the
decrementer (DEC) register changes from zero to one.
Reserved 0x00A00 —
Reserved 0x00B00 —
System call 0x00C00 A system call exception occurs when a system call (sc) instruction is executed.
Trace 0x00D00 A trace exception occurs if MSR[SE] = 1 and any instruction other than rfi is
successfully completed, or if MSR[BE] = 1 and a branch is completed.
Exception
Vector Offset Causing Conditions
Type
To enable the processor to recover from an exception, a history buffer is used. This
buffer is a FIFO queue which records relevant machine state at the time of each
instruction issue. Instructions are placed on the tail of the queue when they are is-
sued and percolate to the head of the queue while they are in execution. Instruc-
tions remain in the queue until they complete execution (i.e., have completed the
writeback stage) and all preceding instructions have completed as well. In this way,
when an exception occurs, the machine state necessary to recover the architectur-
al state is available. As instructions complete execution, they are retired from the
queue, and the buffer storage is reclaimed for new instructions entering the queue.
ISSUED RETIRED
INSTRUCTIONS INSTRUCTIONS
HISTORY BUFFER QUEUE
COMPLETED INSTRUCTIONS
WRITE BACK HIST BUF Q BLOCK
An exception can be detected at any time during instruction execution and is re-
corded in the history buffer when the instruction finishes execution. The exception
is not recognized until the faulting instruction reaches the head of the history
queue. When the exception is recognized, exception processing begins. The
queue is reversed, and the machine is restored to its state at the time the instruc-
tion was issued. Machine state is restored at a maximum rate of two floating-point
and two integer instructions per clock cycle.
To correctly restore the architectural state, the history buffer must record the value
of the destination before the instruction is executed. The destination of a store in-
struction, however, is in memory. It is not practical for the processor to always read
memory before writing it. Therefore, stores issue immediately to store buffers, but
do not update memory until all previous instructions have completed execution
without exception, i.e., until the store has reached the head of the history buffer.
The history buffer has enough storage to hold a total of six instructions. Of these,
a maximum of four can be integer instructions (including integer load or store in-
structions), and a maximum of three can be floating-point instructions (including
floating-point loads or stores). If the buffer includes an instruction with long latency,
it is possible (if a data dependency does not occur first) for issued instructions to
fill up the history buffer. If so, instruction issue halts until the long-latency operation
retires (along with any instructions following it that are ready to retire). Instructions
that can cause the history buffer to fill up include floating-point arithmetic instruc-
tions, integer divide instructions, and instructions that affect or use resources ex-
ternal to the processor (e.g., load/store instructions).
If the instruction is not one of those listed above, it and all subsequent instructions
are flushed from the buffer as if they were never executed at all.
The exceptions are listed in Table 6-5 in order of highest to lowest priority.
Asynchronous, 1 Non-maskable external breakpoint — This exception has the highest priority and is
non-maskable taken immediately, regardless of other pending exceptions or whether the machine
state is recoverable.
2 Reset —The reset exception has the second-highest priority and is taken
immediately, regardless of other pending exceptions (except for the non-maskable
external breakpoint exception) or whether the machine state is recoverable.
Asynchronous, 4 Peripheral or external maskable breakpoint request — When this exception type
maskable occurs, the processor retires as many instructions as possible (i.e., all instructions
that have completed the writeback stage without generating an instruction, provided
all instructions ahead of it in the history buffer have also completed the writeback
stage without generating an exception). Then, depending on the instruction
currently at the head of the history buffer, the processor either flushes the history
buffer or allows the instruction at the head of the buffer to retire before generating
an exception. Refer to 6.4 Implementation of Asynchronous Exceptions.
5 External interrupt — When this exception type occurs, the processor retires as
many instructions as possible (i.e., all instructions that have completed the
writeback stage without generating an instruction, provided all instructions ahead of
it in the history buffer have also completed the writeback stage without generating
an exception). Then, depending on the instruction currently at the head of the
history buffer, the processor either flushes the history buffer or allows the instruction
at the head of the buffer to retire before generating an exception (provided a higher
priority exception does not exist). Refer to 6.4 Implementation of Asynchronous
Exceptions. This exception is delayed if MSR[EE] is cleared.
6 Decrementer — This exception is the lowest priority exception. When this exception
type occurs, the processor retires as many instructions as possible (i.e., all
instructions that have completed the writeback stage without generating an
instruction, provided all instructions ahead of it in the history buffer have also
completed the writeback stage without generating an exception). Then, depending
on the instruction currently at the head of the history buffer, the processor either
flushes the history buffer or allows the instruction at the head of the buffer to retire
before generating an exception (provided a higher priority exception does not exist).
Refer to 6.4 Implementation of Asynchronous Exceptions. This exception is
delayed if MSR[EE] is cleared.
Only one synchronous, precise exception can be reported at a time. If single in-
structions generate multiple exception conditions, the processor handles the ex-
ception it encounters first; then the execution of the excepting instruction continues
until the next excepting condition is encountered. Table 6-6 lists the order in which
synchronous exceptions are detected.
1 Trace1
3 I-bus breakpoint
5 Floating-point unavailable
62 Privileged instruction
Alignment exception
System call
Trap
10 L-bus breakpoint
NOTES:
1. The trace mechanism is implemented by letting one instruction complete as if no trace
were enabled and then trapping the second instruction. Trace has the highest priority
of exceptions associated with this second instruction.
2. All of these cases are mutually exclusive for any one instruction.
SRR1 is a 32-bit register used to save machine status (the contents of the MSR)
on exceptions and to restore machine status when rfi is executed.
The data address register (DAR) is a 32-bit register used by alignment exceptions
to identify the address of a memory element.
RESERVED ILE
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
EE PR FP ME FE0 SE BE FE1 0 IP IR DR 0 RI LE
RESET:
0 0 0 U 0 0 0 0 0 * 0 0 0 0 0 0
*Reset value of this bit on value of internal data bus configuration word at reset. Refer to the System Interface Unit
Reference Manual (SIURM/AD).
[0:14] — Reserved
15 ILE Exception little-endian mode. When an exception occurs, this bit is copied into MSR[LE] to se-
lect the endian mode for the context established by the exception.
0 Processor runs in big-endian mode during exception processing.
1 Processor runs in little-endian mode during exception processing.
17 PR Privilege level
0 The processor can execute both user- and supervisor-level instructions.
1 The processor can only execute user-level instructions.
18 FP Floating-point available
0 The processor prevents dispatch of floating-point instructions, including floating-point
loads, stores and moves. Floating-point enabled program exceptions can still occur and
the FPRs can still be accessed.
1 The processor can execute floating-point instructions, and can take floating-point enabled
exception type program exceptions.
24 — Reserved
25 IP Exception prefix. The setting of this bit determines the location of the exception vector table.
0 Exceptions are vectored to the physical address 0x0000 0000 plus vector offset.
1 Exceptions are vectored to the physical address 0xFFF0 0000 plus vector offset.
[26:29] — Reserved
30 RI Recoverable exception
0 Exception is not recoverable.
1 Exception is recoverable.
31 LE Little-endian mode
0 Processor operates in big-endian mode during normal processing.
1 Processor operates in little-endian mode during normal processing.
FE[0:1] Mode
MSR[16:31] are guaranteed to be written to SRR1 when the first instruction of the
exception handler is encountered.
Table 6-9 shows the MSR bit settings when the processor changes to supervisor
mode.
MSR Bit
EE PR FP ME FE0 SE BE FE1 IP RI LE
Exception Type
16 17 18 19 20 21 22 23 25 30 31
Reset 0 0 0 —1 0 0 0 0 †2 0 0
All others 0 0 0 — 0 0 0 0 — 0 †3
NOTES:
1. — indicates that bit is not altered.
2. Depends on value of internal data bus configuration word at reset.
3. Contains value of MSR[ILE] prior to exception.
The data storage and interrupt status register (DSISR) identifies the cause of the
error in case of an exception caused by a load or a store. The DSISR is loaded with
the instruction information as described in 6.11.4 Alignment Exception
(0x00600).
The breakpoint address register (BAR) indicates the address on which an L-bus
breakpoint occurs. For multi-cycle instructions, the BAR contains the address of
the first cycle associated with the breakpoint. The BAR has a valid value only when
a data breakpoint exception is taken; at all other times its value is boundedly unde-
fined.
Table 6-10 summarizes the values in DAR, BAR, and DSISR following an excep-
tion.
• The sync instruction, to resolve any data dependencies between the process-
es and to synchronize the use of SPRs.
• The isync instruction, to ensure that undispatched instructions not in the new
process are not used by the new process.
• The stwcx. instruction, to clear any outstanding reservations, which ensures
that an lwarx instruction in the old process is not paired with an stwcx. in the
new process.
Note that if an exception handler is used to emulate an instruction that is not imple-
mented, the exception handler must report in SRR0 the EA computed by the in-
struction being emulated and not one used to emulate the instruction.
A Faulting instruction
issue
Kill pipeline
C Start fetch
handler
D ≤ B + 3 clocks
E 1st instruction of
handler issued
At time-point A the excepting instruction issues and begins execution. During the
interval A-B previously issued instructions are finishing execution. The interval A-
B is equivalent to the time required for all instructions currently in progress to com-
plete, (i.e., the time to serialize the machine).
At time-point B the excepting instruction has reached the head of the history queue,
implying that all instructions preceding it in the code stream have finished execu-
tion without generating any exception. In addition, the excepting instruction itself
has completed execution. At this time the exception is recognized, and exception
processing begins. If at this point the instruction had not generated an exception,
it would have been retired.
During the interval B-D the machine state is being restored. This can take up to
three clock cycles.
At time-point C the processor starts fetching the first instruction of the exception
handler.
By time-point D the state of the machine prior to the issue of the excepting instruc-
tion has been restored. During interval D-E, the machine is saving context informa-
tion in SRR0 and SRR1, disabling interrupts, placing the machine in privileged
mode, and may continue the process of fetching the first instructions of the interrupt
handler from the vector table.
At time-point E the MSR and instruction pointer of the executing process have been
saved and control has been transferred to the exception handler routine.
The interval D-E requires a minimum of one clock cycle. The interval C-E depends
on the memory system. This interval is the time it takes to fetch the first instruction
of the exception handler. For a full history buffer, it is no less then two clocks.
Table 6-12 shows the state of the machine just before it fetches the first instruction
after reset. Registers not listed are not affected by reset.
SRR0 Undefined
SRR1 Undefined
SRR0 Set to the effective address of the instruction that caused the interrupt.
SRR1 0 Cleared
1 Set for instruction-fetch related errors, cleared for load-store related errors
[2:15] Cleared
[16:31] Loaded from MSR[16:31].
MSR IP No change
ME Cleared to zero
LE Set to value of ILE bit prior to the exception
Other bits Cleared
DSISR (L-bus case [15:16] Set to bits [29:30] of the instruction if X-form
only) Set to 0b00 if D-form
17 Set to bit 25 of the instruction if X-form
Set to bit 5 of the instruction if D-form
[22:31] Set to bits [6:15] of the instruction
DAR (L-bus case Set to the effective address of the data access that caused the exception.
only)
The interrupt may be delayed by other higher priority exceptions or if the MSR[EE]
The register settings for the external interrupt exception are shown in Table 6-15.
SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next
if no interrupt conditions were present.
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared
Alignment exceptions use the SRR0 and SRR1 to save the machine state and the
DSISR to determine the source of the exception.
The register settings for alignment exceptions are shown in Table 6-16.
SRR0 Set to the effective address of the instruction that caused the exception.
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared
• Load or store
• Length (half word, word, or double word)
• String, multiple, or normal load/store
Table 6-17 shows how the DSISR bits identify the instruction that caused the ex-
ception.
DSISR[15:21] Instruction
1
00 0 0000 lwarx, lwz, reserved
00 0 0010 stw
00 0 0100 lhz
00 0 0101 lha
00 0 0110 sth
00 0 0111 lmw
00 0 1000 lfs
00 0 1001 lfd
00 0 1010 stfs
00 0 1011 stfd
00 1 0000 lwzu
00 1 0010 stwu
00 1 0100 lhzu
00 1 0101 lhau
00 1 0110 sthu
00 1 0111 stmw
00 1 1000 lfsu
00 1 1001 lfdu
00 1 1010 stfsu
00 1 1011 stfdu
01 0 1000 lswx
01 0 1001 lswi
01 0 1010 stswx
DSISR[15:21] Instruction
01 0 1011 stswi
01 1 0101 lwaux
10 0 0010 stwcx.
10 0 1000 lwbrx
10 0 1010 stwbrx
10 0 1100 lhbrx
10 0 1110 sthbrx
11 0 0000 lwzx
11 0 0010 stwx
11 0 0100 lhzx
11 0 0101 lhax
11 0 0110 sthx
11 0 1000 lfsx
11 0 1001 lfdx
11 0 1010 stfsx
11 0 1011 stfdx
11 1 0000 lwzux
11 1 0010 stwux
11 1 0100 lhzux
11 1 0101 lhaux
11 1 0110 sthux
11 1 1000 lfsux
11 1 1001 lfdux
11 1 1010 stfsux
11 1 1011 stfdux
NOTES:
1. The instructions lwz and lwarx give the same DSISR bits (all zero). But if
lwarx causes an alignment exception, it is an invalid form, so it need not
be emulated in any precise way. It is adequate for the alignment exception
handler to simply emulate the instruction as if it were an lwz. It is important
that the emulator use the address in the DAR, rather than computing it
from rA/rB/D, because lwz and lwarx use different addressing modes.
Note that only one of bits 11, 13, and 14 can be set.
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared to zero
The register settings for floating-point unavailable exceptions are shown in Table
6-19.
SRR0 Set to the effective address of the instruction that caused the exception.
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared
• Loading a GPR from the decrementer does not affect the decrementer.
• Storing a GPR value to the decrementer replaces the value in the decrementer
with the value in the GPR.
• Whenever bit 0 of the decrementer changes from zero to one, an exception
request is signaled. If multiple decrementer exception requests are received
before the first can be reported, only one exception is reported. The occur-
rence of a decrementer exception cancels the request.
• If the decrementer is altered by software and if bit 0 is changed from zero to
one, an interrupt request is signaled.
The register settings for the decrementer exception are shown in Table 6-20.
SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next
if no exception conditions were present.
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared to zero
SRR0 Set to the effective address of the instruction following the System Call instruction
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared to zero
A monitor or debugger software needs to change the vectors of other possible ex-
ception addresses in order to single-step such instructions. If this is not desirable,
other debugging features can be used. Refer to SECTION 8 DEVELOPMENT
SUPPORT for more information.
SRR0 Set to the effective address of the instruction following the executed instruction
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared to zero
When a trace exception is taken, execution resumes at offset 0x00D00 from the
base address indicated by MSR[IP].
SRR0 Set to the effective address of the instruction that caused the exception
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared to zero
FPSCR
PTEC HARDWARE
FLOATING POINT UNIT (REGULAR OPERATION)
FX
FEX
(REGULAR OPERATION)
VX
UX OX
XX ZX
VXSNAN
VXISI
VXIDI IEEE
VXZDZ RESULT
VXIMZ
C VXVC
FI FR
VXSQRT
FPCC
VXCVI INTERRUPT
RESULT
SOFTWARE ENVELOPE
ENABLED
EXCEPTION
RESULT
IEEE
FPSCR RESULT
FX
UX
FR
C
FI
FPECR
FPCC
USER
FR FI PROGRAM
TR
DNA
DNB
DNC
RESULT
FLOATING POINT UNIT
CPU FP ARCH
DOUBLE NORMALIZED
NOTES:
1. The results in all cases of programming errors are boundedly undefined.
2. When used by a single precision instruction, generates correct result only if bits [35:63] of the operand equal
zero, otherwise it is a programming error.
3. Since the result is tiny, a floating-point assist exception is taken at the end of instruction execution.
Except when the result is tiny or when denormalized operands are detected, the
results generated by the hardware in SIE mode are practically all that is needed in
order to complete the operation according to the IEEE standard. Therefore, in most
cases after executing the instruction in SIE mode all that is needed by the software
is to issue rfi. Upon execution of the rfi, the hardware restores the previous value
of the MSR, as it was saved in SRR1. If as a result ((MSR[FE0] | MSR[FE1]) &
FPSCR[FEX]) is set, a program exception is generated.
When the result is tiny and the floating-point underflow exception is disabled
(FPSCR[UE] = 0), the hardware in SIE mode delivers the same result as when the
exception is enabled (FPSCR[UE] = 1), (i.e., rounded mantissa with exponent ad-
justed by adding 192 for single precision or 1536 for double precision). This inter-
mediate result simplifies the task of the emulation routine that finishes the
instruction execution and delivers the correct IEEE result. In this case the software
envelope is responsible for updating the floating-point underflow exception bit
(FPSCR[UX]) as well.
When at least one of the source operands is denormalized and the hardware can
not complete the operation, the destination register value is unchanged. In this
case, the software emulation routine must execute the instruction in software, de-
liver the result to the destination register, and update the FPSCR.
SIE RESERVED
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[1:27] — Reserved
NOTE
Software must insert a sync instruction before reading the FPECR.
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 FX Floating-point exception summary (FX). Every floating-point instruction implicitly sets FP-
SCR[FX] if that instruction causes any of the floating-point exception bits in the FPSCR to
change from zero to one. The mcrfs instruction implicitly clears FPSCR[FX] if the FPSCR field
containing FPSCR[FX] is copied. The mtfsf, mtfsfi, mtfsb0, and mtfsb1 instructions can set
or clear FPSCR[FX] explicitly. This is a sticky bit.
1 FEX Floating-point enabled exception summary (FEX). This bit signals the occurrence of any of the
enabled exception conditions. It is the logical OR of all the floating-point exception bits masked
with their respective enable bits. The mcrfs instruction implicitly clears FPSCR[FEX] if the re-
sult of the logical OR described above becomes zero. The mtfsf, mtfsfi, mtfsb0, and mtfsb1
instructions cannot set or clear FPSCR[FEX] explicitly. This is not a sticky bit.
2 VX Floating-point invalid operation exception summary (VX). This bit signals the occurrence of any
invalid operation exception. It is the logical OR of all of the invalid operation exceptions. The
mcrfs instruction implicitly clears FPSCR[VX] if the result of the logical OR described above
becomes zero. The mtfsf, mtfsfi, mtfsb0, and mtfsb1 instructions cannot set or clear FP-
SCR[VX] explicitly. This is not a sticky bit.
3 OX Floating-point overflow exception (OX). This is a sticky bit. See 6.11.10.8 Overflow Exception
Condition.
4 UX Floating-point underflow exception (UX). This is a sticky bit. See 6.11.10.9 Underflow Excep-
tion Condition.
5 ZX Floating-point zero divide exception (ZX). This is a sticky bit. See 6.11.10.7 Zero Divide Ex-
ception Condition.
6 XX Floating-point inexact exception (XX). This is a sticky bit. See 6.11.10.10 Inexact Exception
Condition.
7 VXSNAN Floating-point invalid operation exception for SNaN (VXSNAN). This is a sticky bit. See
6.11.10.6 Invalid Operation Exception Conditions.
8 VXISI Floating-point invalid operation exception for ×-× (VXISI). This is a sticky bit. See 6.11.10.6 In-
valid Operation Exception Conditions.
9 VXIDI Floating-point invalid operation exception for ×/× (VXIDI). This is a sticky bit. See 6.11.10.6 In-
valid Operation Exception Conditions.
10 VXZDZ Floating-point invalid operation exception for 0/0 (VXZDZ). This is a sticky bit. See 6.11.10.6
Invalid Operation Exception Conditions.
11 VXIMZ Floating-point invalid operation exception for ×*0 (VXIMZ). This is a sticky bit. See 6.11.10.6
Invalid Operation Exception Conditions.
12 VXVC Floating-point invalid operation exception for invalid compare (VXVC). This is a sticky bit. See
6.11.10.6 Invalid Operation Exception Conditions.
13 FR Floating-point fraction rounded (FR). The last floating-point instruction that potentially rounded
the intermediate result incremented the fraction. (See 3.3.11 Rounding.) This bit is not sticky.
14 FI Floating-point fraction inexact (FI). The last floating-point instruction that potentially rounded
the intermediate result produced an inexact fraction or a disabled exponent overflow. (See
3.3.11 Rounding.) This bit is not sticky.
[15:19 FPRF Floating-point result flags (FPRF). This field is based on the value placed into the target register
] even if that value is undefined. Refer to Table 6-27 for specific bit settings.
15 Floating-point result class descriptor (C). Floating-point instructions other than the
compare instructions may set this bit with the FPCC bits, to indicate the class of
the result.
[16:19] Floating-point condition code (FPCC). Floating-point compare instructions always
set one of the FPCC bits to one and the other three FPCC bits to zero. Other
floating-point instructions may set the FPCC bits with the C bit, to indicate the class
of the result. Note that in this case the high-order three bits of the FPCC retain their
relational significance indicating that the value is less than, greater than, or equal
to zero.
16 Floating-point less than or negative (FL or <)
17 Floating-point greater than or positive (FG or >)
18 Floating-point equal or zero (FE or =)
19 Floating-point unordered or NaN (FU or?)
20 — Reserved
21 VXSOFT Floating-point invalid operation exception for software request (VXSOFT). This bit can be al-
tered only by the mcrfs, mtfsfi, mtfsf, mtfsb0, or mtfsb1 instructions. The purpose of VX-
SOFT is to allow software to cause an invalid operation condition for a condition that is not
necessarily associated with the execution of a floating-point instruction. For example, it might
be set by a program that computes a square root if the source operand is negative. This is a
sticky bit. See 6.11.10.6 Invalid Operation Exception Conditions.
22 VXSQRT Floating-point invalid operation exception for invalid square root (VXSQRT). This is a sticky bit.
This guarantees that software can simulate fsqrt and frsqrte, and to provide a consistent in-
terface to handle exceptions caused by square-root operations. See 6.11.10.6 Invalid Opera-
tion Exception Conditions.
23 VXCVI Floating-point invalid operation exception for invalid integer convert (VXCVI). This is a sticky
bit. See 6.11.10.6 Invalid Operation Exception Conditions.
24 VE Floating-point invalid operation exception enable (VE). See 6.11.10.6 Invalid Operation Ex-
ception Conditions.
25 OE Floating-point overflow exception enable (OE). See 6.11.10.8 Overflow Exception Condition.
26 UE Floating-point underflow exception enable (UE). This bit should not be used to determine
whether denormalization should be performed on floating-point stores. See 6.11.10.9 Under-
flow Exception Condition.
27 ZE Floating-point zero divide exception enable (ZE). See 6.11.10.7 Zero Divide Exception Con-
dition.
28 XE Floating-point inexact exception enable (XE). See 6.11.10.10 Inexact Exception Condition.
Table 6-27 illustrates the floating-point result flags that correspond to FP-
SCR[15:19].
01001 –Infinity
10010 –Zero
00010 + Zero
00101 +Infinity
The following conditions cause floating-point assist exceptions when the corre-
sponding enable bit in the FPSCR is set and the FE field in the MSR has a nonzero
value (enabling floating-point exceptions). These conditions may occur during ex-
ecution of floating-point arithmetic instructions. The corresponding status bits in the
For the remaining kinds of exception conditions, a result is generated and written
to the destination specified by the instruction causing the exception. The result may
be a different value for the enabled and disabled conditions for some of these ex-
ception conditions. The kinds of exception conditions that deliver a result are the
following:
The IEEE standard specifies the handling of exception conditions in terms of traps
and trap handlers. In the PowerPC architecture, setting an FPSCR exception en-
able bit causes generation of the result value specified in the IEEE standard for the
trap enabled case — the expectation is that the exception is detected by software,
which will revise the result. An FPSCR exception enable bit of zero causes gener-
ation of the default result value specified for the trap disabled (or no trap occurs or
trap is not implemented) case — the expectation is that the exception will not be
detected by software, which will simply use the default result. The result to be de-
livered in each case for each exception is described in the following sections.
The IEEE default behavior when an exception occurs, which is to generate a de-
fault value and not to notify software, is obtained by clearing all FPSCR exception
enable bits and using ignore exceptions mode (see Table 6-8). In this case the sys-
tem floating-point assist error handler is not invoked, even if floating-point excep-
If the program exception handler notifies software that a given exception condition
has occurred, the corresponding FPSCR exception enable bit must be set and a
mode other than ignore exceptions mode must be used. In this case the system
floating-point assist error handler is invoked if an enabled floating-point exception
condition occurs.
FE[0:1] Mode
01, 10, 11 Floating-point precise mode — The system floating-point assist error
handler is invoked precisely at the instruction that caused the enabled
exception.
• If the IEEE default results are acceptable to the application, FE0 and FE1
should be cleared (ignore exceptions mode). All FPSCR exception enable bits
should be cleared.
• For even faster operation, non-IEEE can be selected by setting the NI bit in
the FPSCR. To ensure that the software envelope is never invoked, select
non-IEEE mode, disable all floating-point exceptions, and avoid using denor-
malized numbers as input to floating-point calculations. Refer to 3.4.3 Non-
IEEE Operation and 3.4.4 Working Without the Software Envelope for
more information.
• Ignore exceptions mode should not, in general, be used when any FPSCR ex-
ception enable bits are set.
• Any operation except load, store, move, select, or mtfsf on a signaling NaN
(SNaN)
• For add or subtract operations, magnitude subtraction of infinities (×-×)
• Division of infinity by infinity (×/×)
• Division of zero by zero (0/0)
• Multiplication of infinity by zero (×*0)
• Ordered comparison involving a NaN (invalid compare)
• Square root or reciprocal square root of a negative, non-zero number (invalid
square root)
• Integer convert involving a number that is too large to be represented in the
format, an infinity, or a NaN (invalid integer convert)
• The same status bits are set in the FPSCR as when the exception is enabled.
• If the operation is an arithmetic operation,
— the target FPR is set to a quiet NaN
— FPSCR[FR] and FPSCR[FI] are cleared
— FPSCR[FPRF] is set to indicate the class of the result (quiet NaN)
• If the operation is a convert to 32-bit integer operation, the target FPR is set
as follows:
— FRT[0:31] = undefined
— FRT[32:63] = most negative 32-bit integer
— FPSCR[FR] and FPSCR[FI] are cleared
— FPSCR[FPRF] is undefined
• If the operation is a convert to 64-bit integer operation, the target FPR is set
as follows:
— FRT[0:63] = most negative 64-bit integer
— FPSCR[FR] and FPSCR[FI] are cleared
— FPSCR[FPRF] is undefined
• If the operation is a compare,
— The FR, FI, and C bits in the FPSCR are unchanged
— FPSCR[FPCC] is set to reflect unordered
• If software explicitly requests the exception, the FR, FI and FPRF fields in the
FPSCR are as set by the mtfsfi, mtfsf, or mtfsb1 instruction.
If the intermediate result is tiny and the underflow exception condition enable bit is
cleared (FPSCR[UE] = 0), the intermediate result is denormalized.
Loss of accuracy is detected when the delivered result value differs from what
would have been computed were both the exponent range and precision
unbounded.
When an underflow exception occurs, the action to be taken depends on the setting
of the underflow exception condition enable bit of the FPSCR.
The FR and FI bits in the FPSCR allow the system floating-point enabled exception
error handler, when invoked because of an underflow exception condition, to sim-
ulate a trap disabled environment. That is, the FR and FI bits allow the system float-
ing-point enabled exception error handler to unround the result, thus allowing the
result to be denormalized.
When the underflow exception condition is disabled (FPSCR[UE] = 0) and an un-
derflow condition occurs, the following actions are taken:
When the inexact exception condition occurs, regardless of the setting of the inex-
act exception condition enable bit of the FPSCR, the following actions are taken:
Register settings after a software emulation exception is taken are shown in Table
6-22.
SRR0 Set to the effective address of the instruction that caused the exception
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared
In order to enable the user to use the breakpoint features without adding restric-
tions on the software, the address of the load/store cycle that generated the data
breakpoint is not stored in the DAR (data address register), as with other excep-
tions that occur during loads or stores. Instead, the address of the load/store cycle
that generated the breakpoint is stored in an implementation dependent register
called the breakpoint address register (BAR).
Register settings after a data breakpoint exception is taken are shown in Table 6-
22.
SRR0 Set to the effective address of the instruction following the instruction that caused the exception
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared
BAR Set to the effective address of the data access as computed by the instruction that caused the
exception.
DSISR, No change
DAR
SRR0 Set to the effective address of the instruction that caused the exception
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared
Maskable external breakpoint exceptions are asynchronous and ordered. The pro-
cessor does not take the exception if the RI (recoverable exception) bit in the MSR
is cleared. Refer to SECTION 8 DEVELOPMENT SUPPORT for more information.
SRR0 Set to the effective address of the instruction that caused the exception
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared to zero
This exception allows the user to stop the processor in cases in which it would oth-
erwise not stop, but with the penalty that the processor may not be restartable. The
value of the MSR[RI] bit, as saved in the SRR1 register, indicates whether the pro-
cessor stopped in a recoverable state or not.
SRR0 Set to the effective address of the instruction that would have been executed next if no exception had
occurred. If the development port request is asserted at reset, the value of SRR0 is undefined.
MSR IP No change
ME No change
LE Set to value of ILE bit prior to the exception
Other bits Cleared to zero
The instruction sequencer fetches the instructions from the instruction cache into
the instruction pre-fetch queue. The processor uses branch folding (a technique of
removing the branch instructions from the pre-fetch queue) in order to execute
branches in parallel with execution of sequential instructions. Sequential (non-
branch) instructions reaching the top of the instruction queue are issued to the ex-
ecution units. Instructions may be flushed from the instruction queue when an ex-
ternal interrupt is detected, a previous instruction causes an exception, or a branch
prediction turns out to be incorrect.
All instructions, including branches, enter the history buffer along with processor
state information that may be affected by the instruction’s execution. This informa-
tion is used to enable out of order completion of instructions together with precise
exceptions handling. Instructions may be flushed from the machine when an ex-
ception is taken. The instruction queue is always flushed when recovery of the his-
tory buffer takes place. Refer to 6.3 Precise Exception Model Implementation
for additional information.
An instruction retires from the machine after it finishes execution without exception
and all preceding instructions have already retired from the machine.
HISTORY BUFFER
ISSUE
BRANCH
INSTRUCTION
UNIT
PRE-FETCH
QUEUE
FETCH
32
INSTRUCTION BUFFER
INSTRUCTION ADDRESS GENERATOR
32
BRANCH
CONDITION
EVALUATION
INSTRUCTION
READ WRITE BUSSES
CC UNIT PREFETCH
QUEUE
32
DECODE I1 I2
L ADDRESS DRIVE I1
LOAD WRITEBACK I1
BRANCH DECODE I1
BRANCH EXECUTE I1
RCPU INST PL
The RCPU depends on a software envelope to fully implement the IEEE floating-
point specification. Overflows, underflows, NaNs, and denormalized numbers
cause floating-point assist exceptions that invoke a software routine to deliver (with
hardware assistance) the correct IEEE result. Refer to 6.11.10 Floating-Point As-
sist Exception (0x00E00) for additional information.
There is a 32-bit wide data path between the load/store unit and the integer register
file and a 64-bit wide data path between the load/store unit and the floating-point
register file.
Single-word accesses to on-chip data RAM require one clock cycle, resulting in two
clock cycles latency. Double-word accesses require two clock cycles, resulting in
three clock cycles latency. Since the L-bus is 32 bits wide, double-word transfers
require two bus accesses.
The LSU interfaces with the external bus interface for all instructions that access
memory. Addresses are formed by adding the source one register operand speci-
fied by the instruction (or zero) to either a source two register operand or to a 16-
bit, immediate value embedded in the instruction.
All load and store instructions are executed and terminated in order. If there are no
prior instructions waiting in the address queue, the load or store instruction is is-
sued to the L-bus as soon as the instruction is taken. Otherwise, if there are still
prior instructions whose address are still to be issued to the L-bus, the instruction
is inserted into the address queue, and data (for store instructions) is inserted into
the respective store data queue. Note that for load/store with update instructions,
the destination address register is written back on the following clock cycle, regard-
less of the state of the address queue.
A new store instruction is not issued to the L-bus until all prior instructions have ter-
minated without an exception. This is done in order to implement the PowerPC pre-
cise exception model. In case of a load instruction followed by a store instruction,
a delay of one clock cycle is inserted between the termination of the load bus cycle
and the issuing of the store cycle.
NOTES:
1. Double-precision load and store instructions are pipelined on the bus.
0x00 00 01 02 03
0x04 04 05 06 07 2 bus cycles
0x08 08 09 0a 0b
0x0C 0c 0d 0e 0f word transfers
3 bus cycles
0x10 10 11 12 13
0x14 14 15 16 17 2 bus cycles
0x18 18 19 1a 1b
BUS CYC/STR EX
Figure 7-4 Number of Bus Cycles Needed for String Instruction Execution
1. Load instruction
2. First floating-point store instruction
3. Second floating-point store instruction
All instructions are fetched into the instruction prefetch queue, but only sequential
instructions are issued to the execution units upon reaching the head of the queue.
(Branches are placed into the instruction prefetch queue to enable watchpoint
marking — refer to SECTION 8 DEVELOPMENT SUPPORT for more information.)
Since branches do not prevent the issue of sequential instructions unless they
come in pairs, the performance impact of entering branches in the instruction
prefetch queue is negligible.
Refer to 4.6.2 Conditional Branch Control for more information on static branch
prediction.
7.3 Serialization
The RCPU has multiple execution units, each of which may be executing different
instructions at the same time. This concurrence is normally transparent to the user
program. In certain circumstances, however (e.g., debugging, I/O control, and
multi-processor synchronization), it may be necessary to force the machine to se-
rialize.
Two types of serialization are defined for the RCPU: execution serialization and
fetch serialization.
Fetch of an isync instruction causes fetch serialization. This means that no instruc-
tions following isync in the instruction stream are pre-fetched until isync and all
previous instructions have completed execution. In addition, when the SER (seri-
alize mode) bit in the ICTRL is asserted, or when the processor is in debug mode,
all instructions cause fetch serialization.
The following encodings are reserved in the RCPU for SPRs not located within the
processor:
SPR[5:9] SPR[0:4]
Many of the encodings in Table 7-2 are not used in the RCPU. If the processor at-
tempts to access to an unimplemented external-to-the-processor SPR, or if an er-
ror occurs during an access of an external-to-the-processor SPR, an implemen-
tation-dependent software emulation exception is taken (rather than a program ex-
ception).
Not taken 1 1 No
mtspr (to other registers) Serialize + 1 Serialize + 1 BPU Refer to Table 7-4
mffs[.] 1 1 FPU No
NOTES:
1. SPRs that are physically implemented outside of the RCPU are the time base, decrementer, ICCST, ICADR, IC-
DAT, AND DPDR.
NoOverflow ⇒ 3+ 34 – divisorLength
----------------------------------------------------
2. DivisionLatency = 4
Overflow ⇒2
3. DivisionBlockage = DivisionLatency
4. Blockage of the multiply instruction is dependent on the subsequent instruction
for subsequent multiply instruction the blockage is one clock.
for subsequent divide instruction the blockage is two clocks.
5. Assuming non-speculative aligned access, on chip memory and available bus.
6. Although stores issued to the LSU buffers free the CPU pipeline, next load or store will not actually be performed
on the bus until the bus is free.
8 LR No
9 CTR No
22 DEC Write
26 SRR0 Write
27 SRR1 Write
80 EIE Write
81 EID Write
82 NRI Write
NOTES:
1. Any write (mtspr) to this address results in an implementation-dependent software emulation ex-
ception.
2. Any read (mftb) of this address results in an implementation-dependent software emulation excep-
tion.
L ADDRESS DRIVE LD
L DATA LD
In the following example, the addic is dependent on the subf rather then on the
mulli. Although the write back of the mulli is delayed two clocks, there is no bubble
in the execution stream.
WR BK ARB EX 2
INTERNAL
CLOCK
FETCH LOAD SUB ADDIC AND ORI
L ADDRESS DRIVE LD
L DATA LD
E ADDRESS LOAD
E DATA LD
LD WR BK BUS EX
NOTE
The external clock is shifted 90° relative to the internal clock.
lwz r12,64(r0)
subf r3,r12,r3
addic r4,r14,1
INTERNAL
CLOCK
FETCH LOAD SUB ADDIC AND ORI
L ADDRESS DRIVE LD
L DATA LD
E ADDRESS LOAD
E DATA LD
EXT LOAD EX
NOTE
Following writeback of the fadd instruction, one additional bubble is
required before instruction issue resumes. During this bubble, the
history buffer retires the fadd instruction (as well as the two lfd in-
structions).
fadd fr5,fr6,fr7
lfd fr12,0(r2)
lfd fr13,8(r2)
lfd fr14,16(r2)
subf r5,r3,r5
FX WRITE BACK
FP WRITE BACK FA
NOTE
In contrast to full serialization cases, the issue and execution of fol-
lowing instructions continue unaffected.
fadd fr5,fr6,fr7
stw r12,64(SP)
subf r5,r5,r3
addic r4,r14,1
fmul fr3,fr4,fr5
or r6,r12,r3
FP WRITE BACK FA
STORE/FP EX
L DATA LD STW
BRANCH DECODE BL
BRANCH EXECUTE BL
BR FOLD EX
When the cmpi instruction is written back, the branch unit re-evaluates the deci-
sion. If the branch was correctly predicted, execution continues without interrup-
tion. The fetched instructions on the predicted path are not allowed to execute
before the condition is finally resolved. Instead, they are stacked in the instruction
prefetch queue.
while: mulli r3,r12,4
addi r4,r0,3
...
lwz r12,64(r2)
cmpi r12,3
addic r6,r5,1
blt while
...
L ADDRESS DRIVE LD
L DATA LD
LOAD WRITEBACK LD
BRANCH FINAL
DECISION BLT
RCPU BR PRED EX
Visibility extends beyond the address and data portions of the buses and includes
attribute and handshake signals. In some cases it may also include bus arbitration
signals and signals which cause processor exceptions such as interrupts and re-
sets. The visibility requirements of emulators and bus analyzers are in opposition
to the trend of modern microcomputers and microprocessors where the CPU bus
may be hidden behind a memory management unit or cache or where bus cycles
to internal resources are not visible externally.
The development tool visibility requirements may be reduced if some of the devel-
opment support functions are included in the silicon. For example, if the bus com-
parator part of a bus analyzer or breakpoint generator is included on the chip, it is
not necessary for the entire bus to be visible at all times. In many cases the visibility
requirements may be reduced to instruction fetch cycles for tracking program exe-
cution. If some additional status information is also available to assist in execution
tracking and the development tool has access to the source code, then the only
need for bus visibility is often the destination address of indirect change-of-flow in-
structions (return from subroutine, return from interrupt, and indexed branches and
jumps).
Since full bus visibility reduces available bus bandwidth and processor perfor-
mance, certain development support functions have been included in the MCU.
These functions include the following:
• Controls to limit which internal bus cycles are reflected on the external bus
(show cycles)
• CPU status signals to allow instruction execution tracking with minimal visibil-
ity of the instructions being fetched
• Watchpoint comparators that can generate breakpoints or signal an external
bus analyzer
• A serial development port for general emulation control
The mechanism described below allows tracking of the program instructions flow
with almost no performance degradation. The information provided externally may
be captured and compressed and then parsed by a post-processing program using
the microarchitecture defined below.
The RCPU implements a prefetch queue combined with parallel, out of order, pipe-
lined execution. Instructions progress inside the processor from fetch to retire. An
instruction retires from the machine only after it, and all preceding instructions, fin-
ish execution with no exception. Therefore only retired instructions can be consid-
ered architecturally executed.
These features, together with the fact that most fetch cycles are performed inter-
nally (e.g., from the I-cache), increase performance but make it very difficult to pro-
vide the user with the real program trace.
In order to reconstruct a program trace, the program code and the following addi-
tional information from the MCU are needed:
• A description of the last fetched instruction (stall, sequential, branch not taken,
branch direct taken, branch indirect taken, exception taken).
• The addresses of the targets of all indirect flow change. Indirect flow changes
include all branches using the link and count registers as the target address,
all exceptions, and rfi and mtmsr because they may cause a context switch.
• The number of instructions canceled each clock.
Reporting on program trace during retirement would significantly complicate the
visibility support and increase the die size. (Complications arise because more
than one instruction can retire in a clock cycle, and because it is harder to report
on indirect branches during retirement.) Therefore, program trace is reported dur-
ing fetch. Since not all fetched instructions eventually retire, an indication on can-
celed instructions is reported.
The following sections define how this information is generated and how it should
be used to reconstruct the program trace. The issue of data compression that could
reduce the amount of memory needed by the debug system is also mentioned.
When the processor is programmed to generate show cycles on the external bus
resulting from indirect change-of-flow, these cycles can generate regular bus cy-
cles (address phase and data phase) when the instructions reside in one of the ex-
ternal devices, or they can generate address-only show cycles for instructions that
reside in an internal device such as I-cache or internal ROM.
Table 8-1 summarizes the encodings that represent the indirect change-of-flow at-
tribute. In all cases the AT1 pin is asserted (high), indicating the cycle is an instruc-
tion fetch cycle.
Refer to 8.1.3 Program Flow-Tracking Pins for more information on the use of
these pins for program flow tracking.
Note that when the value of the ISCTL field is changed (with the mtspr instruction),
the new value does not take effect until two instructions after the mtspr instruction.
The instruction immediately following mtspr is under control of the old ISCTL val-
ue.
In order to keep the pin count of the chip as low as possible, VSYNC is not imple-
mented as an external pin; rather, it is asserted and negated using the develop-
ment port serial interface. For more information on this interface refer to 8.3.5 Trap-
Enable Input Transmissions.
The assertion and negation of VSYNC forces the machine to synchronize and the
first fetch after this synchronization to be marked as an indirect change-of-flow cy-
cle and to be visible on the external bus. This enables the external hardware to syn-
chronize with the internal activity of the processor.
When either VSYNC is asserted or the ISCTL bits in the I-bus control register are
programmed to a value of 0b10, cycles resulting from an indirect change-of-flow
are shown on the external bus. By programming the ISCTL bits to show all indirect
flow changes, the user can thus ensure that the processor maintains exactly the
same behavior when VSYNC is asserted as when it is negated. The loss of perfor-
mance the user can expect from the additional external bus cycles is minimal.
For additional information on the ISCTL bits and the ICTRL register, refer to 8.8 De-
velopment Support Registers. For more information on the use of VSYNC during
program trace, refer to 8.1.4 External Hardware During Program Trace.
• Instruction queue status pins (VF[0:2]) denote the type of the last fetched in-
struction or how many instructions were flushed from the instruction queue.
• History buffer flushes status pins (VFLS [0:1]) denote how many instructions
were flushed from the history buffer during the current clock cycle.
• Address type pin 1 (AT1) indicates whether the cycle is transferring an instruc-
tion or data.
• The write/read pin (WR), when asserted during an instruction fetch show cy-
cle, indicates the current cycle results from an indirect change-of-flow.
• Cycle type pins (CT[0:3]) indicate the type of bus cycle and are used to deter-
mine the address of an internal memory or register that is being accessed.
010 Branch (direct or indirect) not taken More instruction type information
011 VSYNC was asserted/negated and therefore the More instruction type information
next instruction will be marked with the indirect
change-of-flow attribute
100 Exception taken — the target will be marked with Queue flush information1
the program trace cycle attribute
101 Branch indirect taken, rfi, mtmsr, isync and in Queue flush information1
some cases mtspr to CMPA-F, ICTRL, ECR, or
DER — the target will be marked with the indirect
change-of-flow attribute2
NOTES:
1. Unless next clock VF=111. See below.
2. The sequential instructions listed here affect the machine in a manner similar to indirect branch instructions.
Refer to 8.1.1.2 Sequential Instructions with the Indirect Change-of-Flow Attribute.
Table 8-4 shows VF[0:2] encodings for instruction queue flush information.
110 Reserved
There is one special case in which although queue flush information is expected
on the VF[0:2] pins (according to the immediately preceding value on these pins),
regular instruction type information is reported. The only instruction type informa-
tion that can appear in this case is VF[0:2] = 111, indicating branch (direct or indi-
rect) not taken. Since the maximum queue flushes possible is five, identifying this
special case is not a problem.
If VSYNC is asserted or negated while the processor is in debug mode, this infor-
mation is reported as the first VF pins report when the processor returns to regular
0001 If address type is data (AT1 = 0), this is a data access to the external bus
and the start of a reservation.
If address type is instruction (AT1=1), this cycle type indicates that an
external address is the destination of an indirect change-of-flow.
0010 External bus cycle to emulation memory replacing internal I-bus or L-bus
memory. An instruction access (AT1 = 1) with an address that is the target
of an indirect change-of-flow is indicated as a logic level zero on the WR
output.
0011 Normal external bus cycle access to a port replacement chip used for
emulation support.
0110 Cache hit on external memory address not controlled by chip selects. An
instruction access (AT1 = 1) with an address that is the target of an indirect
change-of-flow is indicated as a logic level zero on the WR output.
1110 Reserved
1111
Program trace can be used in various ways. Two types of traces that can be imple-
mented are the back trace and the window trace.
When a back trace is needed, the external hardware should start sampling the sta-
tus pins and the address of all cycles marked with the indirect change-of-flow at-
tribute immediately after reset is negated. Since the ISCTL field in the ICTRL has
a value of is 0b00 (show all cycles) out of reset, all cycles marked with the indirect
change-of-flow attribute are visible on the external bus. VSYNC should be asserted
sometime after reset and negated when the programmed event occurs. VSYNC
must be asserted before the ISCTL encoding is changed to 0b11 (no show cycles),
if such an encoding is selected.
Note that in case the timing of the programmed event is unknown, it is possible to
use cyclic buffers.
After VSYNC is negated, the trace buffer will contain the program flow trace of the
program executed before the programmed event occurred.
After VSYNC is negated, the trace buffer will contain information describing the
program trace of the program executed between the two events.
For a window trace, the value of the status pins and the address of the cycles
marked with the indirect change-of-flow attribute should be latched beginning im-
mediately after the first VSYNC is reported on the VF pins. The starting address of
the trace window should be calculated according to the first two VF pin reports.
Assume VF1 and VF2 are the two first VF pin reports and T1 and T2 are the ad-
dresses of the first two cycles marked with the indirect change-of-flow attribute that
were latched in the trace buffer. Use Table 8-7 to calculate the trace window start-
ing address.
When the user negates VSYNC, the processor delays the report of the assertion
or negation of VSYNC on the VF pins until all addresses marked with the indirect
change-of-flow attribute have been made visible externally. Therefore, the external
hardware should stop sampling the value of the status pins (VF and VFLS) and the
address of the cycles marked with the program trace cycle attribute immediately
after the VSYNC report on the VF pins.
CAUTION
The last two instructions reported on the VF pins are not always valid.
Therefore, at the last stage of the reconstruction software, the last
two instructions should be ignored.
8.1.5 Compress
In order to store all the information generated on the pins during program trace (5
bits per clock + 30 bits per show cycle) a large memory buffer may be needed.
However, since this information includes events that were canceled, compression
can be very effective. External hardware can be added to eliminate all canceled in-
structions and report only on branches (taken and not taken), indirect flow change,
and the number of sequential instructions after the last flow change.
DEVELOPMENT
SYSTEM OR
EXTERNAL
PERIPHERALS
INTERNAL
PERIPHERALS
MASKABLE BREAKPOINT
DEVELOPMENT
NON-MASKABLE BREAKPOINT
PORT
BREAKPOINT
DEVELOPMENT PORT TRAP ENABLE BITS TO CPU
X
X
MSRRI
MSR
INTERNAL
WATCHPOINTS
WATCHPOINTS
LOGIC
COUNTERS
TO WATCHPOINTS
X BIT WISE AND PINS
X
X BIT WISE OR
WATCH/BREAK SUPPORT
8.2.1 Watchpoints
Watchpoints are based on eight comparators on the I-bus and L-bus, two counters,
and two AND-OR logic structures. There are four comparators on the instruction
address bus (I-address), two comparators on the load/store address bus (L-ad-
dress), and two comparators on the load/store data bus (L-data).
The comparators generate match events. The I-bus match events enter the I-bus
AND-OR logic, where the I-bus watchpoints and breakpoint are generated. When
asserted, the I-bus watchpoints may generate the I-bus breakpoint. Two of them
may decrement one of the counters. When a counter that is counting one of the
I-bus watchpoints expires, the I-bus breakpoint is asserted.
The I-bus watchpoints and the L-bus match events (address and data) enter the
L-bus AND-OR logic where the L-bus watchpoints and breakpoint are generated.
When asserted, the L-bus watchpoints may generate the L-bus breakpoint, or they
may decrement one of the counters. When a counter that is counting one of the
L-bus watchpoints expires, the L-bus breakpoint is asserted.
A watchpoint progresses in the machine along with the instruction that caused it
(fetch or load/store cycle). Watchpoints are reported on the external pins when the
associated instruction is retired.
Since watchpoint events are reported upon the retirement of the instruction that
caused the event, and more than one instruction can retire from the machine in one
clock, separate watchpoint events may be reported in the same clock. Moreover,
the same event, if detected on more than one instruction (e.g., tight loops, range
detection), in some cases is reported only once. However, the internal counters still
count correctly.
To use this feature the user needs to program the byte mask for each of the L-data
comparators and to write the needed match value to the correct half word of the
data comparator when working in half word mode and to the correct bytes of the
data comparator when working in byte mode.
Since bytes and half words can be accessed using a larger data width instruction,
the user cannot predict the exact value of the L-address lines when the requested
byte or half word is accessed. For example, if the matched byte is byte two of the
word and it is accessed using a load word instruction, the L-address value will be
of the word (byte zero). Therefore the processor masks the two least significant bits
of the L-address comparators whenever a word access is performed and the least
significant bit whenever a half word access is performed. Address range is support-
ed only when aligned according to the access size.
The following examples illustrate how to detect matches on bytes and half words.
0x0000_0000
0x0000_0004
0x0000_0008
0x0000_000c
0x0000_0010
WATCH/BREAK EXAMPLE
The “greater than or equal” compare type can be generated using the greater than
compare type and programming the comparator to the needed value minus one.
The “less than or equal” compare type can be generated using the less than com-
pare type and programming the comparator to the needed value plus one.
This method does not work for the following boundary cases:
COMPARE TYPE
EQ
COMPARE
COMPARATOR
TYPE
A CONTROL BITS
LT LOGIC
EQ I-WATCHPOINT 0
COMPARE A
COMPARATOR
TYPE
B B
EVENTS GENERATOR
D
EQ I-WATCHPOINT 3
COMPARE (C&D)
COMPARATOR
TYPE
C (C | D)
LT LOGIC I-BREAKPOINT
EQ
COMPARE
COMPARATOR
TYPE
D
LT LOGIC
I-BUS SUPPORT
The I-bus watchpoints and breakpoint are generated using these events and ac-
cording to the user’s programming of the CMPA, CMPB, CMPC, CMPD, and IC-
There are two L-bus data comparators (comparators G and H). Each is 32 bits wide
and can be programmed to treat numbers either as signed values or as unsigned
values. Each data comparator operates as four independent byte comparators.
Each byte comparator has a mask bit and generates two output signals, equal and
less than, if the mask bit is not set. Therefore, each 32-bit comparator has eight
output signals.
These signals are used to generate the “equal and less than” signals according to
the compare size programmed by the user (byte, half word, word). In byte mode all
signals are significant. In half word mode only four signals from each 32-bit com-
parator are significant. In word mode only two signals from each 32-bit comparator
are significant.
From the new “equal and less than” signals, depending on the compare type pro-
grammed by the user, one of the following four match events is generated: equal,
not equal, greater than, less than. Therefore from the two 32-bit comparators, eight
match indications are generated: Gmatch[0:3], Hmatch[0:3].
According to the lower bits of the address and the size of the cycle, only match in-
dications that were detected on bytes that have valid information are validated; the
LT EQ LT EQ
VALID 3
VALID 0
VALID 1
VALID 2
COMPARATOR G
EQ EQ
BYTE 0
LT LT
SIZE COMPARE BYTE
EQ EQ
LOGIC TYPE QUALIFIER
BYTE 1
LT LT LOGIC LOGIC
I-BUS WATCHPOINTS
EQ EQ EVENTS
CONTROL BITS
BYTE 2 GENERATOR
LT LT
EQ EQ
(E | F)
(E&F)
EVENTS GENERATOR
BYTE 3
E
F
LT LT
BYTE MASK
G
COMPARATOR H L-WATCHPOINT 0
EQ EQ H
BYTE 0
LT LT L-WATCHPOINT 1
SIZE COMPARE BYTE
AND-OR LOGIC
EQ LOGIC EQ QUALIFIER
TYPE (G&H)
BYTE 1
LT LT LOGIC LOGIC L-BREAKPOINT
EQ EQ (G | H)
BYTE 2
LT LT
EQ EQ
BYTE 3
LT LT
L-BUS SUPPORT
Using the match indication signals, four L-bus data events are generated as shown
in Table 8-9.
(G&H) ((Gmatch0 & Hmatch0) | (Gmatch1 & Hmatch1) | (Gmatch2 & Hmatch2) | (Gmatch3 &
Hmatch3))
The four L-bus data events together with the match events of the L-bus address
comparators and the I-bus watchpoints are used to generate the L-bus watchpoints
and breakpoint according to the user’s programming of the CMPE, CMPF, CMPG,
CMPH, LCTRL1, and LCTRL2 registers. Table 8-10 shows how the watchpoints
are determined from the programming options.
An internal breakpoint progresses in the machine along with the instruction that
caused it (fetch or load/store cycle). When a breakpoint reaches the top of the his-
tory buffer, the machine processes the breakpoint exception.
An instruction that causes an I-bus breakpoint is not retired. The processor branch-
es to the breakpoint exception routine before it executes the instruction. An instruc-
tion that causes an L-bus breakpoint is executed. The processor branches to the
breakpoint exception routine after it executes the instruction. The address of the
load/store cycle that generated the L-bus breakpoint is stored in the breakpoint ad-
dress register (BAR).
The value used by the breakpoints generation logic is the bit-wise OR of the soft-
ware trap enable bits (the bits written using the mtspr) and the development port
trap enable bits (the bits serially shifted using the development port).
All bits, the software trap-enable bits and the development port trap enable bits,
can be read from ICTRL and the LCTRL2 using mfspr. For the exact bits place-
ment refer to Table 8-30 and Table 8-32.
When the IFM bit is cleared, every matched instruction can cause an I-bus break-
point (used for “go from x,” where x is an address that would not cause a break-
point).
The IFM bit is set by the software and cleared by the hardware after the first I-bus
breakpoint match is ignored.
Since L-bus breakpoints are treated after the instruction is executed, L-bus break-
points and counter-generated I-bus breakpoints are not affected by this mode.
The development port serial interface can be used to assert either a maskable or
non-maskable breakpoint. Refer to 8.3.5 Trap-Enable Input Transmissions for
more information about generating breakpoints from the development port. The de-
velopment port breakpoint bits remain asserted until they are cleared; however,
they cause a breakpoint only when they change from cleared to set. If they remain
set, they do not cause an additional breakpoint until they are cleared and set again.
External breakpoints are not referenced to any particular instruction; they are ref-
erenced to the current or following L-bus cycle. The breakpoint is taken as soon as
the processor completes an instruction that uses the L-bus.
Non-maskable breakpoints cause the processor to stop without regard to the state
of the MSR[RI] bit. If the processor is in a non-recoverable state when the break-
Only the development port and the internal breakpoint logic are capable of gener-
ating a non-maskable breakpoint. This allows the user to stop the processor in cas-
es where it would otherwise not stop, but with the penalty that it may not be
restartable. The value of the MSR[RI] bit as saved in the SRR1 register indicates
whether the processor stopped in a recoverable state or not.
The relationship of the development support logic to the rest of the MCU is shown
in Figure 8-5. Although the development port is implemented as part of the system
interface unit (SIU), it is used in conjunction with RCPU development support fea-
tures and is therefore described in this section.
SIU BUS
DEVELOPMENT PORT
9
CONTROL LOGIC
BKPT, TE,
TECR
VSYNC
DSCK
DEVELOPMENT PORT PLLL/
SHIFT REGISTER DSDO
DSDI
VFLS
(FRZ)
In clocked mode, detection of the rising edge of the synchronized clock causes the
synchronized data from the DSDI pin to be loaded into the least significant bit of
the shift register. This transfer occurs one quarter clock after the next rising edge
of the system clock. At the same time, the new most significant bit of the shift reg-
ister is presented at the PLLL/DSDO pin. Future references to the DSCK signal im-
ply the internal synchronized value of the clock. The DSCK input must be driven
either high or low at all times and not allowed to float. A typical target environment
would pull this input low with a resistor.
To allow the synchronizers to operate correctly, the development serial clock fre-
quency must not exceed one half of the system clock frequency. The clock may be
implemented as a free-running clock. The shifting of data is controlled by ready and
start signals so the clock does not need to be gated with the serial transmissions.
(Refer to 8.3.5 Trap-Enable Input Transmissions and 8.3.6 CPU Input Trans-
missions.)
The DSCK pin is also used during reset to enable debug mode and immediately
following reset to optionally cause immediate entry into debug mode following re-
set. This is described in section 8.4.1 Enabling Debug Mode and 8.4.2 Entering
Debug Mode.
When the processor is not in debug mode (freeze not indicated on VFLS[0:1] pins)
the data received on the DSDI pin is transferred to the trap enable control register.
When the processor is in debug mode, the data received on the DSDI pin is pro-
vided to the debug mode interface. Refer to 8.3.5 Trap-Enable Input Transmis-
sions and 8.3.6 CPU Input Transmissions for additional information.
The DSDI pin is also used at reset to control overall chip reset configuration and
immediately following reset to determine the development port clock mode. See
8.3.3 Development Port Clock Mode Selection for more information.
6 2
32 (DATA)
32
LENGTH/STATUS0 32
CONTROL/STATUS1
START/READY 32
SHIFT CONTROL
AND COUNTER
DSDI
DSDO (INTERNAL)
DSDI (INTERNAL)
The selection of clocked/self clocked mode is shown in Figure 8-7. The timing di-
agrams in Figure 8-8, Figure 8-9, and Figure 8-10 show the serial communica-
tions for both trap enable mode and debug mode for all clocking schemes.
RESET
SRESET
DSDI
CLKEN
Examples of serial communications using the three clock modes are shown in Fig-
ure 8-8, Figure 8-9, and Figure 8-10.
DSCK
SYNC
DSCK
DI DI DI
DSDI START LENGTH CNTRL DI<0> <N-2> <N-1> <N>
NT S/R
CLK
DEVELOPMENT TOOL DRIVES THE “START” BIT ON DSDI (AFTER DETECTING “READY” BIT
PLLL/DSDO WHEN IN DEBUG MODE). THE “START” BIT IS IMMEDIATELY FOLLOWED BY
A LENGTH BIT AND A CONTROL BIT AND THEN N (7 OR 32) INPUT DATA BITS.
DEBUG PORT DRIVES “READY” BIT ONTO PLLL/DSDO WHEN READY FOR A NEW TRANSMISSION.
In Figure 8-8, the frequency on the DSCK pin is equal to CLKOUT frequency di-
vided by three. This is the maximum frequency allowed for the asynchronous
clocked mode. DSCK and DSDI transitions are not required to be synchronous with
CLKOUT.
DSCK
SYNC
DSCK
I DI DI DI
DSDI START LENGTH CNTRL DI<0> DI<1> DI<-3> <N-2> <N-1> <N>
SYNC DI DI DI DI
START LENGTH CNTRL DI<0> DI<1> <N-3> <N-2> <N-1> <N>
DSDI
NT S/R
CLK
THE “READY” BIT WITH TWO STATUS BITS AND N (7OR 32) OUTPUT
DEBUG PORT DETECTS THE “START” BIT ON DSDI AND FOLLOWS
DATA BITS.
DEVELOPMENT TOOL DRIVES THE “START” BIT ON DSDI (AFTER DETECTING “READY” BIT
PLLL/DSDO WHEN IN DEBUG MODE). THE “START” BIT IS IMMEDIATELY FOLLOWED BY
A LENGTH BIT AND A CONTROL BIT AND THEN N (7 OR 32) INPUT DATA BITS.
DEBUG PORT DRIVES “READY” BIT ONTO PLLL/DSDO WHEN READY FOR A NEW TRANSMISSION.
In Figure 8-9, the frequency on the DSCK pin is equal to CLKOUT frequency di-
vided by two. DSDI and DSCK transitions must meet setup and hold timing require-
ments with respect to CLKOUT.
DSDI DI
STARTLENGTHCNTRL DI<0> DI<1> DI< <N-3> DI DI DI
<N-2> <N-1> <N>
SYNC DI
STARTLENGTHCNTRL DI<0> DI<1><-4> <N-3> DI DI DI
DSDI <N-2> <N-1> <N>
NT S/R
CLK
DEVELOPMENT TOOL DRIVES THE “START” BIT ON DSDI (AFTER DETECTING “READY” BIT ON
PLLL/DSDO WHEN IN DEBUG MODE). THE “START” BIT IS IMMEDIATELY FOLLOWED BY
A LENGTH BIT AND A CONTROL BIT AND THEN N (7 OR 32) INPUT DATA BITS.
DEBUG PORT DRIVES “READY” BIT ONTO PLLL/DSDO WHEN CPU READY FOR A NEW
TRANSMISSION.
In Figure 8-10, the DSCK pin is not used, and the transmission is clocked by CLK-
OUT. DSDI transitions must meet setup and hold timing requirements with respect
to CLKOUT.
The start bit also signals the development port that it can begin driving data on the
DSDO pin. While data is shifting into the LSB of the shift register from the DSDI pin,
it is simultaneously shifting out of the MSB of the shift register onto the DSDO pin.
A length bit defines the transmission as being to either the trap-enable register
(length bit = 1, indicating 7 data bits) or the CPU (length bit = 0, indicating 32 data
bits). Transmissions of data and instructions to the CPU are allowed only when the
processor is in debug mode. The two types of transmissions are discussed in 8.3.5
Trap-Enable Input Transmissions and 8.3.6 CPU Input Transmissions.
Table 8-11 Trap Enable Data Shifted Into Development Port Shift Register
Start Length Control 1st 2nd 3rd 4th 1st 2nd Usage
VSYNC
I-bus L-bus
Table 8-12 Breakpoint Data Shifted Into Development Port Shift Register
Start Length Control Non- Maskable Reserved bits Usage
Maskable
Breakpoints
For transmissions to the CPU, the 35 bits of the development port shift register are
interpreted as a start bit, a length bit, a control bit, and 32 bits of instructions or da-
ta. The encoding of data shifted into the development port shift register (through
the DSDI pin) is shown in Table 8-13.
NOTES:
1. Depending on input mode.
When the processor is not in debug mode, the sequencing error encoding indicates
that the transmission from the external development tool was a transmission to the
CPU (length = 0). When a sequencing error occurs, the development port ignores
the data being shifted in while the sequencing error is shifting out.
The null output encoding is used to indicate that the previous transmission did not
have any associated errors.
When the processor is not in debug mode, the ready bit is asserted at the end of
each transmission. If debug mode is not enabled and transmission errors can be
guaranteed not to occur, the status output is not needed, and the DSDO pin can
be used for untimed I/O.
1) the processor was trying to read instructions and data was shifted into the
development port, or
2) the processor was trying to read data and an instruction was shifted into the
development port.
When a sequencing error occurs, the port terminates the CPU read or fetch cycle
with a bus error. This bus error causes the CPU to signal the development port that
an exception occurred. Since a status of sequencing error has a higher priority than
a status of exception, the port reports the sequencing error. The development port
ignores the data being shifted in while the sequencing error is shifting out. The next
transmission to the port should be a new instruction or trap enable data.
Table 8-16 illustrates a typical sequence of events when a sequencing error oc-
curs. This example begins with CPU data being shifted into the shift register (con-
trol bit = 1) when the processor is expecting an instruction. During the next
1 CPU Data Depends on Cause bus error, set Fetch instruction, take
(Control bit = 1) previous sequence error latch exception because of bus
transmissions error
2 X (Thrown away) Sequencing Error Set exception latch, clear Signal exception to port,
sequencing error latch begin new fetch from port
3 X (Thrown away) CPU Exception Clear exception latch Continue to wait for
instruction from port
4 CPU instruction Null Send instruction to CPU Fetch instruction from port
at end of transmission
First, if the previous transmission results in a sequence error, or the CPU reports
an exception, that status may not be reported until two transmissions after the
transmission that caused the error. (When the ready bit is used, the status is re-
ported in the following transmission.) This is because an error condition which oc-
curs after the start of a transmission cannot be reported until the next transmission.
Second, if a transmitted instruction causes the CPU to write to the DPDR and the
transmission that follows does not wait for the assertion of ready, the CPU data
may not be latched into the development port shift register, and the valid data sta-
tus is not output. Despite this, no error is indicated in the status outputs.To ensure
that the CPU has had enough time to write to the DPDR, there must be at least four
CLKOUT cycles between when the last bit of the instruction (move to SPR) is
clocked into the port and the time the start bit for the next transmission is clocked
into the port.
The DSCK pin is sampled again eight clock cycles following the negation of
RESETOUT. If DSCK is negated following reset, the processor jumps to the reset
vector and begins normal execution. If DSCK is asserted following reset and debug
mode is enabled, the processor enters debug mode before executing any instruc-
tions.
CLK
OUT
RESET
SRESET
DSCK
DM_EN
NMBKPT
DSCK IS NEGATED WITHIN 8 CLOCKS FOLLOWING RESETOUT NEGATION TO AVOID ENTRY INTO
DEBUG MODE.
INTERNAL BREAKPOINT SIGNAL DOES NOT ASSERT BECAUSE DSCK IS NEGATED LESS THAN 8 CLOCKS
AFTER RESETOUT NEGATION (THEREFORE CPU DOES NOT ENTER DEBUG MODE FOLLOWING RESET).
DEBUG MODE AT RESET TIM
Debug mode may also be entered immediately following reset. If the DSCK pin
continues to be asserted following reset (after debug mode is enabled), the proces-
sor takes a breakpoint exception and enters debug mode directly after fetching (but
not executing) the reset vector. To avoid entering debug mode following reset, the
DSCK pin must be negated no later than seven clock cycles after RESETOUT is
negated.
A timing diagram for entering debug mode following reset is shown in Figure 8-12.
RESET
SRESET
DSCK
DM_EN
NMBKPT
VFLS
[0:1]
DSCK ASSERTED PRIOR TO RESETOUT NEGATION ENABLES
DEBUG MODE.
DEBUG MODE IS ENABLED IF DSCK IS ASSERTED
IMMEDIATELY BEFORE RESETOUT NEGATES.
DSCK STAYS ASSERTED FOR AT LEAST 8 CLOCK CYCLES FOLLOWING RESETOUT NEGATION
TO CAUSE ENTRY INTO DEBUG MODE.
INTERNAL BREAKPOINT SIGNAL ASSERTS BECAUSE DSCK STAYS ASSERTED FOR AT LEAST
8 CLOCK CYCLES AFTER RESETOUT NEGATION (THEREFORE CPU WILL ENTER DEBUG MODE
FOLLOWING RESET).
DEBUG MODE ENTRY IS INDICATED BY VFLS [0:1] BOTH HIGH
Freeze is indicated by the value 11 on the VFLS[0:1] pins. This encoding is not
used for pipeline tracking and is left on the VFLS[0:1] pins when the processor is
in debug mode. Figure 8-14 shows how the internal freeze signal is generated.
Software must read the ECR (to clear it) before executing the rfi instruction. Oth-
erwise, if a bit in the ECR is asserted and its corresponding enable bit in the DER
is also asserted, the processor re-enters debug mode and re-asserts the freeze
signal immediately after executing the rfi instruction.
2 Any CPU instruction Null Port ignores data, terminates DPDR read 5
with error;
latches sequence error;
CPU signals exception to port, fetches
next instruction
CPU DATA
CPU
TRAP ENABLE INSTRUCTION
DATA NON DPDR
INSTRUCTION
T I EL
DPDR WRITE
INSTRUCTION ANY INSTRUCTION
DPDR READ WITH EXCEPTION
INSTRUCTION
2 3 4 5
N D X S
CPU
INSTR TRAP
CPU TRAP
CPU CPU ENABLE
DATA INSTR ENABLE
DATA DATA
DATA
R T T
6 Any CPU instruction or data Null Port ignores data and latches sequence error 7
Data for trap enable control Port updates trap enable control register 6
register
Shift in instruction mtspr DPDR, R0 Transfer R0 to DPDR Save R0 so the register can
be used
Shift out R0 data, mfspr R0, ECR Transfer ECR to R0 Read the debug mode cause
shift in instruction register
Shift in instruction mtspr DPDR, R0 Transfer from R0 to Output reason for debug
DPDR mode entry
Shift out stop cause data, mtspr DPDR, R1 Transfer R1 to DPDR Save R1 so the register can
shift in instruction be used
Shift out R1 data, First instruction of next Execute next instruction Continue instruction
shift in instruction sequence processing
Shift in instruction, mfspr R0, DPDR Transfer from DPDR to Restores value of R0 when
shift in saved R0 R0 stopped
Shift in instruction, mfspr R1, DPDR Transfer from DPDR to Restores value of R1 when
shift in saved R1 R1 stopped
Shift in instruction mfspr R1,DPDR Transfer address from Point to memory address
DPDR to R1
Shift in instruction lwzu R0,D(R1) Load data from memory Read data from memory
address (R1) into R0
Shift in instruction mtspr DPDR,R0 Transfer data from R0 to Write memory data to the port
DPDR
Shift in instruction, First instruction of next Execute next instruction Output memory data
shift out memory data sequence
Shift in instruction mfspr R1,DPDR Transfer address from Point to memory address
DPDR to R1
Shift in instruction, mfspr R0, DPDR Transfer data from DPDR Read memory data from the
shift in memory data to R0 port
Shift in instruction stwu R0,D(R1) Store data from R0 to Write data to memory
memory address (R1)
The internal freeze signal is connected to all relevant internal modules. These mod-
ules can be programmed to stop all operations in response to the assertion of the
freeze signal. In order to enable a software monitor debugger to broadcast the fact
that the debug software is now executing, it is possible to assert and negate the
internal freeze signal when debug mode is disabled. (The freeze signal can be as-
serted externally only when the processor enters debug mode.)
The internal freeze signal is asserted whenever an enabled event occurs, regard-
less of whether debug mode is enabled or disabled. To enable an event to cause
freeze assertion, software needs to set the relevant bit in the DER. To clear the
freeze signal, software needs to read the ECR to clear the register and then per-
form an rfi instruction.
If the ECR is not cleared before the rfi instruction is executed, freeze is not negat-
ed. It is therefore possible to nest inside a software monitor debugger without af-
fecting the value of the freeze signal, even though rfi is performed. Only before the
last rfi does the software need to clear the ECR.
Figure 8-14 shows how the ECR and DER control the assertion and negation of
the freeze signal and the internal debug mode signal.
EVENT VALID
EXCEPTION CAUSE REGISTER (ECR)
RFI
RESET SET
Q FREEZE
INTERNAL DEBUG
DEBUG MODE ENABLE MODE SIGNAL
0 0 X Read is performed.
ECR is cleared when read.
Reading DPDR yields indeterminate data.
0 1 0 Read is performed.
ECR is not cleared when read.
Reading DPDR yields indeterminate data.
0 1 1 Read is performed.
ECR is cleared when read.
0 0 X Write is performed.
Write to ECR is ignored.
Writing to DPDR is ignored.
0 1 1 Write is performed.
Write to ECR is ignored.
CMPAD
RESET: UNDEFINED
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
CMPAD RESERVED
RESET: UNDEFINED
30:31 — Reserved
CMPEF
RESET: UNDEFINED
CMPGH
RESET: UNDEFINED
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
IW2 IW3 SIW0 SIW1 SIW2 SIW3 DIW0 DIW1 DIW2 DIW3 IIFM SER ISCTL
EN EN EN EN EN EN EN EN
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[0:2] CTA Compare type of comparator A 0xx = not active (reset value)
100 = equal
[3:5] CTB Compare type of comparator B
101 = less than
[6:8] CTC Compare type of comparator C 110 = greater than
111 = not equal
[9:11] CTD Compare type of comparator D
[12:13] IW0 I-bus 1st watchpoint programming 0x = not active (reset value)
10 = match from comparator A
11 = match from comparators (A&B)
[16:17] IW2 I-bus 3rd watchpoint programming 0x = not active (reset value)
10 = match from comparator C
11 = match from comparators (C&D)
[18:19] IW3 I-bus 4th watchpoint programming 0x = not active (reset value)
10 = match from comparator D
11 = match from comparators (C | D)
28 IIFM Ignore first match, only for I-bus 0 = Do not ignore first match, used for “go to x”
breakpoints (reset value)
1 = Ignore first match (used for “continue”)
[30:31] ISCTL Instruction fetch show cycle 00 = Show cycle will be performed for all fetched
control instructions (reset value). When in this
mode, the machine is fetch serialized.
01 = Show cycle will be performed for all chang-
es in the program flow.
10 = Show cycle will be performed for all indirect
changes in the program flow.
11 = No show cycles will be performed for
fetched instructions
When the value of this field is changed (with the
mtspr instruction), the new value does not take
effect until two instructions after the mtspr in-
struction. The instruction immediately following
mtspr is under control of the old ISCTL value.
The ICTRL is cleared following reset. Note that the machine is fetch serialized
whenever SER = 0b0 or ISCTL = 0b00.
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[0:2] CTE Compare type, comparator E 0xx = not active (reset value)
100 = equal
[3:5] CTF Compare type, comparator F
101 = less than
[6:8] CTG Compare type, comparator G 110 = greater than
111 = not equal
[9:11] CTH Compare type, comparator H
[22:25] CGBMSK Byte mask for 1st L-data 0000 = all bytes are not masked
comparator 0001 = the last byte of the word is masked
.
[26:29] CHBMSK Byte mask for 2nd L-data
.
comparator
.
1111 = all bytes are masked
[30:31] — Reserved —
LW0E LW0IA LW0 LW0LA LW0 LW0LD LW0 LW1E LW1IA LW1 LW1LA
N IADC LADC LDDC N IADC
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 LW0EN 1st L-bus watchpoint enable 0 = watchpoint not enabled (reset value)
bit 1 = watchpoint enabled
10 LW1EN 2nd L-bus watchpoint enable 0 = watchpoint not enabled (reset value)
bit 1 = watchpoint enabled
[21:27] — Reserved —
For each watchpoint, three control register fields (LWxIA, LWxLA, LWxLD) must be
programmed. For a watchpoint to be asserted, all three conditions must be detect-
ed.
CNTV
RESET: UNDEFINED
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESERVED CNTC
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[16:29 — Reserved
]
CNTV
RESET: UNDEFINED
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESERVED CNTC
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[16:29 — Reserved
]
RESERVED CHST MCE DSE ISE EXTI ALE PRE FPUV DECE RESERVED SYSE TR FPAS
P E E
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[0:1] — Reserved
2 CHSTP Checkstop bit. Set when the processor enters checkstop state.
3 MCE Machine check interrupt bit. Set when a machine check exception (other than one caused by a
data storage or instruction storage error) is asserted.
4 DSE Data storage exception bit. Set when a machine check exception caused by a data storage er-
ror is asserted.
5 ISE Instruction storage exception bit. Set whena machine check exception caused by an instruction
storage error is asserted.
6 EXTI External interrupt bit. Set when the external interrupt is asserted.
7 ALE Alignment exception bit. Set when the alignment exception is asserted.
8 PRE Program exception bit. Set when the program exception is asserted.
9 FPUVE Floating point unavailable exception bit. Set when the program exception is asserted.
10 DECE Decrementer exception bit. Set when the decrementer exception is asserted.
[11:12 — Reserved
]
13 SYSE System call exception bit. Set when the system call exception is asserted.
14 TR Trace exception bit. Set when in single-step mode or when in branch trace mode.
15 FPASE Floating point assist exception bit. Set when the floating-point assist exception is asserted.
16 — Reserved
17 SEE Software emulation exception. Set when the software emulation exception is asserted.
[18:27 Reserved
]
28 LBRK L-bus breakpoint exception bit. Set when an L-bus breakpoint is asserted.
29 IBRK I-bus breakpoint exception bit. Set when an I-bus breakpoint is asserted.
30 EBRK External breakpoint exception bit. Set when an external breakpoint is asserted (by an on-chip
IMB or L-bus module, or by an external device or development system through the develop-
ment port).
31 DPI Development port interrupt bit. Set by the development port as a result of a debug station non-
maskable request or when debug mode is entered immediately out of reset.
RESERVED CHST MCEE DSEE ISEE EXTIE ALEE PREE FPU- DE- RESERVED SY- TRE FPA-
PE VEE CEE SEE SEE
RESET:
0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
RESET:
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
[0:1] — Reserved
4 DSEE Data storage exception (type of machine check exception) enable bit
0 = Debug mode entry disabled (reset value)
1 = Debug mode entry enabled
5 ISEE Instruction storage exception (type of machine check exception) enable bit
0 = Debug mode entry disabled (reset value)
1 = Debug mode entry enabled
[11:12 — Reserved
]
16 — Reserved
[18:27 — Reserved
]
Bits 0 to 5 always specify the primary opcode. Many instructions also have a sec-
ondary opcode. The remaining bits of the instruction contain one or more fields for
the different instruction formats.
Some instruction fields are reserved or must contain a predefined value as shown
in the individual instruction layouts. If a reserved field does not have all bits set to
zero, or if a field that must contain a particular value does not contain that value,
the instruction form is invalid.
BD 16:29 Immediate field specifying a 14-bit signed two's complement branch displacement that is
concatenated on the right with 0b00 and sign-extended to 32 bits.
crfD 6:8 Field used to specify one of the CR fields or one of the FPSCR fields as a destination.
crfS 11:13 Field used to specify one of the CR fields or one of the FPSCR fields as a source.
BI 11:15 Field used to specify a bit in the CR to be used as the condition of a branch conditional
instruction.
BO 6:10 Field used to specify options for the branch conditional instructions. The encoding is
described in 4.6 Flow Control Instructions.
crbD 6:10 Field used to specify a bit in the CR or in the FPSCR as the destination of the result of an
instruction.
CRM 12:19 Field mask used to identify the CR fields that are to be updated by the mtcrf instruction.
d 16:31 Immediate field specifying a 16-bit signed two's complement integer that is sign-extended
to 32 bits.
FM 7:14 Field mask used to identify the FPSCR fields that are to be updated by the mtfsf
instruction.
IMM 16:19 Immediate field used as the data to be placed into a field in the FPSCR.
LI 6:29 Immediate field specifying a 24-bit, signed two's complement integer that is concatenated
on the right with 0b00 and sign-extended to 32 bits.
LK 31 Link bit.
0 Does not update the link register.
1 Updates the link register. If the instruction is a branch instruction, the address of the
instruction following the branch instruction is placed into the link register.
MB, M 21:25, 26:30 Fields used in rotate instructions to specify a 32-bit mask consisting of 1-bits from bit
MB+32 through bit ME+32 inclusive, and 0-bits elsewhere, as described in 4.3.4 Integer
Rotate and Shift Instructions.
NB 16:20 Field used to specify the number of bytes to move in an immediate string load or store.
Rc 31 Record bit
0 Does not update the condition register.
1 Updates the condition register (CR) to reflect the result of the operation.
For integer instructions, CR[0:3] are set to reflect the result as a signed quantity. The
result as an unsigned quantity or a bit string can be deduced from the EQ bit. For
floating-point instructions, CR[4:7] are set to reflect floating-point exception, floating-
point enabled exception, floating-point invalid operation exception, and floating-point
overflow exception.
SPR 11:20 Field used to specify a special purpose register for the mtspr and mfspr instructions. The
encoding is described in 4.7.2 Move to/from Special Purpose Register Instructions.
TO 6:10 Field used to specify the conditions on which to trap. The encoding is described in 4.6.7
Trap Instructions.
← Assignment
∗ Multiplication
+ Two’s-complement addition
|| Used to describe the concatenation of two values (i.e., 010 || 111 is the same as 010111)
(rA|0) The contents of rA if the rA field has the value 1–31, or the value 0 if the rA field is 0
. (period) As the last character of an instruction mnemonic, a period (.) means that the instruction
updates the condition register field.
DOUBLE(x) Result of converting x form floating-point single format to floating-point double format.
MASK(x, y) Mask having ones in positions x through y (wrapping if x > y) and 0’s elsewhere
ROTL[32](x, y) Result of rotating the 64-bit value x||x left y positions, where x is 32 bits long
SINGLE(x) Result of converting x from floating-point double format to floating-point single format.
(n)x The replication of x, n times (i.e., x concatenated to itself n-1 times). (n)0 and (n)1 are
special cases
undefined An undefined value. The value may vary from one implementation to another, and from
one execution to another on the same implementation.
characterization Reference to the setting of status bits, in a standard way that is explained in the text
CIA Current instruction address, which is the 32-bit address of the instruction being described
by a sequence of pseudocode. Used by relative branches to set the next instruction
address (NIA). Does not correspond to any architected register.
NIA Next instruction address, which is the 32-bit address of the next instruction to be
executed (the branch destination) after a successful branch. In pseudocode, a successful
branch is indicated by assigning a value to NIA. For instructions which do not branch, the
next instruction address is CIA +4.
∗, ÷ Left to right
|| Left to right
| Left to right
– (range) None
← None
Note that operators higher in Table 9-3 are applied before those lower in the table.
Operators at the same level in the table associate from left to right, from right to left,
or not at all, as shown.
RTL Description of
Instruction Operation rD ← (rA) + (rB)
Text Description of The sum (rA) + (rB) is placed into rD.
Instruction Operation
Registers Altered by Instruction Other registers altered:
0x1F D A B OE 0x10A Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD ← (rA) + (rB)
The sum (rA) + (rB) is placed into rD.
0x1F D A B OE 0xA Rc
0 5 6 10 11 15 16 20 21 22 30 31
rD ← (rA) + (rB)
The sum (rA) + (rB) is placed into rD.
0x1F D A B OE 0x8A Rc
0 5 6 10 11 15 16 20 21 22 30 31
addi rD,rA,SIMM
0x0E D A SIMM
0 5 6 10 11 15 16 31
if rA=0 then
rD←EXTS(SIMM)
else
rD←(rA)+EXTS(SIMM)
The sum (rA| 0) + SIMM is placed into rD.
• None
addic rD,rA,SIMM
0x0C D A SIMM
0 5 6 10 11 15 16 31
rD ← (rA) + EXTS(SIMM)
The sum (rA) + SIMM is placed into rD.
• XER:
Affected: CA
addic. rD,rA,SIMM
0x0D D A SIMM
0 5 6 10 11 15 16 31
rD ← (rA) + EXTS(SIMM)
The sum (rA) + SIMM is placed into rD.
addis rD,rA,SIMM
0x0F D A SIMM
0 5 6 10 11 15 16 31
if rA=0 then
rD←(SIMM || (16)0)
else
rD←(rA)+(SIMM || (16)0)
The sum (rA| 0) + (SIMM || 0x0000) is placed into rD.
• None
Reserved
rD ← (rA) + XER[CA] - 1
The sum (rA)+XER[CA]+0xFFFF FFFFF is placed into rD.
Reserved
rD ← (rA) + XER[CA]
The sum (rA)+XER[CA] is placed into rD.
0x1F S A B 0x1C Rc
0 5 6 10 11 15 16 20 21 30 31
0x1F S A B 3C Rc
0 5 6 10 11 15 16 20 21 30 31
rA←(rS)& ¬ (rB)
The contents of rS is ANDed with the one’s complement of the contents of rB and the re-
sult is placed into rA.
andi. rA,rS,UIMM
0x1C S A UIMM
0 5 6 10 11 15 16 31
andis. rA,rS,UIMM
0x1D S A UIMM
0 5 6 10 11 15 16 31
rA←(rS)+(UIMM || (16)0)
The contents of rS are ANDed with UIMM || 0x0000 and the result is placed into rA.
0x12 LI AA LK
0 5 6 29 30 31
if AA then
NIA←EXTS(LI || 0b00)
else
NIA←CIA+EXTS(LI || 0b00)
if LK, then
LR←CIA+4
target_addr specifies the branch target address.
If AA=0, then the branch target address is the sum of LI || 0b00 sign-extended and the
address of this instruction.
If AA=1, then the branch target address is the value LI || 0b00 sign-extended.
If LK=1, then the effective address of the instruction following the branch instruction is
placed into the link register.
0x10 BO BI BD AA LK
0 5 6 10 11 15 16 29 30 31
If AA=0, the branch target address is the sum of BD || 0b00 sign-extended and the ad-
dress of this instruction.
If LK=1, the effective address of the instruction following the branch instruction is placed
into the link register.
Decrement CTR, branch absolute if CTR non-zero bdnza target bca 16,0,target
Decrement CTR, branch and update LR if CTR non- bdnzl target bcl 16,0,target
zero
Decrement CTR, branch absolute and update LR if bdnzla target bcla 16,0,target
CTR non-zero
Decrement CTR, branch if false and CTR non-zero bdnzf BI,target bc 0,BI,target
Decrement CTR, branch absolute if false and CTR bdnzfa BI,target bca 0,BI,target
non-zero
Decrement CTR, branch and update LR if false and bdnzfl BI,target bcl 0,BI,target
CTR non-zero
Decrement CTR, branch absolute and update LR if bdnzfla BI,target bcla 0,BI,target
false and CGRnon-zero
Decrement CTR, branch if true and CTR non-zero bdnzt BI,target bc 8,BI,target
Decrement CTR, branch absolute if true and CTR bdnzta BI,target bca 8,BI,target
non-zero
Decrement CTR, branch and update LR if true and bdnztl BI,target bcl 8,BI,target
CTR non-zero
Decrement CTR, branch absolute and update LR if bdnztla BI,target bcla 8,BI,target
true and CTR non-zero
Decrement CTR, branch absolute if CTR zero bdza target bca 18,0,target
Decrement CTR, branch and update LR if CTR zero bdzl target bcl 18,0,target
Decrement CTR, branch absolute and update LR if bdzla target bcla 18,0,target
CTR zero
Decrement CTR, branch if false and CTR zero bdzf BI,target bc 2,BI,target
Decrement CTR, branch absolute if false and CTR bdzfa BI,target bca 2,BI,target
zero
Decrement CTR, branch and update LR if false and bdzfl BI,target bcl 2,BI,target
CTR zero
Decrement CTR, branch absolute and update LR if bdzfla BI,target bcla 2,BI,target
false and CTR zero
Decrement CTR, branch if true and CTR zero bdzt BI,target bc 10,BI,target
Decrement CTR, branch absolute if true and CTR bdzta BI,target bca 10,BI,target
zero
Decrement CTR, ranch and update LR if true and bdztl BI,target bcl 10,BI,target
CTR zero
Decrement CTR, branch absolute and update LR if bdztla BI,target bcla 10,BI,target
true and CTR zero
Branch absolute and update LR if equal beqla crX,target bcla 12, 4*crX+2,target
Branch and update LR if greater than or equal to bgel crX,target bcl 4,4*crX,target
Branch absolute and update LR if greater than or bgela crX,target bcla 4,4*crX,target
equal to
Branch absolute and update LR if greater than bgtla crX,target bcla 12,4*crX+1,target
Branch and update LR if less than or equal to blel crX,target bcl 4,4*crX+1,target
Branch absolute and update LR if less than or equal blela crX,target bcla 4,4*crX+1,target
to
Branch absolute and update LR if less than bltla crX,target bcla 12,4*crX,target
Branch absolute and update LR if not equal to bnela crX,target bcla 4,4*crX+2,target
Branch and update LR if not greater than bngl crX,target bcl 4,4*crX+1,target
Branch absolute and update LR if not greater than bngla crX,target bcla 4,4*crX+1,target
Branch and update LR if not less than bnll crX,target bcl 4,4*crX,target
Branch absolute and update LR if not less than bnlla crX,target bcla 4,4*crX,target
Branch and update LR if not summary overflow bnsl crX,target bcl 4,4*crX+3,target
Branch absolute and update LR if not summary bnsla crX,target bcla 4,4*crX+3,target
overflow
Branch absolute and update LR if not unordered bnula crX,target bcla 4,4*crX+3,target
NOTES:
1. If crX is not included in the operand list (for operations that use a cr field), cr0 is assumed.
Reserved
If LK=1, the effective address of the instruction following the branch instruction is placed
into the link register.
If the “decrement and test CTR” option is specified (BO[2]=0), the instruction form is in-
valid.
Reserved
if ¬ BO[2] then
CTR ← CTR-1
ctr_ok ← BO[2] | ((CTR¦0) ⊕ BO[3])
cond_ok ← BO[0] | (CR[BI] ≡ BO[1])
if ctr_ok & cond_ok then
NIA ← LR[0:29] || 0b00
if LK then
LR ← CIA+4
The BI field specifies the bit in the condition register to be used as the condition of the
branch. The BO field is used as described above, and the branch target address is
LR[0:29] || 0b00.
If LK=1 then the effective address of the instruction following the branch instruction is
placed into the link register.
Decrement CTR, branch to LR if false and CTR non- bdnzflr BI bclr 0,BI
zero
Decrement CTR, branch to LR if false and CTR non- bdnzflrl BI bclrl 0,BI
zero, update LR
Decrement CTR, branch to LR if true and CTR non- bdnztlr BI bclr 8,BI
zero
Decrement CTR, branch to LR if true and CTR non- bdnztlrl BI bclrl 8,BI
zero, update LR
Decrement CTR, branch to LR if false and CTR zero bdzflr BI bclr 2,BI
Decrement CTR, branch to LR if true and CTR zero bdztlr BI bclr 10,BI
Decrement CTR, branch to LR if true and CTR zero, bdztlrl BI bclrl 10,BI
update LR
Branch to LR if greater than or equal to, update LR bgelrl crX bclrl 4,4*crX
Branch to LR if less than or equal to, update LR blelrl crX bclrl 4,4*crX+1
NOTES:
1. If crX is not included in the operand list (for operations that use a cr field), cr0 is assumed.
cmp crfD,L,rA,rB
Reserved
a ← (rA)
b ← (rB)
if a < b then
c ← 0b100
else
if a > b then
c ← 0b010
else
c ← 0b001
CR[4∗crfD:4∗crfD+3] ← c || XER[SO]
The contents of rA are compared with the contents of rB, treating the operands as signed
integers. The result of the comparison is placed into CR Field crfD.
The L operand controls whether rA and rB are treated as 32-bit operands (L=0) or 64-bit
operands (L=1). For 32-bit PowerPC implementations such as the RCPU, if L=1, the in-
struction form is invalid.
cmpi crfD,L,rA,SIMM
Reserved
a ← (rA)
if a < EXTS(SIMM) then
c ← 0b100
else
if a > EXTS(SIMM) then
c ← 0b010
else
c ← 0b001
CR[4∗crfD:4∗crfD+3] ← c || XER[SO]
The contents of rA are compared with the sign-extended value of the SIMM field, treating
the operands as signed integers. The result of the comparison is placed into CR Field
crfD.
The L operand controls whether rA and rB are treated as 32-bit operands (L=0) or 64-bit
operands (L=1). For 32-bit PowerPC implementations such as the RCPU, if L=1, the in-
struction form is invalid.
cmpl crfD,L,rA,rB
Reserved
a ← (rA)
b ← (rB)
if a < U b then
c ← 0b100
else
if a >U b then
c ← 0b010
else
c ← 0b001
CR[4∗crfD:4∗crfD+3] ← c || XER[SO]
The contents of rA are compared with the contents of rB, treating the operands as un-
signed integers. The result of the comparison is placed into CR Field crfD.
The L operand controls whether rA and rB are treated as 32-bit operands (L=0) or 64-bit
operands (L=1). For 32-bit PowerPC implementations such as the RCPU, if L=1, the in-
struction form is invalid.
cmpli crfD,L,rA,UIMM
Reserved
a ← (rA)
b ← (rB)
if a <U (0x0000 || UIMM) then
c ← 0b100
else
if a >U (0x0000 || UIMM) then
c ← 0b010
else
c ← 0b001
CR[4∗crfD:4∗crfD+3] ← c || XER[SO]
The contents of rA are compared with 0x0000 || UIMM, treating the operands as unsigned
integers. The result of the comparison is placed into CR Field crfD.
The L operand controls whether rA and rB are treated as 32-bit operands (L=0) or 64-bit
operands (L=1). For 32-bit PowerPC implementations such as the RCPU, if L=1, the in-
struction form is invalid.
Reserved
n←0
do while n < 32
if rS[n]=1 then leave
n ← n+1
rA ← n
A count of the number of consecutive zero bits starting at bit 0 of rS is placed into rA. This
number ranges from 0 to 32, inclusive.
crand crbD,crbA,crbB
Reserved
• Condition Register:
Affected: Bit specified by operand crbD
This instruction is defined by the PowerPC UISA.
crandc crbD,crbA,crbB
Reserved
• Condition Register:
Affected: Bit specified by operand crbD
This instruction is defined by the PowerPC UISA.
creqv crbD,crbA,crbB
Reserved
• Condition Register:
Affected: Bit specified by operand crbD
This instruction is defined by the PowerPC UISA.
crnand crbD,crbA,crbB
Reserved
• Condition Register:
Affected: Bit specified by operand crbD
This instruction is defined by the PowerPC UISA.
crnor crbD,crbA,crbB
Reserved
• Condition Register:
Affected: Bit specified by operand crbD
This instruction is defined by the PowerPC UISA.
cror crbD,crbA,crbB
Reserved
• Condition Register:
Affected: Bit specified by operand crbD
This instruction is defined by the PowerPC UISA.
crorc crbD,crbA,crbB
Reserved
• Condition Register:
Affected: Bit specified by operand crbD
This instruction is defined by the PowerPC UISA.
crxor crbD,crbA,crbB
Reserved
• Condition Register:
Affected: Bit specified by crbD
This instruction is defined by the PowerPC UISA.
0x1F D A B OE 0x1EB Rc
0 5 6 10 11 15 16 20 21 22 30 31
dividend ←(rA)
divisor ←(rB)
rD ← dividend ÷ divisor
Register rA is the 32-bit dividend. Register rB is the 32-bit divisor. A 32-bit quotient is
formed and placed into rD. The remainder is not supplied as a result.
Both operands are interpreted as signed integers. The quotient is the unique signed inte-
ger that satisfies the following:
where
0 ≤ r < |divisor|
-|divisor| < r ≤ 0
0x8000 0000 / -1
<anything> / 0
0x1F D A B OE 0x1CB Rc
0 5 6 10 11 15 16 20 21 22 30 31
dividend ← (rA)
divisor ← (rB)
rD ← dividend ÷ divisor
The dividend is the contents of rA. The divisor is the contents of rB. A 32-bit quotient is
formed and placed into rD. The remainder is not supplied as a result.
Both operands are interpreted as unsigned integers, except that if Rc = 1 the first three
bits of the CR0 field are set by signed comparison of the result to zero. The quotient is the
unique unsigned integer that satisfies the following:
dividend=(quotient ∗ divisor)+r
where
0 ≤ r < divisor.
If an attempt is made to divide by zero, then the following conditons result:
• The contents of rD are undefined.
• If Rc = 1, the contents of the LT, GT, and EQ bits of the CR0 field are undefined.
• If OE = 1, then OV is set to 1.
Other registers altered:
• Condition Register (CR0 Field):
Affected: LT, GT, EQ, SO (if Rc=1)
• XER:
Affected: SO, OV (if OE=1)
The 32-bit unsigned remainder of dividing rA by rB can be computed as follows:
divwu rD,rA,rB # rD=quotient
mull rD,rD,rB # rD=quotient∗divisor
subf rD,rD,rA # rD=remainder
This instruction is defined by the PowerPC UISA.
Reserved
The eieio instruction provides an ordering function for the effects of load and store instruc-
tions executed by a given processor. Executing an eieio instruction ensures that all mem-
ory accesses previously initiated by the given processor are complete with respect to main
memory before any memory accesses subsequently initiated by the given processor ac-
cess main memory.
• None
The eieio instruction is intended for use only in performing memory-mapped I/O opera-
tions and to prevent load/store combining operations in main memory. It can be thought
of as placing a barrier into the stream of memory accesses issued by a processor, such
that any given memory access appears to be on the same side of the barrier to both the
processor and the I/O device.
The eieio instruction may complete before previously initiated memory accesses have
been performed with respect to other processors and mechanisms.
0x1F S A B 0x11C Rc
0 5 6 10 11 15 16 21 22 30 31
rA ← ((rS) ≡ (rB))
The contents of rS are XORed with the contents of rB and the complemented result is
placed into rA.
Reserved
S ← rS[24]
rA[24:31] ← rS[24:31]
rA[0:23] ← (24)S
The contents of rS[24:31] are placed into rA[24:31]. Bit 24 of rS is placed into rA[0:23].
Reserved
S ← rS[16]
rA[16:31]← rS[16:31]
rA[0:15] ← (16)S
The contents of rS[16:31] are placed into rA[16:31]. Bit 16 of rS is placed into rA[0:15].
Reserved
The contents of frB with bit 0 cleared to zero are placed into frD.
Reserved
The floating-point operand in frA is added to the floating-point operand in frB. If the most
significant bit of the resultant significand is not a one, the result is normalized. The result
is rounded to the target precision under control of the floating-point rounding control field
RN of the FPSCR and placed into frD.
Floating-point addition is based on exponent comparison and addition of the two signifi-
cands. The exponents of the two operands are compared, and the significand accompa-
nying the smaller exponent is shifted right, with its exponent increased by one for each bit
shifted, until the two exponents are equal. The two significands are then added or sub-
tracted as appropriate, depending on the signs of the operands, to form an intermediate
sum. All 53 bits in the significand as well as all three guard bits (G, R, and X) enter into
the computation.
If a carry occurs, the sum's significand is shifted right one bit position and the exponent is
increased by one. FPSCR[FPRF] is set to the class and sign of the result, except for in-
valid operation exceptions when FPSCR[VE]=1.
Reserved
0 5 6 10 11 15 16 20 21 25 26 30 31
The floating-point operand in frA is added to the floating-point operand in frB. If the most
significant bit of the resultant significand is not a one, the result is normalized. The result
is rounded to the target precision under control of the floating-point rounding control field
RN of the FPSCR and placed into frD.
Floating-point addition is based on exponent comparison and addition of the two signifi-
cands. The exponents of the two operands are compared, and the significand accompa-
nying the smaller exponent is shifted right, with its exponent increased by one for each bit
shifted, until the two exponents are equal. The two significands are then added or sub-
tracted as appropriate, depending on the signs of the operands, to form an intermediate
sum. All 53 bits in the significand as well as all three guard bits (G, R, and X) enter into
the computation.
If a carry occurs, the sum’s significand is shifted right one bit position and the exponent is
increased by one. FPSCR[FPRF] is set to the class and sign of the result, except for in-
valid operation exceptions when FPSCR[VE]=1.
fcmpo crfD,frA,frB
Reserved
If at least one of the operands is a NaN, either quiet or signaling, then CR Field crfD and
FPSCR[FPCC] are set to reflect unordered. If at least one of the operands is a signaling
NaN, then FPSCR[VXSNAN] is set, and if invalid operation is disabled (FPSCR[VE]=0)
then FPSCR[VXVC] is set. If neither operand is a signaling NaN, but at least one is a
QNaN, then FPSCR[VXVC] is set.
fcmpu crfD,frA,frB
Reserved
If at least one of the operands is a NaN, either quiet or signaling, then CR Field crfD and
FPSCR[FPCC] are set to reflect unordered. If at least one of the operands is a signaling
NaN, then FPSCR[VXSNAN] is set.
Reserved
The floating-point operand in register frB is converted to a 32-bit signed integer, using the
rounding mode specified by FPSCR[RN], and placed in of frD[32:63]. frD[0:31] are unde-
fined.
If the contents of frB is greater than 231-1, frD[32:63] are set to 0x7FFF FFFF.
If the contents of frB is less than -231, frD[32:63] are set to 0x8000 0000.
Reserved
If the operand in frB is greater than 231-1, frD[32:63] are set to 0x7FFF FFFF.
If the operand in frB is less than -231, frD[32:63] are set to 0x8000 0000.
Reserved
The floating-point operand in register frA is divided by the floating-point operand in regis-
ter frB. No remainder is preserved.
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1 and zero divide exceptions when FPSCR[ZE]=1.
Reserved
The floating-point operand in register frA is divided by the floating-point operand in regis-
ter frB. No remainder is preserved.
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1 and zero divide exceptions when FPSCR[ZE]=1.
0x3F D A B C 0x1D Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
frD ← [(frA)∗(frC)]+(frB)
The floating-point operand in register frA is multiplied by the floating-point operand in reg-
ister frC. The floating-point operand in register frB is added to this intermediate result.
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
0x3B D A B C 0x1D Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The following operation is performed:
frD ← [(frA)∗(frC)]+(frB)
The floating-point operand in register frA is multiplied by the floating-point operand in reg-
ister frC. The floating-point operand in register frB is added to this intermediate result.
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
Reserved
0x3F D A B C 0x1C Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
0x3B D A B C 0x1C Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
Reserved
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
Reserved
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
Reserved
Reserved
0x3F D A B C 0x1F Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The following operation is performed:
frD ← -([(frA)∗(frC)]+(frB))
The floating-point operand in register frA is multiplied by the floating-point operand in reg-
ister frC. The floating-point operand in register frB is added to this intermediate result. If
an operand is a denormalized number then it is prenormalized before the operation is
started. If the most significant bit of the resultant significand is not a one the result is nor-
malized. The result is rounded to the target precision under control of the floating-point
rounding control field RN of the FPSCR, then negated and placed into frD.
This instruction produces the same result as would be obtained by using the floating-point
multiply-add instruction and then negating the result, with the following exceptions:
0x3B D A B C 0x1F Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The following operation is performed:
frD ← -([(frA)∗(frC)]+(frB))
The floating-point operand in register frA is multiplied by the floating-point operand in reg-
ister frC. The floating-point operand in register frB is added to this intermediate result. If
an operand is a denormalized number then it is prenormalized before the operation is
started. If the most significant bit of the resultant significand is not a one the result is nor-
malized. The result is rounded to the target precision under control of the floating-point
rounding control field RN of the FPSCR, then negated and placed into frD.
This instruction produces the same result as would be obtained by using the floating-point
multiply-add instruction and then negating the result, with the following exceptions:
0x3F D A B C 0x1E Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The following operation is performed:
This instruction produces the same result obtained by negating the result of a floating mul-
tiply-subtract instruction with the following exceptions:
0x3B D A B C 0x1E Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
The following operation is performed:
This instruction produces the same result obtained by negating the result of a floating mul-
tiply-subtract instruction with the following exceptions:
Reserved
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
Reserved
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
Reserved
FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation ex-
ceptions when FPSCR[VE]=1.
icbi rA,rB
Reserved
If a block containing the byte addressed by EA is in the instruction cache of this processor,
the block is made invalid in the processor. Subsequent references cause the block to be
refetched.
NOTE
According to the PowerPC architecture, if the addressed block is in
coherency-required mode, the block is made invalid in all affected
processors. In the RCPU, however, all instruction memory is consid-
ered to be in coherency-not-required mode.
• None
This instruction is defined by the PowerPC VEA.
isync
Reserved
• None
This instruction is defined by the PowerPC VEA.
lbz rD,d(rA)
0x22 D A d
0 5 6 10 11 15 16 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+EXTS(d)
rD ← (24)0 || MEM(EA, 1)
The effective address is the sum (rA|0) + d. The byte in memory addressed by EA is load-
ed into rD[24:31]. Bits rD[0:23] are cleared to zero.
• None
This instruction is defined by the PowerPC UISA.
lbzu rD,d(rA)
0x23 D A d
0 5 6 10 11 15 16 31
EA ← (rA)+EXTS(d)
rD←(24)0 || MEM(EA, 1)
rA←EA
EA is the sum (rA|0) + d. The byte in memory addressed by EA is loaded into rD[24:31].
Bits rD[0:23] are cleared to zero.
• None
This instruction is defined by the PowerPC UISA.
lbzux rD,rA,rB
Reserved
0x1F D A B 0x77 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
rD ← (24)0 || MEM(EA, 1)
rA ← EA
EA is the sum (rA|0) + (rB). The byte addressed by EA is loaded into rD[24:31]. Bits
rD[0:23] are cleared to zero.
• None
This instruction is defined by the PowerPC UISA.
lbzx rD,rA,rB
Reserved
0x1F D A B 0x57 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
rD ← (24)0 || MEM(EA, 1)
EA is the sum (rA|0) + (rB). The byte in memory addressed by EA is loaded into rD[24:31].
• None
This instruction is defined by the PowerPC UISA.
lfd frD,d(rA)
0x32 D A d
0 5 6 10 11 15 16 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+EXTS(d)
frD ← MEM(EA, 8)
EA is the sum (rA|0) + d.
• None
This instruction is defined by the PowerPC UISA.
lfdu frD,d(rA)
0x33 D A d
0 5 6 10 11 15 16 31
EA ← (rA)+EXTS(d)
frD ← MEM(EA, 8)
rA ← EA
EA is the sum (rA|0) + d.
• None
This instruction is defined by the PowerPC UISA.
lfdux frD,rA,rB
Reserved
0x1F D A B 0x277 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
frD ← MEM(EA, 8)
rA ← EA
EA is the sum (rA|0) + (rB).
• None
This instruction is defined by the PowerPC UISA.
lfdx frD,rA,rB
Reserved
0x1F D A B 0x257 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
frD ← MEM(EA, 8)
EA is the sum (rA|0) + (rB).
• None
This instruction is defined by the PowerPC UISA.
lfs frD,d(rA)
0x30 D A d
0 5 6 10 11 15 16 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+EXTS(d)
frD ← DOUBLE(MEM(EA, 4))
EA is the sum (rA|0) + d.
• None
This instruction is defined by the PowerPC UISA.
lfsu frD,d(rA)
0x31 D A d
0 5 6 10 11 15 16 31
EA ← (rA)+EXTS(d)
frD ← DOUBLE(MEM(EA, 4))
rA ← EA
EA is the sum (rA|0) + d.
• None
This instruction is defined by the PowerPC UISA.
lfsux frD,rA,rB
Reserved
0x1F D A B 0x237 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
frD ← DOUBLE(MEM(EA, 4))
rA ← EA
EA is the sum (rA|0) + (rB).
• None
This instruction is defined by the PowerPC UISA.
lfsx frD,rA,rB
Reserved
0x1F D A B 0x217 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
frD ← DOUBLE(MEM(EA, 4))
EA is the sum (rA|0) + (rB).
• None
This instruction is defined by the PowerPC UISA.
lha rD,d(rA)
0x2A D A d
0 5 6 10 11 15 16 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+EXTS(d)
rD ← EXTS(MEM(EA, 2))
EA is the sum (rA|0) + d. The half word in memory addressed by EA is loaded into
rD[16:31]. Bits rD[0:15] are filled with a copy of bit 0 of the loaded half word.
• None
This instruction is defined by the PowerPC UISA.
lhau rD,d(rA)
0x2B D A d
0 5 6 10 11 15 16 31
EA ← (rA)+EXTS(d)
rD ← EXTS(MEM(EA, 2))
rA ← EA
EA is the sum (rA|0) + d. The half word in memory addressed by EA is loaded into
rD[16:31].
Bits rD[0:15] are filled with a copy of bit 0 of the loaded half word.
• None
This instruction is defined by the PowerPC UISA.
lhaux rD,rA,rB
Reserved
0x1F D A B 0x177 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
rD ← EXTS(MEM(EA, 2))
rA ← EA
EA is the sum (rA|0) + (rB). The half word in memory addressed by EA is loaded into
rD[16:31]. Bits rD[0:15] are filled with a copy of bit 0 of the loaded half word.
• None
This instruction is defined by the PowerPC UISA.
lhax rD,rA,rB
Reserved
0x1F D A B 0x157 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
rD ← EXTS(MEM(EA, 2))
EA is the sum (rA|0) + (rB). The half word in memory addressed by EA is loaded into
rD[16:31]. Bits rD[0:15] are filled with a copy of bit 0 of the loaded half word.
• None
This instruction is defined by the PowerPC UISA.
lhbrx rD,rA,rB
Reserved
0x1F D A B 0x316 0
0 5 6 10 11 15 16 20 21 30 31
if rA=0 then b ← 0
else b ← (rA)
EA ← b+(rB)
rD ← (16)0 || MEM(EA+1, 1) || MEM(EA,1)
EA is the sum (rA|0) + (rB). Bits 0:7 of the half word in memory addressed by EA are load-
ed into rD[24:31]. Bits 8:15 of the half word in memory addressed by EA are loaded into
rD[16:23]. Bits rD[0:15] are cleared to zero.
Some PowerPC implementations may run the lhbrx instructions with greater latency than
other types of load instructions. This is not the case in the RCPU. This instruction operates
with the same latency as other load instructions.
• None
This instruction is defined by the PowerPC UISA.
lhz rD,d(rA)
0x28 D A d
0 5 6 10 11 15 16 31
• None
This instruction is defined by the PowerPC UISA.
lhzu rD,d(rA)
0x29 D A d
0 5 6 10 11 15 16 31
EA ← (rA)+EXTS(d)
rD ← (16)0 || MEM(EA, 2)
rA ← EA
EA is the sum (rA|0) + d. The half word in memory addressed by EA is loaded into
rD[16:31]. Bits rD[0:15] are cleared to zero.
• None
This instruction is defined by the PowerPC UISA.
lhzux rD,rA,rB
Reserved
0x1F D A B 0x137 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
rD←(16)0 || MEM(EA, 2)
rA←EA
EA is the sum (rA|0) + (rB). The half word in memory addressed by EA is loaded into
rD[16:31]. Bits rD[0:15] are cleared to zero.
• None
This instruction is defined by the PowerPC UISA.
lhzx rD,rA,rB
Reserved
0x1F D A B 0x117 0
0 5 6 10 11 15 16 20 21 30 31
• None
This instruction is defined by the PowerPC UISA.
lmw rD,d(rA)
0x2E D A d
0 5 6 10 11 15 16 31
n=(32-rD).
n consecutive words starting at EA are loaded into the 32 bits of GPRs rD through r31.
EA must be a multiple of four; otherwise, the system alignment exception handler is in-
voked.
• None
This instruction is defined by the PowerPC UISA.
lswi rD,rA,NB
Reserved
0x1F D A NB 0x255 0
0 5 6 10 11 15 16 20 21 30 31
Let n=NB if NB¦0, n=32 if NB=0; n is the number of bytes to load. Let nr=CEIL(n/4); nr is
the number of registers to be loaded with data.
n consecutive bytes starting at the EA are loaded into GPRs rD through rD+nr-1. Bytes
are loaded left to right in each register. The sequence of registers wraps around to r0 if
required. If the four bytes of register rD+nr-1 are only partially filled, the unfilled low-order
byte(s) of that register are cleared to zero.
• None
This instruction is defined by the PowerPC UISA.
lswx rD,rA,rB
Reserved
0x1F D A B 0x215 0
0 5 6 10 11 15 16 20 21 30 31
If n>0, n consecutive bytes starting at EA are loaded into GPRs rD through rD+nr-1.
Bytes are loaded left to right in each register. The sequence of registers wraps around to
r0 if required. If the bytes of rD+nr-1 are only partially filled, the unfilled low-order byte(s)
of that register are cleared to zero.
If n=0, the content of rD is undefined.
• None
This instruction is defined by the PowerPC UISA.
lwarx rD,rA,rB
Reserved
0x1F D A B 0x14 0
0 5 6 10 11 15 16 20 21 30 31
This instruction creates a reservation for use by a store word conditional instruction. An
address computed from the EA is associated with the reservation, and replaces any ad-
dress previously associated with the reservation: the manner in which the address to be
associated with the reservation is computed from the EA is described in 4.1.2 Addressing
Modes and Effective Address Calculation.
• None
This instruction is defined by the PowerPC UISA.
lwbrx rD,rA,rB
Reserved
0x1F D A B 0x216 0
0 5 6 10 11 15 16 20 21 30 31
Some PowerPC implementations may run the lwbrx instructions with greater latency than
other types of load instructions. This is not the case in the RCPU. This instruction operates
with the same latency as other load instructions.
• None
This instruction is defined by the PowerPC UISA.
lwz rD,d(rA)
0x20 D A d
0 5 6 10 11 15 16 31
• None
This instruction is defined by the PowerPC UISA.
lwzu rD,d(rA)
0x21 D A d
0 5 6 10 11 15 16 31
EA ← (rA)+EXTS(d)
rD←MEM(EA, 4)
rA←EA
EA is the sum (rA|0) + d. The word in memory addressed by EA is loaded into rD.
• None
This instruction is defined by the PowerPC UISA.
lwzux rD,rA,rB
Reserved
0x1F D A B 0x37 0
0 5 6 10 11 15 16 20 21 30 31
EA ← (rA)+(rB)
rD←MEM(EA, 4)
rA←EA
EA is the sum (rA|0)+(rB). The word in memory addressed by EA is loaded into rD.
• None
This instruction is defined by the PowerPC UISA.
lwzx rD,rA,rB
Reserved
0x1F D A B 0x17 0
0 5 6 10 11 15 16 20 21 30 31
• None
This instruction is defined by the PowerPC UISA.
mcrf crfD,crfS
Reserved
CR[4∗crfD:4∗crfD+3] ← CR[4∗crfS:4∗crfS+3]
The contents of condition register field crfS are copied into condition register field crfD.
All other condition register fields remain unchanged.
mcrfs crfD,crfS
Reserved
0 5 6 8 9 10 11 13 14 15 16 20 21 30 31
The contents of FPSCR field crfS are copied to CR Field crfD. All other CR fields are un-
changed. All exception bits copied except FEX and VX are cleared in the FPSCR.
mcrxr crfD
Reserved
CR[4∗crfD:4∗crfD+3]←XER[0:3]
XER[0:3]← 0b0000
The contents of XER[0:3] are copied into the condition register field designated by crfD.
All other fields of the condition register remain unchanged. XER[0:3] is cleared to zero.
mfcr rD
Reserved
0 5 6 10 11 15 16 20 21 30 31
rD← CR
The contents of the condition register are placed into rD.
• None
This instruction is defined by the PowerPC UISA.
Reserved
0 5 6 10 11 15 16 20 21 30 31
The contents of the FPSCR are placed into frD[32:63]. frD[0:31] are undefined.
mfmsr rD
Reserved
0 5 6 10 11 15 16 20 21 30 31
rD← MSR
The contents of the MSR are placed into rD.
• None
This instruction is defined by the PowerPC OEA.
mfspr rD,SPR
Reserved
0 5 6 10 11 20 21 30 31
n←SPR[5:9] ||SPR[0:4]
rD← SPR(n)
The SPR field denotes a special purpose register, encoded as shown in Table 4-29, Ta-
ble 4-30, and Table 4-31. The contents of the designated special purpose register are
placed into rD.
For mtspr and mfspr instructions, the SPR number coded in assembly language does
not appear directly as a 10-bit binary number in the instruction. The number coded is split
into two 5-bit halves that are reversed in the instruction, with the high-order 5 bits appear-
ing in bits 16 to 20 of the instruction and the low-order 5 bits in bits 11 to 15.
If the SPR field contains any value other than one of the values shown in one of the tables
listed above, one of the following occurs:
If the SPR field contains a value that is not valid for the RCPU, the instruction form is in-
valid. For an invalid instruction form in which SPR[0]=1, if MSR[PR]=1 a supervisor-level
instruction type program exception will occur instead of a no-op.
The execution unit that executes the mfspr instruction depends on the SPR. Moves from
the XER and from SPRs that are physically implemented outside the processor are han-
dled by the LSU. Moves from the FPSCR and FPECR are executed by the FPU. In all oth-
er cases, the BPU executes the mfspr instruction.
• None
mftb rD,TBR
Reserved
0 5 6 10 11 20 21 30 31
n←TBR[5:9] ||TBR[0:4]
if n = 268 then
rD ← TBL
else if n = 269 then
rD← TBU
The TBR field denotes either the time base lower (TBL) or time base upper (TBU), encod-
ed as shown in Table 9-20. Notice that the order of the two 5-bit halves of the TBR number
is reversed in the instruction. The contents of the designated register are copied into rD.
If the TBR field contains any value other than one of the values shown in Table 9-20, one
of the following occurs:
• None
This instruction is defined by the PowerPC VEA.
mtcrf CRM,rS
Reserved
0 5 6 10 11 12 19 20 21 30 31
Reserved
0 5 6 10 11 15 16 20 21 30 31
Bit crbD of the FPSCR is cleared to zero. All other bits of the FPSCR are unchanged.
Reserved
0 5 6 10 11 15 16 20 21 30 31
Bit crbD of the FPSCR is set to one. All other bits of the FPSCR are unchanged.
Reserved
0 5 6 7 14 15 16 20 21 30 31
frB[32:63] are placed into the FPSCR under control of the field mask specified by FM. The
field mask identifies the 4-bit fields affected. Let i be an integer in the range 0–7. If
FM(i)=1, FPSCR Field i (FPSCR bits 4∗i through 4∗i+3) is set to the contents of the cor-
responding field of the low-order 32 bits of register frB.
FPSCR[FX] is altered only if FM[0]=1.
In some PowerPC implementations, updating fewer than all eight fields of the FPSCR may
have substantially poorer performance than updating all the fields. This is not the case
with the RCPU.
When FPSCR[0:3] is specified, bits 0 (FX) and 3 (OX) are set to the values of frB[32] and
frB[35] (i.e., even if this instruction causes OX to change from zero to one, FX is set from
frB[32] and not by the usual rule that FX is set to one when an exception bit changes from
zero to one). Bits 1 and 2 (FEX and VX) are set according to the usual rule and not from
frB[33:34].
Reserved
0 5 6 8 9 10 11 12 15 16 19 20 21 30 31
The value of the IMM field is placed into FPSCR field crfD.
When FPSCR[0:3] is specified, bits 0 (FX) and 3 (OX) are set to the values of IMM[0] and
IMM[3] (i.e., even if this instruction causes OX to change from zero to one, FX is set from
IMM[0] and not by the usual rule that FX is set to one when an exception bit changes from
0 to 1). Bits 1 and 2 (FEX and VX) are set according to the usual rule, given in 2.2.3 Float-
ing-Point Status and Control Register (FPSCR) and not from IMM[1:2].
mtmsr rS
Reserved
0 5 6 10 11 15 16 20 21 30 31
MSR←rS
The contents of rS are placed into the MSR.
• MSR
This instruction is defined by the PowerPC OEA.
mtspr SPR,rS
Reserved
0 5 6 10 11 20 21 30 31
n =SPR[5:9] ||SPR[0:4]
SPREG(n)←(rS)
The SPR field denotes a special purpose register, encoded as shown in Table 4-29, Ta-
ble 4-30, and Table 4-31. The contents of rS are placed into the designated special pur-
pose register.
For mtspr and mfspr instructions, the SPR number coded in assembly language does
not appear directly as a 10-bit binary number in the instruction. The number coded is split
into two 5-bit halves that are reversed in the instruction, with the high-order 5 bits appear-
ing in bits 16 to 20 of the instruction and the low-order 5 bits in bits 11 to 15.
If the SPR field contains any value other than one of the values shown in one of the tables
listed above, one of the following occurs:
The value of SPR[0] is one if and only if the register is read at the supervisor-level. Exe-
cution of this instruction specifying a supervisor-level register when MSR[PR]=1 results in
a supervisor-level instruction type program exception or software emulation exception.
If the SPR field contains a value that is not valid for the RCPU, the instruction form is in-
valid. For an invalid instruction form in which SPR[0]=1, if MSR[PR]=1 a supervisor-level
instruction type program exception will occur instead of a no-op.
The execution unit that executes the mtspr instruction depends on the SPR. Moves to the
XER and to SPRs that are physically implemented outside the processor are handled by
the LSU. Moves to the FPSCR and FPECR are executed by the FPU. In all other cases,
the BPU executes the mtspr instruction.
• None
This instruction is defined by the PowerPC UISA.
Reserved
0x1F D A B 0 0x4B Rc
0 5 6 10 11 15 16 20 21 22 30 31
prod[0:63]←(rA)∗(rB)
rD←prod[0:31]
The contents of rA and of rB are interpreted as 32-bit signed integers. They are multiplied
to form a 64-bit signed integer product. The high-order 32 bits of the 64-bit product are
placed into rD.
Reserved
0x1F D A B 0 0x0B Rc
0 5 6 10 11 15 16 20 21 22 30 31
prod[0:63]←(rA)∗(rB)
rD←prod[0:31]
The contents of rA and of rB are interpreted as 32-bit unsigned integers. They are multi-
plied to form a 64-bit unsigned integer product. The high-order 32 bits of the 64-bit product
are placed into rD.
mulli rD,rA,SIMM
0x07 D A SIMM
0 5 6 10 11 15 16 31
prod[0:47]←rA∗SIMM
rD←prod[16:47]
The low-order 32 bits of the 48-bit product (rA)∗SIMM are placed into rD. The low-order
bits are calculated independently of whether the operands are treated as signed or un-
signed 32-bit integers.
This instruction can be used with mullhwx to calculate a full 64-bit product.
• None
This instruction is defined by the PowerPC UISA.
0x1F D A B OE 0xEB Rc
0 5 6 10 11 15 16 20 21 22 30 31
prod[0:63]←(rA)∗(rB)
rD←prod[32:63]
The low-order 32 bits of the 64-bit product (rA)∗(rB) are placed into rD. The low-order bits
are calculated independently of whether the operands are treated as signed or unsigned
integers. However, OV is set based on the result interpreted as a signed integer.
0x1F S A B 0x1DC Rc
0 5 6 10 11 15 16 20 21 30 31
Reserved
rD← ¬ (rA) + 1
The sum ¬(rA) + 1 is placed into rD.
If rA contains the most negative 32-bit number (0x8000 0000), the result is the most neg-
ative 32-bit number, and if OE=1, OV is set.
0x1F S A B 0x7C Rc
0 5 6 10 11 15 16 20 21 30 31
or rA,rS,rB (Rc=0)
or. rA,rS,rB (Rc=1)
0x1F S A B 0x1BC Rc
0 5 6 10 11 15 16 20 21 30 31
rA←(rS) | (rB)
The contents of rS is ORed with the contents of rB and the result is placed into rA.
0x1F S A B 0x19C Rc
0 5 6 10 11 15 16 20 21 30 31
rA ← (rS) | ¬ (rB)
The contents of rS is ORed with the complement of the contents of rB and the result is
placed into rA.
ori rA,rS,UIMM
0x18 S A UIMM
0 5 6 10 11 15 16 31
ori 0,0,0
• None
This instruction is defined by the PowerPC UISA.
oris rA,rS,UIMM
0x19 S A UIMM
0 5 6 10 11 15 16 31
• None
This instruction is defined by the PowerPC UISA.
Reserved
MSR[16:31]←SRR1[16:31]
NIA←SRR0[0:29] || 0b00
SRR1[16:31] are placed into MSR[16:31]. If the new MSR value does not enable any
pending exceptions, then the next instruction is fetched, under control of the new MSR
value, from the address SRR0[0:29] || 0b00. If the new MSR value enables one or more
pending exceptions, the exception associated with the highest priority pending exception
is generated; in this case the value placed into SRR0 by the exception processing mech-
anism is the address of the instruction that would have been executed next had the ex-
ception not occurred.
• MSR
This instruction is defined by the PowerPC OEA.
0x14 S A SH MB ME Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
n←SH
r←ROTL(rS, n)
m←MASK(MB, ME)
rA←(r&M) | (rA &¬m)
The contents of rS are rotated left SH bits. A mask is generated having 1-bits from bit MB
through bit ME and 0-bits elsewhere. The rotated data is inserted into rA under control of
the generated mask.
Note that rlwimi can be used to insert a bit field into the contents of rA using the methods
shown below:
• To insert an n-bit field that is left-justified in rS into rA starting at bit position b, set
SH = 32 - b, MB = b, and ME = b + n - 1
• To insert an n-bit field that is right-justified in rS into rA starting at bit position b, set
SH = 32 - (b + n), MB = b, and ME = b + n - 1
Other registers altered:
0x15 S A SH MB ME Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
n←SH
r←ROTL(rS, n)
m←MASK(MB, ME)
rA←r & m
The contents of rS are rotated left SH bits. A mask is generated having 1-bits from bit MB
through bit ME and 0-bits elsewhere. The rotated data is ANDed with the generated mask
and the result is placed into rA.
Note that rlwinm can be used to extract, rotate, or clear bit fields using the following meth-
ods:
• To extract an n-bit field that starts at bit position b in rS[0:31], right-justified into rA
(clearing the remaining 32-n bits of rA), set SH=b+n, MB=32-n, and ME=31.
• To extract an n-bit field that starts at bit position b in rS[0–31], left-justified into rA
(clearing the remaining 32-n bits of rA), set SH=b, MB=0, and ME=n-1.
• To rotate the contents of a register left (or right) by n bits, set SH=n (32-n), MB=0,
and ME=31.
• To shift the contents of a register right by n bits, set SH=32-N, MB=n, and ME=31.
• To clear the high-order b bits of a register and then shift the result left by n bits, set
SH=n, MB=b-n and ME=31-n.
• To clear the low-order n bits of a register, set SH=0, MB=0, and ME=31-n.
Other registers altered:
Extract and left justify immediate extlwi rA,rS,n,b (n > 0) rlwinm rA,rS,b,0,n-1
extlwi. rA,rS,n,b (n > 0) rlwinm. rA,rS,b,0,n-1
Clear left and shift left immediate clrlslwi rA,rS,b,n (n ð b ð 31) rlwinm rA,rS,n,b-n,31-n
clrlslwi. rA,rS,b,n (n ð b ð 31) rlwinm. rA,rS,n,b-n,31-n
0x17 S A B MB ME Rc
0 5 6 10 11 15 16 20 21 25 26 30 31
n←rB[27:31]
r←ROTL(rS, n)
m←MASK(MB, ME)
rA←r & m
The contents of rS are rotated left the number of bits specified by rB[27:31]. A mask is
generated having 1-bit from bit MB through bit ME and 0-bits elsewhere. The rotated data
is ANDed with the generated mask and the result is placed into rA.
Note that rlwnm can be used to extract and rotate bit fields using the following methods:
• To extract an n-bit field that starts at variable bit position b in rS[0:31], right-justified
into rA (clearing the remaining 32-n bits of rA), set rB[27:31]=b+n, MB=32-n, and
ME=31.
• To extract an n-bit field, that starts at variable bit position b in rS[0:31], left-justified
into rA (clearing the remaining 32-n bits of rA), set rB[27:31]=b, MB=0, and ME=n-
1.
• To rotate the contents of a register left (or right) by variable n bits, set rB[27:31]=n
(32-N), MB=0, and ME=31.
Other registers altered:
Reserved
The effective address of the instruction following the system call instruction is placed into
SRR0. MSR[16:31] are placed into SRR1[16:31], and SRR1[0:15] are set to undefined
values.
Then a system call exception is generated. The exception causes the MSR to be altered
as described in 6.11.8 System Call Exception (0x00C00).
The exception causes the next instruction to be fetched from offset 0xC00 from the phys-
ical base address indicated by the new setting of MSR[IP]. This instruction is context-syn-
chronizing.
0x1F S A B 0x18 Rc
0 5 6 10 11 15 16 20 21 30 31
n←rB[27:31]
r←ROTL((rS), n)
if rB[26]=0 then
m← MASK(0,31-n)
else
m←(32)0
rA←r&m
If rB[26]=0, the contents of rS are shifted left the number of bits specified by rB[27:31].
Bits shifted out of position 0 are lost. Zeros are supplied to the vacated positions on the
right. The 32-bit result is placed into rA. If rB[26]=1, 32 zeros are placed into rA.
0x1F S A B 0x318 Rc
0 5 6 10 11 15 16 20 21 30 31
n←rB[27:31]
r←ROTL((rS), 32-n)
if rB[26]=0 then
m← MASK(n,31)
else
m←(32)0
s←rS[0]
rA←r&m | (32)s & ¬ m
XER[CA]←s & ((r & ¬ m)¦0)
If rB[26]=0,then the contents of rS are shifted right the number of bits specified by
rB[27:31]. Bits shifted out of position 31 are lost. The result is padded on the left with sign
bits before being placed into rA. If rB[26]=1, then rA is filled with 32 sign bits (bit 0) from
rS. CR0 is set based on the value written into rA.
XER[CA] is set to one if rS contains a negative number and any 1-bits are shifted out of
position 31; otherwise XER[CA] is cleared to zero. A shift amount of zero causes XER[CA]
to be cleared.
0x1F S A SH 0x338 Rc
0 5 6 10 11 15 16 20 21 30 31
n←SH
r←ROTL((rS), 32-n)
m← MASK(n,31)
s←rS[0]
rA←r&m | (32)s & ¬ m
XER[CA]←s & ((r & ¬ m)¦0)
The contents of rS are shifted right SH bits. Bits shifted out of position 31 are lost. The
shifted value is sign extended before being placed in rA. The 32-bit result is placed into
rA. XER[CA] is set to one if rS contains a negative number and any 1-bits are shifted out
of position 31; otherwise XER[CA] is cleared to zero. A shift amount of zero causes
XER[CA] to be cleared to zero.
Other registers altered:
0x1F S A B 0x218 Rc
0 5 6 10 11 15 16 20 21 30 31
n←rB[27:31]
r←ROTL((rS), 32-n)
if rB[26]=0 then
m←MASK(n,31)
else
m←(32)0
rA←r & m
If rB[26]=0, the contents of rA are shifted right the number of bits specified by rA[27:31].
Bits shifted out of position 31 are lost. Zeros are supplied to the vacated positions on the
left. The 32-bit result is placed into rA.
stb rS,d(rA)
0x26 S A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 1)←rS[24:31]
EA is the sum (rA|0)+d. The contents of rS[24:31] are stored into the byte in memory ad-
dressed by EA. Register rS is unchanged.
• None
This instruction is defined by the PowerPC UISA.
stbu rS,d(rA)
0x27 S A d
0 5 6 10 11 15 16 31
• None
This instruction is defined by the PowerPC UISA.
stbux rS,rA,rB
Reserved
0x1F S A B 0xF7 0
0 5 6 10 11 15 16 21 22 30 31
• None
This instruction is defined by the PowerPC UISA.
stbx rS,rA,rB
Reserved
0x1F S A B 0xD7 0
0 5 6 10 11 15 16 21 22 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
EM(EA, 1) ← rS[24:31]
EA is the sum (rA|0)+(rB).
The contents of rS[24:31] is stored into the byte in memory addressed by EA. Register rS
is unchanged.
• None
This instruction is defined by the PowerPC UISA.
stfd frS,d(rA)
0x36 frS A d
0 5 6 10 11 15 16 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 8)←(frS)
EA is the sum (rA|0)+d.
The contents of frS are stored into the double word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stfdu frS,d(rA)
0x37 frS A d
0 5 6 10 11 15 16 31
The contents of frS are stored into the double word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stfdux frS,rA,rB
Reserved
The contents of frS are stored into the double word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stfdx frS,rA,rB
Reserved
if rA + 0 then b ←0
else b←(rA)
EA←b + (rB)
MEM(EA, 8)←(frS)
EA is the sum (rA|0)+(rB).
The contents of frS are stored into the double word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stfiwx frS,rA,rB
Reserved
if rA =0 then b ←0
else b←(rA)
EA←b + (rB)
MEM(EA, 4)←frS[32:63]
EA is the sum (rA|0)+(rB).
The low-order 32 bits of frS are stored, without conversion, into the word in memory ad-
dressed by EA.
If the contents of frS were produced, either directly or indirectly, by an lfs instruction, a
single-precision arithmetic instruction, or frsp, then the value stored is undefined. The
contents of frS are produced directly by such an instruction if frS is the target register for
the instruction. The contents of frS are produced indirectly by such an instruction if frS is
the final target register of a sequence of one or more floating-point move instructions, with
the input to the sequence having been produced directly by such an instruction.
• None
This instruction is defined by the PowerPC UISA.
stfs frS,d(rA)
0x34 frS A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 4)←SINGLE(frS)
EA is the sum (rA|0)+d.
The contents of frS are converted to single-precision and stored into the word in memory
addressed by EA.
stfsu frS,d(rA)
0x35 frS A d
0 5 6 10 11 15 16 31
The of frS are converted to single-precision and stored into the word in memory ad-
dressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stfsux frS,rA,rB
Reserved
The contents of frS are converted to single-precision and stored into the word in memory
addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stfsx frS,rA,rB
Reserved
The contents of frS are converted to single-precision and stored into the word in memory
addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
sth rS,d(rA)
0x2C S A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 2)←rS[16:31]
EA is the sum (rA|0)+d.
The contents of rS[16:31] are stored into the half word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
sthbrx rS,rA,rB
Reserved
0x1F S A B 0x396 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 2)←rS[24:31] || rS[16:23]
EA is the sum (rA|0)+(rB).
The contents of rS[24:31] are stored into bits 0:7 of the half word in memory addressed
by EA. Bits rS[16:23] are stored into bits 8:15 of the half word in memory addressed by
EA.
• None
This instruction is defined by the PowerPC UISA.
sthu rS,d(rA)
0x2D S A d
0 5 6 10 11 15 16 31
The contents of rS[16:31] are stored into the half word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
sthux rS,rA,rB
Reserved
0x1F S A B 0x1B7 0
0 5 6 10 11 15 16 20 21 30 31
The contents of rS[16:31] are stored into the half word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
sthx rS,rA,rB
Reserved
0x1F S A B 0x197 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 2)←rS[16:31]
EA is the sum (rA|0)+(rB).
The contents of rS[16:31] are stored into the half word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stmw rS,d(rA)
0x2F S A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
r←rS
do while r ð 31
MEM(EA, 4) ← GPR(r)
r←r + 1
EA← EA + 4
EA is the sum (rA|0)+d.
n = (32 - rS).
n consecutive words starting at EA are stored from the GPRs rS through 31. For example,
if rS=30, two words are stored.
EA must be a multiple of four; otherwise, the system alignment error handler is invoked.
• None
This instruction is defined by the PowerPC UISA.
stswi rS,rA,NB
Reserved
0x1F S A NB 0x205 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then EA←0
else EA←(rA)
if NB = 0 then n←32
else n←NB
r←rS-1
i←0
do while n>0
if i = 0 then r←r+1 (mod 32)
MEM(EA, 1)←GPR(r)[i:i+7]
i←i+8
if i = 32 then i←0
EA←EA+1
n←n-1
Bytes are stored left to right from each register. The sequence of registers wraps around
to GPR0 if required.
• None
This instruction is defined by the PowerPC UISA.
stswx rS,rA,rB
Reserved
0x1F S A B 0x295 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b+(rB)
n←XER[25:31]
r←rS-1
i←0
do while n>0
if i = 0 then r←r+1 (mod 32)
MEM(EA, 1)←GPR(r)[i:i+7]
i←i+8
if i = 32 then i←0
EA←EA+1
n←n-1
• None
This instruction is defined by the PowerPC UISA.
stw rS,d(rA)
0x24 S A d
0 5 6 10 11 15 16 31
if rA = 0 then b←0
else b←(rA)
EA←b + EXTS(d)
MEM(EA, 4)←rS
EA is the sum (rA|0)+d.
The contents of rS are stored into the word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stwbrx rS,rA,rB
Reserved
0x1F S A B 0x296 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 4)←rS[24:31] || rS[16:23] || rS[8:15] || rS[0:7]
EA is the sum (rA|0)+(rB).
The contents of rS[24:31] are stored into bits 0:7 of the word in memory addressed by EA.
Bits rS[16:23] are stored into bits 8:15 of the word in memory addressed by EA. Bits
rS[8:15] are stored into bits 16:23 of the word in memory addressed by EA. Bits rS[0:7]
are stored into bits 24:31 of the word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stwcx. rS,rA,rB
0x1F S A B 0x96 1
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
if RESERVE then
MEM(EA, 4)←rS
RESERVE←0
CR0←0b00 || 0b1|| XER[SO]
else
CR0←0b00 || 0b0 || XER[SO]
EA is the sum (rA|0)+(rB).
If a reservation exists, the contents of rS are stored into the word in memory addressed
by EA and the reservation is cleared. If no reservation exists, the instruction completes
without altering memory.
CR0 Field is set to reflect whether the store operation was performed (i.e., whether a res-
ervation existed when the stwcx. instruction commenced execution) as follows.
The EQ bit in the condition register field CR0 is modified to reflect whether the store op-
eration was performed (i.e., whether a reservation existed when the stwcx. instruction be-
gan execution). If the store was completed successfully, the EQ bit is set to one.
EA must be a multiple of four; otherwise, the system alignment error handler is invoked.
stwu rS,d(rA)
0x25 S A d
0 5 6 10 11 15 16 31
The contents of rS are stored into the word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stwux rS,rA,rB
Reserved
0x1F S A B 0xB7 0
0 5 6 10 11 15 16 20 21 30 31
The contents of rS are stored into the word in memory addressed by EA.
• None
This instruction is defined by the PowerPC UISA.
stwx rS,rA,rB
Reserved
0x1F S A B 0x97 0
0 5 6 10 11 15 16 20 21 30 31
if rA = 0 then b←0
else b←(rA)
EA←b + (rB)
MEM(EA, 4)←rS
EA is the sum (rA|0)+(rB). The contents of rS are stored into the word in memory ad-
dressed by EA.
• None
This instruction is defined by the PowerPC UISA.
0x1F D A B OE 0x28 Rc
0 5 6 10 11 15 16 20 21 22 30 31
0x1F D A B OE 0x08 Rc
0 5 6 10 11 15 16 20 21 22 30 31
0x1F D A B OE 0x88 Rc
0 5 6 10 11 15 16 20 21 22 30 31
subfic rD,rA,SIMM
0x08 D A SIMM
0 5 6 10 11 15 16 31
• XER:
Affected: CA
This instruction is defined by the PowerPC UISA.
Reserved
Reserved
Reserved
The sync instruction can be used to ensure that the results of all stores into a data struc-
ture, performed in a “critical section” of a program, are seen by other processors before
the data structure is seen as unlocked.
• None
This instruction is defined by the PowerPC UISA.
tw TO,rA,rB
Reserved
0x1F TO A B 0x04
0 5 6 10 11 15 16 20 21 30 31
a← (rA)
b← (rB)
if (a < b) & TO[0] then TRAP
if (a > b) & TO[1] then TRAP
if (a = b) & TO[2] then TRAP
if (a <U b) & TO[3] then TRAP
if (a >U b) & TO[4] then TRAP
The contents of rA are compared with the contents of rB. If any bit in the TO field is set to
one and its corresponding condition is met by the result of the comparison, then the sys-
tem trap handler is invoked.
• None
This instruction is defined by the PowerPC UISA.
twi TO,rA,SIMM
0x03 TO A SIMM
0 5 6 10 11 15 16 31
a← (rA)
if (a < EXTS(SIMM)) & TO[0] then TRAP
if (a > EXTS(SIMM)) & TO[1] then TRAP
if (a = EXTS(SIMM)) & TO[2] then TRAP
if (a <U EXTS(SIMM)) & TO[3] then TRAP
if (a >U EXTS(SIMM)) & TO[4] then TRAP
The contents of rA are compared with the sign-extended SIMM field. If any bit in the TO
field is set to one and its corresponding condition is met by the result of the comparison,
then the system trap handler is invoked.
• None
This instruction is defined by the PowerPC UISA.
0x1F S A B 0x13C Rc
0 5 6 10 11 15 16 20 21 30 31
rA←(rS) ⊕ (rB)
The contents of rA is XORed with the contents of rB and the result is placed into rA.
xori rA,rS,UIMM
0x1A S A UIMM
0 5 6 10 11 15 16 31
• None
This instruction is defined by the PowerPC UISA.
xoris rA,rS,UIMM
0x1B S A UIMM
0 5 6 10 11 15 16 31
• None
This instruction is defined by the PowerPC UISA.
Name 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
bx 0x12 LI AA LK
bcx 0x10 BO BI BD AA LK
lbz 0x22 D A d
lbzu 0x23 D A d
lfd 0x32 D A d
lfdu 0x33 D A d
lfs 0x30 D A d
lfsu 0x31 D A d
lha 0x2A D A d
lhau 0x2B D A d
lhz 0x28 D A d
lhzu 0x29 D A d
lmw 0x2E D A d
lswi 31 D A NB 0x255 0
lswx 31 D A B 0x215 0
lwarx 31 D A B 0x14 0
lwbrx 31 D A B 0x216 0
lwz 32 D A d
lwzu 33 D A d
lwzux 31 D A B 0x37 0
lwzx 31 D A B 0x17 0
rlwimix 0x14 S A SH MB ME Rc
rlwinmx 0x15 S A SH MB ME Rc
rlwnmx 0x17 S A B MB ME Rc
stb 0x26 S A d
stbu 0x27 S A d
sth 0x2C S A d
sthu 0x2D S A d
stmw 0x2F S A d
stw 0x24 S A d
stwcx. 31 S A B 0x96 1
stwu 0x25 S A d
tw 0x1F TO A B 0x04 0
The examples shown below distinguish between the cases n = 2 and n > 2. If n =
2, the shift amount may be in the range 0 to 63, which are the maximum ranges
supported by the shift instructions used. However if n > 2, the shift amount must be
in the range 0 to 31, for the examples to yield the desired result. The specific in-
stance shown for n > 2 is n = 3: extending those instruction sequences to larger n
is straightforward, as is reducing them to the case n = 2 when the more stringent
restriction on shift amount is met. For shifts with immediate shift amounts only the
case n = 3 is shown, because the more stringent restriction on shift amount is al-
ways met.
In the examples it is assumed that GPRs 2 and 3 (and 4) contain the quantity to be
shifted, and that the result is to be placed into the same registers. In all cases, for
both input and result, the lowest-numbered register contains the highest-order part
of the data and highest-numbered register contains the lowest-order part. For non-
immediate shifts, the shift amount is assumed to be in bits 27 to 31 (32-bit mode)
of GPR6. For immediate shifts, the shift amount is assumed to be greater than ze-
ro. GPRs 0 to 31 are used as scratch registers. For n > 2, the number of instruc-
tions required is 2N-1 (immediate shifts) or 3N-1 (non-immediate shifts).
Zero Operand
FRT ← FRB
If FRB[0]=0 then FPSCR[FPRF] ← "+zero"
If FRB[0]=1 then FPSCR[FPRF] ← "-zero"
FPSCR[FR FI] ← 0b00
Done
Infinity Operand
FRT ← FRB
If FRB[0]=0 then FPSCR[FPRF] ← "+infinity"
If FRB[0]=1 then FPSCR[FPRF] ← "-infinity" Done
QNaN Operand–
FRT ← FRB[0:34] || 0b0 0000 0000 0000 0000 0000 0000 0000
FPSCR[FPRF] ← "QNaN"
FPSCR[FR FI] ← 0b00
Done
QNaN Operand
FRT ← FRB[0:34] || 0b0 0000 0000 0000 0000 0000 0000 0000
FPSCR[FPRF] ← "QNaN"
FPSCR[FR FI] ← 0b00
Done
SNaN Operand
FPSCR[VXSNAN] ← 1
If FPSCR[VE]=0 then
Do
FRT[0:11] ← FRB[0:11]
FRT[12] ← 1
FRT[13:63] ← FRB[13:34] || 0b0 0000 0000 0000 0000 0000 0000 0000
FPSCR[FPRF] ← "QNaN"
End
FPSCR[FR FI] ← 0b00
Done
sign ← FRB0
If FRB[1:11]>0 then exp ← FRB[1:11] - 1023 /* exp - bias */
If FRB[1:11]=0 then exp ← -1022
If FRB[1:11]>0 then frac[0:64]←0b01 ||FRB[12:63]||0b00000000000 /
*normal*/
If FRB[1:11]=0 then frac[0:64]←b’00’||FRB[12:63]||0b00000000000 /
*denormal*/
Infinity Operand
FPSCR[FR FI VXCVI] ← 0b001
If FPSCR[VE]=0 then Do
If tgt_precision="32-bit integer" then
Do
If sign=0 then FRT ← 0xuuuu uuuu 7FFF FFFF
If sign=1 then FRT ← 0xuuuu uuuu 8000 0000
End
Else
Do
If sign=0 then FRT ← 0x7FFF FFFF FFFF FFFF
If sign=1 then FRT ← 0x8000 0000 0000 0000
End
FPSCR[FPRF] < undefined
End
Done
SNaN Operand
FPSCR[FR FI VXCVI VXSNAN] ← 0b0011
If FPSCR[VE]=0 then
Do
If tgt_precision="32-bit integer"
then FRT ← 0xuuuu uuuu 8000 0000
If tgt_precision="64-bit integer"
then FRT ← 0x8000 0000 0000 0000
FPSCR[FPRF] ← undefined
End
Done
Large Operand
FPSCR[FR FI VXCVI] ← 0b001
If FPSCR[VE]=0 then Do
If tgt_precision="32-bit integer" then
Do
If sign=0 then FRT ← 0xuuuu uuuu 7FFF FFFF
If sign=1 then FRT ← 0xuuuu uuuu 8000 0000
End
Else
Do
If sign=0 then FRT ← 0x7FFF FFFF FFFF FFFF
If sign=1 then FRT ← 0x8000 0000 0000 0000
End
FPSCR[FPRF] ← undefined
End
Done
Do until frac[0]=1
frac ← frac[1:63] || 0b0
exp ← exp - 1
End
Zero Operand
FPSCR[FR FI] ← 0b00
FPSCR[FPRF] ← "+zero"
FRT ← 0x0000 0000 0000 0000
Done
• In general, lwarx and stwcx. instructions should be paired, with the same ef-
fective address used for both. The exception is an isolated stwcx. instruction
that is used to clear any existing reservation on the processor, for which there
is no paired lwarx and for which any (scratch) effective address can be used.
• It is acceptable to execute an lwarx instruction for which no stwcx. instruction
is executed. For example, such a dangling lwarx instruction occurs if the val-
ue loaded in the test and set sequence shown in D.3.2 Test and Set is not
zero.
• To increase the likelihood that forward progress is made, it is important that
looping on lwarx/stwcx. pairs be minimized. For example, in the sequence
shown above for test and set, this is achieved by testing the old value before
attempting the store — were the order reversed, more stwcx. instructions
might be executed, and reservations might more often be lost between the
lwarx and the stwcx. instructions.
• The manner in which lwarx and stwcx. are communicated to other proces-
sors and mechanisms and between levels of the memory subsystem within a
given processor is implementation-dependent. In some implementations per-
formance may be improved by minimizing looping on an lwarx instruction that
fails to return a desired value. For example, in the test and set example shown
above, to stay in the loop until the word loaded is zero, the programmer could
change the bne S+ 12 to bne loop. However, in some implementations better
performance may be obtained by using an ordinary load instruction to do the
initial checking of the value, as follows:
loop: lwz rS,0(r3) #load the word
cmpwi r5,0 #loop back if word
bne loop #not equal to 0
lwarx rS,0,r3 #try again, reserving
cmpwi r5,0 #(likely to succeed)
bne loop #try to store nonzero
stwcx. r4,0,r3 #loop if lost reservation
bne loop
In this example it is assumed that the address of the word to be loaded and re-
placed is in GPR3, the new value is in GPR4, and the old value is returned in
GPR5.
loop: lwarx r5,0,r3 #load and reserve
stwcx. r4,0,r3 #store new value if still reserved
bne loop #loop if lost reservation
In this example it is assumed that the address of the word to be ANDed is in GPR3,
the value to AND into it is in GPR4, and the old value is returned in GPR5.
loop: lwarx rS,0,r3 #load and reserve
and ra,r4,rS #AND word
stwcx. ra,0,r3 #store new value if still reserved
bne loop #loop if lost reservation
This sequence can be changed to perform another Boolean operation atomically
on a word in memory, simply by changing the AND instruction to the desired Bool-
ean instruction (OR, XOR, etc.).
In this example it is assumed that the address of the word to be tested is in GPR3,
the new value (non-zero) is in GPR4, and the old value is returned in GPR5.
loop: lwarx r5,0,r3 #load and reserve
cmpwi r5,0 #done if word
bne $+12 #not equal to 0
stwcx. r4,0,r3 #try to store nonzero
bne loop #loop if lost reservation
Test and set is shown primarily for pedagogical reasons. It is useful on machines
that lack the better synchronization facilities provided by lwarx and stwcx.. Test
and set does not scale well. Using test and set before a critical section allows only
one process to execute in the critical section at a time. Using lwarx and stwcx. to
bracket the critical section allows many processes to execute in the critical section
at once, but at most one will succeed in exiting from the section with its results
stored.
Depending on the application, if test and set fails (that is, clears the EQ bit of CR
field 0) it may be appropriate to re-execute the test and set.
In this example it is assumed that the address of the word to be tested is in GPR3,
the comparand is in GPR4, the new value is in GPR5, and the old value is returned
in GPR6.
Depending on the application, if compare and swap fails (that is, clears the EQ bit
of CR0) it may be appropriate to recompute the value potentially to be stored and
then re-execute the compare and swap.
The next element pointer from the list element after which the new element is to be
inserted, here called the parent element, is stored into the new element, so that the
new element points to the next element in the list: this store is performed uncondi-
tionally. Then the address of the new element is conditionally stored into the parent
element, thereby adding the new element to the list.
In this example it is assumed that the address of the parent element is in GPR3,
the address of the new element is in GPR4, and the next element pointer is at offset
O from the start of the element. It is also assumed that the next element pointer of
each list element is in a reservation granule separate from that of the next element
pointer of all other list elements.
loop: lwarx r2,0,r3 #get next pointer
stw r2,0(r4) #store in new element
sync #let store settle (can omit if not
MP)
stwcx. r 4, a, r3 #add new element to list
bne loop #loop if stwcx. failed
In the preceding example, if two list elements have next element pointers in the
same reservation granule then, in a multiprocessor, livelock can occur. (Livelock is
a state in which processors interact in a way such that no processor makes
progress.)
If it is not possible to allocate list elements such that each element's next element
pointer is in a different reservation granule, then livelock can be avoided by using
the following, more complicated, code sequence.
E.1 Symbols
The symbols in Table E-1 are defined for use in instructions (basic or simplified
mnemonics) that specify a condition register (CR) field or a bit in the CR.
E.2.2 Subtract
The “subtract-from” instructions subtract the second operand (rA) from the third
(rB). Simplified mnemonics are provided in which the third operand is subtracted
from the second. Both these mnemonics can be coded with a final ‘o’ or ‘.’ (or both)
to cause the OE or Rc bit, respectively, to be set in the underlying instruction. In
these examples, the value in rB is subtracted from the value in rA and the result
placed in rD.
The crfD field can be omitted if the result of the comparison is to be placed into the
CR0 field. Otherwise, the target CR field must be specified as the first operand. The
CR field symbols defined in E.1 Symbols can be used to identify the condition reg-
ister field.
The following examples demonstrate the use of the word compare mnemonics:
11. Compare 32 bits in register rA with immediate value 100 and place result in
condition register field CR0.
cmpwi rA,100 (equivalent to cmpi 0,0,rA,100)
12. Same as (1), but place results in condition register field CR4.
cmpwi cr4,rA,100 (equivalent to cmpi 4,0,rA,100)
13. Compare registers rA and rB as logical 32-bit quantities and place result in
condition register field CR0.
cmplw rA,rB (equivalent to cmpl 0,0,rA,rB)
14. Same as (3), but place result in condition register field CR4.
cmplw cr4,rA,rB (equivalent to cmpl 4,0,rA,rB)
• Extract — Select a field of n bits starting at bit position b in the source register;
left or right justify this field in the target register; clear all other bits of the target
register.
• Insert — Select a left-justified or right-justified field of n bits in the source reg-
0000y Decrement the CTR, then branch if the decremented CTR ¦ 0 and the condition is FALSE.
0001y Decrement the CTR, then branch if the decremented CTR = 0 and the condition is
FALSE.
0100y Decrement the CTR, then branch if the decremented CTR ¦ 0 and the condition is TRUE.
0101y Decrement the CTR, then branch if the decremented CTR = 0 and the condition is TRUE.
The z indicates a bit that must be zero; otherwise, the instruction form is invalid.
The y bit provides a hint about whether a conditional branch is likely to be taken.
The 5-bit BI field in branch conditional instructions specifies which of the 32 bits in
the CR represents the condition to test.
Table E-6 provides the operands for the simplified mnemonics in Table E-5, as
well as the operands of the corresponding basic branch instruction.
Instructions using a mnemonic from Table E-5 that test a condition specify the con-
dition (bit in the condition register) as the first (BI) operand of the instruction. The
symbols defined in E.1 Symbols can be used in this operand. If one of the CR field
symbols is used, it must be multiplied by four and added to a symbol or value (zero
to three) representing the bit number within the CR field.
The simplified mnemonics found in Table E-5 are illustrated in the following exam-
ples:
1. Decrement CTR and branch if it is still non-zero (closure of a loop controlled
by a count loaded into CTR).
bdnz target (equivalent to bc 16,0, target)
2. Same as (1) but branch only if CTR is non-zero and condition in CR0 is
“equal.”
bdnzt eq, target (equivalent to bc 8,2,target)
5. Same as (4), but set the link register. This is a form of conditional “call.”
bfl 27,target (equivalent to bcl 4,27,target)
Branch if less than blt blta bltlr bltctr bltl bltla bltlrl bltctrl
Branch if less than or equal ble blea blelr blectr blel blela blelrl blectrl
Branch if equal beq beqa beqlr beqctr beql beqla beqlrl beqctrl
Branch if greater than bge bgea bgelr bgectr bgel bgela bgelrl bgectrl
Branch if greater than bgt bgta bgtlr bgtctr bgtl bgtla bgtlrl bgtctrl
Branch if not less than bnl bnla bnllr bnlctr bnll bnlla bnllrl bnlctrl
Branch if not equal bne bnea bnelr bnectr bnel bnela bnelrl bnectrl
Branch if not greater than bng bnga bnglr bngctr bngl bngla bnglrl bngctrl
Branch if summary bso bsoa bsolr bsoctr bsol bsola bsolrl bsoctrl
overflow
Branch if not summary bns bnsa bnslr bnsctr bnsl bnsla bnslrl bnsctrl
overflow
Branch if unordered bun buna bunlr bunctr bunl bunla bunlrl bunctrl
Branch if not unordered bnu bnua bnulr bnuctr bnul bnula bnulrl bnuctrl
Table E-8 shows the operands used with the simplified branch mnemonics in Ta-
ble E-7. The examples provided are for the first column of Table E-7 (simplified
forms of the bc instruction), but all entries within a row in Table E-7 use the same
operands (except that branches to the LR or CTR do not require a “target” oper-
and). Table E-8 also indicates the operands used with the corresponding basic
branch mnemonic.
Instructions using the mnemonics in Table E-7 specify the condition register field
in an optional first operand. If the CR field being tested is CR0, this operand need
not be specified. Otherwise, one of the CR field symbols defined in E.1 Symbols
can be used for this operand.
If one of the CR field symbols is used, it must not be multiplied by four. The bit num-
ber within the CR field is part of the simplified mnemonic. The CR field is identified,
and the assembler does the multiplication and addition required to produce a CR
bit number for the BI field of the underlying basic mnemonic.)
The simplified mnemonics found in Table E-7 are used in the following examples:
3. Branch to an absolute target if CR4 specifies “greater than,” setting the link
register. This is a form of conditional “call”, as the return address is saved in
the link register.
bgtla cr4,target (equivalent to bcla 12,17,target)
Assemblers should clear this bit unless otherwise directed. This default action in-
dicates the following:
For relative and absolute branches (bc[l][a]), the setting of the y bit depends on
whether the displacement field is negative or non-negative. For negative displace-
ment fields, coding the suffix ‘+’ causes the bit to be cleared, and coding the suffix
‘–’ causes it to be set. For non-negative displacement fields, coding the suffix ‘+’
causes the bit to be set, and coding the suffix ‘–’ causes the bit to be cleared.
For branches to an address in the LR or CTR (bcclr[l] or bcctr[l]), coding the suffix
‘+’ causes the y bit to be set, and coding the suffix ‘–’ causes the bit to be cleared.
1. Branch if CR0 reflects condition “less than,” specifying that the branch
should be predicted to be taken.
blt+ target
2. Same as (1), but target address is in the LR and the branch should be pre-
dicted not to be taken.
bltlr–
The symbols defined in E.1 Symbols can be used to identify the condition register
bit. If one of the CR field symbols is used, it must be multiplied by four and added
to a symbol or value (zero to three) representing the bit number within the CR field.
lt Less than 16 1 0 0 0 0
eq Equal 4 0 0 1 0 0
gt Greater than 8 0 1 0 0 0
ne Not equal 24 1 1 0 0 0
(none) Unconditional 31 1 1 1 1 1
NOTES:
1. The symbol ‘<U’ indicates an unsigned “less than” evaluation will be performed.
2. The symbol ‘>U’ indicates an unsigned “greater than” evaluation will be performed.
The mnemonics defined in Table E-11 are variations of the trap instructions, with
the most useful values of the trap instruction TO operand represented as a mne-
monic rather than specified as a numeric operand.
The following examples illustrate the use of simplified mnemonics for trap instruc-
tions:
2. Trap unconditionally.
trap (equivalent to tw 31,0,0)
Trap instructions evaluate a trap condition as follows: the contents of register rA
are compared with either the sign-extended SIMM field or the contents of register
rB, depending on the trap instruction.
The comparison results in five conditions which are ANDed with operand TO. If the
result is not zero, the trap exception handler is invoked. See Table E-12 for these
conditions.
General special mtsprg n,rS mtspr 272+n,rS mfsprg rD,n mfspr rD,272+n
purpose registers G0
through G3
E.9.1 No-Op
Many PowerPC instructions can be coded in a way such that, effectively, no oper-
ation is performed. An additional mnemonic is provided for the preferred form of no-
op.
The following instruction loads a 16-bit signed immediate value, shifted left by 16
bits, into rA:
B Beat. A single state on the external bus interface that may extend across
multiple bus cycles. An RCPU transaction can be composed of
multiple address or data beats.
Biased Exponent. The sum of the exponent and a constant (bias) chosen
to make the biased exponent's range non-negative.
Big-Endian. A byte-ordering method in memory where the address n of a
word corresponds to the most significant byte. In an addressed
memory word, the bytes are ordered (left to right) 0, 1, 2, 3, with 0
being the most significant byte.
Blockage. The number of clock cycles between the time an instruction
begins execution and the time its execution unit is available for a
subsequent instruction.
Boundedly Undefined. The results of attempting to execute a given
instruction are said to be boundedly undefined if they could have
been achieved by executing an arbitrary sequence of defined
instructions, in valid form, starting in the state the machine was in
before attempting to execute the given instruction. Boundedly
undefined results for a given instruction may vary between
implementations, and between execution attempts in the same
implementation.
Branch Folding. A technique of removing the branch instruction from the
instruction sequence.
Breakpoint. An event that, when detected, forces the machine to branch to
a breakpoint exception routine.
P Park. The act of allowing a bus master to maintain mastership of the bus
without having to arbitrate.
Pipelining. A technique that breaks instruction execution into distinct steps
so that multiple steps can be performed at the same time.
Precise Exceptions. The pipeline can be stopped so the instructions that
preceded the faulting instruction can complete, and subsequent
instructions can be executed from their beginning.
W Watchpoint. An event that, when detected, is reported but does not change
the timing of the machine.
Section 2 Registers
Page 2-5 Added Table 2-1 FPSCR Control, Status, and Sticky Bits.
Page 2-14 Added to the description of the BE bit in Table 2-8 Machine
State Register Bit Settings.
Section 4
Section 6 Exceptions
Page 6-7 Modified Figure 6-1.
Page 6-14 Added to the description of the BE bit in Table 6-7 Machine
State Register Bit Settings.
Page 8-51 Updated Table 8-30 ICTRL Bit Settings to include the SER bit.
Page 8-57 Corrected the reset value for CHSTPE in Table 8-36 DER Bit
Settings.
Page 9-21 Made RTL consistent with RTL on pages 9 -25 and 9-27.
Page 9-22 to 9-24 Corrected Table 9-8 Simplified Mnemonics for bc, bca, bcl,
and bcla Instructions.
Page 9-25 to 9-26 Corrected the RTL. Corrected Table 9-9 Simplified Mnemon-
ics for bcctr and bcctrl Instructions.
Page 9-28 Corrected Table 9-10 Simplified Mnemonics for bclr and bclrl
Instructions.
Page 9-30 Revised the second paragraph of text. Updated Table 9-11
Simplified Mnemonics for cmp Instruction.
Page 9-31 Updated Table 9-12 Simplified Mnemonics for cmpi Instruc-
tion.
Page 9-32 Corrected the RTL. Updated Table 9-13 Simplified Mnemon-
ics for cmpl Instruction.
Page 9-33 Corrected the RTL. Revised the second paragraph of text. Up-
dated Table 9-14 Simplified Mnemonics for cmpli Instruction.
Page 9-45 Updated text regarding setting LT, GT, and EQ bits.
Page 9-53 Revised the second paragaph of text. Corrected Other Reg-
Page 9-54 Revised the second text paragaph of text. Corrected Other
Registers Altered.
Page 9-78 Corrected the RTL. Revised the third paragraph of text.
Page 9-79 Corrected the RTL. Revised the third paragraph of text.
Page 9-82 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-83 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-86 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-87 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-90 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-91 Corrected the RTL. Revised the third paragraph of text.
Page 9-95 Corrected the RTL. Revised the third paragraph of text.
Page 9-96 Corrected the RTL. Revised the third paragraph of text.
Page 9-98 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-99 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-100 Corrected the RTL.
Page 9-104 Corrected the RTL. Revised the third paragraph of text.
Page 9-105 Corrected the RTL. Revised the third paragraph of text.
Page 9-117 Corrected the RTL. Added Table 9-22 Simplified Mnemonics
for mtcrf Instruction.
Page 9-128 Corrected the RTL. Revised the first paragraph of text.
Page 9-142 Corrected the RTL. Revised the first paragraph of text.
Page 9-147 Corrected the RTL. Revised the third paragraph of text.
Page 9-148 Corrected the RTL. Revised the third paragraph of text.
Page 9-151 Corrected the RTL. Revised the third paragraph of text.
Page 9-152 Corrected the RTL. Revised the third paragraph of text.
Page 9-154 Corrected the RTL. Revised the second paragraph of text.
Page 9-156 Corrected the RTL. Revised the third paragraph of text.
Page 9-157 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-161 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-162 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-165 Corrected the RTL. Revised the third paragraph of text.
Page 9-166 Corrected the RTL. Revised the third and fourth paragraphs of
text.
Page 9-170 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-171 Corrected the RTL. Revised the fourth paragraph of text.
Page 9-180 Corrected the RTL. Corrected Table 9-31 Simplified Mnemon-
ics for Instructions.
A-1 to A-6 Corrected Table A-1 Complete Instruction List Sorted by Mne-
monic.
–E– –F–
–U–
UIMM 9-3
UISA register set 2-3
Unlock all 5-9
Unlock line 5-9
Unordered exceptions 6-2
User
privilege level 2-1
User level
registers 2-3
SPRs 1-12, 4-62
UX bit 2-6, 6-36
–V–
–W–
Watchpoints 8-11
Window trace 8-8
Writeback stage 1-9, 6-1, 7-4
–X–
X bit 3-22
XE bit 2-7, 6-38
XER 2-10
XO 9-3
xor 4-13, 9-182
xori 4-13, 9-183
xoris 4-13, 9-184
XX bit 2-6, 6-37
–Z–