Reverse Engineering For Beginners 1st Edition by Dennis Yurichev B0DLZR55FH
Reverse Engineering For Beginners 1st Edition by Dennis Yurichev B0DLZR55FH
com
https://ebookball.com/product/reverse-engineering-for-
beginners-1st-edition-by-dennis-yurichev-b0dlzr55fh-16234/
OR CLICK HERE
DOWLOAD EBOOK
https://ebookball.com/product/reversing-secrets-of-reverse-
engineering-1st-edition-by-eldad-eilam-0764574817-9780764574818-16232/
ebookball.com
https://ebookball.com/product/hacking-hacking-practical-guide-for-
beginners-1st-edition-by-jeff-simon-isbn-b01lxvjrnv-15730/
ebookball.com
https://ebookball.com/product/hacking-hacking-practical-guide-for-
beginners-1st-edition-by-jeff-simon-isbn-b01lxvjrnv-15704/
ebookball.com
https://ebookball.com/product/thoracic-radiology-a-guide-for-
beginners-1st-edition-by-iacopo-carbone-michele-
anzidei-9783030357658-3030357651-6066/
ebookball.com
https://ebookball.com/product/basic-electronics-for-scientists-and-
engineers-1st-edition-by-dennis-eggleston-
isbn-0521154308-978-0521154307-17674/
ebookball.com
Reverse Engineering for Beginners
Dennis Yurichev
Reverse Engineering for Beginners
Dennis Yurichev
<dennis(a)yurichev.com>
c ba
©2013-2016, Dennis Yurichev.
This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
license. To view a copy of this license, visit https://creativecommons.org/licenses/by-sa/4.0/.
Text version (May 11, 2017).
The latest version (and Russian edition) of this text is accessible at beginners.re.
The cover was made by Andy Nechaevsky: facebook.
i
Call for translators!
You may want to help me with translating this work into languages other than English and Russian. Just
send me any piece of translated text (no matter how short) and I’ll put it into my LaTeX source code.
Read here.
Speed isn’t important, because this is an open-source project, after all. Your name will be mentioned as
a project contributor. Korean, Chinese, and Persian languages are reserved by publishers. English and
Russian versions I do by myself, but my English is still that horrible, so I’m very grateful for any notes
about grammar, etc. Even my Russian is flawed, so I’m grateful for notes about Russian text as well!
So do not hesitate to contact me: dennis(a)yurichev.com .
ii
Abridged contents
1 Code patterns 1
4 Java 661
6 OS-specific 734
7 Tools 789
12 Communities 1005
Afterword 1007
Appendix 1009
Glossary 1043
Index 1045
iii
Contents
1 Code patterns 1
1.1 The method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Some basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 A short introduction to the CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Numeral systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Empty function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 x86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.4 Empty functions in practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Returning value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.1 x86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.2 ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.3 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.4 In practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Hello, world! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.1 x86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.2 x86-64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.3 GCC—one more thing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.4 ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5.5 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.6 Function prologue and epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.6.1 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.7 Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.7.1 Why does the stack grow backwards? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.7.2 What is the stack used for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.7.3 A typical stack layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.7.4 Noise in stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.8 printf() with several arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.8.1 x86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.8.2 ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.8.3 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
1.8.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1.8.5 By the way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
1.9 scanf() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
1.9.1 Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
1.9.2 Popular mistake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
1.9.3 Global variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
1.9.4 scanf() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
1.9.5 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
1.10Accessing passed arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
1.10.1x86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
1.10.2x64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
1.10.3ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
1.10.4MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
1.11More about results returning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
1.11.1Attempt to use the result of a function returning void . . . . . . . . . . . . . . . . . . . . . 107
1.11.2What if we do not use the function result? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
1.11.3Returning a structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
iv
CONTENTS
1.12Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
1.12.1Swap input values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
1.12.2Returning values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
1.13GOTO operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
1.13.1Dead code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
1.13.2Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
1.14Conditional jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
1.14.1Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
1.14.2Calculating absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
1.14.3Ternary conditional operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
1.14.4Getting minimal and maximal values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
1.14.5Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
1.14.6Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
1.15switch()/case/default . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
1.15.1Small number of cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
1.15.2A lot of cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
1.15.3When there are several case statements in one block . . . . . . . . . . . . . . . . . . . . . 179
1.15.4Fall-through . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
1.15.5Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
1.16Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
1.16.1Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
1.16.2Memory blocks copying routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
1.16.3Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
1.16.4Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
1.17More about strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
1.17.1strlen() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
1.17.2Boundaries of strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
1.18Replacing arithmetic instructions to other ones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
1.18.1Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
1.18.2Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
1.18.3Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
1.19Floating-point unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
1.19.1IEEE 754 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
1.19.2x86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
1.19.3ARM, MIPS, x86/x64 SIMD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
1.19.4C/C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
1.19.5Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
1.19.6Passing floating point numbers via arguments . . . . . . . . . . . . . . . . . . . . . . . . . . 230
1.19.7Comparison example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
1.19.8Some constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
1.19.9Copying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
1.19.10Stack, calculators and reverse Polish notation . . . . . . . . . . . . . . . . . . . . . . . . . . 267
1.19.11x64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
1.19.12Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
1.20Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
1.20.1Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
1.20.2Buffer overflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
1.20.3Buffer overflow protection methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
1.20.4One more word about arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
1.20.5Array of pointers to strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
1.20.6Multidimensional arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
1.20.7Pack of strings as a two-dimensional array . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
1.20.8Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
1.21By the way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
1.21.1Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
1.22Manipulating specific bit(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
1.22.1Specific bit checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
1.22.2Setting and clearing specific bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
1.22.3Shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
1.22.4Setting and clearing specific bits: FPU1 example . . . . . . . . . . . . . . . . . . . . . . . . . 317
1.22.5Counting bits set to 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
1.22.6Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
1.22.7Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
1 Floating-point unit
v
CONTENTS
1.23Linear congruential generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
1.23.1x86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
1.23.2x64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
1.23.332-bit ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
1.23.4MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
1.23.5Thread-safe version of the example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
1.24Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
1.24.1MSVC: SYSTEMTIME example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
1.24.2Let’s allocate space for a structure using malloc() . . . . . . . . . . . . . . . . . . . . . . . . 349
1.24.3UNIX: struct tm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
1.24.4Fields packing in structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
1.24.5Nested structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
1.24.6Bit fields in a structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
1.24.7Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
1.25Unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
1.25.1Pseudo-random number generator example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
1.25.2Calculating machine epsilon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
1.26FSCALE replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
1.26.1Fast square root calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
1.27Pointers to functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
1.27.1MSVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
1.27.2GCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
1.27.3Danger of pointers to functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
1.2864-bit values in 32-bit environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
1.28.1Returning of 64-bit value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
1.28.2Arguments passing, addition, subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
1.28.3Multiplication, division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
1.28.4Shifting right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
1.28.5Converting 32-bit value into 64-bit one . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
1.29SIMD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
1.29.1Vectorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
1.29.2SIMD strlen() implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
1.3064 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
1.30.1x86-64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
1.30.2ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
1.30.3Float point numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
1.30.464-bit architecture criticism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
1.31Working with floating point numbers using SIMD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
1.31.1Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
1.31.2Passing floating point number via arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
1.31.3Comparison example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
1.31.4Calculating machine epsilon: x64 and SIMD . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
1.31.5Pseudo-random number generator example revisited . . . . . . . . . . . . . . . . . . . . . 440
1.31.6Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
1.32ARM-specific details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
1.32.1Number sign (#) before number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
1.32.2Addressing modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
1.32.3Loading a constant into a register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
1.32.4Relocs in ARM64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
1.33MIPS-specific details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
1.33.1Loading a 32-bit constant into register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
1.33.2Further reading about MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
vi
CONTENTS
2.2.1 Using IMUL over MUL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
2.2.2 Couple of additions about two’s complement form . . . . . . . . . . . . . . . . . . . . . . . 456
2.3 AND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
2.3.1 Checking if a value is on 2n boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
2.3.2 KOI-8R Cyrillic encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
2.4 AND and OR as subtraction and addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
2.4.1 ZX Spectrum ROM text strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
2.5 XOR (exclusive OR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
2.5.1 Everyday speech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
2.5.2 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
2.5.3 RAID2 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
2.5.4 XOR swap algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
2.5.5 XOR linked list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
2.5.6 By the way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
2.5.7 AND/OR/XOR as MOV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
2.6 Population count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
2.7 Endianness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
2.7.1 Big-endian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
2.7.2 Little-endian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
2.7.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
2.7.4 Bi-endian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
2.7.5 Converting data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
2.8 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
2.9 CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
2.9.1 Branch predictors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
2.9.2 Data dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
2.10Hash functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
2.10.1How do one-way functions work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
vii
CONTENTS
3.11.1Strings and memory functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
3.12C99 restrict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
3.13Branchless abs() function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
3.13.1Optimizing GCC 4.9.1 x64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
3.13.2Optimizing GCC 4.9 ARM64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
3.14Variadic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
3.14.1Computing arithmetic mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
3.14.2vprintf() function case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
3.14.3Pin case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
3.14.4Format string exploit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
3.15Strings trimming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
3.15.1x64: Optimizing MSVC 2013 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
3.15.2x64: Non-optimizing GCC 4.9.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
3.15.3x64: Optimizing GCC 4.9.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
3.15.4ARM64: Non-optimizing GCC (Linaro) 4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
3.15.5ARM64: Optimizing GCC (Linaro) 4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
3.15.6ARM: Optimizing Keil 6/2013 (ARM mode) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
3.15.7ARM: Optimizing Keil 6/2013 (Thumb mode) . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
3.15.8MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
3.16toupper() function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
3.16.1x64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
3.16.2ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
3.16.3Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
3.17Obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
3.17.1Text strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
3.17.2Executable code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
3.17.3Virtual machine / pseudo-code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
3.17.4Other things to mention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
3.17.5Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
3.18C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
3.18.1Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
3.18.2ostream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
3.18.3References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
3.18.4STL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
3.18.5Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
3.19Negative array indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
3.19.1Addressing string from the end . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
3.19.2Addressing some kind of block from the end . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
3.19.3Arrays started at 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
3.20Packing 12-bit values into array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
3.20.1Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
3.20.2Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
3.20.3The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
3.20.4The C/C++ code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
3.20.5How it works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
3.20.6Optimizing GCC 4.8.2 for x86-64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
3.20.7Optimizing Keil 5.05 (Thumb mode) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
3.20.8Optimizing Keil 5.05 (ARM mode) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
3.20.9(32-bit ARM) Comparison of code density in Thumb and ARM modes . . . . . . . . . . . . 605
3.20.10
Optimizing GCC 4.9.3 for ARM64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
3.20.11
Optimizing GCC 4.4.5 for MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
3.20.12
Difference from the real FAT12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
3.20.13
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
3.20.14
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
3.20.15
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
3.21More about pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
3.21.1Working with addresses instead of pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
3.21.2Passing values as pointers; tagged unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
3.21.3Pointers abuse in Windows kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
3.21.4Null pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
3.21.5Array as function argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622
3.21.6Pointer to function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
3.21.7Pointer as object identificator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
3.22Loop optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
viii
CONTENTS
3.22.1Weird loop optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
3.22.2Another loop optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
3.23More about structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
3.23.1Sometimes a C structure can be used instead of array . . . . . . . . . . . . . . . . . . . . . 628
3.23.2Unsized array in C structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
3.23.3Version of C structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
3.23.4High-score file in “Block out” game and primitive serialization . . . . . . . . . . . . . . . . 632
3.24memmove() and memcpy() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
3.24.1Anti-debugging trick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
3.25setjmp/longjmp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638
3.26Other weird stack hacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
3.26.1Accessing arguments/local variables of caller . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
3.26.2Returning string . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
3.27OpenMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
3.27.1MSVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
3.27.2GCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
3.28Another heisenbug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
3.29Windows 16-bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
3.29.1Example#1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650
3.29.2Example #2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650
3.29.3Example #3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
3.29.4Example #4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
3.29.5Example #5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
3.29.6Example #6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
4 Java 661
4.1 Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
4.1.2 Returning a value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
4.1.3 Simple calculating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
4.1.4 JVM3 memory model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
4.1.5 Simple function calling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
4.1.6 Calling beep() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
4.1.7 Linear congruential PRNG4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
4.1.8 Conditional jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672
4.1.9 Passing arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
4.1.10Bitfields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
4.1.11Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
4.1.12switch() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
4.1.13Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679
4.1.14Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687
4.1.15Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
4.1.16Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 692
4.1.17Simple patching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694
4.1.18Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
ix
CONTENTS
5.4.2 Finding strings in binary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 710
5.4.3 Error/debug messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 711
5.4.4 Suspicious magic strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 711
5.5 Calls to assert() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712
5.6 Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712
5.6.1 Magic numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
5.6.2 Specific constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
5.6.3 Searching for constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
5.7 Finding the right instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
5.8 Suspicious code patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716
5.8.1 XOR instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716
5.8.2 Hand-written assembly code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717
5.9 Using magic numbers while tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
5.10Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
5.10.1Some binary file patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719
5.10.2Memory “snapshots” comparing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
5.11ISA5 detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
5.11.1Incorrectly disassembled code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
5.11.2Correctly disassembled code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732
5.12Other things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732
5.12.1General idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732
5.12.2Order of functions in binary code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
5.12.3Tiny functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
5.12.4C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
6 OS-specific 734
6.1 Arguments passing methods (calling conventions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
6.1.1 cdecl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
6.1.2 stdcall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
6.1.3 fastcall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735
6.1.4 thiscall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
6.1.5 x86-64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
6.1.6 Return values of float and double type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
6.1.7 Modifying arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
6.1.8 Taking a pointer to function argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741
6.2 Thread Local Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742
6.2.1 Linear congruential generator revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743
6.3 System calls (syscall-s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747
6.3.1 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
6.3.2 Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
6.4 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
6.4.1 Position-independent code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
6.4.2 LD_PRELOAD hack in Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 751
6.5 Windows NT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753
6.5.1 CRT (win32) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753
6.5.2 Win32 PE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
6.5.3 Windows SEH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764
6.5.4 Windows NT: Critical section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787
7 Tools 789
7.1 Binary analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789
7.1.1 Disassemblers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789
7.1.2 Decompilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 790
7.1.3 Patch comparison/diffing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 790
7.2 Live analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 790
7.2.1 Debuggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 790
7.2.2 Library calls tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 790
7.2.3 System calls tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
7.2.4 Network sniffing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
7.2.5 Sysinternals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
7.2.6 Valgrind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
7.2.7 Emulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
7.3 Other tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792
5 Instruction Set Architecture
x
CONTENTS
7.4 Something missing here? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792
xi
CONTENTS
9.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987
9.7 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987
9.8 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987
12 Communities 1005
Afterword 1007
12.1Questions? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1007
Appendix 1009
.1 x86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009
.1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009
.1.2 General purpose registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009
.1.3 FPU registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1013
.1.4 SIMD registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015
.1.5 Debugging registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015
.1.6 Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1016
.1.7 npad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1028
.2 ARM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029
.2.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029
.2.2 Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1030
.2.3 32-bit ARM (AArch32) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1030
.2.4 64-bit ARM (AArch64) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1031
.2.5 Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1031
.3 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1032
.3.1 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1032
.3.2 Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1033
xii
CONTENTS
.4 Some GCC library functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1033
.5 Some MSVC library functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1033
.6 Cheatsheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034
.6.1 IDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034
.6.2 OllyDbg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034
.6.3 MSVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034
.6.4 GCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035
.6.5 GDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035
Index 1045
xiii
CONTENTS
Preface
There are several popular meanings of the term “reverse engineering”: 1) The reverse engineering of
software: researching compiled programs; 2) The scanning of 3D structures and the subsequent digital
manipulation required in order to duplicate them; 3) Recreating DBMS6 structure. This book is about the
first meaning.
Oracle RDBMS ( 8.11 on page 900), Itanium ( 10.5 on page 991), copy-protection dongles ( 8.5 on
page 815), LD_PRELOAD ( 6.4.2 on page 751), stack overflow, ELF7 , win32 PE file format ( 6.5.2 on
page 757), x86-64 ( 1.30.1 on page 421), critical sections ( 6.5.4 on page 787), syscalls ( 6.3 on page 747),
TLS8 , position-independent code (PIC9 ) ( 6.4.1 on page 748), profile-guided optimization ( 10.7.1 on
page 994), C++ STL ( 3.18.4 on page 558), OpenMP ( 3.27 on page 643), SEH ( 6.5.3 on page 764).
Prerequisites
• “Now that Dennis Yurichev has made this book free (libre), it is a contribution to the world of free
knowledge and free education.” Richard M. Stallman, GNU founder, software freedom activist.
• “It’s very well done .. and for free .. amazing.”11 Daniel Bilar, Siege Technologies, LLC.
6 Database management systems
7 Executable file format widely used in *NIX systems including Linux
8 Thread Local Storage
9 Position Independent Code: 6.4.1 on page 748
10 Programming language
11 twitter.com/daniel_bilar/status/436578617221742593
xiv
CONTENTS
• “... excellent and free”12 Pete Finnigan, Oracle RDBMS security guru.
• “... book is interesting, great job!” Michael Sikorski, author of Practical Malware Analysis: The
Hands-On Guide to Dissecting Malicious Software.
• “... my compliments for the very nice tutorial!” Herbert Bos, full professor at the Vrije Universiteit
Amsterdam, co-author of Modern Operating Systems (4th Edition).
• “... It is amazing and unbelievable.” Luis Rocha, CISSP / ISSAP, Technical Manager, Network & Infor-
mation Security at Verizon Business.
• “Thanks for the great work and your book.” Joris van de Vis, SAP Netweaver & Security specialist.
• “... reasonable intro to some of the techniques.”13 Mike Stay, teacher at the Federal Law Enforce-
ment Training Center, Georgia, US.
• “I love this book! I have several students reading it at the moment, plan to use it in graduate
course.”14 Sergey Bratus , Research Assistant Professor at the Computer Science Department at
Dartmouth College
• “Dennis @Yurichev has published an impressive (and free!) book on reverse engineering”15 Tanel
Poder, Oracle RDBMS performance tuning expert .
• “This book is some kind of Wikipedia to beginners...” Archer, Chinese Translator, IT Security Re-
searcher.
• “First class reference for people wanting to learn reverse engineering. And it’s free for all.” Mikko
Hyppönen, F-Secure.
Thanks
For patiently answering all my questions: Andrey “herm1t” Baranovich, Slava “Avid” Kazakov.
For sending me notes about mistakes and inaccuracies: Stanislav “Beaver” Bobrytskyy, Alexander Ly-
senko, Alexander “Solar Designer” Peslyak, Federico Ramondino, Mark Wilson, Shell Rocket, Zhu Ruijin,
Changmin Heo, Vitor Vidal, Stijn Crevits, Jean-Gregoire Foulon16 , Ben L., Etienne Khan, Norbert Szetei17 ,
Marc Remy, Michael Hansen..
For helping me in other ways: Andrew Zubinski, Arnaud Patard (rtp on #debian-arm IRC), noshadow on
#gcc IRC, Aliaksandr Autayeu, Mohsen Mostafa Jokar.
For translating the book into Simplified Chinese: Antiy Labs (antiy.cn), Archer.
For translating the book into Korean: Byungho Min.
For translating the book into Dutch: Cedric Sambre (AKA Midas).
For translating the book into Spanish: Diego Boy, Luis Alberto Espinosa Calvo, Fernando Guida.
For translating the book into Portuguese: Thales Stevan de A. Gois.
For translating the book into Italian: Federico Ramondino18 , Paolo Stivanin19 , twyK.
For translating the book into French: Florent Besnard20 , Marc Remy21 , Baudouin Landais, Téo Dacquet22 .
For translating the book into German: Dennis Siekmeier23 , Julius Angres24 , Dirk Loser25 .
For proofreading: Alexander “Lstar” Chernenkiy, Vladimir Botov, Andrei Brazhuk, Mark “Logxen” Cooper,
Yuan Jochen Kang, Mal Malakov, Lewis Porter, Jarle Thorsen, Hong Xie.
12 twitter.com/petefinnigan/status/400551705797869568
13 reddit
14 twitter.com/sergeybratus/status/505590326560833536
15 twitter.com/TanelPoder/status/524668104065159169
16 https://github.com/pixjuan
17 https://github.com/73696e65
18 https://github.com/pinkrab
19 https://github.com/paolostivanin
20 https://github.com/besnardf
21 https://github.com/mremy
22 https://github.com/T30rix
23 https://github.com/DSiekmeier
24 https://github.com/JAngres
25 https://github.com/PolymathMonkey
xv
CONTENTS
Vasil Kolev26 did a great amount of work in proofreading and correcting many mistakes.
For illustrations and cover art: Andy Nechaevsky.
Thanks also to all the folks on github.com who have contributed notes and corrections27 .
Many LATEX packages were used: I would like to thank the authors as well.
Donors
Those who supported me during the time when I wrote significant part of the book:
2 * Oleg Vygovsky (50+100 UAH), Daniel Bilar ($50), James Truscott ($4.5), Luis Rocha ($63), Joris van
de Vis ($127), Richard S Shultz ($20), Jang Minchang ($20), Shade Atlas (5 AUD), Yao Xiao ($10), Pawel
Szczur (40 CHF), Justin Simms ($20), Shawn the R0ck ($27), Ki Chan Ahn ($50), Triop AB (100 SEK), Ange
Albertini (e10+50), Sergey Lukianov (300 RUR), Ludvig Gislason (200 SEK), Gérard Labadie (e40), Sergey
Volchkov (10 AUD), Vankayala Vigneswararao ($50), Philippe Teuwen ($4), Martin Haeberli ($10), Victor
Cazacov (e5), Tobias Sturzenegger (10 CHF), Sonny Thai ($15), Bayna AlZaabi ($75), Redfive B.V. (e25),
Joona Oskari Heikkilä (e5), Marshall Bishop ($50), Nicolas Werner (e12), Jeremy Brown ($100), Alexandre
Borges ($25), Vladimir Dikovski (e50), Jiarui Hong (100.00 SEK), Jim Di (500 RUR), Tan Vincent ($30),
Sri Harsha Kandrakota (10 AUD), Pillay Harish (10 SGD), Timur Valiev (230 RUR), Carlos Garcia Prado
(e10), Salikov Alexander (500 RUR), Oliver Whitehouse (30 GBP), Katy Moe ($14), Maxim Dyakonov ($3),
Sebastian Aguilera (e20), Hans-Martin Münch (e15), Jarle Thorsen (100 NOK), Vitaly Osipov ($100), Yuri
Romanov (1000 RUR), Aliaksandr Autayeu (e10), Tudor Azoitei ($40), Z0vsky (e10), Yu Dai ($10).
Thanks a lot to every donor!
mini-FAQ
xvi
CONTENTS
A: In my own experience, authors of technical literature do this mostly for self-advertisement purposes.
It’s not possible to get any decent money from such work.
Q: How does one get a job in reverse engineering?
A: There are hiring threads that appear from time to time on reddit, devoted to RE31 (2016). Try looking
there.
A somewhat related hiring thread can be found in the “netsec” subreddit: 2016.
Q: I have a question...
A: Send it to me by email (dennis(a)yurichev.com).
In January 2015, the Acorn publishing company (www.acornpub.co.kr) in South Korea did a huge amount
of work in translating and publishing my book (as it was in August 2014) into Korean.
It’s now available at their website.
The translator is Byungho Min (twitter/tais9). The cover art was done by my artistic friend, Andy Nechaevsky:
facebook/andydinka. They also hold the copyright to the Korean translation.
So, if you want to have a real book on your shelf in Korean and want to support my work, it is now available
for purchase.
In 2016 the book has been translated by Mohsen Mostafa Jokar (who is also known to Iranian community
by his translation of Radare manual32 ). It is available on the publisher’s website33 (Pendare Pars).
40 page excerpt: https://beginners.re/farsi.pdf.
Registration of the book in National Library of Iran: http://opac.nlai.ir/opac-prod/bibliographic/
4473995.
In April 2017, translation to Chinese has been finished by Chinese PTPress publisher. They are also the
Chinese translation copyright holder.
It’s available for order here: http://www.epubit.com.cn/book/details/4174. Some kind of review and
history behind the translation: http://www.cptoday.cn/news/detail/3155.
Principal translator is Archer, to whom I owe so much. He was extremely meticulous (in good sense)
and reported most of known mistakes and bugs, which is very important to literature like this book. I’ll
recommend his services to any other author!
Guys from Antiy Labs has also helped with translation. Here is preface written by them.
31 reddit.com/r/ReverseEngineering/
32 http://rada.re/get/radare2book-persian.pdf
33 http://goo.gl/2Tzx0H
xvii
Chapter 1
Code patterns
Author unknown
When the author of this book first started learning C and, later, C++, he used to write small pieces of
code, compile them, and then look at the assembly language output. This made it very easy for him
to understand what was going on in the code that he had written. 1 . He did it so many times that the
relationship between the C/C++ code and what the compiler produced was imprinted deeply in his mind.
It’s easy to imagine instantly a rough outline of C code’s appearance and function. Perhaps this technique
could be helpful for others.
Sometimes ancient compilers are used here, in order to get the shortest (or simplest) possible code snip-
pet.
By the way, there is a great website where you can do the same, with various compilers, instead of
installing them on your box. You can use it as well: https://gcc.godbolt.org/.
Exercises
When the author of this book studied assembly language, he also often compiled small C-functions and
then rewrote them gradually to assembly, trying to make their code as short as possible. This probably
is not worth doing in real-world scenarios today, because it’s hard to compete with latest compilers in
terms of efficiency. It is, however, a very good way to gain a better understanding of assembly. Feel free,
therefore, to take any assembly code from this book and try to make it shorter. However, don’t forget to
test what you have written.
Source code can be compiled by different compilers with various optimization levels. A typical compiler
has about three such levels, where level zero means disable optimization. Optimization can also be tar-
geted towards code size or code speed. A non-optimizing compiler is faster and produces more under-
standable (albeit verbose) code, whereas an optimizing compiler is slower and tries to produce code that
runs faster (but is not necessarily more compact). In addition to optimization levels, a compiler can in-
clude in the resulting file some debug information, thus producing code for easy debugging. One of the
important features of the ´debug’ code is that it might contain links between each line of the source code
and the respective machine code addresses. Optimizing compilers, on the other hand, tend to produce
output where entire lines of source code can be optimized away and thus not even be present in the re-
sulting machine code. Reverse engineers can encounter either version, simply because some developers
1 In fact, he still does it when he can’t understand what a particular bit of code does.
1
1.2. SOME BASICS
turn on the compiler’s optimization flags and others do not. Because of this, we’ll try to work on examples
of both debug and release versions of the code featured in this book, where possible.
The CPU is the device that executes the machine code a program consists of.
A short glossary:
Instruction : A primitive CPU command. The simplest examples include: moving data between registers,
working with memory, primitive arithmetic operations. As a rule, each CPU has its own instruction
set architecture (ISA).
Machine code : Code that the CPU directly processes. Each instruction is usually encoded by several
bytes.
Assembly language : Mnemonic code and some extensions like macros that are intended to make a
programmer’s life easier.
CPU register : Each CPU has a fixed set of general purpose registers (GPR2 ). ≈ 8 in x86, ≈ 16 in x86-
64, ≈ 16 in ARM. The easiest way to understand a register is to think of it as an untyped temporary
variable. Imagine if you were working with a high-level PL and could only use eight 32-bit (or 64-bit)
variables. Yet a lot can be done using just these!
One might wonder why there needs to be a difference between machine code and a PL. The answer lies
in the fact that humans and CPUs are not alike—it is much easier for humans to use a high-level PL like
C/C++, Java, Python, etc., but it is easier for a CPU to use a much lower level of abstraction. Perhaps
it would be possible to invent a CPU that can execute high-level PL code, but it would be many times
more complex than the CPUs we know of today. In a similar fashion, it is very inconvenient for humans
to write in assembly language, due to it being so low-level and difficult to write in without making a huge
number of annoying mistakes. The program that converts the high-level PL code into assembly is called
a compiler. 3 .
The x86 ISA has always been one with variable-length instructions, so when the 64-bit era came, the x64
extensions did not impact the ISA very significantly. In fact, the x86 ISA still contains a lot of instructions
that first appeared in 16-bit 8086 CPU, yet are still found in the CPUs of today. ARM is a RISC4 CPU
designed with constant-length instructions in mind, which had some advantages in the past. In the very
beginning, all ARM instructions were encoded in 4 bytes5 . This is now referred to as “ARM mode”. Then
they thought it wasn’t as frugal as they first imagined. In fact, most used CPU instructions6 in real world
applications can be encoded using less information. They therefore added another ISA, called Thumb,
where each instruction was encoded in just 2 bytes. This is now referred as “Thumb mode”. However, not
all ARM instructions can be encoded in just 2 bytes, so the Thumb instruction set is somewhat limited. It is
worth noting that code compiled for ARM mode and Thumb mode may of course coexist within one single
program. The ARM creators thought Thumb could be extended, giving rise to Thumb-2, which appeared
in ARMv7. Thumb-2 still uses 2-byte instructions, but has some new instructions which have the size of
4 bytes. There is a common misconception that Thumb-2 is a mix of ARM and Thumb. This is incorrect.
Rather, Thumb-2 was extended to fully support all processor features so it could compete with ARM mode—
a goal that was clearly achieved, as the majority of applications for iPod/iPhone/iPad are compiled for the
Thumb-2 instruction set (admittedly, largely due to the fact that Xcode does this by default). Later the
64-bit ARM came out. This ISA has 4-byte instructions, and lacked the need of any additional Thumb mode.
However, the 64-bit requirements affected the ISA, resulting in us now having three ARM instruction sets:
ARM mode, Thumb mode (including Thumb-2) and ARM64. These ISAs intersect partially, but it can be
said that they are different ISAs, rather than variations of the same one. Therefore, we would try to add
2 General Purpose Registers
3 Old-school Russian literature also use term “translator”.
4 Reduced instruction set computing
5 By the way, fixed-length instructions are handy because one can calculate the next (or previous) instruction address without
effort. This feature will be discussed in the switch() operator ( 1.15.2 on page 174) section.
6 These are MOV/PUSH/CALL/Jcc
2
1.2. SOME BASICS
fragments of code in all three ARM ISAs in this book. There are, by the way, many other RISC ISAs with
fixed length 32-bit instructions, such as MIPS, PowerPC and Alpha AXP.
Humans accustomed to decimal numeral system probably because almost all ones has 10 fingers. Nev-
ertheless, number 10 has no significant meaning in science and mathematics. Natural numeral system
in digital electronics is binary: 0 is for absence of current in wire and 1 for presence. 10 in binary is 2 in
decimal; 100 in binary is 4 in decimal and so on.
If the numeral system has 10 digits, it has radix (or base) of 10. Binary numeral system has radix of 2.
Important things to recall: 1) number is a number, while digit is a term of writing system and is usually
one character; 2) number is not changed when converted to another radix: writing notation is (and way
of representing it in RAM7 ).
How to convert a number from one radix to another?
Positional notation is used almost everywhere, this means, a digit has some weight depending on where
it is placed inside of number. If 2 is placed at the rightmost place, it’s 2. If it is placed at the place one
digit before rightmost, it’s 20.
What does 1234 stand for?
103 ⋅ 1 + 102 ⋅ 2 + 101 ⋅ 3 + 1 ⋅ 4 = 1234 or 1000 ⋅ 1 + 100 ⋅ 2 + 10 ⋅ 3 + 4 = 1234
Same story for binary numbers, but base is 2 instead of 10. What does 0b101011 stand for?
25 ⋅ 1 + 24 ⋅ 0 + 23 ⋅ 1 + 22 ⋅ 0 + 21 ⋅ 1 + 20 ⋅ 1 = 43 or 32 ⋅ 1 + 16 ⋅ 0 + 8 ⋅ 1 + 4 ⋅ 0 + 2 ⋅ 1 + 1 = 43
Positional notation can be opposed to non-positional notation such as Roman numeric system 8 . Perhaps,
humankind switched to positional notation because it’s easier to do basic operations (addition, multipli-
cation, etc.) on paper by hand.
Indeed, binary numbers can be added, subtracted and so on in the very same as taught in schools, but
only 2 digits are available.
Binary numbers are bulky when represented in source code and dumps, so that is where hexadecimal
numeral system can be used. Hexadecimal radix uses 0..9 digits and also 6 Latin characters: A..F. Each
hexadecimal digit takes 4 bits or 4 binary digits, so it’s very easy to convert from binary number to
hexadecimal and back, even manually, in one’s mind.
3
1.2. SOME BASICS
Binary numbers sometimes prepended with ”0b” prefix: 0b100110111 (GCC9 has non-standard language
extension for this10 ). There is also another way: ”b” suffix, for example: 100110111b. I’ll try to stick to
”0b” prefix throughout the book for binary numbers.
Hexadecimal numbers are prepended with ”0x” prefix in C/C++ and other PLs: 0x1234ABCD. Or they are
has ”h” suffix: 1234ABCDh—this is common way of representing them in assemblers and debuggers. If
the number is started with A..F digit, 0 is to be added before: 0ABCDEFh. There was also convention
that was popular in 8-bit home computers era, using $ prefix, like $ABCD. I’ll try to stick to ”0x” prefix
throughout the book for hexadecimal numbers.
Should one learn to convert numbers in mind? A table of 1-digit hexadecimal numbers can easily be
memorized. As of larger numbers, probably, it’s not worth to torment yourself.
Perhaps, the most visible to all people hexadecimal numbers are in URL11 s. This is the way how non-Latin
characters are encoded. For example: https://en.wiktionary.org/wiki/na%C3%AFvet%C3%A9 is the
URL of Wiktionary article about “naïveté” word.
Octal radix
Another numeral system heavily used in past of computer programming is octal: there are 8 digits (0..7)
and each is mapped to 3 bits, so it’s easy to convert numbers back and forth. It has been superseded by
hexadecimal system almost everywhere, but surprisingly, there is *NIX utility used by many people often
which takes octal number as argument: chmod .
As many *NIX users know, chmod argument can be a number of 3 digits. The first digit is rights for owner
of file, second is rights for group (to which file belongs), third is for everyone else. And each digit can be
represented in binary form:
4
1.3. EMPTY FUNCTION
Divisibility
When you see a decimal number like 120, you can quickly deduce that it’s divisible by 10, because the
last digit is zero. In the same way, 123400 is divisible by 100, because two last digits are zeros.
Likewise, hexadecimal number 0x1230 is divisible by 0x10 (or 16), 0x123000 is divisible by 0x1000 (or
4096), etc.
Binary number 0b1000101000 is divisible by 0b1000 (8), etc.
This property can be used often to realize quickly if a size of some block in memory is padded to some
boundary. For example, sections in PE12 files are almost always started at addresses ending with 3 hex-
adecimal zeros: 0x41000, 0x10001000, etc. The reason behind this is in the fact that almost all PE
sections are padded to boundary of 0x1000 (4096) bytes.
Multi-precision arithmetic can use huge numbers, and each one may be stored in several bytes. For
example, RSA keys, both public and private, are spanning up to 4096 bits and maybe even more.
In [Donald E. Knuth, The Art of Computer Programming, Volume 2, 3rd ed., (1997), 265] we can find
the following idea: when you store multi-precision number in several bytes, the whole number can be
represented as having a radix of 28 = 256, and each digit goes to corresponding byte. Likewise, if you store
multi-precision number in several 32-bit integer values, each digit goes to each 32-bit slot, and you may
think about this number as stored in radix of 232 .
Pronouncement
Numbers in non-decimal base are usually pronounced by one digit: “one-zero-zero-one-one-...”. Words
like “ten”, “thousand”, etc, are usually not pronounced, because it will be confused with decimal base
then.
To distinguish floating point numbers from integer ones, they are usually written with “.0” at the end, like
0.0, 123.0, etc.
1.3.1 x86
Here’s what both the optimizing GCC and MSVC compilers produce on the x86 platform:
Listing 1.2: Optimizing GCC/MSVC (assembly output)
f:
ret
There is just one instruction: RET , which returns execution to the caller.
12 Portable Executable
5
1.3. EMPTY FUNCTION
1.3.2 ARM
The return address is not saved on the local stack in the ARM ISA, but rather in the link register, so the
BX LR instruction causes execution to jump to that address—effectively returning execution to the caller.
1.3.3 MIPS
There are two naming conventions used in the world of MIPS when naming registers: by number (from $0
to $31) or by pseudo name ($V0, $A0, etc.).
The GCC assembly output below lists registers by number:
The first instruction is the jump instruction (J or JR) which returns the execution flow to the caller, jumping
to the address in the $31 (or $RA) register.
This is the register analogous to LR14 in ARM.
The second instruction is NOP15 , which does nothing. We can ignore it so far.
Register and instruction names in the world of MIPS are traditionally written in lowercase. However, for
the sake of consistency, we’ll stick to using uppercase letters, as it is the convention followed by all other
ISAs featured this book.
Despite the fact empty functions are useless, they are quite frequent in low-level code.
First of all, debugging functions are quite popular, like this one:
void some_function()
{
...
13 Interactive Disassembler and debugger developed by Hex-Rays
14 LinkRegister
15 No OPeration
6
1.4. RETURNING VALUE
...
};
In non-debug build (e.g., “release”), _DEBUG is not defined, so dbg_print() function, despite still being
called during execution, will be empty.
Another popular way of software protection is make several builds: one for legal customers, and a demo
build. Demo build can lack some important functions, like this:
Listing 1.7: C/C++ Code
void save_file ()
{
#ifndef DEMO
// actual saving code
#endif
};
save_file() function can be called when user click File->Save menu. Demo version may be delivered
with this menu item disabled, but even if software cracker will enable it, empty function with no useful
code will be called.
IDA marks such functions with names like nullsub_00 , nullsub_01 , etc.
Another simple function is the one that simply returns a constant value:
Listing 1.8: C/C++ Code
int f()
{
return 123;
};
1.4.1 x86
Here’s what both the optimizing GCC and MSVC compilers produce on the x86 platform:
Listing 1.9: Optimizing GCC/MSVC (assembly output)
f:
mov eax, 123
ret
There are just two instructions: the first places the value 123 into the EAX register, which is used by
convention for storing the return value and the second one is RET , which returns execution to the caller.
The caller will take the result from the EAX register.
1.4.2 ARM
7
1.5. HELLO, WORLD!
ARM uses the register R0 for returning the results of functions, so 123 is copied into R0 .
It is worth noting that MOV is a misleading name for the instruction in both x86 and ARM ISAs.
The data is not in fact moved, but copied.
1.4.3 MIPS
The $2 (or $V0) register is used to store the function’s return value. LI stands for “Load Immediate” and
is the MIPS equivalent to MOV .
The other instruction is the jump instruction (J or JR) which returns the execution flow to the caller.
You might be wondering why positions of the load instruction (LI) and the jump instruction (J or JR) are
swapped. This is due to a RISC feature called “branch delay slot”.
The reason this happens is a quirk in the architecture of some RISC ISAs and isn’t important for our
purposes—we just must keep in mind that in MIPS, the instruction following a jump or branch instruction
is executed before the jump/branch instruction itself.
As a consequence, branch instructions always swap places with the instruction which must be executed
beforehand.
1.4.4 In practice
Functions which merely returns 1 (true) or 0 (false) are very frequent in practice.
Smallest ever standard UNIX utilities, true and false returns 0 and 1 respectively, as an exit code (zero as
an exit code usually means success, non-zero means error).
Let’s use the famous example from the book [Brian W. Kernighan, Dennis M. Ritchie, The C Programming
Language, (1988)]:
#include <stdio.h>
int main()
{
printf("hello, world\n");
return 0;
}
1.5.1 x86
MSVC
8
1.5. HELLO, WORLD!
cl 1.cpp /Fa1.asm
MSVC produces assembly listings in Intel-syntax. The difference between Intel-syntax and AT&T-syntax
will be discussed in 1.5.1 on page 11.
The compiler generated the file, 1.obj , which is to be linked into 1.exe . In our case, the file contains
two segments: CONST (for data constants) and _TEXT (for code).
The string hello, world in C/C++ has type const char[] [Bjarne Stroustrup, The C++ Programming
Language, 4th Edition, (2013)p176, 7.3.2], but it does not have its own name. The compiler needs to deal
with the string somehow so it defines the internal name $SG3830 for it.
That is why the example may be rewritten as follows:
#include <stdio.h>
int main()
{
printf($SG3830);
return 0;
}
Let’s go back to the assembly listing. As we can see, the string is terminated by a zero byte, which is
standard for C/C++ strings. More about C/C++ strings: 5.4.1 on page 705.
In the code segment, _TEXT , there is only one function so far: main() . The function main() starts with
16
prologue code and ends with epilogue code (like almost any function) .
After the function prologue we see the call to the printf() function:
CALL _printf . Before the call, a string address (or a pointer to it) containing our greeting is placed on
the stack with the help of the PUSH instruction.
When the printf() function returns the control to the main() function, the string address (or a pointer
to it) is still on the stack. Since we do not need it anymore, the stack pointer (the ESP register) needs to
be corrected.
ADD ESP, 4 means add 4 to the ESP register value.
16 You can read more about it in the section about function prologues and epilogues ( 1.6 on page 30).
9
1.5. HELLO, WORLD!
Why 4? Since this is a 32-bit program, we need exactly 4 bytes for address passing through the stack.
If it was x64 code we would need 8 bytes. ADD ESP, 4 is effectively equivalent to POP register but
without using any register17 .
For the same purpose, some compilers (like the Intel C++ Compiler) may emit POP ECX instead of ADD
(e.g., such a pattern can be observed in the Oracle RDBMS code as it is compiled with the Intel C++ com-
piler). This instruction has almost the same effect but the ECX register contents will be overwritten. The
Intel C++ compiler supposedly uses POP ECX since this instruction’s opcode is shorter than ADD ESP, x
(1 byte for POP against 3 for ADD ).
After calling printf() , the original C/C++ code contains the statement return 0 —return 0 as the
result of the main() function.
In the generated code this is implemented by the instruction XOR EAX, EAX .
XOR is in fact just “eXclusive OR”18 but the compilers often use it instead of MOV EAX, 0 —again because
it is a slightly shorter opcode (2 bytes for XOR against 5 for MOV ).
Some compilers emit SUB EAX, EAX , which means SUBtract the value in the EAX from the value in EAX .
That in any case will results in zero.
The last instruction RET returns the control to the caller. Usually, this is C/C++ CRT19 code which in turn
returns control to the OS.
GCC
Now let’s try to compile the same C/C++ code in the GCC 4.4.1 compiler in Linux: gcc 1.c -o 1 . Next,
with the assistance of the IDA disassembler, let’s see how the main() function was created. IDA, like
MSVC, uses Intel-syntax20 .
push ebp
mov ebp, esp
and esp, 0FFFFFFF0h
sub esp, 10h
mov eax, offset aHelloWorld ; "hello, world\n"
mov [esp+10h+var_10], eax
call _printf
mov eax, 0
leave
retn
main endp
The result is almost the same. The address of the hello, world string (stored in the data segment) is
loaded in the EAX register first and then it is saved onto the stack.
In addition, the function prologue has AND ESP, 0FFFFFFF0h —this instruction aligns the ESP register
value on a 16-byte boundary. This results in all values in the stack being aligned the same way (The CPU
17 CPU flags, however, are modified
18 wikipedia
19 C runtime library
20 We could also have GCC produce assembly listings in Intel-syntax by applying the options -S -masm=intel .
10
1.5. HELLO, WORLD!
performs better if the values it is dealing with are located in memory at addresses aligned on a 4-byte or
16-byte boundary)21 .
SUB ESP, 10h allocates 16 bytes on the stack. Although, as we can see hereafter, only 4 are necessary
here.
This is because the size of the allocated stack is also aligned on a 16-byte boundary.
The string address (or a pointer to the string) is then stored directly onto the stack without using the PUSH
instruction. var_10 —is a local variable and is also an argument for printf() . Read about it below.
Unlike MSVC, when GCC is compiling without optimization turned on, it emits MOV EAX, 0 instead of a
shorter opcode.
The last instruction, LEAVE —is the equivalent of the MOV ESP, EBP and POP EBP instruction pair —in
other words, this instruction sets the stack pointer ( ESP ) back and restores the EBP register to its initial
state. This is necessary since we modified these register values ( ESP and EBP ) at the beginning of the
function (by executing MOV EBP, ESP / AND ESP, … ).
Let’s see how this can be represented in assembly language AT&T syntax. This syntax is much more
popular in the UNIX-world.
We get this:
The listing contains many macros (beginning with dot). These are not interesting for us at the moment.
21 Wikipedia: Data structure alignment
11
1.5. HELLO, WORLD!
For now, for the sake of simplification, we can ignore them (except the .string macro which encodes a
null-terminated character sequence just like a C-string). Then we’ll see this 22 :
Some of the major differences between Intel and AT&T syntax are:
• Source and destination operands are written in opposite order.
In Intel-syntax: <instruction> <destination operand> <source operand>.
In AT&T syntax: <instruction> <source operand> <destination operand>.
Here is an easy way to memorize the difference: when you deal with Intel-syntax, you can imagine
that there is an equality sign (=) between operands and when you deal with AT&T-syntax imagine
there is a right arrow (→) 23 .
• AT&T: Before register names, a percent sign must be written (%) and before numbers a dollar sign
($). Parentheses are used instead of brackets.
• AT&T: A suffix is added to instructions to define the operand size:
– q — quad (64 bits)
– l — long (32 bits)
– w — word (16 bits)
– b — byte (8 bits)
Let’s go back to the compiled result: it is identical to what we saw in IDA. With one subtle difference:
0FFFFFFF0h is presented as $-16 . It is the same thing: 16 in the decimal system is 0x10 in hexadec-
imal. -0x10 is equal to 0xFFFFFFF0 (for a 32-bit data type).
One more thing: the return value is to be set to 0 by using the usual MOV , not XOR . MOV just loads a
value to a register. Its name is a misnomer (data is not moved but rather copied). In other architectures,
this instruction is named “LOAD” or “STORE” or something similar.
We can easily find “hello, world” string in executable file using Hiew:
22 This GCC option can be used to eliminate “unnecessary” macros: -fno-asynchronous-unwind-tables
23 By the way, in some C standard functions (e.g., memcpy(), strcpy()) the arguments are listed in the same way as in Intel-syntax:
first the pointer to the destination memory block, and then the pointer to the source memory block.
12
1.5. HELLO, WORLD!
Spanish text is one byte shorter than English, so we also add 0x0A byte at the end ( \n ) and zero byte.
It works.
What if we want to insert longer message? There are some zero bytes after original English text. Hard to
say if they can be overwritten: they may be used somewhere in CRT code, or maybe not. Anyway, you
can overwrite them only if you really know what you are doing.
[0x00400430]> s 0x004005c4
[0x004005c4]> px
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0x004005c4 6865 6c6c 6f2c 2077 6f72 6c64 0000 0000 hello, world....
0x004005d4 011b 033b 3000 0000 0500 0000 1cfe ffff ...;0...........
0x004005e4 7c00 0000 5cfe ffff 4c00 0000 52ff ffff |...\...L...R...
0x004005f4 a400 0000 6cff ffff c400 0000 dcff ffff ....l...........
0x00400604 0c01 0000 1400 0000 0000 0000 017a 5200 .............zR.
13
1.5. HELLO, WORLD!
0x00400614 0178 1001 1b0c 0708 9001 0710 1400 0000 .x..............
0x00400624 1c00 0000 08fe ffff 2a00 0000 0000 0000 ........*.......
0x00400634 0000 0000 1400 0000 0000 0000 017a 5200 .............zR.
0x00400644 0178 1001 1b0c 0708 9001 0000 2400 0000 .x..........$...
0x00400654 1c00 0000 98fd ffff 3000 0000 000e 1046 ........0......F
0x00400664 0e18 4a0f 0b77 0880 003f 1a3b 2a33 2422 ..J..w...?.;*3$"
0x00400674 0000 0000 1c00 0000 4400 0000 a6fe ffff ........D.......
0x00400684 1500 0000 0041 0e10 8602 430d 0650 0c07 .....A....C..P..
0x00400694 0800 0000 4400 0000 6400 0000 a0fe ffff ....D...d.......
0x004006a4 6500 0000 0042 0e10 8f02 420e 188e 0345 e....B....B....E
0x004006b4 0e20 8d04 420e 288c 0548 0e30 8606 480e . ..B.(..H.0..H.
[0x004005c4]> oo+
File a.out reopened in read-write mode
[0x004005c4]> q
What I do here: I search for “hello” string using / command, then I set cursor (or seek in rada.re terms) to
that address. Then I want to be sure that this is really that place: px dumps bytes there. oo+ switches
rada.re to read-write mode. w writes ASCII string at the current seek. Note \00 at the end—this is zero
byte. q quits.
The way I described was a common way to translate MS-DOS software to Russian language back to 1980’s
and 1990’s. Russian words and sentences are usually slightly longer than its English counterparts, so that
is why localized software has a lot of weird acronyms and hardly readable abbreviations.
Perhaps, this also happened to other languages during that era, in other countries.
1.5.2 x86-64
MSVC: x86-64
main PROC
sub rsp, 40
lea rcx, OFFSET FLAT:$SG2989
call printf
xor eax, eax
add rsp, 40
ret 0
main ENDP
In x86-64, all registers were extended to 64-bit and now their names have an R- prefix. In order to use
the stack less often (in other words, to access external memory/cache less often), there exists a popular
way to pass function arguments via registers (fastcall) 6.1.3 on page 735. I.e., a part of the function
arguments is passed in registers, the rest—via the stack. In Win64, 4 function arguments are passed in
the RCX , RDX , R8 , R9 registers. That is what we see here: a pointer to the string for printf() is
now passed not in the stack, but in the RCX register. The pointers are 64-bit now, so they are passed in
the 64-bit registers (which have the R- prefix). However, for backward compatibility, it is still possible
to access the 32-bit parts, using the E- prefix. This is how the RAX / EAX / AX / AL register looks like in
x86-64:
14
1.5. HELLO, WORLD!
Byte number:
7th 6th 5th 4th 3rd 2nd 1st 0th
RAXx64
EAX
AX
AH AL
The main() function returns an int-typed value, which is, in C/C++, for better backward compatibility
and portability, still 32-bit, so that is why the EAX register is cleared at the function end (i.e., the 32-bit
part of the register) instead of RAX . There are also 40 bytes allocated in the local stack. This is called
the “shadow space”, about which we are going to talk later: 1.10.2 on page 101.
GCC: x86-64
A method to pass function arguments in registers is also used in Linux, *BSD and Mac OS X is [Michael Matz,
Jan Hubicka, Andreas Jaeger, Mark Mitchell, System V Application Binary Interface. AMD64 Architecture
Processor Supplement, (2013)] 24 .
The first 6 arguments are passed in the RDI , RSI , RDX , RCX , R8 , R9 registers, and the rest—via the
stack.
So the pointer to the string is passed in EDI (the 32-bit part of the register). But why not use the 64-bit
part, RDI ?
It is important to keep in mind that all MOV instructions in 64-bit mode that write something into the lower
32-bit register part also clear the higher 32-bits (as stated in Intel manuals: 11.1.4 on page 1003).
I.e., the MOV EAX, 011223344h writes a value into RAX correctly, since the higher bits will be cleared.
25
If we open the compiled object file (.o), we can also see all the instructions’ opcodes :
Listing 1.22: GCC 4.4.6 x64
.text:00000000004004D0 main proc near
.text:00000000004004D0 48 83 EC 08 sub rsp, 8
.text:00000000004004D4 BF E8 05 40 00 mov edi, offset format ; "hello, world\n"
.text:00000000004004D9 31 C0 xor eax, eax
.text:00000000004004DB E8 D8 FE FF FF call _printf
.text:00000000004004E0 31 C0 xor eax, eax
.text:00000000004004E2 48 83 C4 08 add rsp, 8
.text:00000000004004E6 C3 retn
.text:00000000004004E6 main endp
As we can see, the instruction that writes into EDI at 0x4004D4 occupies 5 bytes. The same instruction
writing a 64-bit value into RDI occupies 7 bytes. Apparently, GCC is trying to save some space. Besides,
it can be sure that the data segment containing the string will not be allocated at the addresses higher
than 4GiB.
We also see that the EAX register has been cleared before the printf() function call. This is done
because according to ABI!26 standard mentioned above, the number of used vector registers is passed
in EAX in *NIX systems on x86-64.
24 Also available as https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
25 Thismust be enabled in Options → Disassembly → Number of opcode bytes
26 ABI!
15
1.5. HELLO, WORLD!
Address patching (Win64)
If our example compiled in MSVC2013 using \MD switch (meaning smaller executable due to MSVCR*.DLL
file linkage), the main() function came first and can be easily found:
16
1.5. HELLO, WORLD!
Hiew shows “ello, world” string. And when we run patched executable, this very string is printed.
The binary file I’ve got when I compile our example using GCC 5.4.0 on Linux x64 box has many other text
strings: they are mostly imported function names and library names.
I run objdump to get contents of all sections of the compiled file:
$ objdump -s a.out
...
It’s not a problem to pass address of the text string “/lib64/ld-linux-x86-64.so.2” to printf() call:
#include <stdio.h>
17
1.5. HELLO, WORLD!
int main()
{
printf(0x400238);
return 0;
}
The fact that an anonymous C-string has const type ( 1.5.1 on page 9), and that C-strings allocated in
constants segment are guaranteed to be immutable, has an interesting consequence: the compiler may
use a specific part of the string.
Let’s try this example:
#include <stdio.h>
int f1()
{
printf ("world\n");
}
int f2()
{
printf ("hello world\n");
}
int main()
{
f1();
f2();
}
Common C/C++-compilers (including MSVC) allocate two strings, but let’s see what GCC 4.8.1 does:
f2 proc near
18
1.5. HELLO, WORLD!
Indeed: when we print the “hello world” string these two words are positioned in memory adjacently and
puts() called from f2() function is not aware that this string is divided. In fact, it’s not divided; it’s
divided only “virtually”, in this listing.
When puts() is called from f1() , it uses the “world” string plus a zero byte. puts() is not aware that
there is something before this string!
This clever trick is often used by at least GCC and can save some memory. This is close to string interning.
Another related example is here: 3.2 on page 468.
1.5.4 ARM
The armcc compiler produces assembly listings in Intel-syntax, but it has high-level ARM-processor related
macros 28 , but it is more important for us to see the instructions “as is” so let’s see the compiled result in
IDA.
In the example, we can easily see each instruction has a size of 4 bytes. Indeed, we compiled our code
for ARM mode, not for Thumb.
29
The very first instruction, STMFD SP!, {R4,LR} , works as an x86 PUSH instruction, writing the values
of two registers ( R4 and LR) into the stack.
Indeed, in the output listing from the armcc compiler, for the sake of simplification, actually shows the
PUSH {r4,lr} instruction. But that is not quite precise. The PUSH instruction is only available in Thumb
mode. So, to make things less confusing, we’re doing this in IDA.
This instruction first decrements the SP31 so it points to the place in the stack that is free for new entries,
then it saves the values of the R4 and LR registers at the address stored in the modified SP.
This instruction (like the PUSH instruction in Thumb mode) is able to save several register values at once
which can be very useful. By the way, this has no equivalent in x86. It can also be noted that the STMFD
instruction is a generalization of the PUSH instruction (extending its features), since it can work with any
register, not just with SP. In other words, STMFD may be used for storing a set of registers at the specified
memory address.
27 Itis indeed so: Apple Xcode 4.6.3 uses open-source GCC as front-end compiler and LLVM code generator
28 e.g. ARM mode lacks PUSH / POP instructions
29 STMFD30
31 stack pointer. SP/ESP/RSP in x86/x64. SP in ARM.
19
1.5. HELLO, WORLD!
The ADR R0, aHelloWorld instruction adds or subtracts the value in the PC32 register to the offset where
the hello, world string is located. How is the PC register used here, one might ask? This is called
“position-independent code”33 .
Such code can be executed at a non-fixed address in memory. In other words, this is PC-relative addressing.
The ADR instruction takes into account the difference between the address of this instruction and the
address where the string is located. This difference (offset) is always to be the same, no matter at what
address our code is loaded by the OS. That’s why all we need is to add the address of the current instruction
(from PC) in order to get the absolute memory address of our C-string.
34
BL __2printf instruction calls the printf() function. Here’s how this instruction works:
• store the address following the BL instruction ( 0xC ) into the LR;
• then pass the control to printf() by writing its address into the PC register.
When printf() finishes its execution it must have information about where it needs to return the control
to. That’s why each function passes control to the address stored in the LR register.
That is a difference between “pure” RISC-processors like ARM and CISC35 -processors like x86, where the
return address is usually stored on the stack. Read more about this in next section ( 1.7 on page 30).
By the way, an absolute 32-bit address or offset cannot be encoded in the 32-bit BL instruction because
it only has space for 24 bits. As we may recall, all ARM-mode instructions have a size of 4 bytes (32 bits).
Hence, they can only be located on 4-byte boundary addresses. This implies that the last 2 bits of the
instruction address (which are always zero bits) may be omitted. In summary, we have 26 bits for offset
encoding. This is enough to encode current_P C ± ≈ 32M .
36
Next, the MOV R0, #0 instruction just writes 0 into the R0 register. That’s because our C-function
returns 0 and the return value is to be placed in the R0 register.
37
The last instruction LDMFD SP!, R4,PC . It loads values from the stack (or any other memory place) in
order to save them into R4 and PC, and increments the stack pointer SP. It works like POP here.
N.B. The very first instruction STMFD saved the R4 and LR registers pair on the stack, but R4 and PC
are restored during the LDMFD execution.
As we already know, the address of the place where each function must return control to is usually saved
in the LR register. The very first instruction saves its value in the stack because the same register will be
used by our main() function when calling printf() . In the function’s end, this value can be written
directly to the PC register, thus passing control to where our function has been called.
Since main() is usually the primary function in C/C++, the control will be returned to the OS loader or
to a point in a CRT, or something like that.
All that allows omitting the BX LR instruction at the end of the function.
DCB is an assembly language directive defining an array of bytes or ASCII strings, akin to the DB directive
in the x86-assembly language.
20
1.5. HELLO, WORLD!
.text:00000002 C0 A0 ADR R0, aHelloWorld ; "hello, world"
.text:00000004 06 F0 2E F9 BL __2printf
.text:00000008 00 20 MOVS R0, #0
.text:0000000A 10 BD POP {R4,PC}
We can easily spot the 2-byte (16-bit) opcodes. This is, as was already noted, Thumb. The BL instruction,
however, consists of two 16-bit instructions. This is because it is impossible to load an offset for the
printf() function while using the small space in one 16-bit opcode. Therefore, the first 16-bit instruction
loads the higher 10 bits of the offset and the second instruction loads the lower 11 bits of the offset.
As was noted, all instructions in Thumb mode have a size of 2 bytes (or 16 bits). This implies it is impossible
for a Thumb-instruction to be at an odd address whatsoever. Given the above, the last address bit may
be omitted while encoding instructions.
In summary, the BL Thumb-instruction can encode an address in current_P C ± ≈ 2M .
As for the other instructions in the function: PUSH and POP work here just like the described STMFD / LDMFD
only the SP register is not mentioned explicitly here. ADR works just like in the previous example. MOVS
writes 0 into the R0 register in order to return zero.
Xcode 4.6.3 without optimization turned on produces a lot of redundant code so we’ll study optimized
output, where the instruction count is as small as possible, setting the compiler switch -O3 .
The MOV instruction just writes the number 0x1686 into the R0 register. This is the offset pointing to
the “Hello world!” string.
The R7 register (as it is standardized in [iOS ABI Function Call Guide, (2010)]39 ) is a frame pointer. More
on that below.
The MOVT R0, #0 (MOVe Top) instruction writes 0 into higher 16 bits of the register. The issue here is
that the generic MOV instruction in ARM mode may write only the lower 16 bits of the register.
Keep in mind, all instruction opcodes in ARM mode are limited in size to 32 bits. Of course, this limitation
is not related to moving data between registers. That’s why an additional instruction MOVT exists for
writing into the higher bits (from 16 to 31 inclusive). Its usage here, however, is redundant because
the MOV R0, #0x1686 instruction above cleared the higher part of the register. This is supposedly a
shortcoming of the compiler.
The ADD R0, PC, R0 instruction adds the value in the PC to the value in the R0 , to calculate the abso-
lute address of the “Hello world!” string. As we already know, it is “position-independent code” so this
correction is essential here.
The BL instruction calls the puts() function instead of printf() .
39 Also available as http://go.yurichev.com/17276
21
1.5. HELLO, WORLD!
GCC replaced the first printf() call with puts() . Indeed: printf() with a sole argument is almost
analogous to puts() .
Almost, because the two functions are producing the same result only in case the string does not contain
printf format identifiers starting with %. In case it does, the effect of these two functions would be different
40
.
41
Why did the compiler replace the printf() with puts() ? Presumably because puts() is faster .
Because it just passes characters to stdout without comparing every one of them with the % symbol.
Next, we see the familiar MOV R0, #0 instruction intended to set the R0 register to 0.
...
The BL and BLX instructions in Thumb mode, as we recall, are encoded as a pair of 16-bit instructions. In
Thumb-2 these surrogate opcodes are extended in such a way so that new instructions may be encoded
here as 32-bit instructions.
That is obvious considering that the opcodes of the Thumb-2 instructions always begin with 0xFx or
0xEx .
But in the IDA listing the opcode bytes are swapped because for ARM processor the instructions are
encoded as follows: last byte comes first and after that comes the first one (for Thumb and Thumb-2
modes) or for instructions in ARM mode the fourth byte comes first, then the third, then the second and
finally the first (due to different endianness).
So that is how bytes are located in IDA listings:
• for ARM and ARM64 modes: 4-3-2-1;
• for Thumb mode: 2-1;
• for 16-bit instructions pair in Thumb-2 mode: 2-1-4-3.
So as we can see, the MOVW , MOVT.W and BLX instructions begin with 0xFx .
One of the Thumb-2 instructions is MOVW R0, #0x13D8 —it stores a 16-bit value into the lower part of
the R0 register, clearing the higher bits.
Also, MOVT.W R0, #0 works just like MOVT from the previous example only it works in Thumb-2.
Among the other differences, the BLX instruction is used in this case instead of the BL .
The difference is that, besides saving the RA42 in the LR register and passing control to the puts()
function, the processor is also switching from Thumb/Thumb-2 mode to ARM mode (or back).
This instruction is placed here since the instruction to which control is passed looks like (it is encoded in
ARM mode):
40 It has also to be noted the puts() does not require a ‘\n’ new line symbol at the end of a string, so we do not see it here.
41 ciselant.de/projects/gcc_printf/gcc_printf.html
42 Return Address
22
1.5. HELLO, WORLD!
This is essentially a jump to the place where the address of puts() is written in the imports’ section.
So, the observant reader may ask: why not call puts() right at the point in the code where it is needed?
Because it is not very space-efficient.
Almost any program uses external dynamic libraries (like DLL in Windows, .so in *NIX or .dylib in Mac
OS X). The dynamic libraries contain frequently used library functions, including the standard C-function
puts() .
In an executable binary file (Windows PE .exe, ELF or Mach-O) an import section is present. This is a list
of symbols (functions or global variables) imported from external modules along with the names of the
modules themselves.
The OS loader loads all modules it needs and, while enumerating import symbols in the primary module,
determines the correct addresses of each symbol.
In our case, __imp__puts is a 32-bit variable used by the OS loader to store the correct address of the
function in an external library. Then the LDR instruction just reads the 32-bit value from this variable and
writes it into the PC register, passing control to it.
So, in order to reduce the time the OS loader needs for completing this procedure, it is good idea to write
the address of each symbol only once, to a dedicated place.
Besides, as we have already figured out, it is impossible to load a 32-bit value into a register while using
only one instruction without a memory access.
Therefore, the optimal solution is to allocate a separate function working in ARM mode with the sole goal of
passing control to the dynamic library and then to jump to this short one-instruction function (the so-called
thunk function) from the Thumb-code.
By the way, in the previous example (compiled for ARM mode) the control is passed by the BL to the
same thunk function. The processor mode, however, is not being switched (hence the absence of an “X”
in the instruction mnemonic).
Thunk-functions are hard to understand, apparently, because of a misnomer. The simplest way to under-
stand it as adaptors or convertors of one type of jack to another. For example, an adaptor allowing the
insertion of a British power plug into an American wall socket, or vice-versa. Thunk functions are also
sometimes called wrappers.
Here are a couple more descriptions of these functions:
23
1.5. HELLO, WORLD!
versions. So there are short C functions callable from C/C++ environment, which are, in turn, call FORTRAN
functions, and do almost anything else:
double Blas_Dot_Prod(const LaVectorDouble &dx, const LaVectorDouble &dy)
{
assert(dx.size()==dy.size());
integer n = dx.size();
integer incx = dx.inc(), incy = dy.inc();
ARM64
GCC
There are no Thumb and Thumb-2 modes in ARM64, only ARM, so there are 32-bit instructions only. The
Register count is doubled: .2.4 on page 1031. 64-bit registers have X- prefixes, while its 32-bit parts—
W- .
The STP instruction (Store Pair) saves two registers in the stack simultaneously: X29 and X30 .
Of course, this instruction is able to save this pair at an arbitrary place in memory, but the SP register is
specified here, so the pair is saved in the stack.
ARM64 registers are 64-bit ones, each has a size of 8 bytes, so one needs 16 bytes for saving two registers.
The exclamation mark (“!”) after the operand means that 16 is to be subtracted from SP first, and only
then are values from register pair to be written into the stack. This is also called pre-index. About the
difference between post-index and pre-index read here: 1.32.2 on page 441.
Hence, in terms of the more familiar x86, the first instruction is just an analogue to a pair of PUSH X29
and PUSH X30 . X29 is used as FP43 in ARM64, and X30 as LR, so that’s why they are saved in the
function prologue and restored in the function epilogue.
The second instruction copies SP in X29 (or FP). This is made so to set up the function stack frame.
ADRP and ADD instructions are used to fill the address of the string “Hello!” into the X0 register, because
the first function argument is passed in this register. There are no instructions, whatsoever, in ARM that
can store a large number into a register (because the instruction length is limited to 4 bytes, read more
about it here: 1.32.3 on page 442). So several instructions must be utilized. The first instruction ( ADRP )
writes the address of the 4KiB page, where the string is located, into X0 , and the second one ( ADD ) just
adds the remainder to the address. More about that in: 1.32.4 on page 444.
43 Frame Pointer
24
1.5. HELLO, WORLD!
0x400000 + 0x648 = 0x400648 , and we see our “Hello!” C-string in the .rodata data segment at this
address.
puts() is called afterwards using the BL instruction. This was already discussed: 1.5.4 on page 21.
The function result is returned via X0 and main() returns 0, so that’s how the return result is prepared.
But why use the 32-bit part?
Because the int data type in ARM64, just like in x86-64, is still 32-bit, for better compatibility.
So if a function returns a 32-bit int, only the lower 32 bits of X0 register have to be filled.
In order to verify this, let’s change this example slightly and recompile it. Now main() returns a 64-bit
value:
uint64_t main()
{
printf ("Hello!\n");
return 0;
}
The result is the same, but that’s how MOV at that line looks like now:
LDP (Load Pair) then restores the X29 and X30 registers.
There is no exclamation mark after the instruction: this implies that the values are first loaded from the
stack, and only then is SP increased by 16. This is called post-index.
A new instruction appeared in ARM64: RET . It works just as BX LR , only a special hint bit is added,
informing the CPU that this is a return from a function, not just another jump instruction, so it can execute
it more optimally.
Due to the simplicity of the function, optimizing GCC generates the very same code.
1.5.5 MIPS
One important MIPS concept is the “global pointer”. As we may already know, each MIPS instruction has
a size of 32 bits, so it’s impossible to embed a 32-bit address into one instruction: a pair has to be used
for this (like GCC did in our example for the text string address loading). It’s possible, however, to load
data from the address in the range of register − 32768...register + 32767 using one single instruction (because
16 bits of signed offset could be encoded in a single instruction). So we can allocate some register for
this purpose and also allocate a 64KiB area of most used data. This allocated register is called a “global
pointer” and it points to the middle of the 64KiB area. This area usually contains global variables and
addresses of imported functions like printf() , because the GCC developers decided that getting the
address of some function must be as fast as a single instruction execution instead of two. In an ELF file
this 64KiB area is located partly in sections .sbss (“small BSS44 ”) for uninitialized data and .sdata (“small
data”) for initialized data. This implies that the programmer may choose what data he/she wants to
be accessed fast and place it into .sdata/.sbss. Some old-school programmers may recall the MS-DOS
44 Block Started by Symbol
25
1.5. HELLO, WORLD!
memory model 10.6 on page 993 or the MS-DOS memory managers like XMS/EMS where all memory was
divided in 64KiB blocks.
This concept is not unique to MIPS. At least PowerPC uses this technique as well.
Optimizing GCC
Lets consider the following example, which illustrates the “global pointer” concept.
As we see, the $GP register is set in the function prologue to point to the middle of this area. The RA
register is also saved in the local stack. puts() is also used here instead of printf() . The address of
the puts() function is loaded into $25 using LW the instruction (“Load Word”). Then the address of the
text string is loaded to $4 using LUI (“Load Upper Immediate”) and ADDIU (“Add Immediate Unsigned
Word”) instruction pair. LUI sets the high 16 bits of the register (hence “upper” word in instruction name)
and ADDIU adds the lower 16 bits of the address.
ADDIU follows JALR (haven’t you forgot branch delay slots yet?). The register $4 is also called $A0,
which is used for passing the first function argument 45 .
JALR (“Jump and Link Register”) jumps to the address stored in the $25 register (address of puts() )
while saving the address of the next instruction (LW) in RA. This is very similar to ARM. Oh, and one
important thing is that the address saved in RA is not the address of the next instruction (because it’s in
a delay slot and is executed before the jump instruction), but the address of the instruction after the next
one (after the delay slot). Hence, P C + 8 is written to RA during the execution of JALR , in our case, this
is the address of the LW instruction next to ADDIU .
LW (“Load Word”) at line 20 restores RA from the local stack (this instruction is actually part of the function
epilogue).
MOVE at line 22 copies the value from the $0 ($ZERO) register to $2 ($V0).
MIPS has a constant register, which always holds zero. Apparently, the MIPS developers came up with
the idea that zero is in fact the busiest constant in the computer programming, so let’s just use the $0
register every time zero is needed.
45 The MIPS registers table is available in appendix .3.1 on page 1032
26
1.5. HELLO, WORLD!
Another interesting fact is that MIPS lacks an instruction that transfers data between registers. In fact,
MOVE DST, SRC is ADD DST, SRC, $ZERO (DST = SRC + 0), which does the same. Apparently, the MIPS
developers wanted to have a compact opcode table. This does not mean an actual addition happens at
each MOVE instruction. Most likely, the CPU optimizes these pseudo instructions and the ALU46 is never
used.
J at line 24 jumps to the address in RA, which is effectively performing a return from the function. ADDIU
after J is in fact executed before J (remember branch delay slots?) and is part of the function epilogue.
Here is also a listing generated by IDA. Each register here has its own pseudo name:
The instruction at line 15 saves the GP value into the local stack, and this instruction is missing mysteri-
ously from the GCC output listing, maybe by a GCC error 47 . The GP value has to be saved indeed, because
each function can use its own 64KiB data window. The register containing the puts() address is called
$T9, because registers prefixed with T- are called “temporaries” and their contents may not be preserved.
Non-optimizing GCC
27
1.5. HELLO, WORLD!
12 lui $28,%hi(__gnu_local_gp)
13 addiu $28,$28,%lo(__gnu_local_gp)
14 ; load the address of the text string:
15 lui $2,%hi($LC0)
16 addiu $4,$2,%lo($LC0)
17 ; load the address of puts() using the GP:
18 lw $2,%call16(puts)($28)
19 nop
20 ; call puts():
21 move $25,$2
22 jalr $25
23 nop ; branch delay slot
24
25 ; restore the GP from the local stack:
26 lw $28,16($fp)
27 ; set register $2 ($V0) to zero:
28 move $2,$0
29 ; function epilogue.
30 ; restore the SP:
31 move $sp,$fp
32 ; restore the RA:
33 lw $31,28($sp)
34 ; restore the FP:
35 lw $fp,24($sp)
36 addiu $sp,$sp,32
37 ; jump to the RA:
38 j $31
39 nop ; branch delay slot
We see here that register FP is used as a pointer to the stack frame. We also see 3 NOPs. The second
and third of which follow the branch instructions. Perhaps the GCC compiler always adds NOPs (because
of branch delay slots) after branch instructions and then, if optimization is turned on, maybe eliminates
them. So in this case they are left here.
Here is also IDA listing:
Listing 1.34: Non-optimizing GCC 4.4.5 (IDA)
1 .text:00000000 main:
2 .text:00000000
3 .text:00000000 var_10 = -0x10
4 .text:00000000 var_8 = -8
5 .text:00000000 var_4 = -4
6 .text:00000000
7 ; function prologue.
8 ; save the RA and FP in the stack:
9 .text:00000000 addiu $sp, -0x20
10 .text:00000004 sw $ra, 0x20+var_4($sp)
11 .text:00000008 sw $fp, 0x20+var_8($sp)
12 ; set the FP (stack frame pointer):
13 .text:0000000C move $fp, $sp
14 ; set the GP:
15 .text:00000010 la $gp, __gnu_local_gp
16 .text:00000018 sw $gp, 0x20+var_10($sp)
17 ; load the address of the text string:
18 .text:0000001C lui $v0, (aHelloWorld >> 16) # "Hello, world!"
19 .text:00000020 addiu $a0, $v0, (aHelloWorld & 0xFFFF) # "Hello, world!"
20 ; load the address of puts() using the GP:
21 .text:00000024 lw $v0, (puts & 0xFFFF)($gp)
22 .text:00000028 or $at, $zero ; NOP
23 ; call puts():
24 .text:0000002C move $t9, $v0
25 .text:00000030 jalr $t9
26 .text:00000034 or $at, $zero ; NOP
27 ; restore the GP from local stack:
28 .text:00000038 lw $gp, 0x20+var_10($fp)
29 ; set register $2 ($V0) to zero:
30 .text:0000003C move $v0, $zero
31 ; function epilogue.
32 ; restore the SP:
33 .text:00000040 move $sp, $fp
28
1.5. HELLO, WORLD!
34 ; restore the RA:
35 .text:00000044 lw $ra, 0x20+var_4($sp)
36 ; restore the FP:
37 .text:00000048 lw $fp, 0x20+var_8($sp)
38 .text:0000004C addiu $sp, 0x20
39 ; jump to the RA:
40 .text:00000050 jr $ra
41 .text:00000054 or $at, $zero ; NOP
Interestingly, IDA recognized the LUI / ADDIU instructions pair and coalesced them into one LA (“Load
Address”) pseudo instruction at line 15. We may also see that this pseudo instruction has a size of 8 bytes!
This is a pseudo instruction (or macro) because it’s not a real MIPS instruction, but rather a handy name
for an instruction pair.
Another thing is that IDA doesn’t recognize NOP instructions, so here they are at lines 22, 26 and 41. It is
OR $AT, $ZERO . Essentially, this instruction applies the OR operation to the contents of the $AT register
with zero, which is, of course, an idle instruction. MIPS, like many other ISAs, doesn’t have a separate
NOP instruction.
The address of the text string is passed in the register. Why setup a local stack anyway? The reason
for this lies in the fact that the values of registers RA and GP have to be saved somewhere (because
printf() is called), and the local stack is used for this purpose. If this was a leaf function, it would have
been possible to get rid of the function prologue and epilogue, for example: 1.4.3 on page 8.
29
1.6. FUNCTION PROLOGUE AND EPILOGUE
0x400820: "hello, world"
(gdb)
1.5.6 Conclusion
The main difference between x86/ARM and x64/ARM64 code is that the pointer to the string is now 64-bits
in length. Indeed, modern CPUs are now 64-bit due to both the reduced cost of memory and the greater
demand for it by modern applications. We can add much more memory to our computers than 32-bit
pointers are able to address. As such, all pointers are now 64-bit.
1.5.7 Exercises
• http://challenges.re/48
• http://challenges.re/49
A function prologue is a sequence of instructions at the start of a function. It often looks something like
the following code fragment:
push ebp
mov ebp, esp
sub esp, X
What these instruction do: save the value in the EBP register, set the value of the EBP register to the
value of the ESP and then allocate space on the stack for local variables.
The value in the EBP stays the same over the period of the function execution and is to be used for local
variables and arguments access. For the same purpose one can use ESP , but since it changes over time
this approach is not too convenient.
The function epilogue frees the allocated space in the stack, returns the value in the EBP register back
to its initial state and returns the control flow to the caller:
mov esp, ebp
pop ebp
ret 0
Function prologues and epilogues are usually detected in disassemblers for function delimitation.
1.6.1 Recursion
1.7 Stack
48
The stack is one of the most fundamental data structures in computer science . AKA49 LIFO50 .
Technically, it is just a block of memory in process memory along with the ESP or RSP register in x86 or
x64, or the SP register in ARM, as a pointer within that block.
48 wikipedia.org/wiki/Call_stack
49 Also Known As
50 Last In First Out
30
1.7. STACK
The most frequently used stack access instructions are PUSH and POP (in both x86 and ARM Thumb-
mode). PUSH subtracts from ESP / RSP /SP 4 in 32-bit mode (or 8 in 64-bit mode) and then writes the
contents of its sole operand to the memory address pointed by ESP / RSP /SP.
POP is the reverse operation: retrieve the data from the memory location that SP points to, load it into
the instruction operand (often a register) and then add 4 (or 8) to the stack pointer.
After stack allocation, the stack pointer points at the bottom of the stack. PUSH decreases the stack
pointer and POP increases it. The bottom of the stack is actually at the beginning of the memory allocated
for the stack block. It seems strange, but that’s the way it is.
ARM supports both descending and ascending stacks.
For example the STMFD/LDMFD, STMED51 /LDMED52 instructions are intended to deal with a descending
stack (grows downwards, starting with a high address and progressing to a lower one). The STMFA53 /LDMFA54 ,
STMEA55 /LDMEA56 instructions are intended to deal with an ascending stack (grows upwards, starting from
a low address and progressing to a higher one).
Intuitively, we might think that the stack grows upwards, i.e. towards higher addresses, like any other
data structure.
The reason that the stack grows backward is probably historical. When the computers were big and
occupied a whole room, it was easy to divide memory into two parts, one for the heap and one for the
stack. Of course, it was unknown how big the heap and the stack would be during program execution, so
this solution was the simplest possible.
Heap Stack
In [D. M. Ritchie and K. Thompson, The UNIX Time Sharing System, (1974)]57 we can read:
The user-core part of an image is divided into three logical segments. The program text
segment begins at location 0 in the virtual address space. During execution, this segment
is write-protected and a single copy of it is shared among all processes executing the same
program. At the first 8K byte boundary above the program text segment in the virtual ad-
dress space begins a nonshared, writable data segment, the size of which may be extended
by a system call. Starting at the highest address in the virtual address space is a stack
segment, which automatically grows downward as the hardware’s stack pointer fluctuates.
This reminds us how some students write two lecture notes using only one notebook: notes for the first
lecture are written as usual, and notes for the second one are written from the end of notebook, by flipping
it. Notes may meet each other somewhere in between, in case of lack of free space.
x86
31
1.7. STACK
When calling another function with a CALL instruction, the address of the point exactly after the CALL
instruction is saved to the stack and then an unconditional jump to the address in the CALL operand is
executed.
The CALL instruction is equivalent to a
PUSH address_after_call / JMP operand instruction pair.
RET fetches a value from the stack and jumps to it —that is equivalent to a POP tmp / JMP tmp instruc-
tion pair.
Overflowing the stack is straightforward. Just run eternal recursion:
void f()
{
f();
};
ss.cpp
c:\tmp6\ss.cpp(4) : warning C4717: 'f' : recursive on all control paths, function will cause ⤦
Ç runtime stack overflow
…Also if we turn on the compiler optimization ( /Ox option) the optimized code will not overflow the
stack and will work correctly58 instead:
?f@@YAXXZ PROC ; f
; File c:\tmp6\ss.cpp
; Line 2
$LL3@f:
; Line 3
jmp SHORT $LL3@f
?f@@YAXXZ ENDP ; f
GCC 4.4.1 generates similar code in both cases without, however, issuing any warning about the problem.
ARM
ARM programs also use the stack for saving return addresses, but differently. As mentioned in “Hello,
world!” ( 1.5.4 on page 19), the RA is saved to the LR (link register). If one needs, however, to call another
function and use the LR register one more time, its value has to be saved. Usually it is saved in the
function prologue.
Often, we see instructions like PUSH R4-R7,LR along with this instruction in epilogue POP R4-R7,PC —
thus register values to be used in the function are saved in the stack, including LR.
58 irony here
32
Another Random Document on
Scribd Without Any Related Topics
obtuse than the other. They are two in number, but, as Mr. Audubon
states, occasionally the nest contains three. Two broods are raised in
a season.
In the vicinity of Charleston these birds were observed to remain all
the year, though the greater proportion retired south or to the sea-
islands.
In the Florida Keys Mr. Audubon met with them among the islands
resorted to by the Zenaida Doves, and also on Sandy Island, near
Cape Sable. In the latter place they were so gentle that he
approached to within two yards of them. Their nest was on the top
of a cactus, not more than two feet from the ground.
Their food, in a wild state, consists of grass-seeds and various small
berries, with which they swallow a large proportion of gravel to
assist digestion. They are extremely fond of dusting themselves in
the sand, lying down in it in the manner of various gallinaceous
birds.
The eggs of this species are of a uniform bright white color, are
slightly more pointed at one end than at the other, and measure .85
of an inch in length by .63 in breadth.
This species was found in abundance at Cape St. Lucas by Mr.
Xantus. They were nesting from April 15 until August 29, and
evidently had two or more broods in a season. Their nests were
usually placed in low cactuses, near the ground, or in small shrubs.
Their nests, eggs, and general habits, so far as we can gather them
from the meagre notes of Mr. Xantus, are in no wise different from
those of the more eastern birds.
PLATE LVIII.
1. Oreopeleia martinica. ♂ Jamaica.
2. Zenaidura carolinensis. ♂ N. C., 55569.
3. Zenaida amabilis. ♂ Jamaica, 24406.
4. Melopeleia leucoptera. ♂ Mazatlan, 34009.
5. Starnoæna cyanocephalus. ♂ Jamaica, ? 12541.
6. Chamæpelia passerina. ♂ 28281.
7. Scardafella inca. ♂ Texas, 45465.
Habits. The Key West Pigeon is found within the fauna of the United
States only in the extreme southern portion of Florida, and, so far as
known, only on the island of Key West, where Mr. Audubon met with
them, and enjoyed a limited opportunity of observing their habits.
He describes the flight as low, swift, and protracted, as he saw them
passing from Cuba to Key West. They moved in loose flocks of from
five or six to a dozen, and so very low as to almost seem to touch
the surface. They were fond of going out early in the morning from
their thickets to cleanse their plumage in the shelly sand, but on the
least approach of danger would fly back to the thickest part of the
woods, throw themselves on the ground, and run off with great
rapidity. Their movements of the tail and neck are similar to those of
the Carolina Dove. Their coo is said to be neither so soft nor so
prolonged as that of the common Dove, and may be represented by
the syllable whoe-whoe-oh-oh-oh. When suddenly approached, they
utter a guttural gasping sound. They are said to alight on the lower
branches of shrubby trees, and to delight in the neighborhood of
shady ponds, always inhabiting by preference the darkest solitudes.
Whatever may have been their abundance on Key West, in Mr.
Audubon’s time, it is certain that they are very rare there now, as I
am not aware of their having been taken of late years by any of the
numerous collectors who have visited South Florida since Mr.
Audubon’s time.
Oreopeleia martinica.
Starnœnas, Bonaparte, Geog. & Comp. List, 1838. (Type, Columba cyanocephala,
L.)
Gen. Char. Bill short; culmen about one third the rest of head, measured from the
frontal feathers. Legs very stout and large; tarsus bare on the entire tibial joint,
and covered with hexagonal scales, largest anteriorly, longer than the middle toe
and claw. Inner lateral claw the larger, reaching the base of the middle claw; all
the claws short, thick, and blunt. Hind toe and claw short; half the middle. Wings
short, broad, and concave; much rounded. Tail short, broad, nearly even, but
slightly vaulted.
Columba cyanocephala, Linn. Syst. Nat. I, 1766, 282.—Gmelin, Syst. I, 1788, 778.—
Wagler, Syst. Avium, 1827, Columba, No. 112.—Aud. Orn. Biog. II, 1834, 441;
V. 1839, 557, pl. clxxii. Starnœnas cyanocephala, Bonap. List, 1838.—Ib. Consp.
II, 1854, 69.—Aud. Syn. 1839, 193.—Ib. Birds Amer. V, 1842, 23, pl. cclxxxiv.—
Gundlach, Cab. Journ. IV, 1856, 108.—Baird, Birds N. Am. 1858, 608.—Cab. J.
IV, 108 (Cuba).—Gundl. Repert. Cub. I, 1866, 299.—Reichenb. Handb. Taub. 30,
tab. 257, f. 1431; 266, f. 2879–81. Starnœnas cyanocephala, Reichenbach,
Systema Av. 1851, p. xxv, pl. xxiii.—Ib. Icones Av. tab. 260 and 266. Geophilus?
cyanocephala, Selby, Pigeons, Jard. Nat. Lib. V, 216, pl. xxvii. Columba
(Lophyrus) cyanocephala, Nuttall, Man. I, (2d. ed.,) 1840, 769. Columba
tetraoides, (Scopoli,) Gmelin, I, 772. Blue-headed Turtle, Latham, Syn. II, II, 651.
Sp. Char. Bill blue, the fleshy part at the base carmine. Iris brown, scales of feet
carmine, the interspaces white. Above and on sides glossy dark chocolate-
olivaceous; beneath brownish-red, lighter centrally. Chin and throat black, with a
narrow border of white below. A white line begins in the chin, and passes under
the eye to the occiput. Sides of head above this and forehead black; crown blue.
Length, 10.70; wing, 5.40; tail, 4.35.
Hab. West India Islands; according to Audubon found occasionally at Key West,
Florida, and other southern keys.
2827 ♂ ½ ½
Starnœnas cyanocephala.
The axillars and under surface of the wings are like the belly. The
crissum is most like the back. The outer tail-feathers have a bluish
tinge above.
The hind toe in this species is not strictly in the same plane with the
others, but placed a little above their point of insertion.
Habits. This handsome Pigeon belongs to the fauna of the West India
Islands, and is only an occasional visitant of Key West and other
southern keys of Florida. They are a common species in Cuba, from
which island a few are stated by Mr. Audubon to migrate each year
to certain of the keys of Florida, where, however, they are rarely
seen on account of their living only in the most tangled thickets. Mr.
Audubon saw a pair on the western side of Key West. They were
near the water picking gravel, but they would not suffer a near
approach. He saw a pair, also, that had been taken, when young, on
“Mule Keys.” These fed well on cracked corn and rice, but he was
unable to obtain any further information in respect to them.
Though abundant in Cuba this species does not appear to have been
found in Jamaica, except as an imported bird from the former island,
contrary to the assertions of various writers, as Temminck, Brisson,
and others. Mr. Gosse was not able to trace its presence, though its
existence among the precipitous woods on the north side of that
island he regards as quite possible.
Starnœnas cyanocephala.
Subfamily PENELOPINÆ.
This is the most extensive section of Cracidæ, embracing, according
to Sclater and Salvin, no less than thirty-nine species. The genera
indicated are as follows:—
A. A central fold of skin on the throat.
Outer quills narrow, but entire.
Throat feathered … 1. Stegnolæma.
Throat naked.
Sexes similar … 2. Penelope.
Sexes different … 3. Penelopina.
Outer quills emarginated.
Gular fold short … 4. Pipile.
Gular fold lengthened; linear … 5. Aburria.
B. No central gular fold.
Throat feathered; outer quills emarginated … 6. Chamæpetes.
Throat naked; with a central line of bristly feathers; outer quills
entire … 7. Ortalida.
Ortalida, Merrem, Av. rar. Icones et Desc. II, 1786, 40 (Gray). (Type, Phasianus
motmot, L.)
37977 ♂ ⅓ ⅓
Ortalida maccalli.
Ortalida vetula, Lawrence, Ann. N. Y. Lyc. V, 1851, 116. (Not Penelope vetula,
Wagler, Isis, 1830, 1112, and 1831, 517.)—Scl. & Salv. P. Z. S. 1870, 538.
(Considers it the same as P. vetula, Wagler). Ortalida poliocephala, Cassin,
Illust. I, ix, 1855, 267, pl. xliv. (Not Penelope poliocephala, Wagler, Isis, 1830,
1112.) Ortalida maccalli, Baird, Birds N. Am. 1858, 611.—Ib. M. Bound. II,
Birds, 22.—Dresser, Ibis, 1866, 24 (S. E. Texas, breeding).—Lawr. Ann. N. Y. IX,
209 (Yucatan).—Scl. & Salv. P. Z. S. 1870, 538 (Honduras, Vera Cruz,
Guatemala).—Reichenb. Handb. der sp. Orn. Lief, viii, 145. (Describes more
adult specimens.)
Sp. Char. Body above dark greenish-olive; beneath brownish-yellow, tinged with
olive. Head and upper part of neck plumbeous. Tail-feathers lustrous green, all
tipped with white, except the middle one. Feathers along the middle of the throat
black; outer edge of primaries tinged with gray. Eyes brown. Bill and feet lead-
colored. Length, 23.50; wing, 8.50; tail, 11.00.
Hab. Valley of the Rio Grande, and southward to Guatemala.
Ortalida maccalli.
Birds of the family to which the Texan species belongs differ in a
very marked manner, in habits, from most Gallinaceæ, inasmuch as
they not only live almost exclusively in deep forests, but are also
remarkable for habitually frequenting trees, feeding upon their
foliage, and building their nests within their branches, more in the
manner of the smaller birds. They are all said to have loud and
discordant voices, and are generally of a black or dark plumage.
Specimens of this bird were taken at Boquillo, in New Leon, in the
spring of 1853, by Lieutenant Couch, who speaks of them as
gregarious and as seeking their food wholly or in part on trees.
According to Mr. Clark, they do not occur higher up the Rio Grande
than the vicinity of Ringgold Barracks, inhabiting the deepest
chaparrals, which they never quit. They are inactive, and for the
most of the time sit about in flocks in these thickets, feeding on
leaves. The Mexican name of Chacalacca is supposed to be derived
from the noise with which at times they make the valleys ring, and
which may be well imitated in kind, but not in strength, by putting
the most stress upon the last two syllables. No sooner does one take
up the song than others chime in from all quarters, till, apparently
exhausted, the noise gradually dies off into an interlude, only to be
again renewed. These concerts take place in the morning and
evening. The birds are quite gentle, are easily tamed, and are said
to cross with the common domestic fowl.
Mr. Dresser states that the Chacalacca is very common near
Matamoras and Brownsville, and that in the autumn great numbers
are exposed for sale in the market of the latter place. The Mexicans
are said to hold it in high esteem for its fighting qualities, and often
keep it in a domesticated state and cross it with the common fowl,
making use of the hybrid for cock-fighting. Mr. Dresser was so
informed by many Mexicans, upon whose word he placed reliance,
and was an eyewitness of a fight in which one of these hybrids was
engaged. Mr. Dresser had a tame one, when at Matamoras, that
became so familiar that he could hardly keep it out of his room. This
bird would occasionally go away for a day or two, and pay a visit to
the poultry belonging to a neighbor; whenever he missed it, he had
only to go to a poultry-yard near the house, where it could generally
be found.
This species was first taken within the United States by Colonel
McCall, who obtained it in Texas, and who enjoyed and improved
unusually good opportunities to observe the habits and manners of
this bird. From his notes, quoted by Mr. Cassin, we give the
following:—
“This very gallant-looking and spirited bird I saw for the first time
within our territory in the extensive forests of chaparral which
envelop the Resaca de la Palma. Here, and for miles along the Lower
Rio Grande, it was abundant; and throughout this region the
remarkable and sonorous cry of the male bird could not fail to attract
and fix the attention of the most obtuse or listless wanderer who
might chance to approach its abode. By the Mexicans it is called
Chiac-chia-lacca, an Indian name, without doubt derived from the
peculiar cry of the bird, which strikingly resembles a repetition of
these syllables. And when I assure you that its voice, in compass, is
equal to that of the Guinea-fowl, and in harshness but little inferior,
you may form some idea of the chorus with which the forest is made
to ring at the hour of sunrise. At that hour, in the month of April, I
have observed a proud and stately fellow descend from the tree on
which he had roosted, and, mounting upon an old log or stump,
commence his clear, shrill cry. This was soon responded to in a lower
tone by the female, the latter always taking up the strain as soon as
the importunate call of her mate had ceased. Thus alternating, one
pair after another would join in the matutinal chorus, and, before
the rising sun had lighted up their close retreat, the woods would
ring with the din of a hundred voices, as the happy couples met
after the period of separation and repose. When at length all this
clatter had terminated, the parties quietly betook themselves to their
morning meal. If surprised while thus employed, they would fly into
the trees above, and, peering down with stretched necks, and heads
turned sideways to the ground, they would challenge the intruder
with a singular and oft-repeated croaking note, of which it would be
difficult to give any adequate idea with words alone.”
Colonel McCall adds that the volubility and singularity of its voice is
its most striking and remarkable trait. While on his march from
Matamoras to Tampico he had encamped, on the 30th of December,
at the spring of Encinal, whence, a short time before sunset, he rode
out in search of game. Passing through a woodland near the stream,
his ears were saluted with a strange sound that resembled
somewhat the cry of the panther (Felis onca). He was at a loss to
what animal to ascribe it, and, dismounting, crawled cautiously
through the thicket for some distance, until he came upon an
opening where there were some larger trees, from the lower
branches of one of which he ascertained that the sound proceeded.
There he discovered a large male bird of this species, ascending
towards the top of the tree, and uttering this hitherto unheard
sound, as he sprang from branch to branch in mounting to his roost.
In a few moments his call was answered from a distance, and soon
after he was joined by a bird of the year. Others followed, coming in
from different quarters, and there were in a little while five or six
upon the tree. One of these discovered the intruder and gave the
alarm. The singular cry of the old bird ceased, and they all began to
exhibit uneasiness and a disposition to fly, whereupon Colonel McCall
shot the old bird.
Colonel McCall also states that the eye is a remarkable feature in the
living birds of this species, being full of courage and animation,
equal, in fact, in brilliancy to that of the finest gamecock. He
frequently noticed this bird domesticated by the Mexicans at
Matamoras, Monterey, etc., and going at large about their gardens.
He was assured that in that condition it not unfrequently crossed
with the common fowl.
In the wild state the eggs are said to be from six to eight, never
exceeding the last number. They are white, without spots, and rather
smaller than a pullet’s egg. The nest is usually on the ground, at the
root of a large tree or at the side of an old log, where a hole several
inches deep is scratched in the ground; this is lined with leaves, and
the eggs are always carefully covered with the same when the
female leaves them for the purpose of feeding. If disturbed while on
her nest, she flies at the intruder with great spirit and determination.
Eggs of this species, from Matamoras, are of an oblong-oval shape,
equally pointed at either end, and measure 2.35 inches in length by
1.65 in breadth. They are of a dirty-white color with a light tint of
buff, and have a slightly roughened or granulated surface.
Family MELEAGRIDÆ.—The Turkeys.
Char. Bill moderate; the nasal fossæ bare. Head and neck without feathers, but
with scattered hairs, and more or less carunculated. An extensible fleshy process
on the forehead, but no development of the bone. Tarsus armed with spurs in the
male. Hind toe elevated. Tail nearly as long as the wing, truncate, of more than
twelve feathers.
Gen. Char. Legs with transverse scutellæ before and behind; reticulated laterally.
Tarsi with spurs. Tail rounded, rather long, usually of eighteen feathers. Forehead
with a depending fleshy cone. Head and the upper half of the neck without
feathers. Breast of male in most species with a long tuft of bristles.
Sp. Char. The naked skin of the head and neck is blue; the excrescences purplish-
red. The legs are red. The feathers of the neck and body generally are very broad,
abruptly truncate, and each one well defined and scale-like; the exposed portion
coppery-bronze, with a bright coppery reflection in some lights, in the specimens
before us chiefly on the under parts. Each feather is abruptly margined with
velvet-black, the bronze assuming a greenish or purplish shade near the line of
junction, and the bronze itself sometimes with a greenish reflection in some lights.
The black is opaque, except along the extreme tip, where there is a metallic gloss.
The feathers of the lower back and rump are black, with little or no copper gloss.
The feathers of the sides behind, and the coverts, upper and under, are of a very
dark purplish-chestnut, with purplish-metallic reflections near the end, and a
subterminal bar of black; the tips are of the opaque purplish-chestnut referred to.
The concealed portion of the coverts is dark chestnut barred rather finely with
black; the black wider than the interspaces. The tail-feathers are dark brownish-
chestnut, with numerous transverse bars of black, which, when most distinct, are
about a quarter of an inch wide and about double their interspaces; the extreme
tip for about half an inch is plain chestnut, lighter than the ground-color; and
there is a broad subterminal bar of black about two inches wide on the outer
feathers, and narrowing to about three quarters of an inch to the central ones.
The innermost pair scarcely shows this band, and the others are all much broken
and confused. In addition to the black bars on each feather, the chestnut
interspaces are sprinkled with black. The black bands are all most distinct on the
inner webs; the interspaces are considerably lighter below than above.
There are no whitish tips whatever to the tail or its coverts. The feathers on the
middle of the belly are downy, opaque, and tipped obscurely with rusty whitish.
The wing-coverts are like the back; the quills, however, are blackish-brown, with
numerous transverse bars of white, half the width of the interspaces. The exposed
surfaces of the wing, however, and most of the inner secondaries, are tinged with
brownish-rusty, the uppermost ones with a dull copper or greenish gloss.
The female differs in smaller size, less brilliant colors, absence generally of bristles
on the breast and of spur, and a much smaller fleshy process above the base of
the bill.
Male. Length, 48.00 to 50.00; extent, 60.00; wing, 21.00; tail, 18.50. Weight, 16
to 35 lbs. Female. Weight about 12 lbs.; measurements smaller in proportion.
Hab. Eastern Province of the United States, and Canada. West along the timbered
river-valleys towards the Rocky Mountains; south to the Gulf coast.
ebookball.com