0% found this document useful (0 votes)
12K views

Code Style For Compilers: 18-642 / Fall 2020

This document discusses code style best practices for compilers, including using language features like enums and typedefs to improve safety and avoid bugs, following guidelines like MISRA C and CERT, and using static and dynamic analysis tools. It also discusses strategies for handling legacy code and notes safer languages like Spark Ada which use formal methods.

Uploaded by

Raghul S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12K views

Code Style For Compilers: 18-642 / Fall 2020

This document discusses code style best practices for compilers, including using language features like enums and typedefs to improve safety and avoid bugs, following guidelines like MISRA C and CERT, and using static and dynamic analysis tools. It also discusses strategies for handling legacy code and notes safer languages like Spark Ada which use formal methods.

Uploaded by

Raghul S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Code Style

for Compilers
Prof. Philip Koopman

18-642 / Fall 2020


“Programming can be fun, so can
cryptography; however they should
not be combined.”
– Kreitzberg and Shneiderman
© 2020 Philip Koopman 1
Coding Style: Language Use
 Anti-Patterns:
 Code compiles with warnings
 Warnings are turned off or over-ridden
 Insufficient warning level set
 Language safety features over-ridden

 Make sure the compiler understands what you meant


 A warning means the compiler might not do what you think
– Your particular language use might be “undefined”
 A warning might mean you’re doing something that’s likely a bug
– It might be valid C code, but should be avoided
 Don’t over-ride features designed for safe language use
© 2020 Philip Koopman 2
The C Language Doesn’t Always Play Nice
 Defined, but potentially dangerous
 if (a = b) { … } // a is modified
 while (x > 0); {x = x-1;} // infinite loop
BAD
 Undefined or unspecified  dangerous CODE!
 You might think you know what these do …
… but it varies from system to system
 int *p = NULL; x = *p; // null pointer dereference
 int b; c = b; // uninitialized variable
 int x[10]; … b = x[10]; // access past end of array
 x = (i++) + a[i]; // when is i incremented?
© 2020 Philip Koopman 3
Language Use Guidelines & Tools
 MISRA C, C++
 Guidelines for critical systems in C (e.g., no malloc)
 Portability, avoiding high risk features, best practices

 CERT Secure C, C++, Java


 Rules to reduce security risks (e.g., buffer overflows)
 Includes list of which tools check which rules

 Static analysis tools


 More than compiler warnings (e.g., strong type warnings)
 Many tools, both commercial and free. Start by going far past “–Wall” on gcc

 Dynamic Analysis tools


 Executes the program with checks (e.g., memory array bounds)
 Again, many tools. Start by looking at Valgrind tool suite
© 2020 Philip Koopman 4
MISRA C
2012
Example

[MISRA C-2012 Guidelines; Fair Use] © 2020 Philip Koopman 5


Let the Language Help!
 Use enum instead of int
 enum color {black, white, red}; // avoids bad values
 Use const instead of #define
 const uint64_t x = 1; // helps with type checking
uint64_t y = x << 40; // avoids 32-bit overflow bug
 Use inline instead of #define
 If it’s too big to inline, the call overhead doesn’t matter
 Many compilers inline automatically even without keyword

 Use typedef with static analysis


 typedef uint32_t feet; typedef uint32_t meters;
feet x = 15;
meters y = x; // feet to meters assignment error https://goo.gl/6SqG2i

 Use stdint.h for portable types


 int32_t is 32-bit integer, uint16_t is 16-bit unsigned, etc. © 2020 Philip Koopman 6
2012 Open Source Coverity Scan
 Sample size: 68 million lines of open source code
 Control flow issues: 3,464 errors
 Null pointer dereferences: 2,724
 Resource leaks: 2,544
 Integer handling issues: 2,512
 Memory – corruptions : 2,264
 Memory – illegal accesses: 1,693
 Error handling issues: 1,432 QUESTIONABLE
 Uninitialized variables: 1,374
 Uninitialized members: 918
CODE
 Notes:
 Warning density 0.69 per 1,000 lines of code
 Most open source tends to be non-critical code
 Many of these projects have previously fixed bugs from previous scans
http://www.embedded.com/electronics-blogs/break-points/4415338/Coverity-
Scan-2012?cid=Newsletter+-+Whats+New+on+Embedded.com © 2020 Philip Koopman 7
Deviations & Legacy Code
 Use deviations from rules with care
 Use “pragma” deviations sparingly; comment what/why

 What about legacy code that generates


lots of warnings?
 Strategy 1: fix one module at a time
– Useful if you are refactoring/re-engineering the code
– Sometimes might need to keep warnings off for 3rd party headers
 Strategy 2: turn on one warning at a time
– Useful if you have to keep a large codebase more or less in synch
 Strategy 3: start over from scratch
– If the code is bad enough this is more efficient … if business conditions permit
© 2020 Philip Koopman 8
Or – You Can Use A Better Language!
 Desirable language capabilities:
 Type safety and strong typing (e.g., pointers aren’t ints)
 Memory safety (e.g., bounds on arrays)
 Robust static analysis (language & tool support)
 In general, no surprises

 Spark Ada as a safety critical language


 Formally defined language; verifiable programs
Wikipedia
– The language doesn’t have ambiguities or undefined behaviors
https://goo.gl/3w6RF6
 You can prove that a program is correct
Spark Ada is a subset
– E.g., can prove absence of: array index out of range, division by zero of the Ada
programming
– (In practice, this makes you clean up your code until proof succeeds)
language.
 Key idea: design by contract
– Preconditions, post-conditions, side effects are defined © 2020 Philip Koopman 9
Language Style Best Practices
 Adopt a safe coding style (or a safe language)
 MISRA C & CERT C are good starting points
 Specify a static analysis tool and config settings
– To degree practical, let machines find the style problems
 When static analysis is set up, add dynamic analysis
 The point of good style is to avoid bugs
 Let the compiler find many bugs automatically
 Reduce chance of compiler mistaking your intention
 Coding style pitfalls:
 “The code passes tests, so warnings don’t matter”
 Real bugs lost in a huge mass of warnings
 Making it too easy to deviate from style rules © 2020 Philip Koopman 10
https://goo.gl/pvDMHX CC BY-NC 2.0 https://goo.gl/pvDMHX CC BY-NC 2.0
https://xkcd.com/1695/ © 2020 Philip Koopman 12

You might also like