SDL Series 5
SDL Series 5
SDL Series 5
In 2006, Michael Howard and Steve Lipner published The Security Development Lifecycle, opening the door to Microsoft's internal methodology for producing more secure software. In this column series, we will walk through the phases of the Microsoft Security The Microsoft SDL Development Lifecycle (SDL) and examine how the SDL is currently put into practice on a daily basisTools development of Microsoft's Threat Modeling in the products. The goal of our effort, through interviews and research, will be to further pull back the covers on Microsoft's practices for creating Compiler and Linker Protections software upon which millions of users (and billions of dollars) depend.
Content
Code Analysis Tools Manual Code Inspection Fuzz Testing Benefits and Limitations About the Authors
2
Threat Modeling Tools
Before coding begins, Microsoft developers engage in threat modeling. At this stage, developers are required to think about the assets theyre protecting and the different ways an attacker could affect those assets. Typical assets include the users data, as well as the computer running the code. Developers begin by white-boarding the system theyre building, including processes, data stores, data flows, privilege and trust boundaries, and external entities. Developers also consider these common threats: spoofing identity, tampering, repudiation, information disclosure, denial of service, and elevation of privilege, also known as STRIDE threats. As Adam Shostack, Microsoft senior program manager on the SDL strategy team, describes, Microsofts developers start by drawing out what theyre doing, following some rules of thumb. Once they feel their diagram is complete, they use our threat modeling tool to complete and validate the diagram. They use the drawing implement of the tool to put information into the threat model and the tool provides live feedback as theyre drawing. For example, the tool might detect that they have diagrammed writing to a data store, but not reading from it. The tool recognizes whats probably a mistake and will prompt them to express how the store is to be read from. Because Microsoft teams are required to threat model their entire product and all new features, Microsoft has invested in tools to help teams build effective and consistent threat models.
Because Microsoft teams are required to threat model their entire product and all new features, Microsoft has invested in tools to help teams build effective and consistent threat models. Adam Shostack, Microsoft SDL Program Manager
These compiler options diminish the effectiveness of attacks that attempt to exploit certain classes of vulnerabilities such as buffer overruns and integer overflows, and all new Microsoft products are required to use these options. Senior Director of Security Engineering Matt Thomlinson explains, If a *Microsoft product+ team is ready to release a new version of a product, theyll point us to their drop site. We use an internal tool called Binscope to walk through a products binaries to see how it was compiled and which flags are turned on. Our SDL states that product groups must opt into the compiler and linker protections. If a product is shipping 600 binaries, we make sure theyre all opted in, and that they are in compliance with the SDL. With Windows Vista and Windows Server 2008, ASLR is enabled by default, and all Microsoft code must be compiled to utilize this operating system protection.
2
Linux distribution vendors typically compile projects using the GNU Compiler Collection (GCC), which does provide some defenses, such as stack protection. ASLR is also available to Linux through projects like PaX and ExecShield, although most Linux distributions do not enable ASLR by default.
If we tell someone, Thou shall not commit a buffer overflow, there has to be some sort of tool they can run to gauge their compliance. Shawn Hernan, Senior Program Manager, Microsoft
The Linux kernel project also makes use of the information provided by Coverity, but interviews with key Linux kernel contributors show a more mixed assessment, as expressed in the LinuxPlanet How Relevant Is the Homeland Security Grant? article: When [Torvalds was] asked if Coveritys results were being used, he confirmed that some bug fixes had indeed found their way into Linux patches. More importantly, the issue is not of quantity, but of quality.
4
Well, Coverity doesnt send patches, but yes, some people go through the Coverity results and send patches based on them. Sometimes the warnings are bogusautomated checking has very definite limitsbut just as a statistic, Coverity is mentioned 75 times in the commit messages over the last eight months or so, Torvalds stated, adding: Not all of them were bugssome of it was to just shut up a bogus warning, but the point being that it does actually end up being useful. The common Web browser bundled with Linux distros is Mozilla FireFox. In an interview with Mike Shaver, Mozilla Chief Evangelist, on the How Software Is Built blog, the value of Coverity scans was described as follows: So certainly we are engineers and we like it when computers can do work that lets humans not do that work. So we are all about tools of various kinds. And certainly security standards are no exception to that. Our experience with various source code checkers, and Im not sure what our most recent results with Coverity were, is that maybe theyre not as useful on our code because we are a project that uses a lot of C++ where most of the open source projects are still largely architected in C. Or it could be because we have a very dynamic architecture. You cant always tell how the pieces are wired together without understanding it or running it. Generally, the level of false positives is incredibly high. The number of possible vulnerabilities that is reported as This could be a problem, though it may not actually be one is vastly smaller. And the number of actual vulnerabilities that is reported is quite small. And we do act on those where we see them. But I think if you look at it, the Coverity rungs are really ratings of how responsive you are to Coverity reports. Theyre not really certifying any absolute level of security in the code. False positives are the primary challenge of static analysis tools. If the number of real issues is low, compared to the number of issues a tool reports, then developers tend not to use such tools, especially if tool use is optional. Microsoft is able to optimize its tools specifically for its code base. As Hernan describes, Lets say we make a change to the SDL that will require changes across an entire code base. For example, lets say we ban some API. The effort to rid code of that API is often a very expensive thing. We accumulate code, and every time we make changes across the board it becomes progressively more expensive.
Lets say we ban some API. The effort to rid code of that API is often a very expensive thing. Shawn Hernan
5
Hernan continues, Initially *Microsoft developers+ will write some alpha-level tools that will find the gross instances of the offending code. Then we will refine that tool and work out false positives, while ensuring that its still finding offending code and suggesting the right correction. Those tools will get developed into scanning tools, and then they will ultimately get incorporated into an infrastructure that most developers here use. Developers will start running those scans on their desktop, rather than running the scans centrally. Once we have reached that plateau, we will deploy these policies down to individual developers desktops and then they wont be able to check the code in until it passes that scan. Microsoft developers are also able to fine-tune the tools to their particular code base, filtering known false positives without diminishing the usefulness of the tool. Coveritys paying customers can also work with Coverity to optimize its scanner for their code bases. However, when the Coverity scanner is run against core Linux distro projects, it is not optimized for any particular project. As a result, some projects receive more false positives than others do, and some projects choose not to significantly use the Coverity scan results.
Once we have reached that plateau, we will deploy these policies down to individual developers desktops and then they wont be able to check the code in until it passes that scan. Shawn Hernan
Bidstrup continues, If you take, for example, integer overflow classes of vulnerability, stuff like David LeBlanc has written extensively about, this is just really hard stuff for even good engineers to find. This calls attention to the importance of some of the other techniques that weve already spoken about, about why static analysis tools are very important, why fuzz testing is very important.
Fuzz Testing
Fuzz testing checks the ability of software to handle malformed input. This may include passing invalid arguments to APIs, sending invalid data into a network port, or changing the contents of a file that must be parsed. A fuzzing tool can either send random data into such inputs, or it can be crafted to intelligently manipulate input to stress likely weaknesses. For example, if its known that a certain byte specifies a field length, fuzz testing can be crafted to input an invalid field length to see if a buffer overrun occurs. Fuzz testing is particularly good for finding denial of service vulnerabilities (by producing crashes) and buffer overruns.
6
Microsoft extensively uses fuzz testing across all products as part of the SDL. As Matt Thomlinson explains, There are fuzzing tools available in the marketplace that are quite good. We also have our own, which we have tuned for our environment and weve tuned for automation. With the public tools, you might end up running a fuzzer overnight. This may result in a bunch of crashes. Each crash must be walked through to understand what the vulnerability was and if its exploitable and those sorts of things. With our internal tools, as much as possible, you can just point it at your binary and walk away. You come back after the weekend and youve got bugs filed. Thomlinson continues: We have debuggers automatically hooked up to the binaries that are being fuzzed. Our fuzzer communicates back and forth with the debugger. When theres a crash, itll save off the crash, and itll save off the file itself. It will try to auto diagnose what happened. Itll try to tell you automatically if its exploitable, and itll bucket the crash. So if you see the same crash or the same call stack, itll try to tell you that theyre really the same crash even though it happened five or six different times. It tries to de-duplicate the crashes so that the developer can say, There are four crashes that I need to look at. I have lots of examples of those crashes in action, but really there are only four distinct ones. As Al Comeau, head of Microsofts Secure SQL Initiative says, You have to be at least as good as the bad guy. Fuzz testing is still the predominant attack technique out in the real world. People have excellent fuzz test tools out there, the bad guys have them and they will throw them at your product. So unless youre doing fuzz testing that is at least as good as, and hopefully better than, what the hackers have, they will get you. Comeau sums it up with, The challenge is being able to do that in a manner where you get good coverage without having to spend an inordinate amount of resources. And thankfully, we have a lot of really good, intelligent fuzz tools that are available to us. You teach the tool about the file format, you teach it about the registry structures, and you teach it about the protocol structure. Then you can have the tool cause a lot of interesting variations in those inputs.
You have to be at least as good as the bad guy. Fuzz testing is still the predominant attack technique out in the real world. Al Comeau
Sites such as Insecure.org list many fuzzing and hacking tools. The random fuzzing tools crashme and fsfuzzer gained additional awareness in 2006, when the Month of Kernel Bugs (MOKB) was published. In this effort, a security researcher found and disclosed one kernel bug every day for the month of November, disclosing bugs in FreeBSD, Linux, MacOS X and Windows. The results are detailed on the Kernel Fun blog. It does not seem that fuzz testing is a quality gate through which core Linux distro projects must pass prior to release. Instead, it seems that the majority of fuzz testing is performed by security researchers who identify vulnerabilities after a project has been released.
7
Benefits and Limitations
Microsoft uses security tools as one component of the SDL. Tools have proven effective at removing certain classes of vulnerabilities, but tools alone are not sufficient to produce secure software. As Hernan weighs in, There is no question that automated tools will help. But automated tools cant, for example, catch fundamentally deficient design. We rely on threat modeling, code review, and other tactics for things that automated tools are not good at. There are, however, classes of vulnerabilities for which tools are very effective, and Microsofts heavy use of tools makes its products substantially more secure. Thomlinson sums up his thoughts: While Microsoft and other vendors release patches regularly, what you dont see are the things that we dont have to patch anymore. There are certain types of flaws that Microsoft doesnt ship any longer because we have tools to find them every time. Weve gotten those tools into the process and we validate that the process is being followed. I would imagine that if we werent using tools to the extent that we are, we wouldnt be able to eradicate certain classes of vulnerabilities. Those types of vulnerabilities would just reappear any time we produced a significant amount of new code, and wed be in a reactive mode trying to remove them after a products release. In conclusion, automated tools assist product team members and increase the effectiveness of the SDL. In specific, tool improve threat modeling, detect coding errors, and probe products for potential vulnerabilities, and Microsoft continues to invest in tools as part of their ongoing effort to improve the SDL, and drive down the number and severity of product vulnerabilities.
There are certain types of flaws that Microsoft doesnt ship any longer because we have tools to find them every time. Matt Thomlinson, Senior Director of Security Engineering