Aspect Security The Unfortunate Reality of Insecure Libraries
Aspect Security The Unfortunate Reality of Insecure Libraries
Aspect Security The Unfortunate Reality of Insecure Libraries
Jeff Williams, Chief Executive Officer Arshan Dabirsiaghi, Director of Research Aspect Security, Inc. March 2012
Abstract
ighty percent of the code in todays applications comes from libraries and frameworks, but the risk of vulnerabilities in these components is widely ignored and under appreciated. A vulnerable library can allow an attacker to exploit the full privilege of the application, including accessing any data, executing transactions, stealing files, and communicating with the Internet. Organizations literally trust their business to the libraries they use. In partnership with Sonatype, researchers from Aspect Security analyzed 113 million downloads from the Central Repository (Central) of the 31 most popular Java frameworks and security libraries and made some conclusions about this important aspect of application security. Central is the industrys most widely used repository of open-source components, and currently contains more than 300,000 libraries. We analyzed more than 113 million downloads of these libraries from more than 60,000 commercial, government, and non-profit organizations. Our analysis revealed several interesting findings, including: 29.8 million (26%) of library downloads have known vulnerabilities The most downloaded vulnerable libraries were GWT, Xerces, Spring MVC, and Struts 1.x Security libraries are slightly more likely to have a known vulnerability than frameworks Based on typical vulnerability rates, the vast majority of library flaws remain undiscovered Neither presence nor absence of historical vulnerabilities is a useful security indicator Typical Java applications are likely to include at least one vulnerable library
The data show that most organizations do not appear to have a strong process in place for ensuring that the libraries they rely upon are up-to-date and free from known vulnerabilities. We conclude that there are no shortcuts to a secure application infrastructure and that the only useful indicator of library security is a broad and rigorous review that finds minimal vulnerability. This paper is not a critique of open source libraries, and we caution against interpreting this analysis as such. The authors are strong advocates of free and open software and created two of the security libraries studied. Instead, we recommend that organizations recognize that libraries are a critical part of their software infrastructure and ensure they have the level of awareness and the necessary tooling within their organization to generate appropriate assurance.
2)
Study Design
In partnership with Sonatype, Aspect Security researchers analyzed more than 113 million downloads of the 31 most popular Java frameworks and security libraries from the Central Repository. The 31 libraries were selected by examining over 500 enterprise applications submitted to Aspect Security for code review and security testing in the past 12 months. The 20 frameworks and 11 security libraries selected are a small but frequently downloaded subset of the more than 36,000 libraries (and 303,000 total versions) in the Central Repository (Central). Our dataset analyzed the downloads of these libraries from Central in the past 12 months:
These numbers represent a partial view into the true picture of library use since many organizations, particularly larger enterprise class organizations, are likely to download libraries to their own local repositories for internal reuse. Downloads from these repositories and direct downloads of libraries from project web sites are not included in the study. These factors not withstanding, we believe that the large size of the dataset provides strong support for the conclusions of the study. The study focuses only on open-source Java libraries, but there is no reason to believe that the data for other languages and platforms would be significantly different. Similarly, our experience in evaluating the security of hundreds of custom applications indicates that the findings are likely to apply to closed-source and commercial libraries as well.
In this study, we focus on two types of libraries that are particularly security critical. The first type is the framework a library that provides the common code necessary to generate a web application. The other type is the security library that provides security controls such as encryption, input validation, logging, access control, and other critical security functions.
Security researchers periodically discover vulnerabilities in libraries and make them available through a disclosure process of their own choosing. Some of these disclosures are coordinated, others simply write blog posts or emails to mailing lists. Our dataset maps the vulnerabilities listed in the MITRE Common Vulnerabilities and Exposures (CVE) and the Open Source Vulnerability Database (OSVDB) to the appropriate library. We used this mapping to analyze the dataset for patterns indicating the causes of vulnerable libraries. Not all inadvertent vulnerabilities are created equal. While some vulnerabilities allow the complete takeover of the host using them, others might result in data loss or corruption, and still others might provide a bit of useful information to attackers. In most cases, the impact of a vulnerability depends greatly on how the library is used by the application. A flawed library might result in a devastating exposure in one application and no exposure at all in another. There are a variety of factors that may prevent an inadvertent vulnerability from being exploitable in a particular application. The most common is that the vulnerability is in a part of the library code that is not used by the application. Another reason is that the vulnerability is shielded by input validation, access control, transformation, or other security controls in the application that prevents exploit. Based on this discussion, one might (wrongly) conclude that the use of vulnerable libraries is an acceptable risk. It might be argued that developers can work around the problems in components. Or, developers might rely on automated tools to catch vulnerabilities in applications. Or, developers might believe that their penetration testing process will uncover any exploitable problems in the final application. Unfortunately, in order to work around a problem, developers have to know that it is there. Currently, developers have no way to know that the library versions they are using have known vulnerabilities. They would have to monitor dozens of mailing lists, blogs, and forums in order to stay abreast of this information. Further, development teams are unlikely to find their own vulnerabilities, as it requires extensive security experience and automated tools are largely ineffective at analyzing libraries. Establishing a process to manage vulnerable library versions in an organization is a relatively easy way to eliminate a significant amount of risk from an application portfolio. By focusing efforts on a small set of approved libraries, one organization that engaged our services was able to eliminate thousands of old vulnerable components and versions and save millions of dollars maintaining their software portfolio.
4)
Struts2 is a fairly popular remake of the highly successful Struts framework. Unfortunately, Struts2 has suffered a series of remote code execution flaws that have affected all known versions. In the last year, Struts2 was downloaded more than 1 million times by over 18,000 organizations. In 2010, a researcher from Googles Security team discovered a unique class of weakness in the library that allowed attackers to execute arbitrary code on any Struts2 web application.
Essentially, their use of the Object Graph Navigation Language (OGNL) included data provided by users, allowing attackers to access data objects and invoke arbitrary methods, such as Runtime.exec(). The implication is that a successful exploit could completely compromise an application and the host on which it runs. These flaws resulted in numerous vulnerable downloads in the last 12 months. These flaws dont require authentication or special skills to exploit. Since then, Google and others have regularly unearthed similar critical flaws.
Vulnerability: Spring Expression Language Injection Spring is the most popular application development framework for Java. It was downloaded over 18 million times by over 43,000 organizations in the last year, during which time many vulnerable versions Spring is the most popular application were created and downloaded. development framework for Java. It was downloaded over 18 million times by over Aspect Security and Minded Security 43,000 organizations in the last year, during collaborated on a whitepaper (https:// which time many vulnerable versions were www.aspectsecurity.com/expressioncreated and downloaded. language-injection) in 2011 that discusses a new class of vulnerabilities in Springs use of Expression Language (EL). This vulnerability allows attackers to submit HTTP parameters that get interpreted as EL and executed. An exploit can leak data out of the server, including sensitive information such as system data, application data, and user cookies.
CVE-2011-2730 exploit to steal data out of a users session: http://example.org/springapp/ search?query=${requestScope} Your search for: javax.servlet.forward.request_uri=/ELI njection/eval.htm,javax.servlet.forward.servlet_path=/eval.htm,user. roles=[ADMIN,USER,ANONYMOUS] display name [WebApplicationContext for namespace cashflowServlet]; startup by [uid=root, Tue Jul 19 22:35:58 EEST 2011];org.springframework.web. servlet.view.InternalResourceView.DISPATCHED_PATH=/var/opt/test/eval.jsp,... returned zero results.
This attack doesnt require a lot of skill or authentication. When processing this request, Spring takes the users parameter and evaluates it. Spring sends the result of this evaluation back to the user, accidentally leaking important session data. Vulnerability: CXF Authentication Bypass Apache CXF is a framework for developing Web Services, from small JSON utilities for web applications to a full scale enterprise service bus (ESB). CXF was downloaded 4.2 million times by more than 16,000 organizations in the past 12 months. Since 2010, CXF has had two major vulnerabilities (CVE-2010-2076 and CVE 2012-0803) that allowed attackers to trick any service using CXF to download arbitrary system files and entirely bypass authentication. The authentication bypass was rated Critical. A user could access any protected resource by simply not passing in the expected WS-Security UsernameToken field. This exposed all business Web Services written in CXF. To many businesses, this is the worst possible vulnerability.
5)
6)
7)
look and feel of their application, without having to create and maintain all the plumbing to make the application work. The next two most frequently downloaded libraries are Apache CXF and Hibernate. Apache CXF is a services framework that helps developers build services using a variety of protocols and transports. Hibernate is a persistence library that helps programmers map their data structures into a database.
While Spring MVC was the most frequently downloaded library in this study, Log4j reached more organizations. Log4j was downloaded by 45,000 organizations in the last twelve months. One possible explanation is that organizations of all sizes need a strong logging library. Another is that many of the open source frameworks include Log4j as a dependency, and it gets automatically downloaded with the framework.
8)
If people were updating their libraries, we would have expected the popularity of older libraries to drop to zero within the first two years. However, the data clearly show popularity extending back over six years. One possible explanation is that some projects, perhaps new development efforts, tend to use the latest version of a library, accounting for the spike in popularity for the libraries in the first year. The continuing popularity of libraries for extended months suggests that incremental releases of legacy applications are not being updated to use the latest versions of libraries but are continuing to use older versions.
9)
All Libraries
Downloaded
This number may be significantly underestimated because many developers get their libraries from local repositories (e.g. Sonatypes Nexus Repository Manager). These local repositories store, often in perpetuity, libraries within an organization and can mask an organizations ongoing use of these cached files from the dataset. Examining the specific libraries, we see that Google Web Toolkit (GWT) has more vulnerable downloads by an order of magnitude than any other library. GWT is a user-interface framework that allows Java web applications to create dynamic user interfaces that leverage JavaScript in the browser. GWT had a huge number of downloads and numerous vulnerable versions released in 2011. When shown along with downloads of versions without known vulnerabilities, the overall picture of the vulnerabilities becomes more clear. The overwhelming number of vulnerable GWT downloads is obvious.
10 )
The analysis of vulnerable downloads by the year the version was created shows a disturbing trend. We expected that older libraries were more likely to be vulnerable downloads. But we found the opposite newer libraries were considerably more likely to be a vulnerable download. In fact, from our dataset of the most popular libraries, 35% of the downloads of versions created in 2011 were vulnerable, compared to an average of 15% for libraries created in 2007.
The primary explanation for the trend is GWT that represents about one-third of the downloads of libraries created in 2011 and 97% of the vulnerable ones. When we remove GWT from the data, the trend returns to what we would have expected. However, because there were so many downloads of vulnerable versions of GWT in 2011, it still means that there are large number of vulnerable applications that use those versions.
11 )
38%
28%
62%
72%
Not Popular
Popular
12 )
Applications typically include many different libraries, which significantly Applications typically include increases the likelihood that an application will contain a vulnerable library. many different libraries, which Developers are playing a strange computer version of Russian roulette when significantly increases the likeliincluding libraries in their application. Imagine an application that uses 10 hood that an application will contain a vulnerable library. libraries from our dataset of the most popular frameworks and security libraries. Even if developers select popular releases of all the libraries they need, theres still a 28% chance that each of them is vulnerable. Because our example application uses ten libraries, the chance that the application includes at least one vulnerable library is more than 95%.
13 )
We recommend that organizations make their commitment to security clear. A strong security culture in some organizations, such as BSD, may reduce the likelihood of vulnerabilities in their codebase. Its hard to read BSDs security page and not be encouraged about their approach. Their page starts with the quote below and we encourage you to read it in its entirety: OpenBSD believes in strong security. Our aspiration is to be NUMBER ONE in the industry for security (if we are not all ready there). Our open software development model permits us to take a more uncompromising view towards increased security than Sun, SGI, IBM, HP, or other vendors are able to. We can make changes the vendors would not make. http://www.openbsd.org/security.html However, though their culture may encourage security, not all BSD licensed libraries come from OpenBSD. Therefore, we cant support any generalizations about security based on the license selected by the project. However, it is possible that some aspect of the license selected by project leaders does affect the likelihood of vulnerabilities being discovered. This will make an interesting research project for the future.
14 )
36%
56%
64%
44%
We suspect that security libraries may have more reported vulnerabilities because they naturally attract a greater degree of scrutiny by security researchers and attackers. This may explain the greater incidence of security vulnerability reports for these libraries. As we mentioned, our dataset does not include the level of scrutiny targeting each library. It is also possible that because security libraries are specifically focused on security-critical code, virtually any bug is likely to be a reportable security vulnerability. Despite the higher rate of reported vulnerabilities in security libraries, we do not recommend writing your own security controls. There are a huge number of subtle mistakes that can introduce vulnerabilities into controls written by the best developers. The best route to a secure set of controls for developers is to use proven and tested components and carefully have your implementation verified by security experts. You can improve your odds by externalizing and standardizing your controls.
We suspect that these numbers are significantly underreported, as we are not able to correctly associate all of the downloads with an organization (due to repository managing caching). Many downloads were attributed to an Internet provider, and slightly over 40% of the downloads could not be attributed to any specific organization. We conclude that many developers are downloading from home or other networks not associated with an organization. We also know through our work that many organizations not represented in the data are using Central and that many have designed their networks to camouflage their identity when employees access the Internet.
15 )
There does seem to be a significant difference in how many of the 31 libraries are used by the organizations studied. On average, the Global 500 downloaded 19.2 of the 31 libraries in this study. Smaller organizations downloaded an average of only 8.5 of the libraries studied. Since the larger organizations have considerably larger application portfolios, it is not surprising that they use more of the libraries. However, this is also a concern because there is considerable overlap in the libraries selected. That means that the larger organizations have not standardized on a small set of framework and security libraries. More libraries means more code. And more code increases the chance of a devastating vulnerability.
16 )
There does not seem to be a difference in the libraries used by the Global 500 and the smaller organizations in the study. Almost all of the libraries received between 7 and 10 percent of their downloads from the Global 500. We expected to find disproportionately high adoption of security libraries by the Global 500 as compared to all the other companies, but that hypothesis was not supported by the data. Only two libraries, AntiSamy and HDIV, are disproportionately represented in the Global 500 compared to other libraries. AntiSamy, created by one of the authors of this article, received almost 22% of downloads from the Global 500. AntiSamy removes attacks from third-party HTML content to make it safe to use in webpages. HDIV adds integrity checks to HTTP input fields and received 17% of its downloads from the Global 500. Perhaps this evidence demonstrates that enterprise frameworks do not provide these niche security functions.
17 )
Recommendations
Given the presence of vulnerabilities in commonly used versions of popular libraries in Central, we strongly recommend that you take steps to minimize the risks to your organization. You should consider the risk from all libraries, including those with known vulnerabilities, unknown vulnerabilities, and malicious code. You trust your business to the libraries that you use. INVENTORY: Gather information about your current library situation Getting some real data about your organizations library use is a good way to get started. While broad studies like this are useful indicators, building momentum in an organization typically requires specific findings about your organization. We recommend metrics around what libraries and frameworks are in use, how far out-of-date and outof-version they are, the use of viral and unapproved licenses, and whether they have known vulnerabilities. ANALYZE: Check the project and the source for yourself Exercise a degree of restraint in the libraries that are used. Before trusting your enterprise to code of unknown provenance, we recommend a vetting process that gathers information about the team building the library and the process they followed to develop it. Minimally, open source projects should have an approved license, a process by which contributions are reviewed for security, and the ability to respond to security vulnerability reports. Consider the use of some recently available software tools to provide this capability. The only way to deal with the risk of unknown vulnerabilities in libraries is to have someone who understands security analyze the source code. Static analysis of libraries is best thought of as providing hints where security vulnerabilities might be located in the code, not a replacement for experts. The lack of context with libraries makes it virtually impossible for tools to conclusively identify vulnerabilities. Manual code review can be used at various levels of rigor from the common flaws level, like the OWASP Top Ten, all the way up to searching for malicious code and rootkits. Be sure to discuss the level of rigor you require with reviewers. CONTROL: Restrict the use of unapproved libraries One way to gain control over ad-hoc library use is to establish a local repository that only contains approved libraries. A strict version of this policy could block direct access to Central by developers, although we have anecdotal evidence that this frequently results in workarounds where developers download jar files directly or access other open repositories. A better approach is to create a governance process around library use and help development groups take advantage of it. Consider enabling the Java SecurityManager, sometimes known as the sandbox. Java was originally designed with security features to allow remotely controlled code known as applets to run within the browser without compromising security. The Java SecurityManager was designed to keep this potentially malicious code in a sandbox where it could do no harm. When you consider that libraries are also remotely controlled code, the solution seems clear. The SecurityManager can prevent libraries from making dangerous calls and make many rootkits impossible. Also consider creating a Secure Use guideline that details how frameworks and libraries are allowed to be used in your organization. The guideline should specifically detail the patterns for using a library securely and point out any known patterns that could lead to an insecure application. MONITOR: Keep libraries up-to-date Development teams should plan for and allocate resources to keep libraries up-to-date. In most cases, updates can be made compatible with an existing application without significant rework. Note that major releases may require significant rework. This is the cost that you incur as result of using the library in the first place, rather than investing in writing your own code. We recommend establishing systems and processes for monitoring the libraries that you are using. This will help you to identify and respond to security vulnerabilities and updates.
18 )
Conclusions
The use of libraries has become a pervasive, almost overwhelming aspect, of modern software development. In the past few years, the use of dependency management tools has caused a significant increase in the number of libraries involved in a typical application. In this paper, we examined some of the security implications of this change and conclude that there are significant risks associated with the use of libraries. We recommend that any organization building critical software applications protect itself against these risks by taking steps to inventory, analyze, control, and monitor the use of libraries across the organization.
19 )
www.aspectsecurity.com