Static Analysis of Java Enterprise Applications
Static Analysis of Java Enterprise Applications
ENTERPRISE APPLICATIONS
FRAMEWORKS AND CACHES,
THE ELEPHANTS IN THE ROOM
2
CHALLENGES OF JAVA ENTERPRISE APPLICATIONS
COMPLETENESS
• Web Frameworks
– Layers of abstraction ease development
• Dynamic techniques
– e.g., Dependency Injection
– Configurability (annotations, xml)
– Custom implementations of JavaEE
• Supporting each framework Unsustainable
“Where do I start from?”
3
ANOTHER CHALLENGE:
FRAMEWORK CACHES
4
JackEE TO THE RESCUE
T H I S PA P E R ’ S C O N T R I B U T I O N S
5
FRAMEWORK-AGNOSTIC MODELING
• JackEE’s modeling of frameworks
– Declarative implementation
• Extends Doop
– Defines a common simplified vocabulary
– Processes programs inputs (incl. annotations, xml)
– Produces framework-independent outputs
• Entry points - Discovery and exercise
• Bean objects - Generation and interconnection
6
JackEE’S GENERALIZED VOCABULARY
• JackEE’s outputs
– Used by the points-to analysis
– Use the points-to analysis information to infer further points-to (mutual recursion)
7
SAMPLE USE OF VOCABULARY
ENTRY POINT DISCOVERY RULES
• Subtyping, annotations, xml configuration
– In-app servlet discovery
Servlet(class) :- ConcreteApplicationClass(class),
SubtypeOf(class, "javax.servlet.GenericServlet").
8
– Completeness:
SAMPLE USE OF VOCABULARY
WIRING TOGETHER BEANS
Bean_Id(bean, field),
GeneratedObject(beanObject, bean).
– Completeness:
9
JackEE POINTS-TO
A RECURSIVE RELATIONSHIP
• Completeness:
10
WHAT ABOUT CACHES?
SCALABILITY CHALLENGES
2objH computation time
• Blowup in java.util points-to (2objH)
– The most precise practical analysis alfresco 72 28
opencms 46 54
“Wait, but why?”
• Lots of internal complexity in maps pybbs 69 31
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
11
java.util time non-java.util time
WHAT ABOUT CACHES?
PRECISION CHALLENGES
• java.util.* maps feature a double-dispatch-like pattern
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict)
{
…
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key,
value);
…
}
• Degrades the precision of most context-sensitive analyses, e.g., 2objH
13
SOUND-MODULO-ANALYSIS MODELING OF MAPS
HASHMAP SNIPPET
Original JackEE
14
SOUND-MODULO-ANALYSIS MODELING
• Captures all the original behaviors of Java maps (e.g., exceptions)
• Removed complexity from the most-reused part of the library
– Greatest complexity-removal factor treeification elimination
• Treeification converts the map’s bins to red-black trees
– Fewer local aliases
– Code simplification context sensitivity keeps greater precision
15
EVALUATION
16
A REAL-WORLD BENCHMARKING
SUITE!
Benchmark Description Gitstar Organization/User Rank
alfresco CMS 2,088 550 1,947
bitbucket-server On-premise version of Bitbucket N/A N/A N/A
dotCMS CMS 612 400 4,624
opencms CMS 522 400 5,092
pybbs Website building framework 1,109 524 3,895
shopizer e-commerce framework 1,643 1.6k 4,040
SpringBlog Blog system 1,548 716 3,568
WebGoat The popular OWASP app 3,701 1.5k 1,234
17
IMPRESSIVE COMPLETENESS
HIGHER THAN PLAIN OLD BENCHMARKS
App reachable methods %
(DACAPO)!
alfresco
reachability
dotCMS
– Doop averages 42.89% in-
app reachability for DaCapo opencms
shopizer
• Without JackEE Doop averages
14.48% in-app reachability SpringBlog
18
SPEED
JackEE OR I G INA L JD K 8 VS SOU ND- M O DU LO - A NA L Y SI S JD K 8
Benchmark Avg. vpt size reduction Avg. app vpt size reduction # CallGraphEdge reduction
alfresco 24.2% 19.4% 7.4%
bitbucket-server 42.3% 25.7% 8.1%
dotCMS N/A N/A N/A
opencms 13.3% 8.2% 1.8%
pybbs 33.7% 24.3% 8.9%
shopizer 30.3% 27.0% 6.0%
SpringBlog 44.6% 28.7% 8.6%
WebGoat 30.2% 6.0% 4.4%
Average 28.7% 19.9% 6.5%
20
CONCLUSION
• JackEE’s contributions
– Automatic, declarative and extendable framework modeling
• Impressive completeness
– Sound-modulo-analysis modeling of maps
• Maintains soundness
• Achieves high scalability
• Significantly improves precision
21
AAAND CUT!
22