Programming for Performance
Alosh Bennett
What is performance?
Overall feel of the system, its quickness and responsiveness.
Can be expressed in terms of
Computational Power
Memory footprint
Is perceived as
Responsiveness of the system
Throughput
Scalability
Startup time
Other factors
Reliability of the system
Availability
How to build a performing system?
What are you building?
Web Application
Application should load fast
Faster request turn-arounds
Minimal net traffic
Reduce number of server calls
Asynchronous updates
Database Application
Fast updates and queries
Schema design
Indexing
Normalizing
Caching results
Connection pooling
There is no generic formula to build a performing application.
Building an Application - stages
Requirements Gathering
Scope of the application
What it is and what it is not.
Real world estimation of usage
Application Architecture
Blueprint of the project
Identify the technologies
Components of the application
Interaction
Pseudocode
Logic of individual components
Algorithms and Data Structures
Coding
Coding standards
Good practices
When should we start thinking about performance?
Requirements Gathering
Scope of the project
Music player with media library
Manage up to 5000 songs in the library.
Play up to 5 songs simultaneously.
Performance Benchmark
Requires 1.8 GHz processor, 50 MB RAM, 20MB hard disk space
Startup under 5 seconds
What the application is not
Site to show top events of the day.
It will not show real-time data
Data would be fetched only once in few hours
Real world usage scenario
Web application – How many concurrent users
Text Editor – How big a file can it handle?
Do one thing and do it well.
Application Architecture – Right tool for the Right job
Blog
jsp or php
Online Transaction site
Industry grade server – Apache, glassfish, weblogic
J2ee or similar framework
Database – mysql, oracle, postgres
Rich content website
Javafx, flex, htm5
Avoid applets
Task Automation
Scripting languages – python, perl, shell
Avoid java, C
Concurrent processing with multiple processors
Scala over java
Mathematical modeling – Computation intensive
Functional programming over Object Oriented
XML Parsing in java
DOM Parser
SAX Parser
Stax Parser
Application Architecture – Harness the processing power
Single threaded vs Multi threaded
Picasa tool to upload pictures
Single upload takes 5 seconds
Single thread
10 pictures take 50 seconds
Multithreaded – (5 workers)
10 pictures take 10 seconds
Identifying parallel tasks
Localize - Break the application into independent units
Parallelize – Execute the units in parallel
Picasa tool to resize and upload
Resize takes 5 seconds, upload takes 5 seconds
Single composite task (5 resize&upload workers)
10 pictures in 20 seconds
Two independent tasks (5 resize workers, 5 upload workers)
10 pictures in 15 seconds
20 pictures in 25 seconds
Many hands make light work
Application Architecture – Don’t repeat the effort
Effective use of caching
Cache results that are costly to re-compute
Used effectively, improves the performance
Eg. Currency conversion rates in a Forex calculator
Avoid excessive caching
Monitor cache hit/miss ratio
Avoid caching user information in an online app
Take care of synchronization
Apache JCS, Oracle Coherence
Pool costly resources
Re-use costly to build resources
After use, check them into the pool instead of discarding
Connection
Costly to establish
Re-usable across users
Clean the resource before checking into the pool
Application Architecture – Keep a watch on the traffic
Multi-threaded model
Threads work on the same data
The data is not transferred between workers
Ideal when the job at hand involves huge data
Eg.
Windows registry
All processes work on the same registers
Different processes read/update different part of registry.
Data driven model
Data is transferred into worker’s queue
Independent chunks of data
Data size is small
Eg.
Order acceptance system
Order sent to worker to see item availability
Sent to next worker to process payment
Sent to next worker to confirm order
Application Architecture – Buffer data bottlenecks
Input/Output
File systems and other IO are slow
Not good at reading/writing a byte at a time
Read a chunk of data and pass it to application one byte at a time
Network
Sending/fetching data across network is slow and unreliable
Take youtube for example
Player doesn’t fetch a frame at a time and show it to user
Keep reading over the network whether video is playing or paused
Write the frames into a buffer
Player reads from the buffer and plays the video
Always buffer slow and unreliable peripherals
Application Architecture – Bulk Action
Bulk action is always cheaper than repeating it for each set of data
Common overhead is spread across the dataset
Uploading photos to Picasa
Authenticate user credentials and login
Establish a connection
Upload a picture
Close the connection
For uploading 10 pictures, you wouldn’t repeat the four steps 10 times
Bulk upload
Authenticate user credentials and login
Establish a connection
Upload first picture
Upload second picture
…
Upload last picture
Close the connection
Always ask for the bulk discount.
Pseudocode and coding
Select the correct algorithm
The factor that can cause most dramatic change in performance
How to compute the sum of all integers between m and n?
Fast Inverse Square Root
Newton’s Method
x1 = xo - f(xo)/f'(xo)
Algorithms specific to the problem performs better than generic algorithm
Comparisons of common sorting algorithm
Avoid using bubble and selection sorts
Use insertion sorts when the dataset is small
Merge, heap and quick sorts are used as the algorithms in java
Use Arrays.sort() method
Data structures
Arrays
Easy to loop
Random access of elements
Insert and delete in the middle is difficult
Good at searching – log(n)
Link Lists
Easy to loop
Random access of elements is not possible
Insert and delete in the middle is easy
Bad at search – log(n)
Binary Search Trees
Easy to loop
Inserting an element is log(n) -> n
Searching is log(n) -> n
Self balancing structures like Red-Black has search and insert times of log(n)
Java Collection framework
Has a collection of useful data structures
Collection
Set – Collection of elements which doesn’t have duplicates
HashSet
O(1) retrieval
TreeSet
Sorted set
O(1) retrieval
LinkedHashSet
Items maintained in the order of insert
List – Collection of elements with duplicates possible
ArrayList
Array based, random access of elements is easy
LinkedList
Insert and delete is constant time operation
Maps – Key-Value pairs
HashMap
Very good at insert, delete and retrieval
TreeMap
Supports traversal in the sorted order of keys
LinkedHashMap
The traversal is in the order of insert into the map
Java Collection framework – contd.
The structures are not thread safe, in order to eliminate the synchronization overhead
Methods to get read-only versions of the collection
Methods to get synchronized versions of the collection
Other historical collections
Arrays
Vectors
Resizable arrays
Synchronized
Hashtable
Older version of Map
Synchronized
Keeping memory consumption low
As memory heap gets filled, garbage collections would be frequent
If there is no more memory to recover, application crashes
Reduce number of objects created
Any operation on a non-mutable object could result in another object creation
Eg. Strings
Reduce the scope of the objects
Scrutinize class level objects
Don’t store references in long-lived objects
Avoid loading unnecessary classes
Avoid static linking to rarely used heavy libraries
Use “java –verbose” to see classes loaded
Collapse smaller classes and anonymous classes into a single class
Multi-threaded application instead of multiple launches of the same application
Reduce data traffic
Serialization with caution
Serialization is a great way to persist and recover states
Loading serialized state could be faster than recreating the state
Control the fields you want to persist by using volatile keyword
Use of XML
XML is a great tool to exchange information in a platform neutral manner
XML takes considerable bandwidth on the wire
Avoid unnecessary conversion of object to XML and back
Use the right parser
DOM vs SAX vs Stax
Evaluate other formats like JSON
Logging
Excessive logging is trouble
Never log to System.err or System.out
Use logging frameworks
Responsive Application
Start-up quick with only the required resources
In a media player application,
Start the application by fetching only the track lists
Lazy loading of costly resources
Fetch album art in the background
Always let the user know
Be interactive
While the album art is loading, display a message
Debugging tools
Benchmarking
Measurement of memory, time and CPU usage of the application
Compare benchmarks of different approaches
Profiling
Profiling tells more about your code execution paths
What methods are called often?
What methods are using the largest percentage of time?
What methods are calling the most-used methods?
What methods are allocating a lot of memory?
Profiling tools
VirtualVM
Netbeans Profiler
Graceful Degradation
How should your application behave when to load is too much to handle?
System should never become completely useless
System should never crash
The application could refuse to take new requests and display a message
In certain cases, its possible to degrade the quality of the results and still keep
up the response time
Eg. Search engines
Voice transmission over the network
In places where accuracy is crucial, this is not possible
Scientific modeling
References
http://java.sun.com/docs/books/performance/1st_edition/html/JPTOC.fm.html
http://www.javaperformancetuning.com/tips/index.shtml
http://en.wikipedia.org/wiki/Sorting_algorithm
http://java.sun.com/developer/onlineTraining/collections/Collection.html
Thank You