Vs J2EE
Vs J2EE
Vs J2EE
http://www.MiddlewareRESEARCH.com
David Herst with William Edwards and Steve Wilkes
September 2004
[email protected]
1 Disclosures
Our research is credible. We publish only what we believe and can stand behind.
Our research is honest. To the greatest extent allowable by law we publish the
parameters, methodology and artifacts of a research endeavor. Where the research
adheres to a specification, we publish that specification. Where the research produces
source code, we publish the code for inspection. Where it produces quantitative results,
we fully explain how they were produced and calculated.
Our research is community-based. Where possible, we engage the community and
relevant experts for participation, feedback, and validation.
If the research is sponsored, we give the sponsor the opportunity to prevent publication if they
deem that publishing the results would harm them. This policy allows us to preserve our
research integrity, and simultaneously creates incentives for organizations to sponsor creative
experiments as opposed to scenarios they can win.
This Code of Conduct applies to all research conducted and authored by The Middleware
Company, and is reproduced in all our research reports. It does not apply to research products
conducted by other organizations that we may publish or mention because we consider them of
interest to the community.
1.2 Disclosure
This study was commissioned by Microsoft.
The Middleware Company has in the past done other business with both Microsoft and IBM.
Microsoft commissioned The Middleware Company to perform this study on the expectation
that we would remain vendor-neutral and therefore unbiased in the outcome. The Middleware
Company stands behind the results of this study and pledges its impartiality in conducting this
study.
Page 2 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
First, what our agenda is not: It is not to demonstrate that a particular company, product,
technology, or approach is better than others.
Simple words such as better or faster are gross and ultimately useless generalizations. Life,
especially when it involves critical enterprise applications, is more complicated. We do our best
to openly discuss the meaning (or lack of meaning) of our results and go to great lengths to
point out the several cases in which the result cannot and should not be generalized.
Our agenda is to provide useful, reliable, and profitable research and consulting services to our
clients and to the community at large.
To help our clients in the future, we believe we need to be experienced in and be proficient in a
number of platforms, tools, and technologies. We conduct serious experiments such as this one
because they are great learning experiences, and because we feel that every technology
consulting firm should conduct some learning experiments to provide their clients with the best
value.
If we go one step further and ask technology vendors to sponsor the studies (with both
expertise and expenses), if we involve the community and known experts, and if we document
and disclose what were doing, then we can:
1.4 Does a sponsored study always produce results favorable to the sponsor?
No.
Our arrangement with sponsors is that we will write only what we believe, and only what we can
stand behind, but we allow them the option to prevent us from publishing the study if they feel it
would be harmful publicity. We refuse to be influenced by the sponsor in the writing of this
report. Sponsorship fees are not contingent upon the results. We make these constraints clear
to sponsors up front and urge them to consider the constraints carefully before they
commission us to perform a study.
Page 3 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
2 TABLE OF CONTENTS
1 DISCLOSURES .................................................................................................2
1.1 Research Code of Conduct .........................................................................2
1.2 Disclosure.....................................................................................................2
1.3 Why are we doing this study? What is our agenda? ................................2
1.4 Does a sponsored study always produce results favorable to the
sponsor?.......................................................................................................3
2 TABLE OF CONTENTS.....................................................................................4
3 EXECUTIVE SUMMARY ...................................................................................9
3.1 The Teams...................................................................................................9
3.2 The System ..................................................................................................9
3.3 The Implementations ...................................................................................9
3.4 Developer Productivity Results....................................................................9
3.5 Configuration and Tuning Results .............................................................10
3.6 Performance Results .................................................................................10
3.7 Reliability and Manageability Results ........................................................11
4 INTRODUCTION..............................................................................................12
4.1 How this Report is Organized....................................................................12
4.2 Goals of the Study .....................................................................................13
4.3 The Approach.............................................................................................14
4.4 The ITS System .........................................................................................14
4.5 Development Environments Tested ..........................................................16
4.6 Application Platform Technologies Tested................................................17
4.7 Application Code Availability......................................................................17
5 THE EVALUATION METHODOLOGY ............................................................18
5.1 The Teams.................................................................................................18
5.1.1 The IBM WebSphere Team ..................................................................... 19
5.1.2 The Microsoft .Net Team ........................................................................ 19
5.2 Controlling the Laboratory and Conducting the Analysis ..........................20
5.3 The Project Timeline ..................................................................................20
5.3.1 Target Schedule ..................................................................................... 20
5.3.2 Division of Lab Time Between the Teams................................................. 21
5.3.3 Detailed Schedule................................................................................... 21
5.4 Laboratory Rules and Conditions ..............................................................22
5.4.1 Overall Rules .......................................................................................... 22
5.4.2 Development Phase................................................................................ 23
5.4.3 Deployment and Tuning Phase................................................................ 23
5.4.4 Testing Phase ........................................................................................ 24
5.5 The Evaluation Tests .................................................................................24
6 THE ITS PHYSICAL ARCHITECTURE...........................................................25
6.1 Details of the WebSphere Architecture .....................................................27
6.1.1 IBM WebSphere ..................................................................................... 27
6.1.2 IBM HTTP Server (Apache) ..................................................................... 28
6.1.3 IBM Edge Server .................................................................................... 28
6.1.4 IBM WebSphere MQ ............................................................................... 29
6.2 Details of the .NET Architecture ................................................................29
6.2.1 Microsoft Internet Information Services (IIS) ............................................. 29
6.2.2 Microsoft Network Load Balancing (NLB) ................................................. 29
Page 4 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
6.2.3 Microsoft Message Queue (MSMQ) ......................................................... 29
7 TOOLS CHOSEN ............................................................................................30
7.1 Tools Used by the J2EE Team ..................................................................30
7.1.1 Development Tools ................................................................................. 30
7.1.1.1 Rational Rapid Developer Implementation ........................................ 31
7.1.1.2 WebSphere Studio Application Developer Implementation ................. 31
7.1.2 Analysis, Profiling and Tuning Tools......................................................... 32
7.2 Tools Used by the .NET Team ..................................................................32
7.2.1 Development Tools ................................................................................. 32
7.2.2 Analysis, Profiling and Tuning Tools......................................................... 32
8 DEVELOPER PRODUCTIVITY RESULTS .....................................................34
8.1 Quantitative Results ...................................................................................34
8.1.1 The Basic Data....................................................................................... 34
8.1.2 .NET vs. RRD......................................................................................... 36
8.1.3 .NET vs. WSAD...................................................................................... 37
8.2 RRD Development Process.......................................................................37
8.2.1 Architecture Summary ............................................................................. 37
8.2.1.1 RRD Applications ............................................................................ 37
8.2.1.2 Database Access ............................................................................ 38
8.2.1.3 Overall Shape of the Code ............................................................... 38
8.2.1.4 Distributed Transactions .................................................................. 39
8.2.2 What Went Well...................................................................................... 39
8.2.2.1 Web Interfaces ................................................................................ 39
8.2.2.2 Web Service Integration................................................................... 39
8.2.3 Significant Technical Roadblocks............................................................. 39
8.2.3.1 Holding Data in Sessions ................................................................. 39
8.2.3.2 Web Service Integration................................................................... 40
8.2.3.3 Configuring and Using WebSphere MQ ............................................ 40
8.2.3.4 Handling Null Strings in Oracle......................................................... 40
8.2.3.5 Building the Handheld Module.......................................................... 40
8.2.3.6 Miscellaneous RRD Headaches ....................................................... 41
8.3 WSAD Development Process....................................................................42
8.3.1 Architecture Summary ............................................................................. 42
8.3.1.1 Overall Shape of the Code ............................................................... 42
8.3.1.2 Distributed Transactions .................................................................. 43
8.3.1.3 Organization of Applications in WSAD .............................................. 43
8.3.2 What Went Well...................................................................................... 44
8.3.2.1 Navigating the IDE .......................................................................... 44
8.3.2.2 Building for Deployment ................................................................... 44
8.3.2.3 Testing in WebSphere ..................................................................... 44
8.3.2.4 Common Logic in JSPs ................................................................... 44
8.3.3 Signficant Technical Roadblocks ............................................................. 45
8.3.3.1 XA Recovery Errors from Server ...................................................... 45
8.3.3.2 Miscellaneous WSAD Headaches .................................................... 45
8.4 Microsoft .NET Development Process ......................................................46
8.4.1 .NET Architecture Summary .................................................................... 46
8.4.1.1 Organization of .NET Applications .................................................... 46
8.4.1.2 Database Access ............................................................................ 47
8.4.1.3 Distributed Transactions .................................................................. 48
8.4.1.4 ASP.NET Session State .................................................................. 48
8.4.2 What Went Well...................................................................................... 48
8.4.3 Significant Technical Roadblocks............................................................. 48
8.4.3.1 Transactional MSMQ Remote Read ................................................. 48
8.4.4 Miscellaneous .NET Headaches .............................................................. 50
Page 5 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
8.4.4.1 DataGrid Paging ............................................................................. 50
8.4.4.2 Web Services Returning DataSets.................................................... 50
8.4.4.3 The Mobile Application .................................................................... 51
8.4.4.4 Model Object Class Creation............................................................ 51
9 CONFIGURATION AND TUNING RESULTS .................................................52
10 WEBSPHERE CONFIGURAT ION AND TUNING PROCESS SUMMARY....54
10.1 RRD Round: Installing Software ................................................................54
10.1.1 Starting Point .......................................................................................... 54
10.1.2 Installing WebSphere Network Deployment .............................................. 54
10.1.3 Installing IBM HTTP Server ..................................................................... 55
10.1.4 Installing IBM Edge Server ...................................................................... 55
10.2 RRD Round: Configuring the System........................................................55
10.2.1 Configuring JNDI .................................................................................... 55
10.2.2 Configuring the WebSphere Web Server Plugin........................................ 56
10.3 RRD Round: Resolving Code Bottlenecks ................................................56
10.3.1 Rogue Threads ....................................................................................... 56
10.3.2 Optimizing Database Calls ...................................................................... 56
10.3.3 Optimizing the Web Service .................................................................... 56
10.3.4 Paging Query Results ............................................................................. 57
10.3.5 Caching JNDI Objects............................................................................. 57
10.3.6 Using DTOs for Work Tickets .................................................................. 58
10.3.7 Handling Queues in Customer Service Application.................................... 58
10.4 RRD Round: Tuning the System for Performance....................................58
10.4.1 Tuning Strategy ...................................................................................... 58
10.4.2 Performance Indicators ........................................................................... 58
10.4.3 Tuning the JVM ...................................................................................... 59
10.4.3.1 Garbage Collection.......................................................................... 59
10.4.3.2 Heap Size....................................................................................... 60
10.4.4 Vertical Scaling....................................................................................... 61
10.4.5 Database Tuning .................................................................................... 61
10.4.6 Tuning JDBC Settings ............................................................................. 61
10.4.7 Web Container Tuning ............................................................................ 61
10.4.7.1 Web Thread Pool ............................................................................ 61
10.4.7.2 Maximum HTTP Sessions................................................................ 61
10.4.8 Web Server Tuning ................................................................................. 62
10.4.9 Session Persistence ............................................................................... 62
10.5 WSAD Round: Issues ................................................................................62
10.5.1 Use of External Libraries and Classloading in WebSphere......................... 62
10.5.2 Pooling Objects ...................................................................................... 63
10.5.3 Streamlining the Web Service I/O ............................................................ 63
10.5.4 Optimizing Queries ................................................................................. 64
10.6 Significant Technical Roadblocks ..............................................................64
10.6.1 Switching JVMs with WebSphere............................................................. 65
10.6.2 Configuring Linux for Edge Server, Act 1.................................................. 65
10.6.3 Configuring Linux for Edge Server, Act 2.................................................. 65
10.6.4 Configuring Linux for Edge Server, Act 3.................................................. 67
10.6.5 Configuring JNDI for WebSphere ND ....................................................... 68
10.6.6 Edge Servers Erratic Behavior ................................................................ 69
10.6.7 Session Persistence ............................................................................... 70
10.6.7.1 Persisting to a Database.................................................................. 70
10.6.7.2 In-Memory Replication ..................................................................... 71
10.6.7.3 Tuning Session Persistence............................................................. 72
10.6.8 Hot Deploying Changes to an Application ................................................. 73
10.6.9 Configuring for Graceful Failover ............................................................. 74
Page 6 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
10.6.9.1 Failover Requirements..................................................................... 75
10.6.9.2 Standard Topology .......................................................................... 75
10.6.9.3 Non-Standard Topology ................................................................... 76
10.6.9.4 Modified Standard Topology ............................................................ 77
10.6.10 Deploying the WSAD Web Service .................................................. 78
10.6.11 The Sudden, Bizarre Failure of the Work Order Application ............... 78
10.6.12 Using Mercury LoadRunner ............................................................. 79
11 .NET CONFIGURATION AND TUNING PROCESS SUMMARY...................81
11.1 Installing and Configuring Software...........................................................81
11.1.1 Network Load Balancing (NLB)................................................................ 81
11.1.2 ASP.NET Session State Server ............................................................... 83
11.2 Resolving Code Bottlenecks......................................................................84
11.3 Base Tuning Process.................................................................................84
11.3.1 Tuning the Database............................................................................... 84
11.3.2 Tuning the Web Applications ................................................................... 84
11.3.3 Tuning the Servers ................................................................................. 85
11.3.4 Tuning the Session State Server.............................................................. 85
11.3.5 Code Modifications ................................................................................. 85
11.3.6 Tuning Data Access Logic....................................................................... 85
11.3.7 Tuning Message Processing.................................................................... 85
11.3.8 Other Changes ....................................................................................... 85
11.3.9 Changes to Machine.config ..................................................................... 86
11.3.10 Changes Not Pursued ..................................................................... 86
11.4 Significant Technical Roadblocks ..............................................................86
11.4.1 Performance Dips in Web Service............................................................ 86
11.4.2 Lost Session Server Connections ............................................................ 86
12 PERFORMANCE TESTING ............................................................................88
12.1 Performance Testing Overview .................................................................88
12.2 Performance Test Results .........................................................................88
12.2.1 ITS Customer Service Application ............................................................ 88
12.2.2 ITS Work Order Web Application ............................................................. 89
12.2.3 Integrated Scenario................................................................................. 91
12.2.4 Message Processing............................................................................... 92
12.3 Conclusions from Performance Tests .......................................................93
13 MANAGEABILITY TESTING ...........................................................................95
13.1 Manageability Testing Overview ................................................................95
13.2 Manageability Test Results........................................................................95
13.2.1 Change Request 1: Changing a Database Query ...................................... 95
13.2.2 Change Request 2: Adding a Web Page .................................................. 97
13.2.3 Change Request 3: Binding a Web Page Field to a Database.................... 97
13.3 Conclusions from Manageability Tests......................................................98
14 RELIABILITY TESTING ...................................................................................99
14.1 Reliability Testing Overview.......................................................................99
14.2 Reliability Test Results...............................................................................99
14.2.1 Controlled Shutdown Test ....................................................................... 99
14.2.2 Catastrophic Hardware Failure Test ....................................................... 100
14.2.3 Loosely Coupled Test ........................................................................... 100
14.2.4 Long Duration Test ............................................................................... 101
Page 7 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
14.3 Conclusions from Reliability Tests...........................................................101
15 OVERALL CONCLUSIONS ...........................................................................102
16 APPENDIX: RELATED DOCUMENTS .........................................................105
17 APPENDIX: SOURCES USED......................................................................106
17.1 Sources Used by the IBM WebSphere Team .........................................106
17.2 Sources Used by the Microsoft .NET Team ............................................106
18 APPENDIX: SOFTWARE PRICING DATA...................................................107
18.1 IBM Software............................................................................................107
18.2 Microsoft Software ...................................................................................108
Page 8 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
3 EXECUTIVE SUMMARY
This study compares the productivity, performance, manageability and reliability of an IBM
WebSphere/J2EE system running on Linux to that of a Microsoft .NET system running on
Windows Server 2003.
Page 9 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
.NET vs. RRD .NET vs. WSAD RRD vs. WSAD
Development Productivity
Tuning Productivity
Page 10 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
.NET vs. RRD .NET vs. WSAD RRD vs. WSAD
Performance
Manageability
Page 11 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
4 INTRODUCTION
Previous studies by The Middleware Company have compared tools or platforms on the basis
of one criterion or another, such as developer productivity, ease of maintenance or application
performance.
This study compares two enterprise application platforms, Microsoft .NET and IBM
WebSphere/ J2EE, across a full range of technical criteria: developer productivity, application
performance, application reliability, and application manageability.
Towards that end, TMC has published the methodology used and the source code for both the
.NET and J2EE application implementations for public download and scrutiny. Customers can
review and comment on the methodology, examine the code, and even repeat the tests in their
own testing environment.
Section 1 discloses the conditions under which The Middleware Company conducted this
study, including our research code of conduct and our policy regarding sponsored studies such
as this one.
Section 3 gives a brief, high-level summary of the study and its results.
Page 12 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Section 7 describes the tools that each team used during the different phases of the study. In
particular, since the J2EE team built two implementations of the system using two different
development tools, this section compares the two IDEs.
This study is the first of its kind to measure all of these criteria, using a novel evaluation
approach. While we expect the study to spark controversy, we also hope it will fulfill two
important goals:
Provide valuable insight into the Microsoft .NET and IBM WebSphere/J2EE development
platforms.
Suggest a controlled, hands-on evaluation approach that organizations can use to structure
their own comparisons and technical evaluations of competing vendor offerings.
Page 13 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
4.3 The Approach
This study took the approach of simulating a corporate evaluation scenario. In it, a
development team is tasked with building, deploying and testing a pilot B2B integrated
application in a fixed amount of time, after which we evaluate the results the team was able to
achieve in this time period.
In the study we executed the scenario three times, once using Microsoft .NET 1.1 running on
the Windows 2003 platform, and twice using IBM WebSphere 5.1 running on the Red Hat
Enterprise Linux AS 2.1 platform. (The latter two cases differed in the development tool used;
more on this in Section 4.5.)
We assembled two different teams, one for each platform, each similarly skilled on their
respective platforms. Each team consisted of senior developers experienced in enterprise
application architecture, application development, and/or performance tuning. The rules limited
each team to no more than two members in the lab at any time, but did not require the same
two members for all phases of the exercise.
The IBM WebSphere/J2EE team consisted of three senior developers from The Middleware
Company with 16 years combined experience in J2EE. The same two of these developers
built both J2EE implementations, and all three participated at different times in the deployment,
tuning and testing phases. For the installation, deployment and initial tuning of the WebSphere
platform, the J2EE team also used two independent, WebSphere-certified consultants having a
total of 7 years experience with the WebSphere platform.
The Microsoft .NET team consisted of three senior developers from Vertigo Software, a
California-based Microsoft Solution Provider, with a combined 10 years experience building
software on Microsoft .NET.
The Middleware Company took pains to keep the study free of vendor influence:
It is important to note that this study represents what the development teams could achieve
using only publicly available technical materials and vendor support channels for their platform.
It does not represent what the vendors themselves might have achieved, nor what each team
might have achieved if given a longer development and tuning schedule or allowed direct
interaction with vendor consultants. Therefore, the resulting applications developed by the two
teams may not fully represent vendor best practices or vendor-approved architectures. Rather,
they reflect what customers themselves might achieve if tasked with independently building
their own custom application using publicly available development patterns, technical guidance
and vendor support channels.
Page 14 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
application to create and track work order requests for facilities management on their corporate
premises.
The ITS system comprises three core subsystems that operate together in in both a loosely
coupled fashion (vi a messaging) and a tightly coupled fashion (via synchronous Web Service
requests):
The ITS Customer Service Application. ITS-FMCs corporate clients use this Web-based
application to create and track work order requests for facilities management at their
premises. The application automatically dispatches work order requests via messaging to
the central ITS system, which operates across the Internet on a separate ITS-FMC internal
network. The ITS Customer Service Application also allows customers to track the status
of their work orders via Web service calls to the ITS central system, as well as view/modify
customer and user information.
The ITS Central Work Order Processing Application. This application is operated by
ITS-FMC itself on a separate corporate network. The application receives incoming work
order requests (as messages) from the ITS Customer Service Application. It places the
requests into a database for further business processing, including assignment to a specific
on-site technician. The application hosts the Web service that returns work order status
and historical information to the ITS Customer Service Application. Additionally, this
application has a Web user interface that ITS -FMCs central dispatching clerks can use to
search, track and update work order requests, as well as query customer information and
query/modify technician data.
The Technician Work Order Mobile Device Application. This application operates on a
handheld device, allowing technicians to retrieve their newly assigned work items and
update work order status as they complete their work orders at the customer premises.
Technicians use this application for dispatching purposes, and to log the time spent
working on an issue so that customer billing can occur.
Page 15 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
The following diagram illustrates these three subsystems and their interactions:
Technician Mobile
Device Application
B2B Internet
ITS Work Order ITS Work Order
ITS Customer Service Connectivity
Message Queue Processing
Application Server Application
While there are.NET development tools from third party vendors such as Borland, the vast
majority of .NET development is done using Visual Studio.NET from Microsoft. This is the
development environment that the .NET team used to produce its implementation.
The J2EE world, on the other hand, offers many competing development tools with different
approaches and advantages. Even within IBMs domain, choices exist. To reflect this range of
offerings and enhance the studys usefulness, we had the J2EE team develop two different
implementations of the ITS system using two different IBM tools: Rational Rapid Developer
(RRD) and WebSphere Studio Application Developer (WSAD). Since both IDEs belong to IBM
and are designed to work well with WebSphere, they are both consistent with the studys focus
on the IBM WebSphere platform. But the two IDEs have important differences that ultimately
led to different results.
Page 16 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Details on these tools, how they compare, how they were used, and other development
software used with them can be found in Section 7.
Finally, customers and vendors can email The Middleware Company to discuss the report and
propose further testing or offer comments by emailing to: [email protected].
Page 17 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
5 THE EVALUATION METHODOLOGY
This study was designed to simulate two enterprise development teams given a fixed amount of
time to build and tune a working pilot application according to a set of business and technical
requirements. One team developed the application using IBM WebSphere running on Linux,
while the other team developed the application using Microsoft .NET running on Windows
2003.
Development took place in a controlled laboratory environment where the time taken to
complete the system was carefully measured. The two teams worked from a common
application specification derived from a set of business and technical requirements. Neither
team had access to the specification until development started in the controlled lab setting.
After developing an implementation, the team then tuned and configured it as part of a
measured deployment phase. Each implementation was then put through a set of basic
performance, manageability and reliability tests while running under load on the production
equipment. Hence this study not only compares the relative productivity achieved by each
development team, but also captures the base performance, manageability and reliability of
each application in a deployed production environment.
It is extremely important to note that the study allocated a fixed amount of time to each phase
of the project, and hence objectively documents what each team was able to achieve in this
1
fixed amount of time. The study objectively documents exactly what each team was able to
achieve, inclusive of detailed notes documenting technical roadblocks encountered by each
team, and how these were resolved. As such, the study tells an interesting story that will
undoubtedly spark much debate, but also shed valuable light on each platform based on actual
hands-on development and testing of a pilot business application.
Neither team included any representative from either IBM or Microsoft, and neither team was
allowed any direct interaction with vendor technicians from IBM or Microsoft other than the
standard online customer support channels available to any customer. In cases where a team
used a vendor support channel, support technicians were not told they were assisting a
research project conducted by The Middleware Company; so the team received only the
standard treatment afforded any developer on these channels.
To mirror the development process of a typical corporate development team, we allowed the
teams to consult with other members of their organizations outside the lab, to answer technical
questions and provide guidance as required. Such access to external resources was
monitored and logged, and we extended the rule prohibiting direct vendor interactions (other
than with standard customer support channels) to all resources contacted during the
development and testing phases of the project.
Here are details on the makeup and experience of the two teams.
1
Note that under certain circumstances we allowed a team to go beyond that fixed time period. See section 5.3.1 for details.
Page 18 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
5.1.1 The IBM WebSphere Team
The WebSphere team consisted of three developers from The Middleware Company, described
in the following table. Members A and B developed both the RRD and WSAD implementations,
while all three members participated at different times in the tuning and testing phases.
B 15 8 6* Experienced in RRD,
modeling and design.
C 23 8 6* Extensive experience in
tuning enterprise applications
for performance.
* Includes experience with Java servlet API predating the introduction of J2EE in 1999.
Additionally, the J2EE team used two independent, IBM-certified WebSphere consultants at
different times during the deployment and tuning phase.
One had three years experience as a WebSphere administrator on various Unix platforms,
including Linux.
The other had over four years experience installing, configuring and supporting IBM
WebSphere on multiple platforms, including Linux.
Microsoft
Development Platform .NET
Team Experience Experience Experience
Member (years) (years) (years) Other Relevant Experience
A 7 7 3 Experienced in Web
application development and
design
C 7 5 3 Experienced in development
and performance tuning
Page 19 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
5.2 Controlling the Laboratory and Conducting the Analysis
The Middleware Company subcontracted a third-party testing organization, CN2 Technology, to
write a specification for the ITS system, set up the lab environment, design the tests, monitor
and control the testing environment, and conduct the actual tests of the J2EE and .NET
implementations. CN2 strictly monitored the time spent by each development team on the
various phases of the project, and controlled the lab environment. CN2 also strictly monitored
Internet access and email access, including logging all such access from within the lab, to
ensure that neither team violated the rules of the lab.
Phase 1 Development 10
While we felt confident that the teams could complete Phases 1 and 3 in the allotted time, we
were less certain about Phase 2. If, after ten days of deployment and tuning, the
implementation did not perform up to even minimal standards, the results of formal testing in
Phase 3 would have little meaning.
So we added a requirement that each team continue their configuration and performance
tuning until satisfied that their implementation would perform well enough to actually undergo
the tests in the final week. This meant that each team was allowed to go beyond their allotted
ten days if they desired, with the understanding that all time spent would be monitored and
reported.
Page 20 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
5.3.2 Division of Lab Time Between the Teams
To keep the two teams from communicating with each other, while at the same time preserving
the continuity of their work, we interleaved their time in the lab in the following sequence:
.NET 1: Development
RRD 1: Development
WSAD * 1: Development
Schedule
Timeline Task/Event Description
Phase 1: Development
Day 1 Overview of lab rules and Team was introduced to the lab
(1 hour) hardware environment. environment for the first time, lab rules
were explained, and a walkthrough of the
hardware was conducted.
Page 21 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Desired Development and Testing Schedule
(established prior to start of exercise)
Schedule
Timeline Task/Event Description
Day 11 Review of base performance, CN2 reviewed with team the tests to be
manageability and reliability performed and technical
tests and requirements, requirements/goals for these tests. CN2
including review of Mercury Load provided a walkthrough of the Mercury
Runner test scripts and test tool. LoadRunner testing environment and
base test scripts so the team could begin
configuring and tuning.
Days 11-20+ Application performance and Ten 8-hour days were initially allotted for
configuration tuning. tuning in preparation for evaluation tests.
However, the team was allowed more
time if required to ensure they felt ready
to conduct the actual tests.
Team members could only use the provided machines for development work and Internet
access. Personal laptops were barred from the lab.
Each day was limited to 8 hours working time in the lab, with an additional hour for lunch.
The team could seek technical support and guidance from other members of their
organization outside the lab as required. They could communicate via telephone or email.
Page 22 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Neither team members nor their offsite colleagues could have any interaction with vendor
technicians from IBM or Microsoft, other than through standard online customer support
channels.
If they did use vendor support channels, team members could not reveal that they were
participating in a study involving IBM and Microsoft software; they received only the
standard treatment afforded any developer on these channels.
Note, however, that the WSAD implementation was developed after the RRD implementation,
and was developed offsite, not in the controlled lab environment.
A development machine for each developer, pre-configured with Windows XP and Internet
access.
Two machines with the two ITS databases pre-installed and pre-populated with data. The
database server was Microsoft SQL Server for the .NET team, Oracle for the WebSphere
team.
Four application server machines pre-configured with the base OS installation only
(Windows Server 2003 for the .NET team, Red Hat Enterprise Linux 2.1 for the WebSphere
team).
As for augmenting or modifying this initial environment, both teams were under the same
restrictions:
They had to install/configure their development environment (tools, source control, etc) as
part of the measured time to complete the application development phase.
They had to install the application server software separately on each server as part of the
measured development time.
They could not make changes to the database schemas, other than adding functions,
stored procedures, or indexes.
This rule applied specifically to coding of the RRD and .NET implementations:
Team members were not allowed to work on code outside the lab. This meant they could
not remove code from or bring code into the lab.
For all implementations (RRD, WSAD and .NET) this rule applied:
We allowed use of publicly available sample code and publicly available pre-packaged
libraries, since a typical corporate development team would also have access to such code.
Page 23 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
5.4.4 Testing Phase
The rules for Phase 3 were the most restrictive, since this phase consisted of the formal
evaluation tests conducted by the CN2 auditor:
Team members could not modify application code or system configurations except as
needed during a test.
After a load test was launched, the team would have to leave the lab until the test reached
completion (typically 1-4 hours later).
Most of the tests were performed under load. As mentioned above, in this study Mercury
LoadRunner running on 40 client machines was used to simulate load. CN2 provided the
teams with a set of LoadRunner scripts for each implementation.
The three sets of scripts were carefully constructed to perform the same set of actions,
ensuring that they tested the exact same functionality for each implementation in a consistent
2
manner.
Here is a summary of the tests performed; for more details and for test results see Sections 12
to 14.
Performance capacity (stress test). How many users can the system handle before
response times become unacceptable or errors occur at a significant rate?
Performance reliability. Given a reasonable load (based on the results of the stress test),
how reliably does the system perform over a sustained period (say, 12 hours)?
Efficiency of message processing. How quickly can the Work Order module process a
backlog of messages in the queue?
Ease of implementing change requests. How quickly and easily can a developer
implement a requested change to the specification?
Ease and reliability of planned maintenance. How easily and seamlessly can system
updates be deployed to the system while under load?
Graceful failover. How well does the clustered Customer Service module respond when
an instance goes down.
Session sharing under load. If one of the clustered Customer Service instances fails
under load, are the sessions that were handled by the failed instance seamlessly resumed
by the other Customer Service instance?
2
CN2 could not provide a single set of scripts for all three implementations because the three differed in certain low-level details, such as the
URL of a given page, the names of fields in that page and whether that page was to be invoked with GET or POST.
Page 24 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
6 THE ITS PHYSICAL ARCHITECTURE
This section describes the hardware and software infrastructure each team used to run its
implementation of the ITS system.
The specification required that the teams deploy to identical hardware; in fact, they used the
same hardware. On the machines hosting the applications and the message server, each team
had its own removable hard drive that was swapped in. On the machines hosting databases,
the two teams DBMSs shared the same drive, but were never run simultaneously. In this way
all three implementations used the very same processors, memory and network hardware.
On the software side, the teams started with the operating systems and database engines
already installed. They were responsible for installing the application server, message server,
load balancing software and handheld device software.
Page 25 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
This table lists the hardware and software used each by each team:
ITS
Subsystem Servers Hardware .NET Software J2EE Software
Page 26 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
The following diagram shows the physical deployment of the ITS system to the network,
including all the machines listed above. It also shows the machine hosting the Mercury
LoadRunner controller and the 40 machines providing client load.
Page 27 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Initially the team included three nodes in the WebSphere network: the two Customer Service
machines and the single Work Order machine. Later they included the Message Queue Server
machine as well, so that they could run a WebSphere instance there for sharing session state
in the Customer Service application.
In terms of WebSphere instances, the team started with one per node. Along the way they
experimented with multiple instances per node (for example, to run each Work Order module in
a dedicated instance), but found no improvement and returned to the original configuration.
Using an external Web server necessitates the use of IBMs Web Server Plugin, an interface
between the Web server and the WebSphere HTTP transport. The plugin consists of a native
runtime library and an XML configuration file, plugin-cfg.xml. Applying the plugin consists of
these steps:
Note that along the way the team found reason to customize the plugin configuration in ways
not possible through the Deployment Manager. That meant they departed from the normal
plugin configuration update process described in Step 3. For details, see Section 10.6.9.4.
To handle both, the WebSphere team decided to use IBMs preferred solution, Edge Server.
This component sits in front of the Web servers and balances load among them. But it also
monitors the health of the Web servers and channels traffic away from one that fails.
The team installed Edge Server on the MQ server host, because that machine was guaranteed
not to go down. Then they had to configure that host and the Customer Service hosts at the
operating system level for Edge Server to work properly. These configuration requirements led
to some of the most vexing problems faced by the WebSphere team, as discussed in Section
10.6.
Page 28 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
6.1.4 IBM WebSphere MQ
The WebSphere team used IBMs WebSphere MQ Series for its message server. MQ was
installed on the host designated for that purpose. Using it also required that host to have an
instance of WebSphere, whose JMS server acts as a front end for MQ.
ASP.NET, the Web application engine for .NET applications, is integrated directly with IIS 6.0.
In addition, Visual Studio enables developers to deploy applications to production servers or
staging servers directly from their development machines, a feature that the .NET development
team utilized during development.
The details on how the .NET team configured NLB for the ITS system are found in the Section
11.1.1.
Like IIS and NLB, MSMQ also comes built into Microsoft Windows 2003. The .NET team had
to enable MSMQ and create and configure the queues for the application. .NET provides
classes for accessing and manipulating the queues. As per the specification, a separate,
dedicated queue server was used for message queuing, with the Customer Service application
writing to the remote queue on this server, and the Work Order application reading messages
from this remote queue for processing.
Page 29 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
7 TOOLS CHOSEN
Each team had the freedom to choose any development, analysis, profiling and support tools
they wished to complete their work for their platform. This section describes the various tools
they chose.
RRD is a a model driven, visual tool that provides O/R mapping and data binding technology
and generates J2EE code from visual constructs. WSAD is a more mainstream J2EE
development tool dedicated to WebSphere.
These two IDEs have important differences that pertain to this study:
RRDs approach emphasizes developer productivity. But the code it generates is not
optimized for performance and does not lend itself to manual tuning.
WSADs approach requires the developer to write much more code manually, but gives the
developer complete freedom to optimize that code.
While both tools work well with WebSphere, WSAD integrates more tightly and provides a
lightweight version of WebSphere for development testing.
Aspect of
Development RRD WSAD
Page 30 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Comparing Rational Rapid Developer (RRD)
and WebSphere Studio Application Developer (WSAD)
as Development Tools
Aspect of
Development RRD WSAD
Approach to Has you place controls in a page design Again, more conventional:
page space, then bind them to data objects from You write business logic code
development your class model. Each page is served by to be used in standard JSPs,
its own subset of classes and attributes then write the JSPs
from the model. themselves. If desired you can
use Struts.
Configuring Has platform settings for WebSphere that Lets you configure your target
WebSphere let you specify JDBC datasources, JMS platform, whether the
message queues and other critical WebSphere test environment
resources. But these settings affect the or a real WebSphere instance,
application only, not the target platform. through the IDE.
You must still configure WebSphere Conversely you can also
directly. configure WSADs test
environment through a
standard WebSphere admin
console just as you would the
real WebSphere.
They used RRD to build the two Web applications (Customer Service and Work Order), the
Work Order message consumption module and the Work Order Web service, which
answers work ticket queries from the Customer Service application.
RRD was not suited for developing the handheld module, however. For that piece the
team used Sun One Studio, Mobile Edition.
During the tuning phase they developed a small library of custom classes to solve some
performance bottlenecks. They used TextPad to write the classes and the Java
Development Kit (JDK) to compile and package the library.
For source control of the RRD code, the team used Microsoft Visual Source Safe, which
integrates nicely with RRD.
Page 31 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Although WSAD works with certain source control software, including CVS and Rational Clear
Case, the team did not use either for this implementation. Instead they simply divided the work
carefully and copied changed source files between their two development machines.
WebSpheres tracing service. A crude runtime monitor built into WebSphere. From the
admin console you can select all the different activities you want to monitor in WebSphere;
the list covers everything the server does. You choose the categories, restart the server,
and see the output in a log file.
IBM Tivoli Performance Viewer (TPV). A profiler that integrates easily with WebSphere.
It displays a wide range of performance information. TPV also has a performance advisor
that recommends changes for better performance.
VERITAS Indepth for J2EE. This is a sophisticated profiler that lets you measure the
performance of code to almost any desired granularity.
Borland Optimizeit. This is another profiling tool that gave the team important information
about thread usage which Indepth could not provide.
Oracle Enterprise Manager. The team used this tool to manage and tune the database,
for example to adjust the size of Oracles buffer cache. But Enterprise Manager also has a
suite of analysis tools that the team used from time to time. By far the most useful was Top
SQL, which gives valuable statistics on the SQL statements executed against the
database.
top and Windows Performance Monitor. The team used these simple tools to monitor
CPU usage on the Linux and Windows machines respectively.
For Web development, Visual Studio includes a feature that makes deployment fairly easy.
The Copy Project mechanism allows a developer to deploy a Web application to any machine
with IIS installed.
The .NET team also used Visual Studio to develop the handheld application since they chose
to target a Microsoft Windows Mobile 2003-based Pocket PC, which includes the .NET
Compact Framework. To develop the application, the team used Visual Studios Pocket PC
emulator; for testing and deployment, they used the real device. With Visual Studio, deploying
to a real device was straightforward.
Page 32 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
To help them analyze database activity, the team used these Microsoft SQL Server 2000 tools:
Page 33 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
8 DEVELOPER PRODUCTIVITY RESULTS
The focus during the development phase of the project was on developer productivity: how
quickly and easily can a team of two developers build the ITS system to specification?
Section 8.1 presents the quantitative productivity results of the development phase. The rest of
Section 8 details the experiences of the two development teams: the architecture they chose
for their implementations, what went well for them during the development phase, and the
major roadblocks they encountered.
The WSAD implementation, on the other hand, was built later under special circumstances:
For these reasons the auditors report provides productivity results only for the .NET and RRD
implementations, while issuing a disclaimer regarding the WSAD results.
Installing Products. This included time to install software on both development and
server machines. All equipment used by the two teams initially had only a core OS
installation, except for the two databases (Customer Service and Work Order) which were
already installed and pre-loaded with an initial data set.
Building the Customer Service Web Application. This included constructing the Web UI
and backend processing for the Customer Service application according to the provided
specification, as well as the functionality to send messages. It also included creating the
Web service request that provides the ticket search functionality in the Customer Service
application, and ensuring the application could be deployed to a cluster of two load-
balanced servers with centralized server-side state management for failover.
Building the Work Order Processing Application. This included building the Web UI
and backend processing for the Work Order application according to the provided
specification, as well as the message handling functionality. It also included creating the
Web service for handling ticket search requests from the Customer Service application.
Building the Technician Mobile Device (Handheld) Application. This development task
included building the complete mobile device application according to the provided
specification.
Page 34 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
System-Wide Development Tasks. This category included working out general design
issues, writing shared code and general deployment and testing.
The following table shows the actual time spent building the .NET and RRD implementations, in
developer hours. The data come from the auditors report:
Customer Service 40 69
Application
System-Wide Development 2 29
Tasks
Subtotal 83 157
Product Installs 4 22
The WSAD implementation was created later by the same team that had previously created the
RRD implementation. It was also created outside of the controlled lab setting. Hence,
productivity data for this implementation cannot be directly compared to the other two, since the
team benefited from already having already built the same application once. In addition, the
team did not reinstall the WebSphere software nor redevelop the handheld application for the
WSAD implementation.
Page 35 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Nevertheless, the following table shows the relative time spent developing the WSAD
implementation of the ITS system. The data come from the developers logs.
Customer Service 13
Application
System-Wide Development 33
Tasks
Subtotal 92
Given how easily two developers working closely together can move quickly among several
tasks, one should not read too much precision into the breakdown of these numbers by
development task. Nevertheless, some interesting conclusions emerge:
One of the greatest differences was for product installation. This is not surprising, since several
key server-side .NET components were already present as part of the base installation of
Windows Server 2003:
Another significant difference was in developing the Mobile Device piece, where the J2EE team
ran into some roadblocks. (See Section 8.2.3.5 for details.)
3
As noted elsewhere, the base Linux installation included an installation of the Apache Web server, but the team chose to use IBMs version
instead.
Page 36 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Even within the core development (the Customer Service and Work Order applications), the
.NET team was more productive. Much of the explanation may lie in the simple fact that Visual
Studio.NET is the dominant.NET tool, and a developer who has worked in .NET for 3 years has
probably worked on VS.NET most or all of that time. In the J2EE world, by contrast, RRD is
one of many tools, and a comparatively new one at that. The .NET team was undoubtedly
more experienced with their tool than the J2EE team with theirs.
Another factor may be the differing approaches taken by the two tools. VS.NET is more
comparable to WSAD than to RRD: a development environment that connects you directly and
explicitly to the platform on which you are developing. RRD, on the other hand, is marketed as
a rapid development tool that accelerates the development process via its model-driven
approach. RRD distances the developer from J2EE, and the team found that it simplified some
tasks but complicated others where low-level code access would have been desirable. In a
wide-ranging development project like ITS, RRDs weaknesses may have outweighed its
particular strengths.
Although the teams did not track the time spent performing different types of tasks (such as
designing a Web page vs . coding database access logic), some inferences are possible. Both
RRD and Visual Studio provide excellent GUI design tools and the ability to bind data objects to
fields in a page. It is likely that the two tools offered much more similar productivity in this area,
and that the greatest differences lay in other aspects of application development, such as
coding the Customer Service logic to create and manage new work tickets in memory.
The main reason for the higher total under common is that the J2EE team developed
frameworks for the Web, business logic and persistence tiers. For example, their custom built
base servlet class provided much of the functionality needed by all the servlets in the two Web
applications. This design reduced the time spent developing individual use cases, while
increased the proportion of time spent on common tasks.
The Customer Service application, including the Customer Service Web interface and
message production.
The Work Order console application, which includes message consumption and the Web
service used by the Customer Service application. When RRD builds this application, the
Web service is packaged as a separate EAR file and must be installed separately.
The Work Order web application, which stands by itself.
Page 37 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
8.2.1.2 Database Access
On the back end, the team chose to forego stored procedures and stick with explicit SQL
through JDBC. The fact that RRD would generate JDBC logic automatically weighed against
writing stored procedures. The team knew from prior experience that, for single database
actions, a prepared statement invoked via JDBC performs at least as well as a stored
procedure. So they expected that RRDs generated logic would suffice for basic CRUD
operations (which covered most cases).
There were cases, however, where they needed to customize that logic. For example:
Work ticket searches. Both the Work Order Web application and the Web service used
by the Customer Service Web application allow ticket queries based on various
combinations of criteria. For example, the Work Order Web application allows queries by
any combination of customer ID, ticket creation date, work type, ticket status and technician
assignment. The developers had to write code that determined which search criteria were
used and constructed a custom SQL statement that uses only those criteria.
Customer search. The Work Order Web application has a customer query function based
on partial match of customer name. For every customer found it returns the number of
tickets in each of three ticket status categories (created, in progress, completed). By
default, RRDs generated code would have separately counted tickets in each category for
each customer, in other words three additional SQL actions per row of customer data
returned. Developer B reduced that number to one action per customer by using a custom
4
SQL statement with a GROUP BY clause to get all three counts in one action.
Later, during the tuning phase, the team discovered that some of the RRD-generated code
performed poorly. In response they added other JDBC optimizations. See Section 10.4.9 for
specifics.
Persistence tier. RRD lets you choose EJB entity beans or plain ordinary Java objects
(POJOs) for database operations. The team decided to avoid the overhead of entity beans
and went with POJOs.
Business tier. RRD offers session beans and POJOs. Again the team chose the latter
because of its lower overhead. One exception was the code for message production in the
Customer Service application; RRD wraps that code in stateless session beans.
Web tier. For Web pages, RRD offers JSPs or ordinary servlets. Since neither would be
edited directly, the choice was to a large degree arbitrary. The team chose straight
servlets. Note also that RRD has its own Web application framework, so the use of an
external MVC like Struts was not considered.
Message consumption. EJB message-driven beans (MDBs) have long been the
accepted technique for consuming JMS messages within an application server. They are
simple and lightweight, and RRD generates them by default. The team did not deviate
from that choice.
4
Here is the exact SQL statement:
SELECT count(ticketid) FROM worktickets WHERE customerid = ? GROUP BY ticketstatus ORDER BY ticketstatus
Page 38 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
8.2.1.4 Distributed Transactions
Those database actions stemming from JMS message processing required a distributed
transaction to span the database update and the JMS action. Both sides of the system
Customer Service and Work Order had such requirements.
Because distributed transactions require a two-phase commit, they are slower than one-phase
transactions within a single database. So the team did not want to use distributed transactions
for all database actions. Instead they set up two JDBC data sources to each database: one
using an ordinary driver for simple transactions, the other using an XA -capable driver for
distributed transactions.
This part went very smoothly. RRDs facility for building Web pages and linking them to data
structures are two of its strong points. By the end of Day 3 (where Day 1 was devoted mostly
to installing software), the team had completed much of the simple logic linking the Web
interfaces to database actions.
Other aspects of the Web service piece caused confusion and loss of time. See Section
8.2.3.2 below.
Here, however, Developer A discovered one of RRDs limitations. Because of the way it
organizes its generated code, RRD does not lend itself to the standard solution. RRD
organizes all the generated code for a given page in a page-specific package. This includes
page specific classes representing the data structures used by that page. In other words, if two
pages both use the WorkTicket class from the class model, RRD generates a different
WorkTicket Java class for each page, each in a different package. This means that if Page 1
creates an instance of its WorkTicket class and places it in a session, Page 2 cannot use it as
an instance of its WorkTicket class.
Developer A used RRDs preferred solution to this problem: store the data in session in XML
form. He used RRDs features to define an XML data structure and map it to classes in the
Page 39 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
5
model. The generated code uses a DOM API to parse the XML. This solution is tedious, and
a bit rankling to the hard core J2EE developer. Nevertheless, it did work.
(Fast forward to Phase 2, when the team found all this to-ing and fro-ing between XML and
objects a major performance bottleneck. They ripped out the XML code and replaced it with
logic that used a custom DTO class.)
In the past he had used another tool (WSAD) to generate the missing descriptor, so he did so
again. He created and built a simple Web service project simply to generate a descriptor.
When he included this descriptor in the RRD application, however, it did not deploy correctly.
It turned out that the initial failure was related to a bad state in RRD, possibly a source control
issue. One of the files was not open for writing, but RRD didn't tell him. So the Web service
implementation hadnt been saved properly and consequently didnt work.
Once he tracked down and fixed the file problem, RRD did successfully build and deploy the
Web service without the custom descriptor. At that point Developer A took yes for an answer
and moved on without investigating the anomaly. But the detour cost him several hours.
Later on, when the team set up its production environment, MQ again gave Developer A a
headache. This while he was setting it up in a remote fashion (one MQ installation that served
three applications residing on different servers). The process was complicated by the
undocumented fact that IBM WebSpheres MQ installation did not use the default port of 1414
6
for MQ. Rather it used 5558, not at all obvious.
Many string attributes required non-empty values. This requirement was enforced in the
input forms.
For non-required attributes, the developer set the initial value to a space character.
5
Document Object Model, an API that treats an XML document as a tree of objects.
6
The developer discovered this by getting a complete process list using the ps command (ps efl). He noticed an entry strmqmlsr that had a
port setting, tried this number and it worked.
Page 40 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
handheld application to communicate with the Work Order application, they chose to use a
Web service. Using RRD, Developer B added a Web service interface to the Work Order Web
application, defining the five remote operations needed by the handheld application:
Meanwhile Developer A ran a stub generator in the J2ME wireless toolkit in Sun One Studio to
create the Web service client.
This should have been easy, but it wasnt. It turned out that the stub generator supported only
document/literal Web services, whereas RRD only supported rpc/encoded. No manner of
tweaking the WSDL would make the two talk to each other.
But they had all the logic ready to use in EJB methods. They needed a way to allow the PDA to
execute them. Luckily the wireless toolkit also had a wizard for converting a service (basically a
simple Java class with some methods in it) into a servlet and client piece using HTTP post and
simple data (not web services). So they butchered the EJBs generated by RRD into POJOs,
delegated to their methods from the service class and ran the wizard. With that, they had the
two halves talking to each other.
The final step was to build the front end MIDLet in J2ME. Luckily this step was very easy and
took only a couple of hours.
Inability to centralize common page logic. Because RRD does not let you work directly with
JSPs and because the ITS specification prohibited use of frames in the pages, the team could
not easily centralize page logic that was common to most or all pages. This logic included the
navigation bar (links to other pages) and page authentication logic to verify that the user is
logged in before displaying the page. In RRD this logic had to be copied to every page. A
tedious but finite process, it would have been much worse had the number of pages been
significantly greater.
False error when building an EAR file for WebSphere. Developer A discovered a glitch in
RRDs build process for WebSphere. The build script that RRD creates makes an EAR file
under <websphere home>/RationalRDApps/<application name>. If there are no JARs in
the application (EJB JARs or custom libraries), the script returns an error code and RRD
aborts. This is true even if there need not be any jars. Developer A worked around it by drop
any old jar in the folder (such as Oracles classes12.jar) to avoid the false error code.
Placement of the GlobalObject class. For each application, RRD generates a GlobalObject
class that contains any global functionality you define. Although the class is application
specific, RRD does not package it in the resulting application EAR file. Rather it is treated like
an external library: It must be placed in the servers class path. This means the team had to
bounce WebSphere whenever the class changed. Also, since they used WebSpheres admin
console rather than RRD to deploy to the production servers, they had to manually copy the
class to its proper destination. It cost them some time figuring out where the class belonged
and ensuring it was properly updated.
Date handling in a Web page. Developer B discovered an apparent bug in RRDs handling of
date fields in a page: After constructing a page with date fields tied to date attributes of an
object in the model, he enter a date using the correct format, then print its value to standard
Page 41 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
error. The printed value is one day behind the entered value. He wrote a simple global method
to increment a date value and compensate for the discrepancy.
Inconsistencies regarding editing and source control. RRD works seamlessly with Visual
Source Safe; you can easily check files out and in of VSS from within RRD. And for the most
part, RRD is smart about preventing you from editing files that you have not checked out.
But Developer B found gaps in this intelligence, leading to wasted time. For example. when he
started to add a session attribute to a project, he was able to define the attribute, enter its
name and set its initial value, before realizing he had not checked out the source file where the
attribute would be stored. But without checking out that file he couldnt save his work. And
checking out the file caused him to lose his work.
Adding a static HTML page to a Web application. At one point Developer B wanted to add a
static HTML page to the Customer Service Web application. This turns out not to be easy at all
for RRD. RRD has no facility for directly adding an actual HTML file to the web app. Instead it
has a way to let you take snapshots of pages that change little or not at all. You designate
the page as static in the page properties, then you have to go through a two-step construction
process. Developer B created a dummy page and tried this, but quickly got bogged down in
the details. So he gave up and instead simply dropped an HTML file in a folder that was
included in the WAR build. That did the trick.
Database access. As with the RRD implementation, again the team avoided putting database
logic in the database itself via stored procedures, favoring explicit SQL statements in the Java
code. In this more standard J2EE environment, where they would have to write the database
logic either way, several factors led to this conclusion:
For a single database action, a prepared statement invoked via JDBC performs at least as
well as a stored procedure.
Prepared statements are easier to write than PL/SQL stored procedures.
Having the SQL in the application code rather than in the database simplifies code
maintenance.
Along the way the team discovered that this choice gave them even greater flexibility than first
thought: They could invoke complete PL/SQL statements as JDBC prepared statements, letting
them combine multiple database actions into one. More on this below under Section 10.5.4,
Optimizing Queries.
The persistence tier. Having chosen persistence logic within the application itself, the team
then had to choose how to organize that logic, whether in EJB entity beans, another O/R
mapping layer such as Hibernate or JDO, or POJOs with straight JDBC. Again, from past
experience they decide that entity beans were too expensive. They considered Hibernate, but
rejected it; it might provide a significant productivity gain only if the entity model were extensive,
and the team wasnt certain it would scale. So instead they settled on POJOs.
Page 42 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
To centralize all common aspects of JDBC operations, the team created a simple framework
based on the Command pattern. A central JdbcHelper class was responsible for getting a
connection, executing a statement, getting and returning the results of that statement (if the
SQL was a query), and handling errors. It obtained the SQL statement and the actual query
results from a case-specific callback object (the command object).
The callback classes themselves were organized as inner classes of entity-specific JDBC
helper classes, such as CustomerJdbcHelper for customer activity. Each class had methods
representing individual database actions, such as updateCustomer(). Each method would
7
create or obtain the proper callback object and invoke the central JdbcHelper logic to execute
the database action.
This framework proved very effective in minimizing repetitive code, reducing bugs and easing
maintenance.
The business tier. For the business faade the team considered EJB session beans but
rejected it as too expensive. Instead they created a simple framework based on the Faade
pattern. Each application had a single faade class with stateless methods representing every
action required by that applications front end. Most of these methods simply called a method
of the appropriate JDBC helper class. Each faade class also created a single instance of itself
of each JDBC helper class it used to eliminate unnecessary object creation.
The Web tier. The team used JSPs for Web pages and servlets to tie those pages to business
logic. They decided against using an MVC framework (Struts was the prime candidate)
because it would add runtime overhead. Nevertheless, they did create a simple controller
servlet that reproduced some of Struts conveniences:
This servlet became the ancestor class to all servlets in the two ITS Web applications.
The team also did not use any custom tag libraries. This choice meant that their JSPs
contained significant amounts of Java code. For a comparatively short project such as this, the
additional code was acceptable; but in a longer, more enduring project the team would probably
have refactored their JSPs to use Struts or JSTL libraries.
Message consumption. As with the RRD implementation, the team saw no reason to deviate
from using EJB message-driven beans (MDBs).
7
Callback classes representing actions with no parameters (such as get all customers) could be treated as singletons.
Page 43 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Project Produced Description
itsCustServ itsCustServ.ear Umbrella project for Customer Service Web
app
It was a straightforward tool that facilitated access to J2EE rather than hiding it. This
appealed to the J2EE developers on the team.
It was dedicated to WebSphere.
Page 44 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
every page that needed it. The team used this technique for two common aspects of their
pages:
The page header, including the navigation bar (links to other pages)
The page authentication logic to verify that the user is logged in before displaying the page
The only tricky part to using @include was the declaration and use of variables in scriptlets. A
variable used in any JSP had to be declared in the same JSP. This restriction complicated the
design a bit but not significantly.
The error appeared shortly after a test of a distributed transaction had failed and the server was
bounced. It seemed the server was trying unsuccessfully to recover from the failure. The
problem was that the server tried again every minute or so, pumping output into the stdout log.
While not a show-stopper, this problem was annoying, especially because it later also
appeared in the live WebSphere instances. Bouncing WebSphere and Oracle had no effect,
and the team could not find any WebSphere configuration settings that helped. Moreover, the
problem had not appeared during the RRD round.
Eventually a Google search found the answer at an IBM devWorks forum: The datasource user
must have SELECT rights to the Oracle table PUBLIC.DBA_PENDING_TRANSACTIONS. In
the RRD round the team had set up the datasources to log in as SYSTEM, which had that right.
In the WSAD round, they configured the datasource to log in as ITSUSER (the ITS schema
8
user), which didnt. When they granted that right to ITSUSER and restarted the servers, the
problem disappeared.
TODOs in JSPs. WSAD has a nice facility for marking to do tasks in your Java code. It
displays any comment beginning with // TODO in a special To Do list and lets you easily
jump to that comment from the list. However, this facility does not work for comments in
JSPs.
Code completion inside JSPs. WSAD has a nice code completion tool, but it is very slow
for Java code inside JSP scriptlets.
Copying a servlet. One developer discovered that when he created a new servlet by
copying an existing one, WSAD did not automatically add the new servlet to the web
deployment descriptor; because of that he got page not found errors when invoking that
servlet. The Web descriptor editor has a Servlets page that should have allowed him to
add existing servlets to the list, but it didnt point to the right, and the developer didnt see
how to make it do so. So he had to manually edit the descriptor source.
Sharing source code. Because the team chose not to use source control software, they
shared code by zipping up their workspace. After a couple of false starts they learned
which files not to share.
8
Why the change? Having the application log in as SYSTEM meant that at least some SQL table references had to be qualified with the schema
name. Logging in as ITSUSER eliminated that problem.
Page 45 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
8.4 Microsoft .NET Development Process
The team spent approximately 50% of their development time writing new code, 25% modifying
code to correct misinterpretations of the specification, 10% creating the overall design, and
15% performing unit and system tests. Since the machines were speedy, build times were
insignificant.
The team used model classes to map objects to relational data, and also used the publicly
available Data Access Application Block (DAAB) to simplify their data access code. DAAB
provides pre-built libraries for ADO.NET and is available on MSDN for both SQL Server and
Oracle backends. More on DAAB below in Section 8.4.1.2.
This diagram shows the software architecture of the Work Order application. The of the
Customer Service and Technician Mobile Device applications had very similar structures,
except that the former lacked message processing and the latter had neither message
processing nor queue access.
Page 46 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
The Customer Service project included the Customer Service Web application with its
associated business and data logic. This application exposed the Web UI and produced
customer update and work order messages.
The Work Order project included the Work Order Web application and the Message
Forwarder and Processor console applications, each with its associated business and data
logic. The Web application exposed the Web UI as well as the Web service consumed by
the Customer Service Web application. The Message Forwarder and Processor console
applications worked together to process the customer update and work order messages.
The Technician Mobile Device project included the Pocket PC Windows Forms
application with its associated business and data logic. This application provided the UI for
technicians to process work orders. It connected directly to the Work Order database via
wireless networking.
The .NET team chose to use stored procedures even though all of the database actions were
little more than CRUD operations. Doing so afforded a level of encapsulation and an interface
that allowed the underlying database operations to change if necessary without adversely
affecting the data access logic in the middle tier code. From a manageability standpoint, this
arrangement has certain advantages, as at least some level of query changes are possible
without having to modify or re-deploy any middle tier logic.
At the same time, the .NET team did not limit themselves to stored procedures. Along the way
they found that, for some operations, putting the SQL statements in the data access logic
markedly improved performance. The most notable example was work ticket queries. Since
the specification required the queries to employ several independent but optional search
criteria, the team coded their middle tier to construct a SQL query for such criteria and send it
as a batch to the database.
Although each .NET application had a data access layer, the custom classes in those layers did
not directly invoke the .NET frameworks data access classes (those found in the
System.Data.SqlClient namespace). Instead, the .NET team chose to use a freely available
application block, the Data Access Application Block (DAAB), created by Microsoft and
publicly available for use with both SQL Server and Oracle. DAAB comes in the form of a
single code file containing a few utility classes that encapsulate the most common data access
operations.
The main DAAB utility class, SqlHelper, contains methods to return scalar values or result sets
in the form of a SqlDataReader or DataSet. It also contains methods to execute SQL batches
that have no return value, such as INSERT or UPDATE statements. All SqlHelper methods
have several overloads that take a variety of parameters, such as either a SqlConnection object
or a connection string. In most cases, the .NET team used the overloads that take a
connection string as well as the stored procedure name and a variable list of arguments in
varargs style. This approach reduced much of the data access code down to a single
statement. For instance, the Update method of the CustomerDAO class was very brief:
Page 47 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
public static void Update(Customer customer) {
SqlHelper.ExecuteNonQuery(connectionString, "UpdateCustomer",
customer.CustomerId, customer.CompanyName, customer.Address1,
customer.Address2, customer.City, customer.State,
customer.Zip, customer.ContactFirstName,
customer.ContactLastName, customer.ContactEmail,
customer.ContactPhone, customer.ContactFax,
customer.MasterAccountCode);
}
However, since the ServiceDomain class does not work on Windows XP (the OS of the
development machines), the team could not test transactional behavior locally. This limitation
did not turn out to be a problem since the code still compiled correctly; the .NET team protected
the transaction-specific code with preprocessor directives (e.g. #if DEBUG). Thus, they were
able to perform functional tests on their development machines and specification conformance
tests (including transactional behavior) on the target machines, which were running Windows
2003 Server.
The .NET team experienced some configuration problems. One was with the Distributed
Transaction Coordinator that .NET uses to manage transactions. They suspected the problem
was due to re-naming the server. They solved this problem without affecting their development
time.
.NET offered the team two solutions: storing session state in a database or using ASP.NET
Session State Services. Since the specification did not require that state persist longer than
the session timeout for users (15 minutes), and believing it would perform better, the team
decided to use ASP.NET Session State Services.
ASP.NET Session State service is an out -of-process service that does not depend on any other
processes, so that developers can begin using session state in server farm environments
without worrying about the requests dependency. Once installed and enabled on the reliable
server, this service satisfied the ITS specification for preserving session state in a clustered
environment.
Page 48 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
specifically, the application would start a transaction before reading and processing a message.
If the message processing succeeded, the transaction would be committed and the message
would be removed from the queue; if it failed, the transaction would be rolled back and the
message would remain on the queue.
This requirement led to the single most costly problem the .NET team experienced. The issue
stemmed from the way MSMQ handles distributed transactions. Currently, MSMQ supports
transactionally sending messages to, but not reading messages from, a remote queue. In other
words, the Work Order application residing on the Work Order host could not read a message
from the queue residing on the MQ host within a transaction.
The team discovered this issue about two-thirds of the way through the development phase
when they tried a system-level test involving work ticket processing. Why did they not see it
sooner? As described above in Section 8.4.1.3, the team had chosen to manage distributed
(DTC or Distributed Transaction Coordinator) transactions via the ServiceDomain class, which
is not available in Windows XP. So they could not test this functionality on the development
machines.
It took about half a day to diagnose the problem and a full day to implement a workaround. The
solution, a read request queue, is described in the MSDN article Transactional Read-response
Applications, found at http://msdn.microsoft.com/library/default.asp?url=/library/en-
us/msmq/msmq_about_transactions_05wz.asp. It lays out the following architecture:
Each receiving (message processing) application has a local queue to hold messages it will
process.
When a receiving application wants to process a request, it sends a read-request message
to a read-request queue colocated with the input queue (the queue to which the sending
applications send their original messages).
A separate application (the read-response application) monitors the input and read-request
queues, which are local to it. In a single transaction it reads a read-request message,
obtains from it the target receiving application, reads a message from the input queue and
Page 49 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
forwards it to the target. When the forwarded message appears in the targets local queue,
the target receiving application handles the request.
This solution gets around the MSMQ barrier the .NET team encountered because all messages
are read from queues local to the reader; there are no remote reads at all, let alone any that
must be transactional.
two read-request queues on the MQ host (where the original input queue resided), one
each for customer updates and new tickets
two corresponding local queues on the Work Order host
a read-response application
The team also noted what it considered a non-standard requirement of the ITS specification. In
a production system, if a message consumer could not process a message properly, it would
probably remove the message from the main queue and place it into a secondary failed queue
for management reasons. However, the ITS specification did not allow this option; it required
9
the failed message to remain in the main queue. One adverse consequence to this
requirement was that a corrupt message would continually be re-processed over and over
again.
However, the specification also called for the Previous and Next buttons to be implemented
as INPUT tags (to work with the Mercury LoadRunner scripts) and named cmdPrevious and
cmdNext, respectively, in the HTML. While these requirements prevented the .NET team from
using all of the DataGrids paging features, they were still able to utilize the auto-paging feature
and write additional code to handle the buttons' events.
Since the specification also forbade client-side caching of query results, the .NET team
disabled view state on Web pages that contained a DataGrid. This change had the added
benefit of reducing the HTML page size, since the view state for a DataGrid can be quite large.
Without view state, however, the team needed another mechanism to hold other, necessary
page state information. They kept track of some of the DataGrid properties (such as the current
page index) in a cookie to get the correct paging behavior.
Had the team been allowed to use the full range of DataGrid features, this particular part of the
development phase would have taken only several minutes. Instead, with all of the custom
code necessary to implement the ITS paging requirements, it consumed a few hours.
9
The reason was simplicity. The ITS specification was designed to include integration technologies such as messaging in the project yet still be
simple enough that a team could develop the ITS system in four developer-weeks.
Page 50 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
However, during the tuning phase the .NET team found they could increase performance by
switching to a custom object collection. This change required a more significant coding effort,
since they had to create custom classes not only on the Web service side but also on the client
side. Since the classes auto-generated by Visual Studios Web Reference mechanism expose
only public member variables instead of the public properties required for DataGrid data
binding, the .NET team had to create wrapper classes to expose those fields as properties.
They also had to implement the IComparer interface to get the proper sorting in the DataGrid.
It turned out they were using version 1.0 of the DAAB in their initial mobile test project, but
version 2.0 in the Technician Mobile Device project. They did not want to use version 1.0 since
2.0 contains several improvements, but for some reason the Compact Framework build was
unable to find the IDisposable interface of the SqlDataAdapter class. They were able to
enhance version 2.0 of the DAAB to work successfully with the Compact Framework by
creating a custom SqlDataAdapter class to wrap the System.Data.SqlClient.SqlDataAdapter
class. Even with this issue, the total development time for the mobile application was only one
day.
About half of the classes were rather small, but others had as many as a dozen attributes.
Although the time taken to create each class was not very significant, it was still minutes
instead of seconds. Copying and pasting some parts of the code that was common across most
of the classes mitigated some of the effort, but that process introduced the risk of errors.
Fortunately, Visual Studio parses the code while editing, effectively performing a syntax check
without compiling. Compilation was quick, in any case.
Page 51 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
9 CONFIGURATION AND TUNING RESULTS
The development teams were initially allotted up to two weeks to tune and configure the system
in preparation for performance, manageability and reliability testing in the final week of the
project. However, each team was allowed additional time if they required it in order to
successfully prepare for the tests and ensure the application was performing properly. The
time spent configuring and tuning the production environment was tracked for comparative
purposes.
The teams used Mercury LoadRunner to simulate load for performance tuning. Each
development team was responsible for their own database indexing, database tuning, and
application server tuning. Changes to code were allowed during this phase if required to make
the application perform more efficiently under load.
The following table shows the amount of time taken by each team to tune each application to
perform as efficiently as possible prior to testing. The data is from the auditors report:
Originally scheduled 20
The J2EE team clearly went well beyond the two-week time frame when tuning the RRD
implementation. There are several reasons for this, detailed in Section 10:
In the WSAD round, the J2EE team built upon the solutions from the previous round, focusing
their efforts on tuning the WSAD code and improving failover. Again see Section 10 for details.
The .NET team, on the other hand, completed their tuning early. As the auditors report states,
they were
able to tune and configure their implementation in 8 days, less than the 10 days allotted.
Vertigo Software could have used the entire 10 days allotted for this phase; they chose to
consider this phase completed after 8 days.
Page 52 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Again, as noted above, .NET has many fewer knobs to turn than WebSphere. The team did
not, for example, have an equivalent of tuning the JVM. See Section 11 for details on the .NET
teams experience.
Page 53 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
10 WEBSPHERE CONFIGURAT ION AND TUNING PROCESS SUMMARY
This section describes the process the J2EE team went through to configure and tune the basic
WebSphere infrastructure. It also describes the major bottlenecks encountered and resolved in
the two implementations.
Here is a high-level summary of the stages the team went through. Details follow in the
sections below.
1. Install the basic software: WebSphere Network Deployment, Edge Server, IBM HTTP
Server (IHS).
For the WSAD implementation, the team did not repeat Stage 1 and did very little Stage 4
tuning. Most of their work in the WSAD round focused on three issues:
session sharing
failover
optimizing database queries
Additionally, the team had set up session sharing between the Customer Service instances
using a longstanding standard IBM technique: writing the session data to a database.
WebSphere makes this relatively easy to configure. The team used the Customer Service
database as the persistent store for sessions. (This technique worked for functional testing, but
later would prove unacceptably slow under load. The team would then replace it with in-
memory replication; more on this in Section 10.6.7.)
Page 54 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
so, losing their resource configurations in the process. Realizing their mistake, they removed
the nodes and added them again in a way that preserved the configurations.
Another false start had to do with adding a node for the MQ servers WebSphere instance
running on the same host as the Deployment Manager. When the team did so, they discovered
two changes in the MQ situation:
How you start MQ changed. As mentioned earlier, you control MQ Series through
WebSpheres own JMS server. If the WebSphere stands alone, its JMS server is
embedded in the application server, and starting the latter starts the former (as well as
MQ). But when you add the WebSphere instance to the Network Deployment, the JMS
server is split out as a separate server and must be started separately.
The MQ configuration names (queue names, JNDI names, etc.) changed, and the
applications could no longer reach the server.
While the first change did not cause a problem, the second did. And since the team had no
compelling reason to include the MQ WebSphere in the federation, they removed it and
restored the status quo. (Ironically, much later the team decided they needed the session
sharing server to run on the MQ host, forcing them to add that node to the federation and
confront the MQ configuration change.)
Once the nodes were added, the team created a cluster for the two Customer Service servers.
apachectl start
Apache did start, but it wasnt IHS. It took a bit of head scratching to figure out that Linux had
its own Apache server already installed and placed in the system path. The two versions of
Apache were launched with the same command. Executing the command with a path qualifier
pointing to the IHS bin folder solved the problem.
Getting it to run properly was a different matter. In fact, Edge Server was directly or indirectly
responsible for the most significant challenges the team faced during this phase. See Section
10.6 for the gory details.
Page 55 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
10.2.2 Configuring the WebSphere Web Server Plugin
The Web Server Plugin is WebSpheres interface between Apache and the HTTP transport
embedded in the application server. The plugin consists of a native runtime library (already
installed with WebSphere) and an XML configuration file, plugin-cfg.xml. Generating that file
and putting it in place on the nodes can be done entirely through the Deployment Manager
console.
The WebSphere literature talks of installing the plugin, but its really a matter of configuring
Apache to use it. This configuration consists of modifying Apaches httpd.conf file to load the
plugin library and point to the plugins configuration file, plugin-cfg.xml. While it took the team a
few tries to get everything right, the procedure is straight forward and well documented.
It took a long time to diagnose this problem, but finally the team did so using Borland
Optimizeits Thread Debugger. This tool shows you all thread activity in the JVM. It told the
team that the server was spawning many new threads from instances of a log4j called
FileWatchDog.. This class extends Thread and is used to check every now and then that a
certain file has not changed.
What was causing this? The symptom turned out to be RRDs debug settings; the Work Order
Web application had been deployed with debug settings turned on. The team redeployed it
with all debug output suppressed.
The generated code often used plain statements instead of prepared statements.
For queries, the code performed a count(*) before doing the actual query, to see how many
rows would be returned. This of course doubled the number of actual database calls.
The team replaced RRDs code for all major queries with custom code that used prepared
statements. It also eliminated the count(*) calls, as these were completely unnecessary. This
coding work took considerable time but proved crucial to improving the applications
performance.
Even after optimizing the query logic, the Web service still responded slowly. So the team
focused on the fact that RRD generated code to wrap the service in a stateless session bean.
Every service call had to acquire a bean instance.
Page 56 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
The team tried replacing the session bean with POJOs. They did this by porting all the
generated RRD Java code into another tool (WSAD), then refactoring the bean class and the
logic that used it. This change improved performance on the entire test script by more than
10%.
Next they looked at the code that creates the XML response string from the query. The original
code created DTOs from the JDBC result set, then used the DOM API to construct XML from
the DTOs. The team refactored this code to create XML directly from the result set using a
simple StringBuffer. Eliminating DTOs and DOM (which is notoriously expensive) improved
overall performance by another 16%.
The teams initial RRD implementation did limit the overall query size, using Oracles maxrows
variable, such as:
But when they switched to custom queries, the team had to add custom paging logic. There
were two important parts to this logic.
First, limiting the query size: In other words, for Page n the query should return the lesser
of 500 and n * 10 + 1.
Second, creating data transfer objects (DTOs) for only the ten rows actually displayed.
So if the application performed asked for Page 3 of a query that could return 100 rows, the
query should only return 31 rows, and the application should create objects only for rows 21-30.
Because RRDs query processing code is so tightly interwoven with its page producing code,
replacing the query code wasnt enough; the team had to work around the code for page
displaying the page. This took the team deep into the realm of working against RRDs
capabilities instead of with them.
The team created a simple ServiceLocator class that cached the InitialContext, JDBC data
sources and EJB homes. (Although the implementation used EJB minimally, it did use
stateless session beans in the Customer Service application to wrap the JMS message
producing code.) This class was packaged in a custom library that was dropped into
WebSpheres AppServer/lib/ext folder.
The library worked well. The only inconvenience was that it meant bouncing WebSphere more
often: when the library changed, and when the Customer Service application was rebuilt and
redeployed (because the EJB home stubs became stale.)
Page 57 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
10.3.6 Using DTOs for Work Tickets
As described above in the section on developing the RRD applications, the way RRD
generates code prohibits data objects from the class model from being directly shared across
pages. So for the Customer Service Web application to hold pending work tickets in session
state, it had to do so by converting ticket objects to XML and back again.
This process proved highly inefficient, especially because RRD used a DOM API to construct
the XML. (DOM is notoriously expensive.) After working through other code bottlenecks, the
team focused on this one. The solution was a custom WorkTicket DTO class to replace the
XML format, along with efficient code to go between the DTO class and the page-specific ticket
objects.
1. Work up and out. In other words, take a stab at the application server parameters,
including JVM parameters. Find the maximum load without errors. Get into the
ballpark, dont try for final precision.
2. Then go to one end of the system (the database or the Web tier) and work toward the
other.
3. At each point (for each tuning variable), try making a big change, such as doubling the
size of a pool. Run a quick test, see if it had any effect. Use binary chopping to zero in
on the optimum value.
4. Never change two things at once. Stick with the plan, resist taking shortcuts as time
runs out. One exception to this rule is where two variables are related, such as the
Web container thread pool and the database connection pool.
5. Realize that precise tuning requires several passes through the system.
Regarding the last point: because the team spent so much time optimizing code, they really
had time for only one pass through the system.
Page 58 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Page hits per second. LoadRunner provides average response times for each individual
action in a test script. With some scripts comprising as many as two dozen actions, and when
you have to run many tests in a long tuning process, it takes too long to record all the response
times. response as a.
The total page hits/second statistic in LoadRunner provides a handy summary performance
indicator. It tells you the peak load that the application can handle. As a load test ramps up the
number of users, at some point hits/second reaches a plateau before response times climb
significantly and errors accumulate. This level amount represents the peak user load.
users = ( hits/sec ) * ( total user think secs in script / total hits in script )
All the scripts used in this study had 5 seconds of think time per Web request. If a script had
15 requests with a total of 20 hits, then
users = ( hits/sec ) *
((5 user think secs / request ) * ( 15 actions / script ) / ( 20 hits / script ))
In other words, it would take approximately 2,250 users to generate 600 hits/second.
CPU usage. To see how hard a machine was working (CPU usage), the team used top on the
Linux servers and Performance Monitor on the Windows machines. The latter also provided
indicators of disk activity to tell them how hard the database was working.
Response times. For more specific issues the team looked at response times of individual
actions in a script.
The two settings are related, because while a larger heap allows the JVM to work with more
objects and possibly provide greater throughput, it also means that GC must work harder when
heap begins to fill.
Non-concurrent. The garbage collection thread sleeps most of the time, then periodically
wakes up to collect garbage. When it does, it pauses the JVM, leading to a backlog of
requests that under load can be overwhelming.
Concurrent. Concurrent mode spreads the performance cost of garbage collection out
over time. The garbage collector runs at a low level in the background most of the time.
This reduces throughput during the steady state. But when full GC kicks in, it takes less
time because the collector has been working continuously, so the backlog of requests
doesnt build to a critical level.
Unrecognized Parameter
Page 59 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
reason was that WebSphere 5.1 is installed with IBMs JVM 1.4.1, which does not recognize
this parameter.
The team briefly tried having WebSphere use Suns JVM instead, but quickly ran into other
errors. So they abandoned this effort and stayed with the original JVM installation. See
Section 10.6.1 for more details.
IBMs JVM has a parameter similar to Suns: -Xgcpolicy toggles concurrent marking of objects:
-Xgcpolicy:optthruput turns concurrent mark off (to optimize thruput). This is the default
setting
-Xgcpolicy:optavgpause turns concurrent mark on (to optimize the pause due to full GC).
During tuning, the team experimented with the latter policy. When they reached a stable
configuration, however, they found that the system performed as well with concurrent mark
turned off, and left that setting in place.
The IBM literature on performance tuning suggests that the JVM should optimally be spending
10
an average of about 15% of its time collecting garbage. If GC is less than this, the JVM may
be wasting memory; if more, the JVM is working too hard, indicating that heap may be too small
and/or objects are not being used efficiently.
The team used this guideline as it examined application server performance. It relied primarily
on Tivoli Performance Viewer to give statistics on garbage collection.
WebSpheres heap settings default to a maximum size of 256 MB and an initial size of 64 MB
(25% of maximum).
To find the optimum heap size, the IBM literature suggests the following procedure:
3. Use the verbosegc JVM parameter, which prints output of GC and heap expansion
activity.
4. Run the server under load. See where heap size stabilizes and where GC falls to an
acceptable level (around 15% of total CPU time).
The team followed this test procedure with different heap sizes, ranging from 128 to 768 MB.
They chose 768 MB as the upper end of the range because each machine had 1 GB total
RAM, and a rule of thumb derived from previous experience suggested devoting no more than
75% of total RAM to the application servers.
What they discovered is that WebSphere does not run better with significantly more memory.
Ultimately the team left the Customer Service servers at 256 MB but increased the Work Order
servers heap size to 384 MB.
10
The IBM Redbook, IBM WebSphere V5.1 Performance, Scalabilityand High Availability, says: The average time between garbage collection
calls should be 5 to 6 times the average duration of a single garbage collection. This translates to a range of 14-17%.
Page 60 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Once the optimum heap size was determined, the team set the initial size = the maximum size,
so that the server doesnt waste time adjusting heap size incrementally.
On the Customer Service side, the team tested 2 and 3 instances per host. (At 256 MB heap
per instance, 3 instances was the most they could run.) On the Work Order side, they tested
two alternatives:
2 identical instances at 384 MB each, with all three applications deployed to both
3 instances at 256 MB apiece, each dedicated to one of the three applications
These alternatives did not improve performance. WebSphere on Linux apparently spawns
additional threads that act like processes. So, after much testing, the team found that the basic
configuration of one instance per machine was best. On the Work Order side, that one
instance hosted all three Work Order applications (the Web application, the message
consumption application and the Web service for queries from Customer Service).
Create indexes. The team found that creating an index on each individual field used in a
query was most efficient. So, for the WorkTickets table for example, they created separate
indexes on CreationDate, WorkStatus and other fields used in queries.
Adjust buffer cache size. Oracle caches recent query results; up to a point, the more it can
cache, the faster it performs. The Oracle Enterprise Manager console shows the optimum
cache size. The team returned to this setting periodically to make sure it was adjusted
properly.
Connection pool size. The number of connections in the pool affects how long a thread in
the Web container must wait to carry out a JDBC action. The team set this pool size equal
to that of the Web container thread pool (since the applications did not use EJBs for
persistence). They also set the initial size = the maximum size (as they did with all pools)
to get the system initialized more quickly.
Prepared statement cache size. WebSphere caches prepared statements. The more
different SQL statements used in the application, the greater this number should be.
Page 61 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
current session) immediately creates a new session when it redisplays the home page! The
ITS specification required sessions to last at least 15 minutes, and there was no guarantee the
load scripts would perform explicit logouts (not that they would help anyway). So sessions
would linger until they timed out.
After some experimentation the team settled on a maximum session count equal to twice the
peak number of users.
MaxClients
ThreadsPerChild
ListenBacklog
MaxClients sets the limit on the number of simultaneous HTTP requests that will be
served.
Any connection attempts over the MaxClients limit go into the queue, whose length is
determined by Apaches ListenBacklog setting (as well as Linuxs TCP backlog setting).
Apache creates multiple child processes, each of which has ThreadsPerChild threads. So
MaxClients is the maximum number of threads operating simultaneously.
MaxClients / ThreadsPerChild must be an integer and cannot exceed 16 (Apache creates
no more than 16 child processes).
In the case of Apache threads, the team found that more is not always better. There came a
point where reducing MaxClients improved performance. The reason: if resources behind
Apache are choked, making users wait improves performance.
The team also experimented with using multiple Apache instances, but found they didnt help.
This requires persisting sessions to some kind of store. Over the course of the RRD round, the
WebSphere team wrestled with several techniques for persisting session state. Additionally,
the team used other WebSphere settings to control the frequency of session persistence and
thereby tune for performance.
In the RRD round the use of these libraries had no effect on how the applications were
deployed. But in the WSAD round it did. The team suddenly got NoClassDefFoundErrors on
code that used classes from one of the libraries.
Page 62 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
It took a few hours to find the remedy. WebSphere has an application deployment setting that
controls the order in which class loaders are invoked. The default is parent first (WebSpheres
class loader before the applications). When they changed the setting to parent last, the error
went away.
While making the error go away, this remedy did not explain why the error had appeared with
the WSAD code but not the RRD code. Eventually the team identified the key difference: In
the RRD implementation, the application classes used classes in the library but did not extend
them. In the WSAD implementation, application classes extended library classes. It was the
loading of these dependent classes that caused the error.
Even then a small headache remained. If a team member had to uninstall and reinstall an
application, the classloader setting always reverted back to its default. It required a few extra
steps and a few extra minutes to set it correctly.
Realizing how expensive the creation of DTOs can be, Developer B created a simple class to
manage a pool of objects.
The ObjectPool class was designed as a wrapper to a java.util.Stack. If the stack is empty
when a client requests an object, the pool creates a new one.
Each DTO type had its own subclass of ObjectPool.
To avoid duplication of pools, each subclass was designed to create a static singleton
instance of itself.
The pooling logic was very simple, using synchronized methods to keep it thread safe. But
as long as the expense of calling those synchronized methods was less than the expense
of creating and garbage-collecting DTO instances, it would be worthwhile.
Refactoring the code to use this class proved fairly easy, and subsequent testing showed that
object pooling improved performance by 5-10%.
After refactoring the code to use the new class, Developer B realized he could have built a
single pool class to cover all cases. It would have maintained a hash map of pool instances,
keyed to DTO class type. This change would have improved code simplicity but not
performance, so he didnt pursue it.
XML element names were very long. They were human readable names matching the
column names in the source table.
The Web service returned all columns in a WorkTicket row, even those that the client did
not use.
Developer B refactored the code to shorten the element names to one or two characters each,
and to eliminate the unneeded data elements. For a query returning a full page of data (11
rows ), these changes shrank the size of the XML from about 7500 characters to about 2900.
That shrinkage improved performance noticeably.
Page 63 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
10.5.4 Optimizing Queries
While most database activity in the system was simple (meaning it involved only one action on
one or perhaps two tables), some actions were more complex. For example, the Work Order
application processed messages from the Customer Service Web application calling for
creation of a new ticket. To do so, the Work Order application had to execute three steps:
1. Use an Oracle sequence to get a ticket ID (primary key) for the new ticket.
3. Insert a new row into the WorkTickets table using the information gathered in Steps 1
and 2 plus the other data in the message.
As noted earlier, the team did not use stored procedures to handle complex database
operations like this. So as a first cut, this operation would require three database calls.
Nesting one call inside another can help. The SELECT statement invoked in Step 2 can be
nested inside the INSERT in Step 3, cutting the number of invocations to two. But the SELECT
used in Step 1 cannot be nested.
11
The team learned, however, that Oracle PL/SQL logic can be passed explicitly as a JDBC
statement. So they constructed the following prepared statement to handle this operation in a
single database call:
DECLARE
tickID INT;
BEGIN
SELECT SEQ_TICKETID.NEXTVAL INTO tickID FROM DUAL;
Using this SQL in place of two separate database calls significantly improved performance on
processing messages for new work tickets.
11
Oracles native language for stored procedures.
Page 64 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
10.6.1 Switching JVMs with WebSphere
When the team was ready to tune the WebSphere JVM for garbage collection (see Section
10.4.3.1), they tried using Suns JVM parameter -XX:+UseConcMarkSweepGC to turn on
concurrent GC. However, this produced this error indicating that the JVM could not start:
This error occurred because the JVM installed with WebSphere 5.1, IBMs Java 1.4.1, does not
recognize this parameter.
Concerned that using IBMs JVM might deny them important options provided by Suns JVM,
the team deciding to try having WebSphere use Suns JVM instead. They downloaded Java
1.4.1_04 for Linux from Sun and installed it on one of the Customer Service hosts. To point
WebSphere to this JVM, they used the WebSphere admin console to change the value of the
JAVA_HOME environment variable.
While the server started and appeared to be running, it did log an error indicating that it could
not find a particular IBM library. This was due to the changed Java home.
At this point, suspicious that the server had not started cleanly, and fearful of opening a
Pandoras box that would waste valuable time, the team abandoned this effort and reverted
back to the original JVM installation.
Technical Background
First some background. IBM Edge Server performs load balancing using a technique called
MAC forwarding. It redirects a request at a low level by altering the MAC address (machine
address) in the packet.
The machines and IP addresses involved in this project are shown here:
Page 65 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
192.168.4.200 is the address of the Customer Service cluster. Requests using that address
must go to the Edge Server host, so that host is assigned the cluster address as an alias.
When a packet comes in addressed to the cluster, it contains both the IP address and the MAC
address of the destination machine, namely the Edge Server machine. Based on the load
balancing algorithm, Edge Server chooses a cluster member to handle the request. It changes
the MAC address to that of the cluster member, but leaves the IP address alone.
For MAC forwarding to work, every machine in the cluster must have the cluster IP address as
an alias. However, the clustered machines should not respond to ARP requests (broadcast
requests asking for the MAC address associated with an IP) on the cluster IP address. The
solution is to alias the machines loopback device. Doing so requires a simple ifconfig
command.
Unfortunately, Linux has a defect such that aliases on the loopback device do respond to ARP
requests. This means when an ARP request goes out for the cluster IP, all the machines in the
cluster respond. That should not happen.
Theres a Linux patch (the so-called hidden patch) that lets you hide those IPs from ARP
requests. This patch is documented in the Edge Server Network Dispatcher Admin guide. The
patch came from http://oss.software.ibm.com/developerworks/opensource/cvs/naslib
Applying the patch required rebuilding the kernel (since thats how things are done in Linux).
While the team had experience using Linux, they had never patched a kernel before.
Developer A, slogged through to do this on ITSWEB01. These were the basic steps he
followed:
1. Locate the kernel source code on the installation CD and copy it to the machine.
2. Install certain additional Linux packages that hadnt been installed in order to get the
kernel to recompile.
4. Follow directions from Redhat tech support to compile the new kernel.
When he rebooted, ITSWEB01, he got errors regarding the ethernet devices. The machines
each had two network cards, a Linksys Gigabit card and a Compaq card: the first was assigned
to eth0, the second to eth1. On reboot, eth0 was not recognized at all. And eth1 gave an error:
Dev eth1 has different MAC address than expected.
Next the team tried fiddling with the Network Configuration, but with no luck. By weeks end
they were no closer to a solution.
The following Monday the team tried reinstalling the network card drivers. This took them on a
tour of a different circle of Linux hell. Some of the highlights:
The Linksys CD had two sets of drivers for the Gigabit card. But the install script for what
looked like the correct drivers said they required 2.4.13. For earlier kernel versions it
suggested upgrading to a newer version.
The team tried using the Compaq ProLiant Support Pack for Red Hat Enterprise Linux 2.1,
from the Compaq / HP web site. Perhaps it might update the drivers properly. Getting it to
run was a series of small obstacles, and ultimately it failed to help anyway.
Finally the team rebooted with the original kernel, but the boot failed because the root
partition had run out of disk space.
Page 66 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Patching the Linux Kernel, Take 2
At this point the team called a colleague offsite with greater Linux knowledge who helped them
free up disk space and remake the kernel. After rebooting, eth0 was active, eth1 (the Compaq
card) was inactive.
The team tried installing the bcm5700 driver for the card. With another series of acrobatic
maneuvers, including manually editing the file /etc/modules.conf, the team got eth0 and eth1 to
activate properly.
With the patch installed, the team applied the alias to the loopback device with this command:
All that remained was to transfer the configuration to the second Customer Service machine.
Also not a trivial process (is anything trivial in Linux?), but with help from the offsite colleague
the team got through it.
The team worked with Edge Server for some time, but found it not working properly. After
wrestling with its configuration settings, they at last delved into the network layer to see whether
MAC forwarding was working as expected. Using the arp utility
They learned that the cluster IP address was being associated with the MAC addresses of the
two Customer Service hosts, but not the ES host. They suspected that the hidden patch had
not taken hold. They performed a definitive test as follows:
Result: the arp table had the cluster IP associated w/ the ES box, as it should.
Result: ITSWEB01 responded (it should not have). So the hidden patch had not worked.
Page 67 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Why the Failure?
Why had the previous attempt to apply the patch failed? In retracing their steps, the team
discovered an apparent discrepancy in how the patch (the diff file) was applied. The team
member had used this command:
The p option has to do with the number of characters to strip from file names. Was that the
explanation? The team tested both options using the dry-run option; -p1 showed all
successes whereas -p0 showed failures.
Discovering this apparent discrepancy, the team retraced their original procedure for applying
the patch. But halfway through it they ran into more disk space problems on the root partition,
as well as intimidating errors like this one:
Mount: wrong fs type, bad option, bad superblock on /dev/loop2, or too many
mounted file systems (could this be the IDE device where you infact use ide-scsi so
that sr0 or sda or so is needed?)
Cant get a loopback device.
At this point the team called a local Linux expert to come onsite and clean up the mess. In
about 6 hours he freed up disk space, cleaned up the kernels and applied the hidden patch
successfully to both Customer Service machines.
After that, the hidden IP addresses were no longer an issue. The team could move on to other
problems with Edge Server.
javax.naming.NameNotFoundException:
Context: WASCell/nodes/ITSWEB01/servers/nodeagent, name:
ITSCustServ.CustomerUpdateMessageHome:
First component in name ITSCustServ.CustomerUpdateMessageHome not found.
The error occurred when the application tried to look up the JDBC datasource or the EJB home
for the session beans that produced messages. The frustrating thing was that this error didnt
occur when they ran the application on a standalone server.
Developer B first thought that the offending part of the name, ITSCustServ, was already
representing a target object, and therefore couldnt also be a context for this compound name.
So in the WebSphere console he changed the name to ITS.CustServ. But this had no effect;
in fact the error message was unchanged, suggesting that his change to the JNDI name had
not been recognized at all.
Next he tried inspecting the JNDI tree. WebSphere has a utility, dumpNameSpace, that lets
you get the tree in text form. He verified that the entries from the console were there. But a
federated environment uses a federated JNDI setup. So the tree is full of links to other places.
Page 68 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
In fact, if you use dumpNameSpace with the default JNDI port 2809, you dont see the entries.
You need to use port 9811 for the right location.
Knowing the names were in the namespace, he then looked for the reason why they werent
found. He found the code RRD generated to do the session bean lookup. It did the following:
So if the name was valid, then the code had to be looking for it in the wrong place. The
provider URL is specified in a RRD-generated file, ITSCustServ_RtConfig.properties. When
you set up a WebSphere deployment model in RRD, that URL defaults to
iiop://localhost:2809.
What you really want to use is the WebSphere bootstrap port. On a standalone server, its
2809. But in a federated environment, 2809 is used by the node agent on the local machine.
The bootstrap port is some other value, determined at runtime. It points you to a location
service demon, which is what you want instead. If you were coding straight J2EE, youd
instantiate InitialContext with a default constructor and use the default value automatically.
But you cant do that with RRD. So you have to change the configuration.
At first the developer thought of changing the provider URL setting in RRD and reconstructing
the application. After a couple of tests with no change in outcome, he realized that the
ITSCustServ_RtConfig.properties file on the target server (ITSWEB01) was not being
updated, because he was not using RRD to deploy the application from his development
machine. So he manually copied the file to ITSWEB01. That produced a new error earlier in
the process, on the lookup of the JDBC datasource.
The relevant part of the properties file is an XML element (the file is an XML file, despite its
misleading extension) with an attribute java.naming.provider.URL. Its value initially was
iiop://localhost:2809, from the default RRD setting. The developer tried manually changing it
to iiop://localhost, then iiop://localhost/, both without success. He also tried an empty string,
which gave a different error (because the empty string was treated as the explicit provider URL
value).
Finally, he decided to add some code to instantiate InitialContext with a default constructor,
verify that JNDI lookups on it would work and see what setting it contained. He did and it did.
The setting it used was corbaloc:rir://NameServiceServerRoot. When he put that URL into
the properties file, it worked.
After all that, it occurred to the developer to delete the java.naming.provider.URL attribute
from the properties file entirely. This worked too, and proved better than hard coding the URL.
12
In fact, RRD doesnt let you choose the JNDI name for the session bean; it creates a name as appName.messageNameHome
Page 69 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
You can assign fixed weights to the different targets that govern the how load is balanced
among them. This mode is similar to DNS round robinning; it balances load but does not
offer graceful failover.
You can configure a load manager that continually pings the targets to confirm their health,
and diverts load away from a target that cannot be reached. The manager pings the target
by invoking the URL of a page on port 80 of the target machine. The page does not have
to exist; as long as some reply even an error comes back from the target, the manager
is satisfied.
The team configured a manager to handle failover, but over the course of working with Edge
Server they found it behaving erratically. The manager would mark a target down for no
apparent reason. The team wrestled with the Edge Server configuration, looking for
configuration mistakes that might explain this behavior, but found none. After that they simply
switched to fixed-weight mode to avoid the inconvenience during the tuning phase. But during
the RRD Round Testing Phase this problem became critical.
Only during testing for the WSAD round did a possible explanation emerge. The team
discovered a bad network card on the Work Order host (see Section 10.6.11 for the full story).
Even though Edge Server had no communication with that host, the team wondered whether
the failing card had created noise on the network that interfered with Edge Servers pinging the
Customer Service targets and made it think one had failed. Although they could not prove or
disprove that hypothesis conclusively, after the bad card was replaced Edge Server behaved
flawlessly.
Additionally, the team used other WebSphere settings to tune for session persistence
performance.
This section describes in detail the session persistence techniques and the teams experience
with them.
The team chose the Customer Service database as the logical target and set up a separate,
dedicated datasource to that database for that purpose. To configure session persistence, all
you need specify is the datasource and database login. WebSphere automatically creates the
necessary table.
While this technique satisfied the functional requirements of the RRD development phase, the
team discovered during tuning, however, that it created a significant performance bottleneck.
The team tried reducing it by adjusting the settings for tuning session persistence (see Section
Page 70 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
10.6.7.3 below), but could not get any improvement. At that point they looked for an
alternative.
Peer to peer. In this topology, each clustered server in the replication domain holds not only
its own sessions but those of all the other clustered servers. This topology saves you from
configuring additional servers, but it has certain implications for performance and failover:
Lots of duplicate messages. If the cluster has ten servers, every session is replicated to
nine destinations. That requires nine messages in the domain.
Greater memory requirements. Every server must have sufficient memory to hold the
sessions for all ten servers, not just its own.
Client-server. In this topology, additional WebSphere servers act as repositories for replicated
sessions. The clustered servers send their sessions to these repositories, but do not
themselves replicate sessions from other servers in the cluster.
Page 71 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Figure 6. Client-server topology for in-memory session replication
The team briefly experimented with the peer-to-peer topology. But given the memory overhead
it imposed, and given the availability of a host guaranteed not to go down, they shifted to the
client-server topology. They created a WebSphere server on the MQ host for this purpose and
configured it as the session server.
How frequently to write session data. You can choose to write the data at fixed time
intervals or after each servlet service.
What session data to write. You can choose to write the entire session or only the
updated attributes.
These settings let you optimize for performance (infrequent writes, update only), optimize for
failover (write after every service, write all session data), or something in between.
The team experimented with different combinations. With database persistence, their focus
was on improving the poor performance. They chose to write updated attributes only rather
than the entire session, and found this choice improved performance marginally. As for
controlling the write frequency, they could find an acceptable tradeoff between failover and
performance. Writing after every servlet service was just too slow. Writing at fixed intervals
improved performance only when the interval was too long to provide reliable failover. This
poor tradeoff led them to switch to abandon database persistence.
With in-memory replication, the team found they could write the session after every servlet
service. They also tried writing updates only to improve performance. But at that point the
team encountered strange out of memory errors related to session replication. Despite
Page 72 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
spending a great deal of time diagnosing this problem, including using WebSpheres diagnostic
trace service to examine the session replication behavior in detail, they could not explain it. But
when they switched from writing updates only to writing all session data, the problem
disappeared.
On the Customer Service side, the team could deploy to one server at a time, downing it
first if necessary. But deployment would have to be handled in a special way, because the
normal process would deploy to both clustered servers at once.
Since the Work Order application was running on only one machine, any changes to it
would have to be deployed while it was running. This would require hot deployment.
13
The IBM WebSphere literature describes how to hot deploy components.in a Network
Deployment environment. But the description is contradictory in a couple of places:
Then, later on, it says: For changes to take effect, you might need to start, stop, or restart an
application. (This would, of course, defeat the purpose of hot deployment.)
Nevertheless, the document describes the various facets of WebSphere that go into the
process:
From this information, and with much experimentation, the WebSphere team put together
procedures for updating the applications. For the Customer Service application, they used this
procedure in the RRD Round:
2. Stop the first Customer Service server (see Section 10.6.9 for a discussion of how to
do this gracefully).
3. Copy the contents of the staging area to their respective locations in the WebSphere
installation.
13
Go to the WebSphere Information Center at http://publib.boulder.ibm.com/infocenter/ws51help/index.jspand search on hot deploy.
Page 73 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
4. Restart the server.
2. Deploy the updated Customer Service application through the Deployment Managers
Admin console in the normal way, but do not synchronize when saving the changes.
This installs the updated application on the DM machine only.
4. In the Deployment Manager, synchronize with the first Customer Service node. This
copies the installed application into place.
1. Install the application with reload enabled and the reload interval set to a reasonable
value, such as 60 seconds.
3. Wait until the reload interval has passed to see the changes.
Note that while these procedures cover most situations, there are some circumstances they
cannot handle:
If you update a custom external library deployed to WebSpheres lib folder, you must
restart the server.
If you update an EJB whose stub is cached (as, say, in a service locator class), the stub
becomes stale.
Load balancing. It must balance client traffic evenly among available servers
Seamless failover. If a server dies in the middle of a conversation with a client, another
server should pick up the conversation without the client knowing it.
To meet these requirements, the architecture must provide two additional capabilities:
While the WebSphere team was able to address load balancing effectively, seamless failover
proved a huge challenge. The team tried several topologies:
Page 74 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
A major departure from the standard topology.
This section describes these topologies in turn and the teams experiences with them.
For the first situation, the team had to be able to redirect load away from a server before
downing it. If they stopped a WebSphere server or even simply stopped the application in the
server, errors would occur.
The only tool available for gracefully stopping load was Edge Servers quiesce function.
Quiescing a target server means reducing to zero the traffic to that server. If you use Edge
Server to provide server affinity, you must choose whether or not to quiesce immediately. If
you choose not to, Edge Server allows ongoing conversations to finish on the same target
server, which means that it may take much longer to shift load away from that server.
Whats interesting is that Edge Server is used simply to balance load bet ween the Apache
instances and provide failover if one of them dies. The Web server plugin, does most of the
Page 75 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
14
work: it provides both load balancing and server affinity (stickiness) with respect to the two
WebSphere instances. It also provides failover capability if one of the WebSphere instances
goes down.
How the plugin distributes load among the target servers is governed by the plugin
configuration file, generated by WebSphere. By default the distribution is equal among the
targets.
The WebSphere team found that this topology worked well for load balancing, but performed
very poorly during failover tests. They could not gracefully bring down a WebSphere server. If
they used Edge Server to quiesce Apache #1 (the instance on Host #1), Apache #2 would
continue to direct load to WebSphere #1.
Figure 8. Standard system topology for load balancing and failover after quiescing Apache #1 from Edge Server
This configuration required manually editing the plugin configuration file to undo the clustering
and have each Apache instance serve one WebSphere instance only. Each host had its own
customized plugin configuration.
14
WebSphere provides stickiness by appending a unique server ID to the session ID that is passed back to the client via a cookie or URL
rewriting. When a new request in the same session comes in, the plugin detects the server ID and routes the request to that server, if possible.
Page 76 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Figure 9. Non-standard system topology for load balancing and failover
This topology also proved troublesome. It did solve the problem of stopping traffic to a
WebSphere instance: when the team used Edge Server to quiesce one of the Apache
instances, traffic to the corresponding WebSphere stopped as well. But this topology required
Edge Server to provide stickiness, and that was difficult to control. Quiescing immediately led
to many errors, while quiescing slowly took much too long. The team found it hard to shut off
Edge Servers traffic to a target server in a timely, seamless fashion.
The team found this alternative worse than the first and reverted back to the standard topology.
Figure 10. Modified standard system topology for load balancing and failover
Page 77 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
This topology allowed the team to take advantage of the plugins stickiness and failover
capabilities while at the same time channelling traffic to a specific WebSphere instance. When
the team wanted to bring down WebSphere #1, they would do the following:
1. Use Edge Server quiesce Apache #1 immediately. With all traffic going to Apache #2,
which favored WebSphere #2 (apart from stickiness), eventually traffic would shift over
to WebSphere #2.
2. Monitor CPU activity on Host #1 and wait until it subsided. That would indicate that the
traffic had shifted.
Although failover still was not seamless, this was the most successful topology.
Developer B double-checked that both these references were using the explicit IP address of
the Web service host. Then he gathered other clues:
When he ran the Customer Service application on the WSAD test environment and tried to
use the Web service on the production machine, it worked.
15
When he manually invoked the direct URL of the Web service from a browser on the
Customer Service host, it gave back the proper response page.
So the Web service was responding; the problem seemed to be with the client application.
Working together, the team noticed that the webservices.jar library in WebSpheres
AppServer/lib folder was older than the one in the WSAD test environment. They wondered
whether this discrepancy was causing the problem. When they substituted the library from
WSADs runtimes/base_v51_stub folder, that solved the problem.
15
The URL was: http://192. 168.4.215:9080/itsWorkOrderConsoleWeb/services/SearchTicketsWS; it responded with a page that simply said And
now some services.
Page 78 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
The very next day, however, the application started behaving badly. As load ramped up,
hits/sec would climb erratically; response times were horrible. The team had no idea what had
thrown this particular monkey wrench, so they pursued all the usual suspects:
None of these actions solved the problem. Moreover, a load test on the Customer Service
application gave respectable response times on Web service queries. This clue narrowed the
focus to the Work Order application itself. So the team looked at it more closely.
After this round of effort, another test of the Work Order Web application cranked it up over
3500 users. That was nice to see, but it didnt explain what the original problem, or whether the
team had solved it.
In fact, they hadnt. Despite successful load tests, the horrible results returned, proving again
the maxim, Things that go away by themselves can come back by themselves.
By the beginning of the testing phase the problem had worsened; the response time of
WorkOrder application was orders of magnitude time greater than that of the Customer Service
application. The team rechecked everything again in a vain search for the cause.
Then a new, tantalizing clue suddenly appeared: Ethernet 0 on the Work Order machine
started to fail, going off and back on intermittently. Could the network card be responsible for
the poor response times? There were two subnets connecting the servers, 192.168.4 and
192.168.5. The failing network card provided a .4 address, the address used by the load
testing scripts. But the Web service (which was performing well) was accessed through a .5
address provided by a second card. So the failing card became the prime suspect in this
mystery.
CN2 immediately replaced the failing card, after which the team reran the Work Order load test.
Lo and behold, the results were now on par with those of Customer Service. (More importantly,
those results would remain stable through the rest of the project!) Eve ryone involved
concluded that the failing card had been the culprit.
But discovering the failing card raised two new questions. First, did it also explain the tendency
of Edge Servers load manager to occasionally mark a target server down for no apparent
reason (described in Section 10.6.6)? The team wondered whether the failing card had created
noise on the network that would fool Edge Server into believing one of its targets had failed.
They would keep their eye on Edge Server. (In fact, it behaved flawlessly from that point on.)
Second, did the bad card compromise the results of the previous performance tests? Since
there was no way to answer that question with certainty, CN2 insisted on rerunning the tests for
both the .NET and RRD implementations, to insure the validity of the results.
Page 79 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
16
But like any complex, sophisticated piece of software , LoadRunner can behave strangely if
not configured and used exactly correctly. Over the course of tuning and testing their system,
the team learned some important lessons about LoadRunner that would apply to comparable
products:
Adjust the scripts in response to page changes. Any change to the URL of a request
(including changes to query parameters) affects a script that invokes that URL. The same is
true of the fields in a form; if you add, remove or rename a field, the script must be corrected.
Double-check the runtime settings. Twice the team got bizarre test results because a simple
LoadRunner runtime setting was wrong.
In one test, the Work Order Web application suddenly began overloading at a fraction of the
load it had handled the day before. The cause turned out to be the LoadRunner runtime setting
governing how think time was handled. The scripts had 5-second think times hard coded
before each Web request, but when running a script you can control whether and how that think
time is used. The correct setting was to use a random value between 50% and 150% of the
17
coded time . But in this case, think time was accidentally turned off, so naturally the system
started hyperventilating very quickly.
On another occasion the opposite occurred. The team suddenly found it could ramp up to
loads much higher than those achieved earlier. These results were too good to be trusted.
Again, the culprit was an incorrect runtime setting in LoadRunner, which controls how many
seconds a user waits between iterations of the script. The correct setting was zero, but in this
case it had been accidentally changed to 60 seconds.
Periodically refresh the clients. The team found it prudent to reboot the client machines
occasionally, to insure they performed properly. They also had LoadRunner periodically
refresh the scripts on the client machines to guard against potential script corruption.
16
For some reason, application servers come to mind
17
Randomization is important because it helps stagger the requests and more evenly distribute the load.
Page 80 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
11 .NET CONFIGURATION AND TUNING PROCESS SUMMARY
This section describes the process the .NET team went through to configure and tune the .NET
and Windows infrastructure. It also describes the major bottlenecks encountered and resolved
in the implementation.
Here is a high-level summary of the stages the team went through. Details follow in the
sections below.
1. Install and configure the software: Network Load Balancing (NLB) and ASP.NET State
Server.
The diagram below shows the network topology relevant to the ITS Customer Services
application and how the .NET team configured it for NLB.
Page 81 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Figure 11. Network topology and Windows NLB configuration for load balancing and failover in the .NET implementation
Since each Web server has multiple network interface cards (NICs), the team decided to
configure the NLB in unicast mode for better performance. They also followed a best practice
for this mode: connecting the clustered network interfaces to a single hub that is up-linked to
the public switch. This practice prevents the NLB from flooding the switch (a condition known
as port flooding), and degrading the entire networks performance. The hub is used by the
servers in the cluster to communicate with each other via a heartbeat process.
There was an additional requirement for using this setup. For NLB to function properly with the
hub configuration, MaskSourceMac had to be disabled. To turn it off, the team set this registry
key to 0:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\WLBS
\Parameters\MaskSourceMac
The hub and the clustered network interfaces would take care of the incoming traffic. With
performance in mind, the .NET team configured the second network interface card (gigabit) for
dedicated outgoing traffic back to the clients.
To simplify the configuration, the team used the interface metric setting for the outbound gigabit
NIC to ensure it was the default NIC used for outbound traffic. To do this, they simply used
TCP/IP properties to set the interface metric of the dedicated outbound NIC to a lower setting
than that of the cluster NIC, which NLB used to communicate over the hub via its heartbeat
process. In the diagram, these interface metrics are 1 and 2 respectively This configuration is
documented in the help files for setting up network load balancing.
Page 82 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
With this design, routing incoming and outgoing traffic through different dedicated channels, the
team could achieve better overall network performance.
Once the team had figured out the load balancing plan, setting up the clustered environment
was straighforward. Using Network Load Balancing Manager, they could configure the
clustered environment or an individual host from any server, making the maintenance task very
simple. Network Load Balancing has both graphical and command line interfaces in Windows.
Commands enable admins to stop, add and monitor servers in the cluster. One particularly
useful command is drainstop, which enables a server in the cluster to finish handling its
current requests, while ceasing to take new requests. Once all requests have been drained
from that server, it can be taken offline for possible maintenance.
It is important to note that the .NET team chose to hold session state in an ASP.NET session
state server (rather than in the Customer Service servers themselves), and that the session
server was placed on a machine outside of the cluster (the same machine handling the durable
MSMQ message queue). This topology made the Customer Service application cluster safe,
with several important advantages:
A possible performance cost because the Customer Service application obtained all
session data remotely rather than locally.
The session state server as a single point of failure. Because the ITS specification
guaranteed that the MQ host would always be available, the team did not concern itself
with this issue; had it been necessary, they could have used a clustered SQL Server for the
session state data store which would have removed the single point of failure.
2. In the details pane, right-click ASP.NET State Service, and then click Properties.
3. On the General tab, in the Startup type list box, click Automatic.
4. Under Service status, click Start, and then click OK. The state service starts
automatically when the Web server restarts.
To configure the clustered application to use a single session state server, there are a few
documented considerations to keep in mind.
Make sure all applications use the same machine key. The team added the same
<machineKey> entries in the Web.config files.
Make sure all objects that are stored in session state can be serialized. It is easy to
implement this in .NET by adding the Serializeable attribute to each class that needs
serializing.
Page 83 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Make sure the Customer Services applications have the identical Application Path on
both Web servers. This means ensuring all installations of the application have the same
URL.
Initially, they had all ticket searches return the top 500 rows. This turned out to be a significant
amount of data with a significant impact on performance. After the team implemented paging
(which meant each query returned only enough data to satisfy the current page), the database
returned an average of 20 rows per search (based on the mix of requests in the test scripts),
with a commensurate improvement in performance.
Note: Many of the actions the team took are discussed in the MSDN article Developing High-
Performance ASP.NET Applications in the .NET Framework Developers Guide discusses.
See Section 17.2 for the URL.
Page 84 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Added CustomerID cookie in CustomerService App to reduce Session State during Web
service call
Disabled unused HttpModules (e.g. output caching)
Page 85 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
11.3.9 Changes to Machine.config
The team made these changes to Machine.config:
<connectionManagement>
<add address="*" maxconnection="200" />
</connectionManagement>
<httpRuntime executionTimeout="90" maxRequestLength="4096"
useFullyQualifiedRedirectUrl="false"
minFreeThreads="8" minLocalRequestFreeThreads="4"
appRequestQueueLimit="3000" enableVersionHeader="true" />
<processModel enable="true" timeout="Infinite" idleTimeout="Infinite"
shutdownTimeout="0:00:05"
requestLimit="Infinite" requestQueueLimit="5000" restartQueueLimit="10"
memoryLimit="60" webGarden="false"
cpuMask="0xffffffff" userName="machine" password="AutoGenerate"
logLevel="Errors" clientConnectedCheck="0:00:05"
comAuthenticationLevel="Connect" comImpersonationLevel="Impersonate"
responseDeadlockInterval="00:03:00"
maxWorkerThreads="16" maxIoThreads="16" />
The team attempted to correct this condition using documented Microsoft Knowledge base
articles by setting the session state network timeout values higher than the default. This did not
impact the condition.
The team suspected two possible factors contributing to this problem:
They configured too many network connections (200) for the .NET HTTP module.
Microsoft documents settings much lower than the 200 network connections used (the
default setting is 2, but does need to be adjusted upwards on the Web service client
machine).
Page 86 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
They did not increase the number of I/O threads to handle the number of network
connections. Analysis of MSDN material indicated that such a high setting might also
require increasing the I/O thread pool.
Since this problem only occurred after server saturation and above the one-second
performance testing cutoff, the team chose not to spend more time diagnosing it.
Page 87 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
12 PERFORMANCE TESTING
The first three were similar: Mercury LoadRunner was used to subject each application to load
in a variety of tests. Each test consisted of ramping up user load gradually over time to plot
system throughput curves, measure transaction response times, determine maximum user
loads supported, and track error rates under load. Identical test scripts were created to test
each of the three implementations in a consistent manner.
The following section details the tests and presents the summary findings from the auditors
report.
The test scripts simulated simultaneous users accessing the ITS Customer Service Application
to:
Since the Customer Service application was integrated with the Work Order application via
messaging and a Web service, putting load on the Customer Service application also exercised
the Work Order application and message queue server.
Customer updates would cause the Customer Service application to not only update its
local database, but also send a message to the Work Order application to update its
database (which replicates customer data) as well.
The Customer Service application would submit new work orders by sending messages to
the Work Order application.
To perform ticket queries, the Customer Service application would invoke a Web service
provided by the Work Order application.
Page 88 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
The following table summarizes the auditors results for the Customer Service application
performance test. Please note that a transaction represents a complete business operation
invoked by a user, such as executing a ticket search and receiving the first results page. So
272 transactions, per second for example, is equivalent to 979,000 business operations per
hour or 23.5 million business operations per day.
Failed transactions as
percentage of total 0.00% 0.02% 0.00%
This graph shows the performance of the three implementations at each user load:
700
600
500
AVG TPS
WSAD
400
RDD
300
.NET 1.1
200
100
0
500 1000 1500 2000 2500 3000 3500 4000
Number of Virtual Users
*Note: Last Data Point is AVG TPS past the 1 second cut-off
The .NET implementation performed slightly better than the WSAD, reaching about 10% higher
peak throughput at the same peak user load. And both performed far better than the RRD
implementation.
log in
Page 89 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
search for customers
modify customer information
search for technicians
modify technician information
search for work tickets
The following table summarizes the auditors results for the Work Order application
performance test. Please note that a transaction represents a complete business operation
invoked by a user, such as executing a ticket search and receiving the first results page. So
260 transactions per second, for example, is equivalent to 936,000 business operations per
hour or 23.5 million business operations per day.
Failed transactions as
percentage of total 0.00% 0.11% 0.00%
This graph shows the performance of the three implementations at each user load:
800
700
600
500
AVG TPS
WSAD
400 RDD
300 .NET 1.1
200
100
0
500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Number of Virtual Users
*Note: Last Data Point is AVG TPS past the 1 second cut-off
In this test the WSAD implementation far surpassed the other two. It achieved 75% more
throughput at 50% higher peak user load. Again the RRD implementation fell far behind the
other two.
Page 90 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
12.2.3 Integrated Scenario
This test stressed both systems at once, running the ITS Work Order test (25% of the load) and
the ITS Customer Service Application test (75%) of the load. In this case, LoadRunner clients
were accessing both Web Applications, running the same scripts as in the two individual tests.
This test ramped up user load at a rate of 661 new users every 15 minutes: 500 on the
Customer Service application, 161 on the Work Order Web application.
The following table summarizes the auditors results for the integrated scenario performance
test, which included both the Customer Service and Work Order applications. Please note that
a transaction represents a complete business operation invoked by a user, such as executing
a ticket search and receiving the first results page. So 365 transactions per second, for
example, is equivalent to 1.3 million business operations per hour or 31.5 million business
operations per day.
Failed transactions as
percentage of total 0.00% 0.00% 0.00%
Page 91 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
And the graph of performance at each user load:
1000
900
800
700 WSAD
AVG TPS
600 RDD
500
.NET 1.1
400
300
200
100
0
661 1322 1983 2644 3305 3966 4627 5082 5299 5449
Number of Virtual Users
*Note: Last Data Point is AVG TPS past the 1 second
cut-off
In this test of the complete system, again the .NET implementation was the clear winner. It
achieved 1/3 higher peak throughput at 50% higher peak user load. Interestingly, though, the
difference between the WSAD and RRD performance was much smaller for this test than for
the others.
The teams were asked to halt their Work Order message processing module and make sure
the work ticket message queue was empty. A script was run to load the queue with exactly
20,000 messages. Once the queue was loaded, the Work Order application performing
message processing was restarted, and the time it took to process the 20.000 messages was
measured.
Page 92 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
The WSAD implementation was the clear winner over the .NET implementation in this test, by
more than a factor of two. The difference may be attributable to differences in code,
infrastructure, or both. On the .NET side, the team had to create an additional message
forwarding layer to compensate for the fact that one cannot do transactional reads from a
remote MSMQ queue (see Section 8.4.3.1). This layer, not necessary for the J2EE
implementations, undoubtedly added overhead. On the other hand, teams initial throughput
was much worse (2 messages per second); they improved it to this level by reconfiguring the
message processing application from single- to multi-threaded.
Equally interesting is the fact that the RRD implementation lagged so far behind the WSAD,
despite their using the same message server (WebSphere MQ) and the same basic
architecture (message-driven EJBs). The main reason is the generated RRD code that
acquires message related resources. In particular, RRD code creates a new JNDI
InitialContext object whenever it wants to do a JNDI lookup. For the WSAD version, the team
used a ServiceLocator class that cached the InitialContext and the queue connection factory.
The most obvious conclusion is that the RRD implementation fell short of the others by a wide
margin. The .NET and WSAD implementations outperformed it by a factor of two. The
Page 93 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
explanation lies largely in the nature of RRD as a development tool (discussed in Section
7.1.1). Designed to speed development and distance the developer from J2EE coding, RRD
generates Java code that in many ways is not optimized for performance. The J2EE team
spent a great deal of time trying to compensate for those code limitations.
Comparing the WSAD and .NET implementations, we find a much closer match. .NET
performed significantly worse on the Work Order and messaging tests, slightly better on the
Customer Service test, and significantly better on the integrated scenario test (which is the one
most like the real world). Taking the total of the three load tests, the .NET and WSAD
implementations come out nearly even. On the message throughput test, the WSAD
implementation surpassed the .NET by more than double.
The ability of the WSAD implementation to match roughly its.NET counterpart in performance
suggests that the WebSphere/Linux platform performed on a par with the .NET/Windows
platform. Of course the study revealed other indicators where the two platforms differed
greatly. But strictly in terms of performance (as measured by these tests), the two platforms
are comparable.
Page 94 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
13 MANAGEABILITY TESTING
1. In the Work Order application, change the ordering of results from a database query
2. In the Customer Service application, add a new Web page and modify an existing page
3. In one of the Customer Service applications Web pages, change a drop-down list
whose contents are hard-coded so that it instead binds to a database table.
Each test proceeded in two parts, development and deployment. For development, the team
made, tested and verified the change in a test area, while the auditor timed the process.
For deployment, the user load was ramped to 1,750 concurrent users (1,400 against the load
balanced Customer Service application, 350 against the Work Order application). Note that
even though Request #1 did not affect the Customer Service application, that application was
still running under load during the test. The team was then asked to deploy the change to the
appropriate server(s). The auditor measured the time taken to deploy, the number of errors
occurring during deployment, and whether the Customer Service application preserved session
state.
The following section details the tests and presents the summary findings from the auditors
report.
Results generated from the Ticket Search Page need to be ordered in descending order by
date with most recent tickets displayed first.
Page 95 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Here are the summary results from the auditors report:
Development
Deployment
Explosion of errors. The WSAD experience was somewhat bizarre. After the team had
successfully deployed the changes to the Work Order Web application, the Customer Service
side of the system (which was also running under load), began to fail. This occurred even
though the changes did not affect it directly; the only modified code was in the freestanding
Work Order Web application, not the Work Order module that processed messages or hosted a
Web service.
The team had no hypothesis to explain this failure. And under the time constraints they could
not properly address it. They are confident, however, that a solution exists and, given enough
time, they could have made this work.
Auditors observations. Given the high number of errors that occurred for the two J2EE
implementations, the auditor included these observations in the report:
RRD: For this change request, the development team chose to modify the middle tier
application logic since this contained the database query to which the order-by clause
needed to be applied. In order to ensure continued query performance, the team also in
tandem chose to change an index in the database schema so that the order-by clause
could be completed efficiently. The 874 errors can be attributed to both the recompiling
of the application and changing the database index while under load.
WSAD: A portion of the errors could be attributed to both the recompiling of the application
and changing the database index while under load. However, in this exercise it is also
observed that while Apache had systematically crashed during the live deployment,
Edge Server was not part of that particular issue. Technically this test was stopped and
allowed to pass; however, the CS portion of the site should not have been effected by the
change to the WO server. Since synchronization was not a step performed by Middleware
there is no real cause for the CS site to suddenly go offline. At the time noted for System
Back Online the User load was continued on the system for approximately 5 more
minutes after the test ended to verify that the bouncing of CS#1 worked and that
transactions were being passed successfully.
.NET: The five errors were time-out errors (120 seconds), most likely due to the application
recompiling and being re-loaded into memory after being re-deployed.
Page 96 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
13.2.2 Change Request 2: Adding a Web Page
The second change request applied to the Customer Service application. It required changes
to the display of news items on the home page and creation of a new page for adding new,
company -specific news items. The request stated:
Add a new page to the Customer Service application that will allow administrators for each
company to generate new news bulletins that are displayed for their company. This part
requires 2 changes: adding the web form to allow news items to be submitted and stored in
the database, and adding a column to the database table allowing news items to be
tracked by unique customer id. The CompanyID of zero for any news item is specified
such that these news items will display for every company. Unauthenticated users will see
only default news items. However, once logged in, the news items for that company will
also display, in addition to the default news items.
Development
Deployment
Databind the Work Status dropdown list to a table in the database vs. hard coding the
values in the HTML.
Page 97 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
Here are the summary results from the auditors report:
Development
Deployment
First, it had fewer errors during deployment under load in all three tests. For the changes to the
Customer Service application, the RRD and WSAD implementations both had relatively small
numbers of errors, whereas the .NET had zero. The greatest difference lay in deploying to the
Work Order application, where (as noted above in Section 13.2.1) the WSAD implementation
had an inexplicable catastrophic failure.
Second, the .NET implementation maintained session state properly in the two tests involving
the Customer Service application it. The RRD implementation failed once on that count, the
WSAD implementation twice.
Page 98 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
14 RELIABILITY TESTING
2. Catastrophic failover: Abruptly down a Customer Service load-balanced server (pull the
plug), then bring the failed server back on-line within the cluster
3. Loose coupling: Power off the Work Order application while running the Customer
Service application (to test the loosely coupled nature of the applications )
4. Long duration: Run the entire system at normal load for 12 hours
The following section details the tests and presents the summary findings from the auditors
report.
Page 99 of 109
.NET-WebSphere/J2EE Comparison Report
Copyright 2004 The Middleware Company
additional capacity during operations?
The WebSphere team did poorly with the RRD implementation, but much better with the WSAD
implementation. The difference is explained much more by the differences in how WebSphere
was configured than in the implementations themselves. By the second round (WSAD), the
team had worked out a more reliable means of handling failover, which centered on skewing
the load balancing that each Apache instance performed to favor its colocated WebSphere
server. See Section 10.6.9 for a complete discussion.
Despite the progress they made in handling a controlled shutdown, the WebSphere team still
could not get the system to handle a catastrophic failure gracefully. The downed server did not
come back properly, leading to a huge number of errors. This was another case where the
team had no explanation for the problem and insufficient time to solve it.
The .NET implementation handled the failover well, with minimal errors.
All three implementations were able to sustain an average response time of less than 1
second and no errors were thrown by any implementation at this user load for the 12 hours.
On the remaining two tests the loosely coupled test and the 12-hour sustained operation all
three performed with equal reliability.
Again, we note that the manageability and reliability testing results show what the teams were
able to achieve given the time each took for configuration and tuning in preparation for the
tests. With additional time and/or involvement by vendors themselves, improved results might
have been achieved.
Focus on the Windows platform to provide tight integration between the OS and the
development framework and tools
Standardization on Visual Studio.NET as the primary development tool for .NET
Developer productivity. Microsofts tight integration approach paid off in the development
phase, where VS.NET and the .NET platform proved more productive than either RRD or
WSAD with the WebSphere platform. Among the reasons:
The position of VS.NET as the premier .NET development tool all but guaranteed an
equivalence between VS.NET experience and .NET platform experience. In other words, a
developer with three years of .NET experience most likely has used VS.NET for three
years, whereas a developer with three years J2EE experience may not have used RRD or
WSAD at all.
VS.NET shared some of the best features of both RRD (visual page design; data binding)
and WSAD (direct coding of business logic; tight integration with the target platform).
Installation and configuration of software. Tight integration paid off here, too, for the .NET
team. Most key elements of the .NET runtime infrastructure (basic application platform, Web
server, load balancer, session server, message server) were already in place with the basic
Windows Server 2003 installation. This fact saved the .NET team a great deal of time and
trouble.
The WebSphere team, by comparison, spent a great deal of time during the development
phase installing the software and configuring it for basic functional tests. They also spent
considerable time overcoming fundamental configuration obstacles, such as patching the Linux
kernel for Edge Server and configuring WebSphere for session replication. The .NET team did
not face such obstacles.
System tuning. The .NET team completed their tuning process much more quickly. One
obvious reason is that they had fewer knobs to turn. A J2EE system has many more moving
parts that interact in many combinations, making the tuning process all the more complex. The
WebSphere team took a methodical approach to tuning that certainly proved more time
consuming.
Performance. In terms of sheer processing throughput, the .NET and WSAD implementations
18
performed comparably. In one particular area, message processing, the .NET version fell far
short, but this is most likely explained by the more complex architecture necessary to work
18
It would be interesting to know to what degree, if any, the different operating systems contributed to the performance results. Unfortunately the
data from this study sheds no light on that question.
Manageability & reliability. The .NET implementation consistently and reliably handled
service interruptions, both controlled and unexpected. It also allowed the team to deploy
application updates much more smoothly.
The WebSphere team, on the other hand, encountered catastrophic failures that they could not
diagnose or explain sufficiently to overcome. They also found session persistence performing
less than reliably. The team feels they could have solved these problems given more time.
Unfortunately, time was a measured resource in this study.
Overall, by most indicators in this study, the .NET implementation running on Windows Server
2003 was better, in some cases significantly so, than either WebSphere/J2EE implementation
running on Linux. Are these results surprising?
It makes sense that using an integrated out -of-the-box operating system and application
server" framework such as Windows and .NET would have a much lower setup cost than
attempting to integrate multiple products (albeit from the same company) and a third party OS.
Although IBM products have come a long way since 1998, they still have some way to go in
providing the seamless integration Microsoft can offer.
Nor should it surprise anyone that the development productivity results favor the .NET side;
productivity has always been one of Microsofts strong points. Perhaps a more noteworthy
result is that WSAD came much closer to Visual Studio than it would have a couple of years
ago.
Regarding performance, the RRD results should not come as a shock. Any code generation
scheme will always have a difficult time holding its own against tightly written, hand-crafted
code. What is worth noting, however, is how close the WSAD and .NET performance results
came. This outcome basically means that both IBM and Microsoft have done a good job getting
the most out of the hardware resources on which their platforms run. The only truly unexpected
result was that the WSAD message processing (using JMS and message EJBs) performed so
much faster than the.NET.
Given that the J2EE approach to enterprise software is very much about competition and
choices, we might well ask whether the most significant problems encountered by the
WebSphere team could have been helped or eliminated through different choices.
RRD vs. WSAD? RRD, chosen initially for its development productivity offering, did not deliver
that offering in this study. Given that the specification included J2EE technologies beyond the
core, such as a handheld application and a Web service; and given that the WebSphere team
consisted of skilled J2EE developers comfortable with those technologies APIs; WSAD was a
much better choice and, so the team feels, would have compared favorably with VS.NET
strictly in terms of code production.
In terms of producing a high-performance implementation, WSAD was clearly the better choice
over RRD.
Linux? Given the choice of Edge Server for load balancing, the WebSphere team had to patch
and upgrade the Linux kernel to make it work. This process requires skills common to Linux
experts but not necessarily to the average J2EE developer or IT team. There is no question
that Linux added a layer of complexity to the configuration process.
Configuring session sharing and failover? Two of the most vexing problems for the
WebSphere team were their inability to get session replication working reliably and to configure
the system for robust failover. During the manageability and reliability tests they found the
system crashing for reasons they could not explain in sufficient depth to correct the problems.
But knowing that robust, successful WebSphere installations exist in the world, the team feels
certain they could have done so, given enough time.
WebSphere? And finally, of course, one has a choice of J2EE platforms. This study examined
the use of one particular platform in a carefully constructed experiment. These results do not
speak to the qualities of others.
The ITS system specification was the basis for the two teams development.
The Independent Auditors report (CN2 report of the study results) provided most of the
result data cited in this report.
WebSphere Edge Server for Multiplatforms, Network Dispatcher Administration Guide, Version
2.0
Extended
Item Product Price/unit Units Price
Total $253,996
The only version of Red Hat Linux supported on 4-cpu servers is Red Hat Enterprise Linux AS.
Support for this product is available from Red Hat in a Standard or Premium subscription, on a
per-system, per year basis. The Standard subscription includes 9am-9pm telephone support
(US Eastern time), with 4-hour response time. The Premium subscription offers 24/7 telephone
support and 1-hour response time. This pricing configuration uses the Standard subscription.
WebSphere Application Server is available in multiple versions and editions. This configuration
required load balancing and failover, which is available in WebSphere Application Server ND.
WAS ND is licensed on a per-cpu basis, and the initial license includes 1 year of product
telephone support and maintenance. This pricing configuration uses prices from IBMs
Passport Advantage Express discount purchasing program, the transaction-based licensing
program from IBM.
Total $19,294.46
The version of Windows Server required for a 4-cpu server is the Enterprise Edition. In addition
to the base Windows Server license, customers enabling authenticated external connections to
Windows Server need to purchase the external connector license. Typically the external
connector is required for e-commerce applications. There is no External Connector required in
this case, since the application did not authenticate incoming requests against Active Directory.
Also, there are no Client Access Licenses (CALs) required in this case. The license of
Windows Server was priced with Software Assurance (SA), through the Open Value Licensing
program, the transaction-based licensing program available through qualified Microsoft
resellers. The license and software assurance plan priced here provides maintenance and
updates, 24x7 web support, as well as telephone support during business hours for these
products, for a period of 2 years.
Visual Studio:
http://msdn.microsoft.com/vstudio/howtobuy/pricing.aspx