RealityBehind RAC
RealityBehind RAC
RealityBehind RAC
Abstract:
This document studies the key features of Oracle Real Application Clusters (RAC),
for version 9i and 10g, and what are the true benefits and issues around its
implementation. This paper will delve into the various abilities claimed by Oracle and
provide results from independent research on what are real and what are just
marketing messages.
The information contained in this document represents the current view of Microsoft
Corporation on the issues discussed as of the date of publication. Because Microsoft
must respond to changing market conditions, it should not be interpreted to be a
commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of
any information presented after the date of publication.
This White Paper is for informational purposes only. MICROSOFT MAKES NO
WARRANTIES, EXPRESS OR IMPLIED, AS TO THE INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without
limiting the rights under copyright, no part of this document may be reproduced, stored
in, introduced into a retrieval system, or transmitted in any form or by any means
(electronic, mechanical, photocopying, recording, or otherwise), or for any purpose,
without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other
intellectual property rights covering subject matter in this document. Except as
expressly provided in any written license agreement from Microsoft, the furnishing of
this document does not give license to these patents, trademarks, copyrights, or other
intellectual property.
© 2004 Microsoft Corporation. All rights reserved.
Windows® and Windows NT® are either registered trademarks or trademarks of Microsoft
Corporation in the United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks
of their respective owners.
ii
Table of Contents
™
The Reality behind Oracle Real Application Clusters (RAC) Marketing Messages i
Table of Contents i
Executive Summary 2
Introduction 4
Target Audience 4
What is Oracle Real Application Clusters? 5
How RAC works 6
Summary 9
Oracle Claim: “RAC is Scalable” 10
The Truth: Unscalable Architecture 10
Summary 11
Oracle Claim: “RAC Provides High Availability “ 12
The Truth: Downtime Still Unavoidable 12
The Truth: Transactions Do Not Failover Automatically 12
Summary 13
Oracle Claim: “RAC Lowers Total Cost of Ownership” 14
Oracle’s TCO Claims 14
The Truth: RAC will probably increase cost 15
The Truth: RAC Increases Licensing Costs 19
Summary 21
Conclusion 22
Appendix I: Resource links 23
Appendix II: References 24
i
Executive Summary
Oracle has touted its Real Application Cluster (RAC) technology as being the all-
encompassing solution for today’s enterprise computing requirements in scalability and
high availability. With the release of is latest database, Oracle further proposes that RAC
is the only solution for Enterprise Grids as part of its Grid computing marketing
campaign. However, although it is now about 4 years since the initial public release of
RAC and over 14 years since the release of RAC’s predecessor Oracle Parallel Server,
several questions still remain around RAC’s business value as a technology;
• RAC continues to maintain Oracle’s tradition of high cost of ownership
for customers already on the Intel platform and some hardware cost
savings for customers on legacy UNIX®/mainframe system.
Regardless of the marketing claims, the cost of Oracle licensing increases when
deploying RAC. In measuring real deployments and long term TCO, SQL Server
still provides the best total cost of ownership for users who demand high
performance and reliability for their systems.
The only savings users may expect with Oracle RAC is primarily on the hardware
if moving from legacy UNIX/mainframe systems to Intel based systems.
If the user is currently already on Intel based servers, deploying RAC will very
likely increase cost because the user will now need to purchase
o additional certified server(s) and networking hardware
o certified storage and connectivity solutions
o additional Oracle database licenses (each node needs to be licensed for
database and all options)
o Oracle RAC licenses (typically separate from database licenses)
o Specialized RAC training of operators and database administrators
o Consulting and other services for implementation, certification and tuning
(RAC consulting is typically at a premium from basic DBA consulting)
On the other hand, not only has SQL Server 2000 repeatedly demonstrated lower
total cost of ownership for deployments of various sizes, it also provides tightly
integrated business intelligence (Analysis, reporting, Data Mining and ETL)
capabilities as a standard feature. Furthermore, its intelligent, dynamic
automated resource management features provide peace of mind and reduced
administration overhead that has no peers in the industry.
• RAC has few customer references running business applications like
SAP®, PeopleSoft®, Siebel®, etc… with no verified large deployments
Oracle has made many claims on the scalability of RAC since its launch almost 6
years ago. It has presented respectable results on the TPC-C benchmark but has
yet to demonstrate scalability with actual large scale commercial deployments.
While there are some interesting deployments by research institutions, these are
not your traditional business application that commercial users can relate to.
2
On the other hand, SQL Server has proven its scalability by performing among
the top positions for various industry benchmarks like TPC-C, TPC-W and
applications like SAP, Siebel, PeopleSoft, Onyx, etc… for some time and will
continue to raise the bar on scalability for business applications while Oracle RAC
remains a relative non-player.
• Does not provide automatic high availability out of the box. Most
applications will not have automated transaction failover.
The level of availability that RAC can provide is no different from what is
commonly available today and has been available for several years. There are no
secrets or advanced technology here as similar methods have been employed by
other vendors for years. RAC does, in some cases, simplify the process (for
example: the adding and removing of nodes to a cluster is much simpler in the
10g version) but it does not introduce any groundbreaking new technology nor
does it raise the bar for high availability.
Additionally, while there is no black out for the transaction since other servers are
available where the transaction can be re-submitted, there is a brown out and
transaction state in most ISV or internally developed corporate applications (not
written with TAF) is lost.
This document will delve into each of the key claims made by Oracle on RAC’s abilities in
Scalability, Availability and Total Cost of Ownership, explain how it really works and
uncover the reality behind Oracle’s marketing messages.
3
Introduction
This paper provides a high-level view of Oracle’s Real Application Cluster (RAC)
technology, its key features, strengths and weaknesses. It also sheds light on various
claims made by Oracle regarding scalability, availability and total cost of ownership. The
primary objective is to provide the reader with the facts about RAC, based on the
authors’ research, testing and observations as opposed to accepting Oracle’s marketing
messages.
Note that this paper is not intended to be an exhaustive technical whitepaper on RAC,
nor will it cover every single detail about RAC and related technologies. Other
technologies may be addressed where appropriate.
Target Audience
This paper is useful to anyone who has to manage a database system, develop database
applications and anyone involved in the decision making process on acquiring database
systems. Business and technical decision makers will find this paper particularly helpful
in aiding them to make the right decisions based on real data, rather than marketing
messages.
4
What is Oracle Real Application Clusters?
Oracle launched Real Application Clusters (RAC) with the release of its 9i database
product several years ago and has since made several updates. While there have been
some changes in marketing taglines, RAC is still generally positioned as the enterprise
solution for both scalability and reliability. Industry veterans will notice that RAC is an
updated version of Oracle’s older scale-out clustering technology known as Oracle
Parallel Server (OPS) in versions of Oracle database prior to 9i.
Figure 1 above provides a view of what the RAC architecture looks like from a very high
level. Basically, RAC is a limited scale-out system built on shared disk, shared cache
architecture, where there is only one copy of the actual data in a single database, but
can be serviced by one or more database server instances. All instances will be working
against the same copy of the data, with various mechanisms in place to manage
resources, locks, access, etc…
In contrast, figure 2 shows how SQL Server 2000 implements scale-out, by federating a
group of independent databases to provide users with the view of single database, even
though there can be two or more physical servers and databases beneath that view. This
concept obviously includes a lot more detail than what has just been described, but it is
not the objective of this paper to dwell into SQL Server 2000’s scale-out
implementation. More information can be obtained from
http://www.microsoft.com/sql/evaluation/features/distpart.asp
5
How RAC works
In the preceding section, we took a brief look at RAC’s architecture. Now, let’s drill down
into how RAC works.
Though it uses the same Oracle database engine, there are several key components that
are unique to RAC. These components are listed below and their functions will be
discussed later in this section.
• Cluster Manager
• Global Cache Service
• Global Enqueue Service
• Cluster Interconnect & Inter-Process Communication (node-to-node)
• Quiesce Database Feature
• Shared Disk Subsystem
Figure 3 below shows a typical 2-node RAC setup. If users were to expand this to
support more nodes, each new node would need to be connected to all existing nodes
and to the shared disk subsystem. The Interconnect between all nodes uses Network
Interface Cards (NICs) on each node, connected via a hub/switch. Connection to the
shared disk typically requires Host Bus Adapters (HBA) for Storage Area Network (SAN)
systems. In some cases, SCSI cards may be used for 2-node clusters.
Instance A Instance B
DLM DLM
Cluster Cluster
Comm. Comm.
Manager Manager
Layer Layer
Cluster Interconnect
Node 1 Node 2
6
Transactions that come into the system may be directed at any active node either
manually or via a cluster alias (a feature of the cluster services), which automatically re-
directs the transaction to an available node. The Global Cache Service manages the
status and transfer of data blocks across the buffer caches of the instances and is
integrated with the buffer cache manager to lookup resource information in the Global
Resource Directory. This directory is distributed across all instances and maintains the
status information about resources, including any data blocks that require global
coordination.
When a transaction requests a specific row(s), the node that services the request will
first check the local cache to see if the requested row(s) is cached locally. If it is, the
requested row(s) is returned to the caller and the transaction ends. If not, the calling
node will check to see if the row(s) is cached in any one of the other nodes in the
cluster. If found, a series of processes are initiated to ship that cache block to the calling
node, after which, the results are returned to the user. This has fixed some of the
problems suffered by OPS which required several I/O operations before the requested
block was acquired by the calling node. In the event that the requested row(s) is not
cached anywhere; a regular read operation is performed.
Disk I/Os are only performed when none of the collective caches contain the necessary
data and when an update transaction performs a COMMIT operation requiring disk write
guarantees.
There are four situations that warrant consideration when multiple Oracle instances
access the same row in the database. For reasons of simplicity, the following example
refers to a 2-node cluster named node-1 and node-2.
Read/read — User on node-1 wants to read a block that user on node 2 has recently
read. The read request may be served by any of the caches in the cluster database
where the order of access preference is local cache, remote cache, and finally disk I/O.
Read/write — User on node 1 wants to read a block that user on node 2 has recently
updated. In both the read/write and write/write cases in which a user on node 1
updates the block, coordination between the instances becomes necessary so that the
block being read is a read consistent image (for read/write) and the block being updated
preserves data integrity (for write/write). In both cases, the node that holds the initially-
updated data block ships the block to the requesting node across the high speed cluster
interconnect and avoids expensive disk I/O for read. It does this with full recovery
capabilities.
Write/read — User on node 1 wants to update a block that user on node 2 has recently
read. An update operation typically involves reading the relevant block into memory and
then writing the updated block back to disk. In this scenario, node 1 wants to update a
block that has already been read and cached by a remote instance or node 2. A disk I/O
for read is avoided and performance is increased, as the block is shipped from the cache
of node 2 into the cache of node 1.
Write/write — User on node 1 wants to update a block that user on node 2 has recently
updated. See read/write above.
7
The efficiency of inter-node messaging depends on three primary factors:
• The number of messages required for each synchronization sequence
• The frequency of synchronization -the less frequent, the better
• The latency, or speed, of inter-node communications
8
Summary
RAC functions by means of spreading processing load across multiple servers, but all
these servers are still working against the same single database residing on the same
storage system. Additionally, it is dependent on the cluster Interconnect to provide
cache coherency and effectively, data integrity. Any weakness or failure in either of
these parts may result in problems with the overall system.
The following sections will delve into the claims of scalability, availability and total cost
of ownership, providing insight into the reality behind Oracle’s marketing messages.
9
Oracle Claim: “RAC is Scalable”
Oracle claims to provide record breaking scalability with RAC. It claims that users can
easily add nodes whenever they needed to with virtually boundless scalability.
Interestingly however, after having been in the market for around 6 years, RAC has
barely merely demonstrated scalability on non-business application scenarios like TPC-C
benchmarks and research organizations. While respectable in their own right, they are
not representative of typical commercial/business applications like SAP, Siebel or
PeopleSoft.
Oracle did publish a SAP-SD Parallel standard benchmark but the number of benchmark
users achieve was not even half of what SQL Server accomplished in its SAP-SD 3-tier
results. In fact, Oracle’s own highest benchmark results are based on Oracle running on
single server, non-clustered systems, not RAC. (See Appendix II, #1 for details).
10
Summary
Oracle has made many claims on the scalability of RAC since its launch 6 years ago, but
to date, it has provided little to no real customer evidence on large scale RAC systems or
top benchmarks with common business applications like SAP and PeopleSoft. The few
benchmarks published, though respectable in their own capacity, are a far cry from what
Oracle has claimed in various marketing presentations and literature. At this time, RAC
has still not been able to deliver on the promises made by Oracle.
On the other hand, SQL Server has proven its scalability by securing the top positions
not just in various industry benchmarks but has also been recognized by independent
analysts like the WinterCorp based on SQL Server’s large customer deployments. In fact,
there are many SQL Server customers running mission critical applications that process
up to a 100 million transactions per day using only 8-CPU SMP servers. More information
on this is available at the following sites:
http://www.microsoft.com/sql/evaluation/compare/wintercorp.asp
http://www.microsoft.com/sql/evaluation/casestudies/default.asp
http://www.microsoft.com/sql/worldrecord/
11
Oracle Claim: “RAC Provides High
Availability “
Oracle makes interesting claims about RAC’s ability to provide users with a system that
suffers no downtime, and other attractive features related to high availability. However,
after filtering through the marketing messages, the reality is not nearly as interesting as
one might have initially believed. In fact, there are instances of customers moving from
RAC to non-RAC systems to resume operations from a failure attributed to RAC.
12
In short, applications that were not developed with TAF in place will not enjoy fully
automated failover or re-try for transactions or queries that were still running when a
node fails. Developing applications to leverage TAF or TAF-like features is not a new
concept and is commonly available among enterprise databases, including SQL Server so
again, there really is nothing groundbreaking here.
Summary
The level of availability that RAC can provide is no different from what has commonly
been available today and for several years now. There are no secrets or advanced
technology introduced here, as similar methods have been employed by other vendors
for some time. RAC does, in some cases, simplify the process (for example: the adding
and removing of nodes to a cluster is much simpler compared to OPS) but it does not
introduce any groundbreaking new technology, nor does it raise the bar for high
availability despite the significant additional cost incurred with RAC.
Additionally, while there is no black out for the transaction since other servers are
available where the transaction can be re-submitted, there is a brown out and
transaction state in most ISV or internally developed corporate systems (not written
with TAF) is lost.
There is no magic formula to high availability despite what Oracle marketing would have
you believe. RAC does have a role in the overall HA scenario but it certainly isn’t the
cure-all solution. See http://www.eweek.com/article2/0,1759,1196874,00.asp for one
example of where RAC was identified as the cause of failure of a major internet
commerce site and the solution to get back online was to move off RAC onto a single
server SMP system.
Microsoft takes a more pragmatic approach and provides customers with prescriptive
guidance on how to build highly available SQL Server systems. The effectiveness of this
approach of providing customers with actual facts and advanced warning of potential
pitfalls has proven highly effective with customers successfully building systems that
with up to 99.999% availability without resorting to exotic hardware or buying expensive
add-on software. See case studies on customers like NASDAQ, Western Digital, Borgata
Hotel Casino & Spa, Wildcard Systems and CountryWide Home Loans who have achieved
99.999% of availability or more with SQL Server 2000. Details are available at
http://www.microsoft.com/sql/evaluation/casestudies/alphalisting.asp
13
Oracle Claim: “RAC Lowers Total Cost of
Ownership”
One of the most heavily touted benefits claimed by Oracle about RAC is that it will
significantly lower total cost of ownership, compared to other databases in the industry.
This is based on the lower cost of hardware and easier management. However, though
there are some instances where the savings can be significant, only a small segment of
customers will actually realize these savings. For the most part, customers will realize
minimal or no change in TCO and in many cases, the TCO actually increases.
Another claim made by Oracle is on its new management tools which are supposed to
significantly reduce management complexity, a well known issue with Oracle, thus
reducing cost of management. However, one only needs to attempt installing RAC to
immediately realize how far from the truth this really is. Furthermore, none of the claims
of simplified management have yet been proven in the demanding environments of a
production system particularly in constantly changing environments.
Finally, Oracle makes claims regarding what other databases cost to manage, with its
own assertions on downtime and Oracle actually defines the cost of that downtime for
those databases. In order for the claims in cost savings to be realized, this claim
requires customers to believe that all Oracle databases have zero downtime and that all
other databases inherently have significant downtime. Fortunately, most customers are
able to see through Oracle marketing’s concoctions and understand that any system that
is well designed, deployed and managed can provide a high level of availability, and that
downtime affects all databases, including Oracle (see news article linked in the summary
of the availability section). Even very large companies with large DBA teams, such as
major internet based auction and ecommerce sites, have experienced significant
downtime involving their databases; often not because of faulty technology. See
http://roc.cs.berkeley.edu/papers/Cost_Downtime_LISA.pdf and
http://news.com.com/2009-1001-251651.html?legacy=cnet&tag=owv for some
examples. Fact is, the human factor is the single largest cause of downtime in any
system and this is something that RAC does not address. If anything RAC increases
14
The Truth: RAC will probably increase cost
For most cases, the few customers who will realize the cost savings claimed by Oracle
are those who are currently running on legacy UNIX and/or mainframe platforms like
SUN or IBM. The savings really just come from the hardware itself and hardware related
services. However, note that the software and services costs of the Oracle database do
not go down but rather increases. Users still need to purchase all relevant Oracle
licenses and pay for the same maintenance, support, consulting and training fees (see
next section for license costs). In fact, users will probably end up paying Oracle more
than before because of the additional charges per CPU, per node for the RAC option and
increased number of nodes the customer is now required to license the database for.
RAC is also probably not covered in the typical Oracle site license but rather, available as
a special option that incurs additional licensing charges. To top that off, there will almost
certainly be a significant increase in system administration complexity with the
deployment of any cluster system, which Oracle does not indicate in its messages. Even
a 10 year veteran of Oracle points out this increased complexity and the added skills
required (see
http://www.miracleas.dk/WritingsFromMogens/YouProbablyDontNeedRACUSVersion.pdf)
.
There are some parties who believe RAC is a cost effective solution for high availability,
because:
• RAC spreads cost across multiple active servers, all of which are servicing users
at all times, instead of waiting idle until a node fails
• All servers are servicing users for the same database, hence load is balanced
across all servers and you don’t have idle resources
• Customers need half the resources on each server as the user load can be
managed to an almost even distribution between the servers. Hence, you get
failover support without needing to invest in high capacity servers
This is a dangerous misconception, because it may lead users into a false sense of
security while deploying systems that really do not provide adequate levels of availability
and scalability. Users needs to understand that in any high availability system where
multiple nodes are built into a cluster to provide failover support, each node in the
cluster must be able to support the entire load of the cluster in the event of a disaster
where only a single node is left functioning. This means each node must be configured
with adequate resources to handle all the load of the entire cluster, on its own. This is
true for any real high availability scenario, regardless of vendor for clusters with few
nodes. As the number of nodes increase to more than four, the risk of multiple node
failure is significantly reduced hence users are able to reduce the level of redundancy
supported per node. However, it still has to take into account workload spikes and
multiple node failures.
For example, if you need 8-CPUs and 16GB RAM in a single server environment to
support your transactions system, and you choose to deploy a 2-node cluster system for
the same application, each node in the cluster must have the same 8-CPU and 16GB
RAM configuration. Otherwise, in the event of a node failure, the surviving node will not
be able to cope with the increased workload resulting in transactions being rejected.
Additionally, because some resources are being used to deal with the added transaction
request (though the system cannot service them), even the existing workload that is
15
being serviced may experience degraded performance. So while the system is
technically “up-and-running”, your users/customers are experiencing downtime. The
summary is that moving from a single node SMP system to a RAC based system
generally does not save a customer hardware costs plus, it will very likely, increase
software and administrative costs.
To further illustrate this, please see the following diagrams and descriptions:
Single server, minimal high availability features
Figure 4
16
impacted performance as some system resources are being utilized to manage the other
5,000 users which it is unable to service.
4-CPU 4-CPU
overwhelmed and not able to support the load for
8GB RAM 8GB RAM 10,000 users hence about 50% of users suffer
-FAILURE- -OVERLOAD-
downtime
Figure 6. When 1 node fails, the system now only has 4-CPUs, 8GB RAM which is not
capable of supporting the 10,000 users load.
If 1 node fails, the system still has 8-CPUs, 16GB RAM which means it can still support
10,000 users. Though the example does take a simplified view of workloads, the
reasoning employed applies to any environment.
17
Once users understand why this is critical for any system (which is omitted from Oracle’s
communications) that requires true high availability, users will quickly realize how costs
do not reduce but will likely increase as the number of nodes (as proposed by Oracle
using RAC) and workload increases. Additionally, in a real deployment, if a single
machine with 4-CPUs and 8GB RAM can support 5,000 users, deploying a 2-node RAC of
identical hardware configuration will not be able to support 10,000 users. Oracle clearly
indicates that, at best, you will be getting 85% gain (at best) by adding a second node
and as the number of nodes in a cluster increases, the gain per node decreases
noticeably. This is partly due to physical inefficiencies of hardware and partly due to the
overhead imposed by Oracle when running RAC.
Thus, the claims made by Oracle on how users can save money with RAC are, at best,
amusing myths or marketing gimmicks. In reality, for a system with true high
availability, users will not save money by deploying RAC and all users should be wary of
any marketing claims otherwise.
Conclusion
Despite Oracle marketing’s claims, the commonly deployed scenario shown above clearly
illustrates how deploying RAC for high availability will not save the user any cost
compared to existing, proven failover solutions available today like Microsoft Failover
Clustering which can be implemented by both SQL Server and Oracle. Additionally, RAC
may increase hardware, software and management cost as there are additional
components required compared to a failover cluster (e.g. Interconnect). Though the
solution will work by providing automated failover, it certainly is not a money saver as
claimed. If anything, it will likely increase the TCO.
That does not mean that customers should avoid using clusters altogether as they do
serve a useful purpose in providing high availability. As mentioned before, both Oracle
and SQL Server support failover cluster which also provides high availability. What
customers should know is that they will not be saving on hardware costs for high
availability clustering with RAC in the given scenario and will likely increase software and
administration costs.
18
The Truth: RAC Increases Licensing Costs
Sample Cost Comparison between Oracle and Microsoft Solutions
The following table provides estimated retail prices, based on published prices on Oracle,
SUN™, DELL™ and Microsoft websites, of various configurations with equivalent basic
functionality. This comparison does not include storage costs, as that is expected to be
reasonably similar across different configurations (though RAC systems require more
Host Bus Adapters or NICs than other systems). Note that software maintenance, taxes,
shipping and other charges may apply.
19
connectivity) SUN Dual Gigabit QLOGIC - QLA2340-CK QLOGIC - QLA2340-CK
Ethernet + Dual SCSI SANblade 2340 PCI-X SANblade 2340 PCI-X
PCI Host Adapter to 2 Gb Fiber Channel to 2 Gb Fiber Channel
Host Bus Adapter x2 Host Bus Adapter
(1 HBA required per
node)
HARDWARE COSTS $202,047 $52,425.80 $40,600.95
TOTAL COST $586,047 $516,425.80 $120,596.95
Oracle10g on SUN Oracle10g with RAC SQL Server 2000 on
Solaris 9 on Dell/Redhat Windows Server
4-CPU x 1-node Enterprise Linux AS 3 2003 Enterprise
(2-CPU x 2-nodes) Edition
4-CPU x 1-node
Table 1. Costs estimates as per published prices on vendor websites and are subject to
change without notice.
1. Requires OLAP option at $20,000 per CPU and Data Mining option at $20,000 per
CPU for 4 CPUs
2. Requires Advanced Security option at $10,000 per CPU for 4 CPUs
3. Requires Diagnostics Pack at $3,000 per CPU for 4 CPUs
4. Requires Tuning Pack at $3,000 per CPU for 4 CPUs
5. Prices from www.dell.com, oraclestore.oracle.com, www.sun.com,
www.microsoft.com/sql
Note: While Oracle typically mentions “commodity hardware”, RAC deployments require
hardware specially certified for RAC and not just any off the shelf server or components.
That means most if not all users will not be able to re-use existing hardware to build a
RAC system. Shared storage subsystem w/ a high speed interconnect is also a pre-
requisite for RAC and is certainly not a commodity item.
It is clear from the figures in Table 1 that moderate levels of hardware savings can be
realized if users move away from legacy UNIX system onto Intel based commodity
systems. However, note that the savings are purely from the reduced hardware and
hardware related services costs. Database costs either remain constant if running on a
single server or increase with the RAC option.
With Oracle’s licensing costs making up almost 90% of the total system cost, hardware
savings become almost negligible. Software discounts have not been taken into account
in this comparison and neither has maintenance & software upgrade costs. Given the
base cost of the software compared, maintenance & upgrade costs which is a percentage
of the software base price, it isn’t hard to see that Oracle once again leads in high TCO.
Note that high availability requirements (as described in the preceding section) were not
factored into this sample configuration. The same systems deployed in a highly available
setting minimally involves doubling the count (and cost) of most of the components
listed including the RAC system. This includes hardware components (true even for RAC
systems that already have two nodes as explained in the earlier section) and software
20
licenses. Please see your Microsoft or hardware representative for further details on
configuring systems that provide high availability.
Summary
Regardless of the marketing claims, when measuring real deployments SQL Server still
provides the best total cost of ownership for users who demand high performance and
reliability for their systems. The only savings which users may expect with Oracle10g
RAC is purely from the hardware but that typically makes up less than 15% of the
overall system cost (see Budgeting for IT: Average Spending Ratios by J. Giera, Giga
Analyst) and only if the user is moving from an expensive, legacy UNIX system to an
Intel-based commodity system.
If the user is currently already using Intel-based servers, moving to RAC would probably
not reduce costs. In fact, there is a high likelihood that total cost of ownership would
increase with using Oracle RAC. Deploying Oracle RAC does not offer significant
scalability or high availability benefits but it does require additional purchases such as:
• additional certified server and networking hardware
• certified storage & connectivity solutions
• additional Oracle database licenses (each node needs to be licensed)
• Oracle RAC licenses
• training of operators and administrators
• services for implementation, certification and tuning
This is not even counting infrastructure requirements and additional administrative tasks
for the DBA, System Administrator and Network Administrator.
On the other hand, not only does SQL Server 2000 deliver the lowest total cost of
ownership, but it also provides the best price/performance for the user. Its intelligent,
dynamic automated resource management features provide peace of mind and reduced
administration overhead. This is not a claim but rather a report on customers’
experience with SQL Server. See
http://www.microsoft.com/sql/evaluation/compare/tco.asp for more details.
21
Conclusion
Though having navigated around some significant problems inherent with OPS (RAC’s
predecessor) in the area of performance, and having raised the scalability ceiling
marginally, RAC has still not proven it is able to deliver enterprise-class scalability for
real business applications. In high availability, RAC only delivers one small part of any
typical high availability solution and offers, at best, marginal advantage over existing
solutions but at a much high cost & complexity.
The actual cost to deploy and manage RAC is nowhere near as minimal as Oracle might
have users believe; one only needs to attempt installing RAC realize the added
complexity (even for experienced Oracle DBAs). RAC is a significant step forward by
Oracle in making OPS usable, but there is only so much that can be done with an
architecture that limits scalability by design.
Conclusion: RAC is a significant release for Oracle, but does not live up to Oracle’s
marketing claims. RAC typically costs users more to implement and maintain than SMP
systems offering equal or greater performance. SQL Server offers comparable or better
levels of scalability and availability but at a fraction of the cost and is significantly easier
to manage/use.
22
Appendix I: Resource links
Transaction Processing Performance Council (TPC)
www.tpc.org
SAP Benchmarks
www.sap.com/benchmark
SQL Server Scalability Benchmarks Leadership Proofpoints
www.microsoft.com/sql/worldrecord
SQL Server Total Cost Of Ownership Leadership Proofpoints
www.microsoft.com/sql/evaluation/compare/tco.asp
SQL Server Business Intelligence and Data Warehousing
www.microsoft.com/sql/evaluation/bi/default.asp
Your Probably Don’t Need RAC
http://www.miracleas.dk/WritingsFromMogens/YouProbablyDontNeedRACUSVersi
on.pdf
23
Appendix II: References
1. SAP benchmark reference
Oracle (SAP-SD Parallel)
** This is Oracle’s highest Oracle RAC based SAP SD Parallel benchmark to date
The SAP SD Standard 4.6 C Application Benchmark performed on November 16, 2001 by HP in Nashua, NH,
USA was certified on June 3, 2002 with the following data:
Number of benchmark users & comp.: 12,000 SD (Sales & Distribution) Parallel
Average dialog response time: 1.92 seconds
Throughput:
Fully Processed Order Line items / hour: 1,208,330
Dialog steps / hour: 3,625,000
SAPS: 60,420
Average DB request time (dia/upd): 0.058 sec / 0.185 sec
Operating System all server: HP Tru64 Unix V5.1A
RDBMS: Oracle 9i Real Application Clusters (RAC)
R/3 Release: 4.6 C
Configuration:
4 Database servers (4 active nodes):
HP AlphaServer ES45 Model 2, 4-processors, Alpha EV6.8CB (21264C) 1000 MHz, 8 MB L2 cache, 32 GB main
memory each
Certification Number: 2002031
25