Choosing Storage For Dell Database Solutions With Dell PowerVault and DellEMC Arrays La
Choosing Storage For Dell Database Solutions With Dell PowerVault and DellEMC Arrays La
Choosing Storage For Dell Database Solutions With Dell PowerVault and DellEMC Arrays La
Solution Guide Choosing Storage for Dell Database Solutions with Dell PowerVault and Dell/EMC Storage
Abstract
This white paper provides a guide for choosing an appropriate storage architecture for Dell Database Solutions deployed on Dell PowerEdge servers with Dell PowerVault and Dell/EMC storage. Using the knowledge gained through joint development, testing and support with Microsoft and Oracle, this white paper documents best practices that can help you select a storage solution to meet your capacity, performance and availability needs. October 2007
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. . 2007 Dell Inc. All rights reserved. Reproduction in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden. Trademarks used in this text: Dell, the DELL logo, PowerEdge and PowerVault are trademarks of Dell Inc.; Intel and Xeon are registered trademarks of Intel Corporation; EMC, Navisphere, and PowerPath are registered trademarks of EMC Corporation; Microsoft, Windows, and Windows Server are registered trademarks of Microsoft Corporation. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade names other than its own.
Table of Contents
INTRODUCTION ........................................................................................................................................ 4 DELL DATABASE SOLUTIONS FOR ORACLE 10G AND MICROSOFT SQL SERVER 2005 ............................... 4 OVERVIEW OF THIS WHITE PAPER .............................................................................................................. 4 OVERVIEW OF DELL STORAGE SYSTEMS ....................................................................................... 5 DIRECT-ATTACHED SERIAL-ATTACHED SCSI (SAS)................................................................................. 5 PowerVault MD3000 Direct-Attached Storage Array......................................................................... 5 MD3000 Capacity................................................................................................................................. 5 MD3000 Availability ............................................................................................................................ 6 INTERNET SCSI (ISCSI) STORAGE NETWORKING ...................................................................................... 7 PowerVault MD3000i iSCSI Storage Array........................................................................................ 7 MD3000i Capacity and Availability..................................................................................................... 8 FIBRE CHANNEL STORAGE NETWORKING .................................................................................................. 8 Dell/EMC CX3 Series FC Storage Arrays .......................................................................................... 8 Dell/EMC CX3 Series Capacity ........................................................................................................... 9 Dell/EMC CX3 Series Availability....................................................................................................... 9 COMMON CRITERIA FOR CHOOSING A STORAGE ARRAY...................................................... 11 ARCHITECTURAL COMPLEXITY ................................................................................................................ 11 COST ........................................................................................................................................................ 11 SCALABILITY ............................................................................................................................................ 12 OVERVIEW OF DATABASE PERFORMANCE WITH DELL STORAGE ARRAYS .................... 13 SELECTING A STORAGE ARRAY FOR A DELL DATABASE SOLUTION ................................. 15 CONCLUSIONS......................................................................................................................................... 17 TABLES AND FIGURES INDEX ............................................................................................................ 18 REFERENCES ........................................................................................................................................... 18
Introduction
Dell PowerEdge servers, Dell PowerVault storage systems and Dell/EMC storage systems are ideal choices to deploy highly reliable and sustainable Oracle Database 10g and Microsoft SQL Server 2005 solutions. This white paper is intended to help IT professionals design and configure Oracle 10g and SQL Server 2005 database solutions using Dell servers and storage that apply best practices derived from laboratory and real-world experiences. This white paper provides criteria and guidance to assist in the selection of the appropriate storage system for your Dell Database Solution.
Dell Database Solutions for Oracle 10g and Microsoft SQL Server 2005
Dell Database Solutions for Oracle 10g and Microsoft SQL Server 2005 are designed to simplify operations, improve utilization and cost-effectively scale as your needs grow over time. In addition to providing price/performance leading server and storage hardware, Dell Solutions for Oracle 10g and SQL Server 2005 include: Dell Tested and Validated Configurations in-depth testing of Oracle 10g and SQL Server 2005 configurations with Dell servers and storage; documentation and tools that help simplify deployment Integrated Solution Management standards-based management of Dell Database Solutions that lower operational costs through integrated hardware and software deployment, monitoring and update Database Software Licensing - multiple licensing options that can simplify customer purchase Dell Enterprise Support and Professional Services for Dell Database Solutions offerings for the planning, deployment and maintenance of Dell Solutions for Oracle 10g and SQL Server 2005 Dell PowerEdge servers and Dell Storage help to minimize operating costs with price/performance leadership - Dell currently holds price/performance leadership for TPC-E and seven of the top ten TPC-C price/performance leadership positions with SQL1. For more information about Dell Solutions for SQL Server 2005, see www.dell.com/sql. For more information about Dell Solutions for Oracle 10g Database, see www.dell.com/oracle.
Source: TPC-E by Price/Performance Version 1 and TPC-C by Price/Performance Version 5 Results as of October 2007. See www.tpc.org for current results Choosing Storage for Dell Database Solutions 4
MD3000 Capacity
Each MD3000 unit supports up to 15 SAS drives and can be expanded by daisy-chaining up to two MD1000 storage arrays for a total of 45 SAS drives. The drives provide capacities of up to 300 gigabytes and either 10,000 RPM or 15,000 RPM speeds. Each of the four SAS ports available on the MD3000 operate at a peak of 3.0 gigabits per second (Gbps) for an aggregate peak throughput of 12 Gbps.
MD3000 Availability
Although the MD3000 can be operated with a single RAID controller, Dell best practices recommend that redundant controllers be used to increase the resiliency of the solution and to maximize the availability of the data stored by a database solution. Each RAID controller on the MD3000 features two host ports. Therefore, it is possible to connect up to four hosts, each with a single data path however, this introduces single points of failure in the storage subsystem, and is not optimal for database solutions. The Dell recommended configuration for database solutions features an MD3000 with redundant RAID controllers and provides for the connection of two hosts, each with redundant data paths (e.g. one connection from each server to each RAID controller). This configuration shown in Figure 2 greatly helps reduce the possibility that a single point of failure might cause any connected server to lose access to the storage system. I/O paths into RAID controllers are managed by a multi-path driver, which re-routes I/O request from the hosts to the other controller when a failure of one of the I/O paths occurs. To further enhance availability, each RAID controller includes a backup battery that can hold cached data in the controllers cache memory for up to 72 hours without external power. Redundancy is provided for cached writes as they are mirrored between controllers. Additional availability and integrity features of the MD3000 include hot- pluggable, redundant power supplies, cooling modules and disk drives; active disk scrubbing and non-disruptive firmware upgrades. In addition, the MD3000 offers two optional Premium Features: Snapshot Virtual Disks and Virtual Disk Copy. Snapshot Virtual Disk creates point-in-time snapshots of a data volume. Using a copy-on-first-write technique, data on source virtual disks can be updated without changing the contents of the point-in-time snapshot. Snapshots are often used as part of a backup system, reducing the time during which a database must be in a quiescent state. Virtual Disk Copy produces a full block-by-block copy of a source virtual disk, which can be utilized in the same manner as clones or business continuity volumes, such as for disk-based backups, recovery to a state preserved via a snapshot, or migration of data to a larger volume. For more information about these features, refer to the Dell PowerVault Modular Disk Storage Manager Users Guide, which is available from http://support.dell.com/.
The maximum distance between components is dependent on the speed and on type of optical ports used. Refer to the documentation for the storage array, FC HBA, and FC switch in your solution for more details. Choosing Storage for Dell Database Solutions 8
12
10
Architectural Complexity
The MD3000 and MD3000i share a common architecture, except for the ports used to connect host servers. Therefore, it is appropriate to compare these systems collectively to the Dell/EMC storage arrays. The RAID controllers in the PowerVault MD-series storage systems serve as the brains of the storage array. These controllers perform the RAID calculations, control the I/O movement, communicate with the management client, store the firmware, and protect data until it can be written safely to the hard disk drives. MD-series storage arrays offer redundant RAID controllers with internal cache backup batteries, power supplies, cooling fans, and hard disk drives in a single three rack unit (3U) enclosure. Dell PowerVault MD1000 storage systems, which provide more disk capacity, each require an additional 3U. Conversely, the Dell/EMC CX3-series storage systems feature separate modules for storage processors, disk drives and standby power supplies. A CX3-series storage system requires at least 5U for the first 15 hard-disk drives. Additional DAEs each require 3U. The integrated design of the Dell PowerVault MD-series is easier to understand and simpler to deploy. In addition, different storage protocols and topologies impose different physical distance limitations among these Dell storage systems. The MD3000, by nature of employing a direct-attached topology, requires that the servers are physically close to the storage system, and is determined by the maximum allowable length of a SAS x4 cable, which is approximately four meters. Dell/EMC Fibre Channel SAN and MD3000 IP SAN allow greater distance between the storage and server, allowing more flexible deployment within a datacenter. Distance-extension solutions for Fibre Channel and iSCSI may be used to enable replication of data over considerable distances between sites as part of a business continuity or disaster recovery plan.
Cost
The components required by a Fibre channel storage array such as the Dell/EMC CX3-series, including the server HBA, optical cabling, switches and other infrastructure components tend to be more expensive than the equivalent components required for other storage solutions. For example, iSCSI can significantly lower the cost of implementing networked storage by employing lower-cost GbE technology, and allowing re-use of existing infrastructure investments. Direct-attached storage has even lower costs as it does not require a switching infrastructure. SAS storage systems, such as the PowerVault MD3000 offer many of the benefits of networked storage without the additional expense of the storage network infrastructure. However, if future growth and the ability to scale by adding additional storage beyond what is supported with a single storage system are required, deploying a SAN architecture may be beneficial. Both Dell/EMC CX3-series and MD3000i iSCSI storage arrays offer this flexibility. Along with the cost in hardware, deploying and managing FC SAN requires specialized expertise, which is obtained by the investment in training personnel or hiring a service company. IP SANs may provide a cost savings alternative in terms of personnel and training. Because Ethernet networking is a well established technology and is well understood by most IT staff, little to no additional network training may be needed to enable IT staff to set up an IP storage network. Direct-attached storage requires no networking expertise.
11
Scalability
The Dell storage systems discussed in this paper vary in terms of total storage capacity and number of hosts which can be connected, both key considerations when evaluating current and future requirements for a database solution. The total storage capacity of the storage system is a function of the size and number of hard-disk drives offered by the solution, which is in turn a function of the number of arrays that can be connected to a host server and the number of expansion enclosures that can be attached to each array. A database solution can only employ a single MD3000 array, which supports up to two MD1000 expansion enclosures for a total of 45 total hard-disk drives. Although currently available drive capacities as large as 400Gbytes can support large databases, database performance is more typically determined by the number of drives over which the data is stored, not their capacity. When a database deployment is deemed to require more than 45 total disks, a solution with multiple MD3000i or CX3-series arrays can be deployed. Both of these solutions offer the ability to attach up to four storage arrays to the same host server(s) in a SAN environment. With four MD3000i arrays, each with two MD1000 expansion enclosures, a total of 180 disks can be made available for a database solution. With four maximally-configured CX3-80 arrays, a total of 1920 disks can be deployed; refer to Table 1 to determine the maximum number of disks for each CX3series array. When building Oracle RAC clusters or deploying SQL Server 2005 into failover clusters, the number of hosts than can be connected to a given array while maintaining redundant connections between each server and the storage array is an important consideration. The MD3000 allows the connection of up to two host servers with redundant paths. The MD3000i allows the connection of up to two host servers with redundant paths when configured in a direct-attached topology, or up to 16 host servers when connected via Ethernet switches in an IP SAN. The CX3-series provides for between two and six directly-attached servers, depending on the number of front-end, or host, ports on each SP, and up to 256 initiator ports (or 128 redundantly-attached servers) when deployed in an FC SAN. Clearly, networked storage solutions offer greater scalability, both in terms of total data capacity and of host connectivity. If a database solution has high initial requirements, then the limitations of a directattached solution may prove unsuitable. Similarly, if the solution has modest initial requirements, but is expected to grow substantially in the future, an iSCSI or FC solution can be directly-attached initially, deferring the cost of interconnect, and later be migrated into a SAN when additional storage capacity or host connections become a necessity.
12
Open source measurement and characterization project; for details see http://www.iometer.org/ Choosing Storage for Dell Database Solutions
13
Transaction Processing The MD3000 array is capable of handling the throughput (IOPS) required for OLTP workloads. This array is a good choice for implementations that fall within its scalability limits. However, for database clusters of more than two nodes, or for database solution implementations that require more than 45 disk drives, a networked storage array should be chosen. The MD3000i is capable of handling the throughput (IOPS) required for OLTP workloads. In fact, for small OLTP databases (20 or fewer disk drives), this array can produce up to 150% higher throughput than the other arrays reviewed in this paper. For larger databases - up to 45 disk drives - the MD3000i provides 10-20% lower throughput than the other arrays. When additional scalability is required, up to four MD3000i arrays can be connected to the same database host or database cluster. The CX3-series arrays are capable of handling the throughput (IOPS) required for OLTP workloads. Their increased data cache size, and support for single volumes larger than 2 TB combine to make these arrays a strong choice for the largest OLTP implementations.
Data Warehousing The MD3000 array is capable of handling the bandwidth (MB/s) required for DW workloads. This array is a good choice for smaller DW implementations; however, the primary limitation of the MD3000 array is its scalability both in terms of the number of supported hosts, and of the maximum possible database size. The MD3000i array is less suitable for DW workloads. The primary limitation of this array is its GbE iSCSI interfaces, which result in 25-35% lower bandwidth (MB/s) than is observed with the other arrays reviewed in this paper.
The CX3-series arrays are capable of handling the bandwidth (MB/s) required for DW workloads. Their increased data cache size, support for a greater number of connected hosts, and higher capacity limits combine to make these arrays a good choice for larger DW implementations.
14
15
Start
1 or 2
Considerations: Both availability and scalability of the solution play a role in this decision. For example, multiple servers may be used for Microsoft SQL deployed in a failover cluster, or for Oracle RAC deployments.
Considerations: Include space for transaction logs, database data, and other required database features (e.g. TempDB space for SQL Server, Flashback space for Oracle) Total Size of Database > 4 TB The scale of your database solution indicates that networked storage should be chosen. Yes Up to 4 TB Is there an existing SAN infrastructure?
Figure 5 - Flowchart for Selecting a Storage Array for a Dell Database Solution
16
The Dell PowerVault MD3000 is an ideal selection. Up to two database servers with Dell SAS 5/E cards can be connected in a DAS topology, and up to 45 disk drives can be used to meet the performance and/or scalability goals of your solution. Because storage consolidation is not desired, and the scalability needs of the solution can be met, there is no need to invest in a networked storage infrastructure. With SAS connections end-to-end (from the host to the disk drive), the MD3000 is well suited for all database workloads, including both transaction-processing and analytical / decision-support systems.
The Dell|EMC CX3-series is an ideal selection. Depending on the model, between two and six database servers can be connected in a DAS topology, and up to 128 servers can be connected in a Fibre Channel SAN. Array models offer a maximum disk capacity of between 60 and 240 disk drives, and up to four arrays can be connected to a database server, allowing the greatest scalability of all of the Dell storage options for your database solution. With 4 Gbps Fibre Channel connections end-to-end (from the host to the disk drive), the Dell|EMC CX3-series is well suited for all database workloads, including both transaction-processing and decision-support.
Conclusions
Dell Database Solutions are designed to simplify operations, improve utilization and cost-effectively scale as your needs grow over time. This white paper provides guidelines that can help you select the appropriate Dell Storage array for your database needs. After evaluating multiple factors, including scalability, cost, and performance of various database workloads, there are some notable differences between the various storage architectures discussed in this paper. Each array has specific advantages with respect to these criteria that must be evaluated before selecting the optimal array for a particular database solution. By examining the results presented in this paper, and using a logical process to evaluate each criterion, a good fit can be found for a variety of sizes and types of database deployments. The Dell PowerVault MD3000, with its SAS x4 wide host interfaces, provides a range of availability features and offers throughput and bandwidth suitable for many database workloads. This storage array supports up to two redundantly-connected hosts and a total of 45 disk drives - scalability that can meet most entry and many mid-range database needs. The Dell PowerVault MD3000i, with its GbE iSCSI host interfaces, provides a range of availability and scalability features and offers throughput suitable for OLTP database workloads. Up to 16 redundantlyconnected hosts may be attached via an IP SAN, and up to four arrays each supporting 45 disk drives may be connected to a single host. The bandwidth available from the MD3000i host interfaces make it more suitable for OLTP and less suitable for mid-range to large business intelligence solutions. The Dell/EMC CX3-series, with their 4 Gbps Fibre Channel host interfaces, provide a range of availability features and the best scalability features of the arrays discussed in this paper. The CX3 series offer throughput and bandwidth suitable for demanding OLTP and business intelligence workloads. The best practices described here are intended to help achieve optimal performance of Oracle Database 10g and SQL Server 2005 on PowerEdge servers and Dell storage. To learn more about Dell Database Solutions, please visit www.dell.com/oracle and www.dell.com/sql or contact your Dell representative for up to date information about Dell servers, storage and services for Dell Database solutions.
17
References
1. 2. Pro Oracle Database 10g RAC on Linux, Julian Dyke and Steve Shaw, Apress, 2006. Benchmark Factory for Databases, Quest Software. http://www.quest.com/Quest_Site_Assets/PDF/Benchmark_Factory_5_TPCH.pdf Interface Decisions FC, iSCSI, and SAS, Dell Inc. http://www.dell.com/downloads/global/products/pvaul/en/isci_fc_sas_interface.pdf PowerVault MD3000i brochure, Dell Inc. http://www.dell.com/downloads/global/products/pvaul/en/pvaul_md3000i_brochure.pdf PowerVault MD3000 brochure, Dell Inc. http://www.dell.com/downloads/global/products/pvaul/en/pvaul_md3000_brochure.pdf Dell/EMC CX storage Products Brochure, Dell Inc. http://www.dell.com/downloads/global/products/pvaul/en/dlemc_broch.pdf
3.
4.
5.
6.
18