0% found this document useful (0 votes)
119 views21 pages

Planning Data Warehouse Infrastructure

This document discusses planning the infrastructure for a data warehouse. It covers considerations for hardware sizing based on data volume, users and workload. Typical server topologies are presented, including single server and distributed options. Methods for scaling out the solution are described. The document also discusses planning for high availability and provides an overview of SQL Server Fast Track reference architectures and appliances.

Uploaded by

Richie Poo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views21 pages

Planning Data Warehouse Infrastructure

This document discusses planning the infrastructure for a data warehouse. It covers considerations for hardware sizing based on data volume, users and workload. Typical server topologies are presented, including single server and distributed options. Methods for scaling out the solution are described. The document also discusses planning for high availability and provides an overview of SQL Server Fast Track reference architectures and appliances.

Uploaded by

Richie Poo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Module 2

Planning Data Warehouse


Infrastructure
Module Overview

Considerations for Data Warehouse Infrastructure


• Planning Data Warehouse Hardware
Lesson 1: Considerations for Data Warehouse
Infrastructure

System Sizing Considerations


Data Warehouse Workloads
Typical Server Topologies for a BI Solution
Scaling Out a BI Solution
• Planning for High Availability
System Sizing Considerations

Data Volume Analysis/Report Complexity

Number of Users Availability Requirements


Data Warehouse Workloads

ETL
• Control flow tasks
Data Models • Data query and insert
• Processing • Network data transfer
• Aggregation storage • In-memory data pipeline
• Multidimensional on disk • SSIS Catalog or msdb I/O
• Tabular in memory
• Query execution

Operations and
DW Maintenance
• OS activity
Reporting •

Logging
SQL Server Agent Jobs
• Client requests • SSIS packages
• Data source queries
• Indexes
• Report rendering
• Caching • Backups
• Snapshot execution
• Subscription processing
• Report Server Catalog I/O
Typical Server Topologies for a BI Solution

Single-server Distributed

Few Servers Many

Hardware costs
Software license costs
Configuration complexity
Scalability and performance
Flexibility
Scaling Out a BI Solution

Data Warehouse Analysis Services

Integration Services Reporting Services


Planning for High Availability

Data Warehouse Analysis Services


• AlwaysOn Failover Cluster • AlwaysOn Failover Cluster
• RAID Storage

Integration Services Reporting Services


• AlwaysOn Availability Group • NLB Report Servers
• AlwaysOn Availability Group
Or
• AlwaysOn Failover Cluster
Lesson 2: Planning Data Warehouse Hardware

SQL Server Fast Track Data Warehouse Reference


Architectures
Core-Balanced System Architecture
Demonstration: Calculating Maximum
Consumption Rate (MCR)
Determining Processor and Memory Requirements
Determining Storage Requirements
Considerations for Storage Hardware
SQL Server Data Warehouse Appliances
• SQL Server Parallel Data Warehouse
SQL Server Fast Track Data Warehouse
Reference Architectures

• Pre-tested and approved


hardware specifications and
guidance
• Available from multiple
hardware vendors in
partnership with Microsoft
• Support for a range of data
warehouse sizes
• Tools provided to calculate
required specification
Core-Balanced System Architecture

Per-Core MCR = 200 MBps 2 x FC Port per processor


Total MCR = 1,600 MBps Max I/O Rate = 2,000 MBps

Server Storage Enclosure


Storage
Processors

4-Spindle RAID 10 Disk Groups


SQL Server

Fibre Switch
Storage Enclosure
Storage
Windows Server Processors

4-Spindle RAID 10 Disk Groups


Quad
Dual Port
Core
FC HBA
CPU

Quad
Storage Enclosure
Dual Port
Core FC HBA
CPU Storage
Processors
Dual Port
FC HBA 4-Spindle RAID 10 Disk Groups

Max I/O Rate = 2,000 MBps Max I/O Rate = 1,800 MBps
Demonstration: Calculating Maximum
Consumption Rate (MCR)

In this demonstration, you will see how to:


• Create tables for benchmark queries
• Execute a query to retrieve I/O statistics
• Calculate MCR from the I/O statistics
Determining Processor and Memory
Requirements

Estimating CPU Requirements:


• Determine core MCR
• Apply formula to estimate required
number of cores:
((Average query size in MB ÷ MCR) x Concurrent users) ÷ Target response
time
• Spread cores across CPUs based on the
number of storage arrays

Estimating RAM Requirements:


• Use a minimum of 4 GB per core
(or 64–128 GB per socket)
• Target 20% of data volume
Determining Storage Requirements

Data Warehouse
Estimating Data Volumes for the Data Warehouse
1. Estimate Initial Fact Data
• Number of fact table rows × row size
• Use 100 bytes per row as an estimate if unknown

2. Allow for Indexes and Dimensions


• Add 30–40% for dimensions and indexes
3. Project Fact Data Growth
• Number of new fact rows per month
4. Factor in compression
• Typically 3:1

Other storage requirements


• Configuration databases • Staging tables
• Log files • Backups
• tempdb • Analysis Services models
Considerations for Storage Hardware

• Use more smaller disks instead of


fewer larger disks
• Use the fastest disks you can afford
• Consider solid state disks―especially for
random I/O
• Use RAID 10, or minimally RAID 5
• Consider a dedicated storage area
network for manageability and
extensibility
• Balance I/O across enclosures, storage
processors, and disk groups
SQL Server Data Warehouse Appliances

• Pre-built hardware and software solutions, based


on tested configurations
• Part of a range of appliances that are based on
SQL Server
• Available from multiple hardware vendors
SQL Server Parallel Data Warehouse

• A special SQL Server edition only available in


hardware appliances
• Shared-nothing architecture
• Massively parallel processing
• Dedicated control nodes, compute nodes, and
storage nodes
Lab: Planning Data Warehouse Infrastructure

• Exercise 1: Planning Data Warehouse Hardware

Logon Information
Virtual machine: 20767C-MIA-SQL
User name: ADVENTUREWORKS\Student
Password: Pa55w.rd

Estimated Time: 30 minutes


Lab Scenario

You are planning a data warehouse solution for


Adventure Works Cycles, and have been asked to
specify the hardware that is required. You must
design a solution that is based on SQL Server that
provides the right balance of functionality,
performance, and cost.
Lab Review

• Review DWHardwareSpec.xlsx in the


D:\Labfiles\Lab02\Solution folder. How does the
hardware specification in this workbook compare
to the one that you created in the lab?
Module Review and Takeaways

• Review Question(s)

You might also like