0% found this document useful (0 votes)

50 views26 pages

Data50 2020 02 - Feb 09

Uploaded by

etest2272

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views26 pages

Data50 2020 02 - Feb 09

Uploaded by

etest2272

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

aka.

ms/DATA50 #MSIgniteTheTour
Optimize data warehousing
query performance
Speaker name
Title

aka.ms/DATA50 #MSIgniteTheTour
Resources

Session Resources Hub

aka.ms/DATA50

Session Code on GitHub

aka.ms/DATA50Repo

All Event Session Resources

aka.ms/mymsignitethetour

aka.ms/DATA50 #MSIgniteTheTour
Agenda

What is Azure Synapse Analytics

Maximizing Performance

Query Performance Tuning

aka.ms/DATA50 #MSIgniteTheTour
Agenda

What is Azure Synapse Analytics

Using Polybase to Load Data in a data warehouse

Data Loading best practices

aka.ms/DATA50 #MSIgniteTheTour
What is Azure Synapse
Analytics?

aka.ms/DATA50 #MSIgniteTheTour
Azure Synapse
Analytics

A limitless analytics service with

unmatched time to insight,
that delivers insights from all your
data, across data warehouses
and big data analytics systems,
with blazing speed

aka.ms/DATA50 #MSIgniteTheTour
Data Warehouse Processes

Provision Load Query

Automate workflow via Azure Data Factory

aka.ms/DATA50 #MSIgniteTheTour
Data Warehouse Architecture

Compute Node Compute Node Compute Node

0110101010101 0110101010101 0110101010101

0101011010101 0101011010101 0101011010101
1101010101011 1101010101011 1101010101011
0 Compute Node 0 Compute Node 0 Compute Node

0110101010101 0110101010101 0110101010101

Control 0101011010101 0101011010101 0101011010101
Node 1101010101011 1101010101011 1101010101011
aka.ms/DATA50 0 0 0 #MSIgniteTheTour
Data warehouse performance in Azure Synapse
Analytics
Query performance
Data preparation

Read data from Load processed

files using DBFS data into tables
optimized for
Azure Databricks analytics

Data ingestion Data storage

Load flat files Applications
into data lake
on a schedule
Visualize
Load into
Azure Data
Logs, files, and media Azure Storage/ SQL DW
Factory
(unstructured) Data Lake Store tables
Azure Synapse Analytics

Power BI
Dashboards

Serving

Business and custom

apps (structured)
Applications Extract and
manage their Azure Data
SQL DB transform
transactional Factory
relational data
data directly
Transactional storage Data prep.

aka.ms/DATA50 #MSIgniteTheTour
Maximizing Performance

aka.ms/DATA50 #MSIgniteTheTour
Maximizing Query Performance
Table distribution

Round Robin Hash Distributed Replicated

Tables Tables Tables

aka.ms/DATA50 #MSIgniteTheTour
Maximizing Query Performance
Is the default option for newly created
Round-robin distribution
tables

Evenly distributes the data across the

Round-robin available compute nodes in a random
Tables manner, giving an even distribution of
data across all nodes

Loading into Round-robin tables is fast

Queries on Round-robin tables may

require more data movement as data is
“reshuffled” to organize the data for the
query

Great to use for loading staging tables

aka.ms/DATA50 #MSIgniteTheTour
Maximizing Query Performance
Hash distribution
Distributes rows based on the value in the
distribution column, using a deterministic
hash function to assign each row to one
Hash Distributed distribution.
Tables
Is designed to achieve high performance
for queries that run against large fact
tables in a star schema.

Choosing a good distribution column is

important to ensure the hash distribution
performs well

As a starting point, use on tables that are

greater than 2GB in size and has frequent
inserts, updates and deleted

But don’t choose a volatile column for the

hash distributed column
aka.ms/DATA50 #MSIgniteTheTour
Maximizing Query Performance
A full copy of a table is placed on every
Replicated Table
single compute node to minimize data
movement

Replicated Works well for dimension tables in a star

Tables schema that are less than 2GB in size
and are used regularly in queries with
simple predicates

Should not be used on dimension tables

that are updated on a regular basis

You can convert existing round-robin

tables to replicated tables to take
advantage of the feature using a CTAS
statement

aka.ms/DATA50 #MSIgniteTheTour
Create statistics after loading
Improve the query performance for
users

Azure Synapse Analytics

Production
Tables

aka.ms/DATA50 #MSIgniteTheTour
Demo:
Query Performance
Tuning

aka.ms/DATA50 #MSIgniteTheTour
Query Performance Tuning

aka.ms/DATA50 #MSIgniteTheTour
Overcomes the 10,000-row limit of DMV’s,
output

Pinpoint and fix queries with plan regression

• View queries which produce multiple plans

• 7-day retention period

Query Data Store • Full query text

A/B Testing with your Azure Synapse Analytics

(SQL DW)

Identify, improve and tune ad hoc queries

• Top hitting queries for performance tuning

aka.ms/DATA50 #MSIgniteTheTour
© Microsoft Corporation
Query Data Store
Dynamic Management Views
VIEW DATABASE
STATE permission
Query Query Text DMVs are in UTC
time zone

Sys.query_store_query Sys.query_query_text
Query_id (PK) Query_test_id (PK)

Plan
Sys.query_store_plan
Plan_id (PK)

Runtime stats Runtime stats interval

Sys.query_store_runtime_stats Sys.query_store_runtime_stats_interval
Runtime_stats_id (PK) Runtime_stats _interval_id (PK)

aka.ms/DATA50 #MSIgniteTheTour
Query execution with Query Data Store CONTROL
1
Queries

5
Engine QDS

Flush to disc every 15 minutes seconds 3

2
Shell DB DMS

10GB is the max storage size

Retention period is 7 days

Compute Compute Compute Compute

Maximum plans per query is 200
DMS DMS DMS DMS

SQL DB SQL DB SQL DB SQL DB

Dist_DB_ Dist_DB_ Dist_DB_ Dist_DB_
1 15 31 46
Dist_DB_ Dist_DB_ Dist_DB_ Dist_DB_

…
…
2 16 32 47

aka.ms/DATA50 Dist_DB_ Dist_DB_ #MSIgniteTheTour

Dist_DB_ Dist_DB_
15 30 45 60
Azure Synapse Analytics recommendations
Recommendation
generation (every 24 hours)
Azure Advisor Recommendation
Blade
Data
skew +
Azure Synapse Replicate
Analytics tables
Recommendation
Telemetry API

Stats

Tempdb

Adaptive
Cache

aka.ms/DATA50 #MSIgniteTheTour
Select the proper table distribution

Detect data skew

In Summary: • Use Query Data store

• Consider changing key columns
• Only as fast as your slowest

Query Performance distribution

Provision additional adaptive cache

capacity

Reduce tempdb contention

Create and update statistics

aka.ms/DATA50 #MSIgniteTheTour
© Microsoft Corporation
/MS Learn alert
Complete interactive learning
exercises, watch videos, and
practice and apply your new
skills.
aka.ms/DATA50MSLearnCollection

aka.ms/DATA50 #MSIgniteTheTour
Resources

Session Resources
aka.ms/DATA50

Session Code on GitHub

aka.ms/DATA50repo

All Event Resources

aka.ms/mymsignitethetour
Get Certified
aka.ms/azuredataengineer

aka.ms/DATA50 #MSIgniteTheTour
Optimize data warehousing
query performance
Speaker name
Title

aka.ms/DATA50 #MSIgniteTheTour

Sqlserver Toturial
No ratings yet
Sqlserver Toturial
3,386 pages
Les - 00. Intro DWH (Ban Quyen)
No ratings yet
Les - 00. Intro DWH (Ban Quyen)
43 pages
Salesforce Certified Agentforce - 23
No ratings yet
Salesforce Certified Agentforce - 23
4 pages
The Cognos BI 10.1.1 Dynamic Query Cookbook - IBM Developer
No ratings yet
The Cognos BI 10.1.1 Dynamic Query Cookbook - IBM Developer
69 pages
Azure Synapse
No ratings yet
Azure Synapse
229 pages
Azure Synapse - Cloud Data Analytics
No ratings yet
Azure Synapse - Cloud Data Analytics
33 pages
Azure Synapse
No ratings yet
Azure Synapse
609 pages
Unit 5 DWDM
No ratings yet
Unit 5 DWDM
19 pages
MIE1628 Big Data Analytics Lecture7
No ratings yet
MIE1628 Big Data Analytics Lecture7
77 pages
Azure Synapse Analytics
No ratings yet
Azure Synapse Analytics
4 pages
Advanced Query Tuning Using IBM Data Studio
No ratings yet
Advanced Query Tuning Using IBM Data Studio
64 pages
Azure Synapse Analytics Overview
No ratings yet
Azure Synapse Analytics Overview
251 pages
Imp Links
No ratings yet
Imp Links
33 pages
PDF
No ratings yet
PDF
2,919 pages
Whiz Cheat Sheet DP 203 v2
No ratings yet
Whiz Cheat Sheet DP 203 v2
42 pages
Azure Synapse Analytics L300 Update
No ratings yet
Azure Synapse Analytics L300 Update
180 pages
Data50 2020 02 - Feb 02
No ratings yet
Data50 2020 02 - Feb 02
26 pages
Analysis Services MOLAP Performance Guide For SQL Server 2012 and 2014
No ratings yet
Analysis Services MOLAP Performance Guide For SQL Server 2012 and 2014
110 pages
Capstone Project
No ratings yet
Capstone Project
57 pages
Azure Synpse
No ratings yet
Azure Synpse
4 pages
Data Weekender DP500
No ratings yet
Data Weekender DP500
41 pages
What Is Azure Synapse Data Explorer (Preview) - Azure Synapse Analytics - Microsoft Docs
No ratings yet
What Is Azure Synapse Data Explorer (Preview) - Azure Synapse Analytics - Microsoft Docs
6 pages
DP 203 Demo
No ratings yet
DP 203 Demo
9 pages
Uso Features
No ratings yet
Uso Features
5 pages
James Serra Azure Synapse Analytics Overview Big Data Conference Europe
No ratings yet
James Serra Azure Synapse Analytics Overview Big Data Conference Europe
72 pages
Big Data Analytics With IBM Cognos Dynamic Cubes: Solution Guide
No ratings yet
Big Data Analytics With IBM Cognos Dynamic Cubes: Solution Guide
14 pages
DP-203 Exam - Free Actual Q&As, Page 7 - ExamTopics
No ratings yet
DP-203 Exam - Free Actual Q&As, Page 7 - ExamTopics
11 pages
PRG10.Multi Threading in T24-R13
No ratings yet
PRG10.Multi Threading in T24-R13
32 pages
Dbms Project
No ratings yet
Dbms Project
50 pages
DP 203T00A ENU PowerPoint - 01
No ratings yet
DP 203T00A ENU PowerPoint - 01
20 pages
United States Patent: Muras Et Al. (10) Patent N0.: (45) Date of Patent
No ratings yet
United States Patent: Muras Et Al. (10) Patent N0.: (45) Date of Patent
11 pages
Cognos Framwork Manager Overview
No ratings yet
Cognos Framwork Manager Overview
34 pages
Monitoring & Tuning Azure SQL Database: Presenting Sponsors
100% (1)
Monitoring & Tuning Azure SQL Database: Presenting Sponsors
38 pages
Best Practices For Query Performance in A Data Warehouse: Calisto Zuzarte
No ratings yet
Best Practices For Query Performance in A Data Warehouse: Calisto Zuzarte
41 pages
DP203-Certification Preparation
No ratings yet
DP203-Certification Preparation
9 pages
Supercharge MDX Performance Using MDX Studio - Ashwani Roy
No ratings yet
Supercharge MDX Performance Using MDX Studio - Ashwani Roy
27 pages
Data Engineering 101 - Azure Synapse Analytics
No ratings yet
Data Engineering 101 - Azure Synapse Analytics
45 pages
8 Most Useful Dynamic Management Views and Functions I Often Use
No ratings yet
8 Most Useful Dynamic Management Views and Functions I Often Use
18 pages
DP-900 Assessment Notes
No ratings yet
DP-900 Assessment Notes
3 pages
Azure Analytics: Synapse
100% (4)
Azure Analytics: Synapse
251 pages
DP 900
50% (2)
DP 900
229 pages
Perf Monitoring and Troubleshooting - PASS Saturday Oregon
No ratings yet
Perf Monitoring and Troubleshooting - PASS Saturday Oregon
49 pages
SQL Server 2012 xVelocityBenchmark DatasheetMar2012
No ratings yet
SQL Server 2012 xVelocityBenchmark DatasheetMar2012
2 pages
Buffer Cache Hit Ratio: Useful DBA Monitoring Scripts
No ratings yet
Buffer Cache Hit Ratio: Useful DBA Monitoring Scripts
21 pages
Project Report Online Food Delivery
No ratings yet
Project Report Online Food Delivery
119 pages
A Mapping Design Tips
No ratings yet
A Mapping Design Tips
5 pages
Azure Databricks An Introduction
No ratings yet
Azure Databricks An Introduction
54 pages
SSAS-Analysis Services Query Performance Top 10 Best Practices
No ratings yet
SSAS-Analysis Services Query Performance Top 10 Best Practices
5 pages
Cognos 10 DQM
No ratings yet
Cognos 10 DQM
124 pages
Oracle Database Performance Tuning: Presented By-Rahul Gaikwad
No ratings yet
Oracle Database Performance Tuning: Presented By-Rahul Gaikwad
42 pages
Performance Tuning in Informatica
No ratings yet
Performance Tuning in Informatica
26 pages
Enkitec RealWorldExadata
No ratings yet
Enkitec RealWorldExadata
38 pages
Performance Tuning Brochure W
No ratings yet
Performance Tuning Brochure W
3 pages
Performance Tips For Large Datasets - Knowledge Base
No ratings yet
Performance Tips For Large Datasets - Knowledge Base
8 pages
Designing Optimized Index Strategies
No ratings yet
Designing Optimized Index Strategies
35 pages
S131 - Welcome To Adobe Analytics - Compressed
No ratings yet
S131 - Welcome To Adobe Analytics - Compressed
48 pages
Slide 05 5 MS-Server 2019 ADDS
No ratings yet
Slide 05 5 MS-Server 2019 ADDS
45 pages
Intro To Jupyter Notebooks
No ratings yet
Intro To Jupyter Notebooks
44 pages
Statistics Formulas: Parameters
No ratings yet
Statistics Formulas: Parameters
3 pages
FSD5
No ratings yet
FSD5
43 pages
Advanced Geographic Information Systems: Spatial Data Types and Models
No ratings yet
Advanced Geographic Information Systems: Spatial Data Types and Models
75 pages
Compare Edition Features in SQL Server 2008
No ratings yet
Compare Edition Features in SQL Server 2008
1 page
USB Certification - TC Application Training - 2023
No ratings yet
USB Certification - TC Application Training - 2023
37 pages
Azure Databricks Brief Introduction
No ratings yet
Azure Databricks Brief Introduction
40 pages
What Is Spool Administration in SAP
No ratings yet
What Is Spool Administration in SAP
3 pages
10 - Practical File Questions
No ratings yet
10 - Practical File Questions
13 pages
License
No ratings yet
License
3 pages
Vertipaq Vs OLAP - Change Your Data Modeling Approach - Marco Russo
No ratings yet
Vertipaq Vs OLAP - Change Your Data Modeling Approach - Marco Russo
10 pages
Ssis 2012
No ratings yet
Ssis 2012
63 pages
AnalysisServices Part1
No ratings yet
AnalysisServices Part1
21 pages
RealPlayer Log
No ratings yet
RealPlayer Log
32 pages
5 Sqlserver 2012ic m5 Postinstall Slides
No ratings yet
5 Sqlserver 2012ic m5 Postinstall Slides
23 pages
How To Build A Data Science Portfolio
No ratings yet
How To Build A Data Science Portfolio
17 pages
Exploring The Oracle Latches
No ratings yet
Exploring The Oracle Latches
52 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
44 pages
Integration Services Project1 - MC
No ratings yet
Integration Services Project1 - MC
8 pages
Azure Databricks Using Libraries
No ratings yet
Azure Databricks Using Libraries
6 pages
MDC Tables
No ratings yet
MDC Tables
14 pages
Module 5 OS
No ratings yet
Module 5 OS
60 pages
User Manual - IS - CDC - 2 - Operations and Commands (Guia)
No ratings yet
User Manual - IS - CDC - 2 - Operations and Commands (Guia)
61 pages
Hanuman Chalisa
No ratings yet
Hanuman Chalisa
4 pages
Chapter 4
No ratings yet
Chapter 4
19 pages
Connect URL
No ratings yet
Connect URL
1 page
Build A Quiz Application With Python
No ratings yet
Build A Quiz Application With Python
47 pages
Bli 224 em 2023 24 KP@9354372788
No ratings yet
Bli 224 em 2023 24 KP@9354372788
11 pages
Integration Services Project1
No ratings yet
Integration Services Project1
1 page
Shourya Reddy
No ratings yet
Shourya Reddy
2 pages
MySQL JOIN Types Poster - Steve Stedman
No ratings yet
MySQL JOIN Types Poster - Steve Stedman
8 pages
DATABASE PROGRAMMING EXAM - Updated
No ratings yet
DATABASE PROGRAMMING EXAM - Updated
3 pages
Exemple LMD
No ratings yet
Exemple LMD
9 pages
RIoTBench Summary
No ratings yet
RIoTBench Summary
26 pages
PW2SI2023 Etudiants
No ratings yet
PW2SI2023 Etudiants
2 pages
Application Job Letter
No ratings yet
Application Job Letter
3 pages
File Input and Output
No ratings yet
File Input and Output
2 pages
DP-500 Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI Exam Guide
From Everand
DP-500 Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI Exam Guide
Anand Vemula
No ratings yet
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
Effective Business Intelligence with QuickSight
From Everand
Effective Business Intelligence with QuickSight
Rajesh Nadipalli
No ratings yet
Azure Synapse Analytics Solutions: Definitive Reference for Developers and Engineers
From Everand
Azure Synapse Analytics Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Amazon Athena Query Design and Optimization: Definitive Reference for Developers and Engineers
From Everand
Amazon Athena Query Design and Optimization: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Data Preparation with AWS Glue DataBrew: Definitive Reference for Developers and Engineers
From Everand
Efficient Data Preparation with AWS Glue DataBrew: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Data Lakes & Pipelines: A Modern Azure Guide
From Everand
Data Lakes & Pipelines: A Modern Azure Guide
Kameron Hussain
No ratings yet
Azure Data Demystified: From SQL to Synapse
From Everand
Azure Data Demystified: From SQL to Synapse
Kameron Hussain
No ratings yet
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)

Data50 2020 02 - Feb 09

Uploaded by

Data50 2020 02 - Feb 09

Uploaded by

aka.

Session Resources Hub

Session Code on GitHub

All Event Session Resources

What is Azure Synapse Analytics

Query Performance Tuning

What is Azure Synapse Analytics

Using Polybase to Load Data in a data warehouse

Data Loading best practices

A limitless analytics service with

Provision Load Query

Automate workflow via Azure Data Factory

Compute Node Compute Node Compute Node

0110101010101 0110101010101 0110101010101

0110101010101 0110101010101 0110101010101

Read data from Load processed

Data ingestion Data storage

Business and custom

Round Robin Hash Distributed Replicated

Evenly distributes the data across the

Loading into Round-robin tables is fast

Queries on Round-robin tables may

Great to use for loading staging tables

Choosing a good distribution column is

As a starting point, use on tables that are

But don’t choose a volatile column for the

Replicated Works well for dimension tables in a star

Should not be used on dimension tables

You can convert existing round-robin

Azure Synapse Analytics

Pinpoint and fix queries with plan regression

• View queries which produce multiple plans

Query Data Store • Full query text

A/B Testing with your Azure Synapse Analytics

Identify, improve and tune ad hoc queries

• Top hitting queries for performance tuning

Runtime stats Runtime stats interval

Flush to disc every 15 minutes seconds 3

10GB is the max storage size

Retention period is 7 days

Compute Compute Compute Compute

SQL DB SQL DB SQL DB SQL DB

aka.ms/DATA50 Dist_DB_ Dist_DB_ #MSIgniteTheTour

Detect data skew

In Summary: • Use Query Data store

Query Performance distribution

Provision additional adaptive cache

Reduce tempdb contention

Create and update statistics

Session Code on GitHub

All Event Resources

You might also like