Microsoft SQL Server to Databricks Migration Guide
Microsoft SQL Server to Databricks Migration Guide
Microsoft SQL
Server to Databricks
Migration Guide
!"
Contents
Introduction 3
About this guide 4
Migration strategy 4
Overview of the migration process 4
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 2
Introduction
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 3
ABOUT THIS GUIDE
Need Help
Migrating?
OVERVIEW OF THE MIGRATION PROCESS
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 4
Despite the substantial differences between SQL Server and Databricks,
there are surprising similarities that can facilitate the migration process:
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 5
Phase 1: Migration Discovery and Assessment
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 6
Typically, these profiling tools examine SQL Server system usage via its
system views and catalog tables, providing consumption and complexity
insights and an inventory of objects and code migration complexity.
Introduction They capture the types of workloads, long-running ETL queries and user
access patterns. This level of analysis aids in pinpointing databases
Phase 1:
Migration Discovery
and pipelines that contribute to high operational costs and complexity,
and Assessment thereby supporting the prioritization process.
Phase 2: Our BladeBridge Code Analyzer not only classifies queries based on
Architecture Design
their complexity in “T-shirt sizes” (small, medium, large, extra-large, etc.)
and Planning
— but also assesses function compatibility of Bteq scripts and stored
Phase 3: procedures, which is vital in ensuring seamless migration.
Data Warehouse
Migration Running Databricks Migration Analyzer
Phase 4:
Code and ETL
Pipelines Migration
Phase 5:
BI and Analytics
Tools Integration
Export Install and Review all Databricks
1 2 3 4
metadata point Analyzer code patterns PS+ SI Partner
Phase 6: from legacy to the metadata and job to give you
Migration Validation systems location complexity full migration
proposal
Need Help
Migrating?
Figure 2: Running Databricks migration analyzer
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 7
Phase 2: Architecture Design and Planning
Introduction SQL Server and Databricks operate in markedly different ways. SQL
Phase 1:
Server requires careful selection of a suitable Primary Index with high
Migration Discovery cardinality to ensure proper data distribution across all participating.
and Assessment
By contrast, the Databricks Intelligence Platform is a distributed system
Phase 2:
by design; the data distribution depends on the configuration of a
Architecture Design
and Planning cluster and the nature of the data. For example, if data is loaded from a
sample CSV file into a Databricks Notebook using the spark.read.csv()
Phase 3:
Data Warehouse
function, the data will be automatically distributed across the nodes in
Migration a cluster. By default, Spark will split the data into partitions, processing
each partition by a separate task on an individual node. This allows for
Phase 4:
Code and ETL efficient parallel processing of the data.
Pipelines Migration
The Databricks distributed design facilitates horizontal scaling, enabling
Phase 5: data distribution and computations across multiple nodes in a cluster.
BI and Analytics
Tools Integration
This capability allows Databricks to process large datasets and handle
high query volumes efficiently, surpassing the capabilities of a traditional
Phase 6: database system like on-premises SQL Server.
Migration Validation
It is essential to consider these distinctions when migrating from
Need Help
Migrating? SQL Server to Databricks. By consciously mapping the similarities
and differences between the two platforms, organizations can better
understand Databricks’ capabilities with SQL Server.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 8
EDW ARCHITECTURE
Phase 3:
Data Warehouse
Operational Data Mart
Migration Database
Lookup
Pipelines Migration
Phase 5: Files
Phase 6:
Figure 3: SQL Server reference architecture of an enterprise data warehouse
Migration Validation
Need Help
Migrating?
It is imperative to analyze the current architecture, the as-is architecture
comprehensively. This involves understanding upstream and
downstream integrations and the respective tools and technologies.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 9
Below is an example of a data warehousing architecture on Databricks
with various ISV partner integration options.
Introduction
Phase 1:
Migration Discovery
and Assessment
Phase 2:
Architecture Design
and Planning
Phase 3:
Data Warehouse
Migration
Phase 4:
Code and ETL
Pipelines Migration Figure 4: Modern data warehousing on Databricks
Phase 5:
BI and Analytics
Tools Integration Following the architectural alignment, we will dive deeply into the SQL
Server’s current features.
Phase 6:
Migration Validation
SQL SERVER VS. DATABRICKS FEATURE MAPPING EXAMPLE
Need Help
OBJECTS/ SQL SERVER DATABRICKS
Migrating? WORKLOAD
Compute SQL Server on-premises compute Databricks Managed Clusters optimized for
workload types with a runtime:
Storage Physical HDD or SSD for on-premises Cloud storage (Amazon S3, Azure Blob
deployment. Storage, Azure Data Lake Storage Gen2,
Google Cloud Storage)
Format SQL Server proprietary Delta and Iceberg Format (open source)
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 10
OBJECTS/ SQL SERVER DATABRICKS
WORKLOAD
Phase 2:
Architecture Design Database Tables, Views, Materialized Views Tables, Views, Materialized Views,
Objects (Join Index), Stored Procedures, UDFs DLT, UDFs
and Planning
Phase 3:
Metadata Built-in system tables under the Unity Catalog
Data Warehouse Catalog DBC schema
Migration
Phase 4: Data Sharing No native support for on-premises mode Delta Sharing
Code and ETL Delta Sharing Marketplace
Pipelines Migration
Migration Validation
Storage SQL Server Proprietary Delta (Parquet files with metadata) and
Format Iceberg format
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 11
OBJECTS/ SQL SERVER DATABRICKS
WORKLOAD
and Assessment
Phase 3:
Data Warehouse Orchestration SSIS - SQL Server Integration Services Databricks Workflows
Migration
Phase 6: The table above compares key features between SQL Server and
Migration Validation
Databricks. Undertaking a thorough comparison is essential during
Need Help this stage of the migration process. This systematic process ensures a
Migrating? comprehensive understanding of the required transformation, facilitating
a smoother transition by identifying equivalent services, functionalities
and potential gaps or challenges.
Typically, by the end of this phase, we have a good handle on the scope
and complexity of the migration and can come up with a more accurate
migration plan and cost estimate.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 12
Phase 3: Data Warehouse Migration
Introduction • What is the target design for the tables being migrated?
Phase 1:
Maintaining the existing schema during migration ensures consistency,
Migration Discovery easing the data verification process and fostering a more reliable and
and Assessment efficient migration.
Phase 2:
Architecture Design
and Planning RECOMMENDED APPROACH
Phase 3: It is important to note that not every data migration will follow the same
Data Warehouse
Migration
pattern, as each migration is influenced by unique factors such as data
volume, system complexity and organizational requirements. However,
Phase 4: Databricks recommends adhering to the following general flow for an
Code and ETL
effective and efficient data migration process:
Pipelines Migration
Phase 5:
BI and Analytics
Tools Integration
1 Migrate enterprise data warehouse (EDW) tables
Phase 6: into Delta Lake medallion data architecture:
Migration Validation
• The raw layer into Bronze
Need Help • The stage/central or historical layer to Silver
Migrating?
• The final layer or semantic layer to Gold
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 15
PHASE 3.1: SCHEMA MIGRATION
Phase 5:
BI and Analytics
Tools Integration
Phase 6:
Migration Validation
1 Namespace mapping, i.e., schema/object in
SQL Server to catalog/schema/object in Databricks
Need Help Unity Catalog.
Migrating?
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 16
4 Caution must be exercised when converting
Introduction PRIMARY and SECONDARY indices to partitions
in Delta tables. Over partitioning can lead to
Phase 1:
Migration Discovery
unnecessary overhead and minor file problems,
and Assessment ultimately compromising performance in the
Lakehouse architecture. Delta’s default partition
Phase 2:
size is 1TB, and Z-order indexes and predictive
Architecture Design
and Planning optimizations simplify the design in Databricks.
Phase 3:
Data Warehouse 5 Additional Delta table properties can be specified
Migration via the TBLPROPERTIES clause, e.g., delta.
targetFileSize, delta.tuneFileSizesForRewrites, delta.
Phase 4:
Code and ETL
columnMapping.mode, and others.
Pipelines Migration
Phase 5:
BI and Analytics
Tools Integration
PHASE 3.2: DATA MIGRATION
Phase 6:
Migration Validation Transferring legacy on-premises data to a cloud storage location for
Need Help
seamless consumption in Databricks can be a demanding task, but
Migrating? there a few viable options:
Phase 3:
Data Warehouse 5 Using Databricks’ JDBC Connector: Databricks
Migration provides a JDBC (Java Database Connectivity)
connector that facilitates direct reading from SQL
Phase 4:
Code and ETL
Server databases.
Pipelines Migration
Phase 5:
BI and Analytics
Tools Integration
PHASE 3.3: OTHER DATABASE OBJECTS MIGRATION
Phase 6:
Migration Validation Other Database Objects, such as Views, Stored procedures, and
Macros and Functions can also be easily migrated to Databricks via
Need Help our automated code conversion processes. Please review this helpful
Migrating?
cheat sheet packed with essential tips and tricks to help users start on
Databricks using SQL programming in no time! Some key pointers while
converting T-specific SQL objects:
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 18
• Stored procedures are typically used in data warehouse
environments to leverage the ELT pattern. This methodology signifies
that most data processing transactions are performed in the
Introduction
warehouse.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 19
Implement Slowly Changing Dimensions
Phase 6:
Migration Validation
PHASE 3.4: DATA SECURITY MIGRATION
Need Help
Migrating?
When discussing security migration, we need to consider both
authentication and authorization.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 20
Authentication
Phase 6:
Though Databricks supports Hive metastore, this document will
Migration Validation focus on Unity Catalog only, as it is a future-proof approach for data
governance in Databricks Data Intelligence Platform.
Need Help
Migrating? From a security perspective, Unity Catalog shares many similarities
with SQL Server. Both offer authorization models where permissions are
assigned to objects and use ANSI-compliant SQL statements such as
GRANT and REVOKE. However, it is crucial to understand the difference
between the two to execute a successful migration.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 21
Similarly, Databricks doesn’t have fixed database roles, which are
widely used in SQL Server (e.g., db_owner or db_securityadmin). We
recommend revising the usage of such roles and moving to a more
Introduction
granular permissions model in the Unity Catalog.
Phase 1:
Both SQL Server and Databricks offer similar GRANT and REVOKE
Migration Discovery statement syntax. However, Databricks doesn’t possess DENY
and Assessment statements. Therefore, there is no explicit support for the denial of
permission. If you use this security technique in your design, we suggest
Phase 2:
Architecture Design revising and redesigning to an approach based on explicit grants and,
and Planning optionally, permission inheritance.
Phase 6: • Databricks
Migration Validation
• SQL Server
Need Help
Migrating? We recommend following these standard practices in Databricks Unity
Catalog to enable better permissions manageability and operational
excellence.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 22
Phase 4: Code and ETL Pipelines Migration
Phase 5:
BI and Analytics
Tools Integration
Phase 6:
Migration Validation
Need Help
Migrating?
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
Figure 6: DLT pipelines
M I G R AT I O N
GUIDE 24
QUERY MIGRATION AND REFACTORING
Phase 2:
Architecture Design
and Planning
Phase 3:
Data Warehouse
Migration
Phase 4:
Code and ETL
Pipelines Migration
Phase 5:
BI and Analytics
Tools Integration
• Cost and time-effective: Our Converter reduces the cost and time
required for a migration project by automating the process.
Phase 5:
BI and Analytics
Tools Integration
Phase 6:
Migration Validation
Need Help
Migrating?
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 26
Code Optimization
Many queries will likely need to be refactored and optimized during the
Introduction migration process. Easy techniques like automated liquid clustering and
predictive optimization make performance tuning almost an automated
Phase 1:
Migration Discovery
process in Databricks. Predictive Optimization uses techniques like:
and Assessment
Phase 2:
Architecture Design
and Planning
1 Compaction - that optimizes file sizes.
Phase 3:
Data Warehouse 2 Liquid clustering - that incrementally clusters
Migration
incoming data, enabling optimal data layout and
Phase 4: efficient data skipping.
Code and ETL
Pipelines Migration
3 Running Vacuum - which reduces costs by deleting
Phase 5: unneeded files from storage.
BI and Analytics
Tools Integration
4 Automatic updating of Statistics - running the
Phase 6: ANALYZE STATISTIC command on the required
Migration Validation
columns for best performance.
Need Help
Migrating?
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 27
SQL Server to Databricks Cutover Phase
During this phase, while data workloads run concurrently in SQL Server
Introduction and Databricks, it presents an opportunity for a comparative analysis to
understand the behavior of workloads in Databricks versus SQL Server.
Phase 1:
Migration Discovery
This can help identify potential bottlenecks or shortcomings resulting
and Assessment from the code migration and refactoring phase.
Phase 2:
Architecture Design To minimize expenses and disruption to business during this transition,
and Planning consider the following recommendations:
Phase 3:
Data Warehouse
Migration
With this phase complete, ETL workloads are fully migrated and
operational in Databricks, and the final layer data in SQL Server is
synchronized with the Gold layer data in Databricks.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 28
Phase 5: BI and Analytics Tools Integration
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 29
Many customers take this opportunity to optimize their BI models and
semantic layers to align with business needs.
Introduction
Phase 1:
Migration Discovery
and Assessment
Phase 2:
Architecture Design
and Planning
Phase 3:
Data Warehouse Figure 9: Future-state architecture
Migration
Phase 4:
Code and ETL During report migration, you may encounter a scenario where expanding
Pipelines Migration the permissions of BI tool access to cloud storage buckets becomes
Phase 5:
necessary to leverage Databricks Cloud Fetch feature. This feature
BI and Analytics enables high-bandwidth data exchange and enhances the efficiency
Tools Integration of data retrieval. For more details, refer to the blog How We Achieved
Phase 6: High-bandwidth Connectivity With BI Tools.
Migration Validation
Need Help
Migrating?
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 30
MICROSOFT POWER BI INTEGRATION
Phase 4:
When migrating Power BI datasets to Azure Databricks, standard
Code and ETL
Pipelines Migration migration techniques may apply, including piloting or MVP, using
separate Power BI workspaces for testing and data validation after
Phase 5: switching datasets to Azure Databricks SQL.
BI and Analytics
Tools Integration
For more information on implementing Semantic Lakehouse with Azure
Phase 6: Databricks and Power BI, please refer to the following blog posts:
Migration Validation
Need Help • The Semantic Lakehouse With Azure Databricks and Power BI
Migrating?
• Power Up Your BI With Microsoft Power BI and Lakehouse in Azure
Databricks: Part 1 — Essentials
MICROSOFT 1. A proprietary functional language primarily used for data transformation, allowing for the import, filtering,
SQ L S ERVER TO merging and shaping of data
D ATA B R I C K S
M I G R AT I O N
GUIDE 31
Phase 6: Migration Validation
Introduction The primary validation method for a data pipeline is the resulting dataset
Phase 1:
itself. We recommend establishing an automated testing framework that
Migration Discovery can be applied to any pipeline. Typically, this involves using a testing
and Assessment framework with a script capable of automatically comparing values in
Phase 2: both platforms.
Architecture Design
and Planning
Databricks recommends you perform the following checks at a minimum:
Phase 3:
• Check to see if a table exists
Data Warehouse
Migration • Check the counts of rows and columns across the tables
Phase 6:
Migration Validation Run the pipelines in parallel for a specific period (we find one week to
be an acceptable baseline, but you may wish to extend this to ensure
Need Help
Migrating? stability) and review the comparison results to ensure the data is
ingested and transformed into the proper context.
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 32
A robust data validation requires the following components:
Phase 6:
Migration Validation
Need Help
Migrating?
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 33
Need Help Migrating?
Migration Strategy and Design: Our architects will work with your
team to finalize the target Databricks architecture, detailed migration
plan and technical approaches for the migration phases outlined in
this guide. We will help select appropriate migration patterns, tools
and delivery partners and collaborate with our certified SI partners
to develop a comprehensive Statement of Work (SOW).
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 34
Execute and Scale: We and our certified partners deliver on our
comprehensive migration plan and then work with your team to
facilitate knowledge sharing and collaboration and scale successful
Introduction practices across the organization. Our experts can help you set up
a Databricks Center of Excellence (CoE) to capture and disseminate
Phase 1:
Migration Discovery
lessons learned and drive standardization and best practices as you
and Assessment expand to new use cases.
Phase 2:
Architecture Design
and Planning Contact your Databricks representative or use this form for more
information. Our specialists can help you every step of the way!
Phase 3:
Data Warehouse
Migration
Phase 4:
Code and ETL
Pipelines Migration
Phase 5:
BI and Analytics
Tools Integration
Phase 6:
Migration Validation
Need Help
Migrating?
MICROSOFT
SQ L S ERVER TO
D ATA B R I C K S
M I G R AT I O N
GUIDE 35
About Databricks
Databricks is the data and AI company. More than 10,000
organizations worldwide — including Block, Comcast, Condé
Nast, Rivian, Shell and over 60% of the Fortune 500 — rely on the
Databricks Data Intelligence Platform to take control of their data and
put it to work with AI. Databricks is headquartered in San Francisco,
with offices around the globe, and was founded by the original
creators of Lakehouse, Apache Spark™, Delta Lake and MLflow.
© Databricks 2025. All rights reserved. Apache, Apache Spark, Spark and
the Spark logo are trademarks of the Apache Software Foundation. Privacy Notice | Terms of Use !"