Data Analysis With Databricks
Data Analysis With Databricks
with
Databricks SQL
Joins 10 min
Become Certified
• Five certification pathways
• Data Analyst
• Business Leader
• Data Engineer
• Machine Learning Practitioner
• Platform Administrator
• Recommended Self-Paced Courses
• How to Ingest Data for Databricks SQL
• How to Integrate BI Tools with Databricks SQL
4 Warehouse Configuration
Amazon
Unify your data
ecosystem with open
Redshift
Lakehouse Platform
Data Providers
Centralized Governance
AWS Glue
Partners
Top Consulting & SI Partners
450+
Across the data landscape
Coming soon:
Maximize existing investments by Respond to business needs faster Build rich and custom data
connecting your preferred BI tools to with a self-served experience enhanced applications for your
your data lake with Databricks SQL designed for every analysts in your own organization or your
Warehouses. Re-engineered and organization. Databricks SQL Analytics customers. Benefit from the ease
optimized connectors ensure fast provides a simple and secure access of connectivity, management, and
performance, low latency, and high user to data, ability to create or reuse SQL better price / performance of
concurrency to your data lake. Now queries to analyze the data that sits Databricks SQL Analytics to
analysts can use the best tool for the job directly on your data lake, and quickly simplify development of
on one single source of truth for your mock-up and iterate on visualizations data-enhanced applications at
data while minimizing more ETL and data and dashboards that fit best the scale, all served from your data
silos. business. lake.
• aka, Database
• Second level of organization
• Users can see all schemas where
USAGE is granted on both the
schema and the catalog
• Databricks SQL can ingest Parquet, JSON, CSV, Delta, and more
• Individual file
• Full directory of files of a single type
• Example (Azure Databricks):
CREATE TABLE table1 LOCATION
'wasbs://[account].blob.core.windows.net/[container]/[path/]'
• Example:
SELECT id, name, deptname
FROM employee
INNER JOIN department ON employee.deptno =
department.deptno;
INNER JOIN
Output:
57
LEFT JOIN
Output:
58
RIGHT JOIN
Output:
59
FULL JOIN
Output:
60
SEMI JOIN
Output:
61
ANTI JOIN
Output:
62
Output:
CROSS JOIN
• Default visualization
• Customizable columns
• Change heading
• Add description
• Change font color
• Conditional font color
• Based on each data
value
• Display a count
• Control of number of
buckets
Sankey Sunburst