Course Review and Exam Tips: AZ-900 Outline Objectives (From Microsoft) Covered in ACG Course Section Lesson(s) /lab(s)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

DP-900 Azure Data Fundamentals Study Guide

Course Review and Exam Tips


• Most sections of the course include a Bingo game, which is a flashcard-like
application for reinforcing vocabulary terms
• Most sections of the course have quizzes, which you can take as often as you like.
• Many sections of the course include an examination of the various tools and
technologies in the context of an overall data platform architecture. This
whiteboard presentation is a good way to understand practical application and
considerations for real-world scenarios.
• For the most thorough coverage of the AZ-900 exam objectives laid out by
Microsoft, consider the hands-on labs as required material, unless otherwise
noted.
• The last 60-90 seconds of each lesson include takeaways from the content
presented. Reviewing these summaries could help you identify areas where you
may need to refresh your understanding of key concepts and topics. Section 6,
focused on relational data, includes a section summary as a standalone lesson,
which is another good knowledge checkpoint.
• When reviewing the certification exam objectives on the Microsoft site, there will
be a link for free training material from Microsoft Learn. The material is presented
with mostly text and images, with a few hands-on exercises. It tends to cover
more than what is required for the exam, which is certainly useful for reinforcing
and expanding on the exam-specific concepts. However, the additional detail can
be overwhelming. This is a fundamentals exam, so focus your attention on the
exam objectives laid out by Microsoft.

Vendor Exam Objectives Mapped to ACG Course Sections

AZ-900 Outline Objectives (from Covered in ACG Course


Microsoft) Section Lesson(s)/Lab(s)
Describe core data concepts (15-20%)
Describe types of core data workloads
Describe batch data Lessons 2.5, 8.2
Describe streaming data Lessons 2.5, 8.2
Describe the difference between batch and Lessons 2.5, 8.2
streaming data
Describe the characteristics of relational data Lessons 2.2, 2.3, 2.7, 6.1
Describe data analytics core concepts
Describe data visualization (e.g., visualization, Lessons 2.9, 10.1, 10.2, 10.3, Lab: Create
reporting, business intelligence (BI)) a Compelling Power BI Dashboard via a
Canned App
Describe basic chart types such as bar charts and Lessons 2.9, 10.1, 10.2, 10.3, Lab: Create
pie charts a Compelling Power BI Dashboard via a
Canned App
Describe analytics techniques (e.g., descriptive, Lessons 2.9, 10.3, Lab: Create a
diagnostic, predictive, prescriptive, cognitive) Compelling Power BI Dashboard via a
Canned App
DP-900 Azure Data Fundamentals Study Guide

Describe ELT and ETL processing Lessons 2.5, 2.8, 8.1 ,8.2, 8.3, 8.4
Describe the concepts of data processing Lessons 2.5, 2.8, 2.9, 8.1,8.2, 8.3, 8.4
Describe how to work with relational
data on Azure (25-30%)
Describe relational data workloads
Identify the right data offering for a relational Lessons 2.2, 2.3, 2.7, 3.1, 3.4, 6.2, 6.3,
workload 6.4
Describe relational data structures (e.g., tables, Lessons 2.2, 2.3, 6.1
index, views)
Describe relational Azure data services
Describe and compare PaaS, IaaS, and SaaS Lessons 1.4, 6.2
solutions
Describe Azure SQL family of products including Lessons 6.2, 6.3, 6.4
Azure SQL Database, Azure SQL Managed
Instance, and SQL Server on Azure Virtual Machine
Describe Azure Synapse Analytics Lessons 2.2, 6.2, 6.3, 6.4, 9.1, 9.2, 9.3,
9.4
Describe Azure Database for PostgreSQL, Azure Lessons 6.2, 6.3, 6.4
Database for MariaDB, and Azure Database for
MySQL
Identify basic management tasks for relational
data
Describe provisioning and deployment of relational Lessons 2.5, 2.6, 5.1, 5.2, 5.3, 5.4, 6.8,
data services Lab: Create and Save a Reusable ARM
Template from an Existing Azure SQL
Database, Lab: Using SQL to Manage
Database Objects
Describe method for deployment including the Lessons 2.6, 5.1, 5.2, 5.3, 5.4, 5.5, 6.8,
Azure portal, Azure Resource Manager templates, Lab: Create and Save a Reusable ARM
Azure PowerShell, and the Azure command-line Template from an Existing Azure SQL
interface (CLI) Database, Lab: Using SQL to Manage
Database Objects
Identify data security components (e.g., firewall, Lessons 2.5, 5.1, 5.2, 6.8
authentication, encryption)
Identify basic connectivity issues (e.g., accessing Lessons 2.5, 5.1, 5.2, 6.8, Lab: Use
from on-premises, access with Azure VNets, Cloud Shell to Survey Azure Data
access from Internet, authentication, firewalls) Resources, Lab: Using SQL to Manage
Database Objects
Identify query tools (e.g., Azure Data Studio, SQL Lessons 2.6, 6.7, 6.8, Lab: Use Azure
Server Management Studio, sqlcmd utility, etc.) Data Studio to Perform 10 Fundamental
SQL Queries in Azure, Lab: Using SQL to
Manage Database Objects
Describe query techniques for data using SQL
language
Compare Data Definition Language (DDL) versus Lessons 2.6, 6.7, 6.8, Lab: Use Azure
Data Manipulation Language (DML) Data Studio to Perform 10 Fundamental
DP-900 Azure Data Fundamentals Study Guide

SQL Queries in Azure, Lab: Using SQL to


Manage Database Objects
Query relational data in Azure SQL Database, Azure Lessons 2.4, 2.6, 6.7, 6.8, Lab: Use
Database for PostgreSQL, and Azure Database for Azure Data Studio to Perform 10
MySQL Fundamental SQL Queries in Azure, Lab:
Using SQL to Manage Database Objects
Describe how to work with non-
relational data on Azure (25-30%)
Describe non-relational data workloads
Describe the characteristics of non-relational data Lessons 2.2, 2.3, 2.7, 2.10, 4.1, 4.2, 4.3,
4.4, 7.1, 7.2, 7.3, 7.4, Lab: Working with
the Core (SQL) API in Azure Cosmos DB,
Lab: Creating an Azure Storage Table,
Lab: Expire Data Based on Age in Azure
Blob Storage
Describe the types of non-relational and NoSQL 2.2, 2.3, 2.7, 2.10, 4.1, 4.2, 4.3, 4.4, 7.1,
data 7.2, 7.3, 7.4, Lab: Creating an Azure
Storage Table, Lab: Expire Data Based
on Age in Azure Blob Storage
Recommend the correct data store Lessons 4.1, 4.2, 4.3, 4.4, 7.3, 7.4
Determine when to use non-relational data Lessons 4.1, 7.1, 7.4
Describe non-relational data offerings on Azure
Identify Azure data services for non-relational Lessons 4.1, 4.2, 4.3, 4.4, 7.1, 7.4
workloads
Describe Azure Cosmos DB APIs Lessons 7.1, 7.3,7.5, Lab: Working with
the Core (SQL) API in Azure Cosmos DB
Describe Azure Table storage Lessons 4.1, 4.3, 5.3, 7.3, 7.5, Lab:
Creating an Azure Storage Table
Describe Azure Blob storage Lessons 4.1, 4.2, 5.1, 7.5, Lab: Expire
Data Based on Age in Azure Blob
Storage
Describe Azure File storage Lessons 4.1, 4.4, 5.3, 7.5
Identify basic management tasks for non-
relational data
Describe provisioning and deployment of non- Lessons 2.5, 5.1, 5.3, 5.4, 7.1, 7.2, 7.3
relational data services
Describe method for deployment including the Lessons 2.5, 5.1, 5.2, 5.3, 5.4, 7.1, Lab:
Azure portal, Azure Resource Manager templates, Working with the Core (SQL) API in
Azure PowerShell, and the Azure command-line Azure Cosmos DB, Lab: Creating an
interface (CLI) Azure Storage Table, Lab: Expire Data
Based on Age in Azure Blob Storage
Identify data security components (e.g., firewall, Lessons 2.5, 5.1, 5.2, 5.3, 5.4, 7.1, Lab:
authentication, encryption) Working with the Core (SQL) API in
Azure Cosmos DB, Lab: Creating an
Azure Storage Table, Lab: Expire Data
Based on Age in Azure Blob Storage
DP-900 Azure Data Fundamentals Study Guide

Identify basic connectivity issues (e.g., accessing Lessons 2.5, 5.1, 5.2, 5.3, 5.4, 7.1, Lab:
from on-premises, access with Azure VNets, Working with the Core (SQL) API in
access from Internet, authentication, firewalls) Azure Cosmos DB, Lab: Creating an
Azure Storage Table, Lab: Expire Data
Based on Age in Azure Blob Storage
Identify management tools for non-relational data Lessons 4.2, 4.3, 4.4, 5.3, 5.4, Lab: Use
Cloud Shell to Survey Azure Data
Resources, Lab: Creating an Azure
Storage Table, Lab: Expire Data Based
on Age in Azure Blob Storage
Describe an analytics workload on
Azure (25-30%)
Describe analytics workloads
Describe transactional workloads Lessons 2.7, 3.2, 3.4
Describe the difference between a transactional Lessons 2.7, 3.2, 3.3, 3.4
and an analytics workload
Describe the difference between batch and real Lessons 2.5, 2.7, 8.1, 8.2, Lab: Trigger
time and Monitor an Azure Data Factory
Pipeline
Describe data warehousing workloads Lessons 2.8, 8.1, 8.2, 8.3
Determine when a data warehouse solution is Lessons 2.8, 6.3, 9.1, 9.4
needed
Describe the components of a modern data
warehouse
Describe Azure data services for modern data Lessons 2.8, 9.1, 9.4
warehousing such as Azure Data Lake Storage
Gen2, Azure Synapse Analytics, Azure Databricks,
and Azure HDInsight
Describe modern data warehousing architecture Lessons 2.8, 8.1, 8.2, 8.3, 9.1, 9.2, 9.3,
and workload 9.4
Describe data ingestion and processing on Azure
Describe common practices for data loading Lesson 8.1
Describe the components of Azure Data Factory Lessons 8.3, 8.4, Lab: Trigger and
(e.g., pipeline, activities, etc.) Monitor an Azure Data Factory Pipeline
Describe data processing options (e.g., Azure Lessons 8.3, 8.4, 9.1, 9.2, 9.3, Lab:
HDInsight, Azure Databricks, Azure Synapse Trigger and Monitor an Azure Data
Analytics, Azure Data Factory) Factory Pipeline
Describe data visualization in Microsoft Power BI
Describe the role of paginated reporting Lessons 10.2, 10.3
Describe the role of interactive reports Lessons 10.2, 10.3, Lab: Create a
Compelling BI Dashboard via a Canned
App
Describe the role of dashboards Lessons 10.2, 10.3, Lab: Create a
Compelling BI Dashboard via a Canned
App
Describe the workflow in Power BI Lesson 10.1
DP-900 Azure Data Fundamentals Study Guide

Vocabulary List
These terms and their definitions populate the Bingo games found near the end of most
course sections.

Term Definition
3NF Data Model The most commonly applied data model form employed in
relational databases optimized for transactional workloads
Access A security-related concept that describes what entities can
interact with which systems and what they can do during those
interactions
ACID Acronym that describes a set of database properties that intend
to guarantee data validity despite errors, power failures,
concurrent use, and other risks to data integrity
Activities In the context of Azure Data Factory or Synapse Pipelines, these
describe work performed at each step in a pipeline
Analytics Workload Larger, richer datasets optimized for decision support and
complex questions posed of the data
Apache Spark Open-source analytics engine used for big data processing
API Describes a means for other systems or humans to interact with
a resource, such as a cloud database, using an "interface" that
hides the complexity of the source system
ARM Templates "Infrastructure as code," written in JSON to define and configure
Azure resources
Attribute-Based Access Control A more granular form of access control that provides access
rights based on user, environment, or resource attributes
Authentication A security-related process where a human user or a system is
confirmed to be who they say they are
AzCopy A tool designed to move large volumes of data in and out of
Azure Storage
Azure Analytics Service Based on SQL Server Analytics, an analytics service that
operates over tabular and dimensional data models
Azure Blob Storage An Azure data resource well suited for storing and serving up
unstructured data in the form of images, videos, and PDFs
Azure CLI A command line language that can be used in a Bash shell to
interact with Azure data resources
Azure Data Lake Storage Gen2 A form of Blob storage optimized for analytics support with a
hierarchical folder structure and able to store both relational
and non-relational data
Azure Data Studio Cross-platform database tool for data professionals using on-
premises and cloud data platforms on Windows, macOS, and
Linux
Azure Database for MariaDB Relational database service in the Microsoft cloud that supports
multiple tiers of the community edition, which was originally
forked off of MySQL
Azure Database for MySQL Azure PaaS offering that is based on the community edition of a
popular open-source database, with some of the advanced
features of the Oracle enterprise edition
Azure Database for PostgreSQL An Azure PaaS offering that is based on the community edition
of a relational database that offers many of the advanced
features only otherwise found in Azure SQL Database
DP-900 Azure Data Fundamentals Study Guide

Term Definition
Azure Databricks Analytics service built on Apache Spark that supports data
processing, machine learning, and analytics via code in
notebooks using SQL, R, Python, and Scala
Azure Files Azure data resource that mimics on-premises file shares and
allows applications and humans to interact with it the same way
they do on servers and desktop computers
Azure HDInsight Open-source big data processing service, similar in purpose to
Azure Synapse Analytics
Azure SQL Database Azure's cornerstone PaaS relational data resource, available in
multiple configurations from a single database to databases that
approach full Microsoft SQL Server capabilities
Azure SQL Elastic Pools Solution for managing and scaling multiple Azure SQL databases
that have varying and unpredictable usage demands by flexibly
sharing set resources on a single server
Azure SQL Managed Instance A cloud-based PaaS offering that is as compatible as possible
with the full Microsoft SQL Server platform while still providing
the benefits of a fully managed service
Azure Storage Explorer A tool designed to manage data stored in a wide variety of
formats in Azure using a familiar interface
Azure Synapse Analytics Modern data warehouse and analytics service based on SQL
Data Warehouse that also includes Spark technologies and deep
integration with other big data resources
Azure Table Storage An Azure storage resource that employs the key-value type of
semi-structured data
Bar Chart Chart that allows for analysis of differences across a set of
categories
Bash A popular shell language that can run Azure CLI commands
Batch Workload Data is gathered from one or more sources at regular intervals,
usually measured in hours or days, and processed all at the
same time
Big Data A term that has come to refer to massive amounts of data that is
produced in large volumes, at high speed, with a lot of variety in
terms of form or content
Bubble Chart Similar to a scatter chart but with a third dimension represented
by the size of the dots
Business Intelligence Describes the use of computing and data resources to aid the
identification, discovery, and analysis of critical business data
Cassandra A wide-column database that is supported on Azure through a
Cosmos DB API
Cloud Shell A web-based command line terminal resource, accessible from
the Azure portal
Cognitive Analytics technique whose hallmark is the ability to make
additional inferences based on analyzing one or more metrics
Column-Family A type of database that stores tables with rows, where a
different set of columns can be defined for each row; also
referred to as a wide-column database
Concurrency Describes the operation and impact on data when more than
one human or system is operating over the same data at the
same time
DP-900 Azure Data Fundamentals Study Guide

Term Definition
Consistency A data concept that describes how well or how quickly data is
kept "in sync" with itself and with replicas of itself
Cosmos DB Multi-model database designed for global distribution and high
performance
Dashboard A collection of visualizations in Power BI that are organized
around a theme and allow for drill-down for further detail
Data Analyst Exploring and analyzing data and building business-friendly
data models falls under the responsibility of this data
professional
Data Engineer Data wrangling and implementing data-driven artificial
intelligence solutions both fall within the many possible
specializations undertaken by this data professional
Data Factory Data processing service in Azure that provides a UI to
orchestrate data processing activities; Synapse Pipelines are
built on the same platform as this resource
Data Lake This resource captures real-time and external data — often both
relational and non-relational — into an organized data store that
can be queried directly
Data Processing The act of gathering, cleansing, transforming, and storing data
from multiple sources for use by other systems and processes
or for further analysis and reporting
Database Administrator This data professional will often be responsible for setting up
database security and encryption configurations as well as
managing and monitoring IaaS infrastructure
Dataset A collection of data from one or more sources used to generate
visualizations that can be aggregated into reports
Deploy After provisioning the storage account or underlying database
infrastructure, a database administrator or a data engineer will
______ specific resources into that infrastructure.
Descriptive Analytics technique that presents historical and current data to
help understand the state of a business or system
Diagnostic Analytics technique that focuses on surfacing the reasons
behind a particular event or scenario
Dimension One of 2 primary table types in star and snowflake data models,
this table type often represents people, places, objects, and
time
Document In a data sense, this describes a data container that stores
human-readable, semi-structured data in the form of JSON,
XML, YAML, and others
ELT One of 2 major approaches to data processing that performs the
transformation step within the final storage destination
Encryption The process of encoding data such that it can be interpreted by
a system or human consumer only if they have a key that is
passed between the source and the requester
ETL One of 2 major approaches to data processing that performs the
transformation in memory or in a staging environment before
the final load into the destination data store
Eventual Consistency A database configuration that favors fast performance over
ensuring data sources are in perfect sync at all times
DP-900 Azure Data Fundamentals Study Guide

Term Definition
Extract Step in both of the 2 major approaches to data processing,
where ingested data is selectively pulled into the process for
transformation and final storage
Fact One of 2 primary table types in star and snowflake data models,
this table type often represents events or activities of the data,
such as invoices or medical claims
Firewall A security device, in the form of software or hardware, that
filters traffic to resources, such as databases
Graph A type of database made up of nodes and edges, which is
particularly well suited to social networking applications and
certain scientific applications
Gremlin A language used for querying/traversing across a graph data
source
IaaS Acronym that describes the form of Azure service provided by
SQL Server on a virtual machine
Index A data structure that copies and re-orders data or contains
pointers to data to improve the speed of data retrieval
operations, often at the cost of speed for write operations
Ingestion Process of pulling data into a data platform environment from
multiple sources, via either batch processing or stream
processing
JSON A type of semi-structured data in human-readable form and
stored on "documents"; similar in nature to XML and YAML
Key Influencer Chart that surfaces the factors that affect a key metric
Key-Value Semi-structured format, where entities are organized into tables
with rows uniquely identified by a key and a flexible number of
columns; similar to wide-column data format
Line Chart Chart that allows for the measure of numeric data over 2 axes,
with the X-axis often representing time
Load Step in both of the 2 major approaches to data processing,
whereby the data comes to rest in a particular data resource for
consumption by other processes or humans
Microsoft Power BI Desktop A downloadable tool used to create interactive reports and
visualizations
Microsoft Power BI Mobile Service designed for viewing shared Power BI dashboards and
reports
Microsoft Power BI Service Online platform focused on building, publishing, and sharing
dashboards
Modern Data Warehouse This resource combines internal and external data in multiple
forms into a unified solution for analytics
MongoDB BSON document database that is supported on Azure through a
Cosmos DB API
Multi-Model A database that can support more than one data format
Nodes and Edges In graph databases, entities and relationships are also referred
to as these 2 terms
Non-Relational A data storage resource that stores semi-structured data or
non-structured data
Normalization A process of organizing data into tables and establishing
relationships between those tables in a way designed both to
protect the data and to make the database more flexible
DP-900 Azure Data Fundamentals Study Guide

Term Definition
NoSQL Often used interchangeably with the term "non-relational"
OLAP An acronym describing a type of data workload that is most
appropriate for decision support and business intelligence
OLTP An acronym describing a type of data workload that is most
appropriate for operational data interactions
PaaS Most, but not all, Azure data services fall under the _____ service
category, where Azure handles most of the administrative tasks
of patching, backups, OS updates, and so on.
Paginated Report Specialized Power BI report used to generate and share longer
tabular data that can be formatted to fit well on a standard
printed page
Partition Key A key in certain types of data resources that buckets multiple
items, which could be rows or documents, under a particular
category or common filter value
Pie Chart A simple, relative-measure chart that is useful only for a small
number of categories
Pipeline Primary orchestration structure for Data Factory, which
organizes activities over datasets into a logical workflow
PolyBase Feature in Synapse Analytics that allows direct loading of data
in other forms and formats, such as blobs in parquet format,
using T-SQL
Power BI Report Builder A service evolved out of similar tooling for SQL Server Reporting
Services and now used to build paginated reports in Power BI
PowerShell A scripting language that has many uses — among them,
interacting with Azure data resources
Predictive Analytics technique that projects future events based on past
trends or similar scenarios
Prescriptive Analytics technique that offers recommendations by asserting
required actions to achieve a target metric or goal
Primary Key A data construct made up of one or more bits of data that
uniquely identify an entity in a data store; examples include a
key on a row in a table or a key on a JSON document
Private Link Use this service to bring certain PaaS services hosted on Azure
into your private virtual network
Provision The act of setting up the underlying infrastructure of a cloud
data resource
Querying Writing code or interacting with a UI to "read" data from one or
more data sources and return the requested data in a particular
form or format
RBAC An acronym describing a form of access control that depends
on putting users in groups, or roles, and providing permissions
to those groups, rather than individually to users
Redundancy This term has 2 meanings in the data realm: 1) The existence of
the same piece of data in more than one place, which is
acceptable — even optimal in some systems — but not in
others; and 2) A form of replication of cloud assets that
improves accessibility and disaster recovery
Relational Type of database where structured data is organized into tables
with rows and columns, where each row is an entity, with pre-
defined columns that are the same for every row
DP-900 Azure Data Fundamentals Study Guide

Term Definition
Row Key Key-value data resources can combine this with the partition
key to form a primary key, which enables fast retrieval of
individual entities in a table
Scatter Chart Chart that presents the relationship between 2 metrics using
dots on X and Y axes
Schema-on-Read A hallmark of semi-structured data, where the desired form of
the returned data is determined by the query
Schema-on-Write A hallmark of structured, relational data, where the desired form
of the data is determined ahead of time and enforced when the
data is written to the data tables
Semi-Structured Data that is flexibly organized around entities that can have
varying types and amounts of associated descriptive data and
hierarchical data per defined entity
SMB A Windows system protocol for file sharing that allows
applications to read and write to files over network-connected
computers
Snowflake Data Model Semi-normalized analytics data model that centers on fact
tables, with dimensions related to facts but also related to other
dimensions
SQL Many data sources can be queried by some variant of this
declarative language
SQL Server on Virtual Machine An IaaS relational database service offered in Azure
SQL/Core API The primary interface for interacting with Cosmos DB
SSMS Tool used to manage and query databases using both UI
features and T-SQL code
Star Data Model Denormalized analytics data model that centers on fact tables,
with dimension tables directly related to the facts
Stored Procedures Prepared database code you can save so the code can be
reused, often with improved performance over ad hoc code
Streaming Workload Data is gathered and processed very near to the time it is
generated, from micro-seconds to a few minutes, such that the
data is delivered in a near continuous fashion
Structured Data that is highly organized with strong rules around the type,
form, and format of the stored data. This type of data is very
tightly associated with relational databases.
Tile A resizable component on a Power BI dashboard; each one
displays a visualization that can be as complex as a bubble
chart or as simple as a single value
TLS A protocol that protects data in motion, for example, traveling
between the cloud services and customers
Transactional Workload Operational data that is processed in short, precise, isolated
actions. Optimized for speed and accuracy within the
interaction.
Transformation Step in both of the 2 major approaches to data processing,
where data is converted into a form most appropriate for
analytical processing and analysis
Transparent Encryption A security feature that protects an entire database file (data at
rest) so it cannot be read unless the host application is secured
using the same key
DP-900 Azure Data Fundamentals Study Guide

Term Definition
Tree Map Visualization that uses interlocking rectangles, with color
coding, size, and position to express relative value and some
simple hierarchies
U-SQL A language used for batch jobs over large amounts of data in
Azure Data Lake Analytics
Unstructured Data that has no defined structure, often referred to as "blobs,"
but may nonetheless include metadata associated with files,
such as PDFs, images, and videos
Visualization A data technique that translates information into a visual
context to make it easier to understand and gain insights
VNet A network in Azure that mimics the form of on-premises,
hardware-based networks
Wide-Column Also known as column-family, this type of database stores
tables with rows, where a different set of columns can be
defined for each row
Workspace A place to collaborate in building Power BI assets in the Power
BI service

You might also like