Azure
Databricks
Mastery:
Hands
Hands-on
on project
with
Unity Catalog,
Delta Lake,
Medallion
Architecture
Azure Databricks Free notes
Azure Databricks end to end project with Unity Catalog
Azure Databricks Mastery: Hand
Hands-on project with Unity Catalog,
Catalog Delta lake,
Medallion Architecture
Day 1: Sign up for DataBricks dashboard and why DataBricks
Day 2: Understanding notebook and Markdown basics
basics: Hands-on
Day 3: DataBricks in Notebook - Magic Commands
Commands: Hands-on
Day 4: DBUitls -Widget Utilities:: Hands
Hands-on
Day 5: DBUtils - Notebook Utils : Hands
Hands-on
Day 6: What is delta lake, Accessing Datalake storage using service principal
Day 7: Creating delta tables using SQL Command
Day 8: Understanding Optimize Command – Demo
Day 9: What is Unity Catalog: Managed and External Tables in Unity Catalog
Day 10: Spark Structured Streaming – basics
Day 11: Autoloader – Intro, Autoloader - Schema inference: Hands-on
Day 12: Project overview: Creating all schemas dynamically
Day 13: Ingestion to Bronze:
onze: raw_roads data to bronze Table
Day 14: Silver Layer Transformations: Transforming Silver Traffic data
Day 15: Golder Layer: Getting data to Gold Layer
Day 16: Orchestrating with WorkFlows: Adding run for common notebook in all notebooks
Day 17: Reporting with PowerBI
Day 18: Delta Live Tables: End to end DLT Pipeline
Day 19: Capstone Project I
Day 20: Capstone Project II
Day 1: Create DataBricks resource using Azure Portal
Environment Setup: Login to your Azure Portal
Step 1: Creating a budget for project: search and type budget, “ADD” on Cost Management, “Add
Filter” in “Create budget”, select Service Name: Azure Databricks in drop down menu.
Step 2: Set alerts as well in next step. Finally click on “Create”.
Step 3: Create a Databricks resource
resource, for “pricing tier”, click here for more details:
https://azure.microsoft.com/en-us/pricing/details/databricks/
us/pricing/details/databricks/
Hence select for Premium (+ Role based access controls)
controls),, skip “Managed Resource Group Name”, not
any changes required in “Networking”, “Encryption”, “Security”, “Tags” also.
Step 4: Create a “Storage Account” from “Microsoft Vendor”, select “Resource Group” as previous
one, “Primary Service”
vice” as “ADLS Gen 2”, select “Performance” as “standard”, “Redundancy” as “LRS”,
“LRS”
not any changes required in “Networking”, “Encryption”, “Security”, “Tags” also.
Step 5: Walkthough on databricks Workspace U UI: click on “Launch Workspace” or go through URL:
looks like https://______azuredatbricks.net
https://______azuredatbricks.net,, Databricks keep updating UI, click on “New” for “Repo”
as CI/CD, “Add data” in “New”, “Workflow” are just like PipPipeline at high level,, “Search” bar for
searching also.
Theory 1: What is Big Data approach?: Monolithic is used for Single Computer and distributed
Approach using Cluster which is group of computers.
Theory 2: Drawbacks of MapReduce
MapReduce:: In HDFS, in the each iteration, Read and Write operation from
disk which will take place high I/O disk costs, developer also have to write complex program, Hadoop
is only single super Computer.
Theory 3: Emergence of Spark: First it uses HDFS or Any ccloud
loud Storage then further process takes place
in RAM, it uses in-memory
memory process which is 1010-100 times faster than Disk based application,
application here
database is detached from memory and process aloof.
Theory 4: Apache Spark: it is an in--memory application framework.
Theory 5: Apche Spark Ecosystem:: Spark Core, special data structure RDD, this is collection of items
distributed across the compute nodes in the cluster, these will be processed in parallel, but RDDs are
difficult to use for complex operations and they are difficult to optimize , now we are making use of
Higher level APIs and libraries like Data Frames and Data Set APIs. Also, uses other high level APIs like
Spark SQL, Spark Streaming, Spark ML etc.
In the real time, we do not use RDD but highe
higherr level APIs to do our programming or coding, data
frame APIs to interact with spark and these data frames can be invokednvoked using any languages like Java,
Python, SQL or R and internally spark has two parts
parts:: set of core APIs, and the Spark Engine: this
distributed
ibuted Computing engine is responsible for all functionalities, there is an OS which will manage
this group of computers (cluster) is called Cluster Manager, In Spark, there are many Cluster Managers
in which you can use like YARN Resource Manager or Resource rce standalone, Mesos or Kubernetes.
So, Spark is a distributed data processing solution not a storage system, Spark does not come with
storage system, can be used like Amazon S3, Azure Storage or GCP.
We have Spark Context, which is Spark Engine, to bre
break down the
e task and scheduling the task for
parallel execution.
So, what is Databricks? The founders of the Spark developed a commercial product and this is called
Databricks to work with Apache Spark in more efficient way, Databricks is available on Azure,
Azu GCP and
AWS also.
Theory 6: What is Databricks?: DB is a way to interact with Spark, to set up our own clusters, manage
the security, and use the network to write the code. It provides single interface where you can
manage data engineering, data science and data analyst workloads.
Theory 7: How Databricks Works with Azure? DB can integrate with data services like Blod storage,
Data Lake Storage and SQL Database and security Entra ID, Data Factory, Power BI and Azure DevOps.
Theory 8: Azure Databricks Architecture
Architecture: Control plane is taken care by DB and Compute Plane is
taken care by Azure.
Theory 9: Cluster Types: All purpose Cluster and Job cluster. Multi-node
node cluster is not available in
Azure Free subscription because it’s allowed to use only maximum of four CPU cores.
In DB workspace: (inside Azure Portal), “create cluster”
cluster”, select “Multi-node”:
node”: Driver node and worker
node are at different machines.. In “Access mode”, if you will select “No isolation shared” then “Unity
Catalogue” is not available. Always uncheck “Use Photon Acceleration” which will reduce your DBU/h,
can be seen from “Summary” pane at right top.
Theory 10: Behind the scenes when creating cluster: click on “Databricks” instance in Azure portal
before clicking on Databricks “Launch Workspace”, there is “Managed Resource Group”:
Group” open this
link; there is a Virtual network and Network security group an
and Storage account.
This Storage account is going to store Meta Data of it, we will see Virtual Machine, when we will
create any compute Resource, now go to Databricks workspace, create any compute resource and
then come back here, will find some disks, Pu
Public IP address and VM. For all these, we will be charged
as DBU/h.
Stop our compute resource, nothing is deleted in Azure portal, but when we will click on Virtual
Machine, then that will show not “start”. But if you delete compute resource from Databricks
Databric
workspace, check your Azure portal again, will find all resources i.e. disks, Public IP address and VM
etc are deleted.
Day 2: Understanding
nderstanding notebook and Markdown basics : Hands-on
Hands
Note: this part can be executed in Databricks Community edition, not necessarily to be run in Azure Databricks resource
%md
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6
####### Heading 7
-----------------------------------------------------------------
%md
# This is a comment
-----------------------------------------------------------------
%md
1. HTML Style <b> Blod </b>
2. Astricks style **Blod**
-----------------------------------------------------------------
%md
*Italics* style
-----------------------------------------------------------------
%md
`print(df)` is the statement to print something
```
This
is multiline
code
```
-----------------------------------------------------------------
%md
- one
- two
- three
-----------------------------------------------------------------
%md
To highlight something
<span style="background-color: #FFFF00"> Highlight this </span>
-----------------------------------------------------------------
%md

-----------------------------------------------------------------
%md
Click on [Profile Pic](https://media.licdn.com/dms/image/C4E03AQGx8W5WMxE5pw/profile-displayphoto-
shrink_400_400/0/1594735450010?e=1705536000&v=beta&t=_he0R75U4AKYCbcLgDRDakzKvYZybksWRoqYvDL-alA)
Day 3: DataBricks in Notebook - Magic Commands : Hands-on
Magic commands in Databricks: if any SQL command is to be executed then select 'SQL'.
Note: this part can be executed in Databricks Community edition, not necessarily to be run in Azure Databricks resource
1. Select 'Python' from top and type
print('hello')
#Comments
Default language is Python
-----------------------------------------------------------------
2. %scala
print("hello") will work and also #comments will also not work.
For comments in Scala use //Comments
-----------------------------------------------------------------
3. Comments in SQL -- Comments
now in %sql
select 2+5 as sum
4. in %r
x <-"Hello"
print(x)
-----------------------------------------------------------------
5. There are much more magic commands in DB.
%fs ls
List all things in all the directories inside DBFS ie Databricks File System.
-----------------------------------------------------------------
6. Know all the Magic commands available:
type:
%lsmagic
-----------------------------------------------------------------
----------------
7. Summary of Magic commands: You can use multiple languages in one notebook and you need to specify language magic commands at the
beginning of a cell. By default, the entire notebook will work on the language that you choose at the top.
-----------------------------------------------------------------
DBUtils:
# DBUtils: Azure Databricks bricks provides set of utilities to efficiently interact with your notebook.
Most commonly used DBUtils are:
1. File System Uttilities
2. Widget Utilities
3. Notebook Utilities
-----------------------------------------------------------------
1. What are the availble utilities?
# just type:
dbutils.help()
-----------------------------------------------------------------
# 2. Lets see File System Utilities
%md
# File System Utilities
# click new cell:
# type:
dbutils.fs.help()
-----------------------------------------------------------------
#### Ls utility
# what are available list in particular directory: Enable DBFS, click on "Admin setting" from right top, click on "Workspace Settings",
# scroll down, enable 'DBFS File Browser', now you can see 'DBFS' tab, after clicking on 'DBFS' tab, some set on folders are there,
You will find "FileStore" in left pane in “Catalog” button, somewhere, copy path from "spark API format",
path = 'dbfs:/FileStore'
dbutils.fs.ls(path)
# why ls, see just above from dbutils.fs.help() details.
-----------------------------------------------------------------
# remove any directory:
# just copy following addres from above:such as FileInfo(path='dbfs:/FileStore/temp/', name='temp/', size=0, modificationTime=0)
dbutils.fs.rm('dbfs:/FileStore/CopiedFolder/',True)
# True is added bcs if this file is not exisiting than it will just reply 'True'
# just check directory list again, that file
ile has been removed.
dbutils.fs.ls(path)
-----------------------------------------------------------------
#### mkdir
# why heading are important bcs, left side "Table of Contents" are there, which showing all the headings
dbutils.fs.mkdirs(path+'/SachinFileTest/')
-----------------------------------------------------------------
# list all files so that we can see newly created directory is there or not?
dbutils.fs.ls(path)
### put: Inside a folder lets put something,
dbutils.fs.put(path+ '/SachinFileTest/test.csv','1, Test')
-----------------------------------------------------------------
# also check using manual "DBFS" tab
### head : read the file conten, which we just written,
filepath = path+ '/SachinFileTest/test.csv'
dbutils.fs.head(filepath)
-----------------------------------------------------------------
### Copy: Move this newly created file from one location to another
source_path = path+ '/SachinFileTest/test.csv'
destination_path = path+ '/CopiedFolder/test.cs
'/CopiedFolder/test.csv'
dbutils.fs.cp(source_path,destination_path,True)
-----------------------------------------------------------------
# display content from recently pasted values
dbutils.fs.head(destination_path)
-----------------------------------------------------------------
# same activity can be done by right click of that file *.csv
# with "Copy path", "Move", "Rename", "Delete"
-----------------------------------------------------------------
# Move is cut and paste/move
# copy is just copy and paste
source_path = path+ '/FileTest/test.csv'
destination_path = path+ '/MovedFolder/test.csv'
dbutils.fs.mv(source_path,destination_path,True)
-----------------------------------------------------------------
# remove folder
dbutils.fs.rm(path+ '/MovedFolder/',True)
dbutils.fs.help()
-----------------------------------------------------------------
Day 4: DBUitls -Widget
Widget Utilities : Hands-on
Note: this part can be executed in Databricks Community edition, not necessarily to be run in Azure Databricks resource
Why Widgets: Widgets are helpful to parameterize the Notebook, imagine, in real world you are working in heterogeneous environment,
either in DEV env, Test env or Production env, to change everywhere, just parameterize the notebook, instead of hard coding the values
everywhere.
Details: Coding:
# what are vailavle tools, just type:
dbutils.widgets.help()
------------------------------
%md
## Widget Utilities
------------------------------
%md
## Let's start with combo Box
### Combo Box
dbutils.widgets.combobox(name='combobox_name',defaultValue='Employee',choices=['Employee','Developer','Tester','Manager'],lab
dbutils.widgets.combobox(name='combobox_name',defaultValue='Employee',choices=['Employee','Developer','Tester','Manager'],label=
"Combobox Label ")
------------------------------
# Extract the value from "Combobox Label"
emp=dbutils.widgets.get('combobox_name')
# dbutils.widgets.get retrieves the current value of a widget, allowing you to use the value in your Spark jobs or SQL Queries.
print(emp)
type(emp)
------------------------------
# DropDown Menu
dbutils.widgets.dropdown(name='dropdown_name',defaultValue='Employee',choices=['Employee','Developer','Tester','Manager'],lab
dbutils.widgets.dropdown(name='dropdown_name',defaultValue='Employee',choices=['Employee','Developer','Tester','Manager'],label=
"Dropdown Label")
------------------------------
# Multiselect
dbutils.widgets.multiselect(name='Multiselect_name',defaultValue
dbutils.widgets.multiselect(name='Multiselect_name',defaultValue='Employee',choices=['Employee','Developer','Tester','Manager'],label=
='Employee',choices=['Employee','Developer','Tester','Manager'],label=
"MultiSelect Label")
------------------------------
# Text
dbutils.widgets.text(name='text_name',defaultValue='',label="Text Label")
------------------------------
dbutils.widgets.get('text_name')
# dbutils.widgets.get retrieves the current value of a widget, allowing you to use the value in your Spark jobs or SQL Queries.
------------------------------
result = dbutils.widgets.get('text_name')
print(f"SELECT * FROM Schema.Table WHERE Year = {result}")
------------------------------
# go to Widget setting from right, change setting to "On Widget change"
change"-->
> "Run notebook", now entire notebook is getting executed
print('execute theseeeSachin ')
Day 5: DBUtils - Notebook
tebook Utils : Hands-on
Note: this part can be executed in Azure Databricks resource
resource, not in Databricks Community edition,, otherwise it will give like: To enable
notebook workflows, please upgrade your Databricks subscription.
Create a compute resource with Policy: “Unrestricted”, “Single node”, uncheck “Use Photon
Acceleration”, select least node type,
Now go to Workspace-> Users->> your email id will be displayed, add notebook from right, click on
“notebook” rename as
Notebook 1: “Day 5: Part 1: DBUtils
ls Notebook Utils: Child”
dbutils.notebook.help()
-------------------------
a = 10
b = 20
-------------------------
c = a + b
-------------------------
print(c)
-------------------------
# And I'm going to use the exit here. So basically what exit will do is it is going to execute all the
commands before that. And it is going to come here. And if ever there is an exit command, it is going to
stop executing the notebook at that particular p
point
oint and it is going to return the value, whatever you are
going to enter here.
dbutils.notebook.exit(f'Notebook
f'Notebook Executed Successfully and returned {c}')
# We are going to access this notebook in another Notebook
-------------------------
print('Test')
Notebook 2: “Day 5: Part 2: DBUtils Notebook Utils: Parent”
print('hello')
-------------------------
dbutils.notebook.run('Day
'Day 5 Part 1 DBUtils Notebook Utils Child'
Child',60)
60 is timeout parameter
Click on “Notebook Job”, will lend you to “Workflow”, where it is executed as job, there are two kinds of
clusters, one is interactive and another is “Job”, it’
it’s executed as a “Job”,
, under “Workflow”, check all
“Runs”.
Now “clone” Notebook 1: “Day 5: Part 1: DBUtils Notebook Utils: Child” and Notebook 2: “Day 5: Part
2: DBUtils Notebook Utils: Parent” and rename as “Day 5: Part 3: DBUtils Notebook Utils: Child
Parameter” and “Day 5: Part 4: DBUtils Notebook Utils: Parent Parameter”
Notebook 3: “Day 5: Part 1: DBUtils Notebook Utils: Child Parameter”
dbutils.notebook.help()
---------------------------
dbutils.widgets.text(name='a',defaultValue=
,defaultValue='',label = 'Enter value of a ')
dbutils.widgets.text(name='b',defaultValue=
,defaultValue='',label = 'Enter value of b ')
---------------------------
a = int(dbutils.widgets.get('a'))
b = int(dbutils.widgets.get('b'))
# The dbutils.widgets.get function in Azure Databricks is used to retrieve the current value of a widget. This allows you to dynamically
dy
incorporate the widget value into your Spark jobs or SQL queries within the not
notebook.
---------------------------
c = a + b
---------------------------
print(c)
---------------------------
dbutils.notebook.exit(f'Notebook
f'Notebook Executed Successfully and returned {c}')
Notebook 4: “Day
Day 5: Part 4: DBUtils Notebook Utils: Parent Parameter
Parameter”
print('hello')
-------------------
dbutils.notebook.run(Day
Day 5: Part 1: DBUtils Notebook Utils: Child Parameter
Parameter',60,{'a' : '50',
'50' 'b': '40'})
# 60 is timeout parameter
# go to Widget setting from right, change setting to "On Widget change"
change"--> "Run notebook", now entire notebook is getting executed
On right hand side in “Workflow” “Runs”, there are Parameters called a and b.
Day 6: What is delta lake
lake, Accessing Datalake storage using service
principal
Introduction to section Delta Lake:: In this section, we will dive into Delta Lake, where the reliability of
structured data meets the flexibility of data lakes.
We'll explore how Delta Lake revolutionizes data storage and management, ensuring ACID
transactions and seamless schema evolution within a unified framework.
Discover how Delta Lake enhances your data lake experience with exceptional robustness and
simplicity.
We'll cover the key features of Delta Lake, accompanied by practical implementations in notebooks.
By the end of this section, you'll have a solid understanding of Delta Lake, its features, and how to
implement them effectively.
ADLS != Database, in
n RDBMS there is called ACID Properties which is not available in ADLS.
Data Lake came forward to solve following drawback of ADLS
ADLS:
Drawbacks of ADLS:
1. No ACID properties
2. Job failures lead to inconsistent data
3. Simultaneous writes on same folder brings incorrect results
4. No schema enforcement
5. No support for updates
6. No support for versioning
7. Data quality issues
What is Delta Lake?
It is an Open-source
source framework that brings reliability to data lakes.
Brings transaction capabilities to data lakes
lakes.
Runss on top of your existing data lake and supports parquet
parquet.
Delta Lake is not a data warehouse or a database.
Enables Lakehouse architecture.
A. Datawarehouse can workk only on structure data, which is first generation evolution. However it is
supporting ACID properties. One can delete, update and perform data governance on it.
Datawarehouse cannot handle the data other than structure cannot serve a ML use cases.
cases
B. Modern data warehouse architecture: There is Modern data warehouse architecture, which
includes usage of Data Lakess for object storage, which is cheaper option for storage, this also called
two tier architecture.
So the best features would be first one.
It supports the any kind of data can be structured or uns
unstructured,
tructured, and the ingestion of data is much
faster. And the data lake is able to scale to any extent. And let us see what the drawbacks here are.
Like we have seen, Data Lake cannot offer the acid guarantees, it cannot offer the schema
enforcement, and a data lake can be used for ML kind of use cases, but it cannot serve for BI use case,
case
a BI use case is better served by the data warehouse.
That is the reason we are still using the data warehouse in this architecture.
C. Lakehouse Architecture: Databricks gave a paper on Lakehouse, which proposed the solution by
just having a single system that manages both the things.
So Databricks has solved this by using Delta Lake. They introduced metadata, which is transaction logs
on top of the data lake, which
hich gives us data warehouse like features.
So Delta Lake is one of the implementation that uses the Lakehouse architecture. If you can see in the
diagram there is something called metadata caching and indexing layer. So under the hood there will
be data lake
ake on the top of the data lake. We are implementing some transaction log feature where
that is called the Delta lake, which we will use the Delta Lake to implement Lakehouse architecture.
So let's understand about the Lakehouse architecture now. So the co
combination
mbination of best of data
warehouses and the data lakes gives the Lakehouse where the Lakehouse architecture is giving the
best capabilities of both.
If you can see the diagram, Data Lake itself will be having an additional metadata layer for data
management,
ent, which having a transaction logs that gives the capability of data warehouse.
So using Delta Lake we can build this architecture. So let's see more about the Lakehouse architecture
now. So coming to this we have the data lake and data warehouse which are architecture we have
seen. And each is having their own capabilities.
Now Data Lake House is built by best features of both. Now we can see there are some best elements
of Data Lake and there are best elements of Data Warehouse. Lake House also provides
provid traditional
analytical DBMs management and performance features such as Acid transaction versioning, auditing,
indexing, caching, and query optimization.
Create Databricks instances (with
with standard Workspace otherwise Delta Live tables and SQL
warehousing will be disabled) and ADLS Gen 2 instances in Azure Portal.
Hands-on: Accessing Datalake storage using service principal
principal:
“Day 6 Part 1 Test+access.ipynb”
Source Link: Tutorial: Connect to Azure Data Lake Storage Gen2 - Azure Databricks | Microsoft Learn
Inside ADLS Gen 2, create a ADLS Gen 2 with name ““deltadbstg”, create a container with name “test”,
inside this container add a directory with name “sample”, upload a csv file name “countires1.csv”.
“countires1.csv”
Inside Databricks instances: Create a compute resource with Policy: “Unrestricted”, “Single node”,
uncheck “Use Photon Acceleration”, select least node type.
Go give permission, we have unity catalogue.
Go to Azure Entra ID (previously Azure Active directory)
directory), inside it, going to create
ate some service
principle, click on “App Registration” on left hand side where you can create an app. Click on “New
Registration”,
Give name: “db-access”,
access”, leave other settings as it is. Copy “Application (client) ID” and “Directory
(tenant) ID” from “db-access”
access” overview.
Also copy secret key from left, “certificates & secrets” from left, click on “+ New client secret”, give
“Description” as “dbsecret” and click on “Add”.
Copy the “Value” from “dbsecret” now.
Note three keys “Application (client) ID” and “Directory (tenant) ID” and “Value” from “dbsecret” i.e.
secret ID in a text notebook.
Inside notebook, secret ID is “service credential”
credential”,.
To give access to data storage, goto ADLS Gen 2 instances in Azure Portal,, go to “Access Control
(IAM)”, click on “+Add”, click on “+Add Role Assignment”, search for “storage blob contributor”,
contributor click
on storage blob contributor”” and “+select members”, type service principle which is “db-access”.
“db
Select, finally Review and Assign.
Hands on 2: Drawbacks of ADLS – practical
practical:
Day 6 Part 2 .+Drawbacks+of+ADLS.ipynb
Create new directory in “test” container with name “files” and upload csv file
“SchemaManagementDelta.csv”
This hands on showing that using data lake we are unable to perform Update operation. Only in delta
lake this operation is supportive.
Even using spark.sql, are unable to perform Update operation. This is one of Drawbacks of ADLS.
Versioning is also not available in ADLS, which is Drawbacks of ADLS.
Hands on 3: Creating Delta lake:
Day 6 Part 3 +Drawbacks+of+ADLS+
+Drawbacks+of+ADLS+-+delta.ipynb
Hands on 4: Understanding Transaction Log
Log:
Day 6 Part 4 Understanding+the+transaction+log.ipynb
Day 7: Creating delta tables using SQL Command LECTURE 33
Reference: To be published
Details: to be added
Day 8: Understanding Optimize Command – Demo
Reference: To be published
Details: to be added
Day 9: What is Unity Catalog
Catalog: Managed and External Tables in Unity
Catalog
Reference: To be published
Details: to be added
Day 10: Spark Structured Streaming – basics
Reference: To be published
Details: to be added
Day 11: Autoloader – Intro
Intro, Autoloader - Schema inference:: Hands-on
Hands
Reference: To be published
Details: to be added
Day 12: Project overview: Creating all schemas dynamically
Reference: To be published
Details: to be added
Day 13: Ingestion to Bronze: raw_roads data to bronze Table
Reference: To be published
Details: to be added
Day 14: Silver Layer Transformations: Transforming Silver Traffic data
Reference: To be published
Details: to be added
Day 15: Golder Layer: Getting data to Gold Layer
Reference: To be published
Details: to be added
Day 16: Orchestrating with WorkFlows: A
Adding
dding run for common
notebook in all notebooks
Reference: To be published
Details: to be added
Day 17: Reporting with PowerBI
Reference: To be published
Details: to be added
Day 18: Delta Live Tables: End to end DLT Pipeline
Reference: To be published
Details: to be added
Day 19: Capstone Project I
Reference: To be published
Details: to be added
Day 20: Capstone Project II
Reference: To be published
Details: to be added