Python and Pyspark With Databricks, With Azure Project
Python and Pyspark With Databricks, With Azure Project
Syllabus:
Python Syllabus
1. Introduction to Python
Python Introduction, History of Python, Introduction to Python
Interpreter and program execution, Python Installation Process in
Windows and Linux, Python IDE, Introduction to anaconda, python
variable declaration, Keywords, Indents in Python,
Python input/output operations
2. Python’s Operators
Arithmetic Operators, Comparison Operators, Assignment Operators,
Logical Operators, Bitwise Operators, Membership Operators, Identity
Operators, Ternary Operator, Operator precedence.
5. Function in python.
Introduction to functions, Function definition and calling, Function
parameters, Default argument function, Variable argument
function, in built functions in python, Scope of variable in python
6. File Processing
Concept of Files, File opening in various modes and closing of a file,
Reading from a file, Writing onto a file, some important File
handling functions e.g open(), close(), read(), readline() etc.
7. Modules
Concept of modularization, Importance of modules in python,
Importing modules, Built in modules ( ex: Numpy)
Databricks Concepts.
1) Databricks Introduction
A. Databricks Architecture
2) Databricks concepts
different notebooks.
3) Data Management
4) Computation Management
A. Databricks Workflows
B. Workflow task
notebook. F. Parameterization in
notebooks
Databricks
J. Volumes in Databricks
1 Pyspark Introduction
2 Pyspark Features and Advantages
3 Pyspark RDD Computation
4 Pyspark Transformations and Actions
5 Pyspark Fault-Tolerance mechanism
6 Pyspark RDD persistence
7 Different persistence options
8 Test
9 ON Lambda filter and map functions
10 Pyspark RDD in-built Transformations
11 Pyspark key value Transformations
12 Pyspark inbuilt Actions
13 Pyspark inbuilt actions and increasing part
14 Pyspark Filtering operations and word count
15 Pyspark Goupings and Aggregations
16 Pyspark installation within jupyter Notebook
17 Pyspark SQL and Creating Dataframes
18 Pyspark sql Dataframe functions
19 Pyspark various Dataframe Functions
20 Pyspark Sql DataFrame Functions
21 Pyspark different types of joins
22 Pyspark working with sql stmts
23 Pyspark Working with CSV and Json data
24 MultiLine JSON and Pyspark integration with
25 Pyspark Column Transformations
26 Nosql Introduction
27 NoSql Hbase Introduction
28 Nosql Hbase CRUD operations
29 Importing data from RDBMS to Hbase table
30 Mysql and Hbase
31 Various pyspark functions
32 Filtering and Replacing column values
33 Pyspark Jupyter and pyspark pandas and cal
34 Pyspark Date and Timestamp functions
35 Stages and Tasks Narrow and wide Transforma
36 Test
37 Nifi Lecture 1
38 Nifi Lecture 2
39 Kafka Lecture 1
40 Kafka Lecture 2
41 Streaming Lecture 1
43 Streaming Lecture 2
44 Streaming Lecture 3
45 Pyspark PROJECTS(s)
Operations usage. E.
Delta Lake partitions
Threat Protection.
SPARK SQL:
3) Drop databases
10) Spark SQL MERGE With SCD Type 1 and SCD Type 2
11) Spark SQL WHERE Clause, Group By Clause and Having Clauses
13) Spark SQL join types, Window , Pivot , Limit and Like
AZURE
3) Blob Storage
A. Azure Blob Resources
B. Azure storage account data objects
C. Azure storage account types and Options
D. Replications in distribution
E. Secure access to an application's data
F. Azure Import/Export service
G. Storage Explorer
H. Practical section on Blob Storage
PROJECT AZURE
Streaming Project Using
1.Nifi
2.Kafka
3.Pyspark
4.Azure