2023713662-PythonSQLPyspark

The document outlines a comprehensive 12-day training program covering Python, SQL, and PySpark, with each subject divided into four days of topics. Key Python topics include installation, data types, control flow, and libraries, while SQL focuses on fundamentals, data manipulation, and stored procedures. PySpark training emphasizes RDDs, DataFrames, collaborative filtering, and ETL processes, culminating in a capstone project.

Uploaded by

chamahdavida30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views5 pages

2023713662-PythonSQLPyspark

Uploaded by

chamahdavida30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Python (4 days)

Day 1
• Installing and setting up python
• Writing your very first program in python
o Printing Hello World
• Operators and Expressions
• Slicing
o Negative slicing
o Using step in slicing
o Slicing backwards
• Strings
o String operators
o String formatting
• Program Flow control in Python
o if statement
o elif
o for loop
o continue and break
o while loop

Day 2
• List and Tuples
o mutable vs Immutable objects
o List
o Sorting a list
o Removing items from list
o Replacing items in list
o What are tuples
o Performing basic functions to a tuple
• Dictionary and Sets
• Functions
o Defining a function
o Parameters and arguments
o Returning values
o Docstring
o *args
2 of 13
Day 3
• Input and Output in python
o Reading and writing to a text file
o Appending to a file
o Object persistence using shelve
• Exception handling in python
• Generators, Decorators and lambda expression
Day 4
• Introduction to external libraries in Python
• Deep dive into libraries:
o NumPy, Pandas and Matplotlib
Assessment

SQL (4 Days)
Fundamentals of SQL
Day 1
• Introduction to SQL
o Introduction
o Work with Schemas
o Explore the structure of SQL Statements DDL, DML, DCL
o Examine the SELECT statements
o Work with data types
o Handle NULLs
Hands-on: Work with SELECT statements

Day 2
• Sort and filter results in SQL
o Sort your results
o Limit the sorted results
o Page results
o Remove duplicates
o Filter data with predicates
• Combine multiple tables with JOINs in SQL
o Understand joins concepts and syntax
o Use Inner joins
o Use Outer joins
o Use Cross joins
o Use Self joins
• Write Subqueries in SQL
o Understand Subqueries
o Use scalar or multi-valued subqueries
o Use self-contained or correlated subqueries
Hands-on: Sort and filter query results Hands-on: Query multiple tables with joins
Hands-on: Use Subqueries

Day 3
• Use built-in functions and GROUP BY in SQL
• Categorize built-in functions
▪ Use aggregate functions - AVG SUM MIN MAX COUNT
▪ Use Mathematical functions - ABS, COS/SIN, ROUND RAND
▪ Use Ranking functions - RANK, DENDE-RANK
▪ Use Analytical function - LAG, LAST_VALUE, LEAD,
PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK
▪ Use Logical functions - CHOOSE, GREATEST, LEAST
o Summarize data with GROUP BY
o Filter groups with HAVING
• Modify data with SQL
o Insert data
o Generate automatic values
o Update data
o Delete data
o Merge data based on multiple tables
Hands-on: Use built-in functions
Hands-on: Modify data

Day 4
• Triggers
• Stored Procedure
o Stored procedures
o Create
o Modify
o Delete
o Execute
o Specify parameters
• Indexes
o Heaps (Tables without Clustered Indexes)
o Clustered & Non-Clustered Indexes
Hands-on: Stored procedure
Hands-on: Indexes
Assessment
4 of 13

PySpark (4 Days)
Pyspark
Day 1
• Fundamentals of PySpark
o A Brief Primer on PySpark
o Brief Introduction to Spark
o Apache Spark Stack
o Spark Execution Process
o Newest Capabilities of PySpark
o Cloning GitHub Repository
• Resilient Distributed Datasets
o Resilient Distributed Datasets
o Creating RDDs
o Schema of an RDD
o Understanding Lazy Execution
o Introducing Transformations – .map(…)
o Introducing Transformations – .filter(…)
o Introducing Transformations – .flatMap(…)
o Introducing Transformations –. distinct(…)
o Introducing Transformations – .sample(…)
o Introducing Transformations – .join(…)
o Introducing Transformations – .repartition(…)
o Project 1: Count Data Project (ingestion of dataset, doing a preprocessing and
exploratory dataset though the data set, applying map, filter, faltmap,
distinct, join and repartition)
o Project 2: Weather Temperature Crunch (ingestion of dataset, doing a
preprocessing and exploratory dataset though the data set, applying map,
filter, faltmap, distinct, join and repartition on instream data)

Day 2
• Resilient Distributed Datasets and Actions
o Introducing Actions – .collect(…)
o Introducing Actions – .reduce(…) and .reduceByKey(…)
o Introducing Actions – .count()
o Introducing Actions – .foreach(…)
o Introducing Actions – .aggregate(…) and .aggregateByKey(…)
o Introducing Actions – .coalesce(…)
o Introducing Actions – .combineByKey(…)
o Introducing Actions – .histogram(…)
o Introducing Actions – .sortBy(…)
o Introducing Actions – Saving Data
o Introducing Actions – Descriptive Statistics
o Project 3: 10 Tasks in Students/Professor University Datasets (ingestion of
dataset, doing a preprocessing and exploratory dataset though the data set,
applying RDD actions.)
o Project 4: 8 Tasks in Customer Data Datasets (ingestion of dataset, doing a
preprocessing and exploratory dataset though the data set, applying RDD
actions through specified applicability)
o Project 5: Movie ratings
• DataFrames and Transformations
o Creating DataFrames
o Specifying Schema of a DataFrame
o Interacting with DataFrames
o The .agg(…) Transformation
o The .sql(…) Transformation
o Creating Temporary Tables
o Joining Two DataFrames
o Performing Statistical Transformations
o The .distinct(…) Transformation
o Project 6: CompanyMegaData (doing all the transformation logics,
columunal logic and aggregation and exploratory data analysis)
o Project 7: University Data (end to end pyspark execution of insight delivery
on University Data)

Day 3
• Collaborative Filtering and Techniques
o Collaborative filtering
o Utility Matrix
o Explicit and Implict Rating
o Expected Results
o Dataset
o Joining Dataframe
o Train and Test Data
o ALS model
o Optimization Hyperparameter tuning and cross validation
o Best model and evaluate prediction
o Project 8: IMDB Rating project (Optimization logics focused on the project
with extensive pyspark logic and clever techniques of manipulation )
• Spark Streaming
o Introduction to spark streaming
o Spark streaming with RDD
o Spark streaming Context
o Spark streaming Reading Data
o Spark streaming Cluster Restart
o Spark streaming RDD Transformation
o Spark streaming DF and Display
o Spark streaming DF Aggregation
o Project 9: Streaming Crunch Dataset(orchestration of a stream pipeline
project of end to end execution of the ingestion of live data)

Day 4
• Spark ETL and Captone project
o Introduction to ETL
o ETL Pipeline
o Dataset
o Preprocessing, extraction, transformation
o Loading Data and cleaning
o RDS Networking
o Downloading PostGres
o Configuration and execution
Project 10: Completion of Captone Project (Full end to end project Streaming
Crunch Dataset of entire pyspark concepts from data exploratory to applying
techniques and finding out the logics to the requirement of the dataset along
with applying multiple ways to solve a solution and figuring out the correct and
most optimized way and efficient way)

Pyspark 30 Days
No ratings yet
Pyspark 30 Days
32 pages
Data Science Training Content Naresh IT Hyderabad
No ratings yet
Data Science Training Content Naresh IT Hyderabad
13 pages
Data Analyst Syllabus
No ratings yet
Data Analyst Syllabus
25 pages
Data Science Training in Naresh I Technologies
100% (3)
Data Science Training in Naresh I Technologies
18 pages
Learning R Programming
From Everand
Learning R Programming
Kun Ren
5/5 (3)
Ict Monitoring and Evaluation Form (Revised)
0% (1)
Ict Monitoring and Evaluation Form (Revised)
4 pages
Math Mammoth Grade5A Samples PDF
100% (1)
Math Mammoth Grade5A Samples PDF
48 pages
Training Plan 2025
No ratings yet
Training Plan 2025
5 pages
DE_Python
No ratings yet
DE_Python
11 pages
SQL Complete Roadmap !
No ratings yet
SQL Complete Roadmap !
8 pages
Step by Step Guide For Data Engineering
No ratings yet
Step by Step Guide For Data Engineering
9 pages
PYTHON_FULL_STACK_CURRICULAM_Final
No ratings yet
PYTHON_FULL_STACK_CURRICULAM_Final
5 pages
Data Engineering Bootcamp
No ratings yet
Data Engineering Bootcamp
5 pages
Step by Step Guide For Data Engineering
No ratings yet
Step by Step Guide For Data Engineering
7 pages
Data-Analytics-2025-V2.0
No ratings yet
Data-Analytics-2025-V2.0
18 pages
PDF
No ratings yet
PDF
25 pages
One Month Internship in DataScience With AIML
No ratings yet
One Month Internship in DataScience With AIML
3 pages
Bigdata engineering syllabus
No ratings yet
Bigdata engineering syllabus
14 pages
4th Sem Syllabus
No ratings yet
4th Sem Syllabus
12 pages
DANLC Course Content
No ratings yet
DANLC Course Content
8 pages
Data Scientist & Data Analyst
No ratings yet
Data Scientist & Data Analyst
24 pages
Data Analytics Advanced With Python, Numpy and
No ratings yet
Data Analytics Advanced With Python, Numpy and
6 pages
Cloud Data Engineering V1.0
No ratings yet
Cloud Data Engineering V1.0
5 pages
AI - ML Curriculum Powered by IBM - Pregrad
No ratings yet
AI - ML Curriculum Powered by IBM - Pregrad
31 pages
AIML Curriculum powered by IBM - Pregrad-merged
No ratings yet
AIML Curriculum powered by IBM - Pregrad-merged
66 pages
AIML-Curriculum by Pregrad
No ratings yet
AIML-Curriculum by Pregrad
33 pages
Data ScienceNovember 2017
No ratings yet
Data ScienceNovember 2017
3 pages
Become A Big Data Engineer 1
No ratings yet
Become A Big Data Engineer 1
7 pages
Course Outline Hadoop and Spark For Big Data and Data Science PDF
No ratings yet
Course Outline Hadoop and Spark For Big Data and Data Science PDF
4 pages
Course Outline Hadoop and Spark For Big Data and Data Science
100% (1)
Course Outline Hadoop and Spark For Big Data and Data Science
4 pages
Tech Launch Program Data science
No ratings yet
Tech Launch Program Data science
22 pages
Skyess Spark Syllabus
No ratings yet
Skyess Spark Syllabus
12 pages
NDS Data Practitioner Degree Curriculum
No ratings yet
NDS Data Practitioner Degree Curriculum
10 pages
3 Months Python and Data Analytics Syllabus
100% (1)
3 Months Python and Data Analytics Syllabus
3 pages
DS Curriculum
No ratings yet
DS Curriculum
4 pages
Data-Intensive Computing: CSE487/587 Bina Ramamurthy (Bina@Buffalo - Edu)
No ratings yet
Data-Intensive Computing: CSE487/587 Bina Ramamurthy (Bina@Buffalo - Edu)
10 pages
Data Science Machine Learning 17054
No ratings yet
Data Science Machine Learning 17054
27 pages
Learninng Plan
No ratings yet
Learninng Plan
6 pages
Pyspark TOC - 24 Hours
No ratings yet
Pyspark TOC - 24 Hours
2 pages
Python AWS Data Engineering Course- Master PySpark, Kafka, SQL
No ratings yet
Python AWS Data Engineering Course- Master PySpark, Kafka, SQL
3 pages
Ciencia Datos Corner
No ratings yet
Ciencia Datos Corner
6 pages
Data Engineer - Copie (3)
No ratings yet
Data Engineer - Copie (3)
16 pages
30-Day Timetable (Quarter 1 Completion)
No ratings yet
30-Day Timetable (Quarter 1 Completion)
3 pages
5-Day KVCET Bootcamp - Data Analytics
No ratings yet
5-Day KVCET Bootcamp - Data Analytics
6 pages
Certified Professional Diploma in Data Science-1
No ratings yet
Certified Professional Diploma in Data Science-1
43 pages
Data Analyst Roadmap
No ratings yet
Data Analyst Roadmap
16 pages
Udacity Dandsyllabus
No ratings yet
Udacity Dandsyllabus
7 pages
DataScience Path Guide
No ratings yet
DataScience Path Guide
14 pages
Data Analytics Using Python
No ratings yet
Data Analytics Using Python
10 pages
Outline for Data Analytics
No ratings yet
Outline for Data Analytics
2 pages
iran
No ratings yet
iran
7 pages
Python - Course Contents
No ratings yet
Python - Course Contents
5 pages
Big Data Training in Chennai - Big Data Course in Chennai
No ratings yet
Big Data Training in Chennai - Big Data Course in Chennai
1 page
Big Data- Road map
No ratings yet
Big Data- Road map
22 pages
Python_Data_Analytics_Outline
No ratings yet
Python_Data_Analytics_Outline
8 pages
Updated Resume 1
No ratings yet
Updated Resume 1
1 page
Data Scientist Syllabus
No ratings yet
Data Scientist Syllabus
25 pages
Full Stack Roadmap
No ratings yet
Full Stack Roadmap
25 pages
Data Analytics professional
No ratings yet
Data Analytics professional
14 pages
Updated Resume 1
No ratings yet
Updated Resume 1
2 pages
PCAC2009
No ratings yet
PCAC2009
3 pages
Data Structures and Algorithms with Python
From Everand
Data Structures and Algorithms with Python
Aadinath Pothuvaal
No ratings yet
STAT 166_B
No ratings yet
STAT 166_B
24 pages
[Handout] Proof Stategies
No ratings yet
[Handout] Proof Stategies
27 pages
ECONS SLIDE _merged
No ratings yet
ECONS SLIDE _merged
246 pages
COM. SKILLS _merged
No ratings yet
COM. SKILLS _merged
149 pages
capstone project mcq
100% (1)
capstone project mcq
8 pages
MATH 154 MIDSEM PSQ
No ratings yet
MATH 154 MIDSEM PSQ
10 pages
CSM 153 Unit 5 (1)
No ratings yet
CSM 153 Unit 5 (1)
24 pages
[Handout] Predicate Logic_240123_212401
No ratings yet
[Handout] Predicate Logic_240123_212401
22 pages
CSM 153 Unit 3 (1)
No ratings yet
CSM 153 Unit 3 (1)
26 pages
01 Discrete Math 1 1 1 Propositions, Negations, Conjunctions and(1)
No ratings yet
01 Discrete Math 1 1 1 Propositions, Negations, Conjunctions and(1)
76 pages
Discrete Math - Truth Table
No ratings yet
Discrete Math - Truth Table
69 pages
Communication Skills Assigment
No ratings yet
Communication Skills Assigment
3 pages
Socomec 2013
No ratings yet
Socomec 2013
660 pages
True False Not Given
No ratings yet
True False Not Given
18 pages
Technology Acceptance Model PHD Thesis
100% (3)
Technology Acceptance Model PHD Thesis
5 pages
DLL-ENG8-2NDQ-2nd Week Edited
No ratings yet
DLL-ENG8-2NDQ-2nd Week Edited
8 pages
Islamic Studies - Lecture Slides - Module 6 - Predestination and Freewill
No ratings yet
Islamic Studies - Lecture Slides - Module 6 - Predestination and Freewill
13 pages
HK1-2324-Anh GB1+-de Cuong
No ratings yet
HK1-2324-Anh GB1+-de Cuong
3 pages
Anti-Abuse Rules and Tax Treaties (2024) - Wolters Kluwer
No ratings yet
Anti-Abuse Rules and Tax Treaties (2024) - Wolters Kluwer
318 pages
Democratic Policies and Practices in Early Childhood Education An Aotearoa New Zealand Case Study Linda Mitchell All Chapters Instant Download
100% (1)
Democratic Policies and Practices in Early Childhood Education An Aotearoa New Zealand Case Study Linda Mitchell All Chapters Instant Download
52 pages
[John d. Bransford][Barry s. Stein][the Ideal Problem Solver]
No ratings yet
[John d. Bransford][Barry s. Stein][the Ideal Problem Solver]
13 pages
So You Want To Become An Accountant - John Hill
No ratings yet
So You Want To Become An Accountant - John Hill
4 pages
HRGP Q2 DLP
No ratings yet
HRGP Q2 DLP
3 pages
Part 1: Choose The Word or Phrase That Best Completes Each Sentence. Write Your Answer A, B, C or D in The Numbered Boxes
No ratings yet
Part 1: Choose The Word or Phrase That Best Completes Each Sentence. Write Your Answer A, B, C or D in The Numbered Boxes
7 pages
Broken Home: How Does It Affect To Academic Achievements of Students?
No ratings yet
Broken Home: How Does It Affect To Academic Achievements of Students?
2 pages
Therigatha (Tr. Hallisey)
No ratings yet
Therigatha (Tr. Hallisey)
340 pages
DELOITTE JD
No ratings yet
DELOITTE JD
4 pages
Cooperative Learning proposal
No ratings yet
Cooperative Learning proposal
22 pages
Punctuality Essay For Students and Children
No ratings yet
Punctuality Essay For Students and Children
3 pages
Comando Fortigate
No ratings yet
Comando Fortigate
3 pages
Letter of Intent
No ratings yet
Letter of Intent
2 pages
Ielts Topics
No ratings yet
Ielts Topics
14 pages
Manaf Khreis
No ratings yet
Manaf Khreis
1 page
Modals of Obligation Grammar Guides Reading Comprehension Exercises 19693
No ratings yet
Modals of Obligation Grammar Guides Reading Comprehension Exercises 19693
4 pages
English in The Workplace: Arranged By: Widha Adriana Surdi Sonali
No ratings yet
English in The Workplace: Arranged By: Widha Adriana Surdi Sonali
7 pages
Activity Guide and Evaluation Rubric Speaking Practice
No ratings yet
Activity Guide and Evaluation Rubric Speaking Practice
6 pages
Invitation 2021 v04
No ratings yet
Invitation 2021 v04
2 pages
MEI AS Mathematics: Coordinate Geometry: Section 1: Points and Straight Lines
No ratings yet
MEI AS Mathematics: Coordinate Geometry: Section 1: Points and Straight Lines
8 pages
UNIT 7 - Well-Being - For Students
No ratings yet
UNIT 7 - Well-Being - For Students
4 pages
BATCH 3 PENSIONADOS
No ratings yet
BATCH 3 PENSIONADOS
37 pages

2023713662-PythonSQLPyspark

Uploaded by

2023713662-PythonSQLPyspark

Uploaded by

Python (4 days)

You might also like