0% found this document useful (0 votes)
10 views13 pages

Data - Engineering & InterView Grooming Course

The document outlines a comprehensive Data Engineering/Data Science course covering topics such as Spark, Hive, SQL, Python, AWS Cloud, and Airflow, aimed at freshers and those looking to upskill. The course lasts 3 to 4 months with live Zoom lectures on weekends, and includes additional resources like recorded sessions and resume guidance. The total cost is Rs 5000, with a focus on practical problem-solving and job preparation support.

Uploaded by

SUBHADIP DAS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views13 pages

Data - Engineering & InterView Grooming Course

The document outlines a comprehensive Data Engineering/Data Science course covering topics such as Spark, Hive, SQL, Python, AWS Cloud, and Airflow, aimed at freshers and those looking to upskill. The course lasts 3 to 4 months with live Zoom lectures on weekends, and includes additional resources like recorded sessions and resume guidance. The total cost is Rs 5000, with a focus on practical problem-solving and job preparation support.

Uploaded by

SUBHADIP DAS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

THE DATA BUZZ WRAP

COMMUNITY
WEEKENDS DATA ENGINEERING/ DATA SCIENCE COURSE

BATCH 6
Class Already Started, 2023

Course Overview
It is a complete end to end Data Engineering / Data Science course which would cover
Spark, Hive, SQL, Python, AWS Cloud, Airflow, GIT along with Guesstimates and
Problem Solving. This course would particularly be helpful for the fresher’s college students of
someone who wants to make a transition into the engineering-science-analytics field. If someone
wants to upskill oneself oí wants to brush up one's knowledge then,this course would be
particularly very helpful considering the comprehensiveness along with the short duration of
the course.

Course Duration: 3 to 4 months


Class Timing: 10Am- 11:30 Am (Sat - Sun)

Doubt Lecture 1 Hour – Sunday


Live Lectures would be conducted on Zoom.

The recoding of each live session with life-time access would also be provided to you.
But we would urge you to attend the live lectures for better understanding.

SPARK

 Spark Overview
 Why Spark is getting used everywhere
instead of MapReduce
 Advantages & Disadvantages of Spark
 Spark Components
 Spark Architecture
 Spark RDD's , Data Frames in detail
 Different File Formats used in Spark
 Spark Operations(Transformation & Action)
 Shuffling in Spark
 Parallelism in Spark
 Spark Built in Functions
 SPARK SQL in detail
 Spark Joins
 Spark Optimization techniques
 Shared Variables in Spark
 Spark Computations
 Realtime problem and solution
 Spark Assignment
HIVE

 Hive Overview & Architecture


 Hive VS RDBMS
 Hive Meta-Store
 OLAP VS OLTP
 Hive Execution engines
 HQL VS SQL
 Hive Built in Functions
 ORC file format
 Different tables in Hive
 Table level optimizations
 Query level optimizations
 Partitioning vs Bucketing
 Hive Built in Functions
 Different types of Hive partitions
 SERDE in hive
 SCD implementations in Hive
 Hive Optimization techniques
 Hive assignment
AWS CLOUD

Will Be Providing 1-year free AWS Cloud Account

 Amazon S3 Overview
 Different S3 buckets overview
 S3 life cycle
 real time use case of S3
 EMR
 Autoscaling & Cooldown
 Real time use of EMR
 Amazon Athena Overview
 Tables & View Creation
 MSCK REPAIR
 Glue
 Redshift
 Practice Problems
AIRFLOW

 Airflow Overview
 Why Airflow
 What is DAG
 DAG Creation
 Operators & Sensors in Airflow
 Integration of Spark jobs to Airflow
 Real time problem statement
SQL
● Introduction to SQL
● What are databases and SQL and how they can be used together to
dive in
● How to store and modify the data in a database:
● DDL Commands: CREATE, ALTER, DROP, TRUNCATE, etc.
● Data types: VARCHAR, INT, DECIMAL, DATE, BOOLEAN, etc.
● Constraints: PRIMARY KEY UNIQUE KEY and NOT NULL etc.
● DML Operations: INSERT, UPDATE, DELETE etc.
● How to retrieve data: SELECT Statement
● Basic select clause operations: Distinct, Limit, ORDER By
● The filter (WHERE) clause: Logical operations, Comparison
operators,Advance filters
● Aggregation and Advance Aggregation: Group by, Partition By,
RowsBetween clause, Rolling Calculations, filter with Having
clause.
● SQL JOINS: INNER, LEFT, RIGHT, FULL OUTER, SELF, CROSS
● Self-Operations: UNION, UNION ALL, MINUS, Intersect
● Calculated Columns and SQL Functions: CASE WHEN, Date
Functions,String functions, Data type conversion functions, etc.
● Queries within queries: Subqueries and CTE (With Clause)
● Window Analytical Functions: RANK, ROW_NUMBER,
DENSE_RANK,LEAD/LAG, NTILE
● Performance tuning: Clustered and non-clustered indexes, best
practices for SQL optimization

PYTHON
● Introduction to Python
● Variables, keywords, indentation quotes
● Comparison: Arithmetic and logic operator
● LOOP
● PASS, BREAK AND Continue
● String (type casting, string formatting, slicing, string method
● List (type casting, String formatting, slicing, string method)
● List (type casting, string formatting, slicing, string method)
● Set (TYPE Castling, Different Operations)
● MAP (USE CASE)
● LAMBDA- (LAMBDA Functions USE)
● NUMPY, PANDAS (Python LIBRARIES IN Detail’s

GUESSTIMATES
● KEY Points’ About ANSWERING Guesstimate’s Question’s
● STEPS FOR SOLVING A Guesstimate Question
● Guesstimate’s Interview Question AND ANSWER EXAMPLES
● CONCLUSION:
● WHAT IS Guesstimate?
8

● WHAT ARE THE SKILLS DECIPHERED WHILE


ANSWERING THE Guesstimate Question’s?

PROBLEM SOLVING

● TO BE PREPARED TO Actively Listen IN ORDER TO


AccuratelyUnderstand THE PROBLEM
● TO HELP YOU KNOW HOW TO TAKE THE FIRST STEP IN
SOLVING APROBLEM
● TO CLARIFY AND DEFINE THE PROBLEM
● TO Understand THE USEÏULNESS OF Collaborative
PROBLEMSOLVING AND DECISION MAKING
9

WHAT ELSE?

RESUME MAKING GUIDANCE

• My Main Focus Would Be to Present You as A Person Who Has Done


Some Work as A Data Engineer and Doesn't Just Have Knowledge.

• For A fresher, It Would Be through Your Projects and for An


ExperiencedPerson, It Would Be through Resume Molding

• I Would Also Be Guiding You How to Make an attractive and


creative Resume

GUIDANCE TO USE VARIOUS JOB BOARDS

● We would be guiding you how to leverage various job


portals likeNaukíi.com and LinkedIn to get a job.
● A proper template to reach out to people on LinkedIn of via email
would be shared.
● Referrals to be provided for 1 year

GUIDANCE FOR HR ROUND

• Guidance for HR Round

• Answering Most Asked Questions by HRs

• Taking One-On-One Mocks for HR Rounds, If You failed In Any.


1

REASONS TO JOIN THIS COURSE INSEAD Of ANY YouTube


VIDEO:

● One to one interaction would be there during the live class.


● Assignments would be given which you can ask, if unable to solve
● Practice of questions which have already been asked in the
interviews would be solved which teaching the concepts ●
Certification after completing the course

THE COST OF THIS COURSE IS RS 5000


ONLY
Mode of Payment:
● All the payments to be done by (gpay/phonepe)
● After that send me the screenshot of the payment along with your
email id.
● You would be added into the WhatsApp group within 30 minutes.
1

In case of any query, please connect with me.

8095821145- WHASTAPP OR CALL

Regards,

Your Data Guy

7 + Years Exp in

The Industry

with Top

Product

Companies

SUBHADIP DAS
1

THANK YOU

You might also like