(DATA SCIENCE Syllabus
(DATA SCIENCE Syllabus
Tech(R23)–CSE-DS
II B.Tech – II Semester
(23E32401T) DATA ENGINEERING
(CSE-DATA SCIENCE)
Course Outcomes: By the end of the course students will be able to:
CO1: Understand Data Engineering Life cycle
CO2: Apply appropriate data modeling techniques for different types of data.
CO3: Evaluate and select appropriate technologies and frameworks for specific data
engineering tasks.
CO4: Analyze the use of OLTP Applications in Data Science.
CO5: Implement data quality checks and governance processes to ensure data
reliability and compliance.
UNIT-I: Introduction to Data Engineering: Definition, Data Engineering Life Cycle,
Evolution of Data Engineer, Data Engineering Versus Data Science, Data Engineering Skills
and Activities,
Data Maturity, Data Maturity Model, Skills of a Data Engineer, Business Responsibilities,
Technical Responsibilities, Data Engineers and Other Technical Roles.
UNIT-II: Data Engineering Life Cycle: Data Life Cycle Versus Data Engineering Life
Cycle, Generation: Source System, Storage, Ingestion, Transformation, Serving Data.
Major undercurrents across the Data Engineering Life Cycle: Security, Data
Management, Data Ops, Data Architecture, Orchestration, Software Engineering.
UNIT-IV: Storage: Raw Ingredients of Data Storage, Data Storage Systems, Data
Engineering Storage Abstractions, Data warehouse, Data Lake, Data Lakehouse.
Ingestion: Data Ingestion, Key Engineering considerations for the Ingestion Phase, Batch
Ingestion Considerations, Message and Stream Ingestion Considerations, Ways to Ingest Data
29
CBIT–B.Tech(R23)–CSE-DS
Textbooks:
1. Joe Reis, Matt Housley, Fundamentals of Data Engineering, O'Reilly Media, Inc.,June
2022,ISBN: 9781098108304
Reference Books:
1. Paul Crickard , Data Engineering with Python,Packt Publishing, October 2020.
2. Ralph Kimball, Margy Ross, The Data Warehouse Toolkit: The Definitive Guide to
Dimensional Modeling, Wiley, 3rd Edition, 2013
3. James Densmore, Data Pipelines Pocket Reference: Moving and Processing Data for
Analytics, O'Reilly Media, 1st Edition,
30