BDS Course Handout - Intuit PDF
BDS Course Handout - Intuit PDF
BDS Course Handout - Intuit PDF
COURSE HANDOUT
Course Description
The course introduces the students to the concepts of Systems for Analytics with particular emphasis on
processing Big Data. It introduces distributed computing models for storage and processing of Big Data
with specific coverage of block storage, file systems, and databases on the one hand and batch processing,
in-memory distributed processing, and stream processing on the other. Hadoop (along with associated
technologies such as Hive and Pig), Spark, and Amazon’s storage and database services are used as
exemplar platforms.
Course Objectives
CO Enable students to understand requirements for and constraints in storing and processing Big Data
1
Text Book(s)
T1 Seema Acharya and Subhashini Chellappan. Big Data Analytics. Wiley India Pvt. Ltd. 2015
LO1 A comprehensive understanding of the Big Data ecosystem and along with the typical
technologies involved.
LO2 Apply concepts from distributed computing and use the Hadoop/Map-reduce framework
and for solving typical big data problems.
LO3 Identify and use appropriate storage / database platforms for Big data storage along with
appropriate querying mechanisms / interfaces for retrieval.
LO4 Use in-memory processing and stream processing techniques for building Big Data
systems.
Session Plan
2 Storage Models and Cost: Memory Hierarchy, Access Any text book
costs, I/O Costs (i.e. number of disk blocks accessed); for Computer
Architecture /
Locality of Reference: Principle, examples Operating
Systems
2 3 Impact of Latency: Algorithms and data structures that N.A.
leverage locality, data organization on disk for better
locality
2 ● Exercises on NoSQL;
● Exercises on NoSQL database – Simple CRUD operations and Failure / Consistency
tests;
● Exercises to implement a Web based application that uses NoSQL databases
3 ● Exercises with Pig queries to perform Map-reduce job and understand how to build
queries and underlying principles;
● Exercises on creating Hive databases and operations on Hive, exploring built in
functions, partitioning, data analysis
4 ● Exercises on Spark to demonstrate RDD, and operations such as Map, FlatMap, Filter,
PairRDD;
● Typical Spark Programming idioms such as : Selecting Top N, Sorting, and Joins;
● Exercises on Spark SQL and DataFrames
6 Exercises on Analytics on the Cloud – using AWS, AWS Map-Reduce, AWS data stores /
databases.
[Note: A few of these topics for experiential learning will be covered by video demonstrations and/or
participatory lab sessions operated remotely. Rest of them will be assigned as homework and may be
included for evaluation – see below. End of Note.]
Evaluation Scheme
Legend: EC = Evaluation Component
No Name Type Duration Weight Day, Date, Session, Time
Assignment I
Take-home, Programming (10+10
EC-1 Assignment II and use of platforms +20 =) To be announced
40%
Assignment III
EC-2 Mid-Semester Test Closed Book 2 hours 24% To be announced
EC-3 Comprehensive Exam Open Book 3 hours 36% To be announced
Important Information
Syllabus for Mid-Semester Test (Closed Book): Topics in Weeks 1-7
Syllabus for Comprehensive Exam (Open Book): All topics given in plan of study
Evaluation Guidelines:
1. EC-1 consists of three Assignments. Announcements regarding the same will be made in a timely
manner.
2. For Closed Book tests: No books or reference material of any kind will be permitted.
Laptops/Mobiles of any kind are not allowed. Exchange of any material is not allowed.
3. For Open Book exams: Use of prescribed and reference text books, in original (not photocopies) is
permitted. Class notes/slides as reference material in filed or bound form is permitted. All other
additional reading materials in filed / bound form are also permitted. However, loose sheets of paper
will not be allowed. Use of calculators is permitted in all exams. Laptops/Mobiles of any kind are not
allowed. Exchange of any material is not allowed.
4. If a student is unable to appear for the Regular Test/Exam due to genuine exigencies, the student
should follow the procedure to apply for the Make-Up Test/Exam. The genuineness of the reason for
absence in the Regular Exam shall be assessed prior to giving permission to appear for the Make-up
Exam. Make-Up Test/Exam will be conducted only at selected exam centres on the dates to be
announced later.
It shall be the responsibility of the individual student to be regular in maintaining the self-study schedule as
given in the course handout, attend the lectures, and take all the prescribed evaluation components such as
Assignment/Quiz, Mid-Semester Test and Comprehensive Exam according to the evaluation scheme
provided in the handout.