0% found this document useful (0 votes)
2 views3 pages

Big Data Analytics Syllabus

The document outlines the course structure for Big Data Analytics (BTAIOE604A), detailing the teaching and examination schemes, prerequisites, course objectives, and outcomes. It covers key topics such as Big Data, Apache Hadoop, HDFS, Map Reduce, and the Hadoop Ecosystem, along with hands-on practice in deploying Big Data systems. Recommended and reference books for further reading are also provided.

Uploaded by

hago sohani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views3 pages

Big Data Analytics Syllabus

The document outlines the course structure for Big Data Analytics (BTAIOE604A), detailing the teaching and examination schemes, prerequisites, course objectives, and outcomes. It covers key topics such as Big Data, Apache Hadoop, HDFS, Map Reduce, and the Hadoop Ecosystem, along with hands-on practice in deploying Big Data systems. Recommended and reference books for further reading are also provided.

Uploaded by

hago sohani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Semester –VI

Big Data Analytics


BTAIOE604A Big Data Analytics OEC2 3L- 1T - 0P 4 Credits

Teaching Scheme Examination Scheme


Lecture: 3 hrs./week Continuous Assessment : 20 Marks
Tutorial : 1 hr./week Mid Semester Exam: 20 Marks
End Semester Exam: 60 Marks (Duration 03 hrs.)
Pre-Requisites: Should have knowledge of one Programming Language (Java preferably),
Practice of SQL (queries and sub queries), exposure to Linux Environment

Course Objectives:
Upon completion of this course, the student should be able to
1. Understand the Big Data Platform and its Use cases
2. Provide an overview of Apache Hadoop
3. Provide HDFS Concepts and Interfacing with HDFS
4. Understand Map Reduce Jobs
5. Provide hands on Hodoop Eco System
6. Apply analytics on Structured, Unstructured Data.

Course Outcomes:
On completion of the course, students will be able to:

CO1 Identify Big Data and its Business Implications.


CO2 List the components of Hadoop and Hadoop Eco-System
CO3 Access and Process Data on Distributed File System
CO4 Develop Big Data Solutions using Hadoop Eco System
CO5 Use Big data Framework, security and governance.

Course Contents:
Unit No 1: Introduction to Big Data and Hadoop [7 Hours]
Types of Digital Data, Introduction to Big Data, Big Data Analytics, History of Hadoop,
Apache Hadoop, Analyzing Data with UNIX tools, Analyzing Data with Hadoop, Hadoop
Streaming, Hadoop Echo System, IBM Big Data Strategy, Introduction to Infosphere
BigInsights and Big Sheets.

Unit No 2: HDFS (Hadoop Distributed File System): [7 Hours]


The Design of HDFS, HDFS Concepts, Command Line Interface, Hadoop file system
interfaces, Data flow, Data Ingest with Flume and Scoop and Hadoop archives, Hadoop I/O:
Compression, Serialization, Avro and File-Based Data structures.

Unit No 3: Map Reduce: [7 Hours]


Anatomy of a Map Reduce Job Run, Failures, Job Scheduling, Shuffle and Sort, TaskExecution,
Map Reduce Types and Formats, Map Reduce Features, Hadoop cluster.
Unit No 4: Hadoop Eco System: [8 Hours]
Pig : Introduction to PIG, Execution Modes of Pig, Comparison of Pig with Databases, Grunt, Pig
Latin, User Defined Functions, Data Processing operators.
Hive : Hive Shell, Hive Services, Hive Metastore, Comparison with Traditional Databases,
HiveQL, Tables, Querying Data and User Defined Functions.
Hbase : HBasics, Concepts, Clients, Example, Hbase Versus RDBMS.
Big SQL : Introduction

Unit No 5: Big Data Framework and security: [7 Hours]


Apache kafka: Feature, concept, architecture, components
Apache Spark: Feature, concept, architecture, components.
Kerberos authentication: Feature, concept, architecture, components

Note: Hands-on practice of to deploy Big Data systems should cover under Tutorial slots.Text

Books
1. Tom White “ Hadoop: The Definitive Guide” Third Edit on, O’reily Media, 2012.
2. Seema Acharya, Subhasini Chellappan, "Big Data Analytics" Wiley 2015.

Reference Books
1. Michael Berthold, David J. Hand, "Intelligent Data Analysis”, Springer, 2007.
2. Jay Liebowitz, “Big Data and Business Analytics” Auerbach Publications, CRC press (2013)
3. Tom Plunkett, Mark Hornick, “Using R to Unlock the Value of Big Data: Big Data Analytics
with Oracle R Enterprise and Oracle R Connector for Hadoop”, McGraw-Hill/Osborne Media
(2013), Oracle press.
4. Anand Rajaraman and Jef rey David Ulman, “Mining of Massive Datasets”, Cambridge
University Press, 2012.
5. Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams
with Advanced Analytics”, John Wiley & sons, 2012.
6. Glen J. Myat, “Making Sense of Data”, John Wiley & Sons, 2007
7. Pete Warden, “Big Data Glossary”, O’Reily, 2011.
8. Michael Mineli, Michele Chambers, Ambiga Dhiraj, "Big Data, Big Analytics: Emerging
Business Intelligence and Analytic Trends for Today's Businesses", Wiley Publications, 2013.

You might also like