0% found this document useful (0 votes)
3 views2 pages

Big Data Analytics

Uploaded by

Punya Sri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views2 pages

Big Data Analytics

Uploaded by

Punya Sri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

MLR Institute of Technology

BIG DATA ANALYTICS

III B. TECH- II SEMESTER


Course Code Category Hours / Week Credits Maximum Marks
L T P C CIE SEE Total
A4CS19 PCC
3 - - 3 30 70 100
COURSE OBJECTIVES:
To learn
1. To introduce the terminology, technology and its applications
2. To introduce the concept of Analytics and Visualization
3. To demonstrate the usage of various Big Data tools and Data Visualization tools

COURSE OUTCOMES:
Upon successful completion of the course, the student is able to
1. Compare various file systems and use an appropriate file system for storing different types of
data.
2. Demonstrate the concepts of Hadoop ecosystem for storing and processing of unstructured
data.
3. Apply the knowledge of programming to process the stored data using Hadoop tools and
generate reports.
4. Connect to web data sources for data gathering, Integrate data sources with hadoop
components to process streaming data.
5. Tabulate and examine the results generated using hadoop components

UNIT-I

INTRODUCATION TO BIG DATA: Data and its importance, Big Data - definition, implications of Big
Data, addressing Big Data implications using Hadoop, Hadoop Ecosystem
HADOOP ARCHITECTURE:
Hadoop Storage : HDFS, Hadoop
Processing : Map Reduce Framework
Hadoop Server Roles : Name Node, Secondary Name Node and Data Node, Job Tracker,
TaskTracker
HDFS-HADOOP DISTRIBUTED FILE SYSTEM: Design of HDFS, HDFS Concepts, HDFS Daemons,
HDFS High Availability, Block Abstraction, FUSE: File System in User Space. HDFS Command Line
Interface (CLI), Concept of File Reading and Writing in HDFS.

UNIT-II

MAPREDUCE PROGRAMMING MODEL: Introduction to Map Reduce Programming model to process


Big Data, key features of Map Reduce, Map Reduce Job skeleton, Introduction to Map Reduce API,
Hadoop Data Types, Develop Map Reduce Job using Eclipse, bulit a Map Reduce Job export it as a java
archive(.jar file).
MAPREDUCE JOB LIFE CYCLE: Understanding Mapper, Combiner, Partitioner, Shuffle & Sort and
Reduce phases of Map Reduce Application, Developing Map Reduce Jobs based on the requirement
using given datasets like weather dataset.

UNIT-III

INTRODUCTION TO PIG: Understanding pig and pig Platform, introduction to Pig Latin Language and
Execution engine, running pig in different modes, Pig Grunt Shell and its usage.
PIG LATIN LANGUAGE –SEMANTICS –DATA TYPES IN PIG: Pig Latin Basics, Key words, Pig Data
types, Understanding Pig relation, bag, tuple and writing pig relations or statements using Grunt Shell,
expressions, Data processing operators, using Built in functions.
WRITING PIG SCRIPTS USING PIG LATIN: Writing pig scripts and saving them text editor, running pig
scripts from command line.

B.Tech- CSE – Academic Regulations & Syllabus – MLR18 Page 138


MLR Institute of Technology

UNIT-IV

INTRODUCATION TO HIVE: Understanding Hive Shell, Running Hive, Understanding Schema on read
and Schema on write.
HIVE QL DATA TYPES, SEMANTICS: Introduction to Hive QL (Query Language), Language semantics,
Hive Data Types.
HIVE DDL, DML AND HIVE SCRIPTS: Hive Statements, Understanding and working with Hive Data
Definition Languages and Manipulation Language statements, Creating Hive Scripts and running them
from hive terminal and command line.

UNIT-V

SQOOP: Introduction to Sqoop tool, commands to connect databases and list databases and tables,
command to import data from RDBMS into HDFS, Command to export data from HDFS into required
tables of RDBMS.
FLUME: Introduction to Flume agent, understanding Flume components Source, Channel and Sink.
OOZIE: Introduction to Oozie, Understanding work flow Management.

TEXT BOOKS:

1. Hadoop: The Definitive Guide, 4th Edition - O'Reilly Media


2. Chris Eaton, Dirk deroos et al. , “Understanding Big data ”, McGraw Hill, 2012.
3. Seema Acharya, Subhasini Chellappan, "Big Data Analytics" Wiley 2015.

REFERENCE BOOKS:
1. Michael Berthold, David J. Hand, "Intelligent Data Analysis”, Springer, 2007.
2. Paul Zikopoulos ,Dirk DeRoos , Krishnan Parasuraman , Thomas Deutsch , James Giles , David
Corigan , "Harness the Power of Big Data The IBM Big Data Platform ", Tata McGraw Hill
Publications, 2012.

B.Tech- CSE – Academic Regulations & Syllabus – MLR18 Page 139

You might also like