0% found this document useful (0 votes)
5 views2 pages

IT4651 - T Syllabus

The document outlines a course on Big Data Analytics, covering topics such as the definition and applications of Big Data, data analysis techniques including clustering and classification, and the architecture of big data file systems like Hadoop. It also discusses mining data streams and introduces NoSQL databases and related data models. The course includes recommended textbooks and reference materials for further study.

Uploaded by

pm96mithun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views2 pages

IT4651 - T Syllabus

The document outlines a course on Big Data Analytics, covering topics such as the definition and applications of Big Data, data analysis techniques including clustering and classification, and the architecture of big data file systems like Hadoop. It also discusses mining data streams and introduces NoSQL databases and related data models. The course includes recommended textbooks and reference materials for further study.

Uploaded by

pm96mithun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

IT4651 BIG DATA ANALYTICS L T P C

(Common to IT & CSE) 3 0 0 3


UNIT – I INTRODUCTION TO BIG DATA
Defining Big Data – 5V’s of Big Data – Traditional Vs Big Data Systems -Big Data Applications -
Risks of Big Data – Structure of Big Data - Big Data Use Cases -Understanding Big Data Storage-
Evolution of Big Data-Big Data Technologies- Data Analytics Lifecycle-Data analytics lifecycle
overview- Discovery- Data Preparation.
UNIT – II DATA ANALYSIS
Overview of Clustering - K-means - Use Cases - Overview of the Method - Determining the Number
of Clusters. - Classification: Decision Trees - Overview of a Decision Tree - The General Algorithm -
Decision Tree Algorithms - Evaluating a Decision Tree - Decision Trees in R - Naïve Bayes – Bayes
Theorem - Naïve Bayes Classifier.
UNIT - III BIG DATA FILE SYSTEM
Google File System (GFS) -Distributed File Systems - Large-Scale File System Organization –
Hadoop Ecosystem – Hadoop Distributed File System (HDFS) concepts – HDFS Architecture- HDFS
Commands- Hadoop Map Reduce -Map reduce Programming Model- Hadoop YARN- Case Studies-
Word count program.
UNIT - IV MINING DATA STREAMS
Streams Concepts – Stream Data Model and Architecture Sampling Data in a Stream – Filtering .
Streams – Counting Distinct Elements in a Stream – Estimating moments – Counting oneness in a
Window – Decaying Window – Real time Analytics Platform(RTAP) applications - Case Studies -
Real Time Sentiment Analysis, Stock Market Predictions.
UNIT - V BIGDATA MODELS
Introduction to NoSQL – Aggregate Data Models – Hbase: Data Model and Implementations – Hbase
Clients – Examples – .Pig Data Model –Hive – Data Types and File Formats – HiveQL Data
Definition – HiveQL Data Manipulation – HiveQL Queries
Total Periods:45
TEXT BOOKS:
1.Bill Franks, ―Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced
Analytics‖, Wiley and SAS Business Series, 2012.
2. David Loshin, "Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques,
NoSQL, and Graph", Morgan Kaufmann/El sevier Publishers, 2013.
REFERENCE BOOKS:
1. Michael Berthold, David J. Hand, ―Intelligent Data Analysis‖, Springer, Second Edition, 2007.
2. Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data, Big Analytics: Emerging Business
Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013.

You might also like