The document outlines a course on Big Data Analytics, covering topics such as the definition and applications of Big Data, data analysis techniques including clustering and classification, and the architecture of big data file systems like Hadoop. It also discusses mining data streams and introduces NoSQL databases and related data models. The course includes recommended textbooks and reference materials for further study.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
5 views2 pages
IT4651 - T Syllabus
The document outlines a course on Big Data Analytics, covering topics such as the definition and applications of Big Data, data analysis techniques including clustering and classification, and the architecture of big data file systems like Hadoop. It also discusses mining data streams and introduces NoSQL databases and related data models. The course includes recommended textbooks and reference materials for further study.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2
IT4651 BIG DATA ANALYTICS L T P C
(Common to IT & CSE) 3 0 0 3
UNIT – I INTRODUCTION TO BIG DATA Defining Big Data – 5V’s of Big Data – Traditional Vs Big Data Systems -Big Data Applications - Risks of Big Data – Structure of Big Data - Big Data Use Cases -Understanding Big Data Storage- Evolution of Big Data-Big Data Technologies- Data Analytics Lifecycle-Data analytics lifecycle overview- Discovery- Data Preparation. UNIT – II DATA ANALYSIS Overview of Clustering - K-means - Use Cases - Overview of the Method - Determining the Number of Clusters. - Classification: Decision Trees - Overview of a Decision Tree - The General Algorithm - Decision Tree Algorithms - Evaluating a Decision Tree - Decision Trees in R - Naïve Bayes – Bayes Theorem - Naïve Bayes Classifier. UNIT - III BIG DATA FILE SYSTEM Google File System (GFS) -Distributed File Systems - Large-Scale File System Organization – Hadoop Ecosystem – Hadoop Distributed File System (HDFS) concepts – HDFS Architecture- HDFS Commands- Hadoop Map Reduce -Map reduce Programming Model- Hadoop YARN- Case Studies- Word count program. UNIT - IV MINING DATA STREAMS Streams Concepts – Stream Data Model and Architecture Sampling Data in a Stream – Filtering . Streams – Counting Distinct Elements in a Stream – Estimating moments – Counting oneness in a Window – Decaying Window – Real time Analytics Platform(RTAP) applications - Case Studies - Real Time Sentiment Analysis, Stock Market Predictions. UNIT - V BIGDATA MODELS Introduction to NoSQL – Aggregate Data Models – Hbase: Data Model and Implementations – Hbase Clients – Examples – .Pig Data Model –Hive – Data Types and File Formats – HiveQL Data Definition – HiveQL Data Manipulation – HiveQL Queries Total Periods:45 TEXT BOOKS: 1.Bill Franks, ―Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics‖, Wiley and SAS Business Series, 2012. 2. David Loshin, "Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph", Morgan Kaufmann/El sevier Publishers, 2013. REFERENCE BOOKS: 1. Michael Berthold, David J. Hand, ―Intelligent Data Analysis‖, Springer, Second Edition, 2007. 2. Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013.