0% found this document useful (0 votes)

63 views

Big Data For Machine Learning - Syllabus

This document provides information about a course titled "BIGDATA FOR MACHINE LEARNING". It is a professional elective course offered by the Information Technology department. The purpose of the course is to utilize the Hadoop architecture and its use cases, create mapper and reducer functions to build Hadoop applications, understand key design considerations for data ingress and egress tools in Hadoop, and review MongoDB Aggregation framework. By the end of the course, learners will be able to understand Hadoop architecture and its business implications and build reliable, scalable distributed applications on Hadoop.

Uploaded by

VASUDEVAN N (RC2113003011006)

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Big Data For Machine Learning - Syllabus

Uploaded by

VASUDEVAN N (RC2113003011006)

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Course Course Course L T P C

20PITE54J BIGDATA FOR MACHINE LEARNING E Professional Elective

Code Name Category 3 0 2 4

Pre-requisite Co-requisite Progressive

Nil Nil
Courses Courses Courses
Course Offering Department Information Technology Data Book / Codes/Standards Nil

Course Learning Rationale

The purpose of learning this course is to: Learning Program Learning Outcomes (PLO)
(CLR):
CLR-1 : Utilize the Hadoop architecture and its use cases 1 2 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
CLR-2 : Create mapper and reducer functions to build Hadoop applications

Scientific Reasoning

Reflective Thinking

Life Long Learning

Ethical Reasoning
CLR-3 : Understand key design considerations for data ingress and egress tools in Hadoop

Level of Thinking

Critical Thinking

Leadership Skills
Problem Solving
Attainment (%)
Proficiency (%)
CLR-4 : Review about MongoDB Aggregation framework

Research Skills

Self-Directed

Multicultural
Competence

Engagement
Team Work
Disciplinary

Community
CLR-5 : Infer about different kind of ecosystem tools in Hadoop

Knowledge

Reasoning

ICT Skills
Analytical
Expected

Expected

Learning
(Bloom)
Course Learning Outcomes
At the end of this course, learners will be able to:
(CLO):
CLO-1 : Understand Hadoop architecture and its Business Implications 1 80 70 L H - H L - - - L L - H - - -
CLO-2 : Build reliable, scalable distributed system with Apache Hadoop 1 85 75 M H M M H - - - M L - H - - -
CLO-3 : Import and export data into Hadoop Distributed File system 2 75 70 M H H H M - - - M L - H - - -
CLO-4 : Interpret MongoDB design goals and setup MongoDB environment 2 85 80 M H M H M - - - M L - H - - -
CLO-5 : Develop Big Data Solutions using Hadoop Eco System tools 3 85 75 H H M H H - - - M L - H - - -

Duration 15 15 15 15
15
(hour)
Basics of Data and what is Big data. Applications Blocks and replication management, HDFS Architecture Data Ingesting into Big data, What is Data Intro to PyMongo PySpark Ml-Preprocess data
SLO-1
of Big Data ingestion ? Install PyMongo, the Python Driver
S-1
Big Data requirement for traditional Data and Distributed Storage (HDFS) Sources of data which can be ingested into Steps to Connect to MongoDB Model training
SLO-2
the environment
Data warehousing and BI space, Big Data HDFS Federation SQOOP Introduction, Need for Sqoop PyMongo Basic Operations Hyper parameter training and AutoML
SLO-1
solutions
S-2
What is Distributed File System What is Name node and Data node, Name node High Where can we use sqoop, import and export Perform basic Create, Retrieve, Update and Delete Inference of Model
SLO-2
availability, syntaxes in sqoop, (CRUD) operations using PyMongo
Characteristics of Big Data and Dimensions of Component failures and recoveries, Incremental imports in SQOOP One end to end tutorial showing installation, data Deploy the model
SLO-1
Scalability loading , processing
S-3
Applications of Big data Basic Hadoop Shell commands implementation Importing data into hive using Introduction to Spark Serve the model
SLO-2
SQOOP,Case Study on SQOOP
S SLO-1 Tutorial 1:Programs in Map Reduce Tutorial4: Hadoop command hands-on Tutorial 7: Case Study Tutorial10: PyMongo Hands-on Tutorial 13: Hands-on PySpark and Various
4-5 SLO-2 examples on Spark
Historical concepts of Hadoop-Where is Hadoop Features of Hadoop 2.0 Flume ,Introduction to Ingesting data into Spark Architecture Model inference
SLO-1
S-6 used. Big Data Platforms using Flume
SLO-2 Apache Hadoop :Introduction to Hadoop The HDFS Sink Application of Data Ingestion PySpark and Data Bricks Deployment of the model
Distributed Computing Environment, What Partitioning and Interceptors Introduction to Flume, Need for Flume Case Study Export the model
SLO-1
Hadoop is & why it is important
S-7
Hadoop comparison with traditional systems, Different File Formats used Flume Architecture, Event, source, channel Introduction to Spark SQL Kafka, Data Streaming
SLO-2
and sink
SLO-1 Data and Types of Data Anatomy of File Write Demo: Data ingestion using flume Basics of Spark SQL as an ETL tool What is Kafka and its architecture ?
S-8 ,Structured, unstructured, semi-structured and Anatomy of File read, Case Study Case Study on Spark SQL Performance Tuning Connect to KSQL or SQL or Python for
SLO-2
quasi structured data analytics
S SLO-1 Tutorial 2: HDFS Commands Tutorial 5: HDFS Commands(Reading and Loading Tutorial 8: Using Sqoop and Flume Tutorial 11: Spark SQL Tutorial 14: Implementing Spark MLib
9-10 SLO-2 Files) examples
HDFS Design System Intro to Hive ,Hive Architecture Introduction to MongoDb, Understanding Case Study Twitter -> Kafka -> Spark streaming -
SLO-1
S-11 Ecosystem of MongoDB >Analytics
SLO-2 Different HDFS Shell Commands Query submission in Hive Limitations of RDBMS PySpark & Azure Data Bricks (Free) Case study,

SRM Institute of Science and Technology - Academic Curricula – (M.Tech Regulations 2020) 45
File Formats supported Hive basic operations Why NoSQL ? Business use cases of PySpark MLBasics Example using Twitter Data - MongoDB -
SLO-1
NoSQL Kafka - PySpark/ADB
S-12 Hadoop main components with a Diagram Creating table and loading data from HDFS Why choose MongoDB and advantages ? PySpark Ml :Walk through and pricing details Twitter API (access, token)
SLO-2 Explore MongoDB collections and
documents
Internal and External Table, Create a free hosted MongoDB database I PySpark Ml :nstance setup and stopping Using MongoDB and examples of MongoDB
SLO-1 HDFS overview and design, using MongoDB Atlas Working with
S-13 MongoDb,
Mapreduce - Python based Program HQL bucketing and partitioning in hive, Case study on MongoDB - Hands On PySpark Ml :Load the data ImplementingPyMongo, Analytics,Case Study
SLO-2
Case Study on HIVE
S SLO-1 Tutorial 3:Implementing HDFS Shell commands Tutorial 6: Hive Commands Tutorial 9: Mongo Db Tutorial 12:Spark Mlib examples Tutorial 15:Streaming using Kafka
14-15 SLO-2 and Python based Mapreduce programs

1. Big Data Analytics,WILEY & SAS BUSINESS SERIES

2. Simon, P., & Dexter, S. (2018). Too big to ignore: The business case for big data.
Learning
3. Baesens, B. (2014). Analytics in a big data world: The essential guide to data science and its applications.
Resources
4. Manoochehri, M. (2014). Data just right: Introduction to large-scale data & analytics.

Continuous Learning Assessment (CLA) (60% weightage)

Final Examination
Bloom’s CLA-1 CLA-2
(40% weightage)
Level of Thinking (20%) (25%) #CLA-3 (15%)
Theory Practice Theory Practice Theory Practice
Remember
Level 1 20% 20% 15% 15% 20% 15% 10%
Understand
Apply
Level 2 20% 20% 15% 15% 40% 20% 20%
Analyze
Evaluate
Level 3 10% 10% 20% 20% 40% 15% 20%
Create
Total 100 % 100 % 100 % 100 %
#CLA-3 will be a Self-Learning Component and is generally a combination from among one or more of these options:

Assignments Surprise Tests Seminars Multiple Choice Quizzes

Tech. Talks Field Visits Self-Study NPTEL/MOOC/Swayam
Mini-Projects Case-Study Group Activities Online Certifications
Presentations Debates Conference Papers Group Discussions

Course Designers
Experts from Industry Experts from Higher Technical Institutions Internal Experts
Ms Leena Shibu, Data Scientist, Great Learning Dr.N.Arunachalam, SRMIST

SRM Institute of Science and Technology - Academic Curricula – (M.Tech Regulations 2020) 46

20IT503 - Big Data Analytics - Unit4
No ratings yet
20IT503 - Big Data Analytics - Unit4
73 pages
BDF 2022 Combined 2
No ratings yet
BDF 2022 Combined 2
266 pages
Hemanth Devops
No ratings yet
Hemanth Devops
4 pages
B.tech. 3rd Yr CSE (AI) 2022 23 Revised
No ratings yet
B.tech. 3rd Yr CSE (AI) 2022 23 Revised
35 pages
Himanshu Sharma Resume V1.0
No ratings yet
Himanshu Sharma Resume V1.0
1 page
Important Questions From All Units
No ratings yet
Important Questions From All Units
3 pages
Unit-1 Introduction To Big Data
No ratings yet
Unit-1 Introduction To Big Data
33 pages
5.data Warehouse
No ratings yet
5.data Warehouse
19 pages
Can Someone Share AWS Certified Solutions Architect - Associate Dumps - Quora
No ratings yet
Can Someone Share AWS Certified Solutions Architect - Associate Dumps - Quora
2 pages
Use Case Diagrams
No ratings yet
Use Case Diagrams
8 pages
Jawad's Resume
No ratings yet
Jawad's Resume
4 pages
BDA - Unit-1
No ratings yet
BDA - Unit-1
24 pages
CS8091 Big Data Analytics Unit5
No ratings yet
CS8091 Big Data Analytics Unit5
71 pages
JNTUA R20 B.tech - CSE III IV Year Course Structure Syllabus
No ratings yet
JNTUA R20 B.tech - CSE III IV Year Course Structure Syllabus
117 pages
Data Science Kelly 2018
No ratings yet
Data Science Kelly 2018
6 pages
Demux Academy
No ratings yet
Demux Academy
205 pages
Resume Rohit Jain
No ratings yet
Resume Rohit Jain
2 pages
Brochure for ATAL Workshop
No ratings yet
Brochure for ATAL Workshop
3 pages
AD8552-Machnie Learning QB
No ratings yet
AD8552-Machnie Learning QB
25 pages
Big Data DevOps Engineer
No ratings yet
Big Data DevOps Engineer
2 pages
Hadoop Interview Questions
No ratings yet
Hadoop Interview Questions
28 pages
Big Data Nit067
No ratings yet
Big Data Nit067
1 page
Introduction To Databricks SQL Answer Guide
No ratings yet
Introduction To Databricks SQL Answer Guide
6 pages
Lab Manual: Jawaharlal Nehru Engineering College Aurangabad
No ratings yet
Lab Manual: Jawaharlal Nehru Engineering College Aurangabad
39 pages
Dice Resume CV PAVAN SRI HARSHA LAGHUVARAPU
No ratings yet
Dice Resume CV PAVAN SRI HARSHA LAGHUVARAPU
4 pages
CS8091 BDA Unit1
No ratings yet
CS8091 BDA Unit1
63 pages
Data Quality Analyst: Professiona L Profile
No ratings yet
Data Quality Analyst: Professiona L Profile
2 pages
2022 Dec. ITT401-A
No ratings yet
2022 Dec. ITT401-A
2 pages
Learneverythingai 1661068200
No ratings yet
Learneverythingai 1661068200
66 pages
In 102 DataDiscoveryGuide en
No ratings yet
In 102 DataDiscoveryGuide en
208 pages
Resume 1669363479
No ratings yet
Resume 1669363479
1 page
Artificial Intelligence Machine Learning Big Data
No ratings yet
Artificial Intelligence Machine Learning Big Data
22 pages
Harshit Resume Big Data
No ratings yet
Harshit Resume Big Data
1 page
Midterm Solution
0% (1)
Midterm Solution
7 pages
DATA ANALYTICS Lab
No ratings yet
DATA ANALYTICS Lab
3 pages
2 Rupesha Patel Oracle DBA 1+year
No ratings yet
2 Rupesha Patel Oracle DBA 1+year
2 pages
B Tech 2nd Year AIML, AIDS, Computer SC and Design 2022 23 Revised
No ratings yet
B Tech 2nd Year AIML, AIDS, Computer SC and Design 2022 23 Revised
14 pages
Resume - Venkatasriram S - 230130 - 131519
No ratings yet
Resume - Venkatasriram S - 230130 - 131519
1 page
Fahad Data Analyst
No ratings yet
Fahad Data Analyst
1 page
Unit 3 Big Data MCQ AKTU: Royal Brinkman Gartenbaubedarf
No ratings yet
Unit 3 Big Data MCQ AKTU: Royal Brinkman Gartenbaubedarf
17 pages
Manideep Lenkalapally
No ratings yet
Manideep Lenkalapally
7 pages
FS Clos
No ratings yet
FS Clos
8 pages
DWH Int Questions
100% (1)
DWH Int Questions
9 pages
Scaler DSML
No ratings yet
Scaler DSML
11 pages
File Organization in DBMS
No ratings yet
File Organization in DBMS
13 pages
Big Data Computing - Assignment 8
No ratings yet
Big Data Computing - Assignment 8
3 pages
6th Sem Big Data Assignment 1
No ratings yet
6th Sem Big Data Assignment 1
1 page
Ba Data-Science 180 en
No ratings yet
Ba Data-Science 180 en
481 pages
M.E. Cse (Ai&ml)
No ratings yet
M.E. Cse (Ai&ml)
63 pages
Siri Kademani: Website
No ratings yet
Siri Kademani: Website
4 pages
Databricks Developer Resume
No ratings yet
Databricks Developer Resume
3 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
ADA - Question - Bank 2020
No ratings yet
ADA - Question - Bank 2020
15 pages
Big Data Testing
100% (1)
Big Data Testing
34 pages
Theory of Computation - Question Bank
No ratings yet
Theory of Computation - Question Bank
19 pages
391 - CS8091 Big Data Analytics - Anna University 2017 Regulation Syllabus
0% (2)
391 - CS8091 Big Data Analytics - Anna University 2017 Regulation Syllabus
2 pages
Read Online 9789351199311 Big Data Black Book Covers Hadoop 2 Mapreduce Hi PDF
50% (2)
Read Online 9789351199311 Big Data Black Book Covers Hadoop 2 Mapreduce Hi PDF
2 pages
AI Data Science
100% (1)
AI Data Science
17 pages
Brochure Curriculum PDF
No ratings yet
Brochure Curriculum PDF
23 pages
The Big Data Technology Landscape
No ratings yet
The Big Data Technology Landscape
36 pages
19 20 23 25 New
No ratings yet
19 20 23 25 New
21 pages
5-14 New
No ratings yet
5-14 New
9 pages
1 9 16 17 21 New
No ratings yet
1 9 16 17 21 New
9 pages
Sorting
No ratings yet
Sorting
4 pages
Java Package
No ratings yet
Java Package
4 pages
A Feature-Wise Attention Module Based On The Difference With Surrounding Features For Convolutional Neural Networks
No ratings yet
A Feature-Wise Attention Module Based On The Difference With Surrounding Features For Convolutional Neural Networks
10 pages
Java VIVA Questions
100% (1)
Java VIVA Questions
2 pages
ETL Testing Resume 11
No ratings yet
ETL Testing Resume 11
4 pages
Unit - Iii RDBMS Notes
No ratings yet
Unit - Iii RDBMS Notes
26 pages
Sample Paper 12 With Answer Key
No ratings yet
Sample Paper 12 With Answer Key
43 pages
Splunk Fundamentals 1 Lab Exercises: Lab Module 12 - Creating Lookups
No ratings yet
Splunk Fundamentals 1 Lab Exercises: Lab Module 12 - Creating Lookups
7 pages
Lecture 1 - Introduction To DB AND DB Environment
No ratings yet
Lecture 1 - Introduction To DB AND DB Environment
40 pages
Managing State
No ratings yet
Managing State
2 pages
DATA FILE HANDLING chapter clearance
No ratings yet
DATA FILE HANDLING chapter clearance
21 pages
How To Retain Deltas When You Change LO in Production System
No ratings yet
How To Retain Deltas When You Change LO in Production System
2 pages
TestOut LabSim
No ratings yet
TestOut LabSim
5 pages
Data Migration Strategy and Design: Project
No ratings yet
Data Migration Strategy and Design: Project
17 pages
SQL em Ingles
No ratings yet
SQL em Ingles
87 pages
Anil Ponna: SQL Server DBA
No ratings yet
Anil Ponna: SQL Server DBA
4 pages
Database Security1
No ratings yet
Database Security1
5 pages
Kulibaba Roman Serhiyovych: Senior Python Developer, $2000 - $2500 Net
No ratings yet
Kulibaba Roman Serhiyovych: Senior Python Developer, $2000 - $2500 Net
4 pages
Sap Sas Document
No ratings yet
Sap Sas Document
4 pages
Business Intelligence and Analytics
No ratings yet
Business Intelligence and Analytics
1 page
Hbase
100% (1)
Hbase
30 pages
BC 180407008
No ratings yet
BC 180407008
6 pages
Data Structure Questions and Answers For Freshers - Sanfoundry
No ratings yet
Data Structure Questions and Answers For Freshers - Sanfoundry
3 pages
Useful Netezza Queries and Tips
No ratings yet
Useful Netezza Queries and Tips
6 pages
Unit-II Unix Notes
No ratings yet
Unit-II Unix Notes
13 pages
Tryag
No ratings yet
Tryag
28 pages
Data Structures Algorithms U2
No ratings yet
Data Structures Algorithms U2
92 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
24 pages
Improved Data Collection and Monitoring Through Dynamic Data Analysis
No ratings yet
Improved Data Collection and Monitoring Through Dynamic Data Analysis
1 page
Canon 814 - 1014 - 1218 Repair 2
No ratings yet
Canon 814 - 1014 - 1218 Repair 2
7 pages
MS Access Project - MIS Course
No ratings yet
MS Access Project - MIS Course
2 pages
002.SLA Custom - Source - Print
No ratings yet
002.SLA Custom - Source - Print
8 pages
Lecture 4 PDF
No ratings yet
Lecture 4 PDF
38 pages
Event Log Lexmark Cx522ade
No ratings yet
Event Log Lexmark Cx522ade
186 pages

Big Data For Machine Learning - Syllabus

Uploaded by

Big Data For Machine Learning - Syllabus

Uploaded by

Course Course Course L T P C

20PITE54J BIGDATA FOR MACHINE LEARNING E Professional Elective

Pre-requisite Co-requisite Progressive

Course Learning Rationale

Life Long Learning

1. Big Data Analytics,WILEY & SAS BUSINESS SERIES

Continuous Learning Assessment (CLA) (60% weightage)

Assignments Surprise Tests Seminars Multiple Choice Quizzes

You might also like