Bigdata Syllabus

Uploaded by

Sankar Terli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views3 pages

Bigdata Syllabus

Uploaded by

Sankar Terli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Module I: Module Name: Getting an overview of Big Number of hours (LTP) 6 0 6

Data
Big Data definition, History of Data Management, Structuring Big Data, Elements of Big-
data, Big Data Analytics.

Exploring use of Big Data in Business Context: Use of Big Data in Social Networking, Use of
Big Data in preventing Fraudulent Activities in Insurance Sector & in Retail Industry.
Learning Outcomes:
After completion of this unit, the student will be able to:

1. Learn various sources of data and forms of data generation. (L2)

2. Understand the evolution and elements of Big Data. (L2)
3. Explore different opportunities available in the career path. (L3)
4. Understand the role and importance of Big Data in various domains. (L2)

Module II: Handling Big Data Number of hours (LTP) 6 0 6

Distributed and parallel computing for Big Data, Introducing Hadoop, Cloud computing and
Big Data, In-memory Computing Technology for Big Data.
Understanding Hadoop Ecosystem: Hadoop Ecosystem, Hadoop Distributed File System,
MapReduce, Hadoop YARN, Introducing HBase, Combing HBase and HDFS, Hive, Pig and
Pig Latin, Sqoop, ZooKeeper, Flume, Oozie.

Learning Outcomes:
After completion of this unit, the student will be able to:

1. Identify the difference between distributed and parallel computing. (L3)

2. Learn the importance of Virtualization in Big Data. (L2)
3. Learn the details of Hadoop and Cloud Computing. (L2)
4. Learn the architecture and features of HDFS. (L2)
Module III:
Understanding Big Data Technology Number of hours (LTP) 6 0 6
Foundations
The MapReduce Framework, Techniques to Optimize Map Reduce Jobs, Uses of Map Reduce,
Role of HBase in Big Data Processing.
Exploring the Big Data Stack, Virtualization and Big Data, Virtualization approaches.
Learning Outcomes:
After completion of this unit, the student will be able to:
1. Understand Hadoop Ecosystem, MapReduce and HBase. (L2)
2. Apply the technique in optimizing MapReduce jobs. (L3)
3. Explore the layers of Big Data Stack. (L2)
4. Learn virtualization approaches in handling Big Data operations. (L2)

Module IV: HIVE and PIG Number of hours (LTP) 6 0 6

Exploring Hive: Introducing Hive, Getting Started with Hive, Hive Services, Data Types,
Built- in Functions, Hive-DDL, Data Manipulation, Data Retrieval Queries, Using Joins.
Analysing Data with Pig: Introducing Pig, Running Pig, Getting started with Pig Latin, working
`with operators in Pig, Debugging Pig, Working with Functions in pig, Error Handling in Pig.

Learning Outcomes:
After completion of this unit, the student will be able to:
1. Learn the working of Hive and query execution. (L2)
2. Learn the importance of Pig. (L2)
3. Choose the operators in Pig. (L2)

Module V: SPARK Number of hours (LTP) 6 0 6

Introduction, Spark Jobs and API, Spark 2.0 Architecture, Resilient Distributed Datasets:
Internal Working, Creating RDDs, Transformations, Actions. Data Frames: Python to RDD
Communications, speeding up PySpark with Data Frames, Creating Data Frames and Simple
Data Frame Queries, Interoperating with RDDs, Querying with Data Frame.
Learning Outcomes:
After completion of this unit, the student will be able to:

1. Get an overview of Spark technology and Jobs Organization concept (L2)

2. Understand the schema less data structure available in PySpark (L3)
3. Get an overview of data frames that bridges the gap between Scala and Python in
terms of efficiency. (L2)
4. Able to handle a real time Big Data Application. (L4)

Textbooks(s)
1. Big Data Black Book by Dt Editorial Services, Dreamtech Publications, 2016.
2. Learning PySpark by Tomasz Drabas, Denny Lee, Packt publishing, 2017.
3. Tom White, "Hadoop: The Definitive Guide", 3/e,4/e O'Reilly, 2015.
Reference Book(s)
1. Bill Franks Taming, The Big Data Tidal Wave, 1/e, Wiley, 2012.

2. Frank J. Ohlhorst, Big Data Analytics, 1/e, Wiley, 2012

Course Outcomes:
1. Demonstrate the big data concepts for real world data analysis (L1).
2. Develop Map Reduce concepts (L2).
3. Learn how Pig Latin is used for programming in Hadoop. (L3).
4. Illustrate Hadoop API for Map reduce framework (L4).
5. Develop basic programs of map reduce framework particularly driver code,
mapper code, reducer code (L5).
6. Learn Apache Spark fundamentals, RDD, DataFrame.
Lab experiments for Bigdata

1 Installation of Hadoop Cluster –

a. Stand Alone Mode, b. Pseudo Distributed Mode, c.Fully Distributed Mode
2 Perform file management task in Hadoop.
a. Creating directory
b. List the contents of a directory
c. Upload and download a file
d. See contents of a file
e. Copy a file from source to destination
f. Move file from source to destination.
3 Map reduce programming
a. Wordcount program using Java
b. Wordcount program using python
4 Databases,Tables,Views,Functions and Indexes
5 Write a program to perform matrix multiplication in hadoop with a matrix size of nxn
where n >1000.
7 Given the following table schema
Employee_table {ID: INT, Name: Varchar (10), Age: INT, Salary: INT}
Loan_table {LoanID:INT, ID: INT, Loan_applied: Boolean, Loan_amt: INT)
a. Create a database and the following tables in Hive.
b. Insert records into the table
c. write an SQL to retrieve the employee details who have applied for a loan.
8 Write a query to create a table which stores the employee records working in the same
department together in the same sub-directory in HDFS. The schema for the table is given
below:Emp_table: {id, name, dept, yoj}
9 Given
+ -+ -+ + + +
| ID | NAME | AGE | ADDRESS | SALARY |
+ -+ -+ + + +
+ + + -+ -+
|OID | DATE | CUSTOMER_ID | AMOUNT |

Create the following table in hive and insert transaction records into it.
write an SQL query to find the customer details who have made an order?
10 Understanding Spark

Coconut VA Resume Template
50% (2)
Coconut VA Resume Template
2 pages
20IT503 - Big Data Analytics - Unit4
No ratings yet
20IT503 - Big Data Analytics - Unit4
73 pages
U 7 Reading
No ratings yet
U 7 Reading
1 page
Big Data Syllabus For Theory and Lab
No ratings yet
Big Data Syllabus For Theory and Lab
4 pages
Big Data Analytics - Sem 7 CVMU
No ratings yet
Big Data Analytics - Sem 7 CVMU
4 pages
Course Pack BDA
No ratings yet
Course Pack BDA
6 pages
Bca Bigdata Fifth - Sem Approved Syllabus
No ratings yet
Bca Bigdata Fifth - Sem Approved Syllabus
23 pages
Bigdata
No ratings yet
Bigdata
3 pages
BE AIDS R 20 VII VIII Sem Syllabus - Compressed
No ratings yet
BE AIDS R 20 VII VIII Sem Syllabus - Compressed
55 pages
Big Data With Hadoop and Spark - 2023-25
No ratings yet
Big Data With Hadoop and Spark - 2023-25
4 pages
2024 25 ODD CE449 BDA Syllabus
No ratings yet
2024 25 ODD CE449 BDA Syllabus
4 pages
Int 421
No ratings yet
Int 421
2 pages
Big Data and Analytics Syllabus 2021
No ratings yet
Big Data and Analytics Syllabus 2021
3 pages
Experiment Pgno
No ratings yet
Experiment Pgno
50 pages
Big Data Engineer Course
No ratings yet
Big Data Engineer Course
31 pages
B.Tech. CS - CE and CSE Syllabus 3rd Year 2024-25
No ratings yet
B.Tech. CS - CE and CSE Syllabus 3rd Year 2024-25
2 pages
IV Yr II Sem Lesson Plans
No ratings yet
IV Yr II Sem Lesson Plans
19 pages
Gujarat Technological University: Sr. No. Content Total Hrs % Weightage 1 13
No ratings yet
Gujarat Technological University: Sr. No. Content Total Hrs % Weightage 1 13
3 pages
Big Data Analytics Syllabus
No ratings yet
Big Data Analytics Syllabus
2 pages
Big Data Analytics
No ratings yet
Big Data Analytics
2 pages
Syllabus
No ratings yet
Syllabus
3 pages
Learn Well Technocraft: Hadoop/Big Data Syllabus
100% (1)
Learn Well Technocraft: Hadoop/Big Data Syllabus
12 pages
Big Data Technologies Course Outline
No ratings yet
Big Data Technologies Course Outline
2 pages
Big Data - Hadoop & Spark Training Syllabus: Tamilboomi
No ratings yet
Big Data - Hadoop & Spark Training Syllabus: Tamilboomi
4 pages
Big Data SV Publication
No ratings yet
Big Data SV Publication
142 pages
Big Data Analytics
No ratings yet
Big Data Analytics
3 pages
B2. Introduction To Big Data With Spark and Hadoop - Coursera
No ratings yet
B2. Introduction To Big Data With Spark and Hadoop - Coursera
12 pages
Information Technology Engineering Syllabus Sem Viii Mumbai University
No ratings yet
Information Technology Engineering Syllabus Sem Viii Mumbai University
60 pages
BDA Syllabus
No ratings yet
BDA Syllabus
4 pages
Hadoop Course Circulum
No ratings yet
Hadoop Course Circulum
2 pages
Developer Training For Apache Spark and Hadoop
No ratings yet
Developer Training For Apache Spark and Hadoop
3 pages
Bigdata Syllabus
No ratings yet
Bigdata Syllabus
2 pages
Big Data
No ratings yet
Big Data
2 pages
BDA - Unit-1
No ratings yet
BDA - Unit-1
24 pages
COMP9313: Big Data Management
No ratings yet
COMP9313: Big Data Management
79 pages
Big Data Analytics-Digital Notes
No ratings yet
Big Data Analytics-Digital Notes
86 pages
2022-23-BDA-LAB Manual
No ratings yet
2022-23-BDA-LAB Manual
59 pages
Big Data Hadoop - Course Curriculum - V1
No ratings yet
Big Data Hadoop - Course Curriculum - V1
7 pages
Big Data Analytics - Notes
No ratings yet
Big Data Analytics - Notes
13 pages
IOT Analytics - AI361
No ratings yet
IOT Analytics - AI361
3 pages
Bite411l Big-data-Analytics TH 1.0 73 Bite411l 67 Acp
No ratings yet
Bite411l Big-data-Analytics TH 1.0 73 Bite411l 67 Acp
2 pages
113 Ce 74
No ratings yet
113 Ce 74
4 pages
Big Data Hadoop Certification Training: About Intellipaat
No ratings yet
Big Data Hadoop Certification Training: About Intellipaat
13 pages
Specialised Programme On Big Data and Machine Learning - 8 Weeks
No ratings yet
Specialised Programme On Big Data and Machine Learning - 8 Weeks
6 pages
Big Data
No ratings yet
Big Data
4 pages
Big Data Framework
No ratings yet
Big Data Framework
3 pages
Big Data Analytics 0th Lecture
No ratings yet
Big Data Analytics 0th Lecture
19 pages
Details
No ratings yet
Details
9 pages
CC ZG522 Course Handout
No ratings yet
CC ZG522 Course Handout
6 pages
19ECS442: BIG DATA Question Bank
No ratings yet
19ECS442: BIG DATA Question Bank
4 pages
Get Syllabus PDF
No ratings yet
Get Syllabus PDF
2 pages
Blda Pract 2024
No ratings yet
Blda Pract 2024
59 pages
BDA Syllabus Final
No ratings yet
BDA Syllabus Final
3 pages
Big Data Analytics Comp Syllabus Sem7
No ratings yet
Big Data Analytics Comp Syllabus Sem7
4 pages
19CS4701D
No ratings yet
19CS4701D
2 pages
Hadoop Architect Brochure
No ratings yet
Hadoop Architect Brochure
13 pages
Bad601 Simp Q
No ratings yet
Bad601 Simp Q
4 pages
Big Data Spark Cs606pc Syllabus
No ratings yet
Big Data Spark Cs606pc Syllabus
4 pages
Syllabus
No ratings yet
Syllabus
7 pages
Mastering Pandas in Python: Course Book
From Everand
Mastering Pandas in Python: Course Book
Pedro Martins
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Hadoop Engineering
From Everand
Hadoop Engineering
Jaxon Vyas
No ratings yet
Fee Management: Kendriya Vidyalaya Sangathan Ambernath
No ratings yet
Fee Management: Kendriya Vidyalaya Sangathan Ambernath
7 pages
COMMAT1 Mock Exam: I. Identification
No ratings yet
COMMAT1 Mock Exam: I. Identification
6 pages
A Review On Automatic Speech Recognition Architect
No ratings yet
A Review On Automatic Speech Recognition Architect
13 pages
EE 244 Tutorial For Programming The BASYS
No ratings yet
EE 244 Tutorial For Programming The BASYS
32 pages
CH 1 To CH 3 10th Class
No ratings yet
CH 1 To CH 3 10th Class
9 pages
MP 4 JSD
No ratings yet
MP 4 JSD
8 pages
Chapter 8
No ratings yet
Chapter 8
5 pages
E Service Quality A Conceptual Model
No ratings yet
E Service Quality A Conceptual Model
17 pages
MATLAB Simulation For Digital Signal Processing PDF
No ratings yet
MATLAB Simulation For Digital Signal Processing PDF
5 pages
EE102 Engg Drawing Course Outline Spring 18
No ratings yet
EE102 Engg Drawing Course Outline Spring 18
4 pages
Persuasive
No ratings yet
Persuasive
4 pages
Syllabus On American Accent
No ratings yet
Syllabus On American Accent
11 pages
Unit 1
No ratings yet
Unit 1
9 pages
Anna Karenina by Leo Tolstoy
No ratings yet
Anna Karenina by Leo Tolstoy
2 pages
Java Applets
No ratings yet
Java Applets
22 pages
Time Measurement Notes
No ratings yet
Time Measurement Notes
4 pages
By William Somerset: Mr. Know-All
No ratings yet
By William Somerset: Mr. Know-All
20 pages
Unit7 Speaking
No ratings yet
Unit7 Speaking
5 pages
WiKAHON-Specifications R12
No ratings yet
WiKAHON-Specifications R12
3 pages
2nd Pui Ching
No ratings yet
2nd Pui Ching
48 pages
How To Schedule Query Extracts Using RSCRM - BAPI
No ratings yet
How To Schedule Query Extracts Using RSCRM - BAPI
14 pages
PASSIVE (With 1 or 2 Objects, Reporting Verbs)
No ratings yet
PASSIVE (With 1 or 2 Objects, Reporting Verbs)
3 pages
The First Part of Rhetoric Invention PDF
No ratings yet
The First Part of Rhetoric Invention PDF
14 pages
MSExcel Scripts
No ratings yet
MSExcel Scripts
6 pages
Sleight of Mouth Patterns
No ratings yet
Sleight of Mouth Patterns
4 pages
Directions A2 SS
No ratings yet
Directions A2 SS
2 pages
Translation Term Paper
100% (1)
Translation Term Paper
7 pages
ME Student Data
No ratings yet
ME Student Data
5 pages

Bigdata Syllabus

Uploaded by

Bigdata Syllabus

Uploaded by

Module I: Module Name: Getting an overview of Big Number of hours (LTP) 6 0 6

1. Learn various sources of data and forms of data generation. (L2)

Module II: Handling Big Data Number of hours (LTP) 6 0 6

1. Identify the difference between distributed and parallel computing. (L3)

Module IV: HIVE and PIG Number of hours (LTP) 6 0 6

Module V: SPARK Number of hours (LTP) 6 0 6

1. Get an overview of Spark technology and Jobs Organization concept (L2)

2. Frank J. Ohlhorst, Big Data Analytics, 1/e, Wiley, 2012

1 Installation of Hadoop Cluster –

You might also like