Question Bank
Question Bank
The wait is over! We are delighted to announce that we have compiled a comprehensive file containing
all past year questions, important model papers and IA question papers for all 5 modules of Big Data
Analytics. This carefully curated resource is designed to provide everything you need for your exam
preparation in one place. We wish you great success in your studies and for your upcoming exam!
Warm regards,
VTU Padhai Team.
Module 2:
Introduction to Hadoop, Hadoop Distributed File System Basics, Essential Hadoop
Tools
1. What is Hadoop? Explain the core components of Hadoop.
2. Explain Hadoop Ecosystem with a neat Diagram
3. What are the features of Hadoop?
4. Explain Hadoop Physical Organisation
5. Explain Hadoop MapReduce Framework and Programming Model
6. Brief about YARN-Based Execution Model
Module 3: NoSQL Big Data Management, MongoDB and Cassandra:
Module 4
1. Explain Map Reduce Map tasks with the Map reduce programming model
2. Discuss, how to compose Map-reduce for calculations
3. Illustrate different Relational algebraic operations in Map reduce
4. Discuss HIVE
i) Features
ii) Architecture
iii) Installation Process
5. Compare HIVE and RDBMS
6. Explain HIVE Datatypes and file format
7. Discuss Hive Data Model with data flow sequences
8. Explain Hive Built in functions
9. Define HiveQL. Write a program to create, show, drop and query operations taking a
database for toy company
10. Explain Table partitioning, bucketing, views, join and aggregation in Hive QL
11. Explain PIG architecture with applications and features.
12. Give the differences between
i) Pig and Map reduce
ii) Pig and SQL
13. Explain Pig Latin Data Model with pig installation steps
14. Explain Pig Relational operations
15. Illustrate User defined functions in PIG with a programming example.
Module 5
Machine Learning Algorithms for Big Data Analytics, Text, Web Content, Link and
Social Network Analytics
1. Explain the following
i) Text mining with text analytics process pipe line
ii)Text mining process and phases
iii)Text mining challenges
2. Discuss the following
i) Naïve base analysis
ii)Support vector machines
iii)Binary classification
3. Discuss
i) Web Mining
ii) Web content
iii) Web usage Analytics
4. Explain
i) Page rank
ii) Structure of Web and Analysing a Web graph authorities
5. What are Hubs and Authorities?
6. Explain Social Network as Graph and Social network analytics
7. Discuss
i) Clustering in social networks
ii) Sim rank
iii) Counting triangles and graph matches
iv) Direct discovery of communities
8. Discuss Analysis of Variances (ANOVA) and correlation indicators of linear relationship
9. Describe the regression analysis predict the value of the dependent variable in case of
linear regression
10. In Machine Learning, Explain Linear and Non-Linear Relationships with Graphs
11. Explain Multiple Regression. Explain their examples in forecasting and optimisation
12. Explain with neat diagram K-means clustering.
13. Explain Naïve Bayes Theorem with example.
14. Explain Apriori Algorithm to evaluate candidate key