0% found this document useful (0 votes)
31 views3 pages

Question Bank

Uploaded by

kpriya1122334455
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views3 pages

Question Bank

Uploaded by

kpriya1122334455
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Dear VTU Padhai Family,

The wait is over! We are delighted to announce that we have compiled a comprehensive file containing
all past year questions, important model papers and IA question papers for all 5 modules of Big Data
Analytics. This carefully curated resource is designed to provide everything you need for your exam
preparation in one place. We wish you great success in your studies and for your upcoming exam!
Warm regards,
VTU Padhai Team.

Subject: Big Data Analytics Subject Code: 21CS71


Question Bank
Module 1: Introduction to Big Data Analytics
1. What is Big Data? Give the different versions of Big Data Definitions
2. Explain the characteristics of Big Data. Explain the characteristics of
Bigdata with respect to images taken by Satellite
3. Explain CAP theorem in detail.
4. Explain the classification of Data with examples.
5. Discuss the Big Data classification methods and types with examples.
6. Explain scalability and Data Processing in Big Data.
7. Explain Big Data architecture of Big Data.
8. Explain the following:
1. Data Sources
2. Data Quality
3. Data Preprocessing
9. Discuss Data store
10. List the characteristics of Big Data platform
11. How does Toy company can optimize the benefits using Big Data Analytics
12. Explain the usage of Big data analytics:
i) to detect Marketing Frauds
ii) in medicine
iii) advertising
13. How are Big Data used in
i) Chocolate company
ii) Automobile industry
14. What is grid computing? List and explain the features, drawbacks of grid computing

Module 2:
Introduction to Hadoop, Hadoop Distributed File System Basics, Essential Hadoop
Tools
1. What is Hadoop? Explain the core components of Hadoop.
2. Explain Hadoop Ecosystem with a neat Diagram
3. What are the features of Hadoop?
4. Explain Hadoop Physical Organisation
5. Explain Hadoop MapReduce Framework and Programming Model
6. Brief about YARN-Based Execution Model
Module 3: NoSQL Big Data Management, MongoDB and Cassandra:

1. Give the comparison between


i) NOSQL and SQL
ii) MongoDB and RDBMS
2. List and compare the features of Big Table, RC, ORC and Parquet data stores
3. With example explain key-value store
4. List the pros and cons of distribution using sharding
5. Discuss the Characteristics of
1) NOSQL
2) MongoDB
3) Cassandra
6. Describe the features of MongoDB, and its industrial applications
7. What are the different ways of handling bigdata problems?
8. Define NOSQL, Explain Bigdata NOSQL with its features, transactions and solutions
9. Describe graph database characteristics, typical uses and examples
10. With a neat diagram explain the shared nothing architectures of Big Data task.
11. Explain NoSQL Data Architecture Patterns
12. Give the characteristics of schema less models
13. What are BASE properties
14. Write a code using MongoDB to
1) To create a collection
2) Add an array in to a collection
15. Give the examples of CQL commands.
16. What are the draw backs of Bigdata and how to overcome the big data problems.

Module 4

1. Explain Map Reduce Map tasks with the Map reduce programming model
2. Discuss, how to compose Map-reduce for calculations
3. Illustrate different Relational algebraic operations in Map reduce
4. Discuss HIVE
i) Features
ii) Architecture
iii) Installation Process
5. Compare HIVE and RDBMS
6. Explain HIVE Datatypes and file format
7. Discuss Hive Data Model with data flow sequences
8. Explain Hive Built in functions
9. Define HiveQL. Write a program to create, show, drop and query operations taking a
database for toy company
10. Explain Table partitioning, bucketing, views, join and aggregation in Hive QL
11. Explain PIG architecture with applications and features.
12. Give the differences between
i) Pig and Map reduce
ii) Pig and SQL
13. Explain Pig Latin Data Model with pig installation steps
14. Explain Pig Relational operations
15. Illustrate User defined functions in PIG with a programming example.
Module 5
Machine Learning Algorithms for Big Data Analytics, Text, Web Content, Link and
Social Network Analytics
1. Explain the following
i) Text mining with text analytics process pipe line
ii)Text mining process and phases
iii)Text mining challenges
2. Discuss the following
i) Naïve base analysis
ii)Support vector machines
iii)Binary classification
3. Discuss
i) Web Mining
ii) Web content
iii) Web usage Analytics
4. Explain
i) Page rank
ii) Structure of Web and Analysing a Web graph authorities
5. What are Hubs and Authorities?
6. Explain Social Network as Graph and Social network analytics
7. Discuss
i) Clustering in social networks
ii) Sim rank
iii) Counting triangles and graph matches
iv) Direct discovery of communities
8. Discuss Analysis of Variances (ANOVA) and correlation indicators of linear relationship
9. Describe the regression analysis predict the value of the dependent variable in case of
linear regression
10. In Machine Learning, Explain Linear and Non-Linear Relationships with Graphs
11. Explain Multiple Regression. Explain their examples in forecasting and optimisation
12. Explain with neat diagram K-means clustering.
13. Explain Naïve Bayes Theorem with example.
14. Explain Apriori Algorithm to evaluate candidate key

You might also like