0% found this document useful (0 votes)
143 views4 pages

21cs71BDA Question Bank

Uploaded by

someshgowda7975
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views4 pages

21cs71BDA Question Bank

Uploaded by

someshgowda7975
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Big Data Analytics Question Bank 21CS71 :[based on previous years papers & Model

question papers]
Module1: Introduction to Big Data Analytics.
1. Define Big Data. Explain the Evolution of Big Data and their characteristics
2. What is grid computing? List and explain the features, drawbacks of grid computing
3. Discuss the functions of each of the five layers in Big Data architecture design
4. Illustrate the various phases involved in Big Data Analytics with neat diagram.
5. Discuss the evolution of BigData
6. Explain the characteristics of BigData
7. Write a neat block diagram,Explain data architecture design.
8. Write a notes on Analytical scalability to big data and Massive Parallel Processing
Platforms.
9. Highlight Big Data Analytics with one case study?
10. Define BigData. Explain the classification of bigdata?
11. Define Scalability and its types along with the examples.
12. Explain the functions of each layer in Big data architecture design with a diagram.
13. Define data preprocessing. Explain in brief the needs of preprocessing?
14. Explain the following terms. i. Scalability & Parallel Processing ii. Grid & Cluster
Computing.
15. What is Cloud Computing? Explain different services of Cloud.
16. Explain any two Big Data different Applications.
17. How does Berkeley data analytics stack help in analytics take?

Module:2 Introduction to Hadoop (T1), Hadoop Distributed File System Basics (T2), Essential
Hadoop Tools (T2).
1. Illustrate the Hadoop core components with neat diagram
2. Discuss the Hadoop system and ecosystem components in four layers
3. Illustrate YARN based execution model and its functions With a neat diagram
4. Discuss the Apache sqoop import and export methods with neat diagram.
5. What are the core components of Hadoop? Explain in brief its each of its components?
6. Explain Hadoop Distributed File System?
7. Define MapReduce Framework and its functions?
8. Write down the steps on the request to MapReduce and the types of process in
MapReduce.
9. Write short noted on Flume Hadoop Tool.
10. What is HDFS? Highlight the important design features of the HDFS
11. Bring out the concepts of the HDFS block replication with an example
12. Explain Apache sqoop import and export method with neat diagram
13. Demonstrate any six HBase commands with output?
14. Write short note on Apache hive.
15. Explain Apache Oozie with neat diagram.
16. Explain YARN application framework.

Module : 3 NoSQL Big Data Management, MongoDB and Cassandra:


1.Discuss the NoSQL data stores and their characteristic features
2. Illustrate the key value pairs in data architectural patterns with an example
3. Discuss the functions of MongoDB query language and database commands
4. Illustrate the CQL commands and their functionality.
5. Define key-value store with example. What are the advantages of key-value store?
6.Write down the steps to provide client to read and write values using key-value store?What
are the typical uses of keyValue store?
7.Discuss the characteristics of NoSQL data store along with the features in NOSQL
transactions?
8.With neat diagrams,explain the following Shared-Nothing Architecture for
BigDataTasks,Explain the following distribution model? (i) Single server model
(ii)Sharding very large databases (iii)Master Slave distribution model (iv) Peer to peer
distribution model.
9.Explain about NOSQL datastore and its characteristics.
10.Describe the principle of working of the CAP theorem
11.Demonstrate the working of key-value store with an example.
12.Describe the principle of working of the CAP theorem.
13.Demonstrate the working of key-value store with an example
14.Describe the features of MongoDB, and its industrial application
15. Explain NOSQL Data Architecture Patterns.
16. Explain MONGO DATABASE.[10m]

Module 4: Map Reduce, Hive and PIG


1.Describe the MapReduce execution steps with neat diagram.
2. Explain Key Value pairing in Map Reduce.
3. Discuss the functions of Group By, partitioning and combining using one example for each
4. Illustrate main features and Architecture of Hive with neat diagram.
5. Discuss the pig Latin data types and examples.
6.With a neat diagram, Explain the process in MapReduce when client submitting a Job?
7.Explain Hive Integration and workflow steps involved with a diagram?
8.Using HiveQL for the following:
a. Create a table with partition
b. Add, rename and drop a partition to a table
9.What is Pig in Big Data? Explain the features of PIG?
10.Describe the Map tasks, Reduce tasks and Map reduce Execution process
11.Describe the Hive architecture and its characteristics.
12.Demonstrate the pig architecture for scripts dataflow and processing
13.Differentiate between pig and Map reduce give industrial application for each.

Module:5 Machine Learning Algorithms for Big Data Analytics &


Text, Web Content, Link, and Social Network Analytics.
1. Discuss Analysis of Variances(ANOVA) and correlation indicators of linear relationship
2. Describe the regression analysis predict the value of the dependent variable in case
of linear regression
3. Illustrate the various phases in text mining process pipeline
4. Describe the web content mining and three phases for web usage mining
5. In Machine Learning explain linear and non-linear relationship with essential graphs?
6. Write the block diagram of text mining process and explain its phases?
7. Define multiple regressions. Write down the examples involved in forecasting and
optimization in regression.
8. Explain the parameters in social graph network topological analysis using centralities
and PageRank?
9. Explain the simple linear regression analysis?
10. Demonstrate frequent item set mining and association rule mining.
11. Explain the purpose of web usage analytics and the significance of web graphs
12. What is Machine Learning? Explain different types of Regression Analysis.
13. Explain with neat diagram K-means clustering.
14. Explain Naïve Bayes Theorem with example.
Reference books:
1. Raj Kamal and Preeti Saxena, “Big Data Analytics Introduction to Hadoop, Spark, and
Machine Learning”, McGraw Hill Education, 2018 ISBN: 9789353164966, 9353164966
2. Douglas Eadline,[refer for module1,2(half),3,4,5]
2. "Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the
Apache Hadoop 2 Ecosystem", 1 stEdition, Pearson Education, 2016. ISBN13: 978
9332570351 [module 2 only]

You might also like