0% found this document useful (0 votes)

81 views6 pages

Question Bank - Big Data

The document provides a question bank for a big data module divided into 5 units and 3 sections. It contains 50 questions in Section A worth 5 marks each, 3 questions in Section B worth 9 marks each, and 1 question in Section C worth 12 marks. The questions cover topics related to big data tools, techniques, and applications like Hadoop, Hive, Spark, analytics etc.

Uploaded by

smullai404

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views6 pages

Question Bank - Big Data

Uploaded by

smullai404

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Question Bank

Modul No of Part A Part B Part C Total

e No. Hour 5 Marks 9 Marks 12 Marks Mark
s s
Question Mark Question Mark Question Mark
s s s s s s
I 09 2 10 - - - - 10
II 09 1 5 1 9 - - 14
III 09 1 5 1 9 - - 14
IV 09 - - - - 1 12 12
V 09 1 5 1 9 - - 14
Total 5 25 3 18 1 12

SECTION – A

Question 1 ( unit 1)
Five marks questions

1. What are the advantages of big data? What is big data?

2. What are the three main characteristics of big data? What are the sources of big
data?
3. What are the challenges associated with big data?
4. How is big data analysed?
5. What are the benefits of utilizing big data?
6. What is the term "big data" referring to? How is big data used in healthcare?
7. What role does big data play in finance? How does big data benefit the retail
industry?
8. What are some applications of big data in transportation? How does big data
contribute to targeted advertising?
9. What are the applications of big data in government? How is big data utilized in
sports?
10. How does big data support marketing analytics? What are some applications of
big data in energy management?

Question 2 ( Unit 2)
Five marks questions
1. What is Hadoop? And How data visualization tools help to work with big data.
2. Explain Apache Spark and Apache Flink
3. What do you understand by Apache Cassandra? Write a short note on Apache
Kafka?
4. What is Apache Pig? What do you understand by Apache Zeppelin:
5. Write short note on Apache Storm: Which Apache technique can be used for
scale datasets.
6. Explain Elasticsearch? What is Apache Mahout:
7. What is Apache Drill? How TensorFlow is useful in big data.
8. How can you create a probability distribution plot in Python?
9. What is Apache Drill? Why we use Splunk?
10. What is Databricks? What are the uses of KNIME in Bigdata?

Question 3 (Unit 3)
Five marks questions
1. What is Hive?
2. What is the usage of Hive? What are some of the features of HIVE?
3. What is a Hive variable? What do we use it for?
4. What are the limitations of HIVE? How to load data into a Hive table?
5. How to query data in Hive? How to insert data into a Hive table?
6. How can you perform linear regression analysis in Python?
7. How to join tables in Hive? How to create partitions in Hive?
8. How to load data into a partition in Hive? How to create an external table in
Hive?
9. How to perform aggregations in Hive? What is the present version of Hive?
Explain ACID transactions in Hive.
10. When should we use SORT BY instead of ORDER BY?

Question 4 ( Unit 4)
Five marks questions
1. What is the role of big data in understanding the genetic diversity and evolution
of species? How does big data help in tracking and studying the spread of
infectious diseases and their evolution?
2. What are some examples of how big data has contributed to our understanding
of human evolution? How does the analysis of big data contribute to the study of
evolutionary relationships among different species?
3. How has big data improved our understanding of the impact of environmental
factors on evolution?
4. What role does big data play in studying the evolution of drug resistance in
pathogens? How does big data facilitate the study of evolutionary dynamics in
complex ecosystems?
5. How does the analysis of big data contribute to understanding the role of genetic
mutations in evolutionary processes?
6. What are some ethical considerations associated with the use of big data in
evolutionary research? What is HDFS (Hadoop Distributed File System)?
7. What are the key features of HDFS? How does HDFS achieve fault tolerance?
8. How does HDFS support high throughput for data-intensive workloads? How
does HDFS handle large files?
9. What is data locality in the context of HDFS?
10. How does HDFS ensure scalability? What are the main components of HDFS
architecture?

Question 5. ( Unit 5)
Five marks questions

1. What is Apache Ambari and what are its key features?

2. How does Ambari simplify the management of Hadoop
clusters?
3. How does Ambari enable real-time monitoring and alerting?
4. What are some key features of Apache Ambari?
5. What is the role of the Ambari web UI?
6. What is the architecture of Apache Ambari?
7. What are Ambari stacks and services?
8. What are the benefits of using Ambari blueprints?
9. How can Apache Ambari be installed and configured?
10. What is the significance of Ambari views?

SECTION – B

Question 1 ( Unit 1)
Each question carries NINE marks
1. How does big data contribute to cybersecurity? What are some use cases of big
data in e-commerce?
2. How is big data utilized in manufacturing? Write short note on supply chain
management
3. What are some use cases of big data in transportation and logistics? Write short
note on fleet management.
4. How does big data contribute to personalized healthcare? How big data useful in
remote patient monitoring.
5. What are some use cases of big data in the entertainment industry? Write about
audience analysis.
6. How does big data support urban planning and smart cities? What are some use
cases of big data in e-commerce? How is big data utilized in manufacturing?
7. What are some use cases of big data in transportation and logistics? How does
big data contribute to personalized healthcare? What are some use cases of big
data in the entertainment industry?

Question 2 ( unit 2 & 3)

Each question carries NINE marks

1. What is the role of HIVE in Distributed System? How query processed in HIVE?
2. What are the common uses of HIVE?
3. What is a Zookeeper? What are the benefits of using a zookeeper?
4. What is partitioning in Hive? What are the components of Apache HBase?
5. When is it appropriate to use a NoSQL database?
6. What are the advantages of Apache Spark?
7. How we use spark?

Question 3 ( unit 5)

Each question carries NINE marks

1. What is dynamic partitioning and when is it used? What is indexing and why do
we need it? Explain the different types of joins in Hive.
2. How does data transfer happen from HDFS to Hive? How can you create a
temporary table in Hive?
3. How can you perform a subquery in Hive? How can you use a user-defined
function (UDF) in Hive? How can you export data from Hive to external
systems?
4. How can you monitor Hive jobs? How can you optimize Hive queries for
performance? How can you perform data transformations in Hive?
5. How can you comment in Hive scripts? How can you run a Hive script? How can
you filter data in Hive?
6. Write queries for the following:
 To drop a Hive table
 To display the schema of a Hive table
 To perform sorting in Hive
 To rename a Hive table?

Write queries for the following:

 To calculate the percentile of a column in Hive

 To handle missing values in Hive?
 To calculate the difference between two dates in Hive
7. How can you round a decimal value in Hive? How can you concatenate strings in
Hive? How can you perform a case-insensitive search in Hive? How can you
handle duplicates in Hive? How can you perform a self-join in Hive?

SECTION - C
Question 1
Each question carries TWELVE marks
1. How does big data impact industries and sectors? How did the Hadoop
framework influence the history of big data? What factors contributed to the
growth of big data? ( Unit 1)
2. What are the different types of tools in Big Data? When do we use Apache drill
over Apache Hive ( Unit 2)
3. How is Apache Spark different from MapReduce? Suppose that I want to
monitor all the open and aborted transactions in the system along with the
transaction id and the transaction state. Can this be achieved using Apache
Hive? ( Unit 3)
4. What are the different components of a Hive architecture? ( unit 4)
Write queries to perform aggregate functions:
 To calculate the minimum value of a column in Hive
 To calculate the maximum value of a column in Hive?
 To calculate the sum of a column in Hive?
 To calculate the average of a column in Hive
 To calculate the count of a column in Hive

5. What are the different types of tables available in Hive? What is the difference
between external and managed tables in Hive? What do you understand by a
Hive Metastore? What is the difference between local and remote Meta stores in
Hive? ( unit 4)
6. Is it possible to run a Unix shell command from Hive? Give an example to
demonstrate. What do you understand by bucketing in Hive? Why do we need
a bucket? Can you list a few commonly used Hive services? ( unit 4 )

7. Examine the impact of big data analytics in various industries such as

healthcare, finance, retail, and manufacturing. Provide specific examples of how
big data analytics has led to innovations, improved decision-making, and
competitive advantages in these sectors. Discuss potential future trends in big
data analytics and its continued significance in shaping business strategies. (unit
1)

Real-Time Data Processing & Analytics - Distributed Computing & Event Processing Using Spark, Flink, Storm, Kafka
100% (3)
Real-Time Data Processing & Analytics - Distributed Computing & Event Processing Using Spark, Flink, Storm, Kafka
422 pages
BDA Question Bank
100% (1)
BDA Question Bank
10 pages
Bdhs - Ebook
No ratings yet
Bdhs - Ebook
970 pages
Big Data SV Publication
No ratings yet
Big Data SV Publication
142 pages
Big Data Analytics 2023 Solution
No ratings yet
Big Data Analytics 2023 Solution
17 pages
BDA R22 Question Bank
No ratings yet
BDA R22 Question Bank
14 pages
It - (R20) - 4-1 - Big Data Analytics - Digital Notes
No ratings yet
It - (R20) - 4-1 - Big Data Analytics - Digital Notes
117 pages
Bda - Digital Notes
No ratings yet
Bda - Digital Notes
85 pages
IT6006-Data Analytics Department of CSE 2018-2019
No ratings yet
IT6006-Data Analytics Department of CSE 2018-2019
193 pages
Hive Introduction
No ratings yet
Hive Introduction
47 pages
Big Data Question Bank
No ratings yet
Big Data Question Bank
11 pages
BIG DATA Question Bank
100% (1)
BIG DATA Question Bank
3 pages
Question Papers Question Bank BDA
No ratings yet
Question Papers Question Bank BDA
54 pages
BDA Cie 2 Answers
No ratings yet
BDA Cie 2 Answers
15 pages
Big Data Analytics Digital Notes
No ratings yet
Big Data Analytics Digital Notes
119 pages
Unit 1 BDA
No ratings yet
Unit 1 BDA
43 pages
Question Bank - Big Data Analytics - Final1
100% (1)
Question Bank - Big Data Analytics - Final1
6 pages
MCA - BigData Notes
No ratings yet
MCA - BigData Notes
136 pages
Big Data Analytics (R18a0529)
No ratings yet
Big Data Analytics (R18a0529)
134 pages
2018-19 Syllabus PES College of Engineering
No ratings yet
2018-19 Syllabus PES College of Engineering
48 pages
BATCH12
No ratings yet
BATCH12
32 pages
Reviewer Business Analytics
No ratings yet
Reviewer Business Analytics
11 pages
Apache Hive Essentials - Sample Chapter
No ratings yet
Apache Hive Essentials - Sample Chapter
13 pages
Big Data Analytics
No ratings yet
Big Data Analytics
61 pages
Lecture Notes - Hive and Querying
No ratings yet
Lecture Notes - Hive and Querying
20 pages
Unit Iv PDF
No ratings yet
Unit Iv PDF
26 pages
Big Data Analytics 0th Lecture
No ratings yet
Big Data Analytics 0th Lecture
19 pages
Big Data Analytics - Sem 7 CVMU
No ratings yet
Big Data Analytics - Sem 7 CVMU
4 pages
Big Data Important Questions
No ratings yet
Big Data Important Questions
6 pages
SQL Server Ground To Cloud
No ratings yet
SQL Server Ground To Cloud
167 pages
Big Data Analytics - Notes
No ratings yet
Big Data Analytics - Notes
13 pages
Big Data Analytics (R20a0520)
No ratings yet
Big Data Analytics (R20a0520)
84 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
36 pages
16MC822 - Big Data Analytics
No ratings yet
16MC822 - Big Data Analytics
5 pages
Mrcet R20 Iv 1 QB
No ratings yet
Mrcet R20 Iv 1 QB
79 pages
6 H Data With Hive Big Data Analytics B.tech. Final Year
No ratings yet
6 H Data With Hive Big Data Analytics B.tech. Final Year
24 pages
Assignment BDHHHH
No ratings yet
Assignment BDHHHH
15 pages
Mastering Apache Spark - Sample Chapter
No ratings yet
Mastering Apache Spark - Sample Chapter
24 pages
Bda Report
No ratings yet
Bda Report
16 pages
Big Data Engineering Updated Unit 1 - 2-QB
No ratings yet
Big Data Engineering Updated Unit 1 - 2-QB
4 pages
Big Data 2023
No ratings yet
Big Data 2023
18 pages
Important Da
No ratings yet
Important Da
9 pages
Important Big Data Questions AKTU
No ratings yet
Important Big Data Questions AKTU
3 pages
BIG Data - Unit - 1
No ratings yet
BIG Data - Unit - 1
24 pages
Hadoop Big Data Unit 2
No ratings yet
Hadoop Big Data Unit 2
23 pages
BDA 6TH SEM Question Bank
No ratings yet
BDA 6TH SEM Question Bank
6 pages
Unit 1
No ratings yet
Unit 1
19 pages
Cloud Computing Unit 2
No ratings yet
Cloud Computing Unit 2
54 pages
A Project Report On Web Based Data Management
No ratings yet
A Project Report On Web Based Data Management
16 pages
Big Data BCS061 Complete Question Bank With RealWorld
No ratings yet
Big Data BCS061 Complete Question Bank With RealWorld
5 pages
Bad601 Simp Q
No ratings yet
Bad601 Simp Q
4 pages
Ite06 Big Data Analytics-Qbank
No ratings yet
Ite06 Big Data Analytics-Qbank
18 pages
BD V
No ratings yet
BD V
6 pages
Mentor Product Description (Including Insight)
No ratings yet
Mentor Product Description (Including Insight)
39 pages
Introduction To Big Dat1
No ratings yet
Introduction To Big Dat1
6 pages
Data Analytics Important Questions
No ratings yet
Data Analytics Important Questions
2 pages
Q. What Is Big Data?
No ratings yet
Q. What Is Big Data?
8 pages
BDA - Unit-1
No ratings yet
BDA - Unit-1
24 pages
19ECS442: BIG DATA Question Bank
No ratings yet
19ECS442: BIG DATA Question Bank
4 pages
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 13 Big Data and Analytics
No ratings yet
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 13 Big Data and Analytics
13 pages
Big Data Qpapers
No ratings yet
Big Data Qpapers
4 pages
Question Bank
No ratings yet
Question Bank
3 pages
Gujarat Technological University: Sr. No. Content Total Hrs % Weightage 1 13
No ratings yet
Gujarat Technological University: Sr. No. Content Total Hrs % Weightage 1 13
3 pages
Big Data
No ratings yet
Big Data
4 pages
Unit 4-1
No ratings yet
Unit 4-1
21 pages
KCS061 Big Data
No ratings yet
KCS061 Big Data
2 pages
Spark Details
No ratings yet
Spark Details
11 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
24 pages
B.Sc. DS NEP 2020 SY (III & IV Sem) Syllabus 21-02-25
No ratings yet
B.Sc. DS NEP 2020 SY (III & IV Sem) Syllabus 21-02-25
29 pages
Bda U-5
No ratings yet
Bda U-5
30 pages
Vii Sem - Cc-Syllabus
No ratings yet
Vii Sem - Cc-Syllabus
2 pages
A Survey On Compression Algorithms in Hadoop
No ratings yet
A Survey On Compression Algorithms in Hadoop
4 pages
Parcial Cono 1 14
No ratings yet
Parcial Cono 1 14
14 pages
Seminar Report On Bigdata and Hadoop
No ratings yet
Seminar Report On Bigdata and Hadoop
4 pages
Model Paper BIG DATA (KOE097)
No ratings yet
Model Paper BIG DATA (KOE097)
8 pages
BDA UNIT-2 (Final)
No ratings yet
BDA UNIT-2 (Final)
27 pages
Big Data CH 1
No ratings yet
Big Data CH 1
66 pages
Resume Personal Time
No ratings yet
Resume Personal Time
4 pages
Answer To Question No. 1
No ratings yet
Answer To Question No. 1
2 pages
Attunity Replicate 5.5 Release Notes - August 2017
No ratings yet
Attunity Replicate 5.5 Release Notes - August 2017
26 pages
Sugandha Srinivas S
No ratings yet
Sugandha Srinivas S
3 pages
Data Warehousing and Data Mining Dec 2023
No ratings yet
Data Warehousing and Data Mining Dec 2023
28 pages
Cloud Computing - IT60020
No ratings yet
Cloud Computing - IT60020
2 pages
Ranjana Cloud Computing (Lab Report)
No ratings yet
Ranjana Cloud Computing (Lab Report)
20 pages
Modal Question Paper
No ratings yet
Modal Question Paper
1 page
Exp 3-Word Count
No ratings yet
Exp 3-Word Count
4 pages
Gfs Vs Hfs
No ratings yet
Gfs Vs Hfs
2 pages

Question Bank - Big Data

Uploaded by

Question Bank - Big Data

Uploaded by

Question Bank

Modul No of Part A Part B Part C Total

1. What are the advantages of big data? What is big data?

1. What is Apache Ambari and what are its key features?

Question 2 ( unit 2 & 3)

Each question carries NINE marks

Write queries for the following:

 To calculate the percentile of a column in Hive

7. Examine the impact of big data analytics in various industries such as

You might also like