0% found this document useful (0 votes)
81 views6 pages

Question Bank - Big Data

The document provides a question bank for a big data module divided into 5 units and 3 sections. It contains 50 questions in Section A worth 5 marks each, 3 questions in Section B worth 9 marks each, and 1 question in Section C worth 12 marks. The questions cover topics related to big data tools, techniques, and applications like Hadoop, Hive, Spark, analytics etc.

Uploaded by

smullai404
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views6 pages

Question Bank - Big Data

The document provides a question bank for a big data module divided into 5 units and 3 sections. It contains 50 questions in Section A worth 5 marks each, 3 questions in Section B worth 9 marks each, and 1 question in Section C worth 12 marks. The questions cover topics related to big data tools, techniques, and applications like Hadoop, Hive, Spark, analytics etc.

Uploaded by

smullai404
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Question Bank

Modul No of Part A Part B Part C Total


e No. Hour 5 Marks 9 Marks 12 Marks Mark
s s
Question Mark Question Mark Question Mark
s s s s s s
I 09 2 10 - - - - 10
II 09 1 5 1 9 - - 14
III 09 1 5 1 9 - - 14
IV 09 - - - - 1 12 12
V 09 1 5 1 9 - - 14
Total 5 25 3 18 1 12

SECTION – A

Question 1 ( unit 1)
Five marks questions

1. What are the advantages of big data? What is big data?


2. What are the three main characteristics of big data? What are the sources of big
data?
3. What are the challenges associated with big data?
4. How is big data analysed?
5. What are the benefits of utilizing big data?
6. What is the term "big data" referring to? How is big data used in healthcare?
7. What role does big data play in finance? How does big data benefit the retail
industry?
8. What are some applications of big data in transportation? How does big data
contribute to targeted advertising?
9. What are the applications of big data in government? How is big data utilized in
sports?
10. How does big data support marketing analytics? What are some applications of
big data in energy management?

Question 2 ( Unit 2)
Five marks questions
1. What is Hadoop? And How data visualization tools help to work with big data.
2. Explain Apache Spark and Apache Flink
3. What do you understand by Apache Cassandra? Write a short note on Apache
Kafka?
4. What is Apache Pig? What do you understand by Apache Zeppelin:
5. Write short note on Apache Storm: Which Apache technique can be used for
scale datasets.
6. Explain Elasticsearch? What is Apache Mahout:
7. What is Apache Drill? How TensorFlow is useful in big data.
8. How can you create a probability distribution plot in Python?
9. What is Apache Drill? Why we use Splunk?
10. What is Databricks? What are the uses of KNIME in Bigdata?

Question 3 (Unit 3)
Five marks questions
1. What is Hive?
2. What is the usage of Hive? What are some of the features of HIVE?
3. What is a Hive variable? What do we use it for?
4. What are the limitations of HIVE? How to load data into a Hive table?
5. How to query data in Hive? How to insert data into a Hive table?
6. How can you perform linear regression analysis in Python?
7. How to join tables in Hive? How to create partitions in Hive?
8. How to load data into a partition in Hive? How to create an external table in
Hive?
9. How to perform aggregations in Hive? What is the present version of Hive?
Explain ACID transactions in Hive.
10. When should we use SORT BY instead of ORDER BY?

Question 4 ( Unit 4)
Five marks questions
1. What is the role of big data in understanding the genetic diversity and evolution
of species? How does big data help in tracking and studying the spread of
infectious diseases and their evolution?
2. What are some examples of how big data has contributed to our understanding
of human evolution? How does the analysis of big data contribute to the study of
evolutionary relationships among different species?
3. How has big data improved our understanding of the impact of environmental
factors on evolution?
4. What role does big data play in studying the evolution of drug resistance in
pathogens? How does big data facilitate the study of evolutionary dynamics in
complex ecosystems?
5. How does the analysis of big data contribute to understanding the role of genetic
mutations in evolutionary processes?
6. What are some ethical considerations associated with the use of big data in
evolutionary research? What is HDFS (Hadoop Distributed File System)?
7. What are the key features of HDFS? How does HDFS achieve fault tolerance?
8. How does HDFS support high throughput for data-intensive workloads? How
does HDFS handle large files?
9. What is data locality in the context of HDFS?
10. How does HDFS ensure scalability? What are the main components of HDFS
architecture?

Question 5. ( Unit 5)
Five marks questions

1. What is Apache Ambari and what are its key features?


2. How does Ambari simplify the management of Hadoop
clusters?
3. How does Ambari enable real-time monitoring and alerting?
4. What are some key features of Apache Ambari?
5. What is the role of the Ambari web UI?
6. What is the architecture of Apache Ambari?
7. What are Ambari stacks and services?
8. What are the benefits of using Ambari blueprints?
9. How can Apache Ambari be installed and configured?
10. What is the significance of Ambari views?

SECTION – B

Question 1 ( Unit 1)
Each question carries NINE marks
1. How does big data contribute to cybersecurity? What are some use cases of big
data in e-commerce?
2. How is big data utilized in manufacturing? Write short note on supply chain
management
3. What are some use cases of big data in transportation and logistics? Write short
note on fleet management.
4. How does big data contribute to personalized healthcare? How big data useful in
remote patient monitoring.
5. What are some use cases of big data in the entertainment industry? Write about
audience analysis.
6. How does big data support urban planning and smart cities? What are some use
cases of big data in e-commerce? How is big data utilized in manufacturing?
7. What are some use cases of big data in transportation and logistics? How does
big data contribute to personalized healthcare? What are some use cases of big
data in the entertainment industry?

Question 2 ( unit 2 & 3)


Each question carries NINE marks

1. What is the role of HIVE in Distributed System? How query processed in HIVE?
2. What are the common uses of HIVE?
3. What is a Zookeeper? What are the benefits of using a zookeeper?
4. What is partitioning in Hive? What are the components of Apache HBase?
5. When is it appropriate to use a NoSQL database?
6. What are the advantages of Apache Spark?
7. How we use spark?

Question 3 ( unit 5)

Each question carries NINE marks

1. What is dynamic partitioning and when is it used? What is indexing and why do
we need it? Explain the different types of joins in Hive.
2. How does data transfer happen from HDFS to Hive? How can you create a
temporary table in Hive?
3. How can you perform a subquery in Hive? How can you use a user-defined
function (UDF) in Hive? How can you export data from Hive to external
systems?
4. How can you monitor Hive jobs? How can you optimize Hive queries for
performance? How can you perform data transformations in Hive?
5. How can you comment in Hive scripts? How can you run a Hive script? How can
you filter data in Hive?
6. Write queries for the following:
 To drop a Hive table
 To display the schema of a Hive table
 To perform sorting in Hive
 To rename a Hive table?

Write queries for the following:

 To calculate the percentile of a column in Hive


 To handle missing values in Hive?
 To calculate the difference between two dates in Hive
7. How can you round a decimal value in Hive? How can you concatenate strings in
Hive? How can you perform a case-insensitive search in Hive? How can you
handle duplicates in Hive? How can you perform a self-join in Hive?

SECTION - C
Question 1
Each question carries TWELVE marks
1. How does big data impact industries and sectors? How did the Hadoop
framework influence the history of big data? What factors contributed to the
growth of big data? ( Unit 1)
2. What are the different types of tools in Big Data? When do we use Apache drill
over Apache Hive ( Unit 2)
3. How is Apache Spark different from MapReduce? Suppose that I want to
monitor all the open and aborted transactions in the system along with the
transaction id and the transaction state. Can this be achieved using Apache
Hive? ( Unit 3)
4. What are the different components of a Hive architecture? ( unit 4)
Write queries to perform aggregate functions:
 To calculate the minimum value of a column in Hive
 To calculate the maximum value of a column in Hive?
 To calculate the sum of a column in Hive?
 To calculate the average of a column in Hive
 To calculate the count of a column in Hive

5. What are the different types of tables available in Hive? What is the difference
between external and managed tables in Hive? What do you understand by a
Hive Metastore? What is the difference between local and remote Meta stores in
Hive? ( unit 4)
6. Is it possible to run a Unix shell command from Hive? Give an example to
demonstrate. What do you understand by bucketing in Hive? Why do we need
a bucket? Can you list a few commonly used Hive services? ( unit 4 )

7. Examine the impact of big data analytics in various industries such as


healthcare, finance, retail, and manufacturing. Provide specific examples of how
big data analytics has led to innovations, improved decision-making, and
competitive advantages in these sectors. Discuss potential future trends in big
data analytics and its continued significance in shaping business strategies. (unit
1)

You might also like