0% found this document useful (0 votes)
126 views3 pages

BD-Practice Questions-Aut

The document contains practice questions for a Big Data midterm exam divided into multiple units. Unit 1 contains direct questions about big data concepts. Unit 2 includes more complex questions involving mathematical problems and explaining concepts like data models and big data architecture. Unit 3 focuses on programming questions in R involving data analysis, matrices, vectors, and other statistical concepts. The prediction is that Unit 1 will have direct questions, Unit 2 will have moderate to complex questions, and Unit 3 will involve more complex programming problems.

Uploaded by

Suman Pandit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
126 views3 pages

BD-Practice Questions-Aut

The document contains practice questions for a Big Data midterm exam divided into multiple units. Unit 1 contains direct questions about big data concepts. Unit 2 includes more complex questions involving mathematical problems and explaining concepts like data models and big data architecture. Unit 3 focuses on programming questions in R involving data analysis, matrices, vectors, and other statistical concepts. The prediction is that Unit 1 will have direct questions, Unit 2 will have moderate to complex questions, and Unit 3 will involve more complex programming problems.

Uploaded by

Suman Pandit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Big Data Midterm Practice Questions

Unit-1:

1. What are the characteristics of data?


2. Classify digital data with suitable examples.
3. Outline the challenges associated with unstructured data.
4. Define big data with example.
5. Write down the name of 5v’s of big data.
6. Write down four applications of big data analytics.
7. Explain different types of analytics used in big data.
8. Identify the role of stakeholders involved in data analytics project.
9. Difference between Business Intelligence and Big data analytics.
10. Discuss different phases of data analytics life cycle.
11. What is distributed computing? Explain the working of distributed computing environment.
12. What are the top challenges faced in Big Data and what kind of technology you recommend to
mitigate the challenges
13. What is shared nothing architecture and how it is related to share disk and share memory?
14. What are the advantages of shared nothing architecture?
15. Explain CAP Theorem and proof it.
16. What is Data Analytics Life Cycle and what are the different phases/stages associated to it?
17. Discuss similarities and differences between ELT and ETL.
18. What is virtualization and what are the different types of virtualization? What are the benefits of
virtualization?
19. Discuss the differences between parallel system and distributed system
20. Explain the following
a. Traditional Analytics Architecture
b. Modern In-Database Analytics Architecture
c. MPP Database Analytics Architecture
d. In-Memory Computing
Unit-2:

1. Explain Conceptual data model, Logical model, and Physical data model with suitable examples.
2. List the major functions of the Big Data architecture model.
3. List the components of the Big Data architecture.
4. Explain the functioning of the Ingestion layer in the Big Data architecture.
5. Discuss the key building blocks of the Hadoop platform management layer.
6. What is the role of analytical engine in the Big Data environment? Describe different types of
engines used to analyze Big Data.
7. Explain data stream with suitable examples.
8. Discuss similarities and differences between SQL and NoSQL.
9. Explain rule based and learning based approach with suitable example.
10. What are the characteristics of Big Data Streaming System?
11. Explain the difference between data-at-rest and data-in-motion with suitable example.
12. What is stream computing and how it is different from traditional computing?
13. Explain Bloom filter algorithm with a suitable example.
14. Discuss Bloom filter performance.
15. A empty bloom filter is of size 11 with 4 hash functions namely
a. h1(x) = (3x+ 3) mod 6

Have a holly and jolly mid semester exam


b. h2(x) = (2x+ 9) mod 2
c. h3(x) = (3x+ 7) mod 8
d. h4(x) = (2x+ 3) mod 5
Illustrate bloom filter insertion with 7 and then 8. Then perform bloom filter lookup/membership
test with 10 and 48.
16. Calculate the optimal number of hash functions for 10 bit length bloom filter having 3 numbers of
input elements.
17. Plot the graph and discuss if the number of hash functions are tends to increase.
18. Calculate the probability of False Positives with table size 10 and the no. of items to be inserted
are 3.
19. Calculate the probability that a slot is set to 1 after insertion of 5 elements for 15 bit length bloom
filter.
20. Calculate the probability that a slot is not set to 1 after insertion of 5 elements for 15 bit length
bloom filter.
21. Calculate the probability that a slot is hashed with 5 hash functions for 15 bit length bloom filter.
22. Explain the algorithm of counting distinct elements in a stream with a suitable example.
23. List the use cases of Bloom filter.
24. Explain the algorithm of detecting False Positive in Bloom filter.
25. Explain the difference between False Positive and False Negative with suitable examples.

Unit 4:

1. A cashier has currency notes of denominations 10, 50 and 100. If the amount to be withdrawn is
input through the keyboard in hundreds, write an R-script to find the total number of currency
notes of each denomination the cashier will have to give to the withdrawer.
2. Ramesh’s basic salary is input through the keyboard. His dearness allowance is 40% of basic
salary, and house rent allowance is 20% of basic salary. Write an R-script to calculate his gross
salary.
3. Write an R-script to check whether an integer number is an Armstrong number or not. If sum of
cubes of each digit of the number is equal to the number itself, then the number is called an
Armstrong number. For example, 153 = ( 1 * 1 * 1 ) + ( 5 * 5 * 5 ) + ( 3 * 3 * 3 )
4. Write an R-script to reverse the number
5. Write an R-script to sum the series S=1+(1+2)+(1+2+3)+...+(1+2+3+...+n)
6. Write an R-script to evaluate sum of the following series using recursive function
1+2+3+………………. +N
7. Write an R-script to convert decimal into binary using recursive function
8. Write an R-script to find the factorial of a number using recursive function
9. Write an R-script to develop a function that receives 5 numbers and display the sum, average and
standard deviation of these numbers using function.
10. Write an R-script to input data for a matrix and check the given matrix is symmetric or not?
11. The nth triangular number is given by n * (n + 1) / 2. Create a sequence of the first 20 triangular
numbers. R has a built-in constant, letters that contains the lowercase letters of the Roman
alphabet. Name the elements of the vector that you just created with the first 20 letters of the
alphabet. Select the triangular numbers where the name is a vowel.
12. A cricket team has following table of batting figures from a series of test matches:

Have a holly and jolly mid semester exam


Player’s Name Runs Innings Times not out
Sachin 8430 150 18
Rahul 4235 158 9
Saurabh 6789 168 11
Virat 9898 200 13
And so on…
Write an R-script to read the figures set out in the above form and then calculate the batting
average and print out the complete table including the averages.
13. Write an R-script to print a table of values of the function y = e-x for x varying from 0 to 10 in
steps of 0.1.
14. An electricity board charges the following rates to domestic users to discourage large
consumption of energy:
For the 1st 100 units – Rs 30 per unit
For the next 200 units – Rs 80 per unit
Beyond 300 units – Rs 90 per unit
All users are charged a minimum of Rs 500.00. If the total amount is more than Rs 3000.00 then
an additional charge of 15% is added. Write an R-script to read the names of users and the
number of units consumed and then print out the charges with names.
15. Write an R-script to represent a vector (a series of floating point values) with the functions:
a. Creating the vector
b. Modify the value of a given element
c. Multiply by a scalar value
d. Display the vector

Prediction: (Don’t look at with ugly eyes if the prediction vented in other way)

1. Unit 1 - Direct questions


2. Unit 2 – Moderate/Complex questions with Mathematical related problems
3. Unit 3 – Moderate/Complex programming questions

The End

Have a holly and jolly mid semester exam

You might also like