0% found this document useful (0 votes)

29 views2 pages

UEC718

The document contains 7 questions related to Big Data technologies like Hadoop, Hive, MapReduce, Spark etc. The questions test knowledge of components of Hadoop ecosystem, MapReduce concepts, Hive queries, collaborative filtering, K-means clustering, RDD operations in Spark.

Uploaded by

Abhi Mittal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views2 pages

UEC718

Uploaded by

Abhi Mittal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Thapar Institute of Engineering and Technology

UEC 718 — Big Data Analytics "U" grade Exam — 07.03.2022 Duration: 2 hrs
Instructors: Debayani Ghosh, Arnab Pattanayak
Note: Attempt any 5 questions out of the following (Max-45 Marks)

Question I. (9 marks)
(a)What are the components of Hadoop echosystem? (3 marks)
(b) Define the functionalities of a Namenode and a Datanode. (3 marks)
(c) Describe Map-Reduce with an example. (3 marks)

Question 2. (9 marks)

Employee ID Name Age Employee ID Salary

1 Alice 23 3 26000
3 Rose 25 5 30000
5 Michael 24 7 25000

Let's consider above two tables in .csv format. The left table is 'employee.csv' and the right
table is `salary.csv'

(a) Now write a program in Apache Hive to create a database 'office'.

Create two tables 'employee' and 'salary' in that database 'office' from .csv files
'employee.csv' and 'salary.csv', respectively. (3 marks)
(b) Perform a relational join on two tables based on common column. (4 marks)
(c) Write a query to display the details of employees with more than 25000 salary. (2
marks)

Question 3. (9 marks) Consider the following string

"this will add the string to string constant pool"

(a) Write a map-reduce program to print below answer —(7 marks)
2: [to]
3: [add, the]
4: [this, will, pool]
5: [string, string]
8: [constant] (2, 3. 4, 5, 8 are the lengths of the respective words)

(b) Explain the functions of a Jobtracker and a Tasktracker. (2 marks)

Question 4. (9 marks)

Consider the following matrix of 12 users rated 6 movies

Ul U2 U3 U4 U5 U6 U7 US U9 LI0
MI I 3 I 4 4 I
i
M2 4 5 4 1
M3 I 5 2 11- 3 4
M4 1 5 5 5
M5 4 3 5
M6 1 3 3 2 —+

Using item-item collaborative filtering, predict the rating of MI by l'5. For similarity
measure use centred cosine similarity and use 2 nearest neighbours of M I. I 9 marks)

Question 5. (9 marks)
(a) Define RDD. (3 marks)
(b) What is a DStream? (2 marks)
(c) Given the statement — (4 marks)
rdd = parallelize(Wa', 1), (V, I), (`a',1)])
Find a way to count the occurrences of the keys and print the following output: I('a',
2), (W. I)].

Question 6. (9 marks)
Consider the following two-dimensional dataset —

(1, 8), (1, 5), (4. 4), (5, 8), (8, 5), (5, 5), (4, 2). (4, 9)
Apply two iterations of K-means clustering algorithm to the above data points to group them
into three clusters. Show the three cluster centroids and data points belonged to these three
clusters. Choose initial cluster centroids as (1,8), (5,8) and (4.2) and calculate ihe distance
between two points using the Euclidean distance formula.

Question 7. (9 marks) In a fladoop environment, write the following programs in Python-

Spark programming:

(a) Read and display the contents of a text file (2 marks)

(b) Given the RDD which contains the following elements: [100, 100, 210. 300, 300.
400, 4050, 400], display only the first occurrence of a number. (4 marks)
(c) Given a dataset ([10, 12, 13, 14, 15]), print only -> [10, 12, 14]. (3 marks)

Past Smart Data
No ratings yet
Past Smart Data
4 pages
Dsbdal Lab Manual
No ratings yet
Dsbdal Lab Manual
107 pages
Bda Pyq
No ratings yet
Bda Pyq
4 pages
Ip Set Ii
No ratings yet
Ip Set Ii
10 pages
SLOT - D1+D2: Digital Assignment - I - Summer Semester 2020-2021
No ratings yet
SLOT - D1+D2: Digital Assignment - I - Summer Semester 2020-2021
9 pages
PracticalList - EDT - BCA - 2024 SET B1 - 4
No ratings yet
PracticalList - EDT - BCA - 2024 SET B1 - 4
8 pages
DSBDA LAB - MANUAL (Autosaved) - Sd1-Converted-1-2
100% (1)
DSBDA LAB - MANUAL (Autosaved) - Sd1-Converted-1-2
256 pages
IP 12 PreBoardPracExam 2024
No ratings yet
IP 12 PreBoardPracExam 2024
1 page
Ip 12 MT4 2024
No ratings yet
Ip 12 MT4 2024
1 page
UEC735
No ratings yet
UEC735
2 pages
Key Ip Pre Board 2024-25
No ratings yet
Key Ip Pre Board 2024-25
10 pages
SCH 20mca31
No ratings yet
SCH 20mca31
7 pages
SCH 18CS72
No ratings yet
SCH 18CS72
11 pages
IP 12 2024-25 BluePrint-QsPattern
No ratings yet
IP 12 2024-25 BluePrint-QsPattern
4 pages
CEG Assessment II
No ratings yet
CEG Assessment II
4 pages
SSCE Practical Solutions
No ratings yet
SSCE Practical Solutions
4 pages
Big Data With Hadoop & Spark - VII
No ratings yet
Big Data With Hadoop & Spark - VII
3 pages
Bda Nov-Dec 2022
No ratings yet
Bda Nov-Dec 2022
2 pages
Dsbda Nov2023
No ratings yet
Dsbda Nov2023
3 pages
Sppu Dsbda QP Nov - Dec - 2023
No ratings yet
Sppu Dsbda QP Nov - Dec - 2023
3 pages
Data Structures Unit 1
No ratings yet
Data Structures Unit 1
96 pages
Xii Ip Ekm MS PB1
No ratings yet
Xii Ip Ekm MS PB1
13 pages
QuestionBank DataAnalytics
No ratings yet
QuestionBank DataAnalytics
2 pages
Accounting Paper
No ratings yet
Accounting Paper
6 pages
UEC735
No ratings yet
UEC735
2 pages
Madhubabu - Shivangi - PDF
No ratings yet
Madhubabu - Shivangi - PDF
228 pages
Ip qp2022 23
No ratings yet
Ip qp2022 23
11 pages
Model Paper
No ratings yet
Model Paper
1 page
Big Data Analytics April 2023
No ratings yet
Big Data Analytics April 2023
4 pages
BDS SampleQP EC2R
No ratings yet
BDS SampleQP EC2R
2 pages
(08 Marks) (08 Marks)
No ratings yet
(08 Marks) (08 Marks)
2 pages
Big Data Analytics (BDA) UNIT 1: Introduction To Big Data
No ratings yet
Big Data Analytics (BDA) UNIT 1: Introduction To Big Data
3 pages
BDA University Question Paper
No ratings yet
BDA University Question Paper
10 pages
Architecture of Big Data Systems (Elective - 1) (BDA 5102)
No ratings yet
Architecture of Big Data Systems (Elective - 1) (BDA 5102)
1 page
12 Ip Question Paper
No ratings yet
12 Ip Question Paper
8 pages
Class Xii Ip - Sahodaya Set1 2023
No ratings yet
Class Xii Ip - Sahodaya Set1 2023
13 pages
Xii Ip Special QP Set B 2022-23
No ratings yet
Xii Ip Special QP Set B 2022-23
7 pages
BDT MSE2Scheme 23-24
No ratings yet
BDT MSE2Scheme 23-24
4 pages
BDA Merged
No ratings yet
BDA Merged
7 pages
BDA - Assignment and Submission Guidelines PDF
No ratings yet
BDA - Assignment and Submission Guidelines PDF
3 pages
Computational
No ratings yet
Computational
7 pages
Bcs Higher Education Qualifications BCS Level 5 Diploma in IT
No ratings yet
Bcs Higher Education Qualifications BCS Level 5 Diploma in IT
2 pages
Dsbda May2022
No ratings yet
Dsbda May2022
2 pages
DSBDA Merged
No ratings yet
DSBDA Merged
13 pages
Grade 12 IP - Practical File Questions 2024-2025
No ratings yet
Grade 12 IP - Practical File Questions 2024-2025
6 pages
Dsbda QP
No ratings yet
Dsbda QP
12 pages
IP-MS-2 India
No ratings yet
IP-MS-2 India
5 pages
May Jun 2022
No ratings yet
May Jun 2022
2 pages
Dsebl ZG522
No ratings yet
Dsebl ZG522
4 pages
M.SC - ITSem II Big Data Analytics R2020
No ratings yet
M.SC - ITSem II Big Data Analytics R2020
2 pages
Scalable Data Mining (Autumn 2021) : Assignment 1 (Full Marks: 100)
No ratings yet
Scalable Data Mining (Autumn 2021) : Assignment 1 (Full Marks: 100)
3 pages
4225 5425Quiz19S2 PaperV1 Answer
No ratings yet
4225 5425Quiz19S2 PaperV1 Answer
16 pages
1422 Shortlisted FYPs of NGIRI 2022 23
No ratings yet
1422 Shortlisted FYPs of NGIRI 2022 23
33 pages
23CP309T BDA MSE Question Paper
No ratings yet
23CP309T BDA MSE Question Paper
2 pages
12 Ip Revision Paper
No ratings yet
12 Ip Revision Paper
7 pages
Info Pract Xii Ms PB 1 Set 3
No ratings yet
Info Pract Xii Ms PB 1 Set 3
11 pages
Int 421
No ratings yet
Int 421
2 pages
II CSE - A&B (96) DS-int 1 QP ANS-set1
No ratings yet
II CSE - A&B (96) DS-int 1 QP ANS-set1
7 pages
Extc Sem 7 Bda R-2016
No ratings yet
Extc Sem 7 Bda R-2016
4 pages
DSBDA Merge PDF
No ratings yet
DSBDA Merge PDF
10 pages
DSDBA Sppu Dsbda QP
No ratings yet
DSDBA Sppu Dsbda QP
11 pages
Class-XII-IP-First Pre Board
No ratings yet
Class-XII-IP-First Pre Board
7 pages
Activity Sheet 1
No ratings yet
Activity Sheet 1
1 page
Active Ecommerce CMS Documentation
No ratings yet
Active Ecommerce CMS Documentation
118 pages
BGMI Software Testing PBL
No ratings yet
BGMI Software Testing PBL
13 pages
Activation of Anylogic University Server License
No ratings yet
Activation of Anylogic University Server License
6 pages
Ict Written Syllabus Ntrca
No ratings yet
Ict Written Syllabus Ntrca
10 pages
ES ESDS SANWatch UMN v3.8
No ratings yet
ES ESDS SANWatch UMN v3.8
282 pages
Installation Process of Server and Its Roles
No ratings yet
Installation Process of Server and Its Roles
52 pages
Code Academy SQ L
No ratings yet
Code Academy SQ L
6 pages
Dir6200 Isuzu Map Manual - 130301 - Web
No ratings yet
Dir6200 Isuzu Map Manual - 130301 - Web
118 pages
Interrupts
No ratings yet
Interrupts
14 pages
Celonis PQL: A Query Language For Process Mining
No ratings yet
Celonis PQL: A Query Language For Process Mining
32 pages
Cluster Analysis-Unit 4
No ratings yet
Cluster Analysis-Unit 4
7 pages
Crowdstrike Falcon Adversary Overwatch Cloud Threat Hunting
No ratings yet
Crowdstrike Falcon Adversary Overwatch Cloud Threat Hunting
3 pages
Ding Talk Operating Instruction
No ratings yet
Ding Talk Operating Instruction
11 pages
Cns File HHH Cryptography and Network Security File 2
No ratings yet
Cns File HHH Cryptography and Network Security File 2
69 pages
KHFHZF Fy TPJ Jpizf FSK TLF F KHFHZK: Provincial Department of Education Northern Province
No ratings yet
KHFHZF Fy TPJ Jpizf FSK TLF F KHFHZK: Provincial Department of Education Northern Province
9 pages
Powermax Usb Rs Quick Start Guide
No ratings yet
Powermax Usb Rs Quick Start Guide
28 pages
Ayush Bhardwaj Resume 1 PDF
No ratings yet
Ayush Bhardwaj Resume 1 PDF
1 page
Review Article: Wireless Body Area Networks For Healthcare Applications: Protocol Stack Review
No ratings yet
Review Article: Wireless Body Area Networks For Healthcare Applications: Protocol Stack Review
24 pages
1st Module Assessment 29-30
No ratings yet
1st Module Assessment 29-30
6 pages
07 TP Websysrtem
No ratings yet
07 TP Websysrtem
15 pages
Installation of A Presence Server
No ratings yet
Installation of A Presence Server
5 pages
User Preference of Cyber Security Awaren
No ratings yet
User Preference of Cyber Security Awaren
12 pages
Agile Process Discovery
No ratings yet
Agile Process Discovery
8 pages
Experiment 6
No ratings yet
Experiment 6
5 pages
Lab6 Deadlock
No ratings yet
Lab6 Deadlock
5 pages
Order of Operations: Please Excuse My Dear Aunt Sally
No ratings yet
Order of Operations: Please Excuse My Dear Aunt Sally
3 pages
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet

UEC718

Uploaded by

UEC718

Uploaded by

Thapar Institute of Engineering and Technology

Employee ID Name Age Employee ID Salary

(a) Now write a program in Apache Hive to create a database 'office'.

Question 3. (9 marks) Consider the following string

"this will add the string to string constant pool"

(b) Explain the functions of a Jobtracker and a Tasktracker. (2 marks)

Consider the following matrix of 12 users rated 6 movies

Question 7. (9 marks) In a fladoop environment, write the following programs in Python-

(a) Read and display the contents of a text file (2 marks)

You might also like