0% found this document useful (0 votes)

40 views11 pages

Bda Lab

Uploaded by

kirti049btit21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views11 pages

Bda Lab

Uploaded by

kirti049btit21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

INDIRA GANDHI DELHI TECHNICAL UNIVERSITY

FOR WOMEN

BIG DATA ANALYTICS

PRACTICAL FILE
BTECH IT-1

Submitted to: Dr. Deepak

Submitted by: Kirti, 04901032021

S.No Title Date Remarks

1. WAP to implement: map function
and reduce function
2. Map reduce program to
implement word count.
3. Map reduce program to
implement Inverted Index.
4. HDFS commands.
5. Map reduce program to
implement Average calculation.

Experiment-1
Aim: Write a program to implement :
a) Map function
b) Reduce function

Code: a] # Custom implementation of the map function

def custom_map(function, iterable):
result = []
for item in iterable:
result.append(function(item))
return result
# Example function to apply to each element
def square(x):
return x * x
# Input list
numbers = [1, 2, 3, 4, 5]
# Using the custom map function
squared_numbers = custom_map(square, numbers)
# Output the result
print(squared_numbers)

Output:

b] from functools import reduce

# Function to sum two numbers
def add(x, y):
return x + y
# List of numbers
numbers = [1, 2, 3, 4, 5]
# Use reduce to sum the list
result = reduce(add, numbers)
print("The sum of the list is:", result)

Output:

Experiment-2
Aim: Map reduce program to implement word count.
Map: Split text into words and map each word to a tuple of (word,1).
Reduce: Aggregate the counts for each word.

Code:
 Map function –
def map_function(text_chunk):
word_list = text_chunk.split() # Split text into words
map_output = []
for word in word_list:
map_output.append((word, 1)) # Output tuple (word, 1)
return map_output

 Reduce function –
from collections import defaultdict
def reduce_function(map_outputs):
word_count = defaultdict(int) # Dictionary to count occurrences of each word
for word, count in map_outputs:
word_count[word] += count # Aggregate the count for each word
return word_count

Example: # Example input text

text = "MapReduce is a programming model for processing large data sets with
a distributed algorithm"

# Split text into chunks for Map function

text_chunks = [text] # In a real-world scenario, you'd have multiple chunks
from large files

# Apply map function

mapped_results = []
for chunk in text_chunks:
mapped_results.extend(map_function(chunk))

# Apply reduce function

word_count_result = reduce_function(mapped_results)

# Print word count

for word, count in word_count_result.items():
print(f'{word}: {count}')
Output:
Experiment-3
Aim: Map reduce program to implement inverted index.
Map- Map each word to its document ID.
Reduce- Aggregate document IDs for each word to create an inverted index.
Code:
 Map function-
from mrjob.job import MRJob
class InvertedIndexMapper(MRJob):
def mapper(self, _, line):
# Split the line into document_id and content
doc_id, content = line.split("\t", 1)
# Tokenize content and emit word with document id
words = content.split()
for word in words:
yield word.lower(), doc_id
if __name__ == '__main__':
InvertedIndexMapper.run()

 Reduce function-
from mrjob.job import MRJob
from mrjob.step import MRStep
class InvertedIndexReducer(MRJob):
def steps(self):
return [
MRStep(mapper=self.mapper, reducer=self.reducer)
]
def mapper(self, _, line):
# Split the line into document_id and content
doc_id, content = line.split("\t", 1)
# Tokenize content and emit word with document id
words = content.split()
for word in words:
yield word.lower(), doc_id
def reducer(self, word, doc_ids):
# Aggregate document IDs for each word
doc_id_list = set(doc_ids) # Use set to avoid duplicates
yield word, list(doc_id_list)
if __name__ == '__main__':
InvertedIndexReducer.run()

Example: Document 1: "MapReduce program example"

Document 2: "MapReduce for inverted index"
Output:

Experiment-4
Aim: HDFS commands.

1. HDFS command syntax

2. Basic HDFS command

3. Additional HDFS commands
Experiment-5

Aim: Map reduce program to implement average calculation.

Map- Map each record to (key, value) pairs where key is the category and
value is the number to be averaged.
Reduce- Compute the average for each key.
Code;
 Map function-
def mapper(record):
# Assuming record is in the format: "Category, Value"
category, value = record.split(', ')
# Emit the category as key and value as float number
return (category, float(value))

 Reduce function-
from collections import defaultdict
def reducer(mapped_data):
# To store the total sum and count of values per category
sums = defaultdict(float)
counts = defaultdict(int)
# Process each (key, value) pair from the mapper
for category, value in mapped_data:
sums[category] += value
counts[category] += 1
# Calculate averages for each category
averages = {}
for category in sums:
averages[category] = sums[category] / counts[category]
return averages

Example: [('A', 5), ('A', 10), ('B', 7), ('B', 12)]

Output:
Experiment-6
Aim:

Mapreduce Final
No ratings yet
Mapreduce Final
55 pages
BDA Lab Manual - BAD601-Final One - 7-11
No ratings yet
BDA Lab Manual - BAD601-Final One - 7-11
25 pages
Bda Lab Exercises Lab Mannual - 2023
No ratings yet
Bda Lab Exercises Lab Mannual - 2023
72 pages
CS BY 8 Unit 4
No ratings yet
CS BY 8 Unit 4
147 pages
Map Reduce
No ratings yet
Map Reduce
33 pages
Programming: Just Basic Tutorials
67% (3)
Programming: Just Basic Tutorials
360 pages
Functional Programming - Unit3
No ratings yet
Functional Programming - Unit3
71 pages
BDA Module 3
No ratings yet
BDA Module 3
66 pages
Bda Lab Output
No ratings yet
Bda Lab Output
22 pages
Lec 8
No ratings yet
Lec 8
19 pages
BDA Lab Manual 200305105108
No ratings yet
BDA Lab Manual 200305105108
44 pages
Map, Filter, Reduce
No ratings yet
Map, Filter, Reduce
28 pages
BDA Mayur
No ratings yet
BDA Mayur
43 pages
16 - Lambda Functions
No ratings yet
16 - Lambda Functions
27 pages
Map, Filter and Reduce Functions
No ratings yet
Map, Filter and Reduce Functions
149 pages
Department of Electrical/Elcetronic Engineering
No ratings yet
Department of Electrical/Elcetronic Engineering
29 pages
Map Reduce Filter Lambda Generator
No ratings yet
Map Reduce Filter Lambda Generator
27 pages
Python Unit2
No ratings yet
Python Unit2
43 pages
eCW Administration Guide
No ratings yet
eCW Administration Guide
295 pages
F# Documentation
No ratings yet
F# Documentation
626 pages
Functional Style Tools Slides
No ratings yet
Functional Style Tools Slides
29 pages
Lab01
No ratings yet
Lab01
6 pages
BDA Manual SHUBHAM
No ratings yet
BDA Manual SHUBHAM
22 pages
BDA - Manual - 1to6 Ayushi
No ratings yet
BDA - Manual - 1to6 Ayushi
22 pages
PYTHON Map, Filter, Ruduce
No ratings yet
PYTHON Map, Filter, Ruduce
14 pages
Vitara Service Manual
No ratings yet
Vitara Service Manual
835 pages
Python143 Week4
No ratings yet
Python143 Week4
19 pages
Saanp pt.3-1
No ratings yet
Saanp pt.3-1
21 pages
084 Liza Bda File
No ratings yet
084 Liza Bda File
23 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Paper Map Reduce
No ratings yet
Paper Map Reduce
16 pages
Lab 4
No ratings yet
Lab 4
4 pages
Unit 05
No ratings yet
Unit 05
15 pages
Module2 C MapReduceParadigm
No ratings yet
Module2 C MapReduceParadigm
74 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Chapter - 3
No ratings yet
Chapter - 3
11 pages
Bda Practical 1
No ratings yet
Bda Practical 1
2 pages
BDA List of Experiments For Practical Exam
No ratings yet
BDA List of Experiments For Practical Exam
21 pages
Collections
No ratings yet
Collections
7 pages
Ir MR 1
No ratings yet
Ir MR 1
34 pages
Big Data Lab
No ratings yet
Big Data Lab
12 pages
Distributed Computing Seminar: Mapreduce Theory and Implementation
No ratings yet
Distributed Computing Seminar: Mapreduce Theory and Implementation
30 pages
Bda Practical 2
No ratings yet
Bda Practical 2
3 pages
Lec 8
No ratings yet
Lec 8
24 pages
Ecs765p W2
No ratings yet
Ecs765p W2
55 pages
Map Reduce Design and Execution Framework Part 1
No ratings yet
Map Reduce Design and Execution Framework Part 1
19 pages
Wa0000.
No ratings yet
Wa0000.
8 pages
Wa0001.
No ratings yet
Wa0001.
8 pages
XPression User Guide
No ratings yet
XPression User Guide
576 pages
Exp7 Python
No ratings yet
Exp7 Python
6 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Mapreduce
No ratings yet
Mapreduce
13 pages
Mapreduce
No ratings yet
Mapreduce
13 pages
Assignment 3
No ratings yet
Assignment 3
6 pages
Practical-1 AIM: To Understand The Overall Programming Architecture Using Map Reduce Api
No ratings yet
Practical-1 AIM: To Understand The Overall Programming Architecture Using Map Reduce Api
7 pages
Lab - 5
No ratings yet
Lab - 5
5 pages
More On Function 2
No ratings yet
More On Function 2
4 pages
Word Count
No ratings yet
Word Count
3 pages
Exp5 BDI 60004200124
No ratings yet
Exp5 BDI 60004200124
5 pages
Meltwater Full Userguide2021 Updated
No ratings yet
Meltwater Full Userguide2021 Updated
16 pages
Python Container Operations
No ratings yet
Python Container Operations
5 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
43 pages
Evans Analytics3e PPT 02 Accessible v2
No ratings yet
Evans Analytics3e PPT 02 Accessible v2
64 pages
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Packard Bell Easynote M3 Disassembly Manual
No ratings yet
Packard Bell Easynote M3 Disassembly Manual
20 pages
The Hitchhiker Autocad
No ratings yet
The Hitchhiker Autocad
87 pages
Study On: Dubbing
No ratings yet
Study On: Dubbing
22 pages
Os Unit - 4
No ratings yet
Os Unit - 4
29 pages
MassLynx 4.2 WIN10 Configuration Guide
No ratings yet
MassLynx 4.2 WIN10 Configuration Guide
76 pages
Lisp Programming Language
From Everand
Lisp Programming Language
Faiz ul haque Zeya
No ratings yet
Likha - Surigao Del Sur - Robotics Team
No ratings yet
Likha - Surigao Del Sur - Robotics Team
22 pages
Python
No ratings yet
Python
1 page
MapReduce: Simplified Data Processing On Large Clusters
100% (1)
MapReduce: Simplified Data Processing On Large Clusters
13 pages
Encoders: For Machine Tool Inspection and Acceptance Testing
No ratings yet
Encoders: For Machine Tool Inspection and Acceptance Testing
20 pages
Hybrid Machine Learning Algorithms For P
No ratings yet
Hybrid Machine Learning Algorithms For P
10 pages
Vsfiltermod: List of New Override Tags
No ratings yet
Vsfiltermod: List of New Override Tags
6 pages
Map Reduce Examples
No ratings yet
Map Reduce Examples
7 pages
A Review On Digital Twin Technology in Smart Grid Transportation System and Smart City Challenges and Future
No ratings yet
A Review On Digital Twin Technology in Smart Grid Transportation System and Smart City Challenges and Future
14 pages
The Inactive State in 5g New Radio
No ratings yet
The Inactive State in 5g New Radio
11 pages
ICMsystem DS E102
No ratings yet
ICMsystem DS E102
4 pages
A Comparative Study of Language Models For Book and Author Recognition
No ratings yet
A Comparative Study of Language Models For Book and Author Recognition
12 pages
Backup Exec Licensing Guide
No ratings yet
Backup Exec Licensing Guide
12 pages
Clat1 Vlsi Ak
No ratings yet
Clat1 Vlsi Ak
5 pages
Capstone Project 2-SQL-DataETL
No ratings yet
Capstone Project 2-SQL-DataETL
7 pages
Lab 7
No ratings yet
Lab 7
6 pages
University Research Graph Database
No ratings yet
University Research Graph Database
5 pages
OB ULT Technical Data Sheet - 2018
No ratings yet
OB ULT Technical Data Sheet - 2018
6 pages
Multithreading in Java
No ratings yet
Multithreading in Java
3 pages
Eetc Comments On Abb Offer
No ratings yet
Eetc Comments On Abb Offer
2 pages
The Impact of COVID-19 On Customer's Online Banking and E-Payment Usage: A Study of Customers Within Kaduna Metropolis
No ratings yet
The Impact of COVID-19 On Customer's Online Banking and E-Payment Usage: A Study of Customers Within Kaduna Metropolis
18 pages

Bda Lab

Uploaded by

Bda Lab

Uploaded by

INDIRA GANDHI DELHI TECHNICAL UNIVERSITY

BIG DATA ANALYTICS

Submitted to: Dr. Deepak

Submitted by: Kirti, 04901032021

S.No Title Date Remarks

Code: a] # Custom implementation of the map function

b] from functools import reduce

Example: # Example input text

# Split text into chunks for Map function

# Apply map function

# Apply reduce function

# Print word count

Example: Document 1: "MapReduce program example"

1. HDFS command syntax

2. Basic HDFS command

Aim: Map reduce program to implement average calculation.

Example: [('A', 5), ('A', 10), ('B', 7), ('B', 12)]

You might also like