0% found this document useful (0 votes)

28 views39 pages

Bda Lab

Lab description

Uploaded by

akshithasonia333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views39 pages

Bda Lab

Lab description

Uploaded by

akshithasonia333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

1. Create a program that simulates a simple calculator.

The user should input two

numbers and an operator (+, -, *, /). The program should perform the corresponding
operation using if-else statements and display the result.

PROGRAM:
add <- function(x, y) {
return(x + y)
}
subtract <- function(x, y) {
return(x - y)
}
multiply <- function(x, y) {
return(x * y)
}
divide <- function(x, y) {
if (y != 0) {
return(x / y)
} else {
return("Cannot divide by zero!")
}
}
num1 <- as.numeric(readline("Enter the first number: "))
num2 <- as.numeric(readline("Enter the second number: "))
operator <- readline("Enter an operator (+, -, *, /): ")
result <- 0
if (operator == "+") {
result <- add(num1, num2)
} else if (operator == "-") {
result <- subtract(num1, num2)
} else if (operator == "*") {
result <- multiply(num1, num2)
} else if (operator == "/") {
result <- divide(num1, num2)
} else {
cat("Invalid operator")
}
cat("Result: ", result, "\n")

OUTPUT:
Enter the first number: 3
Enter the second number: 3
Enter an operator (+, -, *, /): +
Result: 6

1
2. Write an R Program which takes an integer as input checks whether given number is
Positive, Negative and Zero.

PROGRAM:
number = as.integer(readline("Enter a Number:- "))
if(number > 0)
{
print(paste(number,"is the positive number"))
} else if(number < 0)
{
print(paste(number,"is the negative number"))
} else {
print(paste(number,"is the zero"))
}

OUTPUT:
Enter a Number:- 2
"2 is the positive number"

2
3. Write an R Program which takes a year as input checks whether given year is leap or
not.

PROGRAM:
year = as.integer(readline("Enter a year : "))

if(year%%4 == 0)
{
if(year%%100 == 0)
{
if(year%%400 == 0)
{
print(paste(year," is leap year"))
}
else
{
print(paste(year," is not leap year"))
}
}
else
{
print(paste(year," is leap year"))
}
} else
{
print(paste(year," is not leap year"))
}

OUTPUT:
Enter a year : 2024
"2024 is leap year"

3
4. Write a function to find odd number of integers in a given vector.

PROGRAM:
odd_count<-function(x){
k<-0
for(n in x){
if(n%%2==1)
{
cat(n," ")
}

}
return(k)
}
print("Enter a Vector : ")
vector = scan()
cat("The Odd Elements are : ")
odd_count(vector)

OUTPUT:
Enter a Vector :
1: 1
2: 2
3: 3
4: 4
5: 5
6:
Read 5 items
The Odd Elements are :
135

4
5. Write a function to find even number of integers in a given vector.

PROGRAM:
even_count<-function(x){
k<-0
for(n in x){
if(n%%2==0)
{
cat(n," ")
}

}
return(k)
}
print("Enter a Vector : ")
vector = scan()
cat("The Even Elements are : ")
even_count(vector)

OUTPUT:
Enter a Vector :
1: 1
2: 2
3: 3
4: 4
5: 5
6:
Read 5 items
The Even Elements are :
2 4

5
6. Write a function to find sum of all even numbers in given vector.

PROGRAM:
even_count<-function(x){
sum <- 0
for(n in x){
if(n%%2==0)
{
sum <- sum + n
}
}
return(sum)
}
print("Enter a Vector : ")
myVector <- scan()
print(paste("The Sum of Even Numbers in List:- ",even_count(myVector)))

OUTPUT:
Enter a Vector :
1: 1
2: 2
3: 3
4: 4
5: 5
6:
Read 5 items
The Sum of Even Numbers in List:- 6

6
7. Write a function to find sum of all odd numbers in given vector.

PROGRAM:
odd_count<-function(x){
sum <- 0
for(n in x){
if(n%%2==1)
{
sum <- sum + n
}
}
return(sum)
}
print("Enter a Vector : ")
myVector <- scan()
print(paste("The Sum of Odd Numbers in List:- ",odd_count(myVector)))

OUTPUT:
Enter a Vector :
1: 1
2: 2
3: 3
4: 4
5: 5
6:
Read 5 items
The Sum of Odd Numbers in List:- 9

7
8. Write a recursive function that calculates the factorial of a given positive integer.
Use an if-else statement to handle the base case and recursive case.

PROGRAM:
factorial <- function(n) {
if (n == 0 || n == 1) {
return(1) # Base case: factorial of 0 and 1 is 1
} else {
return(n * factorial(n - 1)) # Recursive case
}
}
num <- as.integer(readline("Enter a positive integer: "))

if (num < 0) {
cat("Please enter a positive integer.")
} else {
result <- factorial(num)
cat("Factorial of", num, "is", result, "\n")
}

OUTPUT:
Enter a positive integer: 5
Factorial of 5 is 120

8
9. Write a function that calculates the factorial of a given positive integer using a loop.
Handle the case where the input is 0 or 1 separately.

PROGRAM:
factorial <- function(n) {
if (n == 0 || n == 1) {
return(1)
} else {
result <- 1
for (i in 2:n) {
result <- result * i
}
return(result)
}
}

num <- as.integer(readline("Enter a positive integer: "))

if (num < 0) {
cat("Please enter a positive integer.")
} else {
result <- factorial(num)
cat("Factorial of", num, "is", result, "\n")
}

OUTPUT:
Enter a positive integer: 5
Factorial of 5 is 120

9
10. Write an R Program to Convert List to Vector.

PROGRAM:
myList <- list("Apple","Banana","Orange","Grapes")
myVector <- unlist(myList)
print(myVector)

OUTPUT:
[1] "Apple" "Banana" "Orange" "Grapes"

10
11. Write an R Program to create Recursive List with Student Name, Roll No and
Marks and perform adding and deleting operations.

PROGRAM:
createStudent <- function(name, roll_no, marks) {
student <- list(Name = name, Roll_No = roll_no, Marks = marks)
return(student)
}
addStudent <- function(student_list, student) {
student_list[[length(student_list) + 1]] <- student
return(student_list)
}
deleteStudent <- function(student_list, roll_no) {
for (i in 1:length(student_list)) {
if (student_list[[i]]$Roll_No == roll_no) {
student_list <- student_list[-i]
return(student_list)
}
}
cat("Student with Roll No", roll_no, "not found.\n")
return(student_list)
}
studentList <- list()
studentList <- addStudent(studentList, createStudent("John", 101, 85))
studentList <- addStudent(studentList, createStudent("Alice", 102, 92))
studentList <- addStudent(studentList, createStudent("Bob", 103, 78))
print("Initial Student List:")
print(studentList)
newStudent <- createStudent("Eva", 104, 95)
studentList <- addStudent(studentList, newStudent)
print("Student List after Adding Eva:")
print(studentList)
studentList <- deleteStudent(studentList, 102)
print("Student List after Deleting Alice:")
print(studentList)

OUTPUT:
Initial Student List:
[[1]]
[[1]]$Name
[1] "John"

[[1]]$Roll_No
[1] 101

[[1]]$Marks
[1] 85
11
[[2]]
[[2]]$Name
[1] "Alice"

[[2]]$Roll_No
[1] 102

[[2]]$Marks
[1] 92

[[3]]
[[3]]$Name
[1] "Bob"

[[3]]$Roll_No
[1] 103

[[3]]$Marks
[1] 78

Student List after Adding Eva:

[[1]]
[[1]]$Name
[1] "John"

[[1]]$Roll_No
[1] 101

[[1]]$Marks
[1] 85

[[2]]
[[2]]$Name
[1] "Alice"

[[2]]$Roll_No
[1] 102

[[2]]$Marks
[1] 92

[[3]]
[[3]]$Name
[1] "Bob"

12
[[3]]$Roll_No
[1] 103

[[3]]$Marks
[1] 78

[[4]]
[[4]]$Name
[1] "Eva"

[[4]]$Roll_No
[1] 104

[[4]]$Marks
[1] 95

Student List after Deleting Alice:

[[1]]
[[1]]$Name
[1] "John"

[[1]]$Roll_No
[1] 101

[[1]]$Marks
[1] 85

[[2]]
[[2]]$Name
[1] "Bob"

[[2]]$Roll_No
[1] 103

[[2]]$Marks
[1] 78

[[3]]
[[3]]$Name
[1] "Eva"

[[3]]$Roll_No
[1] 104

[[3]]$Marks
[1] 95

13
12. Create a function that takes a list of numeric vectors as input. The function
should return a new list where each vector has been normalized (scaled to have a
mean of 0 and standard deviation of 1).

PROGRAM:
normalize_vector <- function(vec) {
mean_val <- mean(vec)
std_dev <- sd(vec)
normalized_vec <- (vec - mean_val) / std_dev
return(normalized_vec)
}

normalize_list <- function(vector_list) {

normalized_list <- lapply(vector_list, normalize_vector)
return(normalized_list)
}

input_list <- list(

c(1, 2, 3, 4, 5),
c(10, 20, 30, 40, 50),
c(0, 0, 0, 0, 0)
)
normalized_result <- normalize_list(input_list)

print("Original list:")
print(input_list)
print("Normalized list:")
print(normalized_result)

OUTPUT:
Original list:
[[1]]
[1] 1 2 3 4 5

[[2]]
[1] 10 20 30 40 50

[[3]]
[1] 0 0 0 0 0
Normalized list:
[[1]]
[1] -1.2649111 -0.6324555 0.0000000 0.6324555 1.2649111

[[2]]
[1] -1.2649111 -0.6324555 0.0000000 0.6324555 1.2649111

[[3]]
[1] NaN NaN NaN NaN NaN

14
13. Create following data with column and row names.

Math Eng Science

Ravi 78 67 92

Mahesh 56 89 78

Sita 51 81 76

Neha 89 70 50

PROGRAM:
student_marks <- matrix(c(78,56,51,89,67,89,81,70,92,78,76,50),ncol = 3,nrow = 4)

colnames(student_marks) <- c("Math","Eng","Science")

rownames(student_marks) <- c("Ravi","Mahesh","Sita","Neha")

print(student_marks)

OUTPUT:
Math Eng Science
Ravi 78 67 92
Mahesh 56 89 78
Sita 51 81 76
Neha 89 70 50

15
14. Write an R script which processes above data and displays the division/grade of
each student.
PROGRAM:
student_marks <- matrix(c(78,56,51,89,67,89,81,70,92,78,76,50),ncol = 3,nrow = 4)
colnames(student_marks) <- c("Math","Eng","Science")
rownames(student_marks) <- c("Ravi","Mahesh","Sita","Neha")
print(student_marks)
calculate_grade <- function(avg_marks)
{
if(avg_marks > 90)
{
return("A")
}
else if(avg_marks > 80)
{
return("B")
}
else if(avg_marks > 70)
{
return("C")
}
else if(avg_marks > 60)
{
return("D")
}
else
{
return("F")
}
}
for(student in rownames(student_marks))
{
total_marks = sum(student_marks[student,])%/%length(student_marks[student,])
grade = calculate_grade(total_marks)
print(paste("Student Name:",student,"Average Marks",total_marks,"Grade",grade))
}

OUTPUT:
Math Eng Science
Ravi 78 67 92
Mahesh 56 89 78
Sita 51 81 76
Neha 89 70 50
[1] "Student Name: Ravi Average Marks 79 Grade C"
[1] "Student Name: Mahesh Average Marks 74 Grade C"
[1] "Student Name: Sita Average Marks 69 Grade D"
[1] "Student Name: Neha Average Marks 69 Grade D"

16
15. Write a R function that takes two matrices as input and checks if they are equal
(have the same dimensions and corresponding elements).

PROGRAM:
are_matrices_equal <- function(mat1, mat2) {
if (is.matrix(mat1) && is.matrix(mat2)) {
if (identical(dim(mat1), dim(mat2))) {
return(all(mat1 == mat2))
} else {
return(FALSE)
}
} else {
return(FALSE)
}
}

print("Enter rows count for Matrix A: ")

rowA = as.integer(readline())

print("Enter cols count for Matrix A: ")

colA = as.integer(readline())

print("Enter Elements for Matrix A (row - wise) ")

vecA = scan()

if((rowA*colA) != length(vecA))
{
print("Vector Length and (Rows and Columns) Count not Match")
} else {
A = matrix(vecA,nrow=rowA,ncol=colA,byrow = TRUE)
}

print("Enter rows count for Matrix B: ")

rowB = as.integer(readline())

print("Enter cols count for Matrix B: ")

colB = as.integer(readline())

print("Enter Elements for Matrix B (row - wise) ")

vecB = scan()

if((rowB*colB) != length(vecB))
{
print("Vector Length and (Rows and Columns) Count not Match")
} else {
B = matrix(vecB,nrow=rowB,ncol=colB,byrow = TRUE)

17
}

result1 <- are_matrices_equal(A, B)

print("Matrix A:")
print(A)
print("Matrix B:")
print(B)

cat("Are matrix A and matrix B equal?", result1, "\n")

OUTPUT:
[1] "Enter rows count for Matrix A: "
2
[1] "Enter cols count for Matrix A: "
2
[1] "Enter Elements for Matrix A (row - wise) "
1: 1
2: 1
3: 1
4: 1
5:
Read 4 items
[1] "Enter rows count for Matrix B: "
2
[1] "Enter cols count for Matrix B: "
2
[1] "Enter Elements for Matrix B (row - wise) "
> vecB = scan()
1: 1
2: 1
3: 1
4: 1
5:
Read 4 items
[1] "Matrix A:"
[,1] [,2]
[1,] 1 1
[2,] 1 1
[1] "Matrix B:"
[,1] [,2]
[1,] 1 1
[2,] 1 1
Are matrix A and matrix B equal? TRUE

18
16. You have two matrices, A and B, both of size 3x3. Write a program that
computes the matrix product of A and B without using the %*% operator.

PROGRAM:
matrixProduct <- function(A, B) {
if (ncol(A) != nrow(B)) {
stop("Number of columns in matrix A must be equal to the number of rows in matrix
B.")
}
result <- matrix(0, nrow(A), ncol(B))
for (i in 1:nrow(A)) {
for (j in 1:ncol(B)) {
for (k in 1:ncol(A)) {
result[i, j] <- result[i, j] + A[i, k] * B[k, j]
}
}
}
return(result)
}
print("Enter rows count for Matrix A: ")
rowA = as.integer(readline())
print("Enter cols count for Matrix A: ")
colA = as.integer(readline())
print("Enter Elements for Matrix A (row - wise) ")
vecA = scan()
if((rowA*colA) != length(vecA))
{
print("Vector Length and (Rows and Columns) Count not Match")
} else {
A = matrix(vecA,nrow=rowA,ncol=colA,byrow = TRUE)
}
print("Enter rows count for Matrix B: ")
rowB = as.integer(readline())
print("Enter cols count for Matrix B: ")
colB = as.integer(readline())
print("Enter Elements for Matrix B (row - wise) ")
vecB = scan()
if((rowB*colB) != length(vecB))
{
print("Vector Length and (Rows and Columns) Count not Match")
} else {
B = matrix(vecB,nrow=rowB,ncol=colB,byrow = TRUE)
}

result1 <- are_matrices_equal(A, B)

19
print("Matrix A:")
print(A)
print("Matrix B:")
print(B)
print("MatrixProduct A*B : ")
print(matrixProduct(A,B))

OUTPUT:
[1] "Enter rows count for Matrix A: "
2
[1] "Enter cols count for Matrix A: "
2
[1] "Enter Elements for Matrix A (row - wise) "
> vecA = scan()
1: 1
2: 2
3: 3
4: 4
5:
Read 4 items

[1] "Enter rows count for Matrix B: "

2
[1] "Enter cols count for Matrix B: "
>2
[1] "Enter Elements for Matrix B (row - wise) "
> vecB = scan()
1: 1
2: 2
3: 3
4: 4
5:
Read 4 items

[1] "Matrix A:"

[,1] [,2]
[1,] 1 2
[2,] 3 4

[1] "Matrix B:"

[,1] [,2]
[1,] 1 2
[2,] 3 4

[1] "MatrixProduct A*B : "

[,1] [,2]
[1,] 7 10
[2,] 15 22

20
17. Write a Java Hadoop Code to run a MapReduce job for a word count
application.

PROGRAM:
WC_Mapper.java
package org.bhavani;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
public class WC_Mapper extends MapReduceBase implements
Mapper<LongWritable,Text,Text,IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value,OutputCollector<Text,IntWritable>
output,
Reporter reporter) throws IOException{
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()){
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}

WC_Reducer.java
package org.bhavani;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class WC_Reducer extends MapReduceBase implements

Reducer<Text,IntWritable,Text,IntWritable> {

21
public void reduce(Text key, Iterator<IntWritable>
values,OutputCollector<Text,IntWritable> output,
Reporter reporter) throws IOException {
int sum=0;
while (values.hasNext()) {
sum+=values.next().get();
}
output.collect(key,new IntWritable(sum));
}
}

WC_Runner.java
package org.bhavani;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
public class WC_Runner {
public static void main(String[] args) throws IOException{
JobConf conf = new JobConf(WC_Runner.class);
conf.setJobName("WordCount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(WC_Mapper.class);
conf.setCombinerClass(WC_Reducer.class);
conf.setReducerClass(WC_Reducer.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf,new Path(args[0]));
FileOutputFormat.setOutputPath(conf,new Path(args[1]));
JobClient.runJob(conf);
}
}

pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">

22
<modelVersion>4.0.0</modelVersion>

<groupId>org.bhavani</groupId>
<artifactId>WordCount</artifactId>
<version>1.0-SNAPSHOT</version>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.2.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.2.3</version>
</dependency>

</dependencies>

</project>

INPUT:
input.txt
Shiva sai
Bhavani
Sekhar
Supriya
sai
sai
Shiva
Sekhar
Bhavani
Shakeel
Supriya

23
OUTPUT:

24
18. Write a Java Hadoop Code to run a MapReduce job for a Maximum
Temperature application.

PROGRAM:
Max_temp.java
package org.bhavani2;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class Max_temp {
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private Text year = new Text();
private IntWritable temperature = new IntWritable();
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);

if (tokenizer.hasMoreTokens()) {
// Assuming the first token is the year
String yearStr = tokenizer.nextToken();
year.set(yearStr);
// Check if there is a temperature value
if (tokenizer.hasMoreTokens()) {
String tempStr = tokenizer.nextToken().trim();
try {
int temp = Integer.parseInt(tempStr);
temperature.set(temp);
context.write(year, temperature);
} catch (NumberFormatException e) {
// Handle parsing error if necessary
System.err.println("Error parsing temperature: " + tempStr);
}
} else {

25
System.err.println("Missing temperature value for year: " + yearStr);
// Handle this case based on your application's requirements
}
} else {
System.err.println("Empty or invalid input line: " + line);
// Handle this case based on your application's requirements
}
}
}
public static class Reduce extends
Reducer<Text, IntWritable, Text, IntWritable> {
@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int maxtemp=0;
for(IntWritable it : values) {
int temperature= it.get();
if(maxtemp<temperature)
{
maxtemp =temperature;
}
}
context.write(key, new IntWritable(maxtemp));
}
}

public static void main(String[] args) throws Exception {

Configuration conf = new Configuration();
Job job = new Job(conf, "Max_temp");
job.setJarByClass(Max_temp.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path outputPath = new Path(args[1]);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
outputPath.getFileSystem(conf).delete(outputPath);
System.exit(job.waitForCompletion(true) ? 0 : 1);

}
}

26
pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>org.bhavani2</groupId>
<artifactId>WordCount</artifactId>
<version>1.0-SNAPSHOT</version>

</dependencies>

</project>

INPUT:
input.txt
1900 39
1900 14
1900 5
1900 11
1901 48
1901 21
1901 13
1902 49

27
1902 1
1902 24
1903 35
1903 35
1903 18
1904 29
1904 23
1904 28
1904 46

OUTPUT:

28
19. Write a Java Hadoop Code to run a MapReduce job for Sales by Country
application.

PROGRAM:
Sales.java
package org.bhavani3;
import java.io.IOException;
import java.io.DataInput;
import java.io.DataOutput;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class Sales {
public static class CountrySalesStatsWritable implements Writable {
private IntWritable productCount;
private LongWritable priceSum;
// Default Constructor
public CountrySalesStatsWritable()
{
this.productCount = new IntWritable();
this.priceSum = new LongWritable();
}
// Custom Constructor
public CountrySalesStatsWritable(IntWritable productCount, LongWritable
priceSum)
{
this.productCount = productCount;
this.priceSum = priceSum;
}
@Override
public void readFields(DataInput in) throws IOException
{
productCount.readFields(in);
priceSum.readFields(in);
}
@Override
public void write(DataOutput out) throws IOException
{
productCount.write(out);
priceSum.write(out);
}

29
@Override
public String toString() {
return productCount.toString() + "" + priceSum.toString();
}
}
public static class Map extends Mapper<LongWritable, Text, Text, LongWritable> {
private Text country = new Text();
private LongWritable price; // Price

public void map(LongWritable key, Text value, Context context) throws

IOException, InterruptedException {
String line = value.toString();
String[] columns = line.split(",");
country.set(columns[7]);
price = new LongWritable(Long.parseLong(columns[2])); // Price
context.write(country, price);
}
}
public static class Reduce extends Reducer<Text, LongWritable, Text,
CountrySalesStatsWritable> {
public void reduce(Text key, Iterable<LongWritable> values, Context context) //
values should contain the read prices with the same key (country)
throws IOException, InterruptedException {
int productCount = 0;
long priceSum = 0;
for (LongWritable price : values) {
productCount++;
priceSum += price.get();
}
CountrySalesStatsWritable countrySalesStatsWritable = new
CountrySalesStatsWritable(new IntWritable(productCount), new LongWritable
(priceSum));
context.write(key, countrySalesStatsWritable);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();

Job job = Job.getInstance(conf, "sales"); // The job is used like a wrapper, the name of
the job is just for seeing which program-job is running
job.setJarByClass(Sales.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class); // Mapper output value class, also
input value class of reducer
job.setMapperClass(Map.class);

30
//job.setCombinerClass(Reduce.class); // Don’t reuse the Reducer for Combiner since
the Reducer’s input and output key value pair types do not match
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

<groupId>org.bhavani3</groupId>
<artifactId>WordCount</artifactId>
<version>1.0-SNAPSHOT</version>

</dependencies>

</project>

31
INPUT:
input.csv

OUTPUT:

32
20. Implement a simple map-reduce job that builds an inverted index on the set of
input documents (Hadoop)

PROGRAM:
InvertedIndexMapper.java
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class InvertedIndexMapper extends Mapper<LongWritable, Text, Text, Text> {
private Text word = new Text();
private Text documentId = new Text();
@Override
protected void map(LongWritable key, Text value, Context context) throws
IOException, InterruptedException {
String[] tokens = value.toString().split("\\s+");
if (tokens.length < 2) {
return; // Skip lines with no content
}
String docId = tokens[0];
documentId.set(docId);
for (int i = 1; i < tokens.length; i++) {
word.set(tokens[i]);
context.write(word, documentId);
}
}
}

InvertedIndexReducer.java
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import java.io.IOException;
import java.util.HashSet;
import java.util.Set;
public class InvertedIndexReducer extends Reducer<Text, Text, Text, Text> {
private Text result = new Text();
@Override
protected void reduce(Text key, Iterable<Text> values, Context context) throws
IOException, InterruptedException {
Set<String> documentIds = new HashSet<>();
for (Text value : values) {
documentIds.add(value.toString());
}
result.set(String.join(",", documentIds));
context.write(key, result);
}
}

33
InvertedIndexDriver.java
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class InvertedIndexDriver {

public static void main(String[] args) throws Exception {
if (args.length != 2) {
System.err.println("Usage: InvertedIndexDriver <input path> <output path>");
System.exit(-1);
}
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Inverted Index");
job.setJarByClass(InvertedIndexDriver.class);
job.setMapperClass(InvertedIndexMapper.class);
job.setReducerClass(InvertedIndexReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

INPUT:

1.txt
doc1 Hello world

2.txt
doc2 Hadoop is great

3.txt
doc3 Hello Hadoop

34
OUTPUT:

35
21. Use R-Project to carry out statistical analysis of big data
PROGRAM:
install.packages("ggplot2")
# Load the dataset
data(iris)
# Display the first few rows of the dataset
head(iris)
# Summary statistics of the dataset
summary(iris)
# Mean of the each variables by species
aggregate(. ~ Species, data = iris, mean)
# Plotting the data
library(ggplot2)
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_point() +
labs(title = "Iris Dataset",
x = "Sepal Length",
y = "Sepal Width")
# Boxplot of each variable by species
boxplot(iris[, 1:4],
main = "Boxplot of Iris Dataset by Species",
xlab = " Species",
ylab = " Measurement",
col = c("skyblue", "lightgreen", "salmon"),
names = c("Sepal Length", "Sepal width", "Petal Length", "Petal width"))

OUTPUT:

> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa

> summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
Median :5.800 Median :3.000 Median :4.350 Median :1.300
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
Species
setosa :50

36
versicolor:50
virginica :50

> aggregate(. ~ Species, data = iris, mean)

Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1 setosa 5.006 3.428 1.462 0.246
2 versicolor 5.936 2.770 4.260 1.326
3 virginica 6.588 2.974 5.552 2.026

37
22. Use R-Project for data visualization of social media data

PROGRAM:
install.packages("ggplot2")
install.packages("dplyr")
install.packages("readr")
library(readr)
social_media_data <-
read_csv("E:\\social media data.csv")
library(dplyr)
summary(social_media_data)
glimpse(social_media_data)

library(ggplot2)
ggplot(social_media_data, aes(x = date, y = likes, color = platform, group = platform)) +
geom_line() + labs(title= "Likes Over Time", x = "Date", y = "Number of Likes") +
theme_minimal()
ggsave("likes_over_time.png")

total_shares <- social_media_data %>% group_by(platform) %>%

summarise(total_shares = sum(shares))
ggplot(total_shares, aes(x = platform, y = total_shares, fill = platform)) + geom_bar(stat =
"identity") + labs(title = "Total Shares per Platform", x = "Platform", y = "Total Shares") +
theme_minimal()
ggsave("total_shares_per_platform.png")

ggplot(social_media_data, aes(x = likes, y = comments, color = platform)) +

geom_point(alpha = 0.6) + labs(title = "Likes vs. Comments", x = "Number of Likes", y =
"Number of Comments") + theme_minimal()
ggsave("likes_vs_comments.png")

INPUT:
social media data.csv

38
OUTPUT:
> summary(social_media_data)
platform likes comments shares
Length:20 Min. :1.200 Min. : 7.00 Min. : 1.00
Class :character 1st Qu.:1.775 1st Qu.:14.75 1st Qu.: 4.75
Mode :character Median :2.500 Median :25.50 Median : 9.50
Mean :2.820 Mean :31.60 Mean :13.10
3rd Qu.:3.650 3rd Qu.:39.25 3rd Qu.:17.00
Max. :5.400 Max. :90.00 Max. :46.00
date
Length:20
Class :character
Mode :character
> glimpse(social_media_data)
Rows: 20
Columns: 5
$ platform <chr> "youtube", "instagram", "twitter", "openid", "facebook", "t…
$ likes <dbl> 2.3, 2.3, 2.3, 5.4, 1.5, 2.0, 1.5, 2.7, 1.2, 1.7, 4.6, 4.1,…
$ comments <dbl> 50, 15, 53, 90, 14, 20, 14, 40, 19, 17, 33, 33, 22, 11, 10,…
$ shares <dbl> 34, 9, 6, 26, 15, 4, 12, 7, 15, 2, 5, 46, 3, 23, 11, 2, 6, …
$ date <chr> "3/6/2023", "3/7/2023", "3/8/2023", "3/9/2023", "3/10/2024"…

R Basics
88% (8)
R Basics
8 pages
Assignment & Quiz (Matlab)
100% (1)
Assignment & Quiz (Matlab)
24 pages
The Hydrogen Cipher (Judy Beebe)
100% (1)
The Hydrogen Cipher (Judy Beebe)
14 pages
Practice Essay Topics For Much Ado About Nothing
No ratings yet
Practice Essay Topics For Much Ado About Nothing
5 pages
UDS in CAN Flash Programming
50% (2)
UDS in CAN Flash Programming
8 pages
Rexercises 1 R Basic
No ratings yet
Rexercises 1 R Basic
35 pages
M.tech - Data Science Lab
No ratings yet
M.tech - Data Science Lab
48 pages
Programming Exercises For R: by Nastasiya F. Grinberg & Robin J. Reed
50% (2)
Programming Exercises For R: by Nastasiya F. Grinberg & Robin J. Reed
35 pages
R Programming Examples
No ratings yet
R Programming Examples
34 pages
Advanced Statistics - Test: Tytuły Labek
No ratings yet
Advanced Statistics - Test: Tytuły Labek
40 pages
En6G-Iig-7.3.1 En6G-Iig-7.3.2: Test - Id 32317&title Prepositional Phrases
100% (1)
En6G-Iig-7.3.1 En6G-Iig-7.3.2: Test - Id 32317&title Prepositional Phrases
15 pages
DAUR Lab Manual
No ratings yet
DAUR Lab Manual
14 pages
R Assignment
No ratings yet
R Assignment
9 pages
UVM Interview Questions - VLSI Encyclopedia
No ratings yet
UVM Interview Questions - VLSI Encyclopedia
7 pages
Bank Reconciliation - Manual
100% (1)
Bank Reconciliation - Manual
9 pages
R Programs I Mba II Sem
No ratings yet
R Programs I Mba II Sem
11 pages
R Studio Assignments
No ratings yet
R Studio Assignments
95 pages
R Lab Programs-1
No ratings yet
R Lab Programs-1
26 pages
FDS Lab Manual Print
No ratings yet
FDS Lab Manual Print
74 pages
Analysis Report
No ratings yet
Analysis Report
8 pages
19PDSC205 Lab Manual
No ratings yet
19PDSC205 Lab Manual
21 pages
Awini Mustapha-Project1
No ratings yet
Awini Mustapha-Project1
8 pages
R Extra Programs
No ratings yet
R Extra Programs
9 pages
Detailed LP CO1 Q1
No ratings yet
Detailed LP CO1 Q1
4 pages
R Prgms
No ratings yet
R Prgms
12 pages
RemoveWatermark pdf24 Merged+
No ratings yet
RemoveWatermark pdf24 Merged+
76 pages
ALY 6000 Project 1-2
No ratings yet
ALY 6000 Project 1-2
9 pages
Raghav Khemka 3B RMC
No ratings yet
Raghav Khemka 3B RMC
72 pages
R Lab Assignment
No ratings yet
R Lab Assignment
17 pages
37.prajakta Zure (R Ass - No1)
No ratings yet
37.prajakta Zure (R Ass - No1)
8 pages
What Did You Do Yesterday?
No ratings yet
What Did You Do Yesterday?
21 pages
Test Units 3-4 (A) 7th Grade
No ratings yet
Test Units 3-4 (A) 7th Grade
5 pages
Maths Class Ix Chapter 04 05 and 06 Practice Paper 02
100% (1)
Maths Class Ix Chapter 04 05 and 06 Practice Paper 02
4 pages
R Programming
No ratings yet
R Programming
34 pages
Sheet
No ratings yet
Sheet
2 pages
R Programs
No ratings yet
R Programs
12 pages
Hallowed Be Your Name
No ratings yet
Hallowed Be Your Name
23 pages
Hamm Code
67% (3)
Hamm Code
36 pages
R Lab Manual (Part B)
No ratings yet
R Lab Manual (Part B)
10 pages
FTP Server Technology
97% (33)
FTP Server Technology
35 pages
Beck, Plato S Parmenides
No ratings yet
Beck, Plato S Parmenides
6 pages
COP & LOP Upgrade May 2021
No ratings yet
COP & LOP Upgrade May 2021
3 pages
R Programming Lab Mannual
No ratings yet
R Programming Lab Mannual
34 pages
Data Anlytics Using R Notes
No ratings yet
Data Anlytics Using R Notes
14 pages
Unit Test-3: English Class
No ratings yet
Unit Test-3: English Class
1 page
RStudio
No ratings yet
RStudio
31 pages
Field Study 2 Reviewer
No ratings yet
Field Study 2 Reviewer
2 pages
CS605 Da
No ratings yet
CS605 Da
21 pages
Fail Computation in CATIA V5R16 (GSA)
No ratings yet
Fail Computation in CATIA V5R16 (GSA)
2 pages
DFC20123 Chap 1 Fundamentals of DBMS
No ratings yet
DFC20123 Chap 1 Fundamentals of DBMS
48 pages
Project Report PDF
No ratings yet
Project Report PDF
5 pages
Practical 3 Kunal
No ratings yet
Practical 3 Kunal
8 pages
Introduction To R Chap 2
No ratings yet
Introduction To R Chap 2
30 pages
R Tools LAB
No ratings yet
R Tools LAB
31 pages
R Lab Manual
No ratings yet
R Lab Manual
16 pages
ECMR11 Proceedings
No ratings yet
ECMR11 Proceedings
333 pages
Dav Lab
No ratings yet
Dav Lab
55 pages
RPP WB 1085
No ratings yet
RPP WB 1085
5 pages
Common Core Diagnostic Test
No ratings yet
Common Core Diagnostic Test
3 pages
R Programming Lab
No ratings yet
R Programming Lab
14 pages
Ex 3
No ratings yet
Ex 3
20 pages
Practical 1 - Basics of R
No ratings yet
Practical 1 - Basics of R
8 pages
R Sample Test
No ratings yet
R Sample Test
3 pages
DAV LAB3.pdf 20250306 141450 0000
No ratings yet
DAV LAB3.pdf 20250306 141450 0000
57 pages
Dav Lab
No ratings yet
Dav Lab
54 pages
Of Mice and Men
No ratings yet
Of Mice and Men
7 pages
R Lab Record
No ratings yet
R Lab Record
30 pages
Khadeeja 66 Journal
No ratings yet
Khadeeja 66 Journal
86 pages
Dsda Manual
No ratings yet
Dsda Manual
64 pages
QQQQQ
No ratings yet
QQQQQ
12 pages
Otaremwa Moses Ronaldo
No ratings yet
Otaremwa Moses Ronaldo
45 pages
R Programming Lab Programs
No ratings yet
R Programming Lab Programs
16 pages
EPP5-WLP-Q1W4 New
No ratings yet
EPP5-WLP-Q1W4 New
14 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
26 pages
Acl 7-9 in R
No ratings yet
Acl 7-9 in R
16 pages
Macbeth Scheme of Work
No ratings yet
Macbeth Scheme of Work
7 pages
Grade 7 English
No ratings yet
Grade 7 English
12 pages
Introduction To System Components and Interfaces
No ratings yet
Introduction To System Components and Interfaces
10 pages
R Programming
No ratings yet
R Programming
23 pages
BCA 280 Lab
No ratings yet
BCA 280 Lab
29 pages
Kotoba Safety & Quality
No ratings yet
Kotoba Safety & Quality
24 pages
R Programming Solved
No ratings yet
R Programming Solved
9 pages
R-1ST Internal-Lab Notes
No ratings yet
R-1ST Internal-Lab Notes
14 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
44 pages
Lab Taskr
No ratings yet
Lab Taskr
6 pages
Index BCA 602
No ratings yet
Index BCA 602
2 pages