0% found this document useful (0 votes)

20 views5 pages

BDA April-May 2024 Answers

The document discusses various optimization techniques including Genetic Algorithms and Simulated Annealing, highlighting their applications and differences. It also covers data structures in Hadoop, the reliability of RDBMS vs. NoSQL, and provides R programming examples for matrix operations and string manipulations. Additionally, it introduces the DGIM algorithm for estimating binary data stream counts and explains statistical moments for data analysis.

Uploaded by

Boopathy S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views5 pages

BDA April-May 2024 Answers

Uploaded by

Boopathy S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

BDA April- May 2024

Part – A

3. Genetic Algorithm vs. Genetic Programming

o Genetic Algorithm: Optimization technique inspired by the process of natural selection to solve problems by
evolving a set of potential solutions.
o Genetic Programming: A specialization of genetic algorithms where the solutions are computer programs
represented as tree structures, which evolve over time.

4. Purpose of Simulated Annealing:

Simulated annealing is a probabilistic optimization technique used to find an approximate global optimum for functions with
multiple local optima. It mimics the annealing process in metallurgy, where gradual cooling leads to a stable crystal structure.

5. Examples of Stream Sources:

1. Keyboard inputs (stdin)

2. Network sockets
3. Sensor data streams
4. Real-time stock price feeds

7. Namenode vs. Datanode:

o Namenode: Manages the metadata of the file system and keeps track of file locations in Hadoop Distributed
File System (HDFS).
o Datanode: Stores the actual data blocks and communicates with the Namenode for read/write operations.

8. Which is more reliable, RDBMS or NoSQL? Why?

 RDBMS: Reliable for structured data and ACID (Atomicity, Consistency, Isolation, Durability) compliance, making it
suitable for critical transactional systems.
 NoSQL: Better for handling unstructured or semi-structured data and scaling horizontally, but lacks strict ACID
properties.

9. R Program to Create a 2x2 Matrix and Display Its First Row:

# Creating a 2x2 matrix

matrix_data <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2)

# Displaying the first row

first_row <- matrix_data[1, ]

print(first_row)

10. Illustration of regexpr() in R Programming:

text <- "This is a test string"

pattern <- "test"

match_pos <- regexpr(pattern, text)

print(match_pos)
Part – B

12 a). Genetic Algorithm:

 Genetic algorithms are often used for optimizing tasks like machine learning hyperparameters, query optimization, and
clustering large datasets.
 Crossover and mutation in big data help evolve better solutions over time to handle vast amounts of data more
efficiently.

Chromosome: 11100110

One-Point Crossover:

 A crossover point is selected randomly, say after the 3rd bit.

 Parent 1: 111|00110
 Parent 2: 000|11101 (example second parent)
 Resulting offspring:
o Offspring 1: 11111101
o Offspring 2: 00000110

Two-Point Crossover:

 Two crossover points are selected, say after the 2nd and 6th bits.
 Parent 1: 11|1001|10
 Parent 2: 00|1110|01 (example second parent)
 Resulting offspring:
o Offspring 1: 11111010
o Offspring 2: 00100101

Mutation:

 Mutation involves flipping a bit at a random position.

 Assume the mutation point is the 5th bit.
 Original chromosome: 11100110
 After mutation: 11101110 (5th bit flipped from 0 to 1)

13 a) Moments:

 Moments are essential for statistical analysis of large data streams.

 Computing higher moments helps detect data anomalies, model variance, and optimize machine learning models over
streaming data.

1. Surprise Number (Second Moment)

The second moment (also known as the surprise number or frequency moment) is defined as:

F2=∑i(fi2)

F_2 = \sum_{i} (f_i^2)

F2=i∑(fi2)

where f_i^2 is the frequency of each element in the stream.

Given Stream:

3, 1, 4, 1, 3, 4, 2, 1, 2
Frequency Count:

 1 appears 3 times
 2 appears 2 times
 3 appears 2 times
 4 appears 2 times

Calculation of Second Moment:

F2=32+22+22+22

F_2 = 3^2 + 2^2 + 2^2 + 2^2

F2=32+22+22+22

F2=9+4+4+4=21

F_2 = 9 + 4 + 4 + 4 = 21

F2=9+4+4+4=21

2. Third Moment Calculation

The third moment is defined as:

F3=∑i(fi3)

F_3 = \sum_{i} (f_i^3)

F3=i∑(fi3)

Calculation:
F3=33+23+23+23

F_3 = 3^3 + 2^3 + 2^3 + 2^3

F3=33+23+23+23

F3=27+8+8+8=51

F_3 = 27 + 8 + 8 + 8 = 51

F3=27+8+8+8=51

13. b) Purpose of the Datar-Gionis-Indyk-Motwani (DGIM) Algorithm

The DGIM algorithm is designed to efficiently estimate the number of 1s in the most recent N elements of a binary data stream
using limited memory. It is commonly used in applications like network traffic monitoring, clickstream analysis, and sensor data
processing.

Key Idea

Instead of storing every bit in the stream, DGIM groups bits into buckets and maintains only a logarithmic number of buckets,
thereby reducing memory usage while providing approximate results.

Example Explanation

Problem Statement:

Suppose we have a binary data stream:

1, 0, 1, 1, 0, 1, 0, 1, 1, 1 (N = 10)

We want to estimate the number of 1s in the last 10 elements.

Steps:

1. Bucket Representation:
o Group consecutive 1s into buckets and store only the size and timestamp of the most recent 1 in each bucket.
o Merge buckets when needed to maintain the constraint that there are at most two buckets of the same size.
2. Bucket Management:
o The algorithm keeps buckets of sizes 1, 2, 4, etc. (powers of 2).
o If there are more than two buckets of the same size, merge the oldest ones.
3. Counting the 1s:
o Add the sizes of all full buckets.
o For the last bucket, count only the portion that fits within the N most recent elements.

Example Calculation:

If the last three buckets have sizes 2, 2, and 4, the estimated count of 1s would be:

2+2+4=8
2+2+4=8
2+2+4=8

This is an approximation because we count part of the last bucket.

Applications in Big Data:

 Network Traffic Analysis: Monitoring active connections over time.

 Clickstream Analysis: Counting recent user clicks without storing entire histories.
 IoT Systems: Tracking sensor activations in resource-constrained environments.

16 a) Create a Class for a Matrix and Perform Matrix Multiplication in R

# Define a custom class for a mom matrix

MomMatrix <- setClass( "MomMatrix",slots = list(data = "matrix"))

# Method to initialize and print the matrix

setMethod("show", "MomMatrix", function(object) {
cat("Mom Matrix:\n")
print(object@data)
})

# Matrix multiplication function

setGeneric("multiplyMatrices", function(x, y) standardGeneric("multiplyMatrices"))
setMethod("multiplyMatrices", c("MomMatrix", "MomMatrix"), function(x, y) {
result_data <- x@data %*% y@data
return(MomMatrix(data = result_data))
})

# Creating two MomMatrix objects

matrix1 <- MomMatrix(data = matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2))
matrix2 <- MomMatrix(data = matrix(c(2, 0, 1, 2), nrow = 2, ncol = 2))

# Displaying the matrices

show(matrix1)
show(matrix2)

# Performing multiplication
result_matrix <- multiplyMatrices(matrix1, matrix2)
show(result_matrix)

16 b) i) R Program to Check if String has a Given Suffix

# Function to check suffix
check_suffix <- function(string, suffixes) {
for (suffix in suffixes) {
if (grepl(paste0(suffix, "$"), string)) {
return(TRUE)
}
}
return(FALSE)
}

# Example usage
string <- "index.html"
suffixes <- c("html", "php", "txt")
result <- check_suffix(string, suffixes)
cat("Does the string have a suffix?", result, "\n")

16 b) ii) R Program to Find Substrings of Length 'n' Starting and Ending with the Same Character

find_substrings <- function(string, n) {

substrings <- c()

length_str <- nchar(string)

for (i in 1:(length_str - n + 1)) {

substring <- substr(string, i, i + n - 1)

if (nchar(substring) == n && substring[1] == substring[n]) {

substrings <- c(substrings, substring)

return(substrings)

# Example usage

string <- "abcabc"

n <- 3

result <- find_substrings(string, n)

cat("Substrings of length", n, "that start and end with the same character:", result, "\n")

GenAI and LLMs Creative Projects, With Solutions
100% (1)
GenAI and LLMs Creative Projects, With Solutions
206 pages
R Programming by Rober D. Peng
No ratings yet
R Programming by Rober D. Peng
179 pages
Ida PDF
No ratings yet
Ida PDF
62 pages
BDI Summary-4
No ratings yet
BDI Summary-4
61 pages
R Programming Course Notes: Overview and History of R
No ratings yet
R Programming Course Notes: Overview and History of R
22 pages
Data Science Notes
No ratings yet
Data Science Notes
66 pages
R For Data Analysis
No ratings yet
R For Data Analysis
216 pages
MIT14 381F13 EcnomtrisInR PDF
No ratings yet
MIT14 381F13 EcnomtrisInR PDF
70 pages
Optimization Algorithms For Association Rule Mining (ARM) : K.Indira
No ratings yet
Optimization Algorithms For Association Rule Mining (ARM) : K.Indira
118 pages
Section 03
No ratings yet
Section 03
20 pages
Advanced R
100% (2)
Advanced R
24 pages
Jim Duggan - Exploring Operations Research With R-CRC Pressr (2024)
No ratings yet
Jim Duggan - Exploring Operations Research With R-CRC Pressr (2024)
396 pages
Workflow of Statistical Data Analysis
No ratings yet
Workflow of Statistical Data Analysis
105 pages
Introduction To R Chap 2
No ratings yet
Introduction To R Chap 2
30 pages
DR - Pierpaolo-Delser - Introduction R
No ratings yet
DR - Pierpaolo-Delser - Introduction R
83 pages
R Programmimg Practical Journal All-1
No ratings yet
R Programmimg Practical Journal All-1
25 pages
R Workshop Material 18-19, Oct-2023
No ratings yet
R Workshop Material 18-19, Oct-2023
67 pages
R Programming Checklist of Basic Skills With Examples
No ratings yet
R Programming Checklist of Basic Skills With Examples
33 pages
Glocal University: Practical File of R Programming
100% (1)
Glocal University: Practical File of R Programming
32 pages
Data - Analysis - With - R - 24
No ratings yet
Data - Analysis - With - R - 24
47 pages
MTH210
No ratings yet
MTH210
126 pages
Genetic Algorithms: The Crossover-Mutation Debate
No ratings yet
Genetic Algorithms: The Crossover-Mutation Debate
26 pages
Icebreake R
No ratings yet
Icebreake R
160 pages
Rcourse PDF
No ratings yet
Rcourse PDF
237 pages
Contents
No ratings yet
Contents
17 pages
CMT
No ratings yet
CMT
8 pages
R Program Record Book Iba
No ratings yet
R Program Record Book Iba
24 pages
Data Science Project - An Inductive Learning Approach, Verri
No ratings yet
Data Science Project - An Inductive Learning Approach, Verri
238 pages
Data Anlytics Using R Notes
No ratings yet
Data Anlytics Using R Notes
14 pages
Rbasics
No ratings yet
Rbasics
96 pages
R Programming Course Notes
No ratings yet
R Programming Course Notes
28 pages
Big Data Pyqp Answers
No ratings yet
Big Data Pyqp Answers
6 pages
Rprogramming PDF
No ratings yet
Rprogramming PDF
182 pages
R Lesson (1 of 2) PDF
No ratings yet
R Lesson (1 of 2) PDF
182 pages
Practical File R by Komal
No ratings yet
Practical File R by Komal
26 pages
AI-ML Syllabus
100% (1)
AI-ML Syllabus
8 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
109 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
109 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
109 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
109 pages
Course Outline (Ds & Ai) 2024
No ratings yet
Course Outline (Ds & Ai) 2024
13 pages
Boulder Handout 2019
No ratings yet
Boulder Handout 2019
187 pages
Machine: Learning
No ratings yet
Machine: Learning
24 pages
R Intro A Firsts Steps
No ratings yet
R Intro A Firsts Steps
112 pages
Joseph V. Tranquillo-MATLAB For Engineering and The Life Sciences (Synthesis Lectures On Engineering) - Morgan & Claypool Publishers (2011) PDF
No ratings yet
Joseph V. Tranquillo-MATLAB For Engineering and The Life Sciences (Synthesis Lectures On Engineering) - Morgan & Claypool Publishers (2011) PDF
137 pages
Matlab For Engineering and The Life Sciences - Joseph V. Tranquillo PDF
No ratings yet
Matlab For Engineering and The Life Sciences - Joseph V. Tranquillo PDF
137 pages
R Lab Programs (4) (Repaired) 1
No ratings yet
R Lab Programs (4) (Repaired) 1
38 pages
R For Absolute Beginners - Hands-On R Tutorial: June 2018
No ratings yet
R For Absolute Beginners - Hands-On R Tutorial: June 2018
43 pages
R Programming Course Notes
No ratings yet
R Programming Course Notes
28 pages
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
IGNOU PGDCA MCS 208 Data Structure and Algorithm Previous Years Unsolved Papers
From Everand
IGNOU PGDCA MCS 208 Data Structure and Algorithm Previous Years Unsolved Papers
Manish Soni
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
IGNOU MCA Design and Analysis of Algorithms Previous Years Unsolved Papers MCS 211
From Everand
IGNOU MCA Design and Analysis of Algorithms Previous Years Unsolved Papers MCS 211
Manish Soni
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Structural Stability 1
No ratings yet
Structural Stability 1
10 pages
ZN Open Loop Method
No ratings yet
ZN Open Loop Method
7 pages
GirirajParcha 2025
No ratings yet
GirirajParcha 2025
1 page
Speaker Dependent Continuous Kannada Speech Recognition Using HMM
No ratings yet
Speaker Dependent Continuous Kannada Speech Recognition Using HMM
4 pages
Pertemuan Ii Diskusi Soal Dan Jawaban Linear Programing
No ratings yet
Pertemuan Ii Diskusi Soal Dan Jawaban Linear Programing
5 pages
GATE CE Slot 2 MEMORY BASED 1
No ratings yet
GATE CE Slot 2 MEMORY BASED 1
3 pages
Unit 6.2 Indexing and Hashing
No ratings yet
Unit 6.2 Indexing and Hashing
37 pages
Worksheet 4.1. Example of An Algorithm
No ratings yet
Worksheet 4.1. Example of An Algorithm
8 pages
Probabilistic Engineering Mechanics: Rahul Kumar, Shaikh Faruque Ali, Sayan Gupta
No ratings yet
Probabilistic Engineering Mechanics: Rahul Kumar, Shaikh Faruque Ali, Sayan Gupta
15 pages
Sheets
No ratings yet
Sheets
3 pages
Chain Rule
No ratings yet
Chain Rule
11 pages
P3 - III - I B.tech Revaluation Results NOV-2024
No ratings yet
P3 - III - I B.tech Revaluation Results NOV-2024
5 pages
Algorithms For Scheduling Problems
No ratings yet
Algorithms For Scheduling Problems
210 pages
Homework Assignment #9 - Solutions: Mr Θ Ω Θ Mgr Θ P ∂ /∂ Θ P Θ
No ratings yet
Homework Assignment #9 - Solutions: Mr Θ Ω Θ Mgr Θ P ∂ /∂ Θ P Θ
10 pages
CNS Lab Manual
No ratings yet
CNS Lab Manual
32 pages
Untitled7.ipynb - Colaboratory
No ratings yet
Untitled7.ipynb - Colaboratory
12 pages
OM Answers
No ratings yet
OM Answers
11 pages
PPT ch02
No ratings yet
PPT ch02
33 pages
DSA - Mini Project - PDF 2101051
No ratings yet
DSA - Mini Project - PDF 2101051
12 pages
Introduction To Optimization
No ratings yet
Introduction To Optimization
13 pages
Advanced FEM: AFEM CH 1 - Slide 1
No ratings yet
Advanced FEM: AFEM CH 1 - Slide 1
12 pages
NLA Lecture Notes
No ratings yet
NLA Lecture Notes
86 pages
Đã Nén Trang 2
No ratings yet
Đã Nén Trang 2
36 pages
Markov Chain and Markov Processes
No ratings yet
Markov Chain and Markov Processes
9 pages
SSRN Id4208856
No ratings yet
SSRN Id4208856
12 pages
Wu 2018
No ratings yet
Wu 2018
34 pages
Introduction To AI Module 1 Part C
No ratings yet
Introduction To AI Module 1 Part C
5 pages
Network Analysis
100% (1)
Network Analysis
28 pages
Unit 2 AAM
No ratings yet
Unit 2 AAM
32 pages

BDA April-May 2024 Answers

Uploaded by

BDA April-May 2024 Answers

Uploaded by

BDA April- May 2024

3. Genetic Algorithm vs. Genetic Programming

4. Purpose of Simulated Annealing:

5. Examples of Stream Sources:

1. Keyboard inputs (stdin)

7. Namenode vs. Datanode:

8. Which is more reliable, RDBMS or NoSQL? Why?

9. R Program to Create a 2x2 Matrix and Display Its First Row:

# Creating a 2x2 matrix

matrix_data <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2)

# Displaying the first row

first_row <- matrix_data[1, ]

10. Illustration of regexpr() in R Programming:

text <- "This is a test string"

pattern <- "test"

match_pos <- regexpr(pattern, text)

12 a). Genetic Algorithm:

 A crossover point is selected randomly, say after the 3rd bit.

 Mutation involves flipping a bit at a random position.

 Moments are essential for statistical analysis of large data streams.

1. Surprise Number (Second Moment)

F_2 = \sum_{i} (f_i^2)

where f_i^2 is the frequency of each element in the stream.

Calculation of Second Moment:

F_2 = 3^2 + 2^2 + 2^2 + 2^2

2. Third Moment Calculation

The third moment is defined as:

F_3 = \sum_{i} (f_i^3)

F_3 = 3^3 + 2^3 + 2^3 + 2^3

13. b) Purpose of the Datar-Gionis-Indyk-Motwani (DGIM) Algorithm

Suppose we have a binary data stream:

We want to estimate the number of 1s in the last 10 elements.

This is an approximation because we count part of the last bucket.

Applications in Big Data:

 Network Traffic Analysis: Monitoring active connections over time.

16 a) Create a Class for a Matrix and Perform Matrix Multiplication in R

# Define a custom class for a mom matrix

# Method to initialize and print the matrix

# Matrix multiplication function

# Creating two MomMatrix objects

# Displaying the matrices

16 b) i) R Program to Check if String has a Given Suffix

find_substrings <- function(string, n) {

substrings <- c()

length_str <- nchar(string)

for (i in 1:(length_str - n + 1)) {

substring <- substr(string, i, i + n - 1)

if (nchar(substring) == n && substring[1] == substring[n]) {

substrings <- c(substrings, substring)

string <- "abcabc"

result <- find_substrings(string, n)

You might also like