0% found this document useful (0 votes)
21 views17 pages

Maths Lab

The document outlines a series of mathematical and statistical exercises conducted over several weeks, focusing on data analysis techniques including mean, median, mode, variance, standard deviation, correlation coefficients, regression analysis, and hypothesis testing. Each week presents a different dataset and corresponding calculations, using R programming code to perform statistical analyses. Key topics include ungrouped and grouped data, binomial distribution probabilities, correlation between variables, and linear and quadratic regression models.

Uploaded by

vijay7093396539
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views17 pages

Maths Lab

The document outlines a series of mathematical and statistical exercises conducted over several weeks, focusing on data analysis techniques including mean, median, mode, variance, standard deviation, correlation coefficients, regression analysis, and hypothesis testing. Each week presents a different dataset and corresponding calculations, using R programming code to perform statistical analyses. Key topics include ungrouped and grouped data, binomial distribution probabilities, correlation between variables, and linear and quadratic regression models.

Uploaded by

vijay7093396539
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Maths lab

Week 1(7-01-2025):
1. A researcher collects data on the heights (in cm) of students in a class. The data is
categorized into ungrouped formats.
Ungrouped Data
The recorded heights of 9 students are:
12, 15, 14, 14, 16, 15, 12, 18, 14

1. Calculate the Mean, Median, and Mode of the dataset.

2. Determine the Variance, Standard Deviation, and Coefficient of Variation of the


dataset.
Code:
Ungrouped Data
ungrouped_data <- c(12, 15, 14, 14, 16, 15, 12, 18, 14)
# Mean mean_value <-
mean(ungrouped_data)

# Median median_value <-


median(ungrouped_data)

# Mode mode_value <-


names(which.max(table(ungrouped_data)))
# Variance
variance_value <- var(ungrouped_data)
# Standard Deviation sd_value <-
sd(ungrouped_data) # Coefficient of
Variation cv_value <- (sd_value /
mean_value) * 100

# Display Results print("Ungrouped Data


Results:\n") cat("Mean:", mean_value,
"\n") cat("Median:", median_value, "\n")
cat("Mode:", mode_value, "\n")
cat("Variance:", variance_value, "\n")
cat("Standard Deviation:", sd_value, "\n")

cat("Coefficient of Variation:", cv_value, "%\n")

Week2(21-01-2025):
Grouped Data
The heights of another set of students are grouped into the following classes along with
their respective frequencies: Height Range (cm) Frequency
10 - 19 15

20 - 29 18

30 - 39 14

40 - 49 12
50 - 59 20
Code:
Grouped data:
classes <- c("10-19", "20-29", "30-39", "40-49", "50-59")
frequencies <- c(15, 18, 14, 12, 20) midpoints <- c(14.5, 24.5,
34.5, 44.5, 54.5) mean_value <- sum(midpoints * frequencies) /
sum(frequencies) cumulative_freq <- cumsum(frequencies) n <-
sum(frequencies) median_class_index <- which(cumulative_freq
>= n/2)[1]
L <- as.numeric(sub("-.*", "", classes[median_class_index])) h <- 10 f <-
frequencies[median_class_index] cf_previous <- ifelse(median_class_index == 1, 0,
cumulative_freq[median_class_index - 1]) median_value <- L + ((n/2 - cf_previous) / f) * h
modal_class_index <- which.max(frequencies)
L_mode <- as.numeric(sub("-.*", "", classes[modal_class_index])) f1 <-
frequencies[modal_class_index] f0 <- ifelse(modal_class_index == 1, 0,
frequencies[modal_class_index - 1])

f2 <- ifelse(modal_class_index == length(frequencies), 0, frequencies[modal_class_index +


1])
mode_value <- L_mode + ((f1 - f0) / ((f1 - f0) + (f1 - f2))) * h
mean_freq <- midpoints * frequencies variance <- sum((midpoints -
mean_value)^2 * frequencies) / sum(frequencies) sd_value <- sqrt(variance) cv
<- (sd_value / mean_value) * 100 cat("Mean:", mean_value, "\n")
cat("Median:", median_value, "\n") cat("Mode:", mode_value, "\n")
cat("Standard Deviation:", sd_value, "\n")
cat("Coefficient of Variation:", cv, "%", "\n")
Week3(4-02-2025):
Two golfers, Golfer A and Golfer B, recorded their scores over 20 rounds of golf. Their scores
are as follows:
Golfer A's Scores:
74, 75, 78, 72, 83, 85, 70, 73, 74, 71, 65, 68, 70, 55, 80, 89, 90, 95, 78, 80 Golfer
B's Scores:
82, 84, 86, 88, 87, 85, 83, 81, 80, 82, 89, 90, 92, 91, 78, 79, 82, 87, 85, 88

1. Compute the Coefficient of Variation (CV) for both golfers' scores.

2. Compare the CVs and determine which golfer is more consistent in performance.

3. Interpret the results: A lower CV indicates greater consistency. Based on your


calculations, which golfer demonstrates less variation in scores?

Code:
# Sample dataset data1 <-
c(74,75,78,72,83,85,70,73,74,71,65,68,70,55,80,89,90,95,78,80) data2 <-
c(82,84,86,88,87,85,83,81,80,82,89,90,92,91,78,79,82,87,85,88)
# Function to calculate Coefficient of Variation
cv <- function(x) { return((sd(x) / mean(x)) *
100)
}
# Compute CVs
cv1 <- cv(data1) cv2 <- cv(data2) # Print CV values
cat("Coefficient of Variation for Data1:", cv1, "%\n")
cat("Coefficient of Variation for Data2:", cv2, "%\n")
# Compare CVs for consistency
if (cv1 < cv2) { cat("GolferA is more
consistent.\n")
} else if (cv1 > cv2) { cat("GolferB is
more consistent.\n") } else {

cat("Both datasets have the same level of consistency.\n")


}
Output:

Week4(11-02-2025):
A fair coin is tossed 10 times. Let X be the number of heads obtained.

1. Find the probability of getting exactly 5 heads.

2. Find the probability of getting at most 5 heads.

3. Compute the mean and variance of the binomial distribution.


Code: n
<- 10 p
<- 0.5

prob_5_heads <- dbinom(5, size = n, prob = p) cat("Probability:",


prob_5_heads, "\n") prob_at_most_5_heads <- pbinom(5, size =
n, prob = p) cat("Probability most 5 heads:",
prob_at_most_5_heads, "\n") mean_value <- n * p
variance_value <- n * p * (1 - p) cat("Mean of the binomial
distribution:", mean_value, "\n") cat("Variance of the binomial
distribution:", variance_value, "\n")

Output:

Week5(18-02-2025):
A researcher is studying the relationship between the heights of fathers and their sons. The
recorded heights (in inches) of 8 father-son pairs are given below:

Father's Heights (in inches): 65,


66, 67, 67, 68, 69, 70, 72 Son's
Heights (in inches):
67, 68, 65, 68, 72, 72, 69, 71

1. Compute the Karl Pearson correlation coefficient to determine the strength and
direction of the relationship between the heights of fathers and their sons.

2. Interpret the correlation coefficient:


o Is the relationship positive or negative?
o Is it weak, moderate, or strong?
Code:
father_height <- c(65, 66, 67, 67, 68, 69, 70, 72) son_height <- c(67, 68, 65, 68, 72,
72, 69, 71) correlation <- cor(father_height, son_height, method = "pearson")
cat("Karl Pearson correlation coefficient:", correlation, "\n") plot(father_height,
son_height, main="Scatter Plot: Father's Height vs Son's Height", xlab="Father's
Height (inches)", ylab="Son's Height (inches)", pch=16, col="blue")
abline(lm(son_height ~ father_height), col="red", lwd=2)

Output:

Week6(25-02-2025):
The English and Mathematics scores of 5 students are given below:
English Marks: 75, 40, 52, 65, 60
Maths Marks: 25, 42, 35, 29, 33

1. Compute Spearman’s Rank Correlation Coefficient manually using the formula.

2. Verify the result using a built-in function.

3. Interpret the correlation – does a higher English score relate to a higher or lower
Maths score?
Code:
english_marks <- c(75, 40, 52, 65, 60) maths_marks <- c(25, 42, 35, 29, 33)
spearman_corr <- cor(english_marks, maths_marks, method = "spearman")
print(spearman_corr)
# Data english_marks <- c(75, 40, 52,
65, 60) maths_marks <- c(25, 42, 35,
29, 33)

# Rank the data rank_english <-


rank(english_marks) rank_maths <-
rank(maths_marks)
d <- rank_english - rank_maths
# Compute d^2 d_squared
<- d^2

# Number of observations n
<- length(english_marks)

# Apply Spearman's formula rho_manual <- 1 - (6 *


sum(d_squared)) / (n * (n^2 - 1))

# Compute using built-in function rho_builtin <- cor(english_marks,


maths_marks, method = "spearman")
# Print results
print(paste("Spearman's Rank Correlation (manual):", rho_manual)) Output:

2. Find the linear regression for the following data


X 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
Y 8.5 7.5 7.9 6.5 7.5 4.2 3.8 7.9 9.2 8.1 8.3
also find the error for each x
find the sum of the squares of error
Code:
# Given data
x <- c(1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0)
y <- c(8.5, 7.5, 7.9, 6.5, 7.5, 4.2, 3.8, 7.9, 9.2, 8.1, 8.3)
# Perform linear regression
model <- lm(y ~ x)
# Get predicted values
y_pred <- predict(model)
# Compute residuals (errors)
errors <- y - y_pred
# Compute Sum of Squared Errors (SSE)
SSE <- sum(errors^2)
print(summary(model))
# Print each error (residual)
print("Errors (Residuals) for each case:")
error_table <- data.frame(x, y, y_pred, errors)
print(error_table)
# Print Sum of Squared Errors
print(paste("Sum of Squared Errors (SSE):", round(SSE, 2)))
# Plot data and regression line
plot(x, y, main = "Linear Regression", xlab = "X Values", ylab = "Y Values", pch = 16, col =
"blue")
abline(model, col = "red", lwd = 2)
Week7( 4-023-2025):
.1. Find the hypothesis using R
h0:Coin is fair p(H)=0.5
h1: coin is biased p(H)=0.6

n=100, for x=40 , alpha=0.05


test the hypothesis upto alpha =0.05 significance
Code:
# Given values

n <- 100 # sample size


x <- 40 # number of heads
p0 <- 0.5 # null hypothesis proportion (fair coin)
alpha <- 0.05 # significance level

# Sample proportion
phat <- x / n
# Standard error under H0
se <- sqrt(p0 * (1 - p0) / n)
# Z-test statistic
z <- (phat - p0) / se
# Two-tailed p-value
p_value <- 2 * pnorm(z)

# Critical Z-value for alpha = 0.05


z_critical <- qnorm(1 - alpha / 2)
# Results
cat("Sample proportion:", phat, "\n")

cat("Z statistic:", z, "\n")


cat("Critical Z value:", z_critical, "\n")
cat("P-value:", p_value, "\n")
# Decision

if (abs(z) > z_critical) {


cat("Reject the null hypothesis: The coin is biased.\n")
} els
e{

cat("Fail to reject the null hypothesis: The coin is fair.\n")


}
Output:

2. using r, find a quadratic regression model for the following data

X 3 4 5 6 7
Y 2.5 3.2 3.8 6.5 11.5
also find x on y
Code:

# Given data
X <- c(3, 4, 5, 6, 7)
Y <- c(2.5, 3.2, 3.8, 6.5, 11.5)
# Create a data frame
data <- data.frame(X, Y)
# Fit a quadratic regression model
model <- lm(Y ~ X + I(X^2), data = data)
# Display model summary

summary(model)
# Print the quadratic equation coefficients
coefficients <- coef(model)
cat("Quadratic Regression Model: Y =", round(coefficients[3], 4), "X² +", round(coefficients[2],
4), "X +", round(coefficients[1], 4), "\n")
# Plot the data and regression curve

plot(X, Y, pch = 16, col = "blue", main = "Quadratic Regression", xlab = "X", ylab = "Y")
curve(coefficients[1] + coefficients[2] * x + coefficients[3] * x^2, add = TRUE, col = "red", lwd
= 2)
Output:

Week8(11-03-2025):
1. fit a straight line using the method of least squares
X 1 2 3 4 5 6 7 8 9 10
Y 52.5 58.7 65 70.2 75.4 81.1 87.2 95.5 102.2 108.4
output:

2.It is claimed that a random sample of 49 tyres had a mean life of 15200 km. This sample
was drawn from a population whose mean is 15150km, SD 1200km.
Test the significance at 5% level.
Code: sample_mean <- 15200
pop_mean <- 15150
sd <- 1200
n <- 49
alpha <- 0.05
#h0 : pop_mean =15, 150
#h1 : pop_mean != 15, 150--> two-tailed
z_score <- (sample_mean - pop_mean) / (sd / sqrt(n))
critical_value <- qnorm(1 - alpha / 2)
cat("Z-score:", round(abs(z_score), 3), "\n")
cat("critical value : for Z alpha/2", round(critical_value, 4), "\n")
if (abs(z_score) > critical_value) {
cat("Reject the null hypothesis H0: There is significant evidence that the mean differs from
15150 km.\n")
} else {
cat("Fail to reject the null hypothesis H0: There is not enough evidence to conclude that the
mean differs from 15150 km.\n")
}
Output:

Week9(18-03-2025):
1. An experiment was performed to compare the abrasive wear of two different laminated
materials. Twelve pieces of material 1 were tested by exposing each piece to a machine
measuring wear. Ten pieces of material 2 were similarly tested. In each case, the depth of
wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with
a sample standard deviation of 4, while the samples of material 2 gave an average of 81 with
a sample standard deviation of 5. Can we conclude at the 0.05 level of significance that the
abrasive wear of material 1 exceeds that of material 2 by more than 2 units? Assume the
populations to be approximately normal with equal variances.
Code:
n1<-12
n2<-10
x1<-85
x2<-81
sd1<-4
sd2<-5
alpha<-0.05
diff_0 <- 2 #null hypothesis
sp <- sqrt(((n1 - 1) * sd1^2 + (n2 - 1) * sd2^2) / (n1 + n2 - 2))
t_stat <- ((x1 - x2) - diff_0) / (sp * sqrt(1/n1 + 1/n2))
df <- n1 + n2 - 2 #df -- degree of freedom
t_critical <- qt(1 - alpha, df)
p_value <- 1 - pt(t_stat, df)
cat("Test Statistic (t):", t_stat, "\n")
cat("Critical Value (t_critical):", t_critical, "\n")
cat("p-value:", p_value, "\n")
if (t_stat > t_critical) {
cat("Reject the null hypothesis: Material 1's wear exceeds Material 2's by more than 2
units.\n")
} else {
cat("Fail to reject the null hypothesis: No significant evidence that Material 1's wear exceeds
Material 2's by more than 2 units.\n")
}
Output:

Week10(1-04-2025):
1.Two materials are tested for wear.
Material 1: n1=12n_1 = 12n1=12, mean = 85, SD = 4
Material 2: n2=10n_2 = 10n2=10, mean = 81, SD = 5
Test if Material 1’s wear exceeds Material 2’s by more than 2 units, using α=0.05\alpha =
0.05α=0.05.
Assume equal variances.
State the test statistic, critical value, p-value, and conclusion.
Code:
# Given data
s1 <- 0.035 # Standard deviation of Company 1
s2 <- 0.062 # Standard deviation of Company 2
n1 <- 12 # Sample size for Company 1
n2 <- 12 # Sample size for Company 2
alpha <- 0.05 # Level of significance
# Compute the F-test statistic
F_stat <- (s1^2) / (s2^2)
# Compute critical value for left-tailed test
F_critical <- qf(alpha, df1 = n1 - 1, df2 = n2 - 1)

# Display results
cat("F-test statistic:", F_stat, "\n")
cat("Critical value (left-tailed):", F_critical, "\n")

# Decision rule
if (F_stat < F_critical) {
cat("Reject the null hypothesis: There is evidence that Company 1 has less variability.\n")
} else {
cat("Fail to reject the null hypothesis: No significant evidence that Company 1 has less
variability.\n")
}
Output:

2. A machine produces metal rods. A sample of 15 rods has a standard deviation of 0.61 mm.
Test if the population standard deviation is greater than 0.50 mm at the 5% significance level.
Use the chi-square test. State the test statistic, critical value, and conclusion.
Code:
# Given data
s <- 0.61 # Sample standard deviation
sigma_0 <- 0.50 # Hypothesized standard deviation
n <- 15 # Sample size
alpha <- 0.05 # Level of significance

# Compute the chi-square test statistic


chi_sq_stat <- (n - 1) * (s^2) / (sigma_0^2)

# Compute critical value for right-tailed test


df <- n - 1 # Degrees of freedom
chi_critical <- qchisq(1 - alpha, df)

# Display results
cat("Chi-square test statistic:", chi_sq_stat, "\n")
cat("Critical value:", chi_critical, "\n")

# Decision rule
if (chi_sq_stat > chi_critical) {
cat("Reject the null hypothesis: The standard deviation is significantly greater than 0.50
mm.\n")
} else {
cat("Fail to reject the null hypothesis: No significant evidence that the standard deviation is
greater than 0.50 mm.\n")
}
Output:

You might also like