Lab-4
Lab-4
Objectives
• Learn how to model the normal distribution in R
• Learn how to calculate probabilities using the normal distribution
• Perform hypothesis testing using the normal distribution
• Find confidence intervals using the normal distribution
Plotting
The normal distribution is a bell shaped distribution that can be defined by a mean (µ) and a standard
deviation (σ). You can visualize the normal distribution using the dnorm() function in R.
# Define the normal distribution parameters
mu <- 0
sigma <- 1
plot(x, y, type = "l", xlab = "x", ylab = "Density", main = "Normal Distribution")
1
Normal Distribution
0.4
0.3
Density
0.2
0.1
0.0
−4 −2 0 2 4
x
You can also plot multiple normal distributions on the same graph.
x <- seq(-5, 5, length.out = 1000)
y1 <- dnorm(x, mean = 0, sd = 1) # Standard normal
y2 <- dnorm(x, mean = 3, sd = 1) # Mean = 3, SD = 1
y3 <- dnorm(x, mean = 0, sd = 2) # Mean = 0, SD = 2
# Add a legend
legend("topright", legend = c("Mean=0, SD=1", "Mean=3, SD=1", "Mean=0, SD=2"),
col = c("blue", "red", "green"), lwd = 2)
2
Multiple Normal Distributions
0.5
Mean=0, SD=1
Mean=3, SD=1
0.4
Mean=0, SD=2
0.3
Density
0.2
0.1
0.0
−4 −2 0 2 4
x
You can plot the cumulative distribution function (CDF) of the normal distribution using the pnorm()
function.
x <- seq(-5, 5, length.out = 1000)
y <- pnorm(x, mean = 0, sd = 1)
plot(x, y, type = "l", xlab = "x", ylab = "Cumulative Probability", main = "Normal Distribution CDF")
3
Normal Distribution CDF
1.0
0.8
Cumulative Probability
0.6
0.4
0.2
0.0
−4 −2 0 2 4
Hypothesis Testing
Hypothesis testing is a statistical method used to make inferences about a population based on sample data.
To run a hypothesis test, you need to define a null hypothesis (H0 ) and an alternative hypothesis (Ha ). For a
normal Distribution, you can perform hypothesis testing using the pnorm() function.
# Define the normal distribution parameters
# H0: m = .5
# Ha: m != .5
print(p_value)
## [1] 1.03129e-08
# Since the p-value is less than 0.05, we reject the null hypothesis
cat('Since the p-value of,', p_value, 'is less than 0.05, we reject the null hypothesis')
## Since the p-value of, 1.03129e-08 is less than 0.05, we reject the null hypothesis
4
Confidence Intervals
A confidence interval is a range of values that is likely to contain the true value of an unknown population
parameter. You can calculate confidence intervals using the normal distribution in R.
# Define the normal distribution parameters
# Sample mean
sample_mean <- mean(data)
# Margin of error
margin_error <- z_alpha * (sigma / sqrt(n))
# Confidence interval
ci_lower <- sample_mean - margin_error
ci_upper <- sample_mean + margin_error
Exercises
Part 1
Question 1
Plot the normal distribution with mean 0 and standard deviation 1. Add a vertical line at x = 1.
Question 2
Plot the cumulative distribution function of the normal distribution with mean 5 and standard deviation 2.
Question 3
Plot the normal distributions with the following parameters on one plot - Mean = 0, SD = 1 - Mean = 2, SD
= 1 - Mean = 0, SD = 2
Add a legend to the plot.
Part 2
Question 4
Generate a random sample of 100 values from a normal distribution with mean 10 and standard deviation 2.
Perform a hypothesis test to determine if the mean of the sample is significantly different from 10.
5
Question 5
Generate a random sample of 100 values from a normal distribution with mean 30 and standard deviation 5.
Calculate a 95% confidence interval for the population mean.
Question 6
Generate a random sample of 1000 values from a normal distribution with mean 50 and standard deviation
10. Calculate a 99% confidence interval for the population mean.
Question 7
Using the sample from Question 6, test the hypothesis that the population mean is equal to 45. Find the
p-value for the test.