0% found this document useful (0 votes)
29 views6 pages

Cosc 416

The document provides a comprehensive overview of probability, including definitions, types of random variables, key concepts, and probability distributions. It discusses the differences between discrete and continuous random variables, fundamental concepts of probability theory, and the Law of Large Numbers and Central Limit Theorem. Additionally, it includes practical examples of various probability distributions and statistical calculations using datasets.

Uploaded by

Gapscode Lyon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views6 pages

Cosc 416

The document provides a comprehensive overview of probability, including definitions, types of random variables, key concepts, and probability distributions. It discusses the differences between discrete and continuous random variables, fundamental concepts of probability theory, and the Law of Large Numbers and Central Limit Theorem. Additionally, it includes practical examples of various probability distributions and statistical calculations using datasets.

Uploaded by

Gapscode Lyon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

NAME: GAPKWI S.

REUEL
REG. NO: U21DLCS10193
COURSE: COSC416

QUESTION 1) DEFINITION OF PROBABILITY


Probability is a measure of the likelihood that an event will occur. It is expressed as a number
between 0 and 1, where:
0 represents an impossible event.
1 represents a certain event.
Two common definitions of probability:
a. Experimental Probability: Defined based on experiments or observed data:
P (A) = Number of times event A occurs
Total number of trials
Example: If a coin is flipped 100 times and lands on heads 55 times, the experimental
probability of heads is 55/100=0.55
b. Axiomatic Probability (Kolmogorov's Axioms): Based on three fundamental
axioms:
0≤P(A) ≤ 1 (Probability is always between 0 and 1).
P(S) =1 (The probability of the entire sample space is 1).
If A and B are mutually exclusive, then P (A∪B) =P (A) + P (B)

1B) DIFFERENCE BETWEEN DISCRETE AND CONTINUOUS RANDOM


VARIABLES
Discrete Random Variable: Takes a countable number of values.
Example: The number of cars passing a toll booth in an hour (0, 1, 2 ...).
Continuous Random Variable: Takes an infinite number of values within a given range.
Example: The time it takes for a customer to be served at a restaurant (e.g., 5.2 minutes, 6.8
minutes).

1C) DISCUSSION ON KEY PROBABILITY CONCEPTS


a) FUNDAMENTAL CONCEPTS OF PROBABILITY THEORY
Sample Space (S): The set of all possible outcomes of an experiment.
Example: Rolling a die, S= {1, 2, 3, 4, 5, 6}
Events: A subset of the sample space.
Example: Getting an even number when rolling a die, A= {2, 4, 6}
Conditional Probability: The probability of an event occurring given that another event has
already occurred.
Example: Probability of drawing an ace from a deck, given that the first card drawn was a
king.
Independence: Two events are independent if one does not affect the probability of the
other.
Example: Tossing two fair coins; the outcome of one does not affect the other.
b) THE LIMIT THEOREM
i. Law of Large Numbers (LLN): As the number of trials increases, the sample
mean converges to the expected value.
Example: If a fair coin is tossed 10,000 times, the proportion of heads will be close to
0.5.
ii. Central Limit Theorem (CLT): The sum (or average) of a large number of
independent random variables follows a normal distribution, regardless of the
original distribution.
Example: The heights of randomly selected students approximate a normal
distribution even if the population distribution is unknown.
c) RANDOM VARIABLES AND RANDOM PROCESSES
i. Random Variable: A numerical value assigned to outcomes of a random
experiment.
Example: The number of calls received by a call centre in an hour.
ii. Random Process: A collection of random variables indexed by time or space.
Example: Daily temperature variations in a city.
d) PROBABILITY DISTRIBUTIONS
A probability distribution describes the likelihood of different outcomes of a random
variable.
It can be:
Discrete (e.g., Binomial, Poisson).
Continuous (e.g., Normal, Exponential).

1D) LISTING OF COMMON PROBABILITY DISTRIBUTIONS


a) Definition of Probability Distribution
A probability distribution is a function that assigns probabilities to different possible values
of a random variable.
For discrete variables, it is called a probability mass function (PMF).
For continuous variables, it is called a probability density function (PDF).
b) Relationship between a Random Variable and Probability Distribution
A random variable represents outcomes numerically, while a probability distribution
provides the probabilities of these outcomes.
Example: If X is the number of heads in 3 coin flips, then:
P(X=0) =1/8,
P(X=1) =3/8,
P(X=2) =3/8,
P(X=3) =1/8.
c) List of Distributions with Practical Examples
I. Bernoulli Trials (Single trial with two outcomes)
Distribution: Bernoulli, Binomial.
Example:
Tossing a coin (Heads or Tails).
Checking if a machine part is defective (Yes/No).
II. Categorical Outcomes (Multiple distinct categories)
Distribution: Multinomial, Categorical.
Example:
Rolling a die (six outcomes: 1, 2, 3, 4, 5, and 6).
Customer survey responses (Satisfied, Neutral, Dissatisfied).
III. Hypothesis Testing (Making inferences about populations)
Distribution: Normal, t-distribution, Chi-square, F-distribution.
Example:
T-distribution: Used in small-sample hypothesis tests (e.g., comparing mean test scores of
two student groups).
Chi-square distribution: Used in categorical data tests (e.g., testing whether gender and job
preference are independent).
Normal distribution: Used in large-sample hypothesis testing (e.g., comparing population
means)
QUESTION 3
# Number of random points for part (a)
num_points_1000 = 1000

# Generate random (x, y) points within the square [-1,1] x [-1,1]


x_1000 = np.random.uniform(-1, 1, num_points_1000)
y_1000 = np.random.uniform(-1, 1, num_points_1000)

# Count points inside the circle (x^2 + y^2 <= 1)


inside_circle_1000 = np.sum(x_1000**2 + y_1000**2 <= 1)

# Estimate of π using 1000 points


pi_estimate_1000 = (inside_circle_1000 / num_points_1000) * 4

# Now testing with increasing number of points to see accuracy improvement


num_points_large = [10000, 100000, 1000000]
pi_estimates_large = []

for num in num_points_large:


x_large = np.random.uniform(-1, 1, num)
y_large = np.random.uniform(-1, 1, num)
inside_circle_large = np.sum(x_large**2 + y_large**2 <= 1)
pi_estimate_large = (inside_circle_large / num) * 4
pi_estimates_large.append((num, pi_estimate_large))

pi_estimate_1000, pi_estimates_large
Results:
(a) Estimate of π\pi using 1000 random points
With 1000 points, the estimated value of π\pi is 3.14.
(b) Accuracy improvement with more points
As we increase the number of points, the estimate of π\pi becomes more accurate:
10,000 points → π≈3.15
100,000 points → π≈3.14472
1,000,000 points → π≈3.13975

QUESTION 5.
import numpy as np
from scipy import stats

# (i) Mean and Median calculation


dataset_1 = np.array([10, 20, 30, 40, 100])
mean_1 = np.mean(dataset_1)
median_1 = np.median(dataset_1)

# (ii) Variance and Standard Deviation calculation


dataset_2 = np.array([6, 8, 10, 14, 18])
variance_2 = np.var(dataset_2, ddof=0) # Population variance
std_dev_2 = np.sqrt(variance_2) # Population standard deviation

# (iv) Mean, Median, Mode, Range, Variance, and Standard Deviation calculation
dataset_3 = np.array([70, 75, 80, 85, 90, 95, 100])
mean_3 = np.mean(dataset_3)
median_3 = np.median(dataset_3)
mode_3 = stats.mode(dataset_3).mode[0] # Mode
range_3 = np.max(dataset_3) - np.min(dataset_3)
variance_3 = np.var(dataset_3, ddof=0) # Population variance
std_dev_3 = np.sqrt(variance_3) # Population standard deviation

(mean_1, median_1), (variance_2, std_dev_2), (mean_3, median_3, mode_3, range_3,


variance_3, std_dev_3)
Results & Explanations:
(i) Mean and Median for dataset [10, 20, 30, 40, 100]
Mean = 40.0
Median = 30.0
Effect of the Outlier (100) on the Mean and Median:
The mean is significantly affected by the outlier (100), pulling it higher than most values in
the dataset.
The median, being the middle value, remains 30.0 and is less affected by the outlier.

(ii) Variance and Standard Deviation for dataset [6, 8, 10, 14, 18]
Variance = 18.56
Standard Deviation = 4.31

(iii) Explanation of Standard Deviation:


A small standard deviation means that the data points are closer to the mean (less spread
out).
A large standard deviation means that the data points are more spread out from the mean,
indicating more variability.

(iv) Statistical measures for dataset [70, 75, 80, 85, 90, 95, 100]
Mean = 85.0
Median = 85.0
Mode = 70 (since all values are unique, the first value is taken as the mode)
Range = 30 (100 - 70)
Variance = 100.0
Standard Deviation = 10.0
These results show a moderate spread in the data, with values fairly evenly distributed
around the mean.

You might also like