0% found this document useful (0 votes)

17 views18 pages

ML Unit-3

This document provides an overview of probability and statistics, emphasizing their importance in machine learning. It covers key concepts such as descriptive and inferential statistics, types of probability (joint, marginal, conditional), and Bayes' theorem, along with examples and applications. Additionally, it explains random variables and their classifications into discrete and continuous types, as well as the concept of probability distributions.

Uploaded by

Nimu Shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views18 pages

ML Unit-3

Uploaded by

Nimu Shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

UNIT-3

PROBABILITY AND STATISTICS

3.1 Overview of Probability
• Probability and statistics are both the most important concepts for Machine
Learning. Probability is about predicting the likelihood of future events,
while statistics involves the analysis of the frequency of past events.
Nowadays, Machine Learning has become one of the first choices for most
freshers and IT professionals. But, to enter this field, one must have some
pre-specified skills, and one of those skills is Mathematics. Mathematics is
very important to learning ML technology and developing efficient
applications for the business.
• Probability can be calculated by the number of times the event occurs
divided by the total number of possible outcomes. Let's suppose we tossed
a coin, then the probability of getting head as a possible outcome can be
calculated as below formula:
P (H) = Number of ways to head occur/ total number of possible outcomes
P (H) = ½
P (H) = 0.5
Where;
P (H) = Probability of occurring Head as outcome while tossing a coin.

3.2 Statistical Tools in Machine Learning

Statistics is a branch of mathematics that deals with collecting, analyzing,
interpreting, and visualizing empirical data. Descriptive statistics and
inferential statistics are the two major areas of statistics. Descriptive
statistics are for describing the properties of sample and population data
(what has happened). Inferential statistics use those properties to test
hypotheses, reach conclusions, and make predictions (what can you
expect).
3.3 Descriptive Statistics
It helps in understanding the basic features of the data by summarizing
them numerically or graphically. Facts regarding the data involved can be
presented by descriptive analysis, however, any kind of generalization or
conclusion is not possible.
Descriptive statistics provide a summary of the data, such as the mean,
median, standard deviation, and variance. Univariate descriptive statistics
are used to describe data containing only one variable. On the other hand,
bivariate and multivariate descriptive statistics are used to describe data
with multiple variables.
The marks of students in two classes are {70, 85, 90, 65} and {60, 40, 89,
96}. The average marks for each class are 77.5 and 71.25, respectively.
Descriptive statistics can be broadly classified into two categories -
measures of central tendency and measures of dispersion.
Types of Descriptive Statistics:
Descriptive statistics are methods used to summarize and describe the
main features of a dataset. They provide a way to organize and simplify
data, making it easier to understand and interpret. Here are the major
types of descriptive statistics, along with examples and visualizations:

Measures of Central Tendency

Mean The average value of a dataset, calculated by adding all values and
dividing by the number of values.
Median: The middle value in a dataset when the values are arranged in
order.
Mode: The most frequent value in a dataset.
Example:
Consider the following dataset of scores: 85, 92, 78, 95, 82.
Mean = (85 + 92 + 78 + 95 + 82) / 5 = 86.4
Median = 85 (arranged in order: 78, 82, 85, 92, 95)
Mode = no mode (no value occurs more than once)

Measures of Variability
Range: The difference between the highest and lowest values in a dataset.
Variance: The average of the squared differences from the mean.
Standard Deviation: The square root of the variance, measuring how
spread out the values are from the mean.
Example:
Using the same dataset of scores:
Range = 95 - 78 = 17
Variance = 35.36
Standard Deviation = 5.95

3.4 Inferential Statistics

It is simply used for explaining the meaning of descriptive stats. It is
simply used to analyze, interpret results, and draw conclusions.
Inferential statistics can be classified into hypothesis testing and
regression analysis. Hypothesis testing also includes the use of
confidence intervals to test the parameters of a population. Given below
are the different types of inferential statistics.

Types of Inferential Statistics:

Hypothesis testing is a part of statistics in which we make assumptions
about the population parameter. So, hypothesis testing mentions a proper
procedure by analyzing a random sample of the population to accept or
reject the assumption.
Z-test
Z-test is mainly used when the data is normally distributed. We find the
Z-statistic of the sample means and calculate the z-score. Z-score is given
by the formula,

Z-score = (x – µ) / σ
Z-test is mainly used when the population mean and standard deviation
are given.

Confidence interval:
A confidence interval is a range of values that is likely to contain the true
population parameter. It is used to estimate the range of values in which
the population parameter lies. The confidence interval is calculated from
the sample data and is often used in hypothesis testing.
Population
It refers to the collection that includes all the data from a defined group
being studied. The size of the population may be either finite or infinite.
Sample
The study of the entire population is always not feasible, instead, a
portion of data is selected from a given population to apply the statistical
methods. This portion is called a Sample. The size of the sample is
always finite
Descriptive Statistics Inferential Statistics
Purpose Describe and Make inferences and draw
summarize the data conclusions about a population
based on sample data

Data Analysis Analyzes and interprets Uses sample data to make

the characteristics of a generalizations or predictions
dataset about a larger population

Population vs Sample Focuses on the entire Focuses on a subset of the

population or dataset population (sample) to draw
conclusions about the entire
population

Measurements Provides measures of Estimates parameters, test

central tendency and hypotheses, and determines the
dispersion level of confidence or
significance in the results

Examples Mean, median, mode, Hypothesis testing, confidence

standard deviation, intervals, regression analysis,
range, frequency tables ANOVA (analysis of variance),
chi-square tests, t-tests, etc.

Goal Summarize, organize, Generalize findings to a larger

and present data population, make predictions,
test hypotheses, evaluate
relationships, and support
decision-making

Population Not typically estimated Estimated using sample

Parameters statistics (e.g., sample mean as
an estimate of population mean)
Sample Not required Crucial; that the sample should
Representativeness be representative of the
population to ensure accurate
inferences

3.5 Concept of Probability

Probability means possibility. It is a branch of mathematics that deals with
the occurrence of a random event. The value is expressed from zero to one.

Experiment as a process that generates well-defined outcomes. On any

single repetition of an experiment, one and only one of the possible
experimental outcomes will occur.
Examples: Hitting a target, checking the boiling point of a liquid, taking an
examination for a student, conducting interviews for some jobs, tossing a
coin, rolling a die, hitting a ball with a batsman, sale of products, chemical
reaction of elements, are few examples of experiments.

The sample space for an experiment is the set of all experimental outcomes.
Example: In the experiment of hitting a target, sample space can be hitting a
target, missing the target.

An Event is one or more of the possible outcomes of an experiment.

Example: If we toss a coin, getting a head will be one event, and getting a
tail will be another event.

For example:- when we toss a coin, either we get Head OR Tail, only two
possible outcomes are possible (H, T). But when two coins are tossed then
there will be four possible outcomes, i.e. {(H, H), (H, T), (T, H), (T, T)}.
Joint Probability
When the probability of two more events occurring together and at the same
time is measured it is marked as Joint Probability. For two events A and B, it
is denoted by joint probability is denoted as, P(A∩B) intersection of two or
more events.
Formula: P(A∩B) = P(A) * P(B)

Example: Find the probability that the number three will occur twice when
two dice are rolled at the same time.
Solution: Number of possible outcomes when a die is rolled = 6
i.e. {1, 2, 3, 4, 5, 6}
Let A be the event of occurring 3 on first die and B be the event of occurring
3 on the second die.
Both the dice have six possible outcomes, the probability of a three
occurring on each die is 1/6.
P(A) =1/6
P(B )=1/6
P(A, B) = 1/6 x 1/6 = 1/36

Marginal Probability
Probability of a single event occurring, independent of other events. It's
found by summing the probabilities of the event across all possible
outcomes of the other variable(s).
Now we have to calculate these probabilities by using a two-way table.
If you are given a pmf = pXY(x,y) , and we will calculate the marginal
probability pY(y).
To calculate the marginal probability we will use the formula
py(y)=∑ip(xi,y).
Let's draw a table to calculate these probabilities.

Now if we wish to calculate the marginal pY(3)

Now by using the formula of marginal py(y)=∑ip(xi,y) at Y=3
⇒pY(3)=P(Y=3)
⇒pY(3)=P(Y=3,X=3) + P(Y=3,Y=4)
Now from the table, if we look at the values as mentioned in the above
expression then we get
⇒pY(3)=0.1+0.2
⇒pY(3)=0.3
Here we get the marginal probability of the taken example.

Conditional Probability
The probability of an event A based on the occurrence of another event B is
termed conditional Probability. It is denoted as P(A|B) and represents the
probability of A when event B has already happened.

Here:
P(A | B) = The probability of A given B (or) the probability of A which
happens after B
P(B | A) = The probability of B given A (or) the probability of B which
happens after A
P(A ∩ B) = The probability of happening of both A and B
P(A) = The probability of A
P(B) = The probability of B

Example: A bag contains 3 red and 7 black balls. Two balls are drawn at
random without replacement. If the second ball is red, what is the
probability that the first ball is also red?
Solution:
Let A: event of selecting a red ball in first draw
B: event of selecting a red ball in the second draw
P(A ∩ B) = P(selecting both red balls) = 3/10 × 2/9 = 1/15
P(B) = P(selecting a red ball in the second draw) = P(red ball and rad ball or
black ball and red ball)
= P(red ball and red ball) + P(black ball and red ball)
= 3/10 × 2/9 + 7/10 × 3/9 = 3/10
∴ P(A|B) = P(A ∩ B)/P(B) = 1/15 ÷ 3/10 = 2/9.
Example: Two dice are rolled, if it is known that atleast one of the dice
always shows 4, find the probability that the numbers appeared on the dice
have a sum 8.
Solution:
Let,
A: one of the outcomes is always 4
B: sum of the outcomes is 8
Then, A = {(1, 4), (2, 4), (3, 4), (4, 4), (5, 4), (6, 4), (4, 1), (4, 2), (4, 3), (4,
5), (4, 6)}
B{(4, 4), (5, 3), (3, 5), (6, 2), (2, 6)}
n(A) = 11, n(B) = 5, n(A ∩ B) = 1
P(B|A) = n(A ∩ B)/n(A) = 1/11.

Actually the basic difference between them is that the joint probability is the
probability of two events occurring simultaneously, and in the marginal
probability is the probability of an event irrespective of the outcome of
another variable, and conditional probability is the probability of one event
occurring in the presence of a second event.

Bayes’ Theorem
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian
reasoning, which determines the probability of an event with uncertain
knowledge.
In probability theory, it relates the conditional probability and marginal
probabilities of two random events.
P(A|B) is known as posterior, which we need to calculate, and it will be
read as Probability of hypothesis A when we have occurred an evidence B.

P(B|A) is called the likelihood, in which we consider that hypothesis is true,

then we calculate the probability of evidence.

P(A) is called the prior probability, probability of hypothesis before

considering the evidence

P(B) is called marginal probability, pure probability of an evidence.

Example:
There are two urns containing colored balls. The first urn contains 50 red
balls and 50 blue balls. The second urn contains 30 red balls and 70 blue
balls. One of the two urns is randomly chosen (both urns have a probability
of 50% of being chosen) and then a ball is drawn at random from one of the
two urns. If a red ball is drawn, what is the probability that it comes from
the first urn?

Solution
In probabilistic terms, what we know about this problem can be formalized
as follows:

The unconditional probability of drawing a red ball can be derived using the
law of total probability:

By using Bayes' rule, we obtain

3.6 Random Variables

A random variable is a variable which represents the outcome of a trial, an
experiment, or an event. It is a specific number which is different each time
the trial, experiment, or event is repeated.
• A random variable is a variable whose value is unknown or a function
that assigns values to each of an experiment's outcomes.
• A random variable can be either discrete (having specific values) or
continuous (any value in a continuous range).
• The use of random variables is most common in probability and
statistics, where they are used to quantify outcomes of random
occurrences.
• Risk analysts use random variables to estimate the probability of an
adverse event occurring.

Types of Random Variables

Continuous random variable
Continuous random variables take up an infinite number of possible
values which are usually in a given range. Typically, these are
measurements like weight, height, the time needed to finish a task, etc.
To give you an example, the life of an individual in a community is a
continuous random variable. Let’s say that the average lifespan of an
individual in a community is 110 years.

Therefore, a person can die immediately on birth (where life = 0 years) or

after he attains an age of 110 years. Within this range, he can die at any
age. Therefore, the variable ‘Age’ can take any value between 0 and 110.
Hence, continuous random variables do not have specific values since the
number of values is infinite. Also, the probability at a specific value is
almost zero.

Discrete random variable

Discrete random variables take on only a countable number of distinct
values. Usually, these variables are counts (not necessarily though). If a
random variable can take only a finite number of distinct values, then it is
discrete.
Number of members in a family, number of defective light bulbs in a box
of 10 bulbs, etc. are some examples of discrete random variables.

3.7 Probability Distribution

Sampling Distribution
A sampling distribution is a probability distribution of a statistic that is
based on random samples from a population. It describes the range of
possible outcomes for a statistic, such as the mean or mode of a variable.

Discrete Distribution
A discrete probability distribution is a type of probability distribution that
shows all possible values of a discrete random variable along with the
associated probabilities. In other words, a discrete probability distribution
gives the likelihood of occurrence of each possible value of a discrete
random variable.
Such a distribution will represent data that has a finite countable number of
outcomes
A discrete probability distribution counts occurrences that have countable
or finite outcomes.
In finance, discrete distributions are used in options pricing and forecasting
market shocks or recessions.
Represented by bars or points, such as in a histogram or probability mass
function plot.
Examples: binomial distribution, Poisson distribution, geometric
distribution

Continuous Distribution
Continuous Probability Distributions. A continuous distribution describes
the probabilities of a continuous random variable's possible values. A
continuous random variable has an infinite and uncountable set of possible
values (known as the range).
Involves continuous random variables that can take any value within a
range. Examples include height, weight, temperature, and time.
Represented by smooth curves, such as the bell curve of the normal
distribution.
Examples: normal distribution, exponential distribution, beta distribution.

Normal Distribution
Normal distribution, also known as the Gaussian distribution, is
a probability distribution that is symmetric about the mean, showing that
data near the mean are more frequent in occurrence than data far from the
mean.
The normal distribution appears as a "bell curve" in graphical form.

3.8 Central Limit Theorem

The central limit theorem, which is a statistical theory, states that when a
large sample size has a finite variance, the samples will be normally
distributed, and the mean of samples will be approximately equal to the
mean of the whole population.
In other words, the central limit theorem states that for any population with
mean and standard deviation, the distribution of the sample mean for sample
size N has mean μ and standard deviation σ/√n.

Central limit theorem formula

Fortunately, you don’t need to actually repeatedly sample a population to
know the shape of the sampling distribution. The parameters of the
sampling distribution of the mean are determined by the parameters of the
population:
The standard deviation of the sampling distribution is the standard deviation
of the population divided by the square root of the sample size.

Where,
N is the normal distribution
µ is the mean of the population
σ is the standard deviation of the population
n is the sample size

3.9 Monte Carlo Approximation

A Monte Carlo simulation is used to model the probability of different
outcomes in a process that cannot easily be predicted due to the
intervention of random variables. It is a technique used to understand the
impact of risk and uncertainty.
A Monte Carlo simulation is used to tackle a range of problems in many
fields including investing, business, physics, and engineering. It is also
referred to as a multiple probability simulation.

A Monte Carlo simulation requires assigning multiple values to an

uncertain variable to achieve multiple results and then averaging the results
to obtain an estimate.
Monte Carlo simulations assume perfectly efficient markets.
History of the Monte Carlo Simulation
The Monte Carlo simulation was named after the gambling destination in
Monaco because chance and random outcomes are central to this modeling
technique, as they are to games like roulette, dice, and slot machines.
The technique was initially developed by Stanislaw Ulam, a mathematician
who worked on the Manhattan Project, the secret effort to create the first
atomic weapon. He shared his idea with John Von Neumann, a colleague at
the Manhattan Project, and the two collaborated to refine the Monte Carlo
simulation.

EV Juices - Pixelmon Wiki
No ratings yet
EV Juices - Pixelmon Wiki
1 page
Modified Ps Final 2023
No ratings yet
Modified Ps Final 2023
124 pages
1 - Statistics
No ratings yet
1 - Statistics
125 pages
Keba User Manual (4030) - 27march2006
88% (8)
Keba User Manual (4030) - 27march2006
100 pages
Data Analytics Notes
No ratings yet
Data Analytics Notes
44 pages
Lecture Note On Biostatistics
No ratings yet
Lecture Note On Biostatistics
74 pages
Ppt01. A Review To Statistics and Probability
No ratings yet
Ppt01. A Review To Statistics and Probability
28 pages
ML2 Math Algo
No ratings yet
ML2 Math Algo
72 pages
R Language All Topic
No ratings yet
R Language All Topic
54 pages
Maneb Jce Mathematics 2012 Past Paper1719321067
No ratings yet
Maneb Jce Mathematics 2012 Past Paper1719321067
4 pages
Statistics 101
100% (1)
Statistics 101
20 pages
AYURSURE (Research and Stat) 4
No ratings yet
AYURSURE (Research and Stat) 4
44 pages
Inferential Statistics
No ratings yet
Inferential Statistics
29 pages
ML Unit 3
No ratings yet
ML Unit 3
46 pages
GE 04 - Mathematics in The Modern World-Topic 2-Data Management
No ratings yet
GE 04 - Mathematics in The Modern World-Topic 2-Data Management
36 pages
Stats Reviewer
No ratings yet
Stats Reviewer
16 pages
Statistics
No ratings yet
Statistics
36 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Basic Statistics
No ratings yet
Basic Statistics
23 pages
F.Y. Maths PPT On Probability and Statistics
No ratings yet
F.Y. Maths PPT On Probability and Statistics
10 pages
Statistics and Probability
No ratings yet
Statistics and Probability
43 pages
Nigerian Agricultural Journal: Adoption of Improved Soybean Production Technologies in Benue State, Nigeria
No ratings yet
Nigerian Agricultural Journal: Adoption of Improved Soybean Production Technologies in Benue State, Nigeria
6 pages
COM 201 - Inferential Statistics - 18032022-1
No ratings yet
COM 201 - Inferential Statistics - 18032022-1
58 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
51 pages
Chapter 2
No ratings yet
Chapter 2
23 pages
Satistics
No ratings yet
Satistics
18 pages
Statistic Module 2
No ratings yet
Statistic Module 2
15 pages
Revision Module 1,2,3
No ratings yet
Revision Module 1,2,3
129 pages
Unit 3 R As A Set of Statistical Tables
No ratings yet
Unit 3 R As A Set of Statistical Tables
31 pages
1 Intro-Statistics
No ratings yet
1 Intro-Statistics
61 pages
Submitted To: Mrs. Geetika Vashisht College of Vocational Studies University of Delhi
No ratings yet
Submitted To: Mrs. Geetika Vashisht College of Vocational Studies University of Delhi
36 pages
Statistics
No ratings yet
Statistics
21 pages
Statistics 110, Lecture Notes - Cedar Crest College
No ratings yet
Statistics 110, Lecture Notes - Cedar Crest College
111 pages
Unit 8. Data Analysis
No ratings yet
Unit 8. Data Analysis
69 pages
Statistics - Compendium - DMS IIT DELHI - 2025
No ratings yet
Statistics - Compendium - DMS IIT DELHI - 2025
18 pages
Lec 1
No ratings yet
Lec 1
54 pages
Lecture 4 - Data Science Statistics
No ratings yet
Lecture 4 - Data Science Statistics
21 pages
2466939-EDA and STATISTICS NOTES
No ratings yet
2466939-EDA and STATISTICS NOTES
15 pages
CO - Earth and Life Science (Detailed Lesson Plan)
No ratings yet
CO - Earth and Life Science (Detailed Lesson Plan)
6 pages
Statistics: The Language of Facts: Group 6
No ratings yet
Statistics: The Language of Facts: Group 6
65 pages
Syllabus: (Course Outline) in Statistics and Probability
No ratings yet
Syllabus: (Course Outline) in Statistics and Probability
7 pages
Presentation 4
No ratings yet
Presentation 4
29 pages
Six - Domains.Leadership Pyramid - Lind.Sitkin
No ratings yet
Six - Domains.Leadership Pyramid - Lind.Sitkin
24 pages
Week 01 Introduction
No ratings yet
Week 01 Introduction
33 pages
What Is Statistic
No ratings yet
What Is Statistic
129 pages
Probability Distributions-Sarin B
No ratings yet
Probability Distributions-Sarin B
20 pages
Notes Data Analytics
No ratings yet
Notes Data Analytics
19 pages
Statistical Methods
No ratings yet
Statistical Methods
16 pages
LQ1 Notes
No ratings yet
LQ1 Notes
15 pages
Measure of Central Tendency
No ratings yet
Measure of Central Tendency
40 pages
Probability and Statistics
No ratings yet
Probability and Statistics
8 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
8 pages
Stat & Probability
No ratings yet
Stat & Probability
48 pages
Prelim Coverage
No ratings yet
Prelim Coverage
6 pages
EDA Reviewer
No ratings yet
EDA Reviewer
8 pages
Statistics SS2020
No ratings yet
Statistics SS2020
12 pages
Generator Emergency Purging
No ratings yet
Generator Emergency Purging
1 page
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
13 pages
MMW Notes
No ratings yet
MMW Notes
10 pages
STAT515 Lecture
No ratings yet
STAT515 Lecture
85 pages
What Is A Worldview? Published in Dutch As: "Wat Is Een Wereldbeeld?"
No ratings yet
What Is A Worldview? Published in Dutch As: "Wat Is Een Wereldbeeld?"
14 pages
2.2 Probability
No ratings yet
2.2 Probability
19 pages
Material Safety Data Sheet: Es Compleat Coolant Eg Premix (Ethylene Glycol Based Coolant)
No ratings yet
Material Safety Data Sheet: Es Compleat Coolant Eg Premix (Ethylene Glycol Based Coolant)
8 pages
G7 Q1 Icl Worksheet 6
No ratings yet
G7 Q1 Icl Worksheet 6
2 pages
Opening and Closing Spaces Problems and Solutions: C H A P T e R
No ratings yet
Opening and Closing Spaces Problems and Solutions: C H A P T e R
10 pages
Dcit 60 Reviewer
No ratings yet
Dcit 60 Reviewer
5 pages
Atomic Habits Presentation
No ratings yet
Atomic Habits Presentation
15 pages
Statistics For Data Analytics
No ratings yet
Statistics For Data Analytics
15 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Black Holes and Beyond
No ratings yet
Black Holes and Beyond
140 pages
Prospectus 2023-2024 SSC
No ratings yet
Prospectus 2023-2024 SSC
88 pages
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
EE401 Class Desc
No ratings yet
EE401 Class Desc
8 pages
Statistical Formula Sheet 1: X X N X N X F X N
No ratings yet
Statistical Formula Sheet 1: X X N X N X F X N
11 pages
Scotts DW2 Guide V1.7
No ratings yet
Scotts DW2 Guide V1.7
40 pages
Biology Practical Class 12
No ratings yet
Biology Practical Class 12
7 pages
Translation Criticism-Week 1
No ratings yet
Translation Criticism-Week 1
50 pages
Houston Stuart 2001 PDF
No ratings yet
Houston Stuart 2001 PDF
33 pages
SW604 Psychosocial Assessment Worksheet
No ratings yet
SW604 Psychosocial Assessment Worksheet
7 pages
Analisis Desain Grafis Menggunakan Teknologi Komputer Berbasis Software Coreldraw
No ratings yet
Analisis Desain Grafis Menggunakan Teknologi Komputer Berbasis Software Coreldraw
11 pages
2014 SafetyOfDomesticRobots IEEERAM 06880806
No ratings yet
2014 SafetyOfDomesticRobots IEEERAM 06880806
10 pages
List of Classified HKAL Chemistry Exam Questions: A S M O
No ratings yet
List of Classified HKAL Chemistry Exam Questions: A S M O
2 pages
Gandjariella Thermophila Gen Nov SP Nov
No ratings yet
Gandjariella Thermophila Gen Nov SP Nov
22 pages
Okereke (2020)
No ratings yet
Okereke (2020)
16 pages
Fundamental Principles of Counting - 073819
No ratings yet
Fundamental Principles of Counting - 073819
6 pages
Vaya Linear MP RGB BCP424 50 RGB L1210 CE 60 Watt
No ratings yet
Vaya Linear MP RGB BCP424 50 RGB L1210 CE 60 Watt
3 pages
Ministry of Resin Exposure Times - Durable Grey
No ratings yet
Ministry of Resin Exposure Times - Durable Grey
1 page
Chapter 4 Introduction To Discontinuity Study
No ratings yet
Chapter 4 Introduction To Discontinuity Study
87 pages
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet

ML Unit-3

Uploaded by

ML Unit-3

Uploaded by

UNIT-3

PROBABILITY AND STATISTICS

3.2 Statistical Tools in Machine Learning

Measures of Central Tendency

3.4 Inferential Statistics

Types of Inferential Statistics:

Data Analysis Analyzes and interprets Uses sample data to make

Population vs Sample Focuses on the entire Focuses on a subset of the

Measurements Provides measures of Estimates parameters, test

Examples Mean, median, mode, Hypothesis testing, confidence

Goal Summarize, organize, Generalize findings to a larger

Population Not typically estimated Estimated using sample

3.5 Concept of Probability

Experiment as a process that generates well-defined outcomes. On any

An Event is one or more of the possible outcomes of an experiment.

Now if we wish to calculate the marginal pY(3)

P(B|A) is called the likelihood, in which we consider that hypothesis is true,

P(A) is called the prior probability, probability of hypothesis before

P(B) is called marginal probability, pure probability of an evidence.

By using Bayes' rule, we obtain

3.6 Random Variables

Types of Random Variables

Therefore, a person can die immediately on birth (where life = 0 years) or

Discrete random variable

3.7 Probability Distribution

3.8 Central Limit Theorem

Central limit theorem formula

3.9 Monte Carlo Approximation

A Monte Carlo simulation requires assigning multiple values to an

You might also like