Supervised Learning , Unsupervised Learning, Reinforcement Learning:
1. Start with Total Balls and Probabilities
•You have 2 blue balls and 3 red balls in a bag.
•Total number of balls = 2 (blue) + 3 (red) = 5 balls.
First Probability: Getting a Blue Ball (0.4)
• "What is the probability of drawing a blue ball on the first try?"
• Number of blue balls = 2.
• Total number of balls = 5.
• So, the probability of getting a blue ball is: Number of blue balls/Total number of balls
• 2/5=0.4
• "There are 2 blue balls out of 5, so you have a 40% chance (0.4 probability) of drawing a blue ball."
Second Probability: Getting a Blue Ball After Getting a Red Ball (0.5)
•"Now, imagine you’ve already drawn a red ball. What’s the probability of getting a blue ball next?"
• After drawing a red ball, you have 2 blue balls left, and only 4 balls remain in the bag (since 1 red is gone).
• So, the probability of getting a blue ball now is: Number of blue balls/Remaining total balls=2/4=0.5
•
• Emphasize: "Since there are still 2 blue balls left, but only 4 balls in total, your chances increase to 50% (0.5
probability)."
Supervised Learning , Unsupervised Learning, Reinforcement Learning:
Third Probability: Getting a Blue Ball After Already Getting a Blue Ball (0.25)
•"Finally, what happens if you’ve already drawn a blue ball? What’s the probability of getting another blue ball?"
• After drawing a blue ball, only 1 blue ball remains, and there are now 4 balls left in total.
• The probability of drawing another blue ball is:
Number of remaining blue balls/Remaining total balls=1/4=0.25
• "With just 1 blue ball and 4 balls left in the bag, your chances are now 25% (0.25 probability)."
Event Relationship - Complement
Example :
The probability of getting the white ball from the bag =0.25
What is the probability of not getting the white ball from the bag?
W=white ball
P(ball not white)= P(Wc) = 1- 0.25 =0.75
Example : A bag contains Red and Blue cards,
R=Red card B=Blue card
The probability of having a Red card =0.4
What is the probability of having Blue card?
The probability of not having Red card = P(Rc)=1- 0.4 = 0.6
Introduction to Artificial Intelligence
Machine Learning Algorithms: Introduction to Probability Theory and Probabilistic Logic
Probability Theory
Relationships Between Events And Probabilities
Conditional Probability
Joint Probability Distribution
Representation of Possibilities
Random Variable
Probability Distribution
Probability Theory - Introduction
• Probability theory uses a discussion of events, categories, and
hypotheses in which there is no 100% certainty.
• For example, How will the weather be tomorrow? We can answer this
very simple hypothesis based on a general observation such as “it is
sunny 10% and rainy 70% of the time.”
• Probabilistic hypotheses are usually expressed by (propositional
logic)
• Propositional logic: is a branch of logic concerned with propositions
and their interrelationship.
Probability Theory - Propositional Logic
• A proposition is a possible state or fact, whether it is (True) or (False).
• The proposition is usually encoded with letters, for example, "it is sunny",
can be encoded with the letter S.
• The proposition sentence can be a simple sentence or a compound
sentence:
1. Simple sentence: expresses one fact about the world.
2. Compound sentence: contains several simple sentences associated with
different logical relationships, such as:
1. Negation is symbolized by ¬ : it negates the logical statement or one of the logical
symbols, Example, (¬ S) means "it is not sunny“.
2. Conjunction is symbolized by ∧ : refers to two or more propositions occurring
together. Example (S ∧ R) means "it's sunny and raining at the same time“.
3. Disjunction is symbolized by ∨: it indicates the occurrence of one of the
propositions. Example (S ∨ R) means "The weather is sunny or rainy“.
Probability Theory - Probability
• Probability provides a quantitative description of an assumption,
event, condition, or proposition.
• Probability can be defined as numerical measures of the probability
of an event occurring and expressed in numbers that take a value
between 0 and 1.
• An impossible event: An event that cannot happen at all (Probability= 0).
Example: P(R)=0 “It means that it will never rain”.
• A sure event: An event whose occurrence is 100% sure. (Probability = 1).
Example: P(S)=1 “It means that it is always sunny”.
Probability Theory - Probability
• The probability of an event "A" happening means a numerical
measure of its probability of occurrence, which is calculated as
follows:
P(A)= (The number of ways “A” occurs) ÷(sample space)
• Examples:
1. When flipping a coin, what are the possible outcomes?
To Calculate the Probability:
First calculate the sample space (), which is all possible
outcomes.
={ Head, Tail} P(Head) =1/2=0.5
2. What is the probability of an even number appearing by
throwing a single die once?
={1,2,3,4,5,6} P(even) =3/6=0.5
Conditional Probability
• Conditional Probability:
• The concept of independence is when Events
A and B are independent. if the probability of
A occurring and is not related to B whether
occurs or not.
• Independent events are those events whose
occurrence is not dependent on any other
event.
• For example, if we flip a coin in the air and
get the outcome as Head the first time, the
probability number is 0.5.
Conditional Probability
2 from 5 blue balls
• Dependent events are not independent and are
affected by previous events.
• Example: (crystal balls in a bag) Let's assume
there is a bag with 2 blue and 3 red balls.
• What is the probability of getting a blue ball? 0.4
• What is the probability of getting a blue ball after
getting a red ball? 0.5
• What is the probability of getting a blue ball after
getting a blue ball? 0.25
2 from 4 1 from 4
blue balls blue balls
Conditional Probability
• The symbol P(B|A) can be read as “the probability of B, given A.”
This is known as a conditional probability—it is conditional on A. In
other words, it states the probability that B is true, given that we
already know that A is true. P(B|A) is defined by the following rule:
P(B ∩ A) The joint probability of A , B
P(B|A)=
P(A) Marginal probability for A
• For example, the probability of the weather being both sunny(S)
and rainy (R) at the same time is 1%. The probability as sunny is
10%. Then we can calculate the probability that it is rainy as follows:
P(R ∩ S) 0.01
P(R|S)= = = 0.1
P(S) 0.1
Event Relationship- Intersection
• The probability of the intersection or involvement
of two events, A and B, is the probability that both
A and B will occur together. It symbolizes it
𝑃 𝐴 ∩𝐵 .
• If two events A and B are (mutually exclusive),
then 𝑃 𝐴 ∩ 𝐵 = 0.
• The multiplication rule is used to calculate the
intersection, or joint probability, of two or more
events:
P(A ∩ B)= P(A) ∙P(B|A)
• If A and B are independent, the probability of
occurrence of both A and B can be calculated:
P(A ∩ B)= P(A) ∙P(B)
Event Relationship - Union
• The probability of a union between two events A and B is the
probability that A or B or both symbolize P(A ∪ B).
events occur
• The probability of union for two or more events is calculated using
the (Addition Rule)
P(A ∪ B)=P(A) + P(B) - P(A ∩ B)
probability of both A and B occurring together.
Event Relationship - Complement
• The complement of event A consists of all outcomes of that did not
result in event A. It is written as Ac.
• To calculate the complementary event of an event, the rule is used:
P(AC)=1-P(A)
includes all possible outcomes in which A does not occur.
This formula reflects the idea that the probability of either A happening or not happening (its complement) must add up to 1, covering all possible outcomes.
Event Relationship - Complement
P(AC)=1-P(A)
Example :
The probability of getting the white ball from the bag =0.25
What is the probability of not giving the white ball from the bag?
P(ball not white)= 1- 0.25 =0.75
Example :
A bag contains Red and Blue cards, the probability of having a Red card =0.4
What is the probability of having Blue card?
R=Red card
B=Blue card
Thec probability of not having Red card =P(B)
P(B)=1- 0.4 = 0.6
Joint Probability Distribution
• Joint probability distribution is used to represent the probabilities of the
collected data.
• For example, the following table shows a joint probability distribution
for two variables A, B:
The probability of event ¬A (A not happening) can be found by
adding the probabilities of the situations where A does not
happen:
=0.09+0.17= 0.26
• From this probability distribution we can calculate for example the joint
probability between A and B: P(A∩ ¬B)=0.63 and P(A B)=0.11.
• The marginal probability can also be calculated P (A)=0.11+0.63=0.74.
So the probability of event A happening is 0.74.
Joint Probability Distribution
• Joint probability distribution is used to represent the probabilities of
aggregated data.
• Example: A ∩ B in the following table shows a common probability
distribution for two variables, A and B:
• This table can be used to determine the
probability of any logical combination of A and B.
Joint Probability Distribution
• Joint probability distribution is used to represent the probabilities of
aggregated data.
• Example: A ∩ B in the following table shows a common probability
distribution for two variables, A and B:
• Also, it can define conditional probabilities such as P(A|B) by using the rule in
0.11
this case P(B)=0.11+0.09=0.2 , P(A ∩ B)=0.11 Therefore P(A|B)= =0.55
0.2
P(B) calculated by summing all probabilities where
B happens, regardless of whether A happens or not:
Contingency Table
• A contingency table is used to represent data and to facilitate the calculation of
probabilities.
• The table displays frequency values related to two different variables that may depend
on each other or coincide with each other. One variable is used to classify rows and the
other is used to classify columns.
• let's assume there is a study of speeding violations and the use of cell phone violations
produced data as follows: P (P S)= P(P) + P(S) - P(P S)
The calculation of the probability for using cell
Speed No Speed phone and speed violations:
Total
Violation Violation
305 70
Cell Phone P (Phone )=755 P (Speed )=755
25 280 305
Violation
No Cell Phone 25
45 405 450 P (phone ∩ 𝑠𝑝𝑒𝑒𝑑) = 755
Violation
Total 70 685 755 305 70 25
P (phone ∪ 𝑠𝑝𝑒𝑒𝑑) = ( 755 + 755) − 755
Tree Diagrams
• A tree diagram is a special type of 1st Toss 2nd Toss 3rd Toss Outcome
graph used to determine the results H HHH
of an experiment. H
T HHT
• It consists of "branches" that are H
H HTH
categorized either by frequencies or T
probabilities. Tree diagrams used to T HTT
make problem probabilities easier to H THH
visualize and solve. H
T THT
• The example here shows the sample T
H TTH
area for tossing a coin three times T
experiment. T TTT
Random Variable
• Random variables are used to determine the outcome of a random
event, and thus can take many values.
• Random variables must be measurable and are usually real numbers.
For example, in tossing the coin three times experiment, the sample
area is:
={HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} N()=8
Suppose, X is the number of H that shows up on the head, so X (random
variable) takes any of the values 0,1,2,3 means no head, head, two head, three
head:
X: → {0,1,2,3}
Random Variable Types
There are two random variable types:
1. Discrete random variable takes countable values from the given
values.
For example, in the experiment of tossing a coin three times, if x
represents the number of times the coin appears as a head, then x is a
discrete random variable.
2. Continuous random variable takes an infinite number of possible
values within a specified range or period.
For example, the average height for a group of 25 people. Here, the
value of x is a real number such as 165.6.
Probability Distribution
• The values of random variables and their probabilities can be
presented in a table known as the probability distribution table or in
the form of a probability distribution function.
• The probabilities of the random variable X = the number of times the
head appears in the previous coin experiment described in the
following table:
P(X=0) = P({TTT}) = 1/8
X 0 1 2 3 Total P(X=1) = P({HTT, THT, TTH}) = 3/8
P(X) 1/8 3/8 3/8 1/8 1 P(X=2) = P({HHT, HTH, THH}) = 3/8
P(X=3) = P({HHH}) = 1/8
• The probability distribution table can be used in the case of discrete
variables only, and it cannot be used for continuous data because of
the difficulty of counting the random variables’ probabilities.
Probability Distribution
• A probability distribution function is a statistical function.
• The probability distribution function describes all the possible
values and probabilities that a random variable can take within a
given range, this range is usually confined between a lower and upper
limit.
The probability that a random variable will take a specific value of P(X)=X
• There are two types of probability distribution functions:
1. Discrete Probability Distributions: The discrete distribution describes the
probability of occurrence of each value of a discrete random variable.
• A discrete random variable that has countable values.
2. Continuous Probability Distributions: A continuous distribution describes
the probabilities of values for a continuous random variable.
• A continuous random variable has an infinite, uncountable set of possible values
(known as the range).
Probability Distribution
• In continuous probability distributions, the
probabilities of continuous random variables X
are defined as:
The area under the curve of the probability density
function
• The normal distribution or bell curve is one of
the probability distributions and can be
represented by the mean and the standard
deviation.
• Example: the graph here is the continuous
normal distribution of the height for male adults 160 170
in a city. We can calculate the probability that the
man's height is between 160 and 170 centimeters
by calculating the area under the curve between
160 and 170.
Thank You