0% found this document useful (0 votes)

9 views

Module 2 Math Foundation II

The document outlines the mathematical foundations for AI, focusing on probability, statistics, and calculus. It covers key concepts such as Bayes' theorem, various probability distributions, statistical inference, and optimization techniques including gradients and convex optimization. These mathematical principles are essential for developing robust AI systems and improving model performance.

Uploaded by

emmanuella.etop-essien

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Module 2 Math Foundation II

Uploaded by

emmanuella.etop-essien

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

www.covenantuniversity.edu.

Raising a new Generation of Leaders

CEN524: MATHEMATICAL
FOUNDATIONS FOR AI

BY
Omoruyi O., Olatimehin O. and Ajilore A.
Module 2
Outline II
• Introduction to Probability and Statistics
- Define probability and statistics, and their importance in AI
- Introduce Bayes' theorem and its significance
• Bayes' Theorem
- Explain the concept of Bayes' theorem and its application
- Provide examples and practice problems
• Distributions
- Introduce different types of distributions, such as Gaussian, Bernoulli, and Poisson
- Explain their importance in AI and provide examples
• Statistical Inference
- Introduce the concept of statistical inference and its application in AI
- Explain the importance of hypothesis testing and confidence intervals

2
Outline III
• Introduction to Calculus
- Define calculus and its importance in AI
- Introduce the concept of gradients and optimization
• Gradients
- Explain the concept of gradients and their application in optimization
- Provide examples and practice problems
• Optimization Basics
- Introduce the concept of optimization and its application in AI
- Explain the importance of convex optimization and local minima
• Applications of Calculus in AI
- Explain the application of calculus in AI, such as in deep learning and neural networks
- Provide examples and case studies

3
Defining Probability and Statistics, and Their Importance in AI
• Probability: The study of uncertainty, quantifying the likelihood of
events (e.g., a coin landing heads has a 50% chance). It provides a
mathematical framework to model randomness.
• Statistics: The science of collecting, analyzing, and interpreting data
to uncover patterns or make predictions.
• Importance in AI:
§ AI systems rely on probability to handle uncertainty (e.g., predicting
outcomes in self-driving cars).
§ Statistics enables data-driven decisions, model evaluation, and learning
from patterns (e.g., training machine learning models on datasets).
§ Together, they form the backbone of algorithms like neural networks,
decision trees, and reinforcement learning.

4
Bayes' Theorem
A fundamental rule in probability that updates beliefs based on
new evidence. Enables reasoning under uncertainty (e.g., spam
email detection).
• P(A|B) = [P(B|A) · P(A)] / P(B)
• ( P(A|B) ): Posterior probability (probability of A given B).
• ( P(B|A) ): Likelihood (probability of B given A).
• ( P(A) ): Prior probability (initial belief about A).
• ( P(B) ): Evidence (normalizing constant).
• Foundation for probabilistic models like Naive Bayes classifiers
and Bayesian networks.

5
Bayes' Theorem and Its Application
Bayes' theorem reverses conditional probabilities, allowing us to update probabilities as
new data arrives.
Example 1: If 1% of people have a disease (P(A) = 0.01), a test is 95% accurate (P(B|A) =
0.95), and 10% false positive rate (P(B|¬A) = 0.10), what's the probability of having the
disease given a positive test (P(A|B))?
Use Bayes:
P(A|B) = [P(B|A) · P(A)] / P(B) where
P(B) = P(B|A) · P(A) + P(B|¬A) · P(¬A)
Example 2: A disease affects 2% of people (P(D) = 0.02). A test is 95% accurate for
positives (P(T|D) = 0.95) and has a 10% false positive rate (P(T|¬D) = 0.10). If the test is
positive, what's P(D|T)?
P(T) = P(T|D) · P(D) + P(T|¬D) · P(¬D) = (0.95 · 0.02) + (0.10 · 0.98) = 0.019 + 0.098 =
0.117
P(D|T) = [P(T|D) · P(D)] / P(T) = 0.019 / 0.117 ≈ 0.162 (16.2%).

6
Bayes Practice Problems
Problem 1: A robot predicts rain with 80% accuracy.
If it rains 30% of the time, what’s the probability it’s
raining given the robot predicts rain?
• P(Rain|Predict) = (P(Predict|Rain) · P(Rain)) /
P(Predict)
Problem 2: A model detects fraud with 90% accuracy.
Fraud occurs in 2% of transactions. If the model flags
a transaction, what’s the chance it’s fraud?

7
Different Types of Distributions
Gaussian (Normal) Distribution: Continuous, bell-shaped distribution
defined by mean (μ) and variance (σ²). Example: Test scores or sensor noise.
Bernoulli Distribution: Discrete, models a single trial with two outcomes
(success = 1 with probability p, failure = 0). Example: Whether a user clicks a
link.
Poisson Distribution: Discrete, models the number of events in a fixed interval,
parameterized by λ (average rate). Example: Number of emails received per
hour.
Gaussian: Assumed in many models (e.g., linear regression) and used for data
normalization.
Bernoulli: Core to binary classification tasks (e.g., fraud detection).
Poisson: Useful for modeling event frequencies (e.g., traffic prediction).

8
Probability Distribution

9
Probability Distribution in AI
• Gaussian in AI: Neural networks often assume input
features are normally distributed after standardization.
Example: Predicting house pr ices with normally
distributed errors.
• Bernoulli in AI: Used in logistic regression to predict
binary outcomes. Example: Classifying an image as “cat”
or “not cat.”
• Poisson in AI: Models rare events in time-series data.
Example: Predicting server failures based on historical
crash rates.

10
Statistical Inference and Its Application in AI
Statistical Inference is the process of using sample data to make
generalizations about a population. Includes estimating
parameters (e.g., mean accuracy of a model) and testing
hypotheses.
Application in AI:
• Parameter Estimation: Inferring weights in a machine
learning model from training data.
• M o d e l E v a l u a t i o n : D e t e r m i n i n g i f a n A I s y s t e m ’s
performance is due to skill or chance. Example: Inferring
customer preferences from a sample of purchase data.

11
Hypothesis Testing
A method to test claims about data using statistical
evidence.
Process:
1. State null hypothesis (H₀ , e.g., "no difference in model
performance"),
2. compute a test statistic,
3. and compare to a threshold (p-value < α, typically 0.05).
• Example: Test if a new AI algorithm outperforms an old
one.

12
Confidence Interval
A range estimating a parameter with a confidence level
( e. g. , 9 5 % C I f o r a c c u ra c y : 8 8 % – 9 2 % ) . I n d i c a t e s
uncertainty in estimates.
• Example: Reporting an AI’s error rate with a range to
show reliability.
Importance in AI:
• Hypothesis testing validates improvements (e.g., “Is this
model significantly better?”).
• Confidence intervals quantify uncertainty, ensuring
trustworthy AI deployment.
13
Conclusion on Probability and Statistics
• Probability and statistics are essential for AI to model
uncertainty, learn from data, and evaluate performance.
• Bayes' theorem enables adaptive reasoning, critical for
real-time AI applications.
• Distributions like Gaussian, Bernoulli, and Poisson
underpin data modeling in AI tasks.
• Statistical inference, hypothesis testing, and confidence
intervals ensure AI systems are robust and reliable.

14
Calculus
The mathematical study of change and accumulation, divided into:
• Differential Calculus: Focuses on rates of change (e.g., slopes, derivatives).
• Integral Calculus: Deals with accumulation (e.g., areas, sums over intervals).

Importance in AI:
• Enables optimization of models by finding minima or maxima (e.g.,
minimizing error in machine learning).
• Powers gradient-based methods, the backbone of training algorithms like
neural networks.
• Helps model continuous relationships in data, critical for tasks like
regression and deep learning.

15
Gradient
A vector of partial derivatives representing the
direction and rate of steepest increase of a function.
For a function f(x, y), the gradient is ∇f = (∂f/∂x,
∂f/∂y).
• Optimization: The process of finding the best
solution (e.g., minimum or maximum) of a function,
often called the objective or loss function in AI.
• Connection: Gradients guide optimization by
indicating how to adjust variables to reduce error
or improve performance.
16
Optimization
The gradient points uphill; its negative points downhill. In optimization,
we follow the negative gradient to minimize a function.
• Example: For f(x) = x², the derivative is f'(x) = 2x. At x = 2, the
gradient is 4, so moving opposite (downhill) reduces f(x).
• Application in Optimization:
Gradient Descent: Iteratively update parameters: xₙ ₑₓ = xₒₗ ₖ - η•∇f,
where η is the learning rate.
• Used to minimize loss functions in AI (e.g., mean squared error in
regression).
• Intuition: Think of gradient descent as a hiker descending a foggy
mountain by feeling the steepest slope underfoot.

17
Examples and Practice Problems
Example 1: Minimize f(x) = x² + 2x + 1.
• Derivative: f'(x) = 2x + 2.
• Set f'(x) = 0: 2x + 2 = 0, so x = -1 (minimum).
• Gradient descent: Start at x = 1, f'(1) = 4, step with η
= 0.1: x = 1 - 0.1•4 = 0.6.
Practice Problem: Use gradient descent to minimize
f(x) = 3x² - 6x + 5. Start at x = 0, η = 0.1, 2 steps.
18
Optimization and Its Application in AI
Finding the parameter values that minimize (or maximize) an
objective function. In AI, the objective is often a loss function
(e.g., difference between predicted and actual values).
Application in AI:
• Linear Regression: Minimize squared error to fit a line to data.
• Neural Networks: Adjust weights to minimize prediction error.
• Reinforcement Learning: Maximize cumulative reward.
• Example: Training a model to predict house prices by
minimizing the error between predicted and actual prices.

19
Convex Optimization:
A function is convex if any line segment between two
points on its graph lies above or on the graph (e.g., f(x) =
x²).
Importance:
• Guarantees a single global minimum, making
optimization reliable and efficient.
• In AI: Convex loss functions (e.g., in logistic regression)
ensure gradient descent finds the best solution.
• Local Minima: Points where the function is lower than
nearby points but not the global minimum.
20
Calculus in AI
• Forward Pass: Compute predictions using a function of inputs and weights.
• Backward Pass (Backpropagation): Use gradients to update weights by
minimizing loss.
• Chain rule: ∂L/∂w = ∂L/∂y·∂y/∂w propagates errors backward.
Deep Learning:
Neural networks are compositions of functions (layers), and calculus optimizes
millions of parameters.
• Example: In a 3-layer network, gradients adjust weights to reduce
classification error.
Neural Networks:
• Loss function (e.g., cross-entropy) is differentiated w.r.t. each weight,
enabling learning.

21
Example:
Linear Regression:
• Loss: L = (1/n)∑(yᵢ - (wxᵢ + b))²
• Gradients: ∂L/∂w = -(2/n)∑xᵢ(yᵢ - (wxᵢ + b)),
∂L/∂b = -(2/n)∑(yᵢ - (wxᵢ + b))
• Optimization: Gradient descent fits the line.

22
Case Study
Image Classification with CNNs:
• Convolutional Neural Networks (CNNs) use calculus to
optimize filters and weights.
• Loss: Cross-entropy between predicted and true labels.
• Backpropagation adjusts millions of parameters to recognize
patterns (e.g., edges, shapes).
GPT Models:
• Transformer-based models (like ChatGPT) rely on gradient
descent to optimize attention weights, enabling language
understanding.

23
References
• Grok – x.com/i/Grok

Daphne Koller, Nir Friedman Probabilistic Graphical Models Principles and Techniques 2009
100% (10)
Daphne Koller, Nir Friedman Probabilistic Graphical Models Principles and Techniques 2009
1,270 pages
Data Analysis Syllabus
No ratings yet
Data Analysis Syllabus
6 pages
(Adaptive Computation and Machine Learning) Daphne Koller - Nir Friedman - Probabilistic Graphical Models - Principles and PDF
No ratings yet
(Adaptive Computation and Machine Learning) Daphne Koller - Nir Friedman - Probabilistic Graphical Models - Principles and PDF
1,270 pages
MATH 221 Statistics For Decision Making
0% (1)
MATH 221 Statistics For Decision Making
2 pages
Analyze The Report of Swedish Motor Insurance
No ratings yet
Analyze The Report of Swedish Motor Insurance
14 pages
lecture2.2.2 unit2_AI [Autosaved]
No ratings yet
lecture2.2.2 unit2_AI [Autosaved]
17 pages
L0
No ratings yet
L0
26 pages
01_intro
No ratings yet
01_intro
37 pages
Lecture2.2.5 6unit2_AI [Autosaved]
No ratings yet
Lecture2.2.5 6unit2_AI [Autosaved]
21 pages
Cs 228
No ratings yet
Cs 228
98 pages
Bayesian AI Tutorial Bayesian AI Tutorial
No ratings yet
Bayesian AI Tutorial Bayesian AI Tutorial
28 pages
ANU July2001 Tutorial 4
No ratings yet
ANU July2001 Tutorial 4
28 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
EIE4105 Multimodal Human Computer Interaction Technology: Fundamental of Statistical Learning
No ratings yet
EIE4105 Multimodal Human Computer Interaction Technology: Fundamental of Statistical Learning
31 pages
Lecture Bayesian Networks
No ratings yet
Lecture Bayesian Networks
50 pages
Announcements
0% (1)
Announcements
29 pages
AI_UNIT_2
No ratings yet
AI_UNIT_2
30 pages
Chapter 1 Uncertainty
No ratings yet
Chapter 1 Uncertainty
32 pages
Role of Statistics in Artificial Intelligence
No ratings yet
Role of Statistics in Artificial Intelligence
3 pages
Cs3491 Aiml Unit 2 Qbank
No ratings yet
Cs3491 Aiml Unit 2 Qbank
33 pages
Artificial Intelligence?: This Is...
No ratings yet
Artificial Intelligence?: This Is...
15 pages
Module 2 Bayesian Network Model and Inference
No ratings yet
Module 2 Bayesian Network Model and Inference
35 pages
Unit 2 - Probabilistic Reasoning
No ratings yet
Unit 2 - Probabilistic Reasoning
25 pages
Week 11
No ratings yet
Week 11
97 pages
Understanding What is Probability Theory in AI
No ratings yet
Understanding What is Probability Theory in AI
9 pages
What_is_bayes_theorem_in_AI_Bayes'_Theorem_is_a_foundational_concept copy
No ratings yet
What_is_bayes_theorem_in_AI_Bayes'_Theorem_is_a_foundational_concept copy
7 pages
unit2 AI & ML
No ratings yet
unit2 AI & ML
29 pages
Module V_v1
No ratings yet
Module V_v1
58 pages
AI UNIT 2-1
No ratings yet
AI UNIT 2-1
33 pages
AI UNIT 2
No ratings yet
AI UNIT 2
33 pages
Statistical Reasoning Cha 8
94% (16)
Statistical Reasoning Cha 8
21 pages
NCC+Journal 9-1-14D Role+of+Statistics Dila+Ram+Bhandari
No ratings yet
NCC+Journal 9-1-14D Role+of+Statistics Dila+Ram+Bhandari
7 pages
Ai99 Tutorial 4
100% (5)
Ai99 Tutorial 4
28 pages
Postgraduate Diploma in Machine Learning and Artificial Intelligence
No ratings yet
Postgraduate Diploma in Machine Learning and Artificial Intelligence
21 pages
Brochure Pgdmlai 020720 2
No ratings yet
Brochure Pgdmlai 020720 2
21 pages
Python Data Science 2024 - Explo - Wilson, Stephen
No ratings yet
Python Data Science 2024 - Explo - Wilson, Stephen
170 pages
AI-W5
No ratings yet
AI-W5
29 pages
Math for AI
No ratings yet
Math for AI
34 pages
ALML QUESTION PAPER
No ratings yet
ALML QUESTION PAPER
8 pages
L2 Basic Mathematics
No ratings yet
L2 Basic Mathematics
32 pages
Full download Learning Bayesian networks Richard E. Neapolitan pdf docx
100% (1)
Full download Learning Bayesian networks Richard E. Neapolitan pdf docx
67 pages
Notes of grade 9 for importance of maths for AI (1)
No ratings yet
Notes of grade 9 for importance of maths for AI (1)
7 pages
ai3
No ratings yet
ai3
41 pages
תרגול - Bayesian Learning
No ratings yet
תרגול - Bayesian Learning
45 pages
SCSA3015 Deep Learning Unit 2 PDF
No ratings yet
SCSA3015 Deep Learning Unit 2 PDF
32 pages
Bayesian Analysis In Natural Language Processing Shay Cohen pdf download
100% (3)
Bayesian Analysis In Natural Language Processing Shay Cohen pdf download
77 pages
Statistical Foundations of Machine Learning: The Handbook
No ratings yet
Statistical Foundations of Machine Learning: The Handbook
364 pages
ai (2)
No ratings yet
ai (2)
13 pages
Unit I_Notesdaa
No ratings yet
Unit I_Notesdaa
50 pages
Stat
No ratings yet
Stat
9 pages
10 1 1 207 351 PDF
No ratings yet
10 1 1 207 351 PDF
133 pages
23ECE205_FODS_01_Introduction
No ratings yet
23ECE205_FODS_01_Introduction
17 pages
learning bayesian networks 1st edition by Richard Neapolitan ISBN 0130125342 978-0130125347 - The full ebook with complete content is ready for download
No ratings yet
learning bayesian networks 1st edition by Richard Neapolitan ISBN 0130125342 978-0130125347 - The full ebook with complete content is ready for download
43 pages
Learning Probabilistic Graphical Models in R - Sample Chapter
No ratings yet
Learning Probabilistic Graphical Models in R - Sample Chapter
37 pages
AI in maths
No ratings yet
AI in maths
22 pages
Bayesian Probability
No ratings yet
Bayesian Probability
36 pages
AIES UNIT 3
No ratings yet
AIES UNIT 3
11 pages
Advanced Machine Learning
No ratings yet
Advanced Machine Learning
63 pages
AI Lec 04+05 - Naive Bayes
No ratings yet
AI Lec 04+05 - Naive Bayes
55 pages
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
No ratings yet
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
58 pages
Probability and Computing 2nd ed Edition Mitzenmacher - The latest updated ebook is now available for download
100% (2)
Probability and Computing 2nd ed Edition Mitzenmacher - The latest updated ebook is now available for download
41 pages
Learning Bayesian networks Richard E. Neapolitan pdf download
100% (1)
Learning Bayesian networks Richard E. Neapolitan pdf download
50 pages
Artificial Intelligence and Machine Learning
100% (1)
Artificial Intelligence and Machine Learning
11 pages
What Is a Contract of Employment
No ratings yet
What Is a Contract of Employment
9 pages
Oladav Event - Invoice
No ratings yet
Oladav Event - Invoice
1 page
Digital Signal Processing Week 4.1
No ratings yet
Digital Signal Processing Week 4.1
11 pages
Hooke's Tutorial
No ratings yet
Hooke's Tutorial
6 pages
Study_Materials
No ratings yet
Study_Materials
28 pages
GEC 517 Exam Preps
No ratings yet
GEC 517 Exam Preps
14 pages
Module 8 - Input output
No ratings yet
Module 8 - Input output
53 pages
Module 6
No ratings yet
Module 6
30 pages
Memory and IO Interfacing
No ratings yet
Memory and IO Interfacing
29 pages
EIE512_08_24
No ratings yet
EIE512_08_24
20 pages
Module_05_Programming_Model_in_Protected_Mode_Part_1_Addressing
No ratings yet
Module_05_Programming_Model_in_Protected_Mode_Part_1_Addressing
16 pages
EIE512_09_25
No ratings yet
EIE512_09_25
34 pages
Module 4 Interrupt 2
No ratings yet
Module 4 Interrupt 2
33 pages
EIE512_07_24
No ratings yet
EIE512_07_24
24 pages
Eds512 - Cost Engineering - L3
No ratings yet
Eds512 - Cost Engineering - L3
84 pages
Innovation AND BUSINESS
No ratings yet
Innovation AND BUSINESS
13 pages
Eds512 - Cost Engineering - L2
100% (1)
Eds512 - Cost Engineering - L2
106 pages
Eie512 02
No ratings yet
Eie512 02
45 pages
Stat 5002 Final Exam Formulas W 21
No ratings yet
Stat 5002 Final Exam Formulas W 21
7 pages
Hypo Testing
No ratings yet
Hypo Testing
10 pages
A Causal Inference Approach To Measure Price Elasticity in Automobile Insurance
No ratings yet
A Causal Inference Approach To Measure Price Elasticity in Automobile Insurance
10 pages
Formula Sheet - Regression
No ratings yet
Formula Sheet - Regression
2 pages
Normal Distribution and Standard Normal Variable
No ratings yet
Normal Distribution and Standard Normal Variable
43 pages
Student Midterm Final Exam
No ratings yet
Student Midterm Final Exam
11 pages
One-Way Analysis of VAriance
No ratings yet
One-Way Analysis of VAriance
8 pages
F-Test and Decomposition of Data
No ratings yet
F-Test and Decomposition of Data
13 pages
Analysis of Variance
100% (1)
Analysis of Variance
18 pages
SOAL TUGAS, Menggunakan SPSS
No ratings yet
SOAL TUGAS, Menggunakan SPSS
9 pages
2 - CHAPTER TWO-Mean and Total Estimation
No ratings yet
2 - CHAPTER TWO-Mean and Total Estimation
14 pages
Cambridge International Advanced Level
No ratings yet
Cambridge International Advanced Level
4 pages
T Merge - A Tool To Facilitate Creation of Multiple Time-Based Intervals Per Subject
No ratings yet
T Merge - A Tool To Facilitate Creation of Multiple Time-Based Intervals Per Subject
19 pages
Sample Size Determination (Capability Indices)
No ratings yet
Sample Size Determination (Capability Indices)
4 pages
5950 2019 Assignment 1 PROCEDURES
No ratings yet
5950 2019 Assignment 1 PROCEDURES
92 pages
Durbin Watson Statistic - Overview, How to Calculate and Interpret
No ratings yet
Durbin Watson Statistic - Overview, How to Calculate and Interpret
5 pages
Deming Regression: By:-Amit Singh
No ratings yet
Deming Regression: By:-Amit Singh
29 pages
ML Interview Questions and Answers
100% (1)
ML Interview Questions and Answers
25 pages
Solutions For Homework 4: Two-Way ANOVA: Response Versus Solution, Days
No ratings yet
Solutions For Homework 4: Two-Way ANOVA: Response Versus Solution, Days
13 pages
Durbin, J., & Watson, G. S. (1950) - Testing For Serial Correlation in Least Squares Regression I. Biometrika, 37 (34), 409.
No ratings yet
Durbin, J., & Watson, G. S. (1950) - Testing For Serial Correlation in Least Squares Regression I. Biometrika, 37 (34), 409.
21 pages
02.2 Graphical Summary Techniques
No ratings yet
02.2 Graphical Summary Techniques
32 pages
AI & ML Unit 4 Notes
No ratings yet
AI & ML Unit 4 Notes
16 pages
Complete Download SPSS Statistics: A Practical Guide 5e 5th Edition Kellie Bennett - eBook PDF PDF All Chapters
100% (4)
Complete Download SPSS Statistics: A Practical Guide 5e 5th Edition Kellie Bennett - eBook PDF PDF All Chapters
69 pages
Ijerph 18 00401 With Cover
No ratings yet
Ijerph 18 00401 With Cover
20 pages
Critical Values of The Student's T Distribution: T T - T T
No ratings yet
Critical Values of The Student's T Distribution: T T - T T
4 pages
E9 205 - Machine Learning For Signal Processing
No ratings yet
E9 205 - Machine Learning For Signal Processing
2 pages
1-s2.0-S2950008723000108-main
No ratings yet
1-s2.0-S2950008723000108-main
17 pages

Module 2 Math Foundation II

Uploaded by

Module 2 Math Foundation II

Uploaded by

www.covenantuniversity.edu.

Raising a new Generation of Leaders

You might also like