0% found this document useful (0 votes)
61 views13 pages

Fundamentals of Data Science

Uploaded by

1002poonam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views13 pages

Fundamentals of Data Science

Uploaded by

1002poonam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Statistics and Probability in Data Science

Presentation Summary
Introduction

▶ Statistics and probability are foundational to data science


▶ Essential for AI, machine learning, and deep learning
▶ Mathematics is embedded in every aspect of our lives
Data Types

▶ Qualitative Data
▶ Nominal: No inherent order (e.g., gender, race)
▶ Ordinal: Ordered series (e.g., ratings)
▶ Quantitative Data
▶ Discrete: Limited possible values (e.g., number of students)
▶ Continuous: Unlimited possible values (e.g., weight)
Variable Types

▶ Discrete variables (categorical)


▶ Continuous variables
▶ Independent variables
▶ Dependent variables
Statistics Overview

▶ Definition: Applied mathematics for data collection, analysis,


interpretation, and presentation
▶ Types:
▶ Descriptive Statistics: Summarizes data features
▶ Inferential Statistics: Makes predictions based on samples
▶ Key concepts: Population and Sample
Sampling Techniques

▶ Probability Sampling
▶ Random sampling
▶ Systematic sampling
▶ Stratified sampling
▶ Cluster sampling
▶ Non-probability Sampling
Information Gain and Entropy

▶ Entropy: Measure of uncertainty in data


▶ Information Gain: How much information a feature provides
about the final outcome
▶ Used in decision trees and random forests
▶ Example: Predicting if a match can be played based on
weather conditions
Probability Theory

▶ Probability: Measure of how likely an event will occur


▶ Key concepts:
▶ Random experiment
▶ Sample space
▶ Event
▶ Types of events: Disjoint and Non-disjoint
Types of Probability

▶ Marginal Probability: Unconditional on any other event


▶ Joint Probability: Measure of two events happening
simultaneously
▶ Conditional Probability: Probability based on the occurrence
of a previous event
Probability Distribution Functions

▶ Probability Density Function (PDF)


▶ Normal Distribution
▶ Central Limit Theorem
Bayes’ Theorem

▶ Shows relation between conditional probability and its inverse


▶ Formula: P(A|B) = P(B|A)∗P(A)
P(B)
▶ Used in naive Bayes algorithm (e.g., spam filtering)
Inferential Statistics

▶ Forms inferences and predictions about a population based on


a sample
▶ Point Estimation vs. Interval Estimation
▶ Confidence Interval and Margin of Error
▶ Methods of Estimation:
▶ Method of Moments
▶ Maximum Likelihood
▶ Bayes Estimator
▶ Bayes Unbiased Estimator
Conclusion

▶ Statistics and probability are crucial for data science


▶ Understanding these concepts helps in:
▶ Data analysis
▶ Machine learning model development
▶ Interpreting results
▶ Making informed decisions

You might also like