Unit II Descriptive-Statistics-And-Correlation

Notes for data science

Uploaded by

Pratik Bante

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views19 pages

Unit II Descriptive-Statistics-And-Correlation

Notes for data science

Uploaded by

Pratik Bante

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

UNRAVELING

DATA
RELATIONSHIPS
: A DEEP D I V E
INTO
DESCRIPTIVE
STATISTICS A N D
CORRELATION
Unit 2: Statistics:
• Descriptive Statistics
• Correlation
• distributions and probability
• Statistical Inference: Populations and samples
• Statistical modelling
• probability distributions
• fitting a model
• Hypothesis Testing
INTRODUCTION TO DATA
RELATIONSHIPS

In this presentation, we will explore

data relationships, focusing on
descriptive statistics and
correlation.
Understanding these concepts is
crucial for analyzing data
get respectively and
on this journey making
to uncover
informed
the decisions.
insights Let's our
hidden within
data.
WHAT ARE DESCRIPTIVE
STATISTICS?
Descriptive statistics summarize
and describe the main features of
a dataset.
They provide insights into the
central tendency, variability, and
overall distribution of the data.
Common measures include mean,
median, mode,
and standard deviation.
Understanding these statistics is
foundational for any data
analysis.
CENTRAL T E N D E N C Y

Central tendency measures, such

as mean, median, and mode,
help us understand the typical
value in a dataset. The mean is
the average, the median is the
middle value, and the mode is
the most frequently occurring
value. Each measure provides
unique insights into the data's
distribution.
A measure of central tendency describes where most of the values in the
dataset occur. It’s the center of the distribution of values. Excel presents three
measures of central tendency. Which one is best for your data?
Mean: This measure is the one with which you’re most familiar. It’s the sum of all
observations divided by the number of observations. It’s best for data that follow
symmetric distributions.
Median: This value splits your data in half. Half the values fall above the median
while half are below it. It’s best for skewed distributions.
Mode: This measure represents the value that occurs most frequently in your data.
It’s best for categorical and ordinal data.
The example data are continuous variables. Excel frequently displays “N/A”
for the mode when you have continuous data. That happens because continuous
data are unlikely to have exactly duplicated values, a requirement for the mode
UNDERSTA NDI NG
VARIABILITY

Variability refers to how much the

data points differ from each
other. Key measures include
range, variance, and standard
deviation. High variability
indicates that data points are
spread out, while low variability
suggests they are clustered
closely. Understanding variability
is essential for interpreting data
accurately.
1. Range

Definition: The range is the difference between the largest and smallest values in
a dataset.
Formula: Range = Maximum Value − Minimum Value
Example: If a data set contains values 2, 5, 8, 10, and 12,
the range is: 12−2=10
Explanation: The range gives a quick sense of the spread of the data, but it is
affected by extreme values (outliers).
2. Variance
3. Standard Deviation
Example-

Exam Scores Suppose you have the following scores of 20 students on an exam:
85, 90, 75, 92, 88, 79, 83, 95, 87, 91, 78, 86, 89, 94, 82, 80, 84, 93, 88, 81
To calculate descriptive statistics:
• Mean: Add up all the scores and divide by the number of scores. Mean = (85 + 90 + 75 + 92 + 88 + 79 + 83
+ 95 + 87 + 91 + 78 + 86 + 89 + 94 + 82 + 80 + 84 + 93 + 88 + 81) / 20 = 1770 / 20 = 88.5
• Median: Arrange the scores in ascending order and find the middle value. Median = 86 (middle value)
• Mode: Identify the score(s) that appear(s) most frequently. Mode = 88
• Range: Calculate the difference between the highest and lowest scores. Range = 95 - 75 = 20
• Variance: Calculate the average of the squared differences from the mean. Variance = [(85-88.5)^2 + (90-
88.5)^2 + ... + (81-88.5)^2] / 20 = 33.25
• Standard Deviation: Take the square root of the variance. Standard Deviation = √33.25 = 5.77
VISUALIZING DESCRIPTIVE
STATISTICS

Data visualization tools like

histograms, box plots, and
scatter plots help illustrate
descriptive statistics
e ectively. These visuals
provide a clearer
understanding of data
distribution, central tendency,
and
variability, making it easier to
communicate ﬁndings to
stakeholders and decision-
makers.
INTRODUCTION TO
CORRELATION

Correlation measures the strength

and direction of the relationship
between two variables. It ranges
from -1 to 1, where -1 indicates a
perfect negative correlation, 1
indicates a perfect positive
correlation, and 0 indicates no
correlation. Understanding
correlation is vital for identifying
relationships in data.
TYPES OF
CORRELATION
There are three main types of
correlation: positive, negative, and
no correlation. Positive correlation
means that as one variable
increases, the other also
increases. Negative correlation
indicates that as one variable
increases, the other decreases. No
correlation means there is no
discernible relationship between
the variables.
CALCULATING CORRELATION COEFFICIENT

The correlation coe cient, often

represented as r, quantiﬁes the
degree of correlation between
two variables. It is calculated
using statistical methods, such as
Pearson's or Spearman's
correlation. Understanding how to
calculate and interpret this coe
cient is essential for data analysis
and research.
LIMITATIONS OF
CORRELATION

While correlation can indicate a

relationship between variables, it
does not imply causation. Other
factors may inﬂuence the
relationship, leading to
misleading interpretations.
Therefore, it is crucial to
complement correlation analysis
with further investigation to
understand the underlying causes.
REAL-WORLD
APPLICATIONS
Descriptive statistics and
correlation are widely used in
various ﬁelds, including business,
healthcare, and social sciences.
They help professionals make
data-driven decisions, identify
trends, and improve outcomes.
Understanding these concepts is
essential for anyone working with
data.
K EY
TAKEAWAYS
In summary, understanding
descriptive statistics and
correlation is crucial for analyzing
data relationships. These concepts
provide insights into data
distribution, variability, and
relationships between variables.
Mastering these tools enhances
data analysis skills and informs
better decision-making.
C O N C LU S I
ON

In conclusion, unraveling data relationships through

descriptive statistics and correlation is essential for e
ective data analysis. By understanding these concepts,
we can uncover valuable insights that drive informed
decisions and strategies. Thank you for your attention!

Chi Square Assignment MOHA 570
No ratings yet
Chi Square Assignment MOHA 570
3 pages
Iba Unit - Ii
No ratings yet
Iba Unit - Ii
31 pages
Stastical Data Analysis: A Lokeshwari 22N31E0014
No ratings yet
Stastical Data Analysis: A Lokeshwari 22N31E0014
30 pages
Research Report
No ratings yet
Research Report
47 pages
AS-level - Research Methods 4 - Correlation and Data Analysis
No ratings yet
AS-level - Research Methods 4 - Correlation and Data Analysis
63 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
11 pages
Statistics FoundationalMathofAI S24
No ratings yet
Statistics FoundationalMathofAI S24
5 pages
Statistics - Compendium - DMS IIT DELHI - 2025
No ratings yet
Statistics - Compendium - DMS IIT DELHI - 2025
18 pages
DSBD Unit-II 4
No ratings yet
DSBD Unit-II 4
15 pages
Shahzeb Ali Mohammed Pca1 Stats
No ratings yet
Shahzeb Ali Mohammed Pca1 Stats
10 pages
Data Visualization
No ratings yet
Data Visualization
37 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
13 pages
Quantitative Data Analysis Thru Descriptive Statistics
No ratings yet
Quantitative Data Analysis Thru Descriptive Statistics
6 pages
SCA - Module 4
No ratings yet
SCA - Module 4
49 pages
DS Unit 2
No ratings yet
DS Unit 2
6 pages
Statistics and Its Types (v1.0)
No ratings yet
Statistics and Its Types (v1.0)
6 pages
Statistics
No ratings yet
Statistics
21 pages
Descriptive Statistics Handout
No ratings yet
Descriptive Statistics Handout
15 pages
Tian Statistics Lesson 3 Descriptive Statistics
No ratings yet
Tian Statistics Lesson 3 Descriptive Statistics
64 pages
Quantitative Analysis
No ratings yet
Quantitative Analysis
30 pages
MATM Midterm Reviewer
No ratings yet
MATM Midterm Reviewer
10 pages
Descriptive Stastistics
No ratings yet
Descriptive Stastistics
10 pages
Statistics For Data Science PDF - Statistics-for-Data-Science PDF
No ratings yet
Statistics For Data Science PDF - Statistics-for-Data-Science PDF
14 pages
Statistical Treatment
No ratings yet
Statistical Treatment
22 pages
Lab 2
No ratings yet
Lab 2
5 pages
1.1 CS3352-FDS - Unit 1
No ratings yet
1.1 CS3352-FDS - Unit 1
42 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
5 pages
MMW Chapter 4
No ratings yet
MMW Chapter 4
11 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Statistics Notes Self Made
100% (1)
Statistics Notes Self Made
41 pages
Chapter 12
No ratings yet
Chapter 12
46 pages
Data Management
No ratings yet
Data Management
48 pages
Week 8 Quantitative Data Analysis - Descriptive Statistics
No ratings yet
Week 8 Quantitative Data Analysis - Descriptive Statistics
59 pages
Lesson 5 (Descriptive Statistics Part 1) - Oct 2024
No ratings yet
Lesson 5 (Descriptive Statistics Part 1) - Oct 2024
72 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
AFM - Module 6
No ratings yet
AFM - Module 6
72 pages
Statistical Analysis - Descriptive Stat
No ratings yet
Statistical Analysis - Descriptive Stat
6 pages
Statistics For Data Science PDF
No ratings yet
Statistics For Data Science PDF
16 pages
Basics Data Description
No ratings yet
Basics Data Description
2 pages
RM EBBA Class 8 CH0 11 Quatitative Analysis
No ratings yet
RM EBBA Class 8 CH0 11 Quatitative Analysis
37 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
18 pages
22
No ratings yet
22
6 pages
Statistics Notes
No ratings yet
Statistics Notes
46 pages
Statistical Measures&Correlation&It'STypes
No ratings yet
Statistical Measures&Correlation&It'STypes
4 pages
Document
No ratings yet
Document
23 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
Lesson 1
No ratings yet
Lesson 1
37 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
34 pages
Lesson 18 Basic Statistical Tool
100% (1)
Lesson 18 Basic Statistical Tool
36 pages
Marketing Research: Ninth Edition
No ratings yet
Marketing Research: Ninth Edition
44 pages
Research Presentation
No ratings yet
Research Presentation
29 pages
MS102
No ratings yet
MS102
9 pages
Descriptive Statistics Lecture
No ratings yet
Descriptive Statistics Lecture
24 pages
Statistics Assignment Chinar Dawod Ozair
100% (1)
Statistics Assignment Chinar Dawod Ozair
12 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
9 pages
Statistics - Imp Points
No ratings yet
Statistics - Imp Points
6 pages
Data science-Unit-3-Complete
No ratings yet
Data science-Unit-3-Complete
33 pages
Statistics Notes
No ratings yet
Statistics Notes
28 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Opm - Case 2 Forecasting - Solve
No ratings yet
Opm - Case 2 Forecasting - Solve
2 pages
Computing New Variables Using Generate and Replace
No ratings yet
Computing New Variables Using Generate and Replace
9 pages
RM Unit 4 - Overview
No ratings yet
RM Unit 4 - Overview
62 pages
Completely Randomized Design
No ratings yet
Completely Randomized Design
7 pages
Earned Value Analysis Example - INF3708
100% (2)
Earned Value Analysis Example - INF3708
3 pages
Introduction To Rlogistic
No ratings yet
Introduction To Rlogistic
135 pages
S2 Binomial Distribution
No ratings yet
S2 Binomial Distribution
24 pages
Probit Logit Interpretation
No ratings yet
Probit Logit Interpretation
26 pages
Chapter 5 CORRELATION AND REGRESSION
No ratings yet
Chapter 5 CORRELATION AND REGRESSION
28 pages
(Ebooks PDF) Download (Ebook PDF) Statistical Techniques in Business and Economics 18th Edition Full Chapters
100% (5)
(Ebooks PDF) Download (Ebook PDF) Statistical Techniques in Business and Economics 18th Edition Full Chapters
41 pages
Data Analysis and Graphics Using R-An Example Based Approach
No ratings yet
Data Analysis and Graphics Using R-An Example Based Approach
22 pages
TMT Siciliano
No ratings yet
TMT Siciliano
9 pages
Experimental Psychology
No ratings yet
Experimental Psychology
3 pages
Lesson 02 - V2
No ratings yet
Lesson 02 - V2
10 pages
Chapter 14, Multiple Regression Using Dummy Variables
No ratings yet
Chapter 14, Multiple Regression Using Dummy Variables
19 pages
Statistical Properties of OLS
No ratings yet
Statistical Properties of OLS
59 pages
Hackett 1985
No ratings yet
Hackett 1985
42 pages
Probability and Statistics Lecture Notes
100% (1)
Probability and Statistics Lecture Notes
9 pages
Solman PDF
No ratings yet
Solman PDF
71 pages
Methodology For The Development
No ratings yet
Methodology For The Development
12 pages
New Predictive Modelling Using R and SPSS
No ratings yet
New Predictive Modelling Using R and SPSS
1 page
Calculating Statistics Using Excel
No ratings yet
Calculating Statistics Using Excel
14 pages
BIOSTAT Assignment
No ratings yet
BIOSTAT Assignment
6 pages
Final A Study On Consumer Preference On Cafe Coffee Da1
100% (3)
Final A Study On Consumer Preference On Cafe Coffee Da1
29 pages
Math 7-Q4-Module-6
No ratings yet
Math 7-Q4-Module-6
16 pages
Happiness Visualization
No ratings yet
Happiness Visualization
2 pages
MGT555 Individual Assignment 2
No ratings yet
MGT555 Individual Assignment 2
9 pages
Sampling and Sample Preparation
No ratings yet
Sampling and Sample Preparation
15 pages
TQ - Fourth quarter-MATHEMATICS10
No ratings yet
TQ - Fourth quarter-MATHEMATICS10
2 pages

Unit II Descriptive-Statistics-And-Correlation

Uploaded by

Unit II Descriptive-Statistics-And-Correlation

Uploaded by

UNRAVELING

In this presentation, we will explore

Central tendency measures, such

Variability refers to how much the

Data visualization tools like

Correlation measures the strength

The correlation coe cient, often

While correlation can indicate a

In conclusion, unraveling data relationships through

You might also like