Descriptive - Statistics Data Discret chp2
Descriptive - Statistics Data Discret chp2
qualitative data
Binary
Data Types
Quantitative and Categorical.
Quantitative data takes on numeric values that allow us to perform mathematical operations (like the number of dogs).
Categorical are used to label a group or set of items (like dog breeds - Collies, Labs, Poodles, etc.).
Descriptive Statistics 1
Categorical Ordinal data take on a ranked ordering (like a ranked interaction on a scale from Very Poor to Very Good with the
dogs).
Categorical Nominal data do not have an order or ranking (like the breeds of the dog )ﺳﻼﻻت ﻣﻦ اﻟﻜﺎﻻب.
Quantitative: Examples
Continuous : Height, Age, Income
Discrete : Pages in a Book, Trees in Yard, Dogs at a Coffee Shop
Categorical: Examples
Ordinal : Letter Grade, Survey Rating
Nominal : Gender, Marital Status, Breakfast Items
1. Measures of Center
2. Measures of Spread
4. Outliers
1-Measures of Center
There are three measures of center:
1. Mean
2. Median
3. Mode
1- The Mean
The mean is often called the average or the expected value in mathematics.
We calculate the mean by adding all of our values together, and dividing by the number of values in our dataset.
Descriptive Statistics 2
2- The Median
The median splits our data so that 50% of our values are lower and 50% are higher.
3-The Mode
The mode is the most frequently observed value in our dataset.
Notation
Notation : Think of notation as a universal language used by academic and industry professionals to convey mathematical
ideas. 5+3
Random Variables
A random variable is a placeholder for the possible values of some process
Aggregations
An aggregation is a way to turn multiple numbers into fewer numbers (commonly one number).
Summation is a common aggregation. The notation used to sum our values is a greek symbol called sigma Σ.
2- Measures of Spread
Measures of Spread are used to provide us an idea of how spread out our data are from one another. Common measures of
spread include:
1. Range
3. Standard Deviation
4. Variance
Descriptive Statistics 3
Histograms اﻟﻤﺪرج اﻟﺘﻜﺮارى
Histograms : are super useful to understanding the different aspects of quantitative data. In the upcoming concepts, you will see
histograms used all the time to help you understand the four aspects we outlined earlier regarding a quantitative variable:
2. Q1: The value such that 25% of the data fall below.
3. Q2: (Median) The value such that 50% of the data fall below.
4. Q3: The value such that 75% of the data fall below.
Descriptive Statistics 4
The standard deviation is associated with risk in finance, assists in determining the significance of drugs in medical
studies, and measures the error of our results for predicting anything from the amount of rainfall we can expect
tomorrow to your predicted commute time tomorrow.
ni
The variance is used to compare the spread of two different groups. A set of data with higher variance is more spread
out than a dataset with lower variance. Be careful though, there might just be an outlier (or outliers) that is increasing the
variance, when most of the data are actually very close.
Descriptive Statistics 5
2 - Left skewed Median > Mean
Real World Applications
Grades as a percentage in many universities,
Age of death,
Weight, Errors,
Precipitation
Descriptive Statistics 6
4- Outliers
outliers : are points that fall very far from the rest of our data points. This influences measures like the mean and standard
deviation much more than measures associated with the five number summary.
Outliers Advise
1. Plot your data to identify if you have outliers.
2. Handle outliers accordingly via the methods above.
3. If no outliers and your data follow a normal distribution - use the mean and standard deviation to describe your dataset, and
report that the data are normally distributed.
4. If you have skewed data or outliers, use the five number summary to summarize your data and report the outliers.
Descriptive Statistics
Descriptive statistics is about describing our collected data.
Inferential Statistics
Inferential Statistics is about using our collected data to draw conclusions to a larger population.
We looked at specific examples that allowed us to identify the
Descriptive Statistics 7