The document discusses skewness and kurtosis, which are statistical measures that describe the asymmetry and shape of probability distributions, respectively. Skewness indicates whether data is skewed to the left or right, affecting model assumptions and feature importance, while kurtosis measures the tailedness and peakedness of a distribution, with implications for financial risk. The document also categorizes types of skewness and kurtosis, explaining their characteristics and significance in data analysis.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
23 views17 pages
Data Science16-02-2025
The document discusses skewness and kurtosis, which are statistical measures that describe the asymmetry and shape of probability distributions, respectively. Skewness indicates whether data is skewed to the left or right, affecting model assumptions and feature importance, while kurtosis measures the tailedness and peakedness of a distribution, with implications for financial risk. The document also categorizes types of skewness and kurtosis, explaining their characteristics and significance in data analysis.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17
Data Science
Dr. Khaled Almghari
16-2-2025 SKEWNESS and KURTOSIS • What is Skewness? • Skewness is a statistical measure that assesses the asymmetry of a probability distribution. It quantifies the extent to which the data is skewed or shifted to one side. • Positive skewness indicates a longer tail on the right side of the distribution, while negative skewness indicates a longer tail on the left side. Skewness helps in understanding the shape and outliers in a dataset. • Depending on the model, skewness in the values of a specific independent variable (feature) may violate model assumptions or diminish the interpretation of feature importance. SKEWNESS and KURTOSIS • A probability distribution that deviates from the symmetrical normal distribution (bell curve) in a given set of data exhibits skewness, which is a measure of asymmetry in statistics. • A skewed data set, typical values fall between the first quartile (Q1) and the third quartile (Q3). • The normal distribution helps to know a skewness. When we talk about normal distribution, data symmetrically distributed. The symmetrical distribution has zero skewness as all measures of a central tendency lies in the middle. • In a symmetrically distributed dataset, both the left-hand side and the right- hand side have an equal number of observations. (If the dataset has 90 values, then the left-hand side has 45 observations, and the right-hand side has 45 observations.). But, what if not symmetrical distributed? That data is called asymmetrical data, and that time skewness comes into the picture. SKEWNESS and KURTOSIS SKEWNESS and KURTOSIS • Types of Skewness • Positive Skewed or Right-Skewed (Positive Skewness) • In statistics, a positively skewed or right-skewed distribution has a long right tail. It is a sort of distribution where the measures are dispersing, unlike symmetrically distributed data where all measures of the central tendency (mean, median, and mode) equal each other. This makes Positively Skewed Distribution a type of distribution where the mean, median, and mode of the distribution are positive rather than negative or zero. • In positively skewed, the mean of the data is greater than the median (a large number of data-pushed on the right-hand side). In other words, the results are bent towards the lower side. The mean will be more than the median as the median is the middle value and mode is always the most frequent value. SKEWNESS and KURTOSIS SKEWNESS and KURTOSIS • Negative Skewed or Left-Skewed (Negative Skewness) • A distribution with a long left tail, known as negatively skewed or left-skewed, stands in complete contrast to a positively skewed distribution. skewness and kurtosis in statistics, negatively skewed distribution refers to the distribution model where more values are plots on the right side of the graph, and the tail of the distribution is spreading on the left side. • In negatively skewed, the mean of the data is less than the median (a large number of data-pushed on the left-hand side). Negatively Skewed Distribution is a type of distribution where the mean, median, and mode of the distribution are negative rather than positive or zero. SKEWNESS and KURTOSIS SKEWNESS and KURTOSIS SKEWNESS and KURTOSIS • What is Kurtosis? • Kurtosis is a statistical measure that quantifies the shape of a probability distribution. It provides information about the tails and peakedness of the distribution compared to a normal distribution. • Positive kurtosis indicates heavier tails and a more peaked distribution, while negative kurtosis suggests lighter tails and a flatter distribution. Kurtosis helps in analyzing the characteristics and outliers of a dataset. • The measure of Kurtosis refers to the tailedness of a distribution. Tailedness refers to how often the outliers occur. • Peakedness in a data distribution is the degree to which data values are concentrated around the mean. SKEWNESS and KURTOSIS • Datasets with high kurtosis tend to have a distinct peak near the mean, decline rapidly, and have heavy tails. Datasets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. • In finance, kurtosis is used as a measure of financial risk. A large kurtosis is associated with a high level of risk for an investment because it indicates that there are high probabilities of extremely large and extremely small returns. On the other hand, a small kurtosis signals a moderate level of risk because the probabilities of extreme returns are relatively low. SKEWNESS and KURTOSIS SKEWNESS and KURTOSIS • Mesokurtic: A distribution with mesokurtic kurtosis has a similar peak and tail shape as the normal distribution. It has a kurtosis value of around 0, indicating that its tails are neither too heavy nor too light compared to a normal distribution. • Leptokurtic: A distribution with leptokurtic kurtosis has heavier tails and a sharper peak than the normal distribution. It has a positive kurtosis value, indicating that it has more extreme outliers than a normal distribution. This type of distribution is often associated with higher peakedness and a greater probability of extreme values. • Platykurtic: A distribution with platykurtic kurtosis has lighter tails and a flatter peak than the normal distribution. It has a negative kurtosis value, indicating that it has fewer extreme outliers than a normal distribution. This type of distribution is often associated with less peakedness and a lower probability of extreme values. SKEWNESS and KURTOSIS • Distribution on the basis of skewness value: • Skewness = 0: Then normally distributed. • Skewness > 0: Then more weight in the left tail of the distribution. • Skewness < 0: Then more weight in the right tail of the distribution. • Kurtosis: • It is also a statistical term and an important characteristic of frequency distribution. It determines whether a distribution is heavy-tailed in respect of the normal distribution. It provides information about the shape of a frequency distribution. • kurtosis for normal distribution is equal to 3. • For a distribution having kurtosis < 3: It is called playkurtic. • For a distribution having kurtosis > 3, It is called leptokurtic and it signifies that it tries to produce more outliers rather than the normal distribution. • This article focuses on how to Calculate Skewness & Kurtosis in Python. SKEWNESS and KURTOSIS SKEWNESS and KURTOSIS Thanks for your attention