SQL Notes
SQL Notes
Descriptive statistics are methods of summarizing and organizing data to provide meaningful
insights. They help in understanding the main features of a dataset.
https://www.analyticsvidhya.com/blog/2021/06/descriptive-statistics-a-beginners-guide/
https://byjus.com/maths/central-tendency/
Definition: The middle value when the data is sorted. It's not affected by extreme values.
Example: Median of {10, 15, 20, 25, 30} = 20
Mode:
Definition: The difference between the maximum and minimum values in a dataset.
Example: Range of {10, 15, 20, 25, 30} = 30 - 10 = 20
Variance:
Skewness:
Definition: A measure of the asymmetry of the probability distribution.
Positive Skewness: Right-skewed (tail on the right side is longer).
Negative Skewness: Left-skewed (tail on the left side is longer).
Example: A positively skewed distribution might represent income data where a few
individuals have very high incomes.
https://www.analyticsvidhya.com/blog/2021/05/shape-of-data-skewness-and-kurtosis/
#:~:text=The%20skewness%20is%20a%20measure,pushed%20towards%20the%20left
%20side).
Visual Plots:
Histogram:
Scatter Plot:
A1: The median is less sensitive to extreme values (outliers) and provides a better
representation of central tendency when the data is skewed.
Q2: How is skewness interpreted in a dataset?
A2: Positive skewness indicates a rightward tail, while negative skewness indicates a
leftward tail. The magnitude of skewness quantifies the degree of asymmetry.
Q3: What information does a box plot convey about a dataset?
A3: A box plot visually represents the distribution's central tendency, and spread, and
identifies potential outliers through quartiles and the interquartile range.
Q4: When would you use a scatter plot in data analysis?
A4: A scatter plot is useful for identifying relationships between two variables, visualizing
patterns, and assessing correlations in data.
CORRELATION
https://www.analyticsvidhya.com/blog/2021/05/shape-of-data-skewness-and-kurtosis/
#:~:text=The%20skewness%20is%20a%20measure,pushed%20towards%20the%20left
%20side).
Hypothesis Testing
https://www.simplilearn.com/tutorials/statistics-tutorial/hypothesis-testing-in-
statistics#:~:text=Hypothesis%20testing%20is%20a%20statistical,data%20to%20assess
%20the%20evidence.
Evaluation Model:
https://www.analyticsvidhya.com/blog/2021/12/evaluation-of-classification-model/
https://www.datacamp.com/tutorial/association-rule-mining-python