100% found this document useful (1 vote)

320 views13 pages

Data Science Statistics Mathematics Cheat Sheet

This document provides a cheat sheet for key data science statistics and machine learning metrics. It defines and explains classifier metrics like the confusion matrix, sensitivity, precision, F1 score, and accuracy rate. It also covers regressor metrics like mean absolute error, mean squared error, root mean squared error, and R2. Additionally, it discusses statistical indicators such as correlation, covariance, variance, and standard deviation. Finally, it briefly introduces the three most common types of distributions: uniform, normal, and binomial. The cheat sheet is intended to help data scientists understand essential metrics and concepts for applying and interpreting machine learning models.

Uploaded by

preethamkrishna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

320 views13 pages

Data Science Statistics Mathematics Cheat Sheet

Uploaded by

preethamkrishna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Get started Open in app

495K Followers · About Follow

You have 2 free member-only stories left this month. Sign up for Medium and get an extra one

Pixabay

Your Ultimate Data Science Statistics &

Mathematics Cheat Sheet
Machine Learning Metrics, Statistical Indicators, & More

Andre Ye May 18 · 8 min read

All images created by author unless stated otherwise.

In data science, having a solid understanding of the statistics and mathematics of
your data is essential to applying and interpreting machine learning methods
appropriately and effectively.

Classifier Metrics. Confusion matrix, sensitivity, recall, specificity, precision, F1

score. What they are, when to use them, how to implement them.

Regressor Metrics. MAE, MSE, RMSE, MSLE, R². What they are, when to use it,
how to implement it.

Statistical Indicators. Correlation coefficient, covariance, variance, standard

deviation. How to use and interpret them.

Types of Distributions. The three most common distributions, how to identify

them.

Classifier Metrics
Classifier metrics are metrics used to evaluate the performance of machine learning
classifiers — models that put each training example into one of several discrete
categories.

Confusion Matrix is a matrix used to indicate a classifier’s predictions on labels. It

contains four cells, each corresponding to one combination of a predicted true or
false and an actual true or false. Many classifier metrics are based on the confusion
matrix, so it’s helpful to keep an image of it stored in your mind.
Sensitivity/Recall is the number of positives that were accurately predicted. This is
calculated as TP/(TP+FN) (note that false negatives are positives). Sensitivity is a
good metric to use in contexts where correctly predicting positives is important, like
medical diagnoses. In some cases, false positives can be dangerous, but it is
generally agreed upon that false negatives (e.g. a diagnosis of ‘no cancer’ in
someone who does have cancer) are more deadly. By having model maximize
sensitivity, its ability to prioritize correctly classify positives is targeted.

Specificity is the number of negatives that were accurately predicted, calculated as

TN/(TN+FP) (note that false positives are actually negatives). Like sensitivity,
specificity is a helpful metric in the scenario that accurately classifying negatives is
more important than classifying positives.

Precision can be thought of as the opposite of sensitivity or recall in that while

sensitivity measures the proportion of actually true obser vations that were
predicted as true, precision measures the proportion of predicted true obser vations
that actually were true. This is measured as TP/(TP+FP) . Precision and recall
together provide a rounded view of a model’s performance.

F1 Score combines precision and recall through the harmonic mean. The exact
formula for it is (2 × precision × recall) / (precision + recall) . The harmonic
mean is used since it penalizes more extreme values, opposed to the mean, which is
naïve in that it weights all errors the same.

Detection/Accuracy Rate is the number of items correctly classified, calculated as

the sum of True Positives and True Negatives divided by the sum of all four
confusion matrix quadrants. This accuracy rate weights positives and negatives
equally, instead of prioritizing one over the other.

Using F1 Score vs Accuracy: The F1 score should be used when not making
mistakes is more important (False Positives and False Negatives being penalized
more heavily), whereas accuracy should be used when the model’s goal is to
optimize performance. Both metrics are used based on context, and perform
differently depending on the data. Generally, however, the F1-score is better for
imbalanced classes (for example, cancer diagnoses, when there are vastly more
negatives than positives) whereas accuracy is better for more balanced classes.

Implementing metrics follows the following format.

Discussed metric names in sklearn are:

Confusion Matrix: confusion_matrix

Sensitivity/Recall: recall_score

Precision: precision_score

F1 Score: f1_score

Accuracy: accuracy_score
Balanced Accuracy (for unevenly distributed classes): balanced_accuracy_score

Regressor Metrics
Regression metrics are used to measure how well a model that puts a training
example on a continuous scale, such as determining the price of a house.

Mean Absolute Error (MAE) is perhaps the most common and interpretable
regression metric. MAE calculates the difference between each data point’s
predicted y-value and the real y-value, then averages ever y difference (the
difference being calculated as the absolute value of one value minus the other).

Median Absolute Error is another metric of evaluating the average error. While it
has the benefit of moving the error distribution lower by focusing on the middle
error value, it also tends to ignore extremely high or low errors that are factored
into the mean absolute error.

Mean Square Error (MSE) is another commonly used regression metric that
‘punishes’ higher errors more. For example, an error (difference) of 2 would be
weighted as 4, whereas an error of 5 would be weighted as 25, meaning that MSE
finds the difference between the two errors as 21, whereas MAE weights the
difference at its face value — 3. MSE calculates the square of each data point’s
predicted y-value and real y-value, then averages the squares.

Root Mean Square Error (RMSE) is used to give a level of interpretability that
mean square error lacks. By square-rooting the MSE, we achieve a metric similar to
MAE in that it is on a similar scale, while still weighting higher errors at higher
levels.

Mean Squared Logarithmic Error (MSLE) is another common variation of the

mean absolute error. Because of the logarithmic nature of the error, MSLE only
cares about the percent differences. This means that MSLE will treat small
differences between small values (for example, 4 and 3) the same as large
differences on a large scale (for example, 1200 and 900).

R² is a commonly used metric (where r is known as the correlation coefficient)

which measures how much a regression model represents the proportion of the
variance for a dependent variable which can be explained by the independent
variables. In short, it is a good metric of how well the data fits the regression model.

Implementing metrics follows the following format.

Discussed metric names in sklearn are:

Mean Absolute Error: mean_absolute_error

Median Absolute Error: median_absolute_error

Mean Squared Error: mean_squared_error

Root Mean Squared Error: root_mean_squared_error

Mean Squared Logarithmic Error: mean_squared_log_error

R²: r2_score

Statistical Indicators
4 main data science statistical measures.

Correlation is a statistical measure of how well two variables fluctuate together.

Positive correlations mean that two variables fluctuate together (a positive change
in one is a positive change to another), whereas negative correlations mean that
two variable change opposite one another (a positive change in one is a negative
change in another). The correlation coefficient, from +1 to -1, is also known as R.

The correlation coefficient can be accessed using the .corr() function through
Pandas DataFrames. Consider the following two sequences:

seq1 = [0,0.5,0.74,1.5,2.9]
seq2 = [4,4.9,8.2,8.3,12.9]

With the constructor table = pd.DataFrame({‘a’:seq1,’b’:seq2}) , a DataFrame

with the two sequences is created. Calling table.corr() yields a correlation table.
For example, in the correlation table, sequences A and B have a correlation of 0.95.
The correlation table is symmetric and equal to 1 when a sequence is compared
against itself.

Covariance is, similarly to correlation, a measure of the property of a function of

retaining its form when the variables are linearly transformed. Unlike correlation,
however, covariance can take on any number while correlation is limited between a
range. Because of this, correlation is more useful for determining the strength of
the relationship between two variables. Because covariance has units (unlike
correlation) and is affected by changes in the center or scale, it is less widely used as
a stand-alone statistic. However, covariance is used in many statistics formulas, and
is a useful figure to know.

This can be done in Python with numpy.cov(a,b)[0][1] , where a and b are the
sequences to be compared.

Variance is a measure of expectation of the squared deviation of a random variable

from its mean. It informally measures how far a set of numbers are spread out from
their mean. Variance can be measured in the statistics librar y ( import statistics )
with statistics.variance(list) .

Standard Deviation is the square root of the variance, and is a more scaled statistic
for how spread out a distribution is. Standard deviation can be measured in the
statistics librar y with statistics.stdev(list) .
Types of Distributions
Knowing your distributions are ver y important when dealing with data analysis and
understanding which statistical and machine learning methods to use.

While there are several types of mathematically specialized distributions, most of

them can fit into these three distributions.

Uniform Distribution is the easiest distribution — it is completely flat, or truly

random. For example, which number of dots a die landed on (from 1 to 6) recorded
for each of 6000 throws would yields a flat distribution, with approximately 1000
throws per number of dots. Uniform distributions have useful properties — for
example, read how the Allied forces saved countless lives in World War II using
statistical attributes of uniform distributions.

How Data Science Gave the Allied Forces an Edge in World War II
The German Tank Problem with computer simulations
medium.com
Normal Distribution is a ver y common distribution that resembles a cur ve (one
name for the distribution is the ‘Bell Cur ve’). Besides its common use in data
science, it is where most things in the universe can be described by, like IQ or salar y.
It is characterized by the following features:

68% of the data is within one standard deviation of the mean.

95% of the data is within two standard deviations of the mean.

99.7% of the data is within three standard deviations of the mean.

Many machine learning algorithms require a normal distribution among the data.
For example, logistic regression requires the residuals be normally distributed. This
can be visualized and recognized with a residplot() . Information and examples of
the usage of this and other statistical models can be found here.

Poisson Distribution can be thought of as a generalization of the normal

distribution; a discrete probability distribution that expresses the probability of a
given number of events occurring in a fixed inter val of time or space if these events
occur with a known constant mean rate and independently of the time since the last
event. With changing values of λ, the Poisson distribution shifts the mean left or
right, with the capability of creating left-skewed or right-skewed data.

Thanks for reading!

If you found this cheat sheet helpful, feel free to upvote and bookmark the page for
easy reference. If you enjoyed this cheat sheet, you may be interested in applying
your statistics knowledge in other cheat-sheets. If you would like to see additional
topics discussed in this cheat-sheet, feel free to let me know in the responses!

Your Ultimate Python Visualization Cheat-Sheet

This cheat-sheet contains the elements of a plot you will most
commonly need in a clear and organized fashion, with…
medium.com
Your Ultimate Data Manipulation & Cleaning Cheat Sheet

Parsing Dates, Imputing, Anomaly Detection, & More

medium.com

Your Ultimate Data Mining & Machine Learning Cheat Sheet

Feature Importance, Decomposition, Transformation, & More
medium.com

Sign up for The Daily Pick

By Towards Data Science

Hands-on real-world examples, research, tutorials, and cutting-edge techniques delivered Monday to
Thursday. Make learning your daily ritual. Take a look

Your email

Get this newsletter

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information
about our privacy practices.

1.1K 2

Machine Learning Data Science AI Statistics Data Analysis

A Medium publication sharing concepts, ideas, and codes.

How to do visualization using python from scratch

Sharan Kumar Ravindran in Towards Data Science

30 Examples to Master Pandas

Soner Yıldırım in Towards Data Science

5 YouTubers Data Scientists And ML Engineers Should

Subscribe To
Richmond Alake in Towards Data Science

5 Types of Machine Learning Algorithms You Need to

Know
Sara A. Metwalli in Towards Data Science

7 Must-Haves in your Data Science CV

Elad Cohen in Towards Data Science

Data Scientist: from zero to hero

Shuyi Yang in Towards Data Science

21 amazing Youtube channels for you to learn AI,

Machine Learning, and Data Science for free
Jair Ribeiro in Towards Data Science

Why 90 percent of all machine learning models never

make it into production
Rhea Moutafis in Towards Data Science

Ab t H l L l
About Help Legal

Get the Medium app

Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
100% (1)
Machine Learning For Tabular Data XGBoost, Deep Learning, and AI (Mark Ryan, Luca Massaron) (Z-Library)
504 pages
ML Cheatsheets
100% (2)
ML Cheatsheets
17 pages
Aryasri - Management Science 4E-Tata McGraw-Hill Education (2009) PDF
No ratings yet
Aryasri - Management Science 4E-Tata McGraw-Hill Education (2009) PDF
438 pages
Machine Learning Cheat Sheet PDF
100% (1)
Machine Learning Cheat Sheet PDF
21 pages
Econometrics Lecture Notes Booklet
No ratings yet
Econometrics Lecture Notes Booklet
81 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
Matlab Signal Processing
91% (11)
Matlab Signal Processing
418 pages
Designing Machine Learning Workflows in Python Chapter1
No ratings yet
Designing Machine Learning Workflows in Python Chapter1
32 pages
Data Analyst Roadmap by Rishabh Mishra
No ratings yet
Data Analyst Roadmap by Rishabh Mishra
9 pages
Module 9 - Risk Management & Trading Psychology
No ratings yet
Module 9 - Risk Management & Trading Psychology
132 pages
Unit 4
No ratings yet
Unit 4
108 pages
The Midnight Library
No ratings yet
The Midnight Library
2 pages
Modeling and Analysis of Principles For Chemical and Biological Engineers
100% (12)
Modeling and Analysis of Principles For Chemical and Biological Engineers
565 pages
Practical Data Science
No ratings yet
Practical Data Science
121 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Python For Non-Programmers Final
No ratings yet
Python For Non-Programmers Final
218 pages
Machine Learning Cheat Sheet ??? - ?
No ratings yet
Machine Learning Cheat Sheet ??? - ?
231 pages
Cluster
100% (1)
Cluster
72 pages
Pandas
100% (1)
Pandas
1,131 pages
R-Python Numpy 101 Exercises. Skyrocket Your Python Skill 2020
100% (1)
R-Python Numpy 101 Exercises. Skyrocket Your Python Skill 2020
162 pages
Vince GRE VOCAB WORDS
No ratings yet
Vince GRE VOCAB WORDS
467 pages
U02Lecture07 Classification
100% (1)
U02Lecture07 Classification
56 pages
Classification and Regression Trees
100% (1)
Classification and Regression Trees
60 pages
New Ebook Guide To AI Data Science
No ratings yet
New Ebook Guide To AI Data Science
50 pages
Datascience With Answers
100% (1)
Datascience With Answers
36 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
The Primary-Response Framework For Geometallurgica
No ratings yet
The Primary-Response Framework For Geometallurgica
6 pages
Cheatsheet Python A4
100% (1)
Cheatsheet Python A4
7 pages
Cheat Sheet - Machine Learning - Data Science Interview PDF
No ratings yet
Cheat Sheet - Machine Learning - Data Science Interview PDF
16 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Data Science Cheat Sheets
100% (1)
Data Science Cheat Sheets
1 page
Panel Data Models
No ratings yet
Panel Data Models
112 pages
Dealing With Missing Data in Python Pandas
100% (1)
Dealing With Missing Data in Python Pandas
14 pages
K-Means Clustering Using Python
No ratings yet
K-Means Clustering Using Python
30 pages
Logistic Regression
No ratings yet
Logistic Regression
24 pages
Final - Data and Ai Governance.6sept2023
No ratings yet
Final - Data and Ai Governance.6sept2023
42 pages
6 XG Boost - Jupyter Notebook
100% (1)
6 XG Boost - Jupyter Notebook
3 pages
Data Science Case Study For Introduction
No ratings yet
Data Science Case Study For Introduction
19 pages
Data Analysis With Pandas - Introduction To Pandas Cheatsheet - Codecademy PDF
100% (1)
Data Analysis With Pandas - Introduction To Pandas Cheatsheet - Codecademy PDF
3 pages
Cheatsheet Machine Learning Tips and Tricks PDF
No ratings yet
Cheatsheet Machine Learning Tips and Tricks PDF
2 pages
FoCal MultiClass Manual
100% (1)
FoCal MultiClass Manual
32 pages
Cheat Sheet: Tableau-Desktop
No ratings yet
Cheat Sheet: Tableau-Desktop
1 page
Feature Selection Engineering
No ratings yet
Feature Selection Engineering
72 pages
Introduction To Data Visualization With Matplotlib Chapter2
No ratings yet
Introduction To Data Visualization With Matplotlib Chapter2
27 pages
Predicting & Preventing Banking Customer Churn by Unlocking Big Data
No ratings yet
Predicting & Preventing Banking Customer Churn by Unlocking Big Data
10 pages
A Comprehensive Statistics Cheat Sheet For Data Science Interviews - StrataScratch
No ratings yet
A Comprehensive Statistics Cheat Sheet For Data Science Interviews - StrataScratch
32 pages
A Comprehensive Statistics Cheat Sheet For Data Science 1685659812
No ratings yet
A Comprehensive Statistics Cheat Sheet For Data Science 1685659812
39 pages
Machine Learning Mini-Project Report
No ratings yet
Machine Learning Mini-Project Report
26 pages
Simqke2 Ps
No ratings yet
Simqke2 Ps
25 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Financial Modeling Outlines 31012021 103557am
No ratings yet
Financial Modeling Outlines 31012021 103557am
5 pages
Bayesian Machine Learning
No ratings yet
Bayesian Machine Learning
127 pages
BIM301 Research Notes
No ratings yet
BIM301 Research Notes
72 pages
Exploratory Data Analysis - Satyajit
No ratings yet
Exploratory Data Analysis - Satyajit
35 pages
Data Science With Python - Lesson 02 - Data Analytics Overview
No ratings yet
Data Science With Python - Lesson 02 - Data Analytics Overview
54 pages
WT Manual
No ratings yet
WT Manual
54 pages
How To Learn Machine Learning Algorithms For Interviews
No ratings yet
How To Learn Machine Learning Algorithms For Interviews
16 pages
Cross Validation LN 12
No ratings yet
Cross Validation LN 12
11 pages
EDA Assignment
No ratings yet
EDA Assignment
15 pages
AB Cheatsheet
No ratings yet
AB Cheatsheet
13 pages
Dissertation Mengenai Gaussian Process Sangat Keren
No ratings yet
Dissertation Mengenai Gaussian Process Sangat Keren
145 pages
Essential Python Libraries and Functions For Data Science 1706295212
No ratings yet
Essential Python Libraries and Functions For Data Science 1706295212
12 pages
A Guide To: Data Science at Scale
No ratings yet
A Guide To: Data Science at Scale
20 pages
ABE 413 Merged. Olumech
No ratings yet
ABE 413 Merged. Olumech
109 pages
Random Variables: - Definition of Random Variable
No ratings yet
Random Variables: - Definition of Random Variable
29 pages
Test 2
100% (1)
Test 2
5 pages
7 Time Series Datasets For Machine Learning
No ratings yet
7 Time Series Datasets For Machine Learning
8 pages
The Roy Model: Christopher Taber
No ratings yet
The Roy Model: Christopher Taber
64 pages
A Practical Approach To Linear Regression in Machine Learning - by Ashwin Raj - Towards Data Science
No ratings yet
A Practical Approach To Linear Regression in Machine Learning - by Ashwin Raj - Towards Data Science
20 pages
Learning Path Machine Learning
No ratings yet
Learning Path Machine Learning
7 pages
Mem 601 L 3
No ratings yet
Mem 601 L 3
43 pages
This Study Resource Was
No ratings yet
This Study Resource Was
8 pages
DAX Cheat Sheet
No ratings yet
DAX Cheat Sheet
10 pages
K Fold and Other Cross-Validation Techniques
No ratings yet
K Fold and Other Cross-Validation Techniques
10 pages
Performance Evaluation
No ratings yet
Performance Evaluation
24 pages
Macro Risks and The Term Structure of Interest Rates
No ratings yet
Macro Risks and The Term Structure of Interest Rates
62 pages
On The Problem of Permissible Covariance and Variogram Models
No ratings yet
On The Problem of Permissible Covariance and Variogram Models
15 pages
07 Covariance Answers Hidden Lecture
No ratings yet
07 Covariance Answers Hidden Lecture
62 pages
Chapter - 5 Portfolio Analysis
No ratings yet
Chapter - 5 Portfolio Analysis
18 pages
Define Data Warehouse. Differentiate Between OLTP and OLAP Databases
No ratings yet
Define Data Warehouse. Differentiate Between OLTP and OLAP Databases
6 pages
Supervised Learning Flowchart
No ratings yet
Supervised Learning Flowchart
1 page
Distributed Aberration Correction in Handheld Ultrasound Based On Tomographic Estimates of The Speed of Sound
No ratings yet
Distributed Aberration Correction in Handheld Ultrasound Based On Tomographic Estimates of The Speed of Sound
11 pages
Assignment 2solution
No ratings yet
Assignment 2solution
13 pages
Narrative Report Introduction To Portfolio Management
No ratings yet
Narrative Report Introduction To Portfolio Management
3 pages
Multivariate Probability: 1 Discrete Joint Distributions
No ratings yet
Multivariate Probability: 1 Discrete Joint Distributions
10 pages
State of Health Estimation of Lithium-Ion Batteries For Dynamic Driving Profiles Based On Feature Extraction From Battery Relaxation Time Using Machine Learning
No ratings yet
State of Health Estimation of Lithium-Ion Batteries For Dynamic Driving Profiles Based On Feature Extraction From Battery Relaxation Time Using Machine Learning
6 pages
Assignment 01 AK
No ratings yet
Assignment 01 AK
4 pages
Building A Career in Data Science - The Overview
No ratings yet
Building A Career in Data Science - The Overview
2 pages
Week 3 - MODULE - BFinance
No ratings yet
Week 3 - MODULE - BFinance
7 pages
Metric
No ratings yet
Metric
6 pages
Power Sys State Estimatn
No ratings yet
Power Sys State Estimatn
6 pages
Bargain Brand Breakfast Cereal
No ratings yet
Bargain Brand Breakfast Cereal
1 page
PTSP Mid 2 Questions
No ratings yet
PTSP Mid 2 Questions
2 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet

Data Science Statistics Mathematics Cheat Sheet

Uploaded by

Data Science Statistics Mathematics Cheat Sheet

Uploaded by

Get started Open in app

495K Followers · About Follow

Your Ultimate Data Science Statistics &

Andre Ye May 18 · 8 min read

All images created by author unless stated otherwise.

Classifier Metrics. Confusion matrix, sensitivity, recall, specificity, precision, F1

Statistical Indicators. Correlation coefficient, covariance, variance, standard

Types of Distributions. The three most common distributions, how to identify

Confusion Matrix is a matrix used to indicate a classifier’s predictions on labels. It

Specificity is the number of negatives that were accurately predicted, calculated as

Precision can be thought of as the opposite of sensitivity or recall in that while

Detection/Accuracy Rate is the number of items correctly classified, calculated as

Implementing metrics follows the following format.

Discussed metric names in sklearn are:

Confusion Matrix: confusion_matrix

Mean Squared Logarithmic Error (MSLE) is another common variation of the

R² is a commonly used metric (where r is known as the correlation coefficient)

Implementing metrics follows the following format.

Discussed metric names in sklearn are:

Mean Absolute Error: mean_absolute_error

Median Absolute Error: median_absolute_error

Mean Squared Error: mean_squared_error

Root Mean Squared Error: root_mean_squared_error

Mean Squared Logarithmic Error: mean_squared_log_error

Correlation is a statistical measure of how well two variables fluctuate together.

With the constructor table = pd.DataFrame({‘a’:seq1,’b’:seq2}) , a DataFrame

Covariance is, similarly to correlation, a measure of the property of a function of

Variance is a measure of expectation of the squared deviation of a random variable

While there are several types of mathematically specialized distributions, most of

Uniform Distribution is the easiest distribution — it is completely flat, or truly

68% of the data is within one standard deviation of the mean.

95% of the data is within two standard deviations of the mean.

99.7% of the data is within three standard deviations of the mean.

Poisson Distribution can be thought of as a generalization of the normal

Thanks for reading!

Your Ultimate Python Visualization Cheat-Sheet

Parsing Dates, Imputing, Anomaly Detection, & More

Your Ultimate Data Mining & Machine Learning Cheat Sheet

Sign up for The Daily Pick

Get this newsletter

Machine Learning Data Science AI Statistics Data Analysis

More from Towards Data Science Follow

A Medium publication sharing concepts, ideas, and codes.

More From Medium

How to do visualization using python from scratch

30 Examples to Master Pandas

5 YouTubers Data Scientists And ML Engineers Should

5 Types of Machine Learning Algorithms You Need to

7 Must-Haves in your Data Science CV

Data Scientist: from zero to hero

21 amazing Youtube channels for you to learn AI,

Why 90 percent of all machine learning models never

Get the Medium app

You might also like