Intro To ML
Intro To ML
Dexter Fichuk
https://goo.gl/VaWHrb (content here)
https://www.continuum.io/downloa
ds
http://tiny.cc/conda
What is ML?
Types of ML
Classification Regression
Classification can if something is Predicting a value based on the
true or false (1 or 0), could be input, could be predicting a
classifying a picture as a cat or credit score, the temperature,
dog or classifying if something is stocks, or anything where the
a square, triangle or circle. there is continuous output
options, (eg. 2.4893, 1.00049,
59.23)
The Flow
Training/
Data Mining Pre-Processing
Evaluating
X= 1.5
0.5
0.0
3.5
4.3
8.1
8.3
3.6
3.4
4.6
y= 0.5
0.2
5.1 9.7 3.5 7.9 5.1 5.6
3.7 7.8 2.6 3.2 6.3 6.7
2 .4 SVMs.
Training A Model
Splitting the Data
Simple Splitting
The gold standard of evaluating a model is by testing it on data it has
not seen in training. This means taking a percentage out of the training
set (typically 10-20%), and running it through the trained model to see
it’s accuracy.
It’s important to set a random state for the split, so you can evaluate
your model on the same training set every time, making your results
reproducible.
Training and Testing Data
training set
1.1 2.2 3.4 5.6 1.0 1.6
6.7 0.5 0.4 2.6 1.6 2.7
2.4 9.3 7.3 6.4 2.8 4.4
X= 1.5
0.5
0.0
3.5
4.3
8.1
8.3
3.6
3.4
4.6
y= 0.5
0.2
BAD SPLIT
test set
Picking an Algorithm
There are many algorithms to choose from, but lucky for us, Scikit-Learn
has a ton built in and can be used mostly interchangeably, meaning
that different classifiers can be used in a loop then plotted to compare
performance.
Each algorithm has better use cases and could outperform others for a
specific task. There is no master algorithm.
Training
Sweet spot
Accuracy
Testing Generalization
Underfitting Overfitting
Model complexity
Overfitting and Underfitting
● Gradient Boosting
Trainin
Evaluating
g
Jupyter Notebook Use
Recommende
d Resources
Accuracy
● Hands-On Machine Learning with
Scikit-Learn and TensorFlow by
Aurélien Géron
● Deep Learning with Python by
François Chollet
● Kaggle
github.com/dexterfichuk/ML-
Bootcamp
https://goo.gl/VaWHrb
http://scikit-learn.org/