The (Almost) Complete Machine Learning Roadmap: Milestone 0: Python 3 and Other Basic Stuff

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

The (almost) Complete

Machine Learning Roadmap

Milestone 0: Python 3 and other basic stuff

1. Skim through ​this​ playlist - should be really easy to pick up Python (No need to learn
everything in one go. You can pick up the advanced topics as and when you need it).
2. Really crisp courses on Kaggle ​here​ - more like a crash course (explore `Kaggle learn`
for more such short courses)
3. Bookmark ​this​ link, as a quick ref for looking into using Python in common ML problems
4. Register on these platforms:
a. https://towardsdatascience.com/
b. https://www.kaggle.com/
c. https://github.com/
The first one is a blog and with its subscription, you will daily get good articles to read.

5. Learn the effective use of Google (It will be your saviour more often than not).

Milestone 1: Machine Learning by Andrew Ng on ​Coursera

1. Complete all the weeks (can skip week 4, week 5 as it is better taught in the next course)
2. Complete the assignments of each week. Though are to be done in MATLAB or Octave,
doing them will make the background computations easier to understand
3. Look for hints (mostly not needed) from ​here
4. By now you will be familiar with terms like
a. Regression
b. Classification
c. Regularization
d. Bias & Variance, cross-validation, overfitting and underfitting
e. Precision, Recall, F1-score,
f. SVM, Kernels, Elbow method
g. PCA, Anomaly Detection
h. Stochastic Gradient Descent
i. Computer Vision
5. Though the entire course consists of 11 weeks and if you skip week 4 and 5 you will only
have 9 weeks of material. But the course is very slow-paced, the entire thing can be
completed in 5-6 weeks. (Try to do this in this time span)

Milestone 2: Machine Learning by Josh Starmer on ​YouTube

1. Complete these (You can cover them from here or any other source you can find)
a. Video 34 to 39 (Decision Trees, Random Forest)
b. Video 41 to 47 (AdaBoost, Gradient Boosting)
c. Video 51 to 53 (XGBoost)

2. After this, you will be familiar with terms like


a. Gini Index, Entropy
b. Information Gain
c. Pruning (pre-pruning and post-pruning)
d. Bagging
e. Boostrap Dataset
f. Stump, Amount of Say

Milestone 3: Deep Learning Specialization by ​deeplearning.ai​ on


Coursera

1. This specialization consists of 5 courses out of which 3 are very important 1, 4, 5.


2. This being one of the most basic courses of Deep learning and requires sufficient
attention. This course is not free but auditable at Coursera. Audit the course and you can
get the programming assignments (which are unavailable in the audit) if you search
online.
3. The programming assignments in this course are in Keras and Tensorflow which will
make writing code easier but take this with a grain of salt. The older versions of
TensorFlow and Keras are used which has recently been rendered obsolete. So the
concepts remain the same but the implementation has changed.
4. The first course is on normal feed-forward neural networks. This is the basic course on
which all other courses depend. It is a 4 weeks course but can be easily completed
within 3 weeks.
5. The next two courses will teach you how to handle data and other overheads while
working on an ML project. They are not super important but still cover them as quickly as
possible.
6. The fourth course is on Convolutional Neural Networks. This is one of the most important
courses which will teach how to apply deep learning to images. Convolutional networks
are mostly used for images but they can be used for other tasks too. This is again is a
4-week course but you can take your time doing this. This will also form the base for
Recurrent Neural Networks.
7. The final course in this series is on Recurrent Neural Networks. These algorithms are
used for sequential data like the temperature at given times etc. Such data is handled
very differently than in convolution. At this point, you may choose to not do this course if
you plan to work only on images, but can continue this course if you later work on any
project on sequential data.
8. After this, you will have a good intuition of the terms like
a. NumPy
b. Activation function
c. Feed Forward
d. Backpropagation
9. For more info on NN, checkout
a. 3Blue1Brown ​playlist
b. And ​this​ too. (really good for convolution)

Milestone 4: Some practice!

First focus on doing small projects such that you understand the workflow and learn to use a
framework like tensorflow, keras or pytorch. Each one has its own pros and cons. You can have
a look here:
● https://towardsdatascience.com/keras-vs-pytorch-for-deep-learning-a013cb63870d
● Keras vs TensorFlow vs PyTorch | Deep Learning Frameworks
● https://deepsense.ai/keras-or-pytorch/

From our personal experience Keras is very easy to pick up and you can do almost everything
with it (when used along with Tensorflow. Keras is a high level API of Tensorflow) which you can
do with other frameworks. On the other hand Pytorch is very popular in academia due to its
flexibility.

1. Now, start using frameworks


a. TensorFlow 2 quickstart for beginners
b. Keras - Python Deep Learning Neural Network API
c. Pytorch Tutorial
The projects you can begin with:
● MNIST handwritten digits
● Facial Keypoint detection
● CIFAR-10 / CIFAR-1000
● Titanic
● Stock Data

# Necessary Tools:
1. Scikit-Learn
2. NumPy
3. Pandas
4. Matplotlib
5. Jupyter Notebook ​ (recommended)
6. Google Colab​ (can be used to eliminate the headache of installation issues)

2. Start with this tutorial on implementing a simple ML and DL models


a. Learn Intro to Machine Learning Tutorials

3. Once you complete that tutorial, explore this treasure


a. UCI Machine Learning Repository
i. Pick a dataset
ii. Test various models
iii. Analyze the results
iv. Repeat

Milestone 5: What’s Next

By doing all the above, you will have ample background knowledge and implementation skills to
tackle any problem. You should keep up curiosity because there are unexplored areas like
Reinforcement learning, One shot and Zero shot learning, vast areas of NLP, Multi-task
learning, Generative models etc. Each of these areas require special algorithms and training
methods but all of these are based on the basic knowledge which you have gathered till now.
Try to explore them!!​ You should try to get yourself involved with any research problem or
approach a faculty for some ideas. Or you can find a problem on Kaggle, form a team and try to
solve it. Research about the existing solutions (reading papers will really open your mind as to
how quickly current research is evolving). Try to follow some papers in NIPS, ICLR, ICML,
CVPR. Take this with a grain of salt, don’t aimlessly read papers, first choose a problem and
then explore papers or research in that area. Doing some good projects will definitely help you
in bagging good research internships along with good internships in companies who have
machine learning openings.

Extras
1. Lectures by leads in Machine Learning:- ​Institute for Pure & Applied Mathematics (IPAM)
2. Neural Networks for Machine Learning by Professor Geoffrey Hinton [Complete]
3. Stanford University CS231n Spring 2017:-
https://www.youtube.com/watch?v=vT1JzLTH4G4&list=PLC1qU-LWwrF64f4QKQT-Vg5
Wr4qEE1Zxk
4. WildML – Artificial Intelligence, Deep Learning, and NLP
5. Home - colah's blog
6. Two Minute Papers
7. Machine Learning From Scratch

You might also like