How I'd Learn Machine Learning (If I Could Start Over) - by Egor Howell - Jan, 2024 - Towards Data Science
How I'd Learn Machine Learning (If I Could Start Over) - by Egor Howell - Jan, 2024 - Towards Data Science
244
Image by author.
I have been working as a Data Scientist for over two years. Over time, I have
learned and mainly studied machine learning (ML). To me, it’s probably the
most fascinating part of the job.
ML is a BIG space, there is so much to learn and understand. However,
taking it one step at a time makes the whole process less daunting and much
easier to handle.
In this article, I want to go over the steps I would take if I had to learn ML
from scratch again. Let’s get into it!
Supplemental video.
Maths
Machine learning revolves around algorithms, which are essentially a series
of mathematical operations. These algorithms can be implemented through
various methods and in numerous programming languages, yet their
underlying mathematical principles are the same.
A frequent argument is that you don’t need to know maths for machine
learning because most modern-day libraries and packages abstract the
theory behind the algorithms.
There is of course more maths to learn, but best start with the basics and you can
always enrich your knowledge later on.
You don’t need to understand all these concepts to a master’s degree level but
should be able to answer questions like what is a derivative, how to multiply
matrices together and what is maximum likelihood estimation.
That list I just wrote is the bedrock of nearly every machine learning
algorithm, so having this solid foundation will set you up for success in the
long run.
Multivariable calculus
Probability distributions
Now, there are numerous courses out there that you can take to learn all the
required maths. For a thorough introduction, I recommend the videos on
freeCodeCamp on Linear Algebra, Calculus, and Statistics.
You can also use websites such as Khan Academy and Brilliant which have
great resources on these topics. They also have a wide range of other
domains, so feel free to explore!
My main advice is to find one course, complete and move on. You can always
come back later if there are gaps in your knowledge or even use Google!
Python
Python is the gold standard and the go-to programming language for
machine learning.
Beginners often get caught up in the so-called “best way” to learn Python. In
reality, any introductory course will suffice as they will teach all the same
things.
Pandas — This is the go-to library for loading, manipulating, and working
with data in Python. It is great for almost any data analysis task and is easy to
use. freeCodeCamp pandas crash course.
Great overall crash course. However, you would only need the first half an hour of the video for the Anaconda
installation.
As with the previous sections, don’t spend too much time on this and get
stuck in tutorial hell. Learn the basics and move on to the next step, which is
probably the most exciting!
The previous three steps were all about getting your foundation ready to
tackle machine learning. These foundational tasks shouldn’t take too long,
maybe a month.
However, the machine learning theory part can take some time due to the
length of the courses. It’s important not to rush, as each subsequent step and
model usually builds on the previous.
The course I took at the beginning of my journey and the one I recommend
you start with is Andrew Ng’s Machine Learning Specialization on Coursera.
I took it back in 2020 when it was still in Octave! However, it has since been
revamped. There are cutting-edge topics in there such as recommendation
systems and reinforcement learning, not to mention the coding tutorials are
now in Python!
Machine Learning
Offered by Stanford University and DeepLearning.AI. #BreakIntoAI
with Machine Learning Specialization. Master…
www.coursera.org
This course will teach you the A-Z of machine learning and give you hands-
on experience implementing them in Python using specialized ML packages
like Sci-Kit Learn, XGBoost, and TensorFlow.
Even though this course is beginner-level, it will cover any question you are
likely to get in an ML interview, particularly if you are applying for entry-
level roles.
Deep Learning
Learn Deep Learning from deeplearning.ai. If you want to break into
Artificial intelligence (AI), this Specialization…
www.coursera.org
Although these two courses will cover pretty much all the theory you need
for ML, feel free to research and supplement your learning. There are so
many niches and specialisms, that it would simply be exhaustive to list them
all out here along with their courses.
For example, one course I took recently was Andrey Karpathy’s Neural
Networks: Zero to Hero. It started quite a low level by building a neural
network from scratch. However, in the last video, we built our own
Generative Pre-trained Transformers (GPT), the model that powers ChatGPT
and most of the recent AI boom!
Practise
The best way to learn anything is to practice and get hands-on experience.
This is by far the most important step in learning ML as it is what really
solidifies your understanding.
Kaggle
I would begin by entering a few competitions on Kaggle. The sole goal is not
to win and earn money, but to learn how to implement a machine learning
algorithm to a real-world problem. In essence, this is how machine learning
is used in industry, to solve business problems.
ML From Scratch
Another method I used was to implement ML algorithms from scratch using
basic Python and packages like NumPy. Being able to write an algorithm
from first principles is one of the best ways to learn it.
You can start simply with linear regression and gradient descent. Then
move over to the hard stuff, eventually working your way up to a shallow
neural network!
You can check out my git repo where I have written some of these algorithms
from scratch.
Blog
The easiest way to get started is by having a blog. Writing about ML concepts
and algorithms will improve your understanding and display your work to
potential employers. Very few people will be doing this, so you will be in the
top echelon of practitioners.
You can start writing about anything. For example: how a neural network
works or what are Markov chains. I found it useful to write a series of blogs
about one topic. For example, this is my Convolutional Neural Network
series.
Egor Howell
Convolutional Neural
Networks
Over time, you can write about more complex topics and start developing a
specialism that can help you target your job search if you want to. Although,
early on in your career this is probably unlikely.
Research Papers
To go even further, you can re-implement a research paper. It depends on
what paper you choose, but this is very hard. I have tried this before and
found it very difficult to match the results given in the paper. Nevertheless,
this is the pinnacle of learning ML and you will gain invaluable knowledge in
the process.
- AlphaCodium
- AlphaGeometry
- RAG vs. Finetuning
- Self-Rewarding Models
- Overview of LLMs for Evaluation
- Tuning Language Models by Proxy
...
7:17 PM · Jan 21, 2024
Read 5 replies
Example tweet.
Read & Digest — Take your time over this to ensure you understand what
the goal, model, and results were from the authors.
Data — If possible, try and get the same data used in the paper. Read and
analyze the data at your own speed.
Study Model Architecture — Review the model and its structure, try and
to learn why the author had this specific architecture for their problem.
Implement — Start building the model and generating results. Take this
one step at a time, slowly iterating on simple steps.
It’s important to document this work as well. You can do this anywhere like
on Twitter/X, LinkedIn, GitHub profile, or even a blog post. Re-
implementing a paper is one of the best ways to stand out, particularly if you
want to work in ML research.
Summary
These are the steps I would take if I had to learn machine learning
completely from scratch again. It is important to note that no one size fits all
and to tailor your learning to your background and experience. Some of the
courses and tutorials I listed here may not be your cup of tea and that’s fine.
The main takeaway is to simply learn basics and just enough to start getting
stuck into real machine learning problems and projects.
Happy learning!
Another Thing!
I have a free newsletter, Dishing the Data, where I share weekly tips for
becoming a better Data Scientist, and the latest AI news to keep you in the
loop. There is no “fluff” or “clickbait”, just pure actionable insights from a
practicing Data Scientist.
LinkedIn 👔
Twitter 🖊
GitHub 🖥
Kaggle 🏅
(All emojis designed by OpenMoji — the open-source emoji and icon project.
License: CC BY-SA 4.0)
Data Science Artificial Intelligence Machine Learning Careers Statistics
🎬
Top Writer: DS, ML, AI , Statistics & Optimization.
https://www.youtube.com/@egorhowell. ---- All opinions here are my own.
Egor Howell in Towards Data Science Sheila Teo in Towards Data Science
1K 12 10.2K 119
Thu Vu in Towards Data Science Egor Howell in Towards Data Science
2.5K 24 419 6
See all from Egor Howell See all from Towards Data Science
144 2 436 6
Lists
130 Data Science Terms Every Data Don’t use loc/iloc with Loops In
Scientist Should Know in 2024 Python, Instead, Use This!
Most Data Science Jargon explained in plain Run your loops at a 60X faster speed
English
2.4K 24 513 4
Sivan Hermon in Code Like A Girl The Pareto Investor
Starting with No: Why Most People ChatGPT has Just Been Dethroned
Shouldn’t Be Managers by French Geniuses!
Why the desired title won’t give you what These Three Individuals, a Former Researcher
you’re looking for at DeepMind and Two Others from Meta,…