0% found this document useful (0 votes)

144 views10 pages

ML Step by Step

The document outlines a 10-month process undertaken by the author to transform from having basic Excel skills to becoming a machine learning expert. It describes 9 key steps in the process, including understanding basic concepts, learning statistics, programming in Python/R, completing exploratory data analysis projects, creating unsupervised and supervised learning models, and understanding big data technologies. The overall process required hundreds of hours of study, practice, and help along the way while also working full-time and being a parent.

Uploaded by

OUAFI Kheireddine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

144 views10 pages

ML Step by Step

Uploaded by

OUAFI Kheireddine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

MyStory: Step by Step process of How I

Became a Machine Learning Expert in 10

Months
Introduction
Not so long ago, using the pivot tables option in Excel was the upper limit of my skills with
numbers and the word python was more likely to make me think about a dense jungle or a nature
program on TV than a tool to generate business insights and create complex solutions.

It took me ten months to leave that life behind and start feeling like I belonged to the exclusive
world of people who can tell their medians from their means, their x-bars from the neighborhood
pub, and who know how to teach machines what they need to learn.

The transformation process was not easy and demanded hard work, lots of time, dedication and
required plenty of help along the way. It also involved well over hundreds of hours of “studying”
in different forms and an equal amount of time practicing and applying all that was being learnt.
In short, it wasn’t easy to transform from being data dumb to a data nerd, but I managed
to do so while going through a terribly busy work schedule as well as being a dad to a one-
year old.

The point of this article is to help you if you are looking to make a similar transformation but do
not know where to start and how to proceed from one step to the next. If you are interested in
finding out, read on to get an idea about the topics you need to cover and also develop an
understanding of the level of expertise you need to build at each stage of the learning process.

Schlumberger-Private
There are plenty of great online and offline resources to help you master each of these steps, but
very often, the trouble for the uninitiated can be in figuring out where to start and where to
finish. I hope spending the next ten to fifteen minutes going through this article will help solve
that problem for you.

And finally, before proceeding any further, I would like to point out that I had a lot of help in
making this transformation. Right at the end of the article, I will reveal how I managed to
squeeze in so much learning and work in a matter of ten months. But that’s for later.

For now, I want to give you more details about the nine steps that I had to go through in my
transformation process.

Step 1: Understand the basics

Spend a couple of weeks enhancing your “general knowledge” about the field of data science
and machine learning. You may already have ideas and some sort of understanding about what
the field is, but if you want to become an expert, you need to understand the finer details to a
point where you can explain it in simple terms to just about anyone.

Suggested exercise to mark completion of this step:

 Create a list of references with the easiest to understand explanation that you found for
each topic and publish them in a blog. Add a list of statistics related questions that one
may be expected to answer in a data science interview

Step 3: Learn Python or R (or both) for data analysis
Programming turned out to be easier to learn, more fun and more rewarding in terms of the
things it made possible, than I had ever imagined. While mastering a programming language
could be an eternal quest, at this stage, you need to get familiar with the process of learning a
language and that is not too difficult.

Both Python and R are very popular and mastering one can make it quite easy to learn the other.
I started with R and have slowly started using Python for doing similar tasks as well.

Schlumberger-Private
Suggested topics:

 Supported data structures

 Read, import or export data
 Data quality analysis
 Data cleaning and preparation
 Data manipulation – e.g. sorting, filtering, aggregating and other functions
 Data vizualization

Know that you are set for the next step:

 Extract a table from a website, modify it to compute new variables, and create graphs
summarizing the data

Step 4: Complete an Exploratory Data Analysis Project

In the first cricket test match ever played (see scorecard), Australian Charles Bannerman scored
67.35% (165 out of 245) of his team’s total score, in the very first innings of cricket’s history.
This remains a record in cricket at the time of writing, for the highest share of the total score by a
batsman in an innings of a test match.

What makes the innings even more remarkable is that the other 43 innings in that test match had
an average of only 10.8 runs an innings, with only about 40% of all batsmen registering a score
of ten or more runs. In fact, the second highest score by an Australian in the match was 20 runs.
Given that Australia won the match by 45 runs, we can say with conviction that Bannerman’s
innings was the most important contributor to Australia’s win.

Just like we were able to build this story from the scorecard of the test match, exploratory data
analysis is about studying data to understand the story that is hidden beneath it, and then sharing
the story with everyone.

Personally, I find this phase of a data project the most interesting, which is a good thing as quite
a lot of the time in a typical project could be expected to be taken up by exploratory data
analysis.

Topics to cover:

 Single variable explorations

 Pair-wise and multi-variable explorations
 Vizualization, dashboard and storytelling in Tableau

Schlumberger-Private
Project output:

 Create a blog post summarizing the exercise and sharing the dashboard or story. Use a
dataset with at least ten columns and a few thousand records

Step 5: Create unsupervised learning models

Let’s say we had data for all the countries in the world across many parameters ranging from
population, to income, to health, to major industries and more. Now suppose we wanted to find
out which countries are similar to each other across all these parameters. How do we go about
doing this, when we have to compare each country with all the others, across over 50 different
parameters?

That is where unsupervised machine learning algorithms come in. This is not the time to bore
you with details about what these are all about, but the good news is that once you reach this
stage, you have moved on into the world of machine learning and are already in elite company.

Topics to cover:

 K-means clustering
 Association rules

Milestone exercise:

 Practice K-means clustering on 3 different datasets from different industries or interest

areas

Step 6: Create supervised learning models
If you had data about millions of loan applicants and their repayment history from the past, could
you identify an applicant who is likely to default on payments, even before the loan is approved?

Given enough prior data, could you predict which users are more likely to respond to a digital
advertising campaign? Could you identify if someone is more likely to develop a certain disease
later in their life based on their current lifestyle and habits?

Supervised learning algorithms help solve all these problems and a lot more. While there are a
plethora of algorithms to understand and master, just getting started with some of the most

Schlumberger-Private
popular ones will open up a world of new possibilities for you and the ways in which you can
make data useful for an organization.

Topics to cover:

 Logistic regression
 Classification trees
 Ensemble models like Bagging and Random Forest
 Supervised Vector Machines

You have not really started with creating models till you have done this:

 Take a dataset, create models using all the algorithms you have learnt. Train, test and
tune each model to improve performance. Compare them to identify which is the best
model and document why you think it is so

Step 7: Understand Big Data Technologies
Many of the machine learning models in use today have been around for decades. The reason
why these algorithms are only finding applications now, is that we finally have access to
sufficiently large amounts of data, that can be supplied to these algorithms for them to be able to
come up with useful outputs.

Data engineering and architecture is a field of specialization in itself, but every machine learning
expert must know how to deal with big data systems, irrespective of their specialization within
the industry.

Understanding how large amounts of data can be stored, accessed and processed efficiently is
important to being able to create solutions that can be implemented in practice and are not just
theoretical exercises.

I had approached this step with a real lack of conviction, but as I soon found out, it was driven
more by the fear of the unknown in the form of Linux interfaces than any real complexity in
finding my way around a Hadoop system.

Topics to cover:

 Big data overview and eco-system

 Hadoop – HDFS, MapReduce, Pig and Hive
 Spark

Do this to know that you have understood the basics:

Schlumberger-Private
 Upload data, run processes and extract results
after installing a local version of Hadoop or Spark on your system

Step 8: Explore Deep Learning Models
Deep learning models are helping companies like Apple and Google create solutions like Siri or
the Google Assistant. They are helping global giants test driverless cars and suggesting best
courses of treatment to doctors.

Machines are able to see, listen, read, write and speak thanks to deep learning models that are
going to transform the world in many ways, including significantly changing the skills required
for people to be useful to organizations.

Getting started with creating a model that can tell the image of a flower from a fruit may not
immediately help you start building your own driverless car, but it will certainly help you start
seeing the path to getting there.

Topics to cover:

 Artificial Neural Networks

 Natural Language Processing
 Convolutional Neural Networks
 TensorFlow
 Open CV

Milestone exercise:

 Create a model that can correctly identify pictures of two of your friends or family memb
ers

Step 9. Undertake and Complete a Data Project

By now you are almost ready to unleash yourself to the world as a machine learning pro, but you
need to showcase all that you have learnt before anyone else will be willing to agree with you.

The internet presents glorious opportunities to find such projects. If you have been diligent about
the previous eight steps, chances are that you would already know how to find a project that will
excite you, be useful to someone, as well as help demonstrate your knowledge and skills.

Topics to cover:

Schlumberger-Private
 Data collection, quality check, cleaning and preparation
 Exploratory data analysis
 Model creation and selection
 Project report

Milestone exercise:

 Get in touch with a stakeholder who will be interested in your report and share your findi
ngs with them and get feedback

End Notes
Machine learning and artificial intelligence is a set of skills for the present and future. It is also a
field where learning will never cease and very often you may have to keep running to stay in the
same place, as far as being equipped with the most in-demand skills is concerned.

However, if you start the journey well, you will be able to understand how to go about taking the
next step in your learning path. As you must have gathered by now, starting the journey well is a
pretty challenging exercise in itself. If you choose to start upon it, I hope this article will have
been of some help to you and I wish you the very best.

Schlumberger-Private
Finally, I will confess that I got a lot of help with my ten-month transition. The reason I was able
to cover so much ground in this amount of time, along with a busy schedule at work and home,
was that I enrolled for the Post Graduate Program in Data Science and Machine Learning offered
by Jigsaw Academy and Graham School, University of Chicago.

Investing in the course helped in keeping my learning hours focused, created external pressure
that ensured that I was finding time for it irrespective of whatever else was going on in life, and
gave me access to experts in the form of faculty and a great peer group through other students.

Transforming from being non-technical to someone who is comfortable with the machine
learning world has already opened up many new doors for me. Whatever path you choose to

Schlumberger-Private
make this transformation, you can do so with the assurance that going through the rigor will reap
rewards for a long time and will banish any fears of becoming irrelevant in tomorrow’s
economy.

Schlumberger-Private

Generative AI A Transformative Force in Business Intelligence
No ratings yet
Generative AI A Transformative Force in Business Intelligence
7 pages
AI in Logistics
100% (1)
AI in Logistics
45 pages
AI Agents What They Are and How To Build Them Using Python and Other No Code Tools A Comprehensive Guide 2025 (Publishing, Reactive Van Der Post, Hayden) (Z-Library)
No ratings yet
AI Agents What They Are and How To Build Them Using Python and Other No Code Tools A Comprehensive Guide 2025 (Publishing, Reactive Van Der Post, Hayden) (Z-Library)
335 pages
Machine Learning: Short Hand Book
No ratings yet
Machine Learning: Short Hand Book
14 pages
Genetics Mind Maps by Seep Pahuja - Watermark
No ratings yet
Genetics Mind Maps by Seep Pahuja - Watermark
4 pages
Mechanical Engineering Objective - Book - PDF
100% (7)
Mechanical Engineering Objective - Book - PDF
1,453 pages
Data Science
No ratings yet
Data Science
21 pages
AI Project Cycle Key
No ratings yet
AI Project Cycle Key
10 pages
Week 13 LLM ChatGPT HAAI IITKgp v2
No ratings yet
Week 13 LLM ChatGPT HAAI IITKgp v2
119 pages
Dissertation Topics On Luxury Brands
100% (2)
Dissertation Topics On Luxury Brands
6 pages
2024 Gep Procurement and Supply Chain Tech Trends Report
No ratings yet
2024 Gep Procurement and Supply Chain Tech Trends Report
12 pages
ENGLISH 2 - Set A 1st QUARTER TEST 2024-2025
100% (2)
ENGLISH 2 - Set A 1st QUARTER TEST 2024-2025
5 pages
Overview of Machine Learning PDF
100% (1)
Overview of Machine Learning PDF
57 pages
AI All Notes
100% (1)
AI All Notes
159 pages
Belleville
No ratings yet
Belleville
73 pages
Management Consulting and Case Solving For Dummies: 11. Guestimates
No ratings yet
Management Consulting and Case Solving For Dummies: 11. Guestimates
6 pages
Shrila Sanatana Goswami
No ratings yet
Shrila Sanatana Goswami
8 pages
Frameworks and Dos and Donts of Guestimates
No ratings yet
Frameworks and Dos and Donts of Guestimates
13 pages
Reading Assessment Tool: Banna National High School
No ratings yet
Reading Assessment Tool: Banna National High School
2 pages
The Personal Side of Policing
No ratings yet
The Personal Side of Policing
2 pages
Senior Artificial Intelligence Engineer CV
No ratings yet
Senior Artificial Intelligence Engineer CV
5 pages
Non-Contact Thermometer
No ratings yet
Non-Contact Thermometer
4 pages
Ramramesh in 2023 ...
No ratings yet
Ramramesh in 2023 ...
32 pages
Commitment To Excellence by Alan Perkins
No ratings yet
Commitment To Excellence by Alan Perkins
5 pages
Data Visualization With Python For Beginners
No ratings yet
Data Visualization With Python For Beginners
302 pages
Regional - Integration - and - Maritime - Connectivity - Across The Maghreb Seaport System PDF
No ratings yet
Regional - Integration - and - Maritime - Connectivity - Across The Maghreb Seaport System PDF
14 pages
Oneplus5 PDF
No ratings yet
Oneplus5 PDF
1 page
Data Analytics in Hospitality Industry
No ratings yet
Data Analytics in Hospitality Industry
13 pages
Customer Purchasing Behavior Prediction Using Machine Learning Classification Techniques
No ratings yet
Customer Purchasing Behavior Prediction Using Machine Learning Classification Techniques
26 pages
Software Requirements Specification: Version 1.0 Approved
No ratings yet
Software Requirements Specification: Version 1.0 Approved
13 pages
Pitch Perfect - 50 Shades of Pitch Perfect - Getting Sidetracked
No ratings yet
Pitch Perfect - 50 Shades of Pitch Perfect - Getting Sidetracked
5 pages
GPU Computing CIS-543: Lecture 10: Streams and Events
No ratings yet
GPU Computing CIS-543: Lecture 10: Streams and Events
23 pages
1.1 Background: Act, Illocutionary Act and Perlocutionary Act. The Literal Meaning of An
No ratings yet
1.1 Background: Act, Illocutionary Act and Perlocutionary Act. The Literal Meaning of An
16 pages
Digital Manufacturing-Driven Transformations of Service Supply Chains For Compex Products
No ratings yet
Digital Manufacturing-Driven Transformations of Service Supply Chains For Compex Products
21 pages
Artificial Intelligence in Workforce Management Systems
No ratings yet
Artificial Intelligence in Workforce Management Systems
12 pages
Informative Speech Assignment Packet - Leaders - Online Class
No ratings yet
Informative Speech Assignment Packet - Leaders - Online Class
6 pages
BS (Accounting & Finance) (Morning)
No ratings yet
BS (Accounting & Finance) (Morning)
3 pages
Udacity Enterprise Syllabus Data Analyst nd002
No ratings yet
Udacity Enterprise Syllabus Data Analyst nd002
16 pages
Challenges For Information-Flow Security: University of Pennsylvania, Philadelphia PA 19104, USA
No ratings yet
Challenges For Information-Flow Security: University of Pennsylvania, Philadelphia PA 19104, USA
5 pages
Implementing An Automated Inventory Management System For Small and Medium-Sized Enterprises
No ratings yet
Implementing An Automated Inventory Management System For Small and Medium-Sized Enterprises
7 pages
SSC Maths
0% (1)
SSC Maths
27 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
45 pages
19 Storytelling PDF
No ratings yet
19 Storytelling PDF
64 pages
FMG RMC 2025 Jan Non Domicile
No ratings yet
FMG RMC 2025 Jan Non Domicile
6 pages
Digital SCM LITE REVIEW FOR FUTURE WORK-main
No ratings yet
Digital SCM LITE REVIEW FOR FUTURE WORK-main
21 pages
Artificial Intelligence: CS60045 Course Introduction
100% (4)
Artificial Intelligence: CS60045 Course Introduction
16 pages
MACHINE LEARNING ALGORITHM Unit-II Part-II-1
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II Part-II-1
65 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
11 pages
10.1 Time Series Analysis Sales Forecast
No ratings yet
10.1 Time Series Analysis Sales Forecast
7 pages
We Get On Really Well SV
No ratings yet
We Get On Really Well SV
3 pages
Lecture # 1-2 Introduction To Gen AI
No ratings yet
Lecture # 1-2 Introduction To Gen AI
41 pages
Diploma in Artificial Intelligence
No ratings yet
Diploma in Artificial Intelligence
28 pages
Predictive Modeling Lecture Notes 1
No ratings yet
Predictive Modeling Lecture Notes 1
11 pages
Basics of Statistics1
No ratings yet
Basics of Statistics1
63 pages
The Impact of Digital Technology and Industry 4 0 On The Ripple Effect and Supply Chain Risk Analytics
No ratings yet
The Impact of Digital Technology and Industry 4 0 On The Ripple Effect and Supply Chain Risk Analytics
19 pages
Internet of Things and Supply Chain Management A Literature Review
No ratings yet
Internet of Things and Supply Chain Management A Literature Review
25 pages
A Digital Supply Chain Twin For Managing The Disruption Risks and Resilience in The Era of Industry 4 0
No ratings yet
A Digital Supply Chain Twin For Managing The Disruption Risks and Resilience in The Era of Industry 4 0
15 pages
Data Science Case Study For Introduction
No ratings yet
Data Science Case Study For Introduction
19 pages
A Primer On Process Mining Practical Skills With Python and Graphviz
No ratings yet
A Primer On Process Mining Practical Skills With Python and Graphviz
101 pages
Hoodoo Herb and Root Magic A Materia Magica of Africanamerican Conjure PDF
3% (36)
Hoodoo Herb and Root Magic A Materia Magica of Africanamerican Conjure PDF
5 pages
White Paper - The Ultimate Guide To Data Sources and Technologies
No ratings yet
White Paper - The Ultimate Guide To Data Sources and Technologies
22 pages
3 - The Data Science Method
No ratings yet
3 - The Data Science Method
8 pages
Theorem: Using The Law of Cosines
No ratings yet
Theorem: Using The Law of Cosines
8 pages
TOP 21 DATA SCIENCE PROJECTS - Part 1
No ratings yet
TOP 21 DATA SCIENCE PROJECTS - Part 1
6 pages
Data Scientist Roadmap 2025-26
No ratings yet
Data Scientist Roadmap 2025-26
32 pages
AI Project Cycle
No ratings yet
AI Project Cycle
31 pages
Difference Between Speak and Talk
No ratings yet
Difference Between Speak and Talk
2 pages
Business Ethics and Social Responsibility
No ratings yet
Business Ethics and Social Responsibility
55 pages
Class Material - 1
No ratings yet
Class Material - 1
66 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
16 pages
Course Collections by Coursera - Machine Learning & Artificial Intelligence
100% (2)
Course Collections by Coursera - Machine Learning & Artificial Intelligence
6 pages
Uncertainty Analysis
No ratings yet
Uncertainty Analysis
14 pages
BDM Using AI - Data Driven Decision Making
No ratings yet
BDM Using AI - Data Driven Decision Making
34 pages
Rupsha Tyre Maruf Rahman Director
100% (1)
Rupsha Tyre Maruf Rahman Director
3 pages
ML Bundle Projects List
No ratings yet
ML Bundle Projects List
16 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
5 pages
Curriculum GenAI Pinnacle Program
No ratings yet
Curriculum GenAI Pinnacle Program
54 pages
RAG Syllabus R&D
No ratings yet
RAG Syllabus R&D
6 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
CSEC-Chemistry-p2 May-June 2012 PDF
50% (4)
CSEC-Chemistry-p2 May-June 2012 PDF
20 pages
Personalised AI Mastery Guide - My HandCrafted
No ratings yet
Personalised AI Mastery Guide - My HandCrafted
25 pages
Artificial - Intelligence - Master Program - Slimup
No ratings yet
Artificial - Intelligence - Master Program - Slimup
25 pages
Chapter 11: Business Intelligence and Knowledge Management: Oz (5th Edition)
100% (1)
Chapter 11: Business Intelligence and Knowledge Management: Oz (5th Edition)
20 pages
Toronto Data Online Curriculum
No ratings yet
Toronto Data Online Curriculum
11 pages
Day 2 Module 2 - Understanding LLMs
No ratings yet
Day 2 Module 2 - Understanding LLMs
14 pages
Basics of Prompt Engineering
No ratings yet
Basics of Prompt Engineering
16 pages
Generative Ai Explained
No ratings yet
Generative Ai Explained
28 pages
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
No ratings yet
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
4 pages
Machine Learning Resource Guide
No ratings yet
Machine Learning Resource Guide
11 pages
What Is Generative AI
No ratings yet
What Is Generative AI
29 pages
Intern - Gen AI
No ratings yet
Intern - Gen AI
2 pages
Artificial Intelligence: A Computational and Linear Programming Approach
No ratings yet
Artificial Intelligence: A Computational and Linear Programming Approach
6 pages
Machine Learning Algorithm, Second Edition by Giuseppe Bonaccorso
No ratings yet
Machine Learning Algorithm, Second Edition by Giuseppe Bonaccorso
1 page
Six Week-Total Handson Internship Program On Machine Learning
No ratings yet
Six Week-Total Handson Internship Program On Machine Learning
8 pages
Word2Vec Tutorial - The Skip-Gram Model Chris McCormick PDF
No ratings yet
Word2Vec Tutorial - The Skip-Gram Model Chris McCormick PDF
39 pages
Types of Psychological Test: 1. Achievement and Aptitude Tests
No ratings yet
Types of Psychological Test: 1. Achievement and Aptitude Tests
6 pages
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
From Everand
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
Alok Kumar
No ratings yet
Data Analysis for Corporate Finance: Building financial models using SQL, Python, and MS PowerBI
From Everand
Data Analysis for Corporate Finance: Building financial models using SQL, Python, and MS PowerBI
Mariano F. Scandizzo CFA CQF
No ratings yet

ML Step by Step

Uploaded by

ML Step by Step

Uploaded by

MyStory: Step by Step process of How I

Became a Machine Learning Expert in 10

Step 1: Understand the basics

Suggested exercise to mark completion of this step:

 Supported data structures

Know that you are set for the next step:

Step 4: Complete an Exploratory Data Analysis Project

 Single variable explorations

Step 5: Create unsupervised learning models

 Practice K-means clustering on 3 different datasets from different industries or interest

 Big data overview and eco-system

Do this to know that you have understood the basics:

 Artificial Neural Networks

Step 9. Undertake and Complete a Data Project

You might also like