0% found this document useful (0 votes)

121 views8 pages

7 More Steps To Mastering Machine Learning With Python - Page1

python

Uploaded by

rajan peri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

121 views8 pages

7 More Steps To Mastering Machine Learning With Python - Page1

python

Uploaded by

rajan peri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

KDnuggets

Subscribe to KDnuggets News

search KDnuggets Search

Blog/News
Opinions
Tutorials
Top stories
Companies
Courses
Datasets
Education
Events (online)
Jobs
Software
Webinars

Predictive Analytics World Financial, May 31 - June 4, Las Vegas. Use code
KDnuggets for 15% off

Topics: Coronavirus | AI | Data Science | Deep Learning | Machine

Learning | Python | R | Statistics

KDnuggets Home » News » 2017 » Mar » Tutorials, Overviews » 7 More Steps to Mastering
Machine Learning With Python ( 17:n09 )

7 More Steps to Mastering Machine

Learning With Python
<= Previous post
Next post =>
http likes 1203

Like 544 Share 544 Tweet Share 422

Tags: 7 Steps, Classification, Clustering, Deep Learning, Ensemble Methods, Gradient Boosting,
Machine Learning, Python, scikit-learn, Sebastian Raschka

This post is a follow-up to last year's introductory Python machine learning post, which includes a
series of tutorials for extending your knowledge beyond the original.

KNIME Spring Summit

Online
Data Science in Action
Online Courses, Workshops and Webinars
Learn More

By Matthew Mayo, KDnuggets.

So, you have been thinking about picking up machine learning, but given the confusing state of the
web you don't know where to begin? Or maybe you have finished the first 7 steps and are looking for
some follow-up material, beyond the introductory?
Machine learning algorithms.

This post is the second installment of the 7 Steps to Mastering Machine Learning in Python series
(since there are 2 parts, I guess it now qualifies as a series). If you have started with the original post,
you should already be satisfactorily up to speed, skill-wise. If not, you may want to review that post
first, which may take some time, depending on your current level of understanding; however, I assure
you that doing so will be worth your effort.

After a quick review -- and a few options for a fresh perspective -- this post will focus more
categorically on several sets of related machine learning tasks. Since we can safely skip the
foundational modules this time around -- Python basics, machine learning basics, etc. -- we will jump
right into the various machine learning algorithms. We can also categorize our tutorials better along
functional lines this time.

I will, once again, state that the material contained herein is all freely available on the web, and all
rights and recognition for the works belong to their original authors. If something has not been
properly attributed, please feel free to let me know.

Step 1: Machine Learning Basics Review & A Fresh Perspective

Just to review, these are the steps covered in the original post:

1. Basic Python Skills

2. Foundational Machine Learning Skills
3. Scientific Python Packages Overview
4. Getting Started with Machine Learning in Python: Introduction & model evaluation
5. Machine Learning Topics with Python: k-means clustering, decision trees, linear regression &
logistic regression
6. Advanced Machine Learning Topics with Python: Support vector machines, random forests,
dimension reduction with PCA
7. Deep Learning in Python
As stated above, if you are looking to start from square one, I would suggest going back to the first
article and proceeding accordingly. I will also note that the appropriate getting started material,
including any and all installation instructions, are including in the previous article.

If, however, you are really green, I would start with the following, covering the absolute basics:

Machine Learning Key Terms, Explained, by Matthew Mayo

Statistical Classification on Wikipedia
Machine Learning: A Complete and Detailed Overview, by Alex Castrounis

If you are looking for some alternative or complementary approaches to learning the basics of
machine learning, I have recently been enjoying Shai Ben-David's video lectures and freely available
textbook written with Shai Shalev-Shwartz. Find them both here:

Shai Ben-David's introductory machine learning video lectures, University of Waterloo

Understanding Machine Learning: From Theory to Algorithms, by Shai Ben-David & Shai
Shalev-Shwartz

Remember, the introductory material does not all need to be digested before moving forward with the
rest of the steps (in either this post or the original). Video lectures, texts, and other resources can be
consulted when implementing models using the reflected machine learning algorithms, or when
applicable concepts are being used practically in subsequent steps. Use your judgment.

Step 2: More Classification

We begin with the new material by first strengthening our classification know-how and introducing a
few additional algorithms. While part 1 of our post covered decision trees, support vector machines,
and logistic regression -- as well as the ensemble classifier Random Forests -- we will add k-nearest
neighbors, the Naive Bayes classier, and a multilayer perceptron into the mix.

Scikit-learn classifiers.

k-nearest neighbors (kNN) is a simple classifier and an example of a lazy learner, in which all
computation occurs at classification time (as opposed to occurring during a training step ahead of
time). kNN is non-parametric, and functions by comparing a data instance with the k closest instances
when making decisions about how it should be classified.
K-Nearest Neighbor classification using python

Naive Bayes is a classifier based on Bayes' Theorem. It assumes that there is independence among
features, and that the presence of any particular feature in one class is not related to any other feature's
presence in the same class.

Document Classification with scikit-learn, by Zac Stewart

The multilayer perceptron (MLP) is a simple feedforward neural network, consisting of multiple
layers of nodes, where each layer is fully connected with the layer which comes after it. The MLP
was introduced in Scikit-learn version 0.18.

First read an overview of the MLP classifier from the Scikit-learn documentation, and then practice
implementation with a tutorial.

Neural network models (supervised), Scikit-learn documentation

A Beginner’s Guide to Neural Networks with Python and SciKit Learn 0.18!, by Jose Portilla

Step 3: More Clustering

We now move on to clustering, a form of unsupervised learning. In the first post we covered the k-
means algorithm; we will introduce DBSCAN and Expectation-maximization (EM) herein.

Scikit-learn clustering algorithms.

First off, read these introductory posts; the first is a quick comparison of k-means and EM clustering
techniques, a nice segue into new forms of clustering, and the second is an overview of clustering
techniques available in Scikit-learn:

Comparing Clustering Techniques: A Concise Technical Overview, by Matthew Mayo

Comparing different clustering algorithms on toy datasets, Scikit-learn documentation
Expectation-maximization (EM) is a probabilistic clustering algorithm, and, as such, involves
determining the probabilities that instances belong to particular clusters. EM ”approaches maximum
likelihood or maximum a posteriori estimates of parameters in statistical models” (Han, Kamber &
Pei). The EM process begins with a set of parameters, iterating until clustering is maximized, with
respect to k clusters.

First read a tutorial on the EM algorithm. Next, have a look at the relevant Scikit-learn
documentation. Finally, follow a tutorial and implement EM clustering yourself with Python.

A Tutorial on the Expectation Maximization (EM) Algorithm, by Elena Sharova

Gaussian mixture models, Scikit-learn documentation
Quick introduction to gaussian mixture models with Python, by Tiago Ramalho

If "Gaussian mixture models" is confusing at first glance, this relevant section from the Scikit-learn
documentation should alleviate any unnecessary worries:

The GaussianMixture object implements the expectation-maximization (EM) algorithm for

fitting mixture-of-Gaussian models.

Density-based spatial clustering of applications with noise (DBSCAN) operates by grouping

densely-packed data points together, and designating low-density data points as outliers.

First read and follow an example implementation of DBSCAN from Scikit-learn's documentation,
and then follow a concise tutorial:

Demo of DBSCAN clustering algorithm, Scikit-learn documentation

Density-based clustering algorithm (DBSCAN) and Implementation

Pages: 1 2

<= Previous post

Next post =>

Most Popular Most Shared

1. 24 Best (and Free) Books To Understand 1. 24 Best (and Free) Books To Understand
Machine Learning Machine Learning
2. COVID-19 Visualized: The power of 2. COVID-19 Visualized: The power of
effective visualizations for pandemic effective visualizations for pandemic
storytelling storytelling
3. How (not) to use Machine Learning for 3. Introducing MIDAS: A New Baseline for
time series forecasting: The sequel Anomaly Detection in Graphs
4. 50 Must-Read Free Books For Every 4. Covid-19, your community, and you a
Data Scientist in 2020 data science perspective
5. Free Mathematics Courses for Data 5. How (not) to use Machine Learning for
Science & Machine Learning time series forecasting: The sequel
6. Nine lessons learned during my first 6. 50 Must-Read Free Books For Every
year as a Data Scientist Data Scientist in 2020
7. New Poll: Coronavirus impact on 7. Coronavirus Data and Poll Analysis –
AI/Data Science/Machine Learning yes, there is hope, if we act now
community

Latest News
3 Reasons to Use Random Forest® Over a Neural Netwo...
KDnuggets 20:n14, Apr 8: Free Mathematics for Machine Learn...
2 Things You Need to Know about Reinforcement Learning ...
Simple Question Answering (QA) Systems That Use Text Si...
Build an app to generate photorealistic faces using Ten...
5 Ways Data Scientists Can Help Respond to COVID-19 and...

Top Stories
Last Week
Most Popular
1. COVID-19 Visualized: The power of effective visualizations for pandemic storytelling

2. How (not) to use Machine Learning for time series forecasting: The sequel
3. Stop Hurting Your Pandas!
4. Research into 1,001 Data Scientist LinkedIn Profiles, the latest
5. 24 Best (and Free) Books To Understand Machine Learning
6. Best Free Epidemiology Courses for Data Scientists
7. Python for data analysis... is it really that simple?!?

Most Shared
1. Introducing MIDAS: A New Baseline for Anomaly Detection in Graphs
2. How (not) to use Machine Learning for time series forecasting: The sequel
3. Best Free Epidemiology Courses for Data Scientists
4. Research into 1,001 Data Scientist LinkedIn Profiles, the latest
5. More Performance Evaluation Metrics for Classification Problems You Should Know
6. Advice for a Successful Data Science Career
7. Introduction to the K-nearest Neighbour Algorithm Using Examples

Subscribe to KDnuggets News

100 Days of Machine Learning
No ratings yet
100 Days of Machine Learning
45 pages
ISWA Unit1pptx 2023 08 28 19 47 11
No ratings yet
ISWA Unit1pptx 2023 08 28 19 47 11
47 pages
SRM'24 AI-Unit 4 - Updated
No ratings yet
SRM'24 AI-Unit 4 - Updated
104 pages
AI and ML Techniques For Cyber Security
No ratings yet
AI and ML Techniques For Cyber Security
8 pages
Privacy and Security in Iot Kot054
100% (1)
Privacy and Security in Iot Kot054
2 pages
Notes Approximations
No ratings yet
Notes Approximations
10 pages
Ai Unit 3 Ai Unit 3
No ratings yet
Ai Unit 3 Ai Unit 3
55 pages
Week 1 Introduction To ML
100% (1)
Week 1 Introduction To ML
42 pages
Kindle Unlimited Central - Books - 04 - 16 - 2020
No ratings yet
Kindle Unlimited Central - Books - 04 - 16 - 2020
20 pages
Reasoning Uncertainty
No ratings yet
Reasoning Uncertainty
38 pages
Machine Learning Security Threats
No ratings yet
Machine Learning Security Threats
39 pages
Mnist Handwritten Digit Classification
No ratings yet
Mnist Handwritten Digit Classification
26 pages
Purdue Data Science Master Program Slimupv2
100% (1)
Purdue Data Science Master Program Slimupv2
28 pages
Ai and Machine Learning For Network Security - Applications and Case Studies
No ratings yet
Ai and Machine Learning For Network Security - Applications and Case Studies
13 pages
Machine Learning and Deep Learning Approaches For CyberSecurity A Review
No ratings yet
Machine Learning and Deep Learning Approaches For CyberSecurity A Review
14 pages
Assignment Unit 2 Problem Solving by Search
No ratings yet
Assignment Unit 2 Problem Solving by Search
2 pages
NLP and Generative AI Syllabus - 2025
No ratings yet
NLP and Generative AI Syllabus - 2025
5 pages
Array Implementation of Binary Trees
No ratings yet
Array Implementation of Binary Trees
6 pages
Python Cheat Sheet
No ratings yet
Python Cheat Sheet
45 pages
Artificial Intelligence Notes
No ratings yet
Artificial Intelligence Notes
66 pages
Ai in Cybersecurity Report22
No ratings yet
Ai in Cybersecurity Report22
11 pages
BDA - Lecture 3
100% (1)
BDA - Lecture 3
17 pages
Artificial Intelligence Foundations
No ratings yet
Artificial Intelligence Foundations
96 pages
Apache Hadoop Commands
100% (1)
Apache Hadoop Commands
8 pages
AI Question Bank With Solutions - 2021-22
No ratings yet
AI Question Bank With Solutions - 2021-22
45 pages
Information Assurance and Security Introduction To IA
No ratings yet
Information Assurance and Security Introduction To IA
64 pages
Machine Learning Blockchain
100% (1)
Machine Learning Blockchain
15 pages
To Design and Implement An FIR Filter For Given Specifications
100% (1)
To Design and Implement An FIR Filter For Given Specifications
8 pages
4 6 Expert Systems 1
100% (1)
4 6 Expert Systems 1
46 pages
Basics of Dimensional Modeling
100% (1)
Basics of Dimensional Modeling
14 pages
Chapter 1 Introduction To AI
No ratings yet
Chapter 1 Introduction To AI
53 pages
Introduction To AI and Cybersecurity
No ratings yet
Introduction To AI and Cybersecurity
9 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Apache Hadoop Yarn Commands
No ratings yet
Apache Hadoop Yarn Commands
8 pages
Apache Hadoop Yarn Commands
No ratings yet
Apache Hadoop Yarn Commands
8 pages
Parkinsons Disease Detection
No ratings yet
Parkinsons Disease Detection
80 pages
Ethics in AI. Introduction To Special Issue
No ratings yet
Ethics in AI. Introduction To Special Issue
3 pages
Convolution Presentation
No ratings yet
Convolution Presentation
65 pages
Signals and Systems Lab2. Linear Time-Invariant Systems
No ratings yet
Signals and Systems Lab2. Linear Time-Invariant Systems
50 pages
Dynamic Programming
No ratings yet
Dynamic Programming
43 pages
Unit - 1
No ratings yet
Unit - 1
65 pages
Social Engineering Case Study
No ratings yet
Social Engineering Case Study
21 pages
AI Unit 3
No ratings yet
AI Unit 3
85 pages
Lcture-1 Introduction To Artificial Intelligence Version-1
No ratings yet
Lcture-1 Introduction To Artificial Intelligence Version-1
54 pages
Divide and Conquer 06 Class Notes PDF
No ratings yet
Divide and Conquer 06 Class Notes PDF
36 pages
Anomaly Detection: Course: Data Mining II
No ratings yet
Anomaly Detection: Course: Data Mining II
12 pages
Seminar Report
No ratings yet
Seminar Report
38 pages
Fundamentals of Networking Chapter 3
No ratings yet
Fundamentals of Networking Chapter 3
30 pages
Lecture Plan Signals and Systems
No ratings yet
Lecture Plan Signals and Systems
3 pages
Apache Hadoop Yarn
No ratings yet
Apache Hadoop Yarn
2 pages
Lesson 1.1 - FACTORING POLYNOMIAL WITH GREATEST COMMON MONOMIAL FACTOR, DIFFERENCE OF TWO SQUARE AND SUM AND DIFFERENCE OF TWO CUBES
No ratings yet
Lesson 1.1 - FACTORING POLYNOMIAL WITH GREATEST COMMON MONOMIAL FACTOR, DIFFERENCE OF TWO SQUARE AND SUM AND DIFFERENCE OF TWO CUBES
31 pages
Safety Evaluation Process For AI Based Autonomous Systems - Pedroza - Adedjouma
No ratings yet
Safety Evaluation Process For AI Based Autonomous Systems - Pedroza - Adedjouma
17 pages
Fostering Cyber Resilience in Europe An in Depth Exploration of The Cyber Resilience Act
No ratings yet
Fostering Cyber Resilience in Europe An in Depth Exploration of The Cyber Resilience Act
16 pages
01-Symmetric Encryption
No ratings yet
01-Symmetric Encryption
20 pages
From ChatGPT To ThreatGPT Impact of Generative AI in Cybersecurity and Privacy
No ratings yet
From ChatGPT To ThreatGPT Impact of Generative AI in Cybersecurity and Privacy
28 pages
Sage X3 Lifecycle Policy
No ratings yet
Sage X3 Lifecycle Policy
13 pages
Pythonlearn 04 Functions
No ratings yet
Pythonlearn 04 Functions
25 pages
Your Results For: "Multiple Choice"
No ratings yet
Your Results For: "Multiple Choice"
12 pages
Sine and Sine Sweep Measurements: Jens Hee
No ratings yet
Sine and Sine Sweep Measurements: Jens Hee
14 pages
Data Science Harvard Lecture 1 PDF
No ratings yet
Data Science Harvard Lecture 1 PDF
43 pages
Cybersecurity Framework 101 Webinar 20170301 v2
No ratings yet
Cybersecurity Framework 101 Webinar 20170301 v2
28 pages
Introduction and Architecture Overview IBM Cloud Computing Reference Architecture 2.0
No ratings yet
Introduction and Architecture Overview IBM Cloud Computing Reference Architecture 2.0
36 pages
Knowledge Representation
No ratings yet
Knowledge Representation
47 pages
ResilientMilitarySystems CyberThreat
No ratings yet
ResilientMilitarySystems CyberThreat
146 pages
MPCA LAB3 Programs
No ratings yet
MPCA LAB3 Programs
8 pages
Correlator Receiver - Probability of Error - Line Coding
No ratings yet
Correlator Receiver - Probability of Error - Line Coding
7 pages
Overview of Artificial Intelligence: Abu Saleh Musa Miah
No ratings yet
Overview of Artificial Intelligence: Abu Saleh Musa Miah
54 pages
Python Programming-Grade 9
No ratings yet
Python Programming-Grade 9
53 pages
Information and Network Security 10cs835 Question Bank and Solution
No ratings yet
Information and Network Security 10cs835 Question Bank and Solution
49 pages
Installation and Configuration - SAS Enterprise Miner
No ratings yet
Installation and Configuration - SAS Enterprise Miner
36 pages
Strings in Python
No ratings yet
Strings in Python
33 pages
Evolution of Machine Learning
No ratings yet
Evolution of Machine Learning
7 pages
EFESO Supply Chain Resilience in The Food Beverage Industry
No ratings yet
EFESO Supply Chain Resilience in The Food Beverage Industry
11 pages
30 Hrs Deep Learning CV Images Video
No ratings yet
30 Hrs Deep Learning CV Images Video
6 pages
SQL 101: A Beginners Guide To SQL
No ratings yet
SQL 101: A Beginners Guide To SQL
8 pages
Explainable AI For Cybersecurity Automation, Intelligence and Trustworthiness 5
No ratings yet
Explainable AI For Cybersecurity Automation, Intelligence and Trustworthiness 5
24 pages
Apache - Hadoop Streaming
No ratings yet
Apache - Hadoop Streaming
13 pages
Apache - Hadoop Streaming
No ratings yet
Apache - Hadoop Streaming
13 pages
Zulfa Putri Asmawi - TUGAS 10
No ratings yet
Zulfa Putri Asmawi - TUGAS 10
8 pages
Technical Seminar: Sapthagiri College of Engineering
No ratings yet
Technical Seminar: Sapthagiri College of Engineering
18 pages
Eversign Document Hash
No ratings yet
Eversign Document Hash
7 pages
CS 8520: Artificial Intelligence: Knowledge Representation
No ratings yet
CS 8520: Artificial Intelligence: Knowledge Representation
30 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
Abstract On The Artificial Intelegence
No ratings yet
Abstract On The Artificial Intelegence
15 pages
Data Visualization For Industry 4
No ratings yet
Data Visualization For Industry 4
3 pages
DataMining Lecture 1
No ratings yet
DataMining Lecture 1
35 pages
OS ASSIGNMENT QUESTIONS (Module 3)
No ratings yet
OS ASSIGNMENT QUESTIONS (Module 3)
2 pages
Data Structure and Algorithm Analysis
No ratings yet
Data Structure and Algorithm Analysis
2 pages
Apache Hadoop MapReduce Commands
No ratings yet
Apache Hadoop MapReduce Commands
5 pages
Human Computer Interaction Lecture Notes On UNIT 1
No ratings yet
Human Computer Interaction Lecture Notes On UNIT 1
29 pages
UNIT - 2 .DataScience 04.09.18
No ratings yet
UNIT - 2 .DataScience 04.09.18
53 pages
Mat 339 Syllabus Fall 2021
No ratings yet
Mat 339 Syllabus Fall 2021
2 pages
Btech Cse 3 Sem Data Structure and Algorithms PCC cs301 2024
No ratings yet
Btech Cse 3 Sem Data Structure and Algorithms PCC cs301 2024
1 page
Excel Equation Solver
No ratings yet
Excel Equation Solver
3 pages
Economics of Artificial Intelligence in Cybersecurity
No ratings yet
Economics of Artificial Intelligence in Cybersecurity
5 pages
Signals and Systems 22EEC13
No ratings yet
Signals and Systems 22EEC13
2 pages
OTE Assignment-1 PDF
No ratings yet
OTE Assignment-1 PDF
2 pages
Octatrack MKII Gain Staging
No ratings yet
Octatrack MKII Gain Staging
1 page
DC Tutorial Sheet 1
No ratings yet
DC Tutorial Sheet 1
2 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
Final Exam in DIP
No ratings yet
Final Exam in DIP
1 page
Practical Machine Learning - Sample Chapter
83% (18)
Practical Machine Learning - Sample Chapter
46 pages
Python Machine Learning
100% (6)
Python Machine Learning
113 pages

7 More Steps To Mastering Machine Learning With Python - Page1

Uploaded by

7 More Steps To Mastering Machine Learning With Python - Page1

Uploaded by

KDnuggets

Subscribe to KDnuggets News

Topics: Coronavirus | AI | Data Science | Deep Learning | Machine

7 More Steps to Mastering Machine

Like 544 Share 544 Tweet Share 422

KNIME Spring Summit

By Matthew Mayo, KDnuggets.

Step 1: Machine Learning Basics Review & A Fresh Perspective

1. Basic Python Skills

Machine Learning Key Terms, Explained, by Matthew Mayo

Shai Ben-David's introductory machine learning video lectures, University of Waterloo

Step 2: More Classification

Document Classification with scikit-learn, by Zac Stewart

Neural network models (supervised), Scikit-learn documentation

Step 3: More Clustering

Scikit-learn clustering algorithms.

Comparing Clustering Techniques: A Concise Technical Overview, by Matthew Mayo

A Tutorial on the Expectation Maximization (EM) Algorithm, by Elena Sharova

The GaussianMixture object implements the expectation-maximization (EM) algorithm for

Density-based spatial clustering of applications with noise (DBSCAN) operates by grouping

Demo of DBSCAN clustering algorithm, Scikit-learn documentation

<= Previous post

Top Stories Past 30 Days

Most Popular Most Shared

More Recent Stories

© 2020 KDnuggets. | About KDnuggets | Contact | Privacy policy | Terms of Service

Subscribe to KDnuggets News

You might also like