6 - Machine Learning 2

The document discusses machine learning concepts including scikit-learn library, splitting datasets, features and targets, feature extraction, feature scaling, encoding categorical features, choosing models, and improving models through validation, hyperparameter tuning, and regularization.

Uploaded by

sdog444514

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views14 pages

6 - Machine Learning 2

Uploaded by

sdog444514

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Machine Learning

(Continued)
Welcome to Machine Learning!

Big Data, Machine Learning, and their Real World Applications

Pre-College Program
Columbia University, SPS
Let’s review: scikit-learn library

https://machinelearningmastery.com
/a-gentle-introduction-to-scikit-learn-a-python-machine-learnin
g-library/
Splitting a Dataset with scikit-learn

https://towardsdatascience.com
/splitting-a-dataset-e328dab2760a
Scikit-learn
• Train, Test split
• Features, target
• fit() - for training
• model
• new_features
• model.predict() -for testing
Concepts on Features
• Feature extraction
• Numerical features: feature scaling
• Categorical features:
• One-Hot Encoding
• Ordinal Encoding
A Note on Feature Extraction
• Not all your data will be ready to input into an algorithm.
Preprocessing!

• Some more complex data (audio, images, sentences of text,

biosignals, etc.) might require extracting features so that you can
input these features into the algorithm.

• “Traditional”, (non-neural network) algorithms, usually rely on

features to be represented as numerical or categorical values rather
than raw complex signals.
Numerical Features: Feature Scaling
Feature scaling in machine learning is
one of the most critical steps during the
pre-processing of data before creating a
machine learning model. Scaling can
make a difference between a weak
machine learning model and a better
one.
Standard Scaler()
• fit() finds the
mean and
variance

• transform() scales
the data to that
mean and
variance

• fit_transform()
does both!
Dealing with Categorical Variables as
Features
• Feature Encoding:
• One-Hot Encoding
• Ordinal Encoding
https://machinelearningmastery.com/one-hot-encoding-for-categorical-data/
A shortcut from Pandas
• get_dummies()
Encoding the Labels
• Label encoder can be used
to normalize labels

• Or transform categorical
labels into numerical labels
Which Model to Choose? Underfitting vs
Overfitting
Model Improvement- if you care to know…
• Validation dataset
• Hyperparameter tuning
• Cross-validation
• Cost functions
• Regularization
Activity for Group Project
Run a decision tree algorithm with scikit-learn for the dataset you
chose for your project.
• Remember to separate features from targets. You might also have to
convert your data to a numpy array.
• Remember to train/test split adequately.
• Fit the decision tree (either classifier or regressor) to your data.
• Predict using the testing features.
• Compare the expected values (test_target) to your predictions.

Python Predictive Modeling
No ratings yet
Python Predictive Modeling
24 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
Introduction To Scikit Learn
100% (1)
Introduction To Scikit Learn
108 pages
Practical Guide To Scikit-Learn For Data Science
No ratings yet
Practical Guide To Scikit-Learn For Data Science
27 pages
Scikit Learn - Quick Guide
No ratings yet
Scikit Learn - Quick Guide
111 pages
Scikit-Learn-Exercises - Jupyter Notebook
100% (2)
Scikit-Learn-Exercises - Jupyter Notebook
28 pages
How To Think Like Da Vinci PDF
100% (5)
How To Think Like Da Vinci PDF
48 pages
Python Scikit-Learn Cheat Sheet For Machine Learning
No ratings yet
Python Scikit-Learn Cheat Sheet For Machine Learning
3 pages
06 - Data Preprocessing
No ratings yet
06 - Data Preprocessing
68 pages
Scikit Learn
No ratings yet
Scikit Learn
107 pages
21 Machine Learning Using Scikit Learn Ipynb Colaboratory PDF
100% (1)
21 Machine Learning Using Scikit Learn Ipynb Colaboratory PDF
23 pages
Machine Learning Algorithms PDF
100% (1)
Machine Learning Algorithms PDF
148 pages
2018 02 Msu Data Science
No ratings yet
2018 02 Msu Data Science
65 pages
Scikit Learn
No ratings yet
Scikit Learn
25 pages
Lecture 2 20022025 092902am
No ratings yet
Lecture 2 20022025 092902am
87 pages
Applied Machine Learning Supervised Machine Learning (Part 2)
No ratings yet
Applied Machine Learning Supervised Machine Learning (Part 2)
47 pages
Machine Learning (2) : Inteligência Artificial E Cibersegurança (Inacs)
No ratings yet
Machine Learning (2) : Inteligência Artificial E Cibersegurança (Inacs)
45 pages
1694266379-Unit1 Machine Learning Introduction CU 2.0
No ratings yet
1694266379-Unit1 Machine Learning Introduction CU 2.0
58 pages
04 MLModelingBasics
No ratings yet
04 MLModelingBasics
61 pages
DE - Python For Data Science - Machine Learning
No ratings yet
DE - Python For Data Science - Machine Learning
45 pages
Week 01
No ratings yet
Week 01
37 pages
Data Science
No ratings yet
Data Science
38 pages
Lect3 Supervised1
No ratings yet
Lect3 Supervised1
25 pages
Unit-2 Feature Selection
No ratings yet
Unit-2 Feature Selection
92 pages
Business
No ratings yet
Business
18 pages
Supervised Learning: Andreas Müller
No ratings yet
Supervised Learning: Andreas Müller
43 pages
Assignmnet
No ratings yet
Assignmnet
25 pages
Vtu ML
No ratings yet
Vtu ML
62 pages
Slides On DataI
No ratings yet
Slides On DataI
33 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
Unit 1
No ratings yet
Unit 1
28 pages
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
No ratings yet
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
29 pages
Chapter Two - Classification Feb 26 2024
No ratings yet
Chapter Two - Classification Feb 26 2024
18 pages
Python 06 MachineLearning
No ratings yet
Python 06 MachineLearning
45 pages
Scikit Learn What Were Covering
No ratings yet
Scikit Learn What Were Covering
15 pages
Scikit-Learn Cookbook Sample Chapter
No ratings yet
Scikit-Learn Cookbook Sample Chapter
52 pages
Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur - No Free Hunch
No ratings yet
Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur - No Free Hunch
22 pages
1 - An Introduction To Machine Learning With Scikit-Learn
No ratings yet
1 - An Introduction To Machine Learning With Scikit-Learn
9 pages
Data Science II: Charles C.N. Wang
No ratings yet
Data Science II: Charles C.N. Wang
38 pages
Ch1 - Slides - Supervised Learning
No ratings yet
Ch1 - Slides - Supervised Learning
32 pages
Machine Learning Lecture1 - 26-27 Aug
No ratings yet
Machine Learning Lecture1 - 26-27 Aug
30 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
Skit Learn Cheatsheet
No ratings yet
Skit Learn Cheatsheet
11 pages
Scikit-Learn: Library For Machine Learning and Data Science With Python
No ratings yet
Scikit-Learn: Library For Machine Learning and Data Science With Python
11 pages
ML LabManual
No ratings yet
ML LabManual
16 pages
Assignment1 LATEX
No ratings yet
Assignment1 LATEX
11 pages
Unit 5 Material
No ratings yet
Unit 5 Material
18 pages
Algorithmeknn 121213175830 Phpapp02
No ratings yet
Algorithmeknn 121213175830 Phpapp02
52 pages
Beginner's Guide To Implementing A Simple Machine Learning Project - DeV Community
No ratings yet
Beginner's Guide To Implementing A Simple Machine Learning Project - DeV Community
9 pages
Expert System
100% (1)
Expert System
54 pages
Python SciKit Learn Tutorial - DigitalOcean
No ratings yet
Python SciKit Learn Tutorial - DigitalOcean
11 pages
MODELS (AutoRecovered)
No ratings yet
MODELS (AutoRecovered)
9 pages
Scikit-Learn: Scikit-Learn Is An Open Source Python Library That
100% (1)
Scikit-Learn: Scikit-Learn Is An Open Source Python Library That
1 page
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Pandas in Scikit-Learn
No ratings yet
Pandas in Scikit-Learn
3 pages
Data - Preprocessing - Jupyter Notebook
No ratings yet
Data - Preprocessing - Jupyter Notebook
5 pages
Unit 1-1
No ratings yet
Unit 1-1
10 pages
TP02
No ratings yet
TP02
3 pages
FS-1-EP-6 Cont.... Picture
No ratings yet
FS-1-EP-6 Cont.... Picture
7 pages
Research
No ratings yet
Research
117 pages
Final ML
No ratings yet
Final ML
2 pages
Reflection Paper On Guidance and Counseling
93% (15)
Reflection Paper On Guidance and Counseling
2 pages
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
No ratings yet
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
4 pages
Muetwritingnew Task 2
100% (3)
Muetwritingnew Task 2
39 pages
Perception, Attitude, and Readiness in Artificial Intelligence Adoption Among Academic Librarians in The Bicol Region Librarians Council (BRLC)
No ratings yet
Perception, Attitude, and Readiness in Artificial Intelligence Adoption Among Academic Librarians in The Bicol Region Librarians Council (BRLC)
6 pages
Sinclair and Coulthard
No ratings yet
Sinclair and Coulthard
6 pages
Rat Tos Mapeh Grade-2
100% (1)
Rat Tos Mapeh Grade-2
3 pages
Work Motivation: Principles and Applications Damodar Suar
No ratings yet
Work Motivation: Principles and Applications Damodar Suar
28 pages
Albert Bandura
No ratings yet
Albert Bandura
37 pages
CH 14 Artificial Intelligence - Pdf681b03458205e31427
No ratings yet
CH 14 Artificial Intelligence - Pdf681b03458205e31427
21 pages
Methodology U (1,2,3)
No ratings yet
Methodology U (1,2,3)
8 pages
Lesson Plan 2nd Grade Creepy Crayon Lhommedieuchristina
No ratings yet
Lesson Plan 2nd Grade Creepy Crayon Lhommedieuchristina
4 pages
Enclosure No 05 PRESENTATION PORTFOLIO ASSESSMENT SCORING SHEET
No ratings yet
Enclosure No 05 PRESENTATION PORTFOLIO ASSESSMENT SCORING SHEET
1 page
Knc3453 - 4453 Strategic Management
No ratings yet
Knc3453 - 4453 Strategic Management
23 pages
Prof. Rdouan Faïzi
No ratings yet
Prof. Rdouan Faïzi
19 pages
Guided Notes Sample
No ratings yet
Guided Notes Sample
1 page
Research Handout 1
No ratings yet
Research Handout 1
5 pages
Syllabus
No ratings yet
Syllabus
2 pages
DRTA
No ratings yet
DRTA
8 pages
Types of Assessment
No ratings yet
Types of Assessment
5 pages
Simple Past Tense Recount TEXT Explain and Example
No ratings yet
Simple Past Tense Recount TEXT Explain and Example
5 pages
Evaluation Form
No ratings yet
Evaluation Form
2 pages
Task 1
No ratings yet
Task 1
5 pages
Social and Multicultural Psychology
No ratings yet
Social and Multicultural Psychology
7 pages
Emma Pavydis - Gr-4-Goal Setting Menu
No ratings yet
Emma Pavydis - Gr-4-Goal Setting Menu
2 pages
Class-9 Computer Ch-1 Part-2 QandA Brajesh
No ratings yet
Class-9 Computer Ch-1 Part-2 QandA Brajesh
3 pages
Playing Video Games Linked To Asthma
No ratings yet
Playing Video Games Linked To Asthma
3 pages
1E English Weekly Lesson Plan For Topic 1 Year 2017
No ratings yet
1E English Weekly Lesson Plan For Topic 1 Year 2017
2 pages
C++ Mastery: Advanced Techniques and Strategies
From Everand
C++ Mastery: Advanced Techniques and Strategies
Adam Jones
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)