ORANGE DATA MINING Steps

The document outlines the steps for building an AI model to predict penguin species using Orange Data Mining. It covers data acquisition, cleaning, defining output labels, splitting the data, and testing various classification algorithms to determine the best model. The process includes connecting different widgets and inspecting data splits to ensure accuracy in predictions.

Uploaded by

sureshpetasvis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

152 views36 pages

ORANGE DATA MINING Steps

Uploaded by

sureshpetasvis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

ORANGE DATA MINING

STEPS for AI model to predict the penguin species .

After Data Acquisition, what should we do next?
What is the mean value of culmen_length feature?
Now that the data is clean and without any missing values, what next?

From TrainData, you would have noticed that the Feature Type for most of the columns is
Numeric Feature. In supervised learning models, we have both the features and the labels. The
labels are the output. Therefore, we need to define an output for our Palmer Penguin model.
We will assign species as our label since that is what we want to identify.
Therefore, we will change the Feature Type for species, from Categorical Feature to Categorical
Label. To do that, we will be using Select Columns.
After choosing a target label, we need to split the data
Connect widget Select Columns to widget Data Sampler. We can do that by dragging the output
from Select Columns to the input of Data Sampler.
After the connection is made, double-click on Data Sampler to open the properties tab.
How do we know if the data is actually split or not?

Let’s inspect on how the data is being split through Data Sampler.
We will be using Data Info.
Connect widget Data Sampler to the second widget Data Info. We can do that by dragging the
output from Data Sampler to the input of the second Data Info.
Take note of the connection name. We will change this. Double-click on the connection.
What do we do after having split the data?
After creating a model, we need to test the model and check its accuracy
Let’s try a couple of other classification algorithms
Now that we have found which model gives us the best results, we can use that one!
Since
the Random Forest algorithm is not working well with one of the species,
let’s use another algorithm

Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners by Scott Hartshorn
No ratings yet
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners by Scott Hartshorn
73 pages
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
25 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
Orange Machine Learning
No ratings yet
Orange Machine Learning
8 pages
Top 9 Feature Engineering Techniques With Python: Dataset & Prerequisites
No ratings yet
Top 9 Feature Engineering Techniques With Python: Dataset & Prerequisites
27 pages
Scikit-Learn Cookbook Sample Chapter
No ratings yet
Scikit-Learn Cookbook Sample Chapter
52 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
Unit-2 Feature Selection
No ratings yet
Unit-2 Feature Selection
92 pages
Python Machine Learning Tutorial With Scikit-Learn
No ratings yet
Python Machine Learning Tutorial With Scikit-Learn
16 pages
Machine Learning in Ecology
No ratings yet
Machine Learning in Ecology
15 pages
Eduonix - Model Development Production Case Study - Part 2 CODE - Jupyter Notebook
No ratings yet
Eduonix - Model Development Production Case Study - Part 2 CODE - Jupyter Notebook
15 pages
Homework 2
100% (1)
Homework 2
25 pages
TGS Besar ML 8488 8684 8861 9010 9027
No ratings yet
TGS Besar ML 8488 8684 8861 9010 9027
8 pages
Orange Data Mining - Random Data
No ratings yet
Orange Data Mining - Random Data
8 pages
This Dataset Comes From An Original (Non-Machine-Learning) Study and Received in December 1995
No ratings yet
This Dataset Comes From An Original (Non-Machine-Learning) Study and Received in December 1995
4 pages
DM PRESENTATION 234 217 Final
No ratings yet
DM PRESENTATION 234 217 Final
27 pages
Aditya Bhandari, Ameya Joshi, Rohit Patki, Bird Species Identification From An Image
No ratings yet
Aditya Bhandari, Ameya Joshi, Rohit Patki, Bird Species Identification From An Image
5 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
74 pages
Decision Tree
No ratings yet
Decision Tree
64 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Tutorial 6
No ratings yet
Tutorial 6
8 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
ML Mid Question Solve
No ratings yet
ML Mid Question Solve
19 pages
Top 90+ Data Science Interview Questions and Answers (2024)
No ratings yet
Top 90+ Data Science Interview Questions and Answers (2024)
38 pages
UNIT3
No ratings yet
UNIT3
71 pages
Chapter 5 Classification
No ratings yet
Chapter 5 Classification
24 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Unit 3
No ratings yet
Unit 3
95 pages
E IS388 Theory MellaMargaretaVeronica 00000059669
No ratings yet
E IS388 Theory MellaMargaretaVeronica 00000059669
7 pages
Wa0001
No ratings yet
Wa0001
39 pages
Day 2 Presentation
No ratings yet
Day 2 Presentation
65 pages
PPA-Building Prediction Model ML
No ratings yet
PPA-Building Prediction Model ML
26 pages
Random Forest 1737667979
No ratings yet
Random Forest 1737667979
11 pages
Unit 3
No ratings yet
Unit 3
63 pages
FREE AI Code Generator - Generate Code Online in Any Language
No ratings yet
FREE AI Code Generator - Generate Code Online in Any Language
12 pages
English Boss
No ratings yet
English Boss
4 pages
Unit-V 1
No ratings yet
Unit-V 1
26 pages
Updated DM Unit 3
No ratings yet
Updated DM Unit 3
28 pages
ML Lab1 PGM
No ratings yet
ML Lab1 PGM
4 pages
Animal Species Prediction Using Machine Learning
No ratings yet
Animal Species Prediction Using Machine Learning
10 pages
Animal Species Prediction Using Machine Learning
No ratings yet
Animal Species Prediction Using Machine Learning
10 pages
Chapter1 ML
No ratings yet
Chapter1 ML
101 pages
L3 - Classification - RandomForest - Jupyter Notebook
No ratings yet
L3 - Classification - RandomForest - Jupyter Notebook
6 pages
3 Text
No ratings yet
3 Text
2 pages
My Project 1 AI
No ratings yet
My Project 1 AI
3 pages
Trees - Classification - Ipynb - Colab
No ratings yet
Trees - Classification - Ipynb - Colab
6 pages
Data Mining
No ratings yet
Data Mining
31 pages
Chap6 ClassificationBasic
No ratings yet
Chap6 ClassificationBasic
83 pages
Lecture 11 Slides - After
No ratings yet
Lecture 11 Slides - After
55 pages
Session 5a Introduction To Mathematical Models
No ratings yet
Session 5a Introduction To Mathematical Models
12 pages
Random Forest
No ratings yet
Random Forest
14 pages
Assignment 4 R Program1
No ratings yet
Assignment 4 R Program1
11 pages
Lecture Slides Slides 9
No ratings yet
Lecture Slides Slides 9
2 pages
Untitled Document
No ratings yet
Untitled Document
8 pages
R20 DMT Unit-Iii
No ratings yet
R20 DMT Unit-Iii
21 pages
Lab 20
No ratings yet
Lab 20
4 pages

ORANGE DATA MINING Steps

Uploaded by

ORANGE DATA MINING Steps

Uploaded by

ORANGE DATA MINING

STEPS for AI model to predict the penguin species .

You might also like