0% found this document useful (0 votes)

20 views

Test2 ML Model Answer

Uploaded by

itsluffy9998

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Test2 ML Model Answer

Uploaded by

itsluffy9998

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

GOVERNMENT POLYTECHNIC, NAGPUR.

(An Autonomous Institute of Govt. of Maharashtra)

Second Progressive Test
Diploma in Computer Engineering
Course Code : CM303G Course Name : Machine Learning
Time : 1 Hour Max. Marks : 25

Instructions:
1. All questions are compulsory
2. Illustrate your answers with neat sketches wherever necessary
3. Figures to the right indicate full marks
4. Use of non-programmable calculator is permissible
5. Assume suitable data if necessary
Preferably, write the answers in sequential order.

Q1 Attempt any FIVE. 10 M Cos

6R2 a) SVMs are particularly effective when the number of features CO6
(dimensions) is very large compared to the number of samples.
They work well in datasets with high dimensionality by
maximizing the margin between classes, ensuring good
generalization performance. By focusing on the points (support
vectors) that are closest to the decision boundary, SVMs minimize
the risk of overfitting, especially in scenarios where there is a clear
margin of separation between classes.
4R2 b) The two important steps of Feature Engineering are Feature CO4
Selection and Feature Transformation.
5R2 c) Supervised learning is a type of machine learning where a model is CO5
trained using a labeled dataset. The goal is to learn a mapping from
inputs (features) to outputs (labels or targets) so that the model can
predict the output for new, unseen data.
4R2 d) Data refers to raw, unprocessed facts, figures, or information that CO4
can be collected, analyzed, and used for decision-making. It is the
foundation of information and knowledge in any field, often
represented in numeric, textual, visual, or auditory forms.
Types of Data
Data can be broadly classified into two main types:
Structured Data
Unstructured Data
Qualitative Data (Categorical Data)
Describes attributes or characteristics.
Types:
Nominal Data: Categories without an inherent order (e.g., colors:
red, blue, green).
Ordinal Data: Categories with a specific order (e.g., ratings: poor,
average, excellent).
Quantitative Data (Numerical Data)
Represents measurable quantities.
Types:
Discrete Data: Countable values (e.g., number of students in a
class).
Continuous Data: Any value within a range (e.g., height, weight,
temperature).
5R2 e) Bayes' Theorem is a mathematical formula used to determine the CO5
conditional probability of an event based on prior knowledge of
conditions that might be related to the event. In machine learning,
it plays a critical role in probabilistic models, particularly in
classification problems.

The theorem is expressed as:

6R2 f) Regression analysis is a statistical and machine learning technique CO6

used to model the relationship between a dependent variable
(target) and one or more independent variables (features). Its
primary purpose is to predict continuous numerical values based on
the input data.
Nature of Target Variable:
Regression deals with continuous target variables.
Classification deals with categorical target variables.

Interpretation of Results:
Regression provides a quantitative prediction (e.g., temperature is
28.5°C).
Classification provides a qualitative label (e.g., "hot" or "cold").
Understanding whether the problem requires predicting a
continuous value or assigning categories helps determine whether
to use regression or classification techniques.
4R2 g) Features CO4
Input variables or attributes used to make predictions.
Serve as the independent variables for training.
Labels
Output variable or target value to be predicted.
Represent the dependent variable (ground truth).

Q2 Attempt any THREE. 09 M

4U3 a) CO4
Normalization Standardization

Rescales the data to a fixed Centers the data around the

range, typically [0, 1] or [-1, mean with unit variance.
1].

Useful when features have Useful when data follows a

different scales but no specific normal (Gaussian) distribution
assumptions about the or is expected to have similar
distribution. spread.

Sensitive to outliers, as they Less sensitive to outliers

can distort the min-max range. because it uses mean and
standard deviation.

Fixed range, typically [0, 1] or No fixed range, typically

[-1, 1]. centered around 0 with unit
variance.

5A3 b) Data preprocessing is a crucial step in machine learning because it CO5

ensures that the raw data is transformed into a suitable format for
model training and evaluation. Raw data is often incomplete, noisy,
or inconsistent, and preprocessing helps address these issues,
improving the model's performance and reliability.
1. Improves Data Quality
Handles missing values, duplicate entries, and noisy data to ensure
a cleaner dataset.
Better quality data leads to more accurate and reliable predictions.
2. Ensures Consistency in Data
Aligns data formats, units, or scales (e.g., normalization,
standardization) to prevent bias toward certain features during
model training.
3. Facilitates Faster Convergence of Models
By transforming data into a more manageable form, preprocessing
helps machine learning algorithms converge faster during training.
4. Reduces Overfitting and Underfitting
Techniques like feature selection or dimensionality reduction (e.g.,
PCA) can remove irrelevant or redundant features, reducing the
risk of overfitting or underfitting.
5. Handles Class Imbalance
Balancing datasets (e.g., using oversampling or undersampling)
ensures that the model does not favor the majority class over the
minority class.
6. Enables Compatibility with Algorithms
Some algorithms require specific input formats (e.g., numeric data
for many models). Preprocessing ensures data compatibility by
encoding categorical variables or scaling numerical data.

Steps in Data Preprocessing:

Data Cleaning: Handling missing values, noise, and
inconsistencies.
Data Transformation: Scaling, normalizing, or encoding data.
Feature Selection: Selecting the most relevant features for the
model.
Feature Extraction: Creating new, meaningful features from
existing ones
6U3 c) CO6
Simple Linear Regression Multiple Linear Regression

Involves only one independent Involves two or more

variable (feature). independent variables
(features).

Less complex due to a single More complex as it considers

predictor variable. multiple predictors.

Analyzes the relationship Analyzes the combined

between one feature and the influence of multiple features
target variable. on the target variable.

Predicting a dependent Predicting a dependent

variable using a single variable using multiple
predictor (e.g., predicting sales predictors (e.g., predicting
based on advertising budget). house prices based on size,
location, and age).

5U3 d) A Decision Tree is a popular and interpretable classification CO5

technique used in machine learning. It works by recursively
partitioning the data into subsets based on the feature values,
creating a tree-like structure. Each internal node represents a
feature, each branch represents a decision rule based on that
feature, and each leaf node represents the predicted class label.

How Decision Trees Work:

Feature Selection:
At each node, the decision tree algorithm selects the feature that
best splits the dataset into distinct classes. It chooses the feature
that maximizes a specific criterion, such as Gini impurity or
information gain (for classification tasks).

Recursive Partitioning:
The dataset is split based on the chosen feature, and this process
continues recursively for each subset until a stopping criterion is
met (e.g., maximum depth, minimum samples in a node, or no
further improvement in the split).

Class Assignment:
Once the tree reaches the leaf nodes, a class label is assigned to
each leaf based on the majority class in the data points that fall into
that leaf.
Overfitting: Decision trees can easily overfit the training data if
they grow too deep. Techniques like pruning (cutting back
branches of the tree) can help prevent overfitting.

Advantages of Decision Trees:

Interpretability: Decision trees are easy to understand and
visualize, making them user-friendly for both data scientists and
stakeholders.
No Need for Feature Scaling: Decision trees do not require
normalization or standardization of data since they are not sensitive
to the scale of the features.
Works with Both Numerical and Categorical Data: Decision trees
can handle both continuous and categorical variables.

Disadvantages of Decision Trees:

Overfitting: If the tree is too deep, it can memorize the training
data, leading to poor generalization to unseen data.
Instability: Small changes in the data can lead to a completely
different tree.
Bias Toward Features with More Levels: Decision trees may favor
features with more levels or categories, leading to biased splits.

Applications of Decision Trees:

Customer Segmentation: Identifying customer segments based on
purchasing behavior.
Medical Diagnosis: Classifying diseases based on patient
symptoms and test results.
Credit Scoring: Classifying loan applicants as high or low risk.
Example:
Imagine you're trying to classify whether a customer will buy a
product based on their age and income:

Node 1: If age ≤ 30, go to Node 2; if age > 30, go to Node 3.

Node 2: If income ≤ $50,000, predict "No"; if income > $50,000,
predict "Yes."
Node 3: If income ≤ $75,000, predict "Yes"; if income > $75,000,
predict "Yes."
The tree structure visually represents the decision rules used to
classify new instances based on the input features.

Q3 Attempt any ONE. 06 M

5A6 a) To build a Random Forest model to predict whether tennis will be CO5
played based on weather conditions, follow these steps. I will guide
you through the typical procedure of building and evaluating a
Random Forest classifier:

1. Dataset Understanding
Assume that you are provided with a dataset where each row
contains weather features and a target variable indicating whether
tennis was played. Here's an example of the dataset structure:

0°C 80% No Sunny Yes

2 25°C 70% Yes Overcast Yes
3 20°C 90% No Rainy No
4 35°C 60% Yes Sunny NoDay Temperature
Humidity Wind Outlook PlayTennis (Target)
1 3
... ... ... ... ... ...
2. Data Preprocessing
You need to prepare the data before building the model:

Handle missing data: Check if there are any missing values and
handle them by removing or filling them.
Encode categorical variables: The Outlook and Wind columns are
categorical. You need to convert these into numerical values using
encoding techniques like One-Hot Encoding.
Feature scaling: Random Forest is not sensitive to feature scaling,
but if you are using other models, scaling might be necessary.

3. Splitting the Data

You’ll split the data into training and test sets. A typical split is
70-80% for training and 20-30% for testing.

4. Model Building
We will use the Random Forest Classifier for this task, which is an
ensemble of decision trees. Here's how you would build the model:

# Step 1: Import Libraries

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, confusion_matrix,
classification_report

# Step 2: Load the dataset

# Example dataset as a DataFrame
data = pd.DataFrame({
'Temperature': [30, 25, 20, 35, 28],
'Humidity': [80, 70, 90, 60, 75],
'Wind': ['No', 'Yes', 'No', 'Yes', 'No'],
'Outlook': ['Sunny', 'Overcast', 'Rainy', 'Sunny', 'Overcast'],
'PlayTennis': ['Yes', 'Yes', 'No', 'No', 'Yes']
})

# Step 3: Encode categorical features (Outlook, Wind)

le = LabelEncoder()
data['Wind'] = le.fit_transform(data['Wind'])
data['Outlook'] = le.fit_transform(data['Outlook'])

# Step 4: Split the dataset into features and target

X = data.drop('PlayTennis', axis=1) # Features
y = data['PlayTennis'] # Target

# Step 5: Split into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.3, random_state=42)

# Step 6: Build the Random Forest Model

model = RandomForestClassifier(n_estimators=100,
random_state=42)
model.fit(X_train, y_train)

# Step 7: Make predictions

y_pred = model.predict(X_test)

# Step 8: Evaluate the model

accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy:.2f}')
print('Confusion Matrix:\n', conf_matrix)
print('Classification Report:\n', class_report)

5. Explanation of the Code:

Label Encoding: Categorical variables (Wind and Outlook) are
encoded into numerical values using LabelEncoder so they can be
used by the model.
Train-Test Split: The dataset is divided into training (70%) and test
(30%) sets using train_test_split.
Random Forest Model: A RandomForestClassifier with 100 trees
(n_estimators=100) is trained on the training data.
Model Evaluation: The model's accuracy is calculated, and other
metrics such as confusion matrix and classification report
(precision, recall, F1-score) are printed.

6. Model Evaluation
Accuracy gives you the overall performance of the model.
Confusion Matrix shows the number of true positives, true
negatives, false positives, and false negatives, which is helpful in
understanding model errors.
Classification Report provides precision, recall, and F1-score for
both classes (Yes/No in this case).

7. Advantages of Random Forest for This Task:

Ensemble Method: Random Forest is an ensemble method that
combines the output of multiple decision trees to improve
predictive performance and reduce overfitting.
Handles Non-linear Data: It can handle complex relationships in
the data without needing explicit feature engineering.
Feature Importance: Random Forest can give insights into the
importance of each feature, which can help in model
interpretability.
Conclusion:
Using a Random Forest model for predicting whether tennis will be
played based on weather features can provide accurate predictions.
By preprocessing the data, encoding categorical variables, splitting
the data into training and testing sets, and evaluating the model
using appropriate metrics, you can effectively build and assess the
model's performance.
6A6 b) To implement Logistic Regression on the Iris dataset and classify CO6
whether a flower is of type "Setosa" or "Not Setosa," we can follow
these steps:

1. Import the Necessary Libraries

We'll use libraries like scikit-learn for dataset handling,
preprocessing, and model building.

2. Load the Iris Dataset

The Iris dataset is readily available in scikit-learn. We will classify
the flower as "Setosa" or "Not Setosa" based on the species
column.

3. Preprocess the Data

We'll create a binary classification problem by converting the target
variable into two classes: "Setosa" (label 1) and "Not Setosa" (label
0).

4. Build and Train the Logistic Regression Model

5. Evaluate the Model
Here’s how you can implement this:

# Step 1: Import necessary libraries

import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix,
classification_report

# Step 2: Load the Iris dataset

iris = load_iris()
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['Species'] = iris.target
# Step 3: Convert target variable to binary classification (Setosa vs
Not Setosa)
df['IsSetosa'] = df['Species'].apply(lambda x: 1 if x == 0 else 0) #
0: Setosa, 1: Not Setosa

# Step 4: Define the feature matrix (X) and target vector (y)
X = df[iris.feature_names] # Features: Sepal length, Sepal width,
Petal length, Petal width
y = df['IsSetosa'] # Target: 1 if Setosa, 0 otherwise

# Step 5: Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.3, random_state=42)

# Step 6: Initialize and train the Logistic Regression model

model = LogisticRegression()
model.fit(X_train, y_train)

# Step 7: Make predictions on the test set

y_pred = model.predict(X_test)

# Step 8: Evaluate the model

accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)

# Print evaluation results

print(f'Accuracy: {accuracy:.2f}')
print('Confusion Matrix:\n', conf_matrix)
print('Classification Report:\n', class_report)

Explanation of the Code:

Data Loading:

We load the Iris dataset using load_iris() from sklearn.datasets.

We create a DataFrame (df) containing the Iris dataset's features
(iris.data) and target labels (iris.target).
Target Variable Transformation:

We convert the Species column into a binary classification column

(IsSetosa), where:
"Setosa" (species label 0) is represented as 1.
All other species (versicolor and virginica) are represented as 0.
Feature Matrix (X) and Target Vector (y):

X is the feature matrix containing the four measurements (sepal

length, sepal width, petal length, and petal width).
y is the binary target variable (IsSetosa).
Train-Test Split:
We split the data into training and test sets using train_test_split()
with 30% of the data reserved for testing.
Logistic Regression Model:

We initialize a Logistic Regression model.

We fit the model on the training data (X_train, y_train).
Predictions and Evaluation:

We use the trained model to make predictions on the test set

(X_test).
We evaluate the model's performance using accuracy_score,
confusion_matrix, and classification_report.
Output:
The model will output the following:

Accuracy: The overall accuracy of the model.

Confusion Matrix: A matrix that shows the counts of true positives,
false positives, true negatives, and false negatives.
Classification Report: Includes metrics such as precision, recall,
and F1-score.
Key Takeaways:
The logistic regression model will predict whether a flower is
"Setosa" (1) or "Not Setosa" (0).
The output includes performance metrics such as accuracy,
precision, recall, and F1-score, providing insight into how well the
model is performing.

Course Outcomes
CO4 Apply feature engineering on dataset.
CO5 Apply classification algorithm on dataset.
CO6 Apply regression algorithm on dataset.

Digital Learning Compass: Distance Education Enrollment 2017
No ratings yet
Digital Learning Compass: Distance Education Enrollment 2017
39 pages
unit 5
No ratings yet
unit 5
25 pages
3-Classification, Clustering and Prediction
No ratings yet
3-Classification, Clustering and Prediction
142 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Lecture 8
No ratings yet
Lecture 8
28 pages
Decision Trees and Decision Modeling
No ratings yet
Decision Trees and Decision Modeling
58 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
STAT 451: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 451: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
18 pages
3 Module DWM
No ratings yet
3 Module DWM
16 pages
Top 90+ Data Science Interview Questions and Answers (2024)
No ratings yet
Top 90+ Data Science Interview Questions and Answers (2024)
38 pages
Unit 3
No ratings yet
Unit 3
16 pages
unit3 ml
No ratings yet
unit3 ml
7 pages
Unit - 2 ML notes
No ratings yet
Unit - 2 ML notes
14 pages
Module 04
No ratings yet
Module 04
75 pages
Classification, Prediction
100% (1)
Classification, Prediction
67 pages
Module 04 Edited
No ratings yet
Module 04 Edited
19 pages
Preface To The Second Edition V 1 1
No ratings yet
Preface To The Second Edition V 1 1
9 pages
LS_Project_Report
No ratings yet
LS_Project_Report
10 pages
Machine Learning Supervised
No ratings yet
Machine Learning Supervised
42 pages
Unit 4
No ratings yet
Unit 4
20 pages
Slide 3
No ratings yet
Slide 3
23 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Les 3 DWM
No ratings yet
Les 3 DWM
21 pages
Data Minning Unit 2-1
No ratings yet
Data Minning Unit 2-1
10 pages
Decision Trees
100% (2)
Decision Trees
16 pages
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
No ratings yet
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
11 pages
03 Decision Tree
No ratings yet
03 Decision Tree
59 pages
Mini Project 2024
No ratings yet
Mini Project 2024
48 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
118 pages
AI Lecture 9
No ratings yet
AI Lecture 9
69 pages
Chapter 2 Types of Machine Learning and Their Learning Strategies
No ratings yet
Chapter 2 Types of Machine Learning and Their Learning Strategies
45 pages
Stat Learn Big Data 20130401
No ratings yet
Stat Learn Big Data 20130401
53 pages
DWDM - Unit - V
No ratings yet
DWDM - Unit - V
93 pages
AIML
No ratings yet
AIML
30 pages
Machine Learning Most Important Question For Mid Term Ipu University
No ratings yet
Machine Learning Most Important Question For Mid Term Ipu University
36 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
22 pages
Machine Learning with Python for Everyone (Addison Wesley Data & Analytics Series) 1st Edition, (Ebook PDF) - Download the ebook and explore the most detailed content
100% (1)
Machine Learning with Python for Everyone (Addison Wesley Data & Analytics Series) 1st Edition, (Ebook PDF) - Download the ebook and explore the most detailed content
60 pages
Unit4_PPT
No ratings yet
Unit4_PPT
118 pages
entropy and information gain for decision tree algorithm
No ratings yet
entropy and information gain for decision tree algorithm
12 pages
CS402 Mod 3
No ratings yet
CS402 Mod 3
2 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
112 pages
Classification and Prediction
No ratings yet
Classification and Prediction
69 pages
Down 4
No ratings yet
Down 4
83 pages
Unit 4 Datamining
No ratings yet
Unit 4 Datamining
5 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
DWDM 4
No ratings yet
DWDM 4
58 pages
Decision Tree Introduction
No ratings yet
Decision Tree Introduction
14 pages
ML Unit 2
No ratings yet
ML Unit 2
84 pages
Machine Learning with Python for Everyone (Addison Wesley Data & Analytics Series) 1st Edition, (Ebook PDF) instant download
100% (2)
Machine Learning with Python for Everyone (Addison Wesley Data & Analytics Series) 1st Edition, (Ebook PDF) instant download
38 pages
Classification Algorithm
No ratings yet
Classification Algorithm
78 pages
Classification and Clustering
No ratings yet
Classification and Clustering
59 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
Machine Learning Study Experiment
No ratings yet
Machine Learning Study Experiment
5 pages
MLBF Session 5
No ratings yet
MLBF Session 5
23 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
118 pages
Final ML
No ratings yet
Final ML
2 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Management of Grievances
No ratings yet
Management of Grievances
13 pages
Chapter 1-5 Thesis
100% (1)
Chapter 1-5 Thesis
18 pages
Role of Chlorhexidine in Caries Prevention
No ratings yet
Role of Chlorhexidine in Caries Prevention
7 pages
Robin Elcc Support File Final
No ratings yet
Robin Elcc Support File Final
8 pages
The Draw A Person Task in Persons With Mental Retardation What Does It Measure
No ratings yet
The Draw A Person Task in Persons With Mental Retardation What Does It Measure
13 pages
[FREE PDF sample] Research Handbook for Health Care Professionals 1st Edition Mary Hickson ebooks
100% (3)
[FREE PDF sample] Research Handbook for Health Care Professionals 1st Edition Mary Hickson ebooks
55 pages
Avinash Kumar (Project) .Sem 3rd Final PDF
No ratings yet
Avinash Kumar (Project) .Sem 3rd Final PDF
39 pages
BM Paper 2 - Markscheme
No ratings yet
BM Paper 2 - Markscheme
12 pages
Yoon Et Al., 2016
No ratings yet
Yoon Et Al., 2016
21 pages
A Short Guide To Oral Assessment: by Gordon Joughin
No ratings yet
A Short Guide To Oral Assessment: by Gordon Joughin
28 pages
(Technical) Operation Research Case Study: Analysis of Shipping Wood To Market
No ratings yet
(Technical) Operation Research Case Study: Analysis of Shipping Wood To Market
10 pages
Public Service Motivation Theory
No ratings yet
Public Service Motivation Theory
23 pages
Syllabus Type of Information Content
100% (1)
Syllabus Type of Information Content
2 pages
Buyer Behavior and Ebay: Abstract
No ratings yet
Buyer Behavior and Ebay: Abstract
11 pages
Chapter 1 Introduction To Production and Operation Management
No ratings yet
Chapter 1 Introduction To Production and Operation Management
13 pages
Aarhat Multidisciplinary International Education Research Journal (AMIERJ) (
No ratings yet
Aarhat Multidisciplinary International Education Research Journal (AMIERJ) (
6 pages
Latar Belakang
No ratings yet
Latar Belakang
14 pages
Admit Card Chetan
No ratings yet
Admit Card Chetan
1 page
Business Statistics: Australia and New Zealand 1st Edition - eBook PDF 2024 scribd download
100% (8)
Business Statistics: Australia and New Zealand 1st Edition - eBook PDF 2024 scribd download
69 pages
PPT
No ratings yet
PPT
12 pages
Application To Sit For PAE
No ratings yet
Application To Sit For PAE
9 pages
Module On Standard Scores and The Normal Curve
No ratings yet
Module On Standard Scores and The Normal Curve
27 pages
CHAPTER 1 DIANA SETYAWATI Incredible Outset
No ratings yet
CHAPTER 1 DIANA SETYAWATI Incredible Outset
7 pages
Chapter 5 Employee Selection Presentation
No ratings yet
Chapter 5 Employee Selection Presentation
30 pages
Business and Business Environment Assignment - Karieva
No ratings yet
Business and Business Environment Assignment - Karieva
5 pages
The Neuroscience of Handwriting Applications for Forensic Document Examination 1st Edition Michael P. Caligiuri 2024 Scribd Download
No ratings yet
The Neuroscience of Handwriting Applications for Forensic Document Examination 1st Edition Michael P. Caligiuri 2024 Scribd Download
41 pages
Final Purposal
No ratings yet
Final Purposal
18 pages
Evaluation of Advertising and Its Effectiveness
No ratings yet
Evaluation of Advertising and Its Effectiveness
24 pages
Autoethnography in TESOL PDF
No ratings yet
Autoethnography in TESOL PDF
14 pages

Test2 ML Model Answer

Uploaded by

Test2 ML Model Answer

Uploaded by

GOVERNMENT POLYTECHNIC, NAGPUR.

(An Autonomous Institute of Govt. of Maharashtra)

Q1 Attempt any FIVE. 10 M Cos

The theorem is expressed as:

6R2 f) Regression analysis is a statistical and machine learning technique CO6

Q2 Attempt any THREE. 09 M

Rescales the data to a fixed Centers the data around the

Useful when features have Useful when data follows a

Sensitive to outliers, as they Less sensitive to outliers

Fixed range, typically [0, 1] or No fixed range, typically

5A3 b) Data preprocessing is a crucial step in machine learning because it CO5

Steps in Data Preprocessing:

Involves only one independent Involves two or more

Less complex due to a single More complex as it considers

Analyzes the relationship Analyzes the combined

Predicting a dependent Predicting a dependent

5U3 d) A Decision Tree is a popular and interpretable classification CO5

How Decision Trees Work:

Advantages of Decision Trees:

Disadvantages of Decision Trees:

Applications of Decision Trees:

Node 1: If age ≤ 30, go to Node 2; if age > 30, go to Node 3.

Q3 Attempt any ONE. 06 M

0°C 80% No Sunny Yes

3. Splitting the Data

# Step 1: Import Libraries

# Step 2: Load the dataset

# Step 3: Encode categorical features (Outlook, Wind)

# Step 4: Split the dataset into features and target

# Step 5: Split into training and testing sets

# Step 6: Build the Random Forest Model

# Step 7: Make predictions

# Step 8: Evaluate the model

5. Explanation of the Code:

7. Advantages of Random Forest for This Task:

1. Import the Necessary Libraries

2. Load the Iris Dataset

3. Preprocess the Data

4. Build and Train the Logistic Regression Model

# Step 1: Import necessary libraries

# Step 2: Load the Iris dataset

# Step 5: Split the dataset into training and testing sets

# Step 6: Initialize and train the Logistic Regression model

# Step 7: Make predictions on the test set

# Step 8: Evaluate the model

# Print evaluation results

Explanation of the Code:

We load the Iris dataset using load_iris() from sklearn.datasets.

We convert the Species column into a binary classification column

X is the feature matrix containing the four measurements (sepal

We initialize a Logistic Regression model.

We use the trained model to make predictions on the test set

Accuracy: The overall accuracy of the model.

You might also like