0% found this document useful (0 votes)

73 views32 pages

Report File 1

Uploaded by

ginni bhayana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views32 pages

Report File 1

Uploaded by

ginni bhayana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

1.

1 Description of the Topic:

The Web-Based Skin Type Prediction System Using Machine Learning is an online platform designed to
automate and improve the accuracy of skin type classification for individuals. Traditionally, determining one's
skin type has been a manual process, often relying on subjective self-assessments or expert consultations,
which can be inconsistent and time-consuming. This system addresses the limitations of current methods by
utilizing machine learning algorithms to analyze input data (e.g., age, skin concerns, texture) and predict skin
types, such as dry, oily, combination, or sensitive. The system is designed to be accessible to all users via a
web-based platform, providing instant, accurate s,kin type predictions and personalized skincare
recommendations.

The development of this project is motivated by the increasing interest in personalized skincare and the need
for accurate, accessible tools to help individuals understand their skin's unique characteristics. By integrating
machine learning with a user-friendly web interface, this system offers a scalable solution for skin type
classification, aiming to reduce the dependency on dermatologists and providing users with the knowledge to
make informed decisions about their skincare routines.

1.2 Problem Statement:

The traditional methods of determining skin type are largely manual, relying on either expert consultations or
subjective self-assessments, both of which can lead to inaccuracies. With the growing demand for personalized
skincare, there is a clear need for an automated, accurate, and accessible system to predict skin types. Current
systems lack real-time, reliable predictions and do not incorporate machine learning to improve their accuracy
over time. As a result, individuals often struggle to identify their true skin type, leading to poor skincare
decisions, ineffective products, and frustration.

The core problem is the inefficiency and inaccuracy of traditional skin type determination methods, which
the proposed system aims to address by leveraging machine learning for more accurate, consistent, and
accessible skin type predictions.

1.3 Objectives:

The main objectives of the Web-Based Skin Type Prediction System are:

1. Automated Skin Type Classification: Develop a machine learning model that can accurately classify
skin types (dry, oily, combination, sensitive, etc.) based on user inputs.
2. User-Friendly Interface: Design an intuitive and responsive web interface that allows users to easily
input their skin-related data, including age, skin concerns, texture, etc.
3. Personalized Recommendations: Provide users with personalized skincare advice based on the
predicted skin type, helping them make informed decisions about their skincare routine and product
choices.
4. Real-Time Predictions: Ensure the system provides accurate skin type predictions almost instantly
after the user submits their data.
5. Data Security: Implement appropriate measures to ensure user data (especially personal and medical
data) is stored and processed securely.
6. Scalability and Performance: Ensure the system can scale to accommodate a growing user base and
operate with low latency, delivering fast predictions.
7. Admin Panel: Develop an administrative backend to manage user data, track model performance, and
update the machine learning model as more data is collected.

1
1.4 Scope of the Project:

The scope of the project encompasses the following:

1. Web-Based Platform: The system will be accessible via any modern web browser, providing an
intuitive platform for users to input their skin type-related data. The platform will be responsive,
allowing access from desktops, laptops, and mobile devices.
2. Machine Learning Model: The core of the system will be a machine learning model that predicts the
skin type based on input data. This model will be trained using a large dataset of skin-related
information and will be capable of predicting skin types with 75%-90% accuracy.
3. Personalized Skincare Recommendations: Based on the skin type prediction, the system will suggest
personalized skincare products or routines. This feature will be tailored to the skin type identified, with
recommendations that consider factors such as age, skin concerns, and environment.
4. User Profiles: Users will have the option to create profiles, which will allow them to track their skin
type history, monitor changes over time, and receive updated skincare recommendations.
5. Admin Features: The system will include an admin panel where administrators can manage user data,
view analytics, and update the machine learning models with new data for better accuracy.
6. Data Storage and Security: User information and prediction history will be securely stored in a
database, following best practices for data security and privacy.

The system will not include physical devices or integration with external IoT devices (though future
enhancements could include such features). It is focused solely on providing an accurate, automated system
for skin type prediction and personalized skincare advice.

1.5 Project Planning Activities

The project planning activities ensure that the system is developed on time and within the specified scope. The
following activities outline the steps and division of labor needed to complete the project.

1.5.1 Team-Member Wise Work Distribution Table:

Team Member Responsibilities

Oversee the project, manage timelines, ensure objectives are met, communicate
Project Manager
with stakeholders.
Design and develop the user interface (UI), implement responsive design, integrate
Frontend Developer
user input forms.
Implement server-side logic, develop APIs, integrate machine learning model,
Backend Developer
manage database.
Machine Learning Select and train appropriate machine learning models, optimize the model for
Specialist accuracy, integrate with backend.
Database Set up and maintain the database (MySQL/PostgreSQL), ensure data security and
Administrator integrity.
Conduct testing (unit tests, integration tests, user acceptance testing), ensure the
Quality Assurance
system meets functional requirements.
Ensure data security measures are implemented, handle user data protection,
Security Specialist
comply with privacy regulations.

2
1.5 System Planning (PERT Chart)

A PERT (Program Evaluation Review Technique) chart is a project management tool used to visualize the
timeline of a project, breaking it down into individual tasks and their relationships. Here's an outline of the
system's planning phases:

PERT Chart for the Web-Based Skin Type Prediction System

Fig 1.5.2 PERT chart

3
2.1 Literature Review
In this chapter, we review previous research studies related to the skin type prediction domain, particularly
focusing on machine learning techniques used for skincare diagnostics, skin type classification, and
personalized recommendations. The integration of machine learning into skin type prediction has shown
promising results, but challenges related to accuracy, dataset size, and generalization remain.

This review draws on various studies, highlighting the methodologies, findings, and gaps in previous work to
form a foundation for our proposed system.

Summary of Papers Studied

Below is a summary table of eight relevant research papers related to the use of machine learning and skin
type prediction, followed by a discussion of the findings from each paper.

Author(s)
Title of Research Paper Findings
with Year
This paper presents a classification model for skin type based
A. Gupta, S. on features like age, skin concerns, and environmental factors.
"Skin Type Classification
Kumar, The authors used Random Forest and achieved an accuracy of
using Machine Learning"
2020 85%, suggesting that machine learning is effective for skin
type classification.
The study focuses on recommending skincare products based
"Personalized Skin Care R. Sharma, on user-provided skin type, age, and skin conditions using
Recommendation Using P. Singh, SVM. The proposed model successfully provided
Machine Learning" 2019 personalized skincare suggestions and achieved a precision
rate of 82%.
This paper explored the use of Convolutional Neural
"Deep Learning Approaches Networks (CNN) for skin type prediction using facial images.
J. Park, Y.
for Skin Type Identification The authors concluded that deep learning methods outperform
Kim, 2021
from Facial Images" traditional machine learning approaches like SVM, achieving
an accuracy of 90%.
The research focuses on combining multiple factors such as
"Development of an
environmental data and user input to predict skin types. They
Intelligent Dermatology X. Chen, M.
used a decision tree model and achieved a high classification
System for Skin Type Zhang, 2021
rate. The study also highlights challenges in handling
Detection"
imbalanced data.
While primarily focused on skin disease diagnosis, this paper
"A Comprehensive Study on reviews several machine learning models, including support
H. Patel, D.
the Use of Machine Learning vector machines (SVM) and k-nearest neighbors (KNN), in
Mehta, 2018
for Skin Disease Diagnosis" identifying skin diseases. It shows that models trained on large
datasets can predict conditions with high accuracy.

4
Author(s)
Title of Research Paper Findings
with Year
The study explored how machine learning can recommend
"Data-Driven Skin Care skin care routines based on various skin conditions, such as
C. Liu, Z.
Recommendations Based on acne, dryness, and sensitivity. The authors showed that
Tan, 2020
Skin Type and Condition" personalized skincare solutions could be effectively generated
using clustering and classification algorithms like K-means.
This paper discusses various machine learning algorithms for
"Machine Learning
predicting skin types and providing personalized dermatology
Approaches for Personalized L. Zhang, S.
advice. The authors highlight that deep learning models
Skin Care and Dermatology Li, 2022
outperform other traditional models, achieving near-perfect
Applications"
accuracy with large datasets.
This study combines collaborative filtering with machine
"Skincare Recommender
learning for personalized skincare product recommendations.
Systems Using Collaborative A. Patel, S.
Although not directly related to skin type prediction, it
Filtering and Machine Gupta, 2019
emphasizes the use of machine learning to suggest skincare
Learning"
products based on user preferences and skin characteristics.

Detailed Discussion of Findings from Relevant Research:

1. "Skin Type Classification using Machine Learning"

 Summary: This paper utilized Random Forest to classify skin types based on a combination of user
input data (such as age, skin concerns, and environmental factors). The model achieved an accuracy of
85%, which indicates that machine learning can be an effective tool for skin type classification.
 Key Insight: A combination of feature engineering (user inputs) and machine learning can create a
robust skin type classifier. However, the model's performance is dependent on the quality and diversity
of the input data.

2. "Personalized Skin Care Recommendation Using Machine Learning"

 Summary: The authors used Support Vector Machines (SVM) for personalized skincare
recommendations. They trained the model using features like skin type, age, and skin concerns (e.g.,
acne, dryness).
 Key Insight: Personalized recommendations can be effectively made by analyzing the relationship
between user skin conditions and skincare products, showing that machine learning can play a
significant role in personalized skincare solutions. Precision rate of 82% was reported.

3. "Deep Learning Approaches for Skin Type Identification from Facial Images"

 Summary: This research shifted from traditional machine learning to deep learning, specifically
Convolutional Neural Networks (CNN), for skin type classification using facial images. The model
achieved a high classification accuracy of 90%, surpassing traditional methods.
 Key Insight: Deep learning, particularly CNN, is more effective for classifying skin types based on
visual data (like facial images) as opposed to textual or numerical data. This has potential for
integrating with apps that use a user's image for skin diagnosis.

5
4. "Development of an Intelligent Dermatology System for Skin Type Detection"

 Summary: This paper combined multiple inputs, including environmental data (humidity,
temperature), and user profile data for skin type classification. The decision tree classifier showed a
high classification accuracy.
 Key Insight: Combining multiple data sources, including environmental factors, can increase the
accuracy and personalization of skin type prediction, but the study also faced challenges in managing
imbalanced datasets.

5. "A Comprehensive Study on the Use of Machine Learning for Skin Disease Diagnosis"

 Summary: This paper reviewed various machine learning models, like KNN and SVM, for skin
disease diagnosis. It emphasized the importance of large, well-labeled datasets to achieve higher
accuracy in dermatological predictions.
 Key Insight: Even though this paper focuses on skin disease diagnosis, it shows the importance of data
size and quality in machine learning applications in dermatology, which can be applied to skin type
classification.

6. "Data-Driven Skin Care Recommendations Based on Skin Type and Condition"

 Summary: The study utilized clustering algorithms, such as K-means, to analyze skin concerns and
recommend skin care solutions based on the user's skin type and condition (e.g., acne, dryness).
 Key Insight: By segmenting users into clusters based on skin type and condition, the model can make
personalized skincare recommendations. This suggests that clustering can be a powerful tool in
conjunction with skin type prediction.

7. "Machine Learning Approaches for Personalized Skin Care and Dermatology Applications"

 Summary: The paper explores various machine learning models, emphasizing deep learning for skin
care applications. It states that deep learning models, particularly with large datasets, achieve the
highest performance in predicting skin types and dermatological conditions.
 Key Insight: For large-scale, real-time systems (like web-based platforms), deep learning can offer
superior accuracy and personalization when trained on vast datasets.

8. "Skincare Recommender Systems Using Collaborative Filtering and Machine Learning"

 Summary: This study combined collaborative filtering with machine learning for recommending
skincare products. Although not directly focused on skin type prediction, it highlights how machine
learning can be used for product recommendations.
 Key Insight: Collaborative filtering enhances personalization and can be combined with machine
learning for recommending skincare products. This is relevant to the proposed system as it may use
similar techniques for suggesting products based on skin type predictions.

6
3.1 System Design

The system design section provides a detailed explanation of the architecture and the key components of the
Web-Based Skin Type Prediction System. This system aims to accurately classify skin types using machine
learning and provide personalized skincare recommendations. The design is modular, ensuring scalability and
ease of maintenance.

1. System Architecture Overview:

The system follows a client-server architecture, with three primary components:

 Frontend (Client-Side): This is the interface where users interact with the system by entering their
skin-related data and receiving predictions and recommendations.
 Backend (Server-Side): This handles the processing of user inputs, calls to the machine learning
model, and manages user authentication, data storage, and interactions with the database.
 Database: Stores user profiles, historical skin type predictions, and personalized skincare
recommendations.

2. Detailed Breakdown of Components:

 Frontend (UI/UX):
o Tech Stack: React.js or Angular (for creating dynamic, responsive UIs).
o Responsibilities:
 Provides a form to collect user data (e.g., age, skin concerns, texture, etc.).
 Displays skin type predictions and personalized skincare recommendations.
 Handles real-time interactions with users through AJAX calls to the backend.
 User registration/login functionality for profile management (optional).
 Backend:
o Tech Stack: Python (Flask or Django for API development), integrated with machine learning
models.
o Responsibilities:
 Processes incoming requests from the frontend (user data input).
 Calls the machine learning model to classify the skin type based on user data.
 Handles data storage, user authentication, and session management.
 Serves the personalized recommendations based on the predicted skin type.
 Manages interaction with the database to store and retrieve user profiles and prediction
history.
 Machine Learning Model:
o Tech Stack: Python, TensorFlow/PyTorch or Scikit-learn.
o Responsibilities:
 Trains and evaluates machine learning models to predict skin types based on user data.
 The model can be periodically retrained using new data to improve accuracy.
 Returns the predicted skin type, which is used to generate personalized
recommendations.
 Database:
o Tech Stack: MySQL or PostgreSQL.
o Responsibilities:
 Stores user data (e.g., age, skin concerns) and predicted skin types.

7
 Stores historical prediction data and previous recommendations for each user (if
applicable).
 Secures user data by employing encryption methods and ensuring privacy.

3. Flow of Information:

1. User Input:
o The user visits the web application and enters their skin-related data via an input form. This
could include:
 Age
 Skin texture (oily, dry, etc.)
 Skin concerns (acne, dryness, etc.)
 Environmental conditions (e.g., climate)
2. Backend Processing:
o The frontend sends the user data to the backend via an API call.
o The backend processes the data and sends it to the pre-trained machine learning model.
3. Skin Type Prediction:
o The machine learning model processes the input and returns the predicted skin type (e.g., dry,
oily, combination).
4. Personalized Recommendations:
o Based on the predicted skin type, the backend queries a database or predefined recommendation
system to generate personalized skincare tips or product recommendations for the user.
5. Display Results:
o The backend sends the prediction and recommendations back to the frontend, which displays
them to the user.
6. Data Storage:
o The user’s skin type prediction, along with any other relevant information, is saved to the
database for future reference.

3.2 Algorithm Used

The core of the Skin Type Prediction System relies on machine learning algorithms to classify skin types
based on user input. The algorithm used must be capable of handling both numerical and categorical data (e.g.,
age, skin concerns) and making accurate predictions. Below is a detailed description of the algorithm used:

1. Machine Learning Approach

For skin type prediction, we can use a Supervised Learning approach, specifically a Classification algorithm.
The goal is to train a model using labeled data where the input features (user data) are mapped to the output
class (skin type).

The following algorithms are considered for this project:

1. Random Forest Classifier

o Random Forest is an ensemble learning method that uses multiple decision trees to improve
prediction accuracy and avoid overfitting.
o It is effective for both categorical and continuous features and can handle non-linear
relationships between features.
2. Support Vector Machine (SVM)
o SVM is a powerful classifier that works by finding the hyperplane that best separates different
classes in the feature space.

8
o It is well-suited for high-dimensional datasets and can work well when there is a clear margin
of separation between skin types.
3. Logistic Regression
o Logistic Regression is another classification algorithm that is simpler than SVM and Random
Forest.
o It works by learning the relationship between the input features and the probability of each skin
type, making it suitable for problems with a binary or multi-class output.
4. Neural Networks (Deep Learning)
o A Neural Network model (e.g., MLP - Multi-layer Perceptron) is another option that could be
explored, especially if the dataset becomes large and complex.
o The model can automatically learn complex relationships in the data, improving performance
in cases where traditional models like SVM or Random Forest might struggle.

2. Chosen Algorithm: Random Forest Classifier

For this project, we have chosen the Random Forest Classifier as the primary machine learning algorithm.
The reasons for selecting Random Forest are:

 Accuracy: Random Forest has been shown to work well in many classification problems, especially
when dealing with heterogeneous data (numerical, categorical).
 Robustness: It is less prone to overfitting than other algorithms like decision trees.
 Interpretability: While Random Forest is an ensemble of many decision trees, it provides insights into
feature importance, which can be useful for understanding which user inputs (e.g., age, skin concerns)
are most indicative of skin type.
 Flexibility: It can handle both numerical and categorical data without requiring complex
preprocessing.

3. Steps of the Random Forest Algorithm:

1. Data Preprocessing:
o Convert categorical data (e.g., skin concerns, environmental factors) into numerical format
using techniques like one-hot encoding or label encoding.
o Normalize or scale numerical features (e.g., age) to ensure that no single feature dominates
others.
o Handle missing data by either imputing values or removing rows with missing data.
2. Model Training:
o Split the dataset into training and testing sets (typically 80% for training and 20% for testing).
o Train multiple decision trees using bootstrapped subsets of the training data (sampling with
replacement).
o Each decision tree is trained on a random subset of features, making the model robust to
overfitting.
3. Prediction:
o For a new user input, the model predicts the skin type by aggregating the outputs from all
decision trees in the forest (majority voting).
o The final prediction is the class (skin type) that receives the most votes across all trees.
4. Model Evaluation:
o Evaluate the model using metrics such as accuracy, precision, recall, and F1-score on the test
set.
o Adjust hyperparameters like the number of trees (estimators), tree depth, and the minimum
number of samples required to split a node to improve model performance.

9
4. Evaluation and Optimization:

 Hyperparameter Tuning: We will perform cross-validation and hyperparameter tuning using grid
search to find the optimal values for parameters like the number of trees (n_estimators), the maximum
depth of the trees (max_depth), and the minimum number of samples required to split a node
(min_samples_split).
 Cross-Validation: Use k-fold cross-validation to assess the model’s performance across multiple
subsets of the training data to ensure robustness and avoid overfitting.
 Feature Selection: Use feature importance scores generated by the Random Forest model to identify
which features (e.g., skin texture, age) contribute most to predicting skin type. We can remove less
important features to reduce model complexity.

10
Phase1: Data Acquisition

Data acquisition for the expanded skin type dataset involves collecting detailed information on various skin
characteristics, such as age, gender, skin type, moisture level, and acne severity, from different sources or
surveys to predict and analyze skin-related conditions.

Importing Libraries & Uploading the dataset

Phase2: Data Preparation

Data Preparation is the first step of data cleaning, which involves identifying data quality issues and
transforming raw data into usable formats. It is necessary to transform raw data so that the information content
can be exposed or made more easily accessible, and can include tasks such as loading data, data cleansing, etc.
This phase is time consuming and involves identification of various data quality issues.

Phase3: Data Manipulation

Filling null values and Converting String to Numerical form

In the dataset we have no null values, so there is no need to filling them with the median.

11
Checking for outliers and duplicates

In the dataset we have no duplicate values, so there is no need to drop them.

Outliers are data points that are significantly different from other observations in a dataset. They can be
unusually high or low values that deviate from the general pattern of the data. We will check for outliers by
forming boxplot of each column and through formation we can see we have many potential outliers which will
be removed through IQR method.

fig 4.1 Representation of box plot

Since, there is no outliers present in this dataset so no outliers are removed.

Data Visualisation

1.Box Plot

Components of a Box Plot

1. Box:
o Interquartile Range (IQR): The box itself represents the interquartile range (IQR), which is the range
between the first quartile (Q1) and the third quartile (Q3). This range contains the middle 50% of the
data.
12
o Median (Q2): The line inside the box represents the median (or the second quartile, Q2) of the data. It
divides the data into two equal halves.
2. Whiskers:
o The lines extending from the box are called whiskers. They typically extend to the smallest and largest
values within 1.5 times the IQR from Q1 and Q3, respectively.
o Lower Whisker: Extends from Q1 to the smallest value within 1.5 * IQR below Q1.
o Upper Whisker: Extends from Q3 to the largest value within 1.5 * IQR above Q3.
3. Outliers:
o Points outside the whiskers are considered outliers and are usually plotted as individual points. In your
plot, there seem to be no outliers.

Interpretation of Your Box Plot

 Minimum: The smallest value that is not an outlier, which is around 1 hour.
 Q1 (First Quartile): The 25th percentile of the data, approximately at 3 hours. This means 25% of the data
points are less than or equal to this value.
 Median (Q2): The 50th percentile, which is around 5 hours. This is the middle value that separates the lower
50% of the data from the upper 50%.
 Q3 (Third Quartile): The 75th percentile, approximately at 7 hours. This indicates that 75% of the data points
are less than or equal to this value.
 Maximum: The largest value that is not an outlier, which is around 9 hours.
 IQR (Interquartile Range): The difference between Q3 and Q1, which is 7 - 3 = 4 hours. The IQR is a measure
of statistical dispersion, or how spread out the data values are.

Fig 4.1.2 Visual Representation of each box plot

2.Scatter Plot

A scatter plot serves as a form of data visualization that illustrates individual data points on a two-dimensional
graph. It is employed to examine and demonstrate the relationships between two numerical variables. Below
is a concise overview:

13
Essential Characteristics of a Scatter Plot:

1. Axes:

o X-axis: Denotes the independent variable (or predictor variable).

o Y-axis: Denotes the dependent variable (or response variable).
o Data Points:Each point on the scatter plot represents a single observation from the dataset, with its
location determined by the values of the two variables.

2. Trends and Patterns:

o Positive Correlation: When both variables increase simultaneously, the points typically align to form
an upward-sloping line.
o Negative Correlation: When one variable rises while the other falls, the points generally create a
downward-sloping line.
o No Correlation: In the absence of a discernible relationship between the variables, the points appear to
be randomly distributed.

3. Clusters and Outliers:

o Clusters: Groups of closely positioned points may signify a shared characteristic or a specific subset
within the dataset.
o Outliers: Points that are significantly distant from the other data points may indicate anomalies or
unique cases within the data.

Fig 4.1.3 Visual representation of each scatter plot

14
Conversion of categorical data into quantitative data using Multiclassification

Here, the categorical data is converted into numerical values using multiclassification by mapping the unique
values of “Skin Type” into 0,1 or 2.

Fig 4.2

Implementation of algorithm and techniques

Division of Independent and Dependent features

• X: A subset of the DataFrame with just the columns “fixed acidity, volatile acidity, citric acid, residual sugar,
chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates” is stored in this variable. You will probably
utilize these characteristics to forecast the target variable.
• Y: The column "Category" from the DataFrame is stored in this variable. This is the variable you wish to
forecast using X's features, or the target variable.

Fig 4.3

Train & Test splitting

• Import: The train_test_split function, which divides datasets, is imported.

• Splitting: The following arguments are passed to the train_test_split function when it is called:
X: Your characteristics (columns “fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur
dioxide, total sulfur dioxide, density, pH, sulphates”).
Y: The variable you want to aim for (“Category”).

25% of the data will be used for testing and the remaining 75% for training, according to the specification
test_size=0.25.

15
random_state=42: Establishes a seed for the random number generator, guaranteeing that each time you
execute the code, you will receive the same split. This is helpful in terms of repeatability.

X_test and Y_test are held back from the model during training. After training, the model is used to make
predictions on X_test, and these predictions are compared to the actual values in y_test to assess how well the
model generalizes to new data.

Fig 4.4 Train Test Split

Fig 4.4.1

Fitting & transforming training data

Fit: Determines the standard deviation and mean values for every feature in your training set (x_train,y_train).
transform: Reduces the mean and divides each feature's value by the standard deviation to standardize the
training data. In doing so, the data is scaled to have a standard deviation of one and is centered on zero.

16
Implementation of Multiclassification Logistic regression

The code imports the LogisticRegression class from the sklearn.linear_model module, which is used for
logistic regression analysis. It assigns an instance of the LogisticRegression class to the variable regression,
using all available processors for computation.

The code then fits the model to the training data, learning the relationship between features and target from
the training data.

Fig 4.5 logistic regression

The code uses a trained logistic regression model to predict x_test data.

The R-squared Score is used to evaluate the model's performance, assessing its ability to explain the target
variable's variance.

Fig 4.5.1

17
Here we are seeing that our model multiclassification logistic regression is giving 0.89 r score which means
it might tend to be the best fit.

Fig 4.5.2

The classification_report function evaluates classification models, not regression models, providing metrics
like precision, recall, and F1-score.

Fig 4.5.3

Implementation of Hyperparameter Tuning

This code snippet demonstrates how to perform hyperparameter tuning for a Logistic regression model using
GridSearchCV.

It defines the hyperparameter search space, including criteria, splitter, max_features, and ccp_alpha.

18
The code imports the GridSearchCV class, initializes a Logistic regression classifier, creates a GridSearchCV
object, and performs hyperparameter tuning by fitting the object to the training data, evaluating
hyperparameter combinations, and selecting the best set.

The code uses the best Logistic Regression model from GridSearchCV to predict x_test on the test data,
assigning predicted values to both variables.

Implementation of Decision Tree Classifier

The code demonstrates the creation and training of a Decision Tree Classifier model.

It imports the DecisionTreeClassifier class, sets the max_depth parameter to 2, and splits the dataset into
training and testing sets.

The model is trained using the training data, learning decision rules from the features (x_train) and
corresponding labels (y_train).

The model is then trained on the training data to classify instances based on their features.

The code ensures that the decision tree model is not too complex and does not overfit the training data.

19
Fig 4.6 Decision Tree

The code uses a trained decision tree model to predict a target variable, like 'Category', for a test data x_test.

Fig 4.6.1

The model's accuracy score is calculated by comparing the predicted values with the actual values, and the
proportion of correct predictions is computed. The calculated score is then printed, indicating the model's
performance on the test data.

Fig 4.6.2

The function classifies test data using y_test and y_pred, and prints the resulting classification report to the
console.

20
Fig 4.6.3

Implementation of the Random Forest Classifier

To build a Random Forest model, the code imports the RandomForestClassifier class from scikit-learn.

It first establishes the starting parameters and then uses the training data to train the model. After that, the
model can forecast using fresh, unobserved data.

rf_model = RandomForestClassifier(42, random_state, n_estimators = 100): Generates a

RandomForestClassifier instance:

The setting "n_estimators=100" indicates that there will be 100 decision trees in the forest. This figure can be
modified in accordance with your dataset and available processing power.

To ensure reproducibility, use a random seed with random_state=5. Every time you execute the code, you will
receive the same results if you use the same seed.

The code uses a trained Random Forest model to generate predictions for the test data, assigning the predicted
values to a variable called y_pred_rf, which is then used to evaluate the model's performance.

21
Fig 4.7 Random Forest

Fig 4.7.1

The Random Forest model's accuracy is calculated by comparing true labels with predicted labels, stored in
the variable accuracy_rf.

A classification report is generated, providing a detailed evaluation of the model's performance.

22
Fig 4.7.2

Implementation of K Nearest Neighbour (KNN)

In order to construct a K-Nearest Neighbors (KNN) model using scikit-learn, you will adhere to a comparable
procedure as the one outlined for the Random Forest model. Here is a detailed, step-by-step guide on how to
accomplish this.

1. Initialize the KNN Classifier

2. Train the Model
3. Make Predictions
4. Evaluate the Model's Performance

KNN Classifier Initialization: An instance of KNeighborsClassifier is instantiated with 5 neighbors.

Model Training: The training data is utilized to train the model.

Making Predictions: The model generates predictions based on the test data.

Model Evaluation: The model's accuracy is computed, and a comprehensive classification report is produced
to assess its performance in precision, recall, and F1-score for every class.

Here, is the complete code to evaluate the score from KNN:

23
Fig 4.8 KNN

A classification report is generated, providing a detailed evaluation of the model's performance.

Fig 4.8.1

Implementation of Naive Bayes Classifier

24
In order to construct a Naive Bayes model with scikit-learn, you will need to adhere to a process that closely
resembles the one outlined for the Random Forest and K-NN models. Here is a detailed guide on how to
proceed:

1. Initialize the Naive Bayes Classifier

2. Train the Model
3. Make Predictions
4. Evaluate the Model's Performance

The Naive Bayes Classifier is initialized by creating a GaussianNB instance.

The model is trained with the training data to initialize it.

Predictions on the test data are made by the model.

The accuracy of the model is calculated, and a detailed classification report is generated to evaluate the model's
performance in terms of precision, recall, and F1-score for each class.

Here, is the complete code to evaluate the score from Naive Bayes:

Fig 4.9 Naïve Bayes

A classification report is generated, providing a detailed evaluation of the model's performance.

25
Fig 4.9.1

Implementation of Support Vector Machine

To construct an SVM model using scikit-learn, you will adhere to a process akin to the one outlined for the
Random Forest, K-NN, and Naive Bayes models. Here are the step-by-step instructions for doing so:

1. Initialize the SVM Classifier

2. Train the Model
3. Make Predictions
4. Evaluate the Model's Performance

SVM Classifier Initialization: An SVC instance is instantiated with a linear kernel. The kernel type can be
modified to 'rbf', 'poly', etc., based on specific needs.

Model Training: The training data is utilized to train the model.

Making Predictions: The model generates predictions on the test data.

Model Evaluation: The accuracy of the model is computed, and a comprehensive classification report is
produced to assess the model's precision, recall, and F1-score for individual classes.

Here, is the complete code to evaluate the score from Support Vector Machine:
26
Fig 4.10 SVM

A classification report is generated, providing a detailed evaluation of the model's performance.

Fig 4.10.1

Accuracy Score for all classification models

Here, is the representation of accuracy table

27
Fig 4.11

Graphical representation of Accuracy for all Classification models

28
Fig 4.12

Evaluation of Prediction through User Input

Fig 4.13

Here, is the Prediction table for all classification models

Fig 4.14

Therefore, all classification models are predicting the same output.

29
CONCLUSION

The skin type prediction model was developed with the objective of forecasting various skin-related
conditions, such as skin type, acne severity, and moisture levels, based on demographic and dermatological
features. The model utilized a variety of machine learning algorithms, including Random Forest, Support
Vector Machines (SVM), and Gradient Boosting to build an accurate predictive system.

Performance Evaluation was conducted using several key metrics to evaluate the model's effectiveness:

 Accuracy: The model achieved a high degree of accuracy, indicating that it was able to correctly
classify the skin type or predict the skin-related condition in most cases.
 Precision: For multi-class classification problems (such as predicting specific skin conditions),
precision showed how many of the predicted conditions were true positives.
 Recall: This metric highlighted the ability of the model to detect all possible conditions (such as acne
severity levels or skin types), even if it resulted in some false positives.
 F1-Score: A balance between precision and recall, ensuring that the model doesn't overfit or
underperform in any specific class of skin types or conditions.
 AUC-ROC: The area under the receiver operating characteristic curve was used to assess the trade-
off between sensitivity and specificity. This demonstrated the model's ability to differentiate between
classes of skin types.

The results indicated that the developed model was effective in classifying skin types and predicting conditions
with satisfactory precision. The model was able to generalize well to new, unseen data, making it a reliable
tool for skincare applications.

Overall, the project successfully met its objectives of creating a predictive system for skin-related conditions,
providing potential applications in skincare product recommendations, dermatological diagnosis, and
personalized treatment plans.

30
FUTURE SCOPE

While the developed model has shown promising results, several limitations exist that can be addressed in
future iterations of the project. Here are some potential areas for improvement:

1. Data Quality and Variety:

o Limitation: The dataset may not capture a broad range of skin types across different
demographics, such as ethnicity, environmental factors, and medical history.
o Proposed Enhancement: A larger and more diverse dataset should be collected to include
various skin conditions from different regions, ages, ethnicities, and skin sensitivities to
improve the model’s generalization and robustness.
2. Incorporating External Factors:
o Limitation: The current dataset may not include external factors such as environmental data
(e.g., humidity, pollution) or lifestyle factors (e.g., diet, stress) that can influence skin
conditions.
o Proposed Enhancement: Future versions of the model could incorporate additional features
such as weather conditions, dietary habits, or stress levels, which are known to influence
skin health, thereby improving the model’s accuracy in real-world applications.
3. Real-Time Prediction:
o Limitation: The model is static and does not adapt to changes in an individual’s skin condition
over time.
o Proposed Enhancement: The model can be enhanced to make real-time predictions using
mobile apps or wearable devices that track an individual’s skin condition over time. Machine
learning models can be updated continuously as more data is collected, creating personalized
skincare routines for users.
4. Fine-tuning Hyperparameters:
o Limitation: While the model performs well, hyperparameters were set using standard
practices, and there might be better configurations that further optimize performance.
o Proposed Enhancement: Implementing grid search or randomized search for
hyperparameter tuning will help identify the optimal settings for the machine learning models,
potentially improving performance metrics like accuracy, precision, and recall.
5. Deep Learning Models:
o Limitation: The current machine learning models used (e.g., Random Forest, SVM) are
relatively traditional and might not fully capture complex patterns in the data.
o Proposed Enhancement: Exploring deep learning models, particularly Convolutional
Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), may provide better
results, especially for detecting complex relationships between skin features and conditions, if
image-based or temporal data is incorporated in the future.
6. User Interface and Accessibility:
o Limitation: The model is useful as a backend tool but lacks an accessible user interface for
individuals or dermatologists to easily interact with it.
o Proposed Enhancement: A web-based or mobile application interface could be developed
to allow users to input their skin-related data (such as images, age, or skin conditions) and
receive predictions and skincare recommendations. This could be especially valuable in remote
areas or for users who do not have access to dermatologists.
7. Integration with Healthcare Systems:
o Limitation: The model does not currently integrate with healthcare or dermatology
management systems.
o Proposed Enhancement: The model could be integrated into electronic medical record
(EMR) systems, allowing dermatologists to use it as a decision-support tool for diagnosing
and recommending treatments based on a patient’s skin data.
31
REFERENCES

1. Jain, S., & Gupta, R. (2021). Predicting Skin Type Using Machine Learning Models. Journal of Skin
Science and Technology, 15(2), 45-56.
2. Lee, C., & Kim, D. (2020). A Survey on Machine Learning in Dermatology and Skin Disease
Diagnosis. International Journal of Medical Informatics, 136, 104092.
3. Patel, H., & Mehta, D. (2022). Impact of Skin Moisture and pH on Dermatological Disorders: A
Machine Learning Approach. Journal of Dermatological Research, 10(1), 23-37.
4. Zhang, X., & Zhao, Y. (2019). Skin Type Classification using Supervised Machine Learning.
Proceedings of the International Conference on Artificial Intelligence and Data Science, 50(6), 1123-
1135.
5. Liu, J., & Wang, L. (2023). Deep Learning Approaches in Dermatology: A Review on Skin Disease
Prediction. Medical Image Analysis, 78, 102221.

Math 146 Finals Reviewer
No ratings yet
Math 146 Finals Reviewer
4 pages
LECTURE 3 Block Cipher Principles
No ratings yet
LECTURE 3 Block Cipher Principles
6 pages
Skin Care DIVYa
No ratings yet
Skin Care DIVYa
66 pages
Grade 8 Computer Studies Notes
100% (1)
Grade 8 Computer Studies Notes
73 pages
VWR-A Series Operation Manual Rev A
No ratings yet
VWR-A Series Operation Manual Rev A
28 pages
Heavin-2018-Challenges For Digital Transformat
No ratings yet
Heavin-2018-Challenges For Digital Transformat
9 pages
CEC366 Image Processing
No ratings yet
CEC366 Image Processing
2 pages
Ad Report Final At-17
No ratings yet
Ad Report Final At-17
104 pages
EOS Human Resource Supervision Level IV (4) .Docx (Edited)
No ratings yet
EOS Human Resource Supervision Level IV (4) .Docx (Edited)
32 pages
Cosmetic Suggestion Based On Skin Condition Using Artificial Intelligence
No ratings yet
Cosmetic Suggestion Based On Skin Condition Using Artificial Intelligence
6 pages
Harsh Rai 1000000000000
No ratings yet
Harsh Rai 1000000000000
67 pages
Skin Diseas Detection Using Machine Learning
No ratings yet
Skin Diseas Detection Using Machine Learning
67 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
83 pages
BRKDCN 2613
No ratings yet
BRKDCN 2613
97 pages
Skin Cancer Final Report-2
No ratings yet
Skin Cancer Final Report-2
48 pages
Report End Sem Presentation
No ratings yet
Report End Sem Presentation
55 pages
Skin Disease Prediction Report
No ratings yet
Skin Disease Prediction Report
34 pages
Python Journal Grade 12
No ratings yet
Python Journal Grade 12
54 pages
Samarip New File
No ratings yet
Samarip New File
62 pages
Skin Disease Final Report-1
No ratings yet
Skin Disease Final Report-1
73 pages
ISM Practicals 1 To 13 - DAKSH
No ratings yet
ISM Practicals 1 To 13 - DAKSH
55 pages
Str-Eye Condition
No ratings yet
Str-Eye Condition
48 pages
Skin Disease
No ratings yet
Skin Disease
53 pages
Unit 4 Self Made
No ratings yet
Unit 4 Self Made
28 pages
Computer Part-I (Master)
No ratings yet
Computer Part-I (Master)
21 pages
Project Proposal: COMSATS University Islamabad, COMSATS Road, Off GT Road, Sahiwal, Pakistan
100% (1)
Project Proposal: COMSATS University Islamabad, COMSATS Road, Off GT Road, Sahiwal, Pakistan
21 pages
Unit 3 Self Made
No ratings yet
Unit 3 Self Made
23 pages
Final Report Skincare.
No ratings yet
Final Report Skincare.
27 pages
Minor Project
No ratings yet
Minor Project
30 pages
Dsa PDF
No ratings yet
Dsa PDF
30 pages
DT and AC Reasearch Paper
No ratings yet
DT and AC Reasearch Paper
8 pages
OpenBlox-Whitepaper 9.13.26 AM
No ratings yet
OpenBlox-Whitepaper 9.13.26 AM
31 pages
Hitesh Oslp Unit 2
No ratings yet
Hitesh Oslp Unit 2
20 pages
17 1
No ratings yet
17 1
11 pages
Introduction To Software Testing
No ratings yet
Introduction To Software Testing
20 pages
Assurance Features and Navigation: Cisco DNA Center 1.1.2 Training
No ratings yet
Assurance Features and Navigation: Cisco DNA Center 1.1.2 Training
54 pages
Dichvusocks - Us - Service Socks5, Anonymous Proxy, Proxy Service, Proxy Server, Hide Your IP, Tools Client
No ratings yet
Dichvusocks - Us - Service Socks5, Anonymous Proxy, Proxy Service, Proxy Server, Hide Your IP, Tools Client
1 page
SkinCare Recommendation System Using Computer Vision
No ratings yet
SkinCare Recommendation System Using Computer Vision
16 pages
Report
No ratings yet
Report
18 pages
Tender 1526 ICT
No ratings yet
Tender 1526 ICT
7 pages
Smart PPT Final
No ratings yet
Smart PPT Final
12 pages
Cosmetics 8
No ratings yet
Cosmetics 8
12 pages
Presentation of Project Synopsis
No ratings yet
Presentation of Project Synopsis
18 pages
Paper 4
No ratings yet
Paper 4
6 pages
Skincare Recommendation Presentation
No ratings yet
Skincare Recommendation Presentation
7 pages
Skincare Recommendation System Using Computer Vision Research Paper
No ratings yet
Skincare Recommendation System Using Computer Vision Research Paper
4 pages
Decentralized Computer
No ratings yet
Decentralized Computer
13 pages
Neuro Glow
No ratings yet
Neuro Glow
11 pages
Work Break Down Structure For Communications and Outreach Systems
No ratings yet
Work Break Down Structure For Communications and Outreach Systems
10 pages
Irjet V9i364
No ratings yet
Irjet V9i364
6 pages
Q Gis Features
No ratings yet
Q Gis Features
15 pages
Rohini 66482883011
No ratings yet
Rohini 66482883011
7 pages
Paper 3
No ratings yet
Paper 3
6 pages
FInal For Reviews 2
No ratings yet
FInal For Reviews 2
16 pages
Neuro Glow
No ratings yet
Neuro Glow
9 pages
Flashman Royal Flash Flashmans Lady George Macdonald Fraser Download
No ratings yet
Flashman Royal Flash Flashmans Lady George Macdonald Fraser Download
14 pages
Skincare Cosmetic Recommendation System
No ratings yet
Skincare Cosmetic Recommendation System
4 pages
Skincare Cosmetic Recommendation System-2
No ratings yet
Skincare Cosmetic Recommendation System-2
4 pages
Exfo Spec-Sheet Optical-Wave-Expert v5 en
No ratings yet
Exfo Spec-Sheet Optical-Wave-Expert v5 en
9 pages
Good PDF 4303162 2
No ratings yet
Good PDF 4303162 2
33 pages
Lecture 02 Write Basic Go Web Server
No ratings yet
Lecture 02 Write Basic Go Web Server
17 pages
Paper 1
No ratings yet
Paper 1
5 pages
Text Classification To Predict Skin Concerns Over Skincare Using Bidirectional Mechanism in Long Short-Term Memory
No ratings yet
Text Classification To Predict Skin Concerns Over Skincare Using Bidirectional Mechanism in Long Short-Term Memory
11 pages
Skin Concern Detection Website Using Machine Learning ABSTRACT
No ratings yet
Skin Concern Detection Website Using Machine Learning ABSTRACT
2 pages
Protocolo RS485 VFD-L
No ratings yet
Protocolo RS485 VFD-L
14 pages
Software Review1
No ratings yet
Software Review1
9 pages
Heal Over Virtual Reality: Treatment Is Easy
No ratings yet
Heal Over Virtual Reality: Treatment Is Easy
25 pages
A Cosmetic Product Recommendation System Based On Skin Type Using AI:ML
No ratings yet
A Cosmetic Product Recommendation System Based On Skin Type Using AI:ML
7 pages
IEEE Conference Template
No ratings yet
IEEE Conference Template
5 pages
Shindo Life v0.2.9
No ratings yet
Shindo Life v0.2.9
2 pages
Development of Skin Care Routine Support System: Journal of Computational and Theoretical Nanoscience October 2018
No ratings yet
Development of Skin Care Routine Support System: Journal of Computational and Theoretical Nanoscience October 2018
7 pages
Paper 2
No ratings yet
Paper 2
5 pages
Kontur PDF
No ratings yet
Kontur PDF
1 page
Import Java - Util.stack Public Class Linkedinterface (
No ratings yet
Import Java - Util.stack Public Class Linkedinterface (
4 pages
UI/UX Design for Agentic AI Enhancing Human-AI Interaction
From Everand
UI/UX Design for Agentic AI Enhancing Human-AI Interaction
Anand Vemula
No ratings yet
Strategic Implementation of Agentic AI: Tools, Techniques, and Use Cases
From Everand
Strategic Implementation of Agentic AI: Tools, Techniques, and Use Cases
Anand Vemula
No ratings yet
AI Systems
From Everand
AI Systems
Anand Vemula
No ratings yet
Fundamentals of Machine Learning: a Simplified Approach
From Everand
Fundamentals of Machine Learning: a Simplified Approach
Er. Sudhir Goswami
No ratings yet
Data-Driven Agentic AI: Integrating Data Science and Machine Learning
From Everand
Data-Driven Agentic AI: Integrating Data Science and Machine Learning
Anand Vemula
No ratings yet
Designing Agentic AI Architecture and Development Strategies
From Everand
Designing Agentic AI Architecture and Development Strategies
Anand Vemula
No ratings yet
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
From Everand
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
Margaux Masson-Forsythe
No ratings yet
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
From Everand
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
DAVID MACKAY
No ratings yet
Self-Supervised Learning: Teaching AI with Unlabeled Data
From Everand
Self-Supervised Learning: Teaching AI with Unlabeled Data
Robert Johnson
No ratings yet
Generative AI – An Overview: Software, #1
From Everand
Generative AI – An Overview: Software, #1
Editor IJSMI
No ratings yet
OpenAI Development Guide: Definitive Reference for Developers and Engineers
From Everand
OpenAI Development Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical MXNet Applications: Definitive Reference for Developers and Engineers
From Everand
Practical MXNet Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Harnessing the Power of AI: A Guide to Making Technology Work for You
From Everand
Harnessing the Power of AI: A Guide to Making Technology Work for You
Roy Hope
No ratings yet
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Instant Approach to Software Testing
From Everand
Instant Approach to Software Testing
Anand Nayyar
No ratings yet
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Python Machine Learning: Introduction to Machine Learning with Python
From Everand
Python Machine Learning: Introduction to Machine Learning with Python
Frank Millstein
No ratings yet
Artificial Intelligence Algorithms
From Everand
Artificial Intelligence Algorithms
akosnemeth
No ratings yet
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Artificial Intelligence Control Problem: Fundamentals and Applications
From Everand
Artificial Intelligence Control Problem: Fundamentals and Applications
Fouad Sabry
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet

Report File 1

Uploaded by

Report File 1

Uploaded by

1.

1 Description of the Topic:

1.2 Problem Statement:

The scope of the project encompasses the following:

1.5 Project Planning Activities

1.5.1 Team-Member Wise Work Distribution Table:

Team Member Responsibilities

PERT Chart for the Web-Based Skin Type Prediction System

Fig 1.5.2 PERT chart

Summary of Papers Studied

Detailed Discussion of Findings from Relevant Research:

1. "Skin Type Classification using Machine Learning"

2. "Personalized Skin Care Recommendation Using Machine Learning"

6. "Data-Driven Skin Care Recommendations Based on Skin Type and Condition"

8. "Skincare Recommender Systems Using Collaborative Filtering and Machine Learning"

1. System Architecture Overview:

The system follows a client-server architecture, with three primary components:

2. Detailed Breakdown of Components:

3.2 Algorithm Used

1. Machine Learning Approach

The following algorithms are considered for this project:

1. Random Forest Classifier

2. Chosen Algorithm: Random Forest Classifier

3. Steps of the Random Forest Algorithm:

Importing Libraries & Uploading the dataset

Phase2: Data Preparation

Phase3: Data Manipulation

Filling null values and Converting String to Numerical form

In the dataset we have no duplicate values, so there is no need to drop them.

fig 4.1 Representation of box plot

Since, there is no outliers present in this dataset so no outliers are removed.

Components of a Box Plot

Interpretation of Your Box Plot

Fig 4.1.2 Visual Representation of each box plot

o X-axis: Denotes the independent variable (or predictor variable).

2. Trends and Patterns:

3. Clusters and Outliers:

Fig 4.1.3 Visual representation of each scatter plot

Implementation of algorithm and techniques

Division of Independent and Dependent features

Train & Test splitting

• Import: The train_test_split function, which divides datasets, is imported.

Fig 4.4 Train Test Split

Fitting & transforming training data

Fig 4.5 logistic regression

Implementation of Hyperparameter Tuning

Implementation of Decision Tree Classifier

Implementation of the Random Forest Classifier

rf_model = RandomForestClassifier(42, random_state, n_estimators = 100): Generates a

A classification report is generated, providing a detailed evaluation of the model's performance.

Implementation of K Nearest Neighbour (KNN)

1. Initialize the KNN Classifier

KNN Classifier Initialization: An instance of KNeighborsClassifier is instantiated with 5 neighbors.

Model Training: The training data is utilized to train the model.

Here, is the complete code to evaluate the score from KNN:

A classification report is generated, providing a detailed evaluation of the model's performance.

Implementation of Naive Bayes Classifier

1. Initialize the Naive Bayes Classifier

The Naive Bayes Classifier is initialized by creating a GaussianNB instance.

The model is trained with the training data to initialize it.

Predictions on the test data are made by the model.

Fig 4.9 Naïve Bayes

A classification report is generated, providing a detailed evaluation of the model's performance.

Implementation of Support Vector Machine

1. Initialize the SVM Classifier

Model Training: The training data is utilized to train the model.

Making Predictions: The model generates predictions on the test data.

A classification report is generated, providing a detailed evaluation of the model's performance.

Accuracy Score for all classification models

Here, is the representation of accuracy table

Graphical representation of Accuracy for all Classification models

Evaluation of Prediction through User Input

Here, is the Prediction table for all classification models

Therefore, all classification models are predicting the same output.

1. Data Quality and Variety:

You might also like