0% found this document useful (0 votes)
50 views21 pages

Predicting Car MPG Using Decision Tree and Random Forest Algorithm Main

This project investigates the use of Decision Tree and Random Forest algorithms to predict car mileage (MPG) based on various vehicle attributes. The study finds that while both models are effective, the Random Forest algorithm generally outperforms the Decision Tree in accuracy and robustness, providing valuable insights for consumers and manufacturers. The research emphasizes the importance of advanced machine learning techniques in enhancing fuel efficiency predictions.

Uploaded by

Saisandeep Y
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views21 pages

Predicting Car MPG Using Decision Tree and Random Forest Algorithm Main

This project investigates the use of Decision Tree and Random Forest algorithms to predict car mileage (MPG) based on various vehicle attributes. The study finds that while both models are effective, the Random Forest algorithm generally outperforms the Decision Tree in accuracy and robustness, providing valuable insights for consumers and manufacturers. The research emphasizes the importance of advanced machine learning techniques in enhancing fuel efficiency predictions.

Uploaded by

Saisandeep Y
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Professional Training - 1
Predicting Car Mpg using Decision Tree and
Random Forest Algorithm

PROJECT STUDENT GUIDE


Ravuri Likhil Mohan Abhiram Naidu
42111065 Ms.P.Krishnaveni,M.Tech.,
Associate Professor, CSE
1
AGENDA
• Certificate
• Abstract
• Introduction
• Objective
• Problem Statement
• Literature Survey
• Module Implementation
• Advantages
• System Architecture Diagram
• Results and Discussion
• Conclusion
• References
COURSE CERTIFICATE
ABSTRACT
This study explores the application of Decision Tree and Random Forest algorithms to
predict car mileage in miles per gallon (MPG). By utilizing a dataset comprising various car
attributes such as engine size, weight, and horsepower, we aim to assess the accuracy and
efficiency of these machine learning models. The Decision Tree algorithm offers
transparency and ease of interpretation, while the Random Forest algorithm enhances
prediction accuracy through ensemble learning. Our results demonstrate that both models
effectively predict MPG, with Random Forest generally outperforming Decision Tree in
terms of accuracy and robustness, providing valuable insights for consumers and
manufacturers alike.

School of Computing - CSE


4
INTRODUCTION
● Predicting car miles per gallon (MPG) is essential for understanding fuel efficiency and
making informed purchasing decisions
● This project utilizes machine learning techniques, specifically Decision Trees and
Random Forest algorithms, to analyze various car features and predict their MPG
● Decision Trees provide a clear visual representation of decision-making processes,
while Random Forest enhances accuracy by aggregating multiple decision trees to
mitigate overfitting and improve generalization
● By leveraging these algorithms, we aim to deliver insights into how factors such as
engine size, weight, and horsepower affect fuel economy, ultimately assisting
consumers and manufacturers in promoting more efficient vehicles.

School of Computing - CSE 5


OBJECTIVE
● This project aims to predict car fuel efficiency (miles per gallon - MPG) using Decision
Tree and Random Forest algorithms
● The dataset comprises various attributes, such as engine size, weight, and
horsepower, influencing MPG.
● We will preprocess the data by handling missing values and encoding categorical
variables.
● Subsequently, we will split the dataset into training and testing sets.
● The Decision Tree will provide a clear, interpretable model, while the Random Forest,
an ensemble method, will enhance accuracy by mitigating overfitting
● We will evaluate model performance using metrics like RMSE and R², enabling us to
identify the most effective approach for MPG prediction.

School of Computing - CSE 6


PROBLEM STATEMENT
• The automotive industry is increasingly focused on fuel efficiency due to rising fuel
costs and environmental concerns. Predicting the miles per gallon (MPG) of vehicles
can assist manufacturers, consumers, and policymakers in making informed decisions
regarding vehicle design, purchase, and regulation.

• In this project, we aim to develop predictive models for estimating the MPG of cars
using machine learning techniques, specifically Decision Trees and Random Forest
algorithms. The goal is to create accurate, interpretable models that can predict a
car's fuel efficiency based on various features such as engine size, weight,
horsepower, and other relevant characteristics.
PROBLEM STATEMENT
• Data Collection and Preprocessing: Gather a comprehensive dataset of cars that includes
features impacting fuel efficiency. Clean and preprocess the data to handle missing values,
categorical variables, and outliers.
• Feature Selection: Identify the most significant features that influence MPG using techniques
such as correlation analysis and feature importance scores.
• Model Development: Implement Decision Tree and Random Forest algorithms to create
predictive models for MPG.
• Model Evaluation: Assess the performance of the models using appropriate metrics such as
Mean Absolute Error (MAE), Mean Squared Error (MSE), and R-squared values. Use cross-
validation to ensure robustness.
• Interpretability: Provide insights into the model's decision-making process, particularly for the
Decision Tree, to understand how different features contribute to MPG predictions.
• Comparison of Models: Compare the performance of the Decision Tree and Random Forest
models to determine which algorithm provides better accuracy and reliability in predicting MPG
LITERATURE SURVEY
1. A. Aliyu and S. Adeshina, "Classifying auto-MPG data set using neural network," 2014 11th
International Conference on Electronics, Computer and Computation (ICECCO), Abuja, Nigeria,
2014, pp. 1-4, doi: 10.1109/ICECCO.2014.6997582.

2. M. J. Cohen and R. P. Wolfson, "The electric passenger car and its competition in the mid-
1980s," 31st IEEE Vehicular Technology Conference, Washington, DC, USA, 1981, pp. 31-37, doi:
10.1109/VTC.1981.1622909.
3. M. S. Rusiman, E. Nasibov and R. Adnan, "The optimal fuzzy c-regression models (OFCRM) in
miles per gallon of cars prediction," 2011 IEEE Student Conference on Research and
Development, Cyberjaya, Malaysia, 2011, pp. 333-338, doi: 10.1109/SCOReD.2011.6148760

School of Computing - CSE 9


LITERATURE SURVEY
4. S. Azak, F. Bozkaya, S. ¨. u. Tı ¨ glıo ˘ glu, A. Yusefi and A. Durdu, "A Unified Monocular Vision-
Based Driving Model for Autonomous Vehicles with Multi-Task Capabilities," in IEEE Transactions
on Intelligent Vehicles, doi: 10.1109/TIV.2024.3483114.

5. P. R, A. Choudhary, P. Jain and O. Kajave, "Vehicle Efficiency Prediction using Machine


Learning Algorithms," 2023 3rd International Conference on Smart Data Intelligence (ICSMDI),
Trichy, India, 2023, pp. 392-399, doi: 10.1109/ICSMDI57622.2023.00076

School of Computing - CSE 10


MODULE IMPLEMENTATION
INFERENCES FROM LITERATURE SURVEY
● Existing systems for predicting car MPG using Decision Trees and Random Forest
algorithms often face several defects
● These include overfitting, where models become too complex and fail to generalize
well to unseen data
● Moreover, the choice of features can be suboptimal, neglecting important factors like
driving conditions or maintenance history
● Data quality issues, such as missing or inconsistent data, can skew results
● Additionally, reliance on a single algorithm may limit performance, as different
algorithms may capture varying patterns in the data
● Lack of interpretability in Random Forest models can also hinder usability for
stakeholders seeking actionable insights.

School of Computing - CSE 12


EXISTING SYSTEM
● The existing system for predicting car miles per gallon (MPG) utilizes decision tree and
random forest algorithms for accurate fuel efficiency estimation
● Decision trees provide a simple, interpretable model that splits data based on feature
values, enabling straightforward decision-making
● Random forests enhance this by aggregating multiple decision trees, improving
prediction accuracy and robustness against overfitting
● The system typically uses various automotive features—such as engine size, weight,
and horsepower—to train the models
● By leveraging historical data, the algorithms can effectively predict MPG, aiding
consumers and manufacturers in making informed choices and optimizing vehicle
performance and efficiency.

School of Computing - CSE 13


PROPOSED SYSTEM
● The proposed system aims to predict car miles per gallon (MPG) using advanced
machine learning techniques, specifically Decision Tree and Random Forest algorithms
● By utilizing a dataset containing features such as engine size, weight, and fuel type,
the system will train the models to identify patterns influencing fuel efficiency
● The Decision Tree algorithm will provide interpretable results, while the Random
Forest algorithm will enhance prediction accuracy through ensemble learning
● This combined approach allows for robust and reliable MPG predictions, aiding
consumers in making informed decisions about vehicle efficiency and performance,
ultimately promoting environmentally friendly driving choices.

School of Computing - CSE 14


ADVANTAGES
• Predictive Accuracy: By utilizing both Decision Tree and Random Forest algorithms, the project
leverages the strengths of both models. Random Forest, as an ensemble method, generally
provides better accuracy and robustness against overfitting compared to a single Decision Tree.
• Interpretability: The Decision Tree model is inherently interpretable, allowing users to
understand how different features influence the MPG prediction. This can be beneficial for
stakeholders who want to grasp the underlying factors affecting fuel efficiency.
• User -Friendly Interface: The program includes an interactive component that allows users to
input car specifications easily. This accessibility makes the tool practical for consumers and
automotive professionals alike.
• Feature Importance Insights: The project can be extended to include feature importance
analysis, helping users understand which factors are most influential in determining MPG. This
information can guide design and purchasing decisions.
• Data-Driven Decision Making: By providing accurate MPG predictions, the project supports
data-driven decision-making for consumers considering vehicle purchases, manufacturers
optimizing designs for fuel efficiency, and policymakers aiming to regulate emissions.
ADVANTAGES
• Scalability: The framework established can be adapted and scaled to include additional
features or other machine learning models, making it versatile for future enhancements.
• Educational Value: The project serves as a practical example of applying machine learning
techniques to real-world problems, providing valuable learning experiences for students and
professionals in data science and machine learning.
• Environmental Impact: By promoting awareness of fuel efficiency through predictions, the
project indirectly contributes to environmental sustainability efforts by encouraging the use of
more fuel-efficient vehicles.
• Robust Performance Evaluation: The inclusion of model evaluation metrics (like Mean
Squared Error and R²) allows users to assess the model's performance quantitatively, ensuring
that the predictions are reliable.
• Comparison of Techniques: The project provides a direct comparison between two different
machine learning approaches, offering insights into their respective advantages and
disadvantages in the context of regression tasks.
ARCHITECTURE DIAGRAM

School of Computing - CSE 17


RESULTS AND DISCUSSION
● In this study, we employed Decision Tree and Random Forest algorithms to predict car
MPG (miles per gallon) based on various features such as engine size, weight, and
horsepower
● The Decision Tree model provided interpretable results, highlighting key factors
influencing fuel efficiency
● However, it was prone to overfitting
● In contrast, the Random Forest algorithm, leveraging multiple decision trees, yielded
improved accuracy and robustness, demonstrating better generalization on unseen
data
● Model evaluation metrics, including R² and RMSE, confirmed the superiority of
Random Forest in predictive performance
● These findings suggest that ensemble methods are effective for MPG prediction in
automotive datasets.
School of Computing - CSE 18
CONCLUSION
● In conclusion, this study effectively demonstrates the application of Decision Tree and
Random Forest algorithms in predicting car mileage (MPG) using diverse car attributes
● While both models show strong predictive capabilities, the Random Forest algorithm
consistently outperforms Decision Tree in terms of accuracy and robustness,
highlighting the benefits of ensemble learning
● This research not only contributes to the understanding of machine learning
applications in automotive contexts but also offers valuable insights for consumers
and manufacturers, facilitating informed decision-making regarding fuel efficiency and
vehicle performance
● Ultimately, these findings underscore the importance of utilizing advanced algorithms
for accurate predictive modeling.

School of Computing - CSE 19


REFERENCES
• A. Aliyu and S. Adeshina, "Classifying auto-MPG data set using neural network," 2014 11th
International Conference on Electronics, Computer and Computation (ICECCO), Abuja, Nigeria,
2014, pp. 1-4, doi: 10.1109/ICECCO.2014.6997582.
• M. J. Cohen and R. P. Wolfson, "The electric passenger car and its competition in the mid-
1980s," 31st IEEE Vehicular Technology Conference, Washington, DC, USA, 1981, pp. 31-37, doi:
10.1109/VTC.1981.1622909.
• M. S. Rusiman, E. Nasibov and R. Adnan, "The optimal fuzzy c-regression models (OFCRM) in
miles per gallon of cars prediction," 2011 IEEE Student Conference on Research and
Development, Cyberjaya, Malaysia, 2011, pp. 333-338, doi: 10.1109/SCOReD.2011.6148760.
• S. Azak, F. Bozkaya, S. ¨. u. Tı ¨ glıo ˘ glu, A. Yusefi and A. Durdu, "A Unified Monocular Vision-
Based Driving Model for Autonomous Vehicles with Multi-Task Capabilities," in IEEE Transactions
on Intelligent Vehicles, doi: 10.1109/TIV.2024.3483114.
• P. R, A. Choudhary, P. Jain and O. Kajave, "Vehicle Efficiency Prediction using Machine Learning
Algorithms," 2023 3rd International Conference on Smart Data Intelligence (ICSMDI), Trichy,
India, 2023, pp. 392-399, doi: 10.1109/ICSMDI57622.2023.00076
School of Computing - CSE 20
THANK YOU

You might also like