project_documentation
project_documentation
Project Details
Project Name: Affordable cars
Date: Started working from June, 2021
Project Managers: Tanmay Mandwekar
Helping Hand: Google, Stack overflow, Scikit learn documentation, Udemy course “Complete
Machine learning and Data Science Bootcamp 2021 by Andrei Neagoie and Daniel Brourke”
Project Idea: Predicting the affordable cars and their manufacturers given parameters as car features
like engine type, horsepower etc.
Software and languages: Mini Conda, Scikit learn, NumPy, Pandas, Matplotlib, Jupyter notebook,
python, C++
Initiation
Collecting data: Going to various car company websites and collecting data about price, horsepower,
engine type, mileage and various other car parameters
Setting Data: Analysing data, filling up various data, deleting unnecessary data with missing entries
etc.
Planning
Data: Converting various data into useful information by converting them to numbers using encoders
in Scikit learn
Selecting models: After converting data into numbers, use of Random Forest Regression Model for
predicting price of the car and use of Random Forest Classifier Model for predicting Manufacturer of
the car
Train and Test: Splitting random chosen entities into train and test data. Training the model and
testing the model to get the accuracy achieved.
Improving Data: Evaluating model using various parameters like cross validation, ROC curve,
classification report etc, then deleting and cleaning up the data to improve the accuracy of model.
Execution
Creating program: Creating a python program for this model with
Input: Takes input of various car parameters like mileage, horsepower, peak rpm, number of
cylinders, engine type etc.
Output: Gives output about the price and manufacturer of the car which matches with the given input
parameters.