Weekly Diary Report-244
Weekly Diary Report-244
S. Field Answer
No
1 Project Title Analysis of Car Dekho Dataset: Trends, Insights, and Anomalies
2 Project Description This project analyses the Car Dekho dataset to uncover trends and insights,
including vehicle manufacturing years, price ranges, record counts, missing data,
vehicle diversity, sales trends, CNG and cost depreciation factors, and
relationships between selling price and mileage.
3 Outline of the Solution The solution starts with exploring and cleaning the Car Dekho dataset to address
missing values and prepare it for analysis. It then centers on answering targeted
questions about manufacturing years, price ranges, vehicle diversity, sales trends,
and depreciation factors. The project wraps up with visualizing key findings and
summarizing insights and recommendations.
4 Design of the Solution The solution involves loading and cleaning the Car Dekho dataset to ensure data
quality in python, followed by performing descriptive analysis to extract insights
on vehicle characteristics, pricing trends, and depreciation factors. Visualizations
will summarize findings for a concise report with actionable recommendations.
5 Hardware and Software Hardware Requirements are Minimum: Intel Core i5 or equivalent, RAM
Requirements to execute minimum: 4 GB, Operating System: Windows 10 or higher.
the project
Software Requirements: Google collab, visual studio or any python ide
3. Data Types
• Numbers: Used for values like car prices, mileage, etc.
• Categories or string characters: Used for car brands or fuel types.
• True/False or If conditions: Used to check conditions, like whether a car
has an automatic transmission.
4. Conditional Statements
• If-Else: Used to check, like whether CNG vehicles present in data or not
4. Validation
• We can use different conditions to check the analyses we got is correct or
not. It is most important check the data to ensures that the analysis results
are accurate and the data used is complete and reliable.
9 Testing Material
(Screenshots of working
outputs. Images in case of
hardware project
10 User Manual 1. Introduction
• This project aims to analyze the Car Dekho dataset to uncover key trends,
insights, and anomalies in the used car market.
2. Requirements
• Software: python ide or google colab and libraries (Pandas, Matplotlib,
Seaborn).
• Hardware: A computer with at least 4GB RAM.
3. Installation
1. Install Python from python.org you can use google colab.
2. Install pandas, matplotlib, seaborn, libraries
3. Download the project files.
4. How to Use
1. Load Data:
import pandas as pd
df = pd.read_csv('car_Dekho_DA.csv')
2. View Data: Use df.head() to see the first 5 rows.
3. Analyze Data: Run the code to analyse.
5. Testing & Validation
• Check Edge Cases: Test with the lowest and highest car prices.
• Validate Input: Make sure the program handles all types of analyzes
correctly.
7. Conclusion
• This project helps you understand car dataset through analysis and
visualization.
Dataset Description
• Data used: Car Dekho dataset.
• Key elements:
o The manufacturer of the car.
o The specific model of the car.
o The manufacturing year.
o The price of the car.
o The fuel efficiency of the car.
o The type of fuel used like Petrol, Diesel, CNG.
o The type of transmission like Manual, Automatic.
Code Structure
• Data Loading:
Python Code to load data:
o import pandas as pd
o df = pd.read_csv('car_Dekho_DA.csv')
• Data Summary:
df.describe()
• Data Info:
df.info()
• Data Cleaning:
o Must check if any null values are present in the data
• Data Visualization:
o Bar Charts: Transmission type analysis, fuel type distribution,
seller type distribution.
o Pie Charts: Fuel type and seller type distributions.
o Scatter Plots: Cost depreciation vs. kilometers driven, selling
price vs. kilometers driven.
o Box Plot: Selling price distribution of two-wheelers.
o Horizontal Bar Graphs: Most sold vehicle models, overall vehicle
count.
o Purpose: To provide a clear and concise visual summary of key
findings.
o Tools Used: Matplotlib and Seaborn for creating comprehensive
and insightful visualizations.
• Data Analysis:
o Investigate manufacturing years to identify trends over time.
o Analyze price ranges to understand market segments.
o Examine vehicle diversity to assess the variety of cars available.
o Evaluate sales trends to determine peak sales periods.
o Study the impact of CNG on sales and cost depreciation.
Day 1 I participated in an EDUNET live session where I set up an IBM website account
and enrolled in the Edunet-TN Data Analytics with Python course. During the
session, I learned some foundational concepts in data analytics and practiced
solving questions with a dataset using Python in Google Colab.
Day 2 I practiced some functions using python in google collab on the dataset
• Head
• Tail
• Describe
Day 3 Completed Module 1 & 2 on Introduction to Data Concepts in IBM course. And
completed 2 quizzes based on module 1 & 2
Day 4 Completed Module 3 & 4 on Introduction to Data Concepts in IBM course. And
completed 2 quizzes based on module 3 & 4
Day 1 Attended EDUNET live session. In this session I have learned about some
functions in python which we can use on dataset. And practiced some questions
based on dataset given.
• How to upload a dataset or file in google collab
• How to read a dataset in our code
• How to import panda library
• And some questions based on Head, Tail, Describe, info, isnull
Day 3 Completed Module 1 & 2 on Data Science in Our World in IBM course. And
completed 2 quizzes based on module 1 & 2
Day 4 Completed Module 3 on Data Science in Our World in IBM course. And
completed quiz based on module 3
Day 5 Completed Module 4 & 5 on Data Science in Our World in IBM course. And
completed quizzes based on module 4 & 5 and completed final assignment based
on Data Science in Our World
Day 1 Attended EDUNET live session. In this session I have learned about graphs which
we can use on dataset. And practiced some questions based on dataset given.
• Depriciation
• Data visualization using graphs
Day 2 Completed Module 1 on Overview of Data Tools and Languages in IBM course.
And completed quiz based on module 1
Day 3 Completed Module 2 on Overview of Data Tools and Languages in IBM course.
And completed quiz based on module 2
Day 4 Completed Module 3 on Overview of Data Tools and Languages in IBM course.
And completed quiz based on module 3 and completed final assignment based on
Overview of Data Tools and Languages
Day 5 Completed Module 1 & 2 on Clean, Refine, and Visualize Data with IBM Watson
Studio in IBM course. And completed quizzes based on module 1 & 2
Day 1 Attended EDUNET live session. In this session I have practiced some questions
based on dataset given.
Day 2 Completed Module 3 & 4 on Clean, Refine, and Visualize Data with IBM Watson
Studio in IBM course. And completed quizzes based on module 3 & 4
Day 3 Completed Module 5 on Clean, Refine, and Visualize Data with IBM Watson
Studio in IBM course. And completed quiz based on module 5 and completed
final assignment based on on Clean, Refine, and Visualize Data with IBM Watson
Studio
Day 4 Completed Module 1,2 & 3 on Your Future in Data: The Job Landscape in IBM
course.
Day 5 Completed Module 4 on Your Future in Data: The Job Landscape in IBM course
and practiced some questions given in the Telegram group.
Day 1 Attended EDUNET live session. In this session I have practiced some questions
which are given based on previous sessions.
Day 2 Started Python for programmer’s course given in the learning plan Data Analytics
with Python on IBM website. And completed modules control flow in python,
getting started with python
Day 3 I completed the modules on basic syntax, control flow, and functions in the Python
for Programmers course on the IBM website. Additionally, I practiced some data
analytics questions provided during the live session.
Day 5 Completed built-in data structures module in Python for programmer’s course on
IBM website. And practiced some questions which are given in the live session.
Day 1 Attended EDUNET live session. In this session we have practiced some questions
and instructors discussed about project
Day 3 I analyzed several project-related questions and began the "Python for Data
Science" course outlined in the course plan on the IBM website, as recommended
by the internship instructors.
Day 4 Some part of Module 1 in python for data science in IBM website has completed
Day 5 Module 1 in python for data science in IBM website has completed and checked
some questions related to project which are previously completed.
Day 1 Completed some of the questions based on data analytics on given dataset for the
project
Day 2 Completed module 2 & 3 in python for data science in IBM website
Day 3 Completed module 4 & 5 in python for data science in IBM website and also
completed some data analytics questions on given dataset related to project
Day 4 I finished all the modules and courses required for the internship, received my
completion certificate, and tackled some data analytics questions related to the
project using the provided dataset.
Day 5 Completed the project and submitted the ppt regarding project to edunet