Cardaatset Project
Cardaatset Project
Impact of Car
Features on Price and
Profitability
By: Trainity
Name : Shaikh Farha
"The full analysis can be found in the
https://docs.google.com/spreadsheets/d/1baBRmgAjNo88Il8uQby-
SKtcrUpAWGf1/edit?
usp=drive_link&ouid=101371966282785241371&rtpof=true&sd=true
OR https://d.docs.live.net/4bb7ae8600c5b2b7/Documents/TRAINITY%20ASSIGNMENT/CARDATASET.xlsx
Project Description:
Overview:
This project analyzes the relationship between car features,
pricing, and profitability to help car manufacturers optimize their
strategies. The study examines trends, relationships, and key
factors influencing the automotive market using a comprehensive
dataset.
Business Problem:
The automotive industry faces increasing competition and
evolving consumer preferences. Understanding how car features
influence pricing and profitability is critical for manufacturers to
remain competitive. This project answers: How can a car
manufacturer optimize pricing and product development to
maximize profitability while meeting consumer demand?
Dataset Description:
The dataset, "Car Features and MSRP," includes:
Total Observations: 11,159
Variables: 16 (e.g., Make, Model, Year, MSRP, Engine
Specifications, Fuel Efficiency, etc.)
File Format: CSV
Source: Kaggle, Cooper Union
Update Year: 2017
Objectives:
1. Explore trends in car features and pricing over time.
2. Identify relationships between features and consumer
demand.
3. Predict pricing based on feature analysis.
4. Develop actionable insights for strategic decisions.
The crossover
market category
has the highest
popularity among
customers.
The crossover
market category
has the highest
popularity among
customers.
The crossover
market category
has the highest
popularity among
customers.
Aston Martin - Average of Popularity Audi - Count of Model Audi - Average of Popularity Bentley - Count of Model
Buick - Average of Popularity Cadil ac - Count of Model Cadil ac - Average of Popularity
Dodge - Average of Popularity Ferrari - Count of Model Ferrari - Average of Popularity FIAT - Count of Model
GMC - Average of Popularity Honda - Count of Model Honda - Average of Popularity
Infiniti - Average of Popularity Kia - Count of Model Kia - Average of Popularity Lamborghini - Count of Model
Lincoln - Average of Popularity Lotus - Count of Model Lotus - Average of Popularity
Mazda - Average of Popularity McLaren - Count of Model McLaren - Average of Popularity Mercedes-Benz - Count of Model
Oldsmobile - Count of Model Oldsmobile - Average of Popularity Plymouth - Count of Model Plymouth - Average of Popularity
Rolls-Royce - Average of Popularity Saab - Count of Model Saab - Average of Popularity Scion - Count of Model
Suzuki - Average of Popularity Tesla - Count of Model Tesla - Average of Popularity
Volvo - Average of Popularity
2500
COUNT OF MODEL
Insight 2: Relationship Between
Engine Power and Price
Task 2: Generate a scatter chart with engine power on the x-axis
and price on the y-axis. Add a trendline to highlight correlations.
2000000
1500000
1000000
500000
Regression Statistics
Multiple R 0.658663
R Square 0.433836
Adjusted R
Square 0.433786
Standard
Error 46303.4
Observatio
ns 11199
ANOVA
Significan
df SS MS F ce F
1.83955E+ 1.84E+ 8579.9
Regression 1 13 13 67 0
2.40064E+ 2.14E+
Residual 11197 13 09
4.24019E+
Total 11198 13
Multiple R (0.6587):
This indicates the strength of the relationship between Engine HP and MSRP. A
value of 0.6587 suggests a moderate positive correlation.
R Square (0.4338):
This means that 43.38% of the variation in MSRP can be explained by Engine HP.
While it’s significant, other factors not included in this model may also influence
MSRP.
Observations (11199):
The dataset contains 11,199 rows used in the analysis.
2. ANOVA Table
Residual SS (2.4006E+13):
Represents the variation in MSRP that is not explained by Engine HP.
3. Coefficients Table
Equation:
Key Takeaways:
Other variables (e.g., Engine Cylinders, Highway MPG, Market Category) may
further improve the model's predictive power.
Variable 2:
R Square (0.289): Approximately 29% of the variation in car prices can be explained by
the number of seats.Significance F (0): The regression model is statistically significant (p-
value < 0.05).
Intercept (-62,255.73): If the number of seats is 0, the base price would theoretically be -
62,255.73. This value is not meaningful in practice but is a result of linear regression's
mathematical modeling.
Coefficient for "Number of Seats" (18,405.30): For each additional seat in a car, the price
increases by approximately $18,405, holding all other factors constant.
P-value for Seats (0): Statistically significant, indicating the variable strongly impacts the
price.
Interpretation: Larger vehicles with more seats are priced significantly higher, likely
reflecting increased material and manufacturing costs.
SUMMARY
OUTPUT
Regression Statistics
0.53804378
Multiple R 3
0.28949111
R Square 2
Adjusted R 0.28942765
Square 7
Standard 51871.2474
Error 4
Observations 11199
ANOVA
Significance
df SS MS F F
4562.12728
Regression 1 1.2275E+13 1.2275E+13 1 0
3.01269E+1 269062631
Residual 11197 3 1
4.24019E+1
Total 11198 3
Coefficient for Highway MPG (-1,142.13): For each unit increase in highway
MPG, the price decreases by approximately $1,142. Cars with higher fuel
efficiency tend to be less expensive.
Interpretation: Cars with higher highway MPG are priced lower, possibly
reflecting demand for fuel efficiency in lower-priced vehicles or cost-saving
designs.
SUMMA
R
OUTPUT
Regression Statistics
Multiple 0.166631
R 495
0.027766
R Square 055
Adjusted 0.027679
R Square 225
Standard 60677.45
Error 054
Observati
ons 11199
ANOVA
Significa
df SS MS F nce F
Regressio 1.17733E 1.17733E 319.7 1.54521E
n 1 +12 +12 754 -70
4.12246E 3681753
Residual 11197 +13 004
4.24019E
Total 11198 +13
Average of highway
Row Labels MPG
0 98.88135593
3 38.66666667
4 31.57666895
5 26.06508876
6 24.00116523
8 20.17709924
10 20
12 17.73684211
16 14
Grand Total 26.61059023
100
80
Average Highway MPG
Total
60
Linear (Total)
40
20
0
0 3 4 5 6 8 10 12 16
Engine Cylinders
Corelation Analysis : -
0.3257
Interpret:
We can see that there is an indirect relationship between these two
variables. As the number of cylinders increases, the highway miles per
gallon decreases. This means that as fuel efficiency increases, the
number of cylinders decreases.
If the correlation coefficient is negative (e.g., -0.75), it indicates that as
the number of cylinders increases, MPG tends to decrease, showing a
negative relationship. If it's close to 0, it indicates a weak or no linear
relationship .If it's close to -1 or 1, the relationship is stronger.
-------------------------------------------------------------- Dashboard
Creation:
Dashboard Requirements: Create an
interactive Excel dashboard addressing the
following questions:
Question 1: Distribution of Car Prices by Brand and Body Style
Visualization: Stacked column chart with slicers for brand and
body style.
Data Preparation: Use SUMIF or Pivot Tables to calculate MSRP
distribution.
DISTRIBUTION OF PRICE
35000000
30000000
25000000
20000000
Total
15000000
10000000
5000000
0
rd
M C
Su ion
wa a
ay s
Lo s
n M ra
Vo T ru
B u l ey
Ca atti
bo iti
Fo i
-R c
Ch lac
F e l er
Ol sub n
M bach
La In ER
m i
r
Ro Pon le
n
Sc e
Be rtin
M tu
Le i
xu
HU GM
ds ish
lks esl
lls tia
hin
it e
rra
c
to Acu
ge
m fin
ba
i
M cLar
oy
rys
nt
ob
dil
g
a
rg
As
DISTRIBUTION OF PRICE
140000000
120000000
100000000
80000000
60000000 Total
40000000
20000000
n
e
on
r H S UV
o M UV
Co Van
er tible
up
2d k
4d k
Ca van
da
bP n
ss b P p
V
Re ssen van
up
p
c
ag
u
SU
ba
ba
rS
ge cku
Co
Se
l a r er V
ick
ick
ini
W
r
i
ch
ch
Pa Min
Co nve
rg
le
nd ab P
g
at
at
tib
rH
Ca
Pa Ca
r
C
rg
2d
4d
nv
ew
Ca
ed
en
gu
Cr
te
Ex
1800000
1600000
1400000
1200000
1000000 Total
800000
600000
400000
200000
0
a n y c r ri d C R ti i s s h n i e c e n u a n
ur rti tle atti lla sle ra or M E ni hin xu tu ac re ish bil tia yc io ar sl ge
Ac Ma en Bug adi hry F er F G MM Infi org Le Lo ayb cLa sub mo Pon -Ro S c S ub Te wa
n B C C
HU b M M Mit lds ol
ls ks
s to a m O R V ol
A L
AVERAGE PRICE
100000
90000
80000
70000
60000
50000
40000 Total
30000
20000
10000
0
k k n n le e p p n n p n n
ac UV ac UV iva Va rtib SU
V
up icku icku niva r Va icku eda ago
chb dr S chb dr S i n g o e l e Co P P i e P S W
at 2 at 4 M
Ca
r nv tib b b M g b
rH rH rg
o Co ver Ca Ca ger ssen r Ca
d d a w d n
2 4 C
Co
n
Cr
e e
nd ass
e Pa gula
xte P Re
E
---------------------------------------------------------------
120000
100000
80000
60000
40000
20000
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
Data Preparation: Use AVERAGEIFS for body style and model year .
Average of
highway Column
MPG Labels
Grand
Row Labels 1990 1995 2000 2005 2010 2015 2016 2017 Total
2dr 29.7272727 30.3333333 36.1029411 36.2653061 34.7543859
Hatchback 30.4 28.6 3 3 27.125 8 2 37.4375 6
18.6666666 21.6190476
2dr SUV 20 16 18.75 7 30 30 29 2
4dr 27.6666666 41.5763888 40.2941176 40.6690821
Hatchback 31 7 30.6 29.5 9 42.28 5 3
17.7333333 19.3333333 23.2545454 25.7695167 25.7373949 25.5973115
4dr SUV 3 3 5 3 26.1965812 6 1
Cargo 20.8571428 27.1111111 25.2857142
Minivan 20 21.5 6 27.5 1 26.5 9
18.3333333
Cargo Van 3 16.4 17 16 16.8125
25.2857142 20.7272727 24.2631578 27.2237762 27.5166666 27.8026315 26.9947916
Convertible 23.5 24.5 9 3 9 2 7 8 7
Convertible 27.3333333
SUV 26 28 3
24.2727272 25.5172413 24.1666666 23.5217391 26.1693121 27.1098901 27.7172413
Coupe 7 8 7 26 3 7 1 8 26.6272578
Crew Cab 18.9459459 22.1212121 22.3759398 21.9615384 21.8924485
Pickup 23 5 2 5 6 1
Extended 21.6593406 21.7840909 20.9868421 21.3924914
Cab Pickup 22 20 20.5 21 6 1 1 7
Passenger 23.1666666 21.8888888 23.8571428 25.6538461 26.0555555
Minivan 19.5 20.1 7 9 6 5 25.5 6 24.455
Passenger 18.1428571 17.7142857 17.9868421
Van 15 14.5 4 1 19 1
Regular 20.8333333 22.7428571 22.5294117 22.5294117 22.2061068
Cab Pickup 22 20.375 3 18 21 4 6 6 7
23.6538461 23.8846153 26.8636363 25.7540983 26.1230769 32.6400709 32.6158940 31.9049429
Sedan 5 8 6 6 2 2 33 4 7
23.8888888 24.2777777 28.4782608 32.8333333 33.0833333 30.8648648 30.6919642
Wagon 24.3 9 31 8 7 3 3 6 9
Grand 23.5844155 23.0940170 23.8434782 23.5868544 24.1245551 28.8755447 29.0993728 28.2459640
Total 8 9 6 6 6 9 9 28.584 6
45
40
35
30
25 1990
1995
20 2000
2005
15
2010
10 2015
2016
5 2017
p
p
n
u
p
V
ku
n
va
n
ck
ck
ck
ku
0
U
le
a
n
a
ic
V
i
a
n
iv
n
S
a
n
e
b
P
b
o
U
P
p
V
a
i
i
in
rti
le
r
P
M
h
g
b
w ou
d
S
b
o
tc
tc
a
b
e
e
a
r
rg
W
C
C
v
rti
n
d
S
a
C
e
n
o
C
H
e
2
g
ve
o
r
rg
ss
C
n
e
la
r
C
d
e
d
n
a
u
re
n
2
ss
C
g
e
C
e
a
xt
R
E
Question 5: Relationship Between
Horsepower, MPG, and Price by
Brand
Visualization: Bubble chart with horsepower, MPG, and price, colored by
brand.
Data Preparation: Label bubbles with model names and calculate
averages using AVERAGEIFS.
Aston
483.76 18.93 198123.46
Martin
Audi 280 28.93 54574.12
Bentley 533.85 18.91 247169.32
BMW 329.62 29.13 62162.56
Bugatti 1001 14 1757223.67
Buick 220.01 27.01 29034.19
Cadillac 332.8 25.24 56368.27
Chevrolet 249.52 25.93 29074.73
Chrysler 229.14 26.37 26722.96
Dodge 254.35 22.99 24857.05
Ferrari 511.96 15.72 238218.84
FIAT 148.96 37.34 22670.24
Ford 249.73 23.89 28511.31
Genesis 347.33 25.33 46616.67
GMC 267.65 21.46 32444.09
Honda 197.05 32.4 26655.15
HUMMER 261.24 17.29 36464.41
Hyundai 205.2 29.77 24926.26
Infiniti 310.68 24.8 42640.27
Kia 207.39 30.69 25513.76
80
60
MPG
40
20
0 Horse600
Power
0 200 400 800 1000 1200
RESULT
Insight 1: Popularity of Car Models Across Market Categories
Results:
o The pivot table revealed SUVs and Sedans as the most
popular categories, with significantly higher counts and
popularity scores.
o Visualization: A combo chart showed a strong alignment
between model counts and popularity scores, especially in the
SUV and Sedan segments.
Dashboard Results:
Question 1: Distribution of Car Prices by Brand and Body Style
Result:
o Stacked column charts showed that SUVs dominate price
distribution, with premium brands like Porsche and Mercedes
topping the price range.
o Slicers: Enabled filtering by body style and brand.
Question 2: Brands with the Highest and Lowest Average MSRPs
Result:
o Highest MSRP: Porsche and Mercedes-Benz.
o Lowest MSRP: Kia and Hyundai.
o Visualization: A clustered column chart provided a clear
comparison.
Question 3: Impact of Features on MSRP by Body Style
Result:
o Transmission type (Automatic vs. Manual) showed a significant
impact on pricing within each body style.
o Visualization: Scatter plots depicted MSRP distribution by
transmission type and body style.
Question 4: Trends in Fuel Efficiency by Body Style and Year
Result:
o Compact cars showed consistent improvement in MPG over
time, while SUVs and trucks lagged.
o Visualization: Line charts highlighted MPG trends over the
years.
Question 5: Relationship Between Horsepower, MPG, and Price by
Brand
Result:
o Bubble charts showed:
High horsepower and low MPG models clustered at
premium price ranges.
Eco-friendly cars with low horsepower and high MPG
positioned at lower price ranges.
o Visualization: Bubble sizes represented average prices,
colored by brand.
Summary of Results:
Key Finding 1: Engine HP and cylinders significantly influence
price positively, while fuel efficiency affects price negatively.
Key Finding 2: Premium brands dominate high-MSRP categories,
while budget brands excel in affordability.
Key Finding 3: Market trends favor fuel efficiency improvements,
especially in compact cars.
--------------------------------------------------------
DASHBOARD
Total
2500000 ENGINE HP VS MSRP Lamborghini
Spyker
2000000
Tesla
BMW
Lexus Total
1500000 HUMMER
Chevrolet
1000000 Toyota
Honda
Subaru
500000 Scion
f(x) = 369.005339623758 x − 51556.8147768116 Plymouth
R² = 0.433836354106972 0 1000000 2000000
0
0.00 200.00 400.00 600.00 800.00 1000.00 1200.00
Engine Cylinder Vs Average
Highway MPG
120
Dodge - Average of Popularity Ferrari - Count of Model Ferrari - Average of Popularity FIAT - Count of Model Engine Cylinders
Rolls-Royce - Average of Popularity Saab - Count of Model Saab - Average of Popularity Scion - Count of Model
Suzuki - Average of Popularity Tesla - Count of Model Tesla - Average of Popularity
Volvo - Average of Popularity
DASHBOARD
2dr H
0
5
10
15
20
25
30
35
40
45
atch
0
10000000
15000000
20000000
25000000
30000000
35000000
5000000
bac Asto Ac
n M ura
artiBn
4dr H k2dr SU
atch V e
bac Buntley
Cad gatti
Carg k4dr SU
o Mi V Chryil ac
Fesrlrer
nCivan ari
argo For
V HUM GMCd
Conv Converti an MER
ertib ble Lam Infin
le S U borg iti
C V C
oupe hini
Lexu
Exte rew Cab
nded Pic May Lotuss
Pass Cab P kup Ma b
enge icku
rM p M cLa ch
Oldsitsubis ren
mob hi
DISTRIBUTION OF PRICE
Regu Pasisneivnagne
l ar C r V a Rol Piloentia
ab P n s- R o c
yce
S
ickup
Sed Sub cion
W ag a n Volk Taerus
on sw a la
gen
Total
2017
2016
2015
2010
2005
2000
1995
1990
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2000000
0
20
40
60
80
100
120
0
B C
MPG
200
M Mit
400 Horse600
AVERAGE PRICE
Relationship
ur tle lla ra M ni xu ac ish tia io sl
Power 800
a y c ri C ti s h i c n a
1000
Total
1200