0% found this document useful (0 votes)
20 views16 pages

Activity 2

This document presents a dataset focused on automobiles, detailing various car models and their features, including make, model, year, engine size, and performance metrics. It categorizes variables into categorical and numerical types, discusses scales of measurement, and outlines methods for data visualization such as bar charts, histograms, and scatter plots to analyze trends and relationships in the automobile industry. The analysis aims to provide insights into market dynamics, consumer preferences, and factors affecting vehicle performance.

Uploaded by

Rhytam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views16 pages

Activity 2

This document presents a dataset focused on automobiles, detailing various car models and their features, including make, model, year, engine size, and performance metrics. It categorizes variables into categorical and numerical types, discusses scales of measurement, and outlines methods for data visualization such as bar charts, histograms, and scatter plots to analyze trends and relationships in the automobile industry. The analysis aims to provide insights into market dynamics, consumer preferences, and factors affecting vehicle performance.

Uploaded by

Rhytam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Statistics

ACTIVITY
DATA EXTRACTION AND ANALYSIS

Introduction

This dataset focuses on automobiles, providing detailed information about various car
models and their features. It includes variables such as make, model, year of
manufacture, engine size, fuel type, and performance metrics. By analyzing this data, we
can gain insights into trends in the automobile industry, compare different car models,
and evaluate factors influencing vehicle performance and consumer preferences. The
analysis aims to help in understanding the market dynamics, making informed
decisions about car purchases, and identifying key features that impact vehicle value and
performance.

Type of Variables

Categorical Variables:

●​ Name: This is a categorical variable because it represents the make and


model of the car, which are distinct categories.
●​ Origin: This is a categorical variable as it denotes the country of origin for
the car, which can be categorized into different groups (e.g., USA, Europe,
Japan).

Numerical Variables:

●​ MPG (Miles Per Gallon): This is a numerical variable representing the fuel
efficiency of the car, measured on a continuous scale.
●​ Cylinders: This is a numerical variable as it represents the number of
cylinders in the engine, which is a countable integer.
●​ Displacement: This is a numerical variable measuring the engine
displacement in cubic inches or liters, and it is continuous.
●​ Horsepower: This is a numerical variable representing the engine power,
measured in horsepower, and it is continuous.
●​ Weight: This is a numerical variable representing the weight of the car,
typically measured in pounds or kilograms, and it is continuous.
●​ Acceleration: This is a numerical variable that measures the car's
acceleration (e.g., time to reach a certain speed), and it is continuous.
●​ Model_year: This is a numerical variable representing the year the car
model was manufactured, and it is treated as continuous for analysis
purposes.

2
Scales of Measurement
Nominal Scale:

●​ Name: Represents the make and model of the car. There is no inherent
order among different car names or models.
●​ Origin: Represents the country of origin of the car. Categories are
distinct without any inherent order.

Ordinal Scale:

●​ None in this dataset. Ordinal scales involve categories with a


meaningful order, but the given variables do not fit this scale.

Interval Scale:

●​ Model_year: Represents the year of manufacture. The differences


between years are meaningful, but there is no true zero point.

Ratio Scale:

●​ MPG (Miles Per Gallon): Measures fuel efficiency. It has a meaningful


zero point (zero miles per gallon) and the differences between values
are meaningful.
●​ Cylinders: Represents the number of engine cylinders. It has a true
zero (zero cylinders) and differences between values are meaningful.
●​ Displacement: Measures engine size in cubic inches or liters. It has a
true zero and the differences between values are meaningful.
●​ Horsepower: Measures engine power. It has a true zero (zero
horsepower) and differences between values are meaningful.

3
●​ Weight: Measures the car’s weight. It has a true zero (zero weight) and
the differences between values are meaningful.
●​ Acceleration: Measures time taken to accelerate, with a true zero (zero
acceleration) and meaningful differences between values.

Summary:

●​ Nominal: Name, Origin


●​ Ordinal: None
●​ Interval: Model_year
●​ Ratio: MPG, Cylinders, Displacement, Horsepower, Weight,
Acceleration

4
Discrete and Continuous

Discrete Variables:

●​ Cylinders: This is a discrete numerical variable because it represents a


countable number of engine cylinders (e.g., 4, 6, 8) and can only take
on specific integer values.

Continuous Variables:

●​ MPG (Miles Per Gallon): Continuous, as it can take any value within a
range and can be measured with varying precision.
●​ Displacement: Continuous, as it represents engine size and can take
any value within a range.
●​ Horsepower: Continuous, as it can take any value within a range and
can be measured with varying precision.
●​ Weight: Continuous, as it represents the car’s weight and can be
measured with varying precision.
●​ Acceleration: Continuous, as it measures the time required to
accelerate and can have any value within a range.
●​ Model_year: While it is a numerical variable, it is often treated as
continuous for analysis purposes, but it represents discrete years.

Categorical Variables:

●​ Name: Categorical, as it represents different car models and makes.


●​ Origin: Categorical, as it represents the country of origin.

5
Charts and Graphs

Bar Chart or Pie Chart: Show the distribution of cars


from different countries. Helps in visualizing the share of
each origin in the dataset.

In the bar chart:

●​The X-axis will represent the different Origins.


●​The Y-axis will represent the count of cars.
●​It shows the “Numbers of Cars by Origin”

6
Each label in pie chart represents frequency of cars
from different origin

7
Numerical Variable :

Histogram: Display the distribution of engine


displacement. Useful for seeing the range and

common values.

In the histogram:

●​ The X-axis will have the MPG ranges (e.g., 10-15, 15-20, 20-25,
etc.).

8
●​ The Y-axis will show the frequency (count of cars) in each MPG
range.

This setup helps visualize how the cars are distributed across different
MPG values, showing the concentration of cars within specific fuel
efficiency ranges.

The X-axis will represent the bins or intervals of


engine displacement (e.g., 0-50, 50-100, 100-150
cubic inches, etc.), and the Y-axis will represent the
frequency, or the number of cars that fall into each
displacement range.

9
The X-axis will represent the bins or intervals of
engine Horsepower (e.g., 0-50, 50-100, 100-150 Watts,
etc.), and the Y-axis will represent the frequency, or
the number of cars that fall into each Horsepower
range.

10
The X-axis will represent the bins or intervals of
weight of automobiles (e.g., 1500-1700,...2500-2700
kilos, etc.), and the Y-axis will represent the frequency,
or the number of cars that fall into each Weight range.

11
The X-axis will represent the bins or intervals of
acceleration of automobiles (e.g.5-7,7-9.9-11 seconds
,etc.), and the Y-axis will represent the frequency, or
the number of cars that fall into each Acceleration
range

12
Line Chart: Show trends over years if the dataset
covers multiple years. Useful for observing
changes in car characteristics over time.

The X-axis will represent the different years in which


cars are manufactured (eg.1970,1972,1980,etc.), and
the Y-axis will represent the average Mpg(miles per
gallon) of cars for each model year.

13
Combined analysis

●​Scatter Plot (MPG vs. Horsepower): Examine


the relationship between fuel efficiency and
engine power.

The scatter plot visualizes the relationship between


fuel efficiency (measured in miles per gallon, MPG)
and engine power (measured in horsepower) for a

14
dataset of automobiles. Each point on the plot
represents an individual car, with the X-axis
displaying Mpg and the Y-axis displaying
Horsepower.

From the scatter plot, we observe a moderate


negative correlation between MPG and horsepower.
This indicates that, generally, as the horsepower of a
car increases, its fuel efficiency decreases. In other
words, cars with more powerful engines tend to
have lower fuel efficiency. This trend is typical in the
automobile industry, as higher engine power often
requires more fuel consumption.

By examining this scatter plot, we gain insights into


the trade-off between engine performance and fuel
economy, which can be crucial for consumers and
manufacturers when evaluating and designing
vehicles.

15
Thank you ! 😀

16

You might also like