Activity 2
Activity 2
ACTIVITY
DATA EXTRACTION AND ANALYSIS
Introduction
This dataset focuses on automobiles, providing detailed information about various car
models and their features. It includes variables such as make, model, year of
manufacture, engine size, fuel type, and performance metrics. By analyzing this data, we
can gain insights into trends in the automobile industry, compare different car models,
and evaluate factors influencing vehicle performance and consumer preferences. The
analysis aims to help in understanding the market dynamics, making informed
decisions about car purchases, and identifying key features that impact vehicle value and
performance.
Type of Variables
Categorical Variables:
Numerical Variables:
● MPG (Miles Per Gallon): This is a numerical variable representing the fuel
efficiency of the car, measured on a continuous scale.
● Cylinders: This is a numerical variable as it represents the number of
cylinders in the engine, which is a countable integer.
● Displacement: This is a numerical variable measuring the engine
displacement in cubic inches or liters, and it is continuous.
● Horsepower: This is a numerical variable representing the engine power,
measured in horsepower, and it is continuous.
● Weight: This is a numerical variable representing the weight of the car,
typically measured in pounds or kilograms, and it is continuous.
● Acceleration: This is a numerical variable that measures the car's
acceleration (e.g., time to reach a certain speed), and it is continuous.
● Model_year: This is a numerical variable representing the year the car
model was manufactured, and it is treated as continuous for analysis
purposes.
2
Scales of Measurement
Nominal Scale:
● Name: Represents the make and model of the car. There is no inherent
order among different car names or models.
● Origin: Represents the country of origin of the car. Categories are
distinct without any inherent order.
Ordinal Scale:
Interval Scale:
Ratio Scale:
3
● Weight: Measures the car’s weight. It has a true zero (zero weight) and
the differences between values are meaningful.
● Acceleration: Measures time taken to accelerate, with a true zero (zero
acceleration) and meaningful differences between values.
Summary:
4
Discrete and Continuous
Discrete Variables:
Continuous Variables:
● MPG (Miles Per Gallon): Continuous, as it can take any value within a
range and can be measured with varying precision.
● Displacement: Continuous, as it represents engine size and can take
any value within a range.
● Horsepower: Continuous, as it can take any value within a range and
can be measured with varying precision.
● Weight: Continuous, as it represents the car’s weight and can be
measured with varying precision.
● Acceleration: Continuous, as it measures the time required to
accelerate and can have any value within a range.
● Model_year: While it is a numerical variable, it is often treated as
continuous for analysis purposes, but it represents discrete years.
Categorical Variables:
5
Charts and Graphs
6
Each label in pie chart represents frequency of cars
from different origin
7
Numerical Variable :
common values.
In the histogram:
● The X-axis will have the MPG ranges (e.g., 10-15, 15-20, 20-25,
etc.).
8
● The Y-axis will show the frequency (count of cars) in each MPG
range.
This setup helps visualize how the cars are distributed across different
MPG values, showing the concentration of cars within specific fuel
efficiency ranges.
9
The X-axis will represent the bins or intervals of
engine Horsepower (e.g., 0-50, 50-100, 100-150 Watts,
etc.), and the Y-axis will represent the frequency, or
the number of cars that fall into each Horsepower
range.
10
The X-axis will represent the bins or intervals of
weight of automobiles (e.g., 1500-1700,...2500-2700
kilos, etc.), and the Y-axis will represent the frequency,
or the number of cars that fall into each Weight range.
11
The X-axis will represent the bins or intervals of
acceleration of automobiles (e.g.5-7,7-9.9-11 seconds
,etc.), and the Y-axis will represent the frequency, or
the number of cars that fall into each Acceleration
range
12
Line Chart: Show trends over years if the dataset
covers multiple years. Useful for observing
changes in car characteristics over time.
13
Combined analysis
14
dataset of automobiles. Each point on the plot
represents an individual car, with the X-axis
displaying Mpg and the Y-axis displaying
Horsepower.
15
Thank you ! 😀
16