0% found this document useful (0 votes)
18 views10 pages

Task 1 Statistics

Uploaded by

Rainbow Warrior
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views10 pages

Task 1 Statistics

Uploaded by

Rainbow Warrior
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

TASK1

For

Stadistics

Submitted by

VICTOR GARCÍA, MARINA GIL, VICTORIA HERRERO, ITZIAR

FERNANDEZ

in

UNIVERSIDAD EUROPEA
INTRODUCTION

In order to start getting used to MATLAB in the statistics world we were given the task of finding
various things that we had already seen in class. Throughout this paper, we will be seeing the
results of our work, such as histograms or box plot diagrams. As stated before, all of this task was
done using MATLAB.
1.Determine the kind of variables: discrete or continuos, quantitative or qualitative.

Cylinders Discrete, numerical (Quantitative)


Type Discrete, Categorical (Qualitative)
Country Discrete, Categorical (Qualitative)
EngSize Continuous, numerical (Quantitative)
Weight Continuous, numerical (Quantitative)
MSRP Continuous, numerical (Quantitative)
City Mpg Continuous, numerical (Quantitative)
HighwayMpg Continuous, numerical (Quantitative)

2.Select the important variables as engine size, consumption, weight, type of car and country of
origin. And determine the mean, median, mode, standard deviation, and range for the important
variables. Show the results in a table.

variables mean median mode desvt range


Engine size 3.3143 3.2000 2.400 1.2321 6.8
High Comp 26.1429 26 25 6.3000 40

Weight 3.8093e+03 3571 2606 869.8680 3931


Type ---- ---- sedan ---- ----
Country ---- ---- us ---- ----

3.Plot the histogram or bar diagram for the 5 variables, and indicate where the mean, median and
mode are in the histogram.

As we know an histogram is the most commonly used graph to show frequency distributions. We
have to first clarify that the data are not integers.

For the engine size histogram, we can observe that, if all the values had the same frequency, the
median would be in between 4 and 5, however, since we have more quantity in 2-3 and 3-4 (the
extremes values have very little frequency) the result would be moved to the left, therefore it would
be lower than 4-5.
In terms of the mean, if all the values had the same frequency, it would be around 4-5, but since
2-3 and 3-4 have more frequency, the data is going to be inclined to those values.

Histogram: Engine size

For the highway consumption histogram, we get to the same conclusion. As seen on the graphs,
the most common highway consumption is between 25-30, however we do have a high frequency
for 17.5-20 (aprox), which will move both the median and the mean to the left, (lower values).
Histogram: Consumption

For the weight histogram, we, again, get to the same conclusion. As seen on the graphs, the most
common weight is between 3000-3500, although we do have a high frequency for 3500-4000,
which will move both the median and the mean to the right, (higher values).

Histogram: Weight
Histogram: Type

For a categorical values, only the mode is calculated. As seen on the graphs, it shows directly
which is the value of the mode as it takes the value of variable with highest frequency. Sedan is
for the mode for the Type and US is the mode for Country.

Histogram: Countries
4.Plot a boxplot for the fuel consumption variable splitted by type of cars.

In this boxplot we are shown the fuel consumption in terms of the type of cars. Looking at the
graph we can see that Minivan and Wagon have thinner boxes, which means they have less
dispersion, than Sedan or Wagon, who has the greatest disparity out of the 4. As you can also
observe, the middle line is the median which is not centered in any of the cases, which means the
distance in between quartiles (Q1 and Q3) isn’t symmetric .

5.Plot a box-lot for the fuel consumption variable splitter by country.


In this boxplot we are shown the fuel consumption in terms of the country. Looking at the graph
you can see that the England, Germany, US and Sweden have thinner boxes (less dispersion) than
Japan or Kore, who has the most dispersion out of the 6. The median in all of them is not centered,
which means the distance between Q1 and Q3 are not symmetrical, the only one out of the 6 whose
median is more centered than the rest is Germany.

6.Plot a scatter diagram for the engine size versus the weight and versus the consumption. You can
do it separately in 2 plots or in the same plot.
Figure 6: Engine Size vs. Weight Figure 7: Engine Size vs. Consumption

A scatter diagram is a diagram who shows the relationship between an independent variable (X
AXIS) and a dependent variable (Y AXIS). As you can see on the graph, as we increase the engine
size the weight increases, on the contrary as you increase the size the consumption decreases.

7.Compute the correlation coefficient for the variables EngSize and Weight and EngSize and
consumption.

-> Correlation coefficient for Engine Size


and Weight.

-> Correlation coefficient for Engine Size


and Consumption.
CONCLUSION

With this we learned how to compute the mean, median, mode, standard deviation, and range,
alongside with histograms, boxplot, and scatter diagrams in the program MATLAB.

You might also like