0% found this document useful (0 votes)
8 views29 pages

Chapter 1

Chapter 1 introduces univariate descriptive statistics, emphasizing its role in summarizing data through various representations like graphs and tables. It distinguishes between qualitative and quantitative variables, detailing their characteristics and modalities, as well as methods for calculating frequency and relative frequency. The chapter also covers graphical representations and characteristic parameters, such as mean, median, mode, variance, and standard deviation, to effectively describe and analyze data distributions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views29 pages

Chapter 1

Chapter 1 introduces univariate descriptive statistics, emphasizing its role in summarizing data through various representations like graphs and tables. It distinguishes between qualitative and quantitative variables, detailing their characteristics and modalities, as well as methods for calculating frequency and relative frequency. The chapter also covers graphical representations and characteristic parameters, such as mean, median, mode, variance, and standard deviation, to effectively describe and analyze data distributions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Chapter 1

Univariate Descriptive Statistics

ABDOUN S.
[email protected]
Introduction
• Statistics is the science of collecting, processing and analysing data from
the observation of random phenomena.

• Data analysis is used to describe the phenomena, make predictions and


take decisions about them. Statistics is an essential tool for understanding
and managing complex phenomena.

• Descriptive statistics aims to summarize the information contained in the


data in a concise and efficient way.

• It uses data representations in the form of graphs, tables and numerical


indicators (e.g. averages). It helps to identify the essential characteristics
of the phenomenon studied.

• Descriptive statistics have one dimension (univariate), if the data relate to


only one variable. In the case where two variables are concerned
simultaneously, we then speak of bivariate statistics.
Key words

Population

Subset
Exemples
Characteristics and modalities
• Character (statistical variable)
Types of characters

Qualitative Quantitative

If (i.e its values are measurable),

If F is not a subset of R (i.e. its Discret Continuous


values are not measurable),

Examples: A quantitative character A quantitative character


1. honors of a student's is said to be discrete if its is said to be continuous if
baccalaureate, values are countable it can take any value in a
baccalaureate series; given interval of
2. Colour of the vehicle
Examples: Examples:
1. Number of siblings; 1. Size of student,
2. Year of production of a average in the
vehicle. baccalaureate;
2. Number of kilometres
travelled by a vehicle
Modalities
Modalitites

The modalities are the different values that the character X can take.

Discret character Continuous character

If the character is discrete quantitative, we If the character F is continuous quantitative,


note where denotes we divide F into classes, the modalities in
the different possible values that X can take this case will be the different classes:
The are assumed to be ordered

Example:
•The modalities of the character "number of
brothers and sisters of a student" are:
Frequency, relative frequency
Cumulative
Cumulative Relative relative
Frequency
frequency frequency frequency

The cumulative
The number, denoted The cumulative The frequency of the
relative frequency of
called the frequency in modality
the modality
frequency of the (respectively in (respectively of the
(respectively of the
modality [, ) , denoted as class , ),
class [, ), is
(respectively the class a such that: denoted as is
called the number
hh hh ) is defined defined by:
denoted s such that:
as the number of
times the modality
a has been observed
(respectively, where a
value in the class
[ , [ is observed)
for .
Representation of data

Statistical Statistical Graphical


series tables representations
Statistical Statistical
series tables

Statistical series are the


sequence of values taken by a The frequency (or relative
variable X on the observation frequency) table is a synthetic
units. way of presenting data. It is
The values of the variable straightforward to create in the
are noted case of a discrete character, but
requires data transformation
for continuous character.
Example 1: (case of a qualitative character)
Example 2: (Case of a discrete quantitative character)
Example 3: (Case of continuous quantitative character)
Graphical representations

The graphical representations have the advantage of immediately


providing information on the general appearance of the distribution of
the data. They thus facilitate the interpretation of the data collected.
Graphical
representations

Qualitative Quantitative
character character

Discrete Continuous
Bar chart Pie chart character character

Stick
Histogram
diagram
Graphical representations Qualitative
character

Bar chart Pie chart

Bar chart is a graph that associates each A PIE CHART is a graph that divides a disk into
modality of the character with a rectangle angular sectors whose central angles are
of constant base and height proportional to proportional to the frequency (or relative
the frequency (or relative frequency). frequency) of each modality.
The center angle of a modality of size
aa is given (in degrees) by:
Graphical representations
Quantitative
character

Discrete character Continuous character

Stick Histogram
diagram

A stick diagram associates a segment Histogram is composed of a set of adjacent


(stick) with each value of the variable, rectangles, each rectangle, associated with
where the height is proportional to the each class, having an area proportional to
frequency (or relative frequency). the frequency (or relative frequency) of
that class.
Characteristic parameters (statistical indicators)

A table or graph can sometimes be time-consuming to consult, without


providing a sufficiently clear idea of the observed statistical
distribution. In such cases, we try to summarize the distribution by a
characteristic parameter, i.e. a single number intended to objectively
characterize the entire dataset.
Characteristic
parameters

Position Dispersion
parameters parameters

Mean Median
Variance
standard deviation
Mode
Position parameters:

Arithmetic mean

From raw data From the statistical table:

For a discrete For a continuous


variable: variable :
Example: Discrete case (see Example 2)
Example: Continuous case ( see example 3)
Position parameters:
Mode

The mode, denoted, is the most frequent value taken by the character, i.e.,
the one that repeats most often in the data.

For a discrete For a continuous


variable: variable:

the mode is the value of X


having the highest frequency
(or relative frequency).
Exemples:
Position parameters:
Mediane

The median, denoted , is the value that divides the studied population into two
groups of equal size (50% of the observations are less than or equal to it, and 50%
are greater than or equal to it).

Discrete case Continuous case

Consider a statistical series First, we define the median class, aaaaaaaa


arranged in ascending order of which contains at least 50% of the total
observations . aaaaaaaaaaa frequency. The median is then given by the
The median is given by: following formula:
Exemples:
Dispersion parameters
Variance

Raw series Statistical table


Developed formulas
Standard deviation

The standard deviation, denoted , is defined as the square root of the variance.
It is written:

You might also like