0% found this document useful (0 votes)

34 views11 pages

NOTES STATISTICS Final

The document provides an overview of key concepts in statistics and measurement, including: 1) It defines measurement as gathering information to characterize social phenomena using various units of analysis like people, organizations, and industries. 2) It discusses different methods of measurement like in-depth interviews and pointed questions, and their advantages and disadvantages. 3) It introduces different scales of measurement like nominal, ordinal, interval, and their properties. 4) It discusses important concepts like validity, reliability, randomization, and sources of error in sampling.

Uploaded by

Giorgio Vailati Facchini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views11 pages

NOTES STATISTICS Final

Uploaded by

Giorgio Vailati Facchini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

NOTES STATISTICS

Week 1 – Lecture 1

MEASUREMENT
Measurement is the task of gathering information that characterises or represent a social phenomena. For example, we can
mention:
- People’s opinion on gun control
- An individual’s income
- The severity of unemployment in a country
- The typical wages offered in a given industry

There are several ways to measure; let’s mention two of them:

1) In-depth interview
You could spend like 4 hours interviewing a person, while transcribing 25 pages of interview per person and eventually
developing a rich, complex understanding of views
o Advantages: it creates a very detailed knowledge
o Disadvantages: it is hard to gather in this way information on many people, and the complexity of the study makes
comparison difficult
2) Ask people a very pointed question
According to the above-mentioned example, your might ask “do you approve gun control or not?”
o Advantages: it is fast and easy, so that you can “measure” at once thousands of people; moreover, such information
gathered are in a comparable form
o Disadvantages: the result is a simplification of information, so that much detail is lost

Before we can measure, we must determine the unit of analysis: the type of thing which we are collecting information about;
common units of analysis are:
- People (the most common)
- Organisations
- Countries
- Schools
- Industries
- Families

 Measurement can involve different units

For example, miles vs kilometres; this is different from the concept of unit of analysis, and in general differences in this kind
of unit are trivial
 Measurement can involve different scales
For example, the difference between measuring the temperature in degrees vs labelling days as hot or cold

The main types of measurement scales are:

1) Nominal
2) Ordinal
3) Interval or ratio

➔ NOMINAL SCALES
Nominal comes from the Latin name for “name” or “label”, and in fact the strategy behind nominal scales is to assign labels
to different cases; for example:
- Measuring gender: male, female, …
- Country of residence: US, Japan, Italy, …
- Religion: Protestant, Catholic, Atheist, …

As a matter of fact, nominal scales involve categories that are:

A) Homogeneous = all people in that category must have a commonality
B) Mutually exclusive = people can’t fit into more than one category
C) Exhaustive = there should be a category for everyone, even if it is a “non of above” or “other”

One problem that could arise happens, supposedly, if a person when asked the measure religion replies that she is both
Protestant and Jewish; what do you do? There are two options:

1
1) You design a better survey that can cope with this, like adding a category for “Jewish/Protestant” or “multiple
religion”
2) Or, you destroy information by forcing the respondent to choose

➔ ORDINAL SCALES
Ordinal scales are similar to nominal, but, in addition to putting people in groups, those groups are ordered; for example:
- Lower class, middle class, upper class
- Elementary school, middle school, high school

Ordinal scales are:

A) Homogeneous
B) Mutually exclusive
C) Ordered

Ordinal scales do not specify the distance between categories: in the case of department rankings, we do not know how big
the difference is between ranks of 1 & 2, or 20 and 21 (these differences may be small or large); that’s why ordinal scales are
distinct from nominal: nominal can not be meaningfully ordered (you cannot perform >, <, = on nominal values)

 NB nominal and ordinal variables are called qualitative variable because we are measuring attributes

➔ INTERVAL SCALES
Interval measures (also ratio scales) are:
A) Homogeneous
B) Ordered
C) Measured in comparable units

For example, we can mention the budget of a university, the number of children in a household, a worker’s annual income.

Interval scales can be discrete or continuous:

- Continuous measures can be measured with arbitrary precision; for example, a worker worked 41.4532678 hours in
a week
- Discrete measures can only take on specific values (e.g., integers); for example, a worker worked (rounded off to the
hour) 42 hours

Data are always an approximation, and statistics is just a way to summarise the results of the scientific experiments in the
scientific methods. Now many think that the data are the experiment, but in the Galilean approach the data are just the
approximation, and depend on the way you chose to collect data, the tools used for the measurement

 On the overall, poor measurement results, regardless of the perfect statistical analysis you may carry our, in incorrect
conclusions

Week 1 – Lecture 2

VALIDITY
Validity is the degree to which a measurement captures what it is intended to; if validity is very poor, measurements become
meaningless, but we have to remember that validity is not an “all or nothing” thing.

For example, measuring a person’s wealth starting from its hourly wage, does it have validity problems? Yes, because retired
people might be wealthy, but have no income or wages; a more valid measure would be the total value of all their assets.

RELIABILITY
Reliability is the extent to which a measure produces consistent result; if reliability is poor, measures are meaningless.

For example, in measuring the overall happiness in life with the question “how happy are you right now?”, has potential reliability
issues, since mood varies a lot from moment to moment and answers may not reflect the true overall happiness in life (we are
constantly influenced by the weather, the day of the week, the time, and many other circumstances. The way to solve this is to
find out less time-sensitive measures.

COLLECTING DATA IN A STUDY – RANDOMISATION

There are two main ways in which we can collect data in a study:

2
1) Sample survey: since you cannot ask the whole population (especially at the same time) the survey question, one
approach is to sample people from a given population and interview them. Here, the issue is how to find a representative
sample of the population
2) Experiment: it consists in comparing the responses of subject under different conditions, with subjects assigned to the
conditions; the great advantage of this method is that you basically can control the conditions of your subjects, and also
purposely alter them

Randomisation is the mechanism for achieving reliable data by reducing potential bias; considering the simple random sample,
each possible sample of size n (size sample) has the same chance of being selected. Basically, you cannot control the unit that is
eventually selected, even if it is important to mention that not all samples are truly random. The simple random sample is an
example of a probability sampling method, because we can specify the probability any particular sample will be selected.

To implement random sampling, we can use random number tables or statistical softwares that can generate random numbers;
nevertheless, the sampling frame, so a listing of all subjects in a population, must exist to implement the simple random sampling

- Other probability sampling methods include systematic, stratified, cluster random sampling.
- For nonprobability sampling, cannot specify probabilities for the possible samples; inferences based on them may be
highly unreliable. Example: volunteer samples, such as polls on the Internet, often are severely biased (but, sometimes
volunteer samples are all we can get, as in most medical studies)

Moreover, we can make another distinction:

- Experimental studies = the researcher assigns subjects to experimental conditions; they should be assigned at random to
the treatments. Randomisation “balances” treatment groups with respect to other variables that could affect response
(e.g., demographic characteristics), makes it easier to assess cause and effect
- Observational studies = the researcher observes subjects without any experimental manipulation (e.g., surveys)

The sampling error of a statistic equals the error that occurs when we use a sample statistics to predict the value of a population
parameter. As a matter of fact, being unable to interview an entire population, we need a scaled version of it with the exact same
features: a representative sample, so that we can reach a conclusion that will always be an estimate; its precision includes the
error, that depends on how you select the sample. Randomisation protects against bias, with sampling error tending to fluctuate
around 0 with predictable size; there are methods that let us predict magnitude (the margin of error) e.g., in estimating a
percentage, no more than about +3% or -3% when n about 1000.

Other factors besides sampling error can cause results to vary from sample to sample:
- Sampling bias (e.g., no probability sampling)
- Response bias (e.g., poorly worded questions and misunderstandings; as the NY Times poll on gasoline tax shows, results
of surveys can depend greatly on question wording)
- Non-response bias (e.g., under coverage and missing data: people do not answer the questions and typically those people
are not random but are a group; it is what happens with social media and elderly people that generally avoid them, so
that a given demographic group is ultimately missing)

FROM MEASUREMENT TO DATASETS

3
Suppose we:
1. Choose a unit of analysis
2. Choose a measurement strategy
3. Take measurements on relevant cases

We end up with sets of measurements on groups of cases; data is often organised in spreadsheets:
- Rows = all measurements on each case
- Columns = reflect sets of measurements or “variables”

To facilitate data analysis, it is best to enter data as numbers rather

than text, in the so-called coding data; for example, you might
decide that 1 = “favour” and 0 = “oppose”. The advantage is clearly
that you have more computation opinions, but at the same time
the downside is that data is harder to interpret by eye.

Another thing we can do is listing variables, so listing the values of a variable for all cases (by looking at the raw data); this is
something that makes sense only for very small samples and for given kinds of variables. To perform that, we have the List
command in STATA.
o Advantages: it is easy and gives a rich description of the dataset (you can see every case)
o Disadvantages: it is not workable for large datasets and if data involves complex coding you might not be able
to interpret it visually

FREQUENCY LISTS
Frequency lists are tables that show how many cases take on a particular value; they are the simplest descriptive tool that is also
called “frequencies” or “frequency distributions”. For example, in the case of a congressional vote you might count the number
of “Yes” or “No” with the STATA command “tabulate (tab)”
o Advantages: frequency lists are useful for large datasets and provide for a rich description of data
o Disadvantages: unlike a list, you can’t see which case is which or compare with other variables. They are best for
nominal and some ordinal variables, while are not so useful if all values are unique (rank orderings, many
continuous variables, …), especially if you do not envision bins.

4
VISUAL REPRESENTATIONS
Bar charts are essentially a visual representation of a frequency list: the height of bars represent the number of cases. It is used
for nominal and some ordinal variables only, because again, ranking orders and continuous measures do not work.

Pie charts are similar, but divide up a circle to show frequencies

As regards the matter of graphing continuous measures, the issue is that continuous variables have an infinite possible number
of unique values; in the case of a bar chart, you would have many bars of height 1 (and what would you do with zeros?). One
possible solution is to use grouped data (bins), so that sets of similar values are grouped and lumped together by constant
intervals; nevertheless, information is destroyed in the process. The result is an histogram, in which the height of bars represent
the number of cases within a given range of values. If you have, for example, people grouped by age, you might have 5-years
intervals with the consequent bars, but you might also group people within a 1-year interval or a 50-year interval:
- Small interval means more bars in the histogram —> greater detail, but you might have difficulties in interpreting the
data
- Wide interval means fewer bars in the histogram —> greater simplification of date

NB Histograms look very different depending on how wide you set the intervals: you should try different intervals and don’t over-
interpret a crude histogram.

If bins are not equally big, then you cannot compare frequencies but have you have densities; you have to compute the ratio
between frequencies and range

5
MEASURES OF CENTRAL TENDENCY
Often, it is important to assess the typical values of a variable; for example how much money the typical family earns, or the age
of the typical person in the dataset. The solution to the question is to conduct calculations to determine what values are typical,
but this is not as easy as it sounds; moreover, it does not exist a unique way of measuring what the centre is, so even different
results are allowed.
Lastly, the typical value that results does nothing to represent the variability in the sample.

MODE
The mode is the value representing the largest number of cases (modal value).

This measure is useful for nominal and ordinal values, while it is only useful for
continuous variables if you have grouped data into a histogram (otherwise, all
values may very likely be unique); nevertheless, the mode is not very helpful (or
even misleading) in certain circumstances (if there are many peaks, or a single
unusual one, or if the variable is distributed quite evenly)

In case of character grouped into classes, the mode is the average (central) value of the class to which the highest frequency
density 𝒍𝒊 (𝒅𝒊 ) Is associated

6
7
Week 2 – Lecture 1

MEDIAN
The median of a variable is the modality that occupied the central position in the ordered distribution of a variable (=value of
the middle case, since you have an equal number of cases that fall higher or lower). It can be performed for ordinal and
continuous variables, but cannot be calculated for nominal variables because they do not naturally possess an order; it is more
informative than the mode.
o Advantages: it is not influenced by unusual peaks (outliers) + it is useful even in very even distributions
o Disadvantages: it is not useful for data spread in two distinct “clumps”

- If the number of statistical units n is odd, there is only one central position P = (n + 1) / 2.
- If the number of statistical units n is even, there are two middle positions: n / 2 and n / 2 + 1. If the units corresponding to
these two positions have the same modality, this modality is the median; if they have different modalities: the median is
indeterminate (if the variable is ordinal), the median is the average of the two modalities (if the variable is quantitative).

The calculation of the median for grouped data follows this procedure:

QUANTILES
Quantile is the general term, but there exist percentiles, quantiles, deciles, …; using quantiles means dividing cases up into fixed
number of equal “bunches”:
- 100 chunks = percentile
- 10 chunks = deciles
- 5 chunks = quintile
8
- 4 chunks = quartiles

The whole distribution is divided in four quartile states

Identifying quartile of a case is a powerful way of describing where a case falls relative to others; in this case, a person with 200CDs
is in the top quartile, meaning that 75% of people have less. We must not forget that quantiles are relative, so a person of average
height in the US would be in the bottom quartile in a dataset of basketball players for example

Moreover, upper and lower bounds of quantiles are useful reference points that describe your data
– The border of the 2nd and 3rd quartile is the median, the middle of your data
– The border of the top quartile (178 CDs) gives you a sense of how many are owned by people toward the upper end of the
distribution

Sometimes people report “interquartile range”: the range of values that contains the middle 50% of cases.

Quantiles help answer two questions:

1) How does a particular case compare to others in the dataset?
For example, you can receive the result of a test, that be 57
points, and wanting to know if it is a good result; if 57
corresponds to the 22nd percentile, then the answer is no, at
least not compared to the others who tool the test (of course
percentiles indicate position relative)
2) How does a case’s value on one variable compare to another
variable?
For example, it I scored 51 on my math test and 78 in English,
which is better? Converting to percentiles allows a direct
comparison: with 51 in maths I’m in the 95th percentile and with
78 in English I’m in the 62th; my math performance was better

9
The closes value to the one you have calculated is the corresponding mark in the ordered sequence 10
Quartiles are portrayed graphically by box plots; in this case, weekly TV
watching for n=60 from student survey data file, 3 outliers

Box plots have box from LQ to UQ, with median marked.

They portray a five-number summary of the data:

Minimum, LQ, Median, UQ, Maximum
(except for outliers identified separately)

Outlier = observation falling:

below LQ – 1.5(IQR) or above UQ + 1.5(IQR)

Ex. If LQ = 2, UQ = 10, then IQR = 8 and outliers above 10 + 1.5(8) = 22

MEAN – AVERAGE
It is the most well-known way of assessing the middle in a distribution; it is calculated by adding values of all cases, then dividing
by the total number of cases. It has its advantages because it is applicable for continuous measures and it is not overly influenced
by any single peak; as a disadvantage, it can be influenced by extreme values (outliers)

Social Work Stat
No ratings yet
Social Work Stat
52 pages
Module 1 - Tests of Hypothesis For A Single Sample
100% (1)
Module 1 - Tests of Hypothesis For A Single Sample
27 pages
The G Factor, The Science of Mental Ability - Arthur R. Jensen
No ratings yet
The G Factor, The Science of Mental Ability - Arthur R. Jensen
660 pages
Chapter One Probability and Statistics
No ratings yet
Chapter One Probability and Statistics
57 pages
STATISTICS
No ratings yet
STATISTICS
10 pages
Chapter 1: Introduction To Statistics
No ratings yet
Chapter 1: Introduction To Statistics
28 pages
1 - Intro To Statistics
No ratings yet
1 - Intro To Statistics
11 pages
Chapter 1 PDF
No ratings yet
Chapter 1 PDF
5 pages
Psych Assessment Notes 7
No ratings yet
Psych Assessment Notes 7
32 pages
Statistics Lecture Notes
No ratings yet
Statistics Lecture Notes
6 pages
Applied Statistics - MIT
100% (1)
Applied Statistics - MIT
654 pages
1 - Basic Concepts
No ratings yet
1 - Basic Concepts
71 pages
Stat CH 1 Lec Note - 231109 - 164005
No ratings yet
Stat CH 1 Lec Note - 231109 - 164005
4 pages
2013 CH 1 Up 9 Probability Note
No ratings yet
2013 CH 1 Up 9 Probability Note
104 pages
CH 1 Up 9 Probability Note-1 PDF
No ratings yet
CH 1 Up 9 Probability Note-1 PDF
106 pages
CH 1 Up 9 Probability Note (For Engineering) - 1
No ratings yet
CH 1 Up 9 Probability Note (For Engineering) - 1
93 pages
'MATH 233 Statistics For Social Sciences - Week 1' D - 241029 - 161224
No ratings yet
'MATH 233 Statistics For Social Sciences - Week 1' D - 241029 - 161224
110 pages
Study Notes
No ratings yet
Study Notes
154 pages
Chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2
No ratings yet
Chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2
47 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
82 pages
Edu 302 Lecture IV - PPTX Samples and Instruments
No ratings yet
Edu 302 Lecture IV - PPTX Samples and Instruments
18 pages
Basic Concepts in Statistics
100% (1)
Basic Concepts in Statistics
19 pages
Basics Concepts of Statistics
No ratings yet
Basics Concepts of Statistics
18 pages
1) Unit 1. Introduction PDF
No ratings yet
1) Unit 1. Introduction PDF
7 pages
Prof. Januario Flores JR
No ratings yet
Prof. Januario Flores JR
14 pages
Statistics and Probablity Note For Engineers
No ratings yet
Statistics and Probablity Note For Engineers
79 pages
Introduction To Statistics and Probability
No ratings yet
Introduction To Statistics and Probability
134 pages
Chapter - 4: Fundamentals of Statistical Concepts & Techniques in Quality Control and Improvement
No ratings yet
Chapter - 4: Fundamentals of Statistical Concepts & Techniques in Quality Control and Improvement
36 pages
Icte Lesson
No ratings yet
Icte Lesson
19 pages
Basic Concepts
No ratings yet
Basic Concepts
105 pages
Note For Int To Statistics
No ratings yet
Note For Int To Statistics
24 pages
Sta 301
No ratings yet
Sta 301
6 pages
Producing Data Surveys and Sampling
No ratings yet
Producing Data Surveys and Sampling
44 pages
Week 1
No ratings yet
Week 1
6 pages
1 BASICemm
No ratings yet
1 BASICemm
61 pages
What Are The Different Sources of Data?
No ratings yet
What Are The Different Sources of Data?
5 pages
Probability and Statistics Lesson 1 2
No ratings yet
Probability and Statistics Lesson 1 2
47 pages
Statistics
No ratings yet
Statistics
5 pages
Statsprob Reviewer
No ratings yet
Statsprob Reviewer
7 pages
Lesson 1 Introduction To Statistics
No ratings yet
Lesson 1 Introduction To Statistics
44 pages
Lesson 5.1
No ratings yet
Lesson 5.1
43 pages
Statistics and Probability - Midterm Reviewer
No ratings yet
Statistics and Probability - Midterm Reviewer
12 pages
Statistics and Probability Handouts - Basic Terms in Statistics
No ratings yet
Statistics and Probability Handouts - Basic Terms in Statistics
4 pages
Chapter 1
No ratings yet
Chapter 1
23 pages
Chapter 12: Measurement: Scaling, Reliability and Validity
No ratings yet
Chapter 12: Measurement: Scaling, Reliability and Validity
43 pages
Measurement and Data Collection
No ratings yet
Measurement and Data Collection
23 pages
Chpt1 4
No ratings yet
Chpt1 4
19 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
10 pages
A Variable Is Any Characteristic or Quantity That Varies Among The Members of A Particular Group
No ratings yet
A Variable Is Any Characteristic or Quantity That Varies Among The Members of A Particular Group
61 pages
Introduction Statistics
100% (1)
Introduction Statistics
23 pages
SMA 160 - Stds Notes (2025)
No ratings yet
SMA 160 - Stds Notes (2025)
40 pages
Basic Concepts in Statistics
No ratings yet
Basic Concepts in Statistics
41 pages
Research Methodology Module 3
No ratings yet
Research Methodology Module 3
60 pages
Basic Stat
No ratings yet
Basic Stat
132 pages
Reviewer 1.2
No ratings yet
Reviewer 1.2
300 pages
Data Collection Methods
No ratings yet
Data Collection Methods
24 pages
Math 1f - All Lessons
No ratings yet
Math 1f - All Lessons
81 pages
Advance Statistics
No ratings yet
Advance Statistics
21 pages
Lecture 1: Introduction To Statistics
No ratings yet
Lecture 1: Introduction To Statistics
23 pages
STATAPP1
No ratings yet
STATAPP1
11 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
调节效应简单斜率图温忠麟2-Way Linear Interactions
No ratings yet
调节效应简单斜率图温忠麟2-Way Linear Interactions
8 pages
Dr. Wheeler's SPC Toolkit
No ratings yet
Dr. Wheeler's SPC Toolkit
36 pages
Restaurants Rating Prediction Using Machine Learning Algorithms
No ratings yet
Restaurants Rating Prediction Using Machine Learning Algorithms
4 pages
Macasinag Jobel M. Action Research
No ratings yet
Macasinag Jobel M. Action Research
12 pages
Average: Sagni D. 1
No ratings yet
Average: Sagni D. 1
85 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
39 pages
An Analysis of Christopher R Browning S Ordinary Men Reserve Police Battalion 101 and The Final Solution in Poland 1st Edition James Chappel and Tom Stammers Instant Download
50% (2)
An Analysis of Christopher R Browning S Ordinary Men Reserve Police Battalion 101 and The Final Solution in Poland 1st Edition James Chappel and Tom Stammers Instant Download
147 pages
A Generative Model For Inorganic Materials Design-Peer Review Report
No ratings yet
A Generative Model For Inorganic Materials Design-Peer Review Report
25 pages
pr2 Final (ABM)
No ratings yet
pr2 Final (ABM)
10 pages
Book of Rumi: 105 Stories and Fables That Illumine, Delight, and Inform Rumi Download
100% (1)
Book of Rumi: 105 Stories and Fables That Illumine, Delight, and Inform Rumi Download
115 pages
Introduction To Business Analytics - Lab Manual
No ratings yet
Introduction To Business Analytics - Lab Manual
8 pages
CBSE 11th SQP Economics Mind Maps
No ratings yet
CBSE 11th SQP Economics Mind Maps
7 pages
Islam Etal 2022 Principlesof Social Resaerch Methdology
No ratings yet
Islam Etal 2022 Principlesof Social Resaerch Methdology
26 pages
NLOGIT Manual
No ratings yet
NLOGIT Manual
173 pages
ARDL Model - Hossain Academy Note PDF
100% (1)
ARDL Model - Hossain Academy Note PDF
5 pages
Magatao Quanti
No ratings yet
Magatao Quanti
23 pages
Short Note For Theme 2
No ratings yet
Short Note For Theme 2
63 pages
Paper 19
No ratings yet
Paper 19
10 pages
Materi 1
No ratings yet
Materi 1
18 pages
Confidence Interval Estimate
No ratings yet
Confidence Interval Estimate
4 pages
Examiners Report Pure Mathematics and Statistics
No ratings yet
Examiners Report Pure Mathematics and Statistics
25 pages
Anova PPT Stats 511 For PG
No ratings yet
Anova PPT Stats 511 For PG
27 pages
Study Guide Physics A S1 2024
No ratings yet
Study Guide Physics A S1 2024
11 pages
Research Proposal: 1.0 Statement of The Problem
No ratings yet
Research Proposal: 1.0 Statement of The Problem
8 pages
Analysis of Report
No ratings yet
Analysis of Report
23 pages
MODULE 5 Mean Comparisons
No ratings yet
MODULE 5 Mean Comparisons
31 pages
SPSS Answers (Chapter 12)
No ratings yet
SPSS Answers (Chapter 12)
15 pages
Jaba Elisabeta
0% (1)
Jaba Elisabeta
7 pages

NOTES STATISTICS Final

Uploaded by

NOTES STATISTICS Final

Uploaded by

NOTES STATISTICS

There are several ways to measure; let’s mention two of them:

 Measurement can involve different units

The main types of measurement scales are:

As a matter of fact, nominal scales involve categories that are:

Ordinal scales are:

Interval scales can be discrete or continuous:

COLLECTING DATA IN A STUDY – RANDOMISATION

Moreover, we can make another distinction:

FROM MEASUREMENT TO DATASETS

To facilitate data analysis, it is best to enter data as numbers rather

Pie charts are similar, but divide up a circle to show frequencies

The whole distribution is divided in four quartile states

Quantiles help answer two questions:

Box plots have box from LQ to UQ, with median marked.

They portray a five-number summary of the data:

Outlier = observation falling:

Ex. If LQ = 2, UQ = 10, then IQR = 8 and outliers above 10 + 1.5(8) = 22

You might also like