0% found this document useful (0 votes)
29 views45 pages

Collection of Data Part 2 Edited MLIS

This document defines key terms used in statistics and data analysis, including variables, data, experiments, parameters, and statistics. It provides examples to illustrate these concepts, distinguishing between population and sample, and qualitative and quantitative variables that can be nominal, ordinal, discrete, or continuous. It also discusses methods for collecting and organizing data through frequency distributions shown in tables, histograms, polygons, bar graphs, and smooth curves. The shape of distributions is addressed.

Uploaded by

Whieslyn Cole
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views45 pages

Collection of Data Part 2 Edited MLIS

This document defines key terms used in statistics and data analysis, including variables, data, experiments, parameters, and statistics. It provides examples to illustrate these concepts, distinguishing between population and sample, and qualitative and quantitative variables that can be nominal, ordinal, discrete, or continuous. It also discusses methods for collecting and organizing data through frequency distributions shown in tables, histograms, polygons, bar graphs, and smooth curves. The shape of distributions is addressed.

Uploaded by

Whieslyn Cole
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 45

Variable: A characteristic about each

individual element of a population or sample.


Data (plural): The set of values collected for
the variable from each of the elements
belonging to the sample.
Experiment: A planned activity whose results
yield a set of data.
Parameter: A numerical value summarizing all
the data of an entire population.
Statistic: A numerical value summarizing the
sample data.
EXAMPLE: A COLLEGE DEAN IS INTERESTED IN
LEARNING ABOUT THE AVERAGE AGE OF
FACULTY. IDENTIFY THE BASIC TERMS IN THIS
SITUATION.

Population
The age of all faculty members at the college.
Sample
Any subset of that population. Like, we might
select 10 faculty members and determine their age.
Variable
the “age” of each faculty member.
EXAMPLE: A COLLEGE DEAN IS INTERESTED IN
LEARNING ABOUT THE AVERAGE AGE OF FACULTY.
IDENTIFY THE BASIC TERMS IN THIS SITUATION.

Data
It would be the age of a specific faculty member.
Data
 It would be the set of values in the sample.
EXAMPLE: A COLLEGE DEAN IS INTERESTED IN
LEARNING ABOUT THE AVERAGE AGE OF FACULTY.
IDENTIFY THE BASIC TERMS IN THIS SITUATION.

Experiment
The method used to select the ages forming the
sample and determining the actual age of each faculty
member in the sample.
EXAMPLE: A COLLEGE DEAN IS INTERESTED IN
LEARNING ABOUT THE AVERAGE AGE OF FACULTY.
IDENTIFY THE BASIC TERMS IN THIS SITUATION.

Parameter
The “average” age of all faculty at the college.
Statistic
The “average” age for all faculty in the sample.
Two kinds of variables:
Qualitative, or Attribute, or Categorical,
Variable:
Quantitative, or Numerical, Variable:
Two kinds of variables:
Qualitative, or Attribute, or Categorical,
Variable: A variable that categorizes or
describes an element of a population.
Note: Arithmetic operations, such as addition
and averaging, are not meaningful for data
resulting from a qualitative variable.
Two kinds of variables:
Quantitative, or Numerical, Variable: A
variable that quantifies an element of a
population.
Note: Arithmetic operations such as addition
and averaging, are meaningful for data
resulting from a quantitative variable.
Example: Identify each of the following examples as
attribute (qualitative) or numerical (quantitative)
variables.
 The residence hall for each student in a statistics class.
 (Attribute)
 The amount of gasoline pumped by the next 10
customers at the local Savemore.
 (Numerical)
 The amount of radon in the basement of each of 25
homes in a new development.
 (Numerical)
Example: Identify each of the following examples as
attribute (qualitative) or numerical (quantitative)
variables.
The color of the baseball cap worn by each of 20
students.
(Attribute)
The length of time to complete a mathematics
homework assignment.
(Numerical)
The state in which each truck is registered when
stopped and inspected at a weigh station.
(Attribute)
Qualitative and quantitative variables may be further
subdivided:

Nominal
Qualitative
Ordinal
Variable
Discrete
Quantitative
Continuous
Nominal Variable: A qualitative variable that
categorizes (or describes, or names) an element of a
population.
Nominal scales are used for labeling
variables, without any quantitative value.
“Nominal” scales could simply be called
“labels.”
Ordinal Variable: A qualitative variable that
incorporates an ordered position, or ranking.
-With ordinal scales, it is the order of the
values is what’s important and significant, but
the differences between each one is not really
known.
-Ordinal scales are typically measures of non-numeric
concepts like satisfaction, happiness, discomfort, etc.
-Advanced note: The best way to determine central
tendency on a set of ordinal data is to use the mode or
median; the mean cannot be defined from an ordinal
set.
 Discrete Variable: A quantitative variable that
can assume a countable number of values.
Intuitively, a discrete variable can assume
values corresponding to isolated points along a
line interval. That is, there is a gap between any
two values.
 Discrete Data can only take certain values.
 Example:
 1. the number of students in a class
 2. the results of rolling 2 dice
Continuous Variable: A quantitative variable that can assume
an uncountable number of values. Intuitively, a continuous
variable can assume any value along a line interval, including
every possible value between any two values. Continuous Data
can take any value (within a range) Examples:
A person's height: could be any value (within the range of human
heights), not just certain fixed heights,
Time in a race: you could even measure it to fractions of a
second,
A dog's weight,
The length of a leaf,
 Collecting Data
1. Data from a designed of experiment (primary
data)
2. Data from a survey (primary data)
3. Data from an observational study (primary
data)
4. Data from a published source (secondary data)
 Definition :Representative Sample:
 A representative sample exhibits characteristics
typical of those possessed by the target population.
 The most common way to satisfy the representative
sample requirement is to select a random sample.
 A random sample ensures that every subset of fixed
size in the population has the same chance of being
included in the sample.
 Definition : Random Sample:

 A random sample of n experimental units


is a sample selected from the population
in such a way that every different sample
of size n has an equal chance of selection.
Collection of Data

 Statistics very often involves the collection of data.


There are many ways to obtain data, and the World
Wide Web is one of them. The advantages and
disadvantages of common data collecting method
are discussed below.
Chapter 2: Frequency
Distributions
24
Frequency Distributions

 After collecting data, the first task for a


researcher is to organize and simplify the
data so that it is possible to get a general
overview of the results. This is the goal of
descriptive statistical techniques.

 One method for simplifying and organizing data


is to construct a frequency distribution.

25
Frequency Distributions (cont.)

 A frequency distribution is an organized


tabulation showing exactly how many
individuals are located in each category on
the scale of measurement.
 A frequency distribution presents an
organized picture of the entire set of
scores, and it shows where each individual
is located relative to others in the
distribution.

26
FREQUENCY DISTRIBUTIONS
(CONT.)

A table that organizes data values into classes


or intervals along with number of values that
fall in each class (frequency, f ).
1. Ungrouped Frequency Distribution – for
data sets with few different values. Each
value is in its own class.

2. Grouped Frequency Distribution: for data


sets with many different values, which
are grouped together in the classes.
Grouped and Ungrouped
Frequency Distributions
Ungrouped Grouped

Courses Frequency, f Age of Frequency, f


Taken Voters
1 25 18-30 202
2 38 31-42 508
3 217 43-54 620
4 1462 55-66 413
5 932 67-78 158
6 15 78-90 32
Frequency Distribution Graphs

 In a frequency distribution graph, the score categories (X


values) are listed on the X axis and the frequencies are
listed on the Y axis.
 When the score categories consist of numerical scores
from an interval or ratio scale, the graph should be
either a histogram or a polygon.
Histograms

 In a histogram, a bar is centered above each score (or class


interval) so that the height of the bar corresponds to the
frequency and the width extends to the real limits, so that
adjacent bars touch.
Polygons

 In a polygon, a dot is centered above each score so that


the height of the dot corresponds to the frequency. The
dots are then connected by straight lines. An additional
line is drawn at each end to bring the graph back to a zero
frequency.

32
Bar graphs

 When the score categories (X values) are


measurements from a nominal or an ordinal scale, the
graph should be a bar graph.
 A bar graph is just like a histogram except that gaps
or spaces are left between adjacent bars.

34
Smooth curve
 If the scores in the population are measured on an
interval or ratio scale, it is customary to present the
distribution as a smooth curve rather than a jagged
histogram or polygon.
 The smooth curve emphasizes the fact that the
distribution is not showing the exact frequency for
each category.

36
Frequency distribution graphs

 Frequency distribution graphs are useful because they


show the entire set of scores.
 At a glance, you can determine the highest score, the
lowest score, and where the scores are centered.
 The graph also shows whether the scores are clustered
together or scattered over a wide range.

38
Shape
A graph shows the shape of the distribution.
A distribution is symmetrical if the left side of the
graph is (roughly) a mirror image of the right side.
One example of a symmetrical distribution is the bell-
shaped normal distribution.
On the other hand, distributions are skewed when
scores pile up on one side of the distribution, leaving a
"tail" of a few extreme values on the other side.

39
Positively and Negatively
Skewed Distributions
 In a positively skewed distribution, the scores tend to
pile up on the left side of the distribution with the tail
tapering off to the right.
 In a negatively skewed distribution, the scores tend
to pile up on the right side and the tail points to the
left.

40
Time Series
(Paired data)

Time Series
 Data set is composed of quantitative entries taken at regular
intervals over a period of time.
 e.g., The amount of precipitation measured each day for
one month.
 Use a time series chart to graph.

Quantitative
data
time
Time-Series Graph
Number of Screens at Drive-In Movies
Theaters

Figure 2-8
44 Graphing Qualitative Data Sets

Pie Chart
 A circle is divided into sectors that
represent categories.

Pareto Chart
• A vertical bar graph in which the
height of each bar represents
frequency or relative frequency.

Frequency

Categories
Constructing Pareto Charts
 Create a bar for each category, where the height of the bar can
represent frequency or relative frequency.
 The bars are often positioned in order of decreasing height,
with the tallest bar positioned at the left.

You might also like