MS6107D Business Statistics: Session 1
MS6107D Business Statistics: Session 1
Session 1
Data
Continuous Variable Discrete Variable
Measurement Categorical
Quantitative Qualitative
Statistics
• Descriptive statistics are used to organize or
summarize a particular set of measurements.
• A descriptive statistic will describe that set of
measurements.
– Mean age of the class
– Rate of absenteeism is a statistic
Statistics
• Inferential statistics use data gathered from a
sample to make inferences about the larger
population from which the sample was drawn.
• For example,
– Opinion polls and television ratings systems
represent other uses of inferential statistics. For
example, a limited number of people are polled
during an election and then this information is
used to describe voters as a whole.
Data Measurement scale
• Scales of measurement refer to ways in which
variables/numbers are defined and categorized. Each
scale of measurement has certain properties which in
turn determines the appropriateness for use of certain
statistical analyses. The four scales of measurement are
nominal, ordinal, interval, and ratio.
• The 4 scales are in the order of Nominal, Ordinal,
Interval and Ratio scale with Nominal having least
mathematical properties, followed by Ordinal and
Interval, whereas Ratio having most mathematical
properties.
Ratio Scale
• A Ratio Scale is at the top level of Measurement. The factor which clearly
defines a ratio scale is
• Ratio scales of measurement have all of the properties of the abstract
number system.
• Properties of Ratio Scale
• Identity
• Magnitude
• Equal distance
• Absolute/true zero
• These properties allow to apply all possible mathematical operations that
include addition, subtraction, multiplication, and division. The
absolute/true zero allows us to know how many times greater one case is
than another. Variables falling in this category and having all the above
mentioned numerical properties fall in ratio scale.
•
Ratio Scale
• For a variable X, taking two values, X1 and X2, the
ratio X1/X2 and the distance (X2 − X1) are
meaningful quantities. Also, there is a natural
ordering (ascending or descending) of the values
along the scale.
• Therefore, comparisons such as X2 ≤ X1 or X2 ≥
X1 are meaningful. Most economic variables
belong to this category. Thus, it is meaningful to
ask how big is this year’s GDP compared with the
previous year’s GDP.
Interval Scale
• A normal survey rating scale is an interval scale
• When asked to rate satisfaction with a training on a 5 point
scale, from Strongly Agree, Agree, Neutral, Disagree and
Strongly Disagree, an interval scale is being used.
• It is an interval scale because it is assumed to have equal
distance between each of the scale elements i.e. the
Magnitude between Strongly Agree and Agree is assumed
to be the same as Agree and Strongly Agree.
• This means that we can interpret differences in the
distance along the scale. We contrast this to an ordinal
scale where we can only talk about differences in order, not
differences in the degree of order i-e the distance between
responses.
Interval scale
• Properties of Interval Scales
• Interval scales have the properties of:
• Identity
• Magnitude
• Equal distance
• Variables which fulfill the above mentioned properties are put in
this scale. The equal distance between scale points helps in
knowing how many units greater than, or less than, one case is
from another. The meaning of the distance between 25 and 35 is
the same as the distance between 65 and 75.
• An interval scale variable satisfies the last two properties of the
ratio scale variable but not the first. Thus, the distance between
two time periods, say (2000–1995) is meaningful, but not the ratio
of two time periods (2000/1995).
Ordinal Scale
• Ordinal Scale is ranking of responses, for instance Ranking Cyclist at the
end of the race at the position 1, 2 and 3.
• Not these are rank and the time distance between 1 and 2 may well not
be the same as between 2 and 3, so the distance between points is not
the same but there is an order present, when responses have an order but
the distance between the response is not necessarily same, the items are
regarded or put into the Ordinal Scale.
• Therefore an ordinal scale lets the researcher interpret gross order and
not the relative positional distances.
• Ordinal Scale variables have the property of Identity and Magnitude. The
numbers represent a quality being measured (identity) and can tell us
whether a case has more of the quality measured or less of the quality
measured than another case (magnitude). The distance between scale
points is not equal. Ranked preferences are presented as an example of
ordinal scales encountered in everyday life.
Nominal Scale
• State Y1 Y2 X1 X2 State Y1 Y2 X1 X2
• AL 2,206 2,186 92.7 91.4 MT 1 72 164 68.0 66.0
• AK 0.7 0.7 15 1.0 149.0 NE 1,202 1,400 50.3 48.9
• AZ 73 74 61.0 56.0 NV 2.2 1.8 53.9 52.7
• AR 3,620 3,737 86.3 91.8 NH 43 49 109.0 104.0
• CA 7,472 7,444 63.4 58.4 NJ 442 491 85.0 83.0
• CO 788 873 77.8 73.0 NM 283 302 74.0 70.0
• CT 1,029 948 106.0 104.0 NY 975 987 68.1 64.0 Note: Y1 = eggs produced in 1990 (millions)
• DE 168 164 117.0 113.0 NC 3,033 3,045 82.8 78.7 Y2 = eggs produced in 1991 (millions)
• FL 2,586 2,537 62.0 57.2 ND 51 45 55.2 48.0 X1 = price per dozen (cents) in 1990
• GA 4,302 4,301 80.6 80.8 OH 4,667 4,637 59.1 54.7
X2 = price per dozen (cents) in 1991
• HI 227.5 224.5 85.0 85.5 OK 869 830 101.0 100.0
• ID 187 203 79.1 72.9 OR 652 686 77.0 74.6
Source: World Almanac, 1993, p. 119. The
• IL 793 809 65.0 70.5 PA 4,976 5,130 61.0 52.0 data are from the Economic Research Service,
• IN 5,445 5,290 62.7 60.1 RI 53 50 102.0 99.0 U.S. Department
• IA 2,151 2,247 56.5 53.0 SC 1,422 1,420 70.1 65.9 of Agriculture.
• KS 404 389 54.5 47.8 SD 435 602 48.0 45.8
• KY 412 483 67.7 73.5 TN 277 279 71.0 80.7
• LA 273 254 115.0 115.0 TX 3,317 3,356 76.7 72.6
• ME 1,069 1,070 101.0 97.0 UT 456 486 64.0 59.0
• MD 885 898 76.6 75.4 VT 31 30 106.0 102.0
• MA 235 237 105.0 102.0 VA 943 988 86.3 81.2
• MI 1,406 1,396 58.0 53.8 WA 1,287 1,313 74.1 71.5
• MN 2,499 2,697 57.7 54.0 WV 136 174 104.0 109.0
• MS 1,434 1,468 87.8 86.7 WI 910 873 60.1 54.0
• MO 1,580 1,622 55.4 51.5 WY 1.7 1.7 83.0 83.0
Pooled Data
• In pooled, or combined, data are elements of
both time series and cross-section data. The
data in Table 1.1 are an example of pooled
data.
• For each year we have 50 cross-sectional
observations and for each state we have two
time series observations on prices and output
of eggs, a total of 100 pooled (or combined)
observations.
Panel, Longitudinal, or Micropanel
Data
• This is a special type of pooled data in which the same
cross-sectional unit (say, a family or a firm) is surveyed
over time.
• For example, the U.S. Department of Commerce carries
out a census of housing at periodic intervals. At each
periodic survey the same household (or the people
living at the same address) is interviewed to find out if
there has been any change in the housing and financial
conditions of that household since the last survey.
• By interviewing the same household periodically, the
panel data provides very useful information on the
dynamics of household behavior,