Hand-Out1 - Introduction To Statistics
Hand-Out1 - Introduction To Statistics
Hand-Out1 - Introduction To Statistics
Objectives:
(1) know the priciples of statistics and why we study and use them in behavioral sciences..
Definition. Statistics
In its plural sense, it refers to the data itself or some numerical computations derived from a
set of data that are systematically collected and analyzed.
In singular sense, it refers to the scientific discipline consisting of theory and methods for
processing numerical information that one can use when making decisions in the face of
uncertainty.
In the broadest sense, “statistics” refers to a range of techniques and procedures for analyz-
ing, interpreting, displaying, and making decisions based on data.
Statistics is the language of science and data. The ability to understand and communicate using
statistics enables researchers from different labs, different languages, and different fields articulate
to one another exactly what they have found in their work. It is an objective, precise, and powerful
tool in science and in everyday life
Determining the level of patient’s satisfaction on the nursing care administered by student
nurses at Central Mindanao University.
Determining the distribution of the number of text messages sent per day of CMU students
enrolled in Math 15.
Comparing the exam results in Statistics of the different CMU colleges. • Relationship of
faculty status and work commitment.
Prediction of the number of CMU students for the next school year 2009 − 2010.
(1) Descriptive Statistics – methods concerned with collecting, describing, and analyzing a set
of data without drawing conclusions (or inferences) beyond the data.
(2) Inferential Satistics – methods concerned with the analysis of a subset of data leading to
predictions or inferences about the entire set of data, that is, to generalize results beyond
the data collected provided that the data collected is a part (sample) of a large set of items
(population)
Salary 35, 000 30, 000 25, 000 20, 000 15, 000
Occupation Pediatrecian Dentist Pysicist Acrhitect FlightAttendant
A new milk formulation designed to improve the psychomotor of infants was tested on ran-
domly selected infants. Based on the results, it was concluded that the new milk formulation
is effective in improving the psychomotor development of infants.
Definitions.
Universe– is the set of all entities under study, that is, the collection of things or observational
units under study.
Variable – is a characteristic observed or measured on every unit of the universe.
Population - is the set of all possible values of the variable.
Sample – is a subset of the population.
Parameters – are numerical measures that describe the population or universe of interest.
Statistics – are numerical measures of a sample. Frame – a listing of all the elements in a
population.
Census – the process in which information is gathered for all units in the population.
Sample survey or sampling – the process in which information obtained is only a part of
the population.
The building blocks of statistical science are data. Specific characteristics (e.g., age, height, and
weight) that we want to assess for a certain population are referred to as variables. Variables may
be categorized further as qualitative and quantitative variables.
(1) Qualitative variables – variables that yield observations by which individuals can be cate-
gorized according to some characteristic or quality. - e.g., gender, marital status and blood
type. - Are expressed in categories.
(2) Quantitative variables – variables that yield observations that can be measured. - e.g.,
weight, height, systolic blood pressure and body mass index. - Numerical measure exists.
(3) Constant – variable that only assume one value.
Note: Data collected on particular variables are classified as either qualitative or quantitative.
Qualitative data if no numerical measure exists (e.g., gender, marital status and blood type), data
obtained on particular variables are usually expressed in categories. Quantitative data are ex-
pressed in numbers (e.g., weight, height, systolic blood pressure and body mass index); data col-
lected on particular variables are measured and counted.
Discrete data – data that can be counted, e.g., number of patients in a hospital, number of
students who obtained 1.0grade in Math15 and Math34. These data assume only a countable
number of values
Continuous data – data that can be measured, e.g., systolic blood pressure, weight and
height. These data result from infinitely many possible values that can be associated with
points on a continuous scale in such a way that there are no gaps or interruptions.
Note: Arithmetical operations for quantitative data have some physical interpretation. Some vari-
ables may take numerical values, but it does not make the variable quantitative, e.g., sum of two
zip codes or the difference of your cellular telephone number to your seatmate. Thus, the arith-
metic operations of the above example do not make sense. The issue is whether performing arith-
metical operations on these data would make any sense. The figure in the next page illustrates the
classification of data collected on particular variables.
They differ in the property of numbers (identity, order, additivity) that they possess.
Identity – is the property that enables a person to distinguish one number from the other.
They are recognized by the shapes of the way they are written.
Order – is the property that numbers are arranged in a sequence. For any integer number
(1) Nominal scale – the lowest level of measurement and is most often used with variables that
are qualitative in nature, rather than quantitative
Examples: gender, eye color, smoking status and nationality, handedness, favorite color, and
religion are- it possess only the property of identity. Thus,numbers are only used
to classify. For instance in the variable gender, if 1 is assign to male and 2 is for
female, it does not mean that female is better than male.
(2) Ordinal scale – possesses the property of identity and order. - can rank-order the objects to
whether they possess more, less or the same amount of the variables being measured. Thus
can determine whether A>B, or - cannot determine how much greater or less A is than B in
the attribute being measured.
(3) Interval scale – possesses the properties of identity, order and additivity but do not have the
absolute zero property. Interval scales are not perfect, however. In particular, they do not
have a true zero point even if one of the scaled values happens to carry the name “zero.”
The Fahrenheit scale illustrates the issue. Zero degrees Fahrenheit does not represent the
complete absence of temperature (the absence of any molecular kinetic energy)
Examples: Consider the Fahrenheit scale of temperature. The difference between 30 degrees
and 40 degrees represents the same temperature difference as the difference be-
tween 80 degrees and 90 degrees. Intelligence score.
(4) Ratio scale – possesses the properties of identity, order, equality of scale and absolute zero.
It is an interval scale with the additional property that its zero position indicates the absence
of the quantity being measured.
Rating scales are used frequently in psychological research. For example, experimental sub-
jects may be asked to rate their level of pain, how much they like a consumer product, their
attitudes about capital punishment, their confidence in an answer to a test question.
In statistics, we usually deal with group of data that result from measuring one or more variables.
The data are often derived from samples and occasionally from population, but in either case it is
useful to let symbols stand for the variables measured in the study. Usually most statistics books
used the Roman letter and sometimes to stand for the variable(s) measured.
The number of observations is also represented by Nand n for population and sample, respec-
tively. Let the symbol xi (read x sub i) denote any of the Nor n values, if n values we have
x1 , x2 , x3, ..., xn assumed by a variable XThe letter i in xi which stands for any of the numbers1, 2, 3, . . . , n
is called a subscript, or index. Any letter other than i such as j, k, v, w, q or r could have been used
as well.
The Summation symbol (∑)- it is a more compact way of writing the sum of a set of data
values
!
n
∑ xi = x1 + x2 + x3 + ... + xn
i =1
Example. Consider the age of a sample of six children as shown in the table below
Find the following: a. Find the sum of their ages in compact form.
a) ∑6i=1 xi
2
b) ∑4i=1 xi
c) ∑4i=1 xi2
Solutions:
a)
6
∑ x i = x1 + x2 + x3 + x4 + x5 + x6
i =1
= 8 + 10 + 7 + 6 + 10 + 12
= 53
b)
!2
4
∑ xi = ( x1 + x2 + x3 + x4 )2
i =1
= (8 + 10 + 7 + 6)2
= 961
c)
!
4
∑ xi2 = ( x12 + x22 + x32 + x42 )
i =1
= 82 + 102 + 72 + 62
= 249
Rules of Summation
1. !
n n n n
∑ xi ± yi ± zi = ∑ xi ± ∑ yi ± ∑ zi
i =1 i =1 i =1 i =1
Example:
3 3 3
∑ ( xi − yi ) = ∑ xi − ∑ yi
i =1 i =1 i =1
= ( x1 + x2 + x3 ) − ( y1 + y2 + y3 )
= x1 + x2 + x3 − y1 − y2 − y3
2. where c is a constant
n n
∑ cxi = c ∑ xi
i =1 i =1
Example:
3 3
∑ 5xi = 5 ∑ xi
i =1 i =1
= 5 ( x1 + x2 + x3 )
= 5x1 + 5x2 + 5x3
3.where c is a constant
n
∑ c = nc
i =1
Example.
3
∑ 5 = 3 (5)
i =1
a) ∑3i=1 xi yi
c) 2 ∑2i=1 ( xi yi )2
Solutions:
a)
3
∑ x i y i = ( x1 y1 + x2 y2 + x3 y3 )
i =1
= ((2 · 3) + (1 · 1) + (−1·2))
= 6+1−2 = 5
b)
2
∑ xi2 y2i = x12 y21 + x22 y22
i =1
22 · 32 + 12 · 12
=
= 36 + 1 = 37
c)
2
2 ∑ = ( x i y i )2
i =1
= 2 ((2 · 3) + (1·1))2
= 2 (6 + 1)2 = 2 (7)2 = 2(49) = 98
is a compact way of writing the product of a sequence of positive integers. The symbol (n!)
is defined as
n! = 1·2·3· ...·n
a) 3!
5!
b) 2!
10!
c) 5!
Solution:
a) 3!=1·2·3 = 6
b) 5! 5·4·3·2!
2! = 2! = 5·4·3 = 60
c) 10!
5! =
10·9·8·7·6·5!
5! = 10·9·8·7·6 = 30, 240