Statistics
Statistics
Services Of ice
Ariel Nimo B. Pumecha- UC R&I Vice President
Nathaniel Vincent A. Lubrica - RSO Director Ruben I. Rubia -RSO
Secretary
Judale W. Quianio - RSO Research Assistant Carlo Jay S.
Valdez - RSO Research Assistant
Services Offered
Research Consultation
Statistical Consultation
Geographic Information System (GIS) Consultation Research
Ethics Review
Tel. No. (074) 442-6268 local 195 email address:
[email protected]
Definition
Originally the word “statistics” was usedforthecollection of
data concerning states bothhistoricalanddescriptive. Now it
has acquired a muchwidermeaningand is used for all types of
data andmethodsfortheanalysis of the data.
In general, Statistics is a
systematiccollectionofnumerical facts. In simple
andcomprehensivemeaning, in singular sense, Statistics
isemployedforthe purpose of data collection,
classification,presentation, comparison
andinterpretationofdata.
The purpose is to make the data simple,
lucidandeasytounderstood.
Importance
• It simplifies mass of data (condensation). • Helps to
get concrete informationaboutanyproblem.
• Helps for reliable and objectivedecisionmaking.• It
presents facts in a precise &definiteform.• To
establish empirical evidence. • To understand the
behavior of variousvariables
such as enrollment, patients, widows,
householdconsumption expenditure, etc.
History Early Beginnings
History
Early Modern Period (1500-1800)
Gambling-Probabilities: 1500's
Invention of theword
“Statistik” (inGerman):1949
by GottfriedAchenwall
1st
ProgrammableElectronic
WWI: 1916 Automobile&Electronics:
Computer: 1940-45
Variables
When we are measuring height or
weight these can be seen as variables. The reason is that their
measurement can vary from time to time.
Constants
When we deal with a quantity or
value that does not change it is
referred to as a constant for
example the speed of light.
Some basic concepts
Variables
Important terms regarding variables:
Independent Variable
A variable thought to be the cause of some effect.
Predictor Variable
A variable thought to predict an outcome – another
termforindependent variable.
Dependent Variable
A variable thought to be affected by changes in the
independentvariable.
Outcome Variable
A variable thought to change as a function of changes
inapredictorvariable – synonymous with dependent variable.
Some basic concepts
Variables
Important terms regarding variables:
Some basic concepts
Variables
Continuous VS Discrete Variables
Continuous Variable
Can take any value in a defined range. Also refered as
MeasurableVariables.
Examples are weight or height.
Discrete Variable
These variables can only take certain values. Discrete
Variablesalsoknown as Categorical Variables.
Example in a race 1
st
, 2nd and 3rd place can be awardednot 2.5or3.25rd
or assigning two victims as 1 for VictimA and 2 for
VictimBthereisn’ta1.5 category.
Some basic concepts
Level of Measurement
• Nominal
Indicate that there is a difference between categories of objects, personsor
characteristics – numbers are used here as labels. – Example:
• Gender (1 = Male, 2 = Female)
• Psychopathology (1 = Schizophrenic, 2 =ManicDepressive,3=
Neurotic)
• Ordinal
Variables indicate categories that are both different
fromeachotherand ranked in terms of the attribute.
– Example
• Race winners 1st, 2nd and 3rd
Some basic concepts
• Ratio
These variables have all the properties of Interval scales but
becausetheyhavethe true zero value
– Example
• Exam marks – 0%-100%
• Age – a 40 years old person is twice the age of a 20 year old• Time, length and
weight other examples
Samples and
Populations
Population
An entire collection of elements
or
individuals.
Sample
A portion of the population.
STATISTICS
The two major branchesof
statisticsaredescriptive
statisticsandinferential statistics.
(Colman, 2001)
STATISTICS
Descriptive Statistics
• To get
descriptivestatisticsforcategorical
variables(males–females)
frequenciesshouldbeused,
thi s will tell you howmanygavearesponse in
these categories. • To get
descriptivestatisticsforcontinuous
variables(age)itisbetter to use
descriptiveanalysiswhich will then provide
asummaryofthevariables (mean,
medianandthemode)
Descriptive Statistics
Univariate analysis:
– Involves the analysis of one variable
acrosscasesonevariable at a time
– 3 major characteristics
• Distribution
• Central Tendency
• Dispersion
Descriptive Statistics –UnivariateAnalysis
• Distribution – The first step in turning data
intoinformationistocreate a distribution.
– The most primitive way to present a
distributionistosimply list, in one column, each
valuethatoccursinthe population and, in the next column,
thenumberoftimes it occurs.
– Thi s s impl e li s ting i s c a ll edaf
requencydistribution. A more elegant way
toturndataintoinformation is to draw a graph of the
distribution.
• Hypothesis tests
(Testing an assumption regardingapopulation.)•
Confidence intervals • Regression analysis
Assumptions (ANOVA)
• The data are continuous (not discrete) • The data
follow the normal probabilitydistribution• Equal
variances between groups • Independence of
Samples
Inferential Statistics
(Parametric)Hypothesis Testing
Correlation (e.g. Pearson)
• Exploring Relationships:
Lookingatthestrength of the
relationshipbetweenvariables– To explore
the strength of the relationshipbetween2
continuous variables. This gives
theindicationofthe direction (positive /
negative)andthestrength of the
relationship. –A positive correlation
indicate –asoneincreasetheother increase as
well. Negative correlationoneincrease the
other decrease.
Inferential Statistics
(Parametric)Hypothesis Testing
Assumptions (Correlation)
• Related pairs
• Absence of outliers
• Linearity
Sample Computations • Paired T-test
At 5% level of significance, compare the scores of
thestudentsduringpre and post exams after taking a 100 pts. exam.
Students Pre-Score Post-Scores Difference (d)
Ʃ(di)=42, n=10,df=n-1
1 80 75 5
=4.2
= ∑
nd
2 59 51 8
di
3 86 84 2
4 69 57 12
2= −
4.89
1 ( )/ − = ∑∑ n
d dn i i
s
5 90 85 5
D
6 76 74 2
2.72 −= d
/0( ) = s n tDi
7 89 81 8
computed
8 62 63 -1
9 84 78 6
t(tabulated)=t(0.05/2),df=9=2.262
10 82 87 -5
Software
p-value or Sig. <0.05,then
there is a significantdifference
betweenthetwoexams
• Independent Samples At 5% level of significance,
compare the scoresof thetwogroups.
x
1,2 1.21,2 i=
=
∑x/n
t(tabulated)=t0.05/2,8=2.306
• Correlations
• If r=1, then all the points lie in a straight
lineandtherelationship is said to be perfect positive
relationship.• If r=-1, then all the points lie in a straight
linebutinaninverse ralation and this relationship is
saidtobeperfect negative relationship.
• If r=0, then there is no relationship betweentwovariables
and the variables are aid to be independent.
• Correlations
Sincer=0.87,it
suggestsastrong, positivecorrelation
betweenmathand stat score.
1878 (130)(100)/10
−
=
2 2=
0.87
(1878 (130) /10)(1138 (100) /10)
−−