0% found this document useful (0 votes)
71 views45 pages

Descriptive 01 PDF

This document provides an introduction to statistics. It defines statistics as the science of collecting, organizing, summarizing, analyzing, and interpreting data. The key concepts covered include data types (quantitative, qualitative, discrete, continuous), scales of measurement (nominal, ordinal, interval, ratio), descriptive statistics (frequency distributions, measures of central tendency and dispersion), inferential statistics, and uses of statistics. Biostatistics is introduced as the application of statistics to biological and medical fields. Common statistical terms like data, population, sample, and tables are also defined.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views45 pages

Descriptive 01 PDF

This document provides an introduction to statistics. It defines statistics as the science of collecting, organizing, summarizing, analyzing, and interpreting data. The key concepts covered include data types (quantitative, qualitative, discrete, continuous), scales of measurement (nominal, ordinal, interval, ratio), descriptive statistics (frequency distributions, measures of central tendency and dispersion), inferential statistics, and uses of statistics. Biostatistics is introduced as the application of statistics to biological and medical fields. Common statistical terms like data, population, sample, and tables are also defined.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Introduction to

Statistics

By;
Mr. Johny Kutty Joseph
Asstt. Professor
Concepts & Definition
• It is to organize, interpret, and
communicate numeric information.
• Logical thinking is required more than
mathematical ability.
• The word statistics comes from the
Italian words Statista means Statement
and a German word Statistik means
political state..
• It is a science of learning from
numbers/data.
• It is a science of collecting, classifying,
analyzing and interpreting the data.
Concepts & Definition
• A branch of mathematics dealing with
the collection, analysis,
interpretation, and presentation of
masses of numerical data. (Merriam-
Webster)
• Statistics is defined as collection,
Presentation, analysis and
interpretation of numerical data”. (
Croxton & Cowden)
• It is the sciences and art of dealing
with figure and facts.
Uses of Statistics
• To make the raw data meaningful.
• To test null hypothesis.
• To test the statistical significance of
data .
• To draw inferences and make the
generalization.
• To estimate parameters.
• Make decisions based on data, and
make predictions.
• It helps in comparison
Biostatistics
• Biostatistics is the branch of statistics
applied to biological or medical
sciences.
• Biostatistics is the methods used in
dealing with statistics in the field of
health sciences such as biology,
medicine, nursing, public health etc.
• Biostatistics is the branch of statistics
applied to biology or medical sciences.
Biostatistics is also called “Biometry”
Data
• Data is defined as factors known or assumed
as facts, making the basis of reasoning or
calculation.
• Broadly there are quantitative and qualitative
data.
• Quantitative data deals with numbers and
things you can measure objectively: Eg; height,
weight, length, temperature, volume, area etc.
It is number value.
• Qualitative data deals with characteristics
and descriptors that can't be easily measured,
but can be observed subjectively. Eg. smells,
tastes, textures, attractiveness, and color.
Data
• Quantitative data;
continuous and discrete.
• Discrete data is a count that can't be
made more precise. For instance, the
number of children in your family is
discrete data, because you are counting
whole, indivisible entities: you can't have
2.5 kids.
• Continuous data could be divided and
reduced to finer and finer levels. Eg;
Height of children made more precise by
Meters-centimeters-millimeters and
Data
• Qualitative data; It is also referred as
attributable data. Binary, Nominal (unordered)
and Ordinal (ordered) data.
• Binary data place things in one of two
mutually exclusive categories: right/wrong,
true/false, or accept/reject.
• Nominal Data: We assign individual items
number or category that do not have an
implicit or natural value or rank. (Gender:
1=male and 2= female)
• Ordinal Data: The items are assigned to
categories that have some kind of implicit or
natural order. Eg. "Short, Medium, or
Tall." Rating from 1 to 5 on scale where 5 is
Scales of Measurement
• Measurement is the process of assigning
numbers or labels to objects, persons,
states, or events in accordance with
specific rules to represent quantities or
qualities of attributes.
• We do not measure specific objects,
persons, etc., we measure attributes or
features that define them.
• It is a system of classifying
measurements according to the nature of
the measurement and the type of
mathematical operations to which they
Scales of Measurement
Data Classification in Science
Nominal Measurement
• The lowest level of measurement also referred
as categorical data.
• It represents characteristics. Eg. Gender,
Language, locality etc. Numerical values may
be given but do not have any mathematical
meaning.
• It act as labels and hence changing order
doesn’t have any significance.
Ordinal Measurement
• It is the second level, in which the scores are
given in such a manner as the number
increases the status/condition also increases
or upgrades.
How often do you feel
back pain ?
No Pain: 1, Mild Pain:
2
Moderate: 3, Severe :
• The limitation of this type4 of data is that
difference between all the 4 options are not
equally measurable or not known.
• It is mainly used to measure non numerical
features such as patient satisfaction, etc.
Interval Measurement
• An interval scale has the characteristics of an
ordinal scale.
• An interval scale permits use of measurement
that enables data to be placed at equally
spaced intervals in relation to the spread of
the variable.
• This measurement has a starting and a
terminating point that is divided into equal
space intervals.
• The problem with interval values data is that
they
Whatdon’t have
is the a true
room zero.
temperature ?
a) -20 to -10; b) -10 to 0; c) 0 to 10 ; d) 10 to
20
Ratio Measurement
• It is the highest level of data.
• A ratio scale is a scale that measures in
terms of equal intervals and an absolute zero
point of origin. It has all the properties of
nominal, interval and ordinal.
• The bio-physiological characteristics such as
age, weight, height are examples.
• The variables that are measured either on
interval or ratio are considered continuous.
• Eg. It can easily be stated that one who
weighs 80 kg is twice heavy as someone who
weighs 40kg.
Comparison of levels
• The levels of measurement forms a hierarchy,
with ratio at the top and nominal at the base.
• The higher the level of measurement precise is
the data.
• It is possible to convert data to lower level but
not the reverse process.
• A ratio may be converted to ordinal but
ordinal cannot be ratio.
Assess the weight of
people
Some psychological
Ordinal Ratio
scales (Likert’s scale)
a.Below 50 a.
are considered
40 to 50
ordinal as well as
b. 50 to 70 b. 50
interval.
to 60
Classification of Statistics
• Descriptive Statistics: It is the
enumeration, organization and graphical
representation of data. It helps to
summarize the meaning of data. Eg.
Demographic variables.
• Inferential Statistics: It is also called as
sampling statistics. It is the inference of
conditions that exist in large set of
observations. Eg. Test the efficiency of a
new hypertensive drug on a particular
population.
Descriptive Statistics
• It is classified as the following
• Frequency distribution and graphical
presentation(measures of condensation).
• Measures of central tendency. (Mean,
Median, Mode)
• Measures of dispersion. (difference) Eg.
Range, Mean deviation, Standard
deviation, Quartile deviation
• Measures of relationship (correlation
coefficient, regression etc.)
Frequency Distribution
• A set of data can be described in terms of
three characteristics. Distribution of
values, central tendency and variability
(dispersion and relationship).
• Distribution of values or frequency
distribution are used to organize the
numeric data.
• It is a systematic arrangement of values
from lower to higher together with count
of number/frequency with which the
value was obtained.
Frequency Distribution
• Observe the below given table for anxiety
scores of 60 patients.
22 24 25 19 24 25 23 23 24 20

25 16 20 25 17 22 24 18 22 23

15 24 23 22 21 24 20 25 18 25

24 23 16 25 30 20 19 21 23 24

19 18 20 21 17 25 22 24 20 17

20 25 21 24 23 19 21 21 25 21

• Inspection of these numbers does not


help us to understand patients anxiety.
Frequency Distribution
• Frequency distribution consists of two
parts; observed values (X) and frequency
(f). N is the sample size.
• Scores are in order in a column and
corresponding frequencies in another.
• The sum of numbers in the frequency
must be equal to N. (Σf=N)
• See the following frequency distribution
table of the given patient’s anxiety
scores that gives clear understanding of
the data.
Frequency Distribution
SCORE (X) Frequency (f) Percentage (%)
15 1 1.7
16 2 3.3
17 3 5.0
18 4 6.7
19 4 6.7
20 7 11.7
21 7 11.7
22 4 6.7
23 8 13.3
24 10 16.7
25 10 16.7
N = 60 = Σf Σ% = 100.0%
Tables
• It represents data in concise, systematic
manner from the masses of statistical
data.
• Tabulation is the first step in data
analysis.
• A table consist of table number, title,
contents, foot notes etc.
• Tables are broadly classified into
• A. Frequency distribution table
• B. Contingency Table
• C. Multiple response table
Tables
Socio demographic Profile of
patients
• Frequency
Variables N = 60
distribution tables: it F (%)
represents frequency Age (years)
and percentage 20 -40 18 (30.0)
41 - 60 42 (70.0)
distribution of the
Gender
collected Male 39 (65.0)
information. Usually Female 21 (21.0)
Transgender 0 (0.0)
the number of classes
Marital Status
vary between 3 to 8. Married 52 (86.7)
Too many or too few Unmarried 08 (13.3)
classes may fail to Divorced 0
reveal the salient Locality
Urban 31 (51.7)
features of data. Rural 29 (48.3)
Tables
• Contingency tables: it represents frequency
distribution of two mutually exclusive nominal
variables simultaneously. It is also called as
cross tables. These tables could be 2x2, 2x3
and 3x3 depending on the number of variables.
The number of subjects in a cell is called as
cell frequency. These tables are usually used
for Type
Chi-square test.
of Ventilation and Bowel movements in patients
Bowel Spontaneou Mechanical Total χ2 value
Movements s ventilation Ventilation frequency
Present 391 (64.0) 32 (29.4) 423 45.87
Absent 220 (36.0) 77 (70.6) 297 df=1 (c-1)(r-1)

Total 611 109 720 (N)


Tables
• Multiple response Factors Contributing to sleep
table: It is used deprivation among patients.
to represent data Factors N = 60
that are neither F (%)
exclusive nor Blood sampling 35 (58.3)
exhaustive. It is
used when “f” Diagnostic 33 (55.0)
exceeds “N”. It is Tests
made to Medication 33 (55.0)
represent the Vital Signs 32 (53.3)
monitoring
percentage
Noise 32 (53.3)
distribution.
Bright Lights 30 (50.0)
Tables
• Miscellaneous Table: Table that
represent data other than frequency or
percentage distributions such as mean,
median, mode, SD etc.
Graphical Representation of Data

• It is most convenient and appealing way


in which statistical results may be
presented.
• It gives an overall view of the entire data
and is visually attractive.
• It facilitates comparison.
Types of Graphs and Diagram

• Bar Diagram: Useful in displaying


nominal or ordinal data. It shows the
visual comparison of magnitude of a
variable and its frequency. It may either
be prepared vertically or horizontally.
• There are mainly three types of Bar
diagram such as simple, multiple and
proportion bar charts. See the following
examples.
Types of Graphs and Diagram

Simple bar diagram showing dietary pattern of


people
80
70
60
50
40 Vegetarian
72
30 Non vegetarian
20
28
10
0
Vegetarian Non vegetarian
Types of Graphs and Diagram

Multiple bar diagram showing the percentage of


population and land.
70
60
60

50
40
40
30 30 Population
30 26 Land
20 14
10

0
Asia Africa Europe
Types of Graphs and Diagram
Proportionate bar graph showing worlds
population and land area
100
90 26 30
80
70 14
Europe
60 30
Africa
50
Asia
40
60
30
40
20
10
0
Population Land
Types of Graphs and Diagram

• Pie Diagram/ Sector Health Problems of the


diagram: Useful to old age in Jammu
present discrete
data such as age
groups, gender, etc Hypertensi
in a population. The 20 on
input must be in 32 Diabetes
percentage. Size of 8
Arthritis
the angle is
calculated by the 40 Sensory
formula class
frequency/total
observation x 360
degree.
Types of Graphs and Diagram

• Histogram: The most commonly used


graphical representation of grouped
frequency.
• Variable characters of different group/class is
on the x axis and their respective frequencies
on y axis.
• Frequency of each group forms a column or
rectangle.
• The area of rectangle is proportional to the
frequency of the class interval.
Age group 15-20 20-25 25-30 30-35 35-40
• Eg: (years)
No. of males 15 20 40 60 50
Types of Graphs and Diagram
Types of Graphs and Diagram

• Frequency Polygon: It is the curve (two


dimensional) obtained by joining the middle
top points of the rectangles in a histogram by
straight lines.
• The two end points of the line drawn are
joined to the x axis at the midpoint of the
empty class intervals.
• It is more simple and sketch the outline of the
data clearly than histogram.
• Eg
Age group 15-20 20-25 25-30 30-35 35-40
(years)
No. of males 15 20 40 60 50
Types of Graphs and Diagram

Number of Males
70

60 60

50 50

40 40
Number of Males
30

20 20
15
10

0 0 0
15 - 20 20 - 25 25 - 30 30 - 35 35 - 40
Types of Graphs and Diagram

• Line graph: In this the frequency polygon are


depicting by line.
• Commonly used to represent those data that
is collected over a long period of time.
• On x axis independent variables are presented
and dependent variables on the y axis.
• The plotted data can be joined by a straight
lines.
Year 2001 2002 2003 2004 2005 2006 2007
Cars sold in 123 203 328 298 337 417 486
Delhi (in
thousand)
Cars sold in 456 402 387 347 342 307 298
Mumbai(in
thousand)
Types of Graphs and Diagram

Line graph presenting the number of cars sold in


Delhi and Mumbai during 2001 - 2007
600

500 486
456
400 402 417
387
347 342
337
328
300 298 307 298 In Delhi
In Mumbai
200 203

123
100

0
2001 2002 2003 2004 2005 2006 2007
Types of Graphs and Diagram

• Cumulative Frequency curve/ “ogive”: It is the


representation of cumulative frequency for
statistical purpose.
• First convert the frequency table to
cumulative frequency and then plot it on the
line.
• It is also called as “ogive”.
Age group 15-20 20-25 25-30 30-35 35-40
(years)
No. of males 15 20 40 60 50
Cumulative 15 35 75 135 185
Frequency
Types of Graphs and Diagram

Number of Males
200
180 185

160
140 135
120
100
Number of Males
80 75
60
40 35
20 15
0
15 - 20 20 - 25 25 - 30 30 - 35 35 - 40
Types of Graphs and Diagram

• Scattered or dotted Number of Marks


diagrams: It is a students obtained out
of 100
graphic
representation shows 12 40-50
the nature of 10 50-60
correlation between
two variables. Eg. 8 60-70
Student marks in an 7 70-80
examination
5 80-90
• It is also called as
correlation diagram. 2 90-100
• It may be positive
(upward) or negative
(downward)
Types of Graphs and Diagram

Scattered diagram show the negative


correlation
14
12
Number of students

10
8
6
No. of students
4
2
0
0 50 100 150
Marks obtained out of 100
Types of Graphs and Diagram

• Pictograms or picture diagram: Use of


pictures to plot the frequency of a
characteristics.
• Map diagram or spot map: Maps are
prepared to show geographical distribution
of frequencies of characteristics.
Limitations of Graphs

• It is confusing (depend on the type)


• It presents only quantitative data.
• It gets only on one aspect or on limited
characteristics.
• It can present only approximate values.

You might also like