0% found this document useful (0 votes)
84 views10 pages

Faculty of Information Science & Technology (FIST) : PSM 0325 Introduction To Probability and Statistics

This document provides an introduction to descriptive statistics. It defines key statistical concepts like population, sample, variables, and data organization. Descriptive statistics involves organizing and summarizing data through tables, graphs, and measures of central tendency and dispersion. Common graphs used include bar graphs, pie charts, histograms, frequency polygons, and ogives. Frequency distribution tables organize qualitative and quantitative data into classes or categories along with their frequencies. Measures of central tendency evaluated include the mean, median and mode, while measures of dispersion include the range, variance and standard deviation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views10 pages

Faculty of Information Science & Technology (FIST) : PSM 0325 Introduction To Probability and Statistics

This document provides an introduction to descriptive statistics. It defines key statistical concepts like population, sample, variables, and data organization. Descriptive statistics involves organizing and summarizing data through tables, graphs, and measures of central tendency and dispersion. Common graphs used include bar graphs, pie charts, histograms, frequency polygons, and ogives. Frequency distribution tables organize qualitative and quantitative data into classes or categories along with their frequencies. Measures of central tendency evaluated include the mean, median and mode, while measures of dispersion include the range, variance and standard deviation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 10

Faculty of Information Science & Technology

(FIST)

PSM 0325
Introduction to Probability and Statistics

Foundation in Life Science


Foundation in Information Technology

ONLINE NOTES

Topic 1
Descriptive Statistics

FIST, MULTIMEDIA UNIVERSITY (436821-T)


MELAKA CAMPUS, JALAN AYER KEROH LAMA, 75450 MELAKA, MALAYSIA.
URL: http://fist2.mmu.edu.my
PSM0325 Introduction to Probability and Statistics Topic 1

TOPIC 1
DESCRIPTIVE STATISTICS

References :
Introduction to Probability and Statistics, Assliza Salim. et al.,Pearson. 2011

Objectives:
1. Understand what is meant by statistics, population, sample, quantitative and qualitative
data, discrete and continuous variable.
2. Be able to present the set of data by using frequency distribution table, bar charts, pie
charts, histogram, polygon and ogive.
3. Be able to find mean, median, mode and also range, variance and standard deviation.

Contents:
1. Introductions
2. Organizing data
3. Measurement of central tendency and dispersion

INTRODUCTION

1.1 WHAT IS STATISTICS


Definition 1.1:
Statistics is a field of study which implies collecting, presenting, analyzing and
interpreting data as a basis for explanation, description and comparison.

Process of statistics:
1. Identify the research objective
- Identify the purpose of the study, determine the questions to be asked, set a
target group.
2. Collect the information needed
- Collect data from a population or sample.
3. Organize, summarize and analyze the information
- Descriptive statistics – Organize the data collected either in a numerical
method or graphical method.
4. Make decision or draw conclusion
- Inferential statistics – The data collected from the sample is generalized to the
population.

1.2 TYPES OF STATISTICS


Statistics can be divided into two
(i) Descriptive Statistics
(ii) Inferential Statistics

__________________________________________________________________________________
1/ 8
PSM0325 Introduction to Probability and Statistics Topic 1

Definition 1.2:
Descriptive Statistics is a field of study which involves organizing, displaying and
describing data by using tables, graphs and summary measures.

Definition 1.3:
Inferential Statistics is a field of study that used sample results to make decisions about
population.

1.3 POPULATION VERSUS SAMPLE


Definition 1.4:
Population refers to every element in an observation which are of interest for data
collection.
For example, If data on final exam results at UNITELE is needed, every student in
UNITELE forms the population.

Definition 1.5:
Sample refers to a certain number of elements that have been chosen from a population
for observation. Sample is subset to population.
For example, choose any 100 students in UNITELE for interviews. The sample size is
100.

1.4 BASIC TERMS


Definition 1.6:
An element or member of a sample or population is a specific subject or object about
which the information is collected.

Definition 1.7:
Variable is a characteristic under study.

Definition 1.8:
The value of the variable for an element is called an observation or measurement.

Definition 1.9:
A data set is a collection of observations on one or more variables.

1.5 TYPES OF VARIABLE


There are two types of variable:
(a) Quantitative variables - measured numerically
- eg: heights, weights, etc.

We also can divide the variables into:


(i) Discrete variables - countable
- eg: number of students, etc

__________________________________________________________________________________
2/ 8
PSM0325 Introduction to Probability and Statistics Topic 1

(ii) Continuous variables - values in a certain interval


- eg: weights, time taken, etc.

(b) Qualitative variables - nonnumeric categories


- eg: gender, color of eyes, etc.

ORGANIZING DATA

2.1 RAW DATA


Once data has been collected, it is crucial that the data be well presented for analysis and
interpretation.

Definition 2.1:
Once data has been collected, before they are processed or ranked we called raw data.
Raw data also called as individual data.

2.2 ORGANIZING AND DISPLAYING QUALITATIVE DATA

2.2.1 FREQUENCY DISTRIBUTION


Numerical data can be presented in form of a table. The data would have to be classified.

Definition 2.2:
Frequency distribution is the lists of all categories or classes and the number of
elements or values that belong to each of the categories or classes.

2.2.2 RELATIVE FREQUENCY AND PERCENTAGE DISTRIBUTION


The frequencies are express in the proportion or the percentage of the total frequency of
that category or class in each category or class.

2.2.3 GRAPHING OF QUALITATIVE DATA

BAR GRAPHS
The graph that is used to display ungrouped frequency contained in the frequency
distribution.
frequency or the relative frequency - height or the length of the bar (y - axis).
different category - horizontal axis (x - axis).

Bars are separated, the gap between each bar is uniform, all bars should be of the same
width.

__________________________________________________________________________________
3/ 8
PSM0325 Introduction to Probability and Statistics Topic 1

BAR GRAPHS
10
Frequency 8
6
4
2
0
MIS
Eco
Bus

BS

Ot
Major

PIE CHARTS
A circle divided into portions that represent the relative frequencies or percentages of a
population or a sample belonging to different categories is called a pie chart.

PIE CHART

Bus
Eco
MIS
BS
Ot

2.3 ORGANIZING AND DISPLAYING QUANTITATIVE DATA

2.3.1 FREQUENCY DISTRIBUTION

Definition 2.3
A class interval is a range of values defined by the lower class limit and upper class limit.

Definition 2.4:
Class boundary is the midpoint of the upper limit of one class and the lower limit of the
next class.

Definition 2.5:
Class midpoint or class mark is a average of lower class limit and upper class limit.
Formula:
upper class limit + lower class limit
Class midpoint =
2
Definition 2.6:
Range is equal to highest value minus lowest value.

__________________________________________________________________________________
4/ 8
PSM0325 Introduction to Probability and Statistics Topic 1

Definition 2.7:
Number of classes can be obtained by using Sturge’s formula.
c  1  3.3 log n where c = number of classes
n = number of observations
Definition 2.8:
Range
Class size =
number of classes
Definition 2.9:
Tally marks used to count class frequency by marking strokes against each class for each
data that falls in that class.
2.3.3 CUMULATIVE FREQUENCY DISTRIBUTION
Cumulative frequencies are obtained by finding the total number of values or frequency
that fall below the upper class boundary of each class.

2.3.4 GRAPHING OF (GROUPED) QUANTITATIVE DATA

HISTOGRAMS
- It is a graphical representation of a grouped frequency distribution with class intervals/
class boundaries at horizontal axis and frequency at vertical axis.

- It is obtained by adjoining rectangles, the width of each rectangle is the size of each
class and the height of each rectangle is the frequency of the class. The area of each
rectangle is important.
frequency HISTOGRAM

12
10
8
6
4
2
0
72-74 75-77 78-80 81-83 84-86
Height (in inches)

FREQUENCY POLYGONS AND CURVE


It is obtained by connecting with straight lines the midpoints of adjacent class intervals of
histogram
A frequency curve is obtained by smoothing the corners of a frequency polygon.

__________________________________________________________________________________
5/ 8
PSM0325 Introduction to Probability and Statistics Topic 1

POLYGON

12
10
Frequency

8
6
4
2
0
72-74 75-77 78-80 81-83 84-86
Height (in inches)

OGIVES
It is the graphical representations of a cumulative frequency distribution. Ogive can be
drawn by joining with straight lines the dots marked above the upper boundaries of
classes at heights.

OGIVE

35
Cumulative Frequency

30
25
20
15
10
5
0
71.5 74.5 77.5 80.5 83.5 86.5
Height (in inches)

DESCRIPTIVE MEASURES

3.1 MEASURES OF CENTRAL TENDENCY


The three common measures of central tendency are mean, median and mode.

3.1.1 MEAN
(i) Ungrouped Data

__________________________________________________________________________________
6/ 8
PSM0325 Introduction to Probability and Statistics Topic 1

Population mean

  x  x = sum of all value


N
Sample mean N = Pop. size
x
x n = Sample size
n
(ii) Grouped Data
Population mean

 mf 
 fm
f N
f = frequency
Sample mean m = midpoint
x
 mf 
 fm
f n

3.1.2 MEDIAN
The median is the value of the item, which is located at the center of the ranked
distribution.

(i) Ungrouped Data:


1. Rank the data in ascending order.
n +1
Location of median = th term
2. 2
n 1
3. Get the median from the ranked data on the th term.
2
(ii) Grouped Data:
1. Determine the median class by calculating the position of the median, w  f
2
2. Obtain median using the formula:
  f  
   FL 
 2  
Median = L   c
fm
 

 

where L = Lower boundary of median class
 f = Total frequencies
FL = Total frequencies for all classes before median class
f m = Frequency of median class
c = Class width of median class

__________________________________________________________________________________
7/ 8
PSM0325 Introduction to Probability and Statistics Topic 1

3.1.3 MODE
The mode is the value, which occurs most frequently in a distribution.
Note: In any set of data may be there is no mode, or one or more than one mode.

(i) Ungrouped Data:


The data value that occurs most often is the mode.

(ii) Grouped Data:


 fm  fB   B 
Mode = L   w  L   w
 fm  fB    fm  f A   B  A 
where
L = Lower boundary of the mode class
f m = Frequencies of the mode class
f B = Frequency of 1 class before the mode class
f A = Frequency of 1 class after the mode class
w = Class width of the mode class
3.1.4 THE SHAPE OF THE DISTIBUTION AND THE RELATIONSHIPS
AMONG THE MEAN, MEDIAN AND MODE

1. Symmetry
The mean, median and mode all have the same values.

2. Skewed to the right


The frequency distribution has a tail stretching out to the right.

3. Skewed to the left


The frequency distribution has a tail stretching out to the left.

3.2 MEASURES OF DISPERSION


3.2.1 RANGE
The range is the difference between highest and lowest value in the distribution.
Formula: Range = highest value - lowest value

3.2.2 VARIANCE
The standard deviation measures the spread of the data as compared to the mean.

(i) Ungrouped data

Population variance: Sample variance:


  x 2   x
2

x  N
2
x  n
2

2  s2 
N n 1

__________________________________________________________________________________
8/ 8
PSM0325 Introduction to Probability and Statistics Topic 1

(ii) Grouped Data

Population variance: Sample variance:


  mf 
2
  mf 
2

m f  N
2
m f  n , n   f
2

 
2
s 
2

N n 1

3.2.3 Standard Deviation


Standard Deviation is the positive square root of the variance.
Population:    2 Sample: s  s2

____________________________End of Topic 1_______________________________

__________________________________________________________________________________
9/ 8

You might also like