Lect. One
Lect. One
Miss. Rasha
2018-2019
1
Types of Data:
1. Qualitative Variables (Data): variables (data) which assume non-numerical values.
2. Quantitative Variables (Data): variables (data) which assume numerical values.
There are two types of quantitative data:
a) Discrete Variables (Data): are usually obtained by counting. There are a finite or
countable number of choices available with discrete data. For example: you can't have
2.73 people in the room.
b) Continuous Variables (Data): which assume an infinite number of possible values
Usually obtained by measurement. Length, weight, and time are all examples of
continuous variables. Since continuous variables are real numbers, we usually round
them. This implies a boundary depending on the number of decimal places. For
example: four is really anything 3.5 <= x < 4.5. Likewise, if there are two decimal
places, then 3.03 is really anything 3.025 <= x < 3.035.
Levels of Data:
There are four levels of measurement: Nominal, Ordinal, Interval, and Ratio. These
go from lowest level to highest level. Data is classified according to the highest
level which it fits. Each additional level adds something the previous level did not
have.
1. Nominal Level (Categorical Data):
Level of measurement which classifies data into mutually exclusive, all inclusive
categories in which no order or ranking can be imposed on the data. Nominal is the
lowest level, only names are meaningful here. For example, color, manufacturer.
2
2. Ordinal Level:
Level of measurement which classifies data into categories that can be ranked.
Differences between the ranks do not exist. Ordinal adds an order to the data, For
example, sizes.
3. Interval Level:
Level of measurement which classifies data that can be ranked and differences are
meaningful. However, there is no starting point (zero), so ratios are
meaningless. This level limited to dates and temperatures
4. Ratio Level:
Level of measurement which classifies data that can be ranked, differences are
meaningful, and there is a true zero. True ratios exist between the different units of
measure, For example counts, weight, height, etc.
Definitions
Population
All subjects possessing a common characteristic that is being studied. For example, all
engineering students.
Census
The collection of data from every element in a population. For example, record the
height for each student in the engineering college.
Sample
A subgroup or subset of the population that is measured. For example, the set of
students in a class in the college.
Parameter
Characteristic or measure obtained from a population.
Statistic (not to be confused with Statistics)
A numerical description of some property of the sample characteristic. For example,
the mean (average) height of the students in a class in the college.
3
Types of Sampling
There are five types of sampling: Random, Systematic, Convenience, Cluster, and
Stratified
1) Random sampling is analogous to putting everyone's name into a hat and drawing
out several names. Each element in the population has an equal chance of occurring.
While this is the preferred way of sampling, it is often difficult to do. It requires that a
complete list of every element in the population be obtained. Computer generated lists
are often used with random sampling.
3) Convenience sampling is very easy to do, but it's probably the worst technique to
use. In convenience sampling, readily available data is used. That is, the first people
the surveyor runs into.
5) Stratified sampling also divides the population into groups called strata. However,
this time it is by some characteristic, not geographically. For instance, the population
might be separated into males and females. A sample is taken from each of these
strata using either random, systematic, or convenience sampling.
4
Statistical Processes:
1. Collection of data
2. Presentation of data
2- Presentation of data
The data when collected should be presented in an intelligible form. Usually for
data that large in numbers a frequency table is created with first column showing
the variety and second column gives the frequency. Frequency is the number of
times each variety is repeated.
5
Example: Data set: 3, 7, 4, 0, 2, 9, 7, 5, 6, 5, 8, 7, 4, 3, 4, 5, 0, 1, 1, 3, 4, 7, 6, 8 and
7.
6
Solution:
The smallest number is: 12
or m=square-root of 90 =9.48 ≈9
Class Frequency
Raw Data: data collected in original form - it means the data that have not been
organized numerically
68 84 75 82 68 90 62 88 76 93
73 79 88 73 60 93 71 59 85 57
61 65 75 87 74 62 95 78 63 72
7
Depending on the data in the table above, find:
(h) The percentage of students who received grades higher than 65 but not higher
than 85
Solution:
*First step is subdividing the data into appropriate classes, such as in table 2.1, to
make the answer easier:
*The second step is constructing an array by arranging the numbers of each class
into an array, such as in table 2.2
8
* From the table 2.2 it is relatively easy to answer the above questions:
(a) The highest grade is (95).
(b) The lowest grade is (57).
(c) The range is 95 – 57 = 38 [Range = highest value - lowest value]
(d) The three lowest-ranking students have grades 57, 59, and 60.
(e) The grade of the student ranking tenth highest is 82.
(f) The number of students receiving grades of 85 or higher is 8.
(g) The percentage of students who received grades higher than 65 but not higher
than 85 is 15/30 = 50%.