Stat I Chapter 1 & 2 Ppt-1
Stat I Chapter 1 & 2 Ppt-1
CHAPTER ONE
TO STATISTICS
• In a business environment managers can make sound
decisions when they use all relevant information in an
effective and meaningful manner.
1. Descriptive Statistics
• Is the process of reaching generalizations about the whole (called the population) by
• In order for this to be valid, the sample must be representative of the population and the
Caution: Inferential Statistics assumes that the sampling methodology is random (i.e.
Census or complete enumeration: - a study that includes every member of the target
population, but it is too costly & time consuming.
A. Based on quantifiable
Qualitative Data - are non-numeric in nature and can't be measured. Examples are
gender/sex, color, religion, nationality, marital status and place of birth.
Quantitative Data - are numerical in nature and can be measured. Examples are height,
weight, amount of rain fall, age, balance in your savings bank account and number of
computers in a given class. Quantitative data can be classified into discrete and
continuous type.
Discrete type - values are obtained by counting, and the possible values are (0, 1, 2, 3,
4, 5, 6, 7, 8 …) which cannot be in fraction.
Continuous type – determined by measurement and its value include decimal values.
Such as, distance between two towns, weight of a person, height …etc
Con….
B. Based on data source
Primary data: - are data which do not already exist in any form, and thus have to be collected
for the first time from the primary source(s). By their very nature, these data require fresh
and first-time collection covering the whole population or a sample drawn from it.
The benefits of primary data are that they fit the needs exactly, are up to date, and
Secondary data: - are those which have already been collected by some one else and which
have already passed through the statistical process. They already exist in some form:
advantages of being much cheaper and faster to collect. And its disadvantage is it is not
C. Based on the time data collected
Cross sectional data: - this is data collected at the
same time or one particular point in time on different
elements.
For example, sales made at the same point in time but at
Con…. different market places.
The weakest data measurement. Under nominal data numbers are used
only for coding and labeling /categorizing nominal data/items.
For example; nominal data includes gender (while we are collecting data
we may represent, 0= male and 1 = female).
Types of data
• Ordinal data
measurement
numbers are used to order and rank data. Ordinal data can also be
verbalized on a continuum like excellent, very good, good, fair and
poor.
• Ratio data
It is the highest level of measurement and allows you to
perform all basic arithmetic operations, including
division, multiplication, logarithm, and power.
Cont…
• Con…
1. Marketing
future years.
4. Personnel
census.
In census approach data is gathered from each and
Limitation of census
• It reduces cost
• It saves labor.
• It enables advanced tabulation of selected topics.
• Sometimes conducting a sample survey is the only option for study.
• Sample survey may be used to test census procedures and updating
Con
census results.
….
Limitation of sample survey
TYPES OF
CLASSIFICATION
Qualitative Classification: - Data are arranged
according to attributes like color, religion, marital-
status, sex, educational background, etc.
Then this data must be organized in to a “FD” which simply lists the
values or classes with their corresponding frequencies in a tabular
Frequency form.
Distribution (FD) Here, frequency refers to the number of observations a certain
value occurred in a data.
Frequency (fi) 3 10 18 6
Common Terminologies in a GFD
Class
it is the difference between the upper- and lower-class limits or the difference between the upper- and
lower-class boundaries of any class.
Remarks:
If both the LCL & UCL are included in a class, it is called an inclusive class. For inclusive classes,
Note: - the difference between any two successive class marks is equal to the
width of a class
Range (R)
is the difference between the largest (L) and the smallest (S) values in a data
R=L–S
1. There should be between 5 and 20 classes.
2. The classes must be mutually exclusive. This means that
no data value can fall into two different classes
Rules for forming 3. The classes must be all inclusive or exhaustive. This
a Grouped
means that all data values must be included.
Frequency
4. The classes must be continuous. There are no gaps in a
Distribution
(GFD) frequency distribution.
5. The classes must be equal in width. The exception here
is the first or last class. It is possible to have a
"below ..." or "... and above" class. This is often used
with ages.
.
Steps for constructing Grouped frequency Distribution
.
.
Find the largest and smallest values
Compute the Range (R) = Maximum - Minimum
Select the number of classes desired, usually between 5 and 20 or use
Sturges rule where k is number of classes desired and n is total
number of observations.
Find the class width by dividing the range by the number of classes and
rounding up, not off.
Pick a suitable starting point less than or equal to the minimum value. The starting
point is called the lower limit of the first class. Continue to add the class width to this
lower limit to get the rest of the lower limits.
Con….
To find the upper limit of the first class, subtract U from the lower limit of the second
class. Then continue to add the class width to this upper limit to find the rest of the
upper limits.
Find the boundaries by subtracting U/2 units from the lower limits and adding U/2
units from the upper limits. The boundaries are also half-way between the upper limit
of one class and the lower limit of the next class.! may not be necessary to find the
boundaries.
Find the frequencies.
The number of customers for consecutive 30 days in a