0% found this document useful (0 votes)
22 views

Week 01statistics

The population is the shipment of 2,440 gallon-sized paint cans from the supplier. The variable of interest is the average weight of the cans in the shipment. A sample of 50 cans was weighed to make an inference about the average weight of the entire shipment. However, there is uncertainty in the inference since it is based on a sample rather than the whole population. The measure of uncertainty quantifies how close the sample average is likely to be to the true population average.

Uploaded by

lovesh kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Week 01statistics

The population is the shipment of 2,440 gallon-sized paint cans from the supplier. The variable of interest is the average weight of the cans in the shipment. A sample of 50 cans was weighed to make an inference about the average weight of the entire shipment. However, there is uncertainty in the inference since it is based on a sample rather than the whole population. The measure of uncertainty quantifies how close the sample average is likely to be to the true population average.

Uploaded by

lovesh kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 47

Probability and

Statistics
BS-1402
Instructor
LECTURES for week # 01
Engr.Syed Muhammad Khubaib

Sweden
Topics to be Covered in week 1

Introduction to Statistical Concepts: Basic Definitions,


Concepts and importance of statistics in practical life
BOOK
Introduction to Statistics

By
R.E. Walpole
Grading Policy

Assignments 15%
Quizzes 10%
Midterm 25%
Final Exam 50%
Statistics
• The science of collecting, organizing,
analyzing and interpreting data.
• Making data meaningful

• Statistical Question:
• A question where we expect to get a
variety of answers and you are interested
in the distribution, tendency of those
answers
Statistics
• Numerical data/facts that has been
arranged systematically ‫اعداد و ش مار‬

• The discipline that helps us draw


conclusions about various phenomena
based on collected data

• Numerical quantities calculated from


collected data like mean, median, mode…
Example
• What is the weight of a mouse?

• Is this a statistical question? Why?

• All mice have different weights so we get a


variety of answers.
• Weigh 20 mice and list them on a table
• Identify clusters, peaks, gaps
Example
20 19 21 20
18 20 27 21
28 23 20 19
20 21 18 27
19 22 21 20

Weights of 20 different mice in grams

Plot the weights on a number line


to identify the peaks, clusters and
gaps
Explaination
• Use the distribution to answer the
statistical question
• What is the weight of the mouse?
• Most mice weight 20 grams
Example 2
• The table shows the ages of people who
retired early. You are asked “How old are
people who retired early”

• A) Is this a statistical question?


• B) Display the data in a dot plot and
identify clusters, peaks or gaps
Mean
• Mean: The average of the given data set
• It represents a typical value and hence it
can be used as a yard stick for all
measurements.

Mean

• Mean = Average
Median
• Median: The middle number; found by
ordering all data points and picking out the
one in the middle (or if there are two
middle numbers, taking the mean of those
two numbers)
• It is the value separating the upper and
lower half of the data set (after putting the
data set into order)
• Median of even data set?
Mode
• Mode is the most frequently appearing
number in the data set.
• This may not necessarily be equal to the
mean of the data set

• Can a data set have 2 modes?


Types of Statistics

Statistics

Descriptiv Inferential
e Statistics Statistics
Descriptive Statistics
• Involves the description of collected data

• Using charts, graphs and visual

• Average, mean, median, mode


Inferential Statistics
• A set of tools with which you use compiled
and summarized data to draw conclusions
about a phenomenon
• Collected data is only a small sample from
a large population of data
• The conclusions are probabilistic in nature
rather than deterministic
• So, it is important to study probability
Deterministic Models
• In deterministic models, the output of the
model is fully determined by the parameter
values and the initial values, whereas
probabilistic (or stochastic) models
incorporate randomness in their approach.
• Consequently, the same set of parameter
values and initial conditions will lead to a
group of different outputs.
Probabilistic Vs Deterministic
• The time of today’s class is deterministic

• The number of students who will attend


today’s class in probabilistic
Aggregates
• Statistics deals with aggregates (large
number of values together)
• The aggregates are subject to a number of
random causes, e.g., height of a student
• What factors are behind the randomness
in height of a set of students?
Data
• Latin for “those that are given”
• Singular form is datum
• Data maybe thought as the results of
observations
• Interrogation of police.
• Interview answers to a psychologist.
• Time of completion of lap in F1 race.
Data
• Is obtained during scientific enquiry
• Number of people testing positive for
COVID
• Observation of temperature and pressure
during an experiment
• Number of times a student will talk during
the class
• Distance b/w the Moon and the Earth
during different time of the day
Data
• Quantitative Data
• Numbers : height, weight, age, income,
heart rate,

• Qualitative Data
• Non-numeric: eye color, ethnicity, marital
status
Variable
• That quantity which varies for every
individual
Variable

Quantitativ
Qualitative
e

Discrete Continuous
Continuous Variable
• Height of a person

• 185 cm / 185.4378967 cm depending


upon the accuracy of the measuring
instrument

• Measurable variable is continuous


Discrete Variable
• Data that can be counted

• Number of family members

• Number of subjects in this semester


Measurement Scale
Nominal
Scale

Ordinal Scale Ratio


Scale s Scale

Interval
Scale
Nominal Scale
• The classification or grouping of
observations into mutually exclusive
qualitative categories is said to constitute
a nominal scale
• Students can be classified as male/female
• Classification based on nationality
Ordinal/ Ranking Scale
• Includes the properties of nominal scale
and has additional quality of
ordering/ranking.
• Students can be ranked by class
performance: excellent, good, fair,
average, poor
• Rainfall can be heavy, moderate, light
(difference between two ranks may not be
same)
Interval Scale
• A measurement scale with a constant
interval size but no true zero point:
thermometer
• 104oF is equal to 40oC but there is no true
zero. -15oC still has some average kinetic
energy in the molecules.
Ratio Scale
• A special type of interval scale where there
is a true zero point. Height, weight, volume
etc.

• The key difference is that the zero point is


meaningful in the ratio scale
Example
• The rivers are being polluted by the
industry.
• A study is conducted to check the
contamination levels in different kinds of
fish.
• Data recorded: 1. location of catch, 2.
species of fish, 3.size and 4. weight of fish,
5.contamination level of DDT in PPM.
Classification
• Classify the 5 variables into
quantitative/qualitative
• Identify the types of measurement scales
for each variable
Solution
• Quantitative: length, weight, DDT
concentration (all on numerical scale)
• Ratio scale: all above 3 on ratio scale
• Why ratio scale? Because all of these
scales have a true zero
• Whenever we speak of an object, if
measuring instrument reads a zero, this
means the object measured has zero
length, weight etc in a true sense.
Solution
• The remaining 2 variables: location of
catch and species of fish are qualitative
(nominal scale)
• There is no ordering in the above 2
variables (ordinal scale)
Explanation
• Statistical methods of describing, reporting
and analyzing data depend on the type of
data measured (quantitative/qualitative)
Types of Errors

Biased Error
Error

Random Error
Errors
• Biased Errors: when the measurement
instrument has an error.
• Example of short scale measuring cloth
• Cumulative error as measurement quantity
increases
• Random errors: sometime measurement is
less or more than the actual.
Compensating error. Chance error
Statistical Inference
• A statistical inference is a prediction or
estimate or a generalization about a
population based on based on information
contained in a sample.
• We use information from a small group to
generalize conclusions about the whole
population
Elements of Statistical Inference
• There are 5 elements of a statistical
inference:
• Population (frame)
• Variable of interest
• The inference itself
• An estimate of reliability
Measure of Reliability
• A statement (usually quantified) about the
degree of uncertainty associated with an
inference
• Note: Only way to be certain that the
inference is correct is to include the whole
population in the sample (which is not
possible due to our limitations)
• Hence our inference is based on a sample
smaller than the population
Reliability
• It is important to determine and report the
reliability of each inference
Example
• A large number complaints about under-
filled paint cans
• So the retailer starts inspecting the paint
cans coming in from the supplier
• Under filled cans will be returned to the
supplier
• A shipment contained 2,440 gallon size
cans and the retailer sampled 50 cans on
a weight scale capable of measuring
weight up to 4 decimal places.
• Properly filled cans weigh 10 pounds
Example
a. Describe the population
b. Describe the variable of interest
c. Describe the sample
d. Describe the inference
e. Describe the measure of uncertainty of
our inference
Bound on the Estimation Error
• It is simply a number that our estimation
error ( the difference b/w the average
weight of the sample and the average
weight of the population of cans) is not
likely to exceed.
• So the bound is the measure of
uncertainty of our inference or the
reliability of our inference.
• Essentially the inference is incomplete
without the measure of its reliability
Example
• When the weights of 50 paint cans are
used to estimate the avg weight of cans,
the estimate will not exactly mirror the
entire population.
• For example: In the sample of 50 cans if
the avg weigh is 9 pounds it does not
mean that the whole population avg is also
9 pounds.
Example
• Nevertheless, we can use sound statistical
reasoning to ensure that our sampling
procedure will generate estimate that is
almost certainly within a specified limit of
the true mean weight of the population.
Example
• For example a reasoning might ensure
that the estimate is accurate to within one
pound of the actual population mean.
• Consequently we can write (9±1) as the
mean weight of the population
• The interval of 1 pound represents the
measure of reliability of the inference.
Population Vs Sample
• Population: The collection of all
individuals, items or data under
consideration in a statistical study

• Sample: The part of population from which


information is collected

• What about accuracy of an inference?


• Cost vs accuracy?

You might also like