Engineering Data Analysis
Engineering Data Analysis
DESCRIPTIVE STATISTICS
DESCRIPTIVE AND INDUCTIVE
STATISTICS
◦Statistics
is a science that deals with the methods of collecting,
organizing, summarizing and interpreting data in
order to draw valid conclusions (or interferences)
from them. These methods are categorized as
belonging to two major areas called descriptive
statistics and inductive or inferential statistics.
◦Desrciptive statistics
is concerned with the collection and
presentation of data and the description of
some of their features to yield meaningful
information without attempting to draw
any inferences from them.
◦Inductive or inferential statistics
is concerned with the development and use
of mathematical tools to go beyond data
presentation and make forecasts and
inferences. The concept of probability is basic
to the development and understanding of
inductive statistics.
POPULATION, SAMPLE AND
VARIABLES
One of the goals of statistical investigation is to acquire
information or draw some conclusions about a large group of items
on the basis of a few. When it is impossible or impractical to
observe the entire set of observations, we must, therefore, depend
on a subset of the observations.
A population consist of all the individuals or objects in a group
under study. A subcollection of items drawn from a population
under study is called a sample. The characteristics thatis being
studied is a variable. A variable may be qualitative or quantitative.
STATISTICS
DESCRIPTIVE INFERENTIAL
1. Organizing and summarizing data 1. Using complete data to make an
using numbers inference or draw a conclusion of the
and graphs population.
2. Data summary: 2. Uses probability to determine how
Bar Graphs, Histograms, Pie Charts, confident we can be that the conclusions
etc. we make are correct.
Shape of graph and skewness (Confidence Intervals and Margins of
3. Mesures of Central Tendency: Error)
Mean, Median, and Mode
4. Measures of Variability:
Range, variance, and Standard
deviation
QUALITATIVE VARIABLES QUANTITATIVE VARIABLES
• Categorical or nonnumeric
• How much or how many
-gender -Company revenues
-eye color -age
-religious affiliation -salary
-political affiliation -IQ
-major
Discrete variables –
can only assume certain values and gaps exist
between those values
Continuous Variables-
can assume any value within a certain range
-Profits
◦ -Square footage
COLLECTION OF DATA
Statistics deals also with the development of techniques for collecting
data. Data should be properly collected so that an investigator may be able
to answer the question under consideration with a reasonable degree of
confidence.
The simplest method foe ensuring a representative selection of samples
is to take a simple random sample. In this method of sampling, any
particular subset of the specified size has the same chance of being
selected. For example, a sample of size 100 will be taken from the 100,000
items produced. Each item’s serial number could be noted on identical
small sheet of paper, placed in box and jumbled thoroughly.
◦Another method is stratified sampling. This involves
taking a sample from each population unit in non-
overlapping groups. For instance, the manufacturer of a
light bulb wishes to investigate the lifetime of their
bulbs. If 25-watt, 60-watt, and 100-watt bulbs were
produced, a separate sample could be selected from
each of the three bulb sizes. This would result in
information on all the three bulb sizes.
TABULAR AND GRAPHICAL
METHODS IN DESCRIPTIVE
STATISTICS
◦ Descriptive statistics can be divided into two subjects areas.
The first area consists of data presentation using visual
techniques in the form of tables and graphs. The other area
consists of numerical summary measures foe data set.
◦ There may be many visual techniques which are familiar to
you, however, we focus our discussion on a selected few
techniques of data presentation that are most useful and
relevant to probability and inferential statistics.
FREQUENCY DISTRIBUTIONS
The organization of data in tabular form yields frequency
distributions. Data in frequency distributions may be grouped or
ungrouped.
Raw data are collected data that have not been organized
numerically an arrangement of raw data in ascending or descending
order or magnitude is an array. In an array, any value may appear
several times. The number of times a value appears in the listing is its
frequency. The relative frequency of any observation is obtained by
dividing the actual frequency of the observation by the total
requency.
UNGROUPED DATA
When the data is small (n ≤ 30) or when there are few distinct values, the data may be organized without
grouping.
EXAMPLE 1.1
A certain machine is to dispense 1.5 kilos of sodium nitrate. To determine whether it is properly adjusted
to dispense 1.5 kilos, the quality control engineer weighed 30 bags of sodium nitrate, 1.5 kilo each after the
machine was adjusted. The data given below refer to the net weight (in kilos) of each bag.
146 126 119 119 105 132 126 118 100 113
80-89 II 2 0.04
∑ 50 1.00
24.5 23.6 24.1 25.0 22.9 24.7 23.8 25.2 23.7 24.4
24.7 23.9 25.1 24.6 23.3 24.3 24.6 23.9 24.1 24.4
24.5 25.7 23.6 24.0 23.9 24.2 24.7 24.9 25.O 24.8
24.5 23.4 24.9 24.8 24.7 24.1 22.8 23.1 25.3 24.6
The lowest value is 22.8, therefore. 22.5 maybe the
lower limit of the 1st class. 22.5 + 0.5 = 23.0 is the
lower limit of the 2nd class.
Gasoline Consumption Tally No. of cars (frequency, f)
(miles/gallon)
22.5 – 22.9 II 2 0.050
23.0 – 23.4 III 3 0.075
23.5 – 23.9 IIII – II 7 0.175
24.0 – 24.4 IIII – III 8 0.200
24.5 – 24.9 IIII – IIII – IIII 14 0.350
25.0 – 25.4 IIII 5 0.125
25.5 – 25.9 I 1 0.025
∑ 40 1.000
TABLE 1.4 CLASS LIMITS, CLASS BOUNDARIES AND CLASS MARKS FOR FREQUENCY DISTRIBUTION
PRESENTED IN TABLE 1.2
Classes Class Boundaries Class Marks
Table 1.5 class limits, class boundaries and class marks of frequency distribution presented in Table 1.3
NUMERICAL SUMMARY
MEASURES
∑xi = x1 + x2 + …+ xn = ∑xi
i=1
MEASURES OF CENTRAL TENDENCY:
MEAN, MEDIAN AND MODE
MEAN
The arithmetic mean or simply the mean is the overall average.
If the data represent the entire population the mean of the
values is referred to as the population mean, μ. This mean is a
quantitative measure describing the characteristic of a population
and therefore, it is a parameter. If the data constitute a sample
drawn _ from a population, the mean is referred to as the sample
mean, ᵪ , which is a statistic.
If there are n observations with numerical values x1, x2,…xn, then the
sample mean is given by
_
ᵪ =
n
Fundamentals of Probability and Statistics of Engineering
The population mean, μ is given by
I
μ= N
ᵪ i Xi
n
Example 1.5
The following data represent the time in seconds for 9 glued
samples to dry and attains its bond strength: 3.6, 2.5, 3.1, 4.3,
2.4, 2.9, 2., 4.1 and 3.4. Calculate the mean.
SOLUTION:
ᵪ = ∑
𝑓𝑖𝑋𝑖 = 28.8 =
n 9 3.2 seconds
Example 1.6
Find the mean weight of sodium nitrate in Example 1.1
SOLUTION:
WEIGHT (kl.) No. of bags (frequency, f)
1.46 6 8.76
1.48 4 5.92
1.49 5 7.45
1.50 6 9.00
1.52 9 13.68
∑ 30 44.81
ᵪ = ∑=n𝑓𝑖 𝑋𝑖
44.81
=301.49 kilos
ᵪ = ∑n𝑓𝑖 𝑋𝑖
Example 1.7
◦ Find the mean of gasoline consumption in Example 1.3
Classes
Solution:
22.5
22.5 -- 22.9
22.9 2
2 22.7
22.7 45.4
45.4
23.0 - 23.4 3 23.2 69.6
23.0 - 23.4 3 23.2 69.6
23.5 - 23.9 7 23.7 165.9
23.5 - 23.9 7 23.7 165.9
24.0 - 24.4 8 24.2 193.6
24.0 - 24.4 8 24.2 193.6
24.5 - 24.9 14 24.7 345.8
24.5 - 24.9 14 24.7 345.8
25.0 - 25.4 5 25.2 126
25.0
25.5 -- 25.4
25.9 5
1 25.2
25.7 126
25.7
25.5
∑ - 25.9 1
40 25.7 25.7
972
∑ 40 972
if n is odd if n is even ᵪn ᵪ
+ n
~ ~
ᵪ= ᵪ ᵪ = 2 2+1
n + ⅟2 2
The sample median ᵪ is used to estimate the population median μ.
Example 1.8
For the set of numbers 1, 3, 3, 5, 6, 8, 9, 9, 10
ᵪ ~ ᵪ= = 6
5
Example 1.9
For the set of numbers 4, 4, 7, 9, 11, 12, 15, 18
ᵪ4 + ᵪ 5 9 + 11
ᵪ = = = 10
` 2 2
Example 1.10
Find the median of the data in Example1.5.
Solution:
Arrange the data in ascending magnitude. 2.3, 2.5, 2.6, 2.9, 3.1, 3.4, 3.6, 4.1, 4.3
ᵪ ᵪ~ = 5
= 3.1 seconds
n
( ∑f ) L
ᵪ ~
= Lm +
2
C
fm
Where:
lowest
Lm = class boundary of the median class
Fundamentals of Probability and Statistics for Engineering
( ∑=f ) sum of frequencies of all classes lower than the median classes
= ffrequency
m
of the median class
C = size of the median class
Example 1.11
Determine the median of the data in example 1.2. Refer to Table 1.8
Solution:
ᵪ ~
= 23.95 + (20 – 12)(0.5) = 24.45
8
MODE
Mode is the valuḙ which occurs with greatest frequency. The sample mode
is designated as and theᵪ population mode by μ.
Example 1.12
For the set of numbers 3, 3, 5, 7, 9, 10, 11, 10, 11, 12, 9, 18, 9
̭
ᵪ = 9 (unimodal)
Example 1.13
The set of numbers 6, 7, 9, 10, 12 has no mode.
Example 1.14
For the set of values 2.2, 3.1, 4.1, 4.1, 5.4, 5.4, 5.4, 5.4, 6.2, 7.7, 7.7, 8.5, 8.5, 8.5, 9.3
̭
ᵪ = 5.4 and 8.5 (bimodal)
MODE OF GROUPED DATA
̭ d1
ᵪ = L mO + c
d1 + d2
where:
σ² = ²
i
N
The sample variance s² is a statistic. A statistic that estimates the true parameter on the
average is said to be unbiased. Dividing by n will underestimate the population variance on the
average. To compensate for the bias in estimating σ², we us n – 1 in the divisor. The number n
- 1 is called the degrees of freedom.
Thus, if there are n numerical observations x , x1 ,….x
2
inna sample, the deviation
of each individual observations from the mean is x - x . i
²
S² = i
n-1
Shortcut formula in Finding the Sample Variance
Therefore:
S² = -
1 ²
² i
n
i
n-1
STANDARD DEVIATION
Standard deviation is the positive square root of the variance.
The population standard deviation σ = √σ² and the sample standard
deviation s = √s² .
s² = ² or s² = - ²
i i 1 i
n-1
² i i
i
n-1
EXAMPLE 1.17
The following readings were the obtained tensile strength in kg/cm² of six
specimens of carbon steel.
2.46 2.65 2.40 2.44 2.41 2.58
a) What is the mean tensile strength and the standard deviation of the tensile
strengths?
b) Suppose each reading is expressed in kg/m², what is the standard deviation?
SOLUTION:
X (X - X)²
i X ² i i
∑( Xi - X )² 0.0516
S² = n=- 1 = 0.01
5
Fundamentals of Probability and Statistics for Engineering
Using the shortcut formula:
2
s² = ⅕[37.2522 – (14.94) ]
6
s² = ⅕(0.0516 = 0.01
2
s = 0.1 kg/cm
b.) If the readings will be expressed in kg/m2 each reading will be multiplied by 104 since 1 kg/cm2 = 104 kg/m2 .
Therefore, the standard deviation of the new set of data is
S = (104 )(0.1)=1000kg/m2
Example 1.18
◦ Determine the standard deviation of gasoline consumption in Example 1.3
Solution:
2
Classes (-X ) 2
19.60
= 0.503
◦ S² = 39
◦ S = 0.71mi/gal
Chapter 2
PROBABILITY refers to the study of randomness and uncertainty of
an outcome. The theory of probability provides methods that will
permit us to quantify the chances, or likelihood, associated with
various outcomes of an event.
P= nPr = n!/(n-r)!
3. The number of permutations of n objects of which n1 are identical,
n2 are identical,…., nm are identical is
P= n!/ n1! n2!...nm!
4. The number of permutations of n objetcs of which n distinct
objects are arrange in a circle is
P= (n-1)!
COMBINATION
Combination is the number of ways of selecting r objects from n without regard to order.
The number of combinations of n objects taken r at a time is
nCr = n!/r!(n-r)!
EXAMPLE 1.1
How many numbers can be formed using all the digits 6, 7, 8, and 9?
Solution: To form different numbers, arrange all the 4 digits and the
arrangements are the number of numbers formed.
P=4! = 24 numbers
Example 1.2
How many distinct permutations are there in the word MILLENNIUM?
Solution: There are 2M’s, 2L’s, 2I’s, 2N’s
P= 10!/2!2!2!2! = 226, 800
Example 2.3
A. In how many ways can 4 letters a,b,c and d be arranged inn a circle?
Solution: P= (4-1)! = 3! = 6 ways
Example 2.4
From a box containing 4 defective and 5 non defective items, how many
sample of size 3 are possible.
A. With no restrictions
B. With 1 defective and 2 non defective item
C. With 2 defective and 1 non defective item if a certain defective item must
be on the sample chosen
a) n = number of ways of selecting 3 from 9
9! 9!
n = 9 C3 = 3!(9-3)! = =
3! 6!
84 samples
n = C C1 . 4! = 4(10)
. 5!= 40 samples
4 5= 2 1! 3! 2! 3!
c) n = no. of samples with 2 defective and 1 non defective with a certain defective item on the sample
chosen
n1 = no. of ways of selecting 1 defective from among 3 defective items
n2 = no. of ways of selecting 1 defective from among 5 non defective items
n = 3 C1 . C = 3! . 5! = 3(5) 15 samples
5 1 1! 2! 1! 4!
Example 2.5
A jar contains 3 red marbles, 7 green marbles and 10 white marbles. If a
marble is drawn from the jar at random, what is the probability that this
marble is white?
A box of 10 fuses has two defective fuses. In how many ways can
one select three of these fuses and get
a.) neither of the defective fuses
b.) one defective fuses
C.) both of the defective fuses
PROBABILITY OF AN EVENT
The objective of probability is to asign to each event A a number P(A), called the
probability of the event A , which will give a precise measure of the chance that A will
happen.
The probability of an event A is the number of the outcomes favorable to A to the
number of outcomes. If NA is the number of outcomes favorable to event A and N is the total
number of outcomes, the number of outcomes in a sample space, thus
P(A) = nA/N
Properties of Probability
1. Positiveness 0P(A) 1
2. Certainty P(S) = 1, the probability of a sure event
Example 1.1
In the experiment of examining 3 bulbs, find
the probability of the following events:
a.) exactly 2 bulbs are defective
b.) at least 2 bulbs are defective
SOLUTION:
The sample space for this experiment is S ={DDD, DDN, DND,
DNN, NDD, NDN, NND, NNN}
N=8
EXAMPLE 1.2
In a card game, if a hand is holding 5 cards,
find the probability that there will be:
a. 3 aces
b. 4 hearts and 1 diamond
SOLUTION!
a.) A = the event of having 3 aces and 2 of any kind of the
other than aces.
4C3 = number of ways of having 3 aces
2. A box contains three 5-μf capacitors, four 10-μf capacitors and one 30-
μf capacitors . If three capacitor are picked at random, find the
probability that there are
a. three 5-μf capacitors
b. one 5-μf capacitors, one 10-μf capacitors and one 30-μf capacitors.
ADDITIVE RULES
1. If A and B are any 2 events, then
P(AB) = P(A) + P(B) – P(AB)
2. If A and B are mutually exclusive events, then
P(AB) = P(A) + P(B)
3. If A and A’ are complementary evnts, then
P(A) + P(A’) = 1
Also, P(AB) – P(A) = P(B) - P(AB) = P(B)
MUTUALLY EXCLUSIVE EVENTS – Events that could not happen at
the same time, ex. dice,sets in exam
●INDEPENDENT EVENTS
Two events A and B are independent if and only if
P(A/B) = P(A) and P(B/A) = P(B)
= P(A) + P(B) – P(AB) so that,
P(AB) = P(A) x P(B)
EXAMPLE 1.4
A petroleum company exploring for oil has decided to drill 2 wells, one after
the other. The probability of striking oil in the first well is 0.25. Given that they
strike oil in the first attempt, the probability of striking oil in the second
attempt is 0.85. What is the probability of striking oil in the both wells?
SOLUTION:
Let W1 = the event of striking oil in the 1st well
W2 =the event of striking oil in the 2nd well
P(striking oil in both wells) = P(W1 and W2)
P(W1) = P(W1) x P(W2/W1) = 0.25(0.85)
P(striking oil in both wells) = 0.1875
EXAMPLE 1.5
Sarah is deciding which courses she wants to take in her next college
semester. The probability that she enrolls in an Algebra is 0.30 and the
probability that she enrolls in a Biology course is 0.70. The probability that
she will enroll in an Algebra course GIVEN that she enrolls in Biology course
is 0.40. a) what is the probability that she will enroll in both an Algebra
course AND a Biology course?, b) what is the probability that she will enroll
in both an Algebra course OR a Biology course?,
SOLUTION:
Let A = the event that the she enrolls in Algebra
B = the event that the she enrolls in Biology
(AB) = the event that she enrolls in Biology course given that she enroll
in biology course
2. A box contains three 5-μf capacitors, four 10-μf capacitors and one 30-μf
capacitors . If three capacitor are picked at random, find the probability that there
are
a. three 5-μf capacitors
b. one 5-μf capacitors, one 10-μf capacitors and one 30-μf capacitors.
CHAPTER 3
PROBABILITY DISTRIBUTION
RANDOM VARIABLES
A random variable is a function that assigns numerical values to the outcomes of a
sample space.
We shall use capital letter to denote a random variable and its corresponding small
letter for one of its values.
CLASSIFICATION OF RANDOM VARIABLES
Random variables are classified as discrete or continuous, depending upon their range
of values.
1. Discrete random variable is one whose set of possible values is finite or countably
infinite.
2. Continuous random variable is one that can assume values on a continuous scale.
PROABILITY DISTRIBUTION OF DISCRETE
RANDOM VARAIBLES
A probability distribution is a formula or a table listing of all possible
values that a random variable can take on. This is the theoretical counterpart of
frequency distribution.
The probability distribution or probability mass function (pmf) of a
discrete random variable X is defined for every number x by.
f(x) = P(X=x)
Properties of a discrete probability function:
1. f(x) ≥ 0
2.
EXAMPLE 3.1
Find the probability distribution for a number of heads that appear
when a coin is tossed 3 times
SOLUTION:
let X = number of heads that appear
x = 0, 1, 2, 3
x f(x)
f(x) = P(X=x)
0
∑ 1
F(0) = P(M=0) = = = 0
1
F(1) = P(M=1) = = =
2
F(2) = P(M=2) = = = 3
F(3) = P(M=3) = = = ∑ 1
f(3) = P(RRR) =
CHAPTER 4
SOME DISCRETE PROBABILITY DISTRIBUTION
BINOMIAL DISTRIBUTION
If an experiment cosists of n repeated trials, each trial has two
possible outcomes which may be labeled as success of failure, and if
the repeated trials are independen, the probability of a success
remains constant from trial to trial. The experiment is called a
binomial experiment
The number of success is n independent trial is called a binomial
random variable. The probability distribution of this discrete random
variable is called the binomial distribution
If a binomial trial can result in a success with probability p and
a failure with probability q= 1-p, then the probability distribution of the
binomial random variable X, the number of success is n independent
trial is
F(x) = nCxpxqn-x
“Bi” means there are two possible outcomes (success,failure)
n= no.of trials ; p= prob.of success
x= any no.of successes ; q= prob.of failure. Note: ( q= 1-p )
Example no.1
1. A coin is tossed 5 times, find the probability of
a. getting exactly 3 heads , b. at most 3 heads
note: Theres a long way (see ex.prob.under probality of an event similar to this prob.) and a
short way in solving this problem.Lets start with the short way!
a.) notice that this is a coin,so there’s two posible outcome (a tail or a head)
F(x) = nCxpxqn-x
GIVEN:
n= 5 , p= 1/2 , why?bec.there is one success (H) out two outcomes (head or tail)
x= 3 , q= 1/2Note: ( q= 1-p )
q= 1-1/2 = 1/2
Solution!
F(x) = nCxpxqn-x = 5C3(0.5)3(0.5)5-3
= 0.3125 x 100 = 31.25% is the probability of getting exactly 3 heads!
b.) at most 3 heads ( means less than 3 heads)
F(x) = P(0) + P(1) + P(2)
F(x) = 5C0(0.5)0(0.5)5-0 + 5C1(0.5)1(0.5)5-1 + 5C2(0.5)2(0.5)5-2
= 0.03125 + 0.15625 + 0.3125
= 0.5 X 100 = 50% is the probability of getting at most 3 heads!
Example no.2
2. A six-sided die is rolled 12 times. What is the probability of getting a 4 five times?
F(x) = nCxpxqn-x
GIVEN:
n= 12 , p= 1/6 , why?bec.out of the six numbers we get only one successful no. or
outcome which is 4 .
x= 4 , q= 5/6 note: ( q= 1-p )
q= 1-1/6 = 5/6
Solution! x n-x
F(x) = nCxp q = 12C5(1/6)5(5/6)12-5
= 2.84% is the probability of getting 4 five times if a six-sided die tolled 12 times!
Example no.3
3. A multiple choice test contains 20 questions with answer choices A, B, C and D. Only one
answer choice to each question represents a correct answer. Find the probability that a student
will answer exactly 6 questions correct if he makes random guesses in all 20 questions.
F(x) = nCxpxqn-x
GIVEN: n= 20 questions , p= ¼ or 0.25 , why?bec.from choices A,B,C and D
there is only one answer x= 6 (success in 20 questions) , q= 0.75 note:
( q= 1-0.25 )
q= 1-0.25 = 0.75
Solution!
F(x) = nCxpxqn-x = 20C6(.26)6(0.75)20-6
= 16.86% is the probability of getting 6 questions correct out of 20 multiple
question test!
note!:
Dear Engineers, aralin nyo yang topics and mga sample problems na binigay ko.On may
6 meron kayong first activity sakin considered as your midterm exam.(please study also our past
lesson about probability kasi kasama sya sa activity). Magsipag at magtyaga!
God bless!
P.S (Inexplain ko sa mga sample problem kung bakit nakuha yung ganitong value,bakit bakitbakit
ka nya iniwan!hahaha.basta madali lang aralin yan.Kayo paba,kayang kaya nyo yan!
I MISS YOU GUYS!!!!