Engineering Data Analysis Notes
Engineering Data Analysis Notes
If some procedure can be performed in n1 different ways, and if, following this procedure, a second procedure can be
performed in n2 different ways, and if following this second procedure, a third procedure can be performed in n 3 different ways, and so
forth; then the number of ways the procedures can be performed in the other indicated is the product (n1)(n2)(n3)…
Example:
Suppose a license plate contains two distinct letters followed by three digits with the first digit not zero. How many different
license plates can be printed?
The first letter can be printed in 26 different ways, the second letter in 25 different ways (since the letter printed first cannot be
chosen for the second letter), the first digit in 9 ways and each of the other two digits in 10 ways.
Suppose a car number plate contains three distinct English letters followed by three non-repeated digits. How many different
car number plates can be printed?
Note that there are 26 letters in the English alphabet and there are 10 digits in out number system, so the first box could be
filled in 26 different ways, and since the 3 letters used are distinct, hence the succeeding 2 boxes could be filled in 25 and 24
different ways, respectively. Then the 4th box could be filled in 10 different ways and again, since the digits should not be
repeated, then the succeeding boxes could be filled in 9and 8 different ways, respectively.
FACTORIAL NOTATION
The product of the positive integers from 1 to n inclusive occurs very often in mathematics and hence is denoted by the
special symbol n! (read in “n factorial”):
Examples
PERMUTATION
An arrangement of a set of n objects in a given order is called a permutation of the objects (taken all at a time). An
arrangement of any r ≤ n of these objects in a given order is called an r-permutation or a permutation of the n objects taken r at a time.
Examples: Consider the set of letters a, b, c, and d. Then:
a. bcda, dcba, and acdb are permutation of the 4 letters (taken all at a time) ;
b. bad, adb, cbd, and bca are permutation of the 4 letters taken 3 at a time;
c. ad, cb, da, and bd are permutation of the 4 letters taken 2 at a time.
Example:
Find the number of permutations of 6 objects, say a, b, c, d, e, f, taken three at a time.
COMBINATION
An arrangement of a set of n objects where order does not count. So, as long as the elements in the arrangement are the
same, then with respect to combination this will mean one arrangement only.
Example:
Find the number of combination of 6 objects, say a, b, c, d, e, f, taken three at a time.
More Examples:
1. If 15 people won prizes in a lottery (assuming that there are no ties), how many ways can these 15 people win first, second,
third, fourth, and fifth place?
2. How many ways are there to select 3 candidates from 8 equally qualified recent graduates for opening in an accounting firm?
3. A teacher forms a committee whose members come from her class consisting of 18 boys and 15 girls. How many committee
are formed consisting of 5 members of which 3 members are girls and 2 members are boys?
4. A developer of a new subdivision offers a prospective home buyer a choice of 5 designs, 3 different air conditioning systems, a
garage or carport, and a patio or screened porch. How many plans are available to this buyer?
PROBABILITY
A measure of certainty of a certain outcome or the likelihood of an event to happen.
SAMPLE SPACE AND EVENT
The set of S of all possible outcomes of a statistical experiment is called a sample space. Each outcome in a sample space is
called an element, simply a sample point or a sample. An event is a subset of a sample space S.
PROBABILITY
If an experiment can result in any one of N different equally likely outcomes, and if exactly n of these outcomes correspond to
event A, then the probability of event A is
Example:
Calculate the probability of getting a Jack from 1 draw of a well shuffled deck of cards.
PROPERTIES OF PROBABILITY
P(A) = the probability of the event A
P(S) = the probability of the sample space
1. Positiveness. For every event A, 0 ≤ P ( A ) ≤ 1
This means that the probability of an event happening is always positive.
2. Probability of a sure event, P ( S )=1
3. If is the empty set, the P()=0
CONDITIONAL PROBABILITY
Calculated when we need to know the likelihood of event A happening given that event B has already happened.
Example:
Find the probability of drawing a 4 from a shuffled deck of cards given that you have already drawn a 7 from the deck.
INDEPENDENT EVENTS
Examples:
Two cards are drawn at random from an ordinary pack of 52 cards. Find the probability that:
a) Both are spades
b) One is spade and one is heart.
Three light bulbs are chosen at random from a box containing 15 bulbs of which 5 are defective. Find the probability if:
a) None is defective.
b) Exactly one is defective.
c) At least one is defective.
SAMPLING METHODS AND DESCRIPTIVE STATISTICS FOR SAMPLES
Slovin’s Formula
Where: n=sample size
N= total of population
e= tolerance error (0.01 or 0.5)
SAMPLING METHODS
1. Probability Sampling
a) Simple Random Sampling
Each element has an equal and independent chance of being included in the sample.
b) Systematic Sampling
Assign a number to every member of the population then sort them according to their assigned number.
c) Stratified Sampling
Grouping according to similarities.
d) Cluster Sampling
Grouping according to geographical location.
e) Multistage Sampling
Complex form of cluster sampling. Subgrouping until the final stage of the grouping process.
2. Non-Probability Sampling
a) Convenience Sampling
Samples are not selected at random. Based on convenience or what is favorable for the researcher.
b) Quota Sampling
Like stratified sampling but not randomly selected.
c) Judgement Sampling
Depends entirely on the researcher’s judgement.
d) Snowball Sampling
Data collected is based on referrals.
VARIANCE
STANDARD DEVIATION
Example: A tire manufacturer tested the life, in months, of sic randomly chosen tire sample. The test recorded below:
48 53 45 61 57 61
DATA COLLECTION
STATISTICS
science which deals with data. It deals with the systematic collection, presentation, analysis and interpretation of numerical data.
Two Categories:
a) Descriptive Statistics- refers to collection and presentation of data.
b) Inferential Statistics- analysis and interpretation of data.
DATA
information which are usually facts or numbers collected to answer research problems or investigations.
Two Types:
DATA COLLECTION
PRESENTATION OF DATA
Example:
Example: Currently, the management of a department store gets comments that consumers must wait a long time to be served by the
salespeople. manager recorded the following observations atter making some observations regarding the wait times for 20 customers.
RANDOM VARIABLES
RANDOM EXPERIMENT - results/ outcomes cannot be anticipated beforehand with any degree of certainty.
SAMPLE SPACE - also called possibility space, is the set of all possible outcomes or results.
RANDOM VARIABLE – a function that assigns a real number to each outcome in the sample space of a random experiment.
RANDOM VARIABLES
1. Discrete Random Variables - has a countable no. of possible values.
Ex: Tossing of coin, as in the experiment, has countable outcomes.
2. Continuous Random Variables – takes all the values in an interval of numbers.
PROBABILITY DISTRIBUTION OF RANDOM VARIABLES
1. Discrete Random Variable
From Example 1:
Experiment Outcome HH HT TH TT
(Sample Point)
x 2 1 1 0
x 0 1 2
f(x0 = P(X=x) 1/4 1/2 1/4
THE MEAN, VARIANCE, AND STANDARD DEVIATION OF RANDOM VARIABLES AND SAMPLES
Example: (Discrete)
X 13 17 21 25
f(x) 0.31 0.23 0.29 0.18
Example: (Continuous)
PLANNING AND CONDUCTING SURVEYS
Level of Phenomenon
Test Area Treatment Introduced
After Treatment (Y)
Level of Phenomenon
Control Area Without Treatment (Z)
Treatment Effect = Y - Z
Two groups or areas are selected, and the treatment is introduced into the test area only. The dependent variable is then
measured in both areas at the same time.
BINOMIAL PROBABILITY DISTRIBUTION- describes the probability of a particular outcome in a series of experiments where the
outcome has two distinct possibilities, success, or failure. Binomial distribution is a series of independent and identically distributed
Bernoulli trials. In a Bernoulli trial, the experiment is said to be random and could only have two possible outcomes: success or failure.
X~B(n,p) where: n-total no. of experiments p-probability of success
P(x: n,p) = nCxpxor P(x: n,p) = nCpxqn-x where: q-probability of failure (q = 1-p)
DESCRIPTIVE STATISTICS FOR A BINOMIAL DISTRIBUTION
Mean= np
Example:
A coin is tossed five times.
a) What is the prob distribution of this binomial experiment?
Example:
A particular river overflows every 25 years on the average Find the prob that there are x=2 overflows in a 25 year interval.
Z-SCORE- number of std deviation from the mean. A positive z-score indicates raw score higher than mean average.
Formula:
Example:
1. The heigh of the male adults are normally distributed with a mean of 1.9 meter and a std dev of 0.22. What is the standard
score if the heights of these adults are x1=1.6 meter and 1.8 meter.
2. A machine produces electrical components. At 99.7%, z=±2.97 of the components have lengths between 1.176cm and
1.224cm. Assuming the data is normally distributed, what are the mean and std dev?
Example:
distribution of the resistance is normal, find the mean, the variance and the standard deviation of the sampling distribution for 𝒏 = 𝟐𝟓
An electronic company manufactures resistors that have mean resistance of 120 ohms and a standard deviation of 12 ohms. If
resistors.
THE CENTRAL LIMIT THEOREM states that the sampling distribution of the sample means (unknown population) approaches a
normal distribution as the sample size gets larger. This holds especially true for sample sizes over 30.
Properties of Point Estimators
1. Bias
𝜃, being estimated.
Bias is the difference between the expected value (the average or mean value) of a point estimator minus the value of the parameter,
A good estimator has a small bias. When the bias is zero then you may say that the point estimator is unbiased.
2. Consistency
Consistency shows how close the point estimator to the value of the parameter as the sample size increases.
3. Relative Efficiency
• The absolute efficiency of an estimator is the ratio between the minimum variance and the actual variance.
• An unbiased estimator is called efficient if its variance coincides with the minimum variance for all values of the population parameter.
efficient. An estimator 𝜃 is said to be more efficient than another estimator 𝜃2 for 𝜃 if the variance of the first is less than the variance
• If two competing estimators are both unbiased, the one with the smaller variance (for a given sample size) is said to be relatively more
of the second.
4.Standard Error
• Standard error is a measure of accuracy of a statistic. This is equal to the standard deviation of the sampling distribution of this
statistic.
• The standard error tells you how accurate the mean of any given sample from that population is likely to be compared to the true
population mean. When the standard error increases, i.e. the means are more spread out, it becomes more likely that any given mean
is an inaccurate representation of the true population mean.
where: SE = standard error of the sample