Unit 3 - Review of Fundamental Statistical Concept
Unit 3 - Review of Fundamental Statistical Concept
Introduction
Introduction
Engineers and Engineering Method
Topics:
Unit 3. Review of Fundamental Statistical Concepts The engineering, or scientific, method is the approach to formulating and
3.1 Data Presentation solving these problems.
3.2 Measures of Central Tendency An engineer is someone who solves problems of interest to society by the
3.3 Measures of Dispersion
efficient application of scientific principles.
By the end of the unit, the students must have: Engineers accomplish this by either refining an existing product or
1. apply the fundamental statistical concepts; and process or by designing a new product or process that meets customers’
2. perform statistical analysis of experimental results. needs.
Thus, engineers must know how to efficiently plan experiments, collect
data, analyze and interpret the data, and understand how the observed
data relate to the model they have proposed for the problem under study.
12/03/2024
Introduction Introduction
Design of Experiments (DOE) 4 Methods of Data Collection
a. Census
Experimental Design or design of experiments (DOE) or experiment - Study that obtains data from every member of a population. In most studies, census is not practical,
design, is the design of any task that aims to describe and explain the because of the cost and/or time required.
variation of information under conditions that are hypothesized to reflect b. Sample Survey
the variation. - Study that obtains data from subset of a population, in order to estimate population attributes.
c. Experiment
An experimental design is a detailed plan for collecting and using data to - Is a controlled study in which the researcher attempts to understand cause-and-effect relationships.
identify causal relationships. Also, a concept used to organize, conduct, and - * The study is “controlled” in the sense that the researcher controls:
(1) how objects are assigned to groups and
interpret results of experiments in an efficient way, making sure that as (2) which treatments each group receives.
much useful information as possible is obtained by performing a small - In the analysis phase, the researcher compares groups scores on some dependent variable. Based on the
number of trials. analysis, the researcher draws a conclusion about whether the treatment (independent variable) had a
causal effect on the dependent variable.
Through careful planning, the design of experiments allows your data d. Observational Study
collection efforts to have a reasonable chance of detecting effects and - Like experiment, however, unlike experiments, the researcher is not able to control:
(1) how objects are assigned to groups and
testing hypotheses that answer your research questions. (2) which treatments each group receives.
Introduction Introduction
Choice of data collection method Methods of Data Collection
This is also termed as plan or the sampling design. The data collection
method or sampling design will depend upon the following:
- Objectives of the study
- Scope and coverage of the study
- Nature and size of the population, and
- Hypothesis to be tested.
Data collection is an important aspect of any type of research study.
Inaccurate data collection can impact the results of the study and
ultimately lead to invalid results.
Data for a statistical investigation can be collected from: records, surveys
(either face-to-face, telephone, or postal), by direct observation or by
measuring or counting.
12/03/2024
Data Presentation Measures of Central Tendency Measures of Dispersion Data Presentation Measures of Central Tendency Measures of Dispersion
The Seven (7) Common Quality Control Tools
4. Histogram. The most commonly used graph for
3. Control charts. showing frequency distributions, or how often each
-A graphical display of quality characteristics that has been different value in a set of data occurs. Histogram helps in
measured and computed from samples; against historically prioritizing factors and identify which are the areas that
determined control lines (upper and lower and average). needs utmost attention immediately. The very purpose of
-Also helps in predicting process performance, understand the Histogram is to study the density of data in any given
various production patterns and study how a process changes distribution and understand the factors or data that
or shifts from a normally specified control limits (under repeat more often.
control or stable or within the limits or out of control).
Data Presentation Measures of Central Tendency Measures of Dispersion Data Presentation Measures of Central Tendency Measures of Dispersion
The Seven (7) Common Quality Control Tools 2. Check sheet (also called as tally sheet). The Seven (7) Common Quality Control Tools 6. Scatter diagram.
1. Cause-and-effect diagram (also called as Ishikawa - It is a structured, prepared form for collecting and analyzing 5. Pareto chart. Pareto chart is named after Alfredo -Scatter diagram or scatter plot is basically a statistical tool that
or fishbone chart). data. The very purpose of checklist is to list down the Pareto. -It shows on a bar graph which factors are more depicts dependent variables on Y – Axis and independent variable
- It is a diagram that identifies many possible causes (i.e. important checkpoints or events in a tabular/metrics format significant through 80:20 rule: where 80 % of the on X – axis plotted as dots on their common intersection points.
measurement, environment, materials, manpower, and keep on updating or marking the status on their problems or failure is accounted to 20 % of vital few Very purpose of scatter diagram is to establish a relationship
methods, machines) for an effect or problem and sorts occurrence. This will help in understanding the progress, causes; while 20 % of the problems or failure is (linear, curvilinear, exponential, logarithmic, quadratic,
ideas into useful categories. defect patterns and even causes for defects. accounted to the 80 % of trivial many causes. The very polynomial etc.) between problem (overall effect) and causes that
purpose of Pareto Chart is to highlight the most are affecting.
important factors that is the reason for major cause of -Stronger the correlation, stronger the relationship will hold true.
a problem or failure. -The variables can be positively or negatively related defined by
-Is having bars graphs and line graphs where individual the slope of equation derived from the scatter diagram.
factors are represented by a bar graph in descending
order of their impact and the cumulative total is shown
by a line graph.
Fig. 6. Types
of correlation
in Scatter Plot
(https://www.
whatissixsigm
a.net/7-qc-
Fig.1. A generic representation of the Ishikawa diagram aka Fig. 2. Defect types with their occurrence on day of the week Causes tools/).
Fig. 5 Causes for defects in supplied items
Fishbone diagram (Hristoski, et al., 2017). (https://www.whatissixsigma.net/7-qc-tools/).
(https://www.whatissixsigma.net/7-qc-tools/).
12/03/2024
Data Presentation Measures of Central Tendency Measures of Dispersion Data Presentation Measures of Central Tendency Measures of Dispersion
The Seven (7) Common Quality Control Tools
7. Flowchart.
-A flowchart is a diagram/picture that shows the step-by-step
process or operation to get a solution of a problem or identify
where the problem occurs in the process.
-The flowchart shows the steps as boxes of different kinds
(according to the type of step) and their order by connecting the
boxes with arrows as shown in Figure 7.
-Stratification is sometimes use wherein it a technique that
separates data gathered from a variety of sources so that patterns
can be seen.
-Basic flowcharting symbols are shown in Figure 8.
Figure 8. Basic flowchart symbols. Fig. 7. Flow chart on glaze proportional unit’s work flow
Data Presentation Measures of Central Tendency Measures of Dispersion Data Presentation Measures of Central Tendency Measures of Dispersion
Data Presentation Measures of Central Tendency Measures of Dispersion Data Presentation Measures of Central Tendency Measures of Dispersion
Measures of Dispersion/Variation
Data Presentation Measures of Central Tendency Measures of Dispersion Data Presentation Measures of Central Tendency Measures of Dispersion
Data Presentation Measures of Central Tendency Measures of Dispersion Data Presentation Measures of Central Tendency Measures of Dispersion
Exercise (20 pts.): How to Find Coefficient of Variation?
Calculate for the mean, median, mode, variance, and standard deviation of the following data values:
The coefficient of variation formula is useful particularly in those cases where we need to
compare results from two different surveys having different values. In statistics, the
Coefficient of variation formula (CV), also known as relative standard deviation (RSD), is a
standardized measure of the dispersion of a probability distribution or frequency
distribution. If the value of the coefficient of variation is lower then it indicates that the
data has less variability and high stability.
Data Presentation Measures of Central Tendency Measures of Dispersion Data Presentation Measures of Central Tendency Measures of Dispersion
Coefficient of Variation (CV) Example: Two plants C and D of a factory show the following results about the number of workers
and the wages paid to them. Using coefficient of variation formulas, find in which plant, C or D is
- is a type of relative measure of dispersion. It is expressed as the ratio of there greater variability in individual wages.
the standard deviation to the mean. No. of workers 5000 6000
- is a dimensionless quantity and is usually given as a percentage. It helps Average daily wages PhP2500 PhP2500
to compare two data sets on the basis of the degree of variation. Standard deviation 9 10
- can be determined for both a sample as well as a population. Solution:
Coefficient of Variation Formula To find which plant has greater variability, we need to find the coefficient of variation. The plant that has a
higher coefficient of variation will have greater variability.
Population, is the entire group
Coefficient of variation for plant C. Using coefficient of variation formula,
that is under consideration (or
CV = (σ/μ) × 100, μ≠0
used to denote the complete
CV = (9/2500) × 100 = 0.36%
data set).
Now, CV for plant D
CV = (σ/μ) × 100
Sample is when a specific part is
CV = (10/2500) × 100 = 0.4%
chosen from this population The
Plant C has CV = 0.36 and plant D has CV = 0.4
sample is used to represent the
Answer: Hence plant D has greater variability in individual wages.
entire population of the study.
12/03/2024
References:
Montgomery, D. C. (2013). Design and Analysis of Experiments Eight Edition. John Wiley & Sons, Inc. (QA279.M66 2013)
Refereed Journals, Bulletins and e-book from the Internet
https://www.moresteam.com/toolbox/design-of-experiments