A Managers Guid o Data Talk

1) Measurements of the same sample will generate a distribution centered around a value we accept as most representative, due to inevitable measurement errors. 2) Both random and systematic errors contribute to measurement uncertainty. Random errors vary unpredictably while systematic errors remain constant. 3) For a normally distributed set of measurements, the mean and standard deviation characterize the central tendency and dispersion. Repeating measurements causes the mean to converge on the true value.

Uploaded by

Ken Osborn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views5 pages

A Managers Guid o Data Talk

Uploaded by

Ken Osborn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 5

A Managers Guide to Interpreting Analytical Data Kenneth E.

Osborn East Bay Municipal Utility District Oakland, CA 94623 Important arguments require evidence and quantitative evidence requires statistical evaluation. (Stephen Stigler, Statistics on the Table). Managers use analytical laboratories to provide them with measurement based information. All measurements have measurement error. Sources and types of measurement error influence the generation, interpretation, and regulatory uses of laboratory data. The successful manager intuitively understands analytical data and measurement error. This paper offers examples using visual models to reinforce that intuitive understanding and provides a concept based tool not found in traditional texts on either statistics or data interpretation. Measurements repeatedly made of the same sample1 or of a series of samples from a single process will generate a distribution of analytical results centered around a value that we accept as more representative than any of the other measurements. This scattering of values occurs because all measurements contain error. There is no way that errors in the individual measurements can be eliminated. Thus to properly interpret a collection of measurements or any analytical result, it is necessary to understand the nature of analytical error. Analytical error includes both random and systematic components. Systematic error is constant from one observation to the next. Sources of systematic error include improper instrument calibration, contamination introduced during sample processing, and matrix interference. Because systematic errors are constant they are predictable and thus potentially controllable. An example of a source of systematic error would be a yardstick that was short by one half-inch. Random errors vary from one observation to the next, have an equal likelihood for increasing or decreasing the measurement value, and decrease in occurrence with increasing magnitude of the error. Because random errors are not individually predictable, they are less controllable than systematic errors. An example of random error is depicted in figure 1. A sample of twenty bodyweight readings rounded to the nearest whole weight in pounds, was taken by repeated weighings on a bathroom scale.2 The weights range from 158 pounds to 162 pounds. Considering the variability in readings of the scale, it would be appropriate to ask how much does this person really weigh?
1

The use of sample in this context refers to a collection of data. Thus a sample of size = 7 refers to 7 distinct observations or measurements. 2 Data used in all plots for this article were generated by a computer based simulation program written by the author. It is available as a QBASIC program by request to the author at [email protected].

Any distribution of data has a value around which the values tend to center. Values are scattered around that value so that some are greater than and some are less than the central value. Figure 2 displays the same readings as figure 1, with the values ranked from low to high. This presentation by order of magnitude rather than order of occurrence emphasizes the statistical properties of the distribution such as central tendency and type of symmetry. One measure of central tendency is the mode. It is the most frequent reading and equals 161 pounds in the example. Another measure of the center would be the value in the middle, or the measurement that is exceeded by as many values as it exceeds. This is known as the median and equals 160 pounds in the example. The measure we are most familiar with is the arithmetic average, also known as the mean. The mean is defined as the sum of all values divided by the number of values and equals 160 pounds in the example. The mean is symbolized as [nb: symbols are in arial font] for the population mean, and as x-bar [nb: find symbol for xbar] for the sample. Which measure of central tendency is best? It depends! For a series of symmetrically distributed measurements, such as in this example, the mean and the median will be the same. For a collection of asymmetrically distributed data, the median may provide a more useful center of the distribution, especially if the data contain extreme values. An example of an asymmetrical distribution would be the collection of all family incomes in the United States. An arithmetic mean would tend to indicate a higher income than would be typical for the average family because a small percentage of very wealthy families would have a disproportionate effect on the mean. The median, being the middle value, tends to be insensitive to extreme values. Given that the mean is 160 pounds and the scales have been accurately calibrated, can it be stated that the true weight based on this sample of twenty weights is 160 pounds? Suppose another sample of twenty weights were taken, would the mean of that set also be 160 pounds? If repeated weights gave the same answer over and over, we could conclude that the true weight was the same as the mean. However, another series of weights will not produce the exact same sequence. Because of measurement error, a sample of measurements can provide only an estimate of the population mean of all possible measurements. The true population mean is beyond our reach. So the question is not is the true weight equal to 160 pounds, but rather, how close is the 160 pounds to the true answer. Figure 3 is a plot of 100 scale readings (rounded to the nearest ounce) and ranked in order by magnitude. The range is a low of 158 pounds and a high of 162 pounds and the mean is 160 pounds. This plot can be converted into a probability curve by exchanging the axes (see figure 4). The readings are symmetrically scattered around the mean, the mean and median are equal, and there is a greater density of readings near the mean that at the

extremes. The conditions of symmetry, equal mean and median, and decreasing probability of occurrence with extreme values are the properties of normal distributions. There is nothing abnormal about other types of distributions, but there is an almost universal tendency of physical measurements to be normally distributed. A generalized plot of the normal distribution is depicted in figure 5 and is also referred to as the Gaussian distribution and the Bell Curve. When the mean is zero and the standard deviation is one, the term standard normal distribution is used. So how does this help answer the question, how close is the mean of a sample to the population mean? Distributions have three important properties: shape, a center, and dispersion around the center. If the errors consist of many small and unrelated effects, the errors, and hence the measurements, will be normally distributed. A normal distribution is uniquely represented by two, and only two, statistics: the mean and the standard deviation. The standard deviation is a measure of dispersion in the same fashion that the mean is a measure of central tendency. Knowledge of one statistic in the absence of the other provides incomplete information regarding a collection of measurements. The standard deviation of the distribution for figures 3 and 4 is 1 pound and is calculated as the square root of the average squared distance of data from the mean (Conick and Smith, The Cartoon Guide to Statistics). The standard deviation is symbolized by for the population and s for the sample. Now we have something! A normal distribution of measurements, with a mean of 160, and a standard deviation of 1. Referring to figure 4, it is apparent that roughly 60 to 70 percent of the values lie within 1 standard deviation of the mean and more than 90 percent of the values lie within 2 standard deviations of the mean. Thus we would expect to obtain readings within one pound of the true weight 60 to 70 percent of the time and readings more than one pound from the true weight 30 to 40 percent of the time. We also would expect to obtain a small percentage of readings that exceed the true weight by more than two pounds. But is the mean of a series of weights no more reliable than an individual weighing? Intuitively, it seems that repeated weights will provide a more representative answer since the sum of random errors will approach zero as the number of measurements increase. This brings us to the Central Limit Theorem, which states: if repeated random measurements are made of a population, the collection of averaged values will approximate a normal distribution with a mean equal to the population mean and standard deviation equal to the population standard deviation divided by the number of measurements in the individual samples. As the sample size increases, the approximation will improve.

The Central Limit Theorem is consistent with our intuition: collect more measurements, and the mean of the data converges to a constant. Furthermore, the standard deviation of the sample mean is smaller than the population standard deviation: as the size of the sample increases, the standard deviation of the mean decreases with the square root of the sample size. For the example of twenty weights on the bathroom scale, the standard deviation of the mean is 0.2 pounds ( 1 pound divided by the square root of 20). Is it necessary to make replicate measurements to evaluate an analytical result? Of course not! It is reasonable to apply the statistics of past efforts to future events. For example, if I weigh myself after the Christmas holidays and see a scale reading of 165, based on the previous readings ranging from 158 to 162 pounds, I can conclude that I have gained weight. Of course it is always possible that the scales malfunctioned or my daughter is playing a trick on me and adjusted the dial. But if the scale calibration and the zero reading are acceptable, then I will should eat fewer hamburgers and French fries for awhile. I hope this brief introduction to visualizing analytical data through probability plots has provided the reader with a new tool for interpreting quantitative information. While this presentation has been limited to a discussion of measurement error and the most basic of statistical concepts, probability plots are simple to construct, and have universal applications for data interpretation. Future articles and presentations will explore additional uses.

Definition of terms

Error Difference between the observed measurement and the true value. Systematic error The component of error that is constant. Random error The component of error that is not constant. Sample A collection of measurements from a population. Population The collection of all possible measurements. Mean The arithmetic average of a sample or population. Median The middle value in a distribution of measurements. Mode The most frequently occurring value in a distribution. Standard deviation A measure of variability or spread in a sample or population. Central Limit Theorem A fundamental theorem of probability that states that the mean of a sample from a population is approximated by the normal distribution as the number in the sample becomes large (Websters Ninth New Collegiate Dictionary). Normal distribution The mathematically defined scatter of a set of measurements around a central value for which the errors are random, less probable at extreme values, and the consequence of multiple, small factors. Z-number The relative distance of a measurement from the average expressed as a fraction of the standard deviation. (mu) Population arithmetic mean. True value. x-bar Sample mean calculated from data. An estimate for mu. (sigma) Population standard deviation. s Estimate of sigma calculated from data.

Analytical Chemistry - Errors in Chemical Analyses
No ratings yet
Analytical Chemistry - Errors in Chemical Analyses
20 pages
Ders 1
No ratings yet
Ders 1
34 pages
Statistics in Research
100% (2)
Statistics in Research
26 pages
Chapter 2: Statistical Tests, Confidence Intervals and Comparative Studies
No ratings yet
Chapter 2: Statistical Tests, Confidence Intervals and Comparative Studies
75 pages
Biostatistics Unit 5. Measure of Skew
No ratings yet
Biostatistics Unit 5. Measure of Skew
38 pages
Basic Statistics
No ratings yet
Basic Statistics
31 pages
error ppt
No ratings yet
error ppt
21 pages
Measurement and Uncertainty
100% (1)
Measurement and Uncertainty
12 pages
SAMPLING-AND-ESTIMATION-THEORIES
No ratings yet
SAMPLING-AND-ESTIMATION-THEORIES
35 pages
Uncertainties in Measurement
No ratings yet
Uncertainties in Measurement
7 pages
Standard Errors: A Review and Evaluation of Standard Error Estimators Using Monte Carlo Simulations
No ratings yet
Standard Errors: A Review and Evaluation of Standard Error Estimators Using Monte Carlo Simulations
17 pages
Your Answer Score Explanation
No ratings yet
Your Answer Score Explanation
20 pages
3 BIOMETRY For ABG-730
No ratings yet
3 BIOMETRY For ABG-730
18 pages
Control 9 Tesis 2
No ratings yet
Control 9 Tesis 2
21 pages
Point-estimation-and-sampling-distribution
No ratings yet
Point-estimation-and-sampling-distribution
6 pages
Chapter 3 Statistical Parameters
No ratings yet
Chapter 3 Statistical Parameters
22 pages
Chapter 3
No ratings yet
Chapter 3
31 pages
Basic Statistics: Statistics: Is A Science That Analyzes Information Variables (For Instance
No ratings yet
Basic Statistics: Statistics: Is A Science That Analyzes Information Variables (For Instance
14 pages
Value of Z Area Between Z and - Z Area From Z To
No ratings yet
Value of Z Area Between Z and - Z Area From Z To
3 pages
Review of Fundamental Statistical Concepts: Measures of Central Tendency and Dispersion
No ratings yet
Review of Fundamental Statistical Concepts: Measures of Central Tendency and Dispersion
8 pages
Measures of Central Tendency
90% (10)
Measures of Central Tendency
22 pages
Basic Statistical Tools
No ratings yet
Basic Statistical Tools
14 pages
Statistics and Probability For Business and Economics
100% (1)
Statistics and Probability For Business and Economics
303 pages
Normal Distribution1
No ratings yet
Normal Distribution1
24 pages
SAT Subject Statistics
No ratings yet
SAT Subject Statistics
12 pages
Analytical Chemistry - Random Errors in Chemical Analyses
No ratings yet
Analytical Chemistry - Random Errors in Chemical Analyses
39 pages
Error Representation and Curvefitting: Hen To Report Random Error
No ratings yet
Error Representation and Curvefitting: Hen To Report Random Error
7 pages
Bio Statistics 3
No ratings yet
Bio Statistics 3
13 pages
Statistics and Probability - Solutions
No ratings yet
Statistics and Probability - Solutions
16 pages
Key Words: Basic Statistical Tools, Degree of Dispersion, Measures of Central Tendency, Parametric Tests and
No ratings yet
Key Words: Basic Statistical Tools, Degree of Dispersion, Measures of Central Tendency, Parametric Tests and
16 pages
4x @6ote ) 'Btda2@m
No ratings yet
4x @6ote ) 'Btda2@m
55 pages
Data Management
No ratings yet
Data Management
44 pages
23-Biostatistics
No ratings yet
23-Biostatistics
18 pages
M&I Lect 4
No ratings yet
M&I Lect 4
27 pages
Instructions For Chapter 3 Prepared by Dr. Guru-Gharana: Terminology and Conventions
No ratings yet
Instructions For Chapter 3 Prepared by Dr. Guru-Gharana: Terminology and Conventions
11 pages
Paper - I (U-I)
No ratings yet
Paper - I (U-I)
13 pages
Unit II
No ratings yet
Unit II
18 pages
05 - Statistical Processing and Analysis of Medical Data
No ratings yet
05 - Statistical Processing and Analysis of Medical Data
14 pages
Application of Statistical Concepts in The Determination of Weight Variation in Samples
No ratings yet
Application of Statistical Concepts in The Determination of Weight Variation in Samples
4 pages
Appendix B: Introduction To Statistics: Eneral Terminology
No ratings yet
Appendix B: Introduction To Statistics: Eneral Terminology
15 pages
Errors Experiment
No ratings yet
Errors Experiment
8 pages
9.1. Prob - Stats
No ratings yet
9.1. Prob - Stats
19 pages
Your Answer Score Explanation
0% (1)
Your Answer Score Explanation
18 pages
Statistics
100% (1)
Statistics
11 pages
STA301 CurrentPastFinalTermSolvedQuestions
No ratings yet
STA301 CurrentPastFinalTermSolvedQuestions
152 pages
6385.scribe Confidence Intervals
No ratings yet
6385.scribe Confidence Intervals
4 pages
Stats 3
No ratings yet
Stats 3
12 pages
Assignment 4 Area Under The Stanard Normal Curve
No ratings yet
Assignment 4 Area Under The Stanard Normal Curve
7 pages
Mean Median and Mode
No ratings yet
Mean Median and Mode
32 pages
Parth Suryavanshi (231056) Practical No.1 To No.5
No ratings yet
Parth Suryavanshi (231056) Practical No.1 To No.5
37 pages
Intro To Error Analysis
No ratings yet
Intro To Error Analysis
10 pages
Stat Handout
No ratings yet
Stat Handout
7 pages
Basic Statistics Concepts: 1 Frequency Distribution
No ratings yet
Basic Statistics Concepts: 1 Frequency Distribution
7 pages
02 - Data Pre Processing
No ratings yet
02 - Data Pre Processing
91 pages
Evaluating Analytical Data PDF
No ratings yet
Evaluating Analytical Data PDF
8 pages
9 Sampling Distribution and Point Estimation of Parameters
No ratings yet
9 Sampling Distribution and Point Estimation of Parameters
4 pages
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
No ratings yet
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
10 pages
CHAPTERS
No ratings yet
CHAPTERS
17 pages
CH 3 and 4
100% (4)
CH 3 and 4
44 pages
Lesson-2 Descriptive-Statistics Lecture
No ratings yet
Lesson-2 Descriptive-Statistics Lecture
27 pages
Stat-231 Objectives @infinite Agril
No ratings yet
Stat-231 Objectives @infinite Agril
39 pages
SAT Suite Question Bank - Results
No ratings yet
SAT Suite Question Bank - Results
48 pages
Statistics For GMAT
No ratings yet
Statistics For GMAT
28 pages
Experiment 1 - Mass, Volume and Graphing
0% (1)
Experiment 1 - Mass, Volume and Graphing
10 pages
Mean and Standard Deviation 2
No ratings yet
Mean and Standard Deviation 2
16 pages
Descriptive Statistics: Lesson 4
No ratings yet
Descriptive Statistics: Lesson 4
24 pages
1825872849class Xi Economices PDF
No ratings yet
1825872849class Xi Economices PDF
89 pages
Unit I
No ratings yet
Unit I
20 pages
Chem 26.1 Formal Report Expt 1
No ratings yet
Chem 26.1 Formal Report Expt 1
8 pages
Application of Statistical Concepts in The Determination of Weight Variation in Samples
No ratings yet
Application of Statistical Concepts in The Determination of Weight Variation in Samples
6 pages
Central Tendency in R Programming
100% (1)
Central Tendency in R Programming
6 pages
Ken Black QA 5th Chapter 3 Solution
No ratings yet
Ken Black QA 5th Chapter 3 Solution
47 pages
Lecture Note
No ratings yet
Lecture Note
124 pages
Quants Theory MBA SEM-1
No ratings yet
Quants Theory MBA SEM-1
11 pages
Maths
0% (1)
Maths
11 pages
Ch-9-Measures of Central Tendency (Mean) (Prashant Kirad)
No ratings yet
Ch-9-Measures of Central Tendency (Mean) (Prashant Kirad)
15 pages
Averaging Wind Speeds and Directions
No ratings yet
Averaging Wind Speeds and Directions
13 pages
GRE Maths Test-4
No ratings yet
GRE Maths Test-4
4 pages
Univariate Statistics
No ratings yet
Univariate Statistics
7 pages
PT 1 QP Eco Xi Final
No ratings yet
PT 1 QP Eco Xi Final
3 pages
5564 Quantitative Techniques
No ratings yet
5564 Quantitative Techniques
8 pages
PDF
No ratings yet
PDF
100 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Click Here To Download The Answer S
No ratings yet
Click Here To Download The Answer S
38 pages
MCQs - Measures of Central Tendency
100% (1)
MCQs - Measures of Central Tendency
9 pages
STATISTICS
No ratings yet
STATISTICS
2 pages
Understanding Statistics: An Introduction
From Everand
Understanding Statistics: An Introduction
Antony Davies
No ratings yet
Hypothesis Testing Made Simple
From Everand
Hypothesis Testing Made Simple
Leonard Gaston
4/5 (5)
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet

A Managers Guid o Data Talk

Uploaded by

A Managers Guid o Data Talk

Uploaded by

A Managers Guide to Interpreting Analytical Data Kenneth E.

You might also like