Intro - To Statistics Data Analysis in Geology - Dr. Franz J Meyer
Intro - To Statistics Data Analysis in Geology - Dr. Franz J Meyer
Intro - To Statistics Data Analysis in Geology - Dr. Franz J Meyer
1. Introduction
Since Then:
2000 2003: Scientific Employee at the Chair for Photogrammetry and
Remote Sensing of the TU Munich.
3
Main Fields of Expertise
Methods & Applications
In the past, geology has been qualitative, but is now becoming increasingly
quantitative. Statistics can be used to quantify data, but often times statistics are
ignored or misrepresented
In the old days geologists would use more observation skills. This looks like
granite
Methods of statistical data analysis are required to retrieve information from the
set of numbers in the computer
Retrieving information from the data, as well as choosing the right observation
source for information retrieval are the main challenges of the information age
To get some perspective, think of the largest library in the world: the
Library of Congress in Washington, D.C. This massive library contains
29 million books and other printed materials, 2.7 million recordings, 12
million photographs, 4.8 million maps and 57 million manuscripts.
As massive as that sounds, the scientific data from EOSDIS could fill the
Library of Congress 300 times.
Using these data and extracting the useful information is one of the
disciplines of statistics and data analysis
Ternary Diagrams:
Spider Diagrams:
The use of graphs is a helpful tool to get a first grasp about the information
content in the data
We need a better answer than They look like it so they must be different
Inferential Statistics: used to model patterns in the data, accounting for randomness
and drawing inferences about the larger population.
answers to yes/no questions (hypothesis testing), estimates of numerical characteristics
(estimation), descriptions of association (correlation), or modeling of relationships (regression).
Other modeling techniques include ANOVA, time series, and data mining.
Examples:
The average age of citizens who voted for the winning candidate in the last presidential
election
The average length of all books about statistics
The variation in the weight of 100 boxes of cereal selected from a factorys production
line
Or more technical: The adjustments of 14 GPS control points for this orthorectification
ranged from 3.63 to 8.36m with an arithmetic mean of 5.14
Interpretation:
You are most likely to be familiar with this branch of statistics, because many examples
arise in everyday life.
Descriptive statistics form the basis for analysis and discussion in many fields.
Examples:
A survey that sampled 2001 full- or part-time workers ages 50 to 70, conducted by the
American Association of Retired Persons (AARP), discovered that 70% of those polled
planned to work past the traditional mid-60s retirement age.
This statistics could be used the draw conclusions about the population of all workers ages 50
to 70.
Or again more technical: The mean adjustment of any set of GPS points used for
orthorectification is no less than 4.3 and no more than 6.1m; this statement has a 5%
probability of being wrong
Interpretation:
If you use inferential statistics, you start with a hypothesis and look to see whether the
data are consistent with this hypothesis.
Inferential statistical methods can be easily misapplied or misconstructed, and many
methods require the use of a calculator or computer.
Variance measures how a set of data values for a variable fluctuate around the
mean of that variable.
Variance is an inherent value of the measurement device one is using, or of the object
that is observed
Preparation errors
contamination, final split does not represent field sample
Analytical errors
calibration errors (setting up the machine)
measurement errors (fluctuations in counting)
machine errors (properties of the machine, mass fraction).
Drawing conclusions from a set of erroneous data is difficult and using the wrong
analysis methods or the wrong models my lead to incorrect results
There are three kinds of lies: lies, damned lies and statistics. - Twain attributed this to
B. Disraeli
It has long recognized by public men of all kinds ... that statistics come under the head
of lying, and that no lie is so false or inconclusive as that which is based on statistics. -
H. Belloc
If your experiment needs statistics, you ought to have done a better experiment -
Ernest Rutherford
Never trust results you havent forged yourself Famous saying among engineers
Lets change the graph by adding the standard deviation (square root of
variance) to the graph
We still see a strong temperature increase since 1900, but
the temperature in the past suddenly seams very noisy and not as threatening anymore
The anthropogenic nature seems still present but not as obvious anymore
Lets change the graph again by adding other studies to the graph
Temperature rise still sticks out but the differences between the studies render the
amplitude questionable
the temperature trend in the past doesnt look linear anymore
anthropogenic climate change?
Since the eye is a "fat pipe" to the mind, that is, since a great deal of
(mis)information can be quickly communicated visually, the (im)proper display of
statistics offers a fast track to selling ideas, and potentially to lying with statistics
The homework will help you to reach this goal and will help you and me to
understand topics that need to be reiterated
ALSO, my foremost goal is to help you understand and to make you pass, so
please dont hesitate to contact me when you have problems:
Franz Meyer, Room 106d, Westridge Research Building
Phone: 907-474-7767
email: [email protected]
http://avo-ftp.images.alaska.edu/TEMP/geos430_geostats/
For downloading the material, just type the URL into your web browser
The PowerPoint material you find on the CD may change a bit in the course of the
semester
I will update the material on the webpage constantly and you can download the
latest versions from there