0% found this document useful (0 votes)
23 views

EES401-Quantitative Data Analysis in Earth and Environmental Sciences

Here is a potential 1/2 page response to the 1 mark assignment: The purpose of my experiment was to test the hypothesis that soil moisture content affects the concentration of nitrates in groundwater. It was a hypothesis test where I proposed the null hypothesis that there is no relationship between soil moisture and nitrate concentration. To test this, I selected 5 monitoring wells located on agricultural land with varying soil types (clay, loam, sand). I chose these wells randomly to get a representative sample of different soil conditions. At each well, I measured the soil moisture content at depths of 1 meter, 2 meters, and 3 meters below ground surface. I also took a water sample from each well to analyze for nitrate concentration.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

EES401-Quantitative Data Analysis in Earth and Environmental Sciences

Here is a potential 1/2 page response to the 1 mark assignment: The purpose of my experiment was to test the hypothesis that soil moisture content affects the concentration of nitrates in groundwater. It was a hypothesis test where I proposed the null hypothesis that there is no relationship between soil moisture and nitrate concentration. To test this, I selected 5 monitoring wells located on agricultural land with varying soil types (clay, loam, sand). I chose these wells randomly to get a representative sample of different soil conditions. At each well, I measured the soil moisture content at depths of 1 meter, 2 meters, and 3 meters below ground surface. I also took a water sample from each well to analyze for nitrate concentration.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

EES401- Quantitative data analysis

in earth and environmental sciences


Grades
 1st Midsem 20 Marks
 Endsemester Exam 50 Marks
 Assignments/Quizzes 30 Marks
Course structure
 You will be learning how to perform data analysis
 You will learn about statistical tools and how to use them

 More importantly IDC401 it is a course attempting to teach you:


1. How to ask the right questions
2. How to select the right mathematical tool to answer
3. How to interpret the answer
4. How to recognize when you are trying to use an inappropriate
tool
5. How to recognise when someone is “fudging” data or
accidentally setting up a flawed experiment
What is data?
Data consists of variables that can be
classified

Variables

quantitative
Celsius scale for qualitative
Kelvin scale for
temperature
temperature
Interval
Ratio (true zero) (interpretable Groups/Categories
distance)

Continuous Continuous
Discrete (finite Discrete (finite
(infinite number of (infinite number of Ordinal (can be
number or number or Nominal (no order)
values on real values on real ordered)
countably infinite countably infinite
number line number line

People in a Money in an Year in Temperature A, B, C, D Male/female/3rd


room account with no MSc of in Celsius grade gender
overdraft student
integer float
integer float
Note that variables can be dependent and
independent
 When your data variable as such has no meaning on its own
it is a dependent variable. An A grade has a meaning only in
relation to the student to which it was awarded
 Similarly, a measured concentration of a pollutant in
groundwater will never have any meaning/practical
implications without the knowledge of the place and time of
sample collection. Lab journals, sample labels and storage
all are an important part of maintaining data integrity. Flaws
in either can jeopardize the meaning of the data
 When your data variable has meaning on its own it is an
independent variable. E.g. space and time are independent
they have meaning whether we measure the pollutant or not.
Datasets can be classified depending on the
number of variables they contain
 Data sets consisting of a single variable are called
univariate
 Datasets consisting of two variables are called bivariate
 When there are more variables in the dataset it is called
multivariate data
Does it matter what type of data I have?
Why should I care?

Here is a simple list of do’s and don’ts.


If you know what kind of data you have, you know what you
can or cannot do with it.
OK to compute.... Nominal Ordinal Interval Ratio
Frequency distribution (Histogram). Yes Yes Yes Yes
Mean/median and percentiles. No Yes Yes Yes
Add or subtract. No No Yes Yes
Mean, standard deviation, standard error of
No No Yes Yes
the mean.
Ratio, or coefficient of variation. No No No Yes
1. A sample of 400 households is selected and several variables are recorded.
Classify the following variables recorded on two accounts: 1. Data class, 2. Type
of variable (dependent/independent)
 
a) Total household income (in $)
b) Socioeconomic status (recorded as “low income”, “middle income”, or “high
income”)
c) The number of people living in a household
d) The primary language spoken in the household
e) Household ID# in result database
2. Which types of operations are permissible/non permissible for certain data
types (tick and cross cells in the table as appropriate)

OK to compute.... Nominal Ordinal Interval Ratio


Frequency distribution (Histogram).
Mean/median and percentiles.
Add or subtract.
Mean, standard deviation, standard error of
the mean.
Ratio, or coefficient of variation.
The way from data to knowledge…. Is
called statistics 
Statistics “ refers to the orderly collection, analysis, and
interpretation of data with a view to objective evaluation of
conclusions based on the data”
Zar J. Biostatistical Analysis Fifth edition, Prentice Hall, New
Jersey1999
Population is a group of (infinite) measurements about
which one wishes to generalise. It consists (hypothetically)
of all measurements of the desired case
Sample is the subset of measurements (that can be acquired
in the capacity of the one who measures).
For a sample to be representative of the population it comes
from, they should be chosen ‘randomly’
Orderly collection
 Starts with a proper lab journal
 Ask yourself what you are doing:
 A) An observational study (e.g. create a map of As levels in the
unconfined aquifer in Punjab)
 B) A hypothesis test (in which case you have to be clear about
the hypothesis that you are testing)
 Ask yourself what is the correct sample selection strategy:
 A) Simple random sampling
 B) stratified random sampling
 C) paired group
1st Assignment – 1 Mark
 Write some words (ca. half page) about an experiment you
are currently doing (e.g. for your thesis) or have done in
the past answer the following questions
1. What was the purpose of the experiment?
2. Was it an observational study or a hypothesis test?
3. How did you decide which samples to work with and
how to analyse them?
4. Which variables did you observe?
5. Now please classify all your variables

You might also like