0% found this document useful (0 votes)
31 views10 pages

Unit 0 - Statistics Unit Notes Dictated (CLOSED)

Uploaded by

Orion Gjoni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views10 pages

Unit 0 - Statistics Unit Notes Dictated (CLOSED)

Uploaded by

Orion Gjoni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Class 8/26 statistics unit starts

Importance: Why stats and data?

Science is a discipline based on a collection of quantifiable data.

Science can only ask questions answered by quantifiable data

No qualitative data of any kind e.g. Theology, philosophy

Even qualitative data at face becomes quantitative change in chemical to Yellow but how
yellow?

Scientific hypothesis can only be supported or not supported by data never proven!!!

Structure defines function of everything.

Science is not old; understanding limited by technology.

There is no law in biology, unlike chemistry or physics.

Validity & significance of data

Data is not always true.

Trust, but verify.

Verification will happen via statistical analysis.

End class 8/26.


Class 8/28 Validity and Significance of
Data
Validation & statistical analysis

Not all data is significant or valid, as data can be skewed / flawed.

Conclusions are valid only when supported by valid / significant data.

How to validate data? Statistics

Data can be skewed through response to stimuli, a requirement of life. Therefore, no study can
remove all noise variables.

Statistics remove significant data from noise.

Collection of v&s data requires multiple trials.

When writing procedures always minimum three trials

Greater sample size, better and v&s data

Lots of graphs and models. Science conveys the most information in the least amount of text
possible.

One model / graph > multi- page writeup

Only ever is the mean plotted

Example: Enzymatic activity in the face of temperature swings

Three trials at room temperature plot 1 points: Average of Trials

Three trials at high temperature plot 1 points: Average of Trials

Three trials at low temperature plot 1 points: Average of Trials

Total number of points: 3

Ideally, the amount of data is large enough to represent a good sample size, which should result
in a bell-shaped curve, AKA a normal distribution.
Ideally, the mean should be a perfect representation

The mean is the average across multiple trials.

One point for variable change

Dotted line is the meaning

Consider organisms like ecosystems: Limited resources.

Biggest limited resource is energy.

Realities of data:

Very perfect, unlikely to fit a normal curve exactly

The problem with plotting the mean: Me may not accurately represent the spread, leading to a
positive skew or negative skew

Skew is mitigated via error bars

Problem with the mean: Mean does not indicate spread of data

Data spread impacts validity of the mean as a representative of the data.

Rule of thumb: Be smaller the spread, the more valid the mean.

Quantify via deviation/error

Used to determine the spread of data points from the mean

Equation of standard deviation


Equation of standard error

Standard deviation represents confidence in the data. Smaller standard deviation means higher
confidence, vice versa.

Confidence does not equal proof

Representation of confidence in models happens via error bars, based off standard error, based
off standard deviation. Standard error of mean represents the uncertainty of the mean due to
sample size as well as spread.

Standard error is a better representation of error, taking into account sample size as well as
spread.

As with standard deviation, lower standard error is preferable. Small standard error means the
likelihood of the mean being correct increases.

The mean is meant to be a representative of a population.

End class 8/28

Class 9/3 Error Bars


Error bars in bar graphs
Error bars in line graphs

Error bars and significance

Data may be valid, but when data is small enough, the question becomes significance.

We know that a mutation may happen, we do not know when, what cell, what gene.

Mutations happen every billion nucleotides. Cells have 6 billion nucleotides. There should be six
mutations per cell cycle.

Organisms are ordered, and stand against the entropy of the universe

In cancer research, these cells exposed to carcinogens show a faster rate of mutation
Standard error bars communicate

1. How accurately the mean represents the data accounting for low sample size

The smaller the error bar, the more reliable the data, vice versa

2. How likely is it that there is a significant data between data sets.

What is a significant difference

Data is significant if results are not due to chance or sampling size error

How does radiation cause cancer? Radiation introduces energy to cell molecules, making them
more likely to do things they would not normally do.

Carcinations also destabilize cell molecules, though not necessarily via introducing energy into
the system.

If error bars overlap, the data is not significant.

Insignificant v Significant

Benefit to using 2SE

When the sem is very small, it can be difficult to distinguish if there is overlap.
2SE makes it easier to ascertain overlap.

Always mention if graph uses SE or 2SE

Experimental design

Key components of experimental design:

1. Independent variable: Variable that is manipulated. Plot on x-axis


2. Dependent variable: Variable that is measured. Plot on y-axis
3. Control group: Group in comparison to experimental. Independent variable is unchanged
dependent variable is measured.
4. Experimental group: Group in which independent variable is manipulated, dependence
variable is measured.
5. Controlled variables: All factors that stay constant between control and experimental
6. Additional components: Large sample size, repeatable procedure, multiple trials.

Positive v negative controls

Negative control group: Is not exposed to the experimental treatments. Provides no response to
treatment. Tests influence of external factors. Placebo.

Positive control group: Exposed to independent variable. Provides an expected / known results.
Where negative gets placebo, positive gets ibuprofen. Negative is far more common.

End class 9/3


Class 9/9 Null/Alternate and Chi-Square
Hetero-hetero cross, ends w/ 75% dominant, 25% recessive

In a population of 200, 150 would be dominant, 50 would be recessive.

Outlier is no longer the first claim we can make; such a claim must be backed by statistical
analysis.

The two kinds of hypotheses


Null hypothesis: assumes the variable has no effect, and that all observations are a product of
chance. Insignificant data. Designated H0.

Alternate hypothesis: assumes the variable & relationships are true, and not a product of
chance. Significant data. Designated Ha.

In order to move forward with the alternate, we must first reject the null, which implies there is a
scientific explanation of the phenomenon.

Chi-Square
Chi-square is a statistical analysis test to determine the significance between the observed and
expected data.

If the discrepancy is not significant, we have failed to reject the null hypothesis, and thus the
discrepancy is due to random chance.

If the discrepancy is significant, we have rejected the null hypothesis and may argue that the
discrepancy is due to scientific phenomena.

A null hypothesis may only ever be used when you have expected results.

Chi-Square equation:
The chi-square value (x2) is compared to the critical value.

Critical value is given by a chart, readable by degrees of freedom and p-value.

If chi-square is less than the critical value, the discrepancy is not significant. Therefore, we have
failed to reject the null.
If chi-square is greater than the critical value, the discrepancy is significant. Therefore, we have
rejected the null.

Generally, the accepted p-value is 0.05, unless stated otherwise.

Degrees of freedom is possible states minus 1.

Steps in a chi-square test


1. Determine null hypothesis
2. Count observed values
3. Determine expected values
4. Calculate the chi-square value
5. Calculate degrees of freedom
6. Select p-value
7. Identify critical values
8. Compare chi-square to critical value
9. Reject, or fail to reject, the null hypothesis.
End class 9/9

End Statistics Unit

You might also like