Lab Note
Lab Note
**Hypothesis**:
We want to test whether studying with music improves test scores. Our
hypotheses would be:
- Null Hypothesis (H0): Studying with music does not improve test scores.
- Alternative Hypothesis (H1): Studying with music improves test scores.
**Experiment**:
We gather a group of students and randomly assign them to two groups:
one group studies with music, and the other studies in silence. After
studying, we give them a math test.
**Results**:
We calculate the average test scores for each group and find that the group
studying with music has a slightly higher average score.
**P-value**:
The p-value is a measure that helps us determine the strength of evidence
against the null hypothesis in a statistical hypothesis test. It represents the
probability of observing the data, or more extreme results, if the null
hypothesis were true. In simpler terms, it tells us how likely it is to get the
observed results by random chance alone.
A small p-value (typically less than 0.05) indicates that the observed
results are unlikely to have occurred under the assumption of the null
hypothesis, suggesting strong evidence against the null hypothesis. On the
other hand, a large p-value suggests that the observed results are more
likely to be consistent with the null hypothesis, indicating weak evidence
against it.
We conduct a statistical test, and let's say we get a p-value of 0.03. This
means that if the null hypothesis were true (studying with music does not
improve test scores), there is only a 3% chance of observing the difference
in test scores that we did, or even more extreme differences, just by
random chance.
Since the p-value (0.03) is less than the common significance level of 0.05,
we reject the null hypothesis and conclude that there is evidence to support
the claim that studying with music improves test scores.
What is ~
The tilde ~ is being used inside the summarise_all function from the dplyr
package in R, which is part of the tidyverse suite of data manipulation
tools.
WHAT IS %>%
The %>% operator in R is known as the pipe operator This operator allows
you to pass the result of one expression as the first argument to the next
expression, facilitating a more readable and streamlined syntax for
chaining together multiple functions. The use of %>% greatly enhances
the readability of code by presenting a sequence of operations in a logical
and linear fashion, akin to how one might describe the process verbally:
"Take the numbers 1 to 10, then calculate the square roots, and then sum
them."
REPLACE NA NUMERIC
REPLACE NA CATEGORICAL
Finding the Most Common "Sex" per Species: This is done by grouping
by species, filtering out NA values in the sex column, and then summarising
each group to find the most common value of sex. The result is stored in
most_common_sex.