Bayesian Models - Team - C
Bayesian Models - Team - C
2. Data Description:
This report shows a Bayesian study of the Air Quality dataset looking at how Ozone levels
relate to other environmental factors Solar Radiation (Solar.R), Wind Speed (Wind), and
Temperature (Temp). We did the study using a Bayesian linear regression model, which we
set up with the rstanarm package in R.
The dataset has daily readings of air quality measurements from New York taken from May
to September 1973. We left out any missing values, which gave us a dataset with 111
observations.
3. Methodology:
The air quality dataset was used for a Bayesian linear regression model with Ozone as the
outcome variable and Solar. R, Wind, and Temp as predictors. Missing values were removed,
and the model was specified with normal priors for the coefficients and intercept, and a
Cauchy prior for the auxiliary parameter. It was fitted using 4 chains of 2000 iterations, with
satisfactory convergence. Model diagnostics included posterior predictive checks and
diagnostic plots, and the Shapiro-Wilk test assessed the normality of predictor variables.
4. Experiments done
We see that there are missing values in the dataset that’s why we removed the missing values
it is crucial for obtaining reliable model estimates.
We fitted the Bayesian linear regression model using the following specifications:
The posterior distributions of the model parameters were examined to ensure that the model
adequately captured the uncertainty and variability in the data.
Posterior predictive checks were conducted to assess how well the model's predictions
matched the observed data.
4. Assumption Checks:
4.1.1 Linearity
The relationship between the predictors and the dependent variable was examined by
plotting observed vs. fitted values.
An autocorrelation plot was used to verify that residuals were independent of each other.
4.1.3 Homoscedasticity
A plot of residuals vs. fitted values was created to check for homoscedasticity, confirming
that the residuals had constant variance.
6. Results:
Model Summary:
• The model identified significant relationships between Ozone and the predictors.
• The posterior predictive mean was 40.4 with a standard deviation of 2.8.
Model Diagnostics:
• The Rhat values were approximately 1.0 for all parameters, indicating good
convergence.
These results indicate that most predictors, except for Wind and Temp, deviate from
normality.
Predictions:
7. Discussion:
By analysing the Air Quality data using Bayesian methods, we found some interesting
patterns about how different environmental factors affect Ozone levels:
• Solar Radiation (Solar.R): There's a slight increase in Ozone levels with more Solar
Radiation, but the effect isn’t very strong. It’s like saying a little more sunshine leads
to a small rise in Ozone.
• Wind Speed (Wind): Higher wind speeds are associated with lower Ozone levels.
This makes sense because stronger winds help disperse pollutants, leading to cleaner
air.
• Temperature (Temp): Warmer temperatures are linked to higher Ozone levels. This
fits with the idea that heat contributes to Ozone formation in the atmosphere.
The models we used worked well overall. It fit the data nicely and the predictions seem
reliable. While there were some minor deviations in the normality of predictors, they didn’t
significantly affect the results. Our predictions for new data help us understand how changes
in environmental conditions could influence Ozone levels.
8. References:
1. https://rpubs.com/riazakhan94/naive_bayes_classifier_e1071
2. https://www.r-bloggers.com/2019/05/bayesian-models-in-r-2/
3. https://odsc.medium.com/building-your-first-bayesian-model-in-r-7aa5dd304800
4. https://rstudio-pubs-
static.s3.amazonaws.com/721703_f1c1a8d256ce4393aabfe23c9c1d221d.html