0% found this document useful (0 votes)
19 views6 pages

Bayesian Models - Team - C

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

Bayesian Models - Team - C

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Team: Biostatistics – C

Team members name and Roll:

1. Title of the Report: Bayesian Models for Air Quality Analysis

2. Data Description:
This report shows a Bayesian study of the Air Quality dataset looking at how Ozone levels
relate to other environmental factors Solar Radiation (Solar.R), Wind Speed (Wind), and
Temperature (Temp). We did the study using a Bayesian linear regression model, which we
set up with the rstanarm package in R.

The dataset has daily readings of air quality measurements from New York taken from May
to September 1973. We left out any missing values, which gave us a dataset with 111
observations.

Variable Datatype Median (Q1, Q3)


Ozone Numeric 31(18,62)
Solar.R Numeric 207(113.5,255.5)
Wind Numeric 9.70(7.4,11.5)
Temp Numeric 79(71,84.5)
Month Numeric 7(6,9)
Day Numeric 16(9,22.5)

3. Methodology:

The air quality dataset was used for a Bayesian linear regression model with Ozone as the
outcome variable and Solar. R, Wind, and Temp as predictors. Missing values were removed,
and the model was specified with normal priors for the coefficients and intercept, and a
Cauchy prior for the auxiliary parameter. It was fitted using 4 chains of 2000 iterations, with
satisfactory convergence. Model diagnostics included posterior predictive checks and
diagnostic plots, and the Shapiro-Wilk test assessed the normality of predictor variables.

4. Experiments done

In this study, we performed a Bayesian linear regression analysis to investigate the


relationship between Ozone levels and environmental variables such as Solar Radiation
(Solar.R), Wind Speed (Wind), and Temperature (Temp) using the rstanarm package in R.

We see that there are missing values in the dataset that’s why we removed the missing values
it is crucial for obtaining reliable model estimates.

We fitted the Bayesian linear regression model using the following specifications:

➢ Dependent Variable: Ozone


➢ Independent Variables: Solar.R, Wind, and Temp
➢ Model Configuration:
• Gaussian family with an identity link
• Normal priors for the intercept and coefficients
• Cauchy prior for the auxiliary parameter
• Four chains, each with 2000 iterations (including warm-up)
4.1 Model Diagnostics

1 Posterior Distribution of Model Parameters

The posterior distributions of the model parameters were examined to ensure that the model
adequately captured the uncertainty and variability in the data.

2 Posterior Predictive Checks

Posterior predictive checks were conducted to assess how well the model's predictions
matched the observed data.

Posterior Predictive Distribution Plot

• The dark line represents the observed data (`y`)


• The lighter lines represent the simulated data from the model (`y_rep`).
3 Convergence Diagnostics
Convergence was evaluated through trace plots, Rhat values, and effective sample size
calculations. These diagnostics confirmed that the model converged appropriately.
MCMC diagnostics

1 mcse Rhat n_eff


(Intercept) 0.4 1.0 2805
Solar.R 0.0 1.0 4949
Wind 0.0 1.0 3258
Temp 0.0 1.0 2962
sigma 0.0 1.0 3054
mean_PPD 0.0 1.0 3909
log-posterior 0.0 1.0 1942

4. Assumption Checks:
4.1.1 Linearity

The relationship between the predictors and the dependent variable was examined by
plotting observed vs. fitted values.

Figure 2: Observed vs. Fitted Values Plot

4.1.2 Independence of Residuals

An autocorrelation plot was used to verify that residuals were independent of each other.
4.1.3 Homoscedasticity

A plot of residuals vs. fitted values was created to check for homoscedasticity, confirming
that the residuals had constant variance.

4.1.4 Normality of Residuals:

The normality of residuals was assessed using a Q-Q plot.

Normal Q-Q Plot


Shapiro-Wilk tests were also performed to check the normality of the predictors, ensuring
that the model assumptions were met.

6. Results:

Model Summary:

• The model identified significant relationships between Ozone and the predictors.

• The coefficients obtained were:

o Intercept: Mean = -66.5 (95% credible interval: -94.2 to -37.8)


o Solar.R: Mean = 0.1 (95% credible interval: 0.0 to 0.1)

o Wind: Mean = -3.3 (95% credible interval: -4.1 to -2.5)

o Temp: Mean = 1.7 (95% credible interval: 1.3 to 2.0)

• The posterior predictive mean was 40.4 with a standard deviation of 2.8.

Model Diagnostics:

• The Rhat values were approximately 1.0 for all parameters, indicating good
convergence.

• The autocorrelation plot of residuals showed no significant autocorrelation,


suggesting independence of residuals.

• The residuals vs. fitted values plot suggested homoscedasticity.

• The Q-Q plot for residuals indicated an approximately normal distribution.

Shapiro-Wilk Test Results:


• Solar.R: W = 0.93285, p-value = 2.957e-05

• Wind: W = 0.98076, p-value = 0.1099

• Ozone: W = 0.87355, p-value = 2.846e-08

• Temp: W = 0.98007, p-value = 0.09569

• Month: W = 0.86221, p-value = 9.556e-09

• Day: W = 0.96174, p-value = 0.002879

These results indicate that most predictors, except for Wind and Temp, deviate from
normality.

Predictions:

• Two predictions were made with new data:


o For Solar.R = 190, Wind = 7, Temp = 80: Median predicted Ozone = 54.44
(Range: -18.07 to 131.55)
o For Solar.R = 250, Wind = 5, Temp = 90: Median predicted Ozone = 80.96
(Range: 7.97 to 157.77)

7. Discussion:

By analysing the Air Quality data using Bayesian methods, we found some interesting
patterns about how different environmental factors affect Ozone levels:

• Solar Radiation (Solar.R): There's a slight increase in Ozone levels with more Solar
Radiation, but the effect isn’t very strong. It’s like saying a little more sunshine leads
to a small rise in Ozone.

• Wind Speed (Wind): Higher wind speeds are associated with lower Ozone levels.
This makes sense because stronger winds help disperse pollutants, leading to cleaner
air.
• Temperature (Temp): Warmer temperatures are linked to higher Ozone levels. This
fits with the idea that heat contributes to Ozone formation in the atmosphere.
The models we used worked well overall. It fit the data nicely and the predictions seem
reliable. While there were some minor deviations in the normality of predictors, they didn’t
significantly affect the results. Our predictions for new data help us understand how changes
in environmental conditions could influence Ozone levels.

8. References:

1. https://rpubs.com/riazakhan94/naive_bayes_classifier_e1071
2. https://www.r-bloggers.com/2019/05/bayesian-models-in-r-2/
3. https://odsc.medium.com/building-your-first-bayesian-model-in-r-7aa5dd304800

4. https://rstudio-pubs-
static.s3.amazonaws.com/721703_f1c1a8d256ce4393aabfe23c9c1d221d.html

You might also like