Untitled Document
Untitled Document
Data Set:
The best strategy is to start with the questions of interest and then find the data.
You must use R for this project.
● The data set should be large enough that multiple main effects and
interactions can be explored for your model.
● It should have at least 100 observations and at least 10 variables (both
quantitative and categorical variables).
● Do not reuse datasets used in lecture notes/homework/labs in class.
Project Proposal:
1. Introduction
Introduce your objectives and research questions you want to explore.
Main Motivation for exploring these questions.
Your hypothesis regarding your questions.
2. Analysis plan
Describe your data set with the source.
Explain all variables (observed variable, predictor variables, quantitative
variables, qualitative variables, etc.)
Explain the potential models that you want to use (clearly, you can
update it later), we will cover chapters 1-12 of the textbook, but you are
free to use non-linear regression models in your analysis. If you plan to
do so, you need to add a section to your final projects and explain the
model thoroughly. You can use our textbook as a reference for your
statistical analysis.
3. Data
Add dimensions (number of observations and variables)
Data dictionary (a description of every variable in the dataset). you can
use the glimpse function in R to print a summary of your data set and
add it to this section.
You can also add some initial plots of your data set (e.g., scatter plot)
4. References
Include the appropriate references for any outside literature.
1. Apply the statistical regression analysis that you have learned in this class.
2. Prepare a strong report for your future career.
Data Set:
The best strategy is to start with the questions of interest and then find the data. You
must use R for this project.
● The data set should be large enough that multiple main effects and
interactions can be explored for your model.
● It should have at least 100 observations and at least 10 variables (both
quantitative and categorical variables).
● Do not reuse datasets used in lecture notes/homework/labs in class.
Project Report
Introduction
Regression Analysis
Conclusion
Additional Work
Appendix
● Data Set
● Your Code
● Code Output