0% found this document useful (0 votes)
5 views4 pages

Untitled Document

The document outlines a project proposal requiring the use of R to analyze a dataset with at least 100 observations and 10 variables, emphasizing the importance of not reusing datasets from class. It details the structure of the proposal and final report, including sections on introduction, analysis plan, regression analysis, discussion, and conclusion. Additionally, it provides resources for obtaining suitable datasets and highlights the goals of applying statistical regression analysis learned in the course.

Uploaded by

M .Tariq IJaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views4 pages

Untitled Document

The document outlines a project proposal requiring the use of R to analyze a dataset with at least 100 observations and 10 variables, emphasizing the importance of not reusing datasets from class. It details the structure of the proposal and final report, including sections on introduction, analysis plan, regression analysis, discussion, and conclusion. Additionally, it provides resources for obtaining suitable datasets and highlights the goals of applying statistical regression analysis learned in the course.

Uploaded by

M .Tariq IJaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Project Proposal

Data Set:

The best strategy is to start with the questions of interest and then find the data.
You must use R for this project.

Minimum Requirements of the Data set:

●​ The data set should be large enough that multiple main effects and
interactions can be explored for your model.
●​ It should have at least 100 observations and at least 10 variables (both
quantitative and categorical variables).
●​ Do not reuse datasets used in lecture notes/homework/labs in class.

Some Resources for Data

1.​ Kaggle https://www.kaggle.com/


2.​ IMDb Datasets https://www.imdb.com/interfaces/
3.​ Links to an external site.
4.​
5.​ ICPSR https://www.icpsr.umich.edu/web/pages/ICPSR/index.html
6.​ Links to an external site.

Project Proposal:

In the proposal, you include the following:

1.​ Introduction​
Introduce your objectives and research questions you want to explore.​
Main Motivation for exploring these questions.​
Your hypothesis regarding your questions.
2.​ Analysis plan​
Describe your data set with the source.​
Explain all variables (observed variable, predictor variables, quantitative
variables, qualitative variables, etc.)​
Explain the potential models that you want to use (clearly, you can
update it later), we will cover chapters 1-12 of the textbook, but you are
free to use non-linear regression models in your analysis. If you plan to
do so, you need to add a section to your final projects and explain the
model thoroughly. You can use our textbook as a reference for your
statistical analysis.
3.​ Data​
Add dimensions (number of observations and variables)​
Data dictionary (a description of every variable in the dataset). you can
use the glimpse function in R to print a summary of your data set and
add it to this section.​
You can also add some initial plots of your data set (e.g., scatter plot)
4.​ References​
Include the appropriate references for any outside literature.

Goals of Final Project:

1.​ Apply the statistical regression analysis that you have learned in this class.
2.​ Prepare a strong report for your future career.

Data Set:

The best strategy is to start with the questions of interest and then find the data. You
must use R for this project.

Minimum Requirements of the Data set:

●​ The data set should be large enough that multiple main effects and
interactions can be explored for your model.
●​ It should have at least 100 observations and at least 10 variables (both
quantitative and categorical variables).
●​ Do not reuse datasets used in lecture notes/homework/labs in class.

Some Resources for Data

1.​ Kaggle https://www.kaggle.com/


2.​ IMDb Datasets https://www.imdb.com/interfaces/
3.​ Links to an external site.
4.​
5.​ ICPSR https://www.icpsr.umich.edu/web/pages/ICPSR/index.html
6.​ Links to an external site.
7.​ ​

Project Report

The report should have at least the following sections:

Introduction

●​ This is based on your project proposal.


●​ Include your research question, hypotheses, and a description of the data.
●​ Include the exploratory data analysis.

Regression Analysis

●​ Your final regression model


●​ Description of why you chose that type of model and any interpretations/
interesting findings from the coefficients
●​ Discussion of the model assumptions and model fit analysis.

Discussion & Limitations

●​ Add predictions and/or conclusions drawn from the model.


●​ Critique your own methods
●​ Suggest improvement of your analysis
●​ Study the reliability and validity of your data
●​ Study the appropriateness of the regression analysis

Conclusion

●​ Summarize your project


●​ Highlight your final points

Additional Work

●​ Include any other models you tried


●​ Include any assumptions that you used
●​ Explain why you didn’t select the model.

Appendix

●​ Data Set
●​ Your Code
●​ Code Output

You might also like