Data-Analysis-using-R

This document provides a comprehensive guide on conducting data analysis using R Markdown, including data visualization and regression analysis with the 'wage1' dataset. It covers essential steps such as loading packages, performing descriptive statistics, creating visualizations, and interpreting regression results. The document also emphasizes reproducibility and adaptability for different datasets and output formats.

Uploaded by

khushalmalhotrabchc23

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Data-Analysis-using-R

Uploaded by

khushalmalhotrabchc23

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Data Analysis using R

2025-04-14

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF,
and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the
output of any embedded R code chunks within the document. You can embed an R code chunk like this:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that
generated the plot.
Data Source
library(wooldridge) library(tidyverse) library(stargazer)
data(“wage1”)
Data Overview
head(wage1) summary(wage1) str(wage1)
Descriptive Statistics
summary(wage1 %>% select(wage, educ, exper, tenure))
Data Visualization Histogram of Wage
ggplot(wage1, aes(x = wage)) + geom_histogram(aes(y = ..count..), binwidth = 1, color = “black”, fill =
“blue”) + labs(title = “Histogram of Wage”, x = “Wage”, y = “Frequency”)
Scatterplot of Wage vs. Education
ggplot(wage1, aes(x = educ, y = wage)) + geom_point() + labs(title = “Scatterplot of Wage vs. Education”,
x = “Education (Years)”, y = “Wage”) + geom_smooth(method = “lm”, se = FALSE, color = “red”)
Regression Analysis
Bivariate Regression: Wage on Education
reg1 <- lm(wage ~ educ, data = wage1) summary(reg1)
Bivariate Regression: Wage on Experience
reg2 <- lm(wage ~ exper, data = wage1) summary(reg2)
Multivariate Regression: Wage on Education, Experience, and Tenure
reg3 <- lm(wage ~ educ + exper + tenure, data = wage1) summary(reg3)
Regression Table
stargazer(reg1, reg2, reg3, title = “Regression Results”, type = “text”, dep.var.labels = “Wage”, covari-
ate.labels = c(“Education”, “Experience”, “Tenure”))
Conclusion
R Code Explanation and Important Considerations

1
• R Markdown/Quarto Setup:
– The YAML header (the part between the ---) sets the title, author, date, and output format of
the document.
– knitr::opts_chunk$set(echo = TRUE) ensures that R code is shown in the output document.
• Loading Packages and Data:
– library(wooldridge) loads the wooldridge package.
– library(tidyverse) loads the tidyverse package, which includes ggplot2 for plotting.
– library(stargazer) loads the stargazer package for creating regression tables.
– data("wage1") loads the wage1 dataset.
• Data Overview:
– head(), summary(), and str() provide initial information about the data.
• Descriptive Statistics:
– summary(wage1 %>% select(wage, educ, exper, tenure)) calculates descriptive statistics for
the selected variables. The %>% is the pipe operator from tidyverse, making the code more
readable.
• Data Visualization:
– ggplot2 is used to create the histogram and scatterplot. It’s part of the tidyverse.
– In the histogram, aes(y = ..count..) ensures that the y-axis shows the frequency.
– In the scatterplot, geom_smooth(method = "lm", se = FALSE, color = "red") adds a linear
regression line.
• Regression Analysis:
– lm(wage ~ educ, data = wage1) performs a linear regression of wage on educ.
– summary(lm_model) displays the regression results (coefficients, t-statistics, p-values, R-squared).
– I’ve included interpretations of the regression output within the text. This is crucial!
• Regression Table:
– stargazer() creates a formatted regression table. The type = "html" argument is suitable for
display in a web browser or for including in an HTML document. You can change it to "text"
for plain text output or "latex" for LaTeX output (if you’re using LaTeX). covariate.labels
relabels the variables in the table.
• Inline R Code:
– I’ve used inline R code (e.g., \${r mean(wage1$wage)}) to insert calculated values directly into
the text. This makes the report dynamic and ensures that the numbers are consistent with the
analysis.
• Interpretation: I’ve provided detailed interpretations of the statistics and regression results. This is
essential for the assignment.
• Reproducibility: The R Markdown/Quarto document is reproducible because it contains both the
code and the narrative. If you run the document, you’ll get the same results.

To Use This Code:

1. Save: Save the code as an R Markdown file (e.g., assignment.Rmd) or a Quarto file (assignment.qmd).
2. Install Packages: Make sure you have the necessary packages installed: r install.packages(c("wooldridge",
"tidyverse", "stargazer", "knitr"))
3. Run the Document: In RStudio, open the R Markdown/Quarto file and click the “Knit” button (or
use the rmarkdown::render() or quarto::quarto_render() function in the console) to generate the
output document (PDF, Word, or HTML).

2
4. Adapt:
• Dataset: If you want to use a different dataset, change the data("wage1") line and adjust the
variable names in the code accordingly. Use data(package = "wooldridge") to see a list of the
datasets.
• Variables: Select different variables for your descriptive statistics, visualizations, and regressions.
• Interpretations: Modify the interpretations to match your chosen dataset and variables.
• Output Format: Change the format in the YAML header if you want a different output format
(e.g., format: word).
• Title/Author/Date: Update the title, author, and date.

This comprehensive example should give you a very strong starting point for your assignment! Remember
to adapt it carefully to your chosen dataset and provide thorough interpretations.

Documents - Pub - Bs 6465 2 2006 Sanitary Installations Space Requirements PDF
100% (1)
Documents - Pub - Bs 6465 2 2006 Sanitary Installations Space Requirements PDF
38 pages
Stata Cheat Sheets
100% (1)
Stata Cheat Sheets
6 pages
Editable Printable Travel Itinerary
No ratings yet
Editable Printable Travel Itinerary
1 page
O2 - Telefónica UK Limited PDF
50% (2)
O2 - Telefónica UK Limited PDF
1 page
Syringe Di 4000 Ds 3000 Catalogue Nonepca
100% (1)
Syringe Di 4000 Ds 3000 Catalogue Nonepca
4 pages
Section 1
100% (1)
Section 1
94 pages
Descriptive and Inferential Statistics With R
No ratings yet
Descriptive and Inferential Statistics With R
6 pages
BES - R Lab 1
No ratings yet
BES - R Lab 1
4 pages
STA4026S 2021 - Continuous Assessment 2 Ver0.0 - 2021!09!29
No ratings yet
STA4026S 2021 - Continuous Assessment 2 Ver0.0 - 2021!09!29
6 pages
Rcourse_partViz
No ratings yet
Rcourse_partViz
9 pages
DSR_Unit 2-2.1 ExploringBasicgraphs
No ratings yet
DSR_Unit 2-2.1 ExploringBasicgraphs
51 pages
Unit3__R
No ratings yet
Unit3__R
19 pages
STATS LAB Basics of R PDF
No ratings yet
STATS LAB Basics of R PDF
77 pages
r 2m
No ratings yet
r 2m
34 pages
2023 Tutorial 12
No ratings yet
2023 Tutorial 12
6 pages
MIT6 0002F16 ProblemSet5
No ratings yet
MIT6 0002F16 ProblemSet5
13 pages
DVT (Lab) - R Language Manual
No ratings yet
DVT (Lab) - R Language Manual
20 pages
Introduction To R For Gene Expression Data Analysis
No ratings yet
Introduction To R For Gene Expression Data Analysis
11 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
R programing
No ratings yet
R programing
12 pages
Uni T - 2 - R Programming
No ratings yet
Uni T - 2 - R Programming
10 pages
Introduction To R
No ratings yet
Introduction To R
36 pages
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
No ratings yet
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
50 pages
#02 R Basics
No ratings yet
#02 R Basics
30 pages
DSUR Chapter 04 Web Material
No ratings yet
DSUR Chapter 04 Web Material
19 pages
Tidyverse: Core Packages in Tidyverse
No ratings yet
Tidyverse: Core Packages in Tidyverse
8 pages
R Programming
No ratings yet
R Programming
20 pages
DSDA MANUAL
No ratings yet
DSDA MANUAL
64 pages
R - Packages With Applications From Complete and Censored Samples
No ratings yet
R - Packages With Applications From Complete and Censored Samples
43 pages
Introduction To R Installation: Data Types Value Examples
No ratings yet
Introduction To R Installation: Data Types Value Examples
9 pages
Eviews: 29. Juni 2010
No ratings yet
Eviews: 29. Juni 2010
20 pages
Introduction To Data Science With R Programming
No ratings yet
Introduction To Data Science With R Programming
91 pages
An Example in Rugarch
100% (2)
An Example in Rugarch
16 pages
Data Analysis2
No ratings yet
Data Analysis2
16 pages
Reporting Tools
No ratings yet
Reporting Tools
30 pages
MODULE 1
No ratings yet
MODULE 1
42 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell pdf download
100% (3)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell pdf download
40 pages
Untitled Document
No ratings yet
Untitled Document
27 pages
R Packages
No ratings yet
R Packages
6 pages
Unit 2 Notes R Programming
No ratings yet
Unit 2 Notes R Programming
10 pages
R456
No ratings yet
R456
8 pages
PW1 2
No ratings yet
PW1 2
20 pages
R Programming
No ratings yet
R Programming
48 pages
STATA
No ratings yet
STATA
26 pages
R - Bar Charts_merged
No ratings yet
R - Bar Charts_merged
30 pages
DATAANALYSIS FINALS123
No ratings yet
DATAANALYSIS FINALS123
36 pages
PS4
No ratings yet
PS4
8 pages
Download full Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell all chapters
100% (17)
Download full Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell all chapters
43 pages
Lab Manual Page No 1
No ratings yet
Lab Manual Page No 1
32 pages
06 Plots Export Plots
100% (1)
06 Plots Export Plots
17 pages
BES - R Lab
No ratings yet
BES - R Lab
5 pages
Rstudio Study Notes For PA 20181126
No ratings yet
Rstudio Study Notes For PA 20181126
6 pages
Graph Plotting in R Programming
No ratings yet
Graph Plotting in R Programming
12 pages
Lenguaje R C3
No ratings yet
Lenguaje R C3
19 pages
R Studio Lab Summary Sheet
No ratings yet
R Studio Lab Summary Sheet
3 pages
Tutorials
No ratings yet
Tutorials
10 pages
Ggplot2 Exercise
No ratings yet
Ggplot2 Exercise
6 pages
CRM Cheat Sheet
No ratings yet
CRM Cheat Sheet
7 pages
Data Analysis Using R and Vectors
No ratings yet
Data Analysis Using R and Vectors
35 pages
DS-R Block 4 All
No ratings yet
DS-R Block 4 All
50 pages
R Programming
No ratings yet
R Programming
77 pages
B5
No ratings yet
B5
10 pages
Module - 4 (R Training) - Basic Stats & Modeling
No ratings yet
Module - 4 (R Training) - Basic Stats & Modeling
15 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
1825117_23DAVDBCOH000103
No ratings yet
1825117_23DAVDBCOH000103
2 pages
Corporate Accounting JR Monga
No ratings yet
Corporate Accounting JR Monga
971 pages
Business Statistics Index No.
No ratings yet
Business Statistics Index No.
10 pages
4.-dr.-c.-vijendra
No ratings yet
4.-dr.-c.-vijendra
11 pages
Mayandi Daratamia: Executive Summary
No ratings yet
Mayandi Daratamia: Executive Summary
2 pages
Installing Usb Drivers
No ratings yet
Installing Usb Drivers
4 pages
Log
No ratings yet
Log
21 pages
AI PPT Spring 2k22
No ratings yet
AI PPT Spring 2k22
44 pages
Simoom's Lantern en User Manual
No ratings yet
Simoom's Lantern en User Manual
26 pages
MTCP CP1L Client E
No ratings yet
MTCP CP1L Client E
8 pages
Tin Can Stirling Engine
No ratings yet
Tin Can Stirling Engine
16 pages
CPVC Installation Guide
No ratings yet
CPVC Installation Guide
68 pages
Insights Data Centre and Cloud Divestments and Mnas To Accelerate in 2018
No ratings yet
Insights Data Centre and Cloud Divestments and Mnas To Accelerate in 2018
40 pages
PROJECT_MANAGEMENT-AN_WAY_TO_OVERCOME_THE_FAILURE_
No ratings yet
PROJECT_MANAGEMENT-AN_WAY_TO_OVERCOME_THE_FAILURE_
4 pages
Deep Resume 2025.Docx
No ratings yet
Deep Resume 2025.Docx
5 pages
Types of Data Structure: Sets Array
No ratings yet
Types of Data Structure: Sets Array
90 pages
Residential Construction Academy HVAC 2nd Edition Silberstein download pdf
100% (4)
Residential Construction Academy HVAC 2nd Edition Silberstein download pdf
71 pages
Writing A Research Report
100% (1)
Writing A Research Report
22 pages
Procedure - CSC
No ratings yet
Procedure - CSC
4 pages
Weber Dry-Mix 85: 1/2 Saint-Gobain
No ratings yet
Weber Dry-Mix 85: 1/2 Saint-Gobain
2 pages
Heerapura Presentation
50% (2)
Heerapura Presentation
30 pages
Notes On Business Stats
No ratings yet
Notes On Business Stats
23 pages
Data Sheet DS Cast 45 NM: General Informations
No ratings yet
Data Sheet DS Cast 45 NM: General Informations
1 page
Ch4 Differential Protection 2022
No ratings yet
Ch4 Differential Protection 2022
117 pages
CDI 4 (Semi-Final Examination) : Last Name First Name M.I
No ratings yet
CDI 4 (Semi-Final Examination) : Last Name First Name M.I
4 pages
Manual de Servicio
No ratings yet
Manual de Servicio
41 pages
MNM Belt ConveyorAlert
No ratings yet
MNM Belt ConveyorAlert
16 pages
Lets Have Fun With Math
No ratings yet
Lets Have Fun With Math
3 pages
Chapter One
No ratings yet
Chapter One
29 pages

Data-Analysis-using-R

Uploaded by

Data-Analysis-using-R

Uploaded by

Data Analysis using R

To Use This Code:

You might also like