RProgrammingLanguage-Workshop (1) 240521 145940
RProgrammingLanguage-Workshop (1) 240521 145940
net/publication/366040774
CITATIONS READS
2 2,617
1 author:
Ahmed Elshahhat
Zagazig University
75 PUBLICATIONS 418 CITATIONS
SEE PROFILE
All content following this page was uploaded by Ahmed Elshahhat on 06 December 2022.
Workshop
By
December 2022
DOI: 10.13140/RG.2.2.15044.09607
Overview
Data Structures
R Statistics
R Graphics
Inference
1943-2022
Outline
1 Overview
Outline
1 Overview
2 Data Structures
Outline
1 Overview
2 Data Structures
3 R Statistics
Outline
1 Overview
2 Data Structures
3 R Statistics
4 R Graphics
Outline
1 Overview
2 Data Structures
3 R Statistics
4 R Graphics
5 Inference
Learning Objectives
Learning Objectives
3 R Statistics
3 R Statistics
4 R Graphics:
Bar & Box
Histogram & Density
Heatmap
Pairs
QQ
3D
3 R Statistics
4 R Graphics:
Bar & Box
Histogram & Density
Heatmap
Pairs
QQ
3D
5 Inference:
Parameter Estimation
Monte Carlo of Parameter Estimation
Linear Regression Models
Monte Carlo of Linear Regression Models
Schedule:
Activity Time (in Minutes)
Overview 30
Data Structures 35
R Statistics 20
R Graphics 35
Inference 60
Dataset: All R scripts used in this workshop are available within
these slides.
Installation Requirements: Download the latest versions of R.
Outline
1 Overview
2 Data Structures
3 R Statistics
4 R Graphics
5 Inference
Overview
What is R?
Overview
R Advantages
Overview
R Advantages
Overview
R Advantages
1 R is open source.
2 R has a wide community.
3 Outstanding graphical outputs.
4 R is easy to learn and understand.
5 More than 18,000 packages are available and free.
6 R is good for MacOS, Linux and Microsoft Windows.
7 R is cross-platform which runs on many operating systems.
8 R is excellent for simulation, programming, computer intensive
analyses, etc.
9 In R, anyone is welcome to provide bug fixes, code enhancements,
and add new packages.
10 Knowledge support for any base default without internet connection.
Overview
R Restrictions
Overview
Installing R System
Overview
R Installation
Overview
Installing R System
Overview
Installing R System
Overview
Installing R System
Overview
Installing R System
Overview
Installing add-on Packages
R Session
R Console
R Session
R Editor
R Session
Interactive R sessions
R Programming Tools
Arithmetic Operators
R Programming Tools
Commonly Functions
table counts
c concatenate
print show value
which TRUE indices
length no. of values
summary generic stats
dim matrix order
min, max minimum, maximum
help(), ? provide informations
rbind, cbind bind vectors as a row, a column
class type of an argument
apply repeat over rows, columns
sort, order, rank sort, order, vector rank
29 / 172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview What is R?
Data Structures R Restrictions
R Statistics R Installation
R Graphics R Programming Tools
Inference Top 10 R Programming Books, Courses, Online Resources
R Programming Tools
Commonly Functions, Cont’d
mean(x) average
var(x) variance
cor(x) correlation
cov(x) covariance
sqrt(x) square root
log10(x) log base 10
sin(x), cos(x), tan(x) linear algebra
log(x) natural logarithm
seq(x) sequence generation
median(x) middle number in a sorted
mad(x) median absolute deviation
d, p, q, r density, probability, quantile, generating rns functions
30 / 172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview What is R?
Data Structures R Restrictions
R Statistics R Installation
R Graphics R Programming Tools
Inference Top 10 R Programming Books, Courses, Online Resources
R Programming Tools
Probability Distribution Functions
R Programming Tools
Probability Distribution Functions, Cont’d
#1 R in Action
#8 R for Everyone
#9 The Book of R
https://www.coursera.org/learn/data- analysis- r
https://www.facebook.com/groups/2101100100212657/
Introduction to R (DataCamp)
Swirl: Learn R
https://swirlstats.com/
https://www.coursera.org/learn/business- analytics- r
https://www.coursera.org/learn/probability- intro
https://www.udemy.com/course/r- programming
#1 R-bloggers
Note: The R-bloggers website comprises the efforts of more than 750 R bloggers.
https://www.r- bloggers.com/
Note: In 2015, Microsoft acquired Inside-R’s parent company Revolution Analytics. One result of this
acquisition is the Microsoft R Application Network, (MRAN).
https://mran.microsoft.com/
#3 Quick-R
Note: Professor Rob Kabacoff at Wesleyan University created this website to introduce you to R and its
applications.
https://www.statmethods.net/
#4 RStudio
Note: RStudio is an online learning page that links to tutorials and examples to help you master R and
related tools.
https://www.rstudio.com/
#5 Statistics Globe
Note: Statistics Globe is an education platform that provides free programming tutorials in R and Python
as well as theoretical explanations for the field of statistics and data science.
https://statisticsglobe.com/
In R, you might run into a situation or two that requires some expert help.
The websites listed can provide the assistance you need.
#6 Stack Overflow
#7 R Tutorial
Note: R tutorial is designed for software programmers, statisticians and data miners who are looking
forward for developing statistical software using R programming.
https://www.tutorialspoint.com/r/index.htm
#8 R Programming Tutorial
Note: R Programming Tutorial is designed for both beginners and professionals. Our tutorial provides all
the basic and advanced concepts of data analysis and visualization.
https://www.javatpoint.com/r- tutorial
In R, you might run into a situation or two that requires some expert help.
The websites listed can provide the assistance you need.
#9 RDocumentation
Note: RDocumentation enables you to search for R packages and functions that suit your needs.
https://www.rdocumentation.org/
#10 R Manuals
Note: If you want to go directly to the source, visit the R manuals page.
https://cran.r- project.org/manuals.html
Outline
1 Overview
2 Data Structures
3 R Statistics
4 R Graphics
5 Inference
R Data Types
In R, there are 6 basic data types called: logical, numeric, integer,
complex, character and raw.
R Data Types
Data Types
print("abc") # Character
[1] "abc"
R Data Types
print(TRUE) # Logical
[1] TRUE
Note: charToRaw() command converts each character to an American Standard Code for Information
Interchange (ASCII) value.
R Data Structures
R has a wide variety of data types including factors, matrices, vectors,
arrays, data frames, and lists.
R Data Structures
Vectors
A vector is the basic data structure in R that stores data of six types of
data such as logical, integer, double, complex, character and raw.
Vectors
R Data Structures
Vectors
Vectors, Cont’d
R Data Structures
Vectors
Vectors, Cont’d
R Data Structures
Vectors
Vectors, Cont’d
R Data Structures
Vectors
Vectors, Cont’d
x; y
[1] 1 2 3 4 5
[1] 6 7 8 9 10
R Data Structures
Vectors
Vectors, Cont’d
x;y
[1] 1 2 3 4 5
[1] 6 7 8 9 10
R Data Structures
Vectors
Vectors, Cont’d
x;y
[1] 1 2 3 4 5
[1] 6 7 8 9 10
R Data Structures
Vectors
Vectors, Cont’d
data <- rep(c(2 ,4 ,6) , times =3)
print(data) # Repeat vector 3 times
[1] 2 4 6 2 4 6 2 4 6
R Data Structures
Vectors
Vectors, Cont’d
for (i in seq (1 ,3 ,0.5)) {
print(i) # Sequence 1 to 3 by 0.5 separately
}
[1] 1
[1] 1.5
[1] 2
[1] 2.5
[1] 3
R Data Structures
Vectors
Vectors, Cont’d
data <- c(1 ,2 ,3 ,4 ,5 ,6)
for (i in data) {
if (i %% 2 == 0)
print (i) # Print even integers
}
[1] 2
[1] 4
[1] 6
R Data Structures
Vectors
Vectors, Cont’d
data <- c(1 ,2 ,3 ,4 ,5 ,6)
for (i in data) {
if (i %% 2 == 1)
print (i) # Print odd integers
}
[1] 1
[1] 3
[1] 5
R Data Structures
Matrices
R Data Structures
Matrices
Matrices
x = c(9 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8)
A = matrix (x ,3 ,3); print(A) # Create a 3x3 matrix
[,1] [ ,2] [ ,3]
[1,] 9 3 6
[2,] 1 4 7
[3,] 2 5 8
R Data Structures
Matrices
Matrices, Cont’d
A = matrix (c(9 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8) ,nrow =3, ncol =3); print(A)
[,1] [ ,2] [ ,3]
[1,] 9 3 6
[2,] 1 4 7
[3,] 2 5 8
dim(A) # Dimension of A
[1] 3 3
det(A) # Determinant of A
[1] -27
R Data Structures
Matrices
Matrices, Cont’d
sum(diag(A)) # Trace of A
[1] 21
R Data Structures
Matrices
Matrices, Cont’d
A[ ,1]; A[1 ,]
[1] 9 1 2
[1] 9 3 6
R Data Structures
Matrices
Matrices, Cont’d
cbind(A[1 ,]) # Transpose 1st row to a column
[,1]
[1,] 9
[2,] 3
[3,] 6
R Data Structures
Matrices
Matrices, Cont’d
colSums (A); rowSums (A) # Sum columns ; Sum rows of A
[1] 12 12 21
[1] 18 12 15
t(A) # Transpose A
[,1] [ ,2] [ ,3]
[1,] 9 1 2
[2,] 3 4 5
[3,] 6 7 8
R Data Structures
Matrices
Matrices, Cont’d
x = c(9 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8) # Data x
y = c(0 ,2 ,4 ,6 ,8 ,10 ,12 ,14 ,16) # Data y
R Data Structures
Matrices
Matrices, Cont’d
R Data Structures
Matrices
Matrices, Cont’d
R Data Structures
Matrices
Matrices, Cont’d
R Data Structures
Arrays
Array is a data structure which can store data of the same type in more
than two dimensions. In R, the basic syntax for creating an array is
array() function as
array(x, dim = c(nrow , ncol , nmat)) # Insert array
x - data items of same type
nrow - number of rows
ncol - number of columns
nmat - number of matrices
R Data Structures
Arrays
Arrays
, , 2
R Data Structures
Arrays
Arrays, Cont’d
R Data Structures
Arrays
Arrays, Cont’d
A1 <- matrix (c (1:6) , 2, 3, byrow = TRUE) # Matrix A1
A2 <- matrix (c( -1: -6) , 2, 3, byrow = TRUE) # Matrix A2
col.names <- c("COL1","COL2","COL3") # Col.names
row.names <- c("ROW1","ROW2") # Row.names
mat.names <- c(" Matrix1 "," Matrix2 ") # Matrix .names
R Data Structures
Arrays
Arrays, Cont’d
, , Matrix1
, , Matrix2
R Data Structures
Arrays
Arrays, Cont’d
Mat <- Array [,,1] + Array [,,2] # Add arrays
print(Mat)
COL1 COL2 COL3
ROW1 0 0 0
ROW2 0 0 0
R Data Structures
Arrays
Arrays, Cont’d
R Data Structures
Arrays
Arrays, Cont’d
Res .3 <- apply (Array ,c(3) ,sum) # Sum all items in array
print(Res .3)
Matrix1 Matrix2
21 -21
R Data Structures
Arrays
Arrays, Cont’d
Outline
1 Overview
2 Data Structures
3 R Statistics
4 R Graphics
5 Inference
Statistics
Statistics
set.seed (1234) # Set seed for reproducibility
print(mean(x)) # Mean
[1] -0.0265972
Note: Set.seed() function helps to reuse the same set of random variables when the same results of
randomization cannot be imported in the future.
100 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Data Structures
R Statistics
R Graphics
Inference
Statistics
Statistics, Cont’d
print(var(x)) # Variance
[1] 0.9946825
101 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Data Structures
R Statistics
R Graphics
Inference
Statistics
Statistics, Cont’d
102 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
Outline
1 Overview
2 Data Structures
3 R Statistics
4 R Graphics
5 Inference
103 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
104 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Simple Plot #1
set.seed (123)
x <- rnorm (500) # Generate sample x from N(0 ,1)
y <- x + rnorm (500) # Generate sample y
plot(x, y) # Plot samples x and y
104 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Simple Plot #1
105 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
x = rnorm(500)
y = x + rnorm(100)
z1 = x − 2y + 100
z1
e 100
z2 = (z1 + 5)
log(z1 )
106 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Simple Plot #2
set.seed (123)
x <- rnorm (500)
y <- x + rnorm (100)
z1 <- x - 2*y + 100
z2 <- (z1 +5)*(exp(z1/100)/log(z1))
plot(z1 , z2 , lwd = 3, col = "coral",
xlab = expression (z[1]) , ylab = expression (z[2]) ,
main = expression (
frac ((z [1]+5) *e^frac(z[1] ,100) ,log(z[1]))
)
)
# Plot sample z1 and z2
107 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Simple Plot #2
108 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Suppose we have two different sequences x1 and x2 as
x1 = 1 : 10
x2 = 1 : 10
Simple Plot #3
109 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Simple Plot #3
110 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Barplot
x = rnorm(50)
y = x + rnorm(50)
Barplot Plot
111 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Barplot
112 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Boxplot
x = rnorm(50)
y = x + rnorm(50)
Boxplot Plot
113 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Boxplot
114 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Histogram
Histogram Plot
set.seed (123)
x <- rnorm (50)
y <- x + rnorm (50)
par(mfrow =c(1 ,2) , oma=c(0 ,0 ,0 ,0))
hist(x)
hist(y) # Draw histograms of x & y in one row
For more details see; https://www.tutorialspoint.com/r/r_histograms.htm
115 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Histogram
116 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Density Plot
x = rnorm(50)
y = x + rnorm(50)
Density Plot
set.seed (123)
x <- rnorm (50)
y <- x + rnorm (50)
plot( density (x))
polygon ( density (x), col = 1) # Draw density
For more details see; https://statisticsglobe.com/kernel- density- plot- in- base- r
117 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Density Plot
118 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Histogram & Density Plot
x = rnorm(50)
y = x + rnorm(50)
set.seed (123)
x <- rnorm (50)
y <- x + rnorm (50)
hist(x, prob = TRUE) # Draw histogram and density
lines( density (x), lwd =3, col = "red")
For more details see; https://statisticsglobe.com/kernel- density- plot- in- base- r
119 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Histogram & Density Plot
120 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
A general plot represents the scatter, bar, box, time series, time-based
and a specified function in 2×3 window.
General Plot
set.seed (123)
x <- rnorm (500)
y <- x + rnorm (500)
Data_1 <- ts( matrix (x, nrow =500 , ncol =1) , start=c(0 ,1) ,
frequency =12)
Data_2 <- seq(as.Date("2005/1/1"),by="month ",length =50)
Data_3 <- factor ( mtcars $cyl)
Data_4 <- function (x) x^2
Data_5 <- rnorm (32)
Data_6 <- rnorm (50)
The ts() function converts a numeric vector into a time series object.
121 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
122 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
General Plot, Cont’d
123 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Heatmap
R Plots
Heatmap, Cont’d
The data set in excel file is
125 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Heatmap, Cont’d
126 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Pairs Plot
127 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Pairs Plot
128 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Pairs Plot
set.seed (123)
x1 <- rnorm (1000) # Create variable
x2 <- x1 + rnorm (1000 , 0, 2)
x3 <- 3 * x1 - x2 + rnorm (1000 ,0 ,4)
PR <- data. frame (x1 ,x2 ,x3)
pairs(PR) # Draw pairs
129 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Pairs Plot
130 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Pairs Plot
Pairs Plot
set.seed (123)
library (" ggplot2 ")
library (" GGally ")
x1 <- rnorm (1000) # Create variable x1
x2 <- x1 + rnorm (1000 ,0 ,2) # Create variable x2
x3 <- 3*x1 -x2 + rnorm (1000 ,0 ,4) # Create variable x3
data <- data. frame (x1 ,x2 ,x3) # Combine all variables
ggpairs (data) # Apply ggpairs function
cor(x1 ,x2) # Correlation between x1 and x2
cor(x1 ,x2) # Correlation between x1 and x3
cor(x2 ,x3) # Correlation between x2 and x3
For more details see; https://statisticsglobe.com/r- pairs- plot- example/
131 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Pairs Plot
132 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
QQ Plot
133 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
QQ Plot
134 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
3D Plot
3D Plot
For more details see; https://www.geeksforgeeks.org/creating- 3d- plots- in- r- programming- persp- function/
135 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
3D Plot
136 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview Bar & Box
Data Structures Histogram & Density
R Statistics Heatmap
R Graphics Pairs
Inference QQ
3D
R Plots
Colors in R Plots
137 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Outline
1 Overview
2 Data Structures
3 R Statistics
4 R Graphics
5 Inference
138 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
139 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Inference
139 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Maximum Likelihood
MLE-Two dimensional
140 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Maximum Likelihood
MLE-Two dimensional
141 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Least-Squares
LSE-Two dimensional
Here, the least-squares estimation method is used to find the best fit for
the parameter(s) of a target population (e.g., Weibull) based on data set
by minimizing the sum of squares of differences between the theoretical
and empirical CDFs.
LSE
set.seed (1234) # Set seed for reproducible
alpha <- 2 # Shape parameter value
lambda <- 1 # Scale parameter value
start <- c(alpha , lambda ) # Start value
x <- rweibull (20 , alpha , lambda ) # Simulate random sample
n <- length (x) # No. of observations
lower <- c(0 ,0); upper <- c(+Inf ,+ Inf)
Dweibull <- function (y, param ) { # Weibull distribution
alpha <- param [1]
lambda <- param [2]
res <- 1-exp(- lambda *y^ alpha )
}
142 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Least-Squares
LSE-Two dimensional
LSE, Cont’d
LSE <- function (param ,y,CDF) { # Set an objective function
D <- rep (0, l = n)
for(i in 1:n) {
D[i] <- (CDF(y[i], param ) -(i/(n+1)))^2
}
sum(D)
}
143 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Least-Squares
LSE-Two dimensional
LSE, Cont’d
OLS_ weibull =OLS(Dweibull ,start ,y,lower , upper )
print (OLS_ weibull )
144 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
145 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
146 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
147 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
148 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
150 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
151 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
152 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
153 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Linear Regression
Simple
y - response variable.
x - predictor variable.
a & b - regression coefficients.
154 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Linear Regression
Simple
155 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Linear Regression
Simple
156 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Linear Regression
Simple
Residuals :
Min 1Q Median 3Q Max
-0.063002 -0.016629 0.000412 0.018944 0.039775
Coefficients :
Estimate Std. Error t value Pr(>|t|)
( Intercept ) -0.38455 0.08049 -4.778 0.00139 **
x 0.67461 0.05191 12.997 1.16e -06 ***
---
Signif . codes : 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
Linear Regression
Simple
Call:
lm( formula = y ~ x, data = mydata )
Coefficients :
( Intercept ) x
-0.3846 0.6746
Linear Regression
Simple
159 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Linear Regression
Multiple
y - response variable.
x1 , x2 , ..., xn - predictor variables.
a, b1 , b2 , ..., bn - regression coefficients.
160 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Linear Regression
Multiple
161 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Linear Regression
Multiple
162 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
Linear Regression
Multiple
Coefficients :
Estimate Std. Error t value Pr(>|t|)
( Intercept ) -0.277006 0.101562 -2.727 0.0294 *
x1 0.670162 0.047952 13.976 2.27e -06 ***
x2 -0.004044 0.002607 -1.551 0.1647
---
Signif . codes : 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
Linear Regression
Multiple
Call:
lm( formula = y ~ x1 + x2 , data = mydata )
Coefficients :
( Intercept ) x1 x2
-0.277006 0.670162 -0.004044
164 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
165 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
166 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
for(i in 1:N){
Est[i ,] = c(Res [[i]]$coef) # Calculate Av.Ests
MSE[i ,] = (theta -c(Res [[i]]$coef))^2 # Calculate MSEs
MAB[i ,] = abs(theta -c(Res [[i]]$coef))/ theta # Calculate MABs
}
Reg_1 = mean(Est [ ,1]); Reg_2 = mean(Est [ ,2])
MSE_1 = mean(MSE [ ,1]); MSE_2 = mean(MSE [ ,2])
MAB_1 = mean(MAB [ ,1]); MAB_2 = mean(MAB [ ,2])
167 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
168 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
for(i in 1:N){
Est[i ,] = c(Res [[i]]$coef)
MSE[i ,] = (theta -c(Res [[i]]$coef))^2
MAB[i ,] = abs(theta -c(Res [[i]]$coef))/ theta
}
# Calculate Av. Ests
Reg_1= mean(Est [ ,1]); Reg_2= mean(Est [ ,2]); Reg_3= mean(Est [ ,3])
# Calculate MSEs
MSE_1= mean(MSE [ ,1]); MSE_2= mean(MSE [ ,2]); MSE_3= mean(MSE [ ,3])
# Calculate MABs
MAB_1= mean(MAB [ ,1]); MAB_2= mean(MAB [ ,2]); MAB_3= mean(MAB [ ,3])
169 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
170 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
171 /
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics
Overview
Parameter Estimation
Data Structures
Monte Carlo of Parameter Estimation
R Statistics
Linear Regression Models
R Graphics
Monte Carlo of Linear Regression Models
Inference
172 /
View publication stats
172 Dr. Ahmed Elshahhat R Programming Language for Data Analytics