0% found this document useful (0 votes)
78 views6 pages

Predictive Analytics: Group Assignment 2

This document provides a summary of the packages, functions, and processes used for predictive modeling. It includes 3 sections - pre-modeling, modeling, and post-modeling processes. The pre-modeling section lists 39 functions for data cleaning, exploration, and preprocessing. The modeling section lists 21 functions for building, evaluating, and selecting models. The post-modeling section lists 7 functions for model performance evaluation, visualization, and prediction.

Uploaded by

Namit Baser
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views6 pages

Predictive Analytics: Group Assignment 2

This document provides a summary of the packages, functions, and processes used for predictive modeling. It includes 3 sections - pre-modeling, modeling, and post-modeling processes. The pre-modeling section lists 39 functions for data cleaning, exploration, and preprocessing. The modeling section lists 21 functions for building, evaluating, and selecting models. The post-modeling section lists 7 functions for model performance evaluation, visualization, and prediction.

Uploaded by

Namit Baser
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Predictive Analytics

Group Assignment 2

Group ID: 191233


Submitted by:
Names Roll No
Divyesh Jain 191222
Hemil Joshi 191228
Namit Maheshwari 191233
Omkar Khandekar 191234
Shashank Saxena 191247

Submitted To:

Prof. Chetan Jhaveri

Batch: MBA – FT (2019-2021)


Institute of Management, Nirma University
Date of Submission: 4th October, 2020
List of Packages Used

S No. Package Name

1 caret

2 caTools

3 cowplot

4 Datarium

5 DMwR

6 ggplot2

7 Lift

8 lmtest

9 MASS

10 nortest

11 olsrr

12 ROCR

Pre-modelling Process

S.no. Function Description

1 any(is.na()) To check whether database has missing values or not

2 as.factor() Converts data in to factor

3 as.integer() To converts any value to integer


4 as.numeric() To convert any value to numeric

5 attach() To attach the data to R search path

6 boxplot() To display the boxplot of dataset

7 cbind() Combines vector, matrix or data frame column wise

Page 1 of 5
8 class() To know about which class data belongs

9 cor() To check the correlation between different variables

10 cov() To check the covariance of the data

11 data() To include the dataset

12 dim() Gives the dimension of the dataset

13 getwd() Gives the working directory of R

14 head() Displays first 6 elements of dataset

15 hist() To create the histogram of dataset

16 ifelse() Conditional statement

17 install.packages() To install any package to the R file

18 IQR() Provides the inter quartile range

19 is.na() to check the presence of any missing values

20 library() To access the particular package from library

21 ls() Give list of memory contents

22 matrix() To create a matrix with m*n dimension

23 mean() To find out mean of columns of dataset

24 names() To check the names of column in dataset

25 nrow() Provides the number of rows

26 pairs() Gives correlation matrix plots

27 plots() Gives scatter plot between two variables

28 rbind() Combines vector, matrix or data frame row wise

29 read.csv() Command to read the csv file

30 sd() find out standard deviation of data

Page 2 of 5
31 setwd() To set the new working directoryy for R

32 str() Display the overall structure of the dataset

33 subset() Creates a new subset from the superset dataset

34 sum(is.na()) To check total no. of missing values

35 summary() Provide 5 point summary of dataset

36 tail() Displays last 6 elements of dataset

37 var() Finds out the variance of data

38 View() To view the dataset

39 which(is.na()) Gives the position of missing values

Modelling Process

S.no Function Description

1 abline() Used to add vertical, horizontal or regression line to the graph

2 ad.test() [nortest] To check property of normal distribution

3 anova() To get the anova table of the model

4 bptest() [lmtest] Test for costant residual variance

5 confint() It computes confidence interval (by default 95%)

6 Datarium] Data Package for visualisation and sataistical analysis

durbinwatsonTest()
7 Check for Auto correlation
[car package]

8 exp() To find the exponential value

9 ggplot() [ggplot2] To visualize the data graphically

Page 3 of 5
10 lm() To create least square regression line

11 lrtest [lmtest] To compare two or models for goodness of fit

ncvTest() [car
12 Test for non-constant error variance
package]

13 qqline() Add a straight line to qq plot

14 qqnorm() To make a normality plot

15 qqplot [car package] Quantile comparison plot

16 regr.eval() [DMwR] Calculate series of regression evaluation statistics

sample.split()
17 Used to split the dataset
[caTools]
18 stepAIC() [MASS] To find out the best fit model

19 var() [car package] variance covariance matrices

20 varImp() [caret] To list important variables

21 vif() [car package] To check multi collinearity

Post Modelling Process

S.no Function Description

1 confusionMatrix() [caret] Table that describes performance of the model

Page 4 of 5
2 geom_point() [cowplot,ggplot2] To create the visual graph with colours

3 grep() Pattern matching

4 performance() [ROCR] To create 2D parameterized performance curve

5 plotLift() [Lift] Draws the actual vs predicted graph in logistic regression

used to predict the values obtained by using regression


6 predict()
function
7 table() To create a table

Page 5 of 5

You might also like