0% found this document useful (0 votes)

71 views23 pages

R Code For Discriminant and Cluster Analysis

The document discusses linear discriminant analysis (LDA). It makes the following key assumptions: 1. The features are normally distributed within each class and outliers may affect the model. Group sizes should also be balanced. 2. The variance of features is the same across all classes. 3. Features are randomly sampled. 4. There is no multicollinearity between variables, otherwise predictive ability decreases.

Uploaded by

Nguyễn Oanh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views23 pages

R Code For Discriminant and Cluster Analysis

Uploaded by

Nguyễn Oanh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 23

DISCRIMINANT ANALYSIS

 Since we are dealing with multiple features, one of the first assumptions
that the technique makes is the assumption of multivariate normality that
means the features are normally distributed when separated for each
class. This also implies that the technique is susceptible to possible
outliers and is also sensitive to the group sizes. If there is an imbalance
between the group sizes and one of the groups is too small or too large,
the technique suffers when classifying data points into that ‘outlier’ class
 The second assumption is about homoscedasticity. This states that the
variance of the features is same across all the classes of the predictor
feature
 We also assume that the features are sampled randomly
 The final assumption is about the absence of multicollinearity. If the
variables are correlated with each other, the predictive ability will
decrease.

library(MASS)
library(ggplot2)

Step 2: Load the Data

For this example, we’ll use the built-in iris dataset
in R. The following code shows how to load and
view this dataset:
#attach iris dataset to make it easy to work with
attach(iris)

#view structure of dataset

str(iris)

Step 3: Scale the Data

One of the key assumptions of linear discriminant
analysis is that each of the predictor variables
have the same variance. An easy way to assure
that this assumption is met is to scale each
variable such that it has a mean of 0 and a
standard deviation of 1.
We can quickly do so in R by using
the scale() function:
#scale each predictor variable (i.e. first 4 columns)
iris[1:4] <- scale(iris[1:4])

Step 4: Create Training and Test Samples

Next, we’ll split the dataset into a training set to
train the model on and a testing set to test the
model on:
#make this example reproducible
set.seed(1)

#Use 70% of dataset as training set and remaining 30% as testing set
sample <- sample(c(TRUE, FALSE), nrow(iris), replace=TRUE, prob=c(0.7,0.3))
train <- iris[sample, ]
test <- iris[!sample, ]

Step 5: Fit the LDA Model

Next, we’ll use the lda() function from
the MASS package to fit the LDA model to our
data:
#fit LDA model
model <- lda(Species~., data=train)

#view model output

model

Call:
lda(Species ~ ., data = train)

Prior probabilities of groups:

setosa versicolor virginica
0.3207547 0.3207547 0.3584906

Group means:
Sepal.Length Sepal.Width Petal.Length Petal.Width
setosa -1.0397484 0.8131654 -1.2891006 -1.2570316
versicolor 0.1820921 -0.6038909 0.3403524 0.2208153
virginica 0.9582674 -0.1919146 1.0389776 1.1229172

Coefficients of linear discriminants:

LD1 LD2
Sepal.Length 0.7922820 0.5294210
Sepal.Width 0.5710586 0.7130743
Petal.Length -4.0762061 -2.7305131
Petal.Width -2.0602181 2.6326229

Proportion of trace:
LD1 LD2
0.9921 0.0079

Step 6: Use the Model to Make Predictions

Once we’ve fit the model using our training data,
we can use it to make predictions on our test data:
#use LDA model to make predictions on test data
predicted <- predict(model, test)

names(predicted)

[1] "class" "posterior" "x"

This returns a list with three variables:

 class: The predicted class

 posterior: The posterior probability that an
observation belongs to each class
 x: The linear discriminants

We can quickly view each of these results for the

first six observations in our test dataset:
#view predicted class for first six observations in test set
head(predicted$class)

[1] setosa setosa setosa setosa setosa setosa

Levels: setosa versicolor virginica

#view posterior probabilities for first six observations in test set

head(predicted$posterior)

setosa versicolor virginica

4 1 2.425563e-17 1.341984e-35
6 1 1.400976e-21 4.482684e-40
7 1 3.345770e-19 1.511748e-37
15 1 6.389105e-31 7.361660e-53
17 1 1.193282e-25 2.238696e-45
18 1 6.445594e-22 4.894053e-41

#view linear discriminants for first six observations in test set

head(predicted$x)

LD1 LD2
4 7.150360 -0.7177382
6 7.961538 1.4839408
7 7.504033 0.2731178
15 10.170378 1.9859027
17 8.885168 2.1026494
18 8.113443 0.7563902

https://www.statology.org/linear-discriminant-analysis-in-r/
K-Means Clustering in R: Step-by-Step Example

Clustering is a technique in machine learning that

attempts to find clusters of observations within a
dataset.
The goal is to find clusters such that the
observations within each cluster are quite similar
to each other, while observations in different
clusters are quite different from each other.
Clustering is a form of unsupervised learning because
we’re simply attempting to find structure within a
dataset rather than predicting the value of
some response variable.
Clustering is often used in marketing when
companies have access to information like:
 Household income

 Household size

 Head of household Occupation

 Distance from nearest urban area

When this information is available, clustering can

be used to identify households that are similar and
may be more likely to purchase certain products or
respond better to a certain type of advertising.
One of the most common forms of clustering is
known as k-means clustering.
What is K-Means Clustering?
K-means clustering is a technique in which we
place each observation in a dataset into one
of K clusters.
The end goal is to have K clusters in which the
observations within each cluster are quite similar
to each other while the observations in different
clusters are quite different from each other.
In practice, we use the following steps to perform
K-means clustering:
1. Choose a value for K.
 First, we must decide how many clusters we’d

like to identify in the data. Often we have to

simply test several different values for K and
analyze the results to see which number of
clusters seems to make the most sense for a
given problem.
2. Randomly assign each observation to an initial cluster, from 1
to K.
3. Perform the following procedure until the cluster assignments
stop changing.
 For each of the K clusters, compute the

cluster centroid. This is simply the vector of

the p feature means for the observations in
the kth cluster.
 Assign each observation to the cluster whose

centroid is closest. Here, closest is defined

using Euclidean distance.
K-Means Clustering in R
The following tutorial provides a step-by-step
example of how to perform k-means clustering in R.
Step 1: Load the Necessary Packages
First, we’ll load two packages that contain several
useful functions for k-means clustering in R.
library(factoextra)
library(cluster)
Step 2: Load and Prep the Data
For this example we’ll use the USArrests dataset
built into R, which contains the number of arrests
per 100,000 residents in each U.S. state in 1973
for Murder, Assault, and Rape along with the
percentage of the population in each state living in
urban areas, UrbanPop.
The following code shows how to do the following:
 Load the USArrests dataset

 Remove any rows with missing values

 Scale each variable in the dataset to have a

mean of 0 and a standard deviation of 1

#load data
df <- USArrests

#remove rows with missing values

df <- na.omit(df)

#scale each variable to have a mean of 0 and sd of 1

df <- scale(df)

#view first six rows of dataset

head(df)

Murder Assault UrbanPop Rape

Alabama 1.24256408 0.7828393 -0.5209066 -0.003416473
Alaska 0.50786248 1.1068225 -1.2117642 2.484202941
Arizona 0.07163341 1.4788032 0.9989801 1.042878388
Arkansas 0.23234938 0.2308680 -1.0735927 -0.184916602
California 0.27826823 1.2628144 1.7589234 2.067820292
Colorado 0.02571456 0.3988593 0.8608085 1.864967207
Step 3: Find the Optimal Number of Clusters
To perform k-means clustering in R we can use the
built-in kmeans() function, which uses the following
syntax:
kmeans(data, centers, nstart)
where:
 data: Name of the dataset.

 centers: The number of clusters, denoted k.

 nstart: The number of initial configurations.

Because it’s possible that different initial

starting clusters can lead to different results,
it’s recommended to use several different
initial configurations. The k-means algorithm
will find the initial configurations that lead to
the smallest within-cluster variation.
Since we don’t know beforehand how many
clusters is optimal, we’ll create two different plots
that can help us decide:
1. Number of Clusters vs. the Total Within Sum of Squares
First, we’ll use the fviz_nbclust() function to create a
plot of the number of clusters vs. the total within
sum of squares:
fviz_nbclust(df, kmeans, method = "wss")
Typically when we create this type of plot we look
for an “elbow” where the sum of squares begins to
“bend” or level off. This is typically the optimal
number of clusters.
For this plot it appears that there is a bit of an
elbow or “bend” at k = 4 clusters.
2. Number of Clusters vs. Gap Statistic
Another way to determine the optimal number of
clusters is to use a metric known as the gap statistic,
which compares the total intra-cluster variation for
different values of k with their expected values for
a distribution with no clustering.
We can calculate the gap statistic for each number
of clusters using the clusGap() function from
the cluster package along with a plot of clusters vs.
gap statistic using the fviz_gap_stat() function:
#calculate gap statistic based on number of clusters
gap_stat <- clusGap(df,
FUN = kmeans,
nstart = 25,
K.max = 10,
B = 50)

#plot number of clusters vs. gap statistic

fviz_gap_stat(gap_stat)

From the plot we can see that gap statistic is

highest at k = 4 clusters, which matches the elbow
method we used earlier.
Step 4: Perform K-Means Clustering with Optimal K
Lastly, we can perform k-means clustering on the
dataset using the optimal value for k of 4:
#make this example reproducible
set.seed(1)

#perform k-means clustering with k = 4 clusters

km <- kmeans(df, centers = 4, nstart = 25)

#view results
km

K-means clustering with 4 clusters of sizes 16, 13, 13, 8

Cluster means:
Murder Assault UrbanPop Rape
1 -0.4894375 -0.3826001 0.5758298 -0.26165379
2 -0.9615407 -1.1066010 -0.9301069 -0.96676331
3 0.6950701 1.0394414 0.7226370 1.27693964
4 1.4118898 0.8743346 -0.8145211 0.01927104

Clustering vector:
Alabama Alaska Arizona Arkansas California Colorado
4 3 3 4 3 3
Connecticut Delaware Florida Georgia Hawaii Idaho
1 1 3 4 1 2
Illinois Indiana Iowa Kansas Kentucky Louisiana
3 1 2 1 2 4
Maine Maryland Massachusetts Michigan Minnesota Mississippi
2 3 1 3 2 4
Missouri Montana Nebraska Nevada New Hampshire New Jersey
3 2 2 3 2 1
New Mexico New York North Carolina North Dakota Ohio Oklahoma
3 3 4 2 1 1
Oregon Pennsylvania Rhode Island South Carolina South Dakota Tennessee
1 1 1 4 2 4
Texas Utah Vermont Virginia Washington West Virginia
3 1 2 1 1 2
Wisconsin Wyoming
2 1

Within cluster sum of squares by cluster:

[1] 16.212213 11.952463 19.922437 8.316061
(between_SS / total_SS = 71.2 %)

Available components:

[1] "cluster" "centers" "totss" "withinss" "tot.withinss" "betweenss"

[7] "size" "iter" "ifault"
From the results we can see that:
 16 states were assigned to the first cluster

 13 states were assigned to the second cluster

 13 states were assigned to the third cluster

 8 states were assigned to the fourth cluster

We can visualize the clusters on a scatterplot that

displays the first two principal components on the
axes using the fivz_cluster() function:
#plot results of final k-means model
fviz_cluster(km, data = df)

We can also use the aggregate() function to find the

mean of the variables in each cluster:
#find means of each cluster
aggregate(USArrests, by=list(cluster=km$cluster), mean)

cluster Murder Assault UrbanPop Rape

1 3.60000 78.53846 52.07692 12.17692

2 10.81538 257.38462 76.00000 33.19231
3 5.65625 138.87500 73.87500 18.78125
4 13.93750 243.62500 53.75000 21.41250
We interpret this output is as follows:
 The mean number of murders per 100,000

citizens among the states in cluster 1 is 3.6.

 The mean number of assaults per 100,000

citizens among the states in cluster 1 is 78.5.

 The mean percentage of residents living in an

urban area among the states in cluster 1

is 52.1%.
 The mean number of rapes per 100,000 citizens

among the states in cluster 1 is 12.2.

And so on.
We can also append the cluster assignments of
each state back to the original dataset:
#add cluster assigment to original data
final_data <- cbind(USArrests, cluster = km$cluster)

#view final data

head(final_data)

Murder Assault UrbanPop Rape cluster

Alabama 13.2 236 58 21.2 4

Alaska 10.0 263 48 44.5 2
Arizona 8.1 294 80 31.0 2
Arkansas 8.8 190 50 19.5 4
California 9.0 276 91 40.6 2
Colorado 7.9 204 78 38.7 2
Pros & Cons of K-Means Clustering
K-means clustering offers the following benefits:
 It is a fast algorithm.
 It can handle large datasets well.

However, it comes with the following potential

drawbacks:
 It requires us to specify the number of clusters

before performing the algorithm.

 It’s sensitive to outliers.

Two alternatives to k-means clustering are k-

medoids clustering and hierarchical clustering.
Hierarchical Clustering in R: Step-by-Step Example

Clustering is a technique in machine learning that

attempts to find groups
or clusters of observations within a dataset such that
the observations within each cluster are quite
similar to each other, while observations in
different clusters are quite different from each
other.
Clustering is a form of unsupervised learning because
we’re simply attempting to find structure within a
dataset rather than predicting the value of
some response variable.
Clustering is often used in marketing when
companies have access to information like:
 Household income

 Household size

 Head of household Occupation

 Distance from nearest urban area

When this information is available, clustering can

be used to identify households that are similar and
may be more likely to purchase certain products or
respond better to a certain type of advertising.
One of the most common forms of clustering is
known as k-means clustering. Unfortunately this
method requires us to pre-specify the number of
clusters K.
An alternative to this method is known
as hierarchical clustering, which does not require us to
pre-specify the number of clusters to be used and
is also able to produce a tree-based representation
of the observations known as a dendrogram.
What is Hierarchical Clustering?
Similar to k-means clustering, the goal of
hierarchical clustering is to produce clusters of
observations that are quite similar to each other
while the observations in different clusters are
quite different from each other.
In practice, we use the following steps to perform
hierarchical clustering:
1. Calculate the pairwise dissimilarity between each observation in
the dataset.
 First, we must choose some distance metric –

like the Euclidean distance – and use this metric to

compute the dissimilarity between each
observation in the dataset.
 For a dataset with n observations, there will

be a total of n(n-1)/2 pairwise dissimilarities.

2. Fuse observations into clusters.
 At each step in the algorithm, fuse together the

two observations that are most similar into a

single cluster.
 Repeat this procedure until all observations are

members of one large cluster. The end result is

a tree, which can be plotted as a dendrogram.
To determine how close together two clusters are,
we can use a few different methods including:
Complete linkage clustering: Find the max distance
between points belonging to two different
clusters.
 Single linkage clustering: Find the minimum

distance between points belonging to two

different clusters.
 Mean linkage clustering: Find all pairwise distances

between points belonging to two different

clusters and then calculate the average.
 Centroid linkage clustering: Find the centroid of

each cluster and calculate the distance

between the centroids of two different
clusters.
 Ward’s minimum variance method: Minimize the

total
Depending on the structure of the dataset, one of
these methods may tend to produce better (i.e.
more compact) clusters than the other methods.
Hierarchical Clustering in R
The following tutorial provides a step-by-step
example of how to perform hierarchical clustering
in R.
Step 1: Load the Necessary Packages
First, we’ll load two packages that contain several
useful functions for hierarchical clustering in R.
library(factoextra)
library(cluster)
Step 2: Load and Prep the Data
For this example we’ll use the USArrests dataset
built into R, which contains the number of arrests
per 100,000 residents in each U.S. state in 1973
for Murder, Assault, and Rape along with the
percentage of the population in each state living in
urban areas, UrbanPop.
The following code shows how to do the following:
 Load the USArrests dataset

 Remove any rows with missing values

 Scale each variable in the dataset to have a

mean of 0 and a standard deviation of 1

#load data
df <- USArrests

#remove rows with missing values

df <- na.omit(df)

#scale each variable to have a mean of 0 and sd of 1

df <- scale(df)

#view first six rows of dataset

head(df)

Murder Assault UrbanPop Rape

Alabama 1.24256408 0.7828393 -0.5209066 -0.003416473
Alaska 0.50786248 1.1068225 -1.2117642 2.484202941
Arizona 0.07163341 1.4788032 0.9989801 1.042878388
Arkansas 0.23234938 0.2308680 -1.0735927 -0.184916602
California 0.27826823 1.2628144 1.7589234 2.067820292
Colorado 0.02571456 0.3988593 0.8608085 1.864967207
Step 3: Find the Linkage Method to Use
To perform hierarchical clustering in R we can use
the agnes() function from the cluster package, which
uses the following syntax:
agnes(data, method)
where:
 data: Name of the dataset.

 method: The method to use to calculate

dissimilarity between clusters.

Since we don’t know beforehand which method will
produce the best clusters, we can write a short
function to perform hierarchical clustering using
several different methods.
Note that this function calculates the
agglomerative coefficient of each method, which is
metric that measures the strength of the clusters.
The closer this value is to 1, the stronger the
clusters.
#define linkage methods
m <- c( "average", "single", "complete", "ward")
names(m) <- c( "average", "single", "complete", "ward")

#function to compute agglomerative coefficient

ac <- function(x) {
agnes(df, method = x)$ac
}

#calculate agglomerative coefficient for each clustering linkage method

sapply(m, ac)

average single complete ward

0.7379371 0.6276128 0.8531583 0.9346210
We can see that Ward’s minimum variance method
produces the highest agglomerative coefficient,
thus we’ll use that as the method for our final
hierarchical clustering:
#perform hierarchical clustering using Ward's minimum variance
clust <- agnes(df, method = "ward")

#produce dendrogram
pltree(clust, cex = 0.6, hang = -1, main = "Dendrogram")
Each leaf at the bottom of the dendrogram
represents an observation in the original dataset.
As we move up the dendrogram from the bottom,
observations that are similar to each other are
fused together into a branch.
Step 4: Determine the Optimal Number of Clusters
To determine how many clusters the observations
should be grouped in, we can use a metric known
as the gap statistic, which compares the total intra-
cluster variation for different values of k with their
expected values for a distribution with no
clustering.
We can calculate the gap statistic for each number
of clusters using the clusGap() function from
the cluster package along with a plot of clusters vs.
gap statistic using the fviz_gap_stat() function:
#calculate gap statistic for each number of clusters (up to 10 clusters)
gap_stat <- clusGap(df, FUN = hcut, nstart = 25, K.max = 10, B = 50)

#produce plot of clusters vs. gap statistic

fviz_gap_stat(gap_stat)

From the plot we can see that the gap statistic is

highest at k = 4 clusters. Thus, we’ll choose to
group our observations into 4 distinct clusters.
Step 5: Apply Cluster Labels to Original Dataset
To actually add cluster labels to each observation
in our dataset, we can use the cutree() method to
cut the dendrogram into 4 clusters:
#compute distance matrix
d <- dist(df, method = "euclidean")
#perform hierarchical clustering using Ward's method
final_clust <- hclust(d, method = "ward.D2" )

#cut the dendrogram into 4 clusters

groups <- cutree(final_clust, k=4)

#find number of observations in each cluster

table(groups)

1 2 3 4
7 12 19 12
We can then append the cluster labels of each
state back to the original dataset:
#append cluster labels to original data
final_data <- cbind(USArrests, cluster = groups)

#display first six rows of final data

head(final_data)

Murder Assault UrbanPop Rape cluster

Alabama 13.2 236 58 21.2 1
Alaska 10.0 263 48 44.5 2
Arizona 8.1 294 80 31.0 2
Arkansas 8.8 190 50 19.5 3
California 9.0 276 91 40.6 2
Colorado 7.9 204 78 38.7 2
Lastly, we can use the aggregate() function to find
the mean of the variables in each cluster:
#find mean values for each cluster
aggregate(final_data, by=list(cluster=final_data$cluster), mean)

cluster Murder Assault UrbanPop Rape cluster

1 1 14.671429 251.2857 54.28571 21.68571 1
2 2 10.966667 264.0000 76.50000 33.60833 2
3 3 6.210526 142.0526 71.26316 19.18421 3
4 4 3.091667 76.0000 52.08333 11.83333 4
We interpret this output is as follows:
 The mean number of murders per 100,000

citizens among the states in cluster 1 is 14.67.

 The mean number of assaults per 100,000

citizens among the states in cluster 1 is 251.28.

 The mean percentage of residents living in an
urban area among the states in cluster 1
is 54.28%.
 The mean number of rapes per 100,000 citizens
among the states in cluster 1 is 21.68.

Kassambara, Alboukadel - Machine Learning Essentials - Practical Guide in R (2018)
100% (1)
Kassambara, Alboukadel - Machine Learning Essentials - Practical Guide in R (2018)
424 pages
ISYE6501 HW1 Kevin
No ratings yet
ISYE6501 HW1 Kevin
7 pages
Assignment Clustering
No ratings yet
Assignment Clustering
22 pages
Datamininganddataware
No ratings yet
Datamininganddataware
25 pages
CSE 3121 Information Visualization R Studio All Codes
No ratings yet
CSE 3121 Information Visualization R Studio All Codes
9 pages
Materi Praktikum
No ratings yet
Materi Praktikum
7 pages
Unit 6 - Machine Learning in R
No ratings yet
Unit 6 - Machine Learning in R
45 pages
Alehandro Lumentah 210211010188 Assignment09
No ratings yet
Alehandro Lumentah 210211010188 Assignment09
10 pages
K-Means Cluster Analysis UC Business Analytics R Programming Guide
No ratings yet
K-Means Cluster Analysis UC Business Analytics R Programming Guide
19 pages
R Lab Program
No ratings yet
R Lab Program
20 pages
FullMarks - Clustering StudentSolution 2
No ratings yet
FullMarks - Clustering StudentSolution 2
13 pages
Cluster Analysis in R TML
No ratings yet
Cluster Analysis in R TML
5 pages
Week 10 Abhishek Srivastava VFinal
No ratings yet
Week 10 Abhishek Srivastava VFinal
14 pages
Clustering
No ratings yet
Clustering
55 pages
Lecture 7 - Integrated Analysis With R
No ratings yet
Lecture 7 - Integrated Analysis With R
79 pages
Notes - With R Code
No ratings yet
Notes - With R Code
7 pages
ISYE6501 Homework 2
No ratings yet
ISYE6501 Homework 2
11 pages
Clustering in R
No ratings yet
Clustering in R
12 pages
Solution HW2
No ratings yet
Solution HW2
6 pages
DATAMINING
No ratings yet
DATAMINING
24 pages
Assignment-1 80501
No ratings yet
Assignment-1 80501
6 pages
STAT452 Project1
No ratings yet
STAT452 Project1
13 pages
R Course - Part7 ML - Exercise Sheet 2024
No ratings yet
R Course - Part7 ML - Exercise Sheet 2024
8 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
77 pages
Learn Lab3
No ratings yet
Learn Lab3
12 pages
Week 1 HW
No ratings yet
Week 1 HW
3 pages
Solution 2.2
No ratings yet
Solution 2.2
4 pages
Da Thoery
No ratings yet
Da Thoery
24 pages
K Means Clustering
No ratings yet
K Means Clustering
13 pages
Data Mining Business Report 2
No ratings yet
Data Mining Business Report 2
18 pages
Datamining Lab Record
No ratings yet
Datamining Lab Record
36 pages
03 23MAT214 MIS4 KMeans Spectral Clustering
No ratings yet
03 23MAT214 MIS4 KMeans Spectral Clustering
52 pages
Overview of Clustering:: UNIT-5
No ratings yet
Overview of Clustering:: UNIT-5
27 pages
01 K Means - Merged
No ratings yet
01 K Means - Merged
26 pages
Exp 5
No ratings yet
Exp 5
4 pages
Kmeans Algorithm
No ratings yet
Kmeans Algorithm
9 pages
Question 2.2
No ratings yet
Question 2.2
2 pages
IDS Unit-3 L2
No ratings yet
IDS Unit-3 L2
26 pages
R Record-1
No ratings yet
R Record-1
53 pages
Statistical Computing With R: Masters in Data Sciences 503 (S27) Third Batch, SMS, TU, 2024
No ratings yet
Statistical Computing With R: Masters in Data Sciences 503 (S27) Third Batch, SMS, TU, 2024
30 pages
Cluster Analysis
No ratings yet
Cluster Analysis
8 pages
K-Means Cluter Analysis For IRIS Data Frame in R
No ratings yet
K-Means Cluter Analysis For IRIS Data Frame in R
3 pages
Clustering
No ratings yet
Clustering
25 pages
R For Data Science Sample Chapter
100% (1)
R For Data Science Sample Chapter
39 pages
SML Hand Note Bau by DT
No ratings yet
SML Hand Note Bau by DT
1 page
Aam Unit 4 QB With Answer
No ratings yet
Aam Unit 4 QB With Answer
11 pages
Statlearn PDF
No ratings yet
Statlearn PDF
123 pages
Final Data Lab
No ratings yet
Final Data Lab
21 pages
Digital Image Processing: Segmentation-5
No ratings yet
Digital Image Processing: Segmentation-5
43 pages
Classification Using R
No ratings yet
Classification Using R
9 pages
Record
No ratings yet
Record
23 pages
K Means Clustering in R Example - Learn by Marketing
No ratings yet
K Means Clustering in R Example - Learn by Marketing
3 pages
Experiment 3.1 K-Mean
No ratings yet
Experiment 3.1 K-Mean
8 pages
LP I Assignment A4 Clustering
No ratings yet
LP I Assignment A4 Clustering
13 pages
Week-1 NK
No ratings yet
Week-1 NK
5 pages
K-Means Clustering
No ratings yet
K-Means Clustering
18 pages
Agenda: 1. Introduction To Clustering
No ratings yet
Agenda: 1. Introduction To Clustering
47 pages
UNIT - 3 - Clustering
No ratings yet
UNIT - 3 - Clustering
21 pages
TheAnalysisofKMV MertonModelinForecasting
No ratings yet
TheAnalysisofKMV MertonModelinForecasting
6 pages
Pronouns
No ratings yet
Pronouns
21 pages
Part 3 Simulation With R
No ratings yet
Part 3 Simulation With R
42 pages
Slide PTKT DBKT Full
No ratings yet
Slide PTKT DBKT Full
115 pages
Part 4.1 R and MC Simulation
No ratings yet
Part 4.1 R and MC Simulation
36 pages
Exam P Tables
No ratings yet
Exam P Tables
2 pages
PCA and FA Commands
No ratings yet
PCA and FA Commands
13 pages
ch0bt10 1
No ratings yet
ch0bt10 1
9 pages
Intro - Cfa - in R
No ratings yet
Intro - Cfa - in R
53 pages