0% found this document useful (0 votes)

568 views113 pages

R Programming Notes

This document provides an overview of statistical hypothesis testing concepts. It defines key terms like the null hypothesis (H0), alternative hypothesis (Ha), type 1 and type 2 errors, test statistics, significance levels, p-values, and normal and t-distributions. It also outlines the steps for performing hypothesis tests, including establishing hypotheses, determining the appropriate statistical test, calculating test statistics, and making conclusions based on rejection regions. An example t-test is provided to illustrate the process.

Uploaded by

nalluri_08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

568 views113 pages

R Programming Notes

Uploaded by

nalluri_08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 113

DATA ANALYTICS-1

SUB CODE:IT412

1 DEPARTMENTOF INFORMATION TECHNOLOGY

BAPATLA ENGINEERING COLLEGE

August 8, 2016
Sample presentation

(BEC ) Short version August 8, 2016 1 / 113

Definition
HYPOTHESIS
An hypothesis is a statement made about the value of a
POPULATION PARAMETER that we wish to test by collecting
evidence in the form of sample.
In a statistical hypothesis test the evidence comes from a sample which
is summarized in the form of a statistic called the TEST STATISTIC.

NULL HYPOTHESIS
”NULL” means nothing new or different; assumption or status quo
maintained.
Null hypothesis is denoted by H0 .

ALTERNATIVE HYPOTHESIS
The ”Alternative” is simply ’the other option’ when null is rejected;
nothing more.
Alternative hypothesis is denoted by Ha .
(BEC ) Short version August 8, 2016 2 / 113
Differences between null hypothesis and alternative
hypothesis

(BEC ) Short version August 8, 2016 3 / 113

Definition
TYPE-1 ERROR
Rejection of the assumption (null hypothesis) when it should not have
been rejected.
Incorrectly rejecting the null hypothesis.

TYPE-1 ERROR EXAMPLE

Let us consider a scenario
1.you smell smoke.
2.you think,”This is not normal” (reject the assumption that
everything is OK).Reject your null hypothesis.
3.Therefore you pull the fire alarm.The building is evacuated and the
fire department arrives to investigate.
4.After the investigation it is determined there was no fire. You
”falsely” pulled the fire alarm.
5.When you rejected tour assumption that everything was OK, when it
really was ok you committed Type-1 Error.A ”false alarm.”
(BEC ) Short version August 8, 2016 4 / 113
Definition
TYPE-2 ERROR
Failure to reject the (null hypothesis) when it should have been
rejected.
Incorrectly mot rejecting the null hypothesis.

TYPE-2 ERROR EXAMPLE

Let us consider a scenario
1.you smell smoke.
2.you think,”it is probabaly someone who burned their lunch in
microwave.No big deal.
3.Therefore you do not reject tour assumption(null hypothesis) that
everything is ok. you uphold the null.
4.But there is indeed a fire. no one is injured. but building burns to
the ground.
5.When you failed to reject your assumption that everything was
OK,when it really was NOT OK,you committed Type-2 Error
(BEC ) Short version August 8, 2016 5 / 113
(BEC ) Short version August 8, 2016 6 / 113
Statistics
Introduction

POPULATION PARAMETER
Any characteristic of a population which is measurable is called
POPULATION PARAMETER.(we usually use greek letters for
population parameters.) A parameter is a numerical property of a
sample.

Example
For example the population mean,µ population variance σ 2 are
population parameters

(BEC ) Short version August 8, 2016 7 / 113

population,sample

(BEC ) Short version August 8, 2016 8 / 113

Definition of terms for hypothesis test

TEST STATISTIC
Critical value
The CRITICAL VALUE separates the critical region from noncritical
region

Rejection Region
REJECTION REGION is the range of values of the test value that
indicates that there is a significant difference and that the null
hypothesis should be rejected.
(BEC ) Short version August 8, 2016 9 / 113
Significance Level(α)

Significance Level(α)
It is a desired parameter of a cut off probability in experimental design
to determine whether an observed test statistic is extreme or not. is
usually set to be 0.05, 0.025 or 0.01.
We reject the null hypothesis if the probability of the observed test
statistic to appear is smaller than .

(BEC ) Short version August 8, 2016 10 / 113

p-value
It is the probability to obtain a new test statistic which is equal or
more extreme than the original observed test statistic.
A small p-value indicates that it is unlikely to get the value of the
observed test statistic.
We reject the null hypothesis if p-value is smaller than α.

(BEC ) Short version August 8, 2016 11 / 113

Normal distribution
Normal distribution is defined by the following probability density
function, where µ is the population mean and σ 2 is the variance.
1 −(x−µ)2
f (x) = √ e 2σ2 (1)
σ 2π
If a random variable X follows the normal distribution, then we write:
X ∼ N (µ, σ 2 ) (2)
In particular, the normal distribution with µ = 0 and σ = 1 is called
the standard normal distribution, and is denoted as N(0,1). It can be

graphed as shown.
(BEC ) Short version August 8, 2016 12 / 113
The normal distribution is important because of the CENTRAL LIMIT
THEOREM, which states that the population of all possible samples of
size n from a population with mean µ and variance σ 2 approaches a
normal distribution with mean µ and σ 2 /n when n approaches infinity.

(BEC ) Short version August 8, 2016 13 / 113

Hypothesis Testing Procedure

Test
1.Start with a well-developed, clear research problem or question.
2.Establish hypothesis, both null and alternative.
3.Determine appropriate statistical test and sampling distribution.
4.Choose the Type 1 error rate.
5.State the decision rule.
6.Gather sample data.
7.Calculate test statistics.
8.State statistical conclusion.
9.Make decision or inference based on conclusion.

(BEC ) Short version August 8, 2016 14 / 113

example

problem
Assume that the test scores of a college entrance exam fits a normal
distribution. Furthermore, the mean test score is 72, and the standard
deviation is 15.2. What is the percentage of students scoring 84 or
more in the exam?

solution
We apply the function pnorm of the normal distribution with mean 72
and standard deviation 15.2. Since we are looking for the percentage of
students scoring higher than 84, we are interested in the upper tail of
the normal distribution.

The percentage of students scoring 84 or more in the college entrance

exam is 21.5

(BEC ) Short version August 8, 2016 15 / 113

Student-t distribution

Assume that a random variable Z has the standard normal

distribution, and another random variable V has the Chi-Squared
distribution with ”m” degrees of freedom. Assume further that Z and
V are independent, then the following quantity follows a Student t
distribution with m degrees of freedom.
Here is a graph of the Student t distribution with 5 degrees of freedom

(BEC ) Short version August 8, 2016 16 / 113

Example

problem
Find the 2.5th and 97.5th percentiles of the Student t distribution with
5 degrees of freedom.

solution
We apply the quantile function qt of the Student t distribution against
the decimal values 0.025 and 0.975.

The 2.5th and 97.5th percentiles of the Student t distribution with 5

degrees of freedom are -2.5706 and 2.5706 respectively.

(BEC ) Short version August 8, 2016 17 / 113

T-Test for single mean
Significance of t test
1. In probability and statistics, Student’s t-distribution (or simply the
t-distribution) is any member of a family of continuous probability
distributions that arises when estimating the mean of a normally
distributed population in situations where the sample size is small and
population standard deviation is unknown.
2. In real life it is impossible to known the standard deviation of
population from which our sample is drawn.
3. When population standard deviation is NOT KNOWN and there we
have to use an estimate,s.
4. when σ is NOT KNOWN we use t-distribution.
5. Every sample size has it’s own t-distribution with n-1 degree of
freedom.
6. The degree of freedom change the probability distribution looks.
7. t-distribution has more probability at the tails and less probability
in the middle.
(BEC ) Short version August 8, 2016 18 / 113
Comparison of t-distribution with z-distribution
.
8. Using Z-distribution is acceptable any time n≥ 30.

(BEC ) Short version August 8, 2016 19 / 113

T-Test statistics for single mean

x̄−µ0
t= √s
n
x=sample mean, µ=hypothesized population mean.
s=sample standard deviation,n=sample size.
In the t-test value in the Non-Rejection and Rejection Region based on
df=n-1.

(BEC ) Short version August 8, 2016 20 / 113

T- distribution example
BUSINESS ANALYST SALARIES
A report from 6 years ago indicated that the average gross salary for a
business analyst was 69,873.Since this survey is now outdated,the
Bureau of labor Statistics wishes to test this figure against current
salaries to see if the current salaries are statistically different from the
old ones.
Based on this sample, we found s=14,985. We do not know σ and we
will therefore have estimate it using s.
For this study, the BLS will take a sample of 12 current salaries.

solution
Step 1:Establish Hypothesis.
H0 :µ=69,873
Ha :µ=69,873.
Step 2: Determine Appropriate Statistical Test and Sampling
Distribution.
(BEC ) Short version August 8, 2016 21 / 113
solution
This will be a two sided test. Salaries could be higher OR lower.
Since σ is unknown and n is small we will use the t-distribution.
Step 3:Specify the Type-1 error rate(significance level)
α=.05.
Step 4:State the decision rule.
for df =11
if t>2.201,reject H0 .
if t<-2.201,reject H0 .

Step 5: Gather data n=12, x=79,180.

(BEC ) Short version August 8, 2016 22 / 113

Solution

(BEC ) Short version August 8, 2016 23 / 113

Solution

(BEC ) Short version August 8, 2016 24 / 113

BUSINESS ANALYST SALARIES,When n=15

n=15

(BEC ) Short version August 8, 2016 25 / 113

BUSINESS ANALYST SALARIES,When n=15
n=15

(BEC ) Short version August 8, 2016 26 / 113

one sample t test

one sample t test in R programming

∗The R function t.test() can be used to perform both one and two
sample t-tests on vectors of data. The function contains a variety of
options and can be called as follows:
> t.test(x, y = NULL, alternative = c(”two.sided”, ”less”, ”greater”),
mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95)
∗Here x is a numeric vector of data values and y is an optional numeric
vector of data values.
∗ If y is excluded, the function performs a one-sample t-test on the
data contained in x, if it is included it performs a two-sample t-tests
using both x and y.
∗The option mu provides a number indicating the true value of the
mean (or difference in means if you are performing a two sample test)
under the null hypothesis.

(BEC ) Short version August 8, 2016 27 / 113

one sample t test

one sample t test in R programming

∗ The option alternative is a character string specifying the alternative
hypothesis, and must be one of the following: ”two.sided” (which is the
default), ”greater” or ”less” depending on whether the alternative
hypothesis is that the mean is different than, greater than or less than
mu, respectively. For example the following call:
∗ > t.test(x, alternative = ”less”, mu = 10)
performs a one sample t-test on the data contained in x where the null
hypothesis is that =10 and the alternative is that ¡10.
∗The option paired indicates whether or not you want a paired t-test
(TRUE = yes and FALSE = no). If you leave this option out it
defaults to FALSE.

(BEC ) Short version August 8, 2016 28 / 113

one sample t test

one sample t test in R programming

∗The option var.equal is a logical variable indicating whether or not to
assume the two variances as being equal when performing a two-sample
t-test. If TRUE then the pooled variance is used to estimate the
variance otherwise the Welch (or Satterthwaite) approximation to the
degrees of freedom is used. If you leave this option out it defaults to
FALSE.
∗Finally, the optionconf.level determines the confidence level of the
reported confidence interval for in the one-sample case and 1-2 in the
two-sample case.

(BEC ) Short version August 8, 2016 29 / 113

one sample t test
one sample t test problem
Ex. An outbreak of Salmonella-related illness was attributed to ice
cream produced at a certain factory. Scientists measured the level of
Salmonella in 9 randomly sampled batches of ice cream. The levels (in
MPN/g) were:
0.593 0.142 0.329 0.691 0.231 0.793 0.519 0.392 0.418
Is there evidence that the mean level of Salmonella in the ice cream is
greater than 0.3 MPN/g?
Let be the mean level of Salmonella in all batches of ice cream. Here
the hypothesis of interest can be expressed as:
H0 := 0.3
Ha :> 0.3
Hence, we will need to include the options alternative=”greater”,
µ=0.3. Below is the relevant R-code:

solution
> x = c(0.593,
(BEC ) 0.142, 0.329, 0.691,
Short 0.231,
version 0.793, 0.519, 0.392,
August 0.418)30 / 113
8, 2016
Two sample t-Test with Unequal Variance

The default form of the t.test() does not assume that the samples
have equal variance,so the welch two sample test is carried out unless
you specify.
Ex:data2=3,5,7,5,3,2,6,8,5,6,9,4,5,7,3,4
data3=6,7,8,7,6,3,8,9,10,7,6,9

Example
> t.test(data2, data3)

(BEC ) Short version August 8, 2016 31 / 113

Two sample t-Test with Equal Variance

we can override the default and use the classic t test by adding
var.equal=T.The p-value slightly different from welch version.

Example
> t.test(data2, data3, var.equal = T )

(BEC ) Short version August 8, 2016 32 / 113

CHI-SQUARED DISTRIBUTION

If X1 , X2 , .., Xm are m independent random variables having the

standard normal distribution,then the following quantity follows a
CHI-SQUARED DISTRIBUTION with m degrees of freedom.Its mean
is m,and its variance is 2m.

V = X1 2 + X2 2 + ... + Xm 2 ∼ χm 2 (3)

Here is a graph of the Chi-Squared distribution 7 degrees of freedom.

(BEC ) Short version August 8, 2016 33 / 113

Example
Find the 95th percentile of the Chi-Squared distribution with 7 degrees
of freedom.

Solution
We apply the quantile function qchisq of the Chi-Squared distribution
against the decimal values 0.95.

The 95th percentile of the Chi-Squared distribution with 7 degrees of

freedom is 14.067

(BEC ) Short version August 8, 2016 34 / 113

Z-Distribution

Z-test
When σ is known,we use the normal standard distribution,or z
distribution,to establish the non-rejection region and critical values.
When σ is not known,we use the t-distribution.
Some instructors or books will indicate that using the z distribution is
acceptable any time n>30.

Z-Test for single mean

x̄ − µ
z=
√σ
n

n:Sample population
x̄:Population mean
µ:Hypothesized population mean
σ:Standard deviation

(BEC ) Short version August 8, 2016 35 / 113

Example

Problem
Suppose the food label on a bag states that there is at most 2 grams of
saturated fat in a single cookie.In a sample of 35 cookies,it is found
that the mean amount of saturated fat per cookie is 2.1 grams.Assume
the population standard deviation is 0.25 grams.At 0.05 significance
level can we reject the claim on food label?

Solution
H0 :µ 62
Ha :µ >2
mean=2,n=35,x̄=2.1,σ=0.25
z = x̄−µ
√σ
n

(BEC ) Short version August 8, 2016 36 / 113

>z
[1]2.36643

a1 ←qnorm(p=0.05,lower.tail=F)
a1
[1]1.64485

z > a1 ,reject the null hypothesis.

z value is greater than critical value.z is in the rejection region.We
reject the null hypothesis ,that the food label contains greater than 2
grams of saturated fat in single cookie.

(BEC ) Short version August 8, 2016 37 / 113

Mann Whitney U test/Mann Whitney Wilcoxon test
•Mann-Whitney U testis used to compares two independent samples by
rank test.
•Non-parametric equivalent of independent Samples t-test.
•The ’U’ statistic provides the degree of overlap in ranks between two
groups
•They come from distinct populations and the samples do not affect
each other.
•Using Mann Whitney Wilcoxon test,we can decide whether the
population distributions are identical without assuming them to follow
the normal distribution.
Note
The difference of Mann Whitney Wilcoxon test and wilcoxon signed
rank test is., Mann Whitney is for independent groups and uses only
ordinal information.Wilcoxon is for matched groups and uses interval
information.There is no reason to expect the two analyses to give
similar results.
(BEC ) Short version August 8, 2016 38 / 113
Example of mann whitney
Note
It is the way of examining the relationship between a numeric outcome
variable y,and a categorical explanatory variable(x,two levels) when
two groups are independent!

Here we have a data file consisting of lung capacity of persons both

smokers and non smokers(725 observations of 6 variables).
We see the relation between smoking and lung capacity.

Here, we plot the relationship between lung capacity and smoke.

(BEC ) Short version August 8, 2016 39 / 113
In this we have the null hypothesis,
H0 :Median lung cap of smokers is equal to lung cap of non-smokers
Ha :Median lung cap of smokers not equal to lung cap of non-smokers
And it is a two sided test.
So,the test will be done in R as follows

(BEC ) Short version August 8, 2016 40 / 113

This wilcoxon test shows the rejection of null hypothesis and accepts
the alternate hypothesis.

(BEC ) Short version August 8, 2016 41 / 113

One sample U test

If we specify a single vector,a one sample u test is carried out.

(BEC ) Short version August 8, 2016 42 / 113

Wilcoxon Signed rank test/ Two sample U test

This is a non-parametric method appropriate for examining the median

difference in observations for 2 populations that are paired or
dependent on one another..

Example
In this example we have blood test results before and after receiving
some treatment.(25 paired observations of 3 variables)

(BEC ) Short version August 8, 2016 43 / 113

> BP←read.table(file.choose(),header=T)
>attach(BP)
>names(BP)

[1]”Subject” ”Before” ”After”

> BP[c(1,3,5),]

we will score changes of before to after treatment

Before examiming the data we plot the before and after values.

(BEC ) Short version August 8, 2016 44 / 113

>boxplot(Before,After)

In the plot,we can see the BP decrement after taking treatment.

H0 :Median change in systolic blood pressure is 0

Two sided test. >
wilcox.test(Before,After,mu=0,alt=”two.sided”,paired=”T”,conf.int=T,co

(BEC ) Short version August 8, 2016 45 / 113

(BEC ) Short version August 8, 2016 46 / 113
Correlation and Covariance

Correlation is a measure of the strength of the relationship or

association between two variables.
•For a generally upward shape we say that the correlation is positive.
As an independent variable increases,the dependent variable generally

increases.
For a generally downward shape we say that the correlation is
negative.
As an independent variable increases,the dependent variable generally
decreases.

(BEC ) Short version August 8, 2016 47 / 113

Continues..

For a randomly scattered points with no upward or downward trend,we

say there is no correlation.

Look at the spread of to make a judgement about the strength of the

correlation.For the positive relationships we would classify the
following scatter diagrams as:

(BEC ) Short version August 8, 2016 48 / 113

continues..

We classify the strengths of negative relationships in the same way:

Covariance
The covariance of two variables x and y in a data sample measure
how the two are linearly related.A positive covariance would indicates a
positive linear relationship
P between the variables,and vice versa.
Co − variance = (x−x̄)(y−ȳ)
n

(BEC ) Short version August 8, 2016 49 / 113

Pearsons correlation coefficient(r)
If a horizontal line is drawn through the mean y valueȳ,and a vertical
mean through the mean x value x̄,you can see the relationship between
the two variables in another way.

(BEC ) Short version August 8, 2016 50 / 113

continues..

coefficient(r)
Sxy
r=p
Sxx Sxy
2
P
• Sxx = x2 − (Pnx) or Sxx = (x − x̄)2
P P
P 2 ( y)2
y − Pn or Syy = (y − ȳ)2
P
• Syy =
xy − ( nxy) or Sxy = (x − x̄)(y − ȳ)
P P
• Sxy =

(BEC ) Short version August 8, 2016 51 / 113

Spearman’s rank correlation coefficient(rs )

In practice a simpler procedure is normally used to calculate rs .The

raw scores are converted to ranks,and the difference d between the
ranks of each observation on the two variables are calculated.
If there are no tied ranks,then rs is given as

6 d2
P
rs = 1 −
(n)(n2 − 1)

• di =the difference between the each rank of corresponding values x

and y
•n=the number of pairs of values.
If there are tied ranks,the classic pearson’s correlation coefficient
between ranks could be used instead of this formula.

(BEC ) Short version August 8, 2016 52 / 113

Least square regression line
Linear regression
Linear regression is a formal method of finding a line which best fits
a set of data.
We can use technology to perform linear regression and hence find the
equation of the line.Most graphics calculators and computer packages
use the method of ’least squares’ to determine the gradient and
y-intercept.
The least square regression line; We find the vertical distances
d1 , d2 , ... to the line of best fit.
We add the squares of these distances,giving d1 2 + d2 2 + ...
The least square regression line is the one which makes this sum as
small as possible

(BEC ) Short version August 8, 2016 53 / 113

unit-2(machine learning)

cluster analysis

(BEC ) Short version August 8, 2016 54 / 113

common steps in cluster analysis

Good clustering
A good clustering method will produce high quality clusters with,high
intra-class similarity and low inter-class similarity.

Data object:It represents an entity.

ex:In sales database the objects may be customers,store items,sales.
Attribute:It is a data field,representing a character or feature of data
objects.
Types of attributes
The type of an attribute is determined by the set of possible values
1.Nominal
Nominal means ”relating to names”.The values of a nominal attribute
are symbols or name of things.
ex:hair color can take black,brown,gray..

(BEC ) Short version August 8, 2016 55 / 113

2.Binary
A binary attribute is a nominal attribute with only two categories 0,1.
•A binary attribute is symmetric,if both of its states are equally
valuable and carry the same weight;that is,there is no preference on
which outcome should be coded as 0 or 1. ex:gender(male,female)
•A binary attribute is asymetric,if the outcomes of the states are not
equally important,such as the positive and negative outcomes of a
medical test for HIV.By convention,we code the most important
outcome,which is usually the rarest one, by 1(eg:HIV positive)and
other by 0(ex:HIV negative).

3.Ordinal
The ordinal attribute is an attribute with possible values that have a
meaningful order or ranking among them,but the magnitude between
successive values is not known.
ex:customer satisfactory(0:very
dissatisfied,1:dissatisfied,2:neutral,3:satisfied,4:very satisfied).

(BEC ) Short version August 8, 2016 56 / 113

4.Numeric
A numeric attribute is quantitative,that is,it is a measurable
quantity,represented in integer or real values.Numeric attributes can be
interval scaled or ratio scaled.
.Interval-scaled Attributes:measured on scale of equal-size units.
ex:Temperature scaling.
.Ratio-scaled Attributes:It is a numerical attribute with an inherent
zero-point.
ex:you are 100 times richer with 100crs than with 1cr.

•CLUSTER:A collection of data objects,Similar to one another

within the same cluster or Dissimilar to the objects in other clusters.
•CLUSTER ANALYSIS:
Grouping a set of data objects into clusters.
Clustering is unsupervised classification: no predefined classes.
•Examples of Clustering Applications
Marketing,Land use,Insurance,City planning,Earthquake studies.

(BEC ) Short version August 8, 2016 57 / 113

calculating distances

Dissimilarity/Similarity metric: Similarity is expressed in terms of a

distance function, which is typically metric:d(i,j).

(BEC ) Short version August 8, 2016 58 / 113

Continues..

•The definitions of distance functions are usually very different for

interval-scaled,boolean,categorical,ordinal and ratio variables.

•Clustering may not be the best way to discover interesting groups in a

data set. Often visulisation methods work well, allowing the human
expert to identify useful groups. However, as the data set sizes increase
to millions of observations, this becomes impractical and clusters help
to partition the data so that we can deal with smaller groups.

•Dierent algorithms, and even multiple runs of the one algorithm, will
deliver dierent clusterings.

(BEC ) Short version August 8, 2016 59 / 113

Similarity and Dissimilarity Between Observations

Distance measures the dissimilarity between two data observations

x = (x1 , x2 , ..., xn ) and y = (y1 , y2 , ..., yn ).
s(x,y)=1-d(x,y)

A distance measure should satisfy the following requirements:

.d(x, y) > 0 distance is non-negative.
.d(x, y)=0 distance to itself is 0.
.d(x,y)=d(y,x) distance is symmetric.
.d(x,y)6d(x,z)+d(z,y) triangular inequality.

(BEC ) Short version August 8, 2016 60 / 113

Euclidean Distance

*The straight line distance between two points;

Default measure based on numeric values;
*The distance calculation we learned in school
v
u n
uX
dist(x, y) = t (xi − yi )2
i=1

* x and y are the observations;

*n variables;
xi is the value of variable i for observation x;
Similarly for yi .

(BEC ) Short version August 8, 2016 61 / 113

Manhattan Distance
The distance walking the streets of Manhattan.

Pn
dist(x, y) = i=1 |xi − yi |

Minkowski Distance
A general p
measure of distance:
d(x, y) = q (|x1 − y1 |q + |x2 − y2 |q + ... + |xn − yn |q ) If q = 2, d is the
Euclidean distance,if q = 1, d is the Manhattan distance.A variation is
the weighted
p distance variables have dierent importance:
d(x, y) = q (w1 |x1 − y1 |q + w2 |x2 − y2 |q + ... + wn |xn − yn |q )

(BEC ) Short version August 8, 2016 62 / 113

Issue of scale

Example of scaling:

(BEC ) Short version August 8, 2016 63 / 113

Interval-Scaled Variables

Interval-scaled variables are continuous variables of a roughly linear

scale.
•The Euclidean distance or some other instance of the Minkowski
distance can be used.
•Before applying the distance measure, the variables need to be
normalized:
ex:Variables with larger ranges (e.g., income) will overwhelm variables
with smaller ranges (e.g., age): 50,000 -40,000 = 10,000 versus 50years
40years = 10
Variation of z-score normalisation:
v−m
v0 =
s
where m is the mean and s is the mean absolute deviation (c.f. stdev:
robust to outliers and retains the outliers)
(BEC ) Short version August 8, 2016 64 / 113
Binary Variables
Binary variables have just two possible values: 0 and 1. We consider as
a group all of the binary variables and count for observation xi and xj
the number of times they both have 0, 1, or (0,1) or (1,0) to build a
contingency table:
•Simple matching coefficient (symmetric variable):

b+c
d(xi , yj ) =
a+b+c+d
•Jaccard coefficientt
(asymmetric: 1 is more important - e.g. diseases):

b+c
d(xi , yj ) =
a+b+c
(BEC ) Short version August 8, 2016 65 / 113
Categorical Variables

• A generalisation of the binary variable in that it can take more than

2 levels, e.g., red, yellow, blue, green.
•Method 1: Simple matching
n−p
d(x, y) =
n
•where p is the number of matched categorical variables and n is the
total number of variables.
•Method 2: Convert each level into a binary variable, creating many
new binary variables.

(BEC ) Short version August 8, 2016 66 / 113

Variables of Mixed Types

•A dataset may contain all types of variables: interval, binary,

categorical.
•Use a weighted formula to combine the dierent normalised (to [0,1])
distances, where the weights are used to express the relative
importance of the variables:
X
d(x, y) = wk .dij Ak
k

where wk is the weight of variable Ak ,dij Ak is the dissimilarity of the

ith observation and the jth observation on variable Ak .dij Ak is
normalized to [0, 1]

(BEC ) Short version August 8, 2016 67 / 113

Major Clustering Approaches

•Partitioning algorithm:
Construct various partitions and then evaluate them by some criterion.
•Hierarchy algorithms
Create a hierarchical decomposition of the set of data (or objects)
using some criterion.
•Density-based
based on connectivity and density functions.
•Grid-based
based on a multiple-level granularity structure.
•Model-based
A model is hypothesized for each of the clusters and the idea is to find
the best fit of that model to each other.

(BEC ) Short version August 8, 2016 68 / 113

Partitioning Methods: The Principle

•Given,
→ A data set of n objects
→ K the number of clusters to form.
• Organize the objects into k partitions (k6n) where each partition
represents a cluster.
• The clusters are formed to optimize an objective partitioning criterion
→Objects within a cluster are similar
→ Objects of different clusters are dissimilar

(BEC ) Short version August 8, 2016 69 / 113

K-Means Method

Given k, the k-means algorithm is implemented in four steps:

(BEC ) Short version August 8, 2016 70 / 113

(BEC ) Short version August 8, 2016 71 / 113
Algorithm

• Input
→ K: the number of clusters
→ D: a data set containing n objects
• Output: A set of k clusters
• Method:
1. Arbitrary choose k objects from D as in initial cluster centers
2.Repeat
3. Reassign each object to the most similar cluster based on the mean
value of the objects in the cluster
4. Update the cluster means
5. Until no change

(BEC ) Short version August 8, 2016 72 / 113

K-Means Properties

• The algorithm attempts to determine k partitions that minimize the

square-error function
k X
X
E= (p − mi )2
i−1 p∈Ci

→ E: the sum of the squared error for all objects in the data set
→ E: the sum of the squared error for all objects in the data set
→ mi : is the mean of cluster Ci
• It works well when the clusters are compact clouds that are rather
well separated from one another.

(BEC ) Short version August 8, 2016 73 / 113

K-Means Properties

Advantages
• K-means is relatively scalable and efficient in processing large data
sets
• The computational complexity of the algorithm is O(nkt)
→ n: the total number of objects
→ k: the number of clusters
→ t: the number of iterations
→ Normally: k << n and t << n
Disadvantages
• Can be applied only when the mean of a cluster is defined
• Users need to specify k
• K-means is not suitable for discovering clusters with nonconvex
shapes or clusters of very different size
• It is sensitive to noise and outlier data points(can influence the mean
value)

(BEC ) Short version August 8, 2016 74 / 113

Variations of the K-Means Method

• A few variants of the k-means which differ in

· Selection of the initial k means
· Dissimilarity calculations
· Strategies to calculate cluster means
• Handling categorical data: k-modes (Huang98)
· Replacing means of clusters with modes
· Using new dissimilarity measures to deal with categorical objects
· Using a frequency-based method to update modes of clusters
· A mixture of categorical and numerical data

(BEC ) Short version August 8, 2016 75 / 113

Example of k-means

i ←read.csv(choose.files())
str(i)

names(i)

i2 ←i[,-5]
names(i2)

(BEC ) Short version August 8, 2016 76 / 113

continues..

i3 ←kmeans(i2,3)
i3

(BEC ) Short version August 8, 2016 77 / 113

continues..

table(iris$Species,i3$cluster)

plot(iris$Petal.Length,iris$Petal.Width,col=iris$Species)

(BEC ) Short version August 8, 2016 78 / 113

continues..

plot(iris$Petal.Length,iris$Petal.Width,col=i3$cluster)

(BEC ) Short version August 8, 2016 79 / 113

continues..

plot(iris$Sepal.Length,iris$Sepal.Width,col=i3$cluster)

(BEC ) Short version August 8, 2016 80 / 113

K-Medoids Method

• Minimize the sensitivity of k-means to outliers

• Pick actual objects to represent clusters instead of mean values
• Each remaining object is clustered with the representative object
(Medoid) to which is the most similar
• The algorithm minimizes the sum of the dissimilarities between each
object and its corresponding reference point
k X
X
E= |p − Oi |
i−1 p∈Ci

· E: the sum of absolute error for all objects in the data set
· P: the data point in the space representing an object
· Oi : is the representative object of cluster Ci

(BEC ) Short version August 8, 2016 81 / 113

K-Medoids Method: The Idea

• Initial representatives are chosen randomly

•The iterative process of replacing representative objects by no
representative objects continues as long as the quality of the clustering
is improved
• For each representative Object O
→ For each non-representative object R, swap O and R
• Choose the configuration with the lowest cost
• Cost function is the difference in absolute error-value if a current
representative object is replaced by a non-representative object

(BEC ) Short version August 8, 2016 82 / 113

K-Medoids Method: Example

(BEC ) Short version August 8, 2016 83 / 113

continues..

(BEC ) Short version August 8, 2016 84 / 113

continues..

(BEC ) Short version August 8, 2016 85 / 113

continues..

(BEC ) Short version August 8, 2016 86 / 113

continues..

(BEC ) Short version August 8, 2016 87 / 113

continues..

(BEC ) Short version August 8, 2016 88 / 113

continues..

(BEC ) Short version August 8, 2016 89 / 113

continues..

(BEC ) Short version August 8, 2016 90 / 113

continues..

(BEC ) Short version August 8, 2016 91 / 113

K-Medoids Algorithm(PAM)
PAM : Partitioning Around Medoids
• Input
→ K: the number of clusters
→ D: a data set containing n objects
• Output: A set of k clusters
•Method:
1. Arbitrary choose k objects from D as representative objects (seeds)
2.Repeat
3. Assign each remaining object to the cluster with the nearest
representative object
4. For each representative object Oj
5. Randomly select a non representative object Orandom
6. Compute the total cost S of swapping representative object Oj with
Orandom
7. if S¡0 then replace Oj with Orandom
8. Until no change
(BEC ) Short version August 8, 2016 92 / 113
Example program of k-medoids

x←read.csv(choose.files())
str(x)

names(x)

x1←x[,-5]
names(x1)

(BEC ) Short version August 8, 2016 93 / 113

continues..

library(cluster)
km←pam(x1,3)
km

(BEC ) Short version August 8, 2016 94 / 113

continues..

table(iris$Species,km$clustering)

plot(iris$Petal.Length,iris$Petal.Width,col=iris$Species)

(BEC ) Short version August 8, 2016 95 / 113

continues..

plot(iris$Petal.Length,iris$Petal.Width,col=km$clustering)

(BEC ) Short version August 8, 2016 96 / 113

continues..

plot(iris$Sepal.Length,iris$Sepal.Width,col=iris$Species)

(BEC ) Short version August 8, 2016 97 / 113

continues..

plot(iris$Sepal.Length,iris$Sepal.Width,col=km$clustering)

(BEC ) Short version August 8, 2016 98 / 113

K-Medoids Properties(k-medoids vs.K-means)

• The complexity of each iteration is O(k(n-k)2)

• For large values of n and k, such computation becomes very costly

• Advantages
→ K-Medoids method is more robust than k-Means in the presence of
noise and outliers
•Disadvantages
→ K-Medoids is more costly that the k-Means method
→ Like k-means, k-medoids requires the user to specify k
→ It does not scale well for large data sets.

(BEC ) Short version August 8, 2016 99 / 113

Hierarchical Clustering

•Hierarchical Clustering Approach

-A typical clustering analysis approach via partitioning data set
sequentially
-Construct nested partitions layer by layer via grouping objects into a
tree of clusters (without the need to know the number of clusters in
advance)
-Use (generalised) distance matrix as clustering criteria
•Agglomerative vs. Divisive -Two sequential clustering strategies
for constructing a tree of clusters
-Agglomerative: a bottom-up strategy
Initially each data object is in its own (atomic) cluster
Then merge these atomic clusters into larger and larger clusters
-Divisive: a top-down strategy Initially all objects are in one single
cluster
Then the cluster is subdivided into smaller and smaller clusters

(BEC ) Short version August 8, 2016 100 / 113

Example

Agglomerative and divisive clustering on the data set a, b, c, d ,e

(BEC ) Short version August 8, 2016 101 / 113

Example

data(nutrient,package = ”flexclust”)
View(nutrient)
row.names(nutrient)¡-tolower(row.names(nutrient))
row.names(nutrient)

nutrient.scaled¡-scale(nutrient)
nutrient.scaled

(BEC ) Short version August 8, 2016 102 / 113

continues..

d←dist(nutrient.scaled)
fit.average←hclust(d,method = ”average”)
fit.average

(BEC ) Short version August 8, 2016 103 / 113

continues..

plot(fit.average,hang = -1,cex=1.0,main = ”Average linkage

clustering”)

(BEC ) Short version August 8, 2016 104 / 113

Classification

Definition
The data analysis task is classication,where a model or classier is
constructed to predict categorical labels,such as safe or risky for the
loan application data;yes or no for the marketing data; or treatment A,
treatment B, or treatment C for the medical data.

Regression analysis
Regression analysis is a statistical methodology that is most often
used for numeric prediction,hence the two terms are often used
synonymously.

(BEC ) Short version August 8, 2016 105 / 113

Preparing the Data for Classication and Prediction
The following preprocessing steps may be applied to the data to help
improve the accuracy, efciency, and scalability of the classication or
prediction process.
•Data cleaning:the preprocessing of data in order to remove or
reduce noise and the treatment of missing values.
•Relevance analysis:Many of the attributes in the data may be
redundant. Correlation analysis can be used to identify whether any
two given attributes are statistically related.
Relevance analysis,in the form of correlation analysis and attribute
subset selection, can be used to detect attributes that do not
contribute to the classication or prediction task.
•Data transformation and reduction: The data may be
transformed by normalization, particularly when neural networks or
methods involving distance measurements.Data can also be reduced by
applying many other methods, ranging from wavelet transformation
and principle components analysist o discretization techniques,such as
binning, histogram analysis, and clustering.
(BEC ) Short version August 8, 2016 106 / 113
Classication by Decision Tree Induction

Decision tree induction is the learning of decision trees from

class-labeled training tuples. A decision tree is a owchart-like tree
structure,where each internal node(non leafnode) denotes a test on an
attribute,each branch represents an outcome of the test,and each leaf
node (or terminal node) holds a class label. The topmost node in a tree
is the root node.

(BEC ) Short version August 8, 2016 107 / 113

Introduction

•Some decision tree algorithms produce only binary trees (where each
internal node branches to exactly two other nodes), whereas others can
produce nonbinary trees.

How are decision trees used for classication? Given a tuple, X, for
which the associated class label is unknown,the attribute values of the
tuple are tested against the decision tree. A path is traced from the
root to a leaf node, which holds the class prediction for that tuple.
Decision trees can easily be converted to classication rules.

Why are decision tree classiers so popular? The construction of

decision tree classiers does not require any domain knowledge or
parameter setting,and therefore is appropriate for exploratory
knowledge discovery.Decision trees can handle high dimensional data.

(BEC ) Short version August 8, 2016 108 / 113

Decision Tree Induction

A researcher in machine learning, developed a decision tree algorithm

known as ID3 (Iterative Dichotomiser). This work expanded on earlier
work on concept learning systems. C4.5 (a successor of ID3), which
became a benchmark to which newer supervised learning algorithms
are often compared.

A researcher in machine learning, developed a decision tree algorithm

(BEC ) Short version August 8, 2016 109 / 113

Algorithm

Input:
Data partition, D, which is a set of training tuples and their associated
class labels;
attribute list, the set of candidate attributes;
Attribute selection method,a procedure to determine the splitting
criterion thatbestpartitions the data tuples into individual classes.
This criterion consists of a splitting attribute and, possibly, either a
split point or splitting subset.
Output:A decision tree
Method:
1.create a node N;
2. if tuples in D are all of the same class,C then
3.return N as a leaf node labeled with the class C;
4.if attribute list is empty then
5.return N as a leaf node labeled with the majority class in D;

(BEC ) Short version August 8, 2016 110 / 113

continues..

6.apply Attribute selection method(D, attribute list) to nd the best

splitting criterion;
7.label node N with splitting criterion;
8.if splitting attribute is discrete-valued and multiway splits allowed
then
9. attribute list←attribute list splitting attribute;
10. for each outcome j of splitting criterion
11. let Dj be the set of data tuples in D satisfying outcome j;
12. if Dj is empty then
13.attach a leaf labeled with the majority class in D to node N;
14. else attach the node returned by Generate decision tree(Dj ,
attribute list) to node N; endfor
15.return N;

(BEC ) Short version August 8, 2016 111 / 113

Information gain

ID3 uses information gain as its attribute selection measure. which

studied the value or information content of messages. . Let node N
represent or hold the tuples of partition D.The attribute with the
highest information gain is chosen as the splitting attribute for node N.
This attribute minimizes the information needed to classify the tuples
in the resulting partitions and reects the least randomness or impurity
in these partitions. Such an approach minimizes the expected number
of tests needed to classify a given tuple and guarantees that a simple
(but not necessarily the simplest) tree is found. The expected
information needed to classify a tuple in D is given by

Inf o(D) = − m
P
i=1 pi log 2 (pi )

• pi is the probability that an arbitrary tuple in D belongs to class Ci

|ci,D |
pi = |D|

(BEC ) Short version August 8, 2016 112 / 113

Continues..

• These partitions would correspond to the branches grown from node

N. Ideally, we would like this partitioning to produce an exact
classication of the tuples.
•That is, we would like for each partition to be pure. However, it is
quite likely that the partitions will be impure How much more
information would we still need (after the partitioning) in order to
arrive at an exact classication? This amount is measured by
Pv |Dj |
Inf oA (D) = j=1 |D| ∗ Inf o(Dj )
|D |
• The term |D|j acts as the weight of j th partition.
• Gain(A)=Info(D)-Inf oA (D)
Gain(A) tells us how much would be gained by branching on A. It is
the expected reduction in the information requirement caused by
knowing the value of A.

(BEC ) Short version August 8, 2016 113 / 113

DWDM Notes - Unit 1
No ratings yet
DWDM Notes - Unit 1
26 pages
Data Mining For The Masses
100% (2)
Data Mining For The Masses
264 pages
SC&RP - Unit 5
No ratings yet
SC&RP - Unit 5
36 pages
Basic PL-SQL
No ratings yet
Basic PL-SQL
47 pages
SQL Practice Set 1 & 2
No ratings yet
SQL Practice Set 1 & 2
6 pages
BCA
No ratings yet
BCA
28 pages
Data Structure Unit 5 (Searching and Sorting Notes)
100% (1)
Data Structure Unit 5 (Searching and Sorting Notes)
26 pages
Data Preprocessing: L1+ Freq
No ratings yet
Data Preprocessing: L1+ Freq
13 pages
Unit-3 DWDM
No ratings yet
Unit-3 DWDM
11 pages
R Programming Unit 2
No ratings yet
R Programming Unit 2
46 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
37 pages
Big Data and Data Science
No ratings yet
Big Data and Data Science
6 pages
Python Lab Programs - Chapter 2 To 4
No ratings yet
Python Lab Programs - Chapter 2 To 4
13 pages
R22 ML Syllabus
No ratings yet
R22 ML Syllabus
2 pages
Data-Mining-Lab-Manual Cs 703b
No ratings yet
Data-Mining-Lab-Manual Cs 703b
41 pages
Chi Merge
No ratings yet
Chi Merge
5 pages
Software Engineering Notes (Unit-III)
No ratings yet
Software Engineering Notes (Unit-III)
21 pages
Predictive Analytics
No ratings yet
Predictive Analytics
46 pages
1.write A Program in Prolog To Show The Sum of N Natural Numbers. Code
No ratings yet
1.write A Program in Prolog To Show The Sum of N Natural Numbers. Code
2 pages
Presentation Topics For Statistics 2024
No ratings yet
Presentation Topics For Statistics 2024
1 page
Questions of SQL Assignment
No ratings yet
Questions of SQL Assignment
3 pages
On MYSQL COMMANDS
No ratings yet
On MYSQL COMMANDS
9 pages
Python Class 11 Ip Notes
No ratings yet
Python Class 11 Ip Notes
5 pages
05 NumPy - Arrays and Vectorized Computation
No ratings yet
05 NumPy - Arrays and Vectorized Computation
47 pages
Attribute Oriented Induction
100% (1)
Attribute Oriented Induction
6 pages
DBMS Lab # 5 SQL Constraints
No ratings yet
DBMS Lab # 5 SQL Constraints
14 pages
Sampling Distributions: The Basic Practice of Statistics
No ratings yet
Sampling Distributions: The Basic Practice of Statistics
14 pages
Dbms Lab Manual
No ratings yet
Dbms Lab Manual
40 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
Create Table Command
No ratings yet
Create Table Command
10 pages
Unit 5
No ratings yet
Unit 5
104 pages
C Programs 1
100% (1)
C Programs 1
37 pages
Nosql Database Systems: M.Tech. (Iind, Sem Ce/Cn)
100% (1)
Nosql Database Systems: M.Tech. (Iind, Sem Ce/Cn)
135 pages
Database Management System-Notes
No ratings yet
Database Management System-Notes
16 pages
Supervised Learning
No ratings yet
Supervised Learning
20 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
21 pages
Methodologies For Stream Data Processing and Stream Data Systems
No ratings yet
Methodologies For Stream Data Processing and Stream Data Systems
20 pages
Regression, Classification and Clustering
100% (2)
Regression, Classification and Clustering
23 pages
All Worksheets MYSQL
No ratings yet
All Worksheets MYSQL
36 pages
Facets of Data
No ratings yet
Facets of Data
6 pages
BDA Unit 5 HIVE HBASE
No ratings yet
BDA Unit 5 HIVE HBASE
33 pages
Unit 2 PDF
No ratings yet
Unit 2 PDF
156 pages
Unit-3 DS Students
No ratings yet
Unit-3 DS Students
35 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
10 pages
Data Analytics (A) CS-503, B.Tech. 5 Semester Assignment Questions
0% (1)
Data Analytics (A) CS-503, B.Tech. 5 Semester Assignment Questions
2 pages
Computer Vision Question Bank 24 25
No ratings yet
Computer Vision Question Bank 24 25
7 pages
Chapter-1:-Introduction To R Language: 1.1 History and Overview
No ratings yet
Chapter-1:-Introduction To R Language: 1.1 History and Overview
7 pages
Hbase
No ratings yet
Hbase
13 pages
04-Various Views of Data-Data Abstraction-DBMS Tutorials For Beginners in Hindi
No ratings yet
04-Various Views of Data-Data Abstraction-DBMS Tutorials For Beginners in Hindi
4 pages
Module 4 - Study Material - Overview of Predictive Analytics
No ratings yet
Module 4 - Study Material - Overview of Predictive Analytics
15 pages
File Organization
No ratings yet
File Organization
2 pages
Jntuk Machine Learning 3-2 Unit-4
No ratings yet
Jntuk Machine Learning 3-2 Unit-4
32 pages
VTU Exam Question Paper With Solution of 17CS73 Machine Learning Jan-2021-Swathi Y
No ratings yet
VTU Exam Question Paper With Solution of 17CS73 Machine Learning Jan-2021-Swathi Y
7 pages
Lab Manual - DWH
No ratings yet
Lab Manual - DWH
21 pages
Data Mining Clustering
No ratings yet
Data Mining Clustering
76 pages
STAT - 835 Probability and Statistics: Basics of Probability III
No ratings yet
STAT - 835 Probability and Statistics: Basics of Probability III
63 pages
Intermediate STATS 10
100% (1)
Intermediate STATS 10
35 pages
R Language
No ratings yet
R Language
59 pages
Clickstream Analysis
No ratings yet
Clickstream Analysis
25 pages
DBMS Lab Manual
From Everand
DBMS Lab Manual
Jitendra Patel
1.5/5 (3)
Hypothesis Testing - Intro - Summer 2025
No ratings yet
Hypothesis Testing - Intro - Summer 2025
59 pages
Machine Learning Lesson - Plan
No ratings yet
Machine Learning Lesson - Plan
3 pages
Script Programing Y18 Syllabus
No ratings yet
Script Programing Y18 Syllabus
2 pages
Ida PDF
No ratings yet
Ida PDF
62 pages
Decision Tree References
No ratings yet
Decision Tree References
1 page
Support Vector Machine References
No ratings yet
Support Vector Machine References
1 page
Machine Learning References
No ratings yet
Machine Learning References
3 pages
Graphs PDF
No ratings yet
Graphs PDF
1 page
Script Programming: 18IT403 B.Tech., (Semester - IV) Section A
No ratings yet
Script Programming: 18IT403 B.Tech., (Semester - IV) Section A
2 pages
Y18curriculam PDF
No ratings yet
Y18curriculam PDF
160 pages
Y18curriculam PDF
No ratings yet
Y18curriculam PDF
160 pages
Script Programming: Scheme of Valuation 18IT404 B.Tech., (Semester-IV)
No ratings yet
Script Programming: Scheme of Valuation 18IT404 B.Tech., (Semester-IV)
2 pages
Teaching Content - 1 18IT403
No ratings yet
Teaching Content - 1 18IT403
9 pages
Hall Ticket Number:: III/IV B.Tech (Regular/Supplementary) DEGREE EXAMINATION
No ratings yet
Hall Ticket Number:: III/IV B.Tech (Regular/Supplementary) DEGREE EXAMINATION
11 pages
Bapatla Engineering College: Unit - I
No ratings yet
Bapatla Engineering College: Unit - I
3 pages
List of Symbols: Number Sets
No ratings yet
List of Symbols: Number Sets
1 page
Bapatla Engineering College: Unit - I
No ratings yet
Bapatla Engineering College: Unit - I
2 pages
Using graphics and pictures in L TEX 2ε: 1 The picture environment
No ratings yet
Using graphics and pictures in L TEX 2ε: 1 The picture environment
3 pages
Sheet Name Reg No Look Up KBR Pavk NSR Name Look Up KRT KSK GP PSR KSR MPK DSP PCR PRK BK PRP KSP KKK
No ratings yet
Sheet Name Reg No Look Up KBR Pavk NSR Name Look Up KRT KSK GP PSR KSR MPK DSP PCR PRK BK PRP KSP KKK
35 pages
10hpsensitivity - Eps: April 14, 2016 1 / 49
No ratings yet
10hpsensitivity - Eps: April 14, 2016 1 / 49
49 pages
Rigid Body Dynamics: Coriolis Acceleration
No ratings yet
Rigid Body Dynamics: Coriolis Acceleration
3 pages
"Hello, World": (Printf ) (P P Malloc P Malloc Free (P) )
No ratings yet
"Hello, World": (Printf ) (P P Malloc P Malloc Free (P) )
1 page
This Is On Algorithms: 1 Algorithm in L TEX
No ratings yet
This Is On Algorithms: 1 Algorithm in L TEX
2 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Exercise of Chapter 4 - Data Mining Tools and Techniques Worksheet
No ratings yet
Exercise of Chapter 4 - Data Mining Tools and Techniques Worksheet
4 pages
AIML ISE mpq2
No ratings yet
AIML ISE mpq2
4 pages
DLT Unit-1
No ratings yet
DLT Unit-1
66 pages
Advantages of The Decision Tree
No ratings yet
Advantages of The Decision Tree
2 pages
AIML Manual V1 2-86
No ratings yet
AIML Manual V1 2-86
85 pages
Boosting Margin
No ratings yet
Boosting Margin
30 pages
Data Mining Classification and Prediction
No ratings yet
Data Mining Classification and Prediction
17 pages
Machine Learning Techniques-Bcds062!01!01
No ratings yet
Machine Learning Techniques-Bcds062!01!01
66 pages
Comparing ML Algorithms On Financial Fraud Detection N
No ratings yet
Comparing ML Algorithms On Financial Fraud Detection N
5 pages
AAIC Syllabus
No ratings yet
AAIC Syllabus
19 pages
Random Forests and Decision Trees: September 2012
No ratings yet
Random Forests and Decision Trees: September 2012
8 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
Estimating The Effect of Cad Model Simplification Techniques For Fea Simulations
No ratings yet
Estimating The Effect of Cad Model Simplification Techniques For Fea Simulations
13 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
21 pages
NguyenCongSang ITITIU20292 Lab2
No ratings yet
NguyenCongSang ITITIU20292 Lab2
13 pages
Chapter 8 - 1 Machine Learning
No ratings yet
Chapter 8 - 1 Machine Learning
167 pages
Lecture 7: Impurity Measures For Decision Trees: Madhavan Mukund
No ratings yet
Lecture 7: Impurity Measures For Decision Trees: Madhavan Mukund
10 pages
Unit-3 (MLT)
No ratings yet
Unit-3 (MLT)
46 pages
360digiTMG - Certificate Course On Data Science - Curriculum
No ratings yet
360digiTMG - Certificate Course On Data Science - Curriculum
12 pages
Robust Decision Trees
No ratings yet
Robust Decision Trees
6 pages
Predictive Maintenance For Automotive Vehicle Engines in Military Logistics
100% (1)
Predictive Maintenance For Automotive Vehicle Engines in Military Logistics
12 pages
Nbtree: A Naive Bayes/Decision-Tree Hybrid: Darin Morrison
No ratings yet
Nbtree: A Naive Bayes/Decision-Tree Hybrid: Darin Morrison
27 pages
Advance Data Science and AI Certification Program Learnbay
No ratings yet
Advance Data Science and AI Certification Program Learnbay
38 pages
Interpretable Machine Learning For Genomics: University College London
No ratings yet
Interpretable Machine Learning For Genomics: University College London
30 pages
MLP Quiz-2
No ratings yet
MLP Quiz-2
4 pages
ML Manual2025 - IV YEar
No ratings yet
ML Manual2025 - IV YEar
39 pages
ML Model Paper 1
No ratings yet
ML Model Paper 1
3 pages
Machine Learning Based Intrusion Detection Systems For IoT Applications
No ratings yet
Machine Learning Based Intrusion Detection Systems For IoT Applications
24 pages