0% found this document useful (0 votes)

33 views117 pages

Dimensionality Reduction

Dimensionality reduction is essential for effective data analysis as it addresses the challenges posed by high-dimensional data, such as decreased accuracy and increased computational complexity. It can be achieved through feature selection and feature extraction methods, which help in reducing the number of features while retaining the most informative ones. Genetic algorithms are also highlighted as a powerful optimization technique for navigating large search spaces in feature selection.

Uploaded by

Asma Ayub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views117 pages

Dimensionality Reduction

Uploaded by

Asma Ayub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 117

DIMENSIONALITY

REDUCTION
WHY DIMENSIONALITY REDUCTION?
Generally, it is easy and convenient to collect data
An experiment

Data accumulates in an unprecedented speed

Data preprocessing is an important part for effective pattern
recognition
Dimensionality reduction is an effective approach to downsizing
data

2
WHY DIMENSIONALITY REDUCTION?
Most machine learning/pattern recognition techniques may not be
effective for high-dimensional data
Curse of Dimensionality
Accuracy and efficiency may degrade rapidly as the dimension increases.

The intrinsic dimension may be small.

For example, the number of genes responsible for a certain type of disease
may be small

3
WHY DIMENSIONALITY REDUCTION?
Visualization: projection of high-dimensional data onto 2D or
3D.

Data compression: efficient storage and retrieval.

Noise removal: positive effect on query accuracy.

4
DOCUMENT CLASSIFICATION Terms
Web Pages
Email
s T1 T2 ….…… TN C
Sp
D1 12 0 ….…… 6 ort
Tr
s
av
D2 3 10 ….…… 28
Documents el

…
…
D Jo
0 11 ….…… 16
M
bs
Intern
et ■ Task: To classify unlabeled
documents into categories
ACM IEEE ■ Challenge: thousands of
PubMed
Portal Xplore terms
Digital ■ Solution: to apply
Libraries dimensionality reduction

5
OTHER EXAMPLES

Face images Handwritten digits

6
DIMENSIONALITY REDUCTION
Reduces time complexity: Less computation

Reduces space complexity: Less parameters

Saves the cost of observing/computing the feature

7
DIMENSIONALITY REDUCTION
Key methods of dimensionality reduction

Feature Selection Feature Extraction

8
FEATURE SELECTION VS EXTRACTION
Feature selection:

Choosing k<d important features, ignoring the remaining d

– k. These are Subset selection algorithms

Feature extraction:

Project the original xi , i =1,...,d dimensions to new k<d

dimensions, zj , j =1,...,k

9
FEATURE SELECTION

Choosing an optimal subset of features

(Subset of m out of n)

10
FEATURE EXTRACTION

Mapping of the original high-dimensional

data onto a lower-dimensional space
11
FEATURE EXTRACTION
Given a set of data points of p variables

Compute their low-dimensional representation:

12
Feature Selection

13
CONTENTS: FEATURE SELECTION
Introduction

Feature subset search

Models for Feature Selection

Filters

Wrappers
Genetic Algorithm

14
INTRODUCTION
You have some data, and you want to use it to build
a classifier, so that you can predict something (e.g.
likelihood of cancer)

The data has 10,000 fields (features)

15
INTRODUCTION
You have some data, and you want to use it to build
a classifier, so that you can predict something (e.g.
likelihood of cancer)

The data has 10,000 fields (features)

You need to cut it down to 1,000 fields before
you try machine learning. Which 1,000?

16
INTRODUCTION
You have some data, and you want to use it to build
a classifier, so that you can predict something (e.g.
likelihood of cancer)

The data has 10,000 fields (features)

You need to cut it down to 1,000 fields before
you try machine learning. Which 1,000?

This process of choosing the 1,000 fields to

use is an example of Feature Selection
17
DATA SETS WITH MANY FEATURES
• Gene expression datasets (~10,000 features)
• http://www.ncbi.nlm.nih.gov/sites/entrez?db=gds

• Proteomics data (~20,000 features)

• http://www.ebi.ac.uk/pride/

18
FEATURE SELECTION: WHY?

Source: http://elpub.scix.net/data/works/att/02-28.content.pdf 19
FEATURE SELECTION: WHY?
Quite easy to find lots more cases from papers, where
experiments show that accuracy reduces when you use more
features

Questions?
Why does accuracy reduce with more features?
How does it depend on the specific choice of features?
What else changes if we use more features?
So, how do we choose the right features?

20
WHY ACCURACY REDUCES ?
Note: Suppose the best feature set has 20
features. If you add another 5 features,
typically the accuracy of machine learning
may reduce. But you still have the original
20 features!! Why does this happen???

21
NOISE/EXPLOSION
The additional features typically add noise

Machine learning will pick up on spurious correlations, that might

be true in the training set, but not in the test set

For some ML methods, more features means more parameters to

learn (more NN weights, more decision tree nodes, etc…) – the
increased space of possibilities is more difficult to search

22
FEATURE SUBSET SEARCH
x2 x2

x1 x1

X2 is important, X1 is not X1 is important, X2 is not

x1 X1 and X2 are
important, X3 is23not
SUBSET SEARCH PROBLEM
An example of search space (Kohavi & John 1997)

Forward Backward

24
DIFFERENT ASPECTS OF SEARCH
Search starting points
⚫ Empty set
⚫ Full set
⚫ Random point

Search directions
⚫ Sequential forward selection
⚫ Sequential backward elimination
⚫ Bidirectional generation
⚫ Random generation

Search Strategies
⚫ Exhaustive/Complete
⚫ Heuristics

25
EXHAUSTIVE SEARCH
Original dataset has N features
You want to use a subset of k features
A complete FS method means: try every subset of k features, and
choose the best!
The number of subsets is N! / k!(N−k)!
What is this when N is 100 and k is 5?
75,287,520

What is this when N is 10,000 and k is 100?

Actually it is around 5 × 1035,101

There are around 1080 atoms in the universe

26
FORWARD SEARCH
• These methods `grow’ a set S of features –
• S starts empty
• Find the best feature to add (by checking
which one gives best performance on a
validation set when combined with S).
• If overall performance has improved, return
to step 2; else stop

27
BACKWARD SEARCH
• These methods ‘remove’ features one by
one.
• S starts with the full feature set
• Find the best feature to remove (by checking
which removal from S gives best
performance on a validation set)
• If overall performance has improved, return
to step 2; else stop

28
MODELS FOR FEATURE SELECTION
Two models for Feature Selection
Filter methods
Carry out feature selection independent of any learning algorithm and the features are selected
as a pre-processing step
Wrapper methods
Use the performance of a learning machine as a black box to score feature subsets

29
FILTER METHODS
A filter method does not make use of the classifier,
but rather attempts to find predictive subsets of the
features by making use of simple statistics
computed from the empirical distribution.

30
FILTER METHODS

31
FILTER METHODS
Ranking/Scoring of features
Select best individual features. A feature evaluation function is used to rank
individual features, then the highest ranked m features are selected.
Although these methods can exclude irrelevant features, they often include
redundant features.
Pearson correlation coefficient

32
FILTER METHODS
Minimum Redundancy Maximum Relevance

a good
predictor set

maximum minimum
relevance redundancy

maximal power in minimal correlation

discriminating between among features (members
different classes of predictor set)

33
WRAPPER METHODS
Given a classifier C and a set of feature F, a
wrapper method searches in the space of subsets
of F, using cross validation to compare the
performance of the trained classifier C on each
tested subset.

34
WRAPPER METHODS

35
WRAPPER METHODS
Say we have predictors A, B, C and classifier M. We want to
find the smallest possible subset of {A,B,C}, while achieving
maximal performance

FEATURE SET CLASSIFIER PERFORMANCE

{A,B,C} M 98%
{A,B} M 98%
{A,C} M 77%
{B,C} M 56%
{A} M 89%
{B} M 90%
{C} M 91%
{.} M 85%
36
“Genetic Algorithms are good at taking large, potentially huge search
spaces and navigating them, looking for optimal combinations of
things, solutions you might not otherwise find in a lifetime.”

- Salvatore Mangano
Computer Design, May 1995

Genetic Algorithms

37
FIRST – A BIOLOGY LESSON
A gene is a unit of heredity in a living organism
Genes are connected together into long strings called chromosomes
A gene represents a specific trait of the organism, like eye colour or
hair colour, and has several different settings.
For example, the settings for a hair colour gene may be blonde, black or brown
etc.

These genes and their settings are usually referred to as an

organism's genotype.
The physical expression of the genotype – the organism itself - is
called the phenotype.

38
FIRST – A BIOLOGY LESSON
Offsprings inherit traits from parents
An offsporint may end up having half the genes from one parent
and half from the other - recombination
Very occasionally a gene may be mutated – Expressed in an
organism as a completely new trait
For example: A child may have green eyes while none of the parents had

39
GENETIC ALGORITHM IS
… Computer algorithm
That resides on principles of genetics and evolution

40
GENETIC ALGORITHMS
Search algorithms based on the mechanics of biological evolution
Developed by John Holland, University of Michigan (1970’s)
Provide efficient, effective techniques for optimization and machine learning
applications
Widely-used today in business, scientific and engineering circles

41
GENETIC ALGORITHM
Genetic algorithm (GA) introduces the principle of evolution and
genetics into search among possible solutions to given problem

The idea is to simulate the process in natural systems

This is done by the creation within a machine a population of

individuals

42
GENETIC ALGORITHM
Survival of the fittest
⚫ The main principle of evolution used in GA
is “survival of the fittest”.
⚫ The good solution survive, while bad ones die.

43
APPLICATIONS: OPTIMIZATION
Assume an individual is going to give you z dollars,
after you tell them the value of x and y

x and y in the range of 0 to

44
z

x y 45
EXAMPLE PROBLEM I
(CONTINUOUS)

y=
f(x)

Finding the maximum (minimum) of

some function (within a defined
range).
EXAMPLE PROBLEM II
(DISCRETE)

The Traveling Salesman Problem (TSP)

A salesman spends his time visiting n
cities. In one tour he visits each city just
once, and finishes up where he started. In
what order should he visit them to
minimize the distance traveled?
There are (n-1)!/2 possible tours.
GENETIC ALGORITHM
Inspired by natural evolution
Population of individuals
⚫ Individual is feasible solution to problem

Each individual is characterized by a Fitness function

⚫ Higher fitness is better solution

Based on their fitness, parents are selected to produce offspring

for a new generation
⚫ Fitter individuals have more chance to reproduce
⚫ New generation has same size as old generation; old generation dies

Offspring has combination of properties of two parents

If well designed, population will converge to optimal solution

48
GENETIC ALGORITHM

49
GENETIC ALGORITHM
Coding or Representation
Possible solutions to problem

Fitness function
Parent selection

Reproduction
Crossover
Mutation

Convergence
When to stop

50
CODING – EXAMPLE: FEATURE
SELECTION
Assume we have 15 features f1 to f15
Generate binary strings of 15 bits as initial population

This is initial population

1011100011100 Population Size= User
01 defined parameter
One gene 0111011111000
00
…………………………
……
One row = one chromosome…………………………
=one
……
individual of population
1111000111011
01
1 means the feature is used – 0 means the
feature is not used
51
CODING: EXAMPLE
The Traveling Salesman Problem:

Find a tour of a given set of cities so that

Each city is visited only once
The total distance traveled is minimized
Representation is an ordered list of city numbers
1) London 3) Dunedin 5) Beijing 7) Tokyo
2) Venice 4) Singapore 6) Phoenix 8) Victoria

CityList1 (3 5 7 2 1 6 4 8)
CityList2 (2 5 7 6 8 1 3 4)
52
FITNESS FUNCTION/PARENT
SELECTION
Fitness function evaluates how good an individual is in solving the
problem

Fitness is computed for each individual

Fitness function is application depended

For classification – we may use the classification rate as the fitness

function

Find the fitness value of each individual in the population

53
FITNESS FUNCTION/PARENT
SELECTION
Parent/Survivor Selection
RouletteWheel Selection

Tournament Selection

Rank Selection

Elitist Selection

54
ROULETTE WHEEL SELECTION
Main idea: better individuals get higher chance
Individuals are assigned a probability of being selected based
on their fitness.
pi = fi / Σfj
Where pi is the probability that individual i will be selected,
fi is the fitness of individual i, and
Σfj represents the sum of all the fitnesses of the individuals with the population.

55
ROULETTE WHEEL SELECTION
Assign to each individual a part of the roulette wheel
Spin the wheel n times to select n individuals

1/6 =
17%

A B fitness(A) = 3
C
3/6 = 2/6 =
fitness(B) = 1
50% 33%
fitness(C) = 2

56
TOURNAMENT SELECTION
Binary tournament
Two individuals are randomly chosen; the fitter of the two is selected as a
parent

Larger tournaments
n individuals are randomly chosen; the fittest one is selected as a parent

57
OTHER METHODS
Rank Selection
Each individual in the population is assigned a numerical rank based on
fitness, and selection is based on this ranking.

Elitism
Reserve k slots in the next generation for the highest scoring/fittest
chormosomes of the current generation

58
REPRODUCTION
Reproduction operators
Crossover
Mutation

Crossover is usually the primary operator with mutation serving

only as a mechanism to introduce diversity in the population

59
REPRODUCTION
Crossover
Two parents produce two offspring
There is a chance that the chromosomes of the two parents are copied
unmodified as offspring
There is a chance that the chromosomes of the two parents are randomly
recombined (crossover) to form offspring
Generally the chance of crossover is between 0.6 and 1.0

Mutation
There is a chance that a gene of a child is changed randomly
Generally the chance of mutation is low (e.g. 0.001)

60
CROSSOVER
Generating offspring from two selected parents
Single point crossover
Two point crossover (Multi point crossover)
Uniform crossover

61
ONE PONT CROSSOVER
Choose a random point on the two parents
Split parents at this crossover point
Create children by exchanging tails

62
ONE PONT CROSSOVER
Choose a random point on the two parents
Split parents at this crossover point
Create children by exchanging tails

Parent 1: XX|XXXXX
Parent 2: YY|YYYYY
Offspring 1: XXYYYYY
Offspring 2: YYXXXXX

63
CROSSOVER
00101111000110

00110010001100

00101111001000

11101100000001

Crossover point
64
CROSSOVER
00101111000110 00101100000001

00110010001100

00101111001000

11101100000001 11101111000110

65
CROSSOVER
00101111000110 00101100000001

00110010001100 00110111001000

00101111001000 00101010001100

11101100000001 11101111000110

66
TWO POINT CORSSOVER
Two-Point crossover is very similar to single-point crossover except
that two cut-points are generated instead of one.

Parent 1: XX|XXX|
XX
Parent 2: YY|YYY|
YY
Offspring 1: XXYYYXX
Offspring 2: YYXXXYY

67
N POINT CROSSOVER
Choose n random crossover points
Split along those points
Glue parts, alternating between parents

68
UNIFORM CORSSOVER
A random mask is generated
The mask determines which bits are copied from one parent
and which from the other parent
Bit density in mask determines how much material is taken from
the other parent

Mask: 0110011000 (Randomly generated)

Parents: 1010001110 0011010010

Offspring: 0011001010 1010010110

69
MUTATION
Alter each gene independently with a probability pm
pm is called the mutation rate
Typically between 1/pop_size and 1/ chromosome_length

70
SUMMARY – REPRODUCTION CYCLE
Select parents for producing the next generation
For each consecutive pair apply crossover with probability pc ,
otherwise copy parents
For each offspring apply mutation (bit-flip with probability pm)
Replace the population with the resulting population of offsprings

71
CONVERGENCE
Stop Criterion
Number of generations
Fitness value
How fit is the fittest individual

72
GA FOR FEATURE SELECTION
The initial population is randomly generated
Each chromosome is evaluated using the fitness function
The fitness values of the current population are used to find the off
springs of the next generation
The generational process ends when the termination criterion is
satisfied
The selected features correspond to the best individual in the last
generation

73
GA FOR FEATURE SELECTION
GA can be executed multiple times
Example: 15 features, GA executed 10 times

74
GA FOR FEATURE SELECTION
Feature categories based on frequency of selection
Indispensable:
Feature selected in each selected feature subset.

Irrelevant:
Feature not selected in any of the selected subsets.

Partially Relevant:
Feature selected in some of the subsets.

75
GA WORKED EXAMPLE
Suppose that we have a rotary system (which could be
mechanical - like an internal combustion engine or gas turbine,
or electrical - like an induction motor).

The system has five parameters associated with it - a, b, c, d

and e. These parameters can take any integer value between 0
and 10.

When we adjust these parameters, the system responds by

speeding up or slowing down.

Our aim is to obtain the highest speed possible in revolutions

per minute from the system.
76
Step
1

GA WORKED EXAMPLE
Generate a population of random strings (we’ll use ten as an
example):

77
Step
2

GA WORKED EXAMPLE
Feed each of these strings into the machine, in turn, and
measure the speed in revolutions per minute of the machine. This
value is the fitness because the higher the speed, the better the
machine:

78
Step
3

GA WORKED EXAMPLE
To select the breeding population, we’ll go the easy route and
sort the strings then delete the worst ones. First sorting:

79
Step
3

GA WORKED EXAMPLE
Having sorted the strings into order we delete the worst half

80
Step
4

GA WORKED EXAMPLE
We can now crossover the strings by pairing them up randomly.
Since there’s an odd number, we’ll use the best string twice. The
pairs are shown below:

81
Step
4

GA WORKED EXAMPLE
The crossover points are selected randomly and are shown by the
vertical lines. After crossover the strings look like this:

These can now join their parents in the

next generation
82
Step
4

GA WORKED EXAMPLE
New Generation

83
Step
4

GA WORKED EXAMPLE
We have one extra string (which we
picked up by using an odd number of
strings in the mating pool) that we can
delete after fitness testing.

84
Step
5

GA WORKED EXAMPLE
Finally, we have mutation, in which a
small number of numbers are changed

85
GA WORKED EXAMPLE
After this, we repeat the algorithm from stage 2, with this new
population as the starting point.

Keep repeating until convergence

86
GA WORKED EXAMPLE
Roulette Wheel Selection
The alternative (roulette) method of selection would make up a breeding
population by giving each of the old strings a chance of ending up in the
breeding population which is proportional to its fitness

Making the fitness for each string the addition of its own fitness with all of
those before it

87
GA WORKED EXAMPLE
Roulette Wheel Selection

1. If we now generate a random number between 0 and 10280

we can use this to select strings.
2. If the random number turns out to be between 0 and 100 then
we choose the last string.
3. If it’s between 8080 and 10280 we choose the first string.
4. If it’s between 2480 and 3480 we choose the string 3 6 8 6 9,
etc. You don’t have to sort the strings into order to use this
method.

88
Feature Extraction

89
FEATURE EXTRACTION
Unsupervised
Principal Component Analysis (PCA)
Independent Component Analysis (ICA)

Supervised
Linear Discriminant Analysis (LDA)

90
PRINCIPAL COMPONENT ANALYSIS
(PCA)
PCA is one of the most common feature extraction techniques
Reduce the dimensionality of a data set by finding a new set of
variables, smaller than the original set of variables
Allows us to combine much of the information contained in n
features into m features where m < n

91
PCA – INTRODUCTION

• The 1st PC is a minimum distance fit to a line in X space

• The 2nd PC is a minimum distance fit to a line in the plane
perpendicular to the 1st PC
PCs are a series of linear least squares fits to a sample,
each orthogonal to all the previous.
93
PRINCIPAL COMPONENT ANALYSIS
(PCA)

94
PCA – INTRODUCTION
Transform n-dimensional data to a new n-dimensions
The new dimension with the most variance is the first principal
component
The next is the second principal component, etc.

95
Recap

VARIANCE AND COVARIANCE

Variance is a measure of data spread in one dimension (feature)
Covariance measures how two dimensions (features) vary with
respect to each other

96
Recap

COVARIANCE
Focus on the sign (rather than exact value) of covariance
Positive value means that as one feature increases or decreases the other does
also (positively correlated)
Negative value means that as one feature increases the other decreases and
vice versa (negatively correlated)
A value close to zero means the features are independent

97
Recap

COVARIANCE MATRIX
Covariance matrix is an n × n matrix containing the covariance
values for all pairs of features in a data set with n features
(dimensions)

The diagonal contains the covariance of a feature with itself

which is the variance (which is the square of the standard
deviation)

The matrix is symmetric

98
PCA – MAIN STEPS
Center data around 0

Form the covariance matrix S.

Compute its eigenvectors:

The first p eigenvectors form the p PCs.

The transformation G consists of the p PCs.

99
Data

PCA – WORKED EXAMPLE

100
Step
1

PCA – WORKED EXAMPLE

First step is to center the original data around 0
Subtract mean from each value

101
Step 2

PCA – WORKED EXAMPLE

Calculate the covariance matrix of the centered data – Only 2 ×
2 for this case

102
Step 3

PCA – WORKED EXAMPLE

Calculate the eigenvectors and eigenvalues of the covariance
matrix (remember linear algebra)
Covariance matrix – square n × n ; n eigenvalues will exist
All eigenvectors (principal components/dimensions) are orthogonal to each
other and will make a new set of dimensions for the data
The magnitude of each eigenvalue corresponds to the variance along that
new dimension – Just what we wanted!
We can sort the principal components according to their eigenvalues
Just keep those dimensions with largest eigenvalues

103
Step 3

PCA – WORKED EXAMPLE

Two eigenvectors
overlaying the centered
data

104
Step 4

PCA – WORKED EXAMPLE

Just keep the p eigenvectors with the largest eigenvalues
Do lose some information, but if we just drop dimensions with small
eigenvalues then we lose only a little information
We can then have p input features rather than n
How many dimensions p should we keep?

Eigenvalue
Proportion
of Variance

1234567…n
105
Step 4

PCA – WORKED EXAMPLE

Proportion of Variance (PoV)

when λi are sorted in descending order

Typically, stop at PoV>0.9

106
Step 4

PCA – WORKED EXAMPLE

107
Step 4

PCA – WORKED EXAMPLE

Transform the features to the p chosen Eigenvectors
Take the p eigenvectors that you want to keep from the list of
eigenvectors, and forming a matrix with these eigenvectors in the
columns.

For our example; either keep both vectors or chose to leave out the
smaller less significant one

O
R
108
Step 5

PCA – WORKED EXAMPLE

FinalData = RowFeatureVector x RowDataAdjust

RowFeatureVector is matrix with eigenvectors in the columns

transposed so that eigenvectors are now in the rows with most
significant eigenvector at the top

RowDataAdjust is the mean-adjusted data transposed

FinalData is the final data set with data items in columns and
dimensions along the rows

109
Step 5

PCA – WORKED EXAMPLE

110
Step 5

PCA – WORKED EXAMPLE

111
Step 5

PCA – WORKED EXAMPLE

112
PCA – WORKED EXAMPLE
Getting back original data
FinalData
We used the transofrmation = RowFeatureVector x RowDataAdjust

This gives
RowDataAdjust = RowFeatureVector-1 x FinalData

In our case, inverse of feature vector is equal to its transpose

RowDataAdjust = RowFeatureVectorT x FinalData

113
PCA – WORKED EXAMPLE
Getting back original data
Add mean to get back raw data:

OriginalData= (RowFeatureVectorT x FinalData) + Mean

If we use all (two in our case) eigenvectors we get back exactly the original data
With one eigenvector, some information is lost

114
PCA – WORKED EXAMPLE

Original Original Data restored with one

Data eigenvector
115
PCA APPLICATIONS
Eigenfaces for recognition. Turk and Pentland. 1991.

Principal Component Analysis for clustering gene expression data. Yeung and Ruzzo.
2001.

Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass

Spectrometry of Human Serum. Lilien. 2003.

116
ACKNOWLEDGEMENTS
Chapter 6, Introduction to Machine Learning, E. Alpyadin, MIT Press.

Most slides in this presentation are adopted from slides of text book and various
sources. The Copyright belong to the original authors.

117

Unit - 3 Feature Engineering
No ratings yet
Unit - 3 Feature Engineering
29 pages
Data Science Handwritten Notes by Kirtika
No ratings yet
Data Science Handwritten Notes by Kirtika
59 pages
Introduction To Machine Learning: Jaime S. Cardoso
100% (1)
Introduction To Machine Learning: Jaime S. Cardoso
52 pages
Wrapper Method
No ratings yet
Wrapper Method
58 pages
3ML.03.Feature Reduction
No ratings yet
3ML.03.Feature Reduction
44 pages
Feature Selection
No ratings yet
Feature Selection
36 pages
Feature Selection
No ratings yet
Feature Selection
56 pages
کتاب پنجم بارگزاری شده
No ratings yet
کتاب پنجم بارگزاری شده
35 pages
Module5.2 Feature Selection Methods
No ratings yet
Module5.2 Feature Selection Methods
64 pages
Features Election
No ratings yet
Features Election
62 pages
Dimenn Red PDF
No ratings yet
Dimenn Red PDF
135 pages
Feature Selection: Slide 1
No ratings yet
Feature Selection: Slide 1
29 pages
Feature Selection 1692278667
No ratings yet
Feature Selection 1692278667
100 pages
Lecture#10
No ratings yet
Lecture#10
24 pages
Module-3 DSV
No ratings yet
Module-3 DSV
20 pages
ML Lecture 02
No ratings yet
ML Lecture 02
40 pages
E-Note 14653 Content Document 20231228101402AM
No ratings yet
E-Note 14653 Content Document 20231228101402AM
10 pages
1-2 The Problem 3-4 Proposed Solution 5-7 The Experiment 8-9 Experimental Results 10-11 Conclusion 12 References 13
No ratings yet
1-2 The Problem 3-4 Proposed Solution 5-7 The Experiment 8-9 Experimental Results 10-11 Conclusion 12 References 13
14 pages
Eel891 Selecao Atributos George Bebis
No ratings yet
Eel891 Selecao Atributos George Bebis
58 pages
3b Features PDF
No ratings yet
3b Features PDF
40 pages
Feature Selection Engineering
No ratings yet
Feature Selection Engineering
72 pages
7 Selectia Trasaturilor
No ratings yet
7 Selectia Trasaturilor
54 pages
Kernels, Model Selection and Feature Selection
No ratings yet
Kernels, Model Selection and Feature Selection
5 pages
NEC ML UNIT-III Complete Final
No ratings yet
NEC ML UNIT-III Complete Final
22 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
No ratings yet
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
32 pages
05a Feature Creation Selection
No ratings yet
05a Feature Creation Selection
18 pages
Chapter 2 Data Preprocessing
No ratings yet
Chapter 2 Data Preprocessing
23 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
52 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
69 pages
Feature Engineering and Dimensionality Reduction
No ratings yet
Feature Engineering and Dimensionality Reduction
146 pages
Workbook of Pattern Recognition
No ratings yet
Workbook of Pattern Recognition
11 pages
Foundations of Machine Learning: Sudeshna Sarkar IIT Kharagpur
No ratings yet
Foundations of Machine Learning: Sudeshna Sarkar IIT Kharagpur
40 pages
University Institute of Engineering Department of Computer Science & Engineering
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
23 pages
AI5003 AML Week07
No ratings yet
AI5003 AML Week07
14 pages
CS464 Ch5 FeatureSelection
No ratings yet
CS464 Ch5 FeatureSelection
31 pages
IDS 6 Feature Engineering
No ratings yet
IDS 6 Feature Engineering
68 pages
Feature Extraction Techniques Using Support Vector Machines in Disease Prediction
No ratings yet
Feature Extraction Techniques Using Support Vector Machines in Disease Prediction
8 pages
Presentation 1
No ratings yet
Presentation 1
22 pages
Flairs99 042
No ratings yet
Flairs99 042
5 pages
Foundations of Machine Learning: Module 3: Instance Based Learning and Feature Reduction
No ratings yet
Foundations of Machine Learning: Module 3: Instance Based Learning and Feature Reduction
40 pages
Feature Selection Techniques For ML - A Survey of More Than Two Decades of Research - Dipti Theng
No ratings yet
Feature Selection Techniques For ML - A Survey of More Than Two Decades of Research - Dipti Theng
63 pages
Pattern L1 L6
No ratings yet
Pattern L1 L6
19 pages
Eature Engineering: Presenter: Prof. Amit Kumar Das
No ratings yet
Eature Engineering: Presenter: Prof. Amit Kumar Das
17 pages
Feature Selection Techniques
No ratings yet
Feature Selection Techniques
17 pages
CDT B1 Lab06 MondayWeek2
No ratings yet
CDT B1 Lab06 MondayWeek2
6 pages
KNIME - Seven Techs For Dimensionality Reduction
No ratings yet
KNIME - Seven Techs For Dimensionality Reduction
17 pages
Information Gain - Towards Data Science
No ratings yet
Information Gain - Towards Data Science
8 pages
Icml 2005
No ratings yet
Icml 2005
8 pages
Week 2 v1.1 (Hidden) - Dimensionality and Evaluation
No ratings yet
Week 2 v1.1 (Hidden) - Dimensionality and Evaluation
47 pages
Feature Subset Selection: A Correlation Based Filter Approach
No ratings yet
Feature Subset Selection: A Correlation Based Filter Approach
4 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
DM Chapter 4
No ratings yet
DM Chapter 4
47 pages
A Comparative Study Between Feature Selection Algorithms - Ok
No ratings yet
A Comparative Study Between Feature Selection Algorithms - Ok
10 pages
Genetic Algorithm-Based Feature Selection Method For Credit Risk Analysis
No ratings yet
Genetic Algorithm-Based Feature Selection Method For Credit Risk Analysis
4 pages
Feature Selection
No ratings yet
Feature Selection
61 pages
Artificial Intelligence in Biomedical Engineering
No ratings yet
Artificial Intelligence in Biomedical Engineering
25 pages
AI-Module 4 - Updated
No ratings yet
AI-Module 4 - Updated
53 pages
Machine Learning
No ratings yet
Machine Learning
48 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Lecture 8
No ratings yet
Lecture 8
26 pages
Lecture 7
No ratings yet
Lecture 7
29 pages
Assignment 3
No ratings yet
Assignment 3
1 page
Automata
No ratings yet
Automata
27 pages
Lecture-21-Exception Handling
No ratings yet
Lecture-21-Exception Handling
15 pages
Lecture 2 A
No ratings yet
Lecture 2 A
10 pages
Automata
No ratings yet
Automata
20 pages
Lecture 22 Pipelining
No ratings yet
Lecture 22 Pipelining
13 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
Computer Architecture Introduction
No ratings yet
Computer Architecture Introduction
27 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
Naive Bayes
No ratings yet
Naive Bayes
24 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
59 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
116 pages
Gradient Descent and Cost Function
No ratings yet
Gradient Descent and Cost Function
35 pages
Aws Scholarship
No ratings yet
Aws Scholarship
48 pages
Document (4) ALEX Thomas SBA 2
No ratings yet
Document (4) ALEX Thomas SBA 2
19 pages
Machine Learning Models Development For Drought Forecasting
No ratings yet
Machine Learning Models Development For Drought Forecasting
37 pages
HR Software Market - Research Report 2
No ratings yet
HR Software Market - Research Report 2
110 pages
BigMart Sale Prediction Using Machine Learning
No ratings yet
BigMart Sale Prediction Using Machine Learning
2 pages
Linear Discriminant Analysis
No ratings yet
Linear Discriminant Analysis
33 pages
Syllabus ADaSci Certified Generative AI Engineer
No ratings yet
Syllabus ADaSci Certified Generative AI Engineer
3 pages
B.C.A. Data Science
No ratings yet
B.C.A. Data Science
83 pages
IT8601 unitIV
No ratings yet
IT8601 unitIV
47 pages
Machine Learning Based Vehicle Intention Trajectory Recognition and Prediction For Autonomous Driving
No ratings yet
Machine Learning Based Vehicle Intention Trajectory Recognition and Prediction For Autonomous Driving
7 pages
Non-Stochastic Best Arm Identification and Hyperparameter Optimization
No ratings yet
Non-Stochastic Best Arm Identification and Hyperparameter Optimization
13 pages
AI - Playbook Executive Briefing Artificial Intelligence
No ratings yet
AI - Playbook Executive Briefing Artificial Intelligence
27 pages
Artificial Intelligence Literacy Among Healthcare Professionals and Students: A Systematic Review
No ratings yet
Artificial Intelligence Literacy Among Healthcare Professionals and Students: A Systematic Review
11 pages
Artificial Intelligence Education For Young Children
No ratings yet
Artificial Intelligence Education For Young Children
7 pages
DL MiniProject
No ratings yet
DL MiniProject
27 pages
ML Step by Step
No ratings yet
ML Step by Step
10 pages
Playful Capitalism, or Play As An Instrument of Capital - Miguel Sicart
No ratings yet
Playful Capitalism, or Play As An Instrument of Capital - Miguel Sicart
15 pages
Learning in Multi-Layer Perceptrons - Back-Propagation: Neural Computation: Lecture 7
No ratings yet
Learning in Multi-Layer Perceptrons - Back-Propagation: Neural Computation: Lecture 7
20 pages
AI Education For Clinicians
No ratings yet
AI Education For Clinicians
7 pages
Project Phase-1 Report
No ratings yet
Project Phase-1 Report
28 pages
A Study of Malicious URL Detection Using Machine Learning and Heuristic Approaches
No ratings yet
A Study of Malicious URL Detection Using Machine Learning and Heuristic Approaches
11 pages
CCS105 Data Analytics Class
No ratings yet
CCS105 Data Analytics Class
5 pages
Artificial Intelligencebased Techniques For Crime Scene Reconstruction and Investigation An Overview
No ratings yet
Artificial Intelligencebased Techniques For Crime Scene Reconstruction and Investigation An Overview
3 pages
Ejons Proceeding Book Güncel
No ratings yet
Ejons Proceeding Book Güncel
411 pages
Cse - Ai & ML
No ratings yet
Cse - Ai & ML
44 pages
Machine Learning and Pattern Recognition - Books
No ratings yet
Machine Learning and Pattern Recognition - Books
1 page
DLMDSDL01 JoelKazadi 9213934 SemiSupervisedLearning 20240907
No ratings yet
DLMDSDL01 JoelKazadi 9213934 SemiSupervisedLearning 20240907
10 pages
Co6i-Eti-Unit 1 Notes
No ratings yet
Co6i-Eti-Unit 1 Notes
25 pages
Adulteration
No ratings yet
Adulteration
9 pages