0% found this document useful (0 votes)
86 views65 pages

MACHINE LEARNING ALGORITHM Unit-II Part-II-1

The document discusses different techniques for feature reduction in machine learning. It describes feature selection and feature extraction as two common types of feature reduction. Feature selection aims to find a subset of relevant features, while feature extraction transforms features into a smaller number of dimensions. Several filter, wrapper and embedded methods for feature selection are outlined, including information gain, chi-square test, correlation coefficients, recursive feature elimination and genetic algorithms. The document also covers feature reduction techniques like forward selection, backward selection and evaluating feature subsets.

Uploaded by

akash chandankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views65 pages

MACHINE LEARNING ALGORITHM Unit-II Part-II-1

The document discusses different techniques for feature reduction in machine learning. It describes feature selection and feature extraction as two common types of feature reduction. Feature selection aims to find a subset of relevant features, while feature extraction transforms features into a smaller number of dimensions. Several filter, wrapper and embedded methods for feature selection are outlined, including information gain, chi-square test, correlation coefficients, recursive feature elimination and genetic algorithms. The document also covers feature reduction techniques like forward selection, backward selection and evaluating feature subsets.

Uploaded by

akash chandankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Feature Reduction

● For given Test data instance , we have to find out near by


instance, for this we need distance function.
● This distance function computed in the term of features.
● If number of features are large , there is a problem ,
because distance may not be represent actual distance.
● As feature contained information about target
● More feature means more information and better
classification
● But , we have to properly handle
● - Irrelevant Feature : As it create noise
● - Redundant Feature : It lead to degradation of
performance of algorithm .
Feature Reduction
Feature Reduction: Type 1

● Feature Selection :
Let , F = { X1,X2,X3…….Xn}
F’ Ƈ F ={X1’,X2’,X3’……..Xm}
where F’ is a subset of F

● Here , to find subset of feature to optimize


certain criteria.
Feature Reduction: Type 2

● Feature Extraction:
● It is a transform of the original set of feature
into a new subset , which has a small
number of dimension.
Feature Reduction
● Forward Selection:
● We started with empty set of feature set and add one
feature at a time.
● Then try each of remaining features.
● Estimate classification/ regression error for adding each
features.
● Selected a feature that gives maximum improvement
● Stop , when there is no significant improvement
Feature Reduction
● Backward Selection:
● We started with full feature set
● Then try remaining features
● Drop a feature with smallest improvement /impact on
error
Feature Reduction
Filter Methods
1. IG- Information Gain Method
2. Chi-squre Test
3. Correlation Coef.
Wrapper Methods
1. Recursive feature elimination
2. Genetic Algorithms

Embedded Methods
1. Decision Tree
Selection of Optimal Feature / Correlation Coef. Method

A B C D E T

Step1: Find the correlation of attribute like A with target attribute T


Wrapper Methods
A B C D E T

Here , we have to make a combination (wrapping ) of attributes with target


attribute.
Let us consider , attribute A and target attribute T will be provide to Machine
Learning Algorithm which developed model M1.
Now , consider attribute A and B combine with T and provide to machine learning
algorithm , which develop model M2.
Now, A , B and C combine with T and provide to machine learning algorithm ,
which develop model M3 and so on .
Then find the best Model match with Objective.
N features, 2N possible feature subsets!
Feature Reduction Steps
Feature Reduction Steps(cont’d)
Evaluating feature subset
Feature selection
Pearson correlation coefficient
Pearson correlation coefficient
Signal to noise ratio
Multivariate feature selection
Multivariate feature selection
Collaborative Filtering Based Recommendation
System
● It is instance based learning
Recommendation Systems
Why there is a need?
Types of Recommendation System
Techniques : Data Acquisition

01 02 03
1. Explicit Data 2. Implicit Data 3. Product Information
- Customer Ratings - Purchase History - Product Taxonomy
- Feedback Click or Browse History - Product Attributes
- Demographics - Product Descriptions
Physiographics
Techniques : Recommendation Generation

1. Collaborative Filtering method finds a subset of users who have


similar tastes and preferences to the target user and use this
subset for offering recommendations.
Basic Assumptions :
- Users with similar interests have common preferences.
- Sufficiently large number of user preferences are available.
Main Approaches :
- User Based
- Item Based
Techniques : Recommendation Generation
Types of Recommendation System
Types of Recommendation System
User Based Collaborative Filtering The sparsity problem occurs when transactional or
● Advantage : feedback data is sparse and insufficient for
identifying neighbors and it is a major issue limiting
- No knowledge about item features needed the quality of recommendations and the
● Problems : applicability of collaborative filtering in general.
- New user cold start problem
- New item cold start problem: items with few rating cannot easily be
recommended
- Sparsity problem: If there are many items to be recommended,
user/rating matrix is sparse and it hard to find the users who have
rated the same item.
- Popularity Bias: Tend to recommend only popular items.
e.g. RINGO, GroupLens
Types of Recommendation System
Types of Recommendation System
Types of Recommendation System

Item Based Collaborative Filtering


● Advantage :
- No knowledge about item features needed
- Better scalability, because correlations between limited number of
items instead of very large number of users
- Reduced sparsity problem
● Problems :
- New user cold start problem
- New item cold start problem
e.g. Amazon, eBay
Types of Recommendation System

2. Recommendations are based on the content of items rather on other


user's opinion.
User Profiles: Create user profiles to describe the types of items that user
prefers.
e.g. User1 likes sci-fi, action and comedy.
Recommendation on the basis of keywords are also classified under content
based.
e.g. IMDB, Last.fm
Types of Recommendation System

Content Based Systems Cont'd...

Advantage :
- No need for data on other users. No cold start and sparsity.
- Able to recommend users with unique taste.
- Able to recommend new and unpopular items.
- Can provide explanation for recommendation.

Limitations:
- Data should be in structured format.
- Unable to use quality judgments from other users
Types of Recommendation System
Collaborative filtering based recommendation system

Let u is a set of users and s is a set of items.


P is utility function that finds out rating of item by users
P: u X s R
Its indicating that for a user u and for item s , the rating of the
user for that item.
Now, Learn P from data.
Where training data set is past rating of users for rating
prediction problem, or past purchase history .

Based on P to predict utility value of each item to each user.


Collaborative filtering based recommendation system
Collaborative filtering based recommendation system
In the previous phase , we have to find the similar user ,using KNN
algorithms and now
Collaborative filtering based recommendation system

● One of the drawback of user based


recommendation system is , when number of
users increased , then it is very difficult to handle
it.
● Amazon has Million of users.
Collaborative filtering based recommendation system
Collaborative filtering based recommendation system
Collaborative filtering based recommendation system
Collaborative Filtering
Finding Similar Users
Jaccard Similarity
Jaccard Similarity
Jaccard Similarity
Cosine Similarity
Cosine Similarity: To Solve
Cosine Similarity
Cosine Similarity
● In the above example, there is only one common rating between A and B
Cosine Similarity
● In this example let us consider A and C , they having two common rating ie. WT1 and TM1
Cosine Similarity
● In this example let us consider A and C , they having two common rating ie. WT1 and TM1
Conclusion From Results
Centered Cosine
Centered Cosine
Centered Cosine
Centered Cosine
Rating Prediction

In first condition, we have find out


Average rating from neighbor
Rating Prediction
Item-Item CF
Item-Item CF
Item-Item CF
Item-Item CF
Item-Item CF
Thank You

You might also like