UNIT - II - Data Mining Essentials
UNIT - II - Data Mining Essentials
Community Analysis
• Ratio - Ratio features, as the name suggests, add the additional prop erties of
multiplication and division. An individual’s income is an example of a ratio feature
Data
• where wji represents the weight for word j that occurs in document i
and N is the number of words used for vectorization
Vector space model
• To compute wji,
• Set 1 ---- when the word j exists in document i
• Set 0 ---- when the word j not exists in document I
• Another approach is,
• Term frequency-inverse document frequency (TF-IDF) weighting
scheme.
• In this Wj,i calculated as,