UFE Lecture-1 Overview Data
UFE Lecture-1 Overview Data
Intelligence
UFE – AIF321
Spring semester
Lecture 1 – Overview
● Types of Data
● Data Quality
● Data Preprocessing
Objects
variable, field, characteristic,
dimension, or feature
● A collection of attributes
describe an object
– Object is also known as
record, point, case, sample,
entity, or instance
Attribute Values
● Discrete Attribute
– Has only a finite or countably infinite set of values
– Examples: zip codes, counts, or the set of words in a
collection of documents
– Often represented as integer variables.
– Note: binary attributes are a special case of discrete
attributes
● Continuous Attribute
– Has real numbers as attribute values
– Examples: temperature, height, or weight.
– Practically, real values can only be measured and
represented using a finite number of digits.
– Continuous attributes are typically represented as
floating-point variables.
01/27/2021 Introduction to Data Mining, 2nd Edition 23
Tan, Steinbach, Karpatne, Kumar
Critiques of the attribute categorization
– Sparsity
◆ Only presence counts
– Resolution
◆ Patterns depend on the scale
– Size
◆ Type of analysis may depend on size of data
● Sequences of transactions
Items/Events
An element of
the sequence
01/27/2021 Introduction to Data Mining, 2nd Edition 33
Tan, Steinbach, Karpatne, Kumar
Ordered Data
● Spatio-Temporal Data
Average Monthly
Temperature of
land and ocean
● Causes?
01/27/2021 Introduction to Data Mining, 2nd Edition 39
Tan, Steinbach, Karpatne, Kumar
Missing Values
● Examples:
– Same person with multiple email addresses
● Data cleaning
– Process of dealing with duplicate data issues