Module 7 Mining Object Spatial Multimedia Text and Web Data
Module 7 Mining Object Spatial Multimedia Text and Web Data
Data Mining
Mining Complex Types of Data
Mining spatial data
Mining image data
Mining text data
Mining the Web
Mining Spatial Databases
Spatial database
Space related data: maps, VLSI layouts, …
Topological, distance information organized by spatial
indexing structures
Spatial data warehousing
Issue: different representations & structures
Dimensions
Nonspatial: 25-30 degree hot
Spatial-to-nonspatial: “New York” “western provinces”
Spatial-to-spatial: equi. temp region 0-5 degree region
Measures
numerical
Spatial: collection of spatial pointers (0-5 degree region)
Example: BC Weather Pattern
Analysis
Input
A map with about 3,000 weather probes scattered in B.C.
Daily data for temperature, wind velocity, etc.
Concept hierarchies for all attributes
Output
A map that reveals patterns: merged (similar) regions
Goals
Interactive analysis (drill-down, slice, dice, pivot, roll-up)
Fast response time, Minimizing storage space used
Challenge
A merged region may contain hundreds of “primitive”
regions (polygons)
Spatial Merge
Precomputing: too much
storage space
On-line merge: very
expensive
Spatial Association Analysis
Spatial association rule: A B [s%, c%]
A and B are sets of spatial or nonspatial predicates
Topological relations: intersects, overlaps, disjoint, etc.
Spatial orientations: left_of, west_of, under, etc.
Distance information: close_to, within_distance, etc.
Example
is_a(x, “school”) ^ close_to(x, “sports_center”)
close_to(x, “park”) [7%, 85%]
Progressive Refinement
First search for rough relationship (e.g. g_close_to for
close_to, touch, intersect) using rough evaluation (e.g.
MBR)
Then apply only to those objects which have passed the
rough test
Spatial Classification
Spatial classification
Analyze spatial objects to derive classification schemes,
such as decision trees in relevance to spatial properties
Example
Classify regions into rich vs. poor
Properties: containing university, containing highway, near
ocean, etc.
Spatial Cluster Analysis
Constraints-based clustering
Selection of relevant objects before clustering
Parameters as constraints
K-means, density-based: radius, min points
Clustering with obstructed distance
C2 C3
r i dge
B C1
River
Mountain C4