DV - Unit 1 - L2
DV - Unit 1 - L2
Data Abstraction
Dataset Types
3
Dataset Types
•A dataset is any collection of information that is the target of
analysis.
• The four basic dataset types are
1. tables
2. networks,
3. Fields
4. geometry.
• Other ways to group items together include
• Clusters
• sets
• lists.
Dataset types - TABLES
• flat table attributes: name, age, shirt size,
fave fruit
– one item per row
– each column is
attribute
– cell holds value
for item-attribute
pair
– unique key
(could be implicit
item:
person
7
Dataset types - TABLES
• flat table attributes: name, age, shirt
– one item per row size, fave fruit
– each column is attribute
– cell holds value for item-
attribute pair
– unique key
(could be implicit
item:
person
8
Table
9
Table
item
1
0
Table
item
attribute
1
1
Table
item cell
attribute
1
2
Dataset types - TABLES
• multidimensional
tables
– indexing based on
multiple keys
• eg genes,
patients
1
3
Dataset types - Networks and Trees
• networks is well suited for
specifying that there is some kind of
relationship between two or more
items.
•Node – An item in a network
•link is a relation between two
items• network/graph
– nodes (vertices) connected by links (edges
– tree is special case: no cycles
• often have roots and are directed
1
4
Dataset types - Fields
• The field dataset type also
contains attribute values
associated with cells
•Each cell in a field contains
measurements or
calculations
from a continuous domain
Sp
ati
al
1
5
Dataset types - Fields
For example, consider a field dataset representing a
medical can of a human body containing
measurements indicating the density of tissue at many
sample points, spread regularly throughout a volume
of 3D space. A low-resolution scan would have
262,144 cells, providing information about a cubical
volume of space with 64 bins in each direction. Each
cell is associated with a specific region in 3D space.
The density measurements could be taken closer
together with a higher resolution grid of cells, or
further apart for a coarser grid.
Dataset types - Fields
1
8
Spatial fields
• attribute values
associated w/ cells
• cell contains value from
continuous domain
– eg temperature, pressure,
wind velocity
• measured or simulated
• major concerns
– sampling:
where attributes are
measured
– interpolation:
how to model attributes
elsewhere
– grid types
1
9
Spatial fields
• attribute values associated w/ scal
cells
• cell contains value from ar
continuous domain
– eg temperature, pressure, wind
velocity
• measured or simulated
• major concerns
– sampling: vect
where attributes are measured
– interpolation:
or
how to model attributes elsewhere
– grid types
• major divisions
– attributes per cell: tens
scalar (1), vector (2), tensor (many)
or
2
0
Dataset types - Grid Types
• When a field contains data created by sampling at
completely regular intervals, as in the previous example,
the cells form a uniform grid.
•There is no need to explicitly store the grid geometry in
terms of its location in space, or the grid topology in
terms of how each cell connects with its neighboring
cells.
•More complicated examples require storing different
amounts of geometric and topological information about
the underlying grid.
•A rectilinear grid supports nonuniform sampling,
allowing efficient storage of information that has high
complexity in some areas and low complexity in others, at
the cost of storing some information about the geometric
location of each each row.
•A structured grid allows curvilinear shapes, where
the geometric location of each cell needs to be specified.
•unstructured grids provide complete flexibility, but
2
the topological information about how the cells connect 1
Dataset types - Geometry
The geometry
dataset type
specifies
information about
the shape of items
with explicit spatial
positions.
Spatial
2
2
Geometry
• shape of items
• explicit spatial positions /
regions
– points, lines, curves,
surfaces, volumes
• boundary between
computer graphics and
visualization
– graphics: geometry taken
as given
– vis: geometry is result of a
design decision
2
3
Dataset types
Spatial
2
4
Dataset types - Other Combinations
Other Combinations
• There are many ways to group multiple items
together,
• sets, lists, and clusters
Set – unordered group of items
List - A group of items with a specified ordering
Cluster - grouping based on attribute similarity
More complex structures
Path - through a network is an ordered set of
segments formed by links connecting nodes
compound network- a network with an associated tree:
all of the nodes in the network are the leaves of the tree,
and interior nodes in the tree provide a
hierarchical structure for the nodes that is different from
network links between them
2
5
Dataset Availability
2
6
SKASC 27