We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8
GIS Data Types and
Data Models
Although the two terms, data and information, are often
used indiscriminately, they both have a specific meaning. Dat: can be described as different observations, which are collectod and stored. Information is that data, which is useful in answering queries or solving a problem. Digitizing a large number of maps provides a large amount of data after hours of painstaking works, but the data can only render useful information if it is used in analysis. GIS DATA TYPES Attribute Data The attributes refer to the properties of spatial entities. do not They are often referred toas non-spatial data since they This type of data in themselves represent location information. features. describes characteristics of the spatial These characteristics can be quantitative and/or qualitative in nature. Attribute data is often referred to as tabular data. Spatial Data feature has Geographic position refers tothe fact that each To specily a location that must be specified in a unique way. is used. the position in an absolute way a coordinate system the regular For small areas, the simplest coordinate system is approved cartographic square grid. For larger areas, certain approved there are many projections are commonly used. Internationally information different coordinate systems in use. This Locational GIS Data Types and Data Models 65 is provided in mapssby using geometric descriptions, Points, are the basicLines and Polygons. These Thus spatial data describes the data elements of a map. of geographic features. absolute and relative location he coordinate location of a while the forest would be spatial characteristics of that forest, e.g. cover data, dominant species, crown group, closure, height, etc., would be attribute data. Other data types, in particular image data, have becomne more prevalent with and multimedia changing technology. Depending on the specific content of the data, be considered either spatial, e.g. image data may movies, etc., or attribute, e.g. sound, photographs, animation, descriptions, narration's, etc.
GIS DATA MODELS
A GIS is based on data, hence there must be a data model that has to be followed to standardize procedures. They are : 1. Spatial Data Models 2. Attribute Data Models. Spatial Data Models Traditionally spatial data has been stored and presented in the form of a map. Three basic types of spatial data models have evolved for storing geographic data digitally. These are referred to as: Raster Vector Image The selection of a particular data model, vector or raster, as the 1s dependent on the source and type of data, as well intended use of the data. Certain analytical procedures require data. raster data while others are better suited to vector Raster Data Formats grid of cells divided A simple raster data set is a regular data values for data set, nto rows and columns. In a raster cell these values may agiven parameter are stored in each sea level, a land use represent an elevation in meters above meter, and so forth. Class, a plant biomass in grams per square is determined by The spatial resolution of the raster data set 66 Remote Sensing and Geographic Information System the size of the cell. For example, Landsat TM data are raster data that are corrected to satellite imagery have a cell size of approximately 30 meters on a side. However, spatial resolution can be much finer, or much coarser than 30 meters. In general. spatial resolution isa function of the data collection used, and the desired outcomes. techniques The size of cells in a tessellated data structure is on the basis of the data accuracy and the selected resolution needed by the user. There is no explicit coding of geographic coordinates required since that is implicit in the layout of the cell A raster data structure is in fact a matrix whereany coordinate can be quickly calculated if the origin point is known, and the size of the grid cells is known. Since grid-cells can be handled as two dimensional arrays in computer encoding many analytical operations are easy to program. This makes tessellated data structures a popular choice for many GIS software. Topology is not a relevant concept with tessellated structures since adjacency and connectivity are implicit in the location of a particular cell in the data matrix. Since geographic data is rarely distinguished by regularly spaced shapes, cells must be classified as to the most common attribute for the cell. The problem of determining the proper resolution for a particular data layer can be a concern. If one selects too coarse a cellsize then data may be overly generalized. If one selects too fine a cell size then too many cells may be created resulting in a large data volume, slower processing times, and a more cumbersome data set. As well, one can imply ccuracy greater than that of the original data capture process and this may result in some erroneous results during analysis. As well, since most data is captured in a vector format, e.g. digitizing,data must be converted to the raster data structure. This is called vector-raster conyersion. Most GIS software allows the user to define the raster grid (cell) size for vector-raster conversion. It is imperative that the original scale, e.g. accuracy, of the data be known prior to conversion. The accuracy of the data, often referred to as the resolution, should determine the cell size of the output raster map during conversion. Most raster based GIS software requires that the raster cell contain only a single discrete value. Accordingly, a data layer, e.g forest inventory stands, may be broken down into a series 0r GIS Data Types and Data Models 67
raster maps, each representing an attribute type,e.g. a species
map, a height map, a density map, etc. These are often referred to as one attribute maps. This is in contrast to most conventional vector data models that maintain data as multiple attribute maps.
0 1 1
1 0 1 1
1 1 1
1 1 0 1
ASimple Raster Data Set: Each cell in the raster is
assigned a single data value. In the above example simpl binary data values have been used meaning that the possibilities are limited to two digit numbers - either 0 or 1. This is an example of a 1-bit raster data file. Mathematically, there are an only two possibilities for each pixel, O or 1. By contrast ineach 8-bit data file, there are 256 possibilities of data values for pixel. In the above example, the computer "sees" the cells that contain 0 as "turned off", while the cells that contain 1 as "turned on". 68 Remote Sensing and Geographic Information System The horizontal dimension of raster data is often oriented parallel tothe east-west direction. Following image processing convention, raster cells are numbered beginning on the left margin of the raster. Further, the positions of cells in the vertical dimension are numbered starting from the top or northern boundary. Thus, the origin of the raster is in the upper left corner. This location is most often referenced (1,1). It is important to note that this referencing system is different from more traditional geo-referencing systems that are based on Cartesian geometry where theorigin is in the lower left corner, and the origin is typically referenced as (0,0).
1,1 1,2 1,3 | 1,4 0,3 1,3 2,3 3,3
2,1 2,2 2,3| 2,4 0,2 1,2 2,2 3,2 3,1 3,2 3,3 3,4 0,1 1,1 2,1 3,1 4,1 |4,2 4,34,4 0,0 1,0 2,0 3,0 The selection of a particular data structure can provide advantages during the analysis stage. For example, the vector data model does not handle continuous data, e.g. elevation, very well while the raster data model is more ideally suited for this type of analysis. Accordingly, the raster structure does not handle linear data analysis, e.g. shortest path, very well while vector systems do. It is important for the user to understand that there are certain advantages and disadvantages to each data model. Advantages of Raster Data 1. The geographic location of each cell is implied by its position in the cell matrix. Accordingly, other than an origin point, e.g. bottom left corner, no geographie coordinates are stored. 2. Due to the nature of the data storage technique data analysis isusually easy to program and quick to perform. GIS Data Types and Data Models 69 3. Theinherent nature of raster maps, e.g. one attribute maps, is ideally suited for mathematical modelling and quantitative analysis. 4. Discrete data, e.g. forestry stands, is accommodated equally wellas continuous data, e.g. elevation data, and facilitates the integrating of the two data types. Disadvantages of Raster Data 1. The cell size determines the resolution at which the data is represented. 2. It is especially difficult to adequately represent linear features depending on the cell resolution. Accordingly, network linkages are difficult to establish. 3. Processing of associated attribute data may be cumbersome if large amounts of data exists. Raster maps inherently reflect only one attribute or characteristic for an area. 4. Since most input data is in vector form, data must undergo vector-to-raster conversion. Besides increased processing requirements this may introduce data integrity concerns due to generalization and choice of inappropriate cell size. Vector Data Models The vector data model is based upon vectors as opposed to space occupancy of raster data structures. The fundamnental primitiveof the vector mnodel is a point. The various objects are created by connecting the points with straight lines, but some systems allow the points to be connected using arcs of circles. The areas are defined in this model by sets of lines. The term polygon is synonymous with area in vector databases because of these of straight-line connections between points. Very large vector databases have been built for facilitating diferent purposes as vectors dominate in various different fields such as transportation, utility and marketing applications. 70 Remote Sensing and Geographic Information System
Vector Raster
Several different vector data models exist, however only
commonly used in GIS data storage. The topologic data structure is often referred to as an intelligent data structure because spatial relationships between geographic features easily derived when using them. Primarily for this reason the topologicmodel is thedominant vector data structure currently used in GIS technology. Many of the complex data analysis functions cannot effectively be undertaken without a topologic vector data structure. The secondary vector data structure that is common among GIS software is the computer-aided drafting (CAD) data structure. This structure consists of listing elements, not features, defined by strings of vertices, to define geographic features, e.g. points, lines, or areas. There is considerable redundancy with this data model since the boundary segment between two polygons can be stored twice, once for each feature. The CADstructure emerged from the development of computer GIS Data Types and Data Models 71 graphics systems without specific considerations of processing geographic features. Accordingly, since features, e.g. polygons, are self-contained and independent, questions about the adjacency of features can be difficult to answer. The CAD vector model lacks the definition of spatial relationships between features that is defined by the topologic data model. Advantages of Vector Data 1. Datacan berepresented at its original resolution without generalization. 2. Graphic output is usually more aesthetically pleasing. 3. Since most data, e.g. hard copy maps are in vector form, no conversion is required. 4. Accurate geographic location of data is maintained. 5. Allows for efficient encoding of topology, and as a result more efficient operations that require topological information, e.g. proximity, network analysis. Disadvantages of Vector Data 1. Thelocation of each vertex needs to be stored explicitly. 2. Algorithms for manipulative and analysis functions are complex and may be processing intensive. 3. Continuous data, such as elevation data, is not effectively represented in vector form. 4. Spatial analysis and filtering within polygons is impossible.