Data Input Techniques
Data Input Techniques
Gis & Remote Sensing (Dr. A.P.J. Abdul Kalam Technical University)
Digitizing can be done in a point mode, where single points are recorded one at a time, or in a stream mode,
where a point is collected on regular intervals of time or distance, measured by an X and Y movement, e.g.
every 3 metres. Digitizing can also be done blindly or with a graphics terminal. Blind digitizing infers that the
graphic result is not immediately viewable to the person digitizing. Most systems display the digitized linework
as it is being digitized on an accompanying graphics terminal.
Most GIS's use a spaghetti mode of digitizing. This allows the user to simply digitize lines by indicating a start
point and an end point. Data can be captured in point or stream mode. However, some systems do allow the
user to capture the data in an arc/node topological data structure. The arc/node data structure requires that
the digitizer identify nodes.
Data capture in an arc/node approach helps to build a topologic data structure immediately. This lessens the
amount of post processing required to clean and build the topological definitions. However, most often
digitizing with an arc/node approach does not negate the requirement for editing and cleaning of the digitized
linework before a complete topological structure can be obtained.
The building of topology is primarily a post-digitizing process that is commonly executed in batch mode after
data has been cleaned. To date, only a few commercial vector GIS software offerings have successfully
exhibited the capability to build topology interactively while the user digitizes.
For raster based GIS software data is still commonly digitized in a vector format and converted to a raster
structure after the building of a clean topological structure. The procedure usually differs minimally from
vector based software digitizing, other than some raster systems allow the user to define the resolution size
of the grid-cell. Conversion to the raster structure may occur on-the-fly or afterwards as a separate
conversion process.
Automatic Scanning
A variety of scanning devices exist for the automatic capture of spatial data. While several different technical
approaches exist in scanning technology, all have the advantage of being able to capture spatial features from
a map at a rapid rate of speed. However, as of yet, scanning has not proven to be a viable alternative for
most GIS implementation. Scanners are generally expensive to acquire and operate. As well, most scanning
devices have limitations with respect to the capture of selected features, e.g. text and symbol recognition.
Downloaded by ARK BOSS ([email protected])
lOMoARcPSD|29693572
Experience has shown that most scanned data requires a substantial amount of manual editing to create a
clean data layer. Given these basic constraints some other practical limitations of scanners should be
identified. These include :
hard copy maps are often unable to be removed to where a scanning device is
available, e.g. most companies or agencies cannot afford their own scanning device
and therefore must send their maps to a private firm for scanning;
hard copy data may not be in a form that is viable for effective scanning, e.g. maps
are of poor quality, or are in poor condition;
geographic features may be too few on a single map to make it practical, cost-
justifiable, to scan;
often on busy maps a scanner may be unable to distinguish the features to be
captured from the surrounding graphic information, e.g. dense contours with labels;
with raster scanning there it is difficult to read unique labels (text) for a geographic
feature effectively; and
scanning is much more expensive than manual digitizing, considering all the
cost/performance issues.
Consensus within the GIS community indicates that scanners work best when the information on a map is
kept very clean, very simple, and uncluttered with graphic symbology.
The sheer cost of scanning usually eliminates the possibility of using scanning methods for data capture in
most GIS implementations. Large data capture shops and government agencies are those most likely to be
using scanning technology.
Currently, general consensus is that the quality of data captured from scanning devices is not substantial
enough to justify the cost of using scanning technology. However, major breakthroughs are being made in the
field, with scanning techniques and with capabilities to automatically clean and prepare scanned data for
topological encoding. These include a variety of line following and text recognition techniques. Users should be
aware that this technology has great potential in the years to come, particularly for larger GIS installations.
Coordinate Geometry
A third technique for the input of spatial data involves the calculation and entry of coordinates using
coordinate geometry (COGO) procedures. This involves entering, from survey data, the explicit measurement
of features from some known monument. This input technique is obviously very costly and labour intensive.
In fact, it is rarely used for natural resource applications in GIS. This method is useful for creating very
precise cartographic definitions of property, and accordingly is more appropriate for land records management
at the cadastral or municipal scale.
Most GIS software vendors also provide an ASCII data exchange format specific to their product, and a
programming subroutine library that will allow users to write their own data conversion routines to fulfil their
own specific needs. As digital data becomes more readily available this capability becomes a necessity for any
GIS. Data conversion from existing digital data is not a problem for most technical persons in the GIS field.
However, for smaller GIS installations who have limited access to a GIS analyst this can be a major stumbling
block in getting a GIS operational. Government agencies are usually a good source for technical information
on data conversion requirements.
Some of the data formats common to the GIS marketplace are listed below. Please note that most formats
are only utilized for graphic data. Attribute data is usually handled as ASCII text files. Vendor names are
supplied where appropriate.
IGDS - Interactive Graphics Design Software This binary format is a standard in the turnkey CAD
(Intergraph / Microstation) market and has become a de facto standard in
Canada's mapping industry. It is a proprietary
format, however most GIS software vendors provide
DGN translators.
DLG - Digital Line Graph (US Geological Survey) This ASCII format is used by the USGS as a
distribution standard and consequently is well utilized
in the United States. It is not used very much in
Canada even though most software vendors provide
two way conversion to DLG.
DXF - Drawing Exchange Format (Autocad) This ASCII format is used primarily to convert
Downloaded by ARK BOSS ([email protected])
lOMoARcPSD|29693572
GENERATE - ARC/INFO Graphic Exchange A generic ASCII format for spatial data used by the
Format ARC/INFO software to accommodate generic spatial
data.
EXPORT - ARC/INFO Export Format . An exchange format that includes both graphic and
attribute data. This format is intended for
transferring ARC/INFO data from one hardware
platform, or site, to another. It is also often used for
archiving.
A wide variety of other vendor specific data formats exist within the mapping and GIS industry. In particular,
most GIS software vendors have their own proprietary formats. However, almost all provide data conversion
to/from the above formats. As well, most GIS software vendors will develop data conversion programs
dependant on specific requests by customers. Potential purchasers of commercial GIS packages should
determine and clearly identify their data conversion needs, prior to purchase, to the software vendor.