Academia.eduAcademia.edu

Health Geomatics: An Enabling Suite of Technologies in Health and Healthcare

2001, Journal of Biomedical Informatics

Journal of Biomedical Informatics 34, 195–219 (2001) doi:10.1006/jbin.2001.1015, available online at http://www.idealibrary.com on METHODOLICAL REVIEW Health Geomatics: An Enabling Suite of Technologies in Health and Healthcare M. N. Kamel Boulos,1 A. V. Roudsari, and E. R. Carson Centre for Measurement and Information in Medicine, School of Informatics, City University, London EC1V 0HB, United Kingdom Received March 7, 2001; published online September 20, 2001 This Methodolical Review describes how health geomatics can improve our understanding of the important relationship between location and health, and thus assist us in Public Health tasks like disease prevention, and also in better healthcare service planning. The reader is first introduced to health geography and its two main divisions, disease ecology and healthcare delivery, followed by an overview of the basic concepts and principles of health geomatics. Topics covered include geographical information systems (GIS), GIS modeling, and GIS-related technologies (remote sensing and the global positioning system). We also present a number of real-life health geomatics applications and projects, with pointers to further studies and resources. Finally, we discuss the barriers facing the adoption of GIS technology in the health sector, including data availability/quality issues. The authors believe that we still need to combat many cultural and organizational barriers, including “spatial illiteracy” among healthcare workers, while making the tools cheaper and easier to learn and use, before health geomatics can become a mainstream technology in the health sector like today’s spreadsheets and databases. q 2001 Academic Press Key Words: geomatics; geographical information systems (GIS); remote sensing; global positioning system (GPS); spatial analysis; decision support systems; epidemiology; disease ecology; public health; healthcare delivery. Space is an essential framework of all modes of thought. From physics to aesthetics, from myth and magic to common everyday life, space, in conjunction with time, provides a fundamental ordering system for interlacing every facet of thought. . . . In short, things occur or exist in relation to space and time. [1] R. Sack (1980) INTRODUCTION The concept that location can influence health is a very old one in medicine. As far back as the time of Hippocrates (ca. 3rd century BC), physicians observed that certain diseases tend to occur in some places and not others. In fact, different locations on Earth are usually associated with different profiles: physical, biological, environmental, economic, social, cultural, and sometimes even spiritual profiles, that do affect and are affected by health, disease, and healthcare. These profiles and associated health and disease conditions may also change with time (the longitudinal or temporal dimension) [2, 3]. In 1854, a major cholera outbreak in London had already taken nearly 600 lives when Dr. John Snow, using a handdrawn map, showed that the source of the disease was a 1 To whom correspondence should be addressed at Centre for Measurement and Information in Medicine, City University, Northampton Square, London EC1V 0HB, United Kingdom. E-mail: [email protected]. 1532-0464/01 $35.00 Copyright q 2001 by Academic Press All rights of reproduction in any form reserved. 195 196 contaminated water pump. By plotting each known cholera case on a street map of Soho district (where the outbreak took place), Snow could see that the cases occurred almost entirely among those who lived near the Broad Street water pump (Fig. 1). This pump belonged to the Southwark and Vauxhall Water Company, which drew water polluted with BOULOS, ROUDSARI, AND CARSON London sewage from the lower Thames River. The Lambeth Water Company, which had relocated its water source to the upper Thames, escaped the contamination. Snow recommended that the handle of this pump be removed, and this simple action stopped the outbreak and proved his theory that cholera is transmitted through contaminated drinking FIG. 1. This map is a digital recreation of Dr. Snow’s hand-drawn map. The 1854 cholera deaths are displayed as small black circles. The gray polygon represents the former burial plot of plague victims. The Broad Street pump (shown in the center of the map) proved to be the source of contaminated water, just as Snow had hypothesized. (Generated using CDC Epi Map 2000 for Windows, a public domain package that can be downloaded from: http://www.cdc.gov/epiinfo/EI2000.htm). 197 HEALTH GEOMATICS water. People could also see on this map that cholera deaths were not confined to the area around a cemetery of plague victims and were thus convinced that the infection was not due to vapors coming from it as they first thought [4]. By using a map to examine the geographical (spatial) locations of cholera cases in relation to other features on the map (water pumps and cemetery of plague victims), Snow was actually performing what is now known as spatial analysis. HEALTH GEOGRAPHY Health (individual and community health issues) and healthcare (clinical issues, and service planning and management issues) are two intertwined concepts, and so are their interactions with location. It is therefore very useful and customary to divide the geography of health and healthcare into the following two interrelated areas: 1. The geography of disease, which covers the exploration, description, and modeling of the spatiotemporal incidence of disease and related environmental phenomena, the detection and analysis of disease clusters and patterns, causality analysis, and the generation of new disease hypotheses; 2. The geography of healthcare systems, which deals with the planning, management, and delivery of suitable health services, ensuring among other things adequate patient access, after determining healthcare needs of the target community and service catchment zones [2, 5]. Preventive and health promotion activities form part of these services. As disease and health can vary from place to place and time to time, so too should be the healthcare planners’ response to the health needs of their communities. Health geography plays a vital role in public health surveillance, including the design and monitoring of the implementation of health interventions and disease prevention strategies. Geographical research into healthcare services can also help identifying inequities in health service delivery between classes and regions, and in the efficient allocation and monitoring of scarce healthcare resources. Examples include allocating healthcare staff by region based on actual needs, and assisting in determining the best location and specifications for new healthcare facilities and in planning extensions to existing ones [2, 6]. HEALTH GEOMATICS ESSENTIALS Definitions and Scope of Geomatics and GIS Geomatics, also known as geoinformatics, is the science and technology of gathering, storing, analyzing, interpreting, modeling, distributing and using georeferenced information. Geomatics is multidisciplinary by nature. It comprises a broad range of disciplines, including surveying and mapping, remote sensing, geographical information systems (GIS), and the global positioning system (GPS) [7]. These, in turn, draw from a wide variety of other fields and technologies, including computational geometry, computer graphics, digital image processing, multimedia and virtual reality, computer-aided design (CAD), database management systems (DBMS), spatiotemporal statistics, artificial intelligence, communications, and Internet technologies among others [8, 9]. Geographical information systems also favor an interdisciplinary approach to the solution of problems. They go beyond conventional spreadsheet and database tables, helping us discover and visualize new data patterns and relationships that would have otherwise remained invisible. They achieve this through their unique way of classifying multifaceted, real-world data coming from disparate sources into layers (coverages or themes), each covering a single aspect of reality. They then link these layers by spatially matching them (like a set of transparent overlays), and query and analyze them together to produce new information and hypotheses. This can be considered one form of data mining, and is especially useful in the context of aggregated patient records (Fig. 2). It is possible, for example, to overlay and integrate the following data layers to perform different types of health-related analyses: —population data, e.g., census and socioeconomic data; —environmental and ecological data, e.g., monitored data on pollution and vegetation (satellite pictures); —topography, hydrology and climate data; —land-use and public infrastructure data, e.g., schools and main drinking water supply; —transportation networks (access routes) data, e.g., roads and railways; —health infrastructure and epidemiological data, e.g., data on mortality, morbidity, disease distribution, and healthcare facilities; and —other data as needed to perform different types of health-related analyses [6]. N.B.: Throughout the rest of this Methodolical Review the 198 BOULOS, ROUDSARI, AND CARSON FIG. 2. Each location on Earth is usually associated with its own physical, biological, environmental, economic, social, occupational, lifestyle, health, and disease profiles that can change with time. Geographical Information Systems with their powerful spatiotemporal analytical capabilities can add a new “location-time” dimension to our reasoning with health and healthcare data, and aggregated electronic patient records. This has the potential of improving our understanding of disease dynamics and factors associated with it, which in turn can lead to better resource planning and management, and ultimately to improved health for the individual (through medical care) and the community (through Public Health programs). terms “coverage” and “theme” will be used interchangeably as synonyms denoting the same meaning. As modeling and decision support tools (Fig. 2), GIS can help determining geographical distribution and variation of diseases (e.g., prevalence, incidence) and associated factors, analyzing spatial and longitudinal trends, mapping populations at risk, and stratifying risk factors. GIS can also assist in assessing resource allocation and accessibility (health services, schools, water points), planning and targeting interventions, including simulating (predicting) many “what-if” scenarios before implementing them, forecasting epidemics, and monitoring diseases and interventions over time. They provide a range of extrapolation techniques, for example, to extrapolate sentinel site surveillance to unsampled regions. Other important GIS applications include routing functions and emergency dispatch systems [5, 7, 10]. Overview of GIS Concepts and Principles On Spatial Information and Its Dimensions Spatial information is information where location has some importance or benefit; it is not necessarily about geographical locations on the surface of the Earth [13]. For example, Dodge and Kitchin have recently published a book on mapping cyberspace (the Internet) [14]. It is also a wellknown fact that many diseases and organisms have a predilection for or exclusively affect specific anatomicophysiological “locations” within the human body (body organs or systems) with varying duration and costs of care. BodyViewer, an ArcView GIS extension from GeoHealth, Inc., maps patient records containing a geographical reference (e.g., postcodes) and ICD-9 or ICD-10 codes (International Classification of Diseases coding diagnoses, complications or causes of death) to a human body systems/organs theme. The human body theme can be in turn linked to a geographical theme to show where the aggregated ICD codes occur geographically. Using BodyViewer, new disease and healthcare patterns can be detected, analyzed, and acted upon effectively (Fig. 3). Spatial information can exist in two, three, or four dimensions. Two-dimensional GIS is concerned with two-dimensional surfaces, e.g., the surface of the Earth [13]. Threedimensional GIS can handle two dimensions of space and one of time (spatiotemporal representations dealing with historical data) [15], or three dimensions of space, e.g., the three-dimensional atmosphere, oceans, and subsurface of the Earth. Three-dimensional georepresentations handle cubic or volumetric data (the third dimension is depth or elevation). When elevation has been determined, a three-dimensional perspective view of a region can be rendered (the user can control the rotation, height, and angle of this view) [11]. Four-dimensional GIS are designed for three dimensions of space and one of time [8]. Geographical References and Geocoding Geographically referenced (georeferenced) data refers to data referenced by location on Earth. Geographical referencing allows us to locate features, such as a hospital or patient’s residence, and events, such as an earthquake or disease outbreak, on the Earth’s surface for analysis. Georeferenced data can have either an explicit geographical reference, such HEALTH GEOMATICS 199 FIG. 3. Screenshot of ESRI ArcView GIS showing how a patient database can be linked to a geographical theme. Patients’ diagnoses (ICD9 or ICD-10 codes, arrow) have been also mapped to corresponding locations on a human body theme using BodyViewer extension. The different colors of organ icons on the human body theme represent the relative prevalence of each disease category (e.g., cardiovascular disorders) in the patient database; the darker the icon, the higher the prevalence. In this screenshot, we have selected the lung icon which represents all respiratory illnesses, and all patients who had an ICD code of a respiratory disease have been automatically identified both on the geographical theme (darker points) and in the patient table. (Generated using BodyViewer Extension for ArcView GIS and fictional data from GeoHealth, Inc. Web address: http://www.geohealth.com/bodyviewer.html.) as latitude and longitude or national grid coordinate (this form is ready for mapping), or an implicit reference such as an address, postal code, or census tract name. In the latter case, an automated GIS process called geocoding is used to create explicit geographical references (coordinates) from implicit references, e.g., to map patients to corresponding address points (coordinates) on a digital map of their city based on addresses in electronic patient records [11, 12]. In GIS, locations and features on the Earth’s surface are represented by points, lines, and polygons that are defined by a series of X,Y coordinates. Latitude–longitude is a world coordinate system, but smaller systems exist for regional purposes and more accurate positions. Whereas the latitude– longitude system never changes, some of the smaller (national, regional, or local) grid systems can shift position. It is therefore necessary to check the dates of GIS data layers before using them, if they have been associated with one of the smaller coordinate systems that are known to have shifted. Otherwise, positions can be misplaced and uncoordinated. Tools exist that can change a theme from one system to another based on precise knowledge of old and new coordinate systems [11]. 200 GIS software stores coordinates in decimal degrees (e.g., 30.508), where fractions of degrees are expressed as decimals; thus, the longitude: 308, 308, would be expressed as 30.58 (a degree contains 608). Data in decimal degrees are in a neutral spherical coordinate system representing the Earth (i.e., unprojected, see section on Map Projections) [12]. Spatial and Attribute Data Geographical information systems store two main types of data in their databases. The first type, spatial data, describes the location and shape of geographical features, and their spatial relationships to other features, in the form of digital coordinates. The described spatial features can be points (e.g., hospitals—it is also possible to describe a hospital as a polygon if scale permits), lines (e.g., roads, rivers, railways), or polygons (e.g., administrative districts or residential parcels). The different features datasets are usually held as separate layers (e.g., a theme/table for healthcare facilities and another theme/table for roads) that can be combined in a number of different ways for analysis and visualization. The second type, attribute data, describes the characteristics, properties, or qualities of the spatial features, e.g., number of hospital beds, or population characteristics of administrative districts. Thus, we could have health districts (polygons) and healthcare centers (points) as spatial data, and descriptive information about these features as attribute data, for instance, persons having access to clean water, number of births, number of 1-year-old children fully immunized, number of health personnel, and so on [16]. On Points, Centroids, and Thiessen Polygons A point represents a spot or location that either has no physical or actual spatial dimensions (an “intangible” feature, e.g., address of a cholera case or location of a traffic accident) or is too small to show properly at a given scale. A point can be displayed using a convenient visual symbol, e.g., a red cross to denote the location of a hospital. Polygons can be also represented by their center points (centroids), if their exact spatial characteristics are not important for the study at hand. Thiessen polygons are the opposite of centroids. They are equal areas drawn around points to represent the territories of these points (areas of influence or service/catchment zones). GIS expands each point’s surrounding area until it meets the next one coming from a neighbor point or until it runs into a theme’s edge [9, 11]. BOULOS, ROUDSARI, AND CARSON GIS Database Functions A major strength of GIS is that it can accept and merge different data into a single database. The database is the operation center of GIS, serving as a powerful system for data management and analysis. Data manipulation options include listing selected records or selected fields, sorting fields alphabetically or numerically in any order, listing by specific range/value(s) of field data, as well as more complex queries involving the selection of records/fields matching one or multiple conditions combined using Boolean operators (and, or, not). All standard database operations are also supported, e.g., performing modifications, updates, additions, and deletions, and linking and joining tables. Merging (summarizing) more than one record into one based on some common attribute (field) in all merged records is also possible. Many types of summary statistics can be calculated for the other fields in this case. For example, if the merged records are about polygon features (e.g., boroughs that are part of Greater London), each with an area field, the resultant summary record can have the sum of all merged polygons’ areas in its area field. However, the real strength of GIS lies in the bidirectional links it provides between its relational databases and graphical elements (map themes and charts). GIS permits relational queries at the database (e.g., select all hospitals with .200bed capacity or just manually selecting some records), with results shown on the map (corresponding features highlighted on the map), or the reverse: selection of feature(s) on a map displays corresponding records in the underlying database (Fig. 4). A field can be also used as a mapping theme and then have the map present the result. The user chooses a type of classification, which reduces the field measurements into generalized categories (classes) that are then mapped according to some coloring or gradient-shading scheme [11]. Sources of GIS Data GIS data come from a variety of sources, including digitization of hard copy maps, remote sensing imagery (see section on “Remote Sensing”), existing digital map products, tabular data such as census lists and patient records, field data (e.g., field measurements, information and photographs collected by a field team), in addition to expert knowledge, classifications, judgments, and decisions. These diverse datasets are not naturally combined and processes like geocoding are usually required to link them together [11]. HEALTH GEOMATICS 201 FIG. 4. Screenshot of ESRI ArcView GIS. Many themes are listed in the Table of Contents (left-hand side) of the map View window. Individual themes can be checked/unchecked (“turned on and off”) as necessary. The right-hand side of the map View window shows the final map resulting from combining the checked themes one over the other, in the same order that appears in the Table of Contents. This order can be changed by dragging themes up and down the Table of Contents. Pointing the “Identify” cursor to a theme feature and clicking on it (after selecting the corresponding theme which will appear embossed in the Table of Contents; “Healthcare building” in this screenshot) will display that feature’s attributes from the theme table. It happens that in the real world and on this map, the identified “Bethlem Royal Hospital” exists as separate polygons (buildings). However, all these polygons are actually represented by (aggregated into) one record in the theme table, i.e., considered as a single feature, and clicking on any of them will display the same “Identify Results.” ArcView does not attempt to merge these polygons geometrically because they are not adjacent to each other. (Generated using London digital map data from Bartholomew Ltd., http://www.bartholomewmaps.com.) GIS Coverages Coverages (themes) are the digital version of paper maps. GIS coverages usually comprise a single major theme, such as roads, healthcare facilities, land- use, or vegetation (Fig. 4). Hard copy maps are turned into computer coverages through digitizing, a process that involves tracing the map electronically, changing it into digital form and doing any necessary corrections. Spatial data must be associated with a real-world coordinate system, such as latitude–longitude or a national grid system (georeferencing), and a desired world projection must be related to it. Digitized spatial data must be also stored in a specific GIS data structure either raster or vector format. Attributes must be attached to it to construct the underlying database of descriptive information, 202 e.g., by importing a database table. Finally, coverages may be split into subcoverages or tiles, each covering only part of the original theme area. Limiting a GIS project to only those tiles that are related to the area under consideration can significantly reduce the effort, time, computer processing and storage demands, and cost associated with that project [11]. Map Projections Even when a theme has been georeferenced to a specific coordinate system, the GIS still needs to know which map projection to use in order to lay out (or project) and give proper shape to features on a two-dimensional computer display. Projections are special configurations used to fit a portion of the globe’s three-dimensional surface onto a flat view; that is, spherical data are converted into a two-dimensional presentation. There are many projections, e.g., Mercator projection, Peters projection, and Robinson projection, each with certain advantages and limitations, preserving some spatial properties and distorting others (spatial properties include shape, area, distance, and direction). Selecting a projection should therefore be guided by the application at hand; if, for instance, we need to accurately measure distances, we should choose a projection that preserves distance [9, 11, 12]. BOULOS, ROUDSARI, AND CARSON Raster Data Structure In raster format, a theme is divided into cells in a grid and each cell is given a single numerical value (Fig. 5). A point is represented by a single cell and lines as sequences of cells. Raster polygons have the area within their borders filled with cells. The cell is the minimum mapping unit in the raster data structure. The final resolution of a raster theme (amount of detail) depends on cell size/number of cells in the grid. Increasing the number of cells in a grid theme (and decreasing cell size) will increase the theme’s spatial resolution and accuracy (i.e., accuracy of measurements performed on such themes and accuracy of the shape and size of features they represent) [9, 11]. There are two kinds of grids: discrete and continuous. Discrete grids store integers; this makes them more suitable for representing data that is descriptive or categorical rather than quantitative. The integer cell values in the grid act as substitutes (codes) for descriptions or attributes (e.g., we can have two different integer land-use codes to represent “urban” and “rural”). Continuous grids store numbers with decimal places, and can therefore represent precise measurements (e.g., annual rainfall of 21.5 cm) [12]. The simple coded grid structure makes analysis easier (e.g., comparing values of cells occupying the same position Map Scale and Resolution Map scale describes the relation between a single map unit to the number of same units in the real world, e.g., 1:1000 (1 cm on the map 5 1000 cm in the real world). The accuracy with which a given map scale represents the location and shape (details) of map features is known as resolution. The larger the original map scale, the higher the possible resolution. GIS can enlarge the scale of a map, e.g., from 1:100,000 to 1:50,000, but no additional (true) detail will be gained in this case and map accuracy will not change. The accuracy and details of the new set of data with a scale of 1:50,000 will remain those of the original 1:100,000 set; in other words, a map with an original scale of 1:50,000 will probably have more detail than the enlarged 1:100,000 map. Scale reduction is also possible, e.g., from 1:100,000 to 1:400,000, but too many details can make a small-scale map cluttered and unreadable, unless some details are omitted or simplified [9, 11]. FIG. 5. Raster and vector GIS data structures. The middle part of the diagram compares the underlying attribute tables of two sample raster and vector themes representing the same geographical topic and region. Note how a single “Agriculture” record in the raster table represents the three “Agriculture” polygon records in the vector table. Also note that some raster grid themes have no underlying table (just relying on cell values which codify the different land attributes). In raster grids, the cell is the minimum measuring unit; for example, if a square feature spans four cells in a grid where a cell side is equivalent to 30 m, then we can infer that each side of this square feature is 60 m long, its perimeter is 240 m and its area is 3600 m2. 203 HEALTH GEOMATICS in different grid themes). Moreover, remote sensing imagery (see section on Remote Sensing) is obtained in raster format. The raster data structure allows easy comparison between imagery and GIS themes and facilitates their integration. Raster format is also more suitable for some types of modelling using map algebra (see section on Spatial Analysis) [11]. Vector Data Structure Instead of being built by raster cells, vector features are actually defined by coordinate points (nodes and vertices); then chains (special lines) connect the points to draw the feature (Fig. 5). Lines, especially when diagonal or curved, are continuous and are not broken into a grid structure. At the beginning and end of every line or polygon feature is a node (a special point that defines the beginning or end of a feature). Two nodes are needed for a line feature, but only one node for a polygon (the start node also acts as the end node in the case of a polygon). At each “bend” (change of direction) in a line, there is a vertex. There must be at least two vertices to make an area. In a coverage display, normally only the chains are seen, defining the line or polygon features, but under special editing views, nodes and vertices can be inspected. The vector format is much better than the raster format in retaining the original shape of features, especially at small map scales. It offers higher levels of detail and accuracy (e.g., when performing measurements of distance and area) compared to raster format and supports topology (see section on Topology). It is possible to convert features from raster to vector format, but, in doing this, even though the vector version looks more accurate, it is not. The lost accuracy in rasterizing the original digitized map cannot be regained by simply vectorizing the raster version. However, this conversion from raster to vector is sometimes needed in preparation for plotting (plotters prefer vector format), for comparison with other vector data (comparisons usually need identical formats), and to establish topology that uses vector formats. Conversion from vector to raster format is also possible, e.g., for modeling using map algebra [9, 11]. to get around, e.g., from point X to point Y. Topological functions are possible because the vector spatial database contains several properties that can be linked and used to relate features to their surroundings, such as identification of the polygons to the right and left of each chain or node connection. The three most important features that come with topology are adjacency, connectivity, and containment. Adjacency means that for any given feature, topology links adjacent features, usually in terms of what is to the left and what is to the right. Connectivity is achieved by keeping track of all connected features, e.g., all chains connected to a node and adjacent/shared chains between polygons. If polygon A has a shared border with polygon B and polygon B has a shared border with polygon C, topology can then infer that polygon A is linked to polygon C. Containment refers to what is within a polygon (features within features). Combining the functions of a relational database and topology is also possible, and can be used, for example, to select adjacent polygons based on some other nonspatial (descriptive) properties of these polygons (from the attributes table) and not just adjacency. Moreover, topological connections can work on a set of lines, called a network (e.g., a network of streets). Networks have nodes at each intersection, with chains in between. Topology can help finding a particular path, e.g., the shortest or least cost path from a given site X to another site Y within a network of roads. To do this, it searches the network and defines all possible routes and then simply compares their respective total lengths to choose the shortest. Other attributes can be also taken into consideration in the search, such as one-way streets, rate of travel (speed limit/traffic condition), road type and condition, or other barriers that prevent continued movement. There are numerous applications for this type of topological operations, such as emergency response routing [11]. A related GIS feature, the isochrone function, allows us, departing from some starting location, to identify all geographically reachable areas and routes in all directions based on some user-defined criteria, e.g., after traveling less than 5 km, less than 5 min or by spending less than 5. Spatial Analysis (Buffering and Overlay) Topology Topology is one of the main benefits associated with the vector format. Topology simply means that each feature (point, line, or polygon) “knows” where it is and what is around it (the attached and surrounding features). This causes features to “understand” their environment and “know” how GIS enables better decision making by answering a wide range of questions (provided that all necessary data are available), from simple questions, like “How far is it between two places?” (measuring distances), to more complex analytical questions, such as “Where are all the sites suitable for building a new healthcare facility (based on a set of criteria)?” 204 In questions like “What is the nearest hospital Accidents and Emergency department to an accident spot?” distance (nearness) might be the major factor, but other attributes (from the database) could be also equally important in specific situations, for example, when looking for a hospital with specific capabilities (e.g., having a Burn Unit) to fulfill specific demands imposed by the nature of some particular accident. Calculating distance and attaching database attributes in such search is a major GIS strength [11]. GIS have many powerful analytical tools, but three are especially important: topological and network analysis (see section on Topology), proximity analysis, and overlay analysis [12]. Proximity analysis answers questions like “How many houses lie within 100 m of this water main?” “What is the total number of patients within 10 km of this healthcare facility?” “What proportion of identified cases lies within 500 m of a suspected well (as source of infection)?” “How many people live within 2 km of a hazardous waste site?” and so on. To answer such questions, a GIS process called buffering is used to determine the proximity relationship between features. Building zones (buffers or corridors) around features is a very useful and standard GIS function. The user enters the desired buffering distance, and then the GIS builds the buffer outward from the selected feature or features (Fig. 6). The buffer can be stored as a separate theme. This buffer theme can be then overlaid over another theme and “Clip” used to determine features that fall within the buffer area. This procedure can be used to answer questions of proximity of some feature(s) to the buffered feature(s) [11, 12, 16]. Selecting the features of one theme with the features of another theme acting as selector (e.g., a buffer theme, as in Fig. 6) can help answer questions like whether one feature lies within another, whether it completely contains another (containment), whether it is within a specified distance of another (proximity), and so on [12]. Buffers can be also created around rivers to represent flood zones. A flood zone buffer theme can be then used to clip a village or road theme, so that flood-prone villages or roads can be easily identified. Various flood levels (buffer distances) can be tried to see their effects, thus assisting in disaster prevention and ensuring a rapid response (in case the disaster happens), e.g., by avoiding affected roads when routing emergency services. Moreover, buffers with concentric zones can be used to build “distance decay” models, where the center has the highest intensity (or magnitude), and the effects decrease outward, e.g., pollution levels around a pollutant source, or noise levels around an airport [9, 11]. BOULOS, ROUDSARI, AND CARSON Overlay is a process involving the integration of different data layers to answer complex questions. Vector (topological) overlay involves laying one theme over another to produce a new coverage (and associated “output” database) that combines and shows the relationship between the overlaid (input) coverages. Thus the “spatial coincidence” of the overlaid coverages can be explored. Several tricks can be used in the process. “Intersect” is used to integrate two spatial datasets while preserving only those features falling within the spatial extent common to both themes. “Union” is used to produce a new theme containing the features and attributes of two polygon themes. “Clip” is used to cut out a piece of one theme using another theme as a “cookie cutter”. “Assigning data by location” uses a spatial relationship to join data from the attribute table of one theme to the attribute table of another theme (spatial join); this can be used, for example, to link land-use and environmental data to population and disease data and discover new relationships. ESRI GeoProcessing wizard extension for ArcView GIS offers all these functions (Fig. 7) [11, 12]. In raster overlay, cell values from different grid themes are combined using a variety of mathematical operations to generate the values of cells at corresponding positions in a new grid theme (Fig. 5). The output grid theme can serve as a model of some process or phenomenon. The use of mathematical operations to combine the values of cells occupying the same position in different grids is termed map algebra. These operations include addition, subtraction, multiplication, division, exponentiation, comparing cells, and finding the maximum (e.g., maximum rainfall value) as well as other more complex formulae, e.g., adding corresponding cell values from two coverages and then multiplying the sum by a factor to obtain the value of the output cell. User-defined input cell combination “rules” can be also used to assign unique values, e.g., suitability or sensitivity scores, to output cells This involves building a matrix of all possible combinations of input cells and associated (planned) output values and then applying the “rules” to the input grids [11]. When grid cells do not align, either because their cell resolution is different or because their spatial coordinates do not match, resampling is used to associate cell values in one grid with those in another. Various resampling methods exist; one of them, called Nearest Neighbor resampling, associates the cell values in one grid with those in another according to cell center proximity [12]. Giving extra importance to some coverages is also often necessary, such as when one theme is more important than others in a particular raster overlay application (e.g., when HEALTH GEOMATICS 205 FIG. 6. Screenshot of ESRI ArcView GIS showing 300-m buffers around “Education Buildings.” Each buffer in this example has three concentric 100-m zones. The buffers were saved as a separate vector polygon theme named “Buffer of Education.” Using ArcView’s “Select By Theme” function, we selected “Healthcare Points” and “Healthcare Buildings” that “Intersect” the “Buffer of Education” theme, i.e., we have selected healthcare features that lie within 300 m of education buildings. We could have also done this in a single step (without first creating the visible buffer theme) by choosing the “Are Within Distance Of” and entering “300 m” in the “Select By Theme” dialogue box. (Generated using London digital map data from Bartholomew Ltd., http://www.bartholomewmaps.com.) doing site suitability or sensitivity analyses, see section on GIS Models). For example, a 23 weighting of a coverage can be achieved by simply multiplying each cell value in this coverage by two [11]. GIS Visualization and Output Visualization is the presentation of data in graphical form (maps and charts). It is a quick and effective way to convey complex information compared to conventional spreadsheets, database tables, and lists of numbers, which are usually difficult to interpret without careful study. GIS output can take the form of maps, charts, tabular data, and statistical reports (Fig. 8); all of them can be displayed on-screen, saved in digital format, and, if needed, printed on paper. Moreover, modern GIS can associate relevant pictures and videos with spatial features. Coverages (themes) can be also output in digital format and shared with others [8, 11]. 206 BOULOS, ROUDSARI, AND CARSON FIG. 7. Screenshot of the GeoProcessing wizard extension of ESRI ArcView GIS, showing its different options: Dissolve, Merge, Clip, Intersect, Union, and Spatial Join. Digital data coverages usually ship as multiple files. For example, a vector theme in ESRI’s famous shapefile format usually consists of a.dbf database file containing the attributes of the shapefile, a.shp file, the spatial data component of the coverage, a.shx index file, and a metadata text report. ESRI raster grids are also stored as multiple files in special folders, and have a.adf extension, which stands for “arc data file”. Metadata are data about data. They help users understand the intended purpose of the dataset and consist of information about it such as agency of origin, method of data collection, dates updated, coordinate system, classification, geographical coverage, accuracy, scale [9, 12] GIS functionality and output are also sometimes embedded in other applications as an aid to analysis and decision making; for example, Microsoft Excel, a well-known spreadsheet program, offers some GIS mapping functionality (Insert menu, Map command) that can be integrated into users’ worksheets. However, it must be stressed that GIS is not just a computer mapping or digital cartography system. As explained before, GIS can also manipulate the original data to produce insight and new information (e.g., by performing proximity and overlay analyses). In addition, GIS models can simplify the original data or the world and its processes to help us understand how things work and for prediction of future states and simulation of different “what-if” scenarios (see section on “GIS Models”) [11, 12]. Statistics in GIS are not only limited to standard descriptive statistics like calculating the mean and standard deviation of a set of values, but can also take the form of more sophisticated spatial statistics. Several advanced extensions and companion tools exist for current GIS software, e.g., EpiAnalyst Extension for ArcView GIS [17], SaTScan [18], DMAP [19], and CrimeStat [20]. These tools can be used to evaluate reported spatial or space-time disease clusters to see if they are statistically significant, and to help us discover 207 HEALTH GEOMATICS FIG. 8. Screenshot of ESRI ArcView GIS showing different forms of GIS visualization and output. The upper theme displays the number of deaths due to ovarian cancer in white females (between 1970 and 1994) in the different U.S. counties. Darker shadings signify higher counts (this is known as a choropleth map). The county records in the original theme table have been summarized into a new table that shows the number of deaths due to ovarian cancer in white females by U.S. state (sorted beginning with the highest counts; see lower left part of the screenshot). Simple descriptive statistics are also shown for this summary table (on its right side), followed by a chart comparing the first five states with the highest counts (lower right part of the screenshot). This screenshot was generated using the “Cancer Mortality in the United States: 1970–1994” public domain dataset from US Geological Survey, which can be downloaded from: http://www-atlas.usgs.gov/cancerm.html. and confirm many spatial, environmental and ecological relationships that might be hidden within our data. Members of different disciplines (geography, epidemiology, and statistics) tend to choose different methodological approaches to analyze spatially referenced public health data. Dunn et al. (2001) used three such methodologies, namely conventional epidemiological methods, GIS, and point pattern analysis (spatial statistics) to analyze the same public health dataset. They compared the three approaches in terms of their relative value and results. There was some variation in the results between different approaches since they adopt different models to address the same research question. But taken overall, the results were seen as complementing, rather than contradicting or duplicating each other [21]. GIS Models Several types of GIS models exist, but in practice a GIS model might combine the properties of more than one type. 208 Generalization models or reduced-detail models are only a simplified representation of reality and do not include every detail of it. Only those items that illustrate or are important to the task or goal at hand are considered. For example, the marketing director of a pharmaceutical company might want to generalize a large number of medical representative territories by grouping them into a smaller number of regional manager territories in order to facilitate a particular study. Generalization models help emphasizing essential points and ease their interpretation by reducing any complexity or confusion arising from superfluous details that might mask these points [11]. A GIS model can also assume that values, e.g., elevations, sampled at specific sites of a given area determine or represent what happens in the entire area. Contouring can be done to extrapolate the sampled points to the surrounding sites and demarcate elevation zones. Contour lines are lines along which a given elevation is constant. A contour map is only a generalization of the real world; it is a model. Contouring can be also used to model abstract surfaces that represent the distribution of less tangible, spatially variable statistics, e.g., noise or pollution levels as we go away from a central source (“distance decay”), or concentric risk/sensitivity zones to particular factors [9, 11]. Environmental modeling is another valuable application of GIS, e.g., combining communities coverage with a pollution sources coverage, and buffering around pollution sources to produce concentric pollution zones with different pollution levels (risk classes), and then determining the degree/risk of exposure for each community that intersects the pollution zones. It is easy, for example, to change one of the parameters (such as pollution zone sizes) or to add new sources and factors, e.g., wind speed and direction, streams of water, or any barrier that might affect the shape of pollution zones and pollution diffusion from source. In this way, alternative situations may be tested. Proper management procedures for each community can then be started once each community’s risk (exposure to pollutants) is determined [4, 11]. Environmental modeling can be of great help in monitoring toxic spills and in improving the outcomes of diseases that are particularly sensitive to environmental factors like asthma. GIS analysis of environmental health issues is clearly feasible and beneficial, but there could be difficulties when executing such projects (see section on Barriers Facing the Adoption of GIS Technology in the Health Sector) [4, 22]. Prediction GIS models can test different “what-if” scenarios. For example, they can be used to predict the impacts of various spatial features, such as a proposed dam, on their BOULOS, ROUDSARI, AND CARSON surroundings, under different configurations of these spatial features (e.g., trying different dam locations and other attributes). It is even possible sometimes to define “goal” objectives (e.g., what is wanted regarding the impact of a proposed spatial feature on its surroundings) and then have the GIS determine the optimum configuration for the proposed spatial feature that fulfills the goal [11]. Optimization or suitability models find optimal solutions to problems of location. These models try to identify the best site, the best path, or the best distribution of features. They are not mathematically predictive. Instead, they use evaluation scales to rate areas, e.g., as bad, good, or best according to a set of criteria. Suitability models often involve reaching a balance of opinion among experts as to what factors define suitability using techniques like the Delphi process [12]. Site suitability analysis typically involves more than one coverage, perhaps overlaying them graphically or combining databases to locate where the best possible spatial or attribute conditions exist. Moreover, using map algebra, different weights could be assigned to the themes representing the different selection criteria to emphasize their relative levels of importance; the output coverage will have its cells coded to reflect degrees of suitability. Buffering might be also needed to limit site selection to particular zones; e.g., to only select sites that are within some specified distance from some feature(s) [11]. A related type of GIS models is the location-allocation or spatial interaction model. The simplest versions of spatial interaction models assume that the flow between demand (e.g., patients) and service centers (e.g., hospitals) is directly proportional to the associated demand and attraction and inversely proportional to the square of the distance between them, giving rise to the concept of a gravity model (by analogy with Newton’s law of gravity). In practice, this distance-related aspect of such models will vary in effect according to the availability, mode, ease and speed of transport, traffic state, and the ability and expectation of patients to travel, and is better expressed as a cost function. Because of the importance of travel in such models, they are sometimes combined with network models, for example, to compute optimal paths along various types of routes and allocate patients to the nearest (most accessible) healthcare facilities [3, 9]. Statistical techniques model data by giving measures that are representative. The average of a set of numbers may not actually exist as a recorded datum, but it is one way of describing the entire list of recorded data. For a numerically quantifiable feature that changes with time, a trend line (the line of best fit or average straight line that runs through the 209 HEALTH GEOMATICS data points of the feature) can be a good generalization of that feature’s quantities over time. The trend line can be also extended to the future (or to the past), giving an indication (prediction) of what can happen if current trends continue. Models can help us understand what is happening over a period of years. Time-series models can estimate what probably happened or will happen based on data provided for other times. When conditions for two times are given, a model may project backward, forward, or in between to extrapolate (predict) conditions. As with the simpler trend lines, these conditions may not have actually occurred, and time-series models should not be thought of as providing specific, conclusive data, but rather a credible set of scenarios from which an idea about the possible changes can be considered. A GIS operational process model comprises specific GIS steps to be followed in order to achieve a specified objective. It models the GIS procedures to be followed (the process flow chart). The modeled process should work in similar situations with different datasets [11]. unknown if we were only using the visible light range (standard aerial photography). For example, thermal infrared sensors pick up subtle temperature differences and display them on film or electronic devices. This is useful in thermal pollution monitoring, allowing industrial effluence to be analyzed in terms of heat characteristics. Remote sensing data are in raster format, consisting of cells with values determined by the capturing sensors. Raw remote sensing data need some preparation to be usable in GIS. Preprocessing cleans the data by filtering out electronic noise and correcting mistakes. Data can be also enhanced by improving the visual contrast, e.g., changing subtle differences in gray levels into more distinctive shades. Thematic analysis then turns enhanced data into selected themes. For example, a landscape image may contain several types of data that can be more useful if made available as separate coverages. Further classification of the generated themes into distinct categories can be also performed, e.g., classifying a vegetation theme into different categories representing various vegetation types [11]. GIS-RELATED TECHNOLOGIES The Global Positioning System Remote Sensing Cline (1970), in an article “New eyes for epidemiologists: aerial photography and other remote sensing techniques,” predicted that remote sensing (RS) will be used in detecting and monitoring disease outbreaks; this proved correct in the following years [23]. RS involves gathering geographical data from above, usually by aircraft or satellite sensors. It is a major source of GIS data and can rapidly cover large areas of the Earth with relatively low cost per ground unit. A ground crew takes much longer time to cover the same areas, a disadvantage for emergency applications like disaster evaluation. Moreover, remote-sensing imagery offers a more consistent view of the land than what can be provided by various field team members working over a long period of time [11]. The ranges of electromagnetic wavelengths that are commonly detected in remote-sensing surveys include the visible spectrum (0.4–0.7 mm), the reflected or near- and midinfrared range (0.7–3 mm), the thermal infrared range (3–14 mm), and radar and microwave radiation (5–500 mm, long wave) [9]. Data from parts of the electromagnetic energy spectrum that are not visible to the human eye can provide very useful information that would have otherwise remained The Global Positioning System (GPS) consists of 24 Earth-orbiting satellites that transmit signals to special receivers on the ground, either hand-held units or more sophisticated vehicle-mounted and stationary equipment, for accurate determination of X,Y positional coordinates. Some receivers can also display digital maps, and plot the positional coordinates on them. GPS can also provide data on elevation or Z coordinate, velocity (while moving), and time of measurement [11, 24]. GPS can assist ground crew workers in collecting accurately positioned (georeferenced) field data to create and update GIS coverages. In all cases, GPS supplies accurate positional data for the features under consideration. This is essential for accurately plotting and updating these features on a map [11]. GPS has been used in monitoring oil spills and cleanup efforts, and in tracking and mapping airborne pollutants [7]. GPS technology is also used to dispatch police cars, ambulances, and fire fighters in emergency situations. Ground emergency units receive signals from GPS receivers mounted in moving emergency vehicles to determine, track and guide the vehicle nearest to an emergency. GPS can be also combined with real-time GIS to ensure efficient routing of ambulance trips by finding the shortest and quickest routes, and avoiding routes with traffic congestion (based on 210 live traffic maps). New FCC rules (Federal Communications Commission, Web address: http://www.fcc.gov/e911/) mean that GPS receivers will be very soon incorporated into mobile phones. This will help ambulance or rescue teams to precisely and quickly locate and track people who are in a medical emergency, injured, or lost but cannot give their precise location. Geomatics technologies can dramatically reduce the response time in emergency situations and help saving more lives [7]. Health Geomatics Applications Applications Using RS and GPS for Data Acquisition CHAART. Since 1985, CHAART (Centre for Health Applications of Aerospace Related Technologies, part of the Ecosystem Science and Technology Branch of the Earth Science Division at the NASA Ames Research Center, U.S.A.) has been involved in a number of projects on the application of RS and GIS technology to human health problems [25]. Among these projects was a study of the spatial patterns of filariasis in the Nile Delta, Egypt, and prediction of villages at risk for filariasis transmission in the Nile Delta. Landsat Thematic Mapper data coinciding with epidemiological field data were converted into vegetation and moisture indices and classified into land-cover types. Statistical analyses were used to correlate these land-cover variables with the spatial distribution of microfilaria in 201 villages [26, 27]. Another study investigated Lyme disease in Westchester County, New York, to develop a satellite remote sensing/ GIS model for prediction of Lyme disease risk, which can help public health workers in their efforts to reduce disease incidence. Similarly, a third study of schistosomiasis in China aimed at developing a hydrological model that could be used to identify risk factors for disease transmission. CHAART has also been involved in two malaria surveillance projects carried in California and Chiapas, Mexico as part of NASA’s Global Monitoring and Human Health programme. The field research focused on the relationship of Anopheles mosquito to environmental variables associated with regional landscape elements, including larval habitats (flooded pastures and transitional wetlands), blood-meal sources (cattle in pastures), and resting sites (trees). The remote sensing research involved identifying and mapping these and other landscape elements using multitemporal Landsat Thematic Mapper data [25]. BOULOS, ROUDSARI, AND CARSON MALSAT. The MALSAT (Environmental Information Systems for Malaria) team is another group of researchers, based at the Liverpool School of Tropical Medicine, UK, who are investigating the ecoepidemiology of vector-borne diseases, including malaria in sub-Saharan Africa, using GIS and RS. Studies in The Gambia have demonstrated how satellite-derived data can be used to explain variation in malaria transmission, while the value of such data in predicting malaria epidemics is being examined in other parts of Africa. The group is now involved in another project titled “Forecasting meningitis epidemics in Africa” to develop a climate-driven model for predicting outbreaks of meningococcal meningitis in Africa [28]. CDC malaria studies in Kenya. In Kenya, researchers from the Division of Parasitic Diseases of the Centers for Disease Control and Prevention (CDC, Atlanta, GA) work with the Kenya Medical Research Institute to study malaria and means of preventing it. These researchers use GPS to collect positions and data in the field, and then edit and analyze these data in GIS. One study region had its last map made in the late 1960s, and researchers needed an updated map for their study. GPS helped them update the old map features to reflect the current status of the land. The GPS mapping team hired local fishermen to row them in small fishing boats to map the shore of the lake. Roads were mapped by driving cars along them while a team member captured location data with GPS. Once they had an updated map of the region, they could begin using their GIS and create maps to help them in their malaria studies [4]. Health/Public Health Applications WHO GIS Programmes HealthMap is a joint WHO/UNICEF GIS Programme that was initially created in 1993 to provide GIS support for the management and monitoring of the Guinea Worm Eradication Programme. But since 1995, the scope of the work has been expanded to cover other disease-control and public health programs. The HealthMap project has successfully contributed to the surveillance, control, prevention, and eradication of many communicable diseases, including Guinea worm, onchocerciasis, lymphatic filariasis, malaria, schistosomiasis, intestinal parasites, blinding trachoma, and HIV. The programme has developed its own HealthMapper application and is providing it at no cost to developing countries. This is a database management and mapping system that 211 HEALTH GEOMATICS simplifies the collection, storage, retrieval, management, spatial and statistical analyses, and visualization of public health data through its user-friendly interface [29]. The WHO is also using GIS technology in its Leprosy Elimination Programme (LEP) [30]. The WHO Regional Office for the Americas (PAHO, Pan American Health Organization) has developed its own GIS in Health project for the Americas (SIG-EPI) [31]. GIS in Malaria: The MARA/ARMA Initiative The MARA/ARMA collaboration (Mapping Malaria Risk in Africa/Atlas du Risque de la Malaria en Afrique) is funded by the International Development Research Centre of Canada (IDRC), the South African Medical Research Council (SAMRC), the UK Wellcome Trust, the Swiss Tropical Institute, and the UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases (TDR). MARA/ARMA aims at providing a GIS atlas of malaria risk for Africa, by integrating spatial environmental and malaria datasets to produce maps of the type and severity of malaria transmission in different regions of the continent [32]. The project attempts to define malaria risk categories (environmental strata) in terms of non-malaria data, e.g., environmental and climatic data, and to develop a mask layer of factors that exclude malaria (a no-risk category), e.g., absence of population, high altitude, deserts. Areas of no data are highlighted during the course of the project with the possibility of using geographical modeling to extrapolate to such no-data areas, based on the above-mentioned environmental stratification rules. By spatially defining the African continent into regions of similar type and severity of malaria transmission, appropriate control measures can be tailored to each region according to its needs, thus maximizing the potential and outcomes of available control resources (human, financial, and technical). The MARA/ARMA maps should be of great value to all future country-specific, regional, and continental research on malaria transmission dynamics. MARA/ARMA can also serve as a model for the study and control of other diseases, and all non-malaria-specific information gathered during the course of the project can be reused in a similar manner [32]. When Location Makes a Difference: The Global Infectious Disease and Epidemiology Network (GIDEON) and the Field Medical Surveillance System (FMSS) The Field Medical Surveillance System (FMSS) has been developed by the Medical Information Systems and Operations Research Department of the U.S. Naval Health Research Center. Using FMSS, Environmental Health and Preventive Medicine Officers can record, analyze, and disseminate data on diseases and illnesses that may occur during foreign deployments or conflicts [33]. FMSS incorporates GIDEON (Global Infectious Disease and Epidemiology Network) knowledge base from C. Y. Informatics Ltd. (Israel), as well as a comprehensive list of injuries, noninfectious diseases, and mental illnesses. GIDEON covers over 330 infectious and parasitic diseases from over 205 countries (Fig. 9) [34]. FMSS can help determine incidence rates, project shortterm trends, profile the characteristics of affected population by person, time, and place, track mode of disease transmission, and generate a variety of graphs and reports. The system also provides some online medical references [33]. The GIS for the Long Island Breast Cancer Study Project (LIBCSP) The aim of the LIBCSP is to investigate whether environmental factors are responsible for breast cancer in five U.S. counties. The investigation began in 1993 and is funded and coordinated by the National Cancer Institute (NCI), in collaboration with the National Institute of Environmental Health Sciences (NIEHS) [35]. The LIBCSP involves over 10 studies, including epidemiological studies, the establishment of a family breast and ovarian cancer registry, and laboratory research on mechanisms of action and susceptibility in development of breast cancer. NCI is also developing a health-related GIS for Long Island as part of the LIBCSP to help researchers investigate relationships between breast cancer and the environment on Long Island, and to estimate exposures to environmental contamination [35]. Major UK Health GIS Programmes The growing interest in GIS for health applications in the UK has resulted in the development of specialist health GIS units such as the West Midlands Health GIS Service in 212 BOULOS, ROUDSARI, AND CARSON FIG. 9. GIDEON is also available as stand-alone expert system. This computer-driven Bayesian matrix can help diagnose most of the world’s infectious diseases based on the signs, symptoms, and laboratory findings that users enter for their patients. In this screenshot, the authors used GIDEON to diagnose a simple case of diarrhea in an infant (,2 years old). The “Diagnosis Results” shown (with probabilities) are for the United Kingdom as the country of disease acquisition. By just changing the geographical location of disease acquisition, we get a different differential diagnosis list and probabilities for the same patient, e.g., “Shigellosis (40.3%), Salmonellosis (37.4%), Campylobacteriosis (18.7%), and Rotavirus infection (,1%)” with Egypt as the country of disease acquisition (compare with UK figures in the screenshot). Birmingham [36], the Small Area Health Statistics Unit (SASHU) in London [37], and more recently the Public Health Observatories of England [38]. The West Midlands Health GIS Service. Following a water contamination incident in Worcester in 1994, the West Midlands Regional Health Authority established a regional GIS service to assist in the surveillance of such environmental incidents and to improve the integration of datasets related to public health surveillance across the region. The GIS service is incorporated into the West Midlands Cancer Intelligence Unit and takes advantage of geographical datasets available to the academic community. Its work involves analyzing census data to calculate or estimate the demographics of study areas before integrating social, environmental, and health indicators. Part of the service’s work also concerns issues of access to and from health services (see below) [36]. SAHSU. Following the identification of a cluster of childhood leukemia near the Sellafield nuclear plant in 1983, the Government established SAHSU within the Department of Epidemiology and Public Health at the Imperial College School of Medicine in London to analyze and interpret statistics relating to small areas. SAHSU uses GIS and incorporates national cause-specific 213 HEALTH GEOMATICS data on deaths, cancers from the national cancer registry, hospital admissions, and congenital malformations, using the postal code of residence to locate cases to within 100 m and to identify those patients living near sources of environmental pollution. SAHSU is concerned with unusual clusters of disease, particularly in small areas and in the neighborhood of industrial installations. It carries out detailed epidemiological investigations of routine health statistics and any relevant environmental data in order to put any identified cluster in its proper context. SAHSU relates the number of cases observed in a specified area to the number expected for a typical population of the same size, age structure, and socioeconomic profile as the neighborhood in question. Small area deprivation measures, e.g., Carstairs index, which are strongly predictive of mortality and cancer incidence, are also used to adjust disease rates for possible confounding by socioeconomic variables [37]. Other Applications The Public Health Early Warning (PHEW) system of New Zealand’s Ministry of Health and the French Sentinel Network are two good examples of Web-based Health GIS applications. PHEW attempts to highlight anomalies in both time series and geographical patterns of communicable disease, in addition to its dynamic mapping and charting capabilities. The Sentinel Network of the French General Practitioners allows its users to run interactive queries on influenza, diarrheas, measles, mumps, chicken-pox, urethritis, and HIV, and obtain maps, time series graphs, or data arrays. Animations (dynamic noninteractive maps) representing epidemiological evolutions, e.g., of flu epidemic, are also available [39, 40]. Recent changes to the structure of the NHS (National Health Service in the UK) also had a significant bearing on the Service, which acted as a decision making tool in the formation of Primary Care Groups (PCGs) in the West Midlands. Using GIS techniques to analyze existing census geography, the Service helped defining the geography of PCGs and also contributed to their information needs [36]. UIC-SPH Research and Community Service Map Server The School of Public Health at the University of Illinois in Chicago, Illinois, runs a Web-based map server that provides maps of its various research and community service projects within the Chicago metropolitan area. These range from community outreach for AIDS education to evaluating the impact of cigarette taxes on youth cigarette smoking, and cover disability, environmental and occupational issues, maternal and child health, geriatrics, mental health, minority health, public health training, substance abuse, and prevention of violence [41]. GISCA Projects in Australia The National Key Centre for Social Applications of Geographical Information Systems (GISCA) at Adelaide University in Australia has recently published several health-related GIS projects sponsored by the Australian Department of Health and Aged Care. Among these projects is ARIA, an Accessibility/Remoteness Index of Australia derived from measures of road distance between populated localities and service centers. Remoteness scores are generated for any location in Australia. Other projects include a GP Rural Retention project (GPARIA, Fig. 10), a Pharmacy Access/ Remoteness Index of Australia (PhARIA), a Rural Health project, and an Influenza Pandemic Model [42]. Healthcare Services/Access Applications Access Studies at the UK West Midlands Health GIS Service The West Midlands GIS Service (see above) looked at service catchment areas of breast cancer screening sites and at the relative distances patients must travel for gynecology and obstetrics services. Using GIS, health service catchment zones have been treated as complex rather than homogenous surfaces by including variations in the ease of travel. This has supported the reconfiguration of existing services and the planning of new ones. Avenir: A Healthcare Planning and Marketing System Avenir is a fully integrated reporting and mapping system produced by Caredata.com (U.S.A.) and designed to help healthcare professionals analyze their market data, perform competitive market analysis, determine the best sites for new facilities, and identify future healthcare demand. It can import medical databases such as hospital discharges and Caredata.com custom datasets called HealthPacs, and create a wide variety of reports, maps, and graphs [43]. 214 BOULOS, ROUDSARI, AND CARSON FIG. 10. Screenshot of a GISCA map server page showing GP ARIA values by point locations. Users can interact with the maps; for example, they can click the Information tool and then click on a feature and see its attributes. HealthQuery HealthQuery (U.S.A.) is a collection of Web-based public domain tools designed to assist California residents and health organizations in making more informed health decisions. It is a collaborative project of many organizations and end-users including the Good Hope Medical Foundation, California Department of Health Services– Center for Health Statistics, the National Health Foundation (NHF), a Los Angeles-based, public benefit organization, and three companies, ESRI, Oracle, and Sun Microsystems. The included Health Facility Finder tool (Figs. 11 and 12) allows users to locate the hospitals, clinics, and emergency rooms that are nearest to them (within a user-defined radius) and provides them with detailed driving directions from their current locations to matching facilities. HealthQuery also has plans to develop other tools to model and simulate the supply and demand for healthcare services into the future and allow users to compare the current supply and demand for these services [44]. The Dartmouth Atlas of Health Care The Center for the Evaluative Clinical Sciences at Dartmouth Medical School (CECS, U.S.A.) includes researchers in the fields of epidemiology, statistics, economics, medical sociology, medical geography, and clinical practice, amid other disciplines. The principal research goal at CECS is to accurately describe the healthcare system in the United Sates and provide answers to a variety of HEALTH GEOMATICS 215 FIG. 11. In this screenshot, we searched for the nearest hospitals within a 5-mile radius around 92373 (zip code, California). HealthQuery found four locations. resource planning and management questions. CECS produces a variety of healthcare atlases for this purpose collectively known as the Dartmouth Atlas of Health Care [45]. More Applications from the Literature/GIS Events Albert et al. (2000) present a literature review of 10 GIS applications addressing research questions and problems related to healthcare delivery and disease ecology, and published between 1990 and 1992. Topics covered by the reviewed applications include automated emergency response using network routing, AIDS surveillance using database queries, measuring accessibility and defining hospital catchment areas using Thiessen polygons, prediction of high blood lead levels among children and identification of high-risk census tracts using overlay operations and GIS modeling, exploring regional variations in radon potential using overlay operations, testing for the significance of cancer clusters in childhood leukemia, and measles surveillance [46]. Other more recent examples of health-related GIS applications are described in Gatrell and Löytönen (1998) and in Meade and Earickson (2000). These applications developed between 1993 and 1997 include modeling lead poi- 216 BOULOS, ROUDSARI, AND CARSON FIG. 12. In this screenshot, we asked HealthQuery to give us detailed driving directions from near 92373 (zip code, California) to one of the facilities located in the figure (Redlands Community Hospital). soning risk potential in North Carolina, environmental and health applications in Italy, a multipurpose, interactive mortality atlas of Italy, and the development of an epidemiological spatial information system in Germany [3, 6]. Lang (2000) presents 12 case studies in which GIS has been used to track the spread of infectious and environmentally caused diseases, site new hospitals and clinics based on demand and demographic factors, monitor toxic spills to protect the health of nearby residents, map demand for future nursing home facilities, and market pharmaceuticals [4]. At every major GIS conference, there is usually a session on health applications [3]. A good example of such sessions is the Healthcare Tracks of the Annual ESRI International User Conferences. ESRI provides free online access to the full text of most of the papers presented in these Healthcare Tracks [47]. Conferences fully dedicated to GIS applications in public health also exist. The Third National Conference on GIS in Public Health organized in 1998 by the CDC Agency for Toxic Substances and Disease Registry (U.S.A.) is one example. The postconference Web site offers abstracts and full papers covering environmental health protection, disease surveillance, social and demographic analyses, in addition to some demonstration projects [48]. A recent MEDLINE query using the term “GIS” yielded 400 citations (February 2001) [49]. One of these is a paper by Schellenberg and colleagues (1998) of the Tropical Health Epidemiology Unit, London School of Hygiene and Tropical Medicine, UK, who used GPS and GIS to study the geographical pattern of admissions to hospital with severe malaria and the stability of this pattern over time in Kilifi District in Kenya. They were able to demonstrate that hospital admission rates for severe malaria were higher in households with better access to hospital than in those further away. They also found evidence of space–time clusters of severe malaria, suggesting that it would be beneficial to conduct case–control studies of environmental, genetic, and human behavioral factors involved in the etiology of malaria [50]. Generic GIS journals like Geoforum and Transactions in GIS also publish interesting health-related papers from time to time. Examples include a paper published in Geoforum in 1995 by McLafferty and Tempalski who studied the spatial distribution of low birth-weight infants in New York City and identified areas in which low birth-weight increased sharply during the 1980s. They used GIS buffering among other techniques in their study. Their results indicated that the rise in low birth-weight was closely linked to women’s declining economic status, inadequate insurance coverage and prenatal care, as well as the spread of cocaine [51]. Another more recent research article by Crabbe et al. (2000) appeared in Transactions in GIS. This article is 217 HEALTH GEOMATICS about a project called MEDICATE (Medical Diagnosis, Communication and Analysis Throughout Europe) that aims at determining the effect of air pollution on measured lung function and respiratory symptom parameters, and remedial actions to limit a patient’s exposure to elevated pollution levels. GIS and dispersion modeling are used as air quality assessment tools to integrate environmental information and estimate human personal exposure [52]. Barriers Facing the Adoption of GIS Technology in the Health Sector There are many barriers preventing the rapid and wide adoption of GIS technology in health applications. Some of these obstacles are user-related, like “spatial illiteracy” among healthcare workers, and the time needed to acquire specific training in GIS. Novice users might find it difficult to perform GIS-based manipulations [6, 22]. The remaining barriers can be attributed to a variety of data availability/ quality and output visualization issues. User and Tools-Related Issues Despite the vast amount of research and publications in geomatics, the discipline remains a “dark mystery” for most healthcare planners and policymakers throughout the world. This severely limits the practical usefulness of any research that has been published so far, and limits the technology to a small elite of researchers and enthusiastic users. Modern Desktop GIS packages like ESRI ArcView try to overcome this by providing their advanced functionality through intuitive graphical user interfaces. However, choosing the right set of procedures to solve a GIS problem is still a complicated task that frustrates many novice users [22]. Fortunately, new tools and wizards started to appear that let (experienced) GIS users save their work methodologies as process flow models (projects emptied of their data or data processing “templates”). These models offer endless possibilities from rerunning them later on using different input data or function parameters to integrating them into other larger models as needed. But most importantly, they act as know-how sharing and transfer tools for those users who do not have enough time or expertise to develop their own methodology. Task abstraction is what makes these models really powerful, reusable, and timesaving [53]. Professional education and hands-on training courses in geomatics are also extremely important in combating “spatial illiteracy.” A good example is the course held in 1995 at the Centers for Disease Control field station in Guatemala City. The course covered the use of GIS and associated peripherals like digitizing tablets and GPS to build a GIS public health infrastructure for research and control of tropical diseases in Latin American countries [54]. Data Issues GIS data issues are related to the availability, quality, suitability, format (for compatibility with a given GIS package), and cost of the required input data. Inappropriate digital map data scale or level of details for the task at hand can be a major source of trouble [6, 11, 22]. Moreover, due to patient privacy and confidentiality issues, GIS users sometimes do not have access to point patient data, and census tract data (polygon/aggregated data) are all that they can get, which might not be suitable for some types of studies. A theme with an inappropriate classification scheme applied to it can be also problematic. This can be the result of an incorrect classification applied to the features of the theme, or a classification having too many or too few categories (classes), or using mixed classifications for the same topic [11]. Worse still, some important themes might be totally missing. Because of the interdisciplinary nature of GIS, a single study might rely on and integrate data coming from different sources, e.g., health, environment, and socioeconomic data, all in one project. GIS users trying to link their own data to other specific sectors might find that the required external datasets are incomplete, unavailable, or inaccessible, or difficult to integrate when they have been collected for other purposes [6]. The age/date of data are also equally important and should be always mentioned in the accompanying metadata documentation. The date of the original data used in a coverage can be critical in population statistics for example, and also for features that can change through time like a city’s water supply network or roads. Old data may compromise the overall quality, accuracy, and relevance of results. Mixed dates, for example, of population figures of different districts in a city, make the figures incompatible and therefore unsuitable for comparative analysis. Data completeness is another related issue. Sometimes feature data of a given area are not complete, and the user does not even know this. For example, some of the emergency health units might be missing in a theme covering a city; the user might think there are no other services other than those presented in the theme, while actually there are other services, but for some reason these have not been 218 presented. Inconsistent data quality or granularity (scale/ level of detail) between different regions under consideration in a study might lead to uneven and sometimes misleading regional analysis results [11]. Visualization Issues Visualization is a key feature of GIS and as such is not really a barrier. On the contrary, it can help us overcome communication barriers when properly used. Visualization can be very helpful in conveying the findings and results of a study to the decision-maker in a clear and easy-to-understand way. However, visualization can be also very deceptive at other times (the misuse potential), if the wrong scales, class intervals, methods, symbols and colors are used. Cartography (the art and science of mapping) is a visual language with its own “grammar” rules that must be observed to achieve good communication of ideas and results [55]. There are established ways in which humans will interpret a map or chart, some of which lead to misinterpretation (under- or overestimation of data) and must be avoided [6]. For example, when the observed number of cases of a certain disease is higher in poor neighborhoods than in other neighborhoods, this can be due to either some relationship between disease and poverty, or simply because there are more people in poor neighborhoods. In this case, mapping the disease rate per 1000 people in different neighborhoods instead of mapping the actual number of cases in each neighborhood can help us reach a correct conclusion [12]. CONCLUSION Understanding the relationship between location and health/healthcare services can greatly assist us in disease prevention and in better service planning, with more efficient and effective resource utilization. Health geomatics has a tremendous and key role to play in this regard. Although there are still many technical, cultural, and organizational barriers hindering the wide-scale adoption of GIS and related technologies in the health sector, the situation is rapidly changing. Tools are becoming easier to use, more powerful and cheaper, and major health organization are currently promoting and using the technology, as well as actively helping other key players to use it. This should ultimately lead to better health for the individual (through improved medical care) and the community (through carefully targeted Public Health programs). BOULOS, ROUDSARI, AND CARSON REFERENCES2 1. Sack R. Conceptions of space in social thought: a geographic perspective. London: MacMillan, 1980. 2. Hall W. Just another medical geography page. Available at http:// www.geocities.com/Tokyo/Flats/7335/medical geography.htm. 3. Meade MS, Earickson RJ. Medical geography. 2nd ed. New York: Guilford Press, 2000 [ISBN 1-57230-558-4]. 4. Lang L. GIS for health organisations. California: ESRI Press, 2000 [ISBN 1-879102-65-X]. 5. Gatrell A, Senior M. Health and health care applications. In: Longley PA, Goodchild MF, Maguire DJ, Rhind DW, editors. Geographical information systems. vol 2. Management issues and applications. New York: Wiley, 1999; 925–38 [ISBN 047133133-3]. 6. Gatrell A, Löytönen M, editors. GIS and health (GISDATA 6). London: Taylor & Francis, 1998 [ISBN 0-7484-07790]. 7. Geomatics Canada Web Site. Available at http://www.geocan. nrcan.gc.ca/. 8. Raper J. Multidimensional geographic information science. London: Taylor and Francis, 2000 [ISBN 0-7484-0507-0]. 9. Jones C. Geographical information systems and computer cartography. Harlow: Prentice Hall, 1997 [ISBN 0-582-04439-1]. 10. WHO. Geographical information systems (GIS): mapping for epidemiological surveillance. WHO Weekly Epidemiol Rec 1999; 74:281–5. 11. Davis BE. GIS: a visual approach. Santa Fe: Onword Press, 1996 [ISBN 1-56690-098-0]. 12. Environmental Systems Research Institute, Inc. Educational material. Available at http://campus.esri.com. 13. Bryan B. GIS III lectures, GISCA, University of Adelaide, Australia. Available at http://www.gisca.adelaide.edu.au/,bbryan/ lectures/lecture timetable.htm. 14. Dodge M, Kitchin R. Mapping cyberspace. London: Routledge, 2001 [ISBN 0-415-19884-4]. 15. Renolen A. Modelling the real world: conceptual modelling in spatiotemporal information system design. Trans GIS 2000; 4(1):23–42. 16. Loslier L. GIS from a health perspective. In: de Savigny D, Wijeyaratne P, editors. GIS for health and the environment. Ottawa: International Development Research Centre (IDRC), 1995 [ISBN 0-88936-766-3]. 17. Research Epidemiology Geographic Software (Public Health Research Laboratories). EpiAnalyst Extension for ArcView GIS. Available at http://www.phrl.org/REGS/Software.htm. 18. Kulldorff M, Rand K, Gherman G, Williams G, DeFrancesco D. Software for calculating the spatial, temporal and space–time scan statistics (SaTScan 2.1 for Windows). Division of Cancer Prevention, National Cancer Institute, US. Available at http:// dcp.nci.nih.gov/bb/SaTScan.html. 19. Department of Geography, University of Iowa. Using distance 2 All Web links have been checked by the authors on 21 February 2001. HEALTH GEOMATICS 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. mapping and analysis program (DMAP) for Windows to compute disease rates with variable spatial filters and test for statistical significance. Available at http://www.uiowa.edu/,geog/health/ demo/dmap.html. Ned Levine. CrimeStat: a spatial statistics program for the analysis of crime incident locations (version 1.1). Ned Levine & Associates: Annandale, VA/National Institute of Justice: Washington, DC, 2000. Available at http://www.nedlevine.com/nedlevine17.htm. Dunn CE, Kingham SP, Rowlingson B, Bhopal RS, Cockings S, Foy CJ, Acquilla SD, Halpin J, Diggle P, Walker D. Analysing spatially referenced public health data: a comparison of three methodological approaches. Health Place 2001; Mar; 7(1):1–12. [Abstract] Zeelenberg C. Health Information Resources (Chapter 22). In: van Bemmel JH, Musen MA, editors. Handbook of medical informatics. Heidelberg: Springer-Verlag, 1997, 360–1 [ISBN 3540-63351-0]. Cline BL. New eyes for epidemiologists: aerial photography and other remote sensing techniques. Am J Epidemiol 1970 Aug; 92(2):85–9. Brain M. How a GPS receiver works. Available at http://www. howstuffworks.com/gps1.htm. Centre for Health Applications of Aerospace Related Technologies. Available at http://geo.arc.nasa.gov/esdstaff/health/chaart.html. Hassan AN, Beck LR, Dister S. Prediction of villages at risk for filariasis transmission in the Nile Delta using remote sensing and geographic information system technologies. J Egypt Soc Parasitol 1998; 28(1):75–87. Hassan AN, Beck LR, Dister S. Spatial analysis of lymphatic filariasis distribution in the Nile Delta in relation to some environmental variables using geographic information system technology. J Egypt Soc Parasitol 1998; 28(1):119–31. MALSAT Environmental Information Systems for Malaria. Available at http://www.liv.ac.uk/lstm/malsat.html. WHO HealthMap. Available at http://www.who.int/emc/ healthmap/healthmap.html. WHO Leprosy Elimination Programme. Available at http:// www.who.int/lep/Monitoring and Evaluation/gis.htm. Pan American Health Organisation SIG-EPI Project for the Americas. Available at http://www.paho.org/english/sha/ SHASIG.htm. The MARA/ARMA Collaboration. Available at http://www. mara.org.za/. Field Medical Surveillance System (FMSS)—US Naval Health Research Centre, San Diego, CA. Available at http://www. nhrc.navy.mil/programs/fmss/. GIDEON—Global Infectious Disease and EpidemiOlogy Network. Available at http://www.cyinfo.com/. The GIS for the Long Island Breast Cancer Study Project. Available at http://www.healthgis-li.com/. West Midlands Health GIS Service. Available at http://www. hsrc.org.uk/gis/. 219 37. Small Area Health Statistics Unit. Available at http://www. med.ic.ac.uk/divisions/60/RIF/, 38. Public Health Observatories, England. Available at http://www. pho.org.uk/, 39. The Public Health Early Warning (PHEW!) system of New Zealand’s Ministry of Health. Available at http://www. phew.govt.nz/, 40. The Sentinel Network of the French General Practitioners. Available at http://www.b3e.jussieu.fr/sentiweb/, 41. Research and Community Service Map server at the School of Public Health, University of Illinois, Chicago. Available at http:// 128.248.232.60/sph mapping 2/map.html, 42. Accessibility-Remoteness Index of Australia. Available at http:// www.gisca.adelaide.edu.au/web aria/. 43. Avenir System Description. Available at http://www.tetrad.com/ custm/healthcare.html. 44. HealthQuery—Find Facility Locations. Available at http://www. healthquery.org/chs.html. 45. The Dartmouth atlas of health care. Available at http://www. dartmouthatlas.org/ 46. Albert DP, Gesler WM, Levergood B. Spatial analysis, GIS, and remote sensing applications in the health sciences. Michigan: Ann Arbor Press, 2000 [ISBN 1-57504-101-4]. 47. Healthcare Tracks of the Annual ESRI International User Conferences (Web sites). URIs: http://www.esri.com/library/ userconf/proc00/professional/indices/track d2.htm and http:// www.esri.com/library/userconf/proc99/proceed/indices/ trackd3.htm and http://www.esri.com/library/userconf/proc98/ PROCEED.HTM. 48. CDC ATSDR GIS in Public Health Post-Conference Web Site. Available at http://www.atsdr.cdc.gov/GIS/conference98/ index.html. 49. MEDLINE Search. Available at http://www.ncbi.nlm.nih.gov/ entrez/query. 50. Schellenberg JA, Newell JN, Snow RW, Mung’ala V, Marsh K, Smith PG, Hayes RJ. An analysis of the geographical distribution of severe malaria in children in Kilifi District, Kenya. Int J Epidemiol 1998 Apr; 27(2):323–9. 51. McLafferty S, Tempalski B. Restructuring and women’s reproductive health: implications for low birthweight in New York City. Geoforum 1995; 25(2):309–23. 52. Crabbe H, Hamilton R, Machin N. Using GIS and dispersion modelling tools to assess the effect of the environment on health. Trans GIS. 2000;4(3):235–44. 53. ESRI ArcView Spatial Analyst 2 Extension–ModelBuilder. Available at http://www.esri.com/software/arcview/extensions/ modelbuilder-new.html. 54. Hightower AW, Klein RE. Building a geographic information system (GIS) public health infrastructure for research and control of tropical diseases. Emerg Infect Dis 1995; Oct–Dec; 1(4):156–7. Available at http://www.cdc.gov/ncidod/eid/vol1no4/hightow.htm. 55. Kraak MJ, Brown A. Web cartography: developments and prospects. London: Taylor & Francis, 2001 [ISBN 0-7484-0869-X].