Techniques used in geographic information systems
Data creation
Modern GIS technologies work with digital information, for which there are several methods used in creating digital data. The most used method is digitization, where a printed map or information taken in the field is transferred to a digital medium through the use of a Computer Aided Design (DAO or CAD) program with georeferencing capabilities.
Given the wide availability of ortho-rectified images (both satellite and aerial), digitization in this way is becoming the main source of geographic data extraction. This form of digitization involves searching for geographic data directly in aerial images rather than the traditional method of locating geographic features on a digitizing board.
The representation of the data
GIS data represents real-world objects (roads, land use, altitudes). Real-world objects can be divided into two abstractions: discrete objects (a house) and continuous objects (amount of rain fallen, an elevation). There are two ways to store data in a GIS: raster and vector.
GIS that focus on handling data in vector format are more popular in the market. However, raster GIS are widely used in studies that require the generation of continuous layers, necessary in non-discrete phenomena; also in environmental studies where excessive spatial precision is not required (atmospheric pollution, temperature distribution, location of marine species, geological analysis, etc.).
A type of raster data is essentially any type of digital image represented in meshes. The raster or grid GIS model focuses on the properties of space rather than the accuracy of location. Divides the space into regular "Cell (geometry)") cells where each cell represents a single value. It is a data model that is very suitable for the representation of continuous variables in space.
Anyone familiar with digital photography recognizes the pixel as the smallest unit of information in an image. A combination of these pixels will create an image, unlike the common use of scalable vector graphics that are the basis of the vector model. While a digital image refers to the output as a representation of reality, in a photograph or art transferred to the computer, the raster type of data will reflect an abstraction of reality. Aerial photographs are a form of raster data commonly used for one purpose: to display a detailed image of a base map upon which digitization will be performed. Other raster data sets may contain information regarding terrain elevations (a Digital Terrain Model), or the reflection "Reflection (physics)") of light of a particular wavelength (for example those obtained by the LandSat satellite), among others.
Raster data is made up of rows and columns of cells, each cell stores a unique value. Raster data can be images (raster images), with a color value in each cell (or pixel). Other values recorded for each cell can be a discrete value, such as land use, continuous values, such as temperatures, or a null value if no data is available. Although a cell raster stores a single value, the cells can be expanded by using the raster's bands to represent RGB (red, green, blue) colors, or an extended attribute table with a row for each unique cell value. The resolution of the raster data set is the width of the cell in ground units.
Raster data is stored in different formats, from a standard file based on the structure of TIFF, JPEG, etc. to Binary Large Objects (BLOBs), data stored directly in Database Management System. Database storage, when indexed, generally allows for rapid retrieval of raster data, but at the cost of requiring the storage of millions of records with significant memory size.
In a raster model, the larger the dimensions of the cells, the lower the precision or detail (resolution) of the representation of the geographical space.
In a GIS, geographic features are often expressed as vectors, maintaining the geometric characteristics of the figures.
In vector data, the interest of the representations focuses on the precision of the location of the geographical elements in space and where the phenomena to be represented are discrete, that is, with defined limits. Each of these geometries is linked to a row in a database that describes its attributes. For example, a database describing lakes may contain data on lake bathymetry, water quality, or pollution level. This information can be used to create a map that describes a particular attribute contained in the database. Lakes can have a range of colors depending on the level of pollution. Furthermore, the different geometries of the elements can also be compared. Thus, for example, GIS can be used to identify those wells (point geometry) that are around 2 kilometers from a lake (polygon geometry) and that have a high level of contamination.
Vector elements can be created respecting territorial integrity through the application of topological rules such as "polygons must not overlap." Vector data can be used to represent continuous variations of phenomena. Contour lines and irregular triangle networks (TINs) are used to represent altitude or other continually evolving values. TINs are records of values at a localized point, which are connected by lines to form an irregular mesh of triangles. The face of the triangles represent, for example, the surface of the land.
To digitally model real-world entities, three geometric elements are used: the point "Point (geometry)"), the line and the polygon.[17].
Advantages and disadvantages of raster and vector models
There are advantages and disadvantages when using a raster or vector data model to represent reality.
Non-spatial data
Non-spatial data can also be stored along with spatial data, those represented by the coordinates of a vector geometry or by the position of a raster cell. In vector data, the additional data contains attributes of the geographic entity. For example, a forest inventory polygon can also have a value that serves as an identifier and information about tree species. In raster data the cell value can store attribute information, but it can also be used as an identifier referring to the records in a table.
Capturing the data
Capturing data and entering information into the system consumes most of the time of GIS professionals. There are a wide variety of methods used to enter data into a GIS stored in a digital format.
Data printed on paper or maps on PET film can be digitized or scanned to produce digital data.
With the digitization of cartography in analog support, vector data is produced through traces of points, lines, and polygon boundaries. This work can be carried out by a person manually or through vectorization programs that automate the work on a scanned map. However, in the latter case, manual review and editing will always be necessary, depending on the level of quality you wish to obtain.
Data obtained from topographic measurements can be entered directly into a GIS through digital data capture instruments using a technique called analytical geometry. Additionally, position coordinates taken through a Global Positioning System (GPS) can also be entered directly into a GIS.
Remote sensors also play an important role in data collection. They are sensors, such as cameras, scanners or LIDAR attached to mobile platforms such as airplanes or satellites.
Currently, most digital data comes from the interpretation of aerial photographs. To do this, workstations are used that directly digitize geographical elements through stereoscopic pairs of digital photographs.
These systems allow data to be captured in two and three dimensions, with elevations measured directly from a stereoscopic pair according to the principles of photogrammetry.
Satellite remote sensing provides another important source of spatial data. In this case, satellites use different sensors to measure the reflectance of parts of the electromagnetic spectrum, or radio waves that are sent from an active sensor such as radar. Remote sensing collects raster data that can be processed using different bands to determine classes and objects of interest, such as different land covers.
When data is captured, the user must consider whether it should be captured with relative accuracy or absolute precision. This decision is important since it not only influences the interpretation of the information, but also the cost of its capture.
Likewise, Mobile Mapping, also known as mobile mapping, is a technique that allows the collection of point clouds, geolocated 360° images and geographic data, all through technologies incorporated in a vehicle or mobile platform to tour and inventory the different elements that make up urban environments (public lighting management, sanitation and drinking water, urban signage, etc.).
In addition to capturing and entering spatial data, attribute data is also entered into a GIS.
During cartography digitization processes, it is common for involuntary topological errors (dangles, , , , , , etc.) to occur in the vector data and they must be corrected. After entering the data into a GIS, it will normally require further editing or processing to eliminate the aforementioned errors. A "topological correction" must be made before they can be used in some advanced analyzes and, for example, in a road network the lines must be connected to nodes at intersections.
Raster-vector data conversion
GIS can carry out data restructuring to transform it into different formats. For example, it is possible to convert a satellite image to a map of vector elements by generating lines around cells with the same classification, determining their spatial relationship, such as proximity or inclusion.
Unassisted vectorization of raster images using advanced algorithms is a technique that has been developed since the late 1960s. To achieve this, contrast improvement, false color images, as well as the design of filters are used through the implementation of two-dimensional Fourier transforms.
The reverse process of converting vector data to a data structure based on a raster matrix is called rasterization.
Since digital data is collected and stored in both vector and raster forms, a GIS must be able to convert geographic data from one storage structure to another.
Projections, coordinate systems and reprojection
Before analyzing the data in the GIS, the cartography must all be in the same projection and coordinate systems. To do this, it is often necessary to reproject the information layers before integrating them into the geographic information system.
The Earth can be represented cartographically by several mathematical models, each of which can provide a different set of coordinates (e.g. latitude, longitude "Longitude (cartography)", altitude) for any given point on its surface. The simplest model is to assume that the Earth is a perfect sphere. As more measurements of the planet have been accumulated, models of the geoid have become more sophisticated and more precise. In fact, some of these are applied to different regions of the Earth to provide greater precision (for example, the European Terrestrial Reference System 1989 - ETRS89 - works well in Europe but not in North America).
Projection is a fundamental component when creating a map. A mathematical projection is the way to transfer information from a model of the Earth, which represents a curved surface in three dimensions, to another two-dimensional model such as paper or a computer screen. To do this, different cartographic projections are used depending on the type of map you want to create, since there are certain projections that are better adapted to some specific uses than others. For example, a projection that accurately represents the shape of the continents distorts their relative sizes.
Since much of the information in a GIS comes from existing cartography, a geographic information system uses computer processing power to transform digital information, obtained from sources with different projections or different coordinate systems, to a common projection and coordinate system. In the case of images (orthophotos, satellite images, etc.) this process is called rectification.
Spatial analysis using GIS
Given the wide range of spatial analysis techniques that have been developed over the last half century, any summary or review can only cover the topic in a limited depth. This is a rapidly changing field and GIS software packages increasingly include analysis tools, either in standard versions or as optional extensions to it. In many cases such tools are provided by the original software providers, while in other cases implementations of these new functionalities have been developed and are provided by third parties. Additionally, many products offer software development kits (SDKs), programming languages, scripting languages, etc. for the development of own analysis tools or other functions.
A GIS can recognize and analyze the spatial relationships that exist in stored geographic information. These topological relationships allow for complex spatial modeling and analysis. Thus, for example, the GIS can discern the cadastral parcel or parcels that are crossed by a high voltage line, or know which group of lines form a certain road.
In short, we can say that in the field of geographic information systems, topology is understood as the spatial relationships between the different graphic elements (node/point topology, network/arc/line topology, polygon topology) and their position on the map (proximity, inclusion, connectivity and neighborhood). These relationships, which for humans may be obvious to the naked eye, must be established by the software using a language and rules of mathematical geometry.
To carry out analyzes in which it is necessary to have topological consistency of the elements of the database, it is usually necessary to previously carry out validation and topological correction of the graphic information. To achieve this, there are GIS tools that facilitate the rectification of common errors automatically or semi-automatically.
Networks
A GIS intended for the calculation of optimal routes for emergency services is capable of determining the shortest path between two points taking into account both directions and directions of circulation and prohibited directions, etc. avoiding impassable areas. A GIS for the management of a water supply network would be able to determine, for example, how many subscribers would be affected by the interruption of service at a certain point in the network.
A geographic information system can simulate flows along a linear network. Values such as slope, speed limit, service levels, etc. can be incorporated into the model in order to obtain greater accuracy. The use of GIS for network modeling is commonly used in transportation planning, hydrological planning or linear infrastructure management.
Map Overlay
Combining multiple sets of spatial data (points, lines, or polygons) can create another new set of vector data. Visually it would be similar to stacking several maps of the same region. These overlays are similar to the mathematical overlays of the Venn diagram. A join of overlay layers combines the geographic features and attribute tables from all of them into a new layer. In the case of an intersection of layers, this would define the area in which both overlap, and the result maintains the set of attributes for each of the regions. In the case of a symmetrical difference superposition, a resulting area is defined that includes the total surface of both layers except for the intersection zone.
In the analysis of raster data, the superposition of the data set is carried out through a process known as map algebra), through the application of simple mathematical methods that allow the values of each raster matrix to be combined. In map algebra it is possible to weight certain coverages that assign the degree of importance of various factors in a geographic phenomenon.
Automated cartography
Both digital cartography and geographic information systems encode spatial relationships in structured formal representations. GIS are used in the creation of digital cartography as tools that allow an automated or semi-automated map-making process called automated cartography to be carried out.
In practice this would be a subset of GIS that would be equivalent to the final map composition phase, given that in most cases not all geographic information systems software have this functionality.
The resulting final cartographic product can be in both digital and printed formats. The combined use in certain GIS of powerful spatial analysis techniques together with a professional cartographic representation of the data means that high-quality maps can be created in a short period of time. The main difficulty in automated cartography is using a single data set to produce several products according to different types of scales "Scale (cartography)"), a technique known as generalization.
Geostatistics
Geostatistics analyzes spatial patterns in order to achieve predictions from specific spatial data. It is a way of viewing the statistical properties of spatial data. Unlike common statistical applications, geostatistics uses graph theory and algebraic matrices to reduce the number of parameters in the data. After that, the analysis of the data associated with geographical entity would be carried out secondly.
When phenomena are measured, observation methods dictate the accuracy of any subsequent analysis. Due to the nature of the data (e.g., traffic patterns in an urban environment, weather patterns in the ocean, etc.), constant or dynamic degree of accuracy is always lost in the measurement. This loss of precision is determined from the scale and distribution of the data collected. GIS has tools that help carry out these analyses, highlighting the generation of spatial interpolation models.
Geocoding
Geocoding is the process of assigning geographic coordinates (latitude-longitude) to map points (addresses, points of interest, etc.). One of the most common uses is the georeferencing of postal addresses. This requires a base cartography on which to reference the geographic codes. This base layer can be, for example, a plot of street axes with street names and police numbers. The specific addresses that you want to georeference on the map, which usually come from tabulated tables, are positioned by interpolation or estimation. The GIS then locates the point in the street axes layer that is closest to reality according to the geocoding algorithms it uses.
Geocoding can also be done with more precise real data (for example, cadastral mapping). In this case, the result of the geographical coding will conform to a greater extent to that carried out, prevailing over the interpolation method.
In the case of reverse geocoding the process would be the other way around. An estimated street address with its portal number would be assigned to certain x,y coordinates. For example, a user could click "Click (computing)") on a layer that represents the road axes of a city and would obtain information about the postal address with the police number of a building. This portal number is calculated in an estimated manner by the GIS through interpolation from numbers already budgeted. If the user clicks on the midpoint of a segment that starts at portal 1 and ends with 100, the value returned for the selected location will be close to 50. Keep in mind that reverse geocoding does not return actual addresses, but only estimates of what should exist based on already known data.