2. Data Entry

2.1. Direct Spatial Data Acquisition

The ltb acquisition of spatial data can be done from several sources. There has been an increase in data acquired (or produced) using ltb remotely sensed sources, such as satellite imagery.

Other sources of spatial data include ltb Aerial survey, ltb Terrestrial survey and ltb Crowdsourcing. The use of different spatial data sources implies that how suitable (strengths and weaknesses) each particular source is for a particular analysis, depends on the acquisition methods. In this section we focus on methods for direct or primary data acquisition.

Attention

Question. What are possible the advantages and disadvantage of using the data sources listed in the table below? Fill in at least one advantage and one disadvantage.

Data Source

Advantage

Disadvantage

Remote sensing

Aerial survey 1

Terrestrial survey

Crowdsourcing

1

It should be noted that aerial surveys are a form of remote sensing, but not from space.


2.2. Indirect Spatial Data Acquisition

Although spatial data can be acquired from third-party sources like government agencies or specialised companies, there will always be the need to acquire your own data. This usually means ‘digitising’ also known as ‘vectorisation’ – the process of capturing objects from a raster base layer like a map or an aerial photograph as points, lines and polygons. In this section, we focus on methods for indirect or secundary data acquisition. Specifically, the main techniques used for vectorisation.

Attention

Question. Read and observe the relation between ltb Digitizing and ltb Scanning. Is Digitising the only way to turn a scanning into Spatial data?

Important

Resources. You will require the latest LTR version of QGIS, plus the dataset data_entry.zip. When you unzip the dataset, you will find the following files inside:

  • data_entry.qgs – a QGIS project file;

  • checking_errors.qgs – a QGIS project file;

  • Pearl_Harbour_topographic_map_(1999).tif – a raster map;

  • Educational_facilities.csv – tabular data;

  • Polygons.gpk – a polygon vector layer.

2.2.1. Digitising

Extracting the data, you need from a raster base map to a vector layer starts with creating a new dataset (i.e. layer), where the features that are about to be created will be stored. Technically speaking, it is a simple task; however, you should always take a moment to assess the requirements before proceeding with the actual software operation.

Capturing elements from a base map is an abstraction exercise; this abstraction depends on the scale and purpose for which the data will be used. For example, think of airports; will you represent them (abstract them) as points or as polygons? The answer to this question will depend on how you are going to use the data. If you want to publish a world map of the major airports, probably you could depict them as points. But if you’re going to map the accessibilities to a given airport, a larger scale will be needed; therefore, polygons might be better.

The ltb attributes associated with the geometries are another important aspect to consider. The choice of attributes depends not only on the scale and intended use, but it also depends on the availability of the data (e.g. what is the capacity of the airport? How does it rank on security? How many international connections does it serve? – would these be information you need to have? And if so, do you have access to this data?)

Task 1

Start QGIS and open the data_entry.qgs project. Among others, you will see a layer named Pearl_Harbour_topographic_map_(1999).tif Observe the map and complete the table below, considering the following requirements:

  • Think of at least three vector layers that can be acquired from the raster base map;

  • Make sure all geometric types – Polygon, Line, Point are represented;

  • For each layer think of at least two attributes.

LayerName

Geometric Type

Attribute 1

Attribute 2

water_lines

line

Id

length

Task 2

Now that you know what you want to extract and how are you are going to abstract it, proceed with the creation of the new layers. Digitise at least three features per layer.

For this task, you may want to watch the video tutorial on basic digitizing:

Note

QGIS. Refer to Editing for a detailed description of vector editing with QGIS.

2.2.2. Topological Consistency

ltb Topology refers to the spatial relationships that should exist among the geometries of a vector dataset, and it is based on the ltb Topological data model. Topology can be a complex subject, but we will take a very pragmatic approach and show you how to maintain the most common topological relationships ltb topological relationships: adjacency in polygons and connectivity of lines.

topological relations

Fig. 2.1 Common topological relations on polygons, lines, and points

In the previous task, for the layer of geometry type ‘Line’ you probably digitised something that is supposed to be a network like roads or water lines. The key characteristic of a network is connectivity. However, if you happen to have digitised lines that are supposed to be connected and you zoom in to the point where the intersection is supposed to be, you will see that lines are not connected. Instead, you will see connectivity issues either by excess or by insufficiency (also known as overshoots and undershoots respectively).

undershoot

Fig. 2.2 Connectivity issues between lines. The case of undershooting

To ensure ltb Topological consistency between geometries, e.g., that line segments get properly connected while digitising, we have to set a snapping tolerance, which tells the GIS software to connect lines that are within certain distance automatically. Otherwise, it will be challenging to ensure that our lines are connected.

Task 3

In QGIS, go to Project > Spaning Options and enable Snapping mode. Enter a tolerance of \(20 px\) for every layer of lines that you may have.

If you may want to watch the video tutorial on advance editing :

Task 4

Digitise some new lines making sure they are topologically connected. You will notice during digitising; if you go closer than a certain distance of an existing feature; the line would be automatically ‘pulled’ towards the nearest vertex or segment of the closest feature. You are thus ensuring connectivity.

In the case of polygons, it is also possible to ensure that adjacent polygons do not overlap.

Attention

Question.

  • How to define a snapping tolerance?

  • What do the options ‘Enable topological editing’ and ‘Enable snapping on intersection’ allow you to do? Try to think of situations where these options might be useful.

Note

Reflection. Ensuring the topological consistency of your vector data is usually not that difficult if you are in control of the data acquisition technique (vectorisation) from the moment the dataset is created. Problems often arise when you receive datasets from third parties. When that happens, you should always do check to make sure the dataset maintains the basic topological relations.

Task 5

Start QGIS and open the checking_errors.qgs project. You will see one layer (polygons). Find a way to check if there are overlapping or adjacency errors automatically. Tip: you may want to install and activate the Geometry Checker plugin. Once it is activated, it should be reachable from the Vector menu.

_images/geometry-checker.png

In some cases, detecting and fixing topological errors is not that simple. Just keep in mind that you should always check the integrity of the data you receive, especially if you do not know the source and lineage of the data.

2.2.3. Spatialising Data

Another way to acquire spatial data is by means of spatialising data. In other words, associate a geographic location with objects. This is a very common procedure when you get, for example, a spreadsheet or some sort of tabular data.

You can spatialise your data in two ways. By means of a join (a concept that will be explored later ahead in the course), or by means of building point geometries given that the tabular data contains X and Y coordinates.

Task 6

Spatialising data. Open the data_entry.qgs project and create a point layer using the educational_facilities.csv file. Follow the steps depicted in the screenshot below.

Create new point layer

Fig. 2.3 Step to create a point layer from the ‘educational_facilities.csv’ file

Attention

Question. If all went well, you should have ended up with a layer of points in your project. Does that mean that the educational_facilities.csv is spatial data?

In the Appendices section, you find a list of Common GIS File Formats.

Section author: André da Silva Mano & Manuel Garcia Alvarez