Directly to content
  1. Publishing |
  2. Search |
  3. Browse |
  4. Recent items rss |
  5. Open Access |
  6. Jur. Issues |
  7. DeutschClear Cookie - decide language by browser settings

Data Quality of Citizen Science Observations of Organisms: Plausibility Estimation Based on Volunteered Geographic Information Context

Jacobs, Clemens

[thumbnail of jacobs_thesis.pdf] PDF, English
Download (33MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

Abstract

In a growing number of Citizen Science projects, volunteers from the general public collect large amounts of observation data of organisms. Such data are an important contribution to biodiversity research, providing information on the distribution of species over large areas and long periods of time. In the current global biodiversity crisis, such information is urgently needed to support research and conservation efforts. One of the most important issues which must be addressed before these data can effectively be used, is data quality. This is a concern especially with data which are being collected in a casual way, without strict, formal protocols ensuring certain standards of data quality before or during the data collection process. There is great need for approaches which allow for assessing data quality of casual citizen science observations of organisms automatically, to cope with the large amounts of observation data which are produced by casual biodiversity citizen science projects. Casual citizen science observations of organisms are biological, but also geographical data, because they always possess location information. Collected mostly online and by untrained volunteers, they belong to the emerging domain of geographic information called Volunteered Geographic Information (VGI, Goodchild 2007). Approaches which are based on geographical criteria are therefore a promising avenue towards providing suitable methods for quality assessment. At the same time, casual citizen science observations of organisms are a special kind of VGI, because they mostly do not represent permanent objects, but rather have the nature of events which cannot be proven to be correct or incorrect. Quality assessment must therefore resort to proxy approaches such as estimating the plausibility of an observation in light of certain reference information. This thesis developed and evaluated novel approaches to quality assessment of casual citizen science observations of organisms based on estimating the plausibility of observations in light of VGI context. It employed two use cases of casual citizen science projects with two different areas of interest: ArtenFinder Rheinland-Pfalz (Germany), and the global project iNaturalist, of which data from California (USA) were used. In an intrinsic approach, geographic context is provided by neighboring observations from the same dataset which are transformed into species-specific observed communities, describing a species’ typical context of other species usually observed close-by. An extrinsic approach uses OpenStreetMap (OSM), a well-established global VGI project providing detailed geographic information on physical objects, for describing a species’ geographic context in the form of an OSM environment, consisting of the OSM features typically found in close proximity to a species’ observations. Plausibility of a new observation added to the dataset is estimated by comparing its context of neighboring observations or of OSM features to the species’ observed community or OSM environment. This comparison is achieved by using similarity indices. The intrinsic observed communities approach as well as the extrinsic OSM environments approach were evaluated by estimating the plausibility of plausible or implausible observations. This was done with real approved or rejected observations from the respective projects, but also with synthetic plausible and implausible observations created for this purpose. Evaluation proved that both approaches are able to distinguish between plausible and implausible observations based on VGI context, using similarity index values. The approaches estimate the plausibility of the location of an observation in light of surrounding observations or OSM context, and in light of the species identification given for the observation by the volunteer. Careful examination of evaluation results revealed differences in behavior of both approaches depending on the similarity index used. Results also partly differed between the data use cases. Variable spatial density of observations and OSM data has an influence on similarity index values. Observed communities were found to reflect biological and ecological properties of species, while OSM environments rarely do so. Both methods were also tested with a number of parameter changes, and results found basically stable with different parameter settings. Some modifications to the basic methodology of the approaches, such as applying auxiliary land cover data for focusing relevant geographic context or using observation frequency in similarity calculation, showed potential of improving results. Future work must seek to overcome the most important drawbacks and weaknesses of the approaches to plausibility estimation of casual citizen science observations of organisms developed in this work. They can be used only for species with an adequate base of previous observations, and for candidate observations in locations providing an adequate geographic context of observations or OSM data. Influence of variable spatial density of context information on plausibility estimation is a problem especially in the extrinsic OSM environments approach. Both methods should be combined with other approaches using other information about an observation, such as the observation date, or the observers’ experience.

Document type: Dissertation
Supervisor: Zipf, Prof. Dr. Alexander
Date of thesis defense: 21 May 2019
Date Deposited: 03 Jun 2019 09:06
Date: 2019
Faculties / Institutes: Fakultät für Chemie und Geowissenschaften > Institute of Geography
DDC-classification: 004 Data processing Computer science
550 Earth sciences
570 Life sciences
About | FAQ | Contact | Imprint |
OA-LogoDINI certificate 2013Logo der Open-Archives-Initiative