Data use
Targeting the known unknowns of the botanical world
Published 2/8/2021
This study by an Australian and U.S. team compares the coverages and overlaps between the most comprehensive sources of geographic, genetic and trait-based botanical data to provide an actionable view of the current presence and absence of knowledge about the world's plants.
Their analysis draws on nearly 215 million plant records from GBIF.org and compares their coverage with data aggregated in GenBank and TRY, placing each of the 350,699 species names in The Plant List into one of three groups: 1) broadly covered species (17.7 per cent) where at least some knowledge of their locations, genes and traits exists; 2) patchily covered species (55.6 per cent) missing from at least one data source; or 3) species with no information other than their names (26.7 per cent).
While the GBIF network provides the most complete species coverage (73.3 per cent), the results outline the shapes of some well-known but previously unquantified shortfalls in botanical knowledge. By providing clear targets for data collection and specimen digitization, the authors present reasons to be optimistic about their stated hope of "turn[ing] what appears to be an insurmountable task into a manageable checklist of gaps to be filled."