Abstract
Compared to applications that trigger massive information streams, like earthquakes and
human disease epidemics, the data input for agricultural and environmental biosecurity
events (ie. the introduction of unwanted exotic pests and pathogens), is expected to be
sparse and less frequent. To investigate if Twitter data can be useful for the detection and
monitoring of biosecurity events, we adopted a three-step process. First, we confirmed that
sightings of two migratory species, the Bogong moth (Agrotis infusa) and the Common Koel
(Eudynamys scolopaceus) are reported on Twitter. Second, we developed search queries
to extract the relevant tweets for these species. The queries were based on either the taxonomic
name, common name or keywords that are frequently used to describe the species
(symptomatic or syndromic). Third, we validated the results using ground truth data. Our
results indicate that the common name queries provided a reasonable number of tweets
that were related to the ground truth data. The taxonomic query resulted in too small datasets,
while the symptomatic queries resulted in large datasets, but with highly variable signal-to-noise
ratios. No clear relationship was observed between the tweets from the
symptomatic queries and the ground truth data. Comparing the results for the two species
showed that the level of familiarity with the species plays a major role. The more familiar the
species, the more stable and reliable the Twitter data. This clearly presents a problem for
using social media to detect the arrival of an exotic organism of biosecurity concern for
which public is unfamiliar
human disease epidemics, the data input for agricultural and environmental biosecurity
events (ie. the introduction of unwanted exotic pests and pathogens), is expected to be
sparse and less frequent. To investigate if Twitter data can be useful for the detection and
monitoring of biosecurity events, we adopted a three-step process. First, we confirmed that
sightings of two migratory species, the Bogong moth (Agrotis infusa) and the Common Koel
(Eudynamys scolopaceus) are reported on Twitter. Second, we developed search queries
to extract the relevant tweets for these species. The queries were based on either the taxonomic
name, common name or keywords that are frequently used to describe the species
(symptomatic or syndromic). Third, we validated the results using ground truth data. Our
results indicate that the common name queries provided a reasonable number of tweets
that were related to the ground truth data. The taxonomic query resulted in too small datasets,
while the symptomatic queries resulted in large datasets, but with highly variable signal-to-noise
ratios. No clear relationship was observed between the tweets from the
symptomatic queries and the ground truth data. Comparing the results for the two species
showed that the level of familiarity with the species plays a major role. The more familiar the
species, the more stable and reliable the Twitter data. This clearly presents a problem for
using social media to detect the arrival of an exotic organism of biosecurity concern for
which public is unfamiliar
Original language | English |
---|---|
Article number | e0172457 |
Pages (from-to) | 1-17 |
Number of pages | 17 |
Journal | PLoS One |
Volume | 12 |
Issue number | 2 |
DOIs | |
Publication status | Published - 23 Feb 2017 |
Externally published | Yes |