Social media may inform early detection of disease outbreaks
19 February, 2016 | Claire Turner |
|
|
Conference season is fast approaching so we will be highlighting some of the interesting posters and slides that are submitted to us throughout the year.
With the recent global outbreaks of Ebola and Zika, there is a need to strengthen surveillance of emerging and re-emerging communicable diseases. Earlier detection of fast spreading disease-related threats, will give public health authorities more time to work on reducing the impact of an outbreak. F1000Research has two channels dedicated to the sharing of rapid health information; the ‘Ebola’ and ‘Zika & Arbovirus Outbreaks’ channel.
We recently received a slide set from Karin Verspoor and her team at the University of Melbourne, who have been investigating social media and its use in early disease outbreak detection. They chose to upload their slide presentation; Towards early discovery of salient health threats: a social media emotion classification technique, which they presented at the Pacific Symposium of Biocomputing, to F1000Research to share the results of their work so far. We spoke to Karin and Bahadorreza Ofoghi to find out more about the benefits of social media surveillance and the challenges they face in implementing this as routine practice.
Interview
F1000Research: what led you to explore social media as a tool for early detection of disease outbreaks?
We had been working with the Australian Defence Science and Technology Group, Bioterrorism Preparedness Task team on syndromic surveillance, and knew that there was substantial concern about the arrival of Ebola to Australia, and in particular a fear that it would slip through the checks at land and air ports. We started by reasoning that people may turn to Twitter to talk about Ebola, and that monitoring tweets, in particular geo-located tweets, might give a clue as to where there was potential “hot spots” of activity/discussion about Ebola, that might reveal an underlying outbreak or potential outbreak.
F1000Research: how did you identify hotspots?
We first built a heat map of tweets mentioning “Ebola”, however, we quickly realised that simply counting tweets in a given city or region about Ebola wasn’t going to provide sufficient insight. There were a whole range of reasons why people mentioned Ebola, the most common reason being simply to share Ebola-related news, which wasn’t necessarily localized. Given the pathogenicity and high fatality rate of Ebola, we thought that the most basic human response to a local (suspected) Ebola event would be anxiety or fear. Based on this, we felt that the “chatter” around the event would be more emotionally charged if it were proximal to a tweeter, and that the content of the tweets themselves would have a different focus (and certainly a different focus than tweets preceding the event). This first study is really about validating that assumption.
F1000Research: as we can see from your slides, you’re using a “sliding window” model to analyse tweets, did you find any limitations with this?
Our initial data was limited to a narrow window around the relevant events, due to the restrictions on the use of the Twitter API. We also did a fairly coarse analysis of the emotional and lexical content — looking for differences in the 7 days before and 7 days after the putative event. Meaning that so far we’ve only established that it should be clear that an event has happened after 7 days. Earlier and more sensitive detection is needed to test whether the “sliding window” idea could be used for actual real-time detection, i.e. noticing that a change in the nature of tweets on a given topic is significant enough to raise an alert that there is an event in progress. Further research will also be necessary to increase the specificity of the model, focusing on tweets that may indicate specific syndrome-related outbreaks from other types of relevant short texts, namely emergency department chief complaints.
F1000Research: what’s next for you in taking this research further?
We plan to work more closely with colleagues in the Centre for Epidemiology and Biostatistics of the Melbourne School of Population and Global Health, to explore the relationships between their understanding of sociological factors in the epidemiology of infection, and our analyses of the language of Twitter. Our model, once adequately adjusted, could have wider applications than solely health-related incidents. This process could be utilized in monitoring and identifying any type of known and localized traumatic events that are associated with strong emotional responses, such as natural disasters.
|