The Potential of Twitter Data for Surveillance of Epidemics in Brazil

A while ago in Brazil I had the pleasure of meeting the folks from Federal University of Minas Gerais who are using twitter data to monitor disease (more specifically dengue) epidemics. Their work is awesome, and worth taking note of.
The website of the project is here

And here’s a paper that provides a more detailed account of the experience:

*Dengue surveillance based on a computational model of spatio-temporal locality of Twitter∗

Gomide et al. (2011)

Twitter is a unique social media channel, in the sense that users discuss and talk about the most diverse topics, in- cluding their health conditions. In this paper we analyze how Dengue epidemic is reflected on Twitter and to what extent that information can be used for the sake of surveillance. Dengue is a mosquito-borne infectious disease that is a leading cause of illness and death in tropical and sub- tropical regions, including Brazil. We propose an active surveillance methodology that is based on four dimensions: volume, location, time and public perception. First we ex- plore the public perception dimension by performing sentiment analysis. This analysis enables us to filter out con- tent that is not relevant for the sake of Dengue surveillance. Then, we verify the high correlation between the number of cases reported by official statistics and the number of tweets posted during the same time period (i.e., R2 = 0.9578). A clustering approach was used in order to exploit the spatio- temporal dimension, and the quality of the clusters obtained becomes evident when they are compared to official data (i.e., RandIndex = 0.8914). As an application, we propose a Dengue surveillance system that shows the evolution of the dengue situation reported in tweets.

Download paper (pdf) at

