The Hidden Risks of AI: How Linguistic Diversity Can Make or Break Collective Intelligence

Diversity is a key ingredient in the recipe for collective intelligence because it brings together a range of perspectives, tools, and abilities; allowing for a more comprehensive approach to problem-solving and decision-making. Gender diversity on corporate boards improves firms’ performance, ethnic diversity produces more impactful scientific research, diverse groups are better at solving crimes, popular juries are less biased than professional judges, and politically diverse editorial teams produce higher-quality Wikipedia articles.

Large language models, like those powering AI systems, rely heavily on datasets or corpora, with a significant part of it based on English content. This dominance is consequential. Just as diverse groups of people yield richer outcomes, an AI trained on diverse linguistic data offers a broader perspective. Each language encapsulates unique thoughts, metaphors, and wisdom. Without diverse linguistic representation, we risk fostering AI systems with limited collective intelligence. The quality, diversity, and quantity of the data they are trained on directly influence their epistemic outputs. Unsurprisingly, large language models struggle to capture long-tail knowledge.

This comes with two major — at least hypothetically — risks: 1) systems that do not fully leverage the knowledge dispersed in the population, 2) the benefits of AI may be more accessible to some groups over others; for instance, speakers of less-dominant languages might not equally benefit from AI’s advancements. It’s not merely about translation; it’s the nuances and knowledge embedded in languages that might be overlooked.

There are also two additional dimensions that could reinforce biases in AI systems: 1) as future models are trained on content that might have been generated by AI, there may be a reinforcing effect where biases present in the initial training data are amplified over time; and 2) techniques such as guided transfer learning may also increase biases if the source model used in transfer learning is trained on biased data.

This introduces a nuanced dimension to the digital divide. Historically, the digital divide was characterized by access to technology, internet connectivity, digital skills, and the socio-economic variables shaping these factors. Yet, with AI, our understanding of what constitutes digital divide should expand. It’s a subtler yet crucial divide that policymakers and development practitioners might not yet fully recognize.

Voices in the Code: Citizen Participation for Better Algorithms

Voices in the Code, by David G. Robinson, is finally out. I had the opportunity to read the book prior to its publication, and I could not recommend it enough. David shows how, between 2004 and 2014 in the US, experts and citizens came together to build a new kidney transplant matching algorithm. David’s work is a breath of fresh air for the debate surrounding the impact of algorithms on individuals and societies – a debate typically focused on the negative and sometimes disastrous effects of algorithms. While David conveys these risks at the outset of the book, focusing solely on these threats would add little to a public discourse already saturated with concerns.

One of the major missing pieces in the “algorithmic literature” is precisely how citizens, experts and decision-makers can make their interactions more successful, working towards algorithmic solutions that better serve societal goals. The book offers a detailed and compelling case where a long and participatory process leads to the crafting of an algorithm that delivers a public good. This, despite the technical complexities, moral dilemmas, and difficult trade-offs involved in decisions related to the allocation of kidneys to transplant patients. Such a feat would not be achieved without another contribution of the book, which is to offer a didactical demystification of what algorithms are, normally treated as a reserved domain of few experts.

As David conducts his analysis, one also finds an interesting reversal of the assumed relationship between technology and participatory democracy. This relationship has mostly been examined from a civic tech angle, focusing on how technologies can support democratic participation through practices such as e-petitions, online citizens’ assemblies, and digital participatory budgeting. Thus, another original contribution of this book is to look at this relationship from the opposite angle: how can participatory processes better support technological deployments. While technology for participation (civic tech) remains an important topic, we should probably start paying more attention to how participation can support technological solutions (civic for tech).

Continuing on through the book, other interesting insights emerge. For instance, technology and participatory democracy pundits normally subscribe to the virtues of decentralized systems, both from a technological and institutional perspective. Yet David depicts precisely the virtues of a decision-making system centralized at the national level. Should organ transplant issues be decided at the local level in the US, the results would probably not be as successful. Against intuition, David presents a clear case where centralized (although participatory) systems might offer better collective outcomes. Surfacing this counterintuitive finding is a welcome contribution to debates on the trade-offs between centralization and decentralization, both from a technological and institutional standpoint.

But a few paragraphs here cannot do the book justice. Voices in the Code is certainly a must-read for anybody working on issues ranging from institutional design and participatory democracy, all the way to algorithmic accountability and decision support systems.

***

P.s. As an intro to the book, here’s a nice 10 min. conversation with David on the Marketplace podcast.

New Papers Published: FixMyStreet and the World’s Largest Participatory Budgeting

2016_7_5_anderson-lopes_consulta-popular_virtual

Voting in Rio Grande do Sul’s Participatory Budgeting (picture by Anderson Lopes)

Here are two new published papers that my colleagues Jon Mellon, Fredrik Sjoberg and myself have been working on.

The first, The Effect of Bureaucratic Responsiveness on Citizen Participation, published in Public Administration Review, is – to our knowledge – the first study to quantitatively assess at the individual level the often-assumed effect of government responsiveness on citizen engagement. It also describes an example of how the data provided through digital platforms may be leveraged to better understand participatory behavior. This is the fruit of a research collaboration with MySociety, to whom we are extremely thankful.

Below is the abstract:

What effect does bureaucratic responsiveness have on citizen participation? Since the 1940s, attitudinal measures of perceived efficacy have been used to explain participation. The authors develop a “calculus of participation” that incorporates objective efficacy—the extent to which an individual’s participation actually has an impact—and test the model against behavioral data from the online application Fix My Street (n = 399,364). A successful first experience using Fix My Street is associated with a 57 percent increase in the probability of an individual submitting a second report, and the experience of bureaucratic responsiveness to the first report submitted has predictive power over all future report submissions. The findings highlight the importance of responsiveness for fostering an active citizenry while demonstrating the value of incidentally collected data to examine participatory behavior at the individual level.

An earlier, ungated version of the paper can be found here.

The second paper, Does Online Voting Change the Outcome? Evidence from a Multi-mode Public Policy Referendum, has just been published in Electoral Studies. In an earlier JITP paper (ungated here) looking at Rio Grande do Sul State’s Participatory Budgeting – the world’s largest – we show that, when compared to offline voting, online voting tends to attract participants who are younger, male, of higher income and educational attainment, and more frequent social media users. Yet, one question remained: does the inclusion of new participants in the process with a different profile change the outcomes of the process (i.e. which projects are selected)? Below is the abstract of the paper.

Do online and offline voters differ in terms of policy preferences? The growth of Internet voting in recent years has opened up new channels of participation. Whether or not political outcomes change as a consequence of new modes of voting is an open question. Here we analyze all the votes cast both offline (n = 5.7 million) and online (n = 1.3 million) and compare the actual vote choices in a public policy referendum, the world’s largest participatory budgeting process, in Rio Grande do Sul in June 2014. In addition to examining aggregate outcomes, we also conducted two surveys to better understand the demographic profiles of who chooses to vote online and offline. We find that policy preferences of online and offline voters are no different, even though our data suggest important demographic differences between offline and online voters.

The extent to which these findings are transferable to other PB processes that combine online and offline voting remains an empirical question. In the meantime, nonetheless, these findings suggest a more nuanced view of the potential effects of digital channels as a supplementary means of engagement in participatory processes. I hope to share an ungated version of the paper in the coming days.

Bit by Bit: Social Research in the Digital Age

Pic by Jim Kaskade (flickr creative commons)

Matthew Salganik, Professor of Sociology at Princeton University, has recently put his forthcoming book on social research and big data online for an open review. Matthew is the author of many of my favorite academic works, including this experiment in which he and Duncan Watts test social influence by artificially inverting the popularity of songs in an online music market. He is also the brains behind All Our Ideas, an amazing tool that I have used in much of the work that I have been doing, including “The Governor Asks” in Brazil.

As in the words of Matthew, this is a book “for social scientists that want to do more data science, and it is for data scientists that want to do more social science.” Even though I have not read the entire book, one of the things that has already impressed me is the simplicity with which Matthew explains complex topics, such as human computation, distributed data collection and digital experiments. For each topic, he highlights opportunities and provides experienced advice for those working with big data and social sciences. His stance on social research in the digital age is brilliant and refreshing, and is a wake-up call for lots of people working in that domain. Below is an excerpt from his preface:

From data scientists, I’ve seen two common misunderstandings. The first is thinking that more data automatically solves problems. But, for social research that has not been my experience. In fact, for social research new types of data, as opposed to more of the same data, seems to be most helpful. The second misunderstanding that I’ve seen from data scientists is thinking that social science is just a bunch of fancy-talk wrapped around common sense. Of course, as a social scientist—more specifically as a sociologist—I don’t agree with that; I think that social science has a lot of to offer. Smart people have been working hard to understand human behavior for a long time, and it seems unwise to ignore the wisdom that has accumulated from this effort. My hope is that this book will offer you some of that wisdom in a way that is easy to understand.

From social scientists, I’ve also seen two common misunderstandings. First, I’ve seen some people write-off the entire idea of social research using the tools of the digital age based on a few bad papers. If you are reading this book, you have probably already read a bunch of papers that uses social media data in ways that are banal or wrong (or both). I have too. However, it would be a serious mistake to conclude from these examples that all digital age social research is bad. In fact, you’ve probably also read a bunch of papers that use survey data in ways that are banal or wrong, but you don’t write-off all research using surveys. That’s because you know that there is great research done with survey data, and in this book, I’m going to show you that there is also great research done with the tools of the digital age.

The second common misunderstanding that I’ve seen from social scientists is to confuse the present with the future. When assessing social research in the digital age—the research that I’m going to describe in this book—it is important to ask two distinction questions:

How well does this style of research work now?

How well will this style of research work in the future as the data landscape changes and as researchers devote more attention to these problems?

I have only gone through parts of the book (and yes, I did go beyond the preface). But from what I can see, it is a must read for those who are interested in digital technologies and the new frontiers of social research. And while reading it, why not respond to Matthew’s generous act by providing some comments? You can access the book here.

New IDS Journal – 9 Papers in Open Government

2016-01-14 16.51.09_resized

The new IDS Bulletin is out. Edited by Rosemary McGee and Duncan Edwards, this is the first open access version of the well-known journal by the Institute of Development Studies. It brings eight new studies looking at a variety of open government issues, ranging from uptake in digital platforms to government responsiveness in civic tech initiatives. Below is a brief presentation of this issue:

Open government and open data are new areas of research, advocacy and activism that have entered the governance field alongside the more established areas of transparency and accountability. In this IDS Bulletin, articles review recent scholarship to pinpoint contributions to more open, transparent, accountable and responsive governance via improved practice, projects and programmes in the context of the ideas, relationships, processes, behaviours, policy frameworks and aid funding practices of the last five years. They also discuss questions and weaknesses that limit the effectiveness and impact of this work, offer a series of definitions to help overcome conceptual ambiguities, and identify hype and euphemism. The contributions – by researchers and practitioners – approach contemporary challenges of achieving transparency, accountability and openness from a wide range of subject positions and professional and disciplinary angles. Together these articles give a sense of what has changed in this fast-moving field, and what has not – this IDS Bulletin is an invitation to all stakeholders to take stock and reflect.

The ambiguity around the ‘open’ in governance today might be helpful in that its very breadth brings in actors who would otherwise be unlikely adherents. But if the fuzzier idea of ‘open government’ or the allure of ‘open data’ displace the task of clear transparency, hard accountability and fairer distribution of power as what this is all about, then what started as an inspired movement of governance visionaries may end up merely putting a more open face on an unjust and unaccountable status quo.

Among others, the journal presents an abridged version of a paper by Jonathan Fox and myself on digital technologies and government responsiveness (for full version download here).

Below is a list of all the papers:

Introduction: Opening Governance – Change, Continuity and Conceptual Ambiguity

Rosie McGee, Duncan Edwards

When Does ICT-Enabled Citizen Voice Lead to Government Responsiveness?

Tiago Peixoto, Jonathan Fox

ICTs Help Citizens Voice Concerns over Water – Or Do They?

Katharina Welle, Jennifer Williams, Joseph Pearce

When Does the State Listen?

Miguel Loureiro, Aalia Cassim, Terence Darko, Lucas Katera, Nyambura Salome

‘You Have to Raise a Fist!’: Seeing and Speaking to the State in South Africa

Elizabeth Mills

The Right of Access to Information: Exploring Gender Inequities

Laura Neuman

Men and Women of Words: How Words Divide and Connect the Bunge La Mwananchi Movement in Kenya

David Calleb Otieno, Nathaniel Kabala, Patta Scott-Villiers, Gacheke Gachihi, Diana Muthoni Ndung’u

Test It and They Might Come: Improving the Uptake of Digital Tools in Transparency and Accountability Initiatives

Christopher Wilson, Indra de Lanerolle

The Dark Side of Digital Politics: Understanding the Algorithmic Manufacturing of Consent and the Hindering of Online Dissidence

Emiliano Treré

World Development Report 2016: Digital Dividends

The World Development Report 2016, the main annual publication of the World Bank, is out. This year’s theme is Digital Dividends, examining the role of digital technologies in the promotion of development outcomes. The findings of the WDR are simultaneously encouraging and sobering. Those skeptical of the role of digital technologies in development might be surprised by some of the results presented in the report. Technology advocates from across the spectrum (civic tech, open data, ICT4D) will inevitably come across some facts that should temper their enthusiasm.

While some may disagree with the findings, this Report is an impressive piece of work, spread across six chapters covering different aspects of digital technologies in development: 1) accelerating growth, 2) expanding opportunities, 3) delivering services, 4) sectoral policies, 5) national priorities, 6) global cooperation. My opinion may be biased, as somebody who made some modest contributions to the Report, but I believe that, to date, this is the most thorough effort to examine the effects of digital technologies on development outcomes. The full report can be downloaded here.

The report draws, among other things, from 14 background papers that were prepared by international experts and World Bank staff. These background papers serve as additional reading for those who would like to examine certain issues more closely, such as social media, net neutrality, and the cybersecurity agenda.

For those interested in citizen participation and civic tech, one of the papers written by Prof. Jonathan Fox and myself – When Does ICT-Enabled Citizen Voice Lead to Government Responsiveness? – might be of particular interest. Below is the abstract:

This paper reviews evidence on the use of 23 information and communication technology (ICT) platforms to project citizen voice to improve public service delivery. This meta-analysis focuses on empirical studies of initiatives in the global South, highlighting both citizen uptake (‘yelp’) and the degree to which public service providers respond to expressions of citizen voice (‘teeth’). The conceptual framework further distinguishes between two trajectories for ICT-enabled citizen voice: Upwards accountability occurs when users provide feedback directly to decision-makers in real time, allowing policy-makers and program managers to identify and address service delivery problems – but at their discretion. Downwards accountability, in contrast, occurs either through real time user feedback or less immediate forms of collective civic action that publicly call on service providers to become more accountable and depends less exclusively on decision-makers’ discretion about whether or not to act on the information provided. This distinction between the ways in which ICT platforms mediate the relationship between citizens and service providers allows for a precise analytical focus on how different dimensions of such platforms contribute to public sector responsiveness. These cases suggest that while ICT platforms have been relevant in increasing policymakers’ and senior managers’ capacity to respond, most of them have yet to influence their willingness to do so.

You can download the paper here.

Any feedback on our paper or models proposed (see below, for instance) would be extremely welcome.

unpacking user feedback and civic action: difference and overlap

I also list below the links to all the background papers and their titles

When Does ICT-Enabled Citizen Voice Lead to Government Responsiveness? by Tiago Peixoto and Jonathan Fox
Development Impact of Social Media by Robert Ackland and Kyosuke Tanaka
The New Cybersecurity Agenda: Economic and Social Challenges to a Secure Internet by Johannes M. Bauer and William H. Dutton
Multistakeholder Internet Governance? by William H. Dutton
The Economics and Policy Implications of Infrastructure Sharing and Mutualisation in Africa by Jose Marino Garcia and Tim Kelly
Guatemala, an Early Spectrum Management Reformer by Jose Marino Garcia
Network Neutrality and Private Sector Investment by Jose Marino Garcia
Best Practices and Lessons Learned in ICT Sector Innovation: A Case Study of Israel by Dr. Daphne Getz and Dr. Itzhak Goldberg (Researchers: Eliezer Shein, Bahina Eidelman, Ella Barzani)
How Tech Hubs are helping to Drive Economic Growth in Africa by Tim Kelly and Rachel Firestone
One Network Africa in East Africa by Tim Kelly and Christopher Kemei
One Step Forward, Two Steps Backward? Does EGovernment make Governments in Developing Countries more Transparent and Accountable? by Victoria L. Lemieux
Exploring the Relationship between Broadband and Economic Growth by Michael Minges
Enabling Digital Entrepreneurs by Desirée van Welsum
Sharing is caring? Not quite. Some observations about ‘the shaing economy’ by Desirée van Welsum

Enjoy the reading.

Civic Tech and Government Responsiveness

For those interested in tech-based citizen reporting tools (such as FixMyStreet, SeeClickFix), here’s a recent interview of mine with Jeffrey Peel (Citizen 2015) in which I discuss some of our recent research in the area.

Praising and Shaming in Civic Tech (or Reversed Nudging for Government Responsiveness)

The other day during a talk with researcher Tanya Lokot I heard an interesting story from Russia. Disgusted with the state of their streets, activists started painting caricatures of government officials over potholes.

In the case of a central street in Saratov, the immediate response to one of these graffiti was this:

Later on, following increased media attention – and some unexpected turnarounds – the pothole got fixed.

That reminded me of a recurrent theme in some conversations I have, which refers to whether praising and shaming matters to civic tech and, if so, to which extent. To stay with two classic examples, think of solutions such as FixMyStreet and SeeClickFix, through which citizens publically report problems to the authorities.

Considering government takes action, what prompts them to do so? At a very basic level, three hypothesis are possible:

1) Governments take action based on their access to distributed information about problems (which they supposedly are not aware of)

2) Governments take action due to the “naming and shaming” effect, avoiding to be publically perceived as unresponsive (and seeking praise for its actions)

3) Governments take action for both of the reasons above

Some could argue that hypothesis 3 is the most likely to be true, with some governments leaning more towards one reason to respond than others. Yet, the problem is that we know very little about these hypotheses, if anything. In other words – to my knowledge – we do not know whether making reports through these platforms public makes any difference whatsoever when it comes to governments’ responsiveness. Some might consider this as a useless academic exercise: as long as these tools work, who cares? But I would argue that the answer that questions matters a lot when it comes to the design of similar civic tech initiatives that aim to prompt government to action.

Let’s suppose that we find that all else equal governments are significantly more responsive to citizen reports when these are publically displayed. This would have importance both in terms of process and technological design. In terms of process, for instance, civic tech initiatives would probably be more successful if devoting part of their resources to amplify the visibility of government action and inaction (e.g. through local media). Conversely, from a technological standpoint, designers should devote substantive more effort on interfaces that maximizes praising and shaming of governments based on their performance (e.g. rankings, highlighting pending reports). Conversely, we might find that publicizing reports have very little effect in terms of responsiveness. In that case, more work would be needed to figure out which other factors – beyond will and capacity – play a role in government responsiveness (e.g. quality of reports).

Most likely, praising and shaming would depend on a number of factors such as political competition, bureaucratic autonomy, and internal performance routines. But a finer understanding of that would not only bear an impact on the civic tech field, but across the whole accountability landscape. To date, we know very little about it. Yet, one of the untapped potential of civic technology is precisely that of conducting experiments at lowered costs. For instance, conducting randomized controlled trials on the effects on the publicization of government responsiveness should not be so complicated (e.g effects of rankings, amplifying visibility of unfixed problems). Add to that analysis of existing systems’ data from civic tech platforms, and some good qualitative work, and we might get a lot closer at figuring out what makes politicians and civil servants’ “tick”.

Until now, behavioral economics in public policy has been mainly about nudging citizens toward preferred choices. Yet it may be time to start also working in the opposite direction, nudging governments to be more responsive to citizens. Understanding whether praising and shaming works (and if so, how and to what extent) would be an important step in that direction.

***

Also re-posted on Civicist.

Open Data and Citizen Engagement – Disentangling the Relationship

[This is a cross-post from Sunlight Foundation’s series OpenGov Conversations, an ongoing discourse featuring contributions from transparency and accountability researchers and practitioners around the world.]

As asserted by Jeremy Bentham nearly two centuries ago, “[I]n the same proportion as it is desirable for the governed to know the conduct of their governors, is it also important for the governors to know the real wishes of the governed.” Although Bentham’s historical call may come across as obvious to some, it highlights one of the major shortcomings of the current open government movement: while a strong focus is given to mechanisms to let the governed know the conduct of their governors (i.e. transparency), less attention is given to the means by which the governed can express their wishes (i.e. citizen engagement).

But striking a balance between transparency and participation is particularly important if transparency is conceived as a means for accountability. To clarify, let us consider the role transparency (and data) plays in a simplified accountability cycle. As any accountability mechanism built on disclosure principles, it should require a minimal chain of events that can be summarized in the following manner: (1) Data is published; (2) The data published reaches its intended public; (3) Members of the public are able to process the data and react to it; and (4) Public officials respond to the public’s reaction or are sanctioned by the public through institutional means. This simplified path toward accountability highlights the limits of the disclosure of information. Even in the most simplified model of accountability, while essential, the disclosure of data accounts for no more than one-fourth of the accountability process. [Note 1 – see below]

But what are the conditions required to close the accountability cycle? First, once the data is disclosed (1), in order for it to reach its intended public (2), a minimal condition is the presence of info-mediators that can process open data in a minimally enabling environment (e.g. free and pluralistic media). Considering these factors are present, we are still only half way towards accountability. Nevertheless, the remaining steps (3 and 4) cannot be achieved in the absence of citizen engagement, notably electoral and participatory processes.

Beyond Elections

With regard to elections as a means for accountability, citizens may periodically choose to reward or sanction elected officials based on the information that they have received and processed. While this may seem a minor requisite for developed democracies like the US, the problem gains importance for a number of countries where open data platforms have launched but where elections are still a work in progress (in such cases, some research suggests that transparency may even backfire).

But, even if elections are in place, alone they might not suffice. The Brazilian case is illustrative and highlights the limits of representative systems as a means to create sustained interface between governments and citizens. Despite two decades of electoral democracy and unprecedented economic prosperity in the country, citizens suddenly went to the streets to demand an end to corruption, improvement in public services and… increased participation. Politicians, themselves, came to the quick realization that elections are not enough, as recently underlined by former Brazilian President Lula in an op ed at the New York Times “(….) people do not simply wish to vote every four years. They want daily interaction with governments both local and national, and to take part in defining public policies, offering opinions on the decisions that affect them each day.” If transparency and electoral democracy are not enough, citizen engagement remains as the missing link for open and inclusive governments.

Open Data And Citizen Engagement

Within an ecosystem that combines transparency and participation, examining the relationship between the two becomes essential. More specifically, a clearer understanding of the interaction between open data and participatory institutions remains a frontier to be explored. In the following paragraphs I put forward two issues, of many, that I believe should be considered when examining this interaction.

I) Behavior and causal chains

Evan Lieberman and his colleagues conducted an experiment in Kenya that provided parents with information about their children’s schools and how to improve their children’s learning. Nevertheless, to the disillusionment of many, despite efforts to provide parents with access to information, the intervention had no impact on parents’ behavior. Following this rather disappointing finding, the authors proceeded to articulating a causal chain that explores the link between access to information and behavioral change.

The Information-Citizen Action Causal Chain (Lieberman et al. 2013)

While the model put forward by the authors is not perfect, it is a great starting point and it does call attention to the dire need for a clear understanding of the ensemble of mechanisms and factors acting between access to data and citizen action.

II) Embeddedness in participatory arrangements

Another issue that might be worth examination relates to the extent to which open data is purposefully connected to participatory institutions or not. In this respect, much like the notion of targeted transparency, a possible hypothesis would be that open data is fully effective for accountability purposes only when the information produced becomes “embedded” in participatory processes.

This notion of “embeddedness” would call for hard thinking on how different participatory processes can most benefit from open data and its applications (e.g. visualizations, analysis). For example, the use of open data to inform a referendum process is potentially a very different type of use than within participatory budgeting process. Stemming from this reasoning, open data efforts should be increasingly customized to different existing participatory processes, hence increasing their embeddedness in these processes. This would be the case, for instance, when budget data visualization solutions are tailored to inform participatory budgeting meetings, thus creating a clear link between the consumption of that data and the decision-making process that follows.

Granted, information is per se an essential component of good participatory processes, and one can take a more or less intuitive view on which types of information are more suitable for one process or another. However, a more refined knowledge of how to maximize the impact of data in participatory processes is far from achieved and much more work is needed.

R&D For Data-Driven Participation

Coming up with clear hypotheses and testing them is essential if we are to move forward with the ecosystem that brings together open data, participation and accountability. Surely, many organizations working in the open government space are operating with limited resources, squeezing their budgets to keep their operational work going. In this sense, conducting experiments to test hypotheses may appear as a luxury that very few can afford.

Nevertheless, one of the opportunities provided by the use of technologies for civic behavior is that of potentially driving down the costs for experimentation. For instance, online and mobile experiments could play the role of tech-enabled (and affordable) randomized controlled trials, improving our understanding of how open data can be best used to spur collective action. Thinking of the ways in which technology can be used to conduct lowered costs experiments to shed light on behavioral and causal chains is still limited to a small number of people and organizations, and much work is needed on that front.

Yet, it is also important to acknowledge that experiments are not the only source of relevant knowledge. To stick with a simple example, in some cases even an online survey trying to figure out who is accessing data, what data they use, and how they use it may provide us with valuable knowledge about the interaction between open data and citizen action. In any case, however, it may be important that the actors working in that space agree upon a minimal framework that facilitates comparison and incremental learning: the field of technology for accountability desperately needs a more coordinated research agenda.

Citizen Data Platforms?

As more and more players engage in participatory initiatives, there is a significant amount of citizen-generated data being collected, which is important on its own. However, in a similar vein to government data, the potential of citizen data may be further unlocked if openly available to third parties who can learn from it and build upon it. In this respect, it might not be long before we realize the need to have adequate structures and platforms to host this wealth of data that – hopefully – will be increasingly generated around the world. This would entail that not only governments open up their data related to citizen engagement initiatives, but also that other actors working in that field – such as donors and NGOs – do the same. Such structures would also be the means by which lessons generated by experiments and other approaches are widely shared, bringing cumulative knowledge to the field.

However, as we think of future scenarios, we should not lose sight of current challenges and knowledge gaps when it comes to the relationship between citizen engagement and open data. Better disentangling the relationship between the two is the most immediate priority, and a long overdue topic in the open government conversation.

Notes

Note 1: This section of this post is based on arguments previously developed in the article, “The Uncertain Relationship between Open Data and Accountability”.

Note 2: And some evidence seems to confirm that hypothesis. For instance, in a field experiment in Kenya, villagers only responded to information about local spending in development projects when that information was coupled with specific guidance on how to participate in local decision-making processes).

The Potential of Twitter Data for Surveillance of Epidemics in Brazil

A while ago in Brazil I had the pleasure of meeting the folks from Federal University of Minas Gerais who are using twitter data to monitor disease (more specifically dengue) epidemics. Their work is awesome, and worth taking note of.

The website of the project is herehttp://www.observatorio.inweb.org.br/dengue/destaques/

And here’s a paper that provides a more detailed account of the experience:

*Dengue surveillance based on a computational model of spatio-temporal locality of Twitter∗

Gomide et al. (2011)

Twitter is a unique social media channel, in the sense that users discuss and talk about the most diverse topics, in- cluding their health conditions. In this paper we analyze how Dengue epidemic is reflected on Twitter and to what extent that information can be used for the sake of surveillance. Dengue is a mosquito-borne infectious disease that is a leading cause of illness and death in tropical and sub- tropical regions, including Brazil. We propose an active surveillance methodology that is based on four dimensions: volume, location, time and public perception. First we ex- plore the public perception dimension by performing sentiment analysis. This analysis enables us to filter out con- tent that is not relevant for the sake of Dengue surveillance. Then, we verify the high correlation between the number of cases reported by official statistics and the number of tweets posted during the same time period (i.e., R2 = 0.9578). A clustering approach was used in order to exploit the spatio- temporal dimension, and the quality of the clusters obtained becomes evident when they are compared to official data (i.e., RandIndex = 0.8914). As an application, we propose a Dengue surveillance system that shows the evolution of the dengue situation reported in tweets.

Download paper (pdf) at http://journal.webscience.org/429/1/92_paper.pdf

DemocracySpot

Musings on technology, democracy, development, and their interplay. By Tiago C. Peixoto

Category Archives for data

The Hidden Risks of AI: How Linguistic Diversity Can Make or Break Collective Intelligence

Voices in the Code: Citizen Participation for Better Algorithms

New IDS Journal – 9 Papers in Open Government

Civic Tech and Government Responsiveness

Praising and Shaming in Civic Tech (or Reversed Nudging for Government Responsiveness)