How representative is it really? A correspondence on sortition

A few months ago, Paolo Spada and I published a blog post about sortition and the representativeness of citizens’ assemblies. We were pleasantly surprised by the response to our post and the ensuing discussions.

In this new exchange at the Deliberative Democracy Digest, Kyle Redman, Paolo Spada, and I try to delve deeper, exploring further the challenges of achieving representativeness in deliberative mini-publics. We extend our gratitude to Nicole Curato and Lucy J. Parry from the Centre for Deliberative Democracy and Global Governance for suggesting and facilitating this discussion.

Underestimated effects of AI on democracy, and a gloomy scenario

A few years ago, Tom Steinberg and I discussed the potential risks posed by AI bots in influencing citizen engagement processes and manipulating public consultations. With the rapid advancement of AI technology, these risks have only intensified. This escalating concern has even elicited an official response from the White House.

A recent executive order has tasked the Office of Information and Regulatory Affairs (OIRA) at the White House with considering the implementation of guidance or tools to address mass comments, computer-generated remarks, and falsely attributed comments. This directive comes in response to growing concerns about the impact of AI on the regulatory process, including the potential for generative chatbots to lead mass campaigns or flood the federal agency rule-making process with spam comments.

The threat of manipulation becomes even more pronounced when content generated by bots is viewed by policymakers as being on par with human-created content. There’s evidence to suggest that this may be already occurring in certain scenarios. For example, a recent experiment was designed to measure the impact of language models on effective communication with members of Congress. The goal was to determine if these models could divert legislative attention by generating a constant stream of unique emails directed at congressional members. Both human writers and GPT-3 were employed in the study. Emails were randomly sent to over 7,000 state representatives throughout the country, after which response rates were compared. The results showed a mere 2% difference in response rates, and for some of the policy topics studied, the response rates remained consistent.

Now, the real trouble begins when governments jump on the bot bandwagon and start using their own bots to respond, and we, the humans, are left out of the conversation entirely. It’s like being the third wheel on a digital date that we didn’t even know was happening. That’s a gloomy scenario.

The Hidden Risks of AI: How Linguistic Diversity Can Make or Break Collective Intelligence

Diversity is a key ingredient in the recipe for collective intelligence because it brings together a range of perspectives, tools, and abilities; allowing for a more comprehensive approach to problem-solving and decision-making. Gender diversity on corporate boards improves firms’ performance, ethnic diversity produces more impactful scientific research, diverse groups are better at solving crimes, popular juries are less biased than professional judges, and politically diverse editorial teams produce higher-quality Wikipedia articles.

Large language models, like those powering AI systems, rely heavily on datasets or corpora, with a significant part of it based on English content. This dominance is consequential. Just as diverse groups of people yield richer outcomes, an AI trained on diverse linguistic data offers a broader perspective. Each language encapsulates unique thoughts, metaphors, and wisdom. Without diverse linguistic representation, we risk fostering AI systems with limited collective intelligence. The quality, diversity, and quantity of the data they are trained on directly influence their epistemic outputs. Unsurprisingly, large language models struggle to capture long-tail knowledge.

This comes with two major — at least hypothetically — risks: 1) systems that do not fully leverage the knowledge dispersed in the population, 2) the benefits of AI may be more accessible to some groups over others; for instance, speakers of less-dominant languages might not equally benefit from AI’s advancements. It’s not merely about translation; it’s the nuances and knowledge embedded in languages that might be overlooked.

There are also two additional dimensions that could reinforce biases in AI systems: 1) as future models are trained on content that might have been generated by AI, there may be a reinforcing effect where biases present in the initial training data are amplified over time; and 2) techniques such as guided transfer learning may also increase biases if the source model used in transfer learning is trained on biased data.

This introduces a nuanced dimension to the digital divide. Historically, the digital divide was characterized by access to technology, internet connectivity, digital skills, and the socio-economic variables shaping these factors. Yet, with AI, our understanding of what constitutes digital divide should expand. It’s a subtler yet crucial divide that policymakers and development practitioners might not yet fully recognize.