Reflections on the representativeness of citizens’ assemblies and similar innovations

(Co-authored with Paolo Spada)

Introduction

For proponents of deliberative democracy, the last couple of years could not have been better. Propelled by the recent diffusion of citizens’ assemblies, deliberative democracy has definitely gained popularity beyond small circles of scholars and advocates. From CNN to the New York Times, the Hindustan Times (India), Folha de São Paulo (Brazil), and Expresso (Portugal), it is now almost difficult to keep up with all the interest in democratic models that promote the random selection of participants who engage in informed deliberation. A new “deliberative wave” is definitely here.

But with popularity comes scrutiny. And whether the deliberative wave will power new energy or crash onto the beach, is an open question. As is the case with any democratic innovation (institutions designed to improve or deepen our existing democratic systems), critically examining assumptions is what allows for management of expectations and, most importantly, gradual improvements.

Proponents of citizens’ assemblies put representativeness at the core of their definition. In fact, it is one of their main selling points. For example, a comprehensive report highlights that an advantage of citizens’ assemblies, compared to other mechanisms of participatory democracy, is their typical combination of random selection and stratification to form a public body that is “representative of the public.” This general argument resonates with the media and the wider public. A recent illustration is an article by The Guardian, which depicts citizens’ assemblies as “a group of people who are randomly selected and reflect the demographics of the population as a whole”

It should be noted that claims of representativeness vary in their assertiveness. For instance, some may refer to citizens’ assemblies as “representative deliberative democracy,” while others may use more cautious language, referring to assemblies’ participants as being “broadly representative” of the population (e.g. by gender, age, education, attitudes). This variation in terms used to describe representativeness should prompt an attentive observer to ask basic questions such as: “Are existing practices of deliberative democracy representative?” “If they are ‘broadly’ representative, how representative are they?” “What criteria, if any, are used to assess whether a deliberative democracy practice is more or less representative of the population?” “Can their representativeness be improved, and if so, how?” These are basic questions that, surprisingly, have been given little attention in recent debates surrounding deliberative democracy. The purpose of this article is to bring attention to these basic questions and to provide initial answers and potential avenues for future research and practice.

Citizens Assemblies and three challenges of random sampling

Before discussing the subject of representativeness, it is important to provide some conceptual clarity. From an academic perspective, citizens’ assemblies are a variant of what political scientists normally refer to as “mini-publics.” These are processes in which participants: 1) are randomly selected (often combined with some form of stratification), 2) participate in informed deliberation on a specific topic, and 3) reach a public judgment and provide recommendations on that topic. Thus, in this text, mini-publics serves as a general term for a variety of practices such as consensus conferences, citizens’ juries, planning cells, and citizens’ assemblies themselves.

In this discussion, we will focus on what we consider to be the three main challenges of random sampling. First, we will examine the issue of sample size and the limitations of stratification in addressing this challenge. Second, we will focus on sampling error, which is the error that occurs when observing a sample rather than the entire population. Third, we will examine the issue of non-response, and how the typically small sample size of citizens’ assemblies exacerbates this problem. We conclude by offering alternatives to approach the trade-offs associated with mini-publics’ representativeness dilemma.

  1. Minimal sample size, and why stratification does not help reducing sample size requirements in complex populations 

Most mini-publics that we know of have a sample size of around 70 participants or less, with a few cases having more than 200 participants. However, even with a sample size of 200 people, representing a population accurately is quite difficult. This may be the reason why political scientist Robert Dahl, who first proposed the use of mini-publics over three decades ago, suggested a sample size of 1000 participants. This is also the reason why most surveys that attempt to represent a complex national population have a sample size of over 1000 people. 

To understand why representing a population accurately is difficult, consider that a sample size of approximately 370 individuals is enough to estimate a parameter of a population of 20,000 with a 5% error margin and 95% confidence level (for example, estimating the proportion of the population that answers “yes” to a question). However, if the desired error margin is reduced to 2%, the sample size increases to over 2,000, and for a more realistic population of over 1 million, a sample size of over 16,000 is required to achieve a 1% error margin with 99% confidence. Although the size of the sample required to estimate simple parameters in surveys does not increase significantly with the size of the population, it still increases beyond the sample sizes currently used in most mini-publics. Sample size calculators are available online to demonstrate these examples without requiring any statistical knowledge. 

Stratification is a strategy that can help reduce the error margin and achieve better precision with a fixed sample size. However, stratification alone cannot justify the very small sample sizes that are currently used in most mini-publics (70 or less).

To understand why, let’s consider that we want to create a sample that represents the five important strata of the population and includes all their intersections, such as ethnicity, age, income, geographical location, and gender. For simplicity, let’s assume that the first four categories have five equal groups in society, and gender is composed of two equal groups. The minimal sample required to include the intersections of all the strata and represent this population is equal to 5^4×2=1250. Note that we have maintained the somewhat unlikely assumption that all categories have equal size. If one stratum, such as ethnicity, includes a minority that is 1/10 of the population, then our multiplier would be 10 instead of 5, requiring a sample size of 5^3x10x2=2500.

The latter is independent of the number of categories within the strata, so even if the strata have only two categories, one comprising 90% (9/10) of the population and one comprising 10% (1/10) of the population, the multiplier would still be 10. When we want to represent a minority of 1% (1/100) of the population, the multiplier becomes 100. Note that this minimal sample size would include the intersection of all the strata in such a population, but such a small sample will not be representative of each stratum. To achieve stratum-level representation, we need to increase the number of people for each stratum following the same mathematical rules we used for simple sampling, as described at the beginning of this section, generating a required sample size in the order of hundreds of thousand of people (in our example above 370×2500=925000).

This is without even entering into the discussion of what should be the ideal set of strata to be used in order to achieve legitimacy. Should we also include attitudes such as liberal vs conservative? Opinions on the topic of the assembly? Metrics of type of personality? Education? Income? Previous level of engagement in politics? In sum, the more complex the population is, the larger the sample required to represent it.

  1. Sampling error due to a lack of a clear population list

When evaluating sampling methods, it is important to consider that creating a random sample of a population requires a starting population to draw from. In some fields, the total population is well-defined and data is readily available (e.g. students in a school, members of parliament), but in other cases such as a city or country, it becomes more complicated.

The literature on surveys contains multiple publications on sampling issues, but for our purposes, it is sufficient to note that without a police state or similar means of collecting an unprecedented amount of information on citizens, creating a complete list of people in a country to draw our sample from is impossible. All existing lists (e.g. electoral lists, telephone lists, addresses, social security numbers) are incomplete and biased.

This is why survey companies charge significant amounts of money to allow customers to use their model of the population, which is a combination of multiple subsamples that have been optimized over time to answer specific questions. For example, a survey company that specializes in election forecasting will have a sampling model optimized to minimize errors in estimating parameters of the population that might be relevant for electoral studies, while a company that specializes in retail marketing will have a model optimized to minimize forecasting errors in predicting sales of different types of goods. Each model will draw from different samples, applying different weights according to complex algorithms that are optimized against past performance. However, each model will still be an imperfect representation of the population.

Therefore, even the best possible sampling method will have an inherent error. It is difficult, if not impossible, to perfectly capture the entire population, so our samples will be drawn from a subpopulation that carries biases. This problem is further accentuated for low-cost mini-publics that cannot afford expensive survey companies or do not have access to large public lists like electoral or census lists. These mini-publics may have a very narrow and biased initial subpopulation, such as only targeting members of an online community, which brings its own set of biases.

  1. Non-response

A third factor, well-known among practitioners and community organizers, is the fact that receiving an invitation to participate does not mean a person will take part in the process. Thus, any invitation procedure has issues of non-participation. This is probably the most obvious factor that prevents one from creating representative samples of the population. In mini-publics with large samples of participants, such as Citizens’ Assemblies, the conversion rate is often quite low, sometimes less than 10%. By conversion rate, we mean the percentage of the people contacted that say that they are willing to participate and enter the recruitment pool. Simpler mini-publics of shorter duration (e.g. one weekend) often achieve higher engagement. A dataset on conversion rates of mini-publics does not exist, but our own experience in organizing Citizens Assemblies, Deliberative Polls, and clones tell us that it is possible to achieve more than 20% conversion when the topic is very controversial. For example, in the UK’s Citizens’ Assembly on Brexit in 2017, 1,155 people agreed to enter the recruitment pool out of the 5,000 contacted, generating a conversion rate of 23.1%, as illustrated below.[1] 

Figure 1: Contact and recruitment numbers UK’s Citizens Assembly on Brexit (Renwick et al. 2017) 

We do not pretend to know all the existing cases, and so this data should be taken with caution. Maybe there have been cases with 80% conversion, given it is possible to achieve such rates in surveys. But even in such hypothetical best practices, we would have failed to engage 20% of the population. More realistically, with 10 to 30% engagement, we are just engaging a very narrow subset of the population.

Frequent asked questions, and why we should not abandon sortition

It is clear from the points above that the assertion that the current generation of relatively small mini-publics is representative of the population from which it is drawn is questionable. Not surprisingly, the fact that participants of mini-publics differ from the population they are supposed to represent has already been documented over a decade ago.[2] However, in our experience, when confronted with these facts, practitioners and advocates of mini-publics often raise various questions. Below, we address five frequently asked questions and provide answers for them.

  1. “But people use random sampling for surveys and then claim that the results are representative, what is the difference for mini-publics?”

The first difference we already discussed between surveys and mini-publics is that surveys that aim to represent a large population use larger samples. 

The second difference, less obvious, is that a mini-public is not a system that aggregates fixed opinions. Rather, one of the core principles of mini-publics is that participants deliberate and their opinions may change as a result of the group process and composition. Our sampling procedures, however, are based on the task of estimating population parameters, not generating input for legitimate decision making. While a 5% error margin with 95% confidence level may be acceptable in a survey investigating the proportion of people who prefer one policy over another, this same measure cannot be applied to a mini-public because participants may change their opinions through the deliberation process. A mini-public is not an estimate derived from a simple mathematical formula, but rather a complex process of group deliberation that may transform input preferences into output preferences and potentially lead to important decisions. Christina Lafont has used a similar argument to criticize even an ideal sample that achieves perfect input representativeness.[3] 

  1. “But we use random assignment for experiments and then claim that the results are representative, what is the difference for mini-publics?”

Mini-publics can be thought of as experiments, similar to clinical trials testing the impact of a vaccine. This approach allows us to evaluate the impact of a mini-public on a subset of the population, providing insight into what would happen if a similar subset of the population were to deliberate. Continuing this metaphor, if the mini-public participants co-design a new policy solution and support its implementation, any similar subsets of the population going through an identical mini-public process should generate a similar output.

However, clinical trials require that the vaccine and a placebo be randomly assigned to treatment and control groups. This approach is only valid if the participants are drawn from a representative sample and cannot self-select into each experimental arm.

Unfortunately, few mini-publics compare the decisions made by members to those who were not selected, and this is not considered a key element for claiming representativeness or legitimacy. Furthermore, while random assignment of treatment and control is crucial for internal validity, it does not guarantee external validity. That is, the results may not be representative of the larger population, and the estimate of the treatment effect only applies to the specific sample used in the experiment. 

While the metaphor of the experiment as a model to interpret mini-publics is preferable to the metaphor of the survey, it does not solve the issue of working with non-representative samples in practice. Therefore, we must continue to explore ways to improve the representativeness of mini-publics and take into account the limitations of the experimental metaphor when designing and interpreting their results.

  1. “Ok, mini-publics may not be perfect, but are they not clearly better than other mechanisms?”

Thus far, we have provided evidence that the claim of mini-publics as representative of the population is problematic. But what about more cautious claims, such as mini-publics being more inclusive than other participatory processes (e.g., participatory budgeting, e-petitions) that do not employ randomization? Many would agree that traditional forms of consultation tend to attract “usual suspects” – citizens who have a higher interest in politics, more spare time, higher education, enjoy talking in public, and sometimes enjoy any opportunity to criticize. In the US, for instance, these citizens are often older white males, or as put by a practitioner once, “the male, pale and stale.” A typical mini-public instead manages to engage a more diverse set of participants than traditional consultations. While this is an obvious reality, the engagement strategies of mini-publics compared to traditional consultations based on self-selection have very different levels of sophistication and costs. Mini-publics tend to invest more resources in engagement, sometimes tens of thousands of dollars, and thus we cannot exclude that existing results in terms of inclusion are purely due to better outreach techniques, such as mass recruitment campaigns and stipends for the participants.

Therefore, it is not fair to compare traditional consultations to mini-publics. As it is not fair to compare mini-publics that are not specifically designed to include marginalized populations to open-to-all processes that are specifically designed for this purpose. The classic critique of feminist, intersectional and social movement scholars that mini-publics design does not consider existing inequalities, and thus is inferior to dedicated processes of minority engagement is valid in that case. This is because the amount dedicated to engagement is positively correlated with inclusion. For instance, processes specifically designed for immigrants and native populations will have more inclusive results than a general random selection strategy that does not have specific quotas for these groups and engagement strategies for them.

We talk past one another when we try to rank processes with respect to their supposed inclusion performance without considering the impact of the resources dedicated to engagement or their intended effects (e.g. redistribution, collective action).

It is also difficult to determine which approach is more inclusive without a significant amount of research comparing different participatory methods with similar outreach and resources. As far as we know, the only study that compares two similar processes – one using random engagement and the other using an open-to-all invitation – found little difference in inclusiveness.[4] It also highlighted the importance of other factors such as the design of the process, potential political impact, and the topic of discussion. Many practitioners do not take these factors into account, and instead focus solely on recruitment strategies. While one study is not enough to make a conclusive judgment, it does suggest that the assumption that mini-publics using randomly selected participants are automatically more inclusive than open-to-all processes is problematic.

  1. “But what about the ergonomics of the process and deliberative quality? Small mini-publics are undeniably superior to large open-to-all meetings.”

One of the frequently advertised advantages of small mini-publics is their capacity to support high-quality deliberation and include all members of the sample in the discussion. This is a very clear advantage; however, it has nothing to do with random sampling. It is not difficult to imagine a system in which an open-to-all meeting is called and then such a meeting selects a smaller number of representatives that will proceed to discuss using high-quality deliberative procedures. The selection rule could include quotas so that the selected members respect criteria of diversity of interest (even though, as we argued before, that would not be representative of the entire group). The ergonomics and inclusion advantages are purely linked with the size of the assembly and the process used to support deliberation.

  1. “So, are you saying we should abandon sortition?”

We hope that it is now clearer why we contend that it is conceptually erroneous to defend the application of sortition in mini-publics based on their statistical representation of the population. So, should sortition be abandoned? Our position is that it should not, and for one less obvious and counterintuitive argument in favor of random sampling: it offers a fair way to exclude certain groups from the mini-public. This is particularly so because, in certain cases, participatory mechanisms based on self-selection may be captured by organized minorities to the detriment of disengaged majorities.

Consider, for instance, one of President Obama’s first attempts to engage citizens at large-scale, the White House’s online town-hall. Through a platform named “open for questions,” citizens were able to submit questions to Obama and vote for which questions they would like to be answered by him. Over 92,000 people posted questions, and about 3.6 million votes were cast for and against those questions. Under the section “budget” of the questions, seven of the ten most popular queries were about legalizing marijuana, many of which were about taxing it. The popularity of this issue was attributed to a campaign led by NORML, an organization advocating for pot legalization. While the cause and ideas may be laudable, it is fair to assume that this was hardly the biggest budgetary concern of Americans in the aftermath of an economic downturn.

(Picture by Pete Souza, Wikimedia Commons)

In a case like the White House’s town-hall, the randomization of people to participate would be a fair and effective way to avoid the capture of the dialogue by organized groups. Randomization does not completely exclude the possibility of capture of a deliberative space, but it does increase the costs of doing so. The probability that members of an organized minority are randomly sampled to participate in a mini-public is minor, therefore the odds of their presence in the mini-public will be minor. Thus, even if we had a technological solution capable of organizing large-scale deliberation in the millions, a randomization strategy could still be an effective means to protect deliberation from the capture by organized minorities. A legitimate method of exclusion will remain an asset – at least until we have another legitimate way to mitigate the ability of small, organized minorities to bias deliberation.

The way forward for mini-publics: go big or go home?

There is clearly a case for increasing the size of mini-publics to improve their ability to represent the population. But there is also a trade-off between the size of the assembly and the cost required to sustain high-quality deliberation. With sizes approaching 1000 people, hundreds of moderators will be required and much of the exchange of information will occur not through synchronous exchanges in small groups, but through asynchronous transmission mechanisms across the groups. This is not necessarily a bad thing, but it will have the typical limitations of any type of aggregation mechanism that requires participant attention and effort. For example, in an ideation process with 100 groups of 10 people each, where each group proposes one idea and then discusses all other ideas, each group would have to discuss 100 ideas. This is a very intense task. However, there could be filtering mechanisms that require subgroups to eliminate non-interesting ideas, and other solutions designed to reduce the amount of effort required by participants.

All else being equal, as the size of the assembly grows, the logistical complexity and associated costs increases. At the same time, the ability to analyze and integrate all the information generated by participants diminishes. The question of whether established technologies like argument mapping, or even emerging artificial intelligence could help overcome the challenges associated with mass deliberation is an empirical one – but it’s certainly an avenue worth exploring through experiments and research. Recent designs of permanent mini-publics such as the one adopted in Belgium (Ostbelgien, Brussels) and Italy (Milan) that resample a small new group of participants every year could attempt to include over time a sufficiently large sample of the population to achieve a good level of representation, at least for some strata of the population, and as long as systematic sampling errors are corrected, and obvious caveats in terms of representativeness are clearly communicated.

Another approach is to abandon the idea of achieving representativeness and instead target specific problems of inclusion. This is a small change in the current approach to mini-publics, but in our opinion, it will generate significant returns in terms of long-term legitimacy. Instead of justifying a mini-public through a blanket claim of representation, the justification in this model would emerge from a specific failure in inclusion. For example, imagine that neighborhood-level urban planning meetings in a city consistently fail to involve renters and disproportionately engage developers and business owners. In such a scenario, a stratified random sample approach that reserves quotas for renters and includes specific incentives to attract them, and not the other types of participants, would be a fair strategy to prevent domination. However, note that this approach is only feasible after a clear inclusion failure has been detected.

In conclusion, from a democratic innovations’ perspective, there seems to be two productive directions for mini-publics: increasing their size or focusing on addressing failures of inclusiveness. Expanding the size of assemblies involves technical challenges and increased costs, but in certain cases it might be worth the effort. Addressing specific cases of exclusion, such as domination by organized minorities, may be a more practical and scalable approach. This second approach might not seem very appealing at first. But one should not be discouraged by our unglamorous example of fixing urban planning meetings. In fact, this approach is particularly attractive given that inclusion failures can be found across multiple spaces meant to be democratic – from neighborhood meetings to parliaments around the globe.

For mini-public practitioners and advocates like ourselves, this should come as a comfort: there’s no shortage of work to be done. But we might be more successful if, in the meantime, we shift the focus away from the representativeness claim.

****************

We would like to express our gratitude to Amy Chamberlain, Andrea Felicetti, Luke Jordan, Jon Mellon, Martina Patone, Thamy Pogrebinschi, Hollie Russon Gilman, Tom Steinberg, and Anthony Zacharewski for their valuable feedback on previous versions of this post.


[1] Renwick, A., Allan, S., Jennings, W., McKee, R., Russell, M. and Smith, G., 2017. A Considered Public Voice on Brexit: The Report of the Citizens’ Assembly on Brexit.

[2] Goidel, R., Freeman, C., Procopio, S., & Zewe, C. (2008). Who participates in the ‘public square’ and does it matter? Public Opinion Quarterly, 72, 792- 803. doi: 10.1093/poq/nfn043

[3] Lafont, C., 2015. Deliberation, participation, and democratic legitimacy: Should deliberative mini‐publics shape public policy?. Journal of political philosophy, 23(1), pp.40-63.

[4] Griffin J. & Abdel-Monem T. & Tomkins A. & Richardson A. & Jorgensen S., (2015) “Understanding Participant Representativeness in Deliberative Events: A Case Study Comparing Probability and Non-Probability Recruitment Strategies”, Journal of Public Deliberation 11(1). doi: https://doi.org/10.16997/jdd.221