Bit by Bit: Social Research in the Digital Age


Pic by Jim Kaskade (flickr creative commons)

Matthew Salganik, Professor of Sociology at Princeton University, has recently put his forthcoming book on social research and big data online for an open review. Matthew is the author of many of my favorite academic works, including this experiment in which he and Duncan Watts test social influence by artificially inverting the popularity of songs in an online music market. He is also the brains behind All Our Ideas, an amazing tool that I have used in much of the work that I have been doing, including “The Governor Asks” in Brazil.

As in the words of Matthew, this is a book “for social scientists that want to do more data science, and it is for data scientists that want to do more social science.” Even though I have not read the entire book, one of the things that has already impressed me is the simplicity with which Matthew explains complex topics, such as human computation, distributed data collection and digital experiments. For each topic, he highlights opportunities and provides experienced advice for those working with big data and social sciences. His stance on social research in the digital age is brilliant and refreshing, and is a wake-up call for lots of people working in that domain. Below is an excerpt from his preface:

From data scientists, I’ve seen two common misunderstandings. The first is thinking that more data automatically solves problems. But, for social research that has not been my experience. In fact, for social research new types of data, as opposed to more of the same data, seems to be most helpful. The second misunderstanding that I’ve seen from data scientists is thinking that social science is just a bunch of fancy-talk wrapped around common sense. Of course, as a social scientist—more specifically as a sociologist—I don’t agree with that; I think that social science has a lot of to offer. Smart people have been working hard to understand human behavior for a long time, and it seems unwise to ignore the wisdom that has accumulated from this effort. My hope is that this book will offer you some of that wisdom in a way that is easy to understand.

From social scientists, I’ve also seen two common misunderstandings. First, I’ve seen some people write-off the entire idea of social research using the tools of the digital age based on a few bad papers. If you are reading this book, you have probably already read a bunch of papers that uses social media data in ways that are banal or wrong (or both). I have too. However, it would be a serious mistake to conclude from these examples that all digital age social research is bad. In fact, you’ve probably also read a bunch of papers that use survey data in ways that are banal or wrong, but you don’t write-off all research using surveys. That’s because you know that there is great research done with survey data, and in this book, I’m going to show you that there is also great research done with the tools of the digital age.

The second common misunderstanding that I’ve seen from social scientists is to confuse the present with the future. When assessing social research in the digital age—the research that I’m going to describe in this book—it is important to ask two distinction questions:

How well does this style of research work now?

How well will this style of research work in the future as the data landscape changes and as researchers devote more attention to these problems?

I have only gone through parts of the book (and yes, I did go beyond the preface). But from what I can see, it is a must read for those who are interested in digital technologies and the new frontiers of social research. And while reading it, why not respond to Matthew’s generous act by providing some comments? You can access the book here.


Tiny post on Big Data

Behind much of the excitement with the data revolution is an assumption that decision-making follows a rational actor model. And that’s a huge problem.


Three New Papers (and a presentation) on Civic Tech


This blog has been slow lately, but as I mentioned before, it is for a good cause. With some great colleagues I’ve been working on a series of papers (and a book) on civic technology. The first three of these papers are out. There is much more to come, but in the meantime, you can find below the abstracts and link to each of the papers. I also add the link to a presentation which highlights some other issues that we are looking at.

  • Effects of the Internet on Participation: Study of a Public Policy Referendum in Brazil.

Does online voting mobilize citizens who otherwise would not participate? During the annual participatory budgeting vote in the southern state of Rio Grande do Sul in Brazil – the world’s largest – Internet voters were asked whether they would have participated had there not been an online voting option (i-voting). The study documents an 8.2 percent increase in total turnout with the introduction of i-voting. In support of the mobilization hypothesis, unique survey data show that i-voting is mainly used by new participants rather than just for convenience by those who were already mobilized. The study also finds that age, gender, income, education, and social media usage are significant predictors of being online-only voters. Technology appears more likely to engage people who are younger, male, of higher income and educational attainment, and more frequent social media users.

Read more here.

  • The Effect of Government Responsiveness on Future Political Participation.

What effect does government responsiveness have on political participation? Since the 1940s political scientists have used attitudinal measures of perceived efficacy to explain participation. More recent work has focused on underlying genetic factors that condition citizen engagement. We develop a ‘Calculus of Participation’ that incorporates objective efficacy – the extent to which an individual’s participation actually has an impact – and test the model against behavioral data from (n=399,364). We find that a successful first experience using (e.g. reporting a pothole and having it fixed) is associated with a 54 percent increase in the probability of an individual submitting a second report. We also show that the experience of government responsiveness to the first report submitted has predictive power over all future report submissions. The findings highlight the importance of government responsiveness for fostering an active citizenry, while demonstrating the value of incidentally collected data to examine participatory behavior at the individual level.

Read more here.

  • Do Mobile Phone Surveys Work in Poor Countries? 

In this project, we analyzed whether mobile phone-based surveys are a feasible and cost-effective approach for gathering statistically representative information in four low-income countries (Afghanistan, Ethiopia, Mozambique, and Zimbabwe). Specifically, we focused on three primary research questions. First, can the mobile phone survey platform reach a nationally representative sample? Second, to what extent does linguistic fractionalization affect the ability to produce a representative sample? Third, how effectively does monetary compensation impact survey completion patterns? We find that samples from countries with higher mobile penetration rates more closely resembled the actual population. After weighting on demographic variables, sample imprecision was a challenge in the two lower feasibility countries (Ethiopia and Mozambique) with a sampling error of /- 5 to 7 percent, while Zimbabwe’s estimates were more precise (sampling error of /- 2.8 percent). Surveys performed reasonably well in reaching poor demographics, especially in Afghanistan and Zimbabwe. Rural women were consistently under-represented in the country samples, especially in Afghanistan and Ethiopia. Countries’ linguistic fractionalization may influence the ability to obtain nationally representative samples, although a material effect was difficult to discern through penetration rates and market composition. Although the experimentation design of the incentive compensation plan was compromised in Ethiopia and Zimbabwe, it seems that offering compensation for survey completion mitigated attrition rates in several of the pilot countries while not reducing overall costs. These effects varied across countries and cultural settings.

Read more here.

  • The haves and the have nots: is civic tech impacting the people who need it most? (presentation) 

Read more here.

Petition Growth and Success Rates on the UK No. 10 Downing Street Website


Screen Shot 2013-05-16 at 08.00.47

This is the kind of research that should be informing the design of ICT mediated initiatives. It also a good example as to why policymakers  and practitioners should reach out more to scholars (and vice-versa).

Now that so much of collective action takes place online, web-generated data can further understanding of the mechanics of Internet-based mobilisation. This trace data offers social science researchers the potential for new forms of analysis, using real-time transactional data based on entire populations, rather than sample-based surveys of what people think they did or might do. This paper uses a ‘big data’ approach to track the growth of over 8,000 petitions to the UK Government on the No. 10 Downing Street website for two years, analysing the rate of growth per day and testing the hypothesis that the distribution of daily change will be leptokurtic (rather than normal) as previous research on agenda setting would suggest. This hypothesis is confirmed, suggesting that Internet-based mobilisation is characterized by tipping points (or punctuated equilibria) and explaining some of the volatility in online collective action. We find also that most successful petitions grow quickly and that the number of signatures a petition receives on its first day is a significant factor in explaining the overall number of signatures a petition receives during its lifetime. These findings have implications for the strategies of those initiating petitions and the design of web sites with the aim of maximising citizen engagement with policy issues.

Read more here [PDF].