Étienne Ollion on AI
What sparked your interest in artificial intelligence? You wrote a thesis on cults, you have published work on political professionalization in Europe, and now you work on AI. This is quite a stretch, isn’t it?
Well if you ask me, it’s not that different! From anti-cult movements in France to the occupational backgrounds of politicians, and now political journalism, my research always deals with politics. I have adapted my methods to my questions, using ethnography and archives when need be, and digital data or masses of text in other contexts. I do, however, spend more time using quantitative methods now than in the past, and I suppose have also adapted my questions to my methods a bit. At least, I come at my research questions from a more quantitative angle now.
As a researcher you use AI, or machine learning, both as an object of study and as a research tool. In a recent article (“The Great Regression: Machine Learning, Econometrics, and the Future of Quantitative Social Sciences”) you compared two approaches, one classic, one more recent, to highlight the merits and limits of AI. Could you tell us more about this?
This paper, co-authored with Julien Boelaert who did his dissertation on this topic, is a modest contribution to an ongoing discussion about “what to do with AI in science.” Put another way, our goal was to assess the claim that the unrivaled capabilities of AI were such that it would soon take over all quantitative research.
You mention the promise of “universal approximation.” What is this?
To fully grasp this, we need to step back for a second. In what some would now call the “classic statistical approach,” the one that dominated throughout the 20th century, a researcher builds a model, and then she tests it. The construction of the models depends on hypotheses, on previous empirical results.
Machine learning works differently, as you feed the algorithm the data, and it tries to find an optimal combination between them. The promise of universal approximation is the idea that an algorithm devoid of any preconceived knowledge about a given case can automatically uncover relations between variables in a data at hand.
This is of course a major promise, as it would not only upend the classic approach (hypothesis followed by test); it would also unlink analysis from the knowledge of a researcher building a model. The idea is that an allegedly neutral algorithm would compensate for her preconceived ideas or potential blind spots. If you push this a bit further, it means that anyone could uncover in seconds what it took months or years for a scientist to discover.
You express reservations about this, though…
Well, let’s say that it makes a good sales pitch—in theory. In practice, it’s another story. First, researchers do more than choose models: they need to have a grasp of data and its potential issues—the algorithm won’t do that. They also know when to dig further if the results are surprising. The researcher, can, eventually, interpret the data. So, expertise is still key. But more importantly, universal approximation comes at a price, which includes mathematical uncertainty, the need for vast data sets that may not be available to social scientists, stochastic procedures that yield somewhat different results every time… We detail these issues in the paper.
One issue you delve into in particular is that of prediction, which is the main result of a machine learning algorithm.
Indeed, the first results yielded by the algorithms we used (called supervised learning models) are predictions. Given a set of properties, and looking at many other cases, these algorithms will tell you what you are more likely to have or do (a given salary, a disease, a lasting marriage).
As a scientist, though, once you have built a model that can predict well, but does not do much more, what do you make of it? Even excellent predictive power is somewhat limited. It’s understandable that companies such as Google or Amazon (who use these technologies on a large scale) would like to be able to predict how likely you are to buy a given item, or which items you would like to buy. But it’s quite a stretch from that to the kinds of questions researchers usually ask. Imagine engaging in scientific research that relies on predictions, but does not offer any information about what caused a given outcome. That would be a problem for medical and social scientific research: scientists don’t just want to know whether you’ll respond to a cure or an event, they want to know why.
Can’t we make a case for prediction in science, though?
You are right that such a case can be made—several cases, in fact… But only on the condition that we clarify our expectations. We discuss this in the paper, but one aspect I think is worth mentioning is that social scientists might use prediction (and thus machine learning) not only to analyze data sets, but to produce data.
Could you explain?
If machine learning is good at predicting what choice someone will make but bad at explaining why, then why not use it to do exactly that: to predict? As an example, one project I have at the moment studies the transformations of political journalism over time, examining how journalists narrate politics. For instance, it is now common practice to quote politicians “off the record.” But if you want to capture this in a text, there are dozens of ways to quote “off the record”: you can cite “a source close to power,” “people familiar with the case,” “four unnamed administration officials,” and so on. There are so many ways to word this that it is hard to list all of them in any comprehensive manner.
So Salomé Do and I trained an algorithm to automatically detect this extremely important facet of contemporary political journalism. Our goal was to date the appearance and spread of this style of “off the record” quoting, in order then to analyze the forces behind this change in political journalism (new standards in journalism, generational changes, change in ownership, etc.). In this very case, the reason why the algorithm classified a given syntagm didn’t matter much to us; all we needed was for the algorithm to identify this style of citation correctly. This saved us months of painstaking labeling: once trained on a limited set of examples, the algorithm was able to generalize effectively across hundreds of thousands of articles.
On the other hand, no algorithm was able to warn us about major events. “Why did no one see this coming?” was the question asked by Queen Elizabeth in the aftermath of the global financial crisis of 2008. Could the global pandemic have been anticipated?
Probably not the global pandemic, as it is in itself a concatenation of multiple events. But AI was certainly used, in addition to simulation, to predict the development of a disease in a population, help determine if you are infected before you could know it, or understand its dissemination across states. In fact, we have seen models being used to this effect in the past weeks.
More generally, your question is about the predictive power of the social sciences. An argument often made about (against) these disciplines is that if they were really scientific, they should be able to predict events. To me, this is forgetting that societies, and even individual life courses, are complex systems constantly interacting. Thus, if you move one aspect, you should adjust another one, and another one…
A paper recently published by Matt Salganik and numerous other researchers reflects on this. Based on high quality data carefully selected by experts in the field, dozens of scholars tried to predict several life outcomes (divorce, imprisonment, poverty, etc.). They succeeded in approximately 30% of the cases. According to how you look at this, this is either a good, or a damning result.
Which way do you lean?
I think we should not judge our results solely on prediction, as we would be imposing a higher bar on ourselves than many other disciplines. I often liken the social sciences to climate science: both are very good at explaining why something happened, but their predictive capacity is more limited. This is precisely because of the complexity they have to deal with. Thus, if you think climate science is a serious endeavor, then you should not discard social sciences on those grounds. In fact, should we predict, we would probably do better than the weather forecast—which often does not predict well beyond 7 days.
Is a good knowledge of mathematics necessary to use AI approaches?
Just like for other quantitative approaches, knowing math is always a plus. Otherwise, you are dependent on the models and its assumptions. In this respect, there is little change. But the answer to this question also depends on what you make of AI for. If you use it to produce data (and thus just care about the quality of the result), then not knowing what happens under the hood may not matter so much.
What will you be discussing in the SASE workshop you will be leading during the virtual conference?
All of the above, and more. More specifically, I will present a general approach to the question of AI in the social sciences. I will briefly evoke the history of the technique, but also the various subfields in AI. I will try to highlight both the merits and limits of AI with respect to classical tools, such as parametric regression or dimensionality reduction. I will also talk about producing data.
In line with the philosophy of these workshops, the goal is to offer a non-technical introduction to the main questions raised by the growing use of machine learning in science, so that participants can explore it further if they feel it could help them in their research.