Open Data and Democracy: Henri Verdier, France's "Mr. Data", Responds

The director of the French government's open data mission reacts to the interview with data sociologist Evelyn Ruppert

A few months ago, we asked the British sociologist Evelyn Ruppert a few questions about the open data policy in the UK. Taking the arguments laid out in her article "Doing the transparent state, open government data as performance indicators", we exchanged about the very possibility of a transparent state and the question of the digital divide. Our article, republished by French information sites and Rue89, triggered response from the public on a specific theme: we can embrace open data while, at the same time, questioning certain aspects of its implementation. Among the comments that followed the interview, one particularly deserves your attention: that of Henri Verdier, Director of the Etalab mission leading the French government’s policy of openness and sharing of public data.

A few months ago, we asked the British sociologist Evelyn Ruppert a few questions about the open data policy in the UK. Taking the arguments laid out in her article "Doing the transparent state, open government data as performance indicators", we exchanged about the very possibility of a transparent state and the question of the digital divide. Our article, republished by French information sites and Rue89, triggered response from the public on a specific theme: we can embrace open data while, at the same time, questioning certain aspects of its implementation. Among the comments that followed the interview, one particularly deserves your attention: that of Henri Verdier, Director of the Etalab mission leading the French government’s policy of openness and sharing of public data.


The following post is the translation of Henri Verdier's blog post "L'open data est-il soluble dans la big society ?" by André Confiado of Five by Five, slightly revised by Abby Tabor.
 An article entitled "Is open data a political illusion ?" appeared at the beginning of July on the news pages of MyScienceWork, and then was reprinted in La gazette des Communes, then by Rue89. 

This interview with Evelyn Ruppert, a British sociologist and notably the writer of the blog Big data and society, is inspired by her work on transparency in Britain, which she appears to know well, but is applied to the French approach, of which she appears to know a little less.

Evelyn Ruppert formulates an analysis which can be summarized as: 
- absolute transparency is an illusion, since governments always choose what they want to communicate, and never share the most important information;
- transparency does not build confidence, but rather mistrust, since it can never be complete;
- the process of transparency limits citizens to the data that we choose to transmit to them;
- Open Data promises a more direct relationship with power, but in fact creates a new technocracy of those that can understand the data;
- Thus, close attention must be paid to documenting the data itself (who created it, when, why, etc.) in order to allow citizens to criticize the data that is given to them.

Double mistrust

A number of friends have asked me what I think of this paper. It’s embarrassing: I more or less agree with everything that it says, but I don't really feel concerned. 

Fundamentally, I think Evelyn Ruppert reasons from an implicit idea that I would qualify as a "model of double mistrust."

Her implicit reading of the Open Data movement is the following: as a response to the increasing mistrust of citizens, governments decided to release certain information allowing citizens to better verify them, hoping to restore this confidence. 

I do not know if this reasoning exists anywhere. One feels that this is related to the British context where open data is hard to separate from the Big Society project. However, what I know is that this is not the context of the French government, and that it is not the spirit in which Etalab works.

In France, the opening and sharing of public data is not seen as an end in itself, but rather as levers that can serve three objectives:
- a more complete democracy;
- innovation and growth;
- and more efficient public action.

Transparency, "accountability," participation…
 Let’s start with the democratic dimension. In France, opening up data does not rely on a value granted to "transparency."

Incidentally, I am personally not a fan of this concept. I find in it an apolitical apathy, traces of the theory of the invisible hand, and ignorance of the resources of human activity. I prefer more the concept of responsibility or accountability, which recognizes the dignity of the subject that exercises his or her responsibilities.

The opening of public data in France is founded on the declaration of human rights and the citizen, and its article 15: "Society has the right to require of every public agent an account of his administration".

The Cada law, which is a legislative translation of this article 15, deals with "administrative documents" (memos, opinions, notes, letters, files, databases, etc.).
We feel that it is not a question of placing the administration in glass display cases, of spying on all their exchanges, of dragging everything into the spotlight, under all circumstances. It is a question of making public their acts of responsibility, the "documents" that an author has worked on and is responsible for, especially in regards to his/her superiors.
The law provides for the bearing of this responsibility before the citizens. It does not ban hesitation, the secrets of the deliberation, the preparation of the decision. Incidentally, it provides as well several exceptions to this principle of publication: protection of the right to privacy, national security, secrets protected by law.

We do not ask the State to describe reality.

Consequently, the law regulates as well the question of data that the State has to produce and share, which seems to worry Evelyn Ruppert. We do not ask the State to describe reality. We ask it to share with simplicity the data that it uses within the framework of its public service missions, as they are used.
There is no need to open the epistemological question of the meaning of the data, from the point of view that it dissimulates possible observation biases.
Open Data is not the public statistics office, and neither is it a great story through which the State tells us what to think… It is the sharing of instruments with which the State works, and on which it bases its decisions. It's the search for a second life and for a new use for the knowledge that the States creates through its daily activities.

Of course, there are other activities that the State keeps secret for better or worse reasons. There are democratic battles to reduce the perimeter of secrecy. But that are also many other fields that are taking more and more steps to share, and that are learning to find new efficiencies thanks to the relationships opened through sharing. 

Evidently, opening data is nothing without dialogue and consultation around the data: the aim is to create the conditions for an active, informed, and responsible citizenry. Exchanging with the users of this data, responding to their questions, compiling criticisms and suggestions is one of the essential changes that these steps produce. It is one of the dimensions of the portal: a world first in this genre, authorizing its users, administrations, citizens, researchers, to enter into a dialogue with the producer of the data, to share points of view that differ from his/her own, to improve the data, to cross-reference it with others, even depositing data that is not the State's.
With more than 1,300 reuses one semester after opening the platform, it is clear that a living community was created, and that it took hold of this resource.

We believe in debate

We believe a lot in this debate, which is developing on our platform, on Twitter, at the many "hackathons" that we organize, as well as within organizations. Even more so than we believe in metadata. 
This is because I want to highlight the paradox of Evelyn Ruppert's position: after making the observation about the impossibility of transparency, and the impossibility of restoring confidence, she suggests  "transparency around transparency", a "meta-transparency."
Without active communities, confident and caring, you can increase transparency to the second or third degree, and nothing will happen. As always on the internet, what counts are the human communities that organize themselves thanks to these resources.

France is highly conscious of the fact that sharing data should allow for the construction of genuine gestures of democratic exchange, to kindle the informed contribution of the citizen to public decisions. Opening data should lead to an opening of the public decision in itself.
This is the meaning behind France's entry into the Open government partnership, announced in May by the President. By joining this community of public innovators that elected us to its steering committee on August 4, France has chosen to build a partnership with people who know that open data takes all of its meaning in open government, and who work concretely in this direction.

A democracy of actors


There are, however, other dimensions to open data not limited to the control of representative democracy. There is a dimension of empowerment which is curiously absent from this interview. To really understand this, you have to understand how much this movement is linked to the current digital revolution, and how much of an impact it can have on the economy of innovation. 

We have entered a world where, every ten years, the power of computers is multiplied by thirty, and the cost is divided by same. Digital technology only very recently opened in society an unimaginable power to act. 

Added to the existence of the internet, which allows citizens to be synchronized, to organize, and to cooperate, this revolution has brought us to a new world where the power of the multitude has become an essential political parameter. 

Admittedly, not everyone is equal before technology, as rightly emphasized by Evelyn Ruppert. Admittedly, the risk of a digital fracture is chronic. Admittedly, we shall see the appearance of new technocrats.

But, the same applies to all democratic revolutions. [The events of] 1789 or 1830 did not equally distribute power to all French people. But they did associate more people with power. 

This prioritization of skills is not antidemocratic. The multitude is not the masses. It is a living body, moving, with its own dynamics and organizations. 

There are a multitude of skills. The social organization becomes multipolar. New elites appear, basing their legitimacy on immeasurable dimensions. Bloggers can hold their own against journalists. Collectives develop Linux, Wikipedia, or OpenStreetMap. Guardianship on the social web will allow for a new prioritization of information. 

To think of society without taking this reality into account is to confine oneself to a pre-digital vision. It's to give up on using this dynamic as a lever. 

The "contributions of the multitude"

Even if was opened to contributions from the multitude, it's not to make it "interactive." It is because the site strives to become a platform embodying a community of producers and users of data. 

This is because the State no longer has the monopoly on the capacity to create information of general interest. This is because OpenStreetMapOpenMeteoForecastOpenFoodFactsWikidata, but also Celtipharm, Que Choisir ?, the Red Cross, and tomorrow perhaps even unions, associations, and think tanks, have something to say, something to share. 

Why split the community of producers and the reusers that work in a network?

It's also the feeling of new alliances that we look for with civil society when we support the winners of Dataconnexions on Kiss Kiss Bank Bank, when we help out project Bano, when we develop research models such as OpenFisca…

Admittedly, the State has a big role. cleanly separates data from authorities, created by the State, holding itself liable for its sincerity, and data created by other actors whose identity the State merely authenticates if asked. But this difference does not stop cooperation. 

The revolution of usages

Above all, what counts with Open Data is what we can do with it. It is this dimension that is at the basis of two others ambitions of Open Data: fuel for innovation and motivation for the efficiency of public action. This data is not only to "keep an eye on the State." Far from it. It is material for the exercise of an active citizenry. It is material for creation. 

Admittedly, there are unfounded secrets, but there is a need for more players like Mediapart (a web site containing information and opinion articles) to update these. This is not, however, the primary mission of Open Data. The mission of Open Data is that there is loads of unused knowledge in our systems. 

Knowledge that comes to life and takes value when put into circulation. There aren't just little secrets in the State. There are also thousands of useful and fascinating pieces of information resulting from the work of thousands of public servants, involving their professional knowledge. 

The geolocated and time-stamped map of road accidents used during the recent hackathon organized by OKFN and the Ministry of the Interior, and that enabled Rue89 to do this fantastic work, is worth its weight in kilobits. The files of street names that fed so many developments. The statistics on pollution, cartography templates, weather data…
 It is the story of "Moneyball," a cult movie for "data scientists." There was something in the statistics of Major League Baseball (in the USA) to reconstruct the sport. There is something in data from the State that can reconstruct large areas of public action. The important thing is to render this data implementable so that new actors can take this data and ask new questions. 

It is a dimension that is frequently lost on non-coders: Open Data is not limited to revealing data. The aim of it is to be accessible, manipulable, implementable. This is why it opens up a universe of unforeseen reuses. 

We do not ask for raw data in the name of some sort of naturalism of data. We ask for raw data in order to have the most manipulable data possible, to get the greatest number of interpretations possible, and to facilitate more possible manipulations. 

This is what Tim Berners-Lee wanted to say with "raw data." This is not a sociological premise: this is the claim of an engineer. Deep down, we do not understand Open Data as long as we do not understand the difference between information concealed in a 1,459 page report from the Court of Auditors, and information placed in a complete series in an easy-to-use spreadsheet. 

The revolution of data

We have in fact entered a new world where: whole segments of reality are codified and thus become analyzable; an increasing number of individuals and entities would know how to get new uses from them; big data opens up epistemological perspectives whose limits we have yet to see.

By sharing data under a license that prevents it from being used for individual profit, one introduces a political dimension to this data revolution. One rebalances power, prevents monopolies, creates common goods. It is one of the forms of essential political action at this period of digital revolution. 

The question remains about the quality of the data, its pertinence, and its capacity to describe reality. You have to admit that human sciences have a long history with data. From phrenology to contemporary marketing, from Gobineau to Alexis Carrel, from IQ to EQ, they have often translated their prejudices into pseudo-knowledge.

They are well placed to know the risk of using contestable ontologies, of selecting correlations presented as causalities, of hiding from politics under the appearance of objectivity (see Stat-Activism, how to fight against the numbers), or of working on data that does not reflect the prejudices of its authors. 

We mentioned it above. It is not entirely the problem of Open Data, which demands the simple sharing of tools that the State uses for its missions. 

But if we really want to talk about it, let's remember that not all data is of the same nature: the exact number of the prison population, the geolocation of hospitals, train schedules, the exact number of professors or tax officers, the price of fuel, the zip codes of villages in France, weather forecasts, street routes...
Certainly the data are intellectual constructions, but that does not mean that all of them pose huge epistemological questions.


We need not be told every time about the difference between noumena and phenomena and the myth of the cave. There is factual information that citizens demand, and that we can share without erring on the side of caution. 

Other data need more interpretation and deserve more analysis. For example, since the work of the Stiglitz commission, France is familiar with the problems posed by GNP. Why is the value creation of government employees only addressed through their payroll?

It is true that a lot of data does not exactly say what it appears to say. In this case it is true that the more transparency on the choices regarding the construction of this data, the more precision on the implicit hypotheses is in order. 

But conversely, it must be highlighted that each piece of this data responds to at least one question. The statistics on justice, for example: Is the social distribution of sentences the distribution of delinquency? The punishment strategies adopted by judges? The capacity to call on a good lawyer ? Or does it talk about the distribution of the forms of delinquency and the severity of the law depending on these forms?

I do not know. But sure enough, the precise analysis of the conditions of the construction of this data allows one to understand it. One just has to ask the right questions about this data. 

By any standards, the world of Open Data appears to me to be more capable of asking the right questions than other social worlds, and, notably, the media.