Everyone encounters big data: via social media, financial transactions and public transport. Although all of these things are useful and fascinating, they simultaneously arouse feelings of discomfort: how far does the – largely invisible – influence of all of these data collections reach? Marleen Stikker has been following developments in this sphere since the beginning of the digital era.
Erica Meijers and Socrates Schouten: What is big data?
Marleen Stikker: Big data is a collective term for everything that is datafied. All kinds of information can be converted into data, and the resulting data sets can also be linked. If you collect a lot of data, you can obtain insights that you don’t have with a more limited number of datasets. For example, you can discover patterns and make predictions.
Who collects this data?
Everyone, that’s the exciting part: science, companies, consultancies, governments and counter-movements. There are datasets being collected that we don’t know about or rarely hear of, like those of intelligence services. Thanks in part to Edward Snowden, we have a bit more insight into that world. The biggest collectors are of course the Googles of this world, but companies such as Uber and Airbnb should not be underestimated. The value of this group of technological ‘Silicon Valley’ companies lies in the data and the accompanying algorithms.
‘Algorithm’ is another crucial term. What does it mean?
An algorithm is an instruction, a number of codes that you programme in advance according to the rule ‘if this is true, then that’. That is the basis of all software. This rule also states how you should interpret the data. Algorithms are written and designed. They do not come from God; they are made by humans. Many people consider technology and ICT as ‘hard science’: something that already existed, a law of nature. But we as human beings create everything related to ICT. I also call it designing science.
So algorithms are steering mechanisms for the collection and organisation of data?
Data is already an interpretation in itself, of course. For example, if you organise a group of people demographically, you have applied the categories yourself and you can disregard all kinds of non-measurable factors. If you ask someone if they are a man or a women, you then have to stick to these categories. The rest does not count. The choice for what we do or do not measure is based on a worldview. These kinds of assumptions regarding data are never actually discussed. Big data and everything connected with it is usually presented as an uncomplicated solution: ‘we will make things safer, more transparent, easier, more sustainable, faster and more fun.’ But what is that optimism actually based on?
Is there not a wider underlying question about technology, which is often perceived as a neutral instrument that can be used in different ways?
The first step is to accept that technology is not neutral. That is difficult, and immediately opens up a huge number of questions. Many people just want to have fun with technology: create apps, make things a little bit faster, and save the world with big data. And you can indeed do everything with it: you can gain new insights that might help us to act better, but you can also manipulate people.
Big data has suddenly become a topic over the past few years. We have measured things since the early days of science – so what makes big data so important right now?
With the increasing growth of data collections, a tipping point has been reached. Instead of sticking a measuring instrument into the ground and processing the results on the computer, we now have underground sensors that provide a permanent flow of data. This also applies to social media, which is a flow of data. And all of these streams can also be combined with each other. That is the second element. A bank no longer simply looks at your transactions, but also combines them with data about your social behaviour. This adds a new dimension. And thirdly, business models have developed around the interpretation of data. All of this together makes us talk about big data. People are particularly worried about this last development: they feel as if they no longer have a grip on their data.
Is big data approaching George Orwell’s Big Brother?
There is no longer just one Big Brother; there are many big brothers and also several little sisters and we do not know if they all work together. But what we do know, and I think this is the fourth dimension, is that all devices intercept data. Since Snowden we know that the hackers’ warnings were not conspiracy thinking: every phone, camera, whatever device or application, also sends standard data to the companies that made them. The problem is that the companies are not transparent about these codes and algorithms. If the companies don’t want to tell us what’s happening behind the scenes, we have no choice but to distrust them.
Is there perhaps another dimension to the issue – namely that everyone loses control of big data?
Then it’s a matter of deep learning by artificial intelligence. This is not new, but it is experiencing a revival. This is the idea that algorithms themselves can form an autonomous system that creates new algorithms and start acting on their own. This can of course lead to very strange things. We can decide for ourselves whether or not we trust these machines, but do we need this kind of cleverness? And who is responsible for such systems? None of that has been arranged. Also, we usually don’t know what the original codes were. It’s as if you are creating a legal system without knowing the laws that it is based upon. This raises the question of how we create value, and values.
How can we begin to understand and get a grip on all this data we are creating?
As humans, we are extremely curious and love to create. Although this generates a lot, it can also be dangerous. We build safety systems – think of nuclear energy, nuclear weapons and the international agreements around them – in order to contain these tendencies. Look at our history: we have never curbed our curiosity because it became dangerous. And that is not going to happen now. Certainly not if there is the promise of a lot of money to be earned.
There are many areas where all of these datasets can be very meaningful, for example in medical science. But we have hardly begun to define limits for what we want, as we have done with nuclear power plants. And an open discussion about this topic is hampered by competition between companies: they do not want prying eyes. Must we simply take their assurances at face value? We currently have no means for controlling them. Transparency and openness about developments are conditional if a society wants to get a grip on what is going on, and determine boundaries and control mechanisms.
How do we achieve openness when companies are not interested?
Fortunately, scientists are concerned. They still work largely with open access, open data, open algorithms, open models. In society there is also an important movement for openness, connected with social themes. Just think of the hackers. They say: ‘If you can’t open it, you don’t own it.’ That is also my motto. The big question is whether governments are really open to this.
You have worked a lot with the municipality of Amsterdam. How does politics deal with big data?
You can see this clearly in the idea of smart cities [read more on smart cities]. Google-affiliated companies offer cities infrastructure, sensors and all of the data for free. Naturally, the companies also receive this data. This is tempting for many municipalities, because they do not have the means themselves. But if you begin this way, you skip a crucial step: what is the strategy of the data policy? These are entirely political questions. Yet you see that the ICT dossier is passed on like a hot potato. It is uncontrollable, and for that reason it is unpopular. Just look at the problems the tax authorities have. The portfolio lands with the weakest alderman, who focuses on internal automation. But data policy affects all political issues. How is care is organised? What is identity? What is privacy? What is citizen’s sovereignty? The big question is what kind of society should be made possible by technology.
Can you give an example of an example of how ICT architecture alters our social dynamics?
The public transport chip card in the Netherlands is designed in such a way that you cannot travel together with someone else on your card, unlike the strip card that preceded it. This is more efficient for the transport companies and the government. When coming up with this design, nobody asked how people actually want to travel. The system of individual cards and gates prevents us from being hospitable.
It becomes more complicated with digital identity. The Dutch authorities are now working on the successor to the DigiD: the EiD. The condition for acting digitally is that you are not always the same person for every service. Our current system has the consequence that if I show an ID because I want to buy beer, people know not only that I am over 18, but also my religious preference, whether or not I have a partner, and what my status is on Facebook. With many services, all they have to know is that they are dealing with a real person, and that’s all. Does my health care provider, who already knows how many steps I have taken today, have to know that I bought a book about eating lots of meat? The technology should make it possible for you to prevent profiles from being created about yourself without your knowledge. The issue is our sovereignty. That is a tricky concept, but it means that you decide what you are giving in which relationship. We do this all day long, of course. You give something different in a work relationship than you do in a private one. This must also be possible in digital interactions. If this is not anchored in technology, you will soon need hundreds of different passes. That is the hacker method, but it should be arranged by the government because this technology is imposed on us. A DigiD and a Citizen Service Number (BSN) are mandatory.
Is the government able to cope with the interests of large international companies?
There are two movements around technology: one is to create openness, democratise, and enable people to organise better and share knowledge. The other is to exploit, own, and accumulate power. If money is the all-determining factor, then the second movement wins. But I think that the power of people also plays a role. There are values other than money in the game. For example, there is a movement that deals with the commons: common ownership and open access to the means of subsistence. And so the outcome is not yet fixed. But of course we see that we are at the end of an era.
On what basis do you say that we’ve reached the end of an era?
Technology optimists tell us that they are going to solve everything with technology, but the group that chose Trump, and that opted for Brexit, does not believe that this is the path for them. They see it as elitist and associate it with people who are already doing well, who can buy a Tesla and do eco-things while drinking designer coffee in a designer café. You have the left translation of these experiences of exclusion from Bernie Sanders and Jeremy Corbyn, and you have the right-wing populist account. To counter this division, you must create more cooperation and community. Technology can help with this, but our current competitive model leads to the exclusion of large groups of people. And I fear that this is not over. I don’t see a single party in the Netherlands that really has an answer to this. I also do not see GroenLinks (the Dutch green-left party) as a challenger of the order. It is missing the entire technology agenda. They could learn something from the sharp agenda being driven by the D66 party from within the liberal group in the European Parliament.
Can’t you use the new digital technologies to make society greener and more social? GroenLinks often shares that vision.
That is too instrumental. The arrival of the internet has been a game changer. It is not a centrally-managed platform, but a distributed platform, a peer-to-peer platform. So this means a shift in institutionalised decision-making from vertical to horizontal. But politics still clings to the institutional image, and this is why they want to use big data to introduce their own policies. I miss a real political agenda in this area. The first question should be: What does the diffused nature of technology mean for the organisation of society?
For the energy transition, for example, in order to be self-sufficient we need a system that matches the possibilities offered by technology. I am not arguing for autarky, but rather that we stay connected in networks. Then you also have organisational and administrative responsibilities, but you do change the perspective. As the government, you no longer say: ‘We are going to build windmills here’. Instead, you say: ‘How can we make it possible for citizens to build windmills together?’ In short, the state must behave differently.
Could it be that politicians, governments and even the great majority of citizens do not get it? Maybe hackers and thinkers need to explain again the possibilities opened up by big data, and that this is where we need to go.
Absolutely! There is a growing European community that is connecting technology and social change. There are cities like Barcelona that are explicitly developing municipal data commons strategies [see more on Barcelona], and like Eindhoven and Amsterdam, which are realising that they can and should take technology policy into their own hands. The city is an interesting scale to act on: it’s possible to tender for IT for example, and intervene in tech companies in the city. Negotiations with these companies can go very differently with this approach: as a city, what are we actually giving away as data? Cities can make this concrete.
Europe can do more in terms of regulations, even if we are just at the beginning. A whole discussion about ‘who is responsible’ still needs to take place, which will be very exciting. If you think that Facebook should be accused of fake news, then you consider them as a publisher and not as a neutral channel. Or: can you hold companies accountable if there is a data breach? That has far-reaching consequences. If the EU starts to impose fines in connection with the General Data Protection Regulation (GDPR), you can’t just release a new app with the snap of your fingers.
Unfortunately, on the other hand, there is the PSD2 legislation (the second Payment Services Directive). This obliges banks to share customer payment data when the customer agrees with the terms and conditions. The argument for the PSD2 was the creation of a level playing field, in this case the termination of the data monopoly by banks. But do you really dissolve that monopoly by releasing user data to commercial companies? We have to go precisely in the other direction, towards smaller banks and more cooperative business models.
The European Commission is currently investing in ‘Next Generation Internet’. As far as I am concerned, you can view this as the answer to the Silicon Valley model. But the question is whether Europe will come far enough along with this approach. An earlier funding programme, ‘Collective Awareness Platforms for Sustainability and Social Innovation’, which is now winding down, was larger and more ambitious, but also just the beginning. If you were to spend the entire European budget for the digital market on a ‘commons’ strategy, you would come a long way. Then you would have sufficient volume.