You are browsing the archive for Guest.

Why privacy is less nebulous than it’s sometimes made out to be

- August 13, 2014 in Uncategorized

This guest post is by Walter van Holst.

The complexities of the notion of privacy

Two recurring themes in conversations about privacy and personal data are that privacy is such an abstract concept, and that public data can’t be personal data. The former is a myth, the latter a misunderstanding, sometimes an understandable one. A recommended reading on the false dichotomy between public and private or personal is danah boyd over at She’s a recommended reading anyway, although I disagree with the complexity of privacy as a whole. It ultimately boils down to the notion of agency: how many degrees of freedom do I have left. And not in the hard, non-coercive, sense of the word freedom. Do I feel like I can freely research Jihadist literature on the internet? To look up a medical condition via a search engine? To communicate with someone who is a well-known investigative journalist? Information empowers, which is both a good and a bad thing. Good because it can mitigate existing power differentials and prevent new ones from happening. Bad because it can amplify existing ones or even create new ones.

Where is open or personal data in this mix?

Open data has always been as much about the mitigation and prevention of power differentials as about innovation. In a sense privacy is about the same core values. That this core value is expressed and enshrined in law differently over time and in different cultural contexts is what makes it complex in practice. In the USA, the starting point is the right to be left alone, born from the injustices of British colonial rule. In Europe the core concept is more that of informational privacy, born from the injustices brought about by Nazi and Stalinist rule. Quite unsurprisingly given the way law develops over time, a lot of privacy law has a philosophical underpinning that is dodgy at best. Property, a core concept in any society more complicated than hunter-gatherers, lacked a sound underpinning till the advent of game theory and its application to economics. From that perspective, privacy already is a remarkably mature concept. And speaking about property: for the love of all that is right, let’s stop framing personal data in terms of ownership!

Image credit: AFP

Image credit: AFP

Your personal data is very much like your shadow in that it both reflects you as a person but can also give a distorted reflection of yourself. Your shadow can take on a life on its own, like in the Indonesian Wayang puppet theatre, including all the drama that ensues in that art form. Personal data is data about a person, not owned by that person. Privacy is more than personal data, but in the context of an information society in which everything becomes data, personal data will become more synonymous with privacy than it already is. And we will become very boring people if we are not wary about this and regain the territory that has been lost already!

Image credit: WSJ/Tim Robinson

Image credit: WSJ/Tim Robinson


What do they know about me? Open data on how organisations use personal data

- March 18, 2014 in Featured

This post is by Reuben Binns, a postgraduate researcher at the University of Southampton, Web Science Institute. His research interests include ethical and legal aspects of personal data and open data. Find him on Twitter and GitHub.

When open data and personal data collide, attention is quite rightly drawn to the negative implications for privacy; namely, the possibility that open data contains – or can be used to infer – personal data. But there’s also a flip-side; open data could help protect privacy by revealing the activity of those who collect and share our personal data. This is something I’ve been exploring in my research using the UK Register of Data Controllers.

This dataset, covering the data protection notifications of 350,000 UK organisations, is released by the Information Commissioner’s Office under an Open Government License (it’s available by DVD on request from the ICO, and can be searched using their website portal). It discloses why organisations collect personal data, what kinds of data they collect, from whom and who has access to it. My research uses snapshots of this data over a 3 year period to paint a picture of the UK personal data landscape – who knows what about who, and why. Of course, some of this data may be inaccurate or incomplete, but it’s compiled from what organisations themselves are legally obliged to disclose to the ICO. The raw XML was parsed and loaded it into a database which can be queried. The full results will be released in a forthcoming paper, but alongside this, I’ve also been experimenting to see how the data could provide context to some of the privacy stories that have been in the media spotlight in recent years.

One example is the ongoing ‘construction worker blacklist’ fiasco. The Consulting Association, a rather blandly named outfit, were fined by the ICO for compiling a blacklist of over 3000 construction workers. Employers paid for access to the list in order to screen out potential workers who had previously caused ‘trouble’ – by, for instance, raising safety concerns on site or engaging in trade union activity. Some of the blacklisted workers were unable to find work for years and are now seeking compensation.

What’s ironic – and alarming – about this case and others like it is that the potentially harmful activity often isn’t itself prohibited by law. In the end, the £5,000 fine was issued due to the Consulting Association’s failure to register their activity with ICO. The truth is, even legal activity that regulators are aware of may still endanger privacy. So I dug into the register to find companies openly claiming to engage in similar practices.

I found 422 organisations who claim to be collecting information about the trade union membership status of employees of other organisations, for the purposes of selling it to third parties. This was essentially the business model of the now defunct Consulting Association. I’ve visualised a sample of 42 of these organisations below – the yellow nodes are the categories of third parties with whom they share this data.

See full image here.

A more recent controversy concerns the use of patient health data. In the debate over the proposed scheme – under which medical records currently held by GP’s would be aggregated into a central database and made available to researchers and companies outside the NHS – it emerged that identifiable patient data from hospitals has apparently already been sold (indirectly) to insurance companies, to the shock and dismay of privacy campaigners and health professionals alike. The body responsible, the HSCIC, have an entry in the register stating who they share personal data with – a copy of which can be seen by searching their registration number (Z8959110) in the ICO’s public portal. (NB: no mention of insurance companies).

A query for organisations who are collecting health data for ‘health administration and services’ purposes returns over 57,000 results. We can refine this to show only those organisations who give this data to ‘traders in personal data’, which yields 840 matches. Many of these appear to be opticians – branches of ‘Specsavers’ make up about a third – so if you’ve had an eye test lately, the results have possibly been aggregated up and sold through third parties. But there also appear to be some other health providers in there with potentially more sensitive data; one of them is an NHS Trust specialising in mental health. There may be a perfectly legitimate and ethical reason why they’re giving away patient data to private data brokers – but I’m struggling to guess what that could be.

Real privacy harms could result from these kinds of data sharing arrangements, even when they don’t contravene data protection law. If I were a member of a trade union, and my employers had any relationship with those 422 companies, I’d want to know about it. If I were a user of an NHS mental health service, I’d want to know if they’re sharing my medical data with data brokers and why. Whether it’s employment history, political affiliations, or health records, authoritative and accurate open data on who knows what about who is a pre-requisite for preventing privacy harms before they arise.

Publishing this information in obscure, unreadable and hidden privacy policies and impact assessments is not enough to achieve meaningful transparency. There’s simply too much of it out there to capture in a piecemeal fashion, in hidden web pages and PDFs. To identify the good and bad things companies do with our personal information, we need more data, in a more detailed, accurate, machine-readable and open format. In the long run, we need to apply the tools of ‘big data’ to drive new services for better privacy management in the public and private sector, as well as for individuals themselves.

So while there are genuine tensions between openness and privacy, there are also harmonies. When it comes to the organisations, businesses and institutions that shape our lives and livelihoods, transparency about how they use our personal data is essential. It’s the first step towards a new privacy infrastructure fit for the digital age – and open data has a crucial part to play.

Further links:
See the github project report for more on the data source itself – contributions / forks are very welcome. See my previous thoughts on how openness can help rather than hinder privacy here and here, and my musings on the scheme shortly before it was postponed.