The Telegraph‘s revelation that Public Health England had been offering data to a firm which was known to work with big tobacco is, rightly, scandalous. At the very least, it shows a level of incompetence in verifying who around 100,000 cases of lung cancer were handed to. At the worst, it suggests a recognition by special interest groups and lobbies that the best way to get their hands on data is to hijack public bodies.
All evidence points to the former, but this is fairly scant relief. The data might be anonymised, but (as The Telegraph and others worried), how much damage could it do in the fight against a very avoidable cause of lung cancer? And equally pressingly is the fact that the word ‘anonymised’ is up for grabs. With sufficient datasets, it’s possible to crack most attempts at anonymisation. And even assuming that William E Wecker, the big tobacco affiliated firm, chooses not to sell on data to all and sundry (a big assumption given the US’s lax laws on data protection), it is questionable just how secure their encryption is.
Big data scandals are not merely the old story of single stolen identities being bandied about: each breach or handover puts more information out there. Your health status, marital status, criminal record (even for minor infractions decades past) can be put on offer, threatening our personal autonomy. Who wants their bosses or advertisers or criminals knowing everything about their lives?
There are several ways we can deal with this. One is to simply ignore the problem, and argue that to put limits on big data collection and transfer is to threaten creativity and innovation. This is the American way, and in a sense, it has its merits – particularly when it comes to research. As academics argued as early as 2009, kneejerk reactions to data breaches threaten key scientific research. Work on massive data sets allows researchers to find patterns that traditional scientific work never could. To clamp down too heavily is to harm this in the longer term.
And yet it is clear that the current position we face is far too lax, and far too keen to ignore the damage down in the here and now to consumers. In that way, the EU has lead the way in legislation to punish those – like Public Health England – hand over data without bothering to check who’s at the receiving end. The fines which they pose are hefty, and for many small businesses, the upheaval which gaining GDPR compliance represents is unwelcome. Yet it also recognises the importance in putting forward user privacy as an integral part of dealing with big data.
We all should have a right to anonymity when it comes to handing over sensitive data: the idea that breaches are a fair trade off is the rhetoric of big companies unwilling to put in the work to actually invest in security. At the same time, we must balance this right with a recognition that real research – the sort which can affect broad social change for the better – requires big data. A balance between trust and innovation when it comes to big data will be key going forward.