The General Data Protection Regulation (GDPR) offers the biggest shake-up for consumers and companies based or working within the European Union. The new laws will have a major impact on how businesses handle personal data, in an attempt to break the almost endemic cycle of data breaches.
This month’s report sets out to examine patterns of influence on Twitter, surrounding the discussions on GDPR, as explained in this blog. We collected data on GDPR from November 15th to December 4th, allowing us to examine key features such as page rank, centrality, and reach. Our overall findings are that the discussion about GDPR is driven by fear of failing to become compliant. Privacy advocates, officials from the European Union and business organisations lead the debate, with national Members of Parliament not so well represented. Opportunities for ethical innovation, within the new regulations, remain largely untapped.
Below are the key takeaways:
1. British and French Information Commissioners lead the discussion on GDPR; EU institutions are less obvious
In terms of location, the tableau shows the vast majority of Twitter users are based in London for both retweets and mentions. Brussels and Paris representing a distant second and third, as you can see below:
Locations for mentions (i.e. all Twitter activity). Brussels is significantly closer when considering retweets alone.
The dominance of UK users in our data may well reflect a certain ‘semantic bias’ in our methodology. French is the only other language which appears in our trending terms – perhaps unsurprising, given its use in both Brussels and Paris.
Nevertheless, as can be seen here, for the top twenty trending one-word terms, English remains the more common language, with only two trending term boxes in French (for two-word terms, that drops to just one). Nevertheless, bearing in mind the natural preponderance towards toward English, it’s not unlikely that this is an underestimate of the relative amount of French discussion on GDPR.
Looking at the trending terms, you’ll find a fairly consistent theme to them. Unsurprisingly, the majority of trending terms include words such as “data protection”, “compliance”, and “protection bill”. However, also see a lot of discussion of “published guide”, “architecture advisor”, “infographic”, “experts”, and “webinar”. It is evident that step-by-step guidance on becoming data compliant remains hugely important for companies, with only a matter of months before the GDPR comes into effect.
Digging into the topics discussed, we can see similar themes being discussed, with cybersecurity, information security, and computer security all in the top ten for the highest counts of users. We also find a mention of the areas which will be affected by stricter regulations, including big data and the internet of things. Topics such as entrepreneurship, startups and artificial intelligence also appear but significantly lower down.
To investigate this in more detail, we decided to try and compare flocks – temporary groupings which form around events, like GDPR, in which relatively small accounts can have a large influence. Two flocks worth comparing are the respective information commisioners of Britain and France: the Information Commissioner’s Office and the Commission Nationle de l’Informatique et des Libertes. They both appear on our Gephi graphs of the data – the CNIL is the centre of the pale blue cluster, whilst the ICO sits at the bottom of the purple cluster.
By comparison, the flock around its French counterpart includes La French Tech (an accreditation programme for French startups), Mounir Mahjoubi (Secretary of State in charge of Digital Affairs), and Esante (the French portal for e-health).
Of the two, CNIL’s tweets gained more traction. In the overall list, the French information commissioner took the first and second spot for the number of retweets. The first of those was a phishing alert, warning businesses and individuals who receive alarming messages about GDPR to ignore them. The 913 retweets it got in that period – more than the next three tweets combined – is a reminder that the importance of GDPR (and fears about potential fines) make it a prime target for scammers. A glance at some of the other top tweets, with emotive language and references to catastrophic data breaches, show just how fake messages can take in business owners.
By contrast, the ICO’s highest ranked tweet comes in at number nine, a link to their new GDPR guide: another reminder that finding good guidance is at the forefront of many businesses’ thinking.
Although GDPR is the most common hashtag for both flocks, the ICO and CNIL flocks have some differences in the sort of topics which they discuss. For the ICO, data protection leads far and away in our normalized topic list, followed by cybercrime, information security, and similar concerns surrounding protecting data (the broader topic of big data also makes an appearance lower down the list). Its leading users include cybersecurity expert David Clarke, Cambridge, Massachussetts-based tech firm Forrester, and business technology news site ZDNet.
By comparison, CNIL’s flock has a wider variety of top topics. Web 2.0 leading in the normalised topic list. Forensic science and France both come before data security; open data and open source software also make an appearance. The top-users in its flock include start-up accreditor La French Tech (whose two accounts are in the second and third place), the laboratory for numeric innovation at CNIL (LINCnil), and the official account of the Ministry of the Interior (@Place_Beauvau). However, the above note on semantic bias is worth bearing in mind: it’s possible that French terms for data protection just aren’t being picked up in our analysis.
However, whilst the topics vary between the flocks, the top tweets remain focused on either guides to becoming GDPR compliant, or cases of companies who have been penalised for bad data protection practices.
A third, looser flock forms around various EU agencies, visible near the bottom of the Gephi graphs for mentions. This includes the European Data Protection Supervisor (EDPS), the European Agency for Network and Information Security (ENISA) and the European Commission.
The EDPS flock is looser than that of ICO or CNIL, with tweets a mixture of updates on GDPR negotiations, and guides on how to stay compliant. The topics discussed are also more similar to the British information commissioner than the French: cybercrime heads up the normalised list, followed by data protection and compliance. Its top tweets focus on EU speeches on the forthcoming legislation, consumer concerns over data breaches, and once again, ways in which businesses can become compliant.
Whilst the topics on how GDPR will be implemented vary from country to country, the underlying messages, which often position compliance as a burden to be carried out, are much the same. Hopefully, over time, we might see more positive tweets considering how companies can flourish under the new regulations.
2. Uber versus Sage UK: Corporate Failure Grabs More Headlines than Corporate Success
Looking at the Gephi graph of mentions again, one company’s appearance is not a mark of great renown: Uber. Its appearance here is the result of an immense data-breach which hit the ride-hailing company, potentially affecting around 2.7 million British customers alone. Uber itself was not responsible for large numbers of tweets: it does not appear on our retweet graph as a result.
Uber’s appearance here is a reminder of the ephemerality of the news cycle. A month earlier, it is unlikely to have appeared there; in a month’s time, another data breach is likely to take its place. At any rate, its prominence here is a reminder that examples of blatant corporate failure are more likely to make the news than those of corporate success. Uber actually turns up in our list of trending one-word terms as well, in three groupings.
In all three cases, the groupings feature ‘breach’ and ‘compliance’, suggesting that the tweets are using Uber’s data protection failure as a lesson in better regulation. The second grouping, unsurprisingly, has Sage UK leading the top flock (more on that below). The second and third grouping have top tweets coming from Forrester, one of the main accounts in the ICO flock, whilst the first cluster of trending terms comes from a relatively lesser-known account, Thomas Daubigny, a digital strategist based in Paris.
Looking at the Uber flock itself, data protection is unsurprisingly top of the normalised topic list, followed by web 2.0 and data security. As we can see, both the ICO and CNIL are some of the larger users in the flock.
In contrast to this example of blatant corporate failure, Sage UK appears in a more positive light in some of the same trending terms, as seen above. The British software company has its own small flock, with most of the tweets in it linking to its guidance for businesses.
Sage UK’s flock include its senior manager Julia Wedgwood, French ad-block analyser AdBack, and Twitter platform Social Fave (also based in France): a reminder that guides and training for GDPR can be applied across the EU. At the same time, the relative dearth of large accounts in this flock is a reminder that examples of bad data protection are more likely to get attention than success stories.
3. Other Flock Leaders: Tech Journalists, Privacy Experts, Governmental Bodies, and Trade Groups
Flocking reflects the construction of temporary communities, clustered around a single topic or event (in this case GDPR). It doesn’t reflect the long-term influence or follower count of the individuals leading the flocks, and doesn’t take into account how influential they are when the topic or event dissipates. Nevertheless, they are extremely useful to show the actual thought leaders within a given area, rather than assuming that a larger follower count means greater impact. Right Relevance uses community detection graph algorithms to create ‘Flocks’.
Below are the top 10 flocks for the report:
The largest belongs to Laura Kayali (@LauKaya), a Brussels-based tech reporter at Contexte. Despite having the largest flock, Kayali only has 1,524 followers (compared to 37.2 K for the ICO): a reminder that the metric considers short-term importance rather than long-term size.
The above tweet by Kayali (her highest, in terms of retweets and reach), is outranked by accounts with significantly more followers. Nevertheless, her centrality and the members of her flock, including EU data protection services and several of those mentioned below, gives her importance within the context of GDPR.
The majority of other accounts in this category are involved with privacy and security. Probably the best known of these is Max Schrems, the lawyer and activist who became famous for bringing Facebook to court for questionable levels of privacy. The launch of his new privacy enforcement NGO None OF Your Business (@NOYBeu) is directly aimed at enforcing GDPR. Smaller accounts also have large flocks, such as Miss Info Geek (@MissIG_Geek), an anonymous infosec writer.
Looking at the Gephi graph of retweets, which allows smaller accounts to have a larger influence, we see these first two sections blurring together into the loosely connected purple cluster at the bottom of the screen (circled in red). We can also the appearance of other noteworthy privacy advocates, such as Ann Cavoukian, the pioneer behind the idea of ‘privacy by design’.
Governmental accounts also make an appearance. In addition to the European Data Protection Supervisor, we also find Digit (@digitfyi), the Scottish technology and media hub. We also see the only politician in the top flocks: Jan Philipp Albrecht, a Green German MEP. The lack of Members of Parliament from member states of the EU in this list is unsurprising if unfortunate: the GDPR will affect their constituents as much as other local issues.
Trade organisations are also represented in the top flocks: a reminder that small businesses will be as affected by GDPR than the multinationals that make the headlines (if not more so). We see both the North East of England Chamber of Commerce (@NEEChamber) and the Small Business Saturday (@SmallBizSatUK) make the list as examples of this category. Looking at the retweet Gephi graph again, we can see that @SmallBizSatUK sits at the centre of a grey cluster at the top of the image, alongside the Federation of Small Businesses (@FSB) and Sage UK.
A final group are entrepreneurs and others who are using GDPR as an opportunity. These include Patrick Coomans (ranked second), a Belgian entrepreneur who works in cybersecurity and regtech, and GDPRSummit, a summer series on compliance to be held in London next year. This group forms the orange curve visible at the far left of the graph and reflects the market for new players to get involved with teaching compliance.
This graph shows just how far GDPR has brought together immense numbers of technical professionals with interests from all across the spectrum. The question now is how to leverage these different viewpoints, and encourage a discussion which effectively brings out the best in all of them.
4. Reach versus Rank – Leaders
An alternative way to look at the data is to consider the reach and rank of various users on the topic of the GDPR. Looking across all Twitter activity (mentions, retweets, likes, and comments), we can again see that the ICO and CNIL rank highly, with almost equivalent reach (although the ICO has a higher page rank). Uber also stands out – again, we would not expect to see this if we were simply looking at retweets of Twitter users.
Discounting media accounts such as Forbes, the Financial Times, and the BBC News, and major tech companies like Forrester, Office 365, Sage UK, and IBM, we see a number of the same smaller accounts with large flocks, including Privacy Trust, GDPR Summit Series, and Privacy Matters.
Others, however, are new to our lists. These include Marc R Gagne, an Ontario-based privacy lawyer, who falls close to the BBC News in terms of both rank and reach (he also appeared in our March report on Industrial IoT as one of the most interesting and engaging accounts). Museums & Heritage Show, which holds an annual awards ceremony for the cultural sector, comes out with a higher reach and rank than Ticketmaster UK, and a higher rank than YouTube, with only 11.2 K followers. And wso2, an enterprise platform based in Mountain View, California, with 7,625 followers, has the third highest rank of all accounts here, with a reach greater than IBM, IBM Security, and Sage UK.
Yet again, we can see that when it comes to GDPR, it is not simply established names with large, long-term Twitter presences which can have an impact on how the discussion is shaped.
5. Rank versus Connectors – The importance of centrality
Reach and rank are important metrics when considering the relative long-term importance of users. However, it is just as important to focus on the betweenness centrality of accounts: how important are they to the overall network? An account can have high reach, but be relatively isolated on a topic; at the same time, a less well-known account can be highly active as a node through which other connections are made.
Consider these tables: the International Association of Privacy Professionals (IAPP), @PrivacyPros, which leads on page rank, is only 17th when it comes to top connectors (circled in red). Similarly, Professor Paul Nemitz (Principal Advisor in the Directorate-General for Justice and Consumers of the European Commission), is the highest connector, but does not appear in the top 20 for page rank. Across the tables, we find six pairs in total – a modest amount, though still under half.
What connects these pairs is difficult to tell. Most, like the IAPP, are related to privacy: these include the IAPP’s Daily Dashboard (in light blue), Miss Info Geek (in orange), who appeared on the top flocks list, Privacy Matters (in dark blue), and Lukasz Olejnik, an independent security and privacy Consultant, and researcher based in London. The final pair belongs to Jan Philipp Albrecht, the German MEP who was the only elected official in our top flocks.
Other top connectors (who do not have very high page rank) have also previously been discussed, including David Clarke and Ann Cavoukian. The same also applies to the other table, where we see the CNIL’s English account, privacy advocate Max Schrems, and the European Data Protection Service.
Different measures present different views of the world: by considering both page rank and betweenness centrality, we hope to gain more holistic insight.
Looking at the most popular tweets, topics, and trends, we can see GDPR compliance is often presented as a chore, with severe consequences for those who fail to complete it. Wariness of the regulation is understandable, not least because of the repeated data breaches suffered by large companies including Uber – and because enforcement of the regulation is likely to be thorough.
And yet GDPR also offers a series of opportunities. As seen in the numerous tweets about guides, there is a demand for experts who can tell companies how to adhere to the new guidelines, and a space for privacy advocates who work on behalf of consumers. Further down the line, we can hope that GDPR will be a boon for responsible data usage, by companies who are driven towards enacting privacy by design by default. Shielding customers’
data does not have to be mutually exclusive with innovation or profitability.
Never miss a story about GDPR by following our feed