GDPR defines personal data as an ‘asset’, yet despite this modern valuation, most of us have unwittingly – or unthinkingly – given it away in exchange for online services. As such, the average digital footprint is spread far and wide.
If you can remember all the companies that have bits of your digital info, you can begin approaching each one individually and demand they delete it – but you may be surprised at the quantity of organisations that really is. While the mind may immediately leap to Facebook, Google and the other tech giants, it’s also lots of obscure entities. That time you brought a hat at Disneyland, for example, the shop collected more than just a payment.
How do you start tracing companies you don’t remember engaging with? The answer is in your inbox and involves a little AI knowhow. This is the basic premise of Mine, an Israeli startup that uses machine learning to make the GDPR’s ‘right to be forgotten’ serviceable.
Gold Mine
Mine was founded by CEO Gal Ringel and CTO Gal Golan, who met in the cyber security unit of the Israeli army, and CPO Kobi Nissan, who previously worked for CandyCrush developer King. While many businesses saw GDPR as a hindrance when it came into force in 2018, for Mine it was an integral part of its inception.
“When we started to research the right to be forgotten, we quickly realised that we couldn’t find one tool that makes the GDPR accessible for the average person,” Ringel tells IT Pro. “Regulations are complex and difficult for the average user to understand. With that goal in mind, we quickly realised that we needed to come up with a really simple app that uncovers what companies have your personal data, to make your digital footprint tangible, for the first time, so you can almost touch it.”
Ringel estimates that around 350 companies are waiting to be found in the average person’s email. For work accounts, it’s almost half of that falling somewhere between 80 to 100. A staggering 90% of the companies that have your data can be found in your inbox, spam filter or even your trash folder. What’s more, the key to finding out who has your personal details isn’t in the email itself, but rather the subject line.
With Google Cloud’s AI platform, Mine has built machine learning models divided into two datasets. The first is trained on emails that have been tagged as different types of interactions – specifically learning about subject lines. This process has been repeated in 12 different languages so that the service works for users in other parts of the world, not just Israel, and can also spot traces of companies from Germany, France, Italy, Israel, Spain and many more.
“We search for these traces and then reflect it back to you,” Ringel explains. “So basically the AI understands what the interaction you had with a company was just based on the email’s subject. So for example, ‘Welcome to Air BnB’, that interaction is a sign up, and ‘Your purchase from Amazon’, means you’ve bought something.
“We worked really hard for almost a year to develop machine learning that is non-intrusive, but basically scans your inbox by only looking at the email subject. So it never actually reads the context of the email, because we don’t want to see the process of how they collect the personal data and we also don’t want to be collecting any email data.”
How Mine understands what data you’ve given to that company is down to the second dataset, which has been trained on thousands of privacy policies. Under the rules of the GDPR, companies have to be transparent with the ‘what’ and ‘why’ of data collection. So for example, Airbnb collects your data for two reasons: Signing up to its service and then for payments. So it will have your name, address, email, mobile phone number, a copy of your passport, plus payment details if you’ve ever used its service.
Sign up
Naturally, to find all this out, you need to sign up to one more service: Mine’s. It requires your email address to perform its basic function and your first and last name so that it can contact each company on your behalf. Upon registering your email, the company says, the machine learning models get to work and within 30 seconds you’ll be presented with 40 or so companies that have your data – this expands to hundreds after roughly 48 hours and repeatedly notifies you as and when you sign up to more.
All the usual suspects will be there – Netflix, LinkedIn, Amazon – along with an assortment of unknown and forgettable one time services. Underneath each will be the data you signed up and a button to take action. Click on this and Mine sends a request on your behalf. Some companies, however, will be listed as “action unavailable”.
“The reason you see action unavailable can be for two reasons,” Ringel explains. “First, these are companies that we still haven’t got round to analysing their privacy policies and learned their data structure. And the second could be that we didn’t find any contact information within their privacy policy. So we don’t know who to approach. When you want to reclaim from a company, we automatically shoot an email to its data protection officer or its privacy officer.”
As a company turning the GDPR into a service, Mine will come under more scrutiny than most when it comes to compliance. The company’s own privacy policy has no margin for error.
“The only data we do store is your email address, which you signed up with, and a list of the companies’ names we identified in your footprint,” Ringel confirms. “You can easily request a copy of the data we hold about you to see exactly what we keep. We are fully transparent on everything we are doing and, of course, in line with GDPR regulations.”
Main image copyright: Mine.