Published on January 26, 2022 08:32
Large amounts of data are extracted from smartphone apps by companies looking to develop products.
PARIS (AFP) – Privacy safeguards meant to preserve the anonymity of smartphone users are no longer fit for the digital age, a study suggests on Tuesday.
Large amounts of data are harvested from smartphone apps by companies seeking to develop products, conduct research, or target consumers with advertisements.
In Europe and many other jurisdictions, companies are legally required to make this data anonymous, often by removing telltale details such as names or phone numbers.
But the study published in the journal Nature Communications indicates that this is no longer enough to keep identities private.
Researchers say people can now be identified with just a few details about how they communicate with an app like WhatsApp.
One of the paper’s authors, Yves-Alexandre de Montjoye of Imperial College London, told AFP it was time to “reinvent what anonymization means”.
– ‘Rich’ data –
His team collected anonymized data from more than 40,000 mobile phone users, most of which was information from messaging apps and other “interaction” data.
They then “attacked” the data by looking for patterns in those interactions – a technique that could be used by malicious actors.
With only the person’s direct contacts included in the dataset, they found they could identify the person 15% of the time.
When other interactions between these primary contacts were included, they could identify 52% of people.
“Our results provide evidence that disconnected and even re-pseudonymised interaction data remains identifiable even over long periods of time,” the researchers from the UK, Switzerland and Italy wrote.
“These findings strongly suggest that current practices may not meet the standard of anonymization set by (European regulators), particularly with respect to linkability criteria.”
De Montjoye stressed that the intention was not to criticize an individual company or a legal regime.
Rather, he said the algorithm they used simply provided a more robust way to test what we consider to be anonymized data.
“This dataset is so rich that the traditional way we used to think about anonymization… doesn’t really work anymore,” he said.
“That doesn’t mean we have to give up anonymization.”
He said a promising new method is to strongly restrict access to large datasets to simple question-and-answer interactions.
This would eliminate the need to classify a dataset as “anonymized” or not.