Can we protect privacy in the era of AI?
When using data, we often have to trade utility for privacy. We must balance the trade-off between using the data to benefit our companies and customers against preserving the privacy of the data.
It may be patient data from a clinical trial or top-secret data on our products which we don’t want to risk exposing. But not using the data doesn’t help our customers or us make things better. Neither do we want to ignore the privacy implications around the data and face the repercussions legally, reputationally, and financially.
We cannot have our cake (privacy) and eat it (utility).
But this is changing. New privacy-enhancing technologies allow us to use the data without compromising privacy. They will allow us to unleash large amounts of data we already have but were too afraid to use.
Federated learning is one of these technologies, which enables AI to work across a distributed network by sharing models instead of data. The MELLODDY consortium uses federated learning to enable ten life science competitors to collaborate without compromising their data. Together they have created more accurate predictive models to improve their drug discovery search. Capgemini has used federated learning in collaboration with multiple hospitals in Spain to improve the diagnosis of COVID-19 from lung x-rays while preserving patients’ privacy.
Differential privacy stops others from identifying data points by carefully adding noise. Our phones use differential privacy to preserve our personal use of emojis, health data, and search queries. The US Census Bureau used differential privacy to preserve individuals’ identities when they shared the 2020 US Census results, as anonymization is ineffective when multiple data sets are combined.
More technologies are coming down the line. Homomorphic encryption enables computations on encrypted data while the data is still encrypted. Though homomorphic encryption currently has performance and maturity limitations for AI usage, it is used to preserve the privacy of database queries and results. Microsoft updated the password monitor feature in Edge to use homomorphic encryption, allowing it to check if any of your passwords have been compromised without seeing your passwords.
By 2030, these technologies will enable us to use sensitive patient data to create better drugs or use confidential data in our AI models to create better products.
Federated learning will enable more collaboration between companies, locations, and individual users. Differential privacy gives us control over how much privacy we want to keep. Homomorphic encryption will become performant and versatile enough to allow us to train and run production AI models on encrypted data.
Together, these technologies will enable us to build more useful AI tools, allowing us to have our cake and eat it.