Are Privacy Laws Driving Privacy-Protecting Analytics?

How-Privacy-Laws-Are-Driving-Privacy-Preserving-Analytics

Clubhouse tapped into the yearning of many for an outlet for personal expression. Having launched in April 2020, the audio-only app, which allows users to set up or attend discussion rooms on topics of their choosing, hit 10 million users worldwide in February. But Clubhouse appears to be unsafe — the app collects reams of data on users. Besides name and number, data collected include information about device, location, and IP address; what users listen to, when they log in, and for how long, according to the app’s privacy policy. This in practice means the company can collect data about anything users do on Clubhouse.

It also records all the audio. It is unclear exactly what the app does with the data, or might do in future, but the privacy policy says it may share data with a long list of third-party companies. 

Clubhouse is the latest in the line of companies raising privacy concerns. Nowadays, headlines are dominated by high profile privacy scandals and there is no indication that this is going to slow down anytime soon. 

While advances in data science and the creation of incredibly powerful new platforms offer large benefits to their users and when there’s a growing demand for data, there’s growing calls for privacy too. Privacy advocates agree that new strategies and support for advancing privacy-preserving technologies is the need of the hour. 

These are the innovative capabilities that enable sharing of information in an increasingly protected manner – techniques such as federated learning, homomorphic encryption, trusted execution environments, and zero-knowledge proofs.

In recent years, multiple demonstration projects and pilot projects displayed the value of these approaches and the importance of clear communication with public administrators about the role, need, and potential for privacy-preserving technologies in practice.

Federated learning

Google introduced the idea of federated learning (FL) in 2017. The key ingredient of federated learning (FL) is that it enables data scientists to train shared statistical models based on decentralised devices or servers with a local data set. It’s a new breed of Artificial Intelligence (AI) that brings learning to the edge or directly on-device. FL is preferred in use-cases where security and privacy are the key concerns and having a clear view and understanding of risk factors enable an implementer/adopter of FL to successfully build a secure environment.

It enables computers to collaboratively learn while keeping their data on their devices. Instead of sending data to one centralised machine, local computers will train a shared algorithm, and share only information on what they’ve learned. Nvidia introduced FL on its autonomous driving platform. The company’s DGX edge platform retrains the shared models in each OEM with local data. The local training results can be sent back to the FL server over a secure link to update the shared model.

Also Read: GDPR: What You Should Know

Homomorphic encryption

Due to privacy leakage of sensitive data, the conventional encryption systems are not completely secure from an intermediary service like cloud servers. The homomorphic encryption (HE) is a special kind of encryption mechanism that can resolve the security and privacy issues. Unlike the public key encryption, which has three security procedures —  key generation, encryption and decryption —  there are four procedures in HE scheme, including the evaluation algorithm. The HE allows the third party service providers to perform certain types of operations on the user’s encrypted data without decrypting the encrypted data, while maintaining the privacy of the users’ encrypted data. 

Trusted execution environment

Trusted execution environments, or TEEs, are pieces of hardware which keep prying eyes from seeing inside them. They provide guarantees of integrity and confidentiality. Those executing the code can have high levels of trust in the asset management of that surrounding environment because it can ignore threats from the “unknown” rest of the device. In the TEE, that trust requires that all TEE related assets, code, the underlying trusted OS and its support code, have been installed and started through a methodology that requires its initial state to be as expected by the designers — everything is signature checked, immutable, or held in isolation. Popular use cases are private machine learning, secure key generation, and solving cryptographic constructs like Yao’s Millionaire Problem.

Zero-knowledge proofs

Researchers at MIT first started developing the concept of a zero-knowledge proof (ZKPs) in the 1980s, to allow data to be verified without revealing that data.. Zero-knowledge techniques are mathematical methods used to verify things without sharing or revealing underlying data. Each transaction has a “verifier” and “prover”. In a transaction using ZKPs, the prover attempts to prove something to the verifier without telling the verifier anything else about that thing. By providing the final output, the prover proves that they can compute something without revealing the input or the computational process. Meanwhile, the verifier only learns about the output. Think of a payment app checking whether you have enough money in your bank account to complete a transaction without finding out anything else about your balance. In this way, zero-knowledge proofs can help broker all sorts of sensitive agreements, transactions, and interactions in a more private and secure way. ZKPs have the potential to revolutionise the way data is collected, used and transacted with.

Also Read: World Economic Forum Launches Data for Common Purpose Initiative

Why it matters

These technologies give a “building block” of privacy that will unlock new applications, businesses, and ways of collaborating. Increasingly, there’s the need and availability of privacy-preserving technologies, especially for those aimed at companies in the financial services, healthcare and retail sectors. It is expected there will be further developments among the companies that collect data to investigate ways of analysing that data responsibly. It’s imperative companies take a closer look at solutions that interact with data and ask solution providers how they are enhancing offerings to align with current and proposed regulations. Especially in cloud environments, companies should ask their vendors about the ability to process their customer data without compromising privacy. The time to start having these conversations is now.