Replica Analytics Ltd., the company, announced the release of Replica Synthesis 3.0, its privacy and utility-preserving synthetic data generation software that has been updated with an enhanced user experience. This makes it easier for analysts to train generative models and evaluate their utility and privacy. The company unveiled the latest version of its trusted software during a Privacy Enhancing Technologies (PETs) demonstration at the Privacy Symposium, which attracts data protection experts from around the globe to discuss developments in data protection regulations, compliance, and innovative technologies.
“There is a serious data challenge impacting important research and innovation in healthcare, in part due to increasingly strict privacy requirements,” says Dr Khaled El Emam, Replica Analytics’ Senior Vice-President and General Manager. Dr El Emam, who co-founded the company and is also the Canada Research Chair in Medical Artificial Intelligence (AI) at the University of Ottawa, has been developing and deploying PETs for two decades, primarily focusing on healthcare. “Synthetic data is rapidly emerging as a practical PET for addressing data access and sharing problems responsibly, enabling greater data utilisation while meeting the requirements of contemporary regulations such as the General Data Protection Regulation (GDPR) and the US Health Insurance Portability and Accountability Act (HIPAA).”
Synthetic data is generated from real data. A machine learning model captures the patterns in an original dataset and then generates new data from that model, which closely captures the properties and patterns in the original dataset. Because synthetic data is generated from a model, it has low disclosure risks. A growing body of research offers evidence that synthetic data can reduce privacy risks and maintain data utility.
In addition to showcasing Replica Synthesis 3.0 during the Privacy Symposium’s PETs demonstration today, the company’s Senior Director of Data Science, Lucy Mosquera, was invited to speak on an International Cooperation and Medical Data Sharing panel a couple of days earlier. There, presenters discussed regulatory issues and challenges related to international transfers of data for clinical research and Mosquera shared insights about technical enablers, including synthetic data generation and re-identification risk assessments.