Betterdata uses synthetic data to keep real data safe

Betterdata, a Singapore-based startup that uses programmable synthetic data to keep real data safe, announced today that it has raised $1.55 million. The initial round, which it says was oversubscribed, was led by Investible with participation from Franklin Templeton, Xcel Next, Singapore University of Technology and Design, Bon Auxilium, Tenity, Plug and Play and Entrepreneur First.

The startup was founded in 2021 by Dr. Uzair Javaid, its CEO, and chief technologist Kevin Yee, with the goal of making data sharing faster and more secure as data protection regulations increase in everyone. The company currently has research and development partnerships with two major universities in Singapore and the United States (it cannot publicly disclose who they are) and its clients include the Shanghai Pudong Development Bank.

Betterdata says it is different from traditional data sharing methods that use data anonymization to destroy data because it uses generative artificial intelligence and privacy engineering instead.

Yee explained to TechCrunch that programmatic synthetic data uses generative models, such as deep learning models, including generative antagonistic models used in deepfakes, transformers used in ChatGPT, and diffusion models used in stable diffusion, to create and augment new data sets.

These synthetic data sets have characteristics and structures similar to real-world data without revealing confidential or private information about individuals.

“The idea is to create a dummy version of a real data set that can be safely used for a variety of purposes, including protecting sensitive data, reducing bias, and also improving machine learning models,” said.

Programmatic synthetic data helps developers in many ways. Some examples include helping them protect sensitive data, complying with data protection regulations like GDPR and HIPAA, increasing data availability across teams, creating more data to train, test, and validate machine learning models, and addressing data imbalance issues. by creating more registries for underrepresented groups. or classes.

Betterdata’s funding will be used to launch its product and to enhance its programmable synthetic data technology stack, including support for single-table, multi-table, and time-series data sets. These are different variations of tabular data sets, and Yee explains that the main differences are their structures and the problems they were created to address.

For example, single-table datasets focus on stand-alone tables, while multi-table datasets are meant to consider relationships between multiple tables, and time-series datasets deal with collected data. over time.

Betterdata also plans to hire more people, including sales and marketing employees, and expand beyond Singapore to more of the Asia-Pacific region over the next year or two.

In a statement on Investible’s investment, director Khairu Rejal said: “Betterdata solves one of the biggest problems facing the AI ​​industry today: the lack of high-quality data that also meets privacy requirements. Through its powerful platform, Betterdata generates synthetic data that mimics real-world data without compromising quality and privacy, helping companies meet global privacy and compliance laws at scale.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button