The synthetic data market is bigger than you think – TechCrunch


“By 2024, 60% data used for the development of AI and analytics projects will be synthetically generated. This is a Gartner prediction that you will find in almost any article, presentation or press release related to synthetic data.

We repeat this quote here despite its ubiquity because it speaks volumes about the total addressable synthetic data market.

Let’s unpack: firstly, describing “synthetically generated” synthetic data might sound tautological, but it’s also pretty clear: we’re talking about artificial/fake data. and created rather than collected in the real world.

Then there is the core of prediction – that synthetic data will be used in the development of most AI and analytics projects. Since such projects are on the rise, the correlation is that the synthetic data market is also set to grow.

Last but not least is the time horizon. In our world of startups, 2024 is almost here, and the folks at Gartner already have a longer-term prediction: some of his team members published a study called “Forget Your Real Data – Synthetic Data Is The Future of AI”.

“The future of AI” is the kind of promise investors love to hear, so it’s no surprise that checks are pouring in to synthetic data startups.

In 2022 alone, MOSTLY AI raised a $25 million Series B funding round led by Molten Ventures; Datagen landed a $50 million Series B led by Scale Venture Partners, and Synthesis AI pocketed a $17 million Series A.

Synthetic data startups that have raised significant funds already serve a wide range of industries, from banking and healthcare to transportation and retail. But they expect the use cases to continue to grow, both in new industries and in those where synthetic data is already common.

To understand what’s happening, but also what’s to come if synthetic data is more widely adopted, we’ve spoken to various CEOs and VCs over the past few months. We learned about the two main categories of synthetic data companies, the industries they cater to, how to size the market, and more.

The tip of the iceberg

Quiet Capital founding partner Astasia Myers is among investors bullish on synthetic data and its applications. She declined to reveal whether she had invested in this space, but said that “there is much to be excited about in the world of synthetic data.”

Why this enthusiasm? “Because it gives teams faster access to data in a secure way at lower cost,” she told TechCrunch.

We can simply say that the TAM of the synthetic data and the TAM of the data will converge. Ofir Zuk (Chakon)

Access to large amounts of data has become essential for machine learning teams, and real data is often not up to the task, for different reasons. This is the void that synthetic data startups hope to fill.

There are two main contexts in which these startups focus: structured data and unstructured data. The former refers to the type of data sets that are found in tables and spreadsheets, while the latter points to what we might call media files, such as audio, text, and visual data.

“It makes sense to distinguish between structured and unstructured synthetic data companies,” Myers said, “because the type of synthetic data is applied to different use cases and therefore different buyers.”


About Author

Comments are closed.