Austrian synthetic data startup MOSTLY AI today announced that it has raised a $25 million Series B round. British VC firm Molten Ventures led the operation, with participation from new investor Citi Ventures. Two existing investors also returned: Munich-based 42CAP, and Berlin-based Earlybird, which had led MOSTLY AI's $5 million Series A round in 2020.
Synthetic data is fake data, but not random: MOSTLY AI uses artificial intelligence to achieve a high degree of fidelity to its clients' databases. Its data sets "look just as real as a company’s original customer data with just as many details, but without the original personal data points," the company says.
Talking to TechCrunch, MOSTLY AI CEO Tobias Hann said that the company plans to use the proceeds to push the boundaries of what its product can do, grow its team and gain more customers both in Europe and in the U.S., where it already has offices in New York City.
MOSTLY AI was founded in Vienna in 2017, and the General Data Protection Regulation (GDPR) was implemented across the EU one year later. This demand for privacy-preserving solutions and the concomitant rise of machine learning have created significant momentum for synthetic data. Gartner predicts that by 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated.
MOSTLY AI's typical clients are Fortune 100 banks and insurers, as well as telcos. These three highly regulated sectors drive most of the demand for synthetic tabular data, alongside healthcare.
Unlike some of its competitors, MOSTLY AI hasn't put its focus on healthcare in the past, but it could change. "It's certainly something that we are watching closely and we are actually starting some pilot projects this year," the CEO said.
The democratization of AI means that synthetic data will eventually be used well beyond Fortune 100 companies, Hann told TechCrunch. His company therefore plans to serve smaller organizations and a wider range of sectors in the future. But until now, it made sense for MOSTLY AI to focus on enterprise-level clients.
At the moment, enterprise companies are the ones that have the budgets, need and sophistication to work with synthetic data, Hann said. To match their expectations, MOSTLY AI obtained ISO certifications.
Talking to Hann, one thing becomes clear: While the startup has a solid technical footing, it is equally invested in the commercialization of its technology and in the business value it can add for its clients. "MOSTLY AI is leading this emerging and rapidly-growing space in terms of both customer deployments and expertise," Molten Ventures' investment director Christoph Hornung said.
The need to comply with privacy laws such as the GDPR and CCPA clearly drives demand for synthetic data, but it's not the only factor at play. For instance, demand in Europe is also driven by a wider cultural context; while in the U.S., it also results from a desire to innovate. For instance, use cases can include advanced analytics, predictive algorithms, fraud detection and pricing models -- but without data that can be traced back to specific users.
"Many companies are proactively approaching the space because they understand that customers value privacy," Hann said. "These companies understand that they can also gain a competitive advantage when dealing and working with data in a privacy-preserving way."
Seeing more U.S. companies wanting to adopt synthetic data in innovative ways is the key reason MOSTLY AI wants to grow its team in the U.S. But it is also recruiting more generally, both in Vienna and remotely. Its plan is to increase its headcount from 35 to 65 people by the end of the year.
Hann expects 2022 to be "the year where synthetic data will take off," and beyond this year, "a really strong decade for synthetic data." This will be supported by growing demand for responsible AI, articulated around key concepts such as AI fairness and explainability. Synthetic data helps answer these challenges. "It enables enterprises to augment and de-bias their data sets," Hann said.
Machine learning aside, MOSTLY AI sees lots of potential for synthetic data to be leveraged in software testing. Supporting these use cases requires making synthetic data accessible not only to data scientists, but also to software engineers and quality testers. It's with them in mind that MOSTLY AI came up a few months ago with version 2.0 of its platform. "MOSTLY AI 2.0 can be implemented on premise or in a private cloud, and adapts to different data structures of the company using it," the company wrote at the time.
"We are clearly a B2B software infrastructure company," Hann said. Both in its Series A and B rounds, the company looked for investors who understood that approach.
Molten Ventures being a publicly listed VC and consequently not subject to typical funding cycles also carried some weight, Hann confirmed when I asked. "Having this long-term commitment from a partner is something that was very appealing to us, because it's a little more flexible."
It doesn't hurt either that Citi Ventures is the venture arm of Citigroup, and that it is headquartered in the U.S. "We're significantly increasing the team in the U.S., and it's always great to also have a U.S.-based investor that can help with network and relationships there," Hann said.
With $25 million in new funding and an increased U.S. presence, MOSTLY AI will now have more resources to compete against other companies in its segment of the synthetic data space. These include Tonic.ai, which raised a $35 million Series B last September; Gretel AI, which disclosed a $50 million Series B round last October; and seed-funded British startup Hazy, as well as players that focus on specific verticals.
"We do see more and more players emerging in the space and in the market in general, so it certainly shows that there's a lot of interest there," Hann said.