New Study Finds Over 96% of Computer Vision (CV) Teams Already Using Synthetic Data for Training and Testing of Visual Machine Learning Models

New Study Finds Over 96% of Computer Vision (CV) Teams Already Using Synthetic Data for Training and Testing of Visual Machine Learning Models

A survey by Datagen reveals widespread adoption of synthetic data throughout the CV field to advance AI/ML applications

Datagen, the leader in synthetic data generation on a mission to bring data simulation to every computer vision engineer, today announced the release of a new research study, “Synthetic Data: Key to Production-Ready AI in 2022,” exploring training data in the field of Computer Vision (CV). The study reveals a once fragmented field beginning to coalesce around the promise of synthetic data to help mitigate frequent project delays and cancellations.

The study emphasizes that training data has become a significant stumbling block for computer vision professionals, who cited a number of data-related complications hindering their organization’s progress in CV.

Marketing Technology News: Ada Places No. 19 on the Globe and Mail’s Third-Annual Ranking of Canada’s Top Growing Companies

Among the data-related issues experienced, the most prevalent were:

  • Wasted time and/or resources caused by a need to retrain the system often (52%)
  • Poor annotation resulting in quality issues (48%)
  • Poor data coverage of the intended application’s domain (47%)
  • Lack of sufficient amount of data (44%)

All four of these problems can seriously jeopardize a project’s progress, making their widespread presence of significant concern to CV teams. As a result of these issues, the overwhelming majority of computer vision teams struggle with frequent, lengthy project delays, and even outright cancellations. Inadequate training data has led to an environment in which:

  • 99% of respondents have experienced project cancellations
  • 80% have experienced project delays lasting at least 3 months
  • 33% have experienced project delays lasting 7 months or more

The frequency, length, and ubiquity of data-driven project disruptions in the field of computer vision are immense. However, the study also revealed several trends that indicate a growing appetite for synthetic data. The research revealed that a staggering 96% of computer vision teams reported already using synthetic data in the training and testing of their computer vision models.

Marketing Technology News: MarTech Interview with Paul Ross, VP of Marketing at Affinity

Based on the survey findings, this surge in synthetic data adoption can be attributed to the fact that its many benefits are both broadly understood and broadly experienced by the computer vision community. For example, when asked what the primary motivation was behind their organization’s use of synthetic data, CV teams reported testing, training, and addressing edge-cases in near equal measure. Similarly, when asked about their first-hand experience, respondents reported experiencing the following benefits of synthetic data:

  • Reduced time-to-production (40%)
  • Elimination of privacy concerns (46%)
  • Reduced bias (46%)
  • Fewer annotation and labeling errors (53%)
  • Improvements in predictive modeling (56%)

“Synthetic data is the future of data. This is the new way to control and consume the data our AI systems need,” said Ofir Chakon, founder and CEO of Datagen. “As simulation gets better over time, with all its benefits, it will take over the place of labor-intensive manual data collection that is no longer scalable at the speed the world is evolving.”

Picture of Globe Newswire

Globe Newswire

GlobeNewswire is one of the world's largest newswire distribution networks, specializing in the delivery of corporate press releases financial disclosures and multimedia content to the media, investment community, individual investors and the general public.

You Might Also Like