Generative AI Startup Twelve Labs Works with AWS to Make Videos as Searchable as Text

Generative AI Startup Twelve Labs Works with AWS to Make Videos as Searchable as Text

Cloud Computing Services - Amazon Web Services (AWS)

Leading startup makes ‘needle-in-a-haystack’ video searches possible using natural language, turning the world’s largest unsearchable data source—video—into a trove of accessible information

Developers can now find specific movie scenes from decades of video archives, or assess video footage of athletes’ performances, with conversational queries

Twelve Labs uses AWS to train its multimodal foundation models up to 10% faster, while reducing training costs by more than 15%

At AWS re:Invent, Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company , announced that Twelve Labs, a startup that uses multimodal artificial intelligence (AI) to bring human-like understanding to video content, is building and scaling its proprietary foundation models on AWS. Twelve Labs will use AWS technologies to accelerate the development of its foundation models that map natural language to what’s happening inside a video. This includes actions, objects, and background sounds, allowing developers to create applications that can search through videos, classify scenes, summarize, and split video clips into chapters.

Creating applications that can pinpoint any video moment or frame

Available on AWS Marketplace, these foundation models enable developers to create applications for semantic video search and text generation, serving media, entertainment, gaming, sports, and additional industries reliant on large volumes of video. For example, sports leagues can use the technology to streamline the process of cataloging vast libraries of game footage, making it easier to retrieve specific frames for live broadcasts. Additionally, coaches can use these foundation models to analyze a swimmer’s stroke technique or a sprinter’s starting block position, making adjustments that lead to better performance. Finally, media and entertainment companies can use Twelve Labs technology to create highlight reels from TV programs tailored to each viewer’s interests, such as compiling all action sequences in a thriller series featuring a favorite actor.

Marketing Technology News: MarTech Interview with Gulab Patil, Founder & CEO @ Lemma

“Twelve Labs was founded on a vision to help developers build multimodal intelligence into their applications,” said Jae Lee, co-founder and CEO of Twelve Labs. “Nearly 80% of the world’s data is in video, yet most of it is unsearchable. We are now able to address this challenge, surfacing highly contextual videos to bring experiences to life, similar to how humans see, hear, and understand the world around us.”

“AWS has given us the compute power and support to solve the challenges of multimodal AI and make video more accessible, and we look forward to a fruitful collaboration over the coming years as we continue our innovation and expand globally,” added Lee. “We can accelerate our model training, deliver our solution safely to thousands of developers globally, and control compute costs—all while pushing the boundaries of video understanding and creation using generative AI.”

Generating accurate and insightful video summaries and highlights

Twelve Labs’ Marengo and Pegasus foundation models deliver groundbreaking video analysis that not only provides text summaries and audio translations in more than 100 languages, but also analyzes how words, images, and sounds all relate to one other, such as matching what’s said in speech to what’s shown in video. Content creators can also access exact moments, angles, or events within a show or game using natural language searches. For example, major sports leagues use Twelve Labs technology on AWS to automatically and rapidly create highlight reels from their extensive media libraries to improve the viewing experience and drive fan engagement.

“Twelve Labs is using cloud technology to turn vast volumes of multimedia data into accessible and useful content, driving improvements in a wide range of industries,” said Jon Jones, vice president and global head of Startups at AWS. “Video is a treasure trove of valuable information that has, until now, remained unavailable to most viewers. AWS has helped Twelve Labs build the tools needed to better understand and rapidly produce more relevant content.”

Marketing Technology News: Ethical Programmatic Advertising: Balancing AI, Automation, and Consumer Trust

Accelerating and lowering the cost of model training

Twelve Labs uses Amazon SageMaker HyperPod to train its foundation models, which are capable of comprehending different data formats like videos, images, speech, and text all at once. This allows its models to unlock deeper insights compared to other AI models focused on just one data type. The training workload is split across multiple AWS compute instances working in parallel, which means Twelve Labs can train their foundation models for weeks or even months without interruption. Amazon SageMaker HyperPod provides everything needed to get AI models up to speed quickly, fine-tune their performance, and scale up operations seamlessly.

Leveraging the scale of AWS to expand globally

As part of a three-year Strategic Collaboration Agreement (SCA), Twelve Labs will work with AWS to deploy its advanced video understanding foundation models across new industries and enhance its model training capabilities using Amazon SageMaker Hyperpod. AWS Activate, a program that helps startups grow their business, has empowered Twelve Labs to scale its generative AI technology globally and unlock deeper insights from hundreds of petabytes of videos—down to split-second accuracy. This support includes hands-on expertise for optimizing machine learning performance and implementing go-to-market strategies. Additionally, AWS Marketplace enables Twelve Labs to seamlessly deliver its innovative video intelligence services to a global customer base.

Write in to psen@itechseries.com to learn more about our exclusive editorial packages and programs.

Picture of Business Wire

Business Wire

For more than 50 years, Business Wire has been the global leader in press release distribution and regulatory disclosure.

You Might Also Like