TwelveLabs Unveils Pegasus-1.2 to Efficiently Process and Understand Videos of Varying Lengths and Complexities, Expanding Possibilities for Video AI

TwelveLabs Unveils Pegasus-1.2 to Efficiently Process and Understand Videos of Varying Lengths and Complexities, Expanding Possibilities for Video AI

Model’s outstanding performance provides nuanced approach to video understanding

TwelveLabs, the video understanding company, announced the release of its Pegasus 1.2 multimodal foundation model, which represents a significant leap forward in industry-grade video language models. Pegasus 1.2 achieves state-of-the-art performance in long video understanding. The model can support videos that are up to one hour long with best-in-class accuracy while also maintaining low latency and competitive pricing. TwelveLabs’ embeddings storage intelligently caches videos, allowing for repeated queries to the same video to be even faster and cheaper. With its latest advances, Pegasus 1.2 serves as a precision tool that delivers business value through its focused, intelligent system design—excelling exactly where production-grade video processing pipelines need it most.

“We are thrilled to debut Pegasus-1.2, which is designed to address the fundamental limitations of existing video language models by introducing a novel approach to spatio-temporal comprehension.”

“Video understanding represents one of the most complex challenges in artificial intelligence, requiring sophisticated models that can simultaneously interpret spatial details, temporal dynamics, and contextual nuances,” said Aiden Lee CTO of TwelveLabs. “We are thrilled to debut Pegasus-1.2, which is designed to address the fundamental limitations of existing video language models by introducing a novel approach to spatio-temporal comprehension.”

Marketing Technology News: Adobe Forecasts Record $240.8 Billion U.S. Holiday Season Online with Black Friday Growth to Outpace Cyber Monday

TwelveLabs’ Pegasus foundation model was built to generate text descriptions about a video, “understanding” its content through analysis of both visual and audio elements. In doing so, it enables the production of summaries, highlights, titles, detailed reports and more based on prompts. This allows users to extract meaningful information from video content through text generation more efficiently than ever before.

Pegasus works in conjunction with TwelveLabs’ Marengo model, a state-of-the-art multimodal embedding model, to bring human-like understanding to videos.

Marketing Technology News: MarTech Interview with Adam Brotman, Co-Founder and Co-Ceo @ Forum3

Pegasus-1.2 Takes Video Understanding to the Next Level

The core innovation of the new Pegasus-1.2 lies in its ability to dynamically balance computational efficiency with comprehensive understanding of videos across varying lengths and complexities. By implementing an advanced vision-encoding strategy and a sophisticated token reduction method, Pegasus-1.2 can capture fine-grained spatial and temporal features while maintaining computational efficiency. This approach enables Pegasus-1.2 to seamlessly transition between understanding short video clips and analyzing extended sequences up to one hour in length, a capability that significantly expands the practical applications of video AI.

Through rigorous testing, the model not only excels in low-level perceptual tasks but also demonstrates advanced reasoning skills across different video understanding domains. Importantly, Pegasus-1.2 achieves these capabilities with a compact architecture, challenging the prevailing assumption that superior performance necessitates exponentially larger model sizes. This positions Pegasus-1.2 as a significant advancement in the field of multimodal AI, offering a more efficient and nuanced approach to video language understanding.

Write in to psen@itechseries.com to learn more about our exclusive editorial packages and programs.

Picture of prweb

prweb

PRWeb is the leader in online news distribution. It provides a highly effective way for organizations to distribute news, increase visibility and attract customers.