Technology

Twelve Labs Launches Marengo 2.7, Introducing New Multi-Vector Approach to Video Understanding

Published

1 month ago

December 4, 2024

Latest innovation yields greater than 15% improvement over previous foundation model

SAN FRANCISCO, Dec. 4, 2024 /PRNewswire-PRWeb/ — Twelve Labs, the video understanding company, today announced Marengo 2.7, a new state-of-the-art multimodal embedding model that achieves over 15% improvement over its predecessor Marengo-2.6. Building upon the success of the previous video foundation model, Marengo-2.7 represents a significant advancement in multimodal video understanding, as it adopts a multi-vector approach that enables more precise and comprehensive video content analysis. This is the first model of its kind to do so, and early results are stunning, including 90.6% average recall in object search (32.6% improvement from previous version) and 93.2% recall in speech search (2.8% higher than specialized speech-to-text systems).

Video understanding has been a notoriously difficult problem to solve. A single video clip simultaneously contains visual elements (objects, scenes, actions), temporal dynamics (motion, transitions), audio components (speech, ambient sounds, music), and often textual information (overlays, subtitles). Traditional single-vector approaches struggle to effectively compress all these diverse aspects into one representation without losing critical information. Marengo 2.7 up-ends this thinking to do something entirely new.

A Novel Approach

With Marengo 2.7, Twelve Labs deploys multi-vector representation for the first time to address the complexities inherent in video. Unlike Marengo-2.6 that compresses all information into a single embedding, Marengo-2.7 decomposes the raw inputs into multiple specialized vectors. Each vector independently captures distinct aspects of the video content – from visual appearance and motion dynamics to OCR text and speech patterns.

For example, one vector might capture what things look like (e.g., “a man in a black shirt”), another tracks movement (e.g., “waving his hand”), and another remembers what was said (e.g., “video foundation model is fun”). This approach helps the model better understand videos that contain many different types of information, leading to more accurate video analysis across all aspects – visual, motion, and audio.

Marengo 2.7 demonstrates particular strength in detecting small objects while maintaining exceptional performance in general text-based search tasks. This level of granular representation enables more nuanced multimodal search capabilities. Now, with Marengo 2.7, users can search complex visual scenes, find specific brand appearances, locate exact audio moments, match images to video segments, and more.

“Twelve Labs continues to push video understanding forward in unprecedented ways, turning the concept of a multi-vector approach into reality for the very first time,” said Jae Lee, CEO of Twelve Labs. “Our R&D team is laser focused on solving what was previously considered unsolvable. Their groundbreaking work has been rigorously tested, and the model’s performance is vastly superior to anything on the market today. We look forward to seeing how our customers will use this powerful technology.”

To learn about how the model was trained and to review Marengo 2.7’s performance benchmarks, please see our blog or try Marengo 2.7 yourself at https://playground.twelvelabs.io/.

About Twelve Labs

Twelve Labs makes video instantly, intelligently searchable and understandable. Twelve Labs’ state-of-the-art video understanding technology enables the accurate and timely discovery of valuable moments within an organization’s vast sea of videos so that users can do and learn more. The company is backed by leading venture capitalists, technology companies, AI luminaries, and successful founders. It is headquartered in San Francisco, with an APAC office in Seoul. Learn more at twelvelabs.io.

Media Contact

Amber Moore, Moore Communications, 1 5039439381, amber@moorecom2.com

View original content to download multimedia:https://www.prweb.com/releases/twelve-labs-launches-marengo-2-7–introducing-new-multi-vector-approach-to-video-understanding-302322215.html

SOURCE Twelve Labs

Nativo Ventures

Technology

Twelve Labs Launches Marengo 2.7, Introducing New Multi-Vector Approach to Video Understanding

Leave a Reply

Leave a Reply

Trending

Leave a Reply Cancel reply

Leave a Reply

Trending

Leave a Reply