TwelveLabs raises $107M total 20,000+ developer organizations using TwelveLabs API Marengo 2.7 supports 47 languages Jae Lee keynotes AWS re:Invent 2024 CB Insights AI 100 - 3 consecutive years Pegasus 1.5 reasons over 2 hours of video Backed by NVIDIA, NEA, Databricks, Snowflake & In-Q-Tel TechCrunch Sessions: AI - June 2025 10,000+ hours of video processed daily 60x real-time video indexing speed TwelveLabs raises $107M total 20,000+ developer organizations using TwelveLabs API Marengo 2.7 supports 47 languages Jae Lee keynotes AWS re:Invent 2024 CB Insights AI 100 - 3 consecutive years Pegasus 1.5 reasons over 2 hours of video Backed by NVIDIA, NEA, Databricks, Snowflake & In-Q-Tel TechCrunch Sessions: AI - June 2025 10,000+ hours of video processed daily 60x real-time video indexing speed
Jae Lee, Co-founder and CEO of TwelveLabs
YesPress Profile

Jae Lee.

The man teaching machines to watch - and actually understand.

Co-founder & CEO — TwelveLabs — San Francisco

$107M
Total Raised
20K+
Developer Orgs
60x
Real-Time Speed
47
Languages

The Guy Who Built the Long Way

Before he was pitching at AWS re:Invent and speaking at World Economic Forum events, Jae Lee was a data scientist decoding threats for South Korea's Ministry of National Defense. He was good at it - award-winning, in fact. But somewhere between military cyber ops and Silicon Valley, he saw something nobody else wanted to touch: the fact that 80% of the world's data is video, and AI could barely watch a clip without losing the thread.

He co-founded TwelveLabs in 2021 to fix that. Not by wrapping existing models with a prettier API - but by building proprietary video foundation models from the ground up. It was, by his own admission, a painful journey. While competitors moved fast with shortcuts, TwelveLabs spent two and a half, three years doing the hard thing.

That patience looks like strategy now. TwelveLabs' flagship models - Marengo 2.7 and Pegasus 1.5 - can search, classify, and reason across video with what Lee describes as "human-level understanding." Marengo handles multimodal embeddings across 47 languages at 78.5% composite accuracy. Pegasus reasons continuously over up to two hours of video in a single pass. Together, they process over 10,000 hours of content per day at 60 times real-time speed.

The customers range from sports broadcasters to federal intelligence agencies. In-Q-Tel - the CIA's venture arm - is among TwelveLabs' investors, which tells you something about what serious video intelligence looks like in 2025.

AI systems need to learn from video to understand the world the way humans do.

- Jae Lee, Co-founder & CEO, TwelveLabs

Lee is a UC Berkeley computer science graduate who went through TechStars before landing in the video AI space. His path through Amazon and Samsung internships, then military data science, gave him something most founders lack: a pragmatic sense of what "deployment at scale" actually means before you've written a single fundraising deck.

He describes his founding conviction plainly: machines need video to understand the world the same way humans do. Text-only AI is, in his view, fundamentally incomplete. The camera is not just a data source - it is a language.

On the question of building versus wrapping: "It was a painful journey in the first like two and a half, three years because folks are flying by." But he held the line, and the models TwelveLabs built became the differentiation that $107 million in funding bought into.

What Jae Lee Actually Says

"If you're kind of entering because, oh, federal market is big and you go in, you're going to get your butt kicked."

On federal market strategy - Frontlines.io Podcast

"It was painful journey in the first like two and a half, three years because folks are flying by."

On building proprietary video AI models from scratch

"We expect to be recognized as the company that processes the majority of the world's video data within three to five years."

On TwelveLabs' long-term vision

"We look forward to the day when Korean AI technology sets a new global standard under the leadership of SKT."

On the SK Telecom K-AI Alliance, January 2025

$107M and Counting

Three funding rounds. Each one a vote for proprietary video AI over the wrapper approach.

Series A (Initial)
$12M
Series A (NEA + NVIDIA)
$50M
Round (Dec 2024)
$30M

Additional seed/earlier funding rounds bring total to $107.1M per Crunchbase.

Notable Investors

NEA
NVIDIA NVentures
Databricks
Snowflake
SK Telecom
Index Ventures
Radical Ventures
HubSpot Ventures
In-Q-Tel
Korea Investment Partners
Wndr Co

Two Models. One Goal.

TwelveLabs ships two flagship video AI models. Marengo finds things. Pegasus understands them.

Marengo
VERSION 2.7 — MULTIMODAL EMBEDDING

Search and retrieve across video, audio, and text simultaneously. Marengo 2.7 achieves 78.5% composite accuracy across 47 languages - finding the exact moment in a 3-hour archive that matches a natural language query.

78.5% composite accuracy • 47 languages • 60x real-time indexing
Pegasus
VERSION 1.5 — VIDEO LANGUAGE MODEL

Reasoning, summarization, and insight extraction from video at feature-film scale. Pegasus 1.5 processes up to 2 continuous hours of video in a single inference pass - generating structured text, answers, and analysis.

2-hour continuous reasoning • 10,000+ hours/day • Enterprise-grade

From Seoul to San Francisco

Jae Lee's path to building one of the world's most-watched video AI companies runs through military intelligence, Amazon internships, and a willingness to spend three years on a bet most investors wouldn't touch.

His Korean roots are not background color. They are load-bearing. In January 2025, he joined SK Telecom's K-AI Alliance, serving on the board of the South Korea Foundation Model Association - working alongside Samsung, SK, and LG to ensure Korean AI competes at the global frontier.

Early Career
Software engineering intern at Amazon and Samsung - built the operational instincts that later shaped TwelveLabs' infrastructure-first thinking
Pre-2021
Lead data scientist, South Korean Ministry of National Defense - award-winning cyber security work for the army gave him a deep sense of what "mission-critical" actually means
2021
Co-founded TwelveLabs in San Francisco - bet on building video foundation models from scratch when everyone else was wrapping GPT
2022 - 2023
Raised initial Series A ($12M) - spent the painful years building proprietary multimodal models while competitors took shortcuts
2024
Closed $50M Series A co-led by NEA and NVIDIA NVentures - the validation that the hard approach was the right approach
Nov 2024
Keynote customer at AWS re:Invent 2024 - TwelveLabs on the main stage in front of 60,000+ cloud developers
Dec 2024
Added $30M from Databricks, Snowflake, SK Telecom, HubSpot Ventures, and In-Q-Tel - bringing total to $107M+
Jan 2025
Joined SK Telecom's K-AI Alliance; board member, South Korea Foundation Model Association
Jun 2025
Speaking at TechCrunch Sessions: AI at UC Berkeley's Zellerbach Hall

We expect to be recognized as the company that processes the majority of the world's video data within three to five years.

- Jae Lee

What the Record Shows

🏆
CB Insights AI 100
3 consecutive years — one of the most consistent runs in the AI startup recognition landscape
🎥
AWS re:Invent Keynote
2024 keynote customer speaker — main stage in front of 60,000+ developers
🌐
World Economic Forum
Named to WEF profile page — global recognition as a technology leader
📊
$107M Total Funding
From NVIDIA, NEA, Databricks, Snowflake, SK Telecom, In-Q-Tel and more
🔧
20,000+ Developer Orgs
TwelveLabs API in production across sports, media, advertising, automotive, and government
🏳
South Korean Army
Award-winning cyber security leader — the credential almost nobody in Silicon Valley has

TwelveLabs at Scale

10K+
Hours of video processed by TwelveLabs every single day across 20,000+ developer organizations
60x
Real-time indexing speed - one hour of video indexed in under one minute
78.5%
Composite accuracy for Marengo 2.7 across 47 languages - the highest in the video embedding benchmark

Details Worth Knowing

When TechCrunch announced that Jae Lee would speak at Sessions: AI in June 2025 at UC Berkeley's Zellerbach Hall, the venue carried a particular resonance. UC Berkeley is where Lee got his CS degree. The event where he'll help define the future of AI will take place blocks from where he learned to build software.

The federal play is less obvious than it looks. Lee has been precise about what it takes: mission alignment, not just revenue opportunity. His exact words on the topic - "if you go in because the market is big, you're going to get your butt kicked" - reflect something rarer than strategy. It reflects actual experience with what government customers demand versus what they say they want.

In-Q-Tel's participation in the December 2024 round is the detail most people skim past. In-Q-Tel is the venture arm created by the CIA in 1999 to fund technologies with intelligence community applications. The fact that they backed TwelveLabs says something about what a US government intelligence application of video AI actually looks like when someone builds it properly.

Lee traveled to South Korea twice in a single month while managing TwelveLabs' US operations - and his public response was that the partnership results made it "all worth it." That framing - dual identity, not divided identity - runs through everything he does. He wants Korean AI to set the global standard. He also wants to process the majority of the world's video data. Both at once.

The camera is not just a data source. For AI to understand the world, it needs to understand what it sees.

- TwelveLabs founding thesis

What NEA called "100% obsession with the problem" is visible in the product roadmap. TwelveLabs didn't ship a search feature. They shipped a model - Marengo - and then they shipped a reasoning engine - Pegasus. The vertical-specialized go-to-market (separate account teams for sports, media, federal, automotive) is how you sell when the problem looks different in every industry, even if the underlying technology is the same.

The company remains headquartered at 55 Green Street in San Francisco, with 170 employees and annual revenues of approximately $15.7M. For a company processing 10,000+ hours of video daily for customers including broadcast networks, automotive manufacturers, sports leagues, and federal agencies, that headcount tells you something about how much compute they run per engineer.

Jae Lee on Video

AWS re:Invent 2025 TwelveLabs Keynote
AWS re:Invent 2025 - Keynote Customer
Watch on YouTube →
MindMakers Episode 7 - TwelveLabs
MindMakers Ep.7 - How TwelveLabs Transforms Enterprise Video
Watch on YouTube →
Jae Lee TwelveLabs Co-founder Interview
Jae Lee - Co-founder Building the Future of Video AI
Watch on YouTube →