Jae Lee

TwelveLabs raises $107M total 20,000+ developer organizations using TwelveLabs API Marengo 2.7 supports 47 languages Jae Lee keynotes AWS re:Invent 2024 CB Insights AI 100 - 3 consecutive years Pegasus 1.5 reasons over 2 hours of video Backed by NVIDIA, NEA, Databricks, Snowflake & In-Q-Tel TechCrunch Sessions: AI - June 2025 10,000+ hours of video processed daily 60x real-time video indexing speed TwelveLabs raises $107M total 20,000+ developer organizations using TwelveLabs API Marengo 2.7 supports 47 languages Jae Lee keynotes AWS re:Invent 2024 CB Insights AI 100 - 3 consecutive years Pegasus 1.5 reasons over 2 hours of video Backed by NVIDIA, NEA, Databricks, Snowflake & In-Q-Tel TechCrunch Sessions: AI - June 2025 10,000+ hours of video processed daily 60x real-time video indexing speed

The Story

The Guy Who Built the Long Way

Before he was pitching at AWS re:Invent and speaking at World Economic Forum events, Jae Lee was a data scientist decoding threats for South Korea's Ministry of National Defense. He was good at it - award-winning, in fact. But somewhere between military cyber ops and Silicon Valley, he saw something nobody else wanted to touch: the fact that 80% of the world's data is video, and AI could barely watch a clip without losing the thread.

He co-founded TwelveLabs in 2021 to fix that. Not by wrapping existing models with a prettier API - but by building proprietary video foundation models from the ground up. It was, by his own admission, a painful journey. While competitors moved fast with shortcuts, TwelveLabs spent two and a half, three years doing the hard thing.

That patience looks like strategy now. TwelveLabs' flagship models - Marengo 2.7 and Pegasus 1.5 - can search, classify, and reason across video with what Lee describes as "human-level understanding." Marengo handles multimodal embeddings across 47 languages at 78.5% composite accuracy. Pegasus reasons continuously over up to two hours of video in a single pass. Together, they process over 10,000 hours of content per day at 60 times real-time speed.

The customers range from sports broadcasters to federal intelligence agencies. In-Q-Tel - the CIA's venture arm - is among TwelveLabs' investors, which tells you something about what serious video intelligence looks like in 2025.

AI systems need to learn from video to understand the world the way humans do.

- Jae Lee, Co-founder & CEO, TwelveLabs

Lee is a UC Berkeley computer science graduate who went through TechStars before landing in the video AI space. His path through Amazon and Samsung internships, then military data science, gave him something most founders lack: a pragmatic sense of what "deployment at scale" actually means before you've written a single fundraising deck.

He describes his founding conviction plainly: machines need video to understand the world the same way humans do. Text-only AI is, in his view, fundamentally incomplete. The camera is not just a data source - it is a language.

On the question of building versus wrapping: "It was a painful journey in the first like two and a half, three years because folks are flying by." But he held the line, and the models TwelveLabs built became the differentiation that $107 million in funding bought into.

In His Own Words

What Jae Lee Actually Says

"If you're kind of entering because, oh, federal market is big and you go in, you're going to get your butt kicked."

On federal market strategy - Frontlines.io Podcast

"It was painful journey in the first like two and a half, three years because folks are flying by."

On building proprietary video AI models from scratch

"We expect to be recognized as the company that processes the majority of the world's video data within three to five years."

On TwelveLabs' long-term vision

"We look forward to the day when Korean AI technology sets a new global standard under the leadership of SKT."

On the SK Telecom K-AI Alliance, January 2025

The Products

Two Models. One Goal.

TwelveLabs ships two flagship video AI models. Marengo finds things. Pegasus understands them.

Marengo

VERSION 2.7 — MULTIMODAL EMBEDDING

Search and retrieve across video, audio, and text simultaneously. Marengo 2.7 achieves 78.5% composite accuracy across 47 languages - finding the exact moment in a 3-hour archive that matches a natural language query.

78.5% composite accuracy • 47 languages • 60x real-time indexing

Pegasus

VERSION 1.5 — VIDEO LANGUAGE MODEL

Reasoning, summarization, and insight extraction from video at feature-film scale. Pegasus 1.5 processes up to 2 continuous hours of video in a single inference pass - generating structured text, answers, and analysis.

2-hour continuous reasoning • 10,000+ hours/day • Enterprise-grade

Career Arc

From Seoul to San Francisco

Jae Lee's path to building one of the world's most-watched video AI companies runs through military intelligence, Amazon internships, and a willingness to spend three years on a bet most investors wouldn't touch.

His Korean roots are not background color. They are load-bearing. In January 2025, he joined SK Telecom's K-AI Alliance, serving on the board of the South Korea Foundation Model Association - working alongside Samsung, SK, and LG to ensure Korean AI competes at the global frontier.

Early Career

Software engineering intern at Amazon and Samsung - built the operational instincts that later shaped TwelveLabs' infrastructure-first thinking

Pre-2021

Lead data scientist, South Korean Ministry of National Defense - award-winning cyber security work for the army gave him a deep sense of what "mission-critical" actually means

2021

Co-founded TwelveLabs in San Francisco - bet on building video foundation models from scratch when everyone else was wrapping GPT

2022 - 2023

Raised initial Series A ($12M) - spent the painful years building proprietary multimodal models while competitors took shortcuts

2024

Closed $50M Series A co-led by NEA and NVIDIA NVentures - the validation that the hard approach was the right approach

Nov 2024

Keynote customer at AWS re:Invent 2024 - TwelveLabs on the main stage in front of 60,000+ cloud developers

Dec 2024

Added $30M from Databricks, Snowflake, SK Telecom, HubSpot Ventures, and In-Q-Tel - bringing total to $107M+

Jan 2025

Joined SK Telecom's K-AI Alliance; board member, South Korea Foundation Model Association

Jun 2025

Speaking at TechCrunch Sessions: AI at UC Berkeley's Zellerbach Hall

We expect to be recognized as the company that processes the majority of the world's video data within three to five years.

- Jae Lee

Recognition

What the Record Shows

🏆

CB Insights AI 100

3 consecutive years — one of the most consistent runs in the AI startup recognition landscape

🎥

AWS re:Invent Keynote

2024 keynote customer speaker — main stage in front of 60,000+ developers

🌐

World Economic Forum

Named to WEF profile page — global recognition as a technology leader

📊

$107M Total Funding

From NVIDIA, NEA, Databricks, Snowflake, SK Telecom, In-Q-Tel and more

🔧

20,000+ Developer Orgs

TwelveLabs API in production across sports, media, advertising, automotive, and government

🏳

South Korean Army

Award-winning cyber security leader — the credential almost nobody in Silicon Valley has

Context and Color

Details Worth Knowing

When TechCrunch announced that Jae Lee would speak at Sessions: AI in June 2025 at UC Berkeley's Zellerbach Hall, the venue carried a particular resonance. UC Berkeley is where Lee got his CS degree. The event where he'll help define the future of AI will take place blocks from where he learned to build software.

The federal play is less obvious than it looks. Lee has been precise about what it takes: mission alignment, not just revenue opportunity. His exact words on the topic - "if you go in because the market is big, you're going to get your butt kicked" - reflect something rarer than strategy. It reflects actual experience with what government customers demand versus what they say they want.

In-Q-Tel's participation in the December 2024 round is the detail most people skim past. In-Q-Tel is the venture arm created by the CIA in 1999 to fund technologies with intelligence community applications. The fact that they backed TwelveLabs says something about what a US government intelligence application of video AI actually looks like when someone builds it properly.

Lee traveled to South Korea twice in a single month while managing TwelveLabs' US operations - and his public response was that the partnership results made it "all worth it." That framing - dual identity, not divided identity - runs through everything he does. He wants Korean AI to set the global standard. He also wants to process the majority of the world's video data. Both at once.

The camera is not just a data source. For AI to understand the world, it needs to understand what it sees.

- TwelveLabs founding thesis

What NEA called "100% obsession with the problem" is visible in the product roadmap. TwelveLabs didn't ship a search feature. They shipped a model - Marengo - and then they shipped a reasoning engine - Pegasus. The vertical-specialized go-to-market (separate account teams for sports, media, federal, automotive) is how you sell when the problem looks different in every industry, even if the underlying technology is the same.

The company remains headquartered at 55 Green Street in San Francisco, with 170 employees and annual revenues of approximately $15.7M. For a company processing 10,000+ hours of video daily for customers including broadcast networks, automotive manufacturers, sports leagues, and federal agencies, that headcount tells you something about how much compute they run per engineer.

Jae Lee.

The Guy Who Built the Long Way

Quick Facts

Key Investors

Markets Served

What Jae Lee Actually Says

$107M and Counting

Two Models. One Goal.

From Seoul to San Francisco

Personality Profile

What the Record Shows

TwelveLabs at Scale

Details Worth Knowing

The Thesis

Anecdotes

Jae Lee on Video

Find Jae Lee Online