Breaking
Roe AI closes $3.5M seed led by Gradient Ventures Y Combinator W24 batch SQL for the messy 80% of enterprise data ex-Snowflake gen-AI tech lead Screens thousands of resumes in under a minute Roe AI closes $3.5M seed led by Gradient Ventures Y Combinator W24 batch SQL for the messy 80% of enterprise data ex-Snowflake gen-AI tech lead Screens thousands of resumes in under a minute
Profile / Founder & CEO, Roe AI

Richard Meng

He spent a decade watching the most valuable data in the building go untouched. Now he is teaching databases to read it.

UC Berkeley ex-Snowflake ex-LinkedIn YC W24 Unstructured Data
Richard Meng, co-founder and CEO of Roe AI

// Richard Meng. The smile of a man who finally figured out how to SELECT * from a PDF.

The Story

The other 80%

Walk into almost any company and you will find a strange kind of hoarding. Spreadsheets get cleaned, joined and dashboarded. Meanwhile the contracts, the scanned invoices, the call recordings, the screenshots, the web pages - roughly 80% of everything - sit in folders nobody queries. Richard Meng noticed this years before it was fashionable to care, and it bothered him enough to quit a very good job over it.

Meng is the co-founder and CEO of Roe AI, a Y Combinator-backed startup he started in 2023 with Jason Wang. The pitch is deceptively small: an AI-powered data warehouse where you point plain SQL at documents, images, web pages and video and get back clean, structured answers. The ambition underneath it is not small at all. He wants the messy majority of enterprise data to feel as ordinary to work with as a table of numbers.

Before Roe, he was the tech lead for Generative AI at Snowflake, where he tuned open-source language models and shipped an AI data copilot to hundreds of Fortune 2000 customers. Before that, at LinkedIn, he led Skills and Knowledge Graph products feeding the largest economic graph on earth. The throughline is unmistakable: wherever data is hardest to tame, that is where you find him.

"I've been driven to challenge the status quo and tackle problems that haven't been solved before - especially those at the intersection of data and AI."

- Richard Meng

By The Numbers
80%
of enterprise data is unstructured
$3.5M
seed round, August 2024
2
degrees: CS + Statistics, Berkeley
W24
Y Combinator batch
The Thesis

A 20% problem the whole industry settled for

Meng likes to put it bluntly. "20% of data is structured," he says. "What about the remaining 80%? 85%+ of data teams are just leveraging that 20%." For years that was simply the deal. You analyzed what fit neatly in rows and columns, and you shrugged at the rest because reading a thousand documents by hand was nobody's idea of a Tuesday.

Roe AI's wager is that large language models finally make the other 80% queryable - not with fragile keyword search, but with reasoning. And in a contrarian touch, Meng largely skips embedding vectors, which he considers too imprecise, in favor of a lot of direct LLM calls. More compute, less guessing.

80% UNSTRUCTURED
20%

// Where the value hides vs. where the tools point. Roe AI goes left.

The Arc

Three eras of data, one obsession

Meng tells his own career as a tour through how data tooling evolved - and every stop sharpened the same question.

UC Berkeley
Double majors in Computer Science and Statistics. Cuts his teeth researching OpenStreetMap data on Berkeley's supercomputer clusters - data work era one.
LinkedIn
Leads Skills and Knowledge Graph products and generative skill assessment, wrangling Hadoop and Spark jobs that fed the world's largest economic graph - era two, all about tuning.
Snowflake
Becomes tech lead for Generative AI. Tunes open-source LLMs and serves an AI data copilot to hundreds of Fortune 2000 customers. Has the revelation: "I can just use SQL to do everything within this data warehouse."
2023
Asks the heretical question - "Why can't we make something like Snowflake for unstructured data?" - and co-founds Roe AI with Jason Wang.
2024
Roe AI joins Y Combinator's Winter batch, launches on Hacker News, and closes a $3.5M seed led by Gradient Ventures with Ardent Ventures and ex-Snowflake execs.
In His Words

The founder's playbook

"Tackle the hard problems first. Chasing easier wins might seem promising initially, but true product-market fit often emerges when we solve the toughest challenges."

"Why can't we make something like Snowflake for unstructured data?"

"My goal is to enable these clients to build agentic, rigorously evaluated document workflows."

"I can just use SQL to do everything within this data warehouse."

Quirks & Curios

Things that make Richard Meng, Richard Meng

Dogfooding

His own toughest customer

Roe AI runs Roe AI internally. The team screens thousands of job-applicant resumes in under a minute using their own platform - the product proves itself before it ever reaches a buyer.

Contrarian

No embeddings, thanks

While the industry reaches for vector embeddings, Meng finds them too imprecise. Roe leans on a high volume of direct LLM calls instead. It is a more expensive bet on accuracy - and a deliberate one.

Origin

Born on a supercomputer

His first serious data project was researching OpenStreetMap data on Berkeley's supercomputer clusters - long before "AI infrastructure" was a job title anyone listed on LinkedIn.

Philosophy

Hard problems first

The easy wins are a trap, he argues. Real product-market fit shows up only after you survive the gnarly stuff nobody else wanted to touch.

The Name

Small eggs, big pile

"Roe" - as in fish roe - reads like a nod to tiny units of data that add up to something worth harvesting. Apt for a company built on the overlooked 80%.

Sector

Aimed at the regulated

Roe sharpened its focus on financial services, where compliance and document review are brutal, manual and unforgiving - exactly the kind of hard problem Meng goes looking for.

The Aspiration

Make the messy 80% as easy as a table

The goal Meng keeps circling back to: let any team - especially in regulated, paperwork-heavy industries - build agentic, rigorously evaluated document workflows with a few lines of SQL. No Python notebook. No prompt-engineering marathon. Just a query, and an answer you can trust.