He spent a decade watching the most valuable data in the building go untouched. Now he is teaching databases to read it.
// Richard Meng. The smile of a man who finally figured out how to SELECT * from a PDF.
Walk into almost any company and you will find a strange kind of hoarding. Spreadsheets get cleaned, joined and dashboarded. Meanwhile the contracts, the scanned invoices, the call recordings, the screenshots, the web pages - roughly 80% of everything - sit in folders nobody queries. Richard Meng noticed this years before it was fashionable to care, and it bothered him enough to quit a very good job over it.
Meng is the co-founder and CEO of Roe AI, a Y Combinator-backed startup he started in 2023 with Jason Wang. The pitch is deceptively small: an AI-powered data warehouse where you point plain SQL at documents, images, web pages and video and get back clean, structured answers. The ambition underneath it is not small at all. He wants the messy majority of enterprise data to feel as ordinary to work with as a table of numbers.
Before Roe, he was the tech lead for Generative AI at Snowflake, where he tuned open-source language models and shipped an AI data copilot to hundreds of Fortune 2000 customers. Before that, at LinkedIn, he led Skills and Knowledge Graph products feeding the largest economic graph on earth. The throughline is unmistakable: wherever data is hardest to tame, that is where you find him.
"I've been driven to challenge the status quo and tackle problems that haven't been solved before - especially those at the intersection of data and AI."
- Richard Meng
Meng likes to put it bluntly. "20% of data is structured," he says. "What about the remaining 80%? 85%+ of data teams are just leveraging that 20%." For years that was simply the deal. You analyzed what fit neatly in rows and columns, and you shrugged at the rest because reading a thousand documents by hand was nobody's idea of a Tuesday.
Roe AI's wager is that large language models finally make the other 80% queryable - not with fragile keyword search, but with reasoning. And in a contrarian touch, Meng largely skips embedding vectors, which he considers too imprecise, in favor of a lot of direct LLM calls. More compute, less guessing.
Meng tells his own career as a tour through how data tooling evolved - and every stop sharpened the same question.
"Tackle the hard problems first. Chasing easier wins might seem promising initially, but true product-market fit often emerges when we solve the toughest challenges."
"Why can't we make something like Snowflake for unstructured data?"
"My goal is to enable these clients to build agentic, rigorously evaluated document workflows."
"I can just use SQL to do everything within this data warehouse."
Roe AI runs Roe AI internally. The team screens thousands of job-applicant resumes in under a minute using their own platform - the product proves itself before it ever reaches a buyer.
While the industry reaches for vector embeddings, Meng finds them too imprecise. Roe leans on a high volume of direct LLM calls instead. It is a more expensive bet on accuracy - and a deliberate one.
His first serious data project was researching OpenStreetMap data on Berkeley's supercomputer clusters - long before "AI infrastructure" was a job title anyone listed on LinkedIn.
The easy wins are a trap, he argues. Real product-market fit shows up only after you survive the gnarly stuff nobody else wanted to touch.
"Roe" - as in fish roe - reads like a nod to tiny units of data that add up to something worth harvesting. Apt for a company built on the overlooked 80%.
Roe sharpened its focus on financial services, where compliance and document review are brutal, manual and unforgiving - exactly the kind of hard problem Meng goes looking for.
The goal Meng keeps circling back to: let any team - especially in regulated, paperwork-heavy industries - build agentic, rigorously evaluated document workflows with a few lines of SQL. No Python notebook. No prompt-engineering marathon. Just a query, and an answer you can trust.