Ben Lerner

The Dispatch

The bill nobody reads, rewritten by a machine

Every large company running on Snowflake has a number it would rather not discuss out loud: the monthly compute bill. It grows quietly, line by line, query by query, until someone in finance circles it in red. Ben Lerner built a company around that circled number. Espresso AI, which he co-founded and runs as CEO, points large language models at the ugliest SQL in the enterprise, rewrites it, and reallocates the compute underneath it. Customers, the company says, keep up to 70% of what they were spending.

The pitch is almost rude in its simplicity - save six figures on Snowflake in roughly ten minutes - and that bluntness is the point. Lerner is not selling a dashboard or a best-practices PDF. He is selling the thing that actually moves money: a system that edits the queries and turns the warehouse up and down on its own. Espresso calls it a neural compute optimizer, and describes it as the first one built from the ground up around the code-reading and code-writing abilities of modern LLMs.

"LLMs are really good at understanding code - leaps and bounds better than prior ML models."

That sentence is the whole thesis. For decades, database query optimizers were rule engines: clever, hand-tuned, and occasionally, stubbornly wrong. Lerner's wager is that a model which has read more code than any human ever could can find the savings those rule engines miss. "There are some changes the compiler can't make," he has said, "and the optimizer can just be wrong in edge cases." Espresso lives in those edge cases.

Two careers, finally merged

To understand the product you have to understand the resume, because Espresso is the seam where Lerner's two professional lives meet. He went to MIT. His first job out of school was at Data Nitro, a small startup with an endearingly specific mission: put Python inside Excel. From there he worked on distributed systems at Uber, then landed at Google, where his career quietly split in two directions.

For about three years he worked on Google Search, doing machine learning and natural language processing - the soft, statistical, language-shaped half of computing. Then he switched sides entirely, moving into Google's storage infrastructure organization to write high-performance code, the kind that makes the disks underneath Google Cloud fast and cheap. One half of him spoke ML. The other half spoke systems performance. Espresso AI is what happens when those two halves stop working in separate buildings.

"I'm a fan of accepting job offers that you feel unqualified for. That's a good way to learn stuff."

Near the end of his Google years, he did research with DeepMind - using reinforcement learning to train neural networks more efficiently across many GPUs, and contributing to an early effort to train models on code itself. That last project reads, in hindsight, like a man assembling the exact toolkit he would later need. Reinforcement learning to make compute cheaper. Models that understand code. Put them together and you have the engine room of Espresso AI.

Why he left

The decision to leave came from a familiar founder ailment: the slow suffocation of scope. Early at Google he got to build, in his words, "super high performance stuff." Then the work calcified into multi-year maintenance journeys. The building stopped; the babysitting began. Around 2023 he walked out the door with a single idea - fuse the ML half of his brain with the systems half, and aim the result at compute, the most expensive and least glamorous problem in enterprise software.

The crisis he picked

Timing matters, and Lerner picked his moment well. The cloud cost crisis is the rare enterprise problem that every CFO already understands without a slide. Companies migrated to platforms like Snowflake for elasticity and ended up with bills that scale just as elastically in the wrong direction. Data teams write the queries; finance gets the invoice; nobody owns the gap in between. Espresso plants itself in that gap. It does not ask engineers to rewrite their work or change their habits. It watches the workload, rewrites what it can, and resizes the warehouse underneath - the savings show up without a migration project attached.

That is also why the ten-minute setup is more than a marketing line. Enterprise software usually arrives with a quarter-long implementation and a services invoice to match. A tool that connects, reads your queries, and starts returning money the same afternoon is selling on a fundamentally different axis. Lerner's bet is that proof beats promise - let the first week's savings make the argument the sales deck cannot.

The team behind the seam

He did not build it alone. Espresso's founding bench is stacked with the kind of engineers who worked on the exact systems the product now optimizes - alumni of Google Search, Google Cloud, Apple, and Google DeepMind, with research lineage in natural language processing, systems performance, and deep learning. His co-founders include Alex Kouzemtchenko and Juri Ganitkevitch, and the roster reads like a small reunion of people who spent years making big infrastructure fast from the inside. When the seed round came together in May 2024, the names attached were equally pointed: Nat Friedman and Daniel Gross led it, with Matt Turck of FirstMark leading the earlier pre-seed and angels including Tasso Argyros, Spencer Kimball, Calvin French-Owen, and Tristan Handy - data and developer-tools veterans who recognize the problem because they have lived it.

By The Numbers

What "up to 70%" looks like

A neural optimizer rewrites the query, then resizes the compute beneath it. Espresso reports cost reductions of up to 70-80% on Snowflake workloads - here is the shape of the claim across the stack.

SQL rewrite

78%

Compute alloc

70%

Auto-scaling

65%

Setup effort

~10m

Figures reflect Espresso AI's public claims. The fourth bar is inverted on purpose - the effort to turn it on is the small number.

The Long Way Round

A career that reads like a parts list

Every stop taught him one component. Espresso is the assembly.

FRESH OUT OF MIT

Data Nitro - a startup trying to put Python inside Excel. The first taste of building something people quietly loved.

~3 YEARS

Google Search - machine learning and natural language processing. The ML half of the brain.

THEN

Google Storage Infrastructure - high-performance code under Google Cloud. The systems half of the brain.

LATE GOOGLE

DeepMind research - RL for efficient multi-GPU training, and an early project teaching models to read code.

2023

Founds Espresso AI - leaves Google to merge ML and systems performance into one product.

MAY 2024

Out of stealth - $11M+ seed, Snowflake optimization, the cloud cost crisis in the crosshairs.

2025

Agentic Lakehouse for Databricks - the "Kubernetes for Snowflake" idea expands to a second warehouse.

In His Own Words

The operating philosophy

The most important thing for us is not necessarily using the best model, but getting something in front of customers that uses any model at all.

Over a slightly longer timeframe, the best approach is almost invariably throwing more data and more compute at it.

Go talk to the people you think are gonna use your product and see what they have to say.

There are some changes the compiler can't make, and the optimizer can just be wrong in edge cases.

The Tell

The emails for a product that no longer existed

Long after Data Nitro went quiet, Ben kept getting messages. Users wanted updates. They wanted fixes. They wanted the Python-in-Excel tool to keep living. He couldn't give them any of it - the product was defunct. But the inbox kept filling, year after year.

Most founders would file that under nuisance. Lerner reads it as the lesson he almost missed: those unanswered emails were product-market fit, arriving late and unrecognized. People don't chase a dead tool for years unless it genuinely solved something. The takeaway shaped how he runs Espresso - find the people who actually want the thing, and listen before you theorize.

"If you're wondering whether you have real user interest, you probably don't have it."

The padel CEO

He is not, by report, a man who performs the founder grind. He plays padel - the tennis-pickleball hybrid - and ping pong to clear his head. And in his personal time, the CEO of a company built to optimize code does the thing that started it all: he writes code, for fun. The job and the hobby are the same activity. That is either a warning sign or the whole explanation.

The Big Number

1000x, and the ghost of Moore's Law

Cutting a Snowflake bill is the wedge, not the dream. Espresso's stated ambition is to build neural optimizers that leave legacy tools behind and eventually deliver up to 1000x faster compute - the kind of software-performance leap the industry took for granted before Moore's Law flattened out.

It is a deliberately enormous goal stated by a deliberately practical person. Lerner's instinct is always to ship the deployable version first - any model, in front of real customers, doing real work - and let the grand number pull the roadmap forward rather than gate it. Start with the bill people can see. Aim at the physics no one can ignore.

The Thesis, Compressed

Old wayHand-tuned rule engines

New wayLLMs that read code

WedgeSnowflake bills

Horizon1000x compute

Marginalia

Things that don't fit the slide deck

01

His handle on X is @ben_lern - the missing "er" doing a lot of quiet work.

02

First real job: making Excel speak Python. A weirdly perfect prologue to optimizing data tools.

03

He worked both the ML and the storage-infrastructure sides of Google before fusing them.

04

Espresso AI runs out of 25 Kent Ave in Brooklyn, not Silicon Valley.

05

His rule for growth: take the job you feel unqualified for.

06

The company's nickname for its own product: "Kubernetes for Snowflake."

Watch & Follow

Go deeper

See the demo, hear the long version, follow the work.

espresso.ai ↗ LinkedIn ↗ X / @ben_lern ↗ YouTube: Kubernetes for Snowflake ↗ YouTube: CEO Demo ↗ Podcast: Tech on the Rocks ↗ Podcast: Code Story ↗ Crunchbase ↗

Pass It On