Expanso: The Startup Betting Your Data Should Stay Home

The Story

There is a boring, expensive tax on modern data, and Expanso wants to stop paying it

Here is a thing about data that is true and slightly absurd: the most expensive part of a data pipeline is often not the analysis, the storage, or the fancy AI model at the end. It is the moving. Enterprises generate data all over the place - factory sensors, retail stores, hospital systems, log streams from thousands of servers - and then, at considerable cost, they ship all of it to a central cloud so a computer can look at it, decide most of it was junk, and throw 90% of it away.

You are, in other words, paying a toll to haul cargo across the country so that a warehouse can tell you it was trash. Expanso, a Seattle company founded in 2022, looked at this arrangement and asked the reasonable question: what if we did the sorting before we paid for the truck?

That is the whole idea, and Expanso has given it a name - “Compute Over Data.” Instead of moving data to where the compute is, you move the compute to where the data already sits. A lightweight agent runs at the source, filters and transforms and governs the data in place, and then forwards only the part that matters. The data that stays put is cheaper (no egress), safer (no travel means fewer places to leak), and faster to act on.

None of this would matter if it were just a slide. What makes Expanso worth a dossier is that the thing works, it is open source at its core, and the people building it have done this kind of plumbing before - at the scale where getting it wrong is very visible.

How It Works

Two ways to process data. Expanso picks the second.

The Usual Way

Data → Cloud → Compute. Copy everything from every source into one central place, pay for the transfer and the storage, then run the job. Simple to reason about, painful on the invoice, and awkward when the data was never supposed to leave the building.

⇨

The Expanso Way

Compute → Data → Result. Send the job to where the data lives. Filter, transform and govern at the source. Only the useful, compliant slice travels onward. Less movement, lower cost, and data sovereignty by default.

Compute Over Data: bring the processing to where the data lives, rather than moving the data to the cloud first.

- The Expanso / Bacalhau thesis

Why It Matters

The pitch is savings, but the real product is control

There are two audiences who almost never agree - the finance team that hates the cloud bill, and the compliance team that hates data leaving the building. Expanso's approach happens to satisfy both at once. When two opposing parties both win, you have usually found real leverage rather than a marketing line.

The company reports figures in the neighborhood of 10x faster pipeline deployment and, through its Red Hat OpenShift integration, cost reductions of 50-70%. Treat those as vendor numbers - directional, not gospel. The underlying logic, though, is hard to argue with: the cheapest byte to process is the one you never had to move.

What You Can Build

Two products, one idea

Open Source Core

Bacalhau

The open-source distributed compute engine that runs jobs where data lives. Its public demo network has processed more than 1.5 million jobs for partners including the University of Maryland, BOINC and the New Atlantis Foundation. (The name is Portuguese for salted cod - a thing valuable enough to preserve and distribute.)

Enterprise Platform

Expanso Platform

Lightweight edge agents, policy-based governance across thousands of sources, 100+ connectors to Snowflake, Databricks, Splunk, Datadog and Elastic, self-healing pipelines, and full data-lineage tracking for compliance - PII and GDPR handling included.

Who Uses It

Edge, Hybrid, Multi-cloud

Enterprises and institutions with scattered data: universities, research networks, telcos, and - per the company - some of the world's largest defense organizations. Anywhere data is too big, too sensitive, or too regulated to move comfortably.

The Founder

From Kubernetes to cod

David Aronchick

Co-founder & CEO

If distributed systems have a resume, Aronchick's is a strong one. He was the first non-founding product manager on Kubernetes, co-founded Kubeflow at Google, and later ran open-source machine learning at Microsoft. Expanso is what happens when someone who spent years watching enterprises struggle with scale decides the problem worth solving is not another orchestrator - it is the data itself. The company is co-founded by alumni of Google, AWS and Microsoft.

The Money

A $7.5M seed that closed in a frozen market

In November 2023 - not an easy moment to raise anything - Expanso announced a $7.5 million seed round led by General Catalyst and Hetz Ventures, with Array Ventures joining. One account described the fundraising conditions of the period as “crazy.” Raising into a bad market is a mild signal that investors believed the problem was real rather than fashionable. In 2024, Samsung Next added a strategic investment.

Control Your Data. Everywhere.

- Expanso's operating slogan

The Record

Selected milestones

Sep 2023

Selected for the 5G Open Innovation Lab alongside AT&T and Comcast.

Nov 2023

$7.5M seed round led by General Catalyst and Hetz Ventures.

May 2024

Strategic investment from Samsung Next; Data Breakthrough Award 2024.

Mar 2025

SXSW Pitch 2025 Finalist; second Data Breakthrough Award.

Feb 2026

Joins Red Hat Partner Network for OpenShift; named to Edge AI Foundation's Defense Working Group.

Jun 2026

Named Edge AI Startup of the Year 2026 at EDGE AI London.

Marginalia

Four things that stuck with us

The name is a fishBacalhau is Portuguese for salted cod - preserved and distributed, which is roughly what the software does to your data.
The feature is subtractionExpanso's whole thesis is doing less. Not moving the data is the product.
Kubernetes pedigreeThe CEO helped ship Kubernetes and co-founded Kubeflow before starting here.
Small team, large customersRoughly 14 people, reportedly serving some of the world's largest defense organizations.

Go Deeper

Links, socials & sources

Websiteexpanso.io LinkedInExpanso Twitter / X@ExpansoIO GitHubbacalhau-project NewsroomLatest updates FounderDavid Aronchick PressVentureBeat PressGeekWire InterviewCarta: the seed round VideoTalks & demos

Video: search “Bacalhau Compute Over Data” on YouTube for conference talks and product demos featuring David Aronchick. No single official channel is confirmed here, so the link points to a scoped search rather than an unverified URL.