BREAKING: Spiral exits stealth with $22M Seed + Series A Vortex format claims up to 200x faster random reads vs Parquet Backed by General Catalyst & Amplify Partners Vortex donated to the Linux Foundation (LF AI & Data) One API for embeddings to video Founders ex-Palantir & ex-Citadel ~3,000 GitHub stars and counting
YesPress Profile · Data Infrastructure · New York

Spiral.

The database rebuilt for the moment machines - not humans - became the readers.

Spiral logo and tagline: the data warehouse for pre-training
The whole pitch fits on a logo card: "the data warehouse for pre-training." Everything else is just very fast plumbing.
$22M
Seed + Series A
200x
Faster random reads*
~18
People
2024
Founded · NYC
3
Founders, ex-Palantir

Somewhere in a data center right now, a $30,000 GPU is doing nothing. It can swallow terabits of data per second. Instead it waits - idle most of the time - while a tired CPU somewhere upstream unpacks files one at a time and feeds it through a straw. Spiral, an 18-person company in New York, exists because that picture is absurd. Its entire reason for being is to take the straw away.

"When you stop pretending machines are just very fast humans, the entire architecture inverts."- Will Manning, co-founder & CEO

That sentence is the whole company in a breath. For fifty years, databases were built for people. People read dashboards. People run a query, sip coffee, read twelve rows. The systems underneath - Postgres, then the big-data lakehouses - were tuned, sensibly, for that rhythm. The trouble is that the main reader stopped being a person. It became a model in training, demanding millions of images a second, and nobody had bothered to redesign the warehouse for a customer who never blinks.

The problem they saw

The Third Age of Data

Spiral likes to tell history in three acts. In the First Age, humans put data in and humans took data out - the Postgres era, human-scale on both ends. In the Second Age, machines started writing at enormous volume but people still did the reading - the lakehouse era of Snowflake and Databricks. We are now in the Third Age, where machines do both the writing and the reading. The inputs are machine-scale; so, finally, are the outputs.

It is a tidy story, and like all tidy stories it is mostly a setup for the punchline: the tools we use were built for Act Two. The incumbents, Spiral argues, are "bolting new marketing onto old architectures" - relational tables, schema-bound warehouses, batch ETL pipelines optimized for human dashboards, now asked to feed GPUs they were never designed to serve. AI data is messy, multimodal, and arrives in awkward sizes. Legacy formats handle it the way a tuxedo handles a swim.

"Modern GPUs can consume terabits per second, but legacy formats force CPUs to sit in the middle, decompressing everything first. That's broken."- via General Catalyst, lead investor

There is a particular spot where the old systems fall apart, and Spiral has a name for it: the uncanny valley between 1 kilobyte and 25 megabytes. Too big to be a tidy database row, too small to be a happy file on disk. An embedding, an image, a short clip of video. Most systems are good at the very small or the very large and miserable in the middle - which is, inconveniently, exactly where modern AI data lives.

The founders' bet

Three engineers and a hunch

The bet was placed by three people who had spent years inside one of the most demanding data platforms on earth. Will Manning, Rob Kruszewski, and Nick Gates met building infrastructure for Palantir Foundry, with stints at Citadel - places where data at impossible scale is a Tuesday, not a moonshot. They had felt the straw personally. So in 2024 they did the unglamorous, slightly reckless thing: they decided not to optimize the existing stack but to start at the file format, the lowest layer, and rebuild upward.

W

Will Manning

Co-founder & CEO
R

Rob Kruszewski

Co-founder
N

Nick Gates

Co-founder

Starting at the file format is a bit like deciding to fix traffic by reinventing the wheel. Ambitious. Possibly mad. But the logic holds: if every byte your GPU eventually sees has to pass through the format, then the format is where the bottleneck either lives or dies. Fix it there and everything above gets faster for free. So they wrote Vortex.

The product

Vortex, and the database on top of it

The open foundation

Vortex

A next-generation columnar file format and compression toolkit, written largely in Rust. Parquet-class compression, but 10-20x faster scans, roughly 5x faster writes, and up to 100-200x faster random access. Its party trick: GPU-native decompression that streams data from object storage straight into GPU memory, no CPU middleman. Spiral handed it to the Linux Foundation.

The commercial layer

Spiral Database

Built on Vortex and object-store native from day one. It promises GPU-saturating throughput, unified governance across every data type, and "fearless" permissioning - granular, time-bounded, audited. One API handles everything from a tiny embedding to a massive video file, including the awkward middle where other systems quietly give up.

"Vortex is the first storage format designed for direct GPU decompression - loading training data straight from object storage into GPU memory."- The headline feature, in one line

Giving away your core technology sounds like a strange way to build a business, until you remember that Parquet became the default precisely because it was free, neutral, and everywhere. Spiral is playing the same card. Vortex lives at the LF AI & Data Foundation as an incubation project, racking up around 3,000 GitHub stars and 90-odd releases, with integrations for Arrow, DataFusion, DuckDB, Spark, Pandas and Polars. The format wins hearts; the managed platform pays rent.

Milestones

How a file format became a company

2024

Quietly begins

Will Manning, Rob Kruszewski and Nick Gates leave the Palantir/Citadel orbit and start building - from the file format up.

AUG 2025

Vortex goes to the Linux Foundation

The LF AI & Data Foundation announces it will host Vortex to power high-performance data access for AI and analytics.

SEP 2025

Out of stealth, $22M in hand

Spiral emerges publicly with combined Seed and Series A financing led by General Catalyst and Amplify Partners.

JUN 2026

Still shipping, fast

Vortex passes 90 releases (v0.74.0 on June 2, 2026) as adoption interest grows from the likes of Microsoft, Snowflake and Palantir.

The argument, in numbers

Why the format matters

Vortex vs. Apache Parquet
// relative speed-up claimed by Spiral. Parquet baseline = 1x. Longer is faster.
Random reads
up to 200x
Scans
10-20x
Writes
~5x
Parquet
1x baseline
Bars are scaled for legibility, not to a single linear axis - random reads dwarf everything else. The point isn't the exact multiple; it's that a GPU rated for ~4 million 100KiB images per second can finally get close to that number instead of waiting on a CPU. Source: Spiral / Vortex benchmarks.
The proof

Who's watching

Money is one kind of proof. Spiral's $22 million came from General Catalyst and Amplify Partners, two firms not known for funding science projects. Adoption is another. The customers Spiral points to are the teams who feel the straw most acutely: computer vision, robotics, and multimodal AI - shops whose data is video and sensor streams and embeddings, not neat rows of numbers. The most telling signal may be that Microsoft, Snowflake and Palantir are reportedly paying attention to the open format. When the incumbents you're routing around start watching your repo, you're either a threat or a feature.

"A ground-up reimagining of the database for the genAI era."- Amplify Partners, on why they invested
The mission

Keep the machines fed

Strip away the format wars and the three-ages framing and Spiral's mission is almost humble: stop wasting the most expensive hardware in the building. An H200 runs around $2 an hour, every hour, whether or not it's actually computing anything. Every minute it spends starved for data is money set on fire for warmth. Spiral wants governance and security to come along for free rather than as the usual tax you pay in speed - "fearless permissioning," they call it, which is a cheerful way of saying you shouldn't have to choose between fast and safe.

That last part matters more than it sounds. The reason teams tolerate slow, bolted-together pipelines is usually that the secure option is the slow option. Spiral's claim is that this was never a law of physics, just a consequence of using formats designed for a different reader. Remove that assumption and the trade-off dissolves.

Why it matters tomorrow

Back to the idle GPU

Return to that data center. The GPU that was waiting is still there, but the picture has changed. With Vortex underneath, the data it needs streams in from cheap object storage directly into its memory, decompressed on the way, no CPU bottleneck doing the unpacking. The straw is gone. The machine reads at the speed it was always capable of, and the bill for all that idle silicon stops climbing.

Whether Spiral becomes the default for the AI era or simply forces the incumbents to finally rebuild their own foundations, the company has already made an uncomfortable point out loud: we spent a decade buying machines that read like gods and feeding them like it was still 1999. Spiral's wager is that someone was going to fix that, and it might as well be three engineers who got tired of watching the straw.

"Proud of what we've built, but even more excited for what we're going to do next."- Will Manning, on emerging from stealth

Spread the word

Share Spiral