Benjamin Wagner

Profile

The Query Engine Architect

There is a specific moment when a database engineer realizes their query engine is slow - not slow like "needs optimization" slow, but slow like "we're leaving entire orders of magnitude on the table" slow. Benjamin Wagner has spent the better part of his career since 2019 methodically closing that gap, first in the academic halls of Technical University of Munich, then in the production trenches at Firebolt, the cloud data warehouse that has staked its entire identity on speed.

Wagner's entry point into databases was the Umbra research project at TU Munich - one of the more serious academic database systems in existence, notable for its radical in-memory architecture and adaptive query processing. While still completing dual bachelor's degrees in Computer Science and Mathematics, he led the query processing team there, working on adaptive workload management. It was, to use a phrase he'd probably reject as too abstract, formative. The hands-on collision of theory and implementation left a mark.

An internship at Snowflake followed, also in query processing. Then, in May 2021, Firebolt. The company was barely two years old, building a cloud data warehouse from scratch and betting that the next generation of analytics needed something fundamentally different from what Redshift or BigQuery offered. Wagner walked in as one of the earliest engineering leaders and started building the thing that makes Firebolt different: its query engine.

"Saying 'our database is Postgres compliant' seems like it makes language choices very easy when building the system, but there's a lot of subtle details that really impact whether it's fun to use the system."

- Benjamin Wagner, Firebolt Engineering Blog

The technical specifics matter here more than the usual startup biography. Wagner's work at Firebolt concentrates on query planning, distributed query execution, and the kind of micro-optimizations that compound into macro performance advantages. One representative example from his writing: a single check of UTF-8 character composition before computing string length makes length() eleven times faster. Not two times. Eleven. That's the altitude at which he operates.

He published a technical blog series on what it actually takes to make a query engine Postgres-compliant - not just SQL-compatible, but behaviorally faithful to the quirks and edge cases that developers rely on in practice. The conclusion is not comforting: "there are a lot of subtle details that really impact whether it's fun to use the system." Each one requires a decision, and each decision compounds into the character of the database.

Firebolt Performance vs. Traditional Warehouses

Firebolt 36x faster

Traditional Cloud Warehouse Baseline speed

Cost (Firebolt vs Traditional) 5% of the cost

Based on Firebolt published benchmarks. Actual results vary by workload.

InkFuse is the other project that reveals how Wagner thinks. A side experiment he built and eventually turned into a published academic paper, it proposes a database runtime that doesn't choose between vectorized and compiled query execution - it does both, incrementally. The two approaches have historically been in tension: vectorized engines batch data for SIMD efficiency, compiled engines specialize code paths for each query. InkFuse argues the distinction is artificial. The paper was co-authored during the same period Wagner was building production query engines at Firebolt, which is the kind of parallel track that only makes sense if you genuinely cannot stop thinking about the problem.

Apache Iceberg has become a defining focus. In November 2025, Wagner gave a CMU database seminar titled "Firebolt: Why Powering User Facing Applications on Iceberg is Hard." The title is not rhetorical. The gap between running analytical queries over a data lake and serving those results to end users in milliseconds involves a category of engineering problems - consistency, caching, partition management, low-latency reads - that most Iceberg discussions quietly skip. Wagner does not skip them.

"CPUs love operating on batches of data."

- Benjamin Wagner, on vectorized query execution

The CMU database group talks are worth noting in themselves. This is not a speaker circuit of polished executives. CMU's database seminar is where people like Andy Pavlo invite engineers to explain, in technical detail, how their systems actually work. Wagner has done this twice - once in December 2021 with "How We Build Firebolt," and again in 2025 with the Iceberg talk. The invitation itself signals something about how the academic database community regards his work.

His GitHub profile summarizes the situation with the kind of compression a database engineer would appreciate: "Building Database Systems at Firebolt. Interested in High-Performance Data Processing and Distributed Systems." That's the whole story, in one line. The InkFuse repo (55 stars, growing) is a C++ prototype that any serious database engineer will find interesting. AnyBlob, another project he contributed to, solves the specific problem of fetching data from cloud object storage as fast as possible for analytics - an underappreciated bottleneck that most teams work around rather than fix.

Firebolt Core, the free self-hosted edition launched in 2025, is the company's bet that developers will build on a fast foundation if the cost of getting started is zero. Wagner's engineering is what makes that foundation credible. The $269 million in total funding, the $1.4 billion Series C valuation at the January 2022 raise - these are downstream consequences of a query engine that does what it claims.

Watch

Benjamin Wagner Talks

How We Build Firebolt

CMU Vaccination 2021 · Technical deep dive into Firebolt's architecture

Why Powering User Facing Applications on Iceberg is Hard

CMU Database Group Seminar · Apache Iceberg & analytics at scale

In His Own Words

Technical Observations

"Throwing errors early was extremely important for us."

"By checking whether there are only single-byte UTF-8 characters... length() becomes 11x faster."

"CPUs love operating on batches of data."

"Both using __builtin_add_overflow() and the manual PG-style check is about 4x slower."

Track Record

What He Has Built

⚡

Query Engine at Scale

Led the engineering behind Firebolt's query execution engine - the architecture behind 36x speed claims at production scale.

🔬

InkFuse Research

Created InkFuse - an open-source experimental database runtime that unifies vectorized and compiled query execution. Published as academic paper.

🏛

CMU Invited Speaker

Invited twice to Carnegie Mellon's database group seminar - one of the most selective technical forums in academic database systems.

🏔

Iceberg at the Edge

Pioneering the hard engineering work of serving user-facing applications from Apache Iceberg data lakes with millisecond latency.

🐘

Postgres Compliance

Authored detailed technical series on what true Postgres compatibility requires - covering subtle behavioral edge cases most databases gloss over.

☁

AnyBlob

Contributed to AnyBlob - a cloud object storage download manager purpose-built to minimize latency for analytics workloads.

Details

Specifics Worth Knowing

2 Dual bachelor's degrees from TU Munich - one in Computer Science, one in Mathematics. Then a Master's in CS on top, completed while working at Firebolt.

11x The speed gain Wagner extracted from a single string length optimization - by first checking UTF-8 character width before computing length.

55 GitHub stars on InkFuse - a prototype that turned into a real academic paper and a proof of concept for a new class of query engine architecture.

wagjamin His GitHub handle - a reverse-compressed mashup of Wagner + Benjamin. The kind of naming choice that reveals a certain kind of mind.

BenjaminWagner