The open-source columnar database that turned the words "real-time analytics on a petabyte" from a sales-deck fantasy into a Tuesday morning query.
An SRE in Lisbon types a SELECT statement into a terminal. She is looking at 14 billion rows of fraud telemetry. The cursor blinks once. The result comes back in 480 milliseconds. She does not seem impressed, which is the highest compliment a database can receive.
That database is ClickHouse. It is open source, written in C++, deeply unfashionable in its love of columnar storage, and presently powering analytics inside Anthropic, Tesla, Sony, Lyft, Instacart, Meta, Mercado Libre, and roughly 2,994 other companies that pay for it. Until recently, most people outside data engineering had never heard of it. Then it raised $400 million at a $15 billion valuation, and the rest of the industry started taking notes.
Back in 2009, Yandex - then Russia's largest search engine - had a problem the rest of the internet would not catch up to for another decade. Yandex.Metrica, its web analytics product, was trying to serve interactive reports on billions of clicks per day. The available tools at the time were row-store databases that politely declined the workload, or proprietary OLAP cubes that demanded a tribute of overnight ETL.
The Metrica team needed something else. Specifically: a database that could scan a column of a few billion rows faster than a human noticed the page had loaded. So a young engineer named Alexey Milovidov did the unfashionable thing. He started writing one.
By 2012, his prototype - eventually named ClickHouse, for Clickstream Data Warehouse - was running Yandex.Metrica in production. Four years later, in 2016, Yandex open-sourced it under Apache 2.0. The decision was treated, internally, as roughly equivalent to giving away the family silver. It turned out to be one of the great open source bets of the decade.
In 2021, three things converged. The open-source project had quietly accumulated tens of thousands of GitHub stars and a small army of contributors. Geopolitics made it untenable to keep a globally important piece of infrastructure inside a Russian conglomerate. And Aaron Katz, an enterprise software executive most recently from Elastic, decided he wanted to commercialize it.
Katz called Yury Izrailevsky, formerly of Google and Netflix. They called Milovidov. They incorporated ClickHouse, Inc. in Delaware, set up shop in Portola Valley, and convinced Index Ventures and Benchmark to write the first cheque: $50 million in September 2021. A month later, Coatue and Altimeter led a $250 million Series B at a $2 billion valuation. The company had a product to sell before it had a logo on a business card.
Milovidov starts an experimental columnar engine inside Yandex.
ClickHouse runs Yandex.Metrica. Nobody outside Moscow notices.
Released under Apache 2.0. The community arrives.
ClickHouse, Inc. is founded. $50M Series A in September; $250M Series B in October.
ClickHouse Cloud goes GA on AWS, then GCP, then Azure.
Paid customer count crosses four digits.
Khosla leads at $6.35B. HyperDX acquired for observability.
$400M Series D from Dragoneer. 3,000+ customers. ~$160M ARR.
The open-source database is free, fast, and somewhat famous for the fact that you can install it on a laptop and have it out-perform whatever your company is currently paying six figures for. The commercial company makes its money the way most open-core companies do: by running the thing for you, very well, and charging by the byte and the query.
The Apache 2.0 columnar database. Sub-second OLAP on petabytes. Self-hosted. Free forever.
Fully managed, serverless ClickHouse on AWS, GCP, Azure. Separation of compute and storage. Autoscaling. BYOC option.
ClickHouse, in-process, inside Python. Analytical SQL inside your notebook with zero infrastructure.
Acquired in 2025. Logs, traces, metrics, and session replay - all sitting on top of ClickHouse.
Numbers, in the order investors care about them. Annualised revenue went from approximately $5 million at the end of 2022 to roughly $160 million at the end of 2025. The growth rate is the headline. The fact that it is real revenue from real enterprises, not credits being burned, is the small print that matters.
The customer roster matches the chart. Anthropic uses ClickHouse to keep an eye on model-serving infrastructure. Tesla runs analytics on what is probably the largest fleet telemetry dataset on Earth. Sony, Lyft, Instacart, and Memorial Sloan Kettering all sit on the same database. The unifying theme is volume: when you have too much data to fit comfortably in a row store, ClickHouse is the place most engineering teams end up.
Ask Aaron Katz why ClickHouse exists and he gives the official answer: to make real-time data analytics accessible to every organisation in the world. Ask Milovidov and you get something closer to the engineering truth: queries should return in milliseconds, because everything else is a failure of imagination.
Both answers are correct. The company's bet is that the future of analytics is interactive, not batch. That AI workloads - model evaluation, retrieval pipelines, observability for agentic systems - will generate datasets large enough to break every existing tool. And that the company holding the fastest, cheapest, most boring engine when the dust settles wins a remarkably large prize.
There is a quiet pattern in this AI cycle worth noticing. The companies training and serving frontier models do not just need GPUs. They need somewhere to put the telemetry, the evals, the traces, the cost data, the prompt logs, the safety signals, the user behaviour. That somewhere needs to be cheap enough to keep everything, fast enough to query in real time, and stable enough that nobody has to think about it.
That somewhere is increasingly ClickHouse. Anthropic uses it. So does a long list of AI infrastructure companies that prefer not to be quoted. The analytics database, it turns out, may be one of the boring picks-and-shovels businesses of the AI boom - the kind of company that gets quietly essential and then surprises everyone with a $15 billion valuation.
Whether that valuation holds is a question for time and Snowflake's lawyers. What is harder to argue with is the engine itself. It is fast. It is open. It is in production at a meaningful slice of the internet. And the next time someone tells you analytics has to be slow, you can quietly point at a SELECT statement in Lisbon, returning in 480 milliseconds.