The company that wrote Apache Druid, built a unicorn on top of it, and is now taking on observability. Real-time analytics, but make it serious.
The Imply team, celebrating 200 employees - more of a "we built the internet's analytics layer" vibe than a typical startup photo.
Somewhere in the architecture of Reddit's traffic dashboards, Atlassian's observability stack, and Cisco's network analytics platform, there is a database doing something quietly extraordinary. It is ingesting millions of events per second, storing them across distributed nodes, and returning query results in milliseconds. Most engineers using it have no idea who built it. The ones who do know: it was Imply.
Imply is a San Francisco-based data platform company built on Apache Druid - the open-source real-time analytics database that its founders co-authored before there was a company at all. Since incorporating in 2015, Imply has turned that technical foundation into $215M in venture funding, unicorn status at a $1.1B valuation, and a cloud product - Imply Polaris - that aims to give any engineering team access to what once required an entire data infrastructure team and significant budget.
"Analytics in Motion" - the phrase Imply uses to describe its core thesis. They are not wrong. The question is whether the market moves fast enough to catch up.
Here is the problem with traditional data warehouses: they were designed around the assumption that you already know which questions you will ask. You define your schema, load your historical data in batches, run your scheduled reports. That model works fine if you are analyzing last quarter's sales figures.
It does not work if you are trying to understand what is happening on your platform right now. A fraud detection system that runs hourly reports is useful the way a smoke detector with a weekly battery check is useful. An observability platform that can only query the last 30 days of logs is not really an observability platform. The world moved to real-time. Most databases did not follow.
The specific failure mode Imply was built to address: the gap between when data is generated and when it can be queried. In legacy architectures, that gap could be hours. For streaming data - user events, IoT sensors, network telemetry, financial transactions - hours might as well be geological epochs.
"The question isn't whether you need real-time analytics. The question is whether your database can keep up with the data generating it."
- Imply product positioningApache Druid, the technology Imply's founders built, was designed specifically to close that gap. Streaming and batch ingestion in the same engine. Sub-second queries across years of historical data. High-concurrency reads without the performance cliff that kills most analytical databases when load increases.
The insight was not subtle: if you are going to build analytics infrastructure for the modern internet, you have to assume the data is always arriving, always changing, and the queries will never stop.
The founding story of Imply is unusual because the main technical asset - Apache Druid - existed before the company did. Fangjin Yang, Gian Merlino, Vadim Ogievetsky, and Eric Tschetter co-created Apache Druid while working at Metamarkets, a digital advertising analytics company. When Metamarkets was acquired, the team had a choice: take their database expertise to a large company, or build something new around the technology they had spent years developing.
The bet was specific: open source would win in the database market, but open source alone doesn't pay salaries or build enterprise sales teams. The play was to open-source the core technology, build a community around it, and then offer a managed cloud version for enterprises that wanted the capabilities without the infrastructure burden. It's a model that Elastic, HashiCorp, and Confluent have all used with varying degrees of success. Imply did it in one of the most technically demanding segments of the market.
Andreessen Horowitz backed Imply in four consecutive funding rounds. That kind of follow-on conviction is either very smart or very stubborn. Given the $1.1B outcome, probably the former.
Fangjin Yang, Gian Merlino, Vadim Ogievetsky incorporate Imply Data, Inc. in San Francisco. First product: professional Apache Druid support and distribution.
Andreessen Horowitz and Khosla Ventures write the first check. Apache Druid gains traction in enterprise as a production analytics database.
Druid becomes a top-level Apache Software Foundation project. Imply raises $30M from a16z, Geodesic Capital, and Khosla at a ~$350M valuation.
Bessemer Venture Partners leads $70M Series C. Imply Polaris - the fully managed cloud DBaaS - launches publicly. Eric Tschetter rejoins as Field CTO.
Thoma Bravo leads $100M Series D in May. Valuation hits $1.1B. Imply joins the unicorn club. Total funding reaches $215.3M.
Imply Polaris expands to Microsoft Azure Marketplace. Customer count crosses 100. ARR estimated at approximately $63M. Druid Summit returns to San Francisco.
September 2025: Imply launches Lumi, described as the industry's first Observability Warehouse. Integrations with Splunk, Grafana Labs, Tableau, and AI assistants including Claude and Langchain.
Fully managed database-as-a-service on AWS and Azure. Sub-second queries at petabyte scale. Native streaming integrations with Confluent Cloud, Kafka, and Kinesis. Zero infrastructure management.
The Observability Warehouse. A high-performance, cost-efficient data layer that connects to Splunk, Grafana Labs, Tableau, and AI tools. SPL-compatible for Splunk users who want Druid's economics.
For organizations that need on-premise or hybrid deployment of Apache Druid. Enterprise-grade support, security compliance, and SLAs for teams that cannot move to cloud.
The engine underneath everything. 10,000+ GitHub stars. Streaming and batch ingestion, sub-second queries, high concurrency by design. Imply's team remains the primary contributor.
The cleanest proof that Imply's technology works is the customer list. Reddit uses it for traffic analytics. Atlassian uses it for observability. Cisco ThousandEyes - a network intelligence platform that processes enormous volumes of telemetry data - uses it in production. These are not "pilot customer" logos on a slide deck. They are engineering teams that chose Druid-based analytics in a market with several well-funded alternatives.
The funding arc is worth studying. A16z at Series A in 2016, back at B and C, and still in at D. Bessemer leads Series C in 2021. Then Thoma Bravo - primarily a private equity firm known for taking software companies private - leads the Series D in 2022 at a $1.1B valuation. That is not a typical venture trajectory. It suggests investors were seeing something in the unit economics or the category scale that moved the conversation from "interesting startup" to "platform company."
Imply Polaris was named "Best Open-Source Cloud Solution" in the Cloud Awards. A detail that sounds like marketing until you realize the competition in that category includes some very serious companies.
Imply competes against companies that are considerably larger, better funded, or both. Snowflake, Databricks, and ClickHouse all offer overlapping capabilities in the analytical database market. In observability specifically, Splunk (now Cisco), Datadog, New Relic, and Elastic Stack represent incumbents with deep enterprise relationships.
| Company | Strength | Imply's Edge |
|---|---|---|
| Imply / Apache Druid | Real-time streaming + historical, sub-second latency | This is the product |
| Snowflake | Broad ecosystem, SQL-first, BI integrations | Not built for streaming; batch-first architecture |
| Databricks | ML/AI workflows, open Lakehouse format | Latency trade-offs; heavier operationally |
| ClickHouse | Extremely fast OLAP, growing open-source community | Less mature managed cloud; different architecture for streaming |
| Splunk / Cisco | Massive installed base in observability | High cost; Imply Lumi positioned as the cheaper Splunk data layer |
| Datadog | Full-stack observability, strong brand | Expensive at scale; Imply offers cost-optimized alternative |
The honest version: Imply is not the only fast analytical database, and it is not the only company building on open-source infrastructure. ClickHouse recently hit a $15B valuation. The category is competitive. Where Imply has an argument is the combination of streaming-native ingestion, open-source credibility with Apache Druid, and the Observability Warehouse angle with Lumi - which targets a specific, expensive pain point (Splunk costs) that many enterprise teams want solved.
The original scene: Reddit's engineers watching traffic spikes in real time during a major news event. Cisco's team monitoring network anomalies across global infrastructure. A fintech company watching transaction patterns for fraud signals. In each case, someone built an analytics layer on top of Apache Druid, and it worked because it was designed for exactly this - data that never stops arriving and questions that cannot wait for a batch window.
Imply did not invent real-time analytics. They did build the most widely adopted open-source engine for it, and then built a company around making that engine accessible. The $215M in funding is a bet that this problem - turning streams of events into interactive queries at scale - is both large enough and hard enough to justify a dedicated platform company.
The Imply Lumi announcement in September 2025 suggests the company is extending that thesis. If Druid is the database that powers your real-time analytics, and Polaris is the managed cloud version, then Lumi is the move into observability infrastructure - a market where the incumbents charge a lot and enterprises are actively looking for alternatives. Whether Imply can compete in that space against Splunk and Datadog while maintaining its core database business is the open question.
Back to where we started: a distributed database, quietly doing the work inside infrastructure you use every day. The engineers at Imply built it. Most people will never notice. That's the point.
The founding team wrote Apache Druid before they had a company. They built Imply as the commercial layer around it. They expanded from database to cloud to observability warehouse. The arc is coherent. Whether it becomes the defining analytics platform of the next decade or a very good database company that exits to a large infrastructure player - both outcomes exist. The technology is real. The customers are real. The funding certainly is.