BREAKING
Imply hits $63M ARR in 2024 - up 77% year-over-year Fangjin Yang: 2023 Datanami Person to Watch Apache Druid: from ad-tech hack to Apache Software Foundation Imply raises $100M Series D - unicorn at $1.1B Twitter bio: "Aspiring rapper." Day job: running a $215M-funded analytics company Netflix, Atlassian, Salesforce, Confluent all run on Imply Total funding: $215.3M across 5 rounds since 2015 Fangjin Yang - a16z scout and angel investor since 2021 Imply hits $63M ARR in 2024 - up 77% year-over-year Fangjin Yang: 2023 Datanami Person to Watch Apache Druid: from ad-tech hack to Apache Software Foundation Imply raises $100M Series D - unicorn at $1.1B Twitter bio: "Aspiring rapper." Day job: running a $215M-funded analytics company Netflix, Atlassian, Salesforce, Confluent all run on Imply Total funding: $215.3M across 5 rounds since 2015 Fangjin Yang - a16z scout and angel investor since 2021
Fangjin Yang, Co-Founder and CEO of Imply
Co-Founder & CEO, Imply  /  Original Author, Apache Druid

Fangjin Yang

Built the database. Founded the company. Raised $215M. Still has "aspiring rapper" in his Twitter bio.

Apache Druid $1.1B Unicorn San Francisco Series D a16z Scout
$63M
ARR (2024)
$215M
Total Raised
$1.1B
Valuation
13+
Years on Druid
100+
Customers
77%
YoY Growth

The engineer who wouldn't wait for a query

In 2011, programmatic advertising was eating the internet - and it was eating it fast. Bidding decisions needed to happen in milliseconds. Analytics needed to keep pace. At a startup called Metamarkets, Fangjin Yang and a small team built what would eventually become Apache Druid because nothing else was fast enough.

That was not a product launch. It was an act of engineering necessity. They were not building a database company. They were building a tool for clients who needed real-time analytics on ad campaigns and could not afford to wait three seconds for a dashboard to refresh. What they built turned out to be useful for everyone else too.

Yang has spent over a decade on the same core problem - how do you give thousands of users simultaneous, interactive, sub-second access to enormous datasets? Most analytics tools give you a batched answer. Yang wanted to give you a live one. He still does.

"When we started Druid, there just weren't that many databases that were really specialized at powering these different forms of data applications, where you could have thousands or tens of thousands of users."
- Fangjin Yang

From Waterloo to the Apache Foundation

Yang trained as an electrical and computer engineer at the University of Waterloo - two degrees, both applied science. He came out of one of Canada's most technically rigorous universities and moved into software engineering, first at Cisco, then at Metamarkets.

Metamarkets is where the story bends. The company was building a user-facing analytics engine for programmatic advertising firms. Clients needed millisecond response times. The existing database options - Hadoop, traditional RDBMS, early columnar stores - were too slow or too rigid. So Yang and his colleagues built their own: a distributed, columnar, in-memory data store with automatic time-partitioning and aggressive indexing.

They named it Druid. Then they open-sourced it - not as a growth hack, but because they had been shaped by open-source software and felt the obligation to give something back. That decision changed the arc of everything. Druid spread to dozens of industries. Users found problems the Metamarkets team had never imagined. A community formed. Eventually, the Apache Software Foundation came calling.

"What you need is a very interactive, almost a 'Google-esque' experience... People want to do that with data as well."

- Fangjin Yang, on what Druid was designed to deliver

Building the company around the database

In 2015, Yang, Gian Merlino, and Vadim Ogievetsky founded Imply. The thesis was simple: Druid had proven itself in production. Real companies, real scale, real workloads. Now it needed a commercial wrapper - managed infrastructure, visualization, enterprise support, and cloud deployment.

They were backed initially by Khosla Ventures. The product combined a Druid backend with Pivot, a visualization engine that let non-engineers explore data directly. Over time, Imply Enterprise became the on-prem offering, Imply Hybrid bridged the gap, and Imply Polaris became their fully managed database-as-a-service - Druid in the cloud, without the ops burden.

The customer list reads like a shortlist of companies that have already figured out that real-time analytics is not optional. Netflix, Atlassian, Salesforce, Confluent - companies that handle enormous, fast-moving data and cannot afford dashboards that are twelve hours stale.

Imply by the Numbers
2011
Druid first built at Metamarkets
2015
Imply founded
5
Funding Rounds
180+
Employees
77%
ARR Growth 2023→2024

The counterintuitive bet that paid off

Yang's read on what enterprises actually need from a database is different from the popular narrative. Streaming ingestion - the ability to ingest data in real-time from Kafka or Kinesis - gets most of the press coverage. But Yang noticed early that this was not the core value for most of his customers.

"Streaming ingestion is a very small part of the value that our customers actually get from the database. Half of our customers don't even use streaming ingestion."
- Fangjin Yang

What they actually wanted was a database that could serve a large number of concurrent users with sub-second query performance on billions of rows. A product that felt like Google Search applied to your own data. That specific problem - high concurrency, low latency, at scale - is what drove every architectural decision in Druid's design. Column-oriented storage. Automatic bitmap indexing. Aggressive pre-aggregation. Horizontal partitioning by time.

The technical choices were not accidents. They were deliberate answers to a question most database vendors were not asking.

"Those problems at scale are incredibly difficult technical problems that take a group of data engineers a decade in order to do it well."

- Fangjin Yang, on why distributed analytics is still hard

Timeline

~2009
Software Engineer at Cisco - early career foundation in enterprise systems
2011
Lead Engineer at Metamarkets; co-creates Apache Druid to power real-time programmatic advertising analytics
2011-2014
Druid open-sourced; community grows across ad tech, finance, and enterprise software verticals
2015
Co-founds Imply with Gian Merlino and Vadim Ogievetsky; initial backing from Khosla Ventures
2021
Becomes angel investor and a16z scout for Andreessen Horowitz while leading Imply as CEO
2022
Imply raises $100M Series D; valuation crosses $1B; total funding reaches $215.3M
2023
Named Datanami Person to Watch; Imply Polaris cloud service scales enterprise customer base
2024
Imply reports $63M ARR - 77% year-over-year growth; 100 enterprise customers

What he's built

  • Co-created Apache Druid, now governed by the Apache Software Foundation
  • Founded Imply in 2015; scaled to $1.1B valuation by 2022
  • Raised $215.3M in total funding across five rounds
  • Grew Imply to $63M ARR with 77% year-over-year growth in 2024
  • Built enterprise relationships with Netflix, Atlassian, Salesforce, and Confluent
  • Named Datanami Person to Watch in 2023
  • Active angel investor and a16z scout since 2021
  • Published technical writing on O'Reilly on analytics stack design and Druid architecture
  • Speaker at Data Council, O'Reilly, and enterprise data conferences worldwide

The details that don't fit the press release

01

His Twitter/X bio says "Aspiring rapper." He joined Twitter in June 2009 and has apparently kept the same bio through the entire journey from engineer to unicorn CEO.

02

Apache Druid was not planned as a product. It was built under deadline pressure at Metamarkets because no existing tool could handle millisecond analytics for programmatic ad bidding.

03

Yang open-sourced Druid before building a commercial product around it - a rare move driven by philosophical commitment to open source, not a go-to-market strategy.

04

His GitHub handle is simply "fjy" - just initials, no drama. The commit history goes back over a decade and follows the entire evolution of the Druid codebase.

05

He holds two engineering degrees from the University of Waterloo - BASc in Electrical Engineering, MASc in Computer Engineering - and now runs a billion-dollar SaaS company's sales, product, and strategy.

06

Yang became an a16z scout in 2021 while still actively running Imply - investing in other people's ideas while scaling his own. Classic operator move.

Still solving the same hard problem

Yang's driving question has not changed since 2011: why can you search all of human knowledge in 0.3 seconds, but asking your own database a simple analytical question takes minutes? The gap between search and analytics felt like a product failure. Druid and Imply are his answers.

The bet is that every company will eventually run applications where users need to explore data interactively - not just engineers, not just analysts, but thousands of users at once. Customer-facing analytics. Operational dashboards. Embedded insights. That use case requires a fundamentally different database architecture than what most companies run today.

Whether it's the next decade of Imply Polaris scaling cloud deployments or the open source Druid community building new connectors and features, Yang is positioned at the center of the real-time analytics stack. He was there when the problem was invented. He is still building the solution.

"In its early days, Druid was adopted for a set of use cases in a handful of industries. Today, developers have shown its applicability across all industries - and the use cases have expanded exponentially."
- Fangjin Yang on the evolution of Apache Druid

The technical foundation

Apache Druid
Column-oriented, distributed, open-source OLAP database. Millisecond queries. Millions of events per second ingestion. Now in the Apache Software Foundation.
Imply Polaris
Fully managed Druid-as-a-service. No infrastructure ops. Automatic scaling. Built for teams that want Druid's performance without the cluster management.
Imply Enterprise
Commercial Druid with advanced visualization, enterprise security, and dedicated support. Used by Netflix, Atlassian, and Confluent.
Pivot / Clarity
Visual analytics interface that sits atop Druid. Makes high-cardinality data exploration possible without writing SQL for every question.

Links & Resources