Who they are right now
It is a Tuesday morning at a Pinterest data center, and a TiDB cluster is doing the work of three databases without telling anyone. The transactions are landing. The analytics are running. Somewhere in the background, a vector search is helping an AI agent answer a question that did not exist five years ago. The team that built the database calls themselves PingCAP. The thing itself is called TiDB. The pitch has not changed since 2015: stop pretending a relational database has to be small, fragile, and sharded by hand.
PingCAP is a Sunnyvale-headquartered, Beijing-born open-source company with roughly 600 engineers, $341 million in venture capital, and a Gold Stevie Award on the shelf as of April 2026. The product is TiDB - a distributed SQL database that speaks MySQL on one end and scales out across cloud regions on the other. The company sells a managed version (TiDB Cloud) and lets you run the open-source version free, which is the rare combination of "give it away" and "still raise a Series D."
"We didn't build TiDB to compete with MySQL. We built it because we got tired of being on call for MySQL."
— paraphrased, every PingCAP engineer at every meetup, 2016–presentThe problem they saw
Before PingCAP, the founders worked at Wandoujia, a Chinese app store running one of the planet's largest Redis clusters. Operationally, this is what is known in the industry as "bad news." The team kept hitting the same wall every fast-growing internet company hits: MySQL is wonderful right up until the moment your single instance becomes ten shards, and your ten shards become a war crime against future engineers.
So they read papers. Specifically: Google's Spanner and F1. Both describe a database that scales horizontally and still speaks SQL, with strong consistency. Both were unavailable to anyone not employed by Google. The founders did the obvious thing, which was also a slightly insane thing: build it themselves, in the open, and put it on GitHub.
The bet was that the world wanted what Google had but did not want to be Google to get it. In 2026, that bet looks obvious. In 2015, it sounded like a hobby project with a long horizon.
The founders' bet
Max Liu (CEO), Edward Huang (CTO), and Dylan Cui co-founded PingCAP in April 2015. They put the first commit of TiDB on GitHub the same year and shipped TiDB 1.0 in October 2017. The founders' theory was straightforward and slightly heretical: open source was not a marketing tactic. It was the product distribution channel. If TiDB was good, infrastructure engineers would adopt it, file issues, send pull requests, and eventually bring it into their employer's stack. The enterprise sales motion would arrive later, and politely.
"Open source isn't generosity. It's how you earn the right to be considered for the workload that pays the bills."
— the unwritten Bay Area / Beijing rulebook, circa 2018It mostly worked. Sequoia and Matrix Partners China came in early. Coatue led the $50 million Series C. GGV, 5Y, Access, and Anatole led the $270 million Series D in November 2020, valuing the company above $3 billion the following year. The cap table is unusual: a database company that is genuinely loved by developers and also taken seriously by growth-stage investors. Both groups, historically, are hard to please.
The product, plainly
TiDB is three things wearing one name. There is TiDB itself, the SQL layer. There is TiKV, the distributed key-value store underneath, now a graduated CNCF project. And there is TiFlash, a columnar storage engine that lets you run analytical queries against operational data without the polite fiction of an overnight ETL job. Together they form what the industry has spent the last decade calling HTAP - hybrid transactional and analytical processing - because acronyms make boring things sound load-bearing.
TiDB
Distributed SQL engine, MySQL-compatible. Drop-in for most apps. Scales out, not up.
TiDB Cloud
Managed DBaaS on AWS, GCP, Azure. Serverless, Dedicated and BYOC tiers.
TiDB X
The 2025 architecture overhaul. Object storage as the backbone. AI-ready by default.
TiKV
CNCF graduated. The transactional key-value layer powering TiDB - and other projects.
TiFlash
Columnar engine for analytics on live data. No ETL detour required.
AI Developer Toolkit
TiDB AI SDK, MCP Server, Reasoning Engine. Vectors, JSON and SQL in one query.
In October 2025, at SCaiLE Summit, PingCAP introduced TiDB X. The headline change: object storage becomes the backbone. The practical change: the database elastically reshapes itself based on the actual workload pattern, which is how every database has wanted to behave since the invention of databases. The other headline: vector search, knowledge graphs and JSON now live inside the same query engine, which is also how every AI engineer has wanted databases to behave for about eighteen months.
How they got here
The proof, in numbers
The skeptic's question with open-source companies is always the same: nice repo, who pays you? PingCAP's answer is a list that has gotten less defensive over time. Pinterest. Plaid. Bolt. Atlassian. Square. Shopee. Flipkart. Dify. Manus. The fintechs use it because regulators ask hard questions about consistency. The marketplaces use it because Black Friday is not gentle. The AI companies use it because suddenly the database needs to do vector math, and they prefer not to bolt on a second system to do it.
PingCAP, in five numbers
"We picked TiDB because we wanted one database, not three. That is a boring answer. It is also the correct one."
— the kind of thing engineering leads say in conference talksThe mission
The official mission statement is a sentence about simpler, more reliable, infinitely scalable data infrastructure. The unofficial version, scribbled into commit messages and Slack channels, is closer to: nobody should have to write their own sharding logic in 2026. The two are the same idea, dressed differently. PingCAP is not trying to dethrone Postgres or Oracle. It is trying to make the boring middle of data infrastructure - the part where you scale, fail over, snapshot, replicate, query - someone else's problem. Specifically, theirs.
This is also why the AI pivot in 2025 was less of a pivot than a continuation. Agents need persistent memory. Persistent memory needs a database. The database needs to do vector search, structured queries, and not panic when ten thousand agents hit it at once. TiDB was already good at the last part. The first two were extensions.
Why it matters tomorrow
If the next decade of software is going to be agentic, multi-tenant, multi-cloud and multi-modal, then the database underneath it has to be all four. Most aren't. TiDB has been quietly preparing for that future since before "agentic" became a LinkedIn adjective. The AI Developer Toolkit released in late 2025 - with the MCP Server, AI SDK and Reasoning Engine - is a serious attempt to make the database a first-class citizen of the agent stack, not a thing the agent talks to through a wrapper.
Will it work? The honest answer is: it might. Distributed databases are hard to displace once they are entrenched, and PingCAP is now entrenched. The risk is the usual one - open-source business models are tricky, hyperscalers have their own databases, and the AI infrastructure space is loud. The opportunity is also the usual one - the world has more data, more agents, and less patience for ETL than at any point in history. PingCAP is well placed for both.
Back to Tuesday morning
Return to that Pinterest data center. The TiDB cluster is still doing the work of three databases. But now you know why. You know who built it, and what they were running away from when they did. You know the bet they made on open source, the bet they made on Spanner-style architecture, and the bet they are making, right now, on databases that talk to AI agents.
The cluster does not care that you know. It just keeps serving the transactions, running the analytics, and answering the vector queries. That is the most flattering thing you can say about a piece of infrastructure: it is boring to use, interesting to read about, and almost impossible to live without once it is in place. PingCAP, for what it's worth, would consider that a compliment.
"The best databases are the ones you forget you have. The second best are the ones you remember to thank."
— old DBA proverb, possibly invented for this article