Somewhere in San Francisco, a phone rings. A voice answers in 380 milliseconds. It is not a person. The caller does not notice. That uneventful exchange is Vapi's entire business model.
Pick up the phone today and there is a non-trivial chance the voice you hear is glued together by a four-year-old company you have probably never read about. Vapi has crossed a billion calls. Its agents handle scheduling for hospitals, support for fintechs, drive-thru orders, and the kind of debt-collection calls that used to be done by people with weary scripts. The product is largely invisible. That is the point.
For a company whose technology is loud by definition, Vapi has been impressively quiet about itself. So we are going to do the talking.
Voice was supposed to be solved.
It was not. Speech-to-text got cheap. Large language models got fluent. Text-to-speech got eerie. And yet, every developer who tried to bolt these pieces together discovered the same thing: in voice, the silence is the product. A 900ms pause feels like the bot is broken. A 200ms pause feels like a person who is thinking. Between those two numbers lies an enormous amount of plumbing nobody wanted to write.
Telephony added a second layer of misery. SIP trunks. Carrier rules. Echo. Barge-in. Interruption handling. The polite fiction that any of this would Just Work if you wired up a few APIs. It did not.
A failed AI therapist that became a billion-call API.
Jordan Dearsley and Nikhil Gupta met at the University of Waterloo, where they shipped a YC-backed calendar app together that, charmingly enough, made money. In mid-2023, Dearsley wanted a therapist he could talk to on his daily walks. So he built one. He chained models, fought latency, and ended up with something that worked over a regular phone call.
The therapy product did not take off. Therapy is a hard business. The infrastructure under the therapy product, however, was something else entirely: a working real-time voice stack that other developers immediately wanted to borrow. Vapi pivoted into the picks-and-shovels business and never looked back. It launched publicly on Product Hunt in March 2024.
There is something almost suspicious about how clean the origin story is. A side project. A failed product. A discovered platform. It would feel apocryphal if the call counter were not at ten figures.
An API for the dial tone.
Strip Vapi to its core and it is a single, slightly outrageous proposition: give a developer one API and a credit card, and they will have a phone-answering agent on the internet in roughly the time it takes to make a cup of coffee. Underneath that promise sits a careful orchestration layer that swaps between speech-to-text engines, large language models, and text-to-speech voices on the fly, all while keeping median response under half a second.
Around it: telephony integrations, SDKs in the usual flavours, a CLI that engineers actually use, automated testing suites for voice agents (a category that barely existed two years ago), and the compliance accoutrements - SOC 2, HIPAA, PCI - that enterprises need before they will even take the meeting. Self-hosted speech models for the regulated paranoid. A/B testing for prompts and voices. Real-time transcripts and analytics.
What you can build with Vapi
- Inbound support - voice agents that answer your support line at 3am without sounding furious about it.
- Outbound campaigns - sales follow-ups, appointment reminders, surveys, payment collection.
- Verticals - healthcare scheduling, drive-thru orders, debt collection, recruiting screens, real-estate intake.
- Embeddings - voice agents inside web apps, mobile apps, smart-home devices, and (yes) doorbells.
Receipts, kindly arranged.
Numbers are doing more of the talking here than press releases. As of May 2026, Vapi reports more than one million developers on the platform, north of 2.7 million voice agents created, and over a billion calls handled. Enterprise revenue, by their own telling, grew tenfold in the year leading into the Series B. Amazon's Ring picked Vapi over forty competing platforms to power voice features in its devices, which is the kind of bake-off result that tends to settle arguments. Other names on the customer list: ServiceTitan, New York Life, Intuit, Kavak.
The cap table tracks the same trajectory. Peak XV led the $50M Series B in May 2026, with Microsoft's M12, Kleiner Perkins, Bessemer Venture Partners, and Y Combinator all writing checks. Total raised: about $72M. Reported post-money: around $500M. Not a unicorn yet. Halfway there, with the runway and the customer logos to argue it is plausible.
Vapi, from dorm room to dial tone.
Where the noise is coming from.
Make voice as boring as the dial tone.
Ask the founders what they are doing and the answer collapses to a sentence: make it trivial for any developer to build human-level voice experiences. The technical translation: turn a stack that takes a team of six months to build into an SDK that takes a single afternoon. The cultural translation: voice should not be a category, the same way "HTTP requests" is not a category. It should be a primitive.
Quiet ambition, that. The fashionable thing in AI right now is to call yourself an agent company and aim at the human. Vapi is aiming at the interface. There is a difference. One replaces workers; the other replaces forms.
The boring future is the bullish one.
If Vapi is right, in a few years a noticeable share of phone calls in the developed world will be at least partly handled by software, and most of those will be running on a handful of orchestration platforms. The interesting question is which ones. Vapi is betting that developers, not procurement teams, will pick the winner. That is the bet behind everything: the API-first packaging, the public docs, the GitHub-forward branding, the SDKs, the CLI.
It is also the bet behind hiring a hundred-and-seventy-odd people and pricing the company at half a billion dollars on $8M of run-rate revenue. The market is not paying for what Vapi is. It is paying for what voice becomes if voice becomes ambient.
Skeptics have plenty of ammunition. Voice AI is a crowded category. Twilio is large. OpenAI ships voice. The platform shifts in conversational AI happen monthly. Margins are thin until you own the model layer, which Vapi does not, at least not yet. Self-hosted speech models, advertised on the marketing site, hint that they are thinking about that. There is a long road between a billion calls and a defensible business.
Still, here we are. Somewhere in San Francisco, a phone rings. A voice answers in 380 milliseconds. The caller, slightly later than usual, notices nothing. The infrastructure underneath the silence was once a side project for a man on a walk. Now it is the dial tone of a small but rapidly enlarging slice of the internet.
That is Vapi.