The pipe under the voice-AI internet.
It is 2026. A nurse in Sydney opens a telehealth app and starts a video visit. A logistics dispatcher in Ohio dials an AI agent that books a truck. A founder in Berlin pitches investors over a custom video room embedded in her own product. Three different industries, three different stacks, three different time zones - all of them, behind the scenes, are running on Daily.
Daily is the developer platform for real-time voice, video and AI. It sells two products that look unrelated and are not. The first is a global WebRTC mesh - the kind of plumbing most people only notice when it breaks. The second is Pipecat, an open-source framework for building voice agents, which Daily wrote, gave away, and watched the internet adopt as a default.
The company is 130 people. Headquarters: 548 Market Street, San Francisco. Quiet on social media. Loud in the GitHub issues tracker. Profitable-minded, in the way that pre-AI infrastructure companies tend to be when they have already done the hard work of staying alive for a decade.
Real-time was always going to be expensive. Until it wasn't.
For most of the last decade, embedding a live video call in a product meant one of two things: pay Twilio or Vonage, or hire a WebRTC engineer and prepare for a very humbling year. The first was expensive. The second was difficult. Neither was scalable for the people who actually wanted to build something new.
Daily noticed something else, too. The model providers were getting faster. Whisper could transcribe in something close to real time. GPT-4o could respond in under a second. Cartesia and ElevenLabs could synthesize speech with barely any perceptible delay. The bottleneck for conversational AI was no longer the model - it was the pipe. And the pipe was, charitably, a mess.
So Daily made a bet. If voice and video were going to be the default interface for AI agents, then somebody had to make the underlying transport boring. Boring as in: predictable, sub-second, globally available, and accessible through an HTTP call instead of a six-month integration project.
Kwindla Hultman Kramer has been waiting for this since the Media Lab.
Daily was founded in 2016 by Kwindla Hultman Kramer and Nina Kuruvilla. Kwin - everyone calls him Kwin - did graduate work at the MIT Media Lab on large-scale networked systems and real-time video. He had also, before Daily, co-founded Oblong Industries, the company that productized the gestural interface from Minority Report. He has been thinking about how humans and machines share screens, in real time, since most of today's voice-AI founders were in middle school.
The first incarnation of Daily was, of all things, a hardware company. Pluot, Inc. - the original name, still preserved in the Facebook URL - sold a small conferencing box for a few hundred dollars and a SaaS subscription. The boxes paid the bills. The bills paid for the patient, unsexy work of building a global WebRTC mesh.
In 2019, Daily launched its video API. In 2020, it raised $4.6M. In late 2021, Renegade Partners led a $40M Series B alongside Heritage Group, Cendana, Tiger Global, Slack Fund and Lachy Groom - bringing total funding to a publicly reported $66.8M. The company has not raised since. It has, instead, shipped.
Two products, one continuous bet.
Daily WebRTC
Global mesh, SDKs for web, iOS, Android, React Native, server-side recording, live RTMP/HLS, real-time transcription. The plumbing.
Pipecat (open source)
Python framework for voice and multimodal agents. 40+ model providers as plugins. SDKs in JS, React, iOS, Android, C++.
Pipecat Cloud
Managed hosting for production Pipecat pipelines. Autoscaling, observability, enterprise-grade infrastructure, the works.
Daily Prebuilt
A drop-in embeddable video room when you want to ship in an afternoon instead of next quarter.
The clever part is the interlock. Daily's WebRTC layer was already obsessed with sub-second latency, because video calls do not tolerate anything more. That obsession turns out to be exactly what voice agents need. So when Pipecat orchestrates a turn between a user, a speech-to-text engine, a language model and a text-to-speech engine, it can do so on a transport that was built to survive bad Wi-Fi at a coffee shop in Lagos.
Pipecat itself is given away. Daily makes money when developers want it run for them - which, in production, most of them do.
A nine-year apprenticeship in patience
Growth, when nobody was looking
Source: Daily's Series B announcement. Multiples reported by the company; we have not audited the spreadsheet.
The customers you already use.
The case studies on Daily's site read like a list of products you might already be paying for. AppFolio uses it for property management video tours. Pitch uses it for live presentation calls. Kumospace built its entire spatial-audio office on it. HotDoc routes telehealth visits through it. Teamflow built its remote workplace on it. Cresta and Epic, in different ways, depend on the AI side of the stack.
The Pipecat side is harder to count. Open source adoption rarely fits into a press release. But when NVIDIA chose a framework to anchor its Voice Agent AI Blueprint - the reference implementation it gives to enterprises building production voice agents - it chose Pipecat. That is not a small endorsement.
Make real-time boring, so the interesting stuff can happen on top.
Ask Kwin what Daily is building and he will not say "the voice AI company". He will say something closer to: a layer of infrastructure that should already exist, and once it does, will free a generation of developers to think about the actual conversation rather than the network conditions in Lagos.
That worldview - infrastructure as a public good - is why Pipecat is open source instead of a closed product. It is also why Daily is an unusually active participant in WebRTC standards work. The bet is that the more people can build on top of this stack, the more valuable the part Daily charges for becomes. So far, the bet is working.
Every AI product ends up needing this.
In 2024, voice was a feature. In 2025, it became a product category. In 2026, it is becoming the default mode of interaction for entire verticals - healthcare intake, customer support, robotics, in-car assistants, language learning, accessibility, the long tail. Every one of those products needs three things: a model, a turn-detection layer, and a real-time transport that does not fall apart.
Daily owns one of the three outright and is the most opinionated voice in the room on a second. That is a useful position. It is also the position most competitors have to spend the next two years trying to catch.
So picture the nurse again. Sydney, 8 a.m. The telehealth app opens. A patient appears. The connection is crisp. The audio is in sync. The notes write themselves in the background, transcribed by a voice agent that politely waits for pauses before speaking. Nobody mentions WebRTC. Nobody mentions Pipecat. Nobody mentions Daily.
That is the point. The companies that build the pipes never get the credit. They just get the next decade.