The roommate who decided fluency comes from talking
Most language apps reward you for tapping the right box. Michael Xing noticed that the people who actually become fluent do something messier and harder - they open their mouths and risk being wrong, out loud, with another human.
That observation started in a college dorm. Xing and Morrie Schonfeld were freshman-year roommates at Northwestern, and once a week they sat down to practice Chinese. Xing was already fluent. Schonfeld had been studying the language for years and would eventually become a published Chinese author. The ritual worked because of something obvious in hindsight: one of them always had someone to talk to.
The problem is that most learners don't. You can buy every app, stream every podcast, and stack every flashcard, and still freeze the moment a real conversation starts. Immersion is the thing that works, and a willing partner is the thing nobody has on demand. So the two of them built the partner - an AI tutor that sounds like a native speaker, adapts to your level, and never sighs when you mangle the tones for the fortieth time.
Shipped in January. Y Combinator by March.
Xing wasn't new to making things people use. Before Pingo he spent roughly three years inside AI startups, and somewhere along the way shipped a Chrome extension that quietly passed 150,000 downloads - the kind of unglamorous proof that he could put a product in front of strangers and have them stick around.
Pingo launched in January 2025. By March it had 50,000 users and $25,000 in monthly recurring revenue, and Y Combinator handed it a seat in the Summer 2025 batch. The line on the chart kept bending upward: 300,000+ users and $200,000 MRR by July, then past 750,000 learners and roughly $6M in annual recurring revenue before the first year was out. Along the way Google named Pingo a Best of 2025 pick. Y Combinator started describing it as one of its fastest-growing consumer companies.
Numbers like that usually come with a long backstory of pivots. Pingo's is short. The team stayed small - four people - and pointed everything at a single idea: get people talking, then get out of the way.
There's a reason the curve looked the way it did. Consumer software is brutal precisely because nobody is obligated to keep using it; the moment an app stops earning its place on your phone, it's gone. A learning app has it worse - it's competing with the very human urge to quit something hard. Pingo's answer was to make the hard thing feel like a conversation instead of a chore. People came back because talking to it was, oddly, kind of fun. That retention is what turns a launch spike into a business, and it's the difference between an app people download and an app people pay for month after month.
The co-founder split tells you something about how the company thinks. Xing runs as CEO and leads product; Schonfeld, the one who spent years grinding through the language the hard way, leads growth. One of them knows what fluency feels like from the inside. The other remembers exactly how it feels to be stuck - which corrections sting, which wins keep you going, which lessons quietly waste your time. A product about learning to speak is, in a sense, an argument between those two perspectives, settled in code.
"AI edtech isn't hype"
Plenty of founders talk about AI in education the way people talk about flying cars - always almost here. Xing takes the opposite posture. His pitch isn't a promise about some distant future; it's a receipt. Hundreds of thousands of people are paying to talk to a machine that helps them speak a new language, today.
His longer view is bolder. He imagines a near future where you don't "use a language app" at all - you just have conversations, and fluency arrives as a side effect of being heard. Faster, more natural, and, in his words, deeply human. It's a strange thing to say about software built on a large language model, but it's also the entire point of Pingo: the technology disappears, and what's left is the feeling of finally being understood in a language that wasn't yours.
He shared the company's story on the Unvested podcast in an episode titled, fittingly, "Let him cook" - a half-hour on what it actually takes to manufacture a single viral moment. The short version: less luck than it looks, more reps than anyone wants to admit.
What makes the whole thing memorable isn't the growth chart - plenty of YC companies have steep ones. It's the specificity of where it came from. Not a market-sizing spreadsheet, not a trend deck about generative AI, but two roommates and a standing weekly appointment to be bad at Chinese together until they were good. The best products tend to start as a fix for the founder's own annoyance, and Pingo is almost embarrassingly literal about it: Xing and Schonfeld had the one resource that makes language learning work, and they built the version of it that scales to everyone who doesn't.
That's also why Xing's framing of the future doesn't read as bravado. He isn't promising a robot tutor that replaces classrooms or claiming AI will make effort obsolete. He's describing something narrower and more believable - a world where the awkward, irreplaceable act of practicing out loud is finally available to anyone, at any hour, without needing to find a willing human first. If he's right, the strange thing won't be that an app taught millions of people to speak. It'll be that it took this long for someone to build the obvious thing.
Why "talk to it" beats "tap the right box"
Pingo's whole design rests on a claim about how humans actually pick up a language: you learn to speak by speaking. Reading prompts and matching pictures builds a kind of knowledge, but it's the wrong kind - recognition, not recall, and certainly not the reflex you need when a stranger asks you a question and waits. The gap between "I understand this sentence" and "I can produce this sentence under pressure" is where most learners stall, and it's precisely the gap a conversation partner closes.
So the product leans into the uncomfortable part. It talks back. It corrects you in real time, in context, the way a patient friend would rather than a quiz would. It meets you at your level and pushes from there. None of this is a new pedagogical idea - immersion has always worked - but the bottleneck was never the theory. It was supply. There simply aren't enough patient, fluent, available humans to give every learner the hours of low-stakes conversation they need. Xing and Schonfeld's insight was that this specific shortage is exactly the kind of thing a large language model is good for: infinite patience, instant availability, no judgment.
It's a useful lens for the broader AI moment, too. The loudest pitches promise to replace experts. Pingo's quieter one is about replacing scarcity - taking something that already works when you can get it, and making it something you can always get. That's a less cinematic promise, and a far more defensible one. It also happens to be the version people are willing to pay for.