How a Compiler Project Became a $900M Startup
Most startups are born from frustration. OctoAI was born from a PhD thesis. When Tianqi Chen - one of the most cited names in applied machine learning - developed Apache TVM as a graduate student at UW, he wasn't thinking about venture capital. He was thinking about how absurd it was that the same model would run dramatically slower on one chip than another, for no good reason a programmer could easily fix.
TVM was the fix. A compiler that could take a trained model and optimize it for whatever hardware you happened to be running - NVIDIA GPUs, AMD chips, AWS Inferentia, embedded processors. The open-source community adopted it fast. And then, in 2019, Chen and four colleagues - including UW professor Luis Ceze, who would become CEO - decided the technology deserved a company around it.
OctoML's first years were focused on the nuts-and-bolts of ML deployment: helping enterprises get their computer vision and NLP models running faster and cheaper, without the months of hand-optimization that used to require a dedicated infrastructure team. The pitch was "we do for ML what compilers do for software" - and it landed well enough to attract Amplify Partners, Addition, Madrona, and eventually Tiger Global.
Then the generative AI wave hit, and everything changed. In 2023, OctoML reoriented almost entirely around large language models and image generation. The company rebranded as OctoAI in January 2024 and launched a cloud inference platform that gave developers API access to Llama, Mistral, Stable Diffusion SDXL, and others - at prices and speeds the large cloud providers weren't yet matching.
The timing was sharp. Within roughly 10 months of launch, the platform had attracted 25,000+ developers and over 100 paying businesses. In April 2024, OctoAI launched OctoStack - its enterprise private deployment product - claiming 4x better GPU utilization and roughly half the operating cost of a self-managed setup. It was the company's most ambitious swing: a complete AI inference stack that enterprises could deploy in their own cloud or on-premises, with support for NVIDIA, AMD, and AWS Inferentia hardware.
Then NVIDIA came calling. In September 2024, the chip giant acquired OctoAI - its fifth acquisition of that year. The reported price of $165M-$250M was a notable step down from the 2021 peak valuation of ~$900M. CEO Luis Ceze moved to NVIDIA as VP of AI Systems Software. Services wound down for customers on October 31, 2024, and the octoml.ai domain began redirecting to nvidia.com.
Whether the exit counts as a win depends on your frame of reference. By venture math - late-stage valuation versus exit price - it was a down-round. By the standard of "five academics built a real product, served tens of thousands of developers, and got absorbed by one of the most consequential companies in tech" - it looks rather different.