OctoAI

THE STORY

Five professors and PhD students walked out of the University of Washington with an open-source compiler, a pile of venture capital, and a plan to make AI run faster on anything, anywhere.

OctoAI - originally named OctoML - launched in 2019 as a spin-out from UW's Paul G. Allen School of Computer Science and Engineering. The founding premise was simple enough: the gap between training a model and deploying it cheaply at scale was enormous. Apache TVM, the open-source machine learning compiler that co-founder Tianqi Chen had built as a PhD student, was already closing that gap for research labs. OctoML would close it for everyone else.

For a few years, it worked. The company raised seed, Series A, B, and C rounds in quick succession - pulling in a total of ~$132 million and hitting a paper valuation of roughly $900 million in November 2021. By then, the AI infrastructure space was being valued on pure optimism.

THE FULL PICTURE

How a Compiler Project Became a $900M Startup

Most startups are born from frustration. OctoAI was born from a PhD thesis. When Tianqi Chen - one of the most cited names in applied machine learning - developed Apache TVM as a graduate student at UW, he wasn't thinking about venture capital. He was thinking about how absurd it was that the same model would run dramatically slower on one chip than another, for no good reason a programmer could easily fix.

TVM was the fix. A compiler that could take a trained model and optimize it for whatever hardware you happened to be running - NVIDIA GPUs, AMD chips, AWS Inferentia, embedded processors. The open-source community adopted it fast. And then, in 2019, Chen and four colleagues - including UW professor Luis Ceze, who would become CEO - decided the technology deserved a company around it.

OctoML's first years were focused on the nuts-and-bolts of ML deployment: helping enterprises get their computer vision and NLP models running faster and cheaper, without the months of hand-optimization that used to require a dedicated infrastructure team. The pitch was "we do for ML what compilers do for software" - and it landed well enough to attract Amplify Partners, Addition, Madrona, and eventually Tiger Global.

Then the generative AI wave hit, and everything changed. In 2023, OctoML reoriented almost entirely around large language models and image generation. The company rebranded as OctoAI in January 2024 and launched a cloud inference platform that gave developers API access to Llama, Mistral, Stable Diffusion SDXL, and others - at prices and speeds the large cloud providers weren't yet matching.

The timing was sharp. Within roughly 10 months of launch, the platform had attracted 25,000+ developers and over 100 paying businesses. In April 2024, OctoAI launched OctoStack - its enterprise private deployment product - claiming 4x better GPU utilization and roughly half the operating cost of a self-managed setup. It was the company's most ambitious swing: a complete AI inference stack that enterprises could deploy in their own cloud or on-premises, with support for NVIDIA, AMD, and AWS Inferentia hardware.

Then NVIDIA came calling. In September 2024, the chip giant acquired OctoAI - its fifth acquisition of that year. The reported price of $165M-$250M was a notable step down from the 2021 peak valuation of ~$900M. CEO Luis Ceze moved to NVIDIA as VP of AI Systems Software. Services wound down for customers on October 31, 2024, and the octoml.ai domain began redirecting to nvidia.com.

Whether the exit counts as a win depends on your frame of reference. By venture math - late-stage valuation versus exit price - it was a down-round. By the standard of "five academics built a real product, served tens of thousands of developers, and got absorbed by one of the most consequential companies in tech" - it looks rather different.

THE TEAM

Five Academics Who Went for It

All five co-founders came from the University of Washington's Allen School. That's either a remarkable coincidence or proof that good research environments produce good companies.

01

Luis Ceze

CEO & Co-founder

Born in São Paulo, Brazil. Tenured UW professor who took a leave of absence to run the company. Joined NVIDIA as VP of AI Systems Software post-acquisition.

02

Tianqi Chen

CTO & Co-founder

Creator of both Apache TVM and XGBoost - two of the most widely used open-source ML tools in existence. Later joined CMU faculty alongside his OctoAI role.

03

Jared Roesch

Chief Architect & Co-founder

PhD student at UW Allen School with prior experience at Mozilla Research. Led architectural decisions on the TVM-based platform.

04

Jason Knight

CPO & Co-founder

Former principal engineer and AI leader at Intel. Brought industry product experience into a team that was otherwise deeply research-oriented.

05

Thierry Moreau

VP Technology Partnerships & Co-founder

Received his PhD from UW Allen School in 2018 and co-taught graduate ML courses with Ceze before co-founding the company.

THE PRODUCTS

What OctoAI Actually Built

OctoAI SaaS Platform

A cloud-based generative AI inference platform with API access to open-source LLMs including Meta Llama, Mistral 8x7B, and image generation models. Designed so developers could swap in and fine-tune models without managing their own GPU infrastructure. Within 10 months of launch it served 25,000+ developers and 100+ paying businesses.

API LLM Pay-per-use

OctoStack

Launched April 2024. A turnkey enterprise AI inference stack deployable in your VPC, on-premises hardware, or public clouds like AWS, Azure, and GCP. Claimed 4x better GPU utilization and ~50% lower operational costs compared to self-managed setups. Supported NVIDIA, AMD, and AWS Inferentia accelerators - notable for not locking enterprises into a single hardware vendor.

Enterprise Private Cloud On-Prem

OctoAI Media Gen Solution

An image and video generation product built around Stable Diffusion models (SDXL, SD 1.5, Stable Video Diffusion) and FLUX.1. Offered text-to-image, image-to-image, image-to-video, inpainting, outpainting, upscaling, background removal, and ControlNet features - all accessible via API, web UI, and Python/TypeScript SDKs. OctoAI claimed to provide the fastest SDXL inference endpoint commercially available in 2024.

Image Gen Video Stable Diffusion

OctoML Platform (legacy)

The original product, 2019-2023. Built on Apache TVM, it focused on optimizing traditional ML workloads - computer vision, NLP - so models would run faster on whatever hardware a company happened to own. The pivot to generative AI in 2023 effectively retired this product line, though the compiler technology underpinned everything that followed.

Apache TVM ML Optimization Legacy

THE MONEY

$132M In, Down-Round Out

The Valuation Whiplash

OctoAI hit a peak paper valuation of ~$900M at its November 2021 Series C. NVIDIA acquired the company three years later for an estimated $165M-$250M. The gap - somewhere between $650M and $735M of "lost" paper value - reflects a market that went from pricing AI infrastructure on optimism to pricing it on revenue multiples. The founders were not alone in this journey; many 2021-era AI infrastructure companies found themselves in similar positions.

2019

Seed Round - $3.9M

The company launches as OctoML, spinning out of UW's Allen School.

April 2020

Series A - $15M

Led by Amplify Partners

March 2021

Series B - $28M

Led by Addition, with Madrona Venture Group & Amplify Partners

November 2021

Series C - $85M

Peak valuation of ~$900M. The generative AI boom is still a year away.

Led by Tiger Global, with Addition, Madrona & Amplify

September 2024

Acquired by NVIDIA

Estimated acquisition price: $165M-$250M. NVIDIA's 5th acquisition of 2024. CEO Luis Ceze becomes VP of AI Systems Software at NVIDIA.

DID YOU KNOW

Five Things Worth Knowing

01

OctoAI was essentially born from a PhD dissertation. Co-founder Tianqi Chen wrote Apache TVM as a graduate student at UW. The open-source project already had global traction before the company existed.

02

Tianqi Chen also created XGBoost - the gradient boosting library that dominated data science competitions for years. Two massively adopted open-source projects, one founder.

03

CEO Luis Ceze is a tenured UW professor who took a leave of absence to run the company. Post-acquisition, he joined NVIDIA as VP of AI Systems Software. Academia-to-startup-to-Big Tech, in five years.

04

The company went from ~$900M paper valuation (2021) to a $165M-$250M acquisition (2024). In absolute terms, that's a significant markdown. In relative terms, it's also how most AI infrastructure companies that peaked in 2021 ended up.

05

OctoStack's claim of "4x better GPU utilization" arrived at exactly the right time: 2024, when GPU costs had become the single largest expense for any company running LLMs in production.

PARTNERSHIPS

Who They Worked With

AWS

Strategic Collaboration Agreement - OctoAI models and tooling on AWS Marketplace

NVIDIA

Pre-acquisition integration of NVIDIA NIM microservices into the OctoAI platform

Google

Noted as a customer and partner on generative AI infrastructure

Docker

Collaboration on multimodal GenAI application development tooling

TIMELINE

Key Moments, In Order

Oct 2024 OctoAI winds down all commercial services for customers on October 31, following NVIDIA acquisition. Domain redirects to nvidia.com.

Sept 2024 NVIDIA officially acquires OctoAI - its 5th acquisition of 2024. CEO Luis Ceze joins NVIDIA as VP of AI Systems Software.

Jul 2024 FLUX.1 [Schnell] text-to-image model added to OctoAI Media Gen platform.

Apr 2024 OctoStack launched - enterprise-grade private AI deployment stack. Claims 4x GPU utilization improvement and ~50% cost reduction versus DIY.

Jan 2024 OctoML formally rebrands to OctoAI, marking its full pivot from ML optimization to generative AI infrastructure.

Nov 2023 OctoAI Image Gen solution unveiled - Stable Diffusion customization and inference at scale, including SDXL and ControlNet.

Jun 2023 OctoAI generative AI SaaS platform publicly launched. Developers can now access LLMs and image models via API without managing GPU infrastructure.

Nov 2021 Series C closes at $85M led by Tiger Global. Paper valuation reaches ~$900M. Peak optimism, peak timing.

2019 OctoML founded as a UW Allen School spin-out. Apache TVM is the foundation; the mission is ML model optimization for any hardware.

FIND THEM

Links & Profiles

Website LinkedIn Twitter / X GitHub

SOURCES