YesPress / AI Infrastructure
The fastest way to run AI - without the headache
Seven engineers who built PyTorch at Meta walked out the door and decided to give every company in the world access to the same AI infrastructure that powers the biggest tech platforms on the planet. Three years later: $4B valuation, 10,000+ customers, and 10 trillion tokens processed every day.
In 2022, Lin Qiao was Head of PyTorch at Meta - which is roughly equivalent to being the caretaker of the most important piece of AI infrastructure in the world. PyTorch powers the research of nearly every major AI lab, runs inside Meta's own products, and underpins most of the AI papers published in academic journals. A comfortable perch, to say the least.
She left anyway. And she brought six colleagues with her.
The problem she kept running into wasn't a shortage of clever AI models. It was everything else. Companies across every industry wanted to deploy generative AI but had nowhere to start. No GPU infrastructure. No inference stack. No team with the depth to build it. The models existed. The capability to run them efficiently - at scale, at low cost, with production-grade reliability - was gated behind Meta, Google, and a handful of hyperscalers.
Fireworks AI's founding thesis was simple: democratize the infrastructure. Build the inference layer that every company needs but almost no company can build themselves. The name is a direct nod to PyTorch's flame logo - spreading the same fire that lit up Meta's AI stack across the rest of the industry.
The Founder on the Mission
"Companies wanted to prioritize AI but lacked the infrastructure, resources, and talent. We wanted to spread that PyTorch flame across industries."- Lin Qiao, Co-Founder & CEO, Fireworks AI
The founding team is a collector's item. Five of the seven co-founders worked directly on PyTorch at Meta - the framework that runs inside nearly every serious AI research project and production system. The other two come from Meta Ads infrastructure and Google Vertex AI. This isn't a team that read about AI infrastructure. They built it.
Lin Qiao
Co-Founder & CEO
Benny Chen
Co-Founder
Chenyu Zhao
Co-Founder
Dmytro Dzhulgakov
Co-Founder
Dmytro Ivchenko
Co-Founder
James Reed
Co-Founder
Pawel Garbacki
Co-Founder
Lin Qiao holds a PhD in Computer Science from UC Santa Barbara focused on distributed systems and machine learning. Before PyTorch, she worked at Facebook, LinkedIn, and IBM. Her co-founders bring comparably deep credentials: PyTorch core maintenance, Newsfeed ML systems, Google Vertex AI, and Meta Ads infrastructure at scale.
Fun fact: 5 of 7 founders built the framework that almost every AI model in production runs on. Then they left to build the layer underneath it.
Fireworks AI is not a model company. It doesn't train foundation models or try to compete with OpenAI or Anthropic on raw capability. Instead, it solves a different problem: how do you take an open-source model and run it in production, fast, cheaply, and reliably, without hiring a team of GPU engineers?
The platform covers the full lifecycle - from model selection to fine-tuning to deployment to real-time inference at scale. The OpenAI-compatible API means developers can swap in Fireworks without rewriting their code. The catalog includes hundreds of models across text, image, audio, and multimodal workloads.
On the AI Landscape
"The next wave of quality is not going to be one of 'single model solves all problems.' The future will involve hundreds of small expert models solving narrower sets of problems."- Lin Qiao, Co-Founder & CEO
Who Runs on Fireworks
Samsung · Uber · Shopify · DoorDash · Notion · Cursor · Perplexity · Upwork · Sourcegraph
Fireworks AI raised $25M in a Series A in early 2024. Seven months later, Sequoia led a $52M Series B at roughly a $552M valuation. Fifteen months after that, the company closed a $250M Series C at a $4B valuation - a 7x jump in a year and a half. The investor roster reads like a who's who of Silicon Valley's most plugged-in technologists.
| Round | Amount | Date | Key Investors |
|---|---|---|---|
| Seed | Undisclosed | 2022-2023 | - |
| Series A | $25M | Jan 2024 | - |
| Series B | $52M | Jul 2024 | Sequoia, NVIDIA, AMD, Databricks, Benchmark, Sheryl Sandberg, Frank Slootman |
| Series C | $250M | Oct 2025 | Lightspeed, Index Ventures, Sequoia, NVIDIA, AMD, MongoDB, Databricks |
Strategic investors include NVIDIA and AMD - both chipmakers betting that Fireworks will be the layer that puts their hardware to work for the most demanding AI workloads. Sheryl Sandberg (former Meta COO) and Frank Slootman (former Snowflake CEO) are among the personal angels backing the company.
Fireworks AI operates across NVIDIA, AMD, AWS, Google Cloud, and Oracle Cloud GPU infrastructure. In 2025, the company signed a Strategic Collaboration Agreement with AWS and achieved AWS Generative AI Competency Partner status. A multi-year deal with Microsoft Azure brought Fireworks into Azure Foundry as a native option.
On the model side, partnerships with Meta (Llama), Mistral AI, and Stability AI ensure the catalog stays current with the latest open-source releases. Databricks and MongoDB are both strategic investors and integration partners - Fireworks handles inference for ML workflows built on Databricks and RAG pipelines built on MongoDB.
The popular narrative in AI has been about chasing the biggest, smartest foundation model. GPT-4, then GPT-5. Claude Sonnet, then Claude Opus. Each release touted as the one that finally cracks general intelligence.
Fireworks AI is betting on a different future. Lin Qiao has argued publicly that the next phase of AI quality won't come from making models bigger - it'll come from making them more specialized. Hundreds of small, expert models trained on curated domain-specific data, each solving a narrow problem better than any general-purpose model can.
That future is convenient for a platform that already hosts hundreds of models and has built fine-tuning infrastructure capable of creating new ones cheaply. Fireworks isn't just serving the current AI market - it's building toward one where every company has its own model, its own inference stack, and its own AI product. Not a dependency on OpenAI's pricing decisions.
On AI Agents
"I'm very bullish on agents. I think it's going to blossom."- Lin Qiao, Co-Founder & CEO
In March 2026, Fireworks AI acquired Hathora Inc., a real-time container orchestration platform spanning 14 regions across two bare-metal providers and four cloud environments. The move was framed as a talent and infrastructure acquisition - Hathora's engineering culture, described by Lin Qiao as focused on "every millisecond and every routing decision," maps directly onto the demands of cutting-edge AI inference.
The practical effect: sub-20ms routing decisions across a global network, applied to inference workloads that demand low latency at any location. For developers running AI products with global user bases, that's the difference between an AI assistant that feels responsive and one that doesn't.
Five of the seven co-founders were core engineers on the PyTorch team at Meta. They didn't just use the most popular AI framework - they wrote it.
The company name is a direct nod to PyTorch's flame logo. The vision: spreading that same flame to every industry that needs it.
Valuation grew 7x in 15 months - from ~$552M at Series B in July 2024 to $4B at Series C in October 2025.
10,000+ customers reached without a large traditional sales force. Growth was largely developer-led and product-led.
Sheryl Sandberg (ex-Meta COO) and Frank Slootman (ex-Snowflake CEO) are personal angel investors - alongside Alexandr Wang of Scale AI and Howie Liu of Airtable.
The team is ~166 people. Processing 10 trillion tokens a day. That's roughly 60 billion tokens per employee per day.