BREAKING Anyscale + Microsoft launch AI-native compute on Azure (Nov 2025) Ray joins the PyTorch Foundation 1,000,000+ Ray clusters spun up every month $259M raised • $1B valuation • 430 employees OpenAI used Ray to coordinate ChatGPT training Anyscale Runtime: up to 10x faster than vanilla Ray BREAKING Anyscale + Microsoft launch AI-native compute on Azure (Nov 2025) Ray joins the PyTorch Foundation 1,000,000+ Ray clusters spun up every month $259M raised • $1B valuation • 430 employees OpenAI used Ray to coordinate ChatGPT training Anyscale Runtime: up to 10x faster than vanilla Ray
Profile • AI Infrastructure

Anyscale

The quiet company underneath ChatGPT training, Uber's ETA predictions and Spotify's recommendations. It sells one thing: the ability to scale Python without thinking about it.

FOUNDED 2019 HQ San Francisco RAISED $259M STAGE Series C
Anyscale logo
The Anyscale wordmark. Beneath it: a framework called Ray, beneath which sits roughly half of generative AI's compute.
Share this profile LinkedIn Twitter / X Facebook Instagram Copy URL
Who they are now

Right now, somewhere, a model is training on Anyscale - and the person who launched it didn't have to know.

That is the trick. Open a notebook in San Francisco or Bangalore, decorate a Python function with @ray.remote, and a cluster appears - GPUs, schedulers, fault tolerance, the whole quiet machinery of distributed computing. Close the notebook and it disappears.

This is what Anyscale sells. Not a database, not a model, not yet another agent framework. It sells the layer underneath all of those things: a way to make a thousand machines behave like one.

On any given week, Ray - the open-source framework Anyscale was built around - is coordinating LLM training at OpenAI, recommendation pipelines at Spotify, ETA models at Uber, screening simulations at Recursion, and the small army of clusters that turn into ChatGPT-style products at companies you have not heard of yet. By Anyscale's own count, more than a million Ray clusters spin up every month.

"Ray is the underlying compute fabric for some of the most important AI applications ever built." - industry coverage of Anyscale, 2025

The problem they saw

Distributed computing was a research problem. AI made it everyone's problem.

For most of computing's history, scaling a program across many machines was a specialist sport. You wrote MPI. You wrote Hadoop. You learned Kubernetes, then learned the seven yaml files Kubernetes requires you to also learn. If you got it wrong - and you did - the job died at 3 a.m. and you read logs until the sun came up.

Then the deep learning era arrived, and the workloads stopped fitting on one machine. Training runs got too big. Inference traffic got too spiky. Data preprocessing pipelines got too tangled. The number of engineers who needed to scale Python code went from "a few hundred at Google" to "anyone trying to ship an AI product." The tools did not keep up.

"Our goal is to make distributed programming as easy as single-machine programming."- Robert Nishihara, co-founder, paraphrased from public talks

Plenty of vendors offered to help. Their pitches usually amounted to: rewrite your code in our DSL, push it to our cloud, trust us. Engineers had been burned by that bargain before. What they wanted was something more mundane and harder to build - a way to take the Python they already had and run it on as many GPUs as the job required, without learning a new abstraction every quarter.

The founders' bet

Berkeley's RISELab has a habit of producing infrastructure companies. Anyscale is the latest one.

Robert Nishihara, Philipp Moritz and Ion Stoica started working on Ray inside RISELab in 2016, joined intellectually by their advisor Michael I. Jordan, one of the most cited names in machine learning. The lab's earlier ancestor, AMPLab, gave the world Apache Spark and, eventually, Databricks. Stoica co-founded that one too. The pattern is suspicious, in a good way.

Their bet was a particular kind of contrarianism. While the AI tooling industry was busy inventing proprietary "platforms" - a euphemism for "a moat we hope you don't notice" - Anyscale put its core technology on GitHub and dared the market to copy it. Open source first, product second. Developers first, procurement second.

"We didn't want to be the company that made distributed computing slightly less painful. We wanted to make it invisible."- Anyscale leadership, summarized from press interviews

By the time they took their first check - $20.6M led by Andreessen Horowitz in December 2019 - Ray was already running inside companies the founders weren't talking to yet. That is the polite way to describe what happens when an open-source project escapes containment.

Milestones

From a research project to the default scheduler for generative AI.

2016

Ray begins at UC Berkeley's RISELab.

2019

Anyscale incorporates. $20.6M Series A led by a16z.

2021

$40M Series B led by NEA. First Ray Summit goes mainstream.

2022

$99M Series C at $1B valuation. ChatGPT ships, trained with Ray.

2024

Anyscale Platform v2 + RayTurbo launch. Generative-AI customer surge.

2025

Ray joins PyTorch Foundation. CoreWeave BYOC. Azure first-party launch.

Caption. Five years, three rounds, one open-source project, and a stubbornly consistent belief that the right interface for distributed AI is still def.
The product

One framework. One runtime. One platform that runs it across whichever cloud is currently cheapest.

Anyscale's surface area, if you squint, is small. There are really three things to remember.

Ray is the open-source piece. It is a Python-native framework for distributed compute, with sub-libraries for training (Ray Train), tuning (Ray Tune), serving (Ray Serve) and data processing (Ray Data). It is free. It is, deliberately, the front door.

Anyscale Platform is the managed version - clusters, observability, governance, role-based access, and the parts of "running infrastructure in production" that no one wants to build twice. It runs in your cloud account or theirs. It speaks AWS, GCP and now, formally, Azure.

Anyscale Runtime, sometimes branded RayTurbo, is the part the platform sells on. It is API-compatible with open-source Ray, but, according to Anyscale's published numbers, up to ten times faster on data, training and serving workloads. Same code, less compute. It is the kind of pitch a CFO can sit through without flinching.

"The same Python. Ten times the speed. Zero rewrites."- the elevator pitch, distilled

1M+
Ray clusters / month
10x
Runtime speedup claimed
430
Employees
$1B
Valuation, Series C

Funding rounds • 2019 - 2022 (USD millions)

Series A$20.6M
Series B$40M
Series C$99M
Total$259M
Caption. The growth chart of a company that decided open source first, product later. Most founders flip those.
The proof

You already use AI products that run on Ray. You just didn't know.

OpenAI used Ray to coordinate the training of ChatGPT - the single most discussed software product of the decade. Cohere trains large language models on it. Uber runs forecasting and ETA models on it. Spotify uses it for parallel model training that ends up in recommendations you blame on yourself. Pinterest, Netflix, Canva, Coinbase, Instacart, Ant Group - the list of public references is unusually long for an infrastructure company barely six years old.

The case studies are also unusually specific. Recursion uses Ray to screen drug candidates. Physical Intelligence uses it for robotics foundation models. A surprising number of "we trained our own LLM" announcements include, somewhere in the methodology section, a quiet thank-you to Ray.

"If your AI workload is bigger than one GPU, the odds are pretty good Ray is in there somewhere."- a not-particularly-controversial industry observation

Microsoft made this official in November 2025, announcing a first-party AI-native compute service on Azure built on Anyscale, with general availability targeted for 2026. CoreWeave followed shortly after, offering Anyscale via BYOC inside its Kubernetes service. When the hyperscalers start integrating you instead of competing with you, that is a tell.

The mission

Make distributed computing as ordinary as writing Python on a laptop.

Anyscale's mission, in public terms, is to "scale AI for everyone." Read it twice and the interesting word is everyone. Not just the hyperscalers. Not just the labs with a billion in funding and a private cluster the size of a small country. Everyone who can write a function.

This is not a small claim. The companies that historically owned distributed systems - Google, Amazon, Meta - had to invent them in private because no commodity tools existed. The companies trying to build serious AI today still mostly have to do the same thing. Anyscale is wagering that, this generation, the tools can be shared. That the operating system for AI compute can be open, and the layer above it can be a business.

"The hyperscalers built their own Ray. Anyscale's bet is that no one else should have to."- summary of Anyscale's pitch deck, in our words

That is a longer-shot bet than it sounds, because the alternative - rolling your own Kubernetes-plus-glue-code AI stack - is the path most enterprises are currently lumbering down. Anyscale's job is to make that path look as expensive as it actually is.

Why it matters tomorrow

The AI era is going to be decided, partly, on whose compute layer ships fastest.

The story we tell about AI is usually about the models. The story that gets told quietly, between people who write checks, is about the compute. Whoever owns the substrate that other people's models run on collects rent for a decade. That is the position Databricks took with analytics. It is the position Anyscale is trying to take with AI.

There is competition - Databricks via Mosaic, Modal, Coiled, Together, the hyperscalers' own offerings. Anyscale's defense is twofold. One: Ray is genuinely good and genuinely open, with the kind of community gravity that takes a decade to build and a few weeks to lose. Two: Anyscale Runtime offers the boring, quantifiable thing CFOs sign off on - the same code, ten times the speed, fewer GPUs on the bill.

If both of those keep holding, the next few years of AI infrastructure look a lot like the last few - except more of it runs through one San Francisco company than most observers realize.

Closing scene

The notebook is still open in San Francisco. The cluster came up. The model trained. Nobody filed a ticket.

That is what Anyscale wanted, six years ago, when four researchers at Berkeley decided distributed computing was too important to leave to specialists. It is what they want now, with a billion-dollar valuation, a Microsoft partnership, and a million clusters spinning up every month.

The infrastructure does its work. The developer ships. The product gets better. The cluster goes away.

Most of the time, that is the whole story. Which is roughly the point.

Where to follow Anyscale