The quiet company underneath ChatGPT training, Uber's ETA predictions and Spotify's recommendations. It sells one thing: the ability to scale Python without thinking about it.
That is the trick. Open a notebook in San Francisco or Bangalore, decorate a Python function with @ray.remote, and a cluster appears - GPUs, schedulers, fault tolerance, the whole quiet machinery of distributed computing. Close the notebook and it disappears.
This is what Anyscale sells. Not a database, not a model, not yet another agent framework. It sells the layer underneath all of those things: a way to make a thousand machines behave like one.
On any given week, Ray - the open-source framework Anyscale was built around - is coordinating LLM training at OpenAI, recommendation pipelines at Spotify, ETA models at Uber, screening simulations at Recursion, and the small army of clusters that turn into ChatGPT-style products at companies you have not heard of yet. By Anyscale's own count, more than a million Ray clusters spin up every month.
"Ray is the underlying compute fabric for some of the most important AI applications ever built." - industry coverage of Anyscale, 2025
For most of computing's history, scaling a program across many machines was a specialist sport. You wrote MPI. You wrote Hadoop. You learned Kubernetes, then learned the seven yaml files Kubernetes requires you to also learn. If you got it wrong - and you did - the job died at 3 a.m. and you read logs until the sun came up.
Then the deep learning era arrived, and the workloads stopped fitting on one machine. Training runs got too big. Inference traffic got too spiky. Data preprocessing pipelines got too tangled. The number of engineers who needed to scale Python code went from "a few hundred at Google" to "anyone trying to ship an AI product." The tools did not keep up.
"Our goal is to make distributed programming as easy as single-machine programming."- Robert Nishihara, co-founder, paraphrased from public talks
Plenty of vendors offered to help. Their pitches usually amounted to: rewrite your code in our DSL, push it to our cloud, trust us. Engineers had been burned by that bargain before. What they wanted was something more mundane and harder to build - a way to take the Python they already had and run it on as many GPUs as the job required, without learning a new abstraction every quarter.
Robert Nishihara, Philipp Moritz and Ion Stoica started working on Ray inside RISELab in 2016, joined intellectually by their advisor Michael I. Jordan, one of the most cited names in machine learning. The lab's earlier ancestor, AMPLab, gave the world Apache Spark and, eventually, Databricks. Stoica co-founded that one too. The pattern is suspicious, in a good way.
Their bet was a particular kind of contrarianism. While the AI tooling industry was busy inventing proprietary "platforms" - a euphemism for "a moat we hope you don't notice" - Anyscale put its core technology on GitHub and dared the market to copy it. Open source first, product second. Developers first, procurement second.
"We didn't want to be the company that made distributed computing slightly less painful. We wanted to make it invisible."- Anyscale leadership, summarized from press interviews
By the time they took their first check - $20.6M led by Andreessen Horowitz in December 2019 - Ray was already running inside companies the founders weren't talking to yet. That is the polite way to describe what happens when an open-source project escapes containment.
Ray begins at UC Berkeley's RISELab.
Anyscale incorporates. $20.6M Series A led by a16z.
$40M Series B led by NEA. First Ray Summit goes mainstream.
$99M Series C at $1B valuation. ChatGPT ships, trained with Ray.
Anyscale Platform v2 + RayTurbo launch. Generative-AI customer surge.
Ray joins PyTorch Foundation. CoreWeave BYOC. Azure first-party launch.
Anyscale's surface area, if you squint, is small. There are really three things to remember.
Ray is the open-source piece. It is a Python-native framework for distributed compute, with sub-libraries for training (Ray Train), tuning (Ray Tune), serving (Ray Serve) and data processing (Ray Data). It is free. It is, deliberately, the front door.
Anyscale Platform is the managed version - clusters, observability, governance, role-based access, and the parts of "running infrastructure in production" that no one wants to build twice. It runs in your cloud account or theirs. It speaks AWS, GCP and now, formally, Azure.
Anyscale Runtime, sometimes branded RayTurbo, is the part the platform sells on. It is API-compatible with open-source Ray, but, according to Anyscale's published numbers, up to ten times faster on data, training and serving workloads. Same code, less compute. It is the kind of pitch a CFO can sit through without flinching.
"The same Python. Ten times the speed. Zero rewrites."- the elevator pitch, distilled
OpenAI used Ray to coordinate the training of ChatGPT - the single most discussed software product of the decade. Cohere trains large language models on it. Uber runs forecasting and ETA models on it. Spotify uses it for parallel model training that ends up in recommendations you blame on yourself. Pinterest, Netflix, Canva, Coinbase, Instacart, Ant Group - the list of public references is unusually long for an infrastructure company barely six years old.
The case studies are also unusually specific. Recursion uses Ray to screen drug candidates. Physical Intelligence uses it for robotics foundation models. A surprising number of "we trained our own LLM" announcements include, somewhere in the methodology section, a quiet thank-you to Ray.
"If your AI workload is bigger than one GPU, the odds are pretty good Ray is in there somewhere."- a not-particularly-controversial industry observation
Microsoft made this official in November 2025, announcing a first-party AI-native compute service on Azure built on Anyscale, with general availability targeted for 2026. CoreWeave followed shortly after, offering Anyscale via BYOC inside its Kubernetes service. When the hyperscalers start integrating you instead of competing with you, that is a tell.
Anyscale's mission, in public terms, is to "scale AI for everyone." Read it twice and the interesting word is everyone. Not just the hyperscalers. Not just the labs with a billion in funding and a private cluster the size of a small country. Everyone who can write a function.
This is not a small claim. The companies that historically owned distributed systems - Google, Amazon, Meta - had to invent them in private because no commodity tools existed. The companies trying to build serious AI today still mostly have to do the same thing. Anyscale is wagering that, this generation, the tools can be shared. That the operating system for AI compute can be open, and the layer above it can be a business.
"The hyperscalers built their own Ray. Anyscale's bet is that no one else should have to."- summary of Anyscale's pitch deck, in our words
That is a longer-shot bet than it sounds, because the alternative - rolling your own Kubernetes-plus-glue-code AI stack - is the path most enterprises are currently lumbering down. Anyscale's job is to make that path look as expensive as it actually is.
The story we tell about AI is usually about the models. The story that gets told quietly, between people who write checks, is about the compute. Whoever owns the substrate that other people's models run on collects rent for a decade. That is the position Databricks took with analytics. It is the position Anyscale is trying to take with AI.
There is competition - Databricks via Mosaic, Modal, Coiled, Together, the hyperscalers' own offerings. Anyscale's defense is twofold. One: Ray is genuinely good and genuinely open, with the kind of community gravity that takes a decade to build and a few weeks to lose. Two: Anyscale Runtime offers the boring, quantifiable thing CFOs sign off on - the same code, ten times the speed, fewer GPUs on the bill.
If both of those keep holding, the next few years of AI infrastructure look a lot like the last few - except more of it runs through one San Francisco company than most observers realize.
That is what Anyscale wanted, six years ago, when four researchers at Berkeley decided distributed computing was too important to leave to specialists. It is what they want now, with a billion-dollar valuation, a Microsoft partnership, and a million clusters spinning up every month.
The infrastructure does its work. The developer ships. The product gets better. The cluster goes away.
Most of the time, that is the whole story. Which is roughly the point.