BREAKING$70M SERIES C CLOSED FEB 2025 PHOENIX HITS 2M MONTHLY DOWNLOADS 1 TRILLION SPANS PROCESSED / MONTH DATADOG, M12 + PAGERDUTY ALL WROTE CHECKS UBER · DOORDASH · INSTACART · REDDIT · BOOKING.COM OPENINFERENCE STANDARD GAINS ADOPTION BERKELEY, CA · ~120 EMPLOYEES BREAKING$70M SERIES C CLOSED FEB 2025 PHOENIX HITS 2M MONTHLY DOWNLOADS 1 TRILLION SPANS PROCESSED / MONTH DATADOG, M12 + PAGERDUTY ALL WROTE CHECKS UBER · DOORDASH · INSTACART · REDDIT · BOOKING.COM OPENINFERENCE STANDARD GAINS ADOPTION BERKELEY, CA · ~120 EMPLOYEES
Company Profile · AI Observability

Arize AI watches the models that run modern internet companies.

When an LLM hallucinates at 3am, when an agent silently picks the wrong tool, when a model drifts a half-percent every Tuesday - somewhere on the screen of a tired engineer, Arize is the dashboard already open.

Founded 2020 HQ: Berkeley, CA Series C · $131M total ~120 people
Arize AI
Arize AI - the company that turned model debugging into a job title.
SHARE → Twitter / X LinkedIn Facebook Instagram Copy URL
Who they are now

The plumbing under the AI boom.

It is the spring of 2026 and a product manager at a global travel company is staring at a Slack thread. A customer-facing agent has been confidently recommending a hotel that no longer exists. Twelve times this week. Somewhere in a tab, a chart in Arize is glowing orange: a particular prompt template's hallucination rate has crept from 0.4% to 3.1%. Within an hour, an evaluator catches it, a guardrail ships, the agent stops lying.

This is the unglamorous, indispensable middle of the AI economy. Not the model. Not the chatbot. The thing watching both. Arize AI sells that thing - and is, by most reasonable measures, the company that most defined what "AI observability" even means.

"AI doesn't fail loudly. It fails quietly, with a confident answer that is just wrong." — Aparna Dhinakaran, Co-founder & CPO
The problem they saw

Models don't crash. They just gently start being wrong.

Traditional software, when it breaks, has the courtesy of a stack trace. AI does not. A model can drift, a prompt can regress, a retrieval pipeline can pull yesterday's data into tomorrow's answer - and the only signal is a slow, mild rise in customer complaints. By the time anyone catches it, the model has been confidently wrong to thousands of users.

Arize's founders had spent careers watching this. Aparna at Uber had co-led the company's first model lifecycle management system - a now-famous internal tool called Michelangelo. Jason at TubeMogul had shipped AI strategy into production ad systems. Both had seen, up close, what happens when a model that worked last Tuesday quietly stops working this Tuesday. The lesson, in both cases, was unpleasant.

So in 2020 they did the deeply unsexy thing: they started a company about debugging.

"We want to be the gold standard for AI evaluation and observability." Jason Lopatecki, Co-founder & CEO
The founders' bet

"What if 'monitoring' for AI were a category?"

In 2020, the wager looked questionable. "ML in production" was still, for most companies, a single engineer with a Jupyter notebook and a recurring nightmare. Calling that engineer's pain a "category" took some optimism. Seed money was small. The list of competitors was even smaller, which is the kind of detail that should worry a founder and almost never does.

Then ChatGPT happened. The single Jupyter notebook turned into a fleet of LLM-powered agents shipping to production every Friday afternoon. Every company in the Fortune 500 suddenly had what Aparna and Jason had at Uber - except they had it everywhere, all at once, and nobody had a Michelangelo to help.

Arize had been quietly building exactly the thing that pain needed. The bet had paid off. The category was real.

Six years, told in five dots

A short history of being early.

2020
Founded by Aparna Dhinakaran & Jason Lopatecki. $4M seed.
2021
Series A. ML observability platform ships to enterprise.
2022
Phoenix open-sourced. TCV leads $38M Series B.
2023-24
LLM era arrives. OpenInference standard launched on OpenTelemetry.
2025
$70M Series C - largest round the AI observability category has seen.
The product

Two products. One opinion.

The opinion is this: every interaction a model has with a user is a span, every span is data, and that data is the only honest record of how AI actually behaves. Everything Arize sells follows from that one belief.

Phoenix

The open-source library that started it all. LLM tracing, evaluation, experimentation. 2M+ monthly downloads. Free, self-hostable, no feature gates.

Arize AX

The enterprise platform. Online evaluations, drift monitoring, prompt management, retrieval debugging, agent tracing - at production scale.

OpenInference

An open standard for AI tracing built on OpenTelemetry. Arize co-authored it because someone had to. Now adopted across the LLM ecosystem.

"Phoenix is the most widely adopted open-source library for LLM evaluation in development. That is not a marketing line - it is a download counter." — Arize AI engineering team blog
The proof

Big numbers, slightly absurd.

A trillion spans a month is the kind of number that, in any other industry, you would politely assume is a typo. In Arize's case it is the actual rate at which their platform ingests AI behavior from customer systems. The category is new. The volume is not.

Arize, by the numbers

Source: Arize AI Series C announcement, Feb 2025 · TechCrunch · company materials
Series C raise
$70M
Total raised
$131M
Phoenix downloads / mo
2M+
Spans processed / mo
1 trillion
Evaluations / mo
50M+
Team
~120
Bar widths are scaled for legibility, not literal proportion. The trillion is real either way.

Who uses it: The customer roster reads like a list of companies you have probably interacted with this week without realizing.

UberDoorDashInstacart RedditBooking.comDuolingo RobloxPagerDutyAir Canada CohereCondé NastFlipkart TripAdvisorSiemensMicrosoft PricelineHyattPepsiCo Wayfair
The mission

Make AI work in the real world.

That is the official phrasing, and it sounds modest in the way that genuinely large goals sometimes do. "Working" in the real world means a model that does not silently get worse over time. It means an agent that, when it does the wrong thing, can be traced, evaluated, and fixed. It means an engineering team that can ship AI on a Friday afternoon and still go home.

Arize's bet on open-source is part of the same idea. Phoenix is free because Arize would rather AI engineers had decent tools at all than only at companies that signed a contract. The enterprise platform is for the ones whose trillion-span problems would crush a laptop.

"If Datadog watches your servers, Arize watches your models. Same idea, different decade." — Industry observer, paraphrased

The cap table is, conveniently, a list of believers. The Series C in February 2025 - the largest single round in AI observability history - was led by Adams Street Partners, with Microsoft's M12, Datadog, PagerDuty, Sinewave, OMERS and Industry Ventures joining. (Yes, Datadog. The observability incumbent putting money into the AI observability upstart is the kind of detail that tells you which way the wind is blowing.)

Why it matters tomorrow

The agent era needs adult supervision.

The next wave of AI deployments will not be chatbots. It will be agents - systems that take many steps, call many tools, spend real money, and make real decisions on behalf of real customers. That is, on a good day, a thrilling product. On a bad day, it is a long, expensive, undebugged loop.

Arize is building the supervision layer for that future: trace every step, evaluate every output, alert when behavior changes, and give the humans a way to intervene before the bill arrives. It is unglamorous work, in the way that brake systems on cars are unglamorous. Nobody buys a car for the brakes. Nobody drives one without them.

Back to that travel-company PM, still in her Slack thread. Six years ago she would have spent the next week reading raw logs and writing a postmortem. This afternoon, the chart is already green again. The agent is back to recommending hotels that exist. The hallucination rate is 0.3%. She closes the laptop. Somewhere in Berkeley, Arize ingests another billion spans, and nothing about that fact is interesting to anyone except the people for whom it is everything.