Breaking
$12.5M seed led by General Catalyst  ·  $100M post-money valuation OpenAI & Anthropic among customers  ·  frontier labs hire Haize to break their models Cascade automates multi-turn jailbreaks  ·  out-attacks manual red-teamers Founded 2024 in New York  ·  three Harvard researchers, one dove logo 🕊️ $12.5M seed led by General Catalyst  ·  $100M post-money valuation OpenAI & Anthropic among customers  ·  frontier labs hire Haize to break their models Cascade automates multi-turn jailbreaks  ·  out-attacks manual red-teamers Founded 2024 in New York  ·  three Harvard researchers, one dove logo 🕊️
New York, N.Y. AI Safety & Reliability Est. 2024
Haize Labs logo
The dove that stress-tests the machine.
Company Profile

Haize Labs 🕊️

The startup that breaks the world's best AI models on purpose - so the loopholes get patched before you ever find them.

Automated Red-Teaming LLM Evaluation Backed by General Catalyst $100M Valuation
Above: Haize Labs' mark - a plain dove against clean white. It looks like peace. The job underneath is closer to demolition: find every way an AI can go wrong, then hand the receipts to the people who built it.
Share this profile

There is a genre of company that sells you a feeling. "AI safety," in most decks, is a feeling - a paragraph about values, a commitment to being responsible, a photo of a diverse team looking thoughtfully at a whiteboard. Haize Labs is not in that business. Haize Labs is in the business of taking your very expensive language model, poking it with algorithms until it says something it absolutely should not have said, and then emailing you a list of exactly how it happened. This is, when you think about it, a strange thing to be able to charge money for. It is also, it turns out, a thing that OpenAI and Anthropic will both pay for.

The pitch is almost aggressively simple. Software engineers have spent decades building "fuzzers" - programs that hurl millions of weird, malformed, unexpected inputs at other programs to see what crashes. AI models are software. So why, the founders of Haize Labs asked, are we testing the most consequential software of the decade by having a few clever humans type mean prompts into a chat box by hand? That is not a testing strategy. That is a hobby.

2024
Founded
$12.5M
Seed Round
$100M
Valuation
~19
Employees

What Haize Labs Actually Does

The core product is the haizing suite: a bundle of red-teaming, fuzzing, and optimization algorithms that systematically search a model's input space for anything that triggers bad behavior. Not one clever jailbreak - all of them, or as close to all as an optimizer can get. The techniques read like a graduate seminar: Monte Carlo tree search over conversation trees, bijection learning, transferred gradient-based attacks, evolutionary algorithms that mutate prompts across generations.

The word "haize" is the whole thesis compressed into a verb. To haize a model is to haze it - to put it through an ordeal designed to reveal what it is really made of. The company even publishes a public GitHub repo, cheerfully named get-haized, that shows off a subset of jailbreaks its algorithms found automatically. A safety company that open-sources its attacks is doing something slightly counterintuitive, and doing it on purpose.

Then there is Cascade, which is the part that should worry model builders and delight everyone who has to defend one. Most automated jailbreaks are single-shot: one prompt, one bad answer. Real attackers are patient. They start with an innocent question, then another, then slowly walk the model somewhere it never agreed to go. Cascade automates exactly that - it runs multi-turn conversations that begin benign and escalate, using tree search and prompt tuning to find the trajectory that ends in a jailbroken response. Haize's own reporting says it matches, and often beats, expert humans doing the same thing by hand.

More recently the company has pushed past pure attack into reliability. It takes a customer's fuzzy goals for an AI system - "don't give medical advice," "never leak this," "stay on brand" - and turns them into automated, model-based evaluators through synthetic data, adversarial attacks, and active learning that sharpens the rules over time. The endgame, per its own site, is a "Reliability Harness" for building agents you can actually deploy for mission-critical work.

"Build AI systems you can trust." — Haize Labs' stated mission

The Toolkit

Core Platform

Haizing Suite

Red-teaming, fuzzing, and optimization algorithms that sweep a model's input space to surface every input that produces undesired output.

Multi-Turn Attack

Cascade

Automated multi-turn red-teaming. Finds conversation paths where benign questions escalate into jailbroken answers - matching expert human red-teamers.

Reliability

Reliability Harness

Infrastructure for building and running expert-level AI agents for mission-critical work, with measurable reliability rather than hope.

Evaluation

Model-based Evaluators

Turns a company's goals into automated eval rules via synthetic data, adversarial attacks, and active learning that tightens over time.

Open Source

Verdict

A framework for inference-time scaling of LLMs-as-a-judge - squeezing more reliable verdicts out of evaluator models.

Open Source

j1 Reward Models

j1-micro (1.7B) and j1-nano (600M): absurdly tiny reward models built to punch far above their parameter count.

The Founders

Co-Founder & CEO

Left the first year of a Stanford PhD to build Haize. Former researcher at Berkeley's AI Research lab; the public face and voice of the company.

Steve Li
Co-Founder

Met the team as a Harvard undergraduate. Part of the founding technical core behind the haizing algorithms.

Richard Liu
Co-Founder

Rounded out the founding trio - all three met as undergrads at Harvard and shared time in academic AI research.

Three researchers, one dove, and the belief that testing AI by hand doesn't scale.

The Money

A round led by General Catalyst valued a company less than a year old at $100 million post-money. The cap table reads like a group chat of people who have built things that broke and then got fixed.

RoundAmountYearLead & Notable Investors
Seed$12.5M2024General Catalyst (lead), Soma Capital, Amjad Masad (Replit), Scott Wu (Cognition), Demi Guo (Pika), Neil Shen (Sequoia China), founders of Okta & Hugging Face
A hotly competitive seed round, closed by a startup barely out of the gate, at a nine-figure valuation. In AI safety, the scarce resource isn't ideas. It's people who can prove their ideas work. — On the round led by General Catalyst

Who's Buying

Customers & Partners

  • OpenAI - frontier model provider client
  • Anthropic - customer; helped red-team its jailbreak-defense prototypes
  • Deloitte - enterprise application-layer client
  • MongoDB - enterprise application-layer client

Frontier labs use Haize as a service. Enterprises use it as SaaS - CI/CD-style haizing plus run-time defense.

Where the Value Lands

Frontier labs
92
Enterprise apps
78
Regulated / risk
64
Open-source devs
55

Illustrative view of demand by segment - directional, not audited figures.

Why It Matters

Here is the uncomfortable structure of the AI industry. The people building the models are racing to ship. The same people are, mostly, the ones grading their own homework on whether those models are safe. This is not because anyone is a villain; it is because the incentives point one way and the clock points faster. Independent, adversarial, automated testing is the thing that is structurally hard to prioritize and structurally valuable to have.

Haize Labs sells into exactly that gap. If you ship an LLM feature, its argument is that red-teaming should look like continuous integration - the attacks run on every commit, not the night before launch, and not never. That reframing is the actual product. The algorithms are how you deliver it.

The open question, as with every young safety company, is whether "trust" survives as an independent product or gets absorbed into the platforms themselves. Frontier labs build internal red teams. Cloud providers bolt on guardrails. Haize's bet is that the adversary needs to be as fast and as tireless as the thing it's attacking, and that a dedicated shop will always run harder at that than a side team ever can.

For now, the evidence is on the wall: two of the most important AI labs on earth pay a startup founded in 2024 to find their models' worst-case behavior. That is either a very good early sign or a very good story. Often, at this stage, the two are indistinguishable - and that is the fun part.

Timeline

Watch & Read

Note: direct interview links point to search - specific video URLs weren't independently verified.

Find Haize Labs