There is a genre of company that sells you a feeling. "AI safety," in most decks, is a feeling - a paragraph about values, a commitment to being responsible, a photo of a diverse team looking thoughtfully at a whiteboard. Haize Labs is not in that business. Haize Labs is in the business of taking your very expensive language model, poking it with algorithms until it says something it absolutely should not have said, and then emailing you a list of exactly how it happened. This is, when you think about it, a strange thing to be able to charge money for. It is also, it turns out, a thing that OpenAI and Anthropic will both pay for.
The pitch is almost aggressively simple. Software engineers have spent decades building "fuzzers" - programs that hurl millions of weird, malformed, unexpected inputs at other programs to see what crashes. AI models are software. So why, the founders of Haize Labs asked, are we testing the most consequential software of the decade by having a few clever humans type mean prompts into a chat box by hand? That is not a testing strategy. That is a hobby.
What Haize Labs Actually Does
The core product is the haizing suite: a bundle of red-teaming, fuzzing, and optimization algorithms that systematically search a model's input space for anything that triggers bad behavior. Not one clever jailbreak - all of them, or as close to all as an optimizer can get. The techniques read like a graduate seminar: Monte Carlo tree search over conversation trees, bijection learning, transferred gradient-based attacks, evolutionary algorithms that mutate prompts across generations.
The word "haize" is the whole thesis compressed into a verb. To haize a model is to haze it - to put it through an ordeal designed to reveal what it is really made of. The company even publishes a public GitHub repo, cheerfully named get-haized, that shows off a subset of jailbreaks its algorithms found automatically. A safety company that open-sources its attacks is doing something slightly counterintuitive, and doing it on purpose.
Then there is Cascade, which is the part that should worry model builders and delight everyone who has to defend one. Most automated jailbreaks are single-shot: one prompt, one bad answer. Real attackers are patient. They start with an innocent question, then another, then slowly walk the model somewhere it never agreed to go. Cascade automates exactly that - it runs multi-turn conversations that begin benign and escalate, using tree search and prompt tuning to find the trajectory that ends in a jailbroken response. Haize's own reporting says it matches, and often beats, expert humans doing the same thing by hand.
More recently the company has pushed past pure attack into reliability. It takes a customer's fuzzy goals for an AI system - "don't give medical advice," "never leak this," "stay on brand" - and turns them into automated, model-based evaluators through synthetic data, adversarial attacks, and active learning that sharpens the rules over time. The endgame, per its own site, is a "Reliability Harness" for building agents you can actually deploy for mission-critical work.
The Toolkit
Haizing Suite
Red-teaming, fuzzing, and optimization algorithms that sweep a model's input space to surface every input that produces undesired output.
Cascade
Automated multi-turn red-teaming. Finds conversation paths where benign questions escalate into jailbroken answers - matching expert human red-teamers.
Reliability Harness
Infrastructure for building and running expert-level AI agents for mission-critical work, with measurable reliability rather than hope.
Model-based Evaluators
Turns a company's goals into automated eval rules via synthetic data, adversarial attacks, and active learning that tightens over time.
Verdict
A framework for inference-time scaling of LLMs-as-a-judge - squeezing more reliable verdicts out of evaluator models.
j1 Reward Models
j1-micro (1.7B) and j1-nano (600M): absurdly tiny reward models built to punch far above their parameter count.
The Founders
Left the first year of a Stanford PhD to build Haize. Former researcher at Berkeley's AI Research lab; the public face and voice of the company.
Met the team as a Harvard undergraduate. Part of the founding technical core behind the haizing algorithms.
Rounded out the founding trio - all three met as undergrads at Harvard and shared time in academic AI research.
Three researchers, one dove, and the belief that testing AI by hand doesn't scale.
The Money
A round led by General Catalyst valued a company less than a year old at $100 million post-money. The cap table reads like a group chat of people who have built things that broke and then got fixed.
| Round | Amount | Year | Lead & Notable Investors |
|---|---|---|---|
| Seed | $12.5M | 2024 | General Catalyst (lead), Soma Capital, Amjad Masad (Replit), Scott Wu (Cognition), Demi Guo (Pika), Neil Shen (Sequoia China), founders of Okta & Hugging Face |
Who's Buying
Customers & Partners
- OpenAI - frontier model provider client
- Anthropic - customer; helped red-team its jailbreak-defense prototypes
- Deloitte - enterprise application-layer client
- MongoDB - enterprise application-layer client
Frontier labs use Haize as a service. Enterprises use it as SaaS - CI/CD-style haizing plus run-time defense.
Where the Value Lands
Illustrative view of demand by segment - directional, not audited figures.
Why It Matters
Here is the uncomfortable structure of the AI industry. The people building the models are racing to ship. The same people are, mostly, the ones grading their own homework on whether those models are safe. This is not because anyone is a villain; it is because the incentives point one way and the clock points faster. Independent, adversarial, automated testing is the thing that is structurally hard to prioritize and structurally valuable to have.
Haize Labs sells into exactly that gap. If you ship an LLM feature, its argument is that red-teaming should look like continuous integration - the attacks run on every commit, not the night before launch, and not never. That reframing is the actual product. The algorithms are how you deliver it.
The open question, as with every young safety company, is whether "trust" survives as an independent product or gets absorbed into the platforms themselves. Frontier labs build internal red teams. Cloud providers bolt on guardrails. Haize's bet is that the adversary needs to be as fast and as tireless as the thing it's attacking, and that a dedicated shop will always run harder at that than a side team ever can.
For now, the evidence is on the wall: two of the most important AI labs on earth pay a startup founded in 2024 to find their models' worst-case behavior. That is either a very good early sign or a very good story. Often, at this stage, the two are indistinguishable - and that is the fun part.
Timeline
- 2024 · FoundingLeonard Tang, Steve Li, and Richard Liu launch Haize Labs in New York after meeting at Harvard and researching at Berkeley's AI lab.
- 2024 · Seed$12.5M seed round led by General Catalyst closes at a $100M post-money valuation, with a marquee list of operator-investors.
- Nov 2024 · CascadeCompany releases Cascade, an automated multi-turn red-teaming engine, and shares results publicly on its blog and X.
- 2025 · EcosystemContinues open-sourcing tooling (Verdict, j1 reward models, get-haized) and works alongside frontier labs stress-testing jailbreak defenses.
Watch & Read
Leonard Tang Interviews
Talks and podcasts featuring the Haize Labs CEO on automated AI red-teaming.
▶ Blog · DemoCascade Deep-Dive
The technical write-up on automating multi-turn jailbreaks, with examples.
▶ GitHub · Liveget-haized
A public gallery of jailbreaks the haizing suite discovered automatically.
Note: direct interview links point to search - specific video URLs weren't independently verified.