BREAKING  Raindrop raises $15M seed led by Lightspeed Sentry for AI agents catches failures your dashboards never show Customers: Replit · Speak · Clay · Framer · AngelList Deep Search across millions of production AI events Backed by founders of Replit, Notion, Figma & Vercel BREAKING  Raindrop raises $15M seed led by Lightspeed Sentry for AI agents catches failures your dashboards never show Customers: Replit · Speak · Clay · Framer · AngelList Deep Search across millions of production AI events Backed by founders of Replit, Notion, Figma & Vercel
Applied AI Research · San Francisco
Raindrop logo
The company that watches the machines that run without watching.

Raindrop

The monitoring platform for AI agents. It traces every production run and catches the failures that never throw an error - hallucinations, loops, refusals, broken tools.

Founded 2023
Seed $15M · Lightspeed
Team ~13
YC W24
Share this page in · LinkedIn X · Post f · Facebook IG · Instagram
The Frontpage Story

It is 2 a.m. and your AI agent is quietly lying to a customer.

Nothing is on fire. The server is up, the latency graph is flat, the error rate reads a comfortable zero. By every dashboard you own, the night is uneventful. And yet somewhere in production an AI agent just invented a refund policy that does not exist, looped on a tool call forty times, and sent a user away confused. No exception was thrown. No pager went off. The customer simply left, and took the story with them.

This is the failure mode nobody built dashboards for. Traditional monitoring was designed for software that breaks loudly - a 500, a stack trace, a crash you can grep. AI agents break politely. They hallucinate, they refuse, they drift off-script, and they do all of it while returning a perfectly valid HTTP 200. Raindrop exists to make those silent failures loud.

Based in San Francisco and often described in one tidy phrase - "Sentry for AI agents" - Raindrop is an applied AI research company building the observability layer for a generation of software that is non-deterministic by design. It records what your agents actually did, decides what went wrong, and hands your engineers the specific run that broke.

“Raindrop is doing for AI what Sentry did for web apps - except the stakes now include hallucinations, refusals, and misaligned intent.”

- How the industry describes the company
By The Numbers

A small team, a large problem.

$15M
Seed round, Dec 2025
50+
AI teams monitored
~13
People on the team
1M+
Events per signal query
What You Can Do With It

Trace it. Detect it. Prove the fix.

Raindrop turns the fog of production AI into a working loop: capture every run, surface the issues automatically, then test a fix against live traffic before you trust it.

Trace

Inspect Agent Runs

Capture every message, tool call, retry, and error from production, then replay the exact run that misbehaved instead of guessing from logs.

Detect

Issue Detection

Automatically flags silent failures - hallucinations, infinite loops, broken tools, refusals - so you learn about them before your users do.

Measure

Signals

Small custom models tuned to the shape of your product. Watch "User Frustration" or define your own, like "Agent Stuck in a Loop," across millions of events.

Search

Deep Search

Semantic search over massive production datasets - find the one pattern that matters, not just the log line that happened to match.

Prove

Experiments & Feature Flags

Agent-native A/B testing. Run a candidate fix against real traffic and show, with data, that it actually worked.

Protect

PII Guard & Slack Triage

Server-side redaction keeps sensitive data out of view, while a Triage Agent lets your team investigate incidents straight from Slack.

How Signals Work

Naming the failures generic monitoring can't.

The point of a custom signal is simple: the metrics that matter for your agent are ones only you can name. Below is an illustrative view of incident rates a team might track. Figures are indicative, for explanation only.

Illustrative incident rate by signal · sample of monitored events
User Frustration
72
Agent Stuck in a Loop
54
Hallucinated Fact
41
Broken Tool Call
33
Unwarranted Refusal
19
* Illustrative only - not Raindrop's published figures. Real dashboards reflect each customer's own product and traffic.
The People

Second-time founders, zero-tolerance backgrounds.

CEO & Co-founder

Previously co-founder and CEO of Opyn, an early DeFi options platform later acquired by Coinbase.

Alexis Gauba
Co-founder

Also a co-founder of Opyn. A second-time founder building monitoring for software that behaves probabilistically.

Ben Hylak
Co-founder

Worked on visionOS at Apple and avionics software at SpaceX - places where silent failure was never an option.

The Argument

Why old evals fail new agents.

The founders' argument is blunt: the tools most teams use to measure AI were built for chatbots. A chatbot answers a question and stops. A modern agent picks up thousands of tools, runs for minutes or hours, and makes a long chain of decisions where any single link can quietly bend the outcome. You cannot score that with a one-shot benchmark.

Raindrop's answer is to train small, custom models to the exact shape of each customer's product rather than lean on one generic classifier. That specialization is the whole idea - it is why the platform can see a behavior as specific as "UI Aesthetic Complaints" and track how often it happens across a river of events. Specialization beats scale when the terrain keeps changing.

Investors noticed. The $15M seed was led by Lightspeed, with checks from Figma Ventures, Vercel Ventures, Y Combinator, and a roster of operators who run AI products themselves: the founders of Replit, Notion, Framer, Cognition, and Speak. When the people shipping agents put money into the company that watches agents, it reads less like a bet and more like buying insurance.

“The intelligence behind intelligence.”

- Raindrop's stated mission
Who Uses It

Trusted by teams shipping real AI.

Raindrop watches production agents for a growing list of AI-first companies - more than 50 teams in all.

ReplitSpeakClay FramerAngelListTolan AvocaBrowserbaseSpellbook LexVercel+ 40 more
The Story So Far

Milestones.

2023
Raindrop is founded in San Francisco by Zubin Koticha, Alexis Gauba, and Ben Hylak.
WINTER 2024
Goes through Y Combinator's W24 batch as "Sentry for AI agents."
2025
Adopted by 50+ AI product teams; launches Deep Search for patterns across millions of production events.
DEC 2025
Announces a $15M seed round led by Lightspeed Venture Partners to detect critical AI agent failures.
Back To 2 a.m.

The night, replayed - this time with the lights on.

Return to that quiet 2 a.m. The graphs are still flat and the error rate still reads zero. But now, in a Slack channel, a message arrives before the customer ever hits send on their complaint: an agent invented a refund policy, looped on a tool, and left a user frustrated - here is the exact run, here is the signal that caught it, here is the incident rate over the last thousand conversations.

The failure did not get quieter. The room got a way to hear it. That is the whole shift Raindrop is chasing: not fewer bugs in a demo, but a feedback loop where production behavior surfaces the problem, an engineer ships a fix, and an experiment proves it worked against real traffic. The machines still run without watching. Raindrop just makes sure someone is.

Watch & Demo

See it in motion.

Find Raindrop

Links & sources.

Profile compiled from public sources including raindrop.ai, Y Combinator, Crunchbase, Lightspeed, and press coverage. Funding, team size, and customer details reflect publicly reported figures as of mid-2026 and may have changed. Signal chart is illustrative, not Raindrop's published data.