Breaking
Jesse Robbins: Amazon's "Master of Disaster" turned DevOps founder turned VC Invented chaos engineering before anyone called it that Co-founded Chef - used by Apple, Facebook, Google, IBM and Microsoft Five portfolio IPOs including PagerDuty, Fastly and Instacart General Partner at Heavybit - the developer tools VC Volunteer firefighter who treats server outages like fire scenes Don't fight stupid, make more awesome - Jesse Robbins MIT Technology Review "Innovators Under 35" class of 2011 Jesse Robbins: Amazon's "Master of Disaster" turned DevOps founder turned VC Invented chaos engineering before anyone called it that Co-founded Chef - used by Apple, Facebook, Google, IBM and Microsoft Five portfolio IPOs including PagerDuty, Fastly and Instacant General Partner at Heavybit - the developer tools VC Volunteer firefighter who treats server outages like fire scenes Don't fight stupid, make more awesome - Jesse Robbins MIT Technology Review "Innovators Under 35" class of 2011
Profile / Venture Capital / Developer Tools

Jesse
Robbins

General Partner - Heavybit Industries - San Francisco

Invented chaos engineering at Amazon. Co-founded Chef. Seeded the global DevOps movement. Now backs the developer-first companies that run the internet - from inside a Victorian firehouse on Octavia Street.

Venture Capital Developer Tools DevOps Chaos Engineering Cloud Infrastructure Open Source
Jesse Robbins, General Partner at Heavybit

Jesse Robbins - Heavybit Industries, San Francisco

"Failure happens and anyone who tells you otherwise is lying."
- Jesse Robbins
60+ Portfolio Companies
5 Portfolio IPOs
$200M+ Chef Acquisition
2003 GameDay Invented

When the servers go down,
he's been here before


In 2003, Jesse Robbins walked into an all-hands meeting at Amazon and proposed something unusual: let's intentionally break the website. Not because something went wrong - because carefully chosen, well-orchestrated failure teaches you things that success never can. He called it GameDay. His managers called it authorized. The rest of the industry, years later, called it chaos engineering.

Robbins grew up in Manchester-by-the-Sea, Massachusetts, and trained as a firefighter and EMT before any of the tech chapter happened. That's not a footnote to his career - it's the whole operating system. When you arrive on a fire scene, you don't debug the fire. You command the response. You stop the spread. You get people out. The instinct to act under uncertainty, to preserve core function while chaos burns around the edges, came from years of pulling hose and running calls before he ever wrote a line of runbook.

Amazon recognized the unusual package they'd hired. As Website Availability Manager, Robbins held formal responsibility for keeping all Amazon brand properties online - a role that eventually earned him the title "Master of Disaster," approved by management and printed without irony. By the time he left, he had contributed to a culture of deliberate resilience that would go on to influence Google, Netflix (hello, Chaos Monkey), Yahoo, and Facebook.

"The best thing your incident commander can do during an outage is STOP debugging."
Jesse Robbins

In 2007, with Tim O'Reilly, Robbins co-founded the Velocity Conference on Web Performance and Operations - at the time, a niche gathering for people who cared about whether websites stayed up. It grew into the place where a loose community of practitioners named and formalized what they were doing. The DevOps movement, as a recognized field with a philosophy and a vocabulary, traces a clear line back to Velocity.

Then came Chef. Co-founded in 2008 as Opscode with Barry Steinglass, Nathen Haneysmith, and Joshua Timberman, Chef was an open-source infrastructure automation framework that let engineers describe their server configurations as code. Apple, Facebook, Google, IBM, and Microsoft all ran it. Hundreds of thousands of developers used it. Progress Software acquired it in 2020 for over $200 million. The company spent a decade proving that infrastructure as code wasn't a clever idea - it was the only sensible way to operate at scale.

Robbins stepped back from the CEO role and into Chief Community Officer, then turned his attention to Orion Labs - a voice communication platform he co-founded with a fellow firefighter. The product, called Onyx, was described as a real-life Star Trek communicator: instant voice for teams, IoT devices, and field workers who needed to communicate without looking down at a phone. It was built by people who knew what it meant to need reliable, fast communication in a situation where fumbling with a touchscreen could cost you something.

Heavybit, the San Francisco venture firm focused exclusively on developer-first companies, had been in Robbins's orbit since 2014, when he joined part-time. When he went full-time in 2022, he brought a portfolio mind shaped by having lived the entire arc of modern developer infrastructure - from the early chaos of web operations to the emergence of DevOps to the cloud-native era to today's AI infrastructure moment.

Breaking things
on purpose

GameDay: The Chaos Engineering Origin Story

Before Netflix had Chaos Monkey. Before Google had DiRT. Before "chaos engineering" existed as a term. Amazon had GameDay - and Jesse Robbins invented it.

The idea came from his firefighter training: you don't learn how to fight a fire by reading about it. You learn by drilling, simulating, practicing under controlled conditions until the response becomes muscle memory. He applied the same logic to web infrastructure: schedule a failure, observe the response, learn everything you can, and be better prepared for the unscheduled version.

  • Deliberately caused major Amazon outages on schedule
  • Observed team response and infrastructure behavior under stress
  • Documented failure modes before they became incidents
  • Directly influenced Netflix Chaos Monkey and Google DiRT
  • Established the intellectual foundation for the SRE field

The insight underneath GameDay isn't just technical - it's psychological. Complex distributed systems will fail. That's not a pessimistic statement; it's an engineering reality that anyone who has operated internet infrastructure knows in their bones. The question isn't whether failure happens, it's whether you've encountered this specific failure before - in a controlled environment where you had time to think, or at 2am on a Tuesday when the on-call engineer is half-asleep and the CEO is calling.

Robbins drew from research on normal accidents theory and human cognitive performance under stress. When people are surprised by a system failure, cognitive load spikes and decision quality drops. When people have seen a failure before - even in simulation - they have a mental model to match against, a pattern to recognize, a playbook to reach for. GameDay wasn't just about finding bugs. It was about building the organizational muscle memory to respond to them.

The idea propagated quietly through the industry. Google built DiRT (Disaster Recovery Testing). Netflix built Chaos Monkey, then an entire Simian Army. Yahoo, Facebook, and others followed with their own variants. The broader field, eventually named "chaos engineering," became a recognized discipline with dedicated conferences, books, and tools. Robbins's original insight - that failure is a feature you can design into your learning process - held up across every implementation.

"For every dollar spent in failure, learn a dollar's worth of lesson."
Jesse Robbins

He took the same instinct from the data center to the boardroom. At Heavybit, he works with founders whose companies will face their own version of GameDay - the first enterprise customer who finds an edge case, the first outage that reaches a paying customer, the first scaling failure when growth arrives faster than the architecture can absorb. His pitch to those founders: plan for this now, while you have time to learn.

What Jesse Robbins says

"Failure happens and anyone who tells you otherwise is lying."
On resilience
"The best thing your incident commander can do during an outage is STOP debugging."
On incident management
"Don't fight stupid, make more awesome."
Personal motto
"You only make money when your web site is up."
On availability
"I'm a dreamer. I believe in the incredible potential of people."
On optimism
"Relentless, unstoppable, driven founders solving a problem they know firsthand."
Investment thesis

From fire scenes
to fireside chats

Pre-2001
Volunteer Firefighter & EMT
Trained in incident command, emergency management, and operational response under uncertainty. The mental models that shaped everything that followed.
2001
Amazon - Website Availability Manager
Joined Amazon to manage availability across all brand properties. Official title: "Master of Disaster" - manager approved, printed without embarrassment.
2003
Invented GameDay
Created the deliberate chaos engineering practice at Amazon - scheduled, intentional system failures to build resilience and organizational muscle memory. Template for Netflix Chaos Monkey.
2007
Co-founded O'Reilly Velocity Conference
With Tim O'Reilly, created the conference on Web Performance and Operations that became the birthplace of the global DevOps movement.
2008
Co-founded Opscode (Chef)
Left Amazon to build an infrastructure automation company. Co-founders: Barry Steinglass, Nathen Haneysmith, Joshua Timberman.
2009-2010
Chef Launches & Scales
Chef shipped as open source. Raised $2.5M Series A (Draper Fisher Jurvetson), then $11M Series B (Battery Ventures). Became market leader for infrastructure as code.
2011
MIT TR35 Award
MIT Technology Review named Robbins one of the world's top innovators under 35 for transforming how web companies design and manage infrastructure.
2014
Co-founded Orion Labs + Joined Heavybit
Started Orion Labs with a fellow firefighter to build enterprise voice communication. Joined Heavybit as part-time General Partner.
2020
Chef Acquired for $200M+
Progress Software acquired Chef. Twelve years after founding, the company that made infrastructure as code mainstream found its exit.
2022
Full-Time at Heavybit
Transitioned to full-time General Partner as Heavybit closed $80M fund. Now leads investments in developer tools, AI infrastructure, and cloud-native security.

Why the firefighter
still matters

🚒
Emergency to Engineering

Jesse Robbins's firefighter training isn't biographical color. It's the source code for how he thinks about systems, failures, and teams under pressure.

  • 1Incident command - the firefighter structure of a clear commander stopping analysis paralysis - became his model for web ops response
  • 2Controlled burns - practicing fire response before it's needed - became GameDay at Amazon
  • 3Orion Labs built exactly the communications tool he wished he'd had on fire scenes
  • 4Hurricane Katrina deployment applied tech community skills to real disaster response, collaborating with the United Nations on crisis technology

Robbins deployed during Hurricane Katrina as part of a task force, applying the same emergency management skills he'd developed as a volunteer. That experience reinforced something he already suspected: the gap between how the tech industry organized disaster response and how trained emergency responders organized it was vast, unnecessary, and fixable.

He bridged the gap. He worked with traditional emergency management organizations, collaborated on crisis technology innovations later adopted by the United Nations, and evangelized the idea that internet infrastructure and emergency response had more in common than either community acknowledged. Both dealt with complex systems. Both required clear decision-making under high uncertainty. Both needed the discipline to act when you didn't have complete information, because waiting for perfect information wasn't an option.

"You only make money when your website is up."
Jesse Robbins

The same instinct that led Robbins to GameDay led him to Orion Labs. His co-founder was also a firefighter. The product they built - instant voice communication for teams and field workers, eventually dubbed a "real-life Star Trek communicator" - addressed a friction point both founders had felt in their EMT and firefighting work. The smartphone was the wrong device for the job when the job required immediate, reliable, hands-free communication. Onyx was the right device. The insight was earned, not inferred.

When Robbins evaluates companies at Heavybit now, he notices the same quality in the founders he backs. Not technical skill alone - the valley is full of technical skill. He's looking for founders who know the problem firsthand. Who have been the person on call when the system failed. Who have pulled the hose. Who understand the gap between the tool that exists and the tool that should exist, because they've been in the situation where the difference mattered.

Heavybit:
where developer tools grow up

Heavybit operates out of a Victorian firehouse at 523 Octavia Street in San Francisco - which says something about institutional self-awareness. Founded in 2013, the firm specializes exclusively in developer-first companies: the tools, APIs, platforms, and infrastructure that engineers choose, use, and sometimes love. The portfolio includes Snyk, PagerDuty, Tailscale, LaunchDarkly, Sanity, CircleCI, Blockdaemon, and more than sixty others.

Robbins joined full-time in 2022, when Heavybit closed an $80 million fund. He sources and leads investments, advises portfolio founders on go-to-market strategy, product development, and company-building. His perspective on developer tools investing reflects the complete arc of his career: he's been the engineer running infrastructure, the founder building the product, and now the investor writing the check.

On AI developer tools, Robbins has been publicly skeptical of what he calls "AI washing" - slapping an AI feature onto an existing product without building genuine integration or generating real value. His investment thesis in the current moment emphasizes founders who can demonstrate clear ROI, products where AI is structurally embedded rather than cosmetically added, and companies with a clear path from developer adoption to revenue. From watching 80+ developer tool companies: they live and die by bottom-up adoption, and the ones that survive long enough to matter are the ones with disciplined revenue thinking from the start.

The Seed 100 named Robbins to its list of best early-stage investors in 2021. Five of his portfolio companies have reached public markets: PagerDuty, Instacart, Fastly, Caribou Biosciences, and Zymergen. Fourteen private companies in the portfolio have exceeded $500 million in valuation. The wins include bets on Snyk and Tailscale, companies that became category-defining in cloud-native security and secure networking respectively.

Investment Focus
Developer-first companies. Pre-seed to Series A. Cloud infrastructure, security, APIs, and AI developer tools.
What He Backs
Founders who know the problem firsthand. Products that create an "instant worldview shift" from first experience. Clear bottom-up adoption paths.
What He Avoids
"AI washing" - superficial AI features without genuine integration or measurable ROI. Hype without a clear revenue path or adoption strategy.
Support Model
Hands-on. Heavybit provides go-to-market support, a network of advisors, and peer community for technical founders scaling from developer adoption to enterprise revenue.

The bets that
built careers

Heavybit's portfolio under Robbins's tenure reads like a who's-who of the developer tools landscape: Snyk (cloud-native security), PagerDuty (incident management), Tailscale (secure networking), LaunchDarkly (feature flags), Sanity (content infrastructure), CircleCI (CI/CD), Blockdaemon (blockchain infrastructure). The through-line is developer-first products that expand from engineering champions to enterprise buyers - a motion Robbins understands from having lived it at Chef.

What's worth noting is the timing. Robbins backed several of these companies before the categories they pioneered had names. PagerDuty was DevOps incident management before DevOps was a mainstream job title. Snyk was developer security before "shift left" became a conference phrase. LaunchDarkly was feature flagging before every major platform built native support for it. Getting in early on category-defining companies requires either luck or the ability to recognize an emerging category from the inside - and Robbins had been inside most of these categories before the venture money arrived.

Portfolio IPOs
  • PagerDuty
  • Fastly
  • Instacart
  • Caribou Biosciences
  • Zymergen
Snyk Cloud-native security
Tailscale Secure networking
LaunchDarkly Feature flags
PagerDuty Incident management - IPO
Fastly Edge cloud platform - IPO
CircleCI CI/CD automation
Sanity Content infrastructure
Blockdaemon Blockchain infrastructure

The details
that define him

Fact 01
His Amazon title "Master of Disaster" was officially approved by management and printed on the organizational chart. It was not a joke. It was not self-assigned. It was accurate.
Fact 02
GameDay, the chaos engineering practice Robbins invented at Amazon in 2003, directly inspired Netflix's Chaos Monkey - arguably the most famous engineering experiment in Silicon Valley history.
Fact 03
He and his Orion Labs co-founder Greg Albrecht both had firefighting backgrounds. The product they built - instant voice for teams - was literally the communications tool they wished they'd had on the job.
Fact 04
Heavybit operates out of a Victorian firehouse. The location is not ironic. Robbins is an actual firefighter. The building was not chosen accidentally.
Fact 05
He joined Twitter in March 2007 - early enough that most of the platform's eventual user base didn't know it existed yet. The account now has over 7,600 followers.
Fact 06
Chef, the company Robbins co-founded in 2008, ended up running infrastructure at Apple, Facebook, Google, IBM, and Microsoft simultaneously - a portfolio of customers that only happens when the product is genuinely good.