BREAKING  Gal Vered bets verification is the next hard job in software NAVY OFFICER → GOOGLE PM → AI TESTING FOUNDER CHECKSUM.AI SHIPS 100-150 TESTS IN WEEK ONE TESTS THAT HEAL THEMSELVES WHEN THE CODE CHANGES NOW ON THE GOOGLE CLOUD MARKETPLACE BREAKING  Gal Vered bets verification is the next hard job in software NAVY OFFICER → GOOGLE PM → AI TESTING FOUNDER CHECKSUM.AI SHIPS 100-150 TESTS IN WEEK ONE TESTS THAT HEAL THEMSELVES WHEN THE CODE CHANGES NOW ON THE GOOGLE CLOUD MARKETPLACE
The {Closed} File · Profile No. 001

Gal Vered

He spent a career learning how software is built. Then he decided the harder, lonelier question was whether it actually works - and built a company around the answer.

On The Record Gal Vered, co-founder and CEO of Checksum.ai Gal Vered, caught mid-grin - the look of a man who enjoys finding bugs before customers do.
100-150Tests, week one
5Lives lived: Navy, MBA, Google, CTO, founder
2Frameworks: Playwright & Cypress
0Tests you maintain by hand

He decided proving it works was the real job

Gal Vered runs Checksum.ai, a San Francisco company with an unglamorous job and an outsized idea. The job: write the end-to-end tests that nobody wants to write, then keep them alive while the product underneath them keeps changing. The idea: as artificial intelligence writes more and more of the world's code, the scarce skill is no longer generating software. It is proving the software works.

That distinction sounds academic until you have shipped something. Anyone can produce a thousand lines of plausible code in an afternoon now. The question that keeps a CEO awake is whether those lines do what they promise when a real person clicks the button at 2am. Checksum's whole pitch lives in that gap. It watches real user sessions, learns the flows that matter, and turns them into Playwright and Cypress tests - the kind of coverage most teams swear they will get to and never do.

Vered's title is co-founder and CEO. His path there is the more interesting story, because almost none of it looks like a straight line.

A navy, a business school, and a search engine walk into a startup

Before the term sheets and the test frameworks, Vered was an officer in the Israeli Navy. He talks about it not as a war story but as a tempo. On the super{set} {Closed} Session podcast he traced his bias for speed straight back to the deck: you operate very fast, you do not have time, everything needs to happen now. It is the kind of formation that does not wash off. Founders who move slowly rarely survive; Vered learned urgency before he learned a balance sheet.

The balance sheet came next. He earned an MBA at Northwestern's Kellogg School of Management between 2016 and 2018, then landed as a product manager at Google - the place where you learn how very large software is shipped, and how much of it is held together by tests that someone, somewhere, dreads updating. After Google he became chief technology officer at SEER, a Y Combinator-backed company, which is where the founder reflex usually gets either cured or confirmed. For Vered it was confirmed.

Stack those up - Navy officer, Kellogg MBA, Google PM, YC-backed CTO - and you get a resume that reads like a Silicon Valley sampler platter. The unifying thread is not industry. It is the same person repeatedly volunteering for the part of the work that is hard, dull from the outside, and quietly load-bearing.

QA automation is more than testing. It is the foundation for AI engineering excellence to truly thrive.

- Gal Vered

Why testing, of all things

Testing is the broccoli of software. Everyone agrees it is good for you. Almost nobody finishes their plate. Engineers write tests when they have time, which is never, and the tests they do write rot the moment the feature changes. The result is a familiar dread: a green checkmark you do not quite trust, sitting on top of a product nobody has the hours to verify.

Checksum's answer is to stop selling a tool and start selling a result. The company prices by the number of workflows it keeps healthy rather than by seat or by run - a results-as-a-service model where the customer buys passing tests, not test software. In a customer's first week Checksum generates somewhere between 100 and 150 tests. When the underlying app changes and a test breaks, the system heals it, rewriting the broken script instead of paging a human at midnight. The tagline is blunt: ship faster without trading off quality.

There is a sharper version of the thesis, and it is the one Vered keeps returning to. The current wave of AI coding agents can produce code at a frightening clip. But an engineering agent that cannot verify its own output is a liability, not a colleague. You cannot trust an autonomous coder until something equally capable can check the work. So Checksum is not really in the testing business. It is building the QA agent that the next generation of AI engineers will need before anyone dares hand them the keys.

To build an engineering AI agent capable of tackling the impossible, you need a QA AI agent that is just as capable.

- Gal Vered, on the next chapter of AI

Watching what people actually do

One small detail captures the whole philosophy. Most test suites cover what engineers imagine users will do. Checksum learns from what users actually did - real sessions, real clicks, real detours - and writes its tests from that evidence. It is the difference between a map drawn from memory and a map drawn from footprints. The footprints are messier, and they are also true.

That instinct - trust the real signal over the tidy assumption - runs through everything Vered builds. A Navy officer does not get to assume the sea is calm. A product manager at Google does not get to assume the feature works because the demo did. And a testing company does not get to assume coverage because someone wrote a test once, long ago, for a screen that has since been redesigned three times.

Where it stands now

Checksum was founded in 2022 inside the super{set} startup studio, raised a seed round, and in 2025 graduated from Google Cloud's Emerging Partner Springboard Program before landing on the Google Cloud Marketplace - the kind of distribution milestone that turns a clever product into a buyable one. Its customers span fintech, insurance, travel, and SaaS, the unglamorous industries where a broken checkout flow is not a bug report but a lost quarter.

Vered, for his part, keeps writing and talking about the same conviction from slightly different angles: on podcasts, on The Test Tribe, on stages. Generation got easy. Verification got hard. The companies that win the next decade of software will be the ones that can prove, continuously and cheaply, that the thing they shipped this morning still works this afternoon. He is building the proof.

It is, when you step back, a deeply unfashionable bet. Quality assurance is the part of the demo nobody claps for. But the people who have actually shipped software know where the bodies are buried, and they tend to nod. Vered is one of them - which is probably the point.

Generation got cheap. So the bottleneck moved.

Vered's argument in a nutshell: when AI can write the code, the hard part is no longer writing it. It is trusting it. Checksum sells the second half.

Writing the code
Increasingly automated
then what?
Proving it works
The new hard problem

The founder, quoted

01

You operate very fast - like you don't have time. Everything needs to happen now.

02

QA automation is more than testing. It is the foundation for AI engineering excellence to truly thrive.

03

Without a potent QA AI agent, the dream of independent, real-world-ready AI engineering agents remains out of reach.

04

The next chapter of AI is here, and it starts with a simple idea.

Four things worth knowing

Navy first. Before software, Vered was an officer in the Israeli Navy - the source, he says, of his everything-needs-to-happen-now tempo.
The sampler platter. Navy officer, Kellogg MBA, Google PM, Y Combinator CTO, founder. Five careers, one stubborn thread: volunteer for the load-bearing part.
Tests from footprints. Checksum learns what to test by watching real user sessions, so coverage maps what people do - not what engineers guess they do.
Self-healing. When the app changes and a test breaks, the AI rewrites it. No one gets paged at midnight.

Find Gal Vered

Filed from San Francisco. Sources: super{set}, Checksum.ai, The Test Tribe, Google Cloud, public interviews.