She Read the Data Before Anyone Else Did
The year was 2005. A Stanford graduate named Sarah Catanzaro took a job that most people couldn't explain at dinner parties: she was applying computational linguistics and network analysis to track Somali pirate organizations and model insurgent behavior. The tools were rough. The data was incomplete. The stakes were non-negotiable.
Two decades later, she's a General Partner at Amplify Partners, running one of Silicon Valley's sharpest early-stage funds focused on data infrastructure and machine learning tooling. The tools are more sophisticated. The problems are less likely to involve actual pirates. The core work - making sense of complex systems from incomplete data - hasn't changed.
We don't invest in people who are just enamored with new technologies. We invest in people who want to solve problems and want to build products.
- Sarah CatanzaroCatanzaro grew up in a home where experimentation was ambient - her father is a molecular biologist, her mother a psychiatrist and clinical researcher. Playing with liquid nitrogen in her father's lab wasn't unusual. Asking hard questions about human behavior wasn't either. When 9/11 happened during her college years at Stanford, a question crystallized that would shape the next decade: what motivates people toward violence?
That question sent her to the Center for Advanced Defense Studies, where she worked with MIT AI researchers and Carnegie Mellon computer scientists to answer it at scale. Then to Qinetiq, where she was deployed to the US Secret Service to build threat intelligence systems that used NLP to protect presidential candidates - including Barack Obama - during the 2012 election cycle.
From the Secret Service, she moved through Palantir (where she watched government agencies struggle with data integration in real time) and Cyveillance, eventually landing at Mattermark as Head of Data. Mattermark was trying to do what she'd always done: extract signal from messy, incomplete information about private companies. She ran the data team. She applied machine learning to startup intelligence. And she started noticing something.
The tools available to data teams were broken. Not in obvious ways - they existed, they mostly worked, they got the job done. But they weren't designed for how real teams actually worked. They didn't integrate with each other. They required enormous manual effort. They weren't built for practitioners; they were built for engineers with days to spare.
Sarah Catanzaro's investment thesis didn't come from whitepapers. It came from spending years as the person trying to do the work - building data dictionaries, debugging integration points, and realizing that the tools she wished she had didn't exist yet.
That practitioner's eye is what got her in the door at Amplify Partners in 2017. She was introduced by Shivon Zilis, and the firm was looking for something specific: someone who had actually done the work. "Impossibly smart with no discernible ego" is how she was described internally. She joined as a Principal. By 2020, she was Partner. By 2022, General Partner.
Amplify's model is thesis-driven investing at the earliest stages - often pre-product, sometimes pre-revenue, occasionally pre-complete-team. Catanzaro's thesis sits at the intersection of where data is collected, stored, managed, analyzed, and modeled. That's the entire data and ML stack, essentially. And she has the practitioner credibility to evaluate it at a technical level that most VCs can't match.
She maintains hands-on data work specifically to preserve that conviction. Not as a performance of humility, but as a practical necessity. If she can't evaluate a technical claim herself, she doesn't trust the investment.
Without good data, there are no good models. But good data is really, really hard to get.
- Sarah Catanzaro, on leading DatologyAI's seed roundHer portfolio is a map of problems she identified before they became consensus. RunwayML - creative AI before generative AI was mainstream. Hex - collaborative analytics notebooks for modern data teams. Modal Labs - serverless compute built for ML workflows. DatologyAI - data curation for model training, which she seeded in February 2024 and described as addressing the root problem that most AI projects fail to acknowledge. LangChain - the orchestration layer for agent engineering, invested at a $1.25 billion valuation.
The exits tell the story just as clearly: Bayes acquired by Airtable, Einblick by Databricks, Eppo by Datadog, WarpStream by Confluent. These aren't flukes. They're companies that Catanzaro identified as solving real problems in the data stack, before the acquirers recognized they needed to own those solutions.
On the subject of MLOps - the operational layer that gets ML models from experiment to production - she has been consistently blunt. "It's still just as hard to get a model into production," she said in 2022, despite hundreds of new vendors claiming otherwise. "Tools are not well integrated with each other. Teams must cobble together all of these pieces." The market has since validated this assessment with aggressive M&A activity across the sector.
What Catanzaro looks for in founders is specific and unsentimental. She wants people who are obsessed with a problem, not with a technology. She wants founders who understand why they're building what they're building - not just the what, but the urgent specific problem that makes this startup necessary right now. She'll take pre-product. She won't take pre-conviction.
What it really takes to succeed in venture is stamina. Being always on, always aware, and comfortable with uncertainty.
- Sarah CatanzaroShe has been candid about what venture requires from women specifically. Female VCs must hustle harder without traditional networks, she's said - and she uses that harder path to emphasize the importance of role models. She's spoken at Strata Data Conference, apply(conf) by Tecton, and PyTorch Conference 2025. She launched a weekly newsletter in 2019 specifically because existing data newsletters were either too shallow or too academic - none built for practitioners who needed content they could use on a Tuesday morning before a sprint.
The year she had her first child, she made three new investments, closed four follow-on rounds, helped orchestrate a nine-figure exit, grew her team, and attended more conferences and dinners than she could count. "This isn't a humble brag," she wrote on X. "I'm straight up bragging."
The precision of that statement is very Sarah Catanzaro. She tracks what matters, reports it accurately, and doesn't dress it up. The data is the story.