Where 99% Accuracy Isn't Good Enough
There was a weapons-detection AI model that passed every benchmark thrown at it. Its aggregate accuracy score: 99%. In a live demo, it failed. A different model - one that scored only 97.2% overall - performed better on the exact scenarios that mattered. Mohamed Elgendy was in the room. He understood immediately that the industry wasn't just measuring poorly; it was measuring the wrong things entirely. Kolena was the answer he built.
Elgendy grew up in Egypt, studied systems and biomedical engineering at Cairo University, then spent a few years working in that field before a different pull took hold. He relocated to the United States, worked through a series of engineering roles - Independence Blue Cross, Aspen Dental Management, Yale University - and eventually landed in the architecture of large-scale software systems. He joined Twilio, then Amazon, where something specific happened that would shape the rest of his career.
At Amazon, Elgendy didn't just build AI products. He designed and taught a deep learning for computer vision course at Amazon's Machine Learning University. He ran Amazon's computer vision think tank. He wasn't only building - he was translating the mechanics of machine learning into language that teams could actually use. That instinct for translation would later produce a book that sold 20,000 copies.
"AI lacks trust from both builders and the public. The genie isn't going back in the bottle, but we can make sure we make the right wishes."
Mohamed Elgendy - Co-Founder & CEO, KolenaAfter Amazon, he went to Synapse Technology Corporation as Head of Engineering, leading the development of a proprietary threat detection platform. Synapse was acquired by Palantir. Then came Rakuten, where he served as VP of Engineering for the AI Platform, building and managing the ML infrastructure for all of Rakuten Mobile's operations. By 2020, his book - "Deep Learning for Vision Systems" (Manning Publications) - was out in the world, reaching engineers who needed the concepts explained in terms they could implement.
In 2021, Elgendy co-founded Kolena with Andrew Shi (CTO) and Gordon Hart (CPO). The founding insight was sharp: software engineering had spent decades developing rigorous testing methodologies - unit tests, regression tests, scenario coverage. Machine learning had essentially borrowed the concept of accuracy and stopped there. Kolena's pitch was to port the entire discipline of software testing into the AI development lifecycle.
"This is testing on steroids," Elgendy has said. "Not just ticking boxes, but diving deep into the nuances of model accuracy." The platform lets teams test AI models at the scenario level - not just asking whether a model is accurate overall, but whether it performs on the specific slices of data that matter for the actual use case. A healthcare AI should be tested on the edge cases a clinician encounters. An autonomous vehicle model should be tested on the specific road conditions where it will operate. Aggregate metrics lie by omission.
Kolena's clients include Fortune 500 companies, government organizations, European AI standardization institutes, and startups in robotics, healthcare, autonomous vehicles, and banking. By September 2023, the company had raised $15M in a Series A led by Lobby Capital, with participation from SignalFire and Bloomberg Beta - bringing total funding to $21M. At the time of the raise, Kolena had 28 full-time employees.
Elgendy has articulated a framework for thinking about AI trust that divides the problem into three distinct communities: builders (who lack the testing tools and visibility to trust their own systems), buyers (who are misled by aggregate accuracy metrics that hide critical failures), and regulators (who have no pre-deployment validation frameworks to work from - what he compares to "the FDA doing away with clinical trials"). Kolena's ambition is to close all three gaps simultaneously.
Beyond the platform, Elgendy co-founded AIQCON - the AI Quality Conference - with MLOps Community. The inaugural event drew more than 10,000 attendees: AI builders, investors, regulators, and journalists, all gathered to wrestle with what responsible AI deployment actually requires. Creating a conference wasn't a marketing exercise. It was Elgendy doing in public what he has always done: turning a hard problem into a shared discipline.
He works 6 AM to 5 PM daily, a discipline he credits as foundational. He identifies trustworthiness and consistency as instrumental character traits. He credits his wife Amanda El-Dakhakhni as his primary support system. These are the kinds of details that rarely appear in TechCrunch coverage but tend to explain a lot about how someone builds something real over time.
Mohamed Elgendy's bet is that the AI industry is about to face exactly what the weapons-detection model faced: the moment when headline metrics meet the real world, and the gap becomes impossible to ignore. He is building the infrastructure to close that gap before it causes serious harm. Whether you call that product-market fit or moral clarity may depend on your vantage point. From where Elgendy stands, it's just the right problem to solve.