At Meta's Menlo Park campus, Nikunj Bajaj had a front-row seat to a kind of structured chaos that most AI teams never admit out loud. The machine learning infrastructure that ran some of the world's most sophisticated conversational AI - the models behind Facebook Messenger's Proactive Assistant, the on-device models he helped ship directly onto hundreds of millions of phones - ran on parallel stacks. One stack for software. One for machine learning. Another for GenAI. Each team rebuilt the same wheel, accumulated the same debt, hit the same walls.
He didn't leave Meta in 2020 to build something modest. He left because he'd spent years seeing exactly what breaks when AI moves from research to production - and he had a clear theory about why most companies hit the same wall. The fragmentation wasn't a technical accident. It was an organizational one. Nobody had built the unified platform that treated ML deployment the same way software engineering had treated CI/CD.
Most enterprises were deploying parallel stacks - separate infrastructure for software, machine learning, and GenAI. That fragmentation doesn't scale. It compounds.
- Nikunj BajajBefore TrueFoundry, there was a detour that mattered. Bajaj co-founded EntHire, an AI-powered tech recruitment platform. It got acquired by Info Edge. A full cycle - build, scale, exit - in under a year. That compressed experience taught him things about shipping product and selling to enterprises that no ML research role ever could have.
In June 2021, he called two people he'd known since they were all first-year students at IIT Kharagpur's Class of 2013. Abhishek Choudhary had become a Senior Staff Engineer at Meta's infrastructure team. Anuraag Gutgutia had run large-scale quantitative funds as VP of Portfolio Management at WorldQuant. Three engineers. Three different views of the problem. One shared conviction that the enterprise AI market was about to demand infrastructure that didn't yet exist.
TrueFoundry's founding premise was almost contrarian in its simplicity: AI deployment should not require a team of DevOps specialists running alongside every data science team. It should deploy to Kubernetes the same way any software service deploys. The platform they built handles model serving, fine-tuning, monitoring, autoscaling, cost management, and governance - from a single interface, on cloud, on-prem, or hybrid.
Agents need flexibility to act. Enterprises need a headquarters to control them.
- Nikunj BajajThe seed round came fast. Sequoia India's Surge led a $2.3M raise in September 2022, backed by Naval Ravikant, Anthony Goldbloom (the founder of Kaggle), and a constellation of engineering leaders from Deutsche Bank, GitHub, and Greenhouse Software. The timing was sharp - Bajaj had predicted an inflection point in enterprise ML adoption, and ChatGPT arrived in November 2022 to confirm it loudly.
By 2025, TrueFoundry had landed contracts with some of the world's most demanding enterprise environments - Siemens Healthineers, ResMed, Automation Anywhere, Games 24x7, NVIDIA. The $19M Series A, led by Intel Capital in February 2025, brought Avi Bharadwaj onto the board. Eniac Ventures, Peak XV (formerly Sequoia Capital India & SEA), and Jump Capital participated, alongside angels including Gokul Rajaram and Mohit Aron.
But numbers only tell part of it. What actually matters is the healthcare customer story Bajaj tells in interviews - a company processing real-time prescription data that started losing revenue the moment a model went down. Their recovery process was manual. That outage became TrueFoundry's TrueFailover product: an automated system that reroutes enterprise AI traffic around model outages without requiring human intervention, while simultaneously validating that prompt quality isn't degrading in the process.
When you move from one model to another, you also have to consider things like output quality, latency, and whether the prompt even works the same way. In many cases, the prompt needs to be adjusted in real-time to prevent results from degrading. That is not something most teams are set up to manage manually.
- Nikunj Bajaj, Unite.AI Interview Series, 2026In 2025, TrueFoundry's net new revenue doubled quarter-over-quarter, every quarter. The team tripled. Fortune 500 POCs moved from kickoff to production in days rather than the months that had become an industry embarrassment. By Bajaj's count, TrueFoundry compresses the average enterprise AI deployment timeline from 14 months to under four - with companies reporting positive ROI within four months of launch.
He writes about the inflection in his 2025 year-end review with the metaphor of a gravitational slingshot: "If 2024 was ignition into orbit, 2025 was the year we caught a gravitational slingshot. In every great space mission, a slingshot depends on two things: a powerful external gravity source, and enough internal thrust to actually reach it." Both conditions, he argues, now exist for enterprise AI - and TrueFoundry spent 2025 providing the thrust.
His philosophy on AI reliability sounds deceptively simple: production-ready AI must be observable, controllable, and recoverable. All three. Not two out of three. The observation is more profound than it sounds - because "failures are no longer binary" in LLM systems. A model doesn't just go down. It degrades silently. It returns plausible-sounding wrong answers. It hallucinates in ways that pass automated checks but fail real users. The monitoring problem is orders of magnitude harder than it was for traditional software, and most enterprise teams are still using traditional software tooling.
The long-term vision is stated plainly on TrueFoundry's about page: AI managing AI. Not AI as a tool, but AI as infrastructure operator - self-sustaining systems where the platform itself uses intelligent agents to optimize, scale, and recover without waiting for a human to notice something is wrong. Bajaj frames it as the natural endpoint of what enterprise IT has always wanted: infrastructure that takes care of itself.
In May 2026, he weighed in on the Portkey acquisition - a move that reshuffled the AI gateway competitive landscape. He's been writing about industry signals consistently, positioning TrueFoundry not just as a vendor but as an analyst of where the enterprise AI market is going. That combination - deep technical architecture, enterprise sales instincts from EntHire, and market commentary from a platform vantage point - is the version of the job that makes TrueFoundry harder to replicate than any individual product feature.
Nikunj Bajaj is not the loudest voice in the AI infrastructure conversation. He doesn't chase trend cycles. He built his thesis before ChatGPT made it obvious, shipped infrastructure that handles the part nobody wants to talk about, and found customers willing to pay for reliability over novelty. The plumbing is still unglamorous. That's precisely why it's worth building.