The Long Setup
Ivan Lee almost finished his Stanford Master's in Computer Science. Almost. A few credits from the finish line, he co-founded Loki Studios with two friends and built Geomon - a mobile game where players captured location-based creatures using their phones' GPS. The year was 2011. Pokémon Go wouldn't launch until 2016. Geomon reached a million users and turned profitable before Yahoo came knocking in May 2013.
Left M.S. Short
Dropped out a few credits from completing his CS Master's to co-found Loki Studios. The game hit a million users.
Inside the Machine
PM at Yahoo (acqui-hire), then PM at Apple working on AI products - setting content policy for hundreds of millions of daily users.
Private AI Platform
Founded Datasaur after noticing enterprises couldn't safely own their own AI training data. YC W20 followed months later.
The Yahoo chapter was genuinely formative. Ivan participated in Yahoo's inaugural Associate Product Manager program, rebuilt mobile search using AI, won the Yahoo Excellence Award, and co-authored two patents for mobile UI innovations. Then Apple. Then a realization.
One thing that surprised me is that there are many product decisions that AI product managers make in a day's work that would usually require months or even years to legislate. I had to determine what types of content were 'family appropriate' for a product with hundreds of millions of daily users. This is, to be honest, too much power in the hands of an individual.
- Ivan Lee, on his time at AppleThat discomfort planted the seed. By February 2019, Ivan had founded Datasaur. By winter 2020, he was inside Y Combinator. By September that year, Initialized Capital had led a $2.8M seed round with Greg Brockman - then President of OpenAI - investing personally. Few vote-of-confidence moments land harder than that one.
What Datasaur Actually Does
The elevator pitch is clean: Datasaur builds the infrastructure for companies to create and train their own AI models without handing their data to a third party. In practice, that means two core products.
In July 2024, Datasaur integrated LLM Labs with Amazon Bedrock, enabling clients to run side-by-side comparisons of proprietary and open-source models - and then migrate to the cheaper option. The result: AI project cost reductions up to 70%. The logic is straightforward. Once you control your own training pipeline, you can switch models. When you're locked into someone else's platform, you can't.
Clients span an unusual range: Google, Netflix, Spotify, Zoom, and Qualtrics on the commercial side; the FBI and US Government on the security-clearance end. The common thread is complexity. Datasaur's annotation system handles hierarchical labeling, relational entity annotation, OCR annotation, multi-language support, and financial data labeling - the kinds of workflows that off-the-shelf tools can't touch.
Three Seed Rounds, No Series A
There's a quiet statement in Datasaur's capital structure. Initialized Capital has led three consecutive rounds without Ivan taking a Series A. That's either unusual discipline or unusual confidence in the growth trajectory - most likely both.
Funding Rounds
The August 2023 round - headlined as "Datasaur Raises $4M to Help Every Company Train Its Own ChatGPT" - landed just as the LLM conversation was reaching peak volume. Gold House Ventures, HNVR, and TenOneTen joined Initialized. The round's framing was deliberate: positioning Datasaur not as a labeling tool but as the private AI infrastructure layer for the enterprise.
The Sequence That Built This
The Thesis Behind All of It
Datasaur's stated mission is direct: "The most powerful AI is the one you fully control." It's not marketing language. It's the organizing principle that came from Ivan's Apple years, watching a single product team make content decisions affecting hundreds of millions of people - decisions that would take democratic institutions years to legislate.
The insight that shaped Datasaur isn't that NLP data labeling was underserved (though it was). It's that the moment a company's training data leaves its servers, it has surrendered control over what its AI learns. Private LLMs - models trained entirely on proprietary data within secure environments - are the logical endpoint. Ivan built the infrastructure to make that possible at enterprise scale.
Sales was something that was this scary concept to me. I had to find my own brand of sales, my own tone in how I would approach that. It's important to really focus on the user story - what pain point are they trying to solve, and then go and solve that. They're not buying your technology for how cool your technology is, they're buying it because it's going to save them time, or increase their revenue x-fold.
- Ivan Lee, on learning to sellThe learning curve was real. Ivan is candid about the gap between building a product and running a company. "I wish I had done more," he's said about startup preparation. "There's not much more reading or tutorial-watching you can do to prepare yourself for startup life. I made many, many mistakes along the way." That self-awareness - earned, not performed - runs through how Datasaur operates.
The Gold House Founder Network recognized Ivan as part of a cohort of leading Asian American entrepreneurs. The community lens matters: Datasaur's approach to enterprise AI - privacy-first, security-certified, suitable for government clearance and healthcare regulation - reflects a seriousness about building for trust, not just traction.