Data Labeling | YesPress

Founder · Engineer · Executive

Henry Ehrenberg

Henry Ehrenberg is Co-Founder and Head of Engineering at Snorkel AI, the data-centric AI company he helped build out of Stanford's AI Lab in 2019. With a background in applied mathematics (Yale) and computational engineering (Stanford), Ehrenberg co-developed the Snorkel system — a paradigm-shifting framework for training machine learning models using programmatic weak supervision rather than hand-labeled data. Snorkel AI has raised $338M total, including a $100M Series D in May 2025 at a $1.3B valuation, and counts five of the top ten US banks, Fortune 500 companies, and leading research labs like Google, OpenAI, and Anthropic among its clients.

Founder · Executive · Operator

Manu Sharma

Manu Sharma is the CEO and Co-founder of Labelbox, the leading AI data infrastructure platform that powers training data pipelines for frontier AI models. Born in Roorkee, India, he studied aerospace engineering at Embry-Riddle and Stanford before building CoPilot at DroneDeploy and leading data analytics at Planet Labs. In 2018, he co-founded Labelbox with former colleagues Brian Rieger and Daniel Rasmuson, raising $188.9M through a Series D led by SoftBank Vision Fund 2. Labelbox now serves as the command center for AI teams building and scaling production machine learning systems, from annotation and RLHF to evals and synthetic data generation.

nlp · data-labelingRead →

Company

Ai · Saas · Enterprise

Datasaur

Datasaur builds secure, private AI infrastructure for regulated enterprises - starting with one of the most loved NLP data labeling platforms and evolving into LLM Labs, a workbench for building custom, on-prem ChatGPT-style assistants on a company's own data.

Founder · Executive · Operator

Radha Basu

Radha Ramaswami Basu is the founder and CEO of iMerit Technology, a leading AI data annotation and services company that employs over 7,400 people worldwide - 52% of whom are women, and 80% from underserved communities. With 40+ years in tech spanning Hewlett-Packard (where she built the India software center into a $1.2B operation), Support.com (which she took public on NASDAQ), and now iMerit, she sits at the intersection of AI infrastructure and social impact. She co-founded the Anudip Foundation in 2005, which trained 120,000+ youth from low-income households - and it was those same graduates who became iMerit's first annotators. Today iMerit powers AI pipelines for autonomous vehicles, healthcare imaging, and generative AI at scale.

ai · data-labelingRead →

Alexander Ratner

Alexander Ratner is the co-founder and CEO of Snorkel AI, the company he spun out of Stanford's AI lab in 2019 after building the Snorkel open-source project during his PhD. A Harvard physics graduate turned Stanford computer scientist, Ratner pioneered the field of data-centric AI and weak supervision - the idea that better data, not just better models, is the key unlock for enterprise AI. Under his leadership, Snorkel AI reached a $1.3 billion valuation in 2025 following a $100 million Series D, with $148M in annual revenue and customers including some of the world's largest enterprises and LLM developers.

Founder · Engineer · Executive

Edwin Chen

Edwin Chen is the Founder & CEO of Surge AI, the AI data infrastructure company that became Anthropic and Google's secret weapon for model training and evaluation. A former ML scientist at Google, Twitter, Dropbox, and Facebook, Chen bootstrapped Surge AI from his San Francisco apartment in 2020 to over $1.2 billion in annual revenue with fewer than 110 employees - no venture capital, no sales team. TIME named him one of the 100 Most Influential People in AI in 2025, and Forbes put him on the 400 list as one of the youngest billionaires. Surge's platform powers RLHF, supervised fine-tuning, and custom evaluations for the world's leading AI labs.

Ivan Lee

Ivan Lee is the Founder and CEO of Datasaur, an NLP data labeling and private LLM development platform based in Silicon Valley. A Stanford-trained engineer and serial entrepreneur, Ivan previously co-founded Loki Studios (acquired by Yahoo in 2013) and worked as a product manager at both Yahoo and Apple before launching Datasaur in 2019. Under his leadership, Datasaur has raised $9.2M across multiple seed rounds backed by Initialized Capital, Y Combinator, and Greg Brockman of OpenAI, serving enterprise clients including Google, Netflix, Spotify, the FBI, and Zoom with tools that improve labeling efficiency by up to 9.6x and enable companies to build their own private AI models.

ai · nlpRead →

ai · computer-visionRead →

Kevin Guo

Kevin Guo is the Co-Founder and CEO of Hive, a San Francisco-based enterprise AI company that built what may be the world's largest human-labeled training dataset - over 1 billion labeled items - and deployed those models as cloud APIs used by Reddit, BeReal, and more than 15 of the top social platforms for content moderation. A Stanford triple-degree graduate (biology BA, mathematical/computational sciences BS, computer science MS), Guo co-founded Kiwi - a social Q&A app that grew to 100 million users - before pivoting that team into Hive in 2017. Hive reached unicorn status with a $2 billion valuation after its $85 million Series D in April 2021, and Guo has become one of the most prominent voices on AI-generated content detection and the fight against deepfakes.