Tagged Content
Everything on the platform tagged with rlhf.
Wow AI is a US-registered, end-to-end AI training-data company that supplies high-quality, multilingual datasets and human-in-the-loop services - data collection, transcription, annotation, validation and RLHF - to teams building large language models, voice assistants and computer-vision systems. Its crowdsourcing network spans 170,000+ contributors across 120+ languages, with 100,000+ hours of audio and domain datasets for finance, healthcare and retail. The founding team has since extended the vision into AIxBlock, a decentralized platform for building, training and deploying AI, and WowDAO, a community-owned AI ecosystem.
Centific is a Redmond, Washington-based AI and data company that calls itself the hidden infrastructure behind world-class AI models. Its AI Data Foundry combines a global network of 1.8 million-plus domain experts with platforms for data collection, annotation, RLHF, model evaluation, governance and multimodal orchestration, helping enterprises and frontier model labs move AI from experimentation to production. Founded in 2020 out of the former Pactera EDGE business, Centific is a recognized NVIDIA innovation partner and raised a $60M Series A led by Granite Asia in June 2025.
AfterQuery is a San Francisco applied research lab that builds expert-generated datasets, benchmarks, and reinforcement-learning environments for the world's leading AI labs. The company recruits nearly 100,000 vetted domain experts - in finance, law, medicine, software, and beyond - to teach frontier models how specialists actually think.
Deccan AI is a Mountain View-based GenAI data company that runs the post-training and production layer for frontier labs and enterprises. It builds super-accurate SFT and RLHF datasets, reinforcement learning environments and agentic evaluation pipelines using an elite expert network and a purpose-built quality platform.
Labelbox is a San Francisco-based AI data factory that helps frontier AI labs and enterprises generate, label, and evaluate the high-quality training data their models need. Its platform combines annotation tools, model-assisted automation, and a global expert network (Alignerr) to power post-training, RLHF, and multimodal reasoning workloads.
Mercor is an AI-powered talent marketplace that recruits domain experts - doctors, lawyers, engineers, scientists - to train and evaluate frontier AI models for labs like OpenAI, Anthropic, Google, Meta, and Amazon. Founded in 2023 by three Thiel Fellows, it has grown from a hiring tool into the human data backbone of the AI economy.
Rukesh Reddy is the Founder and CEO of Deccan AI, a Mountain View-based AI data and post-training company that raised a $25M Series A in March 2026 led by A91 Partners with participation from Susquehanna International Group and Prosus Ventures. Built as a 'born GenAI' company in October 2024, Deccan AI serves frontier AI labs and major tech companies - including Google DeepMind and Snowflake - with high-precision training datasets, reinforcement learning environments, and enterprise evaluation suites. Reddy brings over 15 years of experience spanning finance, strategy consulting, and digital transformation at firms including J.P. Morgan, Monitor Group (now Monitor Deloitte), and Citi, where he led CX and digital transformation for the global retail bank.
Tim Shi is the co-founder and former CTO of Cresta, the AI platform for contact centers that grew to over $100M ARR and raised $401M from Sequoia, a16z, and Greylock. A Tsinghua CS graduate who did early AI research at OpenAI alongside Andrej Karpathy — including the 'World of Bits' paper on web-based reinforcement learning agents — he co-founded Cresta in 2017 with Zayd Enam after both dropped out of Stanford's AI PhD program. In 2025, he co-founded Recursive Superintelligence, which emerged from stealth with $650M at a $4.65B valuation to build self-improving AI systems.

Surge AI is a San Francisco data annotation company that produces high-quality human-labeled datasets and RLHF feedback for the world's leading AI labs - OpenAI, Google, Anthropic, Meta and Microsoft among them. Founded in 2020 by ex-Twitter and ex-Facebook ML engineer Edwin Chen, it bootstrapped to roughly $1.2B in annual revenue with around 110 employees and a labeler network reported near one million.
Turing is a $2.2 billion AGI infrastructure company that bridges the global talent gap in AI development. Founded in 2018 by Stanford AI alumni Jonathan Siddharth and Vijay Krishnan, Turing operates across two core missions: supplying frontier AI labs like OpenAI and Anthropic with expert human talent for coding, STEM, and reasoning data; and building production-grade AI systems for Fortune 500 enterprises. With a network spanning 4 million developers across 140+ countries, a fine-tuning platform called ALAN, and $334 million raised in total funding, Turing has grown into one of the fastest-scaling companies at the intersection of talent and artificial intelligence.

Radha Ramaswami Basu is the founder and CEO of iMerit Technology, a leading AI data annotation and services company that employs over 7,400 people worldwide - 52% of whom are women, and 80% from underserved communities. With 40+ years in tech spanning Hewlett-Packard (where she built the India software center into a $1.2B operation), Support.com (which she took public on NASDAQ), and now iMerit, she sits at the intersection of AI infrastructure and social impact. She co-founded the Anudip Foundation in 2005, which trained 120,000+ youth from low-income households - and it was those same graduates who became iMerit's first annotators. Today iMerit powers AI pipelines for autonomous vehicles, healthcare imaging, and generative AI at scale.

Eric Zhang is the Chief Executive Officer of Thoth AI, a Singapore-headquartered global AI data solutions company with R&D operations in Silicon Valley. Under his leadership, Thoth AI powers frontier AI models for some of the world's leading AI labs by providing high-quality training data, RLHF workflows, model evaluation, and multilingual customer experience services across 170+ countries in 200+ languages. Zhang operates at the intersection of AI safety, responsible deployment, and global scale - building the human infrastructure that makes AI smarter, safer, and culturally aware.
Edwin Chen is the Founder & CEO of Surge AI, the AI data infrastructure company that became Anthropic and Google's secret weapon for model training and evaluation. A former ML scientist at Google, Twitter, Dropbox, and Facebook, Chen bootstrapped Surge AI from his San Francisco apartment in 2020 to over $1.2 billion in annual revenue with fewer than 110 employees - no venture capital, no sales team. TIME named him one of the 100 Most Influential People in AI in 2025, and Forbes put him on the 400 list as one of the youngest billionaires. Surge's platform powers RLHF, supervised fine-tuning, and custom evaluations for the world's leading AI labs.

Pankaj Gupta is a serial entrepreneur and seasoned tech executive who built Twitter's early recommendation engine (Who-to-Follow, MagicRecs), led Google Pay's engineering across India and globally, scaled Coinbase's India operations from zero, and co-founded Yupp — a crypto-incentivized AI model evaluation platform that raised $33M from a16z before shutting down in March 2026. A Stanford PhD and IIT Delhi alumnus, he has founded four startups, three of which were acquired.

Alexandr Wang is the co-founder of Scale AI and current Chief AI Officer at Meta Platforms. Born in 1997 to Chinese immigrant physicists in Los Alamos, New Mexico, he dropped out of MIT at 19 to build Scale AI, which became the backbone of AI training data infrastructure for companies like OpenAI, Google, Meta, and the U.S. Department of Defense. By 24, he was the world's youngest self-made billionaire. Scale AI grew to a ~$29B valuation after Meta's $14.8B strategic investment in 2025, and Wang now leads Meta's AI superintelligence efforts.

Dario Amodei is the co-founder and CEO of Anthropic, the AI safety company behind Claude. A former VP of Research at OpenAI and co-inventor of Reinforcement Learning from Human Feedback (RLHF), he holds a PhD in biophysics from Princeton and has co-authored some of the most cited papers in AI safety and scaling laws. He leads Anthropic — valued at $380 billion as of early 2026 — with the conviction that building safe, interpretable AI is not only compatible with building powerful AI, but inseparable from it. His landmark essay 'Machines of Loving Grace' envisions AI compressing decades of scientific progress into years, potentially eliminating most disease and radically expanding human prosperity.

Wojciech Zaremba is a Polish-American AI researcher and co-founder of OpenAI, the company behind ChatGPT and GPT-4. A former International Mathematical Olympiad silver medalist from Kluczbork, Poland, he holds dual master's degrees from the University of Warsaw and École Polytechnique, and a PhD from NYU under Yann LeCun. At OpenAI he has led robotics (including the Dactyl hand that solved a Rubik's Cube one-handed), the Codex project powering GitHub Copilot, and the RLHF human feedback infrastructure that shaped ChatGPT's alignment. One of the few original co-founders still at the company after a decade, he is regarded as one of the most technically consequential and quietly influential figures in modern AI.

Cameron R. Wolfe, Ph.D. is a Senior Research Scientist at Netflix's Globalization team and the author of Deep (Learning) Focus, a twice-weekly Substack newsletter with 60,000+ subscribers that translates cutting-edge ML research into approachable long-form essays. A Rice University computer science PhD, he has built a reputation as one of the clearest explainers of large language models, RLHF, and AI agents in the field, bridging academia and industry with methodical depth and intellectual generosity.

Nathan Lambert is a Senior Research Scientist and Post-Training Lead at the Allen Institute for AI (Ai2), where he leads open-source language model development on the OLMo and Tulu series. A UC Berkeley PhD, he previously led the RLHF team at Hugging Face, co-building the TRL library and the Zephyr model. He runs Interconnects AI, a Substack newsletter read by tens of thousands covering post-training, open models, and AI policy, and is the author of The RLHF Book (Manning Publications). With roughly 8,000 academic citations and a reputation for demystifying the hardest parts of modern AI, Lambert is one of the most trusted voices at the intersection of open-source AI research and public education.

Predibase was a San Francisco-based AI infrastructure company (founded 2020, acquired by Rubrik in June 2025) that pioneered efficient LLM fine-tuning and serving at scale. Built by the creators of Uber AI's Ludwig and Horovod frameworks, Predibase made it easy for enterprises to fine-tune and deploy open-source LLMs using LoRA adapters — often outperforming GPT-4 on domain-specific tasks for under $8 of compute. Its open-source LoRAX inference server enabled serving thousands of fine-tuned models from a single GPU, dramatically cutting costs. After raising $28M from Greylock and Felicis, Predibase was acquired by cybersecurity firm Rubrik for over $100M to accelerate agentic AI adoption.
Scale AI is a San Francisco-based AI infrastructure company founded in 2016 by Alexandr Wang and Lucy Guo. It provides the data engine, evaluation tools, and AI deployment platforms that power the world's leading AI labs, Fortune 500 enterprises, and US government agencies. By combining a massive distributed workforce with proprietary tooling, Scale accelerates AI development through high-quality data labeling, RLHF, model evaluation, and agentic platforms — making it one of the most consequential picks-and-shovels companies in the modern AI boom, with a $29B valuation as of mid-2025.