Tagged Content
Everything on the platform tagged with reinforcement-learning.

Nathan Lambert is a Senior Research Scientist and Post-Training Lead at the Allen Institute for AI (Ai2), where he leads open-source language model development on the OLMo and Tulu series. A UC Berkeley PhD, he previously led the RLHF team at Hugging Face, co-building the TRL library and the Zephyr model. He runs Interconnects AI, a Substack newsletter read by tens of thousands covering post-training, open models, and AI policy, and is the author of The RLHF Book (Manning Publications). With roughly 8,000 academic citations and a reputation for demystifying the hardest parts of modern AI, Lambert is one of the most trusted voices at the intersection of open-source AI research and public education.

Azalia Mirhoseini is an Iranian-born AI researcher, Stanford professor, and co-founder of Ricursive Intelligence - a frontier AI lab valued at $4 billion that uses AI to design better chips, which in turn train stronger AI. Best known for AlphaChip, the deep reinforcement learning system that now designs Google's TPUs and has compressed chip floorplanning from months to hours, she also co-invented the Mixture-of-Experts architecture underpinning GPT, Claude, and Gemini. With 20,000+ citations and a $335M-funded startup launched in under four months, she is closing the recursive loop between artificial intelligence and the hardware it runs on.