Josh Tobin

Profile

From McKinsey to Rubik's Cubes to MLOps

Josh Tobin does not fit one box. He was a management consultant at McKinsey before he was an AI researcher. He was an AI researcher at OpenAI before he was a founder. He was a founder before he was an educator. He is, at every stage, someone who showed up to the hardest version of a problem and did something useful with it.

Today, Tobin's name is most closely associated with the gap between AI research and production reality - the messy, expensive, often-ignored middle ground where models trained in comfortable labs meet users, data drift, and edge cases that no lab ever prepared them for. His newsletter, his teaching, and his startup all orbit the same problem: most ML teams can't see what's happening to their models once they ship. That invisibility is not an inconvenience. It's a product crisis.

The trajectory began at UC Berkeley, where Tobin did his PhD under Pieter Abbeel - one of the most respected names in robot learning. His dissertation, "Real-World Robotic Perception and Control Using Synthetic Data," was not an abstract exercise. It was a direct attack on a fundamental challenge: how do you train a neural network in simulation and trust it to work in the real world, where lighting is different, objects wobble, and the physics engine wasn't tuned by a committee?

His answer was domain randomization. Train the model in dozens, then hundreds, of randomized simulated environments - vary the textures, the lighting, the masses, the friction. Force the network to learn features that survive variation. Then deploy it into reality. The 2017 paper introducing this technique has now been cited over 600 times. It changed how robotics teams think about perception.

The probably number one lesson of machine learning is you get what you optimize for. If you can set up the system to optimize directly for the outcome you're looking for, the results are going to be much, much better.

- Josh Tobin

While finishing his PhD, Tobin was simultaneously a research scientist at OpenAI - a three-year overlap that put him at the center of some of the most technically ambitious robotics work anywhere in the world. The most visible output was the dexterous manipulation project: a robot hand trained in simulation that learned to solve a Rubik's cube. The video circulated across the ML world in 2018. Behind it was a pile of reinforcement learning, generative modeling, and the domain randomization techniques Tobin had been developing.

By 2020, Tobin had a new problem to solve. He and Vicki Cheung - who had headed infrastructure at OpenAI and was a founding engineer at Duolingo - co-founded Gantry. The premise was straightforward and underaddressed: most teams had no visibility into what their deployed models were doing. They couldn't tell if a model was drifting, returning garbage on edge cases, or silently degrading as the world changed around it. Gantry was built to close that gap - instrumenting ML applications, visualizing performance, and giving teams the tools to decide when and how to retrain.

The company raised a $4.4M seed round and a $23.9M Series A led by Amplify Partners and Coatue. Greg Brockman, co-founder of OpenAI, was among the notable angels. Continual learning was the central concept: the idea that a model trained offline is not a finished product, it's a starting point. The world moves. Users change. Models need to keep up.

The quality of the data that you put into the model is probably the biggest determining factor in the quality of the model you get on the other side.

- Josh Tobin

Parallel to his research and company work, Tobin built something rarer: a genuinely useful educational program. Full Stack Deep Learning started as a course at UC Berkeley and became the first focused treatment of production ML engineering - the unglamorous discipline of getting models from notebooks into systems that run, monitor themselves, and don't embarrass you at 2am. The 2022 iteration refocused on building ML-powered products and included a bootcamp for large language models before LLM bootcamps were a genre.

There's a pattern here. Tobin consistently shows up a step ahead of the problem. Domain randomization before sim-to-real was mainstream. ML monitoring before MLOps was a job title. LLM engineering before every conference had a track for it. This is not a coincidence. It is the product of a particular kind of mind: someone who consults before they research, who researches before they build, who builds before they teach, and who teaches because they believe the field moves faster when more people understand the fundamentals.

His pre-PhD stint at McKinsey - unusual for a deep learning researcher - shows up in how he frames problems. Not "what is technically possible" but "what does the team actually need to make a decision." Not "can we train a better model" but "what does the organization need to trust it." That business-aware framing runs through his writing, his courses, and the product choices at Gantry.

The Instagram handle @joshingtobin - a pun on the slang for joking - and his association with the Upright Citizens Brigade comedy community suggest that behind the papers and the pitch decks, there's someone who takes the work seriously without taking himself too seriously. His newsletter, focused on ML infrastructure and ops, has found an audience among engineers who are done with theory and want to know what actually breaks in production.

What Tobin has built, across a career that defies neat categorization, is a set of tools - technical, educational, organizational - for the space between cutting-edge research and functioning systems. That space is hard. Most people who know how to live in it don't explain it well. Most people who explain it well haven't lived in it. Tobin has done both.

Career

The Timeline

Pre-2015

Management consultant at McKinsey & Company. Built the business instincts that would later shape how he frames AI problems.

2015

Started PhD at UC Berkeley under Pieter Abbeel. Focused on robotic perception and control using synthetic data.

2015-2018

Concurrent Research Scientist role at OpenAI - three years working on deep reinforcement learning and robotics at the frontier.

2017

Published Domain Randomization paper (arXiv:1703.06907). First successful sim-to-real transfer of a deep neural network trained purely on simulated RGB images. Now cited 600+ times.

2018

Co-authored Learning Dexterous In-Hand Manipulation - the OpenAI project that produced the Rubik's cube robot hand video that circulated across the AI world.

2019

Completed PhD at UC Berkeley. Co-created Full Stack Deep Learning at Berkeley - the first course teaching production ML engineering.

2020

Co-founded Gantry with Vicki Cheung to solve ML model monitoring and continual learning. Addressed the critical gap between deployed models and team visibility.

2022

Gantry launched publicly with $28.3M raised - a $4.4M seed and $23.9M Series A led by Amplify Partners and Coatue. Greg Brockman among notable angels. Also relaunched FSDL with LLM bootcamp.

2024

Featured speaker at Databricks Data + AI Summit 2024. Continued writing newsletter on ML infrastructure and ops for production engineering teams.

Models trained offline won't perform well for long in production because user behavior and world context changes continuously.

- Josh Tobin on Continual Learning

Key Themes

What Makes Tobin Tick

Domain Randomization

Train in chaos, deploy in reality. Tobin's 2017 technique randomizes simulated environments - textures, lighting, physics - to force neural networks to learn features that survive the messiness of the real world. 600+ citations later, it's standard in robotics research.

Continual Learning

A model trained offline is not a finished product - it's a starting point. The world moves. Users change. Data drifts. Gantry was built on the premise that production AI needs infrastructure for ongoing improvement, not just one-time deployment.

Production ML Engineering

Full Stack Deep Learning emerged from a simple observation: there was no serious curriculum for the work that happens after the model trains. Deploying, monitoring, debugging, retraining - the discipline that makes AI products function at scale.

Data Quality First

Before architecture choices, before hyperparameter tuning, before any of the glamorous decisions - data quality is the biggest determinant of model quality. Tobin returns to this principle across his research, teaching, and products.

Optimization Alignment

You get what you optimize for. The cardinal sin of production ML is optimizing for a proxy metric while the actual outcome drifts. Tobin's work keeps returning to the challenge of building systems that optimize for what you actually care about.

Sim-to-Real Transfer

The dream of robotics: train cheaply in simulation, deploy in reality. Tobin made it work for deep neural networks. The insight was to make simulation deliberately imperfect - variable enough that the real world doesn't come as a surprise.

Extras

Things Worth Knowing

His Instagram handle is @joshingtobin - a pun on "just joshing" (joking). One of the more self-aware social handles in ML research.

Before becoming one of the most-cited names in sim-to-real robotics, he was a management consultant at McKinsey. The pivot is rare - and it shows in how practically he frames technical problems.

He helped engineer a robot hand that solved a Rubik's cube - a project so technically demanding it became a landmark demo for what reinforcement learning trained in simulation could achieve in the physical world.

Gantry's co-founder Vicki Cheung was both a founding engineer at Duolingo and the head of infrastructure at OpenAI. The founding team was arguably the strongest ML infrastructure pair anywhere at the time.

Full Stack Deep Learning launched a large language model bootcamp before "LLM bootcamp" was a recognizable category. Tobin's timing - in research, products, and education - consistently runs slightly ahead of the field.

Associated with the Upright Citizens Brigade comedy community - which may explain why his educational content is unusually clear and human compared to most technical teaching in the ML space.

JoshTobin

From McKinsey to Rubik's Cubes to MLOps

Quick Facts

Research Areas

Organizations

Key Papers

Topics