BREAKING Reflection AI raises ~$2B at $8B valuation / 15x valuation jump in 7 months / NVIDIA leads the round / Former Gemini reward-modeling lead goes solo / Open weights for the West / ~60 researchers from DeepMind & OpenAI / Physics PhD turned superintelligence CEO / BREAKING Reflection AI raises ~$2B at $8B valuation / 15x valuation jump in 7 months / NVIDIA leads the round / Former Gemini reward-modeling lead goes solo / Open weights for the West / ~60 researchers from DeepMind & OpenAI / Physics PhD turned superintelligence CEO /

Reflection AI / New York

Misha Laskin

He spent years asking how many-body quantum systems behave. Now he is asking a harder question: who gets to set the global standard of intelligence - and whether the answer stays open.

Co-Founder & CEO, Reflection AI ex-Google DeepMind Gemini reward modeling Reinforcement learning

Misha Laskin, co-founder and CEO of Reflection AI

Misha Laskin. The physics never left - he just pointed it at a bigger problem.

The dispatch

In October 2025, a 60-person lab nobody had heard of two years earlier was worth $8 billion.

The wake-up call had a name. Two, actually: DeepSeek and Qwen. When Chinese open models started topping leaderboards, Misha Laskin did not write a think-piece. He raised roughly two billion dollars and pointed Reflection AI straight at the gap - the idea that frontier intelligence might be built, and owned, somewhere far from American hands.

Reflection AI began in March 2024 as a company building autonomous coding agents. By late 2025 the public framing had widened into something closer to a national-stakes argument: an open-weight frontier lab meant to be the Western counterweight to closed labs on one side and fast-moving Chinese open models on the other. The check writers agreed. Nvidia led. Eric Schmidt, Eric Yuan, Sequoia, Lightspeed, DST and others followed.

The valuation math is almost rude. Seven months earlier the company was worth $545 million. The new round put it at $8 billion - roughly a 15x jump - before a single frontier model had shipped publicly. Investors were not buying a product. They were buying a thesis, and the two people carrying it.

One of them co-created AlphaGo. The other - the one doing most of the talking - used to calculate how quantum particles interact.

"If we don't do anything about it, the global standard of intelligence will be built by someone else. It won't be built by America."

The unlikely arc

Yale physics. And literature. Then quantum theory. Then inventory forecasting. Then AlphaGo broke his plan.

At Yale he refused to choose, studying physics and literature side by side - the equations and the stories. The PhD at the University of Chicago narrowed things considerably: theoretical many-body quantum physics, the study of how enormous numbers of particles behave when you can no longer track them one at a time. It is, in hindsight, an oddly perfect apprenticeship for someone who would later wrangle models trained on tens of trillions of tokens.

The first detour was entrepreneurial. Before AI claimed him, Laskin founded a Y Combinator-backed startup built around inventory prediction - the unglamorous art of guessing what a business will need before it knows itself. Useful. Profitable-adjacent. Not the thing that would keep him up at night.

AlphaGo was. When DeepMind's program beat the world's best Go player in 2016, plenty of people called it impressive. Laskin called it a career change. He went to UC Berkeley as a postdoc to work on reinforcement learning - the branch of AI where systems learn by doing rather than by being told - and produced research on making agents learn efficiently from raw pixels.

From Berkeley the path ran straight into Google DeepMind, and into the center of the most important model the company would ship. Laskin led reward modeling for Gemini: the machinery that teaches a model what "good" looks like. Reward modeling is the quiet lever behind every well-behaved answer a large model gives. It is also, not coincidentally, exactly the expertise you would want if you planned to build frontier models from scratch.

A physicist learns to find the one principle that explains the mess. Laskin keeps betting that reinforcement learning at scale is that principle for intelligence.

What Reflection is actually doing

Open weights, closed kitchen.

The model

Open by design

Reflection plans to release model weights publicly for researchers to use freely, while keeping its datasets and full training pipelines proprietary. Open enough to matter, guarded enough to fund.

The business

Enterprise & sovereign

Revenue is meant to come from large enterprises building on the models and from governments developing "sovereign AI" - their own national systems rather than rented foreign ones.

The method

RL at scale

The founding conviction: reinforcement learning at scale unlocks the next frontier of capability. The first frontier language model is targeted to train on tens of trillions of tokens.

Career, in moves

The line from particles to parameters.

Yale

Physics and literature

Undergraduate study at Yale - equations and narrative, refusing to pick one.

U. Chicago

PhD, quantum theory

Doctorate in theoretical many-body quantum physics.

Pre-AI

YC-backed founder

Built a startup around inventory prediction before pivoting to research.

Berkeley

RL postdoc

Joined UC Berkeley after AlphaGo, working on reinforcement learning.

DeepMind

Gemini reward modeling

Led reward modeling for Google DeepMind's flagship Gemini model.

Mar 2024

Reflection AI founded

Co-founded with Ioannis Antonoglou, co-creator of AlphaGo, AlphaZero and MuZero.

2025

~$2B at $8B

Raised roughly $2 billion, led by Nvidia, to build open frontier models.

The other founder

You don't build superintelligence alone. You build it with the person who built AlphaGo.

Laskin's co-founder is Ioannis Antonoglou, a name that runs through the most celebrated milestones in modern AI: AlphaGo, AlphaZero, MuZero - the systems that learned to master games no human had ever fully solved. If Laskin brings the reward-modeling and physics instinct, Antonoglou brings a track record of teaching machines to plan and to win.

Around them sits a team of roughly 60 researchers and engineers, many recruited out of DeepMind and OpenAI, split across infrastructure, data, training and algorithms. It is small for the ambition - deliberately so. The pitch is not headcount. It is a bet that a tight group of people who have already shipped frontier systems can do it again, in the open.

There is a second twist worth noting: Laskin is also a partner at Sequoia Capital, which he joined in 2024. He sits, unusually, on both sides of the table - the founder raising and the investor backing. It is a rare vantage point in an industry where most people only ever see one.

Things that stick

The footnotes that explain him.

Two majorsStudied physics and literature at Yale before committing to quantum theory.

TriggerAlphaGo's 2016 win was the moment he left forecasting for AI research.

Co-founder pedigreeIoannis Antonoglou helped build AlphaGo, AlphaZero and MuZero.

VelocityReflection's valuation jumped about 15x in roughly seven months.

Both sidesHe is a startup CEO and a Sequoia Capital partner at the same time.

Base campReflection AI is headquartered in New York, at 124 E 14th St.