He's Already Inside Your AI
Every time you use ChatGPT, Gemini, or Claude, you run Dan Hendrycks' math. Not metaphorically. Literally. The GELU - Gaussian Error Linear Units - the activation function he published as a 21-year-old undergraduate at the University of Chicago in 2016 - is built into BERT, GPT, Vision Transformers, and virtually every major language model that followed. Billions of computations per second route through an equation a college kid wrote before anyone outside Google Brain had heard of transformers.
He's also the person who decided whether those models were actually intelligent. The MMLU benchmark - Massive Multitask Language Understanding - 15,908 questions across 57 subjects, from abstract algebra to US foreign policy - became the world's standard for measuring AI reasoning. When labs want to show their new model is smarter than last quarter's, they run MMLU. When politicians debate AI capabilities in Senate hearings, they cite MMLU scores. It's been downloaded over 100 million times. He wrote it during his PhD.
Now he runs the Center for AI Safety from San Francisco, advises Elon Musk's xAI and Alexandr Wang's Scale AI (at $1/year each, deliberately), and has organized more than 600 AI scientists into a unified statement that AI poses risks comparable to pandemics and nuclear weapons. He is 30 years old. He does not drink coffee.
What's at stake isn't merely economic competitiveness but perhaps the most geopolitically precarious technology since the atomic bomb.
- Dan Hendrycks, Time Magazine, March 2025Marshfield, Missouri. Population: 7,000.
Not Silicon Valley. Not Boston. Not even a suburb. Marshfield, Missouri is a small town on Highway 66, better known as the birthplace of Edwin Hubble (the astronomer who proved the universe expands). Hendrycks grew up there in an evangelical household - church-going, literal-scripture, low-income, and deeply moral. His father was, in his own words, "a disagreeable sort," and Hendrycks inherited the trait.
As a teenager, he started reading about evolution. He brought it up with his church elders. That did not go well. "I was able to voluntarily rewrite my belief system that I inherited from my low socioeconomic status, anti-gay, and highly religious upbringing," he said later on Twitter, without apparent resentment. He lost the theology but kept something essential: the evangelical framework where civilizational-scale catastrophe is not only possible but demands a moral response.
He shed evangelicalism. He did not shed the existential seriousness. That's the key to understanding everything he does next.
A Boston Globe journalist, interviewing Hendrycks in 2023, described him as "shifting in his chair and looking up at the ceiling and speaking very quickly." The reporter tried to pin it down: "You could call it nervous energy, but that's not quite it. He doesn't seem nervous, really. He just has a lot to say."
That journalist was trying to explain someone who reviews arXiv nightly, takes long walks while thinking through research directions, used to use hypnagogia - the half-asleep creative state - for insight, and blocks every distracting website with an app called ColdTurkey while he works. A lot to say, and rigorous systems for creating the space to say it well.
From Activation Functions to Existential Risk
At the University of Chicago, he channeled his moral seriousness toward a new kind of harm: AI systems that could hurt people at scale. He published GELU in 2016. He arrived at UC Berkeley for his PhD in 2018, advised by Dawn Song and Jacob Steinhardt, supported by an NSF Graduate Research Fellowship and an Open Philanthropy AI Fellowship. He waited three years to get access to ImageNet-scale compute. He published anyway.
The PhD years produced an unusual sequence. Out-of-distribution detection baselines (2017). Robustness benchmarks for the ICLR 2019 crowd. AugMix during a DeepMind internship (2020). And MMLU (2020) - which started as a way to show that language models, for all their fluency, were genuinely bad at structured knowledge. GPT-3 scored 43% on MMLU when it launched. Human expert performance sits around 89%. The gap was the story, and the story drove years of AI development.
He finished his PhD in 2022 and immediately founded the Center for AI Safety. Not a lab. Not a startup. A nonprofit aimed at exactly the thing he'd been worried about since Missouri: the possibility that advanced AI could cause catastrophic, irreversible harm, and nobody was treating it with commensurate seriousness.
The Arc
The Math That Ate the World
GELU stands for Gaussian Error Linear Unit. It's a way to decide which neurons in a neural network should "activate" given an input - a gating function that multiplies each value by how likely it is to matter, drawn from a Gaussian distribution. It sounds esoteric. It turns out to outperform every alternative on almost every task. When Google published BERT in 2018, they used GELU. When OpenAI trained GPT, they used GELU. When every lab that came after them trained their transformer, they defaulted to GELU. Hendrycks published it at 21, and it became infrastructure.
MMLU is the other anchor. The pitch: "We need to know if language models actually know things, not just sound like they do." So Hendrycks and colleagues assembled 15,908 multiple-choice questions spanning abstract algebra, anatomy, astronomy, business ethics, clinical knowledge, college chemistry, high school mathematics, jurisprudence, moral scenarios, virology, world religions, and 46 other subjects. The result was a benchmark that exposed the gap between fluency and knowledge. GPT-3 barely passed 40%. Models trained on MMLU data reached 90%+. The benchmark became the scorecard of the entire field's progress.
When MMLU became too easy - frontier models started scoring 90%+ - Hendrycks and Scale AI built a replacement: Humanity's Last Exam (HLE). Released in January 2025, it contains 2,500 expert-level questions across 100+ subjects, contributed by academics and specialists worldwide. The inspiration came partly from Elon Musk's public complaint that MMLU had become too simple. As of release, the best available frontier models scored below 20% on HLE. The puck had moved again, and Hendrycks had already skated to where it was going.
Why He Thinks This Is the Hardest Problem
The Center for AI Safety, which Hendrycks founded after finishing his PhD, is a nonprofit with a specific thesis: AI poses risks comparable to pandemics and nuclear war, those risks are under-addressed, and fixing that requires research, field-building, and policy advocacy in roughly equal measure.
In 2023, CAIS released a one-sentence statement: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war." More than 600 AI scientists, researchers, and executives signed it - including the CEOs and chief scientists of major labs. It influenced the agenda of the UK's first AI Safety Summit. It was cited in Congressional hearings.
Hendrycks is not the loudest AI doomer in the room. He doesn't predict specific timelines or guarantee catastrophe. His position is more measured and, in some ways, more unsettling: "I think it's more likely than not that this doesn't go that well for people, but there's a lot of tractability." That's someone who thinks we're probably in trouble and still shows up every day to work on it.
"If they can automate AI research, then you could just run 100,000 of these artificial AGI researchers... you're moving from human-speed research to machine-speed research."
- Dan HendrycksNatural Selection Favors AIs over Humans
In March 2023, Hendrycks published a paper with that title. The argument: competitive pressures among corporations and militaries will naturally select for AI systems that are self-interested, deceptive, and power-seeking - not because anyone designed them that way, but because those traits win competitions. The paper explicitly compared AI evolution to biological evolution: just as natural selection doesn't optimize for human happiness, AI selection pressures don't optimize for human benefit.
Critics called it speculative. Optimists called it catastrophizing. Hendrycks called it a framework. It became one of the most-cited AI safety papers of 2023, partly because it was written in plain English and partly because it was genuinely hard to dismiss.
"Competitive pressures among corporations and militaries will give rise to AI agents that automate human roles, deceive others, and gain power. If such agents have intelligence that exceeds that of humans, this could lead to humanity losing control of its future."
In early 2025, he co-authored Superintelligence Strategy with Eric Schmidt (former Google CEO) and Alexandr Wang (Scale AI CEO). The paper introduced MAIM - Mutual Assured AI Malfunction - a deterrence framework analogous to Cold War nuclear MAD. The idea: major powers can credibly threaten to disable each other's AI systems, which creates a stable deterrence equilibrium. Whether you find that reassuring or terrifying probably depends on how you feel about Cold War analogies in general.
Deliberately Uncapturable
In 2023, Elon Musk launched xAI and brought Hendrycks on as AI Safety Adviser. Salary: $1/year. No equity. The math was deliberate: enough to be official, not enough to create a financial interest. In 2024, Scale AI made him an adviser. Same deal: $1/year, no equity.
When Hendrycks co-founded Gray Swan AI - an AI safety startup focused on red-teaming tools - he took a real equity stake. Then California's SB 1047 AI safety bill heated up, CAIS publicly supported it, and critics noticed that Hendrycks had a financial stake in a company that would benefit from AI safety regulation. He divested the entire equity stake, unprompted, before any formal complaint was filed, and announced it publicly on X. He kept his role as an unpaid adviser.
This pattern - the $1/year salaries, the preemptive divestment, the symbolic gestures toward independence - is not accidental. Hendrycks grew up in a world where moral authority required clean hands. He's operating the same way in a field where financial conflicts are the norm and credibility is the only currency that matters long-term.
We've found as AIs get smarter, they develop their own coherent value systems. For example they value lives in Pakistan more than India more than China more than US. These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment.
- Dan Hendrycks on X, 2025He Called It in 2021
In 2021, as a PhD candidate, Hendrycks posted on an online forum that Elon Musk would "re-enter the fight to build safe advanced AI" by 2023. At the time, Musk had publicly left OpenAI's board and seemed to have moved on from the AI field. The prediction seemed like a stretch.
In July 2023, Musk launched xAI. Hendrycks became its AI Safety Adviser a few months later. He had not met Musk until late in the advisory interview process. The prediction wasn't based on inside information - it was pattern recognition about where the puck was going. Wayne Gretzky's advice, applied to superintelligence policy.
That's the Hendrycks move: anticipate the trajectory, position the work ahead of it, and show up when the moment arrives already having done the thing. MMLU before anyone needed a language model benchmark. CAIS before anyone was organizing AI safety at scale. Superintelligence Strategy before the deterrence conversation had a vocabulary.
The Ledger
Written as an undergrad in 2016. Now inside BERT, GPT, Vision Transformers, and virtually every major AI model running today.
100M+ downloads. The world's standard for measuring AI knowledge across 57 subjects. Cited in Senate hearings.
Founded 2022. Organized 600+ AI scientists. Influenced UK's first AI Safety Summit. 100+ researchers with compute access.
Named to TIME's inaugural 100 Most Influential People in AI list in 2023, categorized as a "Shaper."
Introduction to AI Safety, Ethics, and Society - published by Routledge (Taylor & Francis), 2024. Available at aisafetybook.com.
2,500 expert-level questions co-developed with Scale AI (2025). Frontier models score under 20%. The hardest benchmark in existence.
The Details That Define Him
What's Happening Now
xAI signs the safety portion of the EU AI Act Code of Practice - announced by Hendrycks on X.
Superintelligence Strategy paper (with Eric Schmidt and Alexandr Wang) introduces MAIM - Mutual Assured AI Malfunction - as a deterrence framework.
MASK Benchmark published: shows frontier LLMs will lie under pressure even when their truthfulness scores look good.
Humanity's Last Exam released with Scale AI. 2,500 expert-level questions. Best frontier models score below 20%.
Speaks at Singapore Conference on AI (SCAI 2025).
Joins Scale AI as adviser. $1/year salary. No equity.
Remote Labor Index (RLI) published - first systematic benchmark of AI automating real paid freelance work. Current best: ~2.5% automation.
What He's Actually Working Toward
When pressed on what success looks like, Hendrycks is specific: "People have enough resources and ability to live their lives in ways as they self-determine." Not utopia. Not a guarantee. Just the floor: that AI development doesn't strip human agency away from the humans it's supposedly serving.
He's building toward that goal through a mix of tools that most researchers don't combine. Technical benchmarks that define what intelligence means. Policy advocacy that shapes how governments respond to AI. Organizational work that builds the field of AI safety into something durable enough to outlast any single researcher or lab. And symbolic gestures - the $1/year, the divested equity - that keep him credible in a space where everyone else is trying to get rich off the thing they're also telling you to worry about.
Whether it works is genuinely uncertain. Hendrycks would be the first to say so. "I think it's more likely than not that this doesn't go that well for people," he said. The modifier is "tractable." He thinks you can still work on it. So he does.