Maarten Grootendorst
Developer Relations @ Google DeepMind
NLP Engineer · Author · Open-Source Creator

Maarten
Grootendorst

Psychologist who taught machines to read. Then taught the world how they do it.
NLP LLMs BERTopic Open Source O'Reilly Author Netherlands
15M+ Downloads
2M+ Newsletter Views
50K+ Students Taught
The Man Who Made Language Models Make Sense
Google DeepMind · O'Reilly Bestselling Author · Netherlands

The Interpreter of Machines

Here is a person who decided that the fastest route between clinical psychology and Google DeepMind was a straight line - through three master's degrees earned cum laude, a suite of open-source tools that became the go-to machinery for NLP practitioners globally, and a writing practice that strips the mathematics out of machine learning and replaces it with pictures so good they should hang in galleries.

Maarten Grootendorst currently serves as Developer Relations Engineer at Google DeepMind, a role he stepped into in early 2025. Before that, he spent years as a clinical data scientist at IKNL - the Netherlands Comprehensive Cancer Organisation - where colleagues routinely spent entire weeks understanding a problem before touching a keyboard. It shaped him. He absorbed the patience. He kept the curiosity. He left the slow pace behind.

But the thing that most defines Maarten is not his job title. It is what he builds on the side: tools that people actually use, explanations that people actually read, books that people actually finish.

"The simplest solutions have the greatest impact."

- Maarten Grootendorst

The open-source portfolio tells the story most efficiently. BERTopic is a topic modelling library that uses BERT embeddings and a class-based TF-IDF procedure to extract interpretable topics from any corpus - earning over 7,600 GitHub stars and becoming a standard reference in the NLP toolkit. KeyBERT does minimal keyword extraction using the same BERT backbone, with 4,200+ stars and a simplicity that makes it feel almost unfair. PolyFuzz handles fuzzy string matching. Together they have been downloaded over 15 million times. That is not a side project. That is an infrastructure layer for an entire research community.

In October 2024, Maarten co-authored Hands-On Large Language Models with Jay Alammar, published by O'Reilly. It became an Amazon number-one bestseller in both Data Modeling and NLP. The GitHub repository for the book now carries more than 25,000 stars - making it one of the most-starred educational repositories in machine learning. Not bad for a book written by a psychologist and a visualiser who decided the world needed fewer equations and more colour-coded diagrams.

The newsletter, Exploring Language Models on Substack, is where Maarten publishes the visual guides that have made him something of a household name in AI circles. Topics like Mixture of Experts, Quantization, Mamba and State Space Models, LLM Agents - each one arrives as a deep-dive illustrated in fifty-plus custom visuals, with the mathematics translated into something you can actually picture in your head. Two million total views. No math required to understand the ideas. That is not an accident. It is a philosophy.

"You don't have to know everything - eventually you will catch up."

- Maarten Grootendorst

The psychology background is not incidental context. It is load-bearing structure. Maarten has spoken candidly about facing doors that would not open because his resume said "psychologist" instead of "computer scientist." His response was not to hide the background - it was to write so clearly, so technically, so prolifically that the credentials became irrelevant. By the time anyone noticed he had not taken a traditional route, he had already built things they were downloading.

He has taught over 50,000 students through courses on DeepLearning.AI and Analytics Vidhya. His academic papers have accumulated more than 3,000 citations on Google Scholar. He has given talks at major conferences and appeared on podcasts explaining BERTopic, the trajectory of LLMs, and the curious relationship between how psychologists think and how neural networks learn.

Now the next book is in progress: An Illustrated Guide to AI Agents, again with Jay Alammar, again from O'Reilly. The topic is agents - memory, tools, planning, reasoning, autonomy. More than 300 custom illustrations are already done across twelve chapters. If the first book was about understanding language models, this one is about understanding what happens when those models start taking actions in the world. The stakes are higher. The pictures will be better.

"Much of what we do in data science involves some sort of human aspect."

- Maarten Grootendorst

For all the output, Maarten is not someone who mistakes velocity for progress. He is notably candid about the FOMO trap in AI - the compulsive need to engage with every new model, every new paper, every new technique the moment it appears. His advice is more aligned with reinforcement learning than with the culture of the AI Twitter feed: know when to explore and when to exploit. Go deep on foundations. The fundamentals do not expire. The hype does.

He manages all of this while dealing with chronic pain - a fact he has mentioned in interviews not for sympathy but as part of the texture of a real working life. The output is not the product of some frictionless genius flow state. It is the product of consistency and choice and the kind of discipline you build when you spend years studying how humans actually behave under pressure.

When he is not writing visual guides to the internals of transformer architectures, Maarten plays board games - solo games like 20 Strong, which is perhaps the most fitting hobby for someone who has spent a career turning complicated systems into something one person can navigate alone. He draws inspiration from multiple sources rather than a single hero, extracting techniques and approaches from everywhere and combining them into something that reads as distinctly his own.

At Google DeepMind, he now sits at the intersection of the world's most ambitious AI research organisation and the broader community of developers trying to make sense of it all. Developer relations at that level is less about writing API documentation and more about being the translator between what frontier research produces and what practitioners can actually use. Which, when you think about it, is exactly what Maarten has been doing for the past decade - just with a different employer on the letterhead.

The career arc looks almost inevitable in retrospect. Psychologist who noticed a gap between how AI works and how humans understand it. Built tools to close that gap. Wrote books to close it further. Joined the organisation at the frontier. But that hindsight narrative is a fiction - what actually happened was a series of choices made by someone who kept asking what is genuinely useful, kept giving it away for free, and kept explaining it until the explanation was good enough to stand on its own.

That is the Maarten Grootendorst approach: make the thing, make it open, make it understandable, and repeat until the field catches up with what you already built.

The Scoreboard

15M+ Open-Source Downloads
25,300 GitHub Stars (LLM Book Repo)
3,000+ Academic Citations
2M+ Newsletter Views

Books That Actually Explain Things

Amazon #1 Bestseller
Hands-On Large Language Models: Language Understanding and Generation
O'Reilly Media · October 2024 · Co-author: Jay Alammar
The visual field guide to LLMs - covering generative, representational, and retrieval applications with working code and hundreds of custom illustrations. The companion GitHub repo has 25,000+ stars.
Forthcoming
An Illustrated Guide to AI Agents: Building Autonomous Systems with LLMs, Tools, and Memory
O'Reilly Media · 2025/2026 · Co-author: Jay Alammar
300+ custom illustrations across 12 chapters. Memory, tools, MCP, context engineering, RL and reasoning LLMs - everything that sits between an LLM and a fully autonomous system.

Tools People Actually Use

BERTopic
★ 7,600+ Stars on GitHub
Neural topic modelling using BERT embeddings and c-TF-IDF. Creates easily interpretable topics at scale. The go-to topic modelling library for the NLP community.
KeyBERT
★ 4,200+ Stars on GitHub
Minimal keyword extraction using BERT embeddings. Finds the keywords that matter. Simple enough to use in five minutes, powerful enough for production.
PolyFuzz
★ 795+ Stars on GitHub
Fuzzy string matching, grouping, and evaluation. Built for data scientists who need to match messy real-world strings to clean reference data without losing their minds.

All projects are free. All code is open. All downloads are real.

From Couch to Code

2017 Enrolled at Jheronimus Academy of Data Science (JADS) - combined psychology background with data science training.
2019 Graduated from JADS cum laude. Joined IKNL (Netherlands Cancer Organisation) as Clinical Data Scientist. Started building in public.
2020 Released KeyBERT - minimal keyword extraction with BERT. Downloads begin. The pattern is set: simple tool, open source, extraordinary uptake.
2022 Released BERTopic and published the academic paper on arXiv. Became the standard neural topic modelling library. Writing on Medium and Substack takes off.
2023 Open-source toolkit crosses 10M+ combined downloads. Launched "Exploring Language Models" newsletter with visual deep dives on LLM internals.
2024 Co-authored "Hands-On Large Language Models" with Jay Alammar (O'Reilly). Amazon #1 bestseller. Newsletter hits 2M+ total views.
2025 Joined Google DeepMind as Developer Relations Engineer. Announced "An Illustrated Guide to AI Agents" with Jay Alammar. The next chapter begins.

Straight Talk

"The best way to learn is really understanding and knowing the basics."

On navigating the fast-moving AI field

"Psychologists can also be great data scientists."

On his unconventional career path

"Understanding the problem truly is a solution."

On lessons from cancer research colleagues

"Much of what we do in data science involves some sort of human aspect."

On why psychology and AI connect

What Makes Maarten Maarten

🎨
Visual Thinker
Every guide Maarten publishes contains 50+ custom-made visuals. He was inspired by 3Blue1Brown's approach: that complex mathematics can be presented "as if you're watching a movie." He took that philosophy and applied it to language models.
🧠
The Psych Angle
Three MSc degrees. Two in psychology. One in data science. He did not abandon the first two when he pivoted - he built the third on top of them. The result: an AI educator who actually understands how humans learn.
🔓
Everything Open
BERTopic: open. KeyBERT: open. PolyFuzz: open. Visual guides: free. Newsletter: free. Courses: available to anyone. The 15 million downloads are not a coincidence. They are the direct consequence of a decision to give things away.
🎲
Off the Clock
Board game enthusiast. Plays solo games like 20 Strong to decompress from the pace of the AI field. Possibly the only AI researcher who unwinds by navigating complex rule systems - which, on reflection, is perfectly on-brand.
⚖️
Explore vs. Exploit
He compares learning in fast-moving AI to reinforcement learning's explore-exploit tradeoff. You do not have to engage with every new paper the moment it appears. Deep understanding of fundamentals beats FOMO every time.
💪
Built Resilient
Manages chronic pain while maintaining one of the most prolific output schedules in AI education. The productivity is not effortless. It is earned, consistently, through discipline and a very clear understanding of what actually matters.

Things You Should Know

Fact 01

He holds three separate Master's degrees - all completed cum laude. Most people stop at one. Maarten apparently treats MSc programs as a collectible.

Fact 02

His career change from psychologist to AI engineer was partly motivated by proving critics wrong. When doors closed because his CV said "psychology," he wrote his way through the walls instead.

Fact 03

The Hands-On LLM GitHub repo has 25,000+ stars - making the companion code repository for a book more starred than most production software projects.

Fact 04

His newsletter explains LLM internals using virtually no mathematics - just visual intuition built from custom illustrations. It has been read over two million times.

Fact 05

He created BERTopic animations using 3Blue1Brown's open-source visualisation software. He took someone else's visual language, applied it to NLP, and the result got cited 3,000+ times in academic papers.

Fact 06

He is one of very few AI educators with both a deep clinical psychology background AND production-grade ML engineering credentials. The combination is genuinely rare and is directly reflected in how he teaches.