The Chip Architect Who Caught the Wave He Didn't See Coming
In November 2022, Reiner Pope packed up his desk at Google and walked out the door. His plan: build a chip company. What he didn't know - what nobody outside of OpenAI knew - was that in exactly one week, a chat interface called ChatGPT would detonate a demand for AI compute unlike anything the world had seen. Pope had left without knowing it was coming. The timing wasn't prescient. It was something more interesting: it was right.
Pope is the co-founder and CEO of MatX, a Mountain View semiconductor startup designing chips specifically for large language models. Not chips that can run LLMs. Chips engineered from the substrate up with one question in mind: what does physics actually allow here? MatX has now raised approximately $604 million, including a $500M Series B closed in February 2026 and led by Jane Street and Leopold Aschenbrenner's Situational Awareness LP - two of the more technically rigorous backers in finance.
The company employs over 110 people. It has a chip in development. It is not shipping yet - Pope has set a target of late 2027 for production systems. But in an industry where "challenger to Nvidia" is a phrase dozens of companies use, MatX is one of the few where the founders have spent years doing exactly the work they're claiming to redo better.
An Uncomfortable Trade-off, Solved in Silicon
The central problem in LLM hardware is something Pope describes as an "uncomfortable trade-off between latency and throughput." You can have chips with fast SRAM memory - low latency, great for serving one request at a time, but expensive and limited in size. Or you can have HBM memory - high capacity, good for throughput, but slower for individual requests. Everyone picks a side. MatX is trying to have both.
The MatX One chip uses a hybrid architecture: model weights live in SRAM for fast, low-latency access - achieving the snappy response times of designs from companies like Groq or Cerebras. The key-value caches, which grow with context length, live in HBM - supporting extended context windows without the capacity constraints of pure SRAM systems. The result is a chip that Pope claims offers higher throughput than any announced competing product while simultaneously matching the latency of SRAM-first designs.
The chip is also built around a "splittable systolic array" - an architecture that retains the energy and area efficiency of large systolic arrays while achieving high utilization on smaller matrices with flexible shapes. Pope has also built custom scale-up and scale-out interconnects, and a programming model that gives hardware developers direct hardware control rather than abstracting it away.
What MatX deliberately doesn't target: small models, convolutional operations, recommendation systems. The focus is surgical. Frontier LLMs - training, reinforcement learning, prefill inference, decode inference. The labs spending the most on compute, trying to build the biggest systems. That's the customer.
"All of the frontier labs are spending tens of billions of dollars on compute. The rational choice is to do anything you can to get hardware costs down."
- Reiner Pope, CEO of MatXFrom Haskell to TPUs to $600M
Pope grew up in Australia and studied mathematics at The Australian National University, graduating in 2011 with a Bachelor of Philosophy (Honours). He came into software through a route that most hardware engineers don't take: functional programming. He was, by his own description, a math enthusiast and Haskell programmer before he ever touched chip design.
That background - mathematical rigor, type-theoretic thinking, functional decomposition - runs through everything he's done since. Before Google, he spent years in software focusing on compilers, algorithms, and low-level systems. The kind of work where you're thinking carefully about what computations actually cost and why.
At Google, he joined Research as a Senior Staff Software Engineer in 2019 and spent roughly three years at the intersection of model training, hardware architecture, and compiler development. His role as Efficiency Lead for PaLM meant owning the software stack that made one of Google's most important language models actually run at production scale. He helped design Google's TPU v5e. He built what was at the time described as the world's fastest LLM inference software. And he co-authored the PaLM paper - which has gone on to accumulate over 9,400 Google Scholar citations.
The Papers Behind the Product
First Principles as a Practice, Not a Slogan
Pope is one of a small number of people who has worked across the complete AI stack - from writing the software that runs on Google's TPUs to understanding the architectural decisions that shaped those chips in the first place. His co-founder Mike Gunter was the lead designer of the TPU hardware itself. Together, they cover every layer from silicon to model.
That depth shapes how he approaches the problem. He's written publicly about why frontier models are likely trained roughly 100x beyond Chinchilla-optimal compute - not because researchers are inefficient, but because when you account for the full lifecycle of a model (reinforcement learning, inference, continued fine-tuning), the economics look completely different from a naive "minimize training compute" framing. He deduces things about competitor systems by reading API pricing: long context getting expensive around 200K tokens means you're running into memory bandwidth walls. Prefill costing 5x less than decode means the systems are "tremendously memory bandwidth bottlenecked."
He's also, characteristically, still writing technical blog posts. From the MatX offices in Mountain View, he's publishing essays on cuckoo hashing improvements for SIMD hash tables, the conditions under which hashed sorting beats hash tables, whether Strassen matrix multiplications would be useful in AI if data movement were free, and why neural networks and cryptographic ciphers share deep structural similarities. The math compulsion hasn't been swapped out for CEO-speak.
"The best iteration is doing it in your head before writing code."
- Reiner Pope"We left Google one week before ChatGPT was released. We did not know it was coming."
"Most of chip design is actually software development in practice."
"AIs are most effective when there is a well-defined objective function."
"You need to be on par on five things and far ahead on at least one."
Betting Against CUDA Lock-in
MatX's argument isn't just technical. It's economic. The frontier AI labs - the handful of companies genuinely pushing the limits of model scale - are spending tens of billions on compute annually. They're also deeply locked into Nvidia's CUDA ecosystem because switching costs are enormous and supply is constrained. Pope thinks that equation is changing.
The labs with the largest compute budgets, he argues, have the longest planning horizons - three to five years out - and the most to gain from hardware that's genuinely better for their workloads. They're not buying GPUs out of affection. They're buying what works. If MatX can deliver a chip that's definitively superior for LLM training and inference, the economics become compelling regardless of ecosystem switching costs.
The $500M Series B, announced in February 2026, is the funding needed to get through tapeout and into production. Jane Street - not a typical venture backer - led the round. Situational Awareness LP, founded by Leopold Aschenbrenner, who wrote one of the most-circulated memos on AGI timelines, also participated. Stripe co-founders Patrick and John Collison backed the round. Marvell Technology - an actual semiconductor company - also came in. This is not a round driven by people who don't understand what they're funding.
The Arc
Reiner Pope on Video
What's Already on the Board
- Co-authored PaLM paper - one of the foundational large language model research papers, with 9,400+ citations in Google Scholar
- Designed and implemented Google's world-fastest LLM inference software as Efficiency Lead for PaLM
- Helped conceive the Google TPU v5e chip architecture - the hardware powering much of Google's current AI infrastructure
- Co-founded MatX and raised $604M total including the largest single chip startup round of early 2026
- Co-authored the JAX scaling book (jax-ml.github.io/scaling-book/) - a reference for ML practitioners
- Built MatX from two founders to 110+ employees in under three years, targeting TSMC manufacturing at datacenter scale
Still a Mathematician at Heart
What's unusual about Pope among startup CEOs is the breadth of the intellectual range. He codes in Rust - attracted by the type system and manual memory control. He prefers iteration in his head over iteration in code. He publishes technical essays that have no obvious marketing value, because he's thinking through the problems anyway and writing it down.
Recent posts from his blog at reiner.org include an exploration of why cuckoo hashing improves SIMD hash tables, whether hashed sorting typically beats hash tables in practice, whether Strassen matrix multiplications could be useful in AI if data movement costs were removed from the equation, and a structural analysis of the similarity between neural networks and cryptographic ciphers. These aren't PR exercises. They're what happens when someone who thinks in mathematics runs a chip company.
He's also genuinely direct about what MatX doesn't know yet. Manufacturing at datacenter scale "remains the biggest hurdle" for the company. He gives production target dates that acknowledge the difficulty involved. The chip is designed. The question now is execution - the unglamorous, grinding work of taking something from a great design to something shipping at scale in datacenters.
Pope has described his approach to competition simply: be on par with the best on all the important dimensions, and be far ahead on at least one. In the chip business, that's a harder challenge than it sounds. The history of Nvidia challengers is a graveyard of companies that solved part of the problem and missed the rest. Pope and Gunter's advantage is that they have, between them, done the actual work at every layer of the stack. They know where the bodies are buried because they were there when the graves were dug.
Further Reading & Links
- MatX official website - Company, technology, and the MatX One chip
- reiner.org - Pope's personal technical blog
- Chipstrat interview - Deep dive on MatX strategy and chip architecture
- Dwarkesh Patel podcast - The math behind how LLMs are trained and served
- Cheeky Pint - Accelerating AI with transformer-optimized chips
- Google Scholar - Research publications
- JAX Scaling Book - Co-authored reference on scaling LLMs
- TechCrunch - MatX $500M Series B coverage