Partner at Andreessen Horowitz. Coined "LLMflation." Built two startups before breakfast. Flies his own plane. Runs home Kubernetes clusters for fun.
Before Guido Appenzeller started writing checks at Andreessen Horowitz, he built things that got acquired - twice. Voltage Security, where he was CTO, brought identity-based encryption to enterprise email. HP bought it. Big Switch Networks, where he was CEO, bet the house on software-defined networking before SDN was a mainstream term. Arista bought that one too. The thread connecting those exits isn't luck. It's a particular flavor of foresight: seeing infrastructure shifts early, building the pick-and-shovel companies, and being right.
That same pattern - track the underlying cost curves, build where the economics break - explains why his work at a16z centers on AI infrastructure. In 2023 he published an analysis called "LLMflation" that became required reading across the industry. The central finding: for models of equivalent performance, inference cost was dropping 10x per year. The 2021 cost of hitting a specific MMLU benchmark score was $60 per million tokens. By 2024, $0.06. A 1,000x decline in three years. Most people felt that change. Guido graphed it.
At VMware, he joined after his Big Switch Networks exit and landed in a company with a very different scale. "VMware was the first time for me to work in a large company - or in fact, any company I didn't start myself," he said. He helped architect the multi-cloud strategy and was part of the team that grew NSX - VMware's software-defined networking product - from under $200M to over $1B in annual revenue. It was a proving ground for operating at enterprise speed: longer sales cycles, more politics, but also a completely different data set on what infrastructure actually looks like at Fortune 500 scale.
After VMware came Yubico as Chief Product Officer, then Intel as CTO of the Data Platforms Group. Intel is not a place where engineers tend to end up after running startups. It's slower. The silicon roadmap moves in years, not sprints. But Guido treated it the way he treats everything: as an infrastructure problem with a cost structure worth studying. His analysis of GPU efficiency and compute trends at Intel directly informs the investment thesis he now applies at a16z.
The career arc reads like someone stress-testing every layer of the stack. Academic research at Stanford's Clean Slate Lab, where he led the team that developed the OpenFlow v1.0 protocol - the foundation of software-defined networking as an industry. Then founder mode. Then operator mode at VMware. Then silicon and data platforms at Intel. And now investor mode, writing memos about AI cost curves with the same analytical rigor he once applied to network packet routing.
At a16z he sits on the Infrastructure Investing team, the group betting on the plumbing beneath AI: GPUs, networking, storage, developer tooling, open source models. He co-authored the AI Canon with colleague Matt Bornstein - a curated reading list covering transformers, diffusion models, and the core papers behind modern generative AI. It became the go-to reference for engineers entering the AI field who didn't want to wade through 1,000 arXiv papers blind.
On LinkedIn, where 31,000 people follow him, he posts data-heavy breakdowns of LLM cost per token, GPU utilization economics, and model quality benchmarks. His engagement rate of 1.64% puts him in the top 1% of AI professionals worldwide by Favikon's measure. These aren't marketing posts. They're primary analysis, written by someone who still installs ESXi on Intel NUC hardware at home and runs Kubernetes clusters for personal projects. The intellectual habits of a PhD computer scientist haven't left him. He just happens to be writing them as a check-writer now.
For an LLM of equivalent performance, the cost is decreasing by 10x every year.- Guido Appenzeller, "LLMflation" (a16z, 2023)
Source: Appenzeller, "LLMflation - LLM Inference Cost Is Going Down Fast" (a16z, 2023). Six independent drivers: GPU efficiency, quantization, software optimizations, smaller models, improved training techniques, open-source competition.
Before venture capital, before two acquisitions, before Intel and VMware - there was a room at Stanford's Clean Slate Lab in 2008. Guido was running it. The question on the table: could networking be reimagined from scratch, the way computing was reimagined by virtualization?
The answer was OpenFlow. The protocol decoupled the control plane from the data plane in network switches, letting software dictate how packets moved across hardware the way an operating system dictates how processes use a CPU. It was a radical idea. The networking industry had built decades of proprietary value on exactly the problem OpenFlow was trying to dissolve.
The SIGCOMM Test of Time Award would later recognize this work. The award goes to papers whose impact becomes fully visible only years after publication. OpenFlow v1.0, released in 2009, spawned an entire industry vertical - Software-Defined Networking - and became the intellectual foundation for the SDN products later built by VMware, Cisco, Google, and others. Guido's next startup, Big Switch Networks, was a direct commercialization of that research.
He holds a PhD from Stanford and an M.S. (Diplom) from Karlsruhe Institute of Technology. The German academic background shows: meticulous, data-driven, structurally sound. His public writing on LLM economics has the same quality as his academic work - careful about methodology, transparent about assumptions, willing to be proven wrong by updated data.
Every time we decrease the cost of something by an order of magnitude, it opens up new use cases.- Guido Appenzeller, a16z
For an LLM of equivalent performance, the cost is decreasing by 10x every year.LLMflation Analysis, a16z 2023
Every time we decrease the cost of something by an order of magnitude, it opens up new use cases.On AI Infrastructure Investing
VMware was the first time for me to work in a large company - or in fact, any company I didn't start myself.Personal blog, guido.appenzeller.net
Most AI investors in 2024 learned the term "inference cost" from someone else's deck. Guido Appenzeller learned it by being CTO of a company that manufactured the chips bearing the cost. That's not a small distinction. When he writes about GPU efficiency as a driver of LLM cost reduction, he's drawing on years of tracking Intel's silicon roadmaps, negotiating with fab partners, and watching the gap between theoretical and actual compute performance manifest in quarterly P&L statements.
The other thing that separates his public writing from most VC commentary: he names the six independent mechanisms driving LLM cost decline. Not "AI is getting cheaper and better" - that's a bumper sticker. His LLMflation analysis identifies GPU efficiency gains from Moore's Law, model quantization (from 16-bit to 4-bit), software-level optimizations, architecturally smaller models that match larger predecessors, improved instruction-tuning techniques like RLHF and DPO, and open-source competition compressing margins. Six independent levers. Any one of them slowing down doesn't stop the trend. That kind of multi-variable thinking is what happens when a physicist gets a CS PhD and then spends twenty years building things.
His investment thesis at a16z extends the same logic: intelligence is becoming a commodity cost, and the winners will be the companies that build on that commodity before the market fully prices it in. Every order-of-magnitude cost drop historically opens application categories that were previously economically impossible. Voice assistants couldn't exist at 1990s compute prices. Neither could protein folding. Neither can most of the AI applications being prototyped right now - until the cost curve moves another 10x. Guido is betting on the infrastructure that enables that move.