The Engineer Who Actually Explains Things
Right now, Vicki Boykis is doing what she does best: building recommendation systems at an early-stage startup while writing about the messy, nondeterministic, occasionally infuriating reality of modern ML. She is a founding engineer at a Philadelphia-area startup working on personalization and information retrieval - the kind of work that keeps the internet pointing you at things you might actually want, rather than things an algorithm decided you should want.
She came to this through a route that would raise eyebrows in certain circles and earn knowing nods in others. Economics degree from Penn State, MBA from Temple, a side trip through community college CS courses in 2016 when she decided that data work without real programming chops was just guesswork in a spreadsheet. No CS degree. No machine learning master's. Just the economist's disposition toward skepticism and the programmer's compulsion to actually test the hypothesis.
What distinguishes Boykis in a field crowded with people writing about AI is that she writes about what the work actually is. Not what it aspires to be. Not what the press release says. The Normcore Tech newsletter - running on Substack, read by engineers, managers, and skeptical observers across the industry - applies what she calls "humanism, nuance, context, rationality, and a little fun" to the practice of data and ML. The BuzzFeed method, she calls it: hit them with the memes, then sneak in the serious content.
My strategy is a little bit like BuzzFeed - hit them with the memes and then sneak in serious content in between.
- Vicki BoykisThe serious content landed. In 2023, she published "What Are Embeddings?" - a 70-plus page technical deep dive that became a reference document across the ML community. Not a blog post. Not a tweet thread. A proper, footnoted, thoroughly explained account of what embeddings are, how they work, and why understanding them matters for anyone building systems with language models. It spread on Hacker News. It spread through Slack channels. Engineers sent it to colleagues with the message: "this is the one."
At Automattic and Tumblr between 2020 and 2022, she worked on recommendation systems for WordPress.com and Tumblr - the kind of work that shapes what content millions of people see without ever knowing a system made that choice. Experimental features included surfacing a ten-minute video of a potato in a microwave as a recommendation. The job of a recommendation engineer is partly to prevent the potato video from being the first thing a new user sees. Also partly to acknowledge that some users want exactly that.
After Automattic, she went to Mozilla.ai, where she led the design and open-sourcing of Lumigator - a self-hosted Python application for evaluating large language models using offline metrics. The premise: if you're going to deploy an LLM, you should be able to compare it against alternatives using real metrics on real tasks, not vibes. Lumigator starts with summarization and grows from there. It is the kind of tool that reflects Boykis's orientation toward production reality over benchmark theater.
I'm not looking to trick you. I'm looking to have a conversation with you and to see if I can work with you, and that's it.
- Vicki Boykis, on engineering interviewsThe conference she organized in 2022 - Normconf, held December 15, fifteen straight hours of virtual talks - captured something real about the state of the field. The explicit agenda was "all the stuff that matters in data and machine learning but doesn't get the spotlight." No keynote about the next frontier. No breathless announcements. Talks about the things practitioners actually wrestle with: cleaning data, writing documentation, communicating with stakeholders, not breaking production on a Friday. Fifteen hours of it. People watched all of it.
In 2023, she built Viberary - a semantic search engine for book recommendations that works differently from any recommendation system you've used. You don't search by genre or author or title. You search by vibe. "Rainy afternoon melancholy with a surprising ending." "Dense prose that rewards rereading." Sentence Transformers, the MSMarco model, and a genuine conviction that the aesthetic texture of a book is a legitimate search query. The project also became a public teaching object: Boykis documented the architecture, the decisions, the tradeoffs, and the things that didn't work.
Her blog at vickiboykis.com operates as what she calls a "machine learning garden" - a growing, annotated map of her own learning. Annual retrospectives document what she built, read, learned, and changed her mind about. In February 2026 she published a piece on querying three billion vectors. These are not think pieces about the future of AI. They are detailed, personal accounts of working in a technical field and trying to understand it a little better each year.
The GitHub username is veekaybee - a phonetic rendering of her initials, V.B. The move from pyenv to uv made it into a blog post. The experience of learning nondeterministic LLM behavior in 2024, getting acquainted with Ray and vLLM and Llamafile, understanding GGUF format and FastAPI and OpenAPI-compatible APIs - all documented, publicly, because the point is not to appear to have always known. The point is to actually learn, and to make the learning legible to whoever comes next.
She has keynoted PyCon Italia in Florence, presented at PyData Amsterdam, and co-authored a widely-read piece on what ML engineering actually is with Gergely Orosz in The Pragmatic Engineer newsletter. She has published in Increment Magazine. She wrangles a kindergartner and a toddler in her off hours. She once taught herself Hebrew over a summer because she was embarrassed about not understanding the language on a visit to Israel. She planned to go into international relations. She ended up querying three billion vectors.
The field needs more people who can do the work and explain what they did. Boykis is one of the few who genuinely does both - with the economist's instinct to question the model, the engineer's discipline to test it, and the writer's insistence on saying something true.