Scene: A Conference Room in MarchThe boring half of AI
It is a Tuesday in 2026 and somewhere in downtown San Francisco, a 32-person company is quietly arguing about reranker latency. Not about agents. Not about AGI. About reranker latency. The vector index is hot, the embedding service is warm, and someone is pasting Grafana panels into Slack with the calm energy of a person who has done this before. The company is called Superlinked, and it has decided - against the prevailing fashion - that the most interesting problem in AI is not the model. It is everything that happens around the model.
Superlinked is the kind of startup that gets misread on first glance. Founders pitch RAG. Funds chase agents. Superlinked, instead, ships SDKs that turn rows in a database into vector embeddings - and now also ships a Kubernetes cluster that runs every model an agent might call. It's plumbing, in other words. The kind of plumbing that, when it works, becomes invisible. The founders seem fine with that.
FoundersTwo ex-operators with a contrarian thesis
Daniel Svonava, the CEO, spent his pre-Superlinked years inside Google, building ML prediction infrastructure for YouTube Ads. The job, stripped of its gloss, was this: rank an enormous catalog against a fast-moving signal under a brutal latency budget. It is the same shape as enterprise search, only with more zeros.
Ben Gutkovich, COO, came from McKinsey's London office, where he advised Fortune 500s on digital transformation. He has the consultant's gift for naming what an enterprise actually needs - which, in 2026, turns out to be a way to run AI on data they cannot legally email to OpenAI.
They founded the company in 2021, somewhere between the GPT-3 demo and the vector-database land grab. By the time the rest of the market noticed retrieval, Svonava and Gutkovich had been quietly thinking about it for two years.
The thesis was unfashionable: vector databases got all the attention and most of the venture money, but the harder problem - turning messy enterprise data into vectors in the first place - was unaddressed.
The ProductThe "vector computer"
The phrase is deliberate. Vector databases store and retrieve embeddings. A vector computer - their term - is the thing that produces and queries them. It is the compiler to the database's filesystem.
Superlinked Framework
An open-source Python SDK. You declare schemas, encoders and multi-objective queries; it produces embeddings that combine relevance, recency, popularity and whatever else you care about.
Superlinked Server
The framework as a service: FastAPI endpoints, Kafka ingestion, Redis-backed indexes, OpenTelemetry traces. The boring parts, professionally done.
Inference Engine (SIE)
Kubernetes-based, open source, self-hosted. LLMs, embeddings, OCR, vision, rerankers - 112 open models in a single cluster running in your cloud.
Managed Service
The same stack, run for you - for teams whose patience for Helm charts is finite.
RAG & Search
Production retrieval that doesn't collapse on the long tail of enterprise queries.
Recommendations
Multi-objective vector queries built for ranking - not just nearest-neighbor.
By The NumbersWhat got Index Ventures to lead
In March 2024, Index Ventures led a $9.5M seed. Theory Ventures co-led. MMC, Episode 1, Firestreak and 20Sales filled out the round. The pitch, paraphrased: the vector-database category had absorbed roughly $250M while the compute side that feeds it remained underbuilt. Superlinked proposed to be the compute side.
The EcosystemA surprisingly long partner roster
For a 32-person company, the integration list is dense. It is the kind of list a sales team builds when it has decided that being the standard matters more than being the platform.
The Open-Source TurnWhy the front door is now inference
By 2025 the homepage was no longer about vector embeddings, primarily. It was about agents - and specifically, about not paying frontier labs by the token. The Superlinked Inference Engine is the new front door: a self-hosted cluster that runs every model an agent might reach for, from LLMs and OCR to embeddings and rerankers.
The case is plain. If a workflow calls three open models a hundred million times, hosted per-token pricing becomes the company's largest single line item. SIE claims to compress that line item by roughly fifty times, and to do it on the customer's own cloud.
It also does something less easy to put on a billboard: it makes regulated industries - banks, hospitals, retailers with strict data residency - able to use AI without negotiating data-sharing with a third party. Air-gapped deployment is, in some sectors, not a feature. It is the precondition for getting in the door.
It is a graceful pivot. The framework didn't go away; it became a layer in a larger story.
TimelineFrom SDK to stack
Tape & DemoIf you'd rather see the thing
That, in the end, is what Superlinked is selling - the unglamorous gift of a retrieval stack that does what it says.