The unglamorous business of feeding the machines.
It is a weekday afternoon in the Mission District, and somewhere inside a converted building on Shotwell Street, a Kubernetes cluster of H200 GPUs is quietly digesting the public internet. The team that owns these machines calls them the exa-cluster, with the kind of straight-faced literalism that suggests they have other things to worry about. They have raised $85 million this year. They are 120 people. They do not run ads. They sell, of all things, search results to robots.
This is Exa. Until late 2023 it was called Metaphor, which sounded like a literary magazine. The new name is a unit of measurement - 10 to the eighteenth - and a quieter promise: that the index they are building is large enough to matter, and small enough to ship. The pitch, stripped of jargon, is that Google was made for humans typing into a box, and that almost no one types into a box anymore. The new users are language models. They read fast, they ask weird questions, and they expect their sources to be clean.
Search was built for clicks. Then the clicks stopped.
For twenty-five years, web search optimized for one thing: a human staring at a page of ten links and choosing one. Everything beneath the hood - the ranking, the snippets, the ads, the SEO arms race - was tuned to that human moment of indecision. It was, by any measure, a triumph. It also produced an internet that is now almost unreadable by software.
Ask a large language model to "find me the three best papers on retrieval-augmented generation from the last year and summarize their methods," and the model needs more than ten blue links. It needs hundreds. It needs the actual contents of the pages, cleaned of cookie banners and chrome. It needs the results in milliseconds, because an agent that waits eight seconds is an agent that times out. And it needs them ranked by meaning, not by which SaaS company spent the most on backlinks.
EXHIBIT B - A polite way of saying the modern web is now mostly read by software.
This is the gap Exa walked into. The founders - William Bryk and Jeffrey Wang, two Harvard freshmen who became roommates and then co-founders - watched the early GPT demos in 2022 and reached a tidy conclusion. If models were going to become the primary readers of the internet, then somebody needed to build them a library card. The incumbents would not do it; their incentives were too tangled with the ad economy. So Bryk and Wang would.
Two co-founders, one GPU cluster, and a stubborn opinion about meaning.
Bryk had done ML research at Harvard, led the robotics club, written part of a book on the history of civilization, and worked briefly at the ML startup Cresta. Wang had spent three years at Plaid building data and web infrastructure, and had run a GPU cluster at Harvard. They were, by the standards of San Francisco founder bios, unusually well-prepared for what they were about to do, which was: re-rank the open web by meaning.
The wager looked like this. Take the entire crawlable internet. Pass it through embedding models trained specifically on next-link prediction - the simple task of guessing which page a thoughtful human would link to next. Compress what you learn into dense vectors. Stand up an API. Charge by the call. If the bet was right, the resulting index would understand questions like "papers similar to this one but with cleaner empirical sections" without anyone telling it what those words meant.
Y Combinator wrote the first check. Lightspeed and Nvidia followed in a $17 million Series A in July 2024. Then, in September 2025, Benchmark led an $85 million Series B at a $700 million valuation, with Lightspeed and Nvidia's NVentures returning. The money is not for marketing - Exa famously does not market - but for GPUs and for the unglamorous work of crawling, cleaning, and indexing more of the web, faster.
A short timeline, mostly told in funding rounds.
Because in this industry, that is how time is measured.
What you actually buy, if you buy it.
From a developer's seat, Exa is four endpoints and a dashboard. The Search API will return between ten and ten thousand links per query, ranked by neural relevance, keyword, or a hybrid of the two. The Contents API hands back the cleaned text of any page, no scraping required. The Answer API does the obvious thing - it stitches search and LLM together into a single grounded reply. And Websets, the most recent addition, behaves less like a search engine and more like a research analyst: ask it for "Series A biotech companies founded by women in Boston since 2022," wait a beat, and receive a structured list.
Search API
Neural, keyword, and hybrid retrieval. 10 to 10,000 results per call.
Contents API
Clean page text, highlights, summaries. No HTML soup.
Answer API
LLM-grounded answers tied to live web sources. Hallucination, reduced.
Websets
Curated entity collections. A research analyst that does not bill hourly.
The point is not the endpoints, which any company can write. The point is the index sitting behind them, and the embeddings the index has been put through. That is the part you cannot copy with a weekend and a docs page.
The funding curve, drawn in money.
Cumulative capital raised by round, USD millions
Sources: TechCrunch, Crunchbase, FinSMEs. Bars scale to cumulative total.
Who actually uses this thing.
Exa has the polite problem of having to brag without bragging, because most of its customers prefer to be discreet. The publicly named ones include Cursor, the AI code editor whose in-product web search runs on Exa. Beyond that, the customer page lists private equity firms, top-tier consultancies, and AI startups that would rather be described as "leading" than named. The internal joke is that Exa is the most used search engine you have never used.
The use cases cluster into three. There is RAG, where Exa supplies fresh, factual context to LLM apps that would otherwise hallucinate confidently. There is research automation, where agents fan out across hundreds of queries to compile competitive landscapes and market maps. And there is enterprise knowledge work, where analysts use Websets to do in fifteen minutes what used to take an associate three days.
EXHIBIT C - Three days of associate billable hours, compressed.
None of this is glamorous. It is plumbing. But it is the kind of plumbing that a meaningful fraction of the AI industry now depends on, which is why Benchmark wrote a check at $700 million for a company that, by SaaS standards, is still small.
"Perfect search," and other ambitious phrases.
Ask Bryk what Exa is for, and he will say, more or less, "perfect search." He means a comprehensive, unbiased, controllable index of the web - one without the gravitational pull of an ad business or the cumulative scar tissue of twenty-five years of SEO. He also means something larger and more uncomfortable, which is that he believes the quality of information humans and AIs consume now determines the quality of civilization, and that this is no longer a metaphor.
Whether you find this stirring or slightly ridiculous probably says more about you than about Exa. The company itself behaves consistently with the larger claim. They have refused to add advertising to results. They sell to developers and enterprises, not to consumers. They publish a brand book whose logo, an hourglass, is meant to evoke the flow of information being filtered and delivered. The whole operation has the slightly Quaker quality of a research lab that has stumbled into a business.
Why this matters, even if you never type a query.
The agentic web is real and arriving faster than the consumer web did. Within the next few years, a sizeable share of internet traffic will not be a human at a keyboard but software acting on someone's behalf - booking flights, comparing prices, summarizing earnings calls, writing code by reading other code. Every one of those agents needs a retrieval layer. Most will not build their own. The companies that supply that layer will be the AWS of a much weirder internet.
Exa's bet is that the retrieval layer for AI looks fundamentally different from the retrieval layer for humans, that owning it requires owning the embeddings and the index together, and that being a small, unbothered lab in San Francisco with a private GPU cluster is the right shape of company to do it. There are credible competitors. Google has its own ambitions. Perplexity is loud about consumer search. But Exa is doing the boring, profitable thing - selling shovels in the back office, while the rest of the industry argues about who is the AI Google.
Back on Shotwell Street, the cluster is still chewing through pages. The team is still small enough to fit in a single Slack channel. Somewhere a customer's research agent has just fired off its eight hundredth query of the morning and received, in less than half a second, a ranked list of links it can actually use. The robot is satisfied. It moves on. Nobody, in the human sense, noticed. That is, more or less, exactly the point.