Humata - The AI That Cites Its Sources

The Business of Reading

A Company That Sells Answers With Receipts

Here is a small, slightly uncomfortable fact about the modern organization: it owns an enormous quantity of documents that no human being will ever read. Contracts, filings, research papers, medical records, 400-page diligence rooms - all of it dutifully stored, almost none of it actually read. The information is technically "accessible," in the sense that you could open the file, and technically useless, in the sense that you won't. Humata, a roughly twelve-person company in Austin, Texas, has built a business on this gap. Its product reads the files for you, answers questions about them in plain English, and - this is the part that matters - tells you exactly where it found the answer.

The pitch is often compressed to four words: "ChatGPT for your files." That is directionally correct and also slightly unfair, because the interesting thing about Humata is not that it chats. Lots of things chat. The interesting thing is that when Humata gives you an answer, it hands you a clickable citation that scrolls you to the exact passage in the source PDF. You ask, it answers, and then it shows you the sentence it was looking at when it decided that was the answer. In a category where the standard failure mode is a confident, fluent, entirely fabricated response, "here is where I got it" is not a garnish. It is the product.

Think about who actually needs this. A lawyer cannot cite an answer she can't trace. A compliance officer cannot file something because a language model felt strongly about it. A researcher reading forty papers needs to know which paper said the thing. For these people, an AI answer without a source isn't a faster version of their job - it's a liability with better grammar. Humata's citation-first design is, in effect, a bet that the buyers who matter most are precisely the ones who cannot use the un-cited stuff.

“Humata is like ChatGPT for your files - it reads every file you upload and generates answers based on the content of your documents.” - How the company describes the product

The Founders

A Chess Champion, a Stanford Researcher, and a Reading Problem

Humata was founded in 2022 by Cyrus Khajvandi and Dan Rasmuson, two people whose resumes are more colorful than the average seed-stage cap table. Khajvandi, the CEO, is a Stanford alum and a repeat founder - he previously co-founded the Y Combinator-backed Dnovo and served as COO of Passfolio, a brokerage that was acquired. Rasmuson, the CTO, previously co-founded and ran engineering at Labelbox, an AI-infrastructure company that reached a billion-dollar valuation labeling the data that models train on. He is also a Forbes 30 Under 30 honoree and, for good measure, a national chess champion.

There is a tidy logic to Rasmuson's trajectory. His first company helped machines learn from structured, labeled data. His second is about the opposite problem: the vast pile of unstructured data - the documents - that never got labeled and never will. If you want a one-line thesis for where AI goes next, "the people who built the plumbing are now building the reading comprehension" is not a bad one.

The company's stated mission is refreshingly short. Not a paragraph, not a manifesto - three words: make you wiser. It is the rare mission statement that fits on a sticky note and that a customer might actually repeat, which is roughly the test for whether a mission is real or decorative. The product and the mission turn out to be the same sentence: read faster, report faster, and always know where the knowledge came from.

The Money

When Google and Cathie Wood Agree

In October 2023, Humata announced a $3.5 million seed round led by Gradient Ventures, Google's AI-focused fund, with participation from Cathie Wood's ARK Invest, the firm M13, and a handful of angels. This is a slightly unusual pairing. Gradient is a deep-tech, AI-native investor. ARK is a growth-and-narrative shop famous for very public conviction bets. They do not land on the same small seed round every day, and when they do, it usually means the thing they're buying looks less like a feature and more like a layer - infrastructure that a lot of other software will eventually sit on top of.

ARK went as far as publishing an investment thesis on the company, which is a polite way of saying they wanted their reasoning on the record. The reasoning, roughly: unstructured documents are a giant, mostly untapped substrate of enterprise knowledge, and the tool that lets people query that substrate in natural language - reliably, with sources - is standing on something durable rather than renting attention from someone else's model.

“Humata can query larger databases of documents more efficiently than ChatGPT, all the while documenting where it derived its answers.” - On what separates Humata from a general chatbot

The Product

Upload. Ask. Cite. Repeat.

Mechanically, the loop is simple enough that the whole company can be explained at a dinner party. You upload a document - or many. Humata reads and embeds it. You ask a question in the same language you'd use to ask a colleague. It returns an answer, a summary if you want one, and a citation you can click to verify. Underneath that simplicity is the usual stack of unglamorous engineering: semantic search across many files at once, OCR for scanned images, summarization for the documents too long to skim, and collaboration so a team can work the same set of files.

The pricing tells you who it's for, which is basically everyone who reads for a living. There's a free tier for the curious, a $1.99-a-month student plan, a $9.99 Expert plan with the full GPT-4 experience, a $49-per-user Team plan with OCR and management controls, and custom-priced Enterprise plans with unlimited users, encryption, SAML SSO, role-based access, and dedicated support. The broke graduate student and the publicly traded enterprise get the same core loop - upload, ask, cite - and the company charges where the budget is, which is a sensible way to build a wedge.

The Enterprise tier is where the interesting business lives. Humata's "data rooms" are encrypted, access-controlled workspaces where a team's sensitive files stay private while remaining conversational. This is the version of the product that lawyers and hospitals and finance teams can actually adopt, because it treats security and provenance as prerequisites rather than afterthoughts - which, again, is the whole citation-shaped philosophy of the company expressed in a different form.

Document Q&A

Ask questions across your files in natural language; get answers with clickable citations to the exact source passage.

AI Summarization

Instant summaries of long, dense documents so you grasp the key points without reading every page.

Semantic Search

Search by meaning, not keywords, across an entire library of uploaded documents at once.

Enterprise Data Rooms

Encrypted, access-controlled team workspaces with OCR, SSO, and role-based permissions.

The Competition

A Crowded Room Where Provenance Is the Differentiator

Humata does not have this space to itself. ChatPDF, AskYourPDF, Adobe's Acrobat AI Assistant, Perplexity, Glean, and plain old ChatGPT-with-file-uploads all gesture at some version of "talk to your documents." In a field like that, the tempting move is to compete on model horsepower or feature count, both of which reset every few months when a new foundation model ships. Humata's chosen ground is stickier and less flashy: trust. If your buyers are in regulated work, the tool that can always answer "where did this come from?" is not competing on the same axis as the tool that just answers fast. That is a quieter bet than raw benchmark bragging, and probably a more defensible one.

The Bigger Idea

Documents as a Database Nobody Has Queried

There is a useful reframe hiding inside Humata's product, and it's worth stating plainly because it explains why sober investors got excited about a tool that, described flatly, "answers questions about PDFs." An organization's archive is not dead weight. It's a database - just one no one has ever been able to query, because the interface for it was "open the file and read it yourself." Structured data got dashboards and SQL decades ago. Unstructured data - the contracts, the filings, the research, the memos - got a folder and a prayer. The bet is that natural-language querying, with citations attached, is finally the interface that makes that pile addressable. That is a much larger claim than "chat with a PDF," and it's the version ARK put in writing.

If the reframe holds, the value compounds with scale. A single document is easy enough to skim yourself. A thousand of them, spread across a decade and forty folders, is exactly where a human gives up and where a citation-backed reader earns its keep. That is why Humata's enterprise motion - unlimited files, secure rooms, team access - matters more than the free tier that gets people in the door. The free plan sells the idea. The data room sells the outcome.

None of this guarantees the outcome. Twelve people and $3.5 million is a slingshot, not a fortress, in a category where much larger companies are pointed at the same problem. But the shape of Humata's bet is coherent in a way that's rare at seed stage: a short mission, a specific buyer, a design principle - show your work - that runs through the product, the security model, and the pitch alike. The company is not trying to be the smartest reader in the room. It's trying to be the one you can check. For anyone who has ever been handed a confident answer and had no way to verify it, that is a surprisingly large promise.

Humata.

A Company That Sells Answers With Receipts

A Chess Champion, a Stanford Researcher, and a Reading Problem

When Google and Cathie Wood Agree

Upload. Ask. Cite. Repeat.

Document Q&A

AI Summarization

Semantic Search

Enterprise Data Rooms

A Crowded Room Where Provenance Is the Differentiator

Documents as a Database Nobody Has Queried

How Humata Works

Upload

Read

Ask

Cite

Pricing, By Monthly Page Ceiling

Who Built It

Cyrus Khajvandi

Dan Rasmuson

Demos & Interviews

Ask AI Anything with Humata - Product Demo

Quick Demo: Summarize & Chat with Any PDF

Founder Fireside Chat with ARK Invest

Find Humata

Humata.

A Company That Sells Answers With Receipts

A Chess Champion, a Stanford Researcher, and a Reading Problem

When Google and Cathie Wood Agree

Upload. Ask. Cite. Repeat.

Document Q&A

AI Summarization

Semantic Search

Enterprise Data Rooms

A Crowded Room Where Provenance Is the Differentiator

Documents as a Database Nobody Has Queried

How Humata Works

Upload

Read

Ask

Cite

Pricing, By Monthly Page Ceiling

Who Built It

Cyrus Khajvandi

Dan Rasmuson

Demos & Interviews

Ask AI Anything with Humata - Product Demo

Quick Demo: Summarize & Chat with Any PDF

Founder Fireside Chat with ARK Invest

Find Humata

Share This Profile