The quiet API turning the world's paperwork into structured data - one billion pages at a time.
Somewhere on the 4th floor of an office tower at 465 California Street, a server quietly inhales a 312-page insurance claim with handwritten margin notes, a fax cover sheet from 1998, and a spreadsheet whose author retired in 2014. Eleven seconds later, it spits out JSON. Clean keys. Typed values. Reading order intact. The tables - the impossible, three-merged-cell, footnoted tables - parsed. This is what Pulse does. It does it about a billion times.
Every large language model demo is glittering and easy. The unglamorous truth is that the demo works because someone, somewhere, parsed the PDF first. RAG pipelines are only as smart as the parser feeding them. Hallucinations are often not hallucinations at all - they are punishment for upstream sloppiness.
Pulse is the upstream. Founded in 2024 by Sid Manchkanti and Ritvik Pandey - two engineers who left Tesla, NVIDIA, D.E. Shaw, and Goldman Sachs to build OCR, of all things - the company trained its own vision-language model from scratch. The bet: that the data-ingestion layer of the AI stack is not a commodity, and that the company that owns it owns a quiet kingdom.
So far the bet is holding. The API now sits inside Fortune 10 enterprises and AI-native startups in finance, healthcare, insurance, legal, real estate, and supply chain. Samsung uses it. Cloudera uses it. Howard Hughes uses it. UC Berkeley uses it. Most of them never tweet about it.
That is the Pulse personality in a sentence: shipped, deployed, indispensable, unbothered.
Indexed vertical mix, YesPress estimate from public materials. Not audited.
Send a PDF, Word, Excel, image, or scan. Receive structured JSON ready for an LLM, a database, or a human who finally has their afternoon back.
An in-house vision-language model purpose-built for documents and spreadsheets. Layout detection, OCR, reading order, table parsing, chart conversion.
Invoices, tax forms, clinical notes, financial statements, contracts. Define the shape you want; Pulse extracts to it.
Cloud API, VPC-isolated, on-prem, Docker, Kubernetes. SOC 2 Type II, ISO 27001, GDPR, HIPAA BAA. Built for the buyer who reads every clause.
UC Berkeley CS. Previously at NVIDIA and D.E. Shaw. Runs the company from San Francisco and answers email at sid@runpulse.com - which is itself a small data point about Pulse.
Georgia Tech CS and Math. ML work at Tesla, plus a stint at Goldman Sachs. Leads the vision model and inference stack.
"Pulse" - the heartbeat of structured data inside the enterprise. A small joke that the company takes seriously.
Tesla, NVIDIA, D.E. Shaw, Goldman, AWS, Berkeley, Georgia Tech - in a team of 33. Density over scale.
A sub-35-person company building a foundation model for documents is rare. Pulse did it anyway.
Back at 465 California Street, the server has moved on. The claim is JSON now. A model downstream summarizes it, an adjuster reviews it, a payment clears. The fax cover sheet from 1998 is, somewhere in a row of a database, finally machine-readable. Nobody at Pulse will tweet about this particular page. There are 999,999,999 others behind it, and a few more arriving each second. The unsexiest problem in AI, getting quietly handled - on Cloudflare DNS, NVIDIA TensorRT, Kubernetes, and a vision model that did not exist two years ago. The work, as Pulse seems to prefer, speaks for itself.