Sammy Sidhu

Dispatch / The Profile

The man's spreadsheet doesn't care if it's a photo, a podcast, or a LiDAR scan.

Sammy Sidhu builds the unglamorous thing every glamorous AI app secretly depends on. His company, Eventual, makes Daft - an open-source engine that treats a video clip, a 3D point cloud, and a paragraph of text as if they were all just rows in a table. Type a few lines of Python. The petabytes sort themselves out.

Daft is the boring miracle. Open the docs and it looks like the DataFrame any data scientist has used a thousand times. Underneath, it is doing something the old tools never could: handling images, audio, video and sensor data at the same scale and with the same shrug as a column of numbers. Sidhu's ambition for it is stated plainly and without hedging - he wants Daft to be "as transformational to unstructured data infrastructure as SQL was to tabular datasets." That is a large claim. He has the resume to make it without laughing.

Today he is co-founder and CEO of Eventual, the company he started in early 2022 with Jay Chia. They put it through Y Combinator's Winter 2022 batch and shipped the first open-source version of Daft that same year - months before ChatGPT arrived and made "multimodal" a word that venture capitalists could pronounce. By mid-2025 the company had raised roughly $30 million, with a seed led by CRV and a Series A led by Felicis, joined by Microsoft's M12 and Citi. Daft, meanwhile, had quietly become load-bearing: petabytes a day flowing through it at Amazon, CloudKitchens, Together AI and Essential AI.

"We had all these brilliant PhDs working on autonomous vehicles, but they're spending like 80% of their time working on infrastructure rather than building their core application." - Sammy Sidhu, to TechCrunch

That sentence is the whole company in one breath. Before Eventual, Sidhu and Chia were engineers inside Lyft's self-driving program, Level 5. Self-driving cars are data factories that never sleep - photos, 3D scans, audio, text, all of it messy, all of it enormous. The two of them built an internal tool to wrangle it. It worked. Then Sidhu went looking for his next job, and a funny thing kept happening in interviews: the people across the table stopped interviewing him and started asking whether he'd come build that same data tool for them. After enough of those conversations, the job hunt ended. The company began.

The timing reads like luck. It wasn't, quite. Eventual was founded before the generative-AI explosion, on a conviction that the world was about to drown in unstructured data whether or not it had a chatbot moment. When the moment came, Sidhu watched it confirm the thesis in real time. "The explosion of ChatGPT - what we saw is just a lot of other folks who are then building AI applications with different types of modalities," he said. "Then everyone started using things like images and documents and videos in their applications." The flood arrived. Daft was already standing there with a bucket.

A career assembled from hard problems

Sidhu grew up in the San Francisco Bay Area and stayed close to home for the part that mattered: UC Berkeley, EECS, with a focus on machine learning and distributed systems. He didn't just take the famous courses - he taught them. He was a teaching assistant for CS61A and CS61B, the introductory gauntlets that every Berkeley computer scientist remembers with a mix of love and trauma, plus the databases course CS186/286 and the microelectronics course EE40. The habit of explaining a hard idea until it becomes obvious is not a marketing trick he learned later. It's where he started.

As a graduate researcher he worked under Kurt Keutzer in the ASPIRE and BAIR labs, living at the seam where deep learning meets high-performance computing - with a detour into medical AI. The throughline of his whole career is right there: make the models, and make the machines underneath them go fast. He has more than a dozen patents and publications to show for it, including work presented at NeurIPS.

Then came the industry tour that reads like a map of where machine learning actually got built. There was a stint in high-frequency trading on Wall Street, where milliseconds are money and sloppy infrastructure is bankruptcy. There was DeepScale, where he was a founding-team member and Chief Architect, building computer-vision models and deep-learning compilers and helping grow the company from four people to thirty before Tesla acquired it and folded the work into Autopilot. At Lyft Level 5 he was a perception lead, designing an ML platform for more than a hundred engineers and squeezing pipelines hard enough to save the company a reported $10M-plus a year. At Woven Planet, Toyota's autonomous-driving arm, he was a senior staff engineer leading sensor-fusion research for vision-based self-driving.

4→30

DeepScale team, before Tesla

~$30M

raised for Eventual

12+

patents & publications

PB / day

processed through Daft

Notice the pattern. Every job on that list was a place where the data was too big, too weird, and too important to handle with off-the-shelf tools. Trading floors, self-driving stacks, perception pipelines - all of them forced the same lesson: the bottleneck is almost never the model. It's everything you have to do to feed it. Eventual is what happens when someone who learned that lesson four times in a row decides to fix it once, in the open, for everybody.

"[The goal is to make Daft] as transformational to unstructured data infrastructure as SQL was to tabular datasets." - Sammy Sidhu

There is a quiet stubbornness in choosing open source for something this foundational. Eventual could have built a closed product and sold access to it. Instead Daft lives on GitHub under the handle a generation of his students would recognize - samster25 - free for anyone to read, fork, and run. The bet is that infrastructure this important should be a public road, not a toll booth, and that a company can still do very well building the on-ramps. It is the kind of bet you make when you've spent your career on the engineer's side of the table and remember exactly what it felt like to be blocked.

Why this one is different

The data world has no shortage of engines. Spark exists. Warehouses exist. What Daft is chasing is the part everyone else treated as someone else's problem: the modalities that don't fit in a cell. A column of integers is easy. A column where every entry is a ten-second video is not - and yet that is exactly the shape of the data the AI era runs on. By making that column behave like any other, Sidhu is trying to delete the 80% tax he watched his colleagues pay, the hours lost to plumbing instead of building. If he's right, the engineers who use Daft will never think about it much. That's the goal. The best infrastructure is the kind you forget is there.

He still does the teaching part, just to bigger rooms now - speaking at the PyData Global conference, at Databricks' Data + AI Summit 2025, at developer summits across Europe, all of it variations on the same talk he'd give a freshman: here is a hard thing, here is why it's hard, here is how it gets simple. The audience changed. The instinct didn't.

Where it goes from here is the open question. The company has the funding, the customers, and a thesis the market keeps validating on its behalf. The harder test is the one Sidhu set himself - the SQL comparison. SQL didn't win because it was clever. It won because it became invisible, the assumed default, the thing nobody argues about. That's a fifty-year shadow to chase. But it's a useful kind of impossible goal: even a fraction of it would reshape how a lot of the world handles its messiest, most valuable data. And Sammy Sidhu has already spent his whole career proving he's comfortable working on the parts other people would rather skip.

In His Own Words

"Everyone started using things like images and documents and videos in their applications."

ON THE MULTIMODAL WAVE

"Brilliant folks... spending like 80% of their time working on infrastructure rather than building their core application."

ON THE PROBLEM HE'S SOLVING

Sammy Sidhu

The man's spreadsheet doesn't care if it's a photo, a podcast, or a LiDAR scan.

A career assembled from hard problems

Why this one is different

Links & sources