Umut Isik

Who he is now

Voice AI for the loudest place on earth

Most AI demos happen in a quiet room. Umut Isik's product lives at a drive-thru speaker box - the acoustic worst-case, where an engine idles, a kid yells in the back seat, and three lanes of orders bleed together.

TThat is the problem Incept AI exists to solve. Isik co-founded the New York company in 2023 with Justin Foster, and the pitch is deceptively plain: build voice AI that actually works outside the lab. The phrase he uses for it is the "last mile" of voice AI - the gap between a transcript that looks fine on a slide and a transcript that holds up when someone mumbles "no pickles" through a tinny microphone in a thunderstorm.

Restaurants are the proving ground. Incept's system takes drive-thru and phone orders, handles the combinatorial nightmare of menu modifiers, suggests the upsell, checks what is out of stock, and drops a finished order into the point-of-sale. Underneath sits the Incept Neural Engine, audio networks built to strip out noise, acoustic echo, and overlapping speakers before a foundation model ever sees the words. The company integrates with the plumbing of the industry - Toast, Square, PAR, HME - and reports 95%+ AI-only completion at roughly 812 milliseconds of conversational latency.

"AI voice agents have handled restaurant orders for years, but no provider has achieved 97%+ accuracy without staff involvement."

In February 2025 the bet got funded: a $3 million pre-seed led by Rally Ventures, with 10VC along for the ride. Ben Fried, Google's former CIO, took a board seat. By then Incept had been live since May 2024 and was running pilots across chains with more than a thousand locations between them. One coffee-chain CTO put it bluntly in a customer note: every new store starts with Incept from day one.

The strange specific

A geometer who learned to listen

Here is the detail that explains the rest of him: before the citation that reads "Better speech enhancement with frequency-positional embeddings," there is one that reads "Equivalence of the derived category of a variety with a singularity category." Same author. The second has been cited 120 times. It is pure algebraic geometry, the kind of mathematics with no obvious use and a deep internal beauty.

Isik earned his Ph.D. in mathematics at the University of Pennsylvania, then spent years inside the field - including a turn as a Visiting Assistant Professor at UC Irvine - working on algebraic geometry, category theory, and categorical complexity. His Google Scholar page carries the receipts of that double life: roughly 1,446 citations and an h-index of 16, split between abstract algebra and applied deep learning.

"I am a founder and the CEO of Incept AI, where we build voice AI for the real world."

The bridge between the two careers was audio. As a Principal Applied Scientist at Amazon Web Services, Isik turned the rigor of a theorem-prover loose on a much messier object: sound. His most-cited papers are foundational speech-enhancement work - Attention Wave-U-Net, PoCoNet, channel-attention dense U-Net - the literature of teaching a network to pull a clean voice out of a dirty signal. It is the exact problem a drive-thru hands you, dressed in a lab coat. When he left to start Incept, he was not switching fields. He was taking the same problem out of the building.

The signal in the noise

Voice AI gets graded on a curve in quiet rooms. The harder a room gets, the wider the gap between providers. Here is the spread Incept is chasing - the move from "good enough on a slide" to "good enough in the rain."

AI-only orders

95%+

Peak accuracy

97%+

Faster service

21%

Latency budget

812ms

Figures self-reported by Incept AI. Latency bar scaled for display.

The quirk

He makes art out of math

Somewhere between the theorems and the neural nets, Isik built mathvas.com - a tool that turns simple mathematical functions into images. He calls it a new medium for artistic and mathematical expression, and he has used it in math circles and undergraduate workshops. His pieces showed at the mathematical art galleries tied to the Joint Mathematics Meetings.

Two Distributions

PRINT · 2016 · 29×39 CM

Uses probability distributions as "colors" - a uniform distribution inside a circle set against a bimodal one outside, the two blurring where the sampling rate climbs.

The Fish

PRINT · 2016 · 28×43 CM

Algebraic curves slice the plane into regions, each filled with distributions that blend strong and faint color across the whole composition.

Selected papers

The other body of work

187Attention Wave-U-Net for speech enhancement 2019
139Channel-attention dense U-Net for multichannel speech enhancement 2020
133PoCoNet: better speech enhancement with frequency-positional embeddings 2020
132A perceptually-motivated approach for low-complexity, real-time enhancement of fullband speech 2020
120Equivalence of the derived category of a variety with a singularity category 2013

Umut Isik

Voice AI for the loudest place on earth

A geometer who learned to listen

He makes art out of math

The other body of work

For the record

Where to find him