Incept AI raises $3M pre-seed led by Rally Ventures Order completion without a human: 95%+ Live conversation latency: 812ms Total service time 21% faster Founded 2024 in New York Integrates with Toast, Square, PAR, HME Former Google CIO Ben Fried joins board Incept AI raises $3M pre-seed led by Rally Ventures Order completion without a human: 95%+ Live conversation latency: 812ms Total service time 21% faster Founded 2024 in New York Integrates with Toast, Square, PAR, HME Former Google CIO Ben Fried joins board
Company Profile · Voice AI

Incept AI
hears you over the engine.

A New York startup teaching machines to take your order in the loudest room in retail - the drive-thru.

Incept AI drive-thru voice AI product

THE SPEAKER BOX. An idling engine, a kid in the back seat, rain on the roof - the one interview room where nobody sits still and everybody talks at once. Incept built its whole company for this exact five feet of asphalt.

95%+
Orders, No Human
812ms
Live Latency
$3M
Pre-Seed Raised
21%
Faster Service

Here is a fact that sounds made up but isn't: the hard part of drive-thru AI was never the artificial intelligence. It was the microphone. For years, everyone kept announcing that voice AI was basically solved, and then you would pull up to a speaker box with a diesel engine two feet away and discover that it was not, in fact, solved.

The Premise

The last 12 points are worth more than the first 83

Incept AI's entire business is a spread. Most AI order-takers, by the company's telling, get you to about 83% accuracy and then tap out - handing roughly 2 of every 10 cars to a human. That handoff is where the economics go sideways. A person has to stop what they are doing, put on a headset, and untangle an order the machine gave up on. Multiply that by every lane in every store, all day.

Incept says its system completes 95%+ of orders (the company has cited 97%+ in its own materials) without anyone stepping in. If that number holds at scale, the difference between 83 and 95 is not a rounding error. It is the difference between a novelty at the speaker box and a machine you can actually staff around.

The reason the company can chase that gap is a boring, unglamorous insight: the bottleneck is audio. Engine hum, acoustic echo, two people talking at once, wind across the mic. So instead of fine-tuning yet another language model, Incept built a neural audio engine to clean the sound before a foundation model ever tries to understand the words.

It is a very specific bet - that owning the messy physical layer is more durable than owning the model on top of it. The models, after all, are increasingly commodities. The noise is forever.

Order completion without a human

Company-reported · higher is better
Incept AI95%+
Typical voice AI (before human handoff)~83%

Figures per Incept AI and industry commentary; independent, audited benchmarks not published.

"There are times I'm amazed the AI can hear what the guest is saying."

Store General Manager · nationwide sandwich chain (customer testimonial)
The People

A founder who kept circling the same problem

CEO and co-founder Umut Isik is an audio scientist by training. Before Incept he was an applied scientist at Amazon Web Services building software to fight background noise and echo - and, in an earlier chapter, he worked on drive-thru audio for McDonald's during an RFP that fed into Apprente (later acquired by McDonald's, then spun into IBM). When someone keeps orbiting the same hard problem across three companies, that is usually a signal. He went and built the company to finish it.

He is joined by co-founder and Chief Revenue Officer Justin Foster, a veteran of quick-service restaurant technology and voice - including time at Presto Automation - who knows the buyers, the headsets, and the operational reality of a lunch rush. The pairing is deliberate: one founder who understands the sound, one who understands the store.

Co-Founder / CEO

Umut Isik

Audio scientist, ex-AWS applied scientist. Spent years on drive-thru noise before founding Incept.

Co-Founder / CRO

Justin Foster

QSR and voice-tech veteran (incl. Presto Automation). Runs go-to-market and restaurant relationships.

Board

Ben Fried

Former Google CIO and Rally Ventures partner; joined Incept's board with the pre-seed round.

What It Does

Clean the sound, then take the order

Incept's stack starts with the Incept Neural Engine - proprietary audio neural networks that strip out background noise, acoustic echo and crosstalk. Only then does cleaned speech get routed to foundation models (the company works with GPT, Gemini and DeepSeek rather than betting on a single one), which handle the conversation, the complex modifiers, and the upsell. The result plugs into the POS and headset systems restaurants already run.

Product

Drive-Thru Voice AI

Takes the order at the speaker box - natural conversation, complex modifiers, heavy noise - without transferring to staff.

Product

Phone Ordering AI

Answers the restaurant phone and handles orders and FAQs over low-quality audio lines.

Platform

Incept Neural Engine

Audio networks that suppress noise, echo and crosstalk before any language model sees the words.

Software

Store Analytics

Dashboards on guest sentiment, order patterns and per-location performance from every interaction.

Revenue

Suggestive Selling

Runs limited-time offers and automated upsell prompts through the voice AI.

Model-agnostic

Foundation Models

Routes to GPT, Gemini or DeepSeek - the LLM is swappable; the audio layer is the moat.

"AI voice agents have handled restaurant orders for years, but no provider has achieved 97%+ accuracy without human intervention."

Umut Isik · Co-Founder & CEO, Incept AI
The Business

Meet the stack where it already lives

Incept sells B2B to quick-service and fast-casual chains. The adoption trick is not the AI - it is the switching cost, so Incept integrates with Toast, Square, PAR and HME rather than asking operators to rip anything out. It works: a team of roughly four people landed a late-stage pilot with a restaurant chain of 1,000+ locations (reported elsewhere as 500+) within about a year of founding. Investors noticed. In February 2025 the company closed a $3M pre-seed led by Rally Ventures with participation from 10VC.

  • Model: subscription / usage-based per location, deployed at drive-thru speaker boxes and phone lines.
  • Customers: QSR and fast-casual chains; references from a large coffee chain CTO and a nationwide sandwich chain GM.
  • Competition: Presto Automation, SoundHound AI, Hi Auto, Valyant AI, and in-house programs at large chains.
  • Team: ~12 people, spread across locations; audio-science-led and shipping-focused.
The Trajectory

Fast for a company this small

2024

Incept AI founded in New York by Umut Isik and Justin Foster.

May 2024

System launches and begins reaching 95%+ order completion in live restaurant environments.

Feb 2025

Announces $3M pre-seed led by Rally Ventures with 10VC; former Google CIO Ben Fried joins the board.

Apr 2025

Featured by Food On Demand and Hospitality Technology for its neural-audio approach to drive-thru noise.

Notes In The Margin

Things that stuck with us

  • The CEO first tackled drive-thru audio for McDonald's years before founding Incept - the problem followed him across three companies.
  • The founding thesis is almost contrarian in 2025: the remaining problem in voice AI is sound, not language.
  • Incept treats the LLM as swappable and the audio engine as the moat - the opposite of most AI startups.
  • A team of about four people won a pilot with a 1,000-unit chain. Distribution followed a product that worked.
Share this profile