Breaking
RIME RAISES $5.5M SEED LED BY UNUSUAL VENTURES (MAY 2025) 100M+ PHONE CONVERSATIONS A MONTH RUN ON RIME ARCANA V2 SHIPS - 40+ VOICES, FOUR LANGUAGES RIMECASTER OPEN-SOURCED SUB-200MS TIME-TO-FIRST-AUDIO IN PRODUCTION DOMINO'S, WINGSTOP RUN ON RIME VOICES RIME RAISES $5.5M SEED LED BY UNUSUAL VENTURES (MAY 2025) 100M+ PHONE CONVERSATIONS A MONTH RUN ON RIME ARCANA V2 SHIPS - 40+ VOICES, FOUR LANGUAGES RIMECASTER OPEN-SOURCED SUB-200MS TIME-TO-FIRST-AUDIO IN PRODUCTION DOMINO'S, WINGSTOP RUN ON RIME VOICES
YesPress / Voice AI / Profile No. 0042

RIME

The voice on the other end of the line is fake. The breathing, the small laugh, the way it pronounces your last name on the second try - that part is Rime.

San Francisco, CA Founded 2022 Team: ~28 Seed: $5.5M Stage: B2B Voice AI
Rime brand image
FIG. 1 - Rime, photographed at the edge of a noisy customer-service line. The voice you didn't notice.
// The Scene

A pizza order at 8:47 p.m. on a Tuesday.

It is raining in Cleveland. A Domino's franchise is taking calls faster than its three teenagers behind the counter can answer. The phone rings, someone picks up, asks for extra cheese, gets the order wrong once, corrects themselves, laughs. The voice on the other end laughs back. The customer never asks if they are talking to a person. That ambiguity, that small unbothered transaction, is the product Rime sells.

Rime is a San Francisco voice AI lab that builds text-to-speech models for enterprise call centers. Its software handles more than 100 million phone conversations a month for brands you have already done business with. The company is 28 people. It raised a $5.5 million seed in May 2025. And it spent the previous three years arguing, quietly and stubbornly, that the way the rest of the industry was building synthetic voices was wrong.

NSFW for robots: Rime trains on real customer-service calls
Not audiobooks. Not actors. Hold music.
Linguistics-first. ML-second.
100M+
Calls / Month
<200ms
Time to First Audio
$5.5M
Seed Round
40+
Voices Shipped
// The Founders

Three people, one stubborn thesis.

Lily Clifford was supposed to finish a PhD in computational linguistics at Stanford. She did not. She dropped out because, in her telling, she wanted to hack on speech synthesis - specifically for customer support, the unglamorous part of voice where everybody had decided "good enough" was good enough. She co-founded Rime in 2022 with Brooke Larson, a PhD linguist who had worked on Amazon's Alexa, and Ares Geovanos, a Stanford engineer who had been around enough product launches to know which corners not to cut.

CEO

Lily Clifford

Co-founder · Computational Linguistics

Sociolinguistics-trained Stanford dropout. Argues that real speech includes hesitation, and synthetic speech that pretends otherwise sounds, frankly, hostile.

CO-FOUNDER

Brooke Larson

Linguistics · ex-Amazon Alexa

PhD linguist who has already shipped voice at planetary scale. Brought the rigor that turns "vibes" into a model card.

CO-FOUNDER

Ares Geovanos

Engineering & Product

Stanford engineer, product veteran. The person who makes sure the API responds before the customer notices it has not.

"Voices that breathe, laugh, code-switch, and carry the subtle rhythms of real speech." - Rime company page
// The Stack

Two models, one philosophy.

Rime ships two enterprise models. Mist v2 is the workhorse - deterministic, fast, predictable. You teach it how to pronounce "Worcestershire" through the API once, and it does not forget on call number eight million. Arcana v2 is the showpiece. Forty-plus voices across English, Spanish, French and German, trained on actual customer service interactions instead of polished audiobooks. It breathes. It laughs. It can code-switch mid-sentence without sounding like it tripped on the carpet.

Mist v2

Deterministic at scale

Built for high-volume production where pronunciation accuracy must be guaranteed across millions of calls. ~225ms time-to-first-audio on Together AI dedicated endpoints. The model your CFO loves.

Arcana v2

Voices with vibes

Expressive, conversational, multilingual. 40+ voices. On-prem available. Captures breathing, laughter, disfluencies - the small human noise that turns a transcript into a conversation.

Rimecaster

Open source

Released in 2025. A practical speech model built on real-world conversational data. Rime gave it away. The strategic logic is interesting and we'll let you draw it.

Coda

Flagship balanced

Tuned for enterprise speed and concurrency. The model deployed when the answer to "expressive or fast?" is "yes."

// By the Numbers

Where Rime spends its cleverness.

A simplified picture of what makes Rime different from the dozen other TTS labs raising seed rounds this year. The pattern, in one chart:

Linguistic depth
96%
Production focus
92%
Latency obsession
88%
Expressivity
84%
Consumer flash
32%

FIG. 2 - YesPress estimate, based on public materials. Rime's brand is not built for TikTok demos. It is built for procurement.

// What You Can Build

If you have a phone number, you have a use case.

IVR That Doesn't Hate You

Replace the menu maze with a voice agent that listens, confirms, and routes - in under 200 milliseconds.

Drive-Thru and Phone Ordering

The exact deployment Domino's and Wingstop are running. Take orders. Upsell. Handle the awkward pause.

Healthcare Back-Office

Appointment confirmations, refill calls, post-discharge check-ins. HIPAA-compliant, on-prem option available.

Conversational Agents

Wrap Rime around any LLM. The model talks like a person; the agent thinks like a colleague.

Multilingual Support

English, Spanish, French, German - with code-switching. The voice that doesn't make customers translate themselves.

Custom Voice Development

Build voices tuned to your brand. The pronunciation you define once is the pronunciation you ship forever.

// The Arc

A small company, moving quickly.

2022

Rime is founded

Clifford, Larson and Geovanos start the company in San Francisco. The thesis: voice AI built for the call center, not the podcast.

2024

Mist ships

The deterministic, low-latency TTS model that becomes Rime's enterprise workhorse.

2025 - May

Arcana and Rimecaster announced

Expressive voices for the consumer-grade demos; an open-source model for the developer community.

2025 - May

$5.5M Seed closes

Unusual Ventures leads, with Founders You Should Know, Cadenza and a long list of operator angels.

2025

Arcana v2 + Together AI partnership

Models go multilingual, ship on-prem, and land on Together AI dedicated endpoints.

// Footnotes

Small things worth knowing.

Mist and Rime are both fog

The naming is not an accident. Mist is a type of rime. The company likes its metaphors load-bearing.

The training data is the joke

Most TTS models learn from audiobooks. Rime learned from real customer-service audio - the place where actual humans actually hesitate.

15% sales lift

VentureBeat reported that Rime's TTS boosted sales 15% for major brands. The cheese-pull moment for voice AI.

A linguist on the cap table

Brooke Larson's PhD is in linguistics. Most voice AI startups would consider that a Wednesday lunch hire. Rime made her a co-founder.

// Return to the Pizza

8:47 p.m., still raining.

Back to Cleveland. The franchise is still taking calls faster than three teenagers can answer. But now most of them are answered. The voice on the other end remembers how to say the customer's street name. It laughs at the right moment - a small, slightly nervous laugh, because it was trained on someone who was, once, slightly nervous. The order goes in. The pizza shows up. Nobody mentions the AI, because nobody noticed.

That is the product. Not the demo on Twitter. Not the celebrity voice clone. The thing you don't think about, working in the background, for 100 million conversations a month. Rime is not trying to make you marvel at voice AI. It is trying to make you forget you were listening to one.

Share this profile