Speech To Text | YesPress

aiera · financial aiRead →

Ai · Fintech · Saas

Aiera

Aiera is a New York-based generative-AI platform for financial research that sources, verifies, transcribes and summarizes corporate and investor events - earnings calls, conferences, shareholder meetings and macro announcements. Covering 50,000+ events across 15,000+ global equities each year, it pairs AI automation with human review to deliver transcripts that reach roughly 99.9% accuracy, plus live audio, sentiment analysis, search and enterprise APIs. Founded in 2018 by former Wall Street internet analyst Ken Sena and ex-Amazon Alexa engineer Bryan Healey, Aiera is used by asset managers, banks and research firms, and closed a $25M Series B in 2025 backed by ten of Wall Street's largest research houses with Microsoft as a strategic technology partner.

Ai · Developer Tools · Saas

AssemblyAI

AssemblyAI is a San Francisco speech-AI company that builds and serves models turning audio and video into accurate text, plus higher-level 'audio intelligence' like summaries, sentiment, speaker labels, and PII redaction. Founded in 2017 by Dylan Fox, it sells a developer-first API used to add transcription, real-time streaming, and voice-agent capabilities to software. The company has raised more than $113M across seed to Series C and reports processing over a million hours of audio a day for customers ranging from startups to large enterprises.

speech-to-text · speech recognitionRead →

Founder · Executive · Operator

Ken Sena

Ken Sena is the CEO and co-founder of Aiera, a New York-based generative AI platform that turns earnings calls, investor presentations, and corporate events into real-time transcriptions, sentiment analysis, and decision-grade insights for institutional investors. A former Wall Street internet analyst who led global internet research coverage at Evercore ISI and Wells Fargo Securities, Sena launched Aiera in 2018 to apply machine learning to the research workflow he had spent two decades inside. The company now covers more than 13,000 equities, processes tens of thousands of public events a year, and closed a $25 million Series B in June 2025 backed by ten of Wall Street's largest research firms with Microsoft as its strategic technology partner.

ken sena · aieraRead →

ai voice agents · healthcare aiRead →

Ai · Health · Enterprise

Syllable

Syllable is a Mountain View, California company that builds, deploys, and manages AI voice and text agents, originally to automate patient communication for health system call centers and medical practices. Founded in 2016, it has expanded from healthcare-specific natural language understanding into a broader, model-neutral agentic platform that lets organizations build agents once and run them across any cloud with governance, analytics, and compliance built in. The company has raised roughly $81.7M+ across Seed through Series C, counts investors like TCV, Oak HC/FT, Verily, and Northwell, and acquired Actium Health in 2024.

captioning glasses · ar glassesRead →

Hardware · Health · Ai

Xander

Xander builds XanderGlasses, standalone augmented-reality smart glasses that turn spoken conversation into real-time captions displayed in the wearer's field of view. Aimed at people with hearing loss who don't sign, the device runs speech-to-text offline - no phone, Wi-Fi, or cloud required - so private conversations stay private. An MIT Media Lab spinout founded by Alex and Marilyn Morgan Westner, Xander has been adopted by the U.S. Department of Veterans Affairs and is built on customized Vuzix Shield hardware.

ai translation · machine translationRead →

Ai · Media · Saas

XL8.ai

XL8.ai is a Silicon Valley AI company building machine translation purpose-built for media and live events. Founded in 2019 by ex-Google and ex-Apple engineers, it turns video, audio and live speech into accurate subtitles, synthesized dubbing and real-time interpretation across 45+ languages. Its two flagship products - MediaCAT for localization workflows and EventCAT for live multilingual events - are used by language service providers, broadcasters, FAST channels and enterprises in 70+ countries, with the company reporting more than 800,000 hours of content and 2.2+ billion words translated.

Founder · Executive · Engineer

Alex Westner

Alex Westner is the co-founder and CEO of Xander, a Raleigh-based startup building AR smart glasses that print live captions of in-person conversations directly into the wearer's field of view. An MIT Media Lab-trained audio engineer who spent nearly two decades shipping sound software (including the Emmy-winning iZotope RX, nicknamed the Photoshop for Sound), Westner now turns his obsession with the cocktail party problem into a self-contained, no-cloud, no-phone device that helps people follow conversations they can no longer fully hear.

xander · xanderglassesRead →

ai translation · real-time translationRead →

Wordly

Wordly is a San Francisco Bay Area company that delivers cloud-based, AI-powered live translation, captioning, transcription and summaries for meetings and events. Without apps, headsets or human interpreters, attendees scan a QR code and follow along in their own language across 60+ languages and 3,000+ language pairs. Founded in 2017 by Lakshman Rathnam, Wordly serves 4,000+ customers and has supported over 5 million users worldwide, and was ranked among the fastest-growing software companies on the 2025 Inc. 5000 list.

transcription · captionsRead →

Rev

Rev is an American speech-to-text company that pairs the world's most accurate AI speech recognition with a global network of human transcriptionists to deliver transcription, captions, and subtitles at up to 99% accuracy. Founded in 2010 by six MIT-connected entrepreneurs, Rev serves over 100,000 customers and more than a million users across legal, media, education, and enterprise, and has increasingly focused its AI on the legal market with tools for depositions, evidence, and case prep.

accessibility · deafRead →

Ai · Health · Saas

Nagish

Nagish is a New York-based assistive-technology company that uses proprietary AI to caption phone calls in real time, converting speech to text and text to speech so people who are deaf or hard of hearing can make and receive calls independently and privately - without a human relay operator. Its name means 'accessible' in Hebrew. The company is one of the few firms certified by the FCC to provide telecommunication relay services and offers its consumer app for free.

Founder · Executive · Engineer

Tomer Aharoni

Tomer Aharoni is the co-founder and CEO of Nagish, a New York startup using AI to caption phone calls in real time so Deaf and hard-of-hearing people can place and receive calls by typing and reading, with no human operator in the loop. The idea began with a phone ringing during a class at Columbia and a question he couldn't shake: how do you take a call if you can't hear or speak? Nagish (Hebrew for 'accessible') is now FCC-certified, offered free to users through federal subsidies, and has raised $16 million. Aharoni builds the product hand-in-hand with the Deaf community and is now pushing into AI sign-language translation.

nagish · accessibilityRead →

voice-ai · speech-to-textRead →

Deepgram

Deepgram builds foundational voice AI - speech-to-text, text-to-speech, and full voice-agent APIs - used by more than 1,300 enterprises including NASA, Spotify, Twilio and Citibank to give machines the ability to listen, understand, and respond in real time.

noise-cancellation · voice-aiRead →

Krisp

Krisp is a voice AI company that strips background noise, voices, and echo from live calls using deep learning, then layers transcription, meeting notes, accent conversion, and real-time translation on top. Founded in 2017 by ex-Twilio engineers, it now processes 75+ billion minutes of audio a month for contact centers, BPOs, Discord, and millions of remote workers.

Ai · Developer Tools · Saas

Vapi

Vapi is a San Francisco developer platform for building, testing, and deploying conversational voice AI agents over phone and web. It abstracts the messy plumbing of speech-to-text, LLMs, text-to-speech, and telephony so developers can ship human-sounding voice agents in minutes, with sub-500ms latency and enterprise-grade compliance.

voice-ai · conversational-aiRead →

Founder · Executive · Scientist

Scott Stephenson

Scott Stephenson is the co-founder and CEO of Deepgram, the voice AI company building foundational speech-to-text, text-to-speech, and voice agent models from scratch. A particle physicist who once helped build a dark-matter detector two miles underground, he now runs an AI company used by NASA, Spotify, and Twilio.

deepgram · voice aiRead →

ai meeting assistant · transcriptionRead →

Otter.ai

Otter.ai builds AI meeting assistants that join your Zoom, Teams, and Google Meet calls, transcribe them in real time, summarize the noise, and pull out action items - so people stop scribbling and start paying attention.