BREAKING · TrustLab picked by European Commission to measure disinformation across 6 platforms★$15M Series A led by USVP & Foundation Capital★Founders built trust & safety at Google, YouTube, TikTok, Reddit★SuperviseAI grades whether your AI agents behave★Clients include In-Q-Tel and leading social platforms★BREAKING · TrustLab picked by European Commission to measure disinformation across 6 platforms★$15M Series A led by USVP & Foundation Capital★Founders built trust & safety at Google, YouTube, TikTok, Reddit★SuperviseAI grades whether your AI agents behave★Clients include In-Q-Tel and leading social platforms★
Trust & Safety · San Mateo, CA
TrustLab
The company that decided the internet's worst content deserved a referee - and built one out of software, policy, and people who already fought the fight at Google and YouTube.
A YesPress profile · Filed June 2026 · ~80 employees, 5 cities
TrustLab, photographed where it lives: a flat brand mark for a job that is anything but flat.
The Beat · Content Safety
A referee for the feed
Somewhere right now a platform is deciding whether a piece of content is fraud, a parody, a coordinated lie, or just a bad day for someone's reputation. The clock is running. The volume is absurd. And increasingly, the thing making the first call is a model - one that someone has to trust. TrustLab exists for that exact moment.
It is a trust-and-safety software company, headquartered in San Mateo, with roughly 80 people scattered across San Francisco, Berlin, Ankara, and Buenos Aires. It does not run a social network. It does not sell ads. Its entire product is the unglamorous machinery underneath everyone else's platform - the detection, the labeling, the enforcement, and the measurement that decides what stays up and what comes down.
That is a strange place to plant a flag. Content moderation is the part of the internet nobody wants to talk about at parties. TrustLab built a business on it anyway.
Organizations need more flexible, cutting-edge tools for enforcement and compliance of content that harms users.Tom Siegel, Co-Founder & CEO
The Problem · Why this exists
The internet outran its own janitors
For two decades, content moderation meant one of two things: enormous rooms of human reviewers, or blunt keyword filters that caught the wrong things and missed the right ones. Neither scaled. Both were exhausting. And both belonged to the platforms themselves, which meant the people grading the homework were also the ones who wrote it.
Then the problem changed shape. Misinformation got organized. Identity fraud got cheap. Hate speech learned to dodge the filters. And generative AI arrived, producing synthetic content faster than any review queue could drain - and, more recently, producing AI agents that act on a user's behalf and can go wrong in ways a keyword filter was never built to catch.
Regulators noticed too. Europe's Digital Services Act turned "we tried our best" into a compliance question with teeth. Suddenly platforms needed not just to moderate, but to prove they moderated - to someone neutral, on the record.
The only way we can address content safety in an automated, scalable way is with technology like Trust Lab's.Steve Vassallo, Foundation Capital
The Bet · Founders
Three people who'd already done it
In 2020, three veterans of the field made a wager: that the hardest trust-and-safety problems should be solved by an outside specialist, not reinvented inside every company that hit them. They had the resume to make the bet credible.
Tom Siegel built Google's first Trust & Safety team in the early 2000s, back when Google was mostly a search box, and ran it for more than a decade across some four billion users. Shankar Ponnekanti led brand safety and suitability at YouTube and helped steer it through the 2017 monetization crisis. Benji Loney built and led trust-and-safety teams at YouTube and TikTok and shaped technical strategy at Reddit - the kind of work that includes crisis response for content nobody wants to describe at dinner.
Between them they had seen every failure mode from the inside. The idea behind TrustLab was almost insolent in its simplicity: take that hard-won institutional knowledge, point it outward, and sell the defense to everyone who could not build it themselves.
2020
Founded
3
Co-founders, ex-Google/YouTube/TikTok/Reddit
$15M
Series A, June 2023
5
Cities across 4 countries
They are undisputed experts in applying ML and AI technology to prevent harmful misinformation and bad actors.Dafina Toncheva, U.S. Venture Partners
The Record · Milestones
Six years, in order
2020
Trust & Safety Laboratory, Inc. is founded
Tom Siegel, Shankar Ponnekanti, and Benji Loney leave the platforms and start the outside specialist they wished existed.
2022
The EU comes calling
The European Commission selects TrustLab to independently measure disinformation across six major platforms. In-Q-Tel, the intelligence community's venture arm, shows up on the client list.
2023
A 71-page benchmark, and a $15M Series A
TrustLab publishes a cross-platform disinformation study used as a monitoring baseline, then closes a Series A led by USVP and Foundation Capital in June.
2024-25
From content to AI behavior
SuperviseAI extends the same human-in-the-loop discipline to AI agents - using LLM-as-a-judge evaluation, drift detection, and closed-loop learning.
2025
The Code becomes law-adjacent
The EU's 2022 Code of Practice on Disinformation - the one TrustLab measured - is endorsed for integration as a Code of Conduct under the Digital Services Act.
The Product · What you actually get
Three letters, one philosophy
TrustLab's catalog reads like an acronym factory, but each product is a piece of the same idea: software does the heavy lifting, humans keep it honest. The company calls it AI-with-human-in-the-loop, which is a polite way of saying it does not trust the machine to be the last word - and does not ask people to read everything either.
Detect
DetectAI
AI-driven threat discovery and investigation that surfaces high-risk and emerging harmful content across platforms - the early-warning system, before things escalate.
Moderate
ModAI
Multi-modal moderation and data labeling that blends automation with human review, tuned for quality and cost at the kind of volume that breaks manual queues.
Supervise
SuperviseAI
Oversight for AI agents using LLM-as-a-judge and human reconciliation, drift detection, and closed-loop reinforcement learning - quality control for software that talks back.
FIELD NOTE - SuperviseAI is the plot twist. A company built to watch human content pivoted to watching AI behavior. The harm moved; so did the floodlight.
The Proof · Customers & data
Receipts, not adjectives
Anyone can claim to keep the internet clean. TrustLab has the rarer thing: a job nobody else could neutrally do. When the European Commission needed an independent measurement of disinformation - not a platform grading itself - it commissioned TrustLab to build the methodology and run the numbers across Facebook, Instagram, LinkedIn, TikTok, Twitter, and YouTube, in Poland, Slovakia, and Spain.
The resulting 71-page study became a benchmark for monitoring disinformation over time. One finding traveled well: discoverability of disinformation varied sharply by platform, with Twitter sitting at the high end and YouTube at the low end.
Disinformation discoverability, by platform
Relative ranking from TrustLab's EU Code of Practice study · index, illustrative
Twitter
highest
Facebook
high
TikTok
mid
Instagram
mid
LinkedIn
low
YouTube
lowest
Bars show relative ordering reported in the 2023 study - Twitter highest, YouTube lowest. Exact percentages live in the full report; the shape of the finding is the point.
The client roster reinforces the point. TrustLab counts leading social media companies, messaging platforms, and marketplaces among its customers - and In-Q-Tel, the strategic investment firm tied to the U.S. intelligence community, among its backers and partners. It is a short list of people who do not hand their content-safety questions to just anyone.
When a regulator wants a neutral referee, it does not pick a player. It picked TrustLab.YesPress · on the EU measurement mandate
The Mission · Why bother
A safer internet, measured
TrustLab's stated mission is to make the internet a safer place - a sentence so common it usually means nothing. What makes it land here is the word the company keeps adding underneath it: measurable. The bet is not that harmful content can be wished away, but that it can be detected, counted, benchmarked, and enforced against in a way that holds up to outside scrutiny.
That is a quietly radical position in an industry that has long preferred opacity. It treats trust and safety not as a cost center to be hidden, but as a function that can be audited - the way finance gets audited, the way safety inspections happen for things we take seriously.
The harm keeps moving - misinformation, fraud, now the behavior of AI itself. The job is to keep the floodlight pointed at wherever it goes next.The TrustLab thesis, in one line
Tomorrow · The stakes
The next thing to moderate isn't content
For years the question was: is this post safe? The next question is harder: is this AI agent - the one answering your bank's chat, screening your job application, acting on a stranger's behalf - behaving the way it should? That is not a keyword problem. It needs judgment at machine scale, which is exactly what LLM-as-a-judge plus human reconciliation is built to provide.
TrustLab spent five years learning how to grade content nobody wanted to look at. Now it is applying that same instinct to the systems that are starting to make decisions for us. Same discipline, new surface. The floodlight just got pointed somewhere new.
Return to that opening moment - a platform deciding, the clock running, a model making the first call. The difference TrustLab is trying to make is small and enormous at once: that the call is not a guess. That it was detected, measured, checked by a human when it mattered, and could be defended to a regulator afterward. Content moderation will never be a glamorous job. TrustLab's wager is that it can at least be an accountable one.