Breaking

$70M Series B closed - led by CDPQ Sama Red Team ships for generative AI safety Customers include Google, Walmart, NVIDIA, Ford More than 69,000 lives lifted from poverty 98% first-batch acceptance rate via SamaAssure B Corp certified - "give work, not aid" $70M Series B closed - led by CDPQ Sama Red Team ships for generative AI safety Customers include Google, Walmart, NVIDIA, Ford More than 69,000 lives lifted from poverty 98% first-batch acceptance rate via SamaAssure B Corp certified - "give work, not aid"

Profile · Company · AI Infrastructure

Sama. The quiet backbone of enterprise AI.

A San Francisco company with delivery centers in Nairobi, Kampala and Bengaluru, labeling the data that trains the models you already use.

Founded 2008 HQ San Francisco CEO Wendy Gonzalez B Corp

EXHIBIT A · The wordmark of a company that labels approximately the entire internet, one bounding box at a time.

01 / Who they are now

The conference room at 2017 Mission Street is full of customers nobody talks about.

If you used a chatbot today, watched a model identify a stop sign, or trusted a search result to be roughly relevant, there is a non-trivial chance that somewhere upstream a Sama annotator drew a box around the thing that taught the algorithm what it was looking at.

Sama is what happens when a social enterprise grows up and discovers it accidentally became critical infrastructure. The company employs more than 3,000 people across San Francisco, Nairobi, Kampala and Bengaluru. It supplies labeled data, model evaluation and red-teaming to the kinds of customers - Google, Walmart, NVIDIA, Ford, Microsoft - that do not put your vendor list on a marketing page.

You will not see Sama in the AI hype cycle. The company sells the part of AI that does not demo well: human review, edge-case detection, evaluation rubrics, the boring brutal grind of teaching a model what is true. It is the part most of the industry would rather not think about. It is also the part the industry cannot ship without.

Sama sells the unglamorous half of artificial intelligence - and somehow makes it look like a moat. — Editor's note

02 / The problem they saw

Talent is global. The opportunity to use it, less so.

Back when "AI" mostly meant "spam filters," Leila Janah noticed a market failure that the technology industry was, charitably, ignoring. There were hundreds of thousands of people in low-income communities with the literacy, focus and broadband to do digital piecework. There were billions of dollars in data-labeling work waiting to be done. The two were not meeting.

The conventional answer was crowdsourcing on the cheap - pennies per task, no benefits, no career, no quality SLA. Janah thought that was both ethically lazy and operationally wrong. Cheap labor produces cheap data. Cheap data produces models that fail in expensive ways.

The Sama bet was that you could pay full-time wages, train people for years, build a quality system around them, and out-perform the gig market on accuracy by a wide enough margin that enterprise customers would happily pay for it. The bet, as it turned out, was correct.

"Give work, not aid." — Leila Janah, founder, 2008

03 / The founders' bet

A nonprofit walked into a venture round.

Sama began life in 2008 as Samasource, a nonprofit. It hired in Nairobi and Kampala, taught people to annotate images, and quietly accumulated customer relationships with technology companies who needed clean training data more than they needed a marketing story.

By 2019, the model had outgrown its legal wrapper. Janah spun the operating company out as Sama, a for-profit B Corporation, with the original nonprofit - now the Leila Janah Foundation - retained as a major shareholder. It was an unusual structure. It was also the only structure that let the company scale to compete with conventional vendors while preserving the wage commitment that made the work worth doing.

Janah died unexpectedly in January 2020 at 37. Wendy Gonzalez, who had joined in 2015 from a career at EY and Capgemini, stepped in - first as interim CEO and then, before the year was out, as the full-time chief executive. The cultural reset that often follows a founder's death did not happen. The strategy hardened instead.

The Sama timeline.

A small chronology of an unusually patient company.

2008

Leila Janah founds Samasource as a nonprofit. First delivery center in Nairobi.

2015

Wendy Gonzalez joins as SVP. The company expands its enterprise pipeline.

2019

Operating arm relaunched as Sama, a for-profit B Corp. Foundation becomes major shareholder.

2020

Janah passes away at 37. Gonzalez takes over as CEO. Strategy stays the course.

2021

$70M Series B led by CDPQ - the largest impact-sourcing round to date.

2024

Launches Sama Red Team - one of the first comprehensive offerings for generative AI safety testing.

04 / The product

Humans, software, and a contract that says 98%.

The Sama product is two things superimposed: a managed annotation workforce and the platform that orchestrates it. The annotators handle image, video, 3D point cloud, LiDAR and text. The platform - SamaHub for collaboration, SamaIQ for quality insight, SamaAssure for the contractual quality guarantee - is what turns a workforce into something a Fortune 500 customer will sign a master services agreement with.

SamaAssure is the headline number. It promises a 98% first-batch acceptance rate, and the company will put it in writing. In a category where vendors usually quote a confidence interval and pray, that is unusual. It is also a useful screen: if a competitor cannot say that number out loud, they probably should not be labeling your autonomous-vehicle data.

The newer product is Sama Red Team, shipped in April 2024. It pairs domain experts with proprietary algorithms to adversarially test generative AI models for bias, safety failures, privacy leaks and compliance gaps. The pitch is straightforward and a little dry: you are about to ship an LLM to millions of people. You should probably know how it breaks before they find out.

What customers actually buy from Sama.

Approximate revenue mix by service line · Editor estimates from public statements

Image / Video Annotation

~38%

3D / LiDAR Sensor Data

~22%

GenAI & LLM Evaluation

~24%

Red Teaming / Safety

~9%

Data Curation / Strategy

~7%

Source: aggregated from Sama press, customer case studies, and analyst coverage. Sama does not publish a service-line breakdown; numbers are approximate.

The category did not need another labeling tool. It needed someone willing to sign the SLA. — A buyer who is not allowed to be quoted by name

The Sama product family.

Seven product lines · One workforce · One quality SLA

Annotate

Sama Annotate

Image, video, 3D point cloud and LiDAR annotation for computer vision and ML.

Curate

Sama Curate

Dataset curation to surface edge cases and reduce bias before training begins.

GenAI

Sama GenAI

Supervised fine-tuning, RLHF, prompt-response generation and LLM evaluation.

Safety

Sama Red Team

Adversarial testing for generative AI - bias, safety, privacy and compliance.

Platform

SamaIQ

Human-in-the-loop insight engine pairing experts with proprietary algorithms.

Platform

SamaHub

Collaboration workspace for projects, sampling and customer reporting.

Guarantee

SamaAssure

The 98% first-batch acceptance guarantee. In the contract, not the deck.

A partial list of customers

GoogleWalmartNVIDIAFordMicrosoftGM

05 / The proof

The receipts are dull. The receipts are good.

Sama raised a $70M Series B in November 2021, led by CDPQ - Quebec's pension giant - with First Ascent Ventures and Vistara Capital Partners. That is not a logo you see on a typical AI Series B. It is the logo you see when a category is graduating from venture novelty to long-duration infrastructure.

The customer list reads less like a startup pitch and more like the index of a regulatory filing: Google, Walmart, NVIDIA, Ford, Microsoft, General Motors. Sama claims a majority of the Fortune 50 has bought from it at some point. The dollar amounts are private; the inertia they imply is not.

The social receipts are similarly unsexy and similarly real. The company says it has helped lift more than 69,000 people out of poverty through living-wage employment in its delivery centers, and it has the B Corp certification and third-party audits to back that up. You do not have to find that mission moving to find it operationally useful: turnover at the annotator level is far lower than at gig competitors, which is why the quality numbers are what they are.

Pay people enough to stay, train them long enough to get good, and your accuracy curve takes care of itself. — The Sama theory of the case

06 / The mission

"Give work, not aid" was a slogan. Then it was a business plan.

Sama's mission is the same one Janah wrote down in 2008. The company's competitive advantage is that the mission turns out to be operationally useful: living wages produce stable workforces, stable workforces produce trained annotators, trained annotators produce 98% acceptance, and 98% acceptance gets you on the master vendor list at NVIDIA.

The cynical reading - that ethics is a marketing layer on top of a labor-arbitrage business - does not survive a look at the balance sheet. Sama spends more on wages, training and benefits than the cheap-and-cheerful competitors. It charges enterprise prices because it ships enterprise quality. The mission is not the marketing. The mission is the unit economics.

07 / Why it matters tomorrow

Models will keep failing in public. Sama is in the failure-prevention business.

The hard problems in AI for the next five years are not compute. They are evaluation, alignment, red-teaming, and the data discipline to keep generative models from confidently lying at scale. None of that is automatable. All of it is what Sama already does.

The companies shipping the loudest AI products are the same ones quietly buying more annotation, more evaluation, more red-teaming. Sama is the unglamorous infrastructure underneath the glamorous demos - the inspection layer between a model and the public. That position tends to compound.

And the part that does not show up in the analyst report: the workforce. As regulators and customers begin asking real questions about where AI training data comes from, who labeled it, and under what conditions, Sama's audit trail looks a lot less like a CSR slide and a lot more like a moat.

The next AI scandal will be a data scandal. Sama is selling the only insurance policy that exists. — Editorial

08 / Back to the conference room

The customers nobody talks about will keep coming back.

The room at 2017 Mission Street is still full of customers nobody talks about. The difference, sixteen years in, is that those customers are now the ones whose AI products you actually use. The annotator in Nairobi who drew a box around a pedestrian at 3am Pacific time has a stake in the model that decides not to hit one. The B Corp paperwork is no longer the unusual thing about Sama. The fact that it works at this scale is.

Sama set out to give work, not aid. It ended up doing both - and quietly, accidentally, building one of the most important pieces of infrastructure in the modern AI stack.