Right now, a model is looking
Somewhere on Reddit, a user uploads an image. Before it shows up in anybody's feed, an API call goes out - a tidy little HTTP request bound for a server farm somewhere in the western United States. A model wakes up, looks at the pixels for about 80 milliseconds, decides whether the image contains a weapon, a nipple, a swastika, or none of the above, and sends back a JSON object. Nobody at Reddit ever sees the photo. Nobody at Hive does either. The user keeps scrolling.
That little choreography - boring, invisible, repeated billions of times a month - is the business. Hive is the company on the other end of the API. It is also, depending on the day, the company moderating BeReal, classifying ad inventory for NBCUniversal, hunting deepfakes for the Department of Defense, and doing the unglamorous plumbing of the modern internet. If you have ever wondered who keeps your social feed at a vague PG-13, the answer is partly Hive.
The unglamorous job
Content moderation is, by reputation, the worst job on the internet. Human reviewers in outsourced contact centers stare at the bottom 0.1% of human behavior for eight hours a day so the rest of us can post brunch pictures. The work is repetitive, traumatizing, and impossible at the volumes a global platform actually produces. Roughly speaking: the internet ships content faster than humans can read it. This has been true since about 2009. It is more true now.
Most platforms tried to solve this with off-the-shelf cloud vision tools, in-house models, or stalling. The off-the-shelf tools were too generic, the in-house models were expensive, and the stalling - well, the stalling created congressional hearings. Somewhere in that gap was a company-shaped opportunity. Kevin Guo and Dmitriy Karpman walked into it.
The founders' bet
Kevin Guo and Dmitriy Karpman met as Stanford undergrads. They started Hive in 2013, originally building a consumer app - the company's pre-history involves a video-sharing product that did not become Instagram. The pivot, when it came, was a sober one: stop chasing consumer attention, start selling the thing they had become unexpectedly good at, which was training computer vision models with a small army of distributed labelers.
The bet was simple, in the way that all good bets are simple in retrospect. Every digital platform on Earth was going to need AI models to understand the content flowing through it. Most of those platforms would not want to build the models themselves. So: build the models once, sell them as APIs, eat. Five years later this looks obvious. In 2015, it required some patience.
What Hive actually sells
Hive's catalog reads less like a startup and more like a library card. There is a visual moderation API that grades imagery across sexual, violent, drug, hate, and attribute categories. A text moderation API that flags bullying, promotion, hate speech, and stray links. A logo and brand detection model that finds Coca-Cola cans in NBA highlight reels. A deepfake detection model that the U.S. military has decided is worth $2.4 million. A no-code dashboard so trust & safety analysts can wire up their own rules without filing a Jira ticket.
The shape of the company is: roughly 500 engineers, salespeople, and operators in San Francisco, plus an extended workforce of about 700,000 distributed gig labelers running through an app called Hive Work. The labelers tag the training data. The engineers train the models. The salespeople sell the API. The customers' users never know.
Visual Moderation
Images and video, scored across nudity, violence, drugs, hate, attributes.
Text Moderation
Sexual, hate, violence, bullying, promotions, external links.
Deepfake Detection
AI-generated and manipulated content across image, video, audio.
Logo & Brand
Find and localize logos in media for ad measurement.
- 2013Kevin Guo and Dmitriy Karpman found Hive in San Francisco.
- 2017Pivot from consumer apps to enterprise AI APIs.
- 2020Demand spikes as platforms move human moderators home during COVID.
- 2021Series D announced, led by Glynn Capital. ~$2B reported valuation.
- 2023Launches AI-generated content and deepfake detection APIs.
- 2024Independent study names Hive the top performer in deepfake image detection. DoD contract awarded.
The proof: who pays Hive
Skepticism is healthy in B2B AI. Anyone can ship a demo. The harder question - whether the product survives contact with paying customers - has a real answer here. Hive's customer list reads like an audit of the internet's attention economy: Reddit, BeReal, Chatroulette, Yubo, Omegle, Tango, Giphy, Truth Social. Add brand advertisers like NBCUniversal, Walmart, Visa, Anheuser-Busch InBev, and Interpublic Group. Then add the parts of the U.S. government that worry about deepfakes for a living, including the Department of Homeland Security's Cyber Crimes Center and, since late 2024, the DoD's Defense Innovation Unit.
Hive, by the numbers
The mission, plainly
Hive will tell you its mission is to be the AI infrastructure layer for understanding digital content. That is true, and a little tidy. The slightly messier truth is that Hive has decided to be the company that handles the things humans don't want to handle. The job description is "look at the worst the internet produces, fast, accurately, at margin." It is not a sexy mission. It is, however, the kind of mission that ends up with you on a DoD shortlist.
What's unusual about the company is its allergy to the showroom side of AI. There is no Hive chatbot. There is no Hive avatar with a quirky name. There is a documentation site, an API, and a sales team who will quote you per-call pricing. In a market drowning in AI demos, that's almost a personality.
Why it matters tomorrow
Generative AI didn't just give us new content. It gave us new content moderation problems. A model can now mint a convincing political deepfake in seconds; a teenager with a laptop can synthesize a voice. The cost of fakery has collapsed. The cost of verification has not - which means the cost of verification is now the business.
Hive's bet was that synthesis and detection would always be the same kind of problem from the inside: pixels in, classification out, just with different training data. So far the bet looks right. The customers who used to buy moderation are now buying deepfake detection from the same vendor, on the same API surface, with the same per-call pricing. The Defense Innovation Unit is buying the same product, with a different SLA, paid in different colored money.
Back to the API call
Return to the image being uploaded to Reddit. The model wakes up, decides, sleeps. The user keeps scrolling. Twelve years ago that decision would have been made by a human contractor in Manila or Hyderabad, hours later, after the post had already done its damage. Today it is made by a Hive endpoint in 80 milliseconds, before the post is ever rendered. The trauma is a vector, not a person. The latency is microseconds. The cost is fractions of a cent.
It is not utopia. Models are wrong. Edge cases pile up. Adversaries iterate. But the floor of internet moderation has quietly moved - up, by orders of magnitude - and a lot of that movement is one San Francisco company shipping pre-trained models behind an API. Boring infrastructure. Loud impact. The exact kind of company that ends up mattering, slowly and then suddenly.
