BREAKING

Chief Business Officer  |  FriendliAI

Brian
Yoo

The Operator Who Runs on Tokens and Margins

He turned 10 engineers and an idea into a $4 billion company at Moloco. Now he's betting the next chapter on the four-cent difference between an efficient and an inefficient GPU call.

500x Revenue Growth at Moloco
$4B Moloco Peak Valuation
90% GPU Cost Reduction (FriendliAI)
Brian Yoo, Chief Business Officer at FriendliAI
San Francisco, CA
600+
Employees Built at Moloco
From 10 to global
$250M
Peak Revenue at Moloco
From near-zero as COO
$180M
Capital Secured at Moloco
Over his COO tenure
$20M
FriendliAI Seed Extension
Aug 2025, led by Capstone
2x
Faster LLM Inference
FriendliAI's platform promise

"Inference is where AI economics are won or lost. Every percentage point of GPU efficiency translates directly to margin, and every millisecond of latency translates to user experience."

Brian Yoo — Chief Business Officer, FriendliAI — 2026

The Operator Behind the Engine

When FriendliAI announced Brian Yoo as its new Chief Business Officer in April 2026, the company's CEO didn't describe a salesman or a networker. Byung-Gon Chun described an engineer of organizations - someone who built "the operational engine behind an AI-driven startup and scaled it into a multi-billion dollar global powerhouse." That company was Moloco. That tenure lasted nearly a decade.

Yoo arrived at Moloco in August 2016 when the company had ten employees and an AI-driven advertising technology that needed a business around it. He left in April 2026 with 600+ people across multiple continents, $250M+ in annual revenue, $180M+ in capital raised, and a valuation hovering around $4 billion. The 500x revenue figure gets cited in press releases, but the more interesting detail is what he actually built to get there: finance, marketing, HR, BizOps, legal, IT, and workplace operations - the entire unglamorous infrastructure of a company, constructed from zero while the product team built the glamorous parts.

That kind of operator doesn't pick a next act casually. Yoo picked FriendliAI, an AI inference platform founded by Seoul National University professor Byung-Gon Chun in 2021, and the choice is pointed. FriendliAI's core proposition is that most enterprises are dramatically overpaying for GPU compute when running large language models in production - and that fixing the inference layer is worth more to AI-driven businesses than almost any product feature they could build.

What FriendliAI Actually Does

The promise sounds like marketing - 2x faster LLM inference, up to 90% GPU cost reduction - but the mechanism is specific. FriendliAI's platform uses custom GPU kernels, continuous batching, speculative decoding, smart caching (the proprietary tcache system), and parallel inference to extract far more throughput from the same hardware. The company can also deploy any of 552,876+ models from Hugging Face Hub in a single click, and it supports multi-LoRA adapters, native quantization, and structured outputs out of the box.

FriendliAI's March 2026 product - InferenceSense - is described as "AdSense for GPUs": a platform that lets GPU operators monetize idle hardware capacity with paid AI inference workloads, splitting token revenue with FriendliAI. No upfront fees. Revenue share only.

For Yoo, the business logic is clean: inference costs are the largest and least optimized line item in most enterprise AI budgets. Every percentage point of GPU efficiency is real margin recaptured. Every millisecond of latency removed is user experience delivered. The role of CBO is to help enterprises understand that equation and act on it.

"As AI moves into production, performance at the inference layer directly determines how many tokens you can generate - and ultimately the margins you can capture. FriendliAI is positioned to maximize both, delivering industry-leading throughput and efficiency so our customers get the most out of every GPU."

Brian Yoo, on joining FriendliAI, April 2026

The Long Road Through Operations

Yoo's route to this role is not a straight line. Cornell gave him two degrees in Operations Research and Industrial Engineering - a discipline that treats complex systems as optimization problems. He started at Capital One as a business analyst in quantitative risk modeling, then moved to Google's capital markets team doing quantitative analytics. Neither role is on the product-and-fundraising circuit that most startup executives trace. Both are about building rigorous quantitative models for decisions that matter.

Kabam came next, as a Senior Product Manager in mobile gaming from 2013 to 2015. Then a short stint as Chief Strategy Officer at ROOY Inc. before Moloco pulled him in. The arc is someone who spent years learning how businesses actually work before building one - financial modeling, risk analysis, product management, strategy - all disciplines that appear when you have to build a finance team, a legal function, and a global HR operation from scratch without templates.

The Moloco decade also gave Yoo something rarer: credibility inside a technical founder-led AI company. He knows what it takes to build the organizational machinery around a highly technical product. He knows what breaks at 50 people that was fine at 10, and what breaks at 300 that was fine at 100. FriendliAI, with 50 employees and a stated goal of 10x revenue growth in 2026 followed by another 10x in 2027, is about to stress-test those lessons again.

The Inference Bet

The $20M seed extension FriendliAI closed in August 2025 - led by Capstone Partners and joined by Sierra Ventures, Alumni Ventures, KDB, and KB Securities - was meant to fund North American and Asian go-to-market expansion. Yoo is the most visible hire of that expansion. In May 2026, the company opened a 7,000 square foot office in San Francisco's historic Crown Point Press building, signaling that the hiring phase has begun in earnest.

The inference infrastructure market is crowded with technical credibility. The differentiated bet FriendliAI is making - and that Yoo is now the face of commercially - is that efficiency at the serving layer is the AI cost problem most enterprises haven't solved, and that the right platform can cut GPU bills dramatically without touching the models themselves. Strategic partnerships with Hugging Face (January 2025) and LG AI Research (July 2025, supporting LG's EXAONE 4.0 model) are early evidence of how that commercial strategy plays out in practice.

Yoo's prior playbook at Moloco scaled an AI-native company by building systems that could handle the growth that followed great technology. At FriendliAI, the play is similar but inverted: the technology is already working in production, the market is demonstrably large, and the job is to build the commercial machine fast enough to claim it before competitors do. For an operations researcher who spent a decade doing exactly that at scale, the problem is familiar. The domain is new. The math is the same.

Career Arc

2005 - 2006
Cornell University - M.Eng. in Operations Research & Industrial Engineering (following B.S. in the same discipline)
2006 - 2009
Capital One - Senior Business Analyst, Quantitative Risk Modeling
2010 - 2013
Google - Capital Markets Analyst, Quantitative Analytics
2013 - 2015
Kabam - Senior Product Manager (mobile gaming)
2015 - 2016
ROOY Inc. - Chief Strategy Officer
2016 - 2026
Moloco - COO. Scaled from 10 to 600+ employees. Revenue 500x. $4B valuation. $180M+ capital raised.
2026 - Present
FriendliAI - Chief Business Officer. Leading go-to-market, partnerships, and commercial growth for an AI inference platform targeting 10x revenue growth.

Things Worth Knowing

Operations Research is the science of making optimal decisions in complex systems - routing supply chains, scheduling aircraft, pricing financial instruments. It's also, it turns out, good training for building a company from 10 to 600 people.

Before joining Moloco as COO, Yoo spent time at Kabam - a mobile gaming company. Consumer product instincts from gaming inform how he thinks about user experience and latency: players notice 100ms. So do LLM users.

FriendliAI can deploy any of 552,876+ Hugging Face models with a single click. That's the catalog they've indexed. Yoo's job is to get enterprises to use it instead of DIY-ing their own inference infrastructure.

InferenceSense - FriendliAI's March 2026 product - is explicitly described as "AdSense for GPUs." The analogy is exact: monetize idle capacity, share revenue with the operator, no upfront cost.

FriendliAI's new San Francisco office is in the Crown Point Press building - a historic location that has housed printmakers since 1962. A company that optimizes tokens choosing a space built around ink on paper is either poetic or a coincidence.

AI Inference LLM Serving GPU Optimization Enterprise AI Operations Go-to-Market Scaling Startups Cornell San Francisco FriendliAI Moloco AI Infrastructure

What He Built

📈

Scaled Moloco revenue 500x over nearly a decade as COO, from near-zero to $250M+ annually

🏢

Built Moloco from 10 to 600+ employees across global offices, constructing every operational function from scratch

💰

Helped secure $180M+ in capital funding at Moloco, contributing to a ~$4 billion company valuation

🔧

Constructed Finance, Marketing, HR, BizOps, Legal, IT, and Workplace Operations functions from zero at Moloco

🤖

Joined FriendliAI as CBO to drive next phase of hypergrowth following $20M seed extension and InferenceSense launch

🎓

Dual Cornell degrees in Operations Research & Industrial Engineering, combining mathematical rigor with systems thinking

"Brian's track record of building the operational engine behind an AI-driven startup and scaling it into a multi-billion dollar global powerhouse is simply remarkable."

Byung-Gon Chun — Founder & CEO, FriendliAI — April 2026

What Yoo Is Selling

FriendliAI's inference platform isn't a wrapper around an existing model provider. It's infrastructure-level optimization applied to your own models, deployed on your own cloud or on theirs.

2x
Faster Inference
Custom GPU kernels, speculative decoding, and continuous batching combine to deliver significantly higher throughput per GPU than standard serving frameworks.
90%
GPU Cost Reduction
Native quantization, smart caching (tcache), and auto-scaling allow enterprises to run the same workloads on a fraction of the hardware.
552K
Models, One Click
Deploy any model from Hugging Face Hub instantly. Multi-LoRA support, model registry, and monitoring included without extra configuration.

InferenceSense (March 2026): The industry's first inference monetization platform. GPU operators share idle capacity with paid inference workloads. Revenue splits between operator and FriendliAI. Zero upfront cost. Think AdSense - but for compute.