It is a Tuesday in 2026 and someone, somewhere, is calling a model on Together's cloud.
A startup founder in Lisbon is fine-tuning Llama 3.1 on customer-support transcripts. A research lab in Singapore is running a 70B parameter inference job at three in the morning. A Fortune 500 team in New Jersey is, against the advice of several consultants, evaluating an open model instead of buying a closed one. None of them have met. All of them are pointing API requests at the same place.
That place is Together AI. From a converted office on Rhode Island Street, the company runs what it calls an AI Acceleration Cloud - a stack of NVIDIA H100s, H200s, and now GB200s sitting under a custom inference engine and a research team that publishes its own papers. It is the company that, somewhat improbably, became the default destination for builders who looked at the AI boom and decided they would rather not be customers of a black box.
"We believe the future of AI will be open source - and the infrastructure to make that real has to be open too."- Vipul Ved Prakash, CEO, Together AI
Above: the logo, photographed in flattering light. Behind it: roughly five years of GPU procurement.
In 2022, "open AI" was a phrase, not an industry.
Generative AI was rapidly turning into a club with a velvet rope. The most capable models were trained behind closed walls, served behind closed APIs, and priced like enterprise software. If you wanted to build with them, you rented access. If you wanted to inspect them, you didn't. If you wanted to own anything, you couldn't.
Vipul Ved Prakash, who had spent a previous career building open infrastructure - first Cloudmark, later Topsy, which Apple acquired - noticed a familiar smell. He had seen the open web get steadily fenced off, and he could see the same thing about to happen to the most consequential technology of the decade. The wrinkle: the open-source models existed, or were about to. What did not exist was a place to run them that was as fast and as easy as the closed APIs.
"You don't get to keep the open web by clicking your heels. You get it by building the boring middle layer that makes open viable."- A line nobody at the company has actually said, but probably should
Translation: the GPUs were available. The orchestration, optimization, and bedside manner were not.
Five founders, one bet: open weights will win, eventually, and the cloud underneath them will be very, very valuable.
Prakash recruited an unusually credentialed group of co-founders: Stanford's Chris Ré, ETH Zürich's Ce Zhang, Stanford NLP's Percy Liang, and a PhD student named Tri Dao - the researcher who, almost in passing, had written a little kernel called FlashAttention that would soon be inside nearly every modern large language model on the planet.
The bet was not subtle. Closed AI labs, the founders argued, would always have the lead on a particular model on a particular Tuesday. But the open ecosystem - the one that gave us Linux, PostgreSQL, and basically the internet - would catch up, then keep catching up, and would eventually swallow the long tail of enterprise use cases that didn't need a single model to rule them all. The cloud that made that ecosystem usable would become indispensable.
It was the sort of bet that looks obvious in hindsight and reckless in the moment. Meta had not yet open-sourced Llama. Mistral did not exist. There was no DeepSeek. There was, frankly, not yet a market.
"The open models will catch up. We just need to be ready when the people running them stop apologizing for it."- The thesis, paraphrased
A short, unromantic timeline.
Together AI · Selected milestones
Founded in San Francisco by five researchers and operators.
$20M seed led by Lux Capital. Releases RedPajama dataset.
$102.5M Series A with NVIDIA, Kleiner Perkins.
$106M extension led by Salesforce Ventures.
Day-one launch partner for Meta Llama 3.1 (incl. 405B).
Dedicated NVIDIA GB200 clusters available to customers.
$305M Series B at a $3.3B valuation. 200+ models live.
~$100M ARR, 210+ employees, growing.
What Together AI actually sells.
It would be inaccurate to call Together AI a single product. It is closer to a small portfolio, organized around the question "what does an AI builder need that they currently can't get without writing a CUDA kernel?"
Together Inference
Serverless and dedicated endpoints for 200+ open-source models, served by a custom optimized engine.
Fine-Tuning API
LoRA and full fine-tuning that ships a custom model from your data without a research team.
GPU Clusters
Reserved H100, H200 and GB200 supercomputers - rented by the month, not the multi-year contract.
Code Interpreter
Sandboxed code execution for agents that need to actually run things, not just describe them.
Custom Models
End-to-end pre-training. You bring the data and the use case; Together brings the researchers.
Above: the menu. Notably absent: a single closed model nobody is allowed to inspect.
"The thing we sell is speed without lock-in. It turns out that's a real product."- Together AI marketing, in spirit if not in those exact words
You can talk a good thesis or you can be on the cap table.
Together AI is in the unusual position of having done both. NVIDIA is an investor. Salesforce is an investor. General Catalyst led the Series B. Customers, when the company is willing to name them, include Salesforce, Zoom, The Washington Post, Zomato, DuckDuckGo, and Pika Labs - which is to say, a span from the world's largest enterprise software vendors to AI-native startups whose entire business model presumes someone else will run their GPUs.
Funding raised, by round
"Open weights, closed cap table. That's the trick."- An investor on background, almost certainly
The chart, in plain English: things got serious in early 2025.
Make the world's best open AI models accessible to everyone.
The sentence is short on purpose. Together AI's stated mission is to make open-source AI models accessible to everyone, through a cloud that's fast and affordable enough for that to be a real choice rather than a moral one. The company contributes upstream - RedPajama, FlashAttention, Mamba, a long list of papers - because the platform's value depends on the ecosystem's health.
It's a worldview with a quiet political edge: the most consequential software of our generation should not be a rental from three companies. It should be infrastructure. Like electricity. Like, for that matter, the internet itself.
"Open source ate enterprise software twenty years ago. AI is next - it just has more GPUs in the way."- The whole company, basically
If the future is many models, you need someone running them.
The interesting bet inside the Together AI bet is that the next phase of generative AI will be plural. Many models, fine-tuned to many tasks, deployed by many companies, on the same shared infrastructure. The frontier will keep moving - that part is not in dispute - but the bulk of the day-to-day work will not be done on the frontier. It will be done on a Llama variant somebody trained on a Tuesday in Lisbon.
If that's right, the company you want is the boring middle layer. The one that makes the variant cheap to run, fast enough to be useful, and accessible to a developer who doesn't have a doctorate in distributed systems. Together AI has spent four years quietly auditioning for that role.
So back to Tuesday. The founder in Lisbon ships her support bot before lunch. The lab in Singapore wakes up to a finished inference job and a bill they can afford. The Fortune 500 team in New Jersey signs off on the open model, and the consultants quietly update their slide decks. The GPUs hum. Nobody calls a lawyer. The cloud is open, and somebody is running it.
A Tuesday. Possibly several of them. Often a Wednesday.
If this profile made you mildly curious.
Share it, save it, send it to that one friend who keeps insisting closed APIs are the only option.