Modal is a serverless cloud built specifically for AI and machine learning workloads. The premise is almost aggressively simple: you write Python, Modal handles everything else. Spin up thousands of GPUs. Pay for what you use. Go home on time.
Before Modal existed, running serious AI workloads meant choosing between AWS's labyrinthine configuration menus, Kubernetes clusters that required a dedicated team to babysit, or reserved GPU instances that charged you whether you used them or not. The developer experience was, charitably, hostile.
Modal's answer was to build something new rather than wrap something old. Instead of slapping a nicer interface on top of Kubernetes, the team wrote their own container runtime in Rust, their own distributed file system, and their own multi-cloud scheduler from scratch. That took longer. It also produced cold starts measured in fractions of a second - compared to the several minutes Docker-based systems typically need.
# This is roughly all you need to run on a GPU
import modal
app = modal.App()
@app.function(gpu="H100")
def train_model(data):
# Your ML code here. Modal handles the rest.
...
That's the whole thing. One decorator. Modal provisions the GPU, loads your container, runs your function, and bills you for exactly the time it ran. No cluster management, no quotas, no pre-reservation. The company's informal internal benchmark for failure: if a developer ever needs to edit YAML or write a Dockerfile, something has gone wrong.
AI-native companies need AI-native infrastructure.
- Modal's founding thesisThe company is headquartered in New York, with offices in Stockholm and San Francisco. As of early 2026, it has around 118 employees, roughly $50M in annualized revenue, and was reportedly in discussions to raise at a $2.5 billion valuation - about five months after becoming a unicorn at $1.1B.
Modal was founded in January 2021 by Erik Bernhardsson, who previously spent six years building machine learning systems at Spotify. That included the recommendation engine behind Discover Weekly, the workflow orchestration tool Luigi (now with tens of thousands of GitHub stars), and Annoy, an approximate nearest-neighbor library that became standard tooling in the ML community.
He left Spotify to become CTO at Better.com, where he grew the engineering team from one person to around three hundred before leaving to start Modal. He is Swedish, which partly explains the Stockholm office, the backing from Swedish VC Creandum, and the fact that Modal ended up classified as a European unicorn despite being based in New York.
Built Discover Weekly at Spotify. Created Luigi and Annoy. CTO at Better.com. Swedish. Based in New York. Clearly cannot stop writing infrastructure tooling.
Joined in August 2021. Deep background in developer tooling and ML infrastructure. The engineering mind behind how Modal actually works at the systems level.
The broader team is deliberately unusual. It includes international olympiad medalists, the author of the Seaborn Python visualization library, academic researchers, and experienced open-source maintainers. The company seems to have a preference for people who have built widely-used tools before, rather than people who have managed teams that managed teams that managed tools.
Modal's product set covers the full lifecycle of running AI in production. Each piece is self-contained, but they are built to work together.
Deploy LLMs and generative models on H100, A100, B200, or T4 GPUs. Sub-second cold starts mean you don't pay for idle capacity between requests.
Multi-node GPU clusters with RDMA networking for distributed model training. Scale out without setting up your own cluster.
Launch massively parallel jobs for data processing, transcription pipelines, or media generation. Substack uses this to transcribe podcasts across hundreds of GPUs.
Secure containers for executing untrusted code - the building block for AI agents and code interpreters. Powers Mistral's Le Chat code runner in production.
GPU-accelerated collaborative notebooks for interactive ML experimentation. Launched September 2025, so relatively new but already integrated into the platform.
The layer everything runs through. Decorate a Python function, point it at a GPU type, deploy. The infrastructure lives in the code, not in a separate config file.
Pricing is usage-based - you pay per second of compute, the platform scales to zero when nothing is running, and there are no reserved instances or minimum commitments at the standard tier. Enterprise plans exist for teams that need dedicated capacity.
Modal has thousands of customers. The named ones span a fairly wide range of AI use cases, which is part of the point - it's not a specialist tool for one type of workload.
Suno uses Modal to scale AI music generation. Mistral runs its Le Chat code interpreter on Modal Sandboxes. Ramp, the fintech company, reported saving thousands of dollars monthly by switching open-source LLM deployment to Modal versus their previous setup. Quora estimated that adopting Modal saved the equivalent of two full-time engineers in ongoing operational overhead.
Substack transcribes podcasts at scale. Future House runs AI agent environments and model hosting for scientific research. The Allen Institute for AI launched its Olmo and Tülu models through Modal. The throughput range is genuinely broad: from a startup running its first fine-tuning job to research institutions running large-scale scientific compute.
Modal has raised $135M+ across five rounds. The pace of fundraising has accelerated sharply as the AI infrastructure market has heated up.
Notable angels and advisors include Elad Gil and Tristan Handy, the founder of dbt. The Series B in September 2025 valued Modal at $1.1B. Five months later, TechCrunch reported the company was in talks for a Series C at roughly $2.5B. That is not a typo.
Most cloud platforms are built on top of Kubernetes. Modal is not. The team decided early on that getting the performance characteristics they wanted - particularly the sub-second cold starts - required writing their own stack rather than customizing existing infrastructure.
That stack includes a container runtime written in Rust (roughly 100 times faster than Docker for spinning up containers), a custom distributed file system, and a multi-cloud scheduler that dynamically routes workloads across AWS, Google Cloud Platform, and Oracle Cloud Infrastructure based on availability and cost.
The multi-cloud approach means Modal isn't dependent on any single cloud provider's GPU inventory. When AWS us-east-1 H100s are constrained, workloads move. When OCI has capacity, it fills. For customers, this mostly means quotas become someone else's problem.
If you have to touch YAML or Dockerfiles, we've failed.
- Modal's internal engineering principlePartnerships with Oracle Cloud Infrastructure (September 2024) and AWS (November 2024, including PrivateLink for enterprise security compliance) have expanded the GPU inventory Modal can access. A joint benchmark with NVIDIA in September 2025 demonstrated sub-second latency for 127 simultaneous clients running speech recognition on a single H100.
In July 2025, Modal acquired Jamsocket, a New York infrastructure company that built stateful Python REPLs and sync engines. The acquisition expanded Modal's capabilities around persistent, stateful agent execution - the kind of long-running interactive compute that AI agents increasingly need.
Erik Bernhardsson built Spotify's Discover Weekly before founding Modal. If you've ever had a playlist that felt eerily well-curated, there's a reasonable chance his code was involved.
Modal is considered a European unicorn despite being headquartered in New York. The Swedish CEO, Stockholm office, and Swedish-linked investors apparently tip the scales in Brussels' accounting.
The team includes international olympiad medalists and the creator of Seaborn, the Python data visualization library. Not your average cloud infrastructure startup hiring profile.
Erik's open-source project Luigi - a workflow orchestration tool he wrote at Spotify - has over 10,000 GitHub stars. He built the infra tooling, watched the industry complain about infra, and then decided to fix it properly.
Modal built everything from scratch: container runtime, file system, scheduler. The container runtime is in Rust. This is the kind of engineering decision that takes much longer and works much better.
Mistral AI uses Modal Sandboxes to power the code interpreter in Le Chat. When Mistral's users ask the chatbot to write and execute code, Modal is running it in isolation in the background.
TechCrunch reports Modal is in talks to raise a Series C at approximately $2.5B, with General Catalyst as the reported lead investor. That would more than double the $1.1B valuation from five months earlier.
Closed $87M Series B led by Lux Capital, achieving unicorn status at $1.1B valuation. Launched GPU-accelerated Notebooks for interactive ML experimentation.
Acquired Jamsocket, a New York infrastructure startup specializing in stateful Python execution environments and sync engines. Co-founders Paul Butler and Taylor Baldwin joined Modal.
Launched Sandboxes - secure isolated containers for untrusted or agentic code execution. Mistral AI adopted them immediately for Le Chat's code interpreter.
Signed a Strategic Collaboration Agreement with AWS, including PrivateLink integration for enterprise security and compliance requirements.
Announced Oracle Cloud Infrastructure partnership at Oracle CloudWorld, enabling bare-metal GPU instances and scaling to hundreds of nodes in seconds.