Tagged Content
Everything on the platform tagged with gpu.

Tim Dettmers is an Assistant Professor at Carnegie Mellon University and Research Scientist at the Allen Institute for AI (AI2), best known for making large language models accessible on consumer hardware. He created the bitsandbytes library (2.2M monthly installs), co-authored QLoRA - a technique enabling fine-tuning of 65B-parameter models on a single GPU - and pioneered LLM.int8() quantization. With over 18,000 citations across his work, Dettmers has become one of the most influential voices in efficient deep learning, consistently arguing that computational democratization - not AGI hype - is where the real progress lives.

Jensen Huang is the co-founder, president, and CEO of Nvidia Corporation — the world's most valuable semiconductor company, which he built from a Denny's booth in 1993 with $600 in combined cash. Born in Taipei, raised between Thailand and rural Kentucky, Huang is the longest-serving CEO of any S&P 500 technology company. His two-decade gamble on CUDA software created the unassailable moat that made Nvidia the backbone of the global AI revolution. He personally delivered the first AI supercomputer to OpenAI in 2016. As of 2026, Nvidia surpasses $5 trillion in market cap, generates $216 billion in annual revenue, and Huang's net worth stands at approximately $170 billion. The leather jacket is optional. The legacy is not.
DigitalOcean is a cloud computing platform designed for developers, startups, and SMBs, offering simple, predictable pricing and powerful infrastructure from virtual machines to high-performance GPU instances. Known for its community-first approach and Hacktoberfest, it has grown into a $9.4B public company competing with hyperscalers by focusing on usability and AI democratization.

OctoAI (formerly OctoML) was a Seattle-based AI infrastructure company founded in 2019 by University of Washington researchers — including Apache TVM creator Tianqi Chen and CEO Luis Ceze. The company built a generative AI inference platform that gave developers fast, affordable API access to leading open-source LLMs and image generation models, along with OctoStack, an enterprise-grade private AI deployment stack. After raising ~$132M and pivoting from ML optimization to GenAI infrastructure, OctoAI was acquired by NVIDIA in September 2024 and wound down its commercial services by October 31, 2024.
Fireworks AI is a generative AI inference platform founded in 2022 by seven engineers — five of whom built PyTorch at Meta — that gives enterprises fast, cost-efficient, and customizable access to hundreds of open-source models. The company's proprietary FireAttention kernels and speculative-execution engine deliver up to 40× faster inference and 8× cost reduction versus alternatives, while its fine-tuning and model-deployment tooling lets companies own their AI stack end-to-end. With $327M+ raised, a $4B valuation, 10,000+ customers including Samsung, Uber, Shopify, and Cursor, and a $315M annualized run-rate as of early 2026, Fireworks AI has become the go-to inference layer for production generative AI applications.

Baseten is a San Francisco-based AI inference infrastructure company that provides dedicated and serverless GPU compute for running AI models at scale. Founded in 2019 by four ex-Gumroad engineers, the company has grown into a unicorn with a $5B valuation and $585M in total funding, backed by NVIDIA and other top-tier investors. Baseten powers inference workloads for 100+ enterprises including Cursor, Notion, HeyGen, and Clay, offering an inference stack with near-zero cold starts, proprietary networking, and open-source tooling like Truss for model packaging.

Modal (Modal Labs) is an AI-native serverless cloud computing platform that gives developers instant, elastic access to GPUs and CPUs through a clean Python SDK — no YAML, no Dockerfiles, no infrastructure management required. Founded in 2021 by Spotify ML veteran Erik Bernhardsson, Modal enables AI and ML teams to scale from zero to thousands of GPUs in seconds, paying only for what they use. With customers like Suno, Mistral AI, Harvey, Ramp, and Substack, Modal reached unicorn status at a $1.1B valuation in September 2025 and was reportedly in talks to raise at $2.5B just five months later.

RunPod is an AI cloud infrastructure company that provides on-demand GPU compute for training, fine-tuning, and deploying AI/ML models. Founded in 2022 by two former Comcast engineers who pivoted their Ethereum mining rigs into AI servers, RunPod grew to $120M ARR with just $22M raised by early 2026, serving 500,000+ developers across 183 countries. Its marketplace model, per-second billing, and support for 30+ GPU SKUs — from consumer RTX 4090s to enterprise H100s and B200s — make it a capital-efficient disruptor to hyperscaler GPU clouds like AWS, GCP, and Azure.