Tagged Content
Everything on the platform tagged with ml-infrastructure.

Josh Tobin is a machine learning infrastructure pioneer who spent three years as a research scientist at OpenAI - contributing to the famous Rubik's cube robot hand - before earning his PhD from UC Berkeley under Pieter Abbeel. He co-founded Gantry, an ML monitoring and continual learning startup that raised $28.3M, and created Full Stack Deep Learning, the first course focused on production ML engineering. His domain randomization technique, which transfers neural networks trained in simulation to the real world, has been cited over 600 times and reshaped how robotics teams build perception systems. He runs a newsletter focused on ML infrastructure and ops.

OctoAI (formerly OctoML) was a Seattle-based AI infrastructure company founded in 2019 by University of Washington researchers — including Apache TVM creator Tianqi Chen and CEO Luis Ceze. The company built a generative AI inference platform that gave developers fast, affordable API access to leading open-source LLMs and image generation models, along with OctoStack, an enterprise-grade private AI deployment stack. After raising ~$132M and pivoting from ML optimization to GenAI infrastructure, OctoAI was acquired by NVIDIA in September 2024 and wound down its commercial services by October 31, 2024.
Fireworks AI is a generative AI inference platform founded in 2022 by seven engineers — five of whom built PyTorch at Meta — that gives enterprises fast, cost-efficient, and customizable access to hundreds of open-source models. The company's proprietary FireAttention kernels and speculative-execution engine deliver up to 40× faster inference and 8× cost reduction versus alternatives, while its fine-tuning and model-deployment tooling lets companies own their AI stack end-to-end. With $327M+ raised, a $4B valuation, 10,000+ customers including Samsung, Uber, Shopify, and Cursor, and a $315M annualized run-rate as of early 2026, Fireworks AI has become the go-to inference layer for production generative AI applications.