Tagged Content
Everything on the platform tagged with llm.

Sebastian Raschka is a German-born AI/ML researcher, educator, and author who has built one of the most trusted independent voices in the machine learning community. Through his Substack newsletter 'Ahead of AI' (184,000+ subscribers), bestselling books like 'Build a Large Language Model (From Scratch)', and 91,000+ starred GitHub repositories, he demystifies cutting-edge AI for practitioners worldwide. After a stint as an Assistant Professor at UW-Madison and a role as Staff Research Engineer at Lightning AI, he now runs RAIR Lab as an independent researcher, writer, and consultant.

Simon Willison is a British software engineer, open source creator, and AI commentator best known for co-creating the Django web framework and building Datasette, the open-source data exploration tool. He coined the term 'prompt injection' in 2022 and popularized 'AI slop' in 2024 - a word later named Merriam-Webster's 2025 Word of the Year. Through his prolific blog (active since 2002), newsletter with 54,000+ subscribers, and 100+ open source tools, he is one of the most influential independent voices at the intersection of LLMs and open source software.

Swyx (Shawn Wang) and Alessio Fanelli are the co-hosts of Latent Space, the #1 AI Engineering podcast and newsletter with 200,000+ subscribers and 10M+ total readers. Swyx — a former Singapore hedge fund trader turned developer advocate who coined the term 'AI Engineer' — and Alessio — a Forbes 30 Under 30 VC partner and Rome-born dropout-turned-engineer — together define the curriculum and culture of a generation of engineers building with AI.

Timothy B. Lee is an independent AI journalist and newsletter writer who runs Understanding AI, a Substack newsletter with over 263,000 subscribers that explains how artificial intelligence actually works - minus the hype and minus the doom. Drawing on a rare combination of a computer science master's from Princeton, two decades of tech policy reporting at outlets like Ars Technica, the Washington Post, and Vox, and an instinct for clear, jargon-free prose, Lee has become one of the most-read independent voices in AI journalism. His superpower is translating complex machine learning concepts into accessible explainers that neither oversell nor undersell the technology.

Jon Stokes is a 25-year veteran of online media who co-founded Ars Technica in 1998 with Ken Fisher, helping build it into the internet's premier tech publication before selling it to Condé Nast for $25 million. An engineer turned journalist turned product builder, he holds a B.S. in Computer Engineering from LSU alongside two master's degrees from Harvard Divinity School in early Christian history - a combination that explains his unusual range: equally comfortable dissecting CPU microarchitecture, AI policy, Second Amendment law, and New Testament scholarship. Today he's co-founder and CPO of Symbolic AI, runs a Substack newsletter on AI and crypto with 13,000+ subscribers, and serves as a fellow at Open Source Defense.

Tom Yeh is an Associate Professor of Computer Science at the University of Colorado Boulder and the creator of AI by Hand, a wildly popular educational newsletter and community that teaches transformers, LLMs, and deep learning architectures through pen-and-paper calculations. With 62,000+ Substack subscribers, 200,000+ social media followers, and a Feynman-inspired philosophy that you only truly understand what you can build by hand, Yeh has become one of the most influential voices in practical AI education - bridging the gap between black-box hype and genuine first-principles understanding.

Azalia Mirhoseini is an Iranian-born AI researcher, Stanford professor, and co-founder of Ricursive Intelligence - a frontier AI lab valued at $4 billion that uses AI to design better chips, which in turn train stronger AI. Best known for AlphaChip, the deep reinforcement learning system that now designs Google's TPUs and has compressed chip floorplanning from months to hours, she also co-invented the Mixture-of-Experts architecture underpinning GPT, Claude, and Gemini. With 20,000+ citations and a $335M-funded startup launched in under four months, she is closing the recursive loop between artificial intelligence and the hardware it runs on.

Elvis Saravia is the co-founder and lead AI researcher at DAIR.AI, a mission-driven organization democratizing AI research and education worldwide. Based in Belize, he is the author of the Prompt Engineering Guide - one of the most widely read AI resources on the internet with 73,000+ GitHub stars and over 3 million learners - and publishes the AI Agents Weekly newsletter. A PhD graduate from National Tsing Hua University in Taiwan, he has contributed to landmark AI projects including the Galactica large language model at Meta AI, and is known for bridging rigorous research with accessible, production-minded education for the next generation of AI builders.

Eugene Yan is a Principal Applied Scientist turned Member of Technical Staff at Anthropic, where he bridges cutting-edge AI research with production-scale systems. Formerly at Amazon for five years building real-time recommendation and LLM-powered systems for Kindle and Search, Eugene is equally well-known for his prolific writing: 209 blog posts, 420,000+ words published, and a newsletter with over 11,800 subscribers. His open-source repository applied-ml on GitHub has become a canonical reference for teams shipping machine learning in production. He lives in Seattle, snowboards on weekends, and writes like someone who actually wants you to understand.

Hamel Husain is a machine learning engineer with 25+ years of experience who built part of the foundation beneath GitHub Copilot - his CodeSearchNet project was early LLM research later used by OpenAI for code understanding. Today he runs Parlance Labs, consults with AI teams across 35+ products, co-authored O'Reilly's 'Evals for AI Engineers', and teaches thousands of engineers how to move beyond vibes and actually measure their AI systems.

Woz (WOZCODE) is a Claude Code plugin built by MIT engineers Ben Collins and Brad Eckert that cuts AI coding costs by 25-55% and speeds up most tasks by 30-40%. Instead of letting Claude Code burn tokens on bloated built-in file operations, WOZCODE replaces them with smarter, leaner alternatives - three specialized agents (code, explore, plan) that do more with less. It installs in two commands, runs locally with no data exfiltration, and works alongside your existing Claude subscription. Backed by Y Combinator (W25) and a $6M seed round, Woz is building the efficiency layer that makes AI-assisted development economically viable at scale.

Maarten Grootendorst is a psychologist-turned-ML engineer at Google DeepMind, best known for creating BERTopic, KeyBERT, and PolyFuzz - open-source NLP tools with over 15 million combined downloads. Co-author of the Amazon #1 bestseller 'Hands-On Large Language Models' (O'Reilly, 2024) with Jay Alammar, he runs the 'Exploring Language Models' newsletter with 2M+ views and has taught 50,000+ students on DeepLearning.AI. His work bridges the worlds of psychology and AI, making complex language model internals accessible through strikingly visual guides.

Mihail Eric is a Palo Alto-based ML engineer, researcher, educator, and serial founder who has spent a decade bridging cutting-edge AI research and production systems. A Stanford CS alumnus who studied under Christopher Manning and Percy Liang, he built some of Amazon Alexa's earliest large language models, co-founded YC-backed Storia AI, founded Confetti AI (acquired by Towards AI), and now teaches 'The Modern Software Developer' at Stanford while running a newsletter for 17,000+ AI practitioners.

Tim Dettmers is an Assistant Professor at Carnegie Mellon University and Research Scientist at the Allen Institute for AI (AI2), best known for making large language models accessible on consumer hardware. He created the bitsandbytes library (2.2M monthly installs), co-authored QLoRA - a technique enabling fine-tuning of 65B-parameter models on a single GPU - and pioneered LLM.int8() quantization. With over 18,000 citations across his work, Dettmers has become one of the most influential voices in efficient deep learning, consistently arguing that computational democratization - not AGI hype - is where the real progress lives.
Muhammad Umair is a Pakistan-based AI consultant, ML engineer, and PhD researcher who has spent seven-plus years building machine learning systems that actually ship. He leads AI training at atomcamp, has driven AI initiatives for UNDP Pakistan, and has built three AI SaaS products end-to-end. His PhD research at UESTC focuses on multimodal test-time adaptation and low-resource learning - the kind of work that makes AI usable in places where data is scarce and compute is expensive.

Yao Fu (符尧) is an AI researcher at xAI specializing in large language model reasoning, efficient inference, and distributed systems. A PhD graduate of the University of Edinburgh, he previously worked at Google DeepMind on Gemini 3 and Project Astra. With over 5,000 citations and key papers like ServerlessLLM (OSDI '24) and DuoAttention (ICLR '25), Fu bridges systems engineering and ML research. He writes the 'Yao Fu' newsletter on Notion and is known for the Chain-of-Thought Hub benchmark repository, which helped track LLM reasoning progress across the field.

Weave is a YC-backed engineering intelligence platform that uses LLMs to analyze pull requests and measure real engineering output. It helps teams distinguish between human and AI-generated code, introducing the 'Weave Hour' as a metric for actual work completed.

On March 31, 2026, OpenAI closed the largest private funding round in history — $122 billion in committed capital at a post-money valuation of $852 billion. Anchored by Amazon ($50B), Nvidia ($30B), and SoftBank ($30B), with continued participation from Microsoft and a sweeping syndicate of global institutions, the round dwarfs every prior private tech raise and cements OpenAI as the world's most valuable startup by a wide margin. The company is generating $2 billion in monthly revenue, counting 900 million weekly ChatGPT users, and is widely expected to pursue an IPO.

Humanloop was an enterprise LLM development platform founded in 2020 as a UCL spinout, offering prompt management, evaluations, and observability tools for teams building AI applications. With customers like Duolingo and Gusto, it raised ~$8M and reached ~$3.8M ARR before being acqui-hired by Anthropic in August 2025, after which the platform was sunsetted on September 8, 2025. Its technology and team live on inside Anthropic's enterprise console.
LlamaIndex is a San Francisco-based AI infrastructure company and open-source framework that enables enterprises to build intelligent document agents using large language models. Founded in 2022 by Jerry Liu and Simon Suo, it started as a side project called GPT Index and has grown into a full enterprise platform with products like LlamaParse (agentic OCR), LlamaCloud (enterprise SaaS), and a widely-used Python/TypeScript SDK. With 25M+ monthly downloads, 48K+ GitHub stars, and customers including Rakuten, Salesforce, and 90+ Fortune 500 companies, LlamaIndex is a leading player in the enterprise RAG and AI agent infrastructure space.

OctoAI (formerly OctoML) was a Seattle-based AI infrastructure company founded in 2019 by University of Washington researchers — including Apache TVM creator Tianqi Chen and CEO Luis Ceze. The company built a generative AI inference platform that gave developers fast, affordable API access to leading open-source LLMs and image generation models, along with OctoStack, an enterprise-grade private AI deployment stack. After raising ~$132M and pivoting from ML optimization to GenAI infrastructure, OctoAI was acquired by NVIDIA in September 2024 and wound down its commercial services by October 31, 2024.
Weights & Biases (W&B) is the leading AI developer platform for machine learning and generative AI, offering tools for experiment tracking, hyperparameter optimization, model registry, and LLM application development. Founded in 2017 by Lukas Biewald, Chris Van Pelt, and Shawn Lewis in San Francisco, W&B powers over 1 million developers and 1,400+ organizations — including OpenAI, Meta, and NVIDIA — by making it easier to build, train, evaluate, and deploy AI models. Acquired by CoreWeave for ~$1.7B in May 2025, W&B continues expanding its platform with Weave for LLM/agent observability, cementing its position as the de facto infrastructure for modern AI development.
Fireworks AI is a generative AI inference platform founded in 2022 by seven engineers — five of whom built PyTorch at Meta — that gives enterprises fast, cost-efficient, and customizable access to hundreds of open-source models. The company's proprietary FireAttention kernels and speculative-execution engine deliver up to 40× faster inference and 8× cost reduction versus alternatives, while its fine-tuning and model-deployment tooling lets companies own their AI stack end-to-end. With $327M+ raised, a $4B valuation, 10,000+ customers including Samsung, Uber, Shopify, and Cursor, and a $315M annualized run-rate as of early 2026, Fireworks AI has become the go-to inference layer for production generative AI applications.

Predibase was a San Francisco-based AI infrastructure company (founded 2020, acquired by Rubrik in June 2025) that pioneered efficient LLM fine-tuning and serving at scale. Built by the creators of Uber AI's Ludwig and Horovod frameworks, Predibase made it easy for enterprises to fine-tune and deploy open-source LLMs using LoRA adapters — often outperforming GPT-4 on domain-specific tasks for under $8 of compute. Its open-source LoRAX inference server enabled serving thousands of fine-tuned models from a single GPU, dramatically cutting costs. After raising $28M from Greylock and Felicis, Predibase was acquired by cybersecurity firm Rubrik for over $100M to accelerate agentic AI adoption.

Baseten is a San Francisco-based AI inference infrastructure company that provides dedicated and serverless GPU compute for running AI models at scale. Founded in 2019 by four ex-Gumroad engineers, the company has grown into a unicorn with a $5B valuation and $585M in total funding, backed by NVIDIA and other top-tier investors. Baseten powers inference workloads for 100+ enterprises including Cursor, Notion, HeyGen, and Clay, offering an inference stack with near-zero cold starts, proprietary networking, and open-source tooling like Truss for model packaging.

RunPod is an AI cloud infrastructure company that provides on-demand GPU compute for training, fine-tuning, and deploying AI/ML models. Founded in 2022 by two former Comcast engineers who pivoted their Ethereum mining rigs into AI servers, RunPod grew to $120M ARR with just $22M raised by early 2026, serving 500,000+ developers across 183 countries. Its marketplace model, per-second billing, and support for 30+ GPU SKUs — from consumer RTX 4090s to enterprise H100s and B200s — make it a capital-efficient disruptor to hyperscaler GPU clouds like AWS, GCP, and Azure.
Scale AI is a San Francisco-based AI infrastructure company founded in 2016 by Alexandr Wang and Lucy Guo. It provides the data engine, evaluation tools, and AI deployment platforms that power the world's leading AI labs, Fortune 500 enterprises, and US government agencies. By combining a massive distributed workforce with proprietary tooling, Scale accelerates AI development through high-quality data labeling, RLHF, model evaluation, and agentic platforms — making it one of the most consequential picks-and-shovels companies in the modern AI boom, with a $29B valuation as of mid-2025.

Perplexity AI is a San Francisco-based AI company that built the world's leading 'answer engine' ? replacing traditional link-based search with real-time, AI-generated responses that cite their sources. Founded in August 2022 by four AI researchers from OpenAI, Meta, Databricks, and Quora, Perplexity has grown from a scrappy post-ChatGPT prototype to a $20B+ company with 45 million monthly active users, over $1.7B in total funding, and a product suite spanning a conversational search engine, a developer API platform, and the Comet AI browser.