Most AI talks. His builds the kind that acts.
Ask the typical chatbot to plan your week and it will write you a beautiful plan. Ask it to actually book the flights, send the invites, and reconcile the spreadsheet, and it quietly falls apart. Zheqing Zhu - everyone calls him Bill - built Pokee AI for that exact gap.
The company, based in Bellevue, Washington, makes foundation AI agents designed to plan, reason, and use tools well enough to execute online workflows from start to finish. Not a wrapper, not a demo, but agents meant to carry a multi-step task across thousands of apps and APIs without dropping it halfway.
That single sentence is the whole company. Generation - the writing, the images, the code suggestions - has been figured out by a dozen labs. Execution, the unglamorous business of reliably doing a thing across many tools, has not. Zhu picked the harder, less photogenic half on purpose.
Reinforcement learning, not a bigger model
The fashionable answer to better agents is to make the language model larger. Zhu's answer is different, and it comes straight from his research. Pokee leans on reinforcement learning - the discipline of teaching systems to make good sequential decisions - to help agents choose and chain tools efficiently.
The payoff is a specific, testable claim. Pokee reports over 97% accuracy when selecting from thousands of tools, a regime where standard large language models start to stumble badly once they get past a few hundred. The company says its foundation tool-usage model surpasses GPT-4o, Claude 3.7, and Gemini 2.5 Pro at function calling by a wide margin, and can extend past 6,000 tools.
Figures as reported by Pokee AI. Illustrative bars; the company's claim is that reliability holds as the number of available tools grows into the thousands.
He earned a Stanford PhD with a Meta day job
Here is the detail that explains the rest of him: Zhu completed a PhD in reinforcement learning at Stanford while working full-time at Meta. His advisor was Benjamin Van Roy, a major name in the field. He had already picked up a master's in computer science from Stanford along the way, and before that graduated summa cum laude from Duke, where he studied under Ronald Parr and minored in finance.
At Meta he was not a researcher off in a corner. As Senior Staff Research Lead Manager he ran the Applied Reinforcement Learning team, shipping RL into ads, recommendation systems, and Reality Labs. The work is credited with more than $500M in annual revenue impact, and the ads growth efforts he was part of helped take Meta's active advertiser base from 2 million to 12 million.
He also led the team that built and open-sourced Pearl, Meta's production-ready RL platform - the kind of contribution that earns a company-wide highlight at NeurIPS and, more importantly, gets used by people who are not the author.
Straight line, steep grade
- 2017Graduates Duke summa cum laude; joins Meta as a machine learning engineer on Ads Growth ML.
- 2018Becomes engineering manager and tech lead for Ads Growth Machine Learning.
- 2020Earns an MS in Computer Science from Stanford - still while working.
- 2021Promoted to Senior Staff Research Lead Manager, heading Applied Reinforcement Learning at Meta AI.
- 2023Finishes his Stanford PhD; open-sources Pearl, Meta's production RL platform.
- 2024Founds Pokee AI in October to build foundation AI agents.
- 2025Closes a $12M seed round, 3x oversubscribed, led by Point72 Ventures; ships public beta.
A round that filled up three times over
When Pokee AI raised its seed, demand outran the allocation by 3x. Point72 Ventures led, with Qualcomm Ventures and Samsung NEXT joining. The angel list is its own signal of who takes the execution thesis seriously - including Intel CEO Lip-Bu Tan and Abhay Parasnis, founder of Typeface and former CTO of Adobe.
The mission Zhu states for the money is deliberately enormous: automate every human workflow on the internet with frictionless AI agents. Reinforcement learning, he argues from experience, is the year's breakout idea for making agents that can actually be deployed rather than merely demoed.
What the bio leaves out
He goes by Bill; his Chinese name is 朱哲清. His GitHub handle, fittingly for a builder, is BillMatrix. He learned from two of the better-known minds in his field - Van Roy at Stanford, Parr at Duke - and then went and did the thing they study at the scale of billions of users.
His publication list runs through JMLR, ICML, ICLR, KDD, RecSys, CIKM, ICRA, and IROS, which is to say he keeps one foot in the research world even while running a company. And he has been collecting recognition along the way, including the Asian American Science and Engineering Innovation Award. The throughline is consistency: the same bet on reinforcement learning, made over and over, from a Duke thesis to a Bellevue startup.