He left a quant seat on Wall Street because the data scientists, the ones doing the actual analysis, had nothing.
Co-Founder & CEO, Sphinx // New York
Rohan Kodialam noticed something on a trading floor that bothered him. The software engineers had Copilot finishing their code. The front-office traders had ChatGPT drafting their memos. And the data scientists, the ones turning raw numbers into decisions worth millions, had nothing built for them at all.
So in 2025 he and engineer Jamie Bloxham started Sphinx, an applied-AI research firm in Queens, New York, with a deliberately unglamorous mission: make AI good at data. Not language. Not code. Data. The messy, schema-less, half-documented stuff sitting in warehouses and Jupyter notebooks where nobody can quite remember what column 14 means.
Sphinx ships a copilot that lives inside Jupyter and VSCode, the places data teams already work. It autocompletes. It reasons. It explores hypotheses and hunts for the insight buried three joins deep. In September 2025 the company came out of stealth with $9.5 million in seed funding led by Lightspeed, with Bessemer Venture Partners, BoxGroup, K5 and Impatient Ventures along for the ride, plus angels who know a thing or two about data: Steve Cohen and Naveen Rao.
AI is driving a paradigm shift for natural language and code, but traditional data has been left behind.Rohan Kodialam, on why Sphinx exists
Most founders chasing AI go where the demos look cleanest: chatbots, copilots for code, image generators. Kodialam went the other way, toward the part everyone else finds tedious. His argument is that data work and software work only look similar from a distance.
Code is literal. A function does what it says. Data is interpretive, ambiguous, full of context that lives in someone's head and nowhere else. He puts it as the difference between writing a poem and writing a technical paper. One has rules you can check. The other depends on what you meant.
That is exactly where today's language models stumble. "The moment you try to breach that boundary of language," he says, "you start to run into problems." Sphinx is his answer: agents trained with representation learning and reinforcement learning to interrogate data instead of hallucinating about it.
Engineers got Copilot. Executives got ChatGPT. Data scientists got handed a generic chatbot and told to make it work.
Sphinx Copilot plugs into Jupyter and VSCode, with autocomplete and agentic reasoning rather than a new app to learn.
Not prompt tricks. The bet is that teaching models the structure of data is the real unlock.
Refined forecasts, optimized operations, insights that move a P&L. The boring, valuable end of AI.
"All the software engineers have Claude Code, all the front-office guys have ChatGPT, and the data people have nothing."
ROHAN KODIALAM · FAST COMPANY
Before the venture money and the launch posts, there was an MIT undergraduate in the Physics department, not Computer Science, who kept wandering into machine learning. As a SuperUROP scholar he worked on a delightfully specific problem: the classic "ski rental" dilemma, reframed for an age when a model can guess how long you'll keep skiing.
He stayed for a master's, joining MIT's Clinical ML group, where he co-authored research on predicting patient outcomes from time-series clinical data and a paper with the very good title "Deep Contextual Clinical Prediction with Reverse Distillation." At CSAIL his work turned to embedding complex hierarchical data into transformer architectures, teaching models to read the kind of structure that does not fit neatly into a sentence.
Then Wall Street. At Citadel he became a quantitative researcher specializing in alternative data, the satellite images and credit-card receipts that funds mine for edge. He went on to lead AI R&D building agentic models for alpha generation. It was there, surrounded by the best-tooled engineers and traders in the world, that the absence of anything for the data team became impossible to ignore.
Sphinx is small on purpose and well-backed by design. The $9.5M seed gives a tiny New York team the runway to chase a frontier that bigger labs have mostly skipped. The investor list reads like a who's-who of people who understand both AI infrastructure and what data is actually worth.
Led the round.
Institutional backing.
The Point72 founder and data-hungry investor.
AI hardware and infra veteran.
"AI is driving a paradigm shift for natural language and code, but traditional data has been left behind."
"The moment you try to breach that boundary of language, you start to run into problems."
"It's almost like the difference between writing a poem and writing a technical paper."
"I aim to unlock new paradigms for intelligent systems to learn from data, and to drive lasting value through informed decision-making across industries."
Kodialam sat down with the SuperDataScience podcast (episode 938) to make the case for agents that interpret data, explore hypotheses, and surface the insights humans miss. Described by the host as "an outstanding speaker building a revolutionary AI product."
▶ SDS 938 on YouTubeFrontier AI Agents for Data ScienceThe data people have nothing.The four words that became a company
His full name is Rohan Sundar Kodialam.
His MIT home department was Physics, not Computer Science.
Before quant finance he co-wrote clinical machine-learning papers predicting patient outcomes.
An early research project tackled the classic "ski rental" algorithm, upgraded with ML predictions.
He describes Sphinx as a "Cursor for data" that lives inside Jupyter and VSCode.
Sphinx launched as a seven-person team based in Queens, New York.