Callers using PolyAI's voice agents occasionally do something strange. Mid-sentence, they pause - sometimes during a complaint, sometimes mid-rant about a late delivery or a billing error - and apologize. Not because someone asked them to. But because a small doubt has crept in: what if this is a person? That moment, that flicker of uncertainty, is what Nikola Mrkšić has spent the better part of a decade engineering. Not a trick, not a parlor gimmick. The actual, hard, unsolved problem of building a voice AI that handles real conversations the way a skilled human agent would.
Mrkšić grew up in Belgrade in the early 1990s, when Yugoslavia was coming apart at the seams. He attended the Belgrade Mathematical Gymnasium - a competitive school originally designed during the Tito era to produce mathematicians and scientists. Whatever the curriculum was supposed to do, it worked on him. He developed an early obsession with mathematics and computation, the kind that earns full scholarships to Cambridge University.
At Cambridge, he landed in the Machine Intelligence Lab under Professor Steve Young - himself a founder of voice technology work - and completed a BA, MSc, and PhD on spoken dialogue systems. His dissertation: "Data-driven language understanding for spoken dialogue systems." One mentor, the famous machine learning researcher Zoubin Ghahramani, offered an instruction that Mrkšić has clearly internalized: "Don't just go and become a PowerPoint monkey."