The Elegance of Why Machines Learn: Inside the Mind of Anil Ananthaswamy
By Rose Polipe
Publication 15th October 2025 09:56 GMT
“Nobody sat around labeling the data for us. Our brains learned patterns through evolution. Machines, too, can learn — not because they mimic us, but because mathematics itself allows them to.”
In the world of artificial intelligence, the debate often sways between awe and anxiety — fascination with what machines can do and fear of how little we understand their inner workings. Yet for Anil Ananthaswamy, celebrated science writer and author of Why Machines Learn, the story of AI is neither myth nor menace. It is, quite simply, a story of mathematics — elegant, austere, and profoundly human in its curiosity.
From Circuits to Sentences
Trained first as a computer and electronics engineer in India before studying at the University of Washington, Ananthaswamy’s journey from software engineer to science writer was less a career change than a reawakening. “At some point,” he recalls, “I realized the two things I loved most — science and writing — could be combined.”
What followed was a distinguished career at New Scientist, where he rose from intern to deputy news editor, and a string of acclaimed books spanning physics, neuroscience, and quantum mechanics. His Edge of Physics transported readers to the ends of the Earth — from the Atacama Desert to the South Pole — to uncover the frontiers of cosmology. The Man Who Wasn’t There delved into the neuroscience of selfhood, while Through Two Doors at Once elegantly traced the double-slit experiment’s enigmatic dance through two centuries of quantum thought.
Then came Why Machines Learn — a book that merges his engineering past with his literary craft, illuminating the subtle mathematical fabric that makes AI possible.
The Beauty Beneath the Code
When the pandemic struck, Ananthaswamy found himself in solitary study, moving between Boston and Berkeley, immersed in online lectures on machine learning. “I started to realize,” he says, “that the mathematics behind machine learning is quite beautiful. The writer in me woke up and wanted to communicate that elegance.”
What captivated him wasn’t the jargon of data science or the engineering of algorithms, but their underlying proofs — the logic that gives the field its quiet poetry. He cites the Perceptron Convergence Theorem of 1959 as a revelation: “It’s a very simple proof, just linear algebra, but when you see it explained, it’s stunningly elegant.”
Equally moving to him are kernel methods — mathematical techniques that project low-dimensional data into infinitely high-dimensional spaces, perform transformations there, and yet compute everything as if still in the lower dimension. “It’s like doing algebra in a dream world,” he says. “It’s beautiful, and it works.”
Understanding the Mind of the Machine
Ananthaswamy’s central argument is clear: to truly grasp artificial intelligence, one must engage with the mathematics — not just the metaphors. “We need more people in society — journalists, policy makers, science communicators — who understand enough of the math to see what machines are actually doing,” he insists. “They’re not reasoning like we do. They’re doing sophisticated pattern matching.”
To that end, Why Machines Learn acts as a guided map for readers equipped with high-school or early undergraduate-level math — basic calculus, linear algebra, probability, and a touch of optimization theory. These, he argues, are the essential lenses through which one can look “under the hood” of machine learning.
The Empirical Age of AI
Modern AI, he explains, is largely empirical. “People are building things and finding that they work, without really understanding why,” he says. Deep learning systems — the foundation of ChatGPT, image generators, and self-driving algorithms — perform astonishingly well but remain partly mysterious in their mechanics. “To know why these systems work so well, or what their limits are, we need the math. That’s where the real understanding begins.”
A Forgotten History
Yet Ananthaswamy is quick to remind us that AI is not synonymous with deep learning. “Ask anyone today about AI, and they’ll say ‘ChatGPT.’ But the field has a long and rich history — and it’s not all about neural networks.”
He traces that lineage through the decades: from the single-layer neural networks of the 1950s to the k-nearest neighbor algorithms of the 1960s, the probabilistic methods inspired by Bayes’ theorem, and the support vector machines of the 1990s — models that once dominated machine learning before deep neural networks returned to the stage.
His fascination lies not only in the algorithms themselves, but in their stories — in the human creativity, rivalries, and flashes of insight that birthed them. “We understand things better,” he reflects, “when they’re anchored in stories.”
The Bias–Variance Balancing Act
Among the book’s most lucid explanations is the bias–variance trade-off, a cornerstone of machine learning. Too simple a model, and it misses the patterns in data (high bias); too complex, and it memorizes the noise (high variance). The sweet spot — the “Goldilocks zone” — lies in between, where a model generalizes well from what it has seen to what it has not.
But deep learning, Ananthaswamy notes, complicates this classical picture. Modern models contain billions of parameters — more than the data points they train on — and yet, paradoxically, still perform remarkably well. “They should overfit catastrophically,” he says, “and yet they don’t. That’s the terra incognita of AI — the unexplored land where our current mathematical theories begin to break down.”
The Need for Mathematical Citizenship
Ultimately, Why Machines Learn is not a manual for coders but a call for mathematical citizenship — a world where the understanding of equations and proofs is seen not as elitist, but as part of being an informed participant in society’s technological future.
“Machines will not stop learning,” Ananthaswamy reminds us. “The question is whether we — the humans who build them — will stop understanding.”
