The Engineer Who Built the Newsletter
the Industry Actually Trusts
Every week, Ananth Packkildurai sits down and reads. Not skims. Reads. He filters 50+ articles down to 15, writes his own analysis, and ships an issue of Data Engineering Weekly to over 50,000 subscribers who have come to trust his curation more than vendor marketing and more than conference keynotes. He has done this without missing a single week since 2020. There is no team. There is no editorial board. There is just Ananth, the data, and the question: what actually matters this week?
The newsletter exists because of a gap Ananth spotted while at Slack in 2018. The data engineering space was fragmenting fast - new tools, new paradigms, new debates - and most of the signal was buried under vendor noise. He had already lived through the chaos firsthand: he was the engineer responsible for keeping Slack's data pipelines alive during the company's explosive growth phase, running Airflow at a scale that most tutorials couldn't prepare you for. The talk he gave about it - "Operating Data Pipeline using Airflow @ Slack" - became required reading for a generation of data engineers trying to understand what it actually looked like to run this stuff in production.
Data Engineering Weekly launched as a quiet Substack in 2020, grew to 2,000 subscribers by early 2021, and crossed 50,000 by early 2026. The secret was the same as the Airflow talk: practitioner writing for practitioners, zero vendor alignment, specific problems examined honestly. In a space where the loudest voices often have a product to sell, that stance turned out to be scarce enough to be genuinely valuable.
"Two hard problems in data engineering: 1. Counting the data. 2. Deleting the data. As the data community, we actively talk about counting but fail to discuss deleting."
- Ananth Packkildurai on X/TwitterPipelines, Contracts, and the People Who Break Them
Ananth's career reads like a tour through the places where data engineering actually gets complicated. At Hotcourses and Evolv he learned the basics. At Slack, from 2015 to 2019, he encountered a different class of problem: what do you do when your pipelines are so important that a failure at 2 AM becomes someone's job to fix, and your monitoring is too noisy to tell you what broke first?
The answer he built at Slack was observability infrastructure for data pipelines - the kind of plumbing work that rarely shows up in product announcements but determines whether a data team can sleep through the night. The 2018 Airflow talk documented that work with unusual honesty: on-call rotations, failure modes, what the dashboards actually looked like. It became one of those resources that practitioners share instead of searching for, which is the highest distinction the engineering internet bestows.
Zendesk, from 2019 to 2024, was a different challenge. Principal Data Engineer on the next-generation customer analytical platform. The scope was larger, the stakeholders more varied, and the architecture decisions had longer half-lives. It was here that Ananth started writing publicly about data catalogs - a field he described as building "the most expensive data integration systems you never intended to build." The piece went small-scale viral in the data community because it said out loud what many practitioners had been thinking: the promise of discovery and governance had been overtaken by the weight of maintenance.
Schemata came out of the same period. Before "data contracts" became a conference track, Ananth shipped an open-source framework for decentralized, domain-driven schema ownership. It supports ProtoBuf, Avro, and dbt - the formats that actual data teams use. The timing mattered: Schemata appeared when the debate was still forming, which meant it shaped how people thought about the problem rather than just answering an existing specification. That's a harder thing to do than it looks.
"The ETL framework was never really designed to capture meaning."
This is Ananth's central thesis for the current era: that the mechanical work of moving data is increasingly automatable by AI, and the irreducible human contribution is semantic - understanding what data means, who owns it, and what contracts govern its use. His "Data Engineering After AI" podcast series launched in 2025 as a direct extension of this argument.
From Anna University to the Data Stack
"Data catalogs are the most expensive data integration systems you never intended to build."
- Ananth Packkildurai, "Data Catalog - A Broken Promise"What He's Actually Arguing
Ananth's writing tends to move against the grain. When the industry was celebrating data catalogs as the solution to data governance, he published a takedown of their operational costs. When data contracts started trending, he already had working code. When AI started generating pipelines, he asked what happens to the people who used to write them - and concluded the answer wasn't "nothing" but "everything shifts toward meaning."
The newsletter reflects this. Each issue is built around a thesis, not a list. Ananth reads across vendor blogs, academic papers, practitioner posts, and conference transcripts, then synthesizes rather than summarizes. The reader experience is less "here are the links" and more "here's how to think about this week in the field."
His open-source work follows the same pattern. Schemata wasn't built to win a popularity contest - it was built to demonstrate a specific idea about how schema ownership should work in distributed data teams. The 261 GitHub stars it accumulated are almost beside the point; the framework influenced how practitioners and vendors alike framed the data contract conversation.
As an angel investor and advisor to early-stage data startups, Ananth brings the same practitioner lens. He's not investing in market categories. He's investing in teams building things he wishes had existed when he was debugging Airflow at Slack at 2 AM. That's a specific and hard-to-fake signal, and founders who've worked with him describe it as unusually useful.