What the module doesn't tell you

Listen · 3:51

“If programmers sang hymns,” David Parnas wrote in December 1972, “some of the most popular would be hymns in praise of modular programming.” He then spent the next six pages demonstrating that virtually everyone doing it was doing it wrong.

The paper, “On the Criteria To Be Used in Decomposing Systems into Modules”, appeared in Communications of the ACM, Vol. 15, No. 12. Parnas was thirty-one years old, on the faculty at Carnegie Mellon University, and mildly — if politely — exasperated. The field had been talking about modules for years. It had been getting the concept exactly backwards.

The problem was how people decided where to draw the lines. The dominant approach was to follow the flowchart: identify the steps the program must perform and assign one module to each step. Parnas showed, with careful precision, that this was nearly always wrong. He took a small but representative program — a KWIC index generator, the kind of keyword-in-context concordance system that librarians used to produce printed indexes — and modularized it both ways. The flowchart approach yielded five modules: an input reader, a circular shifter, an alphabetizer, an output formatter, and a master controller. Each module did its job and handed data to the next. They were clean in the sense that they were separate. They were not clean in the sense that mattered: change the way you stored lines in memory, and you had to touch all of them.

The second decomposition hid things. Each module was defined not by what it did in the workflow, but by what it knew that nothing else was allowed to know. The line storage module knew how lines were stored; nothing outside it needed to. Change the storage format, and you changed exactly one module. Parnas put it plainly: “Every module in the second decomposition is characterised by its knowledge of a design decision which it hides from all others.”

The paper won ACM’s Best Paper Award — but not until 1979, seven years after publication. The International Conference on Software Engineering declared it its Most Influential Paper twice: in 1991 and again in 1995. The field, evidently, was a slow reader.

Fred Brooks had been one of the slow ones. In the first edition of The Mythical Man-Month, the most-read book in software engineering, Brooks had been skeptical of information hiding. He later issued a public retraction, preserved by Steve McConnell: “Parnas was right, and I was wrong about information hiding.” In a profession where admissions of error are rarer than correct estimates, this registered.

What Parnas had named was a principle the field half-knew but hadn’t been able to say. A module wasn’t just a bundle of related code — it was the custodian of a secret. The secret was a design decision likely to change. The module’s job was to absorb those changes without broadcasting them outward.

Every object with private fields is working from that blueprint. Every API that specifies what you can call but not how it answers is enforcing it. Every microservice that owns its own database and refuses to share is, knowingly or not, running Parnas’s six-module KWIC experiment at scale.

Fifty years later, the secret is still being kept.

Sources

On the Criteria To Be Used in Decomposing Systems into Modules — ACM Digital Library — original December 1972 CACM paper; publication details, the KWIC example, and core quotes.
David Parnas — Wikipedia — biographical details, academic affiliations, ACM and ICSE award dates.
Missing in Action: Information Hiding — Steve McConnell — Fred Brooks’s retraction and the broader context on why information hiding faded from textbooks.
On the criteria to be used in decomposing systems into modules — The Morning Paper — detailed analysis of both KWIC decompositions and their implications.

Spot a mistake?

What the module doesn't tell you

Sources

Subscribe by email