9 extraordinary documents every developer should read

Software development history is rich with "Eureka!" moments that caught the world by surprise. These papers define a century (nearly) of innovation that still shapes programming today.

Contributor, InfoWorld |

Extraordinary documents every developer should read — Dmitry Ratushny (CC0)

There are moments that expand what we think is possible in software development, and thus change the fabric of everything we do as developers. Certain historic documents capture the most crucial paradigm shifts in computing technology, and they are priceless. This article looks back over the past century (nearly) of software development, encoded in papers that every developer should read.

9 defining papers in the history of software development

On Computable Numbers, with an Application to the Entscheidungsproblem
First Draft of a Report on the EDVAC
Specifications for the IBM Mathematical FORmula TRANSlating System, FORTRAN
Go To Statement Considered Harmful
New Directions in Cryptography
The Gnu Manifesto
Architectural Styles and the Design of Network-based Software Architectures
Bitcoin: A Peer-to-Peer Electronic Cash System
TensorFlow: A System for Large-Scale Machine Learning

1. Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem (1936)

Here is the archetype of a paradigmatic document. Turing's writing has the character of a mind exploring on paper an uncertain terrain, and finding the landmarks to develop a map. What's more, this particular map has served us well for almost a hundred years.

Turing’s paper is readable, with an almost narrative flair—at least for a technical paper. It asks hard questions about what makes a number computable and delves into some tricky mathematics. But the general model—of a limitless series of squares on a tape (a Turing tape) that can move the “head” (pointer) around—is astonishing, even today. Turing describes the essence of the whole world of information machines that followed.

The elegance of the Turing machine idea is in its ability to render math into computing and computing into math. It remains a useful model for describing the complexity of systems.

On Computable Numbers is a must-read on many levels, including as a continuation of Gödel's work on incompleteness. Just the unveiling of the tape-and-machine idea makes it worthwhile.

2. John von Neumann: First Draft of a Report on the EDVAC (1945)

Von Neumann’s proposal for the EDVAC (Electronic Discrete Variable Automatic Computer) architecture is the kind of breakthrough that might make you think, at first, “that wasn’t already obvious?”

What wasn’t obvious was that a computer’s memory could store both data and instructions, together. In other words, memory could hold information that was also executable. Beyond the core idea, though, is the sense of a writer defining what was possible for machines at the time. It's an enormous leap from Turing’s largely philo-mathematical discussion to von Neumann’s practical discussion of information “magnetically impressed on steel tape or wire."

This work has all kinds of interesting thinking going on, including ideas about error handling in computation: “The device may recognize the most frequent malfunctions automatically, indicate their presence and location by externally visible signs, and then stop.” Von Neumann's paper stands right at the gateway of modern computers, describing in a half-real, half-speculative way the nature of the devices we use today. That's why the general architecture of computers is still known as von Neumann architecture.

The von Neumann paper asks what the character of a general computer would be, as it “applies to the physical device as well as to the arithmetical and logical arrangements which govern its functioning.” Von Neumann's answer was an outline of the modern digital computer.

3. John Backuss et al.: Specifications for the IBM Mathematical FORmula TRANSlating System, FORTRAN (1954)

Although the FORTRAN specification was not published publicly, it exerted a strong influence over language design and software in general.

FORTRAN, now an ancient progenitor among programming languages, was a breakthrough in higher-level languages for its time. It was the first truly general-purpose language.

The significance of FORTRAN becomes clear when you remember that it had been only 28 years since Turing imagined a computer in 1936.

The FORTRAN specification gives a great sense of the moment and helped to create a model that language designers have adopted since. It captures the burgeoning sense of what was then just becoming possible with hardware and software.

4. Edsger Dijkstra: Go To Statement Considered Harmful (1968)

Aside from giving us the “considered harmful” meme, Edsger Dijkstra’s 1968 paper not only identifies the superiority of loops and conditional control flows over the hard-to-follow go-to statement, but instigates a new way of thinking and talking about the quality of code.

Dijkstra’s work gives us a whole milieu, an attitude toward programming, and a way of looking at the discipline in one and a half pages.

Consider this:

For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.

Here, we are thinking about not only the structure of software at write- and runtimes, but the character of the work and our role as human beings in it. This gives us a glimpse into the culture of software engineering as a passionate endeavor; a culture which underpins everything we do in software today.

Dijkstra’s short treatise also helped to usher in the generation of higher-order languages, bringing us one step closer to the programming languages we use today.

5. Diffie-Hellman: New Directions in Cryptography (1976)

The Diffie-Hellman paper, by Whitfield Diffie and Martin E. Hellman, stands out in three remarkable ways:

The proposal seems impossible at first.
The solution is elegant and easy to understand.
It changed the course of history.

If you already know how the Diffie–Hellman key exchange works, then you know why this paper is on our list. The discovery and invention of public-key/asymmetric encryption laid the groundwork for all secure communications on the internet (like HTTPS), and was a foundation for the Bitcoin white paper 32 years later.

When it landed, New Directions in Cryptography set off an epic battle between open communication and government espionage agencies like the NSA. It was an extraordinary moment in software, and history in general, and we have it in writing. The authors also seemed to understand the radical nature of their proposal—after all, the paper's opening words were: “We stand today on the brink of a revolution in cryptography.”

6. Richard Stallman: The Gnu Manifesto (1985)

The GNU manifesto is, in a sense, the manifesto of open source software. It is also a bold claim to the for-the-love-of-it programming ethos that many developers embrace today:

GNU, which stands for Gnu's Not Unix, is the name for the complete Unix-compatible software system which I am writing so that I can give it away free to everyone who can use it. Several other volunteers are helping me. Contributions of time, money, programs and equipment are greatly needed.

Here is the basic open source premise. (Note the clever, self-referential name, another OSS trope.) The paper goes on to describe a bold project—a generally available, quality operating system for anyone to use—and backs it up with a philosophical discussion.

Highly readable and amusing (even a bit smart-ass) the manifesto argues against the closed-source, pay-for-license model that dominated the industry at the time. As history has shown, unbelievably, this was one instance where the plucky rebels won. Open source software is everywhere today, and programming for the love of it, while also making a living from it, is an entire way of life.

The Gnu Manifesto is still fresh enough today that it reads like it could have been written for a GitHub project in 2023. It is surely the most entertaining of the papers on this list.

7. Roy Fielding: Architectural Styles and the Design of Network-based Software Architectures (2000)

You'll notice I'm fudging a bit here on the dates. Although Fielding’s paper introducing the REST architectural style landed in 2000, it summarized lessons learned in the '90’s distributed programming environment, then proposed a way forward. In this regard, I believe it holds title for two decades of software development history.

This paper gathers up everything developers learned from the early internet and offers a solution to its most pressing problems. REST is important because it takes a well-aimed stab at the heart of modern software complexity. That's why it has remained the touchstone for architectural decision-making for two decades.

Fielding's discussion of complexity and design as it applies to web architecture is apt reading for developers today.

8. Satoshi Nakamoto: Bitcoin: A Peer-to-Peer Electronic Cash System (2008)

The now-famous Nakamoto paper was written by a person, group of people, or entity unknown. It draws together all the prior art in digital currencies and summarizes a solution to their main problems. In particular, the Bitcoin paper addresses the double-spend problem.

This is a short, approachable document. It does a great job of outlining the issue of double spending in plain language, then offers a conceptual response, and then digs into some of the implementation details of a solution.

I don’t need to tell you about the paper's impact, or the furor it has unleashed since.

Beyond the simple notion of a currency like Bitcoin, the paper suggested an engine that could leverage cryptography in producing distributed virtual machines like Ethereum.

The Bitcoin paper is a wonderful example of how to present a simple, clean solution to a seemingly bewildering mess of complexity.

9. Martin Abadi et al.: TensorFlow: A System for Large-Scale Machine Learning (2015)

If you are looking for a significant milestone on the way to modern large-language model (LLM) AI systems, the TensorFlow white paper is it. It is relevant as a discussion of a generalized machine learning framework, and it introduces TensorFlow, a flagship AI platform.

Although this paper dives freely into the complex end of the machine-learning pool, it also comes out with succinct distillations such as, “Given a sequence of words, a language model predicts the most probable next word.” That is a concise summary of what is really happening in modern chat AI.

This paper, by Martín Abadi and a host of contributors too extensive to list, focuses on the specifics of TensorFlow, especially in making a more generalized AI platform. In the process, it provides an excellent, high-level tour of the state of the art in machine learning. Great reading for the ML curious and those looking for a plain-language entry into a deeper understanding of the field. Read it here.

Conclusion

Perhaps the most valuable takeaway from this tour of brilliance is that there is always room for new ideas and approaches. Right now, someone, somewhere, is working on a way of doing things that will shake up the world of software development. Maybe it's you, with a paper that could wind up being #10 on this list. Just don’t be too quick to dismiss wild ideas—including your own.

Next read this:

Matthew Tyson is a founder of Dark Horse Group, Inc. He believes in people-first technology. When not playing guitar, Matt explores the backcountry and the philosophical hinterlands. He has written for JavaWorld and InfoWorld since 2007.