The Structure of Information Revolutions

In 1962, philosopher Thomas Kuhn published his landmark work The Structure of Scientific Revolutions, forever changing our view of the history of science. Kuhn showed that scientific revolutions follow a certain pattern, in which periods of conceptual continuity in normal science were interrupted by periods of revolutionary science.

In this article I’ll argue that the same thing applies to information revolutions – periods in which the volume of information that humans needed to manage grew dramatically in a short period. These information revolutions were similar to scientific revolutions, introducing whole new ways of thinking that were not possible before.

In order to survive and continue advancing through each of these periods, two things had to be invented:

  • A new paradigm for our relationship to information
  • A new kind of tool that manifested this new paradigm

These two inventions were self-reinforcing: the new paradigm allowed people to conceive of the new tool, and the new tool promoted the spread of the new paradigm.

previously summarized the story of 60 of these inventions through 45,000 years of human history. These tools were not just slightly new and improved versions of something that existed before. Each one represented a whole new kindof tool that fundamentally expanded our ability to make use of information.

We are in the midst of an information explosion right now, perhaps the most challenging one that humanity has ever faced. 

The growth in information is no longer constrained by physical materials or space. And we aren’t just facing an information explosion at the level of society. Each and every one of us now has to individually handle information of unprecedented scale and complexity just to manage our daily lives.

To survive this explosion, we need to make it through yet another information revolution. By understanding the 7 stages that information revolutions typically follow, we’ll be able to see them when they inevitably arrive in our own era.

The 7 stages are:

  • Externalization
  • Abstraction
  • Centralization
  • Atomization
  • Scaling
  • Standardization
  • Networking

Let’s examine each of them in greater detail.

Externalization

One of the first things that humans do in an information explosion is “externalize” what they know: they transfer information stored in their fragile biological brains to more durable forms

One of the earliest examples of externalization were the beads and pendants used in the Ice Age. They were made of stone, shells, or ivory, but were much more than trinkets. They allowed people to imbue physical objects with emotion, status, and significance. 

They externalized their knowledge of social relationships and tribal alliances into durable physical artifacts, which could be carried around, traded, and gifted. This allowed early humans to form much wider networks of social trust than was possible with only direct relationships. These networks took the form of tribes, clans, federations, nations, and religions.

Writing many thousands of years later in the 20th century, engineer and inventor Vannevar Bush described his “Memex,” a research tool for creating connections and associations between documents.

In the most famous passage from his landmark 1945 essay “As We May Think,” Bush explained the significance of creating these connections outside the researcher’s head: “The inheritance from the master becomes, not only his additions to the world’s record, but for his disciples the entire scaffolding by which they were erected.” It was only when this scaffolding was manifested outside the master’s head, that it became available for others to build upon.

Externalizing our ideas makes them into tangible building blocks that others can use and incorporate into their own work. Only then can our ideas extend beyond our circle of relationships and beyond our lifespan. 

Abstraction

As more and more information gets externalized, it starts to pile up. It soon becomes too time-consuming to read an entire document from beginning to end just to find a single piece of information.

This is where abstraction comes into play. Abstraction involves drawing out common characteristics from a collection of items, and organizing them according to these characteristics. This could include organizing written works by author or subject, or cataloguing a collection of photos by theme or era.

The earliest examples of abstraction closely followed the invention of writing. The earliest known document “abstract” was found in a Hittite settlement called Hattusas near modern day Ankara. The abstract contained keywords to help scribes preview the content of tablets in the collection, and call numbers to help them find the tablet they were looking for. The abstract helped readers quickly get the gist of a tablet, to decide whether it met their needs before diving in.

In a later era, the keeper of the Great Library at Alexandria was one of the first to systematically abstract the “meta-data” (or data about data)” for the massive collection of books. He assigned works to different rooms based on their subject matter, and then attached small tags to each scroll describing the work’s title, author, and subject. Browsers could visit the room most related to their topic of interest, and then read the tags for a summary of what was in each scroll without having to read through each one.

Abstraction allowed collections of written works to grow faster than the time required to manage and reference them. One small label could now reference anything from an individual written work, to an entire genre. This allowed stockpiles of information to grow even faster, without sacrificing our ability to find what we needed.

Centralization

Eventually, even abstraction starts to hit limits. When you have thousands of documents, it becomes a challenge to remember which information has even been recorded, where it is located, and how to find it. Centralization collects all the meta-data and captures it in a single, authoritative reference tool, such as a catalogue or bibliography.

The first document index, a tablet with a list of other tablets, was found in Elba, Syria dating to 2,300 B.C.E. Another example comes from the poet Callimachus, another Alexandria librarian, who was the first to create a separate catalogue of the collection. It was a comprehensive bibliography, organized by author, known as the Pinakes. It filled 120 scrolls despite the fact that he only finished 20% of the process.

Centralization is powerful because it creates a single point of reference for a potentially vast collection of documents. Readers have multiple ways of finding what they’re looking for, since these reference tools can themselves be sorted, organized, and cross-referenced. They can be carried around, copied, or annotated with notes as a researcher makes his way through a topic.

Atomization

As more and more information continues to pile up, and we have less and less time to be searching through piles of books to find the answer we need, a new force starts to come into play: atomization. Atomization refers to breaking down information into smaller, more discrete chunks.

One of the earliest examples of atomization was the popularization of the book itself. Originally made up of leafed pages of parchment and vellum bound between thick, hard covers, books allowed an incredible new capability: random access.

Instead of having to read a 12-foot scroll from beginning to end, a scholar could go straight to the page they needed. Chapters and page numbers made it possible to reference a specific idea within a book, saving time and effort.

Atomization also occurred within language itself. The rebirth of written Greek culture in the 9th century B.C.E. was powered by a new form of writing adopted from the Mediterranean Phoenician civilization: a phonetic alphabet. By breaking down complex symbols into letters that signified individual sounds, it took more letters to write something. But it was possible to express many more combinations of things, and writing became open to more people because it was easier to learn.

American engineer and inventor Doug Engelbart stumbled upon the same need in the 20th century. His goal was to augment the human intellect, by “…increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems.” He quickly realized that in order to keep humans at the center of the intellectual process, he would need to find a way to overcome their limited working memory. He sought to break down information into “atomized nuggets” that could be reconstituted in endless possible configurations.

Atomization breaks information free from large, static documents like scrolls and books. It makes each discrete chunk of knowledge into an asset that can be referenced and incorporated into different works. This cross-fertilization produces a whole new wave of insights and connections.

Scaling

The potential of information has always been closely tied to the specific physical medium on which it is stored. At some point in an information explosion, a simpler and cheaper storage medium is always needed.

Paper was first invented in the second century B.C.E. in China, and found its way to Medieval Europe by the 13th century. It replaced difficult materials like parchment and vellum, fueling an explosion of popular texts written in vernacular: travel journals, poems, romances, and lives of saints. Christopher Columbus’ travels in the Americas was a best-seller that found its way across Europe.

The production of cheap paper made it possible to print “broadsheets” – political posters that could be printed quickly on demand and posted all over a city to announce meetings or movements. Hitting their stride in Reformation Europe, broadsheets were used to spread ideas and messages that weren’t sanctioned by the state or Catholic Church.

Martin Luther could write out his 95 theses by hand, but no one would have ever found out about it if those grievances couldn’t be mass printed. Making information cheaper to reproduce led to the spread of all kinds of ideas that we’re still feeling the effects of today.

Standardization

As information is mass produced and spreads, people in different regions need standard conventions for how to record, present, and organize it.

Alcuin of York was an English scholar, clergyman, and poet who was called in 782 by Charlemagne to establish a great imperial library. As part of that effort he established a central repository for distributing templates for manuscripts. These templates made it easier to copy and distribute standardized texts across the growing empire. This was important because it ensured that the same books with the same information would be taught everywhere, facilitating the spread of common knowledge. 

A similar need arose with the invention of the printing press in the 15th century in Europe. Scholastic books until then had been printed in black gothic letters, which varied from region to region, and vernacular works used different versions of bastarda gothic type. A new typeface called “littera antique,” inspired by ancient Roman letters, took hold among humanists as a more enlightened, modern script. Along with other conventions like movable type, title and chapter pages, and colophons, it allowed something printed in one part of Europe to be understood everywhere.

The adoption of Roman type during the Reformation foreshadowed the invention of the ASCII character encoding standard in the 20th century. ASCII was developed from telegraph code, and formed a common standard for written characters adopted by virtually all electronic communication. 

The value of information is a function of how many people have it, and how effectively they’re able to put it to use. By standardizing on common rules, the number of people who are able to communicate and collaborate explodes, further fueling the explosion of information.

Networking

Once information has been externalized, abstracted, centralized, atomized, scaled, and standardized, the stage is set for it to be networked. This involves connecting each piece of information into a network, where it can form new connections and associations.

This networking didn’t start with modern digital technology. We’ve seen it many times before. In Medieval Europe, scholarly networks of monastic librarians sprang up to share biographical works. They started to borrow rare works from each other, and even borrowed ideas for how to organize their collections. 

The printing press also allowed the formation of “textual communities” around written works, especially anti-establishment ones. Ideas no longer depended on institutional networks like the state and the church to spread. They could be carried from person to person directly, through social networks.

But there’s a reason that the information age is sometimes called “the age of the network.” The above examples show that social networks often sprang up to share and exchange knowledge. But we’ve had to wait for digital technology, with its algorithms, magnetic memory, and communication protocols, to allow us to network knowledge itself.

In the 1930s, visionary Indian librarian Shiyali Ramamrita Ranganathan proposed an entirely new approach to cataloguing. He called it “faceted classification.” Instead of a single, top-down, deterministic list of subjects, such as an index, it used descriptions of works based on multivalent characteristics, or “facets”: personality, matter, energy, space, and time. He believed that these five facets could describe any piece of information in endlessly reconfigurable ways.

Faceted classification was highly influential as an ideal conceptual system, but its complexity made it too hard to implement. Decades later, with the invention of modern computers, faceted classification has made a big impact on computer science. Relational databases, which are able to connect chunks of data according to multiple criteria, have finally allowed us to implement Ranganathan’s vision.

Now we can type in any characteristic of a book –its title, author, year of publication, subject, ISBN number, etc. – and find our way to it based on what we know. We can also easily see works sorted and grouped according to these criteria with just the push of a few buttons.

A Lesson for Technological Evangelists

There is one overarching lesson in this history for anyone seeking to invent a new tool for helping people manage the current information explosion: simple, user-friendly tools always win over sophisticated, more complex ones.

Again and again, when two competing tools went head to head, it was the one that gave the greatest number of people the most useful new capabilities with the least amount of cognitive burden that won.

Buffon vs. Linnaeus

In the 18th century, two European naturalists offered competing visions for how nature should be classified and organized. The French Georges-Louis Leclerc, Count of Buffon, proposed a system that was more accurate, taking into account the evolution of species, for example. He profoundly influenced modern science, including Darwin’s theory of natural selection. But the flexibility of his system made it too complex for most people to use.

In 1735, a young Swedish botanist named Carolus Linnaeus published his Systema Naturae, proposing a universal classification of all life. His system was not especially sophisticated, and to a certain extent even oversimplified the complexity of nature. But it was easy to learn, had precise rules, and supported a division of labor among scientists. The Linnaean system is what we still use to this day, because it was the simplest tool that got the job done.

Dewy vs. Cutter

A couple centuries later, a similar battle raged between the Dewey Decimal System and the Expansive Classification System, created by American cataloguers John Dewey and Charles A. Cutter, respectively. Cutter’s ECS had elaborate multi-tiered subject schemes describing works by author, title, and subject. His system was more advanced in many ways, recognizing that patrons often came to the library with nothing more than a vague question in mind. But it was difficult to use for librarians who weren’t highly trained.

Dewey’s system, although it lacked some of this sophistication, was easy to learn and easy to use in the day to day work of the small town librarians of America. Often criticized for oversimplifying categories, it won because of its simplicity and usability. This simplicity has allowed it to survive the transition to electronic catalogues in the modern era.

Centralized hypertext systems vs. the World Wide Web

Fast forward to the modern internet. This battle between simplicity and sophistication played out once again. A generation of hypertext systems had demonstrated many advanced capabilities in the 1970s and 1980s. One of the most advanced was IRIS Intermedia, a centralized hypertext system that allowed users to create collections of interlinked original materials online, i.e. a “web” of knowledge about any topic.

Intermedia stored all links in a central database, which allowed for two-way hyperlinks and ensured that links wouldn’t be “broken” by updates. Intermedia pioneered hyperlinks within email, collaborative authoring tools (like Google Docs), and “live” objects that could be updated dynamically across applications. Such systems laid the foundations for the modern internet, and many of their capabilities have yet to be matched.

But in the end, it was Tim Berners-Lee’s World Wide Web that won, even though it was less capable. Intermedia and its cousins couldn’t scale because they were too centralized. The World Wide Web, on the other hand, was so easy to access that it rapidly grew to millions of users.

Our mission in developing tools for information management is to help people continue to externalize, abstract, centralize, atomize, scale, standardize and network the knowledge they need to grow and to thrive. And to do so with the simplest, most easy to use tool possible.

Thanks to Abby Paredes, Courtney Blanche, Sean McLaren for their feedback and suggestions

Like this?
Become a subscriber.

Subscribe →

Or, learn more.

Read this next:

Napkin Math

Crypto’s Prophet Speaks

A16z’s Chris Dixon hasn’t abandoned the faith with his new book, 'Read Write Own'

13 Feb 1, 2024 by Evan Armstrong

Napkin Math

Profit, Power, and the Vision Pro

Will Apple’s new headset change everything?

5 Feb 6, 2024 by Evan Armstrong

Thanks for rating this post—join the conversation by commenting below.

Comments

You need to login before you can comment.
Don't have an account? Sign up!

Every smart person you know is reading this newsletter

Get one actionable essay a day on AI, tech, and personal development

Subscribe

Already a subscriber? Login