What I learned from Edward Tufte

I recently was able to attend Edward Tufte’s seminar on presentations and data graphic design.  This blog post covers the essential elements I took away from the lecture.

On Space vs. Time

When one has a great deal of content to convey to an audience, it cannot be blurted out all at once and in the same spot: it must be spread out over either space or time (or both).  A slide deck spreads out content in time, using the same space over and over. A document or web page shows everything at once, spread out in space. Documents play to human being’s strengths, while slide decks play to their weaknesses.

Humans have a natural ability to visually consume a complex field of data by instantly shifting modes from high-level scanning to detailed inspection and back again.  This makes it possible—and even quite easy—for a person to scan through a long document, identify sections of interest, and dive into that piece for a closer look. The same is true for presenting many hundreds or thousands of data points in a graph; the viewer can rapidly scan the overall structure of the data and zoom in to particular interesting details.  With information arranged with spatial adjacency, it is easy for people to compare and contrast, scan and examine, and learn most efficiently.  This leads to a very high throughput of data transfer from the author to the audience.

Humans, on the other hand, have a limited ability to precisely remember detailed data for any length of time.  They also have a limited attention span: particularly when presented with data which is either confusing or boring to them.  This makes it very difficult for a person to hold the context required to compare data presented sequentially over a series of slides.  The needs of a slide presentation (i.e., a limited space per slide which must be legible at a distance) means that the content is broken into tiny chunks and widely distributed over time as the presenter talks over each point: often re-iterating the content on each slide slowly (relative to the speed of reading).  It is impossible for any individual listener to speed up or slow down the presentation to suit their needs, or to scan ahead to answer a question, or to skip back to revisit an unclear point. This leads to a very low throughput of relevant data transfer from the author to the audience.

It is highly preferable, therefore, that information displays maximize “spatial adjacency” of material with a visually dense presentation with varying levels of headers, “data paragraphs”, and whitespace to allow viewers to readily identify and select from large blocks of content at a single glance.

On Giving Presentations

For presentations of virtually any scale, it is far better to provide a narrative document (i.e., like this one) instead of a slide deck.  The document should be from 2–6 pages long, and include all the information to be discussed integrated into a single flow. Tables of numbers, charts, graphs, pictures, etc. should all be integrated with the narrative description of the subject matter.  In all cases, references should be included to source materials, primary sources, etc. The document should be written to be a permanent record of what was discussed, and therefore should be complete and self-contained.

For the actual presentation, the meeting should begin by handing out copies of the actual document to each attendee.  This is followed by a study hall session long enough for everyone to carefully read the document and make note of any questions, thoughts, or disagreements.  Once everyone is ready, the remainder of the meeting is not spent re-hashing what was just read, but instead is spent in discussion those questions, thoughts, and disagreements each person noted while reading.

The primary advantages of this style of meeting are:

  1. People can read through the document at their own pace, and to serve their own needs.  Sections irrelevant to a certain person can be skimmed, while sections of intense interest can be lingered over carefully.
  2. People can easily jump ahead to see if a question is answered later, or skip back for extra clarity on a point they may have misunderstood.
  3. People read much more quickly and with much greater throughput than can be presented aloud, so meetings can often be shorter.
  4. The document serves as a permanent record of what was presented which everyone can take away with them to refresh their memories later.

On Judging an Information Display

“The purpose of information display is to assist people in reasoning about the content.”

Edward Tufte

When judging an information display (i.e., charts, graphs, tables, etc.), people judge both the quality of the data and the reliability of the presenter.  To establish both, apply these six principles both when making and consuming an information display.

show comparisons, contrasts, and differences

The information display should be deliberately designed to make it easy to compare various data sets or points within each data set.  The author should be thoroughly conversant with the data, and deliberately highlight those points of contrast which are most surprising, interesting, or useful.

show causality, mechanism, explanation, and systematic structure

Information displays should endeavor to show how certain data sets were the cause of other data sets.  In charts, for example, one can use labeled arrows to not only show the direction of causality, but also to describe the mechanism or process by which it happened.  On graphs, this can be a block of text describing some causal connection with an arrow pointing to where this is shown in the data.

show multivariate data (i.e., 3 or more variables)

The real world is complex, and includes a lot of interconnections between different data sets.  Information displays should attempt to draw in as many of these various data sets as possible to show the interconnections between them (see: Minard).

completely integrate words, numbers, images, diagrams, etc.

When helping someone understand a data set, it is very unhelpful to segregate data based upon its source, format, or media.  Instead, pull all sources of data into the single information display so that they can be compared side-by-side with the other data relevant to the story.  Data labels and other text should be integrated into the data display whenever possible instead of being relegated to sidebars, legends, or other documents.

document the display thoroughly

The reader should be left with no questions about what it is they are seeing or where it came from.  This often requires extensive textual, even narrative, explanations included within and along side the information display. A title, the author’s name, units for all numbers, and links to source data are a minimum.  One may also find it helpful to include a paragraph explaining the principle features of the data set, interesting comparisons to make, or surprising results.

presentations stand or fall based on the quality, relevance, and integrity of the content

Showing the content in the clearest and most accessible fashion should be the only purpose of an information display.  Design for the sake of design should be avoided at all costs. Any extra line, letter, or decoration should be eliminated if it doesn’t serve to help the reader understand the data better.  The data will tell the story better without confusing or distracting embellishment.

✧✧✧

Naturally, this only covers the most essentialized version of what Tufte presents over the course of his full-day lecture. Along with the lecture, you receive his four published works on data display:

I found the lecture both extremely informative, and productive. I came home bursting with ideas on how to improve the data displays of the various projects I was working on, and I’ve been able to put his precepts to good use on a number of occasions since. I highly recommend attending if he comes to a city near you.