A picture of woman holding a tablet computer from Nick Byrd's "PDF documents and Footnotes Decrease The Accessibility of Research"

PDF Documents and Footnotes Decrease the Accessibility of Research

Once upon a time, I loved footnotes and PDF documents. Now I don’t. I prefer eBook format and endnotes. I admit that footnotes are handy sometimes. For example, when I read visually, it’s nice to have the notes on the same page as the body text. However, footnotes are not so handy for auditory reading. Neither are PDF documents. For instance, footnotes wreak havoc on auditory reading. They interrupt the audio stream of the main body of text — sometimes mid-sentence. And since many people have to rely on auditory reading to consume academic research, this means that PDF documents and footnotes decrease the accessibility of research.

1.  Books vs. Articles

Sometimes academic books are available in an eBook version that is amenable to auditory reading — e.g., Amazon’s Kindle format and Apple’s iBook format. And some academic books have a proper audiobook version — e..g, Amazon’s audiobooks. This is great, but…

Research articles and chapters are not available in a format that is amenable to auditory reading. They are almost exclusively available in portable document format (PDF).

And a PDF document is much trickier to listen to. Some devices, like Apple’s iPhones, can do it with ease. But most devices — including Apple’s Mac and MacBook line — have difficulty reading a PDF start-to-finish. And that’s without footnotes. With footnotes, things get even worse.

2.  How Footnotes Wreak Havoc On Auditory Reading In PDF

Auditory reading of PDF documents requires certain software: text-to-speech software. What is text-to-speech software? I have already written about that and about how to read better via text-to-speech software, but here’s the short story:

Text-to-speech software reads the text on your screen aloud.

Sounds great, right? It is …when it works.

There are loads of circumstances in which text-to-speech doesn’t work. And many of these situations involve PDF documents and/or footnotes.

Scanned PDF documents

When you scan a book or paper, the scanned file usually lacks text-encoding. It’s just a picture of each page. So even though you see symbols on the PDF document, your device doesn’t. To allow your device to see the text, you need optical character recognition software (OCR). OCR software finds the symbols on the PDF image of each page and encodes the text information into the PDF file. (Alas, good OCR software is expensive — e.g., Adobe Acrobat Pro comes with OCR, but Adobe products are infamously expensive. And free OCR tools are a joke. So unless you have lots of scanned PDF documents, OCR software is probably not worth the investment.)

Non-body text

Lots of academic PDF documents have miscellaneous text around the body text. E.g., they sometimes have headers and footers (Figure 1). Or they have information about when the PDF was downloaded (Figure 2). And, of course, they often have footnotes (Figure 3).

All of this non-body text is confusing to text-to-speech software. Text-to-speech software just reads the text on a page. So if the software finds text in the margins of a page, then it’ll start reading it. And sometimes the software starts reading this marginal text in the midst of reading a sentence from the main body of a page. In other words, the audio will jump back and forth between main body text to non-body text. Obviously, that makes auditory reading near impossible. E.g.,

“In this paper we argue—Journal of Neuroscience, Copyright 2015—that conscious reasoning is realized—Table 1: Categories of reasoning—by frontal cortical networks—downloaded on September 16, 2010 from www.sciencedirect…..”

What the what?!

Footnotes are just one kind of non-body text. But, importantly, footnotes are unnecessary. After all, footnotes can be replaced by endnotes. And endnotes do not confuse text-to-speech software because endnotes do not live in the margins of a page. They’re placed squarely in the body of the page at the end of the paper.

3. Why This Matters: Accessibility

Lots of people need auditory reading. Obviously, some of these people are unsighted people who simply cannot read visually.

Perhaps less obviously, lots of sighted people also need auditory reading. After all, some people have to read more than they could read visually. For instance, it is very difficult — if not impossible — to read visually while commuting, doing chores, running errands, childcaring, etc.

The Argument

Given what I’ve said so far, we now have the basis for a strong argument for replacing PDF documents and footnotes with eBook-style formatting and endnotes. It goes like this

  1. Publicly funded research should not be systematically inaccessible to certain members of the public.
  2. When research is published in PDF with footnotes, then research is systematically inaccessible to certain members of the public.
  3. When research is published in eBook-style format with endnotes, then research is not (or is less) systematically inaccessible to certain members of the public.
  4. Therefore, publicly funded research should be published in an eBook-like format with endnotes in addition to (or instead of) PDF with footnotes.

Recap

Both PDF documents and their footnotes systematically decrease the accessibility of research. This seems like a bad outcome. Fortunately, the outcome is avoidable. Specifically, we can  — and should — add an option to download academic research papers in eBook format with endnotes. The current standard of PDF documents with endnotes is not enough.

Published by

Nick Byrd

Nick is a cognitive scientist studying reasoning, wellbeing, and willpower. When he is not teaching, in the lab, writing, exercising, or relaxing, he is blogging at www.byrdnick.com/blog

2 thoughts on “PDF Documents and Footnotes Decrease the Accessibility of Research”

  1. Your argument lacks mentions of hypertext or Marshall McLuhan. Medium has a great influence, which is why I aim to read texts in as many mediums as possible. How would complex multidimensional texts such as hieroglyphics fare under your regime?

    Open Access and Open Source are separate issues, unrelated to format or medium.

    1. Hi Malcolm,

      Can you explain how hypertext would overcome the problems that PDF documents and footnotes pose to text-to-speech software? It’s not obvious.

      Also, I am not sure why hyeroglyphics are relevant. In case that wasn’t a joke, I wonder if you could explain the relevance of that as well.

      Thanks for commenting.

Comments are closed.