Noting two recent items in digital publishing/open access

  1. Academic Presses Explore Open Access for Monographs by Seth Denbo of the AHA

The article notes that “evidence shows that providing free and unrestricted access to digital monographs increases their usage significantly,” which makes sense. I’m glad that the article notes that a move to pay-to-publish puts a burden on scholars who are not at well-funded institutions. I suppose this burden would be greater in the humanities then in the sciences, where research already requires institutional funding from the get-go.

2. Introducing Unpaywall from Impactstory

Just happened to run across this today. I haven’t tried it, but it’s an extension for Chrome and Firefox that links to articles behind paywalls. A cursory look at their FAQ suggests that they source their articles from open access databases and repositories to which the original authors themselves submitted the article – hence it is legal even as it dodges paywalls. It’s more a matter of connecting readers to sources that are already available, but not known.

As the creators write:

“We loathe paywalls. Now more than ever, humanity needs to access our collective knowledge, not hoard it. Lots of scholars feel the same; that’s why they upload their papers to free, legal servers online. We realized that the missing link is in getting these free resources to the people who want them, at the right time. By using a browser extension, we can do that, leveraging the toll-access distribution system to bring open access to the masses.”


Digitizing the OED

I recently picked up a copy of John Simpson’s memoir The Word Detective: Searching for the Meaning of It All at the Oxford English Dictionary, largely because I like words and dictionaries and the OED in particular. A nice surprise, however, is that Simpson, the former chief editor of the OED, oversaw the digitization of the OED to CD-ROM, beginning in the early 1980s, as well as it’s later migration online. This is exciting stuff, not only for the description of transferring a  massive database from print to computer (~67 million characters), but also because Simpson nicely describes how digitization did not replace the original function of the OED, but rather added new dimensions to it.

One of the benefits of the OED was that it’s data was already structured; i.e. definitions, pronunciations, etymology, etc. are distinguishing by their formatting, “a change of typeface, size of print, special print characters, indentation, etc.,” consistently and repetitively.  The OED teamed up with International Computaprint Coporation, IBM, and the University of Waterloo (all in North America). The first two helped with digitizing the data, while Waterloo’s Computer Science Department helped construct the database. The typing took 150 people working for 18 months. After words, the 20,000 pages of type, each three columns of small print, had to by proofread, which was taken on by 50 freelancers.

Simpson’s descriptions of how this large project took shape and was organized are interesting, but he shines when describing the new possibilities that digitization would open for the OED. Up to this point, dictionaries were incredibly linear: you looked up the word you wanted, and there you were. But what if, as Simpson describes it, you were able “to search the entire content of the dictionary instantly for information relating to the language”? He gives the example of finding all the words in English that end in -ology (1,011 in the OED), followed by comparing them with all the words that end in -ography (508). Given how time-consuming doing this would be with the print dictionary, it wasn’t done, but digitization could make such a search feasible and quick.

“Hundreds of other questions which might have been asked about the language were not asked, or were only answered falteringly by considering just a sample of the data. What if you could dream up more or less any question you wanted about the language, ask it, and receive an answer seconds later?” Simpson writes. This seems to be the common-sense attraction of what is now collectively referred to as digital humanities: it opens up the possibility of new questions, new forms of analysis, and the ability to see patterns and meanings that would be impossible or extremely impractical to reach without digital tools. At the same time, the possibilities these new avenues offer do not mean we abandon other avenues of research and analysis. Just because we can search the entire corpus for all instances of -ology doesn’t mean that sometimes we just need or want to know the specific meaning(s) and history of amphibology or tropology – both of which are just two of the many words that Simpson explores in his fascinating memoir.