Digital edition of Darwin’s Origin

“The currently available digital copies of Darwin’s great work suffer serious defects from the point of view of both human and machine readers.” (Goldstein, editor’s introduction)

Adam M. Goldstein, of Iona College, has created a structured source text of Charles Darwin, Origin of Species (London, 1859). Here is the link to download a digital edition of the source text, in pdf (6.4MB), at the American Museum of Natural History website:

The source text underlying this edition was produced by editing, correcting, and reformatting the Oxford Text Archive’s text number 1783. In the editor’s introduction, Goldstein relays the results of an initial proof-reading of the 1783 text by Eric English:

“The base text (text 1783) is rife with errors, approximately 1,000 of them identified during the first round of proofreading. Some seem to be typographical errors or errors of transcription, and some of these significantly alter the meaning of the text: missing words, a variant of a word differing in meaning from the correct word; missing punctuation; and, most startling, missing phrases or sentences. The base text is Anglicized in some cases, Americanized in others. For instance, “organization” and “organisation” both appear regularly in the base text, and double quotes where Americans today would expect them are frequently changed to single quotes in a manner that accords with today’s British practice. Additionally, no diphthongs, ampersands, or accented characters appear anywhere in text 1783. Dashes, commas, and semicolons are often deleted or misplaced. Superscripts, subscripts, and mathematics are either deleted or incorrectly represented.”

This edition rectifies these issues, creating a nearly word-for-word replica of the 1859 edition. As Goldstein states in his introduction: “The central principle informing the editorial practices used in production of the digital Origin is that the text be presented in a manner as close to its original rendering as possible….” Features which still differ from the original text (for example, the lack of running heads) are identified in the editor’s introduction.

Goldstein expresses the hope that this new source text will “provide a basis in machine-readable code for producing the text of the 1859 Origin in a range of designs, for instance, a large-type edition for the visually impaired, or an edition formatted for reading on a hand-held device.” In addition, the source text is structured to support machine analysis. Goldstein envisions the creation of an appropriate informatics tool that will enable scholars to analyze this source text of the Origin more powerfully than is possible through basic key word searching. Not only is this pdf the most accurate digital copy of the Origin to date, but Goldstein’s preparation of the structured source text underlying the pdf is an important step toward the application to the Darwin corpus of analytical techniques adapted from informatics.

This is the first version of this digital edition; revisions will be posted at the same link. At present, the AMNH site is the only authorized distribution point; refer users to the link above rather than distributing the file itself.

Notes about the edition:

  • The editor’s introduction is an invaluable guide to the technical ins and outs of making an accurate digital edition and a sustainable encoded source text. It discusses the edition’s necessity and methodology.
  • The pdf includes a hyperlinked table of contents and two hyperlinked indices, one that refers to the pagination of the edition in hand, the other to Darwin’s original Origin pagination. The latter is more accurate than the index in Ernst Mayr’s facsimile edition.

A copyright mark appears on every page of the pdf. Goldstein explains that the document is available under the terms set by the AMNH for use of material on the Darwin Manuscripts Project site. He intends eventually to release the source text under the GNU GPL, but that’s taking a little time to work out.


The OU History of Science Collections is providing high resolution facsimile images of Darwin first editions to the Darwin Manuscripts Project of the American Museum of Natural History. For more about the OU Darwin collection, see Darwin First Editions and Darwin@theLibrary.

About ouhos

Kerry Magruder, Curator; and JoAnn Palmeri, Librarian
This entry was posted in Digital projects. Bookmark the permalink.