The Archimedes Project is constructing a comprehensive library of works in physical science printed during the Scientific Revolution. The project is also the focal point for the application to the history of science of the linguistic technologies pioneered by the Perseus Digital Library at Tufts University. It is one of the leading projects in developing innovative programming and linguistic technologies that will be shared among future open-access, NSF-funded digital projects in the history of science. Funded by the Digital Libraries Initiative Phase 2 program of the National Science Foundation, the Project is a joint endeavor of Harvard University, the Max Planck Institute for the History of Science (MPIWG) in Berlin, and the Perseus Digital Library at Tufts University.
The screenshot above shows a list of available sources, in this case of texts in the category of Renaissance engineers. Available sources are listed in two categories: Digital texts and Digital facsimiles. Digital texts are works that have been processed to include searchable text and other features implemented through machine-automated XML tagging and other forms of markup. The ideal is a work that consists of both high quality images and processed text.
The screenshot above shows the first page of Agricola’s De re metallica (1556), a work for which high quality images are needed. (These images recently have been supplied by the OU History of Science Collections; cf. our list of digitized books.)
Searchable text has been produced by the Archimedes project’s powerful OCR technology. The Archimedes OCR technology is optimized for the varied typography found in early printed books and works effectively with non-Roman languages.
Clicking the brown page icon (located top-right, immediately to the left of the “search” link) opens a facsimile image page for comparison with the searchable text pane.
Advanced Usability features
The Archimedes project becomes particularly innovative in its implementation of advanced usability features to make the texts interactive, rather than being restricted to the passive functions of typical web browsing.
First, the Archimedes project supports user annotation of both illustrations and texts within the web browser.
Image annotations: For example, users may mark up illustrations. Agricola is noted for its remarkable illustrations. The screenshot above shows two user-added markers to indicate features of an illustration (note the little red numbers 1 and 2). I added these two markers from within a web browser while viewing page 135 of the work. After adding such markers, a student or scholar may share the illustration with the same zoom size and markers (click here to open this screen at the Archimedes Project website). The Archimedes Project software supplies a stable link that points to the image at the selected resolution, together with its annotations. This image markup feature allows users to interact with images in a collaborative manner.
Text annotations: The Archimedes Project also supports annotations of individual words and groups of words within texts. Source texts remain unchanged, while multiple users create their own independent sets of annotated texts, which they may share with other scholars and students.
Annotation is possible even when “terms” consist of multiple associated words, which may be discontinuous or interlaced with each other. That is, “overlay tagging” (unlike XML) enables overlapping “terms” to be marked individually. In addition, corresponding “terms” in parallel may be suggested by the software based upon the co-occurrence of words, and edited in user-defined “term lists.” Texts are Unicode based, with support for non-Roman characters including Greek, Arabic and Chinese.
The screenshot above shows three panes in parallel containing different editions of Euclid’s Elements of Geometry, where a search for a term in a Greek text (left column) has identified the corresponding terms in the two parallel Latin translations. (Screenshot by Mark Schiefsky.)
Interactive features like these change the way scholarship is performed.
The Archimedes Project supports sophisticated morphological searching of texts. With morphological searching, a word as it appears in a text is automatically analyzed into its dictionary form and part of speech. For example, if a user searches the Greek text of Euclid’s Elements of Geometry for “isópleuros,” the search can return any or all of the forms shown below:
Morphological analysis is currently available for texts in Arabic, Dutch, English, French, German, Greek, Italian and Latin; support for further languages is in development.
Morphological searching requires sophisticated linguistic technology, including online dictionaries and automated machine-generated markup of source texts. The Archimedes Project is able to support morphological searching only because of the decades long development of the underlying linguistic technology by the Perseus Digital Library of classical texts.
This post about the technological innovations of the Archimedes project has focused upon the client side, the features presented to the user. However, the project has also created development tools, including image viewers and annotating applications, to facilitate the markup of texts and images consistent with TEI and RDF standards.
The Max Planck Institute for the History of Science in Berlin hosts a number of other history of science digital projects on topics including Chinese science, cuneiform science, Galileo, quantum physics, and biological experimentation. These projects will also benefit from the technical expertise being developed in the Archimedes project.