Skip to main content


Hosting organisations
ACDH-CH - Austrian Centre for Digital Humanities and Cultural Heritage
Responsible persons
Ulrike Czeitschner and Barbara Krautgartner

The travel!digital Corpus, a digital collection of early Baedeker travel guides, brings together a valuable but rarely investigated part of European cultural heritage in an up-to-date and sustainable digital form. The project’s challenging task was making available a well-equipped language resource, which is meant to foster cross-disciplinary research in cultural representation and identity-constructing discourses. To this end, semantic technologies can contribute significantly. Along with the basic layers of linguistic annotation (lemmatization, part-of-speech-tagging), controlled vocabularies and Linked Open Data sources are appropriate and efficient instruments for exploring the German repertoire of speech about culture at the turn of the 19th century.

The corpus comprises first editions of German travel guides on non-European countries, which were published before World War I. It contains more than 1.5 million tokens and covers various regions. Focusing on people and monuments, two essential semantic domains of the guidebook genre and of cultural discourse itself, the rich lexical inventory is represented by means of the Simple Knowledge Organization System SKOS. An outcome of the systematic recording is the travel!digital Thesaurus:

  • people collective terms, ethnic/national communities, geographical concepts, professions [political, religious, economic roles, styles of living], religious communities, social classes
  • monuments architecture, artwork, activities, nature, folklore, accommodations, breath-taking views

The online edition includes the digital texts together with their facsimiles, detailed metadata, a TEI-Schema, and combines source oriented approaches with the applied semantic technologies. The most innovative feature is the integrated travel!digital thesaurus. The taxonomy gives a structured overview of the extensive domain-specific vocabulary. It offers definitions, reveals relationships and adds further information using external data from the Linked Open Data cloud. For practical reasons, all entries are linked to corresponding occurrences in the travel!digital Corpus in order to support efficient navigation and comfort of reading. The domain-specific vocabulary and content contextualization by LOD sources show promise for new perspectives on the genre. Without anticipating conclusions, well-structured and carefully annotated data support fine-grained examinations of central components that have a lasting influence on cultural perceptions of “Other” and “Self”. Thus, the application of recent technologies has the potential to reveal much about a discourse that goes far beyond travel literature.