Skip to main content


Hosting organisations
ACDH-CH - Austrian Centre for Digital Humanities and Cultural Heritage
Responsible persons
Marie-Luise Pitzl

The aim of the VOICE CLARIAH project is to ensure the long-term, user-friendly and open-access availability of the Vienna-Oxford International Corpus of English (VOICE), a digital one-million word corpus of spoken English as a lingua franca (ELF) interactions. For this purpose, the project team builds a new, enhanced web interface for VOICE Online (to be released in summer 2021) and integrates VOICE into the CLARIAH-AT infrastructure. The project enhances the system architecture of VOICE Online and the quality of VOICE data, for instance by providing an updated TEI-XML format that merges VOICE XML and VOICE POS XML and combines both layers of annotation in a single XML file for each corpus text. The improved system infrastructure complements existing corpus applications and offers new search, filter and style functions that are implemented through an integration of XML, NoSketch Engine, html and json technologies. The new advanced back-/frontend tools enable VOICE users to filter the corpus by selecting transcripts based on additional metadata categories previously unavailable as filters (like number of speakers). Increased style options facilitate and further customize the visualisation of VOICE transcripts. Enhanced search facilities support an extended range of queries, including searches for part-of-speech categories, but also searching for select features of conversational mark-up. The range of functions are made available in the newly-designed, intuitive and user-friendly VOICE Online web interface.

VOICE CLARIAH is an interdisciplinary collaboration of researchers from the Austrian Centre of Digital Humanities and Cultural Heritage (ACDH-CH) of the Austrian Academy of Sciences and the Department of English and American Studies of the University of Vienna. The project team combines applied, corpus and computational linguistic knowledge with IT expertise in software development, programming and web design and contributes to the international visibility of digital humanities research carried out in Austria.

The interplay of digital technologies and corpus linguistics works towards improved digital data processing for spoken corpora, analysis of interaction and multilingual data.

Principal Researcher: Priv.doz. Mag. Dr. Marie-Luise Pitzl, Austrian Centre of Digital Humanities and Cultural Heritage

Project Partner:

  • Mag. Daniel Schopper, Austrian Centre of Digital Humanities and Cultural Heritage
  • Univ.-Prof. Mag. Dr. Barbara Seidlhofer, University of Vienna

Project Team:

  • Hans Christian Breuer, University of Vienna
  • Mag. Dr. Ruth Osimk-Teasdale, University of Vienna
  • Mag. Hannes Pirker, Austrian Centre of Digital Humanities and Cultural Heritage
  • Mag. Stefanie Riegler, University of Vienna
  • Mag. Omar Siam, Austrian Centre of Digital Humanities and Cultural Heritage


Additional images

  • voice_clariah_screenshot_for_dha_kachel_02.png