Skip to main content

ACDH-CH Research Lunch with Nikola Ljubešić

When: Tuesday, September 10th 2024 - 12:30 PM

Where: ACDH-CH, OeAW
Bäckerstraße 13, 1010 Vienna
Meeting Room 2D (2nd floor)


The Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH), Austrian Academy of Sciences, is pleased to inform you about and invite you to the following Research Lunch:

We are looking forward to welcoming Nikola Ljubešić, senior researcher from the Jožef Stefan Institute in Ljubljana / Faculty of Computer and Information Science of the University of Ljubljana, and the Institute of Contemporary History in Ljubljana, who will provide some insights on the research project “ParlaMint - How recent developments in NLP help us reveal the hidden treasures of parliamentary proceedings”:

Parliaments serve as the cornerstone of democracy, ensuring the political representation of citizens. Despite their empirical relevance, parliamentary studies have often limited their scope to a single parliamentary body or a small group of parliaments analyzed in a comparative perspective. There are two recent developments that offer the opportunity to change this. The first development is the availability of open comparable parliamentary data through the ParlaMint project, covering transcripts of 26 European national parliaments, comprising over 7 million speeches given in more than 20 languages. The second development represents improvements in the area of natural language processing, which allows for automatic and consistent enrichment of vast textual data across languages, as well as significantly improved processing and enrichment of large quantities of speech recordings. In this talk, I will present two follow-up projects of ParlaMint.

The first project, ParlaCAP, focuses on enriching each of the 7 million parliamentary speeches, regardless of their language, by multilingual language models. Each text will be automatically labeled with the topic discussed and the sentiment expressed, which will help elucidate differences in agenda setting across the 26 European national parliaments, revisiting the theory of “core issues” receiving prioritized attention in European democracies. This confirmatory work will be further expanded with the question to what extent does tone, primarily negativity in legislative debates, differ between countries and how does this variation in communication relate to agenda diversity.

The second project, ParlaSpeech, focuses on ensuring the availability of the original, spoken modality of parliamentary debates. With their alignment to the textual transcripts, we ensure the availability of large quantities of aligned spoken and textual material from the public domain. I will showcase the usefulness of this data, merged with recent improvements in speech processing, by applying a pre-trained speech model on automatically identifying disfluencies in speech that are not part of the official parliamentary transcripts. This will enable us to revisit an information-theoretic take on the function of disfluencies in spoken communication.

The Research Lunch takes place at the ACDH-CH in the meeting room 2D on the 2nd floor, on Tuesday, September 10th, 2024, at 12:30pm and is kindly supported by CLARIN-ERIC and CLARIAH-AT.

We are looking forward to seeing as many of you as possible and would especially encourage attendance of colleagues who have an interest in or knowledge of (corpus) linguistics and/or natural language processing (NLP) and would like to share their insights and participate in fruitful discussions respectively!