Skip to main content

Summer School: Machine Learning for Digital Scholarly Editions - Registration open!

When: Monday, September 8 - Friday, September 12, 2025

Where: Department of Digital Humanities (University of Graz)
Elisabethstraße 59/III, 8010 Graz, Austria

Registration: The submission deadline has already passed (31. March 2025).

Organisation: Martina Scholger, Sarah Lang, Bernhard Geiger, und Roman Bleier ( Universität Graz ), in cooperation with Know Center Graz , TU Graz and CLARIAH-AT.


Machine learning is increasingly shaping research in the Digital Humanities, offering powerful tools for analyzing and enriching textual data. Using the Python library BERTopic, participants will explore various steps of topic modeling. Building upon BERTopic’s modular architecture, students will be introduced to several essential machine learning methods, such as embedding, dimensionality reduction, and clustering. Through practical sessions, students will learn to apply these techniques to historical texts. The aim is to give non-experts a high-level practical overview of how to use the BERTopic library and the essential theory behind its modules.

The school is intended for both students and researchers with an interest in the intersection between digital scholarly editing and Machine Learning. After attending the school, participants will have a basic understanding of machine learning algorithms and be able to assess their possible applications as well as strengths and limitations. Participants will be able to practically use BERTopic on their own data. 

For more detailed information kindly check the Call for Participation .


Schedule

ZeitMontag (8.9.)Dienstag (9.9.)Mittwoch (10.9.)Donnerstag (11.9.)Freitag (12.9.)
8:30 - 9:00Registration
9:00 - 10:30Welcome and setup (Georg Vogeler, Walter Scholger) (Roman Bleier, Martina Scholger)Embeddings (Michael Jantscher)Clustering (Max Toller)Tokenization and weighting (Klara Venglarova)Experiments
10:30 - 11:00Coffee breakCoffee breakCoffee breakCoffee breakCoffee break
11:00 - 12:30BERTopic: overview and example (Selina Galka)Embeddings (Michael Jantscher)Clustering (Max Toller)Topic finetuning (Lucija Brozić)Machine learning and DSE wrap up (Sarah Lang)
12:30 - 13:30LunchLunchPoster SessionLunchLunch
13:30 - 15:00Introduction to PythonDimensionality reduction (Bernhard Geiger)Exkursion:Built your BERTopic pipeline (Roman Bleier, Martina Scholger)Keynote Ulrike Henny-Krahmer (online)
15:00 - 15:30Coffee breakCoffee break”Buschenschank”Coffee breakGoodbye coffee
15:30 - 17:00Prepare a dataset (Roman Bleier, Martina Scholger)Dimensionality reduction (Bernhard Geiger)Experiments (Michael Otto)
18:00Keynotezurück in Graz um ca. 21:30

More detailed information about registration, tutors, keynote speakers and much more is available on the Summer School Website:

Summer School: ML for DSE

related Links: