Summer School: Machine Learning for Digital Scholarly Editions - Registration open!
When: Monday, September 8 - Friday, September 12, 2025
Where: Department of Digital Humanities (University of Graz)
Elisabethstraße 59/III, 8010 Graz, Austria
Registration: The submission deadline has already passed (31. March 2025).
Organisation: Martina Scholger, Sarah Lang, Bernhard Geiger, und Roman Bleier ( Universität Graz ), in cooperation with Know Center Graz , TU Graz and CLARIAH-AT.
Machine learning is increasingly shaping research in the Digital Humanities, offering powerful tools for analyzing and enriching textual data. Using the Python library BERTopic, participants will explore various steps of topic modeling. Building upon BERTopic’s modular architecture, students will be introduced to several essential machine learning methods, such as embedding, dimensionality reduction, and clustering. Through practical sessions, students will learn to apply these techniques to historical texts. The aim is to give non-experts a high-level practical overview of how to use the BERTopic library and the essential theory behind its modules.
The school is intended for both students and researchers with an interest in the intersection between digital scholarly editing and Machine Learning. After attending the school, participants will have a basic understanding of machine learning algorithms and be able to assess their possible applications as well as strengths and limitations. Participants will be able to practically use BERTopic on their own data.
For more detailed information kindly check the Call for Participation .
Schedule
| Zeit | Montag (8.9.) | Dienstag (9.9.) | Mittwoch (10.9.) | Donnerstag (11.9.) | Freitag (12.9.) |
|---|---|---|---|---|---|
| 8:30 - 9:00 | Registration | ||||
| 9:00 - 10:30 | Welcome and setup (Georg Vogeler, Walter Scholger) (Roman Bleier, Martina Scholger) | Embeddings (Michael Jantscher) | Clustering (Max Toller) | Tokenization and weighting (Klara Venglarova) | Experiments |
| 10:30 - 11:00 | Coffee break | Coffee break | Coffee break | Coffee break | Coffee break |
| 11:00 - 12:30 | BERTopic: overview and example (Selina Galka) | Embeddings (Michael Jantscher) | Clustering (Max Toller) | Topic finetuning (Lucija Brozić) | Machine learning and DSE wrap up (Sarah Lang) |
| 12:30 - 13:30 | Lunch | Lunch | Poster Session | Lunch | Lunch |
| 13:30 - 15:00 | Introduction to Python | Dimensionality reduction (Bernhard Geiger) | Exkursion: | Built your BERTopic pipeline (Roman Bleier, Martina Scholger) | Keynote Ulrike Henny-Krahmer (online) |
| 15:00 - 15:30 | Coffee break | Coffee break | ”Buschenschank” | Coffee break | Goodbye coffee |
| 15:30 - 17:00 | Prepare a dataset (Roman Bleier, Martina Scholger) | Dimensionality reduction (Bernhard Geiger) | Experiments (Michael Otto) | ||
| 18:00 | Keynote | zurück in Graz um ca. 21:30 |
More detailed information about registration, tutors, keynote speakers and much more is available on the Summer School Website:
Summer School: ML for DSErelated Links: