Summer School: Machine Learning for Digital Scholarly Editions
- Hosting organisations
- University of Graz (Department of Digital Humanities)
- Responsible persons
- Martina Scholger, Sarah Lang, Bernhard Geiger, and Roman Bleier
- Start
- End
Machine Learning is gaining increasing importance in the Digital Humanities, particularly in the field of digital scholarly editions. The Summer School: Machine Learning for Digital Scholarly Editions builds on the international conference “Machine Learning and Data Mining for Digital Scholarly Editions” (University of Rostock, 2022) and addresses students and researchers working at the intersection of Digital Humanities and AI technologies. Its goal is to combine theoretical foundations with practical application, thereby unlocking the potential of machine learning methods for historical text editions.
Humanities research is increasingly shaped by powerful tools applying machine learning in some way for analyzing and enriching textual data. Using the Python library BERTopic, participants will explore various steps of topic modeling. Building upon BERTopic’s modular architecture, students will be introduced to several essential machine learning methods, such as embedding, dimensionality reduction, and clustering. Through practical sessions, students will learn to apply these techniques to historical texts. The aim is to give non-experts a high-level practical overview of how to use the BERTopic library and the essential theory behind its modules.
Project Information
The one-week Summer School is organized by the Department of Digital Humanties at the University of Graz in cooperation with Know Center Research GmbH Graz and the Institute for Documentology and Scholarly Editing ( IDE ).
The school is intended for both students and researchers with an interest in the intersection between digital scholarly editing and Machine Learning. After attending the school, participants will have a basic understanding of machine learning algorithms and be able to assess their possible applications as well as strengths and limitations. Participants will be able to practically use BERTopic on their own data.
In addition to lectures and tutorials, two keynotes by internationally renowned scholars will be part of the program. Networking will be fostered through social events.
All teaching materials – including concise introductory videos, presentation slides, and code notebooks – will be made openly accessible via GitHub and DARIAH-Campus , ensuring sustainable use beyond the in-person event.
Summer School Website Call for Participation