Tag der guten Daten 2026
“Where’s the data?” - “Wo sind die Daten?”
When: Tuesday, February 10th, 2026
Where: hybrid at the University of Graz:
SR 62.31 (Hauptbibliothek, 3. OG, Universitätsplatz 3a, 8010 Graz) and online.
Registration: Register in advance via this registration form .
During International Love Data Week , which this year focuses on the question “Where’s the data?” , the University of Graz is hosting a full-day event dedicated to this topic.
The keynote, which opens the morning lecture series (9:00–13:00), addresses issues of responsibility, maintainability, and long-term thinking, using a long-established Digital Humanities project, the Middle High German Conceptual Database , as a case study. The speakers will present AI-supported research methods, demonstrate that while AI can support research it will never replace research personnel, and argue that open, human-readable standards such as XML are by no means a step backwards, but rather a key prerequisite for digital sustainability and future viability. Subsequent presentations will cover licensing in AUSSDA , the Health Data Research Hub of the Medical University of Graz, performance tests and metadata from the Digital Skills Austria studies, and a transferable model for heterogeneous (psychological) research data.
This part of the event will also be broadcast via livestream.
In the afternoon, a workshop entitled “Making Research Data Available / Reusing Research Data” will take place. Following a series of short impulse talks that characterise data from different disciplines, participants will collaboratively develop a practical guideline for researchers on this topic.
The workshop will be held on site only, and the number of participants is limited.
Programme
-
8:30 - 8:50 - Registation, Check-In, coffee
-
8:50 - 9:00 - Welcome and opening remarks
-
9:00 - 10:00 - The data are not gone! They are just somewhere else. On responsibility, maintainability, and long-term thinking, using a DH cornerstone as a case study
Keynote by Mag. Dr.phil. Katharina Zeppezauer-Wachauer; Mag.phil. Christian Steiner, MA
The Middle High German Conceptual Database (MHDBDB) has existed since 1972 and is one of the oldest Digital Humanities projects worldwide. Over more than five decades, it has passed through punch cards, relational databases, and an RDF triplestore with six billion triples—only to ultimately return to XML.
This talk does not tell a success story. It tells how one can lose one’s own data without deleting it: by embedding it in ever more complex structures that only machines can still understand. It shows what happens when technological euphoria collides with institutional realities—lack of permanent positions, tight budgets, and knowledge that disappears with fixed-term contracts. It also presents a way out: migration to TEI XML as a human-readable standard, supported by AI-assisted workflows (promptotyping). We do not trust the LLM. We use it—for reverse engineering, structural transformation, and validation. The result is a migration that would have required six-figure budgets three years ago, now achieved within months and with only a few thousand euros.
We share three lessons learned: first, that frontier LLMs enable migrations that were previously unaffordable; second, that AI scales but does not replace staff—and that this staff requires permanent contracts; and third, that open, human-readable standards such as TEI XML are not a step backwards, but an insurance policy for survival.
Bio-Notes:
Katharina Zeppezauer-Wachauer is a Senior Scientist at the University of Salzburg. She has been part of the Middle High German Conceptual Database (MHDBDB) team since 2010 and has coordinated the project since 2016. Her research focuses on corpus linguistics, computational text semantics, and the application of AI and large language models to historical corpora, with particular emphasis on semantic modelling, narrative structures, and FAIR data practices. She received her PhD in German Studies from the University of Graz with a specialisation in Digital Humanities and serves as the University of Salzburg’s representative in the Austrian research infrastructure consortium CLARIAH-AT.Christian Steiner completed master’s degrees in Translation Studies and Digital Humanities at the University of Graz, where he worked at the Institute for Digital Humanities from 2012 onwards. He is the founder and CEO of Digital Humanities Craft OG, an IT research company in close cooperation with the University of Graz. He teaches at several universities across Europe. His main areas of expertise are generative AI, particularly coding agents as well as prompt and context engineering, and web programming. He brings extensive experience in data modelling and processing across disciplines, from the humanities to the natural sciences.
- 10:00 - 10:20 - Break
- 10:20 - 11:30 - Presentations
-
Licensing of Research Data, using AUSSDA – the Austrian Social Sciences Data Archive – as an example
Mag.phil. Dr.rer.soc.oec. Otto Bodi-Fernandez (University of Graz, Institute of Sociology)AUSSDA—the Austrian Social Science Data Archive—supports researchers in the sustainable archiving, documentation, and reuse of social science research data. A central guiding principle is the promotion of Open Science while simultaneously complying with the legal framework, in particular data protection law. This presentation provides an overview of the role of licenses in research data archiving and situates them within the FAIR principles, with a particular focus on the “R” (Reusable). Using the AUSSDA licensing model as an example, it explains under which conditions anonymised data can be made available as Open Data and highlights the significance of Creative Commons licences—especially CC BY—for open, non-purpose-bound reuse. In contrast, data protection requirements apply to personal or pseudonymised data. Since such data may not be shared without restriction as to purpose, AUSSDA relies on the so-called Scientific Use File (SUF) licence, which limits reuse to scientific purposes. The licensing system is complemented by access categories and an institutional access policy, which together regulate legally compliant access to research data. Drawing on concrete examples from the AUSSDA repository, the presentation demonstrates how openness, reusability, and data protection can be balanced in practice—guided by the principle: as open as possible, as closed as necessary.
Bio-Note:
Otto Bodi-Fernandez works at the Centre for Social Research as the Graz contact point for AUSSDA. He is active in the field of research data management and teaches empirical methods at the Institute of Sociology. His research interests include childhood and youth studies as well as educational inequality. -
Data Love @ MedUni Graz: HDRH-Tools for FAIR, secure research data management
Mag.Dr.rer.nat. Andrea Groselj-Strele (Medizinische Universität Graz, Core Facility Computational Bioanalytics)
The Health Data Research Hub (HDRH) of the Medical University of Graz provides the foundation for data excellence across the entire research data life cycle by means of a secure data environment and standardised interfaces—from study planning and the creation of data management plans to long-term storage and the reuse of data in a secure analysis environment. The presentation introduces the integrated service and research data management tool ecosystem of the Medical University of Graz, as well as the MedBioNode HPC cluster for data and image analysis. Using the IT architecture landscape as an example, it is demonstrated how imaging data, including metadata, can be managed and annotated in OMERO, linked with clinical data, and processed via the Python API on the MedBioNode cluster. For omics data, automated, rule-based import processes using iRODS and watchdog mechanisms are outlined, enabling continuous data ingestion. By indexing metadata, the stored data and images become efficiently searchable, thereby facilitating secondary use in accordance with the FAIR principles.
Andrea Groselj-Strele heads the Core Facility Computational Bioanalytics at the Centre for Medical Research of the Medical University of Graz, where she is responsible for methodological support in bioinformatic and biostatistical data analysis. Her main areas of work include the computational analysis of complex biomedical data sets and the development and operation of research-supporting infrastructure. In addition, she serves as a lecturer in biostatistics at FH JOANNEUM.
-
- 11:30-11:50 - Break
- 11:50 - 13:00 - Presentations
-
Are these even data? What performance tests and metadata from the Digital Skills Austria studies reveal about digital competence
MMag. Manuela Grünangerl (University of Salzburg)From a methodological perspective, the measurement of competences poses a particular challenge—especially in online surveys. Competences are difficult to observe directly and can only be captured to a limited extent through self-reports alone. This applies to an even greater degree to digital competences, which are situational, context-dependent, and action-oriented. Against this background, the question arises not only of how digital competences can be measured, but more fundamentally: what actually counts as data in such surveys? And how do we, as researchers, deal with non-response or refusal to act?
This presentation addresses these questions using the Digital Skills Austria study as an example. Since 2023, digital competences have been measured three times in the Austrian online population through an online survey employing a 13- (and in some cases 30-)step problem-oriented performance test. Participants were required to solve everyday digital problems, with different solution strategies—knowledge, application of skills, trial and error, or targeted online research—all being equally recognised as competent behaviour. This measurement approach already shifts the focus away from classical survey data towards observable action outcomes.
Beyond this, such online performance testing generates a wide range of para- and metadata, including processing times, response patterns, dropouts, and forms of response avoidance. These data are not merely by-products of the survey process, but open up an additional analytical layer. The presentation demonstrates how the systematic analysis of these para- and metadata enables key insights into data quality, test engagement, and potential measurement biases. In particular, extreme response behaviours such as speeding, straightlining, or complete task avoidance can be identified, and their effects on measurement results can be analysed.
The presentation argues that the combination of performance testing with para- and metadata constitutes a particularly productive methodological perspective for competence research. It expands the concept of data beyond classical response data and makes visible—or opens up for discussion—the fact that crucial information often lies precisely in the process of responding itself: in hesitation, skipping, or rapid clicking. In this sense, the presentation aims to stimulate a rethinking of the guiding question of Love Data Week—Where are the data?—from an unconventional angle: data are often not only found in the answers, but hidden in response behaviour itself.
Bio-Note:
MMag. Manuela Grünangerl is a Senior Lecturer at the Department of Communication Studies at Paris Lodron University of Salzburg, where she specialises in teaching social science research methods. Her teaching and research focus in particular on empirical social research, the development of digital competences, and questions of social change. In addition to her university position, she is also active in adult education and is involved in academic projects and publications in the fields of communication and media research. -
Structure instead of data silos: a transferable model for heterogeneous (psychological) research data
Mag. Dr.rer.nat. Karl Koschutnig (University of Graz)Research data in psychology and related disciplines are often heterogeneous: questionnaires, physiological measurements, or performance-based tests are generated in different formats and within project-specific structures. This diversity complicates validation, analysis, and reuse—particularly across project and institutional boundaries.
The presentation introduces PRISM (Psychological Research Information & Structure Model), an open structural model for organising heterogeneous research data. PRISM defines consistent conventions for data storage, metadata, and validation, and enables reproducible preparation for statistical analyses. The underlying workflow is openly documented and available on GitHub.
PRISM is presented as a working example from practice that is deliberately not designed as an isolated solution. The focus lies on the question of how fundamental structural principles can be transferred to other research domains and how open models can contribute to improved findability, verifiability, and reuse of research data.
Bio-Note:
Dr Karl Koschutnig is a neuropsychologist and head of the MRI Laboratory Graz. He previously worked at the clinical department of neuroradiology at LKH Graz and specialises in functional and structural MRI processing as well as diffusion tensor imaging. With 58 publications, an h-index of 28, and more than 3,000 citations, he is an expert in the field of neuroimaging.
-
- 13:00-14:00 - lunch break
-
14:00-17:15 - Making Research Data Available / Reusing Research Data
Workshop by Otto Bodi-Fernandez, Helmut W. Klug, Dimitri PrandnerNational and international funding agencies and research institutions have strong incentives to focus on research data: they aim to actively promote the reuse of research data. The reasons are diverse, commonly cited motivations include increasing efficiency in the research process by avoiding redundant data collection, enhancing transparency and reproducibility of scientific results, and enabling new research questions through the combination and secondary analysis of existing datasets.
But where are these data, actually? Developments in research data over recent years speak clearly: the FAIR principles, developed in 2014 and published in 2016, provide clear guidance on the publication of research data. They encourage that data be easily findable for the relevant research communities, ensure technically open and barrier-free access, enable data to be processed in different contexts, and clearly define the legal conditions under which they may be used. Reasons for not publishing data remain comparatively limited and mostly concern legal, ethical, or security-related restrictions. A key criterion of the FAIR principles is the full machine-readability of at least the metadata, enabling automated workflows, linkages, and findability across distributed infrastructures.
The data management plan provides researchers and project managers with a traditional, but newly framed, project management tool in the context of research data, which can significantly support this goal—from planning data collection through documentation to storage, archiving, and publication. Whether these overarching plans or the concrete usefulness of the tool have fully reached all stakeholders remains an open question and subject for discussion.
In the planned workshop, after a series of short introductory talks presenting the particularities of research data from diverse disciplines, participants will collaboratively work on a practical guideline for researchers. The current working hypothesis, which may be modified during the workshop, is the development of a checklist for both those publishing data and those reusing data. The requirements of the two groups largely overlap but differ in perspective: producers are primarily concerned with structured documentation, appropriate licensing, and sustainable repositories, whereas reusers focus more on clarity, contextual information, and explicit usage conditions.
Introductory talks on the particularities of research data from different research areas:
- Thomas Rauter: Research data at the Medical University
- Elisabeth Steiner: Research data in the Humanities
- Claire Jean-Quartier: Research data at the Technical University
- Otto Bodi-Fernandez: Research data in the Social Sciences