Research
Current research projects and collaborations
FAIrPaCT2 |
|
![]() |
Drug PertubationsRTG-2978GönomiX |
Key Research Areas
Leverage the Unreachable - Federated Learning for Privacy Preserving Model Aggregation
Sensitive patient information such as clinical data and medical registry data is often stored in critical healthcare infrastructure distributed across institutes. The analysis of such data harbours privacy risks and thus falls under a variety of legal regulations such as the General Data Protection Regulation (GDPR), making the application of traditional machine learning algorithms often impossible. Essentially data exchange among institutions over the internet is posing a roadblock hampering big-data-based medical innovations. Therefore we are developing federated learning FL algorithms that follow a privacy by design architecture. FL techniques aim to overcome the barrier of exchanging raw patient data and move towards large-scale medical data mining. The idea is to build a generalized global model without access to a shared dataset by merging locally trained models that capture the essence of the data. As part of this focus, we will soon launch the BMBF-funded FAIrPaCT project (see projects). Recently, Maryam Moradpour joined our group and will focus on the development of federated algorithms for non-partially overlapping biomedical datasets.
Just-In-Time Prediction - Online and time-critical event prediction
The majority of current clinical and omics research, is often restricted to investigations of cross-sectional snapshots of specific diseases. However, most diseases traverse various stages during their development, or emerge towards different subtypes depending on influences such as genetics, environment or medication. Thus an analysis that focuses on a single time point of the patients' metabolite or protein abundance may overlook potential biomarker pattern, in particular, when identifying subtypes of diseases and optimize treatment. Novel technologies paved the way for longitudinal analysis and online monitoring of fast progressing diseases or changing patient vitals, for instance, parameters of lung function or metabolites in exhaled air. Our group is developing methods, and packages, such as the R package LoBrA for longitudinal linear spline analysis and more advanced recurrent neural networks to model real world longitudinal data. For example, we applied the LoBrA package to breath metabolomic data of rats with progressing to illuminate the potential of computer-aided metabolic breath analysis as early alarm system for time critical diseases such as sepsis. Moreover, we will employ our methodologies to investigate to clinical variables that are captured during mechanical ventilation and model the progression of acute respiratory failure. Dr. Zully Ritter and Stefan Rühlicke are currently working in the Ensure project, the aim is to implement a clinical decision support system for diagnostics in the emergency department and evaluate its usability and accuracy compared to existing guidelines.
Leveraging the Vast Known - Transfer learning to address Challenges of Data Heterogeneity and Sparsity
Data sparsity and heterogeneity are two of the biggest challenges in the medical data science domain. Transfer learning is able to mitigate these issues and has been successfully employed in medical imaging utilizing pre-trained models from general image databases such as ImageNet to identify skin cancer, for instance. Furthermore, it can be very beneficial in biomedical research where a lack of sufficient data is even more frequent. Significant technological advances in next-generation sequencing led to a large number of focused studies with distinct purpose often spread amongst a large number of data sets which are not fully utilized yet.
Therefore, data heterogeneity is one of the big challenges for integrative analysis, which requires merging data from various developing fields. For example, various data types require different processing, even if the data sets originate from the same sequencing technique. In addition, other challenges such as batch effects and noise often make biomedical data analysis difficult. Thus, in our group we are trying to develop transfer learning methods that overcome these challenges by leveraging large scale public data sets from one domain and transferring these to improve tasks in a different target domain with smaller data sets to identify meaningful biomedical insights. Recently, Youngjun Park has developed a method to transfer knowledge between data sets from different sequencing technologies and to handle batch effect and noise in the target domain. Currently, our team is developing domain adaptation and zero-shot learning methods for proper knowledge transfer to a different biological domain.
Opening the Black Box - Achieving model and prediction interpretability by explainable artificial intelligence methods
In domains such as clinical decision support systems (CDSS), there is typically a lack of trust in black box machine learning models that do not allow to trace back the factors that lead to a specific decision. The field of eXplainable Artificial Intelligence (XAI) tries to increase the transparency and the trustworthiness of AI models and thus increasing the use of AI models as CDSS. However, until now, their usage is primarily limited to data science experts. In one of her current projects Ms. Beinecke is developing a Graphical User Interface (GUI) for visualizing and analysing XAI attributions on graph datasets such as protein protein interaction networks, to allow researchers of other domains to gain a deeper understanding of the underlying models. The GUI is part of a human-in-the-loop platform that will allow graph manipulation based on the gained insights through the XAI attributions. In addition, she is working on an extensive benchmarking study focusing on the trustworthiness of XAI methods on different types and sizes of, in particular, biomedical data to develop a guideline that will aid in the acceptance of AI methods as clinical decision support systems (CDSS). Moreover, Zully Ritter is employing classical tools such as SHAP and LIME for giving information on which specific parameters and its weight in relation to the correct prediction finding is used in the decision-making process. These methodologies are essential for increasing the acceptance of these CDSS especially in the critical and post hospital care, not only for allowing the understanding of the decision taken but mainly when the ML prediction differs from those of the clinicians.
Clinical Decision Support Systems (CDSS) based on machine learning models
Machine Learning models (ML) are an essential tool that, integrated into portable devices or a clinical setup, can be used by medical staff as clinical decision support. Patient data, including vital signs, symptoms, and outcome diagnosis, are frequently used. Some considerations, like physiological filters, are essential to pre-processing and cleaning data steps. Zully Ritter is working on practical applications of machine learning models, which include diagnostic prediction in emergency department patients. In such a specific clinical setup, the assessment of patient diagnosis is a time-critical process to initiate adequate treatment strategies. It thus represents one of the most important steps concerning patient`s outcomes in emergency care. ML models are time-saving, managing the clinical case of a specific patient using the previously learned result from several patients’ analyzed data in adequate time. In critical cases, it could be life-saving in emergency patients.
Open Source
Publications
Maurer MC, Hempel P, Steinhaus KE, Chereda H, Vollmer M, Krefting D, Spicher N, Hauschild AC
xGNN4MI: explainability of graph neural networks in 12-lead electrocardiography for cardiovascular disease classification
npj Digital Medicine, 2026; DOI:10.1038/s41746-026-02367-1
Moradpour M, Ritter Z, Hauschild AC
Ensemble Multi-Objective Hyperparameter Optimization for the Classification of Imbalanced Heart Disease Data
Expert Systems with Applications 130318, 2025/11/11; DOI:10.1016/j.eswa.2025.130318
Nguyen AT, Nguyen DMH, Diep NT, Nguyen TQ, Ho N, Metsch JM, Maurer MC, Sonntag D, Bohnenberger H, Hauschild
AC MGPATH: A Vision-Language Model with Multi-Granular Prompt Learning for Few-Shot Whole Slide Pathology Classification Transactions on Machine Learning Research 05 Oct 2025;
https://openreview.net/pdf?id=u7U81JLGjH
Metsch JM, Hauschild AC
BenchXAI: Comprehensive benchmarking of post-hoc explainable ai methods on multimodal biomedical data
Computers in Biology and Medicine 191, 110124; DOI:10.1016/j.compbiomed.2025.110124
Zaschke P, Maurer MC, Hempel P, Hauschild AC, Rodenbeck A, Spicher N.
A somnologist’s guide to explainable deep neural networks for sleep scoring.
Somnologie. 2025 Apr 30:1-8.; DOI:10.1007/s11818-025-00504-8
Photo editing: JLU/Anna Sposato; unedited original photo “Institute”: Anna Sposato; unedited original photo “Contact PredLMed”: colourbox.de; unedited original image “News”: AI‑generated with Adobe Firefly; unedited original photo “Jobs”: colourbox.de

