Inhaltspezifische Aktionen

PhD projects

Collection of currently running PhD projects.

Inhaltspezifische Aktionen

PhD project of Andreas Hoek

WASP: A versatile, web-accessible single cell RNA-Seq processing platform

Since its first application in 2009, single cell RNA sequencing (scRNA-seq) has experienced a steep development. Due to the unprecedented resolution of single cell technology it is widely applicable in many different fields of research, ranging from basic research questions such as analyzing differentiation processes in cells up to highly specific biomedical questions such as tumor cell characterisation. Furthermore, scRNA-seq has undergone many developments in the last decade leading to a variety of different protocols, a massive gain in throughput, sensitivity and significant reduction in cost-per-cell compared to its early stages.

As a consequence, scRNA-seq has a need for tailored bioinformatic software solutions able to tackle the new challenges. These include protocol-specific processing of barcodes and unique molecular identifier (UMI) sequences for up to hundreds of thousands of cells in parallel, detection and characterisation of cellular clusters and appropriate visualizations.

During my thesis, I'm developing WASP - a web-accessible scRNA-seq analysis platform. WASP covers a complete workflow for data based on the ddSeq protocol, from raw reads to cell clustering, differential gene expression detection and visualization. Due to its modular design, the software can easily be employed for data from other protocols. To perform on-premise analysis of sensitive data, WASP can be employed using Docker, Conda or simply as a standalone version for Windows-based systems. Furthermore, users can interactively change parameters during the analysis workflow and download publication-ready visualizations.

Inhaltspezifische Aktionen

PhD project of Tobias Zimmermann

Petra: A new R package for epigenome and transcriptome analysis within the Bioconductor platform.

Analysis of NGS data comes along with the requirements for computational infrastructure that allows the execution of relevant analysis tools without expert programming knowledge. We want to provide an R package, Petra, for epigenome and transcriptome analysis within the Bioconductor platform.

Petra will be accessible to the community to enable researchers to perform standard ChIP-seq, ATAC-seq, and RNA-seq analysis. In addition to basic workflows for the execution of principal analysis steps like differential expression/binding analysis and visualizations for exploratory analysis, we would like to integrate more specific and complex functionality dedicated to questions resulting from the combined analysis. For this purpose, we have implemented a super-enhancer detection algorithm based on peak position data. By combining different peak annotation approaches, a new peak to gene association algorithm has been designed. Additionally, functionality for the visualization of genomic data like browser snapshots, motif analysis, or correlation heat maps has been implemented.

The integration of transcriptome and epigenome data analysis will help put forward and test new hypotheses more efficiently. The R package Petra allows maximum flexibility to add new features. It provides a combined analysis of epigenome and transcriptome data - differential bound transcription factors and differentially expressed genes will be identified.

Inhaltspezifische Aktionen

PhD project of Patrick Blumenkamp

The yearly increasing citations of DESeq2, edgeR, and limma (an increase of 535 % from 2015 to 2018) show that differential gene expression (DGE) analyses are still on an emerging path. The vast amount of data generated by current sequencing instruments underpins the need for automated and reproducible analysis pipelines.

Thus, we develop a two-component software for analyzing and visualizing RNA-Seq data focusing on DGE analyses. The first part is a modularized Snakemake pipeline generator consisting of quality control, preprocessing, mapping, and in-depth analysis modules, called Curare. The pipelines are built for high-throughput analyses and can be executed on local machines as well as on high-performance compute clusters. Each pipeline is entirely reproducible, and the existing collection of modules, which are customizable and extendable, increases the flexibility of the pipeline generation. The second component is a tool for visualizing DGE results. With the Gene Expression Visualizer (GenExVis), DGE results can be interactively analyzed, and numerous charts can be created. All charts can be saved in common image file formats for usage in presentations and publications. Both components combined create an environment that supports the full process of data analysis from the initial handling of RNA-seq raw data to the final DGE analyses and result visualization.

Inhaltspezifische Aktionen

PhD project of Nina Hofmann: Integrative and comparative analysis of virus-host interactions

Virus infections remain a major threat to human health. Virus-host interactions are often not fully understood resulting in a lack of available treatment and vaccination in many cases. RNA viruses are of particular interest, because their replication machinery introduces a high number of nucleotide substitutions. This leads to high variability among the virus genomes which is an essential factor to adapt to changing environmental conditions or to new hosts.
In my thesis I analyze high-throughput RNA-Seq data taken from human pathogenic RNA viruses from different families covering the respiratory viruses human CoV-229E, MERS-CoV, a highly pathogenic H5N1 IV, a seasonal H1N1 IV and RSV, the hemorrhagic fever causing viruses Ebola virus (EBOV), Marburg virus (MARV), Lassa virus (LASV) and Rift Valley fever virus (RVFV), as well as Nipah virus (NIV), Sandfly fever Sicilian virus (SFSV), and hepatitis C virus (HCV). For this purpose, I am developing a bioinformatics pipeline that is adjusted to evaluate transcriptome changes of virus-host interactions after infection with RNA viruses over time. The RNA-Seq pipeline provides an automated workflow for the joint evaluation of host transcriptome and viral genome data.