Inhaltspezifische Aktionen

PhD projects

Collection of currently running PhD projects.

Inhaltspezifische Aktionen

PhD project of Michael Schwabe: The transcriptome of Tineola bisselliella

Keratin is a structural fibrous protein and is the main building block in hair, wool, feathers, horn, and nails. In slaughterhouses and poultry farms, large amounts of keratin-containing waste are generated every year, e.g. in the form of feathers. Keratin is very resistant to physical influences and chemical and biological agents. As a result, it is often buried or burned in landfills. Nevertheless, keratin-containing waste does not accumulate in nature.

Only few microorganisms can break down keratin, and even fewer higher eukaryotes have this ability. One of them is Tineola bisselliella - the common or webbing clothes moth. The mechanism of keratin digestion in beetles, moths and microorganisms is different from one another and the keratin degrading mechanism in the larvae has not yet been fully described. Therefore, we compare the transcriptomic shift in the intestine of T. bisselliella larvae fed with feathers (keratin-rich) and insect carcasses (keratin-free). We search for known and new enzymes that are believed to be part of the keratin degrading system. Our data include potential symbiotic transcripts as well as host transcripts.

The project is initiated and financed by the Fraunhofer IME in Giessen.

Inhaltspezifische Aktionen

PhD project of Patrick Barth

Non-coding RNAs (ncRNAs) are RNA molecules that are not translated into proteins. Nonetheless, they still partake in a variety of essential biological processes and also take important roles in the complex regulatory system of gene expression. In the last decade several new ncRNA classes have been characterized showing that ncRNAs are a current topic with constant discoveries.

To gain more insight into the complex field of ncRNAs the RTG 2355 was arranged consisting of twelve project groups from different disciplines. This project is part of the RTG 2355 and aims at supporting the other members with data analyses and implementing automated workflows which will be made accessible in an easy to use manner. Consequently, collaborations between the members, spanning the different disciplines, are encouraged.

As part of those collaborations an iCLIP-analysis pipeline has been developed and is still further extended by varying postprocessing analyses.

Additionally, a project in which the packaging of siRNAs into exosomes in plants and their potential cross-kingdom gene silencing is being investigated.

Inhaltspezifische Aktionen

PhD project of Andreas Hoek

WASP: A versatile, web-accessible single cell RNA-Seq processing platform

Since its first application in 2009, single cell RNA sequencing (scRNA-seq) has experienced a steep development. Due to the unprecedented resolution of single cell technology it is widely applicable in many different fields of research, ranging from basic research questions such as analyzing differentiation processes in cells up to highly specific biomedical questions such as tumor cell characterisation. Furthermore, scRNA-seq has undergone many developments in the last decade leading to a variety of different protocols, a massive gain in throughput, sensitivity and significant reduction in cost-per-cell compared to its early stages.

As a consequence, scRNA-seq has a need for tailored bioinformatic software solutions able to tackle the new challenges. These include protocol-specific processing of barcodes and unique molecular identifier (UMI) sequences for up to hundreds of thousands of cells in parallel, detection and characterisation of cellular clusters and appropriate visualizations.

During my thesis, I'm developing WASP - a web-accessible scRNA-seq analysis platform. WASP covers a complete workflow for data based on the ddSeq protocol, from raw reads to cell clustering, differential gene expression detection and visualization. Due to its modular design, the software can easily be employed for data from other protocols. To perform on-premise analysis of sensitive data, WASP can be employed using Docker, Conda or simply as a standalone version for Windows-based systems. Furthermore, users can interactively change parameters during the analysis workflow and download publication-ready visualizations.

Inhaltspezifische Aktionen

PhD project of Maike Weber

Using Machine Learning to predict bacterial phenotypes

Though machine learning and deep learning are on the rise in bioinformatic analysis, their complexity and the knowledge required to use them behave in a similar way. I aim to ease the path of biologists and medical scientists to neural networks by creating a framework that focuses on semi-automatic training of artificial neural networks (ANN). 

The first steps in achieving that goal are to create the basis of easier training of neural networks without the need for the user to understand the libraries and languages, like TensorFlow, Keras, NextFlow, and Python. Only using the terminal, the training can either be performed locally, or on the BCF cluster. Metadata and training parameters will be collected during training and can be visualized after the run.

The end goal is to create a framework that is to be able to semi-automatically set the training parameters and find the best working model. These resulting models can be saved together with metadata and be reused for the prediction of other genome information.

Predicting bacterial phenotypes based on their genomic sequences serves as a benchmark test to prove the capability and accuracy of the neural networks. Several phenotypes, such as oxygen requirement, gram property, and sporulation have already been successfully classified by the framework. Additional phenotypes will follow and the models will be made available to download following open-access guidelines.

Inhaltspezifische Aktionen

PhD project of Tobias Zimmermann

Petra: A new R package for epigenome and transcriptome analysis within the Bioconductor platform.

Analysis of NGS data comes along with the requirements for computational infrastructure that allows the execution of relevant analysis tools without expert programming knowledge. We want to provide an R package, Petra, for epigenome and transcriptome analysis within the Bioconductor platform.

Petra will be accessible to the community to enable researchers to perform standard ChIP-seq, ATAC-seq, and RNA-seq analysis. In addition to basic workflows for the execution of principal analysis steps like differential expression/binding analysis and visualizations for exploratory analysis, we would like to integrate more specific and complex functionality dedicated to questions resulting from the combined analysis. For this purpose, we have implemented a super-enhancer detection algorithm based on peak position data. By combining different peak annotation approaches, a new peak to gene association algorithm has been designed. Additionally, functionality for the visualization of genomic data like browser snapshots, motif analysis, or correlation heat maps has been implemented.

The integration of transcriptome and epigenome data analysis will help put forward and test new hypotheses more efficiently. The R package Petra allows maximum flexibility to add new features. It provides a combined analysis of epigenome and transcriptome data - differential bound transcription factors and differentially expressed genes will be identified.

Inhaltspezifische Aktionen

PhD project of Patrick Blumenkamp

The yearly increasing citations of DESeq2, edgeR, and limma (an increase of 535 % from 2015 to 2018) show that differential gene expression (DGE) analyses are still on an emerging path. The vast amount of data generated by current sequencing instruments underpins the need for automated and reproducible analysis pipelines.

Thus, we develop a two-component software for analyzing and visualizing RNA-Seq data focusing on DGE analyses. The first part is a modularized Snakemake pipeline generator consisting of quality control, preprocessing, mapping, and in-depth analysis modules, called Curare. The pipelines are built for high-throughput analyses and can be executed on local machines as well as on high-performance compute clusters. Each pipeline is entirely reproducible, and the existing collection of modules, which are customizable and extendable, increases the flexibility of the pipeline generation. The second component is a tool for visualizing DGE results. With the Gene Expression Visualizer (GenExVis), DGE results can be interactively analyzed, and numerous charts can be created. All charts can be saved in common image file formats for usage in presentations and publications. Both components combined create an environment that supports the full process of data analysis from the initial handling of RNA-seq raw data to the final DGE analyses and result visualization.

Inhaltspezifische Aktionen

PhD project of Raphael Müller

PhD project regarding gene regulation focusing on extracytoplasmic function sigma factors.

The RNA polymerase holoenzyme's ability to recognize promoter regions on the DNA originates from its sigma subunit. In this project, we focus on sigma factors with a particular interest in the alternative, i.e., non-essential, extracytoplasmic function (ECF) sigma factors.

 

Here, we develop a new online platform, ECFHub, which gives information about the different groups and subgroups of the ECF family's sigma factors with useful information, interactive visualizations, and analysis capacities, allowing users to analyze their ECF sequences. 

 

Additionally, with the recently developed technique Cappable-Seq, we perform a new TSS analysis of the gram-positive Bacillus subtilis to get insights into its complex gene regulation mechanisms.

Inhaltspezifische Aktionen

PhD project of Nina Hofmann: Integrative and comparative analysis of virus-host interactions

Virus infections remain a major threat to human health. Virus-host interactions are often not fully understood resulting in a lack of available treatment and vaccination in many cases. RNA viruses are of particular interest, because their replication machinery introduces a high number of nucleotide substitutions. This leads to high variability among the virus genomes which is an essential factor to adapt to changing environmental conditions or to new hosts.
In my thesis I analyze high-throughput RNA-Seq data taken from human pathogenic RNA viruses from different families covering the respiratory viruses human CoV-229E, MERS-CoV, a highly pathogenic H5N1 IV, a seasonal H1N1 IV and RSV, the hemorrhagic fever causing viruses Ebola virus (EBOV), Marburg virus (MARV), Lassa virus (LASV) and Rift Valley fever virus (RVFV), as well as Nipah virus (NIV), Sandfly fever Sicilian virus (SFSV), and hepatitis C virus (HCV). For this purpose, I am developing a bioinformatics pipeline that is adjusted to evaluate transcriptome changes of virus-host interactions after infection with RNA viruses over time. The RNA-Seq pipeline provides an automated workflow for the joint evaluation of host transcriptome and viral genome data.

Inhaltspezifische Aktionen

PhD project of Christopher Schölzel: Object-oriented multi-scale modeling of biological systems with Modelica using the example of the human cardiovascular system

Modelica is an object-oriented acausal modeling language built for complex cyberphysical systems including both continuous variables and discrete events. Models can be formulated in a purely declarative mathematical way or by graphically combining pre-built components in a diagram view.
In contrast to traditional approaches using languages like C and Matlab, the equations are arranged and solved entirely by a solver backend. This allows for a very consise hierarchical description that lends itself well to multi-scale modeling of biological systems.

In my thesis I investigate the usefulness of Modelica in systems biology by taking the example of the Seidel-Herzel model of the human cardiovascular system, which includes the function of baroreceptors, the lungs, the heart, and the autonomous nervous system.