Keratin is a structural fibrous protein and is the main building block in hair, wool, feathers, horn, and nails. In slaughterhouses and poultry farms, large amounts of keratin-containing waste are generated every year, e.g. in the form of feathers. Keratin is very resistant to physical influences and chemical and biological agents. As a result, it is often buried or burned in landfills. Nevertheless, keratin-containing waste does not accumulate in nature.
Only few microorganisms can break down keratin, and even fewer higher eukaryotes have this ability. One of them is Tineola bisselliella - the common or webbing clothes moth. The mechanism of keratin digestion in beetles, moths and microorganisms is different from one another and the keratin degrading mechanism in the larvae has not yet been fully described. Therefore, we compare the transcriptomic shift in the intestine of T. bisselliella larvae fed with feathers (keratin-rich) and insect carcasses (keratin-free). We search for known and new enzymes that are believed to be part of the keratin degrading system. Our data include potential symbiotic transcripts as well as host transcripts.
The project is initiated and financed by the Fraunhofer IME in Giessen.
Non-coding RNAs (ncRNAs) are RNA molecules that are not translated into proteins. Nonetheless, they still partake in a variety of essential biological processes and also take important roles in the complex regulatory system of gene expression. In the last decade several new ncRNA classes have been characterized showing that ncRNAs are a current topic with constant discoveries.
To gain more insight into the complex field of ncRNAs the RTG 2355 was arranged consisting of twelve project groups from different disciplines. This project is part of the RTG 2355 and aims at supporting the other members with data analyses and implementing automated workflows which will be made accessible in an easy to use manner. Consequently, collaborations between the members, spanning the different disciplines, are encouraged.
As part of those collaborations an iCLIP-analysis pipeline has been developed and is still further extended by varying postprocessing analyses.
Additionally, a project in which the packaging of siRNAs into exosomes in plants and their potential cross-kingdom gene silencing is being investigated.
Though machine learning and deep learning are on the rise in bioinformatic analysis, their complexity and the knowledge required to use them behave in a similar way. I aim to ease the path of biologists and medical scientists to neural networks by creating a framework that focuses on semi-automatic training of artificial neural networks (ANN).
The first steps in achieving that goal are to create the basis of easier training of neural networks without the need for the user to understand the libraries and languages, like TensorFlow, Keras, NextFlow, and Python. Only using the terminal, the training can either be performed locally, or on the BCF cluster. Metadata and training parameters will be collected during training and can be visualized after the run.
The end goal is to create a framework that is to be able to semi-automatically set the training parameters and find the best working model. These resulting models can be saved together with metadata and be reused for the prediction of other genome information.
Predicting bacterial phenotypes based on their genomic sequences serves as a benchmark test to prove the capability and accuracy of the neural networks. Several phenotypes, such as oxygen requirement, gram property, and sporulation have already been successfully classified by the framework. Additional phenotypes will follow and the models will be made available to download following open-access guidelines.
Analysis of NGS data comes along with the requirements for computational infrastructure that allows the execution of relevant analysis tools without expert programming knowledge. We want to provide an R package, Petra, for epigenome and transcriptome analysis within the Bioconductor platform.
Petra will be accessible to the community to enable researchers to perform standard ChIP-seq, ATAC-seq, and RNA-seq analysis. In addition to basic workflows for the execution of principal analysis steps like differential expression/binding analysis and visualizations for exploratory analysis, we would like to integrate more specific and complex functionality dedicated to questions resulting from the combined analysis. For this purpose, we have implemented a super-enhancer detection algorithm based on peak position data. By combining different peak annotation approaches, a new peak to gene association algorithm has been designed. Additionally, functionality for the visualization of genomic data like browser snapshots, motif analysis, or correlation heat maps has been implemented.
The integration of transcriptome and epigenome data analysis will help put forward and test new hypotheses more efficiently. The R package Petra allows maximum flexibility to add new features. It provides a combined analysis of epigenome and transcriptome data - differential bound transcription factors and differentially expressed genes will be identified.