Bioinformatics & Systems Biology
Today, massively parallel DNA sequencing or hybridization approaches allow the identification of not only the gene repertoire but also the gene regulatory networks of an organism. The huge amounts of data acquired from such experiments can only be handled with intensive bioinformatics support that has to provide an adequate infrastructure for storing and analyzing these data. Thus, bioinformatics has to deliver efficient data analysis algorithms, user-friendly tools and software applications, as well as extensive hardware infrastructure for answering such questions.
As part of the Bielefeld-Giessen Resource Center for Microbial Bioinformatics (BiGi), a service unit of the 'German Network for Bioinformatics Infrastructure – de.NBI', the group is focused on data management for genome and post-genome research projects that require new software solutions for systematic data acquisition, secure data storage of structured information, and high-throughput data analysis. Bioinformatics training and education and the cooperation within the German bioinformatics community is a main scope of the group.
- Recent publications
Platon: identification and characterization of bacterial plasmid contigs in short-read draft assemblies exploiting protein sequence-based replicon distribution scoresPlasmids play a vital role in the environmental adaptation of bacteria. Due to potential mobilization or conjugation capabilities, they are important genetic vehicles for antimicrobial resistance genes and virulence factors with huge clinical implications. To comprehensively characterize plasmids via NGS methods, Platon allows the identification and characterization of plasmid-borne contigs from bacterial short-read draft assemblies achieving both high accuracy and balanced classifications in terms of sensitivity and specificity. The software follows a new approach to this problem exploiting the natural distribution bias of protein-coding genes between chromosomes and plasmids represented by a new metric: the replicon distribution score (RDS). To further increase the achieved sensitivity, platon applies several heuristics taking into account plasmid-specific contig characterizations.
ASA³P: An automatic and scalable pipeline for the assembly, annotation and higher-level analysis of closely related bacterial isolatesBacterial whole-genome sequencing has become daily routine in many fields. Technical advances and dropping costs have resulted in a tremendous increase in available sequence data. However, the comprehensive in-depth analysis thereof remains an arduous and time-consuming task. Here, we introduce ASA³P, a fully automatic, locally executable and scalable assembly, annotation and analysis pipeline for bacterial genomes. The pipeline conducts necessary raw data processing steps as well as comprehensive genome characterizations, e.g. taxonomic classification, detection of AMR genes and virulence factors and much more. The workflow is implemented in a fully-automatic manner, scaling from tiny to very large cohorts via a portable Docker image or a highly-scalable cloud version. All results are presented via interactive HTML reports.