GCB 2015 - Program

10:00 - Registration open, coffee

11:00 - Tutorials begin

Tutorial 1 - Fundamentals of Proteome Bioinformatics Revisited

PD Dr. Martin Eisenacher, Prof. Dr. Oliver Kohlbacher

Tutorial 2 - Introduction to Snakemake

Dr. Johannes Köster

Tutorial 3 - Introduction to Data Mining with RapidMiner

Dr. Tim Ruhe

Contributed by Collaborative Research Center SFB 876

17:30 Welcome Reception (after tutorials)

Please join us for a welcome drink and snacks. An opportunity to register and to pick up the conference documents will be provided.

Schedule overview

10:00 - 11:00	Registration, coffee
11:00 - 13:00	Tutorials
13:00 - 14:00	Lunch Break (catering)
14:00 - 15:40	Tutorials
15:40 - 16:00	Coffee Break
16:00 - 17:30	Tutorials
17:30 - 19:00	Welcome reception

8:00 - Registration, coffee, poster set-up

Session Chair: Sven Rahmann

9:00 - Opening

Prof. Dr. Sven Rahmann (UA Ruhr)

Prof. Dr. Winfried Schulze (Mercator Research Center Ruhr)

Prof. Dr. Matthias Rarey (FaBI)

9:30 - Modelling Coverage in RNA Sequencing

Arndt von Haeseler

(Joint work with Celine Prakash, Florian Pflug, Luis Felipe Paulin Paz)

RNA sequencing (RNA-seq) is the method of choice for measuring the expression of RNAs in a cell population. In an RNA-seq experiment, sequencing the full length of larger RNA molecules requires fragmentation into smaller pieces to be compatible with limited read lengths of most deep-sequencing technologies. Unfortunately, the issue of non-uniform coverage across a genomic feature has been a concern in RNA-seq and is attributed to preferences for certain fragments in steps of library preparation and sequencing. However, the disparity between the observed non-uniformity of read coverage in RNA-seq data and the assumption of expected uniformity elicits a query on the read coverage profile one should expect across a transcript, if there are no biases in the sequencing protocol. We propose a simple model of unbiased fragmentation where we find that the expected coverage profile is not uniform and, in fact, depends on the ratio of fragment length to transcript length. To compare the non-uniformity proposed by our model with experimental data, we extended this simple model to incorporate empirical attributes matching that of the sequenced transcript in an RNA-seq experiment. In addition, we imposed an experimentally derived distribution on the frequency at which fragment lengths occur.

We used this model to compare our theoretical prediction with experimental data and with the uniform coverage model. If time permits, we will also discuss a potential application of our model.

10:20 - Coffee break; posters on display

Session Chair: Knut Reinert

11:00 - Statistical Learning in Computational Biology

Nico Pfeifer

11:20 - From raw ion mobility measurements to disease classification: a comparison of analysis processes

Salome Horsch, Dominik Kopczynski, Jörg Ingo Baumbach, Jörg Rahnenführer and Sven Rahmann

11:40 - Computing and Visualizing Precision-Recall Curves and Receiver Operating Characteristic Curves for Soft-labeled and Hard-labeled Data

Ivo Grosse, Jan Grau and Jens Keilwagen

12:00 - FSOL - a workflow for the detection of patient subgroups and affected molecular features in high-throughput omics data

Maike Ahrens, Michael Turewicz, Katrin Marcus, Helmut E. Meyer, Caroline May, Martin Eisenacher and Jörg Rahnenführer

12:20 - Cluster analysis and visualization techniques for large datasets in complexome profiling

Heiko Giese, Joerg Ackermann, Ulrich Brandt, Ilka Wittig and Ina Koch

12:40 - Lunch break (see survival guide)

Session Chair: Martin Eisenacher

14:00 - LoRDEC: a tool for correcting errors in long sequencing reads

Eric Rivals

(Joint work with L. Salmela and A. Makrini)

High-throughput DNA/RNA sequencing is a routine experiment in molecular biology and life sciences in general. For instance, it is increasingly used in the hospital as a key procedure of personalized medicine. Compared to the second generation, third generation sequencing technologies produce longer reads with comparatively lower throughput and higher error rate. Those errors include substitutions, indels, and they hinder or at least complicate downstream analysis like mapping or de novo assembly. However, these long read data are often used in conjunction with short reads of the 2nd generation.

I will present a hybrid strategy for correcting the long reads using the short reads that we introduced last year. Unlike existing error correction tools, ours, called LoRDEC, avoids aligning short reads on long reads, which is computationally intensive. Instead, it takes advantage of a succinct graph to represent the short reads, and compares long reads to paths in the graph. Experiments show that LoRDEC outperforms existing methods in running time and memory while achieving a comparable correction performance. It can correct both Pacific Biosciences and MinION reads from Oxford Nanopore.

LoRDEC is available at http://atgc.lirmm.fr/lordec.

14:40 - An Optimization Approach to Detect Differentially Methylated Regions from Whole Genome Bisulfite Sequencing Data

Nina Hesse, Christopher Schröder and Sven Rahmann

15:00 - Integrative analysis of Epigenomics Data using hidden Markov Models in the R package STAN

Benedikt Zacher, Rafael Campos-Martin, Julien Gagneur and Achim Tresch

15:20 - Coffee break; posters on display

Session Chair: Sebastian Böcker

16:00 - Algorithms for Computational Genomics

Tobias Marschall

16:20 - Simultaneous Gene Finding in Multiple Genomes

Stefanie König, Lars Romoth, Lizzy Gerischer and Mario Stanke

16:40 - Fast alignment-free sequence comparison using spaced-word frequencies

Chris-Andre Leimeister, Marcus Boden, Sebastian Lindner, Sebastian Horwege and Burkhard Morgenstern

                    17:00 - Poster flash presentations
                    17:30 - Poster session
                

19:00 - End

20:00 - PC Dinner

(by invitation only)

Schedule overview

8:00 - 9:00	Registration, coffee, poster set-up
9:00 - 9:30	Opening
9:30 - 10:20	Arndt von Haeseler
10:20 - 11:00	Coffee break
11:00 - 12:40	Talks
12:40 - 14:00	Lunch break
14:00 - 14:40	Eric Rivals
14:40 - 15:20	Talks
15:20 - 16:00	Coffee break
16:00 - 17:00	Talks
17:00 - 17:30	Poster flash presentations
17:30 - 19:00	Poster session
20:00 - 21:00	PC Dinner

8:00 - Registration open, coffee, posters on display

Session Chair: Tobias Marschall

9:00 - From sequence analysis to graph analysis

Veli Mäkinen

The abstraction of a genome as a linear sequence has created a vast sequence analysis literature with plethora of interesting subproblems defined and often algorithmically optimally solved; recent results in compressed indexing provide linear time sequence analysis functionality even in space close to what an input sequence occupies. One could say it is time to move on to more realistic abstractions of genomic content. This talk explores what happens to a selected classical sequence analysis tasks when labeled directed acyclic graphs (labeled DAGs) are used as inputs. Applications in partially phased diploid genomes, pan-genomes, and splicing graphs, are discussed. Some algorithms for the new problems are presented. The talk concludes with a list of open problems to summarize what needs to be achieved in order for the theory of labeled DAG analysis to reach completion similar to sequence analysis.

9:40 - Virus-Host Transcriptomics

Caroline C. Friedel

10:00 - From Predicting to Analyzing HIV-1 Resistance to Broadly Neutralizing Antibodies

Anna Feldmann and Nico Pfeifer

10:20 - Natural genetic variation impacts expression levels of coding, non-coding and antisense transcripts in fission yeast

Mathieu Clément-Ziza, Francesc Marsellach, Sandra Codlin, Manos Papadakis, Susanne Reinhardt, Maria Rodriguez-Lopez, Stuart Martin, Samuel Marguerat, Alexander Schmidt, Eunhye Lee, Christopher Workman, Jürg Bähler and Andreas Beyer

10:40 - Coffee break

Session Chair: Jörg Rahnenführer

11:20 - Efficient Duplicate Rate Estimation from Subsamples of Sequencing Libraries

Christopher Schröder and Sven Rahmann

11:40 - Ultra-fast functional classification of short reads using UProC with Pfam and KEGG

Manuel Landesfeind, Robin Martinjak, Heiner Klingenberg and Peter Meinicke

12:00 - Next Generation Cluster Editing

Thomas Bellitto, Tobias Marschall, Alexander Schoenhuth and Gunnar W. Klau

12:20 - Collecting reliable clades using the Greedy Strict Consensus Merger

Markus Fleischauer and Sebastian Böcker

12:40 - Causal modelling of stroma-cancer cell communication

Julia Catherine Engelmann, Claus Hellerbrand and Rainer Spang

13:00 - Lunch break (see survival guide)

Session Chair: Axel Mosig

14:00 - ELIXIR Europe: the European life science infrastructure for biological data

Andrew Smith

The life sciences are undergoing a transformation. Scientists are rapidly generating the most complex and heterogeneous datasets that science can currently imagine, with unprecedented volumes of biological data to manage. Data will only generate long-term value if it is Findable, Accessible, Interoperable and Re-usable (‘FAIR’). This requires a scalable infrastructure that connects local, national and European efforts and provides standards, tools and training for data management and analysis.

Established in January 2014, ELIXIR - the European life science Infrastructure for Biological Information - is a distributed organisation comprising national bioinformatics research infrastructures across Europe and the European Bioinformatics Institute (EMBL-EBI). This coordinated infrastructure supports data standards, exchange, interoperability, storage, security and training. From September 2015, the newly-awarded ELIXIR-EXCELERATE Horizon 2020 grant will fast-track ELIXIR’s early implementation phase by coordinating and enhancing existing resources into a world-leading data service for academia and industry and growing bioinformatics capacity and competence across Europe.

14:40 - Gemeinsame Fachgruppe Bioinformatik (FaBI)

Prof. Dr. Matthias Rarey

15:20 - Coffee break

16:00 - Take S-Bahn S1 to Bochum (see survival guide)

17:00 - Social event: Bergbaumuseum

18:30 - Conference Dinner

21:00 - Return to Dortmund (regional train)

Schedule overview

8:00 - 9:00	Registration, coffee, view posters
9:00 - 9:40	Veli Mäkinen
9:40 - 10:40	Talks
10:40 - 11:20	Coffee break
11:20 - 13:00	Talks
13:00 - 14:00	Lunch break
14:00 - 14:40	Andrew Smith
14:40 - 15:20	FaBI
15:20 - 16:00	Coffee break
16:00 - 21:00	Social event (Bergbaumuseum Bochum)

8:00 - Registration, coffee, pick up posters!

Session Chair: Ina Koch

9:00 - Intra-tumour heterogeneity and genomic rearrangements in human malignancies

Roland Schwarz

Accurate reconstruction of the evolutionary history of cancer in the patient and quantification of intra-tumour heterogeneity (ITH) are current challenges in cancer genomics. Genomic rearrangements are thereby of particular importance, but notoriously difficult to deal with computationally. The accuracy of tree inference from genomic rearrangements further depends on the quality of the phasing of copy-numbers: the assignment of major and minor copy-numbers to the two physical parental alleles. So far, phasing has been done using evolutionary criteria alone, a heuristic and computationally expensive procedure which impedes probe-level resolution tree reconstruction.

I will give an overview of the challenges and current state of research in reconstructing cancer trees from copy-number data. Results from our clinical studies demonstrate how ITH is associated with chemotherapy resistance in the clinic. I will further illustrate the importance of haplotype-specific copy-number assignment and show how the common genetic background between multiple samples from the same patient can be used to accurately phase copy-number data. This is a crucial step towards probe-level resolution tree inference on genomic rearrangement events in cancer and exact quantification of genetic heterogeneity for routine applications in translational cancer research.

9:40 - Statistical models of non-coding RNA-mediated gene regulation

Annalisa Marsico

10:00 - Motif clustering with implications for transcription factor interactions

Jan Grau, Ivo Grosse, Stefan Posch and Jens Keilwagen

10:20 - Varying levels of complexity in transcription factor binding motifs

Jan Grau and Jens Keilwagen

10:40 - Coffee break

Session Chair: Caroline Friedel

11:20 - Development and application of computational methods for lead identification and optimization

Johannes Kirchmair

11:40 - Integrating Sequence and Structure Information for Efficient Retrieval and Alignment of Flexible Protein Binding Sites

Stefan Bietz and Matthias Rarey

12:00 - SEMS: Improving the management of simulation studies in computational biology

Dagmar Waltemath

12:20 - A minimal model for explaining the higher ATP production in the Warburg effect

Stefan Schuster, Daniel Boley, Philip Möller and Christoph Kaleta

12:40 - Best poster awards and farewell

13:00 - End (or have another lunch)

Schedule overview

8:00 - 8:20	Registration, coffee, remove posters
9:00 - 9:40	Roland Schwarz
9:40 - 10:40	Talks
10:40 - 11:20	Coffee break
11:20 - 13:00	Talks
13:00 - 14:00	End / lunch