ServicenavigationHauptnavigationTrailKarteikarten


Research unit
EU RFP
Project number
95.0236-2
Project title
TRADAT: Characterization of regulatory genomic regions. Development of databases and sequence analysis tools

Texts for this project

 GermanFrenchItalianEnglish
Key words
-
-
-
Anzeigen
Alternative project number
-
-
-
Anzeigen
Research programs
-
-
-
Anzeigen
Short description
-
-
-
Anzeigen
Further information
-
-
-
Anzeigen
Partners and International Organizations
-
-
-
Anzeigen
Abstract
-
-
-
Anzeigen
References in databases
-
-
-
Anzeigen

Inserted texts


CategoryText
Key words
(English)
Gene regulation; transcription; databases; algorithms; bioinformatics
Alternative project number
(English)
EU project number: BIO4-CT95-0226
Research programs
(English)
EU-programme: 4. Frame Research Programme - 4.1 Biotechnology
Short description
(English)
See abstract
Further information
(English)
Full name of research-institution/enterprise:
EPF Lausanne
Laboratoire de biotechnologie moléculaire
Partners and International Organizations
(English)
E. Wingender (GBF-D), M. Bishop (MRC-UK), P. Bucher (ISREC-CH), L. Milanesi (CNR-I), W. Thomas (GSF-D)
Abstract
(English)
Efficient computational tools are being increasingly used for high throughput analysis of newly determined DNA sequences in genome sequencing projects. The potential function and regulation of new genes can be addressed by algorithms that allow the prediction of the regulatory DNA sequences that act as binding sites for transcription factor proteins. At present, the efficacy of such currently available algorithms has in most cases not been validated experimentally.

The Lausanne contribution in the TRADAT project was two-fold. First, we evaluated the efficacy of various widely available computer methods for the prediction of binding sequences for the human CTF/NF1 regulatory proteins. None of the tested tools gave results that fitted well with in vitro binding strength, and attempts to define a cut-off line for functional/non-functional sequences produced large proportions of false positive and/or false negative results. Our results indicated that the reliability of these prediction tools may be limited by the set of training data available in the scientific literature, and also by the prediction methods and algorithms used to generate the computer tools.

In the second part of our work, we tested several experimental methods to measure protein-DNA interaction, and we chose one of them to analyze systematically a collection of DNA binding sequences for the CTF/NF1 proteins. This study indicated that additional computational parameters are required for accurate binding site prediction. For instance, current prediction methods make the assumption that the interactions of the regulatory protein with distinct base pairs are independent, which implies that various base pair substitutions should have additive effects on the binding strength. However, we found that this is not true for most of the combinations of substitutions that we evaluated. Second, this study indicated that the length of the binding sites is quite flexible, another parameter usually not taken into account by current computer tools.

These findings led to a biochemical model for the binding of this transcription factor to the regulatory DNA sequences. This model in turn formed the basis for the construction of new computer prediction tools by several partner laboratories. Our subsequent experimental evaluation of the new computer methods indicated that these predict quite accurately the binding affinity for natural and synthetic DNA sequences, thus validating these prediction tools. Altogether, this study thus not only highlighted some of the limitations of usual computational tools, but, more importantly, it also formed the basis for the generation of improved prediction methods.
References in databases
(English)
Swiss Database: Euro-DB of the
State Secretariat for Education and Research
Hallwylstrasse 4
CH-3003 Berne, Switzerland
Tel. +41 31 322 74 82
Swiss Project-Number: 95.0236-2