Pure Appl. Chem. Vol. 74, No. 6, pp. 899-905 (2002)

Pure and Applied Chemistry

Vol. 74, Issue 6

Microbial computational genomics of gene regulation*

Julio Collado-Vides, Gabriel Moreno-Hagelsieb, and Arturo Medrano-Soto**

Program of Computational Genomics, CIFN-UNAM, Av. Universidad s/n, Cuernavaca, 62100 Morelos, Mexico

Abstract: Escherichia coli is a free-living bacterium that condensates a large legacy of knowledge as a result of years of experimental work in molecular biology. It represents a point of departure for analyses and comparisons with the ever-increasing number of finished microbial genomes. For years, we have been gathering knowledge from the literature on transcriptional regulation and operon organization in E. coli K-12, and organizing it in a relational database, RegulonDB. RegulonDB contains information of 20­25 % of the expected total sets of regulatory interactions at the level of transcription initiation. We have used this knowledge to generate computational methods to predict the missing sets in the genome of E. coli, focusing on prediction of promoters, regulatory sites, regulatory proteins, operons, and transcription units. These predictions constitute separate pieces of a single puzzle. By putting them all together, we shall be able to predict the complete set of regulatory interactions and transcription unit organization of E. coli. Orthologous genes in other genomes of known co-regulated sets of genes in E. coli, along with their corresponding predicted operons, and their predicted transcriptional regulators, shall permit the extension of the previous goal to many more microbial genomes.

* Plenary lecture presented at the International Conference on Bioinformatics 2002: North-South Networking, Bangkok, Thailand, 6-8 February 2002. Other presentations are presented in this issue, pp. 881-914.
** Corresponding author.

