GenoChore: a family of GenoList-based specialized microbial databases

Gang Fang1,3, Christine Ho1, Virginie Cubas1, Yaowu Qiu1, Cédric Cabau1, Zhou Yu1, Frankie Cheung1, Ivan Moszer2,3, Antoine Danchin1,3

1HKU-Pasteur Research Centre, Dexter HC Man Building, 8, Sassoon Road, Pokfulam, Hong Kong, China
2Plateforme Integration et Analyse des Génomes, Génopole, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
3Unité de Génétique des Génomes Bactériens, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France

Contact    antoine.danchin[at]normalesup[dot]org

An example: LeptoList

Database description

LeptoList is the reference database dedicated to the genome of Leptospira interrogans serovar Lai, the paradigm of leptospirosis causative agents (1). It has been developed as a spin-off of the B. subtilis genome project (2), GenoChore provides a data schema for the management of DNA and protein sequences, combined with the relevant annotations and functional assignments (3). Information about gene functions and products is updated by a curating team. LeptoList is based on a generic relational data schema and World-Wide Web interface developed for the handling of bacterial genomes, GenoChore. The underlying DataBase Management System used in the Hong Kong based databases is MySQL. The WWW interface was designed to allow users to easily browse through genome data and retrieve information according to common biological queries. LeptoList also provides more elaborate tools, such as pattern searching, which are tightly connected to the overall browsing system. LeptoList was accessible at the HKU Computer Center. Similar bacterial databases (for Bacillus anthracis, Bacillus cereus, Neisseria meningitidis, Pseudomonas aeruginosa, Staphylococcus epidermidis, Streptomyces coelicolor, Thermoanaerobacter tengcongensis, Vibrio cholerae, Xylella fastidiosa, etc.) were accessible at the HKU Computer Center.

Other databases using the GenoList data schema

Several databases have been constructed following this data schema. Genes likely to be orthologs of genes identified in model genomes have been named accordingly. A "y" letter starting a gene name indicates that it has not been experimentally identified, nor convincingly identified after in silico analysis yet. In addition to providing sequence information on genome annotation, these databases allow selected curators to modify the annotations. If you are interested in curating some of these data please contact Antoine Danchin and Frankie Cheung to get a protected access to the curation pages.

AeruList Pseudomonas aeruginosa AE004091  
AnthraList Bacillus anthracis AE016879  
CampyloList Campylobacter jejuni AL111168  
CereList Bacillus cereus AE016877  
CholeList Vibrio cholerae AE003852, AE003853  
CoeliList Streptomyces coelicolor AL645882  
InfluList Haemophilus influenzae L42023  
LeptoList Leptospira interrogans LAI AE010300, AE010301  
MeningoList Neisseria meningitidis AE002098  
SepiList Staphylococcus epidermidis AE015929  
SubtiList Bacillus subtilis AL009126  
ThermaList Thermoanaerobacter tengcongensis AE008691  
XylelList Xylella fastidiosa AE003849  
CunicuList Encephalitozoon cuniculi (Microsporidia)

Acknowledgements

These databases are implemented at the Hong Kong University Computer Centre (Dr Nam Ng, Director) by Frankie Cheung. This effort is sponsored by the Innovation and Technology Fund of the government of the SAR Hong Kong, China (programme BIOSUPPORT), granted to A. Danchin for the creation of the HKU-Pasteur Research Centre, with the collaboration of N. Ng, Computer Centre. Initial development stages (4) of GenoList were performed in the framework of the European B. subtilis genome project (European commission Biotechnology program — contracts BIO2-CT93-0272, BIO2-CT94-2011, BIO4-CT96-0655), coordinated by F. Kunst and supported by the BACillus Industrial Platform (BACIP). We thank the contribution at various stages of P. Glaser, A. Hénaut, C. Médigue, M. Pupin, and A. Viari. We acknowledge the fruitful collaboration with A. Bairoch and the SWISS-PROT team.

REFERENCES

1. Ren S.X. , Fu G., Jiang X.G.,. Zeng R,
Miao Y.G., Xu H., Zhang Y.X., Xiong H.,
Lu G., Lu L.F., Jiang H.Q., Jia J., Tu Y.F.,
Jiang J.X., Gu W.Y., Zhang Y.Q., Cai Z.,
ShengH.H. , Yin H.F., Zhang Y., Zhu G.F.,
Wan M., Huang H.L., Qian Z., Wang S.Y.,
Ma W., Yao Z.J., Shen Y., Qiang B.Q.,
Xia Q.C., Guo X.K., Danchin A.,
Saint Girons I., Somerville R.L., Wen Y.M.,
Shi M.H., Chen Z., Xu J.G. & Zhao G.P.
Unique physiological and pathogenic features
of Leptospira interrogans revealed by whole-
genome sequencing.
Nature (2003) 422:888-893.

2. Kunst,F., Ogasawara,N., Moszer,I.,
Albertini,A.M., Alloni,G., Azevedo,V.,
Bertero, M.G., Bessières,P., Bolotin,A.,
Borchert,S., et al. and A. Danchin (1997)
The complete genome sequence of the
Gram-positive bacterium Bacillus subtilis.
Nature, 390, 249-256.

3. Moszer,I. (1998) The complete genome of
Bacillus subtilis: from sequence annotation to
data management and analysis. FEBS Lett.,
430, 28-36.

4. Moszer,I., Glaser,P. and Danchin,A. (1995)
SubtiList: a relational database for the
Bacillus subtilis genome. Microbiology, 141,
261-268.