Génétique des Génomes Bactériens
Genetics of Bacterial Genomes
Coin Presse Résumé History of Biology Summary Journalist's Corner


Philosophy

Causeries

Home

 

Curriculum vitae (excerpts)

Lectures and courses (excerpts)

Research at the Unit Genetics of Bacterial Genomes

Main scientific achievements

References listed at PubMed

 

See also the history page of the Unit Genetics of Bacterial Genomes, see also the WWW site of the HKU-Pasteur Research Centre

Whole genome sequencing and annotation: in vivo and in silico genome analysis

In 1986, AD decided to explore the possibility of sequencing a whole bacterial genome to try to understand its basic principles of construction, and, above all, to explore the coupling between coordination of gene expression and the physical organisation of the genome. After a complex set of political events, impossible to summarise here (see Why sequence genomes? The Escherichia coli imbroglio and The Delphic Boat) he was eventually involved in the sequencing of a large segment of the Bacillus subtilis genome and in the scientific co-ordination of genome sequencing for this organism. This led him to try and organise bioinformatics in France with the help of several colleagues at Universities, CNRS and INRIA, through the creation of a nation-wide group, GDR 1029 (1991-1995) and subsequently through the coordination of the bioinformatics programme of the Groupement de Recherche et d'Etudes des Genomes (1992-1996), then at the Comité de Coordination des Sciences du Vivant (1998-2000). He is at present the director of the Department Genomes and Genetics at the Institut Pasteur.

The Delphic Boat or, what do genomes tell us

The oracle at Delphi had the habit of questioning passers-by. One of the questions told the following story. I have a boat made of wooden planks. As time elapses they rot one after the other. At some time no original plank still remains in the boat: is it the same boat? Clearly the owner will say, yes. And he will be right. The boat is not the matter of the boat, but something else, much more interesting, that orders the matter of the planks: it is the relationships between the planks.

In a very similar way the study of life should never be restricted to the study of objects, but must study their relationships. This is why genomes cannot be considered simply as collections of genes. They are much more. How can we have access to this? Considering the current flow of genome sequences that are published two contrasting images emerge: at first sight genes appear to be distributed randomly along the chromosome. In contrast their organisation into operons (or pathogenicity islands) suggests that, at least locally, related functions are in physical proximity. In order to try to understand genome organisation, we must therefore explore the distribution of genes along the chromosome, but we should do this by generalising the concept of neighbourhood to many more types of vicinities than the mere succession of genes in the genomic text.

The first observations of the Units successively created by AD suggest that this order is far from being random, but is linked to the function of genes in relation with the cell's architecture. These results are fragmentary, so they must be experimentally validated. This ought to combine in silico analysis of the genome (bioinformatics) of model organisms, such as Escherichia coli or Bacillus subtilis, with their study in vivo (reverse genetics and physiological biochemistry, in particular using transcription expression profiling and two-dimensions protein electrophoresis), and comparative studies with other genomes, with biochemical and structural analyses. If indeed the map of the cell is in the chromosome, this asks for some physical principle linking the succession of the genes - a symbolic text - and the cell's architecture - concrete matter. If we do not claim a divine principle, this should be a simple physical principle. The winning trio of Darwinian natural selection (variation / selection / amplification) shows that evolution creates functions, that functions "capture" (recruit) structures (acquisitive evolution), so that structural analysis only becomes important when functions are understood.

The simplest way to evolve is to follow the arrow of time, to increase the overall entropy of the system. In water, this is indeed the driving force for the construction of many a biological structure: this is at the root of the universal formation of helices, this allows the folding of proteins and the formation of viral capsids. But it should not escape our attention that the largest increase in entropy of a molecular complex in water occurs when the ratio surface / volume is the highest: when a planar structure is formed it orders the water molecules on both its faces. As a consequence, if this plane meets another one, it will loose one layer of water molecules, and stick there. Formation of planar layers should therefore be a very strong organising principle. Is it possible to find out, just knowing the genomic text whether a gene product will form such layers, whether it simply forms hexagons, for example? This is even more unlikely than that an amino acid sequence could tell us exactly the fold of a protein, without knowing pre-existing folds: pancreatic RNase would fold indeed, because selection isolated it for that (it is secreted in bile salts), but this would never be accepted as the paradigm of protein folding.

However, in silico analysis permits us to organise knowledge, and this might be a way to proceed in the future. In order to generate new knowledge, why not explore neighborhoods of biological objects, considering genes as starting points, stressing that each object exists in relation to other objects. Inductive exploration will consist in finding all neighbors of each given gene. "Neighbour" has here the largest possible meaning. This is not simply a geometrical or structural notion. Each neighbourhood is meant to shed specific light on a gene, looking for its function as bringing together the objects of the neighbourhood. A natural neighborhood is proximity on the chromosome: operons or pathogenicity islands show that genes neighbors from each other can be functionally related. Another interesting neighborhood is similarity between genes or gene products. The isoelectric point often gives a first idea of a gene product compartmentalisation. Also, a gene may have been studied by scientists in laboratories all over the world. And it can display features that refer to other genes: its neighbors will be the genes found together with it in the literature. Finally, there exists more complex neighborhoods, the study of which gives particularly revealing results: two genes may be neighbors because they use the genetic code in the same way. One can also study all genes that belong to the same neighborhood in the cloud of points describing codon usage of all the genes of the organism.

From the methodological standpoint this requires construction of neighborhoods tables (conveniently available to scientists in databases: a field of choice for bioinformatics). Finally, systematic investigation of bibliography will identify literature neighborhoods, not only using title and abstracts, but the whole content of articles: "in biblio" analysis is an essential component of inductive reasoning. We do not possess heuristics permitting direct access to unknown functions, and apart from preliminary studies there does not exist many places where such in silico work is developed. There exists however an excellent illustration of the concept of neighborhood, the software Entrez, created by David Lipman and colleagues at the NCBI.

All this has some flavour of a once fashionable field, Artificial Intelligence, a highly contentious but fascinating domain! This should also make clear to us that in silico analysis will never replace validation in vivo and in vitro: let us hope that propagation of erroneous assignments of functions by automatic interpretation of the genomic texts will not hinder discoveries. Knowing genome sequences is a marvelous feat, but it is the starting point, not the end.

Main scientific achievements

Before extracting some of the contribution to general knowledge of the work developed for more than thirty years by AD and his colleagues around him, it is important to recall the last word of his book, The Delphic Boat, to try and prevent misunderstandings. AD is well aware that, in contrast to Art, Science should have no names: "In our time, now that the written word has lost the almost sacred status it had for so long, I wonder about the significance of this book. Should I have written it? Or indeed, what place is there for scientific writing? This is a minor work; what benefit can it bring? Choose a major work, such as the 1905 article in which Einstein explicitly sets out the foundations of the theory of relativity, and compare it, with its enormously destructive consequences – do you remember Nagasaki? – with Mozart’s Don Giovanni, with Mickiewicz’s love poetry, or Michelangelo’s Pietà. Ask yourself which you would destroy, if fate allowed you to save only three of these works. Science is, and remains, anonymous. Anyone might rediscover Einstein’s findings, one fine day. Science is anonymous, and this is probably why it is both feared and disdained. What ambition, what mad hope drove me to dedicate my days and nights to thinking through what is set down in black and white in this bundle of pages? Astonishment, no doubt. Astonishment at finding myself one of the billions of human beings taking part in the last doubling of the population of humanity, at a time when we can all look around us at this planet we have devastated. At having witnessed the destruction of all my hopes and all my beliefs, and yet finding myself still standing on this transient earth. At knowing that a day will come, not for me but for others, when a new form of creation will replace what we have destroyed, only to disappear in its turn into the night of Oblivion. I wanted to leave some signs of this new hope of an impossible future, to help those mysterious Others who will one day give it birth, to help them find their way, or rather to know that the way exists, without me."

All the research developed by AD is centered around one unique question: is it possible to uncover rules that would account for the fact that genes function as a whole in the cell and contribute to its consistent and reproducible development? When one tries to isolate some of the important trends of AD's past research, one produces a picture that culminates in what can be considered as "symplectic biology", a biology where the relationships between objects is of more conceptual importance than the objects themselves. As this becomes understood, the idea that it will be possible to reconstruct life, and even to construct material objects endowed of living properties, based on building blocks that differ from those which exist in present day living organisms gains ground. Synthetic Biology is no longer a dream, it is in the process of being a novel achievement.

To answer this very general question, AD initially devised a genetic selection and screening procedure in the model bacterium Escherichia coli, meant to isolate mutants that would orient his future experiments along a rewarding track. The idea was to explore whether some signals which appear to us as redundant (i.e. look somewhat "useless" for the unprepared human mind) in macromolecular syntheses could be separated (i.e. by selecting mutants that would grow with only one active signal instead of several). The idea was that there exists some "secundary punctuation" in the expression of the genetic message allowing coupling between macromolecular syntheses and the bulk metabolism of the cell. Emphasis on this linguistic analogy came from his contribution to the reflection on the role of selective processes at the root of memory and learning. The study of the process of initiation of translation, which, in Bacteria, associates two independent signals (a metabolic signal which labels the first methionine of the nascent polypeptide with a one-carbon residue, and the structure of a special transfer RNA) led him, through experiments using genetics, to the discovery of a ubiquitous anomaly in metabolism, coupling replication, transcription, translation and cell division. The mutants affected in that process were analysed in succession. They involved transcription termination, translation initiation, the “stringent” coupling between these processes and ppGpp biosynthesis, the one-carbon metabolism, synthesis of cyclic AMP, a protein long proposed to be a bacterial histone, H-NS, and the biosynthesis pathway of branched-chain amino acids. This apparently haphazard list, derived from the outcome of genetic experiments, accounts for the threads followed, one by one, to attempt to unravel this complicated network of interactions, finally understood in january 2006 as a chemical serine derived effect. From the mid-1980 on, AD considered that the time was now ripe to explore this same question not through the study of individual genes, but rather to develop a global study of the genes from the knowledge of the complete genome texts. On this occasion, AD introduced the concept of experiments “in silico” as complementing in vivo or in vitro experiments (this term was used for the first time in 1988-1989, in his discussions with the European commission, meant to justify the setting up of genome projects). The question then became a simple conjecture, based on a former reflection of von Neumann about Turing machines: is there a link between the architecture of the cell and that of the genome? The most recent work from the Unit show that indeed genes are not randomly distributed in genomes. Whether this indicates a link with the architecture of the cell remains however, of course, an open question. The present exploration developed in the laboratory he is heading tries to identify the biochemical and physical constraints which lead to the selective stabilisation of the genome organisation thus uncovered.

• Discovery of toxic adenylyl cyclases (adenylate cyclases) (whooping cough and anthrax), discovery and molecular characterisation of four independent classes of adenylyl cyclases (evolutionary convergence), 1988-1998

The involvement of cyclic AMP in the "serine effect" (wild type strains are sensitive to serine, but cya and crp mutants are more resistant) led AD to develop with his colleagues a thorough study both in terms of genetics and biochemistry of adenylyl cyclases. After having been the first laboratory to isolate and characterise in full the gene of an adenylyl cyclase (that of Escherichia coli), the work was extended to the identification of adenylyl cyclase toxins, present in the etiologic agents of whooping cough and anthrax. Having invented a multipartner cloning technique, which is the ancestor of the technique now known as “double hybrid”, the genes from the corresponding toxins were isolated and sequenced, the proteins analysed biochemically and the secretion process of the cyclases was characterised:

P Glaser, D Ladant, O Sezer, F Pichot, A Ullmann, A Danchin
The calmodulin-sensitive adenylate cyclase of Bordetella pertussis: cloning and expression in Escherichia coli
Mol Microbiol (1988) 2: 19-30
 

P Glaser, H Sakamoto, J Bellalou, A Ullmann, A Danchin
Secretion of cyclolysin, the calmodulin-sensitive adenylate cyclase-haemolysin bifunctional protein of Bordetella pertussis
EMBO J (1988) 7: 3997-4004 

A symmetrical approach was used to clone the cDNA of mammalian calmodulins, showing that the method (double hybrid) is of wide efficiency:

A Danchin, O Sezer, P Glaser, P Chalon, D Caput
Cloning and expression of mouse-brain calmodulin as an activator of Bordetella pertussis adenylate cyclase in Escherichia coli
Gene (1989) 80: 145-149 

As early as 1988, this work asked a series of ethical problems (recently revived under the name of “bioterrorism”) discussed in:

fr-flagA Danchin
Doute et création
In: "La Responsabilité, la condition de notre humanité"
Autrement (1994) 14:249-266

HKU_Pasteur
A Danchin
Not every truth is good. The dangers of publishing knowledge about potential bioweapons
-EMBO Rep (2002) 3: 102-104 

An overview of this first work on adenylyl cyclases is summarised in:

A Danchin
Phylogeny of adenylate cyclases
Adv Second Messenger Phosphoprotein Res (1993) 27: 109-162 

This article creates the international reference for the classification of adenylyl cyclases. Initially, three classes from different phylogenetic descent (convergent evolution) were identified: Class I, cyclases from enterobacteria and related bacteria; Class II, secreted toxic cyclases; Class III, "universal" class present in Bacteria and in Eukarya (including higher vertebrates). A fourth class, also from a completely different phylogenetic origin was discovered several years later in the Unit:

O Sismeiro, P Trotot, F Biville, C Vivarès, A Danchin
Aeromonas hydrophila adenylyl cyclase 2: a new class of adenylyl cyclases with thermophilic properties and sequence similarities to proteins from hyperthermophilic archaebacteria
J Bacteriol (1998) 180: 3339-3344  J_Bact

The "universal" cyclases class (class III) clusters together adenylyl and guanylyl cyclases, and an original selection procedure allows one to go from one type of specificity to the other one (this was one of the very first experiments showing that it is possible to change the specificity of an enzyme for its substrate):

A Beuve, A Danchin
From adenylate cyclase to guanylate cyclase. Mutational analysis of a change in substrate specificity
J Mol Biol (1992) 225: 933-938 

• Discovery of the unexpectedly large extent of horizontal gene transfer (HGT) in bacteria, 1991-1999

Genome studies implied the creation of a global in silico analysis of the genome texts. A first analysis of 800 genes from E. coli allowed their clustering into three major classes: core metabolism, genes expressed at a high level under rapid growth, and genes coming from outside…

C Médigue, T Rouxel, P Vigier, A Hénaut, A Danchin
Evidence for horizontal gene transfer in Escherichia coli speciation
J Mol Biol (1991) 222: 851-856 

This very early work of genomics in silico demonstrated for the first time that a large fraction (at least one sixth) of the genes of E. coli are derived from horizontal gene transfer. It also shows that antimutator genes are likely to be propagated by horizontal gene transfer, suggesting that bacteria in the environment are often in a highly mutable state, which is fixed in a much more rigid (invariable) form when they meet a stable biotope. Another observation from this study is the clustering of HGT genes in relation with particular cell processes, suggesting that genomes are organised entities:

P Guerdoux-Jamet, A Hénaut, P Nitschké, JL Risler, A Danchin
Using codon usage to predict genes origin: is the Escherichia coli outer membrane a patchwork of products from different genomes?
DNA Research (1997) 4: 257-265  

The fact that this observation is general would be demonstrated later on, in the case of Bacillus subtilis. The importance of HGT is so well accepted nowadays that it has become common knowledge in biology:

I Moszer, EPC Rocha, A Danchin
Codon usage and lateral gene transfer in Bacillus subtilis
Curr Opin Microbiol (1999) 2: 524-528 
 

• Discovery of the massive presence of genes of unknonw function in genomes, 1991, and first sequencing and annotation of the genome of a Firmicute, 1997

The setting up of the sequencing of the genome of Bacillus subtilis, first project of this type launched for conceptual and not technological reasons, was publicly proposed by AD at the beginning of 1987. This resulted, in parallel with the same result obtained by the consortium sequencing the genome of Saccharomyces cerevisiae, in the first significant discovery of genomics, that found that many genes were completely unknown, not only in their sequence but also in their function and in the structure of their product:

P Glaser, F Kunst, M Arnaud, M-P Coudart, W Gonzales, M-F Hullo, M Ionescu, B Lubochinsky, L Marcelino, I Moszer, E Presecan, M Santana, E Schneider, J Schweizer, A Vertes, G Rapoport, A Danchin
Bacillus subtilis genome project: cloning and sequencing of the 97 Kb region from 325o to 333o
Mol Microbiol (1993) 10: 371-384  [it is amusing to note that this article is listed at PubMed with a truncated authors' list: biologists were not, at the time, familiar with the long lists of authors that are frequent in physics]

This article shows, for the first time, that in a long DNA fragment sequenced in full, half of the genes did not look like anything known until then. This utterly unexpected result (the opponents to genome sequencing projects had "demonstrated" that we knew at least 95% of all possible gene classes and published this demonstration in the most fashionable journals), presented with a similar conclusion from the sequencing of the yeast's chromosome III, at the first genomics symposium organised by the commission of European Communities in Elounda in Crete in 1991, revealed the first major discovery obtained by genome projects.

Performed by a consortium associating Europe and Japan, the sequencing of the B. subtilis genome was completed in 1997, at the same time as that of E. coli. As early as 1995 the total length of continuous fragments from the organism was significantly larger than that of the genomes then sequenced by Craig Venter and his colleagues. This was not much noticed however: Science has now become an activity in the domain of show business. However this genome remained for five years the only example of its domain (the genomes of the Firmicutes are particularly difficult to sequence, because their DNA is usually toxic in the universal host used to construct DNA libraries, E. coli, for biochemical reasons well understood by the authors of this project) :

F Kunst, N Ogasawara, I Moszer, AM Albertini, G Alloni, V Azevedo, MG Bertero, P Bessières, A Bolotin, S Borchert, R Borriss, L Boursier, A Brans, M Braun, SC Brignell, S Bron, S Brouillet, CV Bruschi, B Caldwell, V Capuano, NM Carter, SK Choi, JJ Codani, IF Connerton, NJ Cummings, RA Daniel, F Denizot, KM Devine, A Düsterhöft, SD Ehrlich, PT Emmerson, KD Entian, J Errington, C Fabret, E Ferrari, D Foulger, C Fritz, M Fujita, Y Fujita, S Fuma, A Galizzi, N Galleron, SY Ghim, P Glaser, A Goffeau, EJ Golightly, G Grandi, G Guiseppi, BJ Guy, K Haga, J Haiech, CR Harwood, A Hénaut, H Hilbert, S Holsappel, S Hosono, MF Hullo, M Itaya, L Jones, B Joris, D Karamata, Y Kasahara, M Klaerr-Blanchard, C Klein, Y Kobayashi, P Koetter, G Koningstein, S Krogh, M Kumano, K Kurita, A Lapidus, S Lardinois, J Lauber, V Lazarevic, SM Lee, A Levine, H Liu, S Masuda, C Mauël, C Médigue, N Medina, RP Mellado, M Mizuno, D Moesti, S Nakai, M Noback, D Noone, M O'Reilly, K Ogawa, A Ogiwara, B Oudega, SH Park, V Parro, TM Pohl, D Portetelle, S Porwollik, AM Prescott, E Presecan, P Pujic, B purnelle, G Rapoport, M Rey, S Reynolds, M Rieger, C Rivolta, E Rocha, B Roche, M Rose, Y Sadaie, T Sato, E Scalan, S Schleich, R Schroeter, F Scoffone, J Sekiguchi, A Sekowska, SJ Seror, P Serror, BS Shin, B Soldo, A Sorokin, E Tacconi, T Takagi, H Takahashi, K Takemaru, M Takeuchi, A Tamakoshi, T Tanaka, P Terpstra, A Tognoni, V Tosato, S Uchiyama, M Vandenbol, F Vannier, A Vassarotti, A Viari, R Wambutt, E Wedler, T Weitzenegger, P Winters, A Wipat, H Yamamoto, K Yamane, K Yasumoto, K Yata, K Yoshida, HF Yoshikawa, E Zumstein, H Yoshikawa, A Danchin
The complete genome sequence of the gram-positive bacterium Bacillus subtilis
Nature (1997) 390: 249-256 
  

As an anecdote, it is amusing to remark that the length of this genome, 4 megabases, represented more than the total length of what The Institute for Genome Research, TIGR, with its well chosen name, had already sequenced. It was also, with the genome of E. coli which has a comparable length, the longest sequence of a known DNA fragment until that date.

The distribution of the corresponding sequence and annotations to the international community was coordinated by AD, in the form of a specialised database with no exact counterpart until now:

C Médigue, A Viari, A Hénaut, A Danchin
Colibri: a functional data base for the Escherichia coli genome
Microbiol Rev (1993) 57: 623-654 
 

I Moszer, P Glaser, A Danchin
SubtiList: a relational database for the Bacillus subtilis genome
Microbiology (1995) 141 ( Pt 2): 261-268 

I Moszer, LM Jones, S Moreira, C Fabry, A Danchin
SubtiList: the reference database for the Bacillus subtilis genome
Nucleic Acids Res (2002) 30: 62-65 

Later on, AD participated to or organised several genome projects: Leptospira interrogans and Staphylococcus epidermidis, in collaboration with the Shanghai Genome Center, Photorhabdus luminescens, at the Institut Pasteur, and more recently, to try and understand the impact of the temperature constraints on genomes, the genome of the Antarctica bacteria Pseudoalteromonas haloplanktis TAC125, in collaboration with the Genoscope and several universities in the world. Within a few years, technological progresses both in vitro and in silico have been so extraordinary that this last project asked, in terms of workforce, one hundred times less person/years that that of B. subtilis:

HKU_Pasteur
SX Ren, G Fu, XG Jiang, R Zeng, YG Miao, H Xu, YX Zhang, H Xiong, G Lu, LF Lu, HQ Jiang, J Jia, YF Tu, JX Jiang, WY Gu, YQ Zhang, Z Cai, HH Sheng, HF Yin, Y Zhang, GF Zhu, M Wan, HL Huang, Z Qian, SY Wang, W Ma, ZJ Yao, Y Shen, BQ Qiang, QC Xia, XK Guo, A Danchin, I Saint Girons, RL Somerville, YM Wen, MH Shi, Z Chen, JG Xu, GP Zhao
Unique physiological and pathogenic features of Leptospira interrogans revealed by whole-genome sequencing
Nature (2003) 422: 888-893 

 

HKU_Pasteur
YQ Zhang, SX Ren, HL Li, YX Wang, G Fu, J Yang, ZQ Qin, YG Miao, WY Wang, RS Chen, Y Shen, Z Chen, ZH Yuan, GP Zhao, D Qu, A Danchin, YM Wen
Genome-based analysis of virulence genes in a non-biofilm-forming Staphylococcus epidermidis strain (ATCC 12228)
Mol Microbiol (2003) 49: 1577-1593   

E Duchaud, C Rusniok, L Frangeul, C Buchrieser, A Givaudan, S Taourit, S Bocs, C Boursaux-Eude, M Chandler, JF Charles, E Dassa, R Derose, S Derzelle, G Freyssinet, S Gaudriault, C Médigue, A Lanois, K Powell, P Siguier, R Vincent, V Wingate, M Zouine, P Glaser, N Boemare, A Danchin, F Kunst
The genome sequence of the entomopathogenic bacterium Photorhabdus luminescens
Nature Biotechnol (2003) 21: 1307-1313 

C Médigue, E Krin, G Pascal, V Barbe, A Bernsel, PN Bertin, F Cheung, S Cruveiller, S D'Amico, A Duilio, G Fang, G Feller, C Ho, S Mangenot, G Marino, J Nilsson, E Parrilli, EPC Rocha, Z Rouy, A Sekowska, ML Tutino, D Vallenet, G von Heijne, A Danchin
Coping with cold: the genome of the versatile marine Antarctica bacterium Pseudoalteromonas haloplanktis TAC125
Genome Res (2005) 15: 1325-1335  

The corresponding data (sequence and annotations) is organised, together with the counterpart from genomes of bacteria interesting for medicine or environment, at the University of  Hong Kong:

HKU_Pasteur
G Fang, C Ho, YW Qiu, V Cubas, Z Yu, C Cabau, F Cheung, I Moszer, A Danchin
Specialized microbial databases for inductive exploration of microbial genome sequences 
BMC Genomics (2005) 6: 14  

 

• Discovery of the first laws of bacterial genomes organisation 1999-2005

This is the very core of the work developed by AD for some twenty years: can one uncover rules in the organisation of genomes? Several laws have been discovered: first, there is a universal bias in the composition of the genes present in the leading and the lagging strand of DNA; second, and this is quite remarkable, the essential genes (experimentally identified after the sequecing project of B. subtilis) are specifically coded in the leading DNA strand:

EPC Rocha, A Danchin, A Viari
Universal replication biases in bacteria
Mol Microbiol (1999) 32: 11-16 
 

A Danchin, P Guerdoux-Jamet, I Moszer, P Nitschké
Mapping the bacterial cell architecture into the chromosome
Philos Trans R Soc Lond B Biol Sci (2000) 355: 179-190  

HKU_Pasteur
EPC Rocha, A Danchin
Ongoing evolution of strand composition in bacterial genomes
Mol Biol Evol (2001) 18: 1789-1799  
HKU_Pasteur
EPC Rocha, A Danchin
Essentiality, not expressiveness, drives gene-strand bias in bacteria
Nature Genetics (2003) 34: 377-378  
HKU_Pasteur
EPC Rocha, A Danchin
Gene essentiality determines chromosome organisation in bacteria
Nucleic Acids Res (2003) 31: 6570-6577  

Considering genomes as wholes, one knew for more than a decade that there exists a 10-11.5 period in the nucleotide distribution, and this is true from prokaryotes to eukaryotes. This bias is present throughout a given genome, both in coding and non-coding sequences. Using a technique for analysis of auto-correlations based on linear projection the sequences responsible for the bias were identified. These ubiquitous patterns were termed "class A flexible patterns". Each pattern is composed of up to ten conserved nucleotides or dinucleotides distributed into a discontinuous motif. Each occurrence spans a region up to 50 bp in length. There is some limited fluctuation in the distances between the nucleotides composing each occurrence of a given pattern, suggesting that they are constrained by DNA supercoiling and/or bending. When taken together, these patterns cover up to half of the genome in the majority of prokaryotes. They generate the previously recognized 11 bp periodic bias. Judging from the structure of the patterns, it was suggested that they may define a dense network of protein interaction sites in chromosomes:

HKU_Pasteur
E Larsabal, A Danchin
Genomes are covered with ubiquitous 11bp periodic patterns, the "class A flexible patterns"
BMC Bioinformatics (2005) 6: 206  

The corresponding constraints are visible in the amino acid sequence of the proteins, suggesting that the sequence is more constrained by the genome organisation than by the protein function. These novel observations have considerable implications in terms of phylogenetic profiles when one analyses protein sequences:

HKU_Pasteur
EPC Rocha, A Danchin
Base composition bias might result from competition for metabolic resources
Trends Genet (2002) 18: 291-294  
HKU_Pasteur
G Pascal, C Médigue, A Danchin
Universal biases in protein composition of model prokaryotes
Proteins (2005) 60: 27-35  

This latter work characterises “orphan” proteins which form approximately 10% of any genome of a new species. These proteins are characterized by their enrichment in aromatic amino acids. This work proposes that many among the represent the "self" of the species, by behaving as “gluons” which bring about an extra contribution is the stability of multiprotein complexes in the cell. This would bring an essential contribution to the functional stabilisation of complex intracellular structures. More generally the approach thus defined allowed the investigators to define the essentiality of a gene in a real context, by measuring its persistence in many species, not only in sequence but also in its place in the genome:

G Fang, EPC Rocha, A Danchin
How essential are non-essential genes?
Mol Biol Evol (2005) 22: 2147-2156   

In summary, it appears that bacterial genomes are highly organised entities, contrary to a widely spread idea of a random "fluidity" of genomes. What are the selective constraints that support this organisation?

replicatorA general analysis of the conservation of syntenies in a large number of complete bacterial genomes has shown that two classes of genes tend to stay together. The way the class of persistent genes keep remaining grouped is organized in a way that is reminiscent of a scenario of the origin of life. This is why the corresponding set has been named the paleome. In the same way, genes that are rarely found in genomes make clusters that are easily horizontally transferred. The corresponding genes allow the bacteria to live in a specific niche. They are named, for this reason, the cenome (to indicate the fact that they are shared by a community living in a particular environment, and prone to be transferred):

A Danchin
Archives or palimpsests? Bacterial genomes unveil a scenario for the origin of life
Biological Theory (2007) 2: 52-61 b7

A Danchin, G Fang, S Noria
The extant core bacterial proteome is an archive of the origin of life
Proteomics (2007) 7: 875-889  biosapiens b7 epg

• Identification physico-chemical and metabolic principles responsible for selective stabilisation of the bacterial genomes structure

The functional organisation of the genes in genomes must result from the selection pressure of simple physico-chemical principles. Beside physical causes such as the structure of water (the study of the genome of P. haloplanktis is meant to have access to some of those), AD made the simple hypothesis that gasses and radicals, because they are highly diffusible, may play a major role in cellular compartmentalisation, and might be the cause of some of the organisation of the genes in genomes. Sulfur metabolism is particularly sensitive to gasses and radicals, and it is therefore important to understand how it is organised. A first study demonstrated that sulfur-related genes are organised into islands:

EPC Rocha, A Sekowska, A Danchin
Sulphur islands in the Escherichia coli genome: markers of the cell's architecture?
FEBS Lett (2000) 476: 8-11 
 

and a detailed analysis, mainly developed during the creation of the HKU-Pasteur Research Centre in Hong Kong permitted them to uncover the details of the “methionine salvage pathway”:

HKU_Pasteur
A Sekowska, HF Kung, A Danchin
Sulfur metabolism in Escherichia coli and related bacteria: facts and fiction
J Mol Microbiol Biotechnol (2000) 2: 145-177  

A Sekowska, JY Coppée, JP Le Caer, I Martin-Verstraete, A Danchin
S-adenosylmethionine decarboxylase of Bacillus subtilis is closely related to archaebacterial counterparts
Mol Microbiol (2000) 36: 1135-1147 
 

HKU_Pasteur
A Sekowska, L Mulard, S Krogh, JK Tse, A Danchin
MtnK, methylthioribose kinase, is a starvation-induced protein in Bacillus subtilis
BMC Microbiol (2001) 1: 15  
HKU_Pasteur
A Sekowska, S Robin, JJ Daudin, A Hénaut, A Danchin
Extracting biological information from DNA arrays: an unexpected link between arginine and methionine metabolism in Bacillus subtilis
Genome Biol (2001) 2: RESEARCH0019  
HKU_Pasteur
A Sekowska, A Danchin
The methionine salvage pathway in Bacillus subtilis
BMC Microbiol (2002) 2:  

The following work makes a synthesis of the catalytic activities involved in this ubiquitous cycle (it is also present in humans and plants), which has the interesting feature that it systematically recruited proteins of diverse structures to lead to the completion of the cycle. One of these proteins is likely to be related to the ancestor of ribulose-phosphate carboxylase/oxygenase (RuBisCO), the most abundant enzyme on the planet (this opens fascinating questions on the origin of catalytic activities):

HKU_Pasteur
A Sekowska, V Dénervaud, H Ashida, K Michoud, D Haas, A Yokota, A Danchin
Bacterial variations on the methionine salvage pathway
BMC Microbiol (2004) 4: 9  

H Ashida, A Danchin, A Yokota
Was photosynthetic RuBisCO recruited by acquisitive evolution from RuBisCO-like proteins involved in sulfur metabolism?
Res Microbiol (2005) 156: 611-618  

This remarkable metabolic cycle has the surprising property as shown in this work, under particular conditions, to lead the cell to synthesize carbon monoxide. As this cycle exists in humans, this opens interesting perspective about possible controls mediated by CO, a gas different from nitric oxide, in the immune system and in the nervous systemx.

•Selective stabilisation and epigenesis

This work explored the role of selective stabilisation in learning and memory in the nervous system and in the immune system, opening concepts for later work in genomics.

JP Changeux, P Courrège, A Danchin
A theory of the epigenesis of neuronal networks by selective stabilization of synapses
Proc Natl Acad Sci U S A (1973) 70: 2974-2978  

franceA Danchin, JP Changeux
Apprendre par stabilisation sélective de synapses en développement
In: "L'Unité de l'Homme" (Centre Royaumont pour une Science de l'Homme) Le Seuil (1974): 320-350

JP Changeux, A Danchin
Selective stabilisation of developing synapses as a mechanism for the specification of neuronal networks
Nature (1976) 264: 705-712 

A Danchin
A selective theory for the epigenetic specification of the monospecific antibody production in single cell lines
Ann Immunol (Paris) (1976) 127: 787-804 

franceA Danchin
Stabilisation fonctionnelle et épigenèse: une approche biologique de la genèse de l'identité individuelle
In: "L'Identité" (JM Benoist, ed) Grasset (1977): 185-221

A Danchin
The specification of the immune response: a general selective model
Mol Immunol (1979) 16: 515-526 

franceJP Changeux, P Courrège, A Danchin, JM Lasry
Un mécanisme biochimique pour l'épigenèse de la jonction neuro-musculaire
C R Séances Acad Sci III (1981) 292: 449-453 

The question is now to try and understand how the future of daughter cells is organised at the time of cell divison, and what are the main selective stabilisation processes at work...

 

 

AD has published more than 500 articles (300 referenced at PubMed, and 365 at the ISI) and four books in Molecular Biology and Genetics:

Ordre et Dynamique du Vivant. Chemins de la Biologie Moléculaire - Le Seuil, 1978.

L'Oeuf et la Poule. Histoires du code génétique - Fayard, 1983.
In portuguese : O Ovo e a Galina. Historias do Codigo genetico - Relogio d'agua, 1993.
Also in japanese

and more recently on the Origin of Life:

Une Aurore de Pierres. Aux origines de la vie - Le Seuil, 1990.
In portuguese : Uma Aurora de Pedras. Nas origens da vida - Almedina, 1992.

A book on the revolution of Genomics was published in 1998 by Odile Jacob:

La Barque de Delphes. Ce que révèle le texte des génomes - Odile Jacob -1998

it has been updated and translated into English in 2003:

The Delphic Boat. What genomes tell us - Harvard University Press - 2003 [Comments in Nature, Nature Genetics, EMBO Reports]

The site in French


<< A Presocratic view of BiologyUne vision Présocratique de la Biologie>>
�2000-2007 Copyright • Disclaimer Origine de la Vie From Minerals to Life Dai Minerali alla Vita