Rolf LANDAUER
From deciphering genomes to synthetic biology: cells are information-gathering machines
Before summarizing the contribution to general knowledge of the work I have developed over several decades, let me recall the last word of The Delphic Boat, to try and avoid misunderstandings: I am well aware that Science, unlike Art, should not have names. This brief account is intended to explain the « style » of a scientist's work, not to promote a narcissistic view. For although Science is universal, a style remains idiosyncratic.
The central question I have been exploring over the last few decades is this. Is there a general principle that explains why biological chemistry seems to be somehow 'animated'? That led to the second question. Is it possible to discover rules that explain how genes work as a whole in the cell and contribute to its consistent and reproducible development? If we isolate some of the important trends in this research, we get a picture that culminates in what can be called « symplectic biology », a biology in which the relations between objects have a greater conceptual importance than the objects themselves (see a view independently proposed by Murray Gell-Mann and archived at this site ***). This means that the material embodiment of abstraction is essential for understanding what life is. A critical consequence of this constraint is that, because the atoms of life have intrinsic properties reflected in the Mendeleev table, which have nothing to do with the abstract world to which they are related in living things, many features of life will look like anecdotes. As a result, life forms are very diverse. This makes it quite difficult to identify any possible underlying law. Once this fact is understood, the idea that it will be possible to reconstruct life, and even to construct material objects with living properties, using building blocks different from those in existing living organisms, will gain ground. Synthetic biology is no longer a dream, but an unprecedented achievement. This makes it all the more important to identify what makes life so special.
Main exploration tracks (from present back to 1968)
Understanding what life is has been the main quest of philosophers, especially since the time of the Presocratic philosophers. In his Lives of Illustrious Men, Plutarch describes the return of Theseus—whose relationship with the temple of Apollo at Delphi is well known, hence my Delphic Boat—from Crete to Athens, and the fate of his ship, which was built by the Athenians. To keep the ship running, the Athenians kept replacing the rotting planks with new ones. Philosophers, later, used this example to discuss permanence and change, some claiming that the ship was no longer the same, others saying the opposite: "The ship wherein Theseus and the youth of Athens returned had thirty oars, and was preserved by the Athenians down even to the time of Demetrius Phalereus, for they took away the old planks as they decayed, putting in new and stronger timber in their place, insomuch that this ship became a standing example among the philosophers, for the logical question of things that grow; one side holding that the ship remained the same, and the other contending that it was not the same." (translated by John Dryden). Following the trend set by this profound question, the study of life must never be limited to the study of things, but must study their relationships. This is an abstract entity—the ship is able to float, whether it is made of oak, pine or even metal planks—, which is represented by the physical currency we usually call information. How is this category of the physical reality dealt with in biological things?
Living organisms produce young offspring. However, their offspring come from parents who are already old. This implies that the parents have somehow recruited or created some kind of new information that restores youth. Information is an essential currency of our world, it is physical. In 1961, Rolf Landauer established that computation is reversible, with the consequence that the creation of new information does not require the dissipation of energy. However, resetting the process used to create information again requires erasing the memory of past events. This is energy intensive. Charles Bennett illustrated reversible computing in 1988 by showing how to construct a simple arithmetic operation, division, in a reversible way. He showed that the result of a division is obtained by erasing the intermediate states, leaving the remainder of the division as the prominent and 'valuable' result of the computation. In this process, erasing memory dissipates energy.
A question remains, however. In his description, Bennett did not explain how the remainder of the division could be distinguished from the bits left over from the calculation process. One had to find some way to erase them without touching the remainder. To perform this task, one needs some kind of additional (contextual) information: I have suggested that this is where the need for energy dissipation comes in. Indeed, an agent must separate the information that forms the result of the operation (the remainder) from the information generated to achieve that result. Energy is dissipated to prevent the remainder of the division from being erased, while the remaining memory is erased. This returns the remaining memory to a state that can be used for other calculations. How does this process work? Here is an attempt to understand this process within cells, having explicitly identified agents that embody the two main steps of the Landauer principle:
• An information-laden (or 'tense') step, associated with the loading of an energy source, without energy dissipation, and providing a quantum of information (typically the selection of a specific molecule, in an environment containing many related molecules). In the case of enzymes, this typically takes the form of a functional step triggered by binding to a non-hydrolyzable ATP analogue (APPNP or related molecules).
• A reset step in which energy is dissipated (usually the hydrolysis of ATP or other nucleoside triphosphate to ADP and phosphate), so as to return the system to its ground state and allow the process to start again.
Among these functions, we have highlighted the presence of ubiquitous but neglected functional categories. Cells are prone to accidents, errors and ageing. The first key function is the separation between clean and altered entities. This process requires the management of information. For example, proteins age, sometimes very rapidly. The cell has to identify old proteins and get rid of them without destroying the young ones. These overlooked functions—which must be identified among the critical functions encoded in all genomes—play a role similar to that of Maxwell’s Demons (MxDs). They can generate classes of material things among similar ones, identify a particular position in a 3D structure, or a particular time in a smooth sequence of events. About fifty such functions have been identified in the minimal set required for an autonomous life. A dozen of these are used to control the correct folding and assembly of the reading head of the genetic message, the ribosome. In fact, this nanomachine is based on the spontaneous folding of a very long RNA by water, and since the number of incorrect conformations is very large, it requires agents capable of retaining only those that are ultimately functional, discarding or refolding the others. There are also other functions that repair broken DNA molecules, calibrate the supercoiling of the double helix or export toxic components from the cell while preserving essential ones, etc.
Before this hypothesis about how animation of life emerged, I wrote a series of articles based on the genome sequencing programs I set up in 1987. These articles show how the hypothesis was formed.
A Danchin Information of the chassis and information of the program in synthetic cells Syst Synth Biol. (2009) 12: 210-242 doi: 10.1007/s11693-009-9036-5 |
|
|
PM Binder, A Danchin Life's demons: information and order in biology. What subcellular machines gather and process the information necessary to sustain life? EMBO Reports (2011) 12: 495-499. In his seventeenth-century classic, Novum Organum, Francis Bacon wrote, “we cannot command nature except by obeying her” (Bacon, 2010). Although our knowledge of living systems is much improved since Bacon’s time, we are still far from understanding—or commanding—all the complex mechanisms of life. To take full advantage of living organisms for the benefit of mankind, we will need to understand those mechanisms to the furthest possible extent. To do so will require that the concept of information and the theories of information science take a more-prominent role in the understanding of living systems... |
A Danchin, G Fang Unknown unknowns: essential genes in quest for function Microb Biotechnol. (2016) 9: 530-540 doi: 10.1111/1751-7915.12384 |
|
G Boël, O Danot, V de Lorenzo, A Danchin
Omnipresent Maxwell’s demons orchestrate information management in living cells Microb Biotechnol. (2019) 12: 210-242 doi: 10.1111/1751-7915.13378 |
Genomes are not just collections of genes. They are more than that. How can we find this information? We can decipher the sequence of model genomes. To this aim, I set up the sequencing Bacillus subtilis genome. In spring 1987 this was the first project of this type launched for conceptual and not for technological reasons. In 1991—in parallel with the same result obtained by the consortium sequencing the genome of Saccharomyces cerevisiae—we discovered that many genes were completely unknown at the time, not only in their sequence but also in their function and in the structure of their product:
P Glaser, F
Kunst, M Arnaud, M-P Coudart, W Gonzales, M-F Hullo, M Ionescu, B
Lubochinsky, L Marcelino, I Moszer, E Presecan, M Santana, E
Schneider, J Schweizer, A Vertes, G Rapoport, A Danchin
Bacillus subtilis genome project: cloning and sequencing of the
97 Kb region from 325o to 333o
Mol Microbiol (1993) 10: 371-384
This completely unexpected result (the opponents of the genome sequencing projects had « proved » that we knew at least 95% of all possible gene classes and published this demonstration in the most fashionable journals), was presented with a similar conclusion from the sequencing of the yeast chromosome III at the first genomics symposium organised by the Commission of European Communities in Elounda, Crete, in 1991. Generally overlooked because people tend to erase the history of their mistakes, this was the first major discovery obtained of a genome project.
The sequencing of the B. subtilis genome, carried out by a European-Japanese consortium, was completed in 1997, at the same time as that of E. coli. As early as 1995 the total length of continuous fragments from the organism was significantly greater than that of the genomes sequenced at that time. For five years, this genome sequence remained the only example of its kind (the genomes of the Firmicutes were particularly difficult to sequence because, for biochemical reasons well understood by the authors, their DNA is usually toxic in the universal host used at the time to construct DNA libraries, E. coli) :
F Kunst, N
Ogasawara, I Moszer, AM Albertini, G Alloni, V Azevedo, MG Bertero,
P Bessières, A Bolotin, S Borchert, R Borriss, L Boursier, A Brans,
M Braun, SC Brignell, S Bron, S Brouillet, CV Bruschi, B Caldwell, V
Capuano, NM Carter, SK Choi, JJ Codani, IF Connerton, NJ Cummings,
RA Daniel, F Denizot, KM Devine, A Düsterhöft, SD Ehrlich, PT
Emmerson, KD Entian, J Errington, C Fabret, E Ferrari, D Foulger, C
Fritz, M Fujita, Y Fujita, S Fuma, A Galizzi, N Galleron, SY Ghim, P
Glaser, A Goffeau, EJ Golightly, G Grandi, G Guiseppi, BJ Guy, K
Haga, J Haiech, CR Harwood, A Hénaut, H Hilbert, S Holsappel, S
Hosono, MF Hullo, M Itaya, L Jones, B Joris, D Karamata, Y Kasahara,
M Klaerr-Blanchard, C Klein, Y Kobayashi, P Koetter, G Koningstein,
S Krogh, M Kumano, K Kurita, A Lapidus, S Lardinois, J Lauber, V
Lazarevic, SM Lee, A Levine, H Liu, S Masuda, C Mauël, C Médigue, N
Medina, RP Mellado, M Mizuno, D Moesti, S Nakai, M Noback, D Noone,
M O'Reilly, K Ogawa, A Ogiwara, B Oudega, SH Park, V Parro, TM Pohl,
D Portetelle, S Porwollik, AM Prescott, E Presecan, P Pujic, B
Purnelle, G Rapoport, M Rey, S Reynolds, M Rieger, C Rivolta, E
Rocha, B Roche, M Rose, Y Sadaie, T Sato, E Scalan, S Schleich, R
Schroeter, F Scoffone, J Sekiguchi, A Sekowska, SJ Seror, P Serror,
BS Shin, B Soldo, A Sorokin, E Tacconi, T Takagi, H Takahashi, K
Takemaru, M Takeuchi, A Tamakoshi, T Tanaka, P Terpstra, A Tognoni,
V Tosato, S Uchiyama, M Vandenbol, F Vannier, A Vassarotti, A Viari,
R Wambutt, E Wedler, T Weitzenegger, P Winters, A Wipat, H Yamamoto,
K Yamane, K Yasumoto, K Yata, K Yoshida, HF Yoshikawa, E Zumstein, H
Yoshikawa, A Danchin
The complete genome sequence of the gram-positive bacterium Bacillus
subtilis
Nature (1997) 390: 249-256
The sequence with the annotation of its genes was further updated four times:
|
V Barbe, S Cruveiller, F Kunst, P
Lenoble, G Meurice, A Sekowska, D Vallenet, TZ Wang, I Moszer,
C Médigue, A Danchin From a consortium sequence to a unified sequence: The Bacillus subtilis 168 reference genome a decade later Microbiology (2009) 155: 1758-1775 doi: 10.1099/mic.0.027839-0 ![]() ![]() ![]() ![]()
|
|||
|
E Belda, A Sekowska, F Le Fèvre, A
Morgat, D Mornico, C Ouzounis, D Vallenet, C Médigue, A
Danchin An updated metabolic view of the Bacillus subtilis 168 genome Microbiology (2013) 159: 757-770. doi: 10.1099/mic.0.064691-0 |
|||
|
R Borriss, A Danchin,
CR Harwood, C Médigue, EPC Rocha, A Sekowska, D Vallenet Bacillus subtilis, the model Gram-positive bacterium: 20 years of annotation refinement Microb Biotechnol. (2018) 11: 3-17 |
|||
|
Bremer E, Calteau A, Danchin A, Harwood
C, Helmann JD, Médigue C, Palsson BO, Sekowska A, Vallenet D,
Zuniga A, Zuniga C A model industrial workhorse: Bacillus subtilis strain 168 and its genome after a quarter of a century Microb Biotechnol (2023)16: 1203-1231 |
The distribution of the corresponding sequence and annotations to the international community was displayed in the form of a specialized database. Lacking financial support, this endeavour was suspended in 2010.
Several genome projects followed: Leptospira interrogans and Staphylococcus epidermidis, in collaboration with the Shanghai Genome Center, Photorhabdus luminescens, at the Institut Pasteur, and, to try and understand the impact of the temperature constraints on genomes, the genome of the Antarctica bacteria Pseudoalteromonas haloplanktis TAC125, in collaboration with the Genoscope and several universities. Sequencing of the genome of Psychromonas ingrahamii followed as a collaboration with Monica Riley and her colleagues.
Looking at the flow of published genomic sequences, two contrasting pictures emerge: on the one hand, genes appear to be distributed randomly along the chromosome. On the other hand, their organisation into operons (or pathogenicity islands) suggests that related functions are physically close, at least locally. To understand the organisation of the genome, it is therefore useful to study the distribution of genes along the chromosome. This requires generalising the concept of neighbourhood to many types of neighbourhood other than the simple sequence of genes in the genomic text. From a methodological point of view, this approach to inductive research requires the construction of neighbourhood tables (conveniently available to scientists in databases: a field of choice for bioinformatics). Finally, the systematic study of history will identify literature neighbourhoods not only on the basis of titles and abstracts, but on the basis of the entire content of the articles: "in biblio" analysis is an essential component of inductive reasoning. We do not have heuristics that allow direct access to unknown functions and, apart from preliminary studies, there are not many places where such in silico work is being developed. However, there is an excellent illustration of the concept of neighbourhood, the Entrez software developed by David Lipman and colleagues at the NCBI.
Inductive exploration will consist in finding all neighbors of each given gene. "Neighbour" here has the broadest possible meaning. This is not simply a geometric or structural concept. Each neighbourhood is inrended to shed a particular light on a gene, looking for its function as bringing together the objects of the neighbourhood. One natural neighbourhood is proximity on the chromosome. Another interesting neighbourhood is similarity between genes or gene products. The isoelectric point often gives a first idea of the compartmentalisation of a gene product. A gene may also have been studied by scientists in laboratories all over the world. And it may have features that relate to other genes: its neighbours will be the genes found together with it in the literature. Finally, there are more complex neighbourhoods, the study of which gives particularly revealing results: two genes can be neighbours because they use the genetic code in the same way. One can also study all the genes that belong to the same neighbourhood in the cloud of points describing the codon usage of all the genes of the organism. I proposed this approach at the 20th anniversary symposium of EMBO at EMBL in Heidelberg in 1994, using the example of the possible role of an important enzyme, polynucleotide phosphorylase.
All this has something of the flavour of the then fashionable field of Artificial Intelligence, a highly controversial but fascinating area. This should also remind us that in silico analysis will never replace in vivo and in vitro validation: let us hope that the propagation of false function assignments through automated interpretation of the genomic texts does not hinder discoveries. The knowledge of genome sequences is a wonderful achievement, but it is the starting point, not the end. The first observations made by my laboratory at the Institut Pasteur (Regulation of Gene Expression and Genetics of Bacterial Genomes) were interpreted as proof that this order was far from random, but was linked to the function of the genes, in relation to the architecture of the cell. These results were fragmentary, so they had to be experimentally confirmed, combining in silico analysis of the genome (bioinformatics) of model organisms, such as Escherichia coli or Bacillus subtilis, with their study in vivo (reverse genetics and physiological biochemistry, in particular using transcription-expression profiling and two-dimensional protein electrophoresis), as well as comparative studies with other genomes, with biochemical and structural analyses. If the map of the cell is indeed in the chromosome, this calls for some physical principle linking the order of the genes - a symbolic text, carrying an information - and the architecture of the cell - concrete (i.e. massive or inert) matter. We do not need to resort to the existence of a divine principle of organization. This should be the consequence of a simple physical principle. The winning triad of Darwinian natural selection (variation / selection / amplification) shows that evolution creates functions, that functions « capture » (recruit) structures (acquisitive evolution), so that structural analysis becomes the most relevant when functions are understood.
A Danchin, P
Guerdoux-Jamet, I Moszer, P Nitschké
Mapping
the bacterial cell architecture into the chromosome
Philos Trans R Soc Lond B Biol Sci (2000) 355:
179-190
![]() |
EPC Rocha, A Danchin |
Ongoing evolution of strand composition in bacterial genomes | |
Mol Biol Evol (2001) 18: 1789-1799 |
![]() |
EPC Rocha, A Danchin |
Essentiality, not expressiveness, drives gene-strand bias in bacteria | |
Nature Genetics (2003) 34: 377-378 |
![]() |
EPC Rocha, A Danchin |
Gene essentiality determines chromosome organisation in bacteria | |
Nucleic Acids Res (2003) 31: 6570-6577 |
Looking at genomes as a whole, we have long known that there is a 10-11.5 period in the distribution of nucleotides, and this is true from prokaryotes to eukaryotes. This bias is present throughout a given genome, in both coding and non-coding sequences. Using a linear projection-based autocorrelation analysis technique, the sequences responsible for this bias have been identified. These ubiquitous motifs were termed "flexible class A motifs". Each motif consists of up to ten conserved nucleotides or dinucleotides distributed in a discontinuous pattern. Each occurrence spans a region of up to 50 bp in length. There is limited variation in the distances between the nucleotides comprising each occurrence of a given motif, suggesting that they are constrained by supercoiling and/or bending of the DNA. Taken together, these motifs cover up to half of the genome in most prokaryotes. They generate the previously identified 11 bp period. Based on the structure of the motifs, it has been suggested that they may define a dense network of protein interaction sites in chromosomes:
![]() |
E Larsabal, A Danchin |
Genomes are covered with ubiquitous 11bp periodic patterns, the "class A flexible patterns" | |
BMC Bioinformatics (2005) 6: 206 ![]() |
The corresponding constraints are visible in the amino acid sequence of the proteins, suggesting that the sequence is more constrained by the genome organisation than by the protein function. These observations have significant implications for phylogenetic profiles when analysing protein sequences:
![]() |
G Pascal, C Médigue, A Danchin |
Universal biases in protein composition of model prokaryotes | |
Proteins (2005) 60: 27-35 ![]() |
The latter work characterises “orphan” proteins which make up about 10% of the genome of each new species. These proteins are characterised by their enrichment in aromatic amino acids. This work proposes that many of these represent the "self" of the species, by behaving as “gluons” which bring about an additional contribution is the stability of multi-protein complexes in the cell. This would make an essential contribution to the functional stabilisation of complex intracellular structures. More generally the approach thus defined allowed the researchers to define the essentiality of a gene in a real context, by measuring its persistence in many species, not only in sequence but also in its place in the genome:
G
Fang, EPC Rocha, A Danchin
How essential are non-essential genes? Mol Biol Evol (2005) 22:
2147-2156
In summary, bacterial genomes appear to be highly organised entities, contrary to the widely held notion of a random « fluidity » of genomes. What are the selective constraints that support this organisation?
A general analysis of the conservation of synteny in a large number of complete bacterial genomes has shown that two classes of genes tend to stay together. The way in which the persistent class of genes remains grouped together is organised in a way that is reminiscent of a scenario for the origin of life. This is why the corresponding set has been called the paleome. Similarly, genes that are rarely found in genomes form clusters that are easily transferred horizontally. These genes allow the bacteria to live in a specific niche. They are therefore called the cenome (to indicate the fact that they are shared by a community living in a particular environment, and are prone to transfer):
G Fang, EP Rocha, A Danchin
Persistence drives gene clustering in bacterial genomes
BMC Genomics (2008) 9: 4
A Danchin
Natural
selection and immortality
Biogerontology (2009) 10: 503-516
A Danchin
A phylogenetic view of bacterial ribonucleases
Prog Nucleic Acid Res Mol Biol (2009) 85: 1-41
The type of DNA polymerase III also plays a role in the overall organisation of the genome. Firmicutes, which have two such polymerases (DnaE and PolC), show a strong bias in gene distribution. Analysis of the genes that co-evolve with these polymerases shows that the different bacterial clades have different origins. This has important implications for the origins of life, as it shows that there is no single ancestor, no LUCA, but a population of progenitors that merged and split several times before giving rise to the species we know today.
Our efforts have led to the identification of several rules, linked to the particularities of the building blocks of life:
These rules must be implemented in synthetic biology constructs.
|
V de Lorenzo, A Danchin Synthetic biology: discovering new worlds and new words EMBO Rep (2008) 9: 822-827. doi: 10.1038/embor.2008.159 |
||
|
A Danchin Bacteria as computers making computers FEMS Microbiol Rev (2009) 33: 3-26. doi: 10.1111/j.1574-6976.2008.00137.x | ||
|
A Danchin, A Sekowska Frustration: Physico-chemical prerequisites for the construction of a synthetic cell in: Systems Chemistry, May 26th - 30th, 2008, in Bozen, Italy Beilstein Institut for the Advancement of Chemical Sciences (2009) 1-13. |
||
|
A Danchin, PM Binder, S Noria Antifragility and tinkering in biology (and in business): Flexibility provides an efficient epigenetic way to manage risk Genes (2011), 2: 998-1016; doi:10.3390/genes2040998 |
||
|
M Porcar, A Danchin, V de Lorenzo, VA
dos Santos, N Krasnogor, S Rasmussen, A Moya The ten grand challenges of synthetic life Systems and Synthetic Biology (2011) 5:1-9. doi: 10.1007/s11693-011-9084-5 |
||
|
A Danchin Scaling up synthetic biology: Do not forget the chassis FEBS Letters (2012) 586: 2129-2137. doi: 10.1016/j.febslet.2011.12.024 |
||
|
A Danchin Synthetic biology's flywheel EMBO Reports (2012) 13: 92. doi: 10.1038/embor.2011.253 |
||
|
CG Acevedo-Rocha, G Fang, M Schmidt, DW
Ussery, A Danchin From essential to persistent genes: a functional approach to constructing synthetic life Trends Genet. (2013) 29: 273-279. doi: 10.1016/j.tig.2012.11.001 |
||
|
A Danchin, A Sekowska Constraints in the design of the synthetic bacterial chassis Methods in Microbiology (2013) 40: 39-68. doi: |
||
|
A Danchin, A Sekowska, S Noria
|
||
|
Danchin A Isobiology: A variational principle for exploring synthetic life Chembiochem (2020) 21: 1781-1792 doi: 10.1002/cbic.202000060 |
||
|
Danchin A Biological innovation in the functional landscape of a model regulator, or the lactose operon repressor CR Biol (2021) 344: 111-126 doi: 10.5802/crbiol.52 |
The minimal functions required to make a cell alive revealed many genes coding for unknown functions.This work was followed by a series of developments combining identification of agents generating classes of things (i.e. protein complexes that dissipate energy during the process of separation between classes of material things, positions, or times) and the coordination of non-homothetic growth, mediated by the omnipresent control of cytidine triphosphate (CTP) synthesis.
Being abstract, information must nevertheless be embodied into material entities, with unavoidable idiosyncratic properties. This inevitably makes new unmet functional needs emerge. Thus, the growth of cells requires specific but clumsy material implementations "kludges" as a trite saying names them. Although difficult to identify this "tinkering" become essential in particular situations. Finally, a specific functional category characterizes the need for growth: metabolic implementations that allow the cell to organize the growth of its cytoplasm, membranes and genome, in different spatial dimensions (3D, 2D, 1D). Solving this metabolic dilemma, which is essential for the engineering of new synthetic biology chassis, has led us to discover an unexpected role for CTP synthetase as a coordinator of non-homothetic growth.
|
|
|
![]() |
EPC Rocha,
A Danchin Base composition bias might result from competition for metabolic resources Trends Genet (2002) 18: 291-294 |
![]() |
|
|
A Danchin Comparison between the Escherichia coli and Bacillus subtilis genomes suggests that a major function of polynucleotide phosphorylase is to synthesize CDP DNA Res (1997) 4: 9-18 |
|
P
Nitschké, P Guerdoux-Jamet, H Chiapello, G Faroux, C Hénaut, A
Hénaut, A Danchin Indigo: a World-Wide-Web review of genomes and gene functions FEMS Microbiol Rev (1998) 22: 207-227 |
In particular, it explains the surprising observation that deoxyribonucleotide synthesis starts from ribonucleoside diphosphates, not triphosphates.
A consequence of non-homothetic growth is a huge pressure to make genome long, not short. This results in the multiplication of accidents — such as DNA sequence local duplications — or processes resulting in accretion of DNA sequences within genomes. Horizontal Gene Transfer, a process we discovered to be omnipresent, is therefore an inevitable consequence of this pressure.
When I decided in 1986 to try to sequence an entire bacterial genome, it was an attempt to understand the basic principles of both its construction and its role. At the time, most biologists regarded this as a waste of time and resources, unlikely to yield important new knowledge and the idea was greeted with reluctance in the spring 1987. My idea was to explore the link between the coordination of gene expression and the physical organisation of the genome, on the basis that a genome is not just a collection of genes. After a complex series of political obstacles that are impossible to summarise here (see Why sequence genomes? The Escherichia coli imbroglio or The Delphic Boat) I ended up in driving the sequencing of a large part of the Bacillus subtilis genome and, together with the late Frank Kunst, in the scientific co-ordination of an international team to sequence the genome of strain 168 of this organism. This led me to try to organise genome bioinformatics in France with the help of several colleagues from universities, and national research agencies, through the creation of a national group, GDR 1029 (1991-1995) and then by coordinating the bioinformatics programme of the Groupement de Recherche et d'Études des Génomes (1992-1996, headed by Piotr Slonimski), and then at the Comité de Coordination des Sciences du Vivant (1998-2000). I coined the expression "in silico" to describe this endeavour. Amusingly, it has been so successful that it is now universally accepted as a counterpart to in vivo and in vitro, and its origin has been forgotten.
As the director of the Department Genomes and Genetics at the Institut Pasteur until June 2009, I brought the project to a close by re-sequencing and annotating the B. subtilis reference genome sequence, as a tribute to the entire international community working on this model organism. In 1991, the B. subtilis programme, in parallel with the Yeast programme, discovered that a significant number of the genes that make up the genome were still of unknown function. In the same year, the analysis of this unknown gene complement of the E. coli genome led to the demonstration that horizontal gene transfer (HGT) was not an anecdotal feature, as previously thought, but accounted for a large proportion of bacterial genomes. Most of these unknown genes belonged to an original class that had a specific codon usage bias. Overlooked at its origin and rediscovered ten years later, HGT has since then been shown to be an essential component in the construction of most if not all, genomes. Indeed, the number of articles in the field continues to grow at a rapid pace.
C Médigue, T Rouxel, P Vigier, A Hénaut, A
Danchin
Evidence for horizontal gene transfer in Escherichia coli
speciation
J Mol Biol (1991) 222: 851-856
![]() |
![]() |
C Médigue, A
Viari, A Hénaut, A Danchin
Escherichia coli molecular genetic map (1500 kbp): update II
Mol Microbiol (1991) 5: 2629-2640
Our in silico genomics work showed for the first time that a fraction (at least one sixth) of E. coli genes are derived from HGT. It also showed that antimutator genes are likely to be propagated by HGT, suggesting that bacteria in the environment are often in a highly mutable state and are fixed in a much more rigid (invariable) form when they encounter a stable biotope. Another observation from this study was the clustering of HGT genes in relation with particular cell processes, suggesting that genomes are organised entities:
P
Guerdoux-Jamet, A Hénaut, P Nitschké, JL Risler, A Danchin
Using codon usage to predict genes origin: is the Escherichia coli
outer membrane a patchwork of products from different genomes?
DNA Research (1997) 4: 257-265
That this observation is general would be demonstrated later on, with Bacillus subtilis. The importance of HGT is so well accepted nowadays that it has become common knowledge:
I Moszer, EPC Rocha, A DanchinMy interest for biology stemmed from the idea that selection had a stabilizing role. I had discovered this during my travel through West Africa, where I collected butterflies (and frogs, for the laboratory of Zoology at the École Normale Supérieure) and the way Ivan Schmalhausen discussed the problems of Darwinism. In line with my interest in tise work and my involvement at the Centre Royaumont pour une Science de l'Homme, I organized in 1971 a weekly seminar, every Wednesday's afternoon, at the Institut de Biologie Physico-Chimique. There, together with Philippe Courrège and Jean-Pierre Changeux we tried to delineate the limits of selection in biological processes. Our work explored the role of selective stabilization in learning and memory in the nervous system and in the immune system. This exploration predated the fashion for neural networks, but with a specific feature: synapses evolved in such a way that they could degenerate irreversibly. The outcome of the process was the carving of an image of the environment within the neural network.
JP Changeux, A
Danchin
Selective stabilisation of developing synapses as a mechanism for the
specification of neuronal networks
Nature (1976) 264: 705-712
A Danchin
A selective theory for the epigenetic specification of the
monospecific antibody production in single cell lines
Ann Immunol (Paris) (1976) 127: 787-804
A Danchin
The specification of the immune response: a general selective model
Mol Immunol (1979) 16: 515-526
I then explored the general process of selective stabilisation in the construction of cells as computers that make computers. This process allows for the embodiment of functional properties within material entities that are progressively linked together as networks or organised in space, for example in operons. This view led to an emphasis on the role of information as the authentic currency of the physical world, and to the discovery of the key role of Landauer's principle in biology, as described above. Recently I summarised this old work together with André Fenton.
A Danchin, AA Fenton
From analog to digital computing: Is Homo sapiens’ brain on
its way to become a Turing Machine?
Front Ecol Evol (2022) 10: fevo.2022.796413
The functional organisation of the genes in genomes must result from the selection pressure of simple physico-chemical principles. Beside physical causes such as the structure of water (the study of the genome of P. haloplanktis is meant to have access to some of those), gasses and radicals, because they are highly diffusible, may play a major role in cellular compartmentalisation, and might be the cause of some of the organisation of the genes in genomes. Sulfur metabolism is particularly sensitive to gasses and radicals, and it is therefore important to understand how it is organised. A first study demonstrated that sulfur-related genes are organised into islands:
EPC Rocha, A
Sekowska, A Danchin
Sulphur islands in the Escherichia coli genome: markers of the
cell's architecture?
FEBS Lett (2000) 476: 8-11
and a detailed analysis, mainly developed during the creation of the HKU-Pasteur Research Centre in Hong Kong permitted them to uncover the details of the “methionine salvage pathway”:
![]() |
A Sekowska, HF Kung, A Danchin |
Sulfur metabolism in Escherichia coli and related bacteria: facts and fiction | |
J Mol Microbiol Biotechnol (2000) 2: 145-177 |
A Sekowska, JY
Coppée, JP Le Caer, I Martin-Verstraete, A Danchin
S-adenosylmethionine decarboxylase of Bacillus subtilis
is closely related to archaebacterial counterparts
Mol Microbiol (2000) 36: 1135-1147
![]() |
A Sekowska, L Mulard, S Krogh, JK Tse, A Danchin |
MtnK, methylthioribose kinase, is a starvation-induced protein in Bacillus subtilis | |
BMC Microbiol (2001) 1: 15 |
![]() |
A Sekowska, A Danchin |
The methionine salvage pathway in Bacillus subtilis | |
BMC Microbiol (2002) 2: 8 |
The following work makes a synthesis of the catalytic activities involved in this ubiquitous cycle (it is also present in man and plants), which has the interesting feature that it systematically recruited proteins of diverse structures to lead to the completion of the cycle. One of these proteins is likely to be related to the ancestor of ribulose-phosphate carboxylase/oxygenase (RuBisCO), the most abundant enzyme on the planet (this opens fascinating questions on the origin of catalytic activities):
![]() |
A Sekowska, V Dénervaud, H Ashida, K Michoud, D Haas, A Yokota, A Danchin |
Bacterial variations on the methionine salvage pathway | |
BMC Microbiol (2004) 4: 9 |
H
Ashida, A Danchin, A Yokota
Was photosynthetic RuBisCO recruited by acquisitive
evolution from RuBisCO-like proteins involved in sulfur
metabolism?
Res Microbiol (2005) 156: 611-618
This remarkable metabolic cycle has the surprising property as shown in ourwork, under particular conditions, to lead the cell to synthesize carbon monoxide. As this cycle exists in man, this opens interesting perspective about possible controls mediated by CO, a gas different from nitric oxide, in the immune system and in the nervous system.
|
A Sekowska, H Ashida, A Danchin Revisiting the methionine salvage pathway and its paralogues Microb Biotechnol. (2019) 12: 77-97 doi: 10.1111/1751-7915.13324 |
Metabolism can be seen as a pre-requisite for any scenario of the origins of life. I have explored several features of the question, based on surface metabolism, as advocated by Samuel Granick, Freeman Dyson and Günter Wächtershäuser.
A Danchin
Homeotopic transformation and the origin of translation
Progress in Biophysics and Molecular Biology (1989) 54:
81-86
A
Danchin
Archives or palimpsests? Bacterial genomes unveil a scenario for the
origin of life
Biological
Theory (2007) 2: 52-61
A
Danchin, G Fang, S Noria
The extant core bacterial proteome is an archive of the origin of life
Proteomics (2007) 7: 875-889
|
A Danchin From chemical metabolism to life: the origin of the genetic coding process Beilstein J Org Chem. (2017) 13: 1119-1135 doi:10.3762/bjoc.13.111 |
|
A Danchin Multiple clocks in the evolution of living organisms pp. 101-118 In: Molecular Mechanisms of Microbial Evolution (2018) edited by Pabulo H. Rampelotto, Springer ISBN: 978-3-319-69078-0 |
![]() |
The study of the process of initiation of translation, which, in Bacteria, associates two independent signals (a metabolic signal that labels the first methionine of the nascent polypeptide with a one-carbon residue, and the structure of a special transfer RNA) led me, through experiments using genetics, to the discovery of a ubiquitous anomaly in metabolism, coupling replication, transcription, translation and cell division. The mutants affected in that process were analysed in succession. They involved transcription termination, translation initiation, the “stringent” coupling between these processes, the one-carbon metabolism, synthesis of cyclic AMP, a protein long proposed to be a bacterial histone, H-NS, and the biosynthesis pathway of branched-chain amino acids. This apparently haphazard list, derived from the outcome of genetic experiments, accounts for the threads followed, one by one, to attempt to unravel this complicated network of interactions, finally understood in january 2006 with the role of the serine amino acid (this common amino acid is toxic in excess because of at least two processes: production of hydroxypyruvate, that makes dead-end products with thiamine, and of aminoacrylate / iminopropionate when it enters pathways such as cysteine and tryptophan biosynthesis).
The involvement of cyclic AMP in the "serine effect" (wild-type strains are sensitive to serine, but cya and crp mutants are more resistant) led us to a thorough study of the genetics and biochemistry of adenylate cyclases. After being the first laboratory to isolate and fully characterise the gene of an adenylate cyclase (that of E. coli), the work was extended to the identification of adenylate cyclase toxins present in the pathogens of whooping cough and anthrax. Following the invention of a multi-partner cloning technique, the ancestor of the technique now known as "double hybrid" (patent EP0301954), the genes of the corresponding toxins were isolated and sequenced, the proteins biochemically analysed and the secretion process of the cyclases characterised:
P Glaser, D
Ladant, O Sezer, F Pichot, A Ullmann, A Danchin
The calmodulin-sensitive adenylate cyclase of Bordetella pertussis:
cloning and expression in Escherichia coli
Mol Microbiol (1988) 2: 19-30
P Glaser, H
Sakamoto, J Bellalou, A Ullmann, A Danchin
Secretion of cyclolysin, the calmodulin-sensitive adenylate
cyclase-haemolysin bifunctional protein of Bordetella pertussis
EMBO J (1988) 7: 3997-4004
To validate this double hybrid technique, a symmetrical approach was used to clone the cDNA of mammalian calmodulins, showing that the method (double hybrid) is of wide efficiency:
A Danchin, O
Sezer, P Glaser, P Chalon, D Caput
Cloning and expression of mouse-brain calmodulin as an activator of Bordetella
pertussis adenylate cyclase in Escherichia coli
Gene (1989) 80: 145-149
As early as 1988, this work raised a number of ethical problems (recently revived under the name of "bio-terrorism"):
![]() |
A Danchin |
Not every truth is good. The dangers of publishing knowledge about potential bioweapons | |
EMBO Rep (2002) 3: 102-104 |
This led me to be appointed as a member of the Centre Consultatif National pour la Biosécurité (CNCB).
An overview of this first work on adenylate cyclases is summarised in:
A Danchin
Phylogeny of adenylyl cyclases
Adv Second Messenger Phosphoprotein Res (1993) 27:
109-162
This article established the international reference for the classification of adenylate cyclases. Three classes of different phylogenetic origin (convergent evolution) were initially identified: Class I, cyclases from enterobacteria and related bacteria; Class II, secreted toxic cyclases; Class III, "universal" class present in Bacteria and in Eukarya (including higher vertebrates). A fourth class, also from a completely different phylogenetic origin, and possibly involved in promiscuous activities, was discovered in our research Unit a few years later:
O Sismeiro, P
Trotot, F Biville, C Vivarès, A Danchin
Aeromonas hydrophila adenylyl cyclase 2: a new class of
adenylyl cyclases with thermophilic properties and sequence
similarities to proteins from hyperthermophilic archaebacteria
J Bacteriol (1998) 180: 3339-3344
The "universal" class of cyclases (class III) brings together adenylate and guanylyl cyclases, and an original selection procedure makes it possible to switch from one type of specificity to the other one (this was one of the very first experiments to show that it is possible to change the specificity of an enzyme for its substrate):
A Beuve, A
Danchin
From adenylate cyclase--BG----BG-- to guanylate cyclase. Mutational
analysis of a change in substrate specificity
J Mol Biol (1992) 225: 933-938
***
This text, which strictly follows the
rules we discussed in an article published in 2021, does not
seem to be available, which justifies making it available here.
Murray Gell-Mann [© Complexus 1 (5) 1995-1996 (Out of
print)]