James Howard Kunstler
Sequencing the genome of Bacillus subtilis: a western inside story
Antoine Danchin (2001)
HKU Pasteur Research Centre, Hong Kong
As is often the case in history, the narrative of the first steps of a radical change in the course of science is reconstructed by dominant actors who were absent from the initial stages that saw the deployment of the new paradigm. The early days of genome sequencing did not follow a smooth path. In fact, most scientists, including today's most ardent advocates of the field, were considerably negative at the beginning, and it is refreshing to look back and get a sense of what actually happened.
In the mid-eigthies it was widely believed that the accumulated knowledge about genes, collected in data libraries, was sufficient to fully characterise the genomes of living organisms. As a result, most researchers were reluctant to organise genome sequencing. This was seen as a waste of important resources that should be distributed to other research programmes. In the spring of 1987, I nevertheless proposed to sequence the genome of the model bacterium, Bacillus subtilis, at a time when André Goffeau (1935-2018) together with Piotr Slonimski (1922-2009) were trying to set up a yeast genome project in Europe. This led us to get together, and draw up a research programme for the Biotechnology Action Programme of the forerunner of the European Union. It took several years for microbiologists to accept that this undertaking was necessary. As is often the case in history, most of the current players in the field, who had been hostile or reluctant at the outset, were quick to claim victory for the programme once its importance was recognised.
The text of the White Paper to the European Commission represents the vision we had in 1988-89. Many of the ideas that have subsequently been followed up by work around the world are explicit in it. However, the first discovery of the genome projects, that a large proportion of genes were completely unknown, both in terms of their function and their origin, had not been foreseen. This discovery, presented at the 1991 Elounda meeting in Crete and resulting from work on the yeast chromosome III and the Bacillus subtilis genome, marked the beginning of the interest of the entire scientific community in these projects.
This text is adapted from an article published in a booklet in honour of Professor Hiroshi Yoshikawa, Nara Institute for Science and Technology, Nara, Japan: For the Love of Genome p. 4-9. We would like to associate our late colleague Frank Kunst, who passed away at the end of the resequencing project with this tribute.
The prehistory of the sequencing of the Bacillus subtilis genome goes back to 1985, at a time when the idea to sequence the human genome began to be discussed in the United States. Robert Sinsheimer, Renato Dulbecco and Charles DeLisi, each in his own way, proposed to sequence the human genome in 1985-1986. Their project was, at that early time, presented as a technical program whose outcome might be to help solve some of the problems of human health.
Immediately after Japan had been crushed by the atomic bombs, the USA initiated a policy of intensive co-operation with the defeated country, in order to hold off the growing threat of communism. Genetics had a central place in the arena of scientific collaboration. Amongst other aims, this allowed the Americans to salve their conscience by showing an interest in the future of the residents of Hiroshima and Nagasaki. This explains how the US Department of Energy (DoE), the federal agency responsible for the USA’s nuclear programmes, very soon became involved in research which at first sight appears well outside its natural jurisdiction. The main areas of research were the mechanisms of mutagenesis and identifying the effects of radiation on the genes. In 1947 this led to the creation of the Atomic Bomb Casualty commission (ABCC), financed by the Atomic Energy commission (which soon became the DoE). Genetics made up an important part of its research.
The mutagenic effects of radiation had been discovered by Hermann Joseph Muller in 1927. For this work, which led him to make appallingly alarmist predictions, he was awarded the Nobel Prize in 1946. In 1954, the ABCC published a report by James Neel and William Schull on the first genetic findings on more than 75,000 births in Hiroshima and Nagasaki. The results were reassuring, but they dealt with only the first generation of children born since the bomb, and were based on an analysis which was still rudimentary. 1954 was only a year after the structure and mode of replication of the DNA molecule had been discovered. A generation later, more sophisticated studies which analysed protein mobility in an electric field did not contradict this early work. But to be really sure of what kind of mutations radiation might have caused, it was necessary to find out what happens in the DNA sequence, right down to the level of the DNA bases. This was not yet possible then.
However, in the meantime the political context had changed beyond recognition. In the mid 80s the cold-war rhetoric gave way to concern about a new adversary. Japan’s economic power threatened America’s leadership in technology. The federal agencies were mobilised to encourage the setting up of new companies and to protect intellectual and industrial property. And we witness today the consequences of this policy in the high number of computer, software and biotech companies in the USA.
It is impossible to give a precise date for the beginning of the Human Genome Project. Some writers date it from the Alta summit in Utah in December 1984, organised by the DoE. The aim of this summit, in which James Neel took part, was to discuss what strategies should be used to detect mutations in the generations after Hiroshima and Nagasaki. The summit succeeded in fulfilling the DoE’s vision for life sciences. In discussions of state-of-the-art technologies and DoE’s capability for using them, all sorts of potential models for identifying mutations were reviewed. Direct sequencing of the DNA involved was already considered to be one of the most obvious methods. The original motives were soon forgotten.
In fact, the Human Genome Project could not have been imagined without efficient DNA sequencing and the constant progress that has been made in this technique. Neither would it have been possible without the systematic development of computer science, both in terms of hardware and software. This is another aspect where the DoE’s contribution is most obvious. In the summer of 1975, Frederick Sanger of the Medical Research Council (MRC) in Cambridge had announced that he had found a way to identify a gene’s sequence (the chain of bases which make it up) by reproducing DNA replication in a test tube. Immediately several laboratories in Europe, the USA and Japan tried their hand at automating these methods. "Fluorescent" sequencing, introduced by Leroy Hood’s team at Caltech in 1986, was a remarkable improvement.
In 1981 Hood had set up Applied BioSystems, which specialised in laboratory equipment for molecular biology. This company developed at remarkable speed, thanks to sales of its DNA sequencers, until it was bought by PE Biosystems in 1997, just as its model 3700 capillary sequencer was coming onto the market. This sequencer was behind the considerable acceleration in sequencing speed worldwide. The technique, imitated elsewhere in the world, has continued to be improved and developed both by its promoters and by its competitors. It led to a ten-fold improvement in laboratory performance between 1995 and the end of 1997, and by a factor of ten again at the end of the century.
This progress would not have been possible without parallel developments in computer memory and calculating speed. As early as 1978, it had been clear that computer support would rapidly become necessary, to allow the scientific community to build the sequences into a continuous text which they could then interpret. A study undertaken by Rockefeller University and the European Molecular Biology Laboratory (EMBL) at Heidelberg led to the idea of the creation of a databank for gene sequences. It became clear very early on that the possession of this information was of vital importance, with political implications. Frequent discussions, sometimes heated, took place between Europe and the USA, to decide where these databanks would be, and how they would be structured. Two banks were established, in competition but also in touch with each other – one at Heidelberg, the other, the first GenBank, at one of the DoE’s laboratories, the Los Alamos National Laboratory (LANL). After the Alta summit, Robert Sinsheimer, then Chancellor of the University of California at Santa Cruz, proposed this project as an appeal for funds. He brought together a group of well-known researchers to discuss the idea in May of the following year (1985), but he was unable to raise the funds needed. Independently, Renato Dulbecco, of the famous Salk Institute, proposed using the human genome sequence to discover the causes of cancer. He published this idea in Science in 1986.
In the same way, in 1986 André Goffeau had proposed to the European commission a program aiming at sequencing the yeast genome as a typical illustration of the principle of subsidiarity (i.e. demonstration of synergy between different European countries, needed to obtain support by the European commission). In april 1996, the sequence of the baker's yeast genome had been made public: 16 chromosomes, representing more than 12 megabases had been sequenced by a consortium of more than 100 laboratories and 641 scientists throughout the world. This was the most remarkable because this feat had been achieved two years ahead of previsions, and that the corresponding genome was much larger than that of the two deciphered genomes of Haemophilus influenzae (1.8 Mb) and Mycoplasma genitalium (0.58 Mb) sequenced by TIGR a year before. However, in all these cases, the longest contiguous DNA sequences remained shorter than 2 Mb. This was because obtaining large continuous DNA segments without gaps is an extremely hard task. The difficulty in assembling sequences and the probability to meet with unclonable DNA regions and repeated sequences increases with length. However the two model bacteria used in the world, Escherichia coli and Bacillus subtilis, had a genome more than 4 Mb long. To get their complete sequence would therefore be a much more difficult endeavour, in particular when the project reaches the end of the sequencing process, when the final unsequenced gaps remained to be closed.
At that early time, the central question asked by the resarch led in my laboratory consisted to try to understand how genes are collectively expressed together, in a harmonious fashion. Witnessing the first successes of the sequencing of viral genomes, it appeared to me quite natural, and even necessary, to attempt to understand this fundamental feature of living cells by analysing the complete text of genomes. This supposed that one would be able to get the whole genome sequence. This also assumed fulfilment of two technical prerequisites: that of experiment at the bench (one had to determine experimentally the sequence of several milllions of base pairs) and that of computer sciences (one needed to assemble and analyse this sequence, and this would certainly be impossible manually, without automated means).
In order to fulfil the first condition I had met, during the summer of 1986, Pierre Prentki, a young and brilliant scientist who worked in the United States with David Galas (who presently became responsible for the genome programs at the DoE). Pierre and I had agreed that he would come and set up in my research unit a laboratory for sequencing and analysis of the gene functions of a model genome. I did not know at that time that he would soon meet a tragic end…
As for the second prerequisite, things were also difficult to set up. Existing genome programs did not aim at answering a specific biological question. They were just descriptive. In constrast, the background was initially different, since the reason why I proposed to sequence the genome of B. subtilis, at the spring meeting of the Société Française de Microbiologie in 1987, was a conceptual one, based on the computer mediated analysis I had recently performed with Olivier Gascuel. Indeed, for several years I had been regularly meeting computer scientists and biochemists, involved for a long time in the computer-mediated analysis of DNA and protein sequences, and more generally of biological knowledge. This had convinced us of the importance of computer sciences for setting up genome programs. The underlying assumption of research in my laboratory was that collective gene behaviour should be revealed as a prominent feature of the genome (which could therefore never be perceived as a simple collection of genes), and this could be tackled with "in silico" approaches. As expected, the reaction to my proposal to sequence a bacterial genome was almost universally dubitative, when not plainly negative, in particular when I proposed to sequence the genome of the second model bacterium, Bacillus subtilis (since, at that time, rumors were spreading that its genome might be very soon completed, I did not propose to sequence the E. coli genome). Fortunately, Simon Wain-Hobson, who had recently sequenced the genome of HIV, the AIDS virus, was interested and ready to start a sequencing program. We proposed together to our local and ministry authorities to sequence the genome of a universally spread sexual disease agent Chlamydia trachomatis, but we did not meet with any successs. In june of this same year Raymond Dedonder, then the director of the Institut Pasteur, attended the regular meeting on the biology of B. subtilis in California. There, James Hoch proposed to the community of specialists of this bacterium to sequence its genome. Back from the San Diego meeting, Dedonder remembered my proposal of the beginning of the year and asked me whether I was still interested. He was willing to create a program if I would take charge of it. Philippe Glaser was just completing the sequence of a long piece of DNA which we had identified as coding for the toxic adenylate cyclases of Bordetella pertussis. He was recruited to set up a sequencing laboratory in my research unit. This is how the E. coli geneticist that I was became involved in working on another model bacterium, B. subtilis.
Here, I must introduce a parenthesis. Knowing the situation fifteen years later (the company Genset announced, at the end of 1997, that it had sequenced the genome of two Chlamydia species, including C. trachomatis), it appears clearly that this history is not a simple anecdote but should be analysed as a an interesting feature of the sociology of science. This made me discover on this occasion that there is such a strong compartmentalization in science that scientists working in a narrow domain do not see with enthusiasm the intrusion of outsiders in their field, and in fact try to deter it by all means. But, above all, it showed that sociological constraints are very strong in orientating research. Chlamydia trachomatis is the most widely spread sexual disease in many countries, and it is the first cause of female sterility. It is easy to cure (with a generic antibiotic, therefore not profitable to companies), but its diagnosis is difficult. Unfortunately, the venal interest for using in vitro human egg fertilization techniques is such (this is the only way to circumvent C. trachomatis induced sterility) that nobody would care to cure the disease…
Let us come back to B. subtilis. Thanks to his talent as an organizer, Dedonder rapidly set up an international meeting where it was decided that a consortium of five European and five American laboratories would cooperate to sequence the B. subtilis genome, as soon as appropriate funds would have been collected. The adventure started well. By chance in november of the same year, in Gif sur Yvette near Paris, at a meeting of the scientific council of the Centre de Génétique Moléculaire of the French national research agency directed by Piotr Slonimski, I met André Goffeau, who had already begun to seriously initiate the yeast genome sequencing project. And, after a moment of interrogation — the European funds were limited, and this supposed therefore an implicit competition between the projects — we both got persuaded of their complementary interest, if they could be financed by the European Community. André Goffeau promised his support. Early in 1988, I was commissioned by the directorate Biology, division Biotechnology of the commission of the European Communities, to write the introductory text for their white paper meant to present the Biotechnology Action Program for sequencing genomes. This work was asked to provide a conceptual justification for research in genome sequencing to the politicians of the European government. All this triggered the creation of the B. subtilis genome program, an history in itself: starting from a collaboration between five European and five American laboratories, it ended as a collaboration between Europe and Japan, with a tiny participation of two American groups! Indeed, as we shall now see, this is where Hiroshi Yoshikawa enters the scene, in a truly seminal way: without him, I am afraid that the B. subtilis genome program would never have existed.
With a praiseworthy tenacity, without ever being discouraged, Dedonder had, from his own side, approached the appropriate direction at the commission, and he got some financing of an exploratory step. This part would be financed by the program named "Science" of the EEC. Raymond Dedonder was its administrative coordinator and I created the sequencing laboratory, within my research unit, under the direction of Philippe Glaser. Unfortunately, things were no going so well in the United States: fights about priorities to give to genome programs, stirred up by personal animosities and by the emphasis placed on the sequencing of the E. coli genome, led to the lack of support of the federal agencies approached. This was not without consequences for the European project: in these conditions, a simple questioning by the German advisor of the grant committee resulted in that the B. subtilis program was not retained for the "Biotechnology" action, which was following the "Science" program! The European support stopped at the end of 1990, in spite of the many efforts of Dedonder who tried all types of interventions to trigger some support. Fortunately, as is often the case, it was possible to extend the Science program for one year — of course without supplementary funds — and also to maintain the contacts with the yeast BAP program to which I was invited to participate as an observer.
Luckily, an unexpected event came to change entirely the course of history. At the international meeting on B. subtilis, held that year in july, our Japanese colleague Hiroshi Yoshikawa took the floor to say in a vehement way (this is quite unusual for a Japanese person!) that he did not understand why Japan had not been considered from the start as a possible partner in the project. This very healthy reaction decided of the future: instead of the United States, why not attempt the adventure with Japan? Hiroshi Yoshikawa knew he could obtain the appropriate support. And this is how a new project was submitted to the EEC Biotechnology Program, in which it was indicated that a team of European laboratories (including a Swiss one, with support of the Helvetic Confederation!) would sequence two thirds of the genome, while Japan would sequence the remaining third.
How did the B. subtilis program fare subsequently? The first long sequence of the B. subtilis genome (almost 100 kb long) was presented at Elounda, in Crete at the same time as the complete sequence of the yeast chromosome III in 1991. The main observation then was that about half of the newly identified genes were of unknown sequence and function, a truly surprising discovery. At the meeting, Piotr Slonimski joked about the European advance, as compared to the situation in the USA (where it had been much talked about the Human Genome sequencing, or about the sequencing of E. coli genome, but not produced much results until that date) by proposing to name the enigmatic genes that had just been discovered in quantity "Elusive, Esoteric, Conspicuous" genes (that is, genes that had been unobtrusive, but really expressed and typical, with the acronym "EEC genes" — where one immediately recognizes the acronym of the European Union at that time). And the representative of the National Institutes of Health which financed an important part of the genome programs in the USA reacted in a way that is not unusual, by superbly ignoring the European success, and by making a list of what was about to happen in the United States (and which indeed happened four years later).
Starting at this date, the progresses of the sequence were regularly made public, both at meetings of the consortium and at international genome meetings (mostly in the United States and in Japan). When Dedonder retired, Frank Kunst, from his laboratory in Pasteur, took the helm, and coordinated the program of the consortium during three successive Biotechnology contracts (which programmed the end of the sequence program for december 1998). Naotake Ogasawara, from the Nara Institute of Science and Technology coordinated the team of Japanese laboratories, with the constant support of Hiroshi Yoshikawa. This allowed the consortium to sequence the whole genome, and, as in the case of the yeast genome program, to accelerate sequencing. In fact the sequence was completed twenty months before its planned completion (and the Japanese team was the first to complete its part) in april 1997, and presented publicly in july 1997 in Lausanne (Switzerland) at a meeting where Craig Venter described the numerous successes of TIGR. It was finally published in november of this same year. The remaining time was used to control the quality of the data, and to resequence, by direct use of PCR on the chromosome, the regions where errors were suspected. This allowed the consortium to get a sequence with an error level thought ot be not higher than one base in ten thousand (*).
Much more could be said about this program and about the various persons who made it go to completion, but I think that it is time again to place emphasis on the immense contribution of Hiroshi Yoshikawa. As I am not a specialist of replication nor a very old specialist of B. subtilis I shall not emphasize the role of Hiroshi Yoshikawa in these domains: his list of publications speaks by itself. However I wish to say that, in contrast to the usual western belief that it is difficult to work with Japanese scientists (as all beliefs it is something said by hearsay, and by people who, of course, have no experience of the matter) I had exactly the opposite impression. It is not the place to discuss the difficulties of international scientific collaborations (there are countries which act in a way that I would not hesitate to name "unethical", as if it were a matter of fact), but I must say that I feel that the European-Japanese collaboration has been a model for future collaborations, and that I certainly wish to go on with as many such future collaborations as possible. And this is certainly due to the atmosphere created by Hiroshi Yoshikawa. Not only did we freely exchange information and materials, but each of us tried his best to work as fast as possible, and to make the project a real team work. There have not be many successes in genomics aside from those in the USA, but I feel that the Bacillus subtilis program is one of those rare successes, and that it owes much to the scientific insight and strong and kind personnality of Hiroshi Yoshikawa. Let us greet him for that and wish him many future years of positive impact in Science (and why not in entomology?)!
* A sequel: Sequencing the Bacillus subtilis genome was a very difficult task at the time, because it required cloning DNA fragments in an Escherichia coli host, where B. subtilis DNA is often so highly expressed that it behaves as if it were toxic. As said above, the genome project was a result of the work of a consortium. Taken together these constraints resulted in a sequence which could not be error-free. The genome has been entirely resequenced and entirely reannotated in 2007-2008, with novel techniques which do not ask for cloning. The present sequence is supposed to have only a very low level of errors. It is accessible at the INSDC entry point EMBL-EBI with the accession number AL009126.3.
A few days after this article was accepted for publication our colleague and friend Frank Kunst passed away (april 2nd, 2009), and this article is dedicated to his memory.
V Barbe, S Cruveiller, F Kunst, P Lenoble, G
Meurice, A Sekowska, D Vallenet, TZ Wang, I Moszer, C Médigue, A
From a consortium sequence to a unified sequence: The Bacillus subtilis 168 reference genome a decade later
Microbiology-SGM (2009) in press
Frank Kunst was instrumental in setting up and managing the Bacillus subtilis genome project, and his tenacity enabled us to get the project on track, despite multiple administrative and technical obstacles. Without him, the project would not have succeeded. Illustrating the difficulty of the undertaking, the Bacillus subtilis genome sequence, completed in 1997, remained for five years the only sequence of an A+T rich model Firmicute. Frank Kunst has initiated many other genome projects and has also continued to be interested in the results of the B. subtilis resequencing project. Among his last contributions was his outstanding role in organising a sequencing team for the rapid identification of a new variant of the chikungunya virus that invaded the island of Reunion in southern France. His retirement was therefore particularly ill-timed and ill-advised at a time when we are certainly not done with (re-)emerging diseases. The forced retirement was the trigger for a dangerous depression that eventually killed him. It is essential to remember, when using the knowledge created by researchers we often tend to ignore, that discoveries are the result of a collective endeavour. Probably more than the fashionable scientists, Frank Kunst is therefore, in his own way, responsible for far more discoveries than many would like to make, or tend to attribute to their own merits.
Some references in genomics where Frank Kunst was a leader:
F Kunst, A Vassarotti, A Danchin
Organization of the European Bacillus subtilis genome sequencing project
Microbiology (1995) 141:249-255
F Kunst, N
Ogasawara, I Moszer, AM Albertini, G Alloni, V Azevedo, MG Bertero,
P Bessières, A Bolotin, S Borchert, R Borriss, L Boursier, A Brans,
M Braun, SC Brignell, S Bron, S Brouillet, CV Bruschi, B Caldwell, V
Capuano, NM Carter, SK Choi, JJ Codani, IF Connerton, NJ Cummings,
RA Daniel, F Denizot, KM Devine, A Düsterhöft, SD Ehrlich, PT
Emmerson, KD Entian, J Errington, C Fabret, E Ferrari, D Foulger, C
Fritz, M Fujita, Y Fujita, S Fuma, A Galizzi, N Galleron, SY Ghim, P
Glaser, A Goffeau, EJ Golightly, G Grandi, G Guiseppi, BJ Guy, K
Haga, J Haiech, CR Harwood, A Hénaut, H Hilbert, S Holsappel, S
Hosono, MF Hullo, M Itaya, L Jones, B Joris, D Karamata, Y Kasahara,
M Klaerr-Blanchard, C Klein, Y Kobayashi, P Koetter, G Koningstein,
S Krogh, M Kumano, K Kurita, A Lapidus, S Lardinois, J Lauber, V
Lazarevic, SM Lee, A Levine, H Liu, S Masuda, C Mauël, C Médigue, N
Medina, RP Mellado, M Mizuno, D Moesti, S Nakai, M Noback, D Noone,
M O'Reilly, K Ogawa, A Ogiwara, B Oudega, SH Park, V Parro, TM Pohl,
D Portetelle, S Porwollik, AM Prescott, E Presecan, P Pujic, B
purnelle, G Rapoport, M Rey, S Reynolds, M Rieger, C Rivolta, E
Rocha, B Roche, M Rose, Y Sadaie, T Sato, E Scalan, S Schleich, R
Schroeter, F Scoffone, J Sekiguchi, A Sekowska, SJ Seror, P Serror,
BS Shin, B Soldo, A Sorokin, E Tacconi, T Takagi, H Takahashi, K
Takemaru, M Takeuchi, A Tamakoshi, T Tanaka, P Terpstra, A Tognoni,
V Tosato, S Uchiyama, M Vandenbol, F Vannier, A Vassarotti, A Viari,
R Wambutt, E Wedler, T Weitzenegger, P Winters, A Wipat, H Yamamoto,
K Yamane, K Yasumoto, K Yata, K Yoshida, HF Yoshikawa, E Zumstein, H
Yoshikawa, A Danchin
The complete genome sequence of the gram-positive bacterium Bacillus subtilis
Nature (1997) 390: 249-256
Comment by Chet Raymo in the Boston Globe
F Chetouani, P Glaser, F Kunst
DiffTool: building, visualizing and querying protein clusters
Bioinformatics (2002) 18: 1143-1144
P Glaser, C Rusniok, C Buchrieser, F
Chevalier, L Frangeul, T Msadek, M Zouine, E Couvé, L Lalioui, C
Poyart, P Trieu-Cuot, F Kunst
Genome sequence of Streptococcus agalactiae, a pathogen causing invasive neonatal disease
Mol Microbiol (2002) 45: 1499-1513
E Duchaud, C Rusniok, L Frangeul, C Buchrieser,
A Givaudan, S Taourit, S Bocs, C Boursaux-Eude, M Chandler, JF
Charles, E Dassa, R Derose, S Derzelle, G Freyssinet, S Gaudriault,
C Médigue, A Lanois, K Powell, P Siguier, R Vincent, V Wingate, M
Zouine, P Glaser, N Boemare, A Danchin, F Kunst
The genome sequence of the entomopathogenic bacterium Photorhabdus luminescens
Nat Biotechnol (2003) 21: 1307-1313
L Frangeul, P Glaser, C Rusniok, C Buchrieser,
E Duchaud, P Dehoux, F Kunst
CAAT-Box, Contigs-Assembly and Annotation Tool-Box for genome sequencing projects
Bioinformatics (2004) 20: 790-797
V Barbe, S Cruveiller, F Kunst, P Lenoble, G
Meurice, A Sekowska, D Vallenet, TZ Wang, I Moszer, C Médigue, A
From a consortium sequence to a unified sequence: The Bacillus subtilis 168 reference genome a decade later
Microbiology (2009) 155: 1758-1775
E Belda, A Sekowska, F Le Fèvre, A Morgat, D
Mornico, C Ouzounis, D Vallenet, C Médigue, A Danchin
An updated metabolic view of the Bacillus subtilis 168 genome
Microbiology (2013) 159: 757-770. doi: 10.1099/mic.0.064691-0
A further refined annotation of the sequence was published
twenty years after the first publication of the sequence
R Borriss, A Danchin, CR Harwood, C Médigue,
EPC Rocha, A Sekowska, D Vallenet
Bacillus subtilis, the model Gram-positive bacterium: 20 years of annotation refinement
Microb Biotechnol. (2018) 11: 3-17 doi: 10.1111/1751-7915.12461