The Human Genome programme was born of a political initiative, but
very soon became inseparable from the commercial issues which
surround it. Since 1995 the spotlight has been on that thorn in the
flesh (and the mind) of the research community, Craig Venter. We certainly
yet seen the last of this "Joker in the pack".
Five years later the Synthetic Biology effort was
launched at the MIT, and in parallel Craig Venter developed several
studies that may lead to the explicit demonstration that in a
living cell the genetic program is separated from the machine
which runs the program (the "chassis" in the Synthetic Biology vocabulary).
Cells can be seen as computers-making-computers and
information can be proposed to be
an authentic category of reality.
On March 14th 2000, Tony Blair and Bill Clinton published a short joint
declaration in which they "applaud the decision
by scientists working on the Human Genome Project to release raw fundamental
information about the human DNA sequence and its variants rapidly into
the public domain".
The declaration ends with an enigmatic phrase in which Blair and Clinton "commend
other scientists around the world to adopt this policy" of rapid
publication. It goes without saying that it is unusual for heads of state
to intervene in scientists
decisions to publish. Incongruous as this is, the declaration
is a salutary if stern reminder that the Human Genome Project is based
on a political initiative, not a scientific one.
Immediately after Japan had been crushed by the atomic bombs, the USA
initiated a policy of intensive co-operation with the defeated country,
to hold off the growing threat of communism. Genetics had a central place
in the arena of scientific collaboration. Amongst other aims, this allowed
the Americans to salve their conscience by showing an interest in the
future of the residents of Hiroshima and Nagasaki. This explains how the
US Department of Energy (DOE), the federal agency (equivalent to a ministry)
responsible for the USAs nuclear programmes, very soon became involved
in research which at first sight appears well outside its natural jurisdiction.
The main areas of research were the mechanisms of mutagenesis and identifying
the effects of radiation on the genes. In 1947 this led to the creation
of the Atomic
Bomb Casualty commission (ABCC), financed by the Atomic
Energy commission (which soon became the DOE). Genetics made up an important
part of its research.
The mutagenic effects of radiation had been discovered by Hermann Joseph
Muller in 1927. For this work, which led him to make appallingly
alarmist predictions, he was awarded the Nobel Prize in 1946.
In 1954, the ABCC published a report by James Neel and William Schull
on the first genetic findings on more than 75 000 births in Hiroshima
and Nagasaki. The results were reassuring, but they dealt with only the
first generation of children born since the bomb, and were based on an
analysis which was still rudimentary. 1954 was only a year after the structure
and mode of replication of the DNA molecule had been discovered. A generation
later, more sophisticated studies which analysed protein mobility in an
electric field did not contradict this early work (1,2). But to be really
sure of what kind of mutations radiation might have caused, it was necessary
to find out what happens in the DNA sequence, right down to the
level of the nitrogen bases.
However, in the meantime the political context had changed beyond recognition.
In the mid 80s the cold-war rhetoric gave way to concern about
a new adversary. Japans economic power threatened Americas
leadership in technology. The federal agencies were mobilised to encourage
the setting up of new companies and to protect intellectual and industrial
It is impossible to give a precise date for the beginning of the Human
Genome Project. Some writers date it from the Alta summit in Utah in December
1984, organised by the DOE. The aim of this summit, in which James Neel
took part, was to discuss what strategies should be used to detect mutations
in the generations after Hiroshima and Nagasaki, in the context of the
DOEs vision for life sciences. Discussion focused on the state-of-the-art
technologies which the DOE would be able to deploy, and all sorts of potential
models for identifying mutations were reviewed. Direct sequencing of the
DNA involved was already considered to be one of the most obvious methods.
The original motives were soon forgotten.
In fact, the Human Genome Project could not have been imagined without
efficient DNA sequencing and the constant progress that has been made
in this technique. Neither would it have been possible without the systematic
development of computer science, both in terms of hardware and software.
This is another aspect where the DOEs contribution is most obvious.
In the summer of 1975, Frederick Sanger of the Medical Research Council
(MRC) in Cambridge had announced that he had found a way to identify a
genes sequence (the chain of bases which make it up) by reproducing
DNA replication in a test tube. Immediately several laboratories in Europe,
the USA and Japan tried their hand at automating these methods. "Fluorescent" sequencing,
introduced by Leroy Hoods team at Caltech in 1986, was a remarkable
In 1981 Hood had set up Applied BioSystems, which specialised in laboratory
equipment for molecular biology. This company developed at remarkable
speed, thanks to sales of its DNA sequencers, until it was bought by Perkin-Elmer
in 1997, just as its model 3700 capillary sequencer was coming onto the
market. This sequencer was behind the considerable acceleration in sequencing
speed worldwide. The technique, imitated elsewhere in the world, has continued
to be improved and developed both by its promoters and by its competitors.
It led to a ten-fold improvement in laboratory performance between 1995
and the end of 1997, and by a factor of ten again at the end of the century.
The DOEs investigators contributed to another improvement the
use of "cell sorter" methods, where in a mixture of cells, those
marked by the presence of a fluorescent molecule can be separated from
unmarked cells. This method was extended to chromosome sorting, and it
thus became possible to purify human chromosomes and to establish specific
DNA banks for each chromosome. As there are 22, plus the two sex chromosomes,
this meant a considerable reduction in the size of sequencing projects.
Using this method, the French national sequencing centre at Evry, near
Paris, is now finishing the sequencing of chromosome 14, which at just
under 100 megabases (1 megabase = 1 million bases) will represent Frances
contribution (only 3%) to the international project.
This progress would not have been possible without parallel developments
in computer memory and calculating speed. As early as 1978, it had been
clear that computer support would rapidly become necessary, to allow the
scientific community to build the sequences into a continuous text which
they could then interpret. A study undertaken by Rockefeller University
and the European Molecular Biology Laboratory (EMBL) at Heidelberg led
to the idea of the creation of a databank for gene sequences. It became
clear very early on that the possession of this information was of vital
importance, with political implications. Frequent discussions, sometimes
heated, took place between Europe and the USA, to decide where these databanks
would be, and how they would be structured. Who would be responsible for
sequence quality its producer or the database? Who would produce
the annotations? This is clearly no small matter a bad annotation
is tantamount to disinformation. It is unfortunately now clear that major
annotation errors have spread via data banks through the entire scientific
community. Two banks were established, in competition but also in touch
with each other
one at Heidelberg, the other, the first GenBank, at one of the
DOEs laboratories, the Los Alamos National Laboratory (LANL). After
the Alta summit, Robert Sinsheimer, then Chancellor of the University
of California at Santa Cruz, proposed this project as an appeal for funds.
He brought together a group of well-known investigators to discuss the
idea in May of the following year (1985), but he was unable to raise the
funds needed. Independently, Renato Dulbecco, of the famous Salk Institute,
proposed using the human genome sequence to discover the causes of cancer.
He published this idea in Science in 1986. (4)
The same idea was being developed at the same time at Frances
Centre for the Study of Human Polymorphism, (the Centre d'étude
du polymorphisme humain or CEPH), set up by Jean Dausset to collect the
entire genetic blueprint of families whose genealogy was well known. Daniel
Cohen, a very active investigator at Daussets laboratory, who had
realised the value of the genetic heritage that this unique collection
represented, developed an industrial-scale approach which would result
in the sequencing of large segments of the genome. Finally, Charles DeLisi
proposed, independently, that this project should be carried out at the
DOE (5). DeLisi, who had worked on computational models of biology at
the National Cancer Institute, one of the National Institutes of Health
(NIH), had taken on the task of understanding the meaning of the sequences,
and had worked on this with investigators from LANL.
DeLisi was at the time one of the project leaders in biological research
at the DOE, which enabled him to cost the project, and make the first
practical propositions. In 1987 he persuaded the DOE to redirect 5.5 million
dollars intended for other projects to his programme. In 1988, under the
influence of Pete Domenici, the Senator for New Mexico, the programme
was considered by the American Senate and brought into the White House
discussions on large-scale scientific projects. David Galas, a pioneer
in molecular genetics, soon became a keen supporter.
In France, Daniel Cohen and Jean Dausset obtained a preliminary budget
heading under which to explore the feasibility of the project, using the
CEPHs human DNA libraries. More importantly, Cohen managed to persuade
the Minister of Research that the CEPH, with its private structure, could
begin a sequencing programme more easily than public bodies could, if
it had direct help from the ministry. As early as 1989 onwards, the CEPH
was recruiting scientists and engineers, and purchasing robots and industrial
equipment, to begin to map and sequence the human genome on a large scale.
At the same time, the EEC granted Eureka funds to the CEPH and Bertin,
a private company (in association with two British partners) with the
aim of creating an industrial supplier for the necessary equipment. This
project, called Labimap, was to supply oligonucleotide synthesisers, robots
and reactors for automatic plasmid preparation, sets for large-scale molecular
hybridisation, and miniature electrophoresis gels for sequencing. Daniel
Cohen had already seen quite clearly that genome projects would have to
develop molecular biology techniques on a large scale. It would be interesting
to analyse Labimaps total failure, as it could have given Europe
the equivalent of what Applied BioSystems and Perkin-Elmer gave the USA.
Progress was too slow to suit Daniel Cohen. By a happy coincidence,
Bernard Barataud, the energetic president of the French Muscular Dystrophy
Association (lAssociation française contre les myopathies)
had organised an unexpectedly successful Telethon in France in 1987. He
planned to use the money collected each year to finance an ambitious programme
in human genetics. Cohen realised just how far he could turn this to his
advantage, and he convinced Barataud that sequencing the human genome
would speed up the identification of genetic diseases considerably. Barataud
chose Evry, not far from where he lived, as the site for the substantial
laboratories which would be needed. The first Genethon was established
at the end of 1990, with the first prototypes built by Bertin for Labimap.
It very soon became clear that it was too early to sequence the human
genome, given the size of the task (a huge number of large chromosome
segments have to be cloned, which is very difficult.). So to begin with,
both in France and elsewhere, the projects were reoriented towards gene
mapping (locating markers spread out along the chromosomes).
Genethon had three major programmes. Under Daniel Cohen, Yeast Artificial
Chromosome banks (YACs) carrying random fragments of human chromosomes.
Under Jean Weissenbach, then at the Pasteur Institute, the construction
of a detailed physical map, and under Charles Auffray, the creation of
a complete set of human complementary DNA. To international astonishment,
in spring 1992 Daniel Cohen presented the first complete map of chromosome
21 at the annual meeting of the Cold Spring Harbor Laboratory in the USA,
and in the autumn of the same year he published the first contiguous sequence
map of YACs, containing up to 1 megabase of human DNA. This map, made
using the computer facilities of INRIA (with Guy Vaysseix and Jean-Jacques
Codani), placed France at the forefront of genomics. (6) This is not the
place to discuss the reasons for the rapid collapse of the French lead,
except to say that it was largely the result of a serious error of scientific
judgement on the part of certain decision-makers, acting behind the scenes,
and of a skilful manipulation of the ministerial structure at the time.
At the same time, a fierce struggle was going on for the ownership,
administration and scientific management of GenBank, the database which
holds all the data on DNA sequences worldwide and which had been taken
over by the National Institutes of Health (NIH). This was between the
DOE, which had founded GenBank at the Los Alamos laboratory which it financed,
and the NIH, which financed the National Center for Biotechnology Information
(NCBI). The DOE went as far as to finance a rival bank, Genome Sequence
Data Base (GSDB). This bank was managed by the National Center for Genome
Resources, a non-profit-making foundation created at the end of 1992 on
the initiative of Senator Domenici. The fact that data entry into the
different banks was not synchronised, and the inconsistent labelling of
the data they held, put scientists all over the world in an almost impossible
Clearly it is not possible to look into the detail of these power struggles
here. As often happens, they appeared as the dominant players began to
lose ground. This was the case with the DOE, which was witnessing a slowdown
in research programmes based on nuclear energy, and ran the risk of soon
finding itself bled dry financially if it could not put forward to the
federal government a long-term programme which would be expensive in terms
of manpower and funds. So the evaluation of its projects took place in
an highly-charged atmosphere, not very conducive to that national and
international collaboration which would certainly have led to the success
of the project in a much shorter time. Luckily the situation improved
in 1997 when the bank financed by the DOE turned commercial, ending its
position as a competitor to GenBank. The informal association between
GenBank and its European and Japanese counterparts, which had existed
since 1990 and which later became official, also brought stability. On
the European side were the EMBL, first at Heidelberg, then at its outstation
at Hinxton, south of Cambridge, and the European Bioinformatics Institute
(EBI), and on the Japanese side the DNA Data Bank of Japan (DDBJ) at the
National Institute of Genetics (NIG) at Mishima. Effectively, there is
now one single DNA sequence data bank for the whole world, with three
entry points at the NCBI, the EBI and the NIG.
In reality, it was not the end of the 1980s but 1995 which was the most
significant turning point for the Human Genome programme, not through
its creation in the form of the Human Genome Initiative, but because of
an outsider who burst onto the scene. This turning point stemmed from
a method similar to that used by Daniel Cohen, but more successful. In
that year, Craig Venter and his colleagues at The Institute for Genome
Research (TIGR) near Washington, published the sequences of two very small
bacterial genomes one after the other in Science. Craig Venter was not
particularly interested in bacteria. He had been an NIH investigator.
With his interest in technological progress, he was tempted by the challenge
of sequencing the human genome very early on, after having been involved
in locating the gene for a neurotransmitter receptor on human chromosome
15, right at the beginning of the 1990s. He immediately realised that
the scale on which molecular biologists were used to working would have
to change if projects of this kind were to be successful. They would have
to "think big", on an industrial scale. Craig Venter also understood
that working with public bodies involved a long and difficult struggle
with red tape, even in the USA, and that if he wanted rapid success that
route was out of the question. He would have to create a tailor-made organisation,
from scratch. Cleverly, instead of setting up just one, he created two,
together with his colleague William Haseltine. Venter was to manage the
non-profit-making organisation, TIGR, while Haseltine would manage the
commercial organisation, Human Genome Sciences (HGS), which had first
industrial property rights over the whole of TIGRs work. TIGR would
thus benefit not only from advances of funds from HGSs capital,
but also from the contracts it entered into with those two old rivals
the NIH and the DOE.
Craig Venter also understood intuitively that, faced with a riot of
different genome sequencing projects and all the battles and ego-trips
they brought with them, it was essential to establish a presence, and
a reputation for reliability, very quickly. After the first meeting on
the sequencing of micro-organisms organised by David Galas, he understood
that he needed a powerful computer infrastructure. He also realised that
TIGRs industrial-scale set-up meant he could contemplate sequencing
the whole of a bacterial genome, provided it was not too big, by using
a random fragmentation procedure called the "shotgun" technique.
Hamilton Smith of Johns Hopkins University in Baltimore, close to TIGR,
also realised this. He had shared the Nobel prize with Werner Arber and
Daniel Nathans, for their discovery of restriction enzymes, the enzymes
which had made the birth of genetic engineering possible. These enzymes
enable scientists to cut DNA at specific points, and thus to juggle the "cut
and paste" methods which are the basis of molecular biology. As a
bacteriologist and biochemist, Smith was familiar with a pathogenic bacterium, Haemophilus
influenzae, which produces restriction enzymes. With his usual flair,
Craig Venter realised that he could soon be the first to have sequenced
a complete genome!
And so, at a meeting organised by the Wellcome Trust in April 1995 at
Dormy House near Oxford, Craig Venter announced that he and his team of
about forty had succeeded in sequencing the entire genome of H. influenzae.
He also announced that he had practically finished the sequence of the
smallest known genome of any living organism, that of Mycoplasma genitalium.
Even though these were very small genomes, it was still quite an achievement.
Meanwhile, the Human Genome Project was getting organised. It involved
not only the two principal American federal agencies, the DOE and the
NIH, but also many other countries from around the world. In Britain,
the powerful Wellcome Trust, a private charitable foundation, had founded
the Sanger Centre at Hinxton, south of Cambridge, in 1994, where later
an outstation of the EMBL was set up. An informal international association,
the Human Genome Organization, shared out as best it could the task of
sequencing the human genome, chromosome by chromosome, between laboratories
around the world, with a target date of late 2005. The story of the power
struggles and dramatic exploits which, one after the other, left their
mark on the way the programme was organised, would fill a book. A look
at the comments in almost every issue of Science and Nature over the last
five years, as well as the information given on various websites, (see,
for example, www.larecherche.fr),
will show not only the struggle between the federal agencies, but also
between personalities within those agencies, and between countries.
In 1998, in one of those dramatic coups he is so good at, Craig Venter
once again changed the face of genomics. He is the reason behind the unexpected
Blair-Clinton declaration. On June 24th 1997, Venter broke off the agreement
between TIGR and HGS. He freed himself to scale up his approach to genomics
and in early 1998 he announced that together with Perkin-Elmer he had
created a new company, Celera ("fast" in Latin), with the aim
of sequencing the human genome within three years. The plan was to use
the "shotgun" approach, without preliminary separation of the
chromosomes, using supercomputers to reassemble the fragments, thanks
to a high-speed algorithm invented by Gene Myers. The sequencing was to
be carried out using several hundred of Perkin-Elmers capillary
sequencers, and the planned "coverage" of the genome was ten-fold,
that is 30 billion bases. A quick calculation shows that this figure is
not impossible, but is difficult to reach. One machine can sequence 96
templates in three hours, reading more than 500 bases (these machines
now routinely go up to 650 to 700 bases). Allowing for poor quality templates,
this means 300 000 bases per day, or 300 megabases in three years. In
addition, Venter proposed to demonstrate the feasibility of his approach
by sequencing the entire genome (nearly 150 megabases) of the geneticists
favourite subject, the fruit-fly Drosophila, in collaboration with Gerry
Rubins group at Berkeley, by the end of 1999. They pulled it off.
It is worth pointing out again Venters remarkably clear thinking.
It has long been clear that the drosophila genome should have been chosen
in the first place as the model organism. Not only are the genetics of
this insect by far the best known in the world, but also its development
is, strange as it might seem, remarkably similar to that of vertebrates
such as the mouse or man. Venter could thus rely on data obtained from
the drosophila to help him identify many of the most important human genes,
at least as a first approximation. At the same time as he perfected the
scaling up of his shotgun technique, he could be preparing to annotate
the human genome. Celera is a private company and its aim is obviously
to make a profit. Venter therefore announced that he would not immediately
release his sequences into the public domain, and that in any case any
use of his sequences for profit would attract royalties. In the circumstances,
the organisations engaged in the Human Genome Project, the Sanger Centre,
which planned to produce a third of the sequences, and the groups involved
in Europe and Japan, reacted strongly. They began by speeding up sequence
production considerably, aiming to producing a "working draft",
a "coverage" of the unassembled genome, by summer 2000, and
the complete sequence by 2003, two years before the date originally proposed.
Very soon, the consortium published the sequence of chromosome 22. (9)
They also launched a high-profile public debate about the fact that in
assembling its sequences, Celera made extensive use of public domain sequences,
and that for the company to want to make a profit from this constituted
an abuse. In March 2000, letters between Francis Collins and Craig Venter
were passed to the national newspapers, in an attempt to force Venter
to cooperate with the public project and to make his sequences available
to investigators throughout the world without charge. It is to this exchange
of letters that the Blair-Clinton declaration alludes.
At the end of this all-too-short account, what do we find? An explosive
mixture of the values which make up science not only the love of
knowledge of course, but also political rivalry, the search for glory,
and the intrusion of the commercial world. In the beginning it was an
entirely American game, inspired by the struggle against communism, then
against the technological supremacy, real or imagined, of Japan. It led
to twenty years of support for innovative private business, in a policy
which is nowadays imitated on this side of the Atlantic. This makes the
Blair-Clinton declaration, which seems to take the opposite view to the
previously established position, all the more surprising, as if suddenly
the free market and its corollary, the protection of intellectual property,
were considered a threat to free access to knowledge.
The most widely-shared value today is the profit motive. There is already
an area in which Perkin-Elmer is quietly piling up the profits the
sale of its sequencers and other laboratory equipment. The stir that Celera
has excited has been an immense success for that if for nothing else.
From this point of view, it is not the gene sequences themselves which
are valuable, but their annotation, the discovery of their meaning, and
the inventions which may result from all this. Patenting genes does not
make sense, not for moral reasons
after all, we patent arms, which does not necessarily mean that
we agree with their use but because they are not something which
has been "invented". On the other hand, understanding a biological
function can lead to the discovery of a therapeutic target and thus to
a treatment. Equally, awareness of a function can lead to the discovery
of a basis for diagnosis, and, yes, the use of this could be patented.
Gaining time means gaining a better chance to make intelligent annotations
on the genome texts, and this is what Celera is doing. The motivation
of those who prepared the Blair-Clinton declaration is not sound. What
really needs to be monitored is how knowledge of the genomes will be used
in the future. That is where the real moral problem lies, but who is paying
any attention? A. D.
• Text of the exchanges between the NIHs and Celera, plus the Blair-Clinton
Declaration at http://www.bioinform.com
• Other sites of interest : see BoOks
• For the scientific reasons for genome sequencing, see Antoine Danchin La
Barque de Delphes, Odile Jacob, 1998, updated and adapted in The
Delphic Boat. What genomes tell us. Harvard University Press, 2003
• A tribute to Hiroshi Yoshikawa
(1) J. Neel, Physician to the Gene Pool : Genetic Lessons and other
Stories, Wiley, 1994.
(2) W.J. Schull, Song among the ruins, Harvard University
(3) L Roberts, « Watson versus Japan », Science,
246 , 576, 1989.
(4) R. Dulbecco, « A turning point in cancer research:
sequencing the human genome », Science, 231 , 1055, 1986.
The Human Genome Project », American Scientist 76 , 488, 1988.
Rabinow, French DNA, Trouble in Purgatory, The University of Chicago
(7) Read the 6 issues published by the Groupement de recherche
et d'études des génomes (GREG), La
Lettre du Greg.
brief history of genome research and bioinformatics in France. Bioinformatics.
16, 65, 2000.
(9) I. Dunham, N. Shimizu, B.A. Roe, S. Chissoe, et al . «
The DNA sequence of human chromosome 22 », Nature, 402, 489, 1999.