Non medical biotechnology, a view from Hong Kong at the onset of Synthetic Biology

The challenge is not just to limit and, wherever possible revert emissions of pollutants and greenhouse gases, but also to replace environmentally costly processes based on fossil fuels with bio‐based sustainable alternatives. This task is not exclusively a scientific and technical one but will also require guidelines and regulations for the development and large‐scale deployment of this new type of bio‐based production.

Victor de Lorenzo et al.

Genome studies

Our genome projects
Rationale for genome projects 1988
Genomes in silico (1989)
Hong Kong Innovation Expo 2001
Microbial genome databases
Bacillus subtilis 2009
Bacillus subtilis 2013
Bacillus subtilis 2018

Synthetic biology

A minimal function set

The biotech connection: dream or reality?

CASB, rights reserved (August 2000)

Antoine Danchin 唐善 • 安東

Human beings like to play, and they would often like to get rich without working hard. Man is a social primate and is therefore very sensitive to collective behaviour, without much thought. In a growth context, success stories have an extremely powerful impact on the imagination and, unfortunately, this has profound implications for economic choices. Here we will put biotechnology in context and see, based on past experience, what can be predicted for the future.

A set of hard facts: genomics is a conceptual revolution in biology

For a long time, it was believed that life should be defined by the combination of a process of reproduction (or rather the production of offspring composed of individuals similar to their parents) and the association of a sensitive activity capable of modifying its environment (the 'vegetative' soul) and a motor activity (the 'animal' soul of the organism). Plants possessed only the former, while animals possessed the latter (and sometimes several others, as in the case of man, an animal endowed with reason...). This way of conceiving life was based on an animistic vision of a world composed of the four elements (Fire or Ether, Air, Water and Earth), rather than of atoms as we conceive them today. Today we can be much more precise.

Four processes and two formal laws are necessary to describe and explain what life is. They are intimately associated together in all living entities. These processes are: metabolism, compartmentalisation, memory and manipulation. The first two processes are organised by small molecules (comprising a few dozen atoms, with a carbon skeleton), while the last two are controlled by macromolecules (nucleic acids and proteins). Thus, two spatial scales are intertwined in all living processes, which develop on an intermediate scale between our macroscopic world and the microscopic world of atoms, the mesoscopic scale.

Because it is important to put the objects and processes of life into a new perspective, we will quickly go through the catalogue they constitute. Metabolism, as its Greek name suggests, is a state of flux. It represents the construction of molecules from smaller parts (anabolism) and the destruction of larger ones into smaller ones (catabolism), both of which go to make up the individual components of the living machine, and the energy needed to run it.
Compartmentalisation is necessary for metabolism. It separates the inside and outside of the organism. In fact, most living organisms can be distinguished by the way they have to manage compartmentalisation. Some have only a more or less complicated envelope, while others have many membranes and skins. The basic structure of compartmentalisation is a membrane made up of a double layer of small molecules of a particular chemical nature, the lipids. They form a film constituting a cell, defined at this level by its outer environment and its interior (the cytoplasm). The new science of genomics (see below) has discovered a new level of compartmentalisation in all cells, the 10-50 nanometre (billionth of a metre) scale.
When we talk about life today, we often only remember two other processes. The first is the memory of the past generation, which is passed on to subsequent generations, and the second is the general ability of organisms to manipulate a wide variety of objects inside and outside the cell. These processes are so important in describing and explaining life that they are generally the only ones considered today. In contrast to metabolism and compartmentalisation, the scale of their basic components is not that of small molecules, but of giant molecules composed of thousands, millions or even billions of atoms. The concept of macromolecule (linked to the concept of polymer, and the corresponding physics of 'soft' matter) defines a new era in physical-chemical studies. These latter processes organise and regulate the flow of information necessary for the manifestation of life.

Associated with these processes, memory and manipulation, are two fundamental laws, complementarity, accounting for the vertical transmission of memory, and coding, allowing the correspondence between memory and manipulation.

The autonomy of a living organism, its reproduction, development and survival presuppose the existence of something that is preserved from one generation to the next, a memory. This memory is often called the 'genetic heritage' (with the curious connotation present in the word 'heritage', which refers to the transmission of fortune), or the organism's genome. It perpetuates from generation to generation the information that controls the birth and development of the organism. In concrete terms, the material basis of the genome belongs to the field of biochemistry, that of nucleic acids, or more precisely Deoxyribonucleic Acids (DNA). These molecules are giant molecules made up of a linear sequence of close members of a repertoire of four elements, the nucleotides (also called bases). This sequence is so regular that it can be summarised as a sequence of letters, as in the sentences of an alphabetical text. The metaphor of text (the 'genomic text', the 'text' of a gene, etc.) is so remarkably appropriate for a DNA sequence that, in what follows, we will not need to know more about the chemical nature of these building blocks. It is sufficient to recall that they are represented by four-letter symbols (corresponding to the initial of their actual chemical name) A, T, G and C. Biotechnology today is based on the manipulation of text and the concretisation of the result of this manipulation, a 'polynucleotide'. This chain of polynucleotides forms a helically wound filamentous structure in space. More precisely, the DNA molecule is made up of two complementary helices wound in a spiral, as we shall now see.

The information carried by DNA directs synthesis of another chemical class of macromolecules, the proteins. These latter molecules are endowed of special chemical properties, which can be summarized under the general connotation of "wielding", or manipulating, because they allow the interconversions of metabolism, the intricacies of control processes, and the constructions needed to create compartmentalization. The chemical reaction of metabolism are enormously accelerated, and made specific (catalyzed) by proteins. As in the case of DNA, proteins form a linear chain of building blocks, the amino-acids, as the letters in an alphabetical word. But this chaining now comprises amino-acids of twenty different types (i.e. represented by the whole alphabet save six letters B, J, O, U, X, Z), not four. Depending on the nature of the amino acid chain, the corresponding thread, a polypeptide, is folded in space into a variety of shapes, much more complicated than DNA. The polypeptide chain forms a specific three-dimensional architecture for each type of amino-acid sequence. This architecture is responsible for the function, but also for the localization of each protein within the cell.

Proteins play the role of actors in the processes underlying life. They link together a variety of objects that they transform, transport or combine together. It is therefore of the utmost importance to understand their function. The correspondence DNA => proteins => functions is a central question to which all biologists endeavour to answer. It must we stressed that this does by no means imply that knowning the DNA sequence alone is sufficient to identify biological functions.

Between the memory embedded in nucleic acids, and protein synthesis, exists a stringent correspondence: portions of the sequence of DNA, a four letters alphabetic text, is expressed in another alphabet as another linear text, that which makes the text of proteins. The whole text of DNA is not expressed into proteins, by far. Only certain segments of the DNA text will specify proteins. There exists a syntax that establishes rules for the correspondence between these segments (that are, by extension, named genes, in reference to the postulated formal entities specifying heritable characters of mendelian genetics). The correspondence between the formal gene of original genetics and the physical object of molecular biology is far from being devoid of ambiguities. When the gene is responsible for the colour of a fly's eye, the correspondence is often possible (but not always!), when associating the synthesis of an enzyme (that which transforms appropriate chemicals into a red pigment) to the alteration of a gene in the drosophila’s eye. But things become much more complicated when genes play the role of an orchestra's conductor, and control the building up of the harmonious functioning of a cell or of an organ.

A semantic level, as in linguistics, appears when considering the meaning of the protein in the cell, its function (which cannot be directly told from its sequence in amino-acids, as the meaning of a word is not visible in its graphical representation nor in its pronunciation). The "rewriting rules" of the genetic heritage, from DNA to proteins are summarized into the correspondence known as the genetic code. They are defined by the sequence of nucleotides, associated with the corresonding sequence of amino-acid residues. It is this correspondence, which changes the operating level of the physical entities from that of the nucleic acids to that of the proteins which allows one to use the metaphor of program, when referring to the formal content of DNA (the sequence of its bases) constituting the chromosomes. The concrete actualization of an individual corresponds to the consequences of the building up of the expression of its genetic program, in the form of these wielding objects that are proteins (which are able, in particular, to manipulate the very substance of the program, DNA, copying it, and introducing variations in the copies). An important result of the correspondence between DNA and proteins is that it introduces both time and space in a representation of the cellular life (or the life of the organism, if it is multicellular) which makes reference to itself. It is therefore necessary, already at this early stage of formalization of life, to emphasize the physical separation, concrete, effective, between the program and its material support, DNA, and the objects that result from its expression, proteins.

In a purely formal and abstract way the central biological law consists in expressing the genomic information content with the help of a rewriting rule which permits one to go from the four letter text of DNA into the twenty letter text of the protein. This complex rule is decomposed into several stages. The first stage is a transcription, which maintains a four letter alphabet, and consists in copying fragments of DNA (A, T, G, C) into chemically similar macromolecules, made of another type of nucleic acid, RNA (RiboNucleic Acid) (A, U, G, C). The correspondence rule is simple : to an A in DNA corresponds a U in RNA, to a T corresponds an A, to a G corresponds a C, and to a C corresponds a G. One must note that this transcription follows an oriented reading of the text (as we do with our alphabet, for example from left to right), the transcribed text (RNA) being written with the opposite orientation to that in the initial text (DNA).

Transcripts, appropriately modified if needed, generally constitute messages that are further translated into proteins. They are for this reason named messenger RNAs. Translation uses the genetic code to adapt each sequence of three nucleotides in the sequence of the gene — each codon — to a given aminoacid residue in the polypeptide chain of the protein.

The genetic code, is, apart from a few rare exceptions, universal. This means that the rule making amino-acids to correspond to codons is the same in bacteria and in man. Of course this is not a trivial statement: this means that, when one possesses the text of a gene (its sequence), one knows at the same time the sequence of the protein that is specified by the gene. Coupled to some biological or biochemical knowledge, this allows one to be able to predict the nature, the function and the location in the cell of the corresponding protein. This is therefore one of the first goals of aiming at the sequencing of whole genomes (" genome programs ", for short): deciphering the DNA text, using the genetic code, first results in the construction of a catalogue of all the proteins coded by the genome. This ultimately allows one to make the inventory of all the functions that are necessary to establish life. The link with genetics therefore becomes simple.

The genetic program totally specifies in the sequence of nucleotides and of the corresponding amino-acids what is neccessary for the construction, the functioning and the reproduction of every organism. But the self-consistency of this information (what makes that it allows an harmonious functioning of the organism, from the cell level to that of the entire organism, and that it provides it with a proper " style ", that distinguishes it from every other organism) is still not well understood, be it only because it results from contingent historical events. We just begin to identify some of the signals that control the building up in space and time of the macromolecules keeping and expressing the genetic program in the genome. And, as we shall see later, we do not yet know — eventhough this was believed for a long time — all the major gene functions. Many obscure questions still remain to be answered to understand what is life. In particular, of course, we need to understand not only the genes and the products they specify (this is the first goal of genome programs), but also the chaining and general organization of the relationships they share. Biotechnology must take into consideration this entirely new view to develop in a positive way.

As the knowledge of bacterial genome sequences accumulates, more and more pieces of data suggest that there is a significant correlation between the distribution of genes along the chromosome and the physical architecture of the cell, suggesting that the map of the cell is in the chromosome. Considering sequences and experimental data indicative of cell compartmentalisation, mRNA folding and turnover, as well as known structural features of protein and membrane complexes, analysis of whole genome sequences strongly substantiates this hypothesis. If there is a correlation between the genome sequence and the cell architecture, it must derive from some selection pressure in the organisms growing in the wild. As a consequence, the underlying constraints should be optimised in genetically modified organisms if one is to expect high product yields. Consequences in terms of gene expression for biotechnology are straightforward: knocking genes out and in genomes should not be randomly performed, but should follow the rules of chromosome organisation. Also, the discovery of the ubiquitous presence of highly organized protein complexes entirely changes our view of drug targets, which must take this integrated feature of the cell into consideration for future development, suggesting that product based strategies will be doomed to fail if they only consider targets as isolated entities, as was (and still is often) done.

Another set of hard facts: the biotechnology sector, as a whole, lost money for the past 25 years

The biotechnology sector is very fashionable. It began to be developed about 25 years ago, at the onset of genetic engineering, when the first attempts to sequence DNA were proposed. In fact, it is important to place the development of this new science in the political context of the time (and in the present political context) to better understand what it means. It is also important to understand the curious declaration of Mr Blair and Clinton, who, in contrast with the previous emphasis of the US on the support to the development of private companies, targeted Celera in the case of the Human Genome Project, as not providing appropriate free access to the human genome sequences, asking this company with elliptic words to release its data as early and as freely as possible. We have to go back to the time of Hiroshima and Nagasaki, when Japan was crushed under atom bombs. At that time the USA and Japan began a collaborative effort to understand the consequences of irradiation on human populations. The Atomic Bomb Casualty commission, which later on was absorbed in the Department of Energy, began to develop programs in human genetics which were to become the Human Genome Initiative. It was then thought that the common ennemy was communism, and the American Japanese concerted effort was meant to provide an efficient scientific support against development in communist countries. With the end of the Cold War, things dramatically changed, and, in fact, the new US enemy became Asian countries, and in particular Japan. It was repeatedly said at the beggining of the eighties that much of the invention power of the West was more or less stolen by Japan (and more recently by China), and that this explained much of the industrial and trade difficulties witnessed by Western countries. It was then fashionable to freeze concrete collaboration with Japan, and the US government explicitely decided to support, through the use of government agencies (mostly DOE and NIH) the creation of new companies. This is the context which witnessed the creation of the first biotechnology companies, under the lead of well-known scientists, and with the funding help of the federal government (this is rarely emphasized, but up to 85% of the support of new biotechnology companies comes directly from government support, which is very far from what would be expected under conditions of free trade!). The main code words were "venture capital", "start-up", "spin-off" and the like. And indeed literally thousands of companies were born, developed and died in a very short period of time. Initially these companies were often headed by reknown scientists, and, not unexpectedly, they were complete failures (indeed, why would a scientist, even an excellent one, be a manager of genius?). Then came a time when it became apparent that to run a company is a true highly specialized activity, which required adequate competence, and a second, then a third generation came out, with usually more and more success (with however still a large rate of failure). In fact, the main (unobtrusive) goal of the US government in this domain was not to create commercial success, but to prevent innovation to go out of the United States, with a concomitant development of all kinds of legal protections, with the heated debate on the patenting of life as an obvious outcome (still unsettled, and subject to much discussion, as witnessed by the Blair-Clinton 14th of march 2000 declaration).

It is in this context that the biotechnology sector asked for more and more investment money, while making no profit. As a matter of fact, if one cumulates all the money input into the sector, for the past 25 years, it kept on losing money, as a whole. And perhaps 1998-1999 was the first year when the losses began to decrease. This is obviously not what investors think, and this does not correspond to a good return on the huge investment in the sector. Of course this may explain the recent negative reaction (especially following the Blair-Clinton declaration) which led stocks in biotechnology to go down almost as fast as they had risen. Clearly the number of successes, in terms of genetic engineering, is very small. One could speak of the Human Growth Hormone as one of the best example (but one should remember that the invention of the needleless injection system was one of the major event for the success of one specific brand of HGH, not the quality of the protein!), with Erythropoietin, Plasminogen Activator and a few others as demonstrating that something can indeed be done, but this is much outnumbered by the number of deadends, plain failures and even dangerous ideas. Naturally the investors would think of earning the jackpot, by just reading, once again, how cancer is cured (cures for cancer appear since the early fifties at the headlines of dailies, almost every month) and how we shall be young for ever, and they keep on investing. But there may come a moment when the feast is ending and when reality catches up. Curiously, this is probably now that things may begin to turn in the right direction, with smaller, but more predictable profits (and losses). However there was at least one sector linked to biotechnology which has kept producing enormous profits, it is the sector of laboratory equipement and consumables. This sector is clearly not very prone to be displayed on a TV show (in contrast to all kinds of dreams associated to the development of biotechnology), but it is growing steadily. And one could even think that the investment of PE in the construction of Celera Corp. is more in terms of selling sequencing machines (and presently computer programs as well as new equipment for expression profiling and proteome analysis) than in the Human Genome it is so keen to advertise...

Is there a future for biotechnology?

If one puts together the contents of the two paragraphs above one is facing a very simple reality: what is the concrete gap between concepts in biology and return in terms of industrial investment? The general (appropriate) idea that new concepts lead to high-technology has been distorted in the mind of laymen (and investors are laymen) into the notion that high-tech means fast return in terms of cash-flow and profits. This is in fact justified by one type of high tech development, that which stems from the use of the crystal matter of silicium and all related objects, mostly based on two inventions, that of the transistor, and that of the laser beam (with many other associated discoveries). It should be stressed that both these discoveries are based on purely academic and esoteric research (the same would hold for supraconductivity) associated to hopeful serendipity. To these discoveries one should add the work of specialists in Number Theory, a typically "useless" domain of the mathematics of integers. Therefore, at the root of the extraordinary expansion of the world of computers and communication, we find purely non profit-making, apparently useless, academic work. And we should notice that this is typically the type of research that investors would be reluctant to support! The return on public investment there has been extremely slow in the usual terms of requirement for economic return (it took some 2,500 years for arithmetics and algorithmics, with some acceleration in the past 70 years or so, and more than 50 years for the developments of laser and the transistor). However, at the moment when these discoveries had been implemented into actual industrial processes the development of applications has been incredibly fast, Moore's law being the best indicator of the exponential development of computers and associated devices during the past 30 years or so. It should however be remarked here that we shall presently reach the physical limits of future developments at the same exponential speed, unless entirely new discoveries, based on different principles (such as high temperature supraconductivity or the quantum computer) can actually be implemented at an industrial scale (which is not known at present). This means that there will be soon a change in the development pace of the corresponding applications, with likely concomitant disappointment of investors, and probably huge changes in the distribution of wealth as new areas will open while older ones will close.

1. Drugs
Many other new concepts have been created in the past century, and several of them led to a variety of applications. This is, for example, the case of problems related to health care. What is the usual time gap here between the discovery of a concept and its concrete implementation? Clearly, much longer than in the case of silicon chips: instead of one doubling every 18 months or so, one could expect something like one doubling every 10 years or so. In fact much of the progresses in medicine stem form the discovery of microbes, with the appropriate emphasis on hygiene and vaccination. A further progress was made when antibiotics were discovered (but the last new one dates some 20 years back, with only promises for new discoveries in the near future) and when miniaturisation and appropriate orthopaedics improved surgical techniques. At the present time it would be interesting to investigate the association between technical progresses and iatrogenic ailments: in fact it is clear that a very large proportion of the need for medical care is caused by medical care itself. This explains both the discrepancies between countries in terms of general public use of medicine (in comparable countries, such as France and UK for example), and the more and more frequent recourse to "alternative" medicine, which, in fact is both inefficient and safe! The new drugs are really powerful, but they are almost never lacking dangerous side effects, and this should be taken into consideration in all terms of investments. An important consequence is the more and more frequent involvement of insurance companies in the control of drug use, with a concomitant lengthening of the processes required to put a drug on the market.

2. Agro-food industry
We are fortunately not sick every day. But we have to eat every day, and this means a lot of biology. In fact fermentation processes are at the root of food treatment everywhere in the world, and there is still room for improvement (in particular knowing the habits of different people, as well as, in some cases, their genetic background). The main problem today it that laymen are afraid of the so-called genetically modified organisms. While this is understandable because of the incredibly thoughtless way GMOs have usually been created, there are now much new means to modify genes in a very clean and understandable way (e.g. by neat excision of the genes of interest). Investment here is slow, because one has usually to wait for constructions and crops, but genetic engineering has speeded up matters very much. One important trend is to consider food as very important for health, and to use is more or less as a current medicine (nutraceuticals). Biotechnology processes could also be used to improve not the quantity of food (which was of course an urgent need in an exponentially growing human population) but its quality. It is probably for want of considering this as a priority that the public reacted so negatively to GMOs. Place should be given to variety of taste and quality in texture, colour etc. This is certainly achievable, but this should be public-driven, and not immediate profit-driven (incidentally, the result of the recent past experience is that in fact public-driven is always the best attitude, be it only because even the producers are, in a way, not different from the public!). In general, research has to be developed to better understand the needs for food of man and animals, as well as the long experience in this domain: biotechnology is perhaps the oldest industry, since the production of wine and beer, as well as delicatessen, cheese and all kinds of fermented vegetables. There is no reason why this should be put to an end, and not develop further, in a better controled and understood way.

3. General manufacturing
One may be puzzled by this paragraph: what could be the link between biotechnology and the industry of aluminum, for example? Curiously enough, although not fashionable, this domain might well be one among the most promising in terms of investment for the future. Let us just give a few examples (there are many others of course).

Aluminum is produced by electrolysis of bauxite, a process which is extremely energy costly. In fact, a significant part of the energy losses comes from the clogging up of carbon electrodes. This process is triggered by organic material present in the aluminum ore: any degradation of this material prior to electrolysis would very much improve the process. Why not a process involving enzymes or bacteria?

Painting is a nightmare, especially when drops fall from the ceiling. It would be extremely useful to have an additive which would make the painting almost solid when unstirred, and very liquid when stirred (this is the type of a "non-Newtonian" liquid). Well, this is possible, and somebody thought of this. There is a product, secreted by the bacteria Xanthomonas campestris, named xanthan gum, which precisely has this property, and this is already used. Of course it is easy to see the tons of product needed! Not very appealing for a TV show, but extremely lucrative in terms of industry...

Oil is still at a relatively low price, but this will not last for long, and it will be therefore more and more important to recover more oil from oil wells. Assisted recovery will need all kinds of surfactants. Clearly, microbes have a lot to say there, and this is still futuristic.

Cleaning (in house and in the environment): this is already much used, and will further develop in the future. We are just waiting for new original ideas.

Fine chemicals: there are often bottlenecks in the production of complex chemicals; the possibilities of bacterial metabolism has almost not been explored in this respect.

In general, the lesson is that one should first try to identify bottlenecks in any type of industrial process, and subsequently to submit this as a challenge to (micro) biologists. This is rarely done.

4. A domain of fashion: DNA chips and proteomics
Granting that progresses in curing diseases will be slow, even if new developments arise, there remains one domain which will probably be developing faster and faster, the domain of diagnostics. This word here is taken as a general approach to phenomenology, to the identification of any type of complex situation, be it a disease, or the state of the environment. Progresses in the knowledge of genomes will certainly improve our abilities in diagnostics. This has been withnessed, for example, with the Polymerase Chain Reaction (PCR) which has entirely changed the nature of forensic sciences (and of insurances). This allowed the building up of many new types of services, which will keep on developing in the future, especially as the PCR patents will come into the public domain (clearly we have here a case where patenting has prevented much development of new ideas, because the royalties of PCR based approaches were too high). Knowledge in appropriate diagnostics will require much inventive activity (PCR is but a technique, it does not tell what should be amplified) and one can expect new developments from the signature of proprietary organism (and DNA signatures have even be patented in ink or painting!) to the identification of microbes in the environment, in diseases or in hospitals. When multicriteria searches will be required (this will be the case for monitoring the quality of air or water) one will need parallel measurements, and this leads to the development of one category of new fashionable products, "DNA chips" (on a plane surface) or "lab on chips" (in nano-wells). Here, once again, the technique is not the limitation; the limitation is both what is put on the chip, and how the data are analyzed. Curiously, not much emphasis is yet put on these major aspects, probably because of the attractive power of the word "chip" which was such a major success in profit making computer construction.

However, there is one particular situation where DNA chips will be useful, and that is, once again, in the development of research laboratories (remember that in the biotechnology sector, the profitable sector was precisely that of consumables and equipment for laboratories...). Here, it will be important to make chips to help solve research problems. If the cost of this is not too high, we can expect a huge development in this field, as these chips will replace the old spectroscopic analysis techniques. It is likely that the companies that bet on this, and make appropriate development while keeping their cost as low as possible, will have a huge impact. They will, of course, need to continually improve their approaches, and provide associated services in terms of image analysis, statistical processing and database construction. This will initially require significant investment, but success is likely. A final area, which is still at the research stage, is the so-called proteome analysis, with the identification of proteins by mass spectrometry. Appropriate scaling up, combined with the construction of relevant databases, will certainly constitute a revolution in terms of diagnosis, both in hospitals and in research laboratories.

Finally, although it will take some time for Big Pharma to understand that drug targeting needs to change its philosophy from individual targets to integrated targets, it seems that expression profiling (microarrays) and proteome analysis will be the tools of choice for the future of applied biotechnology research. It is therefore time for a paradigm shift and to truly enter the genomics era (there is no such thing as 'post-genomics'!).

CHINESE ASSOCIATION FOR SCIENCE & BUSINESS