In praise of diversity: the orcherstra of genetics

Praise the disheveled, praise the sleek;
Austerity and hearts-and-flowers;
People who turn the other cheek
And extroverts who take cold showers;
Saints we can name a holy day for
And infidels the saints can pray for.

Praise youth for pulling things apart,
Toppling the idols, breaking leases;
Then from the upset apple-cart
Praise oldsters picking up the pieces.
Praise wisdom, hard to be a friend to,
And folly one can condescend to.

In Praise of Diversity
Phyllis McGinley

The human genome orchestra

This page is an adaptation of an article published in French in the magazine BoOks, n°101 in october 2019: « Races, pourquoi le débat est faussé ». We propose the image of the orchestra and the score it plays as a metaphor explaining the link between genotype and phenotype. We also recall that we have known since the time of Vicq d'Azyr that it is possible to constitute well-formed classes that do not share all the traits that compose them.

Constructing classes to understand reality

Science did not appear in its modern form until 2,500 years ago. The conditions of its birth presupposed that the human animal should organize a coherent view of itself what it could recognise from its environment. To this end, it was first necessary to classify events or objects of interest into recognisable families, and then to study and understand the relationships between them. This is how the ancient astronomers began what was to become cosmology, by establishing a catalogue of stars. By making assumed classes, they recorded the position of these luminaries in the sky (still dark at the time), their luminosity and their local density. Because they had noticed the regularity of their course in the cycle of the seasons, astronomers discovered the moving stars, the planets. Science begins when we notice the return of things. Later, with the birth of optics, it became possible to characterize stars much better, notably by their colour and shape (some are galaxies). To go further and investigate the forces at work in the organization of the Universe, it became necessary to group these countless objects into well-defined classes. This is when true cosmology was born. Similarly, living species, collected and recognized one by one, as at the birth of botany in treatises such as Theophrastus' The History of the Plants Περὶ φυτῶν ἱστορία (fourth century BC), only began to make sense when they were put into classes, according to their kinship. For this purpose, it was necessary for researchers to find descriptive characters that were fairly constant in individuals of a group of plants, so that they could be identified as an authentic type.

It was soon realised that the formation of classes depended on the number of characters used to describe the elements of the class. It is simple to catalogue the individuals belonging to a population using a single character (for example, for peas, those with white flowers and those with coloured flowers). The situation became more complicated when several characteristics were considered simultaneously. Classification methods then had to look for ways to organise individuals, labelled with a very large number of characters (both macroscopic and microscopic, if relevant analysis methods are available) into natural classes. In order to form a class, a measure of distance between the different characters as they appear in the individuals in the study had to be associated. It was in the second half of the 18th century that scientific discussion focused on this issue. Félix Vicq d'Azyr, a precursor of comparative anatomy and a keen animal physiologist, was interested in the characteristics of classification, as can be seen in the articles he signed in the Great Encyclopaedia (edited by Diderot and d'Alembert): « The second way of disposing plants is called method by botanists. It is an arrangement based on the concurrence of several characters taken from the most essential parts of the plants, with the aid of which we succeed in bringing together those which resemble each other the most; and then build up what are called natural families. » (1) Elsewhere, he observed that to form a natural class it is not necessary that all the individuals of the class share all the elements used to define it: « A natural class results from the assembly of a number of species which are held together by a greater number of features than exist between each of them and the species of the other classes. For an individual to be part of a class, considered from this point of view, it is not necessary for it to gather all the characters; it suffices that it offers the greatest number of them; whence it follows that it would be possible for a class to be very natural, and that there was not a single character common to all the species that compose it. » (2) This observation, which is counter-intuitive for those unfamiliar with the scientific method, is at the heart of all current classification methods.

Once the natural classes have been formed, we can go further by observing that certain families of characteristics are simultaneously present or simultaneously absent in the entity of interest. As a result, we may want to group them together and see how, from this grouping, new classes can be formed. This is how botany was born, which is now illustrated in this primitive form and available in the archives of many British families, as in the Handbook of the British Flora A description of the Flowering Plants and Ferns Indigenous to or Naturalised in the British Isles by George Bentham (1858 first edition 1800-1884 widely present in 1918 and continuously reprinted since) or for those familiar of Hong Kong, the Flora Hongkongensis published in 1861.

Shortly before Vicq d'Azyr's reflection, Carl Linnaeus had had the intelligence to notice that the most important characters that led to a classification of living beings were the characters responsible for their generation, their sexual organs (stamens and pistil in plants). It then became apparent that the natural classes of living organisms were distinguished by a unique original character, the need for their members to be interbreeding. It was then possible to define what we now recognise as a species. Subsequently, the grouping of living organisms by proximity, using increasingly general characters, led to the idea of genus (a coherent group of species), then families and more general classes. For example, the common pea (Pisum sativum L.) is part of the sativum species of the genus Pisum (which is not italicised when referring to the genus alone) and of the family Leguminosae. In this way, plants have been grouped into more and more general classes, up to the separation of plants and animals, for example. We can also operate and attempt a finer classification, and identify races within species

Alongside the scientific description as a prelude to the search for the causes of what defines living things, and in accordance with the interests of the time, came the question of the 'usefulness' of natural classes. This in a world progressively dominated by a certain idea of human well-being, based on pleasure or the possession of various goods. This idea introduced a hidden - and dangerous because hidden - ideological bias into the classifications. Some species - plants, animals or even microbes - are considered useful and have been domesticated throughout human history. Depending on the location and the tastes of their owners, individuals of the same species have been selected over the generations to express different characteristics, thanks to the control of their reproduction. This was initially based on very random choices, but became more and more precise as knowledge of the mechanisms of heredity was understood and informed selection was progressively refined. This selection is the basis of how you can choose your cows for their milk or meat. This is how many well-characterised cattle breeds have been created and maintained over the centuries. Of course, as we remain within the same species, individuals of different breeds interbreed, but this affects the properties of their offspring, and their corresponding 'utility'. To go further, it is important to understand how individuals of the same species can be different, and what this means in terms of their biological nature and the future of their offspring.

Human polymorphism

Unlike many animals (but certainly not all animals: think of the rat, for example), Man is exceptionally adaptable. This is evident in the variety of places where the human animal can live. However, it is often forgotten that the most important reason for this adaptability is man's ability to resist disease. We live today because our ancestors resisted smallpox, the plague and more recently influenza. Living in hot, dry, cold and humid regions means that we encounter an infinite variety of pathogens, bacteria, viruses and parasites, which differ from place to place. This remarkable characteristic, as we know it today, comes from the fact that evolution has equipped us with an extraordinary arsenal of defence. This arsenal is based, first of all, on the defensive control systems located on the surface of our cells, which enable them to prevent recognition of or interact with these different pathogens in such a way as to divert their virulence. The way out found by natural selection is simple: the cells of human individuals are covered on their surface with extremely polymorphic structures, which differ from one person to another. If one person is infected with a dangerous pathogen, his or her neighbour - often even within the same family - will not be infected, simply because the pathogen fails to adhere to the surface of his or her cells. One might think that it would have been sufficient to limit the surface structures to a set not recognised by the pathogens, but this would be to forget that the pathogens have a very large number of descendants and that they evolve. Allowing an interaction with one of these structures is so advantageous that sooner or later a descendant of the pathogen will have discovered it. The resistance solution found by evolution has been to multiply the catalogue of structural variants on the cell surface. Of course, as the number of pathogens is unlimited, the number of variants needed to avoid infection is itself unlimited. The solution was then to share the resistance, by spreading it among a large number of individuals. The selective interest of this polymorphism, cruel to the person who will be contaminated, is extremely advantageous for the whole population: during a pandemic, there are always people who will not be affected or for whom the disease will be mild. This incompatibility between cells is not limited to pathogens - evolution has no rational purpose - it also extends to the cells of different people. We all know that this problem of recognition arises when we have to undergo an organ or tissue transplant: we have to find a "compatible" donor. So we are all different. Is it important to understand this characteristic of our heredity? The question of transplants of course provides a positive answer. It is therefore very useful to classify human beings according to the characteristics (genetically inherited, therefore identifiable in their genome) of the surface of their cells, characterised by their "major histocompatibility complex", HLA (ἱστος means cell tissue in Greek, and HLA stands for "Human Leukocyte Antigen").

To make things clearer, an example, extreme in its simplicity, illustrates the value of classifications. In this example, individuals are divided into only two classes. It was discovered some years ago that patients with narcolepsy (spontaneous involuntary sleep, a very disabling disease) became ill after having had a severe flu. Long an enigma, the reason for this surprising association is now understood. The influenza virus - but not just any type of virus, the H1N1 virus - is recognised by some people thanks to a particular HLA (DR15 DQB1*0602), which is present in more than 20% of the European population. This recognition triggers a very effective immune response against the virus, with a destruction of the infected cells so that the virus cannot multiply and spread in the infected person. This response occurs after specialised 'killer' cells have exposed a marker of the viral surface to their own surface. This should provide effective protection. Unfortunately, this marker is, by accident, identical to a neuropeptide (a brain mediator specialising in the transmission of nerve impulses) whose role is to control sleep. Two symmetrical brain ganglia secrete this peptide. Due to the antiviral defence response in the patient, they were identified and destroyed by the host immune system because they were mistaken for the influenza virus! (3) It is therefore advisable to warn the relevant carriers in case of an H1N1 epidemic, but, unfortunately, we can only advise them to avoid human contacts in case of epidemic. This is because vaccination with the whole virus —if it uses an effective vaccine— would be prohibited in this case because, just as the virus does, the vaccine would trigger the disease. This data is important for the vaccination policy —it will be possible to make a vaccine without the incriminated peptide— in particular because it avoids accidents that would be pinpointed by opponents to vaccination, and go against a practice whose beneficial effects are massive, populationwise. Here again we see that the interest of a given person may go against the interest of a population. This is a very well-documented example of a genetic character, spread all over the world but whose frequency varies according to the geography of populations, which it is certainly very interesting to know.

Phenotype and genotype: most characters are multigenic

So far, we have analysed only one family of traits, remarkable for their polymorphism in the human population. This family leads to the formation of natural classes, those that distinguish individuals according to their HLA. There are more than 20,000 genes in the human genome, and stories like this could be told for each of them. However, their polymorphism is highly variable. Some genes whose product is crucial vary very little, and their variants are the cause of those "orphan genetic diseases" that make headlines. Knowing the distribution of these genes and their many variants in human populations is therefore particularly important. To do this, we need to associate each individual with a set of specific traits - a phenotype - and then see if and how they can be grouped into natural classes. Here we see a new and risky question, as it may harbour a hidden ideology, the one that decides which traits will be retained and then operates the classification. Because of our necessary anthropocentrism and the constraints of our cultures, human beings are probably a poor model for reasoning about classification methods and their consequences. This is not without implications for human genetics, for example, because political correctness can make a choice in the characters selected, favouring some, while forbidding the notation of other characters despite their importance. I will illustrate the method with a domestic animal, the cow (Bos taurus).

This animal is familiar to all of us. This familiarity comes from the fact that we recognise this animal by its shape, or its behaviour. The set of characteristics we have selected forms its phenotype. However, this phenotype is not limited, it depends on our motivated interest for this animal. The industrial interest will retain: the productivity and the quality of the animal's milk, its meat, but also its resistance to disease, the length of its gestation, the number of pregnancies and the number of calves, longevity, etc. Breeders who are familiar with these animals also recognise characteristics such as "beauty" (yes, there is such a thing as a beautiful cow and it is carefully defined and awarded in agricultural competitions), coat colour, horn shape, weight, volume and temperature of the rumen, behaviour (solitary, social, suitable for automatic milking, etc.). In addition to these macroscopic (visible) characteristics, there are microscopic characteristics, often invisible, which are revealed macroscopically at the time of the appearance of a disease, for example, or by the analysis of the animal's blood. It is thus possible to compile a list of hundreds or even thousands of phenotypic characteristics and then analyse how they can be organised into natural classes. It should be noted that there are many well-characterised breeds of cattle (in France: Charolais, Montbéliarde, Normande, Salers... and the ubiquitous Prim'Holstein), and a multitude of intermediate individuals, resulting from hybridisation between individuals of different breeds. So far, nothing surprising, except that these breeds can dissolve very quickly into a more or less homogeneous mixture, which indicates that the boundary between breeds is necessarily blurred due to generalised inter-fertility within the same species.

Genetics has made it possible to go further and understand how these breeds are formed and how they evolve spontaneously. Indeed, the phenotypic characteristics that define the breeds can be associated with a new knowledge, that of the DNA of the animals, their genome, which defines the genetic program of their constitution, their survival capacity and their reproduction. Better still, this knowledge has made it possible to discover how, from one generation to the next, this program is accidentally modified, revealing new traits [identified mainly by the appearance of negative traits, such as CHARGE syndrome or the very rare profound deafness, known as Tietz syndrome (4)] similar to what is observed in human genetic "diseases". But the most striking knowledge derived from the coupling of knowledge of a complex phenotype and a genotype was that the situation where a gene is directly associated with a specific trait is the exception rather than the rule. In practice, no gene functions alone. The products it encodes interact with others, encoded in other genes. All of this forms complicated structures, like the wheels of a clock, which are made of soft matter, and which animate us.

In this context, it must be understood that the phenotype of a living organism is 100% due to the patterns outlined in its genotype (innate), and 100% due to environmental constraints (acquired), whereas the banal arguments that discuss heredity seek to separate the innate and the acquired. Innate is not at all opposed to acquired, they are orthogonal categories. An important consequence is that it is not possible to create a hierarchy between these categories, they are simply not comparable. There may be an order in performance (which is the result of a particular innate/acquired pair), but that is all. And, in particular, one can have identical performances from different innate/acquired pairs. It follows that the same phenotype can be the expression of two different genotypes. And the reverse is also true.

To go further, let us take a metaphor (partially inadequate like all metaphors, but nevertheless quite telling), that of a musical work performed by a large orchestra. When we attend the concert, it goes without saying that everything depends on both the score played and the orchestra. The concert depends 100% on the score and 100% on the orchestra. Moreover, the score does not only give the notes, but all sorts of indications: the general tempo, the composer's state of mind, the moment when each musician should intervene, etc. And it goes without saying that the orchestra is not the only one to be involved. And it is also obvious that the psycho-physiological state of the musicians, their style, etc. will influence the interpretation of the work. It also goes without saying that the interpretation will never be free of mistakes, false notes, errors in the bars, which the musician will be able to make up for, or which will be made up for by his colleagues, etc. Each performance will be unique. It may also happen that the score has been badly bound, that pages are missing, that some are inverted, that others are blurred or stained. It may have been altered, corrected or amended. Interpretation also depends on the period in which the work is done, etc. It is easy to understand what can and will happen. It is regrettable that this way of seeing the relationship between the innate and the acquired, which is obviously very understandable, is not more widespread among the general public. Of course, there may be a musical phrase played by a single instrumentalist, where errors in the score or execution will be very visible, but in general the performance comes from the combined playing of several musicians simultaneously. Thus, a sick soloist will lead to a poor performance, and similarly, if his score is badly printed, the interpretation will be poor, unless the musician has a very good memory, for example. But more often than not, the end result will be a combination of a large number of faults, in the playing of the individual musicians, or in what they read. It is easy to understand that this situation is analogous to monogenic (first case) or polygenic (second case) genetic diseases. And this takes into account the familiar contribution of "penetrance" and "terrain" in the varied outcome, i.e. the observation that the end result of the same genetic abnormality (analogous to a blurred or missing score played by different musicians, some of whom can only play very close to the score, while others can easily remember playing it in the past). What is important to understand is the dialogue between the performance (the phenotype) and the program/ score (the genotype).

Here is a concrete illustration of this dialogue, with a butterfly that used to be fairly common in France, Araschnia levana. There are two forms, a spring form called the "Fawn Map" and a summer form, the "Brown Map". These two forms are so different that it was long thought that they were two species. In reality, it is enough to raise the caterpillars of this butterfly over several generations (provided that the cycle of the seasons, and of the temperature in particular, is followed) to see that they are a single species! This is a particularly telling demonstration of the role of the environment in the expression of the genetic program. If we then maintain reproduction by keeping only the winter cycle, we will have only one of the two forms (the spring form), and if we had started with the summer form, we will have the impression that an acquired character has become hereditary.

Classification and evolutionary history

We have seen the value of classification as a prelude to identifying causal relationships, as in the case of human polymorphism diseases. Another interest, of course, would be to determine whether a particular genetic inheritance may be associated with drug side effects, especially when they are severe. But a particularly significant place for classification is its relationship to history. Making classes allows investigators to propose family trees. The genus Homo, which emerged as a result of a genetic accident that fused two chromosomes we share with our ape cousins, forming our long chromosome 2, led, most likely in Africa, to the appearance of Homo sapiens. This mobile and adaptable animal probably invaded Europe on several occasions, where another hominid, Homo neanderthalensis - the exact origin of this human species is not known, as skeletons of more primitive hominins are found in Europe - had preceded it. And these two species intermingled. The study of the human genome has taught us that Neanderthals and modern humans have interbred in Europe at least twice in the last 100,000 years. As they were different species, most of the Neanderthal DNA segments introduced into modern humans were quickly eliminated in the offspring of the hybrid individuals. On the other hand, some sequences hybridised into the sapiens genome were preserved, and if we put all the Neanderthal fragments found in the European genomes together, it is possible to reconstitute almost half of the ancestral Neanderthal genome! Having emerged from Africa, and poorly adapted to the viruses of Europe, unlike the Neanderthals who had been there for a long time, modern humans gained a certain number of adaptive genes thanks to this interbreeding, offering their descendants increased resistance to these diseases. We see here that the longest and most frequently encountered segments of Neanderthal ancestry in modern humans are most likely adaptive. They are indeed enriched in proteins that interact with viruses. Regions that code for factors that specifically recognise RNA viruses - such as the influenza virus - are more likely to belong to segments of Neanderthal ancestry in modern Europeans. From these observations, we know that conserved segments of Neanderthal ancestry can be used to detect ancient epidemics. A similar story is also unfolding in Eastern Europe and Asia with one or possibly more crosses involving another hominin, Denisova man. And here again a selective advantage of interbreeding appears, perhaps because Denisova's ancestor had lived at high altitude: Tibetans acquired from this hominin a gene that allows them to live at high altitude without suffering from the blood problems encountered by modern humans living on the plains.

The formation of natural classes makes it possible to recognise these events and to understand their timing. It also explains the difference in susceptibility to diseases of different human groups. And today we can recognise four major natural classes for modern man (and this classification will be refined over time). A first dichotomy distinguishes between a group composed of Western Europeans, Asians and Oceanians, all of whom carry a significant percentage of hominin genomes other than Homo sapiens, and a group composed of most of the inhabitants of Sub-Saharan Africa. The first group is further divided into at least three, depending on the amount of Denisova genome in their own genome, with the Pacific Islander group being particularly rich in the Denisovan genome. Of course, as we have seen above in relation to domestication, the boundaries between these natural classes are blurred, and will become increasingly so as a result of the great movements of human populations that we are experiencing today. In some West African populations, some Neanderthal genomic sequences have been introduced among people of certain ethnic groups, making it possible to follow human migrations throughout history. Moreover, by knowing more human genomes, we will better understand this migratory adventure towards the North and the East, which started very early, probably with the migration of the Neanderthal ancestor.

Race and ideology

If it is true that the great stages of science have often been preceded by mapping or classification, it is because a particular intention has dominated these activities, underpinned by a particular model of the world. Thus the concept of race arose from the exploitation of domestication. This concept is an avatar of classifications, which were indispensable for defining the concept of species. While it answers the question of determining natural classes of individuals within a large population, a welcome intention for deciding health policies for example, or for understanding the history of the human species, it has been plagued by the superimposition of the idea of the 'purity' of a race, since for domestication it was important (it still is) to avoid interbreeding. It is amusing to note in this regard that if there is a "pure" sapiens race, it is obviously found among the African populations... In fact, while this question of interbreeding is of only anecdotal importance when it comes to most animal or plant species (it may even be desirable, as when mules are produced from a horse and a donkey, and this will always require the same procedure, as the mules are not fertile, thus avoiding the question of the purity of the offspring), interbreeding is possible and even probable and fertile, within the same species.

As a result, particular phenotypic traits, which may be valued for one reason or another, will appear more and more often, and apparently at random. Moreover, some traits are quickly masked by others, the so-called "recessive" traits. It is a classic observation that scarcity is very often considered a value by human societies. Scarcity has a price. This implies that, if it is linked to a human trait, it obliterates its dignity, as Immanuel Kant noted [in the realm of ends, everything has either a price or a dignity (5)]. Modern genetics, alas, perpetuates this ideology, as it uses the pseudo-concept of "purification" when genes change or disappear as a result of mutations or when a foreign DNA segment has been introduced into a genome. In fact, the latter type of "horizontal" gene transfer - i.e. not through the direct "vertical" pathway from parents to offspring - is most often the result of hybridisation, in multicellular species. Moreover, the incompatibility that arises between neighbouring but different gene products requires the management of an adjustment or balancing in the offspring, which is often achieved by the complete loss of foreign genes. The disappearance of a large part of the Neanderthal genome among Europeans is not a purge, but simply a differential effect on the survival of individuals who possess certain genes, to the benefit of those who have lost them or replaced them with a copy of a homologous gene from the Homo sapiens genome.

What must disappear is not the concept of race, which is perfectly clear and harmless in normal times, but the concept associated with it, which hides harmful moral values that have nothing to do with science, derived from the concept of 'purity'. We must also get rid of many other related pseudo-concepts of biology, such as "altruism" or "selfishness", which are attributed as qualifiers to genes that certainly have no moral conscience! No gene, of course, is selfish, no gene is pure. Each gene can be associated with a function - or rather a contribution to the implementation of a function - the role of which will strongly depend on the context in which it is implemented. Having black skin when the sun is dominant will help carriers to have viable offspring in these conditions. On the contrary, it will be a handicap at high latitudes because it triggers a vitamin D deficiency, resulting in a lack of proper ossification, leading to rickets. A smooth skin protects against many parasites, but makes one more sensitive to cold, etc. Human genetic polymorphism is what has allowed humans to invade the planet.

Notes

1. F. Vicq d'Azyr (1792) Grande Encyclopédie Médecine, vol 4
2. F. Vicq d'Azyr (1792) Grande Encyclopédie Système anatomique, vol 2
3. Luo et al., (2018) Autoimmunity and molecular mimicry to flu in type 1 narcolepsy Proc Natl Acad Sci U S A. 115: E12323-E12332
4. Bourneuf et al., (2017) Rapid discovery of de novo deleterious mutations in cattle enhances the value of livestock as a model species, Scientific Reports 7: 11466.
5. I. Kant (1785) Grundlegung zur Metaphysik der Sitten

IN PRAISE OF DIVERSITY