In praise of diversity: the orcherstra of genetics

Praise the disheveled, praise the sleek;
Austerity and hearts-and-flowers;
People who turn the other cheek
And extroverts who take cold showers;
Saints we can name a holy day for
And infidels the saints can pray for.

Praise youth for pulling things apart,
Toppling the idols, breaking leases;
Then from the upset apple-cart
Praise oldsters picking up the pieces.
Praise wisdom, hard to be a friend to,
And folly one can condescend to.

In Praise of Diversity
Phyllis McGinley

The human genome orchestra

This page is an adaptation of an article published in French in the magazine BoOks, n°101 in october 2019: « Races, pourquoi le débat est faussé ». We propose the image of the orchestra and the score it plays as a metaphor explaining the link between genotype and phenotype. We also recall that we know from the time of Vicq d'Azyr that it is possible to build up well-formed classes that do not share all the features that make up the class.

Constructing classes to understand reality

Science did not appear in its modern form until 2,500 years ago. The conditions of its birth supposed that the human animal organized into a self-consistent view what he could recognize from its environment. To this aim it was first necessary to class events or objects of interest into recognizable families and then study and understand the relationships that they maintained. This is how the astronomers of antiquity began what was to become cosmology, building up a catalog of stars. Making classes supposed that they recorded the position of these luminaries in the sky (still dark at the time), their brightness and their local density. Because they had noticed the regularity of their course in the cycle of the seasons, astronomers discovered moving stars, the planets. Science begins when the return of things is noticed. Later, with the birth of optics, it became possible to characterize stars much better, in particular by their color and their shape (some are galaxies). To go further and seek the forces at work in the organization of the Universe, it became necessary to group these innumerable objects into well-defined classes. This is when true cosmology began. In the same way, the living species, collected and recognized one by one, as at the birth of botany in treaties such as The History of the Plants Περὶ φυτῶν ἱστορία of Theophrastus (fourth century BC), began to make sense only from the moment they were put into classes, based on their kinships. To this aim, it was necessary that investigators found descriptive characters that were kept fairly constant in the individuals of a group of plants so that they could identify them as an authentic type.

It was quickly realized that class formation depended on the number of characters used to describe the elements of the class. It is straightforward to catalogue individuals belonging to a population using a single character (for example, for peas, those with white blossoms and those with colored blossoms). The situation became more complicated as more characters were simultaneously taken into account. Classification methods had then to seek how to organize individuals, tagged with a very large number of characters (both macroscopic and microscopic, if one has access to relevant methods of analysis) into natural classes. To form a class it was convenient to associate a measure of distance between the different characters as displayed in the individuals of the study. It was during the second half of the eighteenth century that the scientific discussion took a deep look at this question. Félix Vicq d'Azyr, precursor of comparative anatomy and keen on animal physiology, was interested in the proper characteristics of classification, as we see in the articles he signed in The Great Encyclopedia (edited by Diderot and d'Alembert): « The second way of disposing plants is called method by botanists. It is an arrangement founded on the concurrence of several characters taken from the most essential parts of the plants, with the aid of which we succeed in bringing together those which resemble each other the most; and then build up what are called natural families. » (1) Elsewhere, he observed that to form a natural class it is not necessary that all the individuals of the class share all the elements used to define it: « A natural class results from the assembly of a number of species which are held together by a greater number of features than exist between each of them and the species of the other classes. For an individual to be part of a class, considered from this point of view, it is not necessary for it to gather all the characters; it suffices that it offers the greatest number of them; whence it follows that it would be possible for a class to be very natural, and that there was not a single character common to all the species that compose it. » (2) This observation, which is counterintuitive for those unfamiliar with the scientific method, is at the heart of all current classification methods.

Once natural classes have been formed, we can go further, observing that some families of characters are simultaneously present or simultaneously absent in the entity of interest. As a consequence, we can want to group them, and see how they may form new classes from this grouping. Thus began botany, which is illustrated in this primitive form today and available in the archives of many British families as in the Handbook of the British Flora A description of the Flowering Plants and Ferns Indigenous to or Naturalised in the British Isles by George Bentham (1858 first edition 1800-1884 widely present in 1918 and continuously reprinted since) or for those familiar of Hong Kong, the Flora Hongkongensis published in 1861.

Shortly before the reflection of Vicq d'Azyr, Carl Linné had had the intelligence to notice that the most important characters that led to a classification of living beings were the characters responsible for their generation, their sexual organs (stamens and pistil in plants). It appeared then that the natural classes of living organisms were distinguished by a unique original character, the need of their members to be inter-fertile. We could then define what we recognize today as a species. Subsequently, grouping living organisms by proximity, using more and more general characters, leads to the idea of genus (a consistent group of species), then of families and more general classes. For example, the common pea (Pisum sativum L.) is part of the sativum species of the genus Pisum (not italicized when referring only to the genus) and the family Leguminosae. This is how plants were grouped into more and more general classes, until the separation between plants and animals, for example. We can also operate and try a finer classification, and identify races within species.

In parallel with the scientific description used as a prelude to the search for the causes of what defines living beings, and in line with the interests of the time, came the question of the « utility » of natural classes. This happened in a world gradually dominated by a certain idea of human well-being, based on pleasure or possession of various goods. This idea introduced into the classifications a hidden —and dangerous because it is hidden— ideological bias. Viewed as useful, some species —plant, animal or even microbe— have been domesticated throughout human history. According to the places and the tastes of their owners, the individuals of the same species were selected across generations to express different characters, thanks to the control of their reproduction. This was at first based on very random choices, then more and more precise as the knowledge of the mechanisms of heredity were understood and informed selection was progressively refined. This selection drives the way you can choose your cows for their milk or their meat. This way, many well-characterized cattle breeds have been created and maintained over the centuries. Of course, as we stay within the same species, individuals of different races are interbreeding but this affects the properties of their progeny, and their corresponding « utility ». To go further, it is important to understand how individuals of the same species may be different, and what this means in terms of their biological nature and for the future of their descent.

Human polymorphism

Unlike many animals (but certainly not all animals: think of the rat, for example) man is exceptionally adaptable. This can be seen in the variety of places where the human animal can live. Yet, it is often forgotten that the most important reason for this adaptability is man's ability to resist diseases. If we live today, it is because our ancestors have resisted smallpox, plague and more recently flu. To live in hot, dry, cold, wet regions makes us meet an infinite variety of pathogenic agents, bacteria, viruses, parasites ..., which differ according to the places. This remarkable feature, as we know today, comes from the fact that evolution has armed us with an extraordinary defense arsenal. This arsenal is based, first, on the defensive control systems located on the surface of our cells, which allows them to prevent recognition or interact with these various pathogens in a way diverting their virulence. The way out found by natural selection is simple: the cells of human individuals are covered on their surface with extremely polymorphous structures, that differ from person to person. If people are infected with a dangerous pathogen, their neighbor —even within the same family, oftentimes— will not be infected, simply because the pathogen fails to stick to the surface of their cells. One may think that it would have been enough to limit the surface structures to a set not recognized by the pathogens, but this would forget that these have a great many offspring and that they evolve. To allow an interaction with one of these structures is so advantageous that sooner or later a descendant of the pathogen will have discovered it. The resistance solution found by evolution was to multiply the catalog of structural variants on the surface of cells. Of course, as the number of pathogens is unlimited, the number of variants needed to avoid infection is itself limitless. The solution was then to share the resistance, spreading it among a large number of individuals. The selective interest of this polymorphism, cruel for the person who will be contaminated, is extremely advantageous for the population as a whole: during a pandemic there are always people who will not be affected or for whom the disease will be benign. This incompatibility between cells is not limited to pathogens —evolution has no rational purpose— it extends also to the cells of different people. We all know that this problem of recognition arises when we have to undergo an organ or tissue transplant: we have to find a « compatible » donor. So, we are all different. Is it important to understand this feature of our heredity? The question about grafts provides of course a positive answer. It is therefore very useful to classify human beings according to the characters (inherited genetically, therefore identifiable in their genome) from the surface of their cells, characterized by their « major histocompatibility complex », HLA (ἱστος means cell tissue in Greek, and HLA is for « Human Leucocyte Antigen »).

To make things clearer, an example, extreme by its simplicity, illustrates the interest of classifications. In this example individuals are distributed into just two classes. It was discovered a few years ago that patients with narcolepsy (spontaneous involuntary sleep, a very debilitating disease) became ill after they had got severe flu. Long an enigma, the reason for this surprising association is now understood. The influenza virus —but not just any flu virus type, the H1N1 virus— is recognized by some people through a particular HLA (DR15 DQB1*0602), represented in more than 20% of the European population. This recognition triggers a very effective immune response against the virus, with killing of the infected cells so that the virus cannot multiply and propagate in the infected person. This response appears after specialized « killer » cells have exposed a marker of the viral surface to their own surface. This should be an efficient protection. Unfortunately, this marker is, accidentally, identical to a neuropeptide (a cerebral mediator specialized in transmission of nerve impulses) whose role is to control sleep. Two symmetrical brain ganglia secrete this peptide. Because of the antiviral defense response, in the patient they have been identified and destroyed by the host's immune system because they are mistaken for the flu virus! (3) It is therefore advisable to warn the relevant carriers in case of an H1N1 epidemic, but, unfortunately, we can only advise them to avoid human contacts in case of epidemic. This is because vaccination with the whole virus —if it uses an effective vaccine— would be prohibited in this case because, just as the virus does, the vaccine would trigger the disease. This data is important for the vaccination policy —it will be possible to make a vaccine without the incriminated peptide— in particular because it avoids accidents that would be pinpointed by opponents to vaccination, and go against a practice whose beneficial effects are massive, populationwise. Here again we see that the interest of a given person may go against the interest of a population. This is a very well-documented example of a genetic character, spread all over the world but whose frequency varies according to the geography of populations, which it is certainly very interesting to know.

Phenotype and genotype: most characters are multigenic

So far we have analyzed only one family of characters, notable for their polymorphism in the human population. This family leads to the formation of natural classes, those that distinguish individuals according to their HLA. There are more than 20,000 genes in the human genome, and stories like this could be told for each of them. However, their polymorphism is very variable. Certain genes whose product is crucial vary very little, and their variants are at the origin of these « orphan genetic diseases » that make the front page of the mass media. Knowing their distribution and that of their many variants in human populations is therefore particularly important. To do this we must associate with each individual a set of specific characters —a phenotype— and then see if they can be, and how, grouped together into natural classes. Here we see a new risky question, because it can harbor a hidden ideology, one that decides the characters that will be retained and then operate the classification. Because of our necessary anthropocentrism and the constraints associated with our cultures, man is probably a bad model for reasoning about classification methods and their consequences. This is not without implication for human genetics for example because political correctness can make a choice in the retained characters, favoring some, while forbidding to note other characters despite their importance. I will illustrate the method with a domestic animal, the cow (Bos taurus).

This animal is familiar to all of us. This familiarity comes from the fact that we recognize this animal by its shape, or its behavior. The set of retained characters form its phenotype. However this phenotype is not limited, it depends on our motivated interest for this animal. The industrial interest will retain: productivity and quality of the animal's milk, its meat, but also its resistance to diseases, the duration of its gestation, number of pregnancies and number of calves, longevity, etc. Farmers familiar with these animals also recognize a character such as « beauty » (yes, a beautiful cow, this exists and this is carefully defined and gets prizes at agricultural shows), color of the dress, shape of the horns, weight, volume and temperature of the rumen, behavior (solitary, social, suitable for automatic milking etc.) To these macroscopic (visible) characters will be added microscopic characters, often invisible, revealed macroscopically at the moment when a disease appears, for example, or by the analysis of the blood of the animal. We can thus build up a list of hundreds or even thousands of phenotypic characters and then analyze how they can organize themselves into natural classes. Right away we will notice the existence of many well characterized cattle breeds (in France: Charolais, Montbéliarde, Normande, Salers ... and the omnipresent Prim'Holstein), and a multitude of intermediate individuals, resulting from the hybridization between individuals of different races. So far, nothing surprising, except that these breeds can dissolve very quickly in a more or less homogeneous mixture, indicating that the boundary between races is necessarily blurred because of widespread inter-fertility within a same species.

Genetics made it possible to go further, and to understand how these races were constituted and how they evolve spontaneously. Indeed, the phenotypic characters defining the races can be associated with a new knowledge, that of the DNA of the animals, of their genome, that defines the genetic program of their build up, their ability to survive, and their reproduction. Better still, this knowledge has made it possible to discover how, from one generation to the next, this program is accidentally altered, revealing new characters (identified mainly by the appearance of negative traits, such as the CHARGE syndrome or the very rare profound deafness, named Tietz syndrome (4) analogous to what we see in human genetic « diseases ». But the most striking knowledge derived from the coupling between knowledge of a complex phenotype and a genotype has been that the situation where a gene is directly associated with a specific trait is the exception rather than the rule. In practice, no gene works alone. The products whose synthesis it code for interact with others, encoded in other genes. The whole forms complicated structures like the wheels of a clock that would be made of soft matter, and which animate us.

In this context we must understand that the phenotype of a living organism is 100% due to patterns outlined in its genotype (innate), and 100% due to environmental constraints (acquired), whereas the commonplace arguments that discuss heredity seek to separate the innate and the acquired. Innate is in no way opposed to acquired, they are orthogonal categories. An important consequence is that it is not possible to create a hierarchy between these categories, they are simply not comparable. There may be an order in the performances (which are the result of a particular innate / acquired couple), but that is all. And, in particular, one can have identical performances from different inborn / acquired couples. It follows that the same phenotype may be the expression of two different genotypes. And the opposite is true too.

To go further, let us take a metaphor (partially inadequate like all metaphors, but nevertheless quite telling), that of a musical work performed by a large orchestra. When we attend the concert it goes without saying that everything depends on both the score played and on the orchestra. The concert is 100% dependent on the score and 100% on the orchestra. Moreover, the score not only gives the notes, but all sorts of indications: the general tempo, the state of mind of the composer, the moment when each musician must intervene, etc. And it goes without saying that the psycho-physiological state of the musicians, their style, etc. will influence the interpretation of the work. It also goes without saying that the interpretation will never be without some errors, false notes, errors in the measures, that the musician will —or not— know how to catch up, or that will be caught by his colleagues, etc. Each interpretation will be unique. It can also happen that the partition has been badly bound, that pages are missing, that some are inverted, some are blurred or stained. It may have been modified, corrected, amended. Interpretation is also dependent on the time when the work is performed, etc. It is easy to understand all that may and will happen. It is unfortunate that this way of seeing the relationship between innate and acquired, which is obviously quite understandable, is not more widespread in the general public. Of course, there may be a musical phrase played by a single instrumentalist, and where the errors of the score or the performance will be very visible, but in general the performance comes from the combination of the play of many musicians simultaneously. Thus, the ill soloist will lead to a mediocre performance, and in the same way, if his score is poorly printed, the interpretation will be bad, unless the musician has a very good memory, for example. But most often, the end result will be the combination of a large number of defects, in the play of individual musicians, or in what they read. This is easily understood as analogous to either monogenic (first case) or polygenic (second case) genetic diseases. And this takes into account the familiar contribution of « penetrance » and « terrain » in the varied outcome, that is, the observation that the final result of the same genetic anomaly (analogous to a fuzzy or absent score, played by different musicians, some able only to play very close to the score, while others easily remember having played it in the past). What is important to understand is the dialogue between the interpretation (the phenotype) and the program / score (the genotype).

Here is a concrete illustration of this dialogue, with a butterfly that used to exist in France, Araschnia levana. There are two forms, a spring form called the "Fawn Map" and a summer form, the "Brown Map". These two forms are so different that it has long been thought that they were two species. In reality it is sufficient to raise the caterpillars of this butterfly, over several generations (provided to follow the cycle of seasons, temperature in particular) to see that it is a single species! This is a particularly telling demonstration of the role of the environment in the expression of the genetic program. If we then maintain the reproduction by keeping only the winter cycle, we will have only one of the two forms (the spring form), and if we had started from the summer form we will have the impression that an acquired character has become heritable.

Classification and evolutionary history

We have seen the value of classifying as a prelude to the identification of causal relationships, as in the case of diseases related to human polymorphism. Another interest, of course, would be to determine whether a particular gene pool can be associated with drug side effects, particularly when they are serious. But a particularly significant place of classificartion is its relation to history. Making classes allows investigators to propose family trees. The genus Homo, which appeared as a result of a genetic accident that fused two chromosomes that we share with our ape cousins, forming our long chromosome 2, led, in Africa most probably, to the appearance of Homo sapiens. This mobile and adaptable animal has probably invaded Europe several times, where another hominin, Homo neanderthalensis —we do not know the exact origin of this human species, because we find in Europe skeletons of more primitive hominins— had preceded him. And these two species have mixed. The study of the human genome has taught us that Neanderthals and modern man have crossed paths in Europe at least twice in the last 100,000 years. Because they were different species, most of the Neanderthalian DNA segments introduced into modern man were rapidly eliminated in the offspring of hybrid individuals. On the other hand, some hybridized sequences in the sapiens genome have been conserved, and if all the Neanderthal fragments found in the European genomes are put end to end, it is possible to reconstitute almost half of the ancestral Neanderthal genome! Coming out of Africa, and poorly adapted to the viruses of Europe, unlike Neanderthals who had been there for a long time, modern man was gaining a number of adaptive genes thanks to this crossbreeding, offering his progeny increased resistance to these diseases. We find here that the longest and most frequently encountered segments of Neanderthal ancestry in modern humans are most likely adaptive. They are indeed enriched with proteins that interact with viruses. Regions that code for factors that specifically recognize RNA viruses —such as the influenza virus— are more likely to belong to segments of Neanderthal descent in modern Europeans. With these observations we know that conserved segments of Neanderthal ancestry can be used to detect old epidemics. A similar story is also unfolding in Eastern Europe and Asia with one or possibly several crossbreeding events involving another hominin, Denisova's man. And there again appears a selective advantage of crossbreeding, perhaps because the Denisovian ancestor had lived in altitude: the Tibetans have acquired from this hominin a gene that allows them to live at high altitude without suffering the blood problems faced by modern men living in plain.

The formation of natural classes makes it possible to recognize these events and to understand their chronology. It also explains the different susceptibility to diseases of different human groups. And today we can recognize four great natural classes for modern man (and this classification will be refined over time). A first dichotomy distinguishes a group made of Western Europeans, Asians and Oceanians, all carrying a significant percentage of genomes of hominins other than Homo sapiens and a group made up of most inhabitants of sub-Saharan Africa. The first group splits in turn into at least three, depending on the amount of Denisova's genome in their own genome, with the group of Pacific Islanders particularly rich in the denisovan genome. Of course, as we saw above about domestication, the boundaries between these natural classes are fuzzy, and they will be more and more so because of the large movements of human populations that we are experiencing today. In some populations of West Africa, a few Neanderthalian genome sequences have been introduced among people of certain ethnic groups, and this makes it possible to follow human migrations in the course of history. Moreover, as we know more human genomes this migratory adventure to the North and East, that started very early, probably with the migration of the ancestor of the Neanderthals, will be better understood.

Race and ideology

If it is true that the main stages of science have often been preceded by cartography or classification, this is because a particular intention dominated these activities, underpinned by a particular model of the world. This is how the concept of race was born from the exploitation of domestication. This concept is an avatar of classifications, that were essential to define the concept of species. While it answers the question of determining natural classes of individuals within a large population, a quite welcome intention to decide on health policies for example, or to understand the history of the human species, it was plagued by the superimposition of the idea of the « purity » of a race, since for domestication it was important (it is always important) to avoid crossbreeding. It is amusing to note in this connection that if there is a « pure » sapiens race, it will obviously be found among African populations... In fact, while this question of crossbreeding is only of an anecdotal importance when it comes to most animal or plant species (it may be even desirable as seen when we produce mules from a horse and a donkey, and this will always require the same procedure, because mules are not fertile, which avoids the question of the purity of the offspring), crossbreeding is possible and even probable and fertile, within the same species.

It follows that particular phenotypic characters, which may be valued for one reason or another, will appear more and more often, and apparently randomly. Moreover, some characters are quickly masked by others, they are the so-called « recessive » characters. Now, this is standard observation, scarcity is very often considered as a value by human societies. Scarcity has a price. This implies that, if linked to a feature of human beings, this obliterates their dignity, as Emmanuel Kant noted [in the kingdom of ends everything has either a price or a dignity (5)]. Modern genetics, alas, perpetuates this ideology, because it uses the pseudo-concept of « purification » when genes change or disappear as a result of mutations or when a segment of foreign DNA has been introduced within a genome. In fact, the latter type of « horizontal » gene transfer —that is, which does not pass through the straightforward « vertical » pathway from parents to their descendants— most often results from hybridization, in multicellular species. Furthermore, the incompatibility which then appears between neighboring but different gene products imposes the management, in the offspring, of an adjustment or a balancing which will often be realized by the complete loss of the foreign genes. The disappearance of a large part of the Neanderthal genome in Europeans is not a purification, but simply a differential effect on the survival of individuals who possess certain genes, to the advantage of those who have lost or replaced them with the copy of a homologous gene from the Homo sapiens genome.

What must disappear is by no means the concept of race, perfectly clear and harmless in normal circumstances, but that which is associated with it and hides unwanted moral values that have nothing to do with science, the concept of « purity ». It is also necessary to get rid of many other related pseudo-concepts of biology, such as that of « altruism » or « selfishness », attributed as qualifiers of genes that certainly have no moral conscience! No gene, of course, is selfish, no gene is pure. We can associate to each gene a function —usually, rather, a contribution to the implementation of a function— whose role will depend strongly on the context in which it is implemented. Having black skin when the sun dominates will help carriers to have viable offspring in these conditions. On the contrary, it will be a disability at high latitudes because it triggers vitamin D deficiency, leading to a lack of proper ossification, leading to rachitism. A smooth skin protects against many parasites, but makes it more sensitive to cold, etc. Human genetic polymorphism is what allowed man to invade the planet.

Notes

1. F. Vicq d'Azyr (1792) Grande Encyclopédie Médecine, vol 4
2. F. Vicq d'Azyr (1792) Grande Encyclopédie Système anatomique, vol 2
3. Luo et al., (2018) Autoimmunity and molecular mimicry to flu in type 1 narcolepsy Proc Natl Acad Sci U S A. 115: E12323-E12332
4. Bourneuf et al., (2017) Rapid discovery of de novo deleterious mutations in cattle enhances the value of livestock as a model species, Scientific Reports 7: 11466.
5. I. Kant (1785) Grundlegung zur Metaphysik der Sitten

IN PRAISE OF DIVERSITY