The HKU-Pasteur Research Centre presents seminars in genomics on a regular basis

The E-seminar "causeries du jeudi" a sequel of the seminar on learning and memory held at the Institut de Biologie Physico-Chimique in Paris with Philippe Courrège and Jean-Pierre Changeux, is held at the Department of Mathematics of the Faculty of Sciences of HKU every Wednesday afternoon, at 2 30 pm.

Innovation Expo 2001


A presentation of the BIOSUPPORT programme has been given on March 20, 2003

"Bioinformatics Resources on the BIOINFO System"

Mr Frankie Cheung
Assistant Computer Officer
Computer Centre, HKU
"Specialized Microbial Databases in the BIOSUPPORT Project"

Mr Cedric Cabau
Research Assistant
HKU-Pasteur Research Centre Ltd
"The BIOINFO System for Teaching and Research in Hong Kong"

Dr David K Smith
Assistant Professor
Department of Biochemistry, HKU

(Abstracts of their presentations can be found at
The HKU-Pasteur Research Centre Ltd and the Computer Centre are jointly organizing a seminar entitled "Bioinformatics Resources of the BIOSUPPORT Project" to introduce the facilities set up for genome analysis by the academia and the development of the state-of-the-art bioinformatics tools and resources under BIOSUPPORT (From gene regulation to gene function: Bioinformatics Sustainable Programme and Portability), a project funded by the Innovation and Technology Fund seeking to provide elements for a stable basis for in silico (computer mediated) genome studies, to be available for use by the HKSAR community at large.

A seminar entitled The Microsporidia: from the minimal genome to the minimal proteome in a parasitic eukaryote has been presented by Prof. Christian Vivarès Professor of Parasitology, Equipe de Parasitologie moléculaire, LBP, Université Blaise Pascal-Clermont-Ferrand, 63177 Aubière Cedex France, Friday 21st, 2003

The ”microsporidial world” is formed by unicellular eukaryotes that are all obligate intracellular parasites and widely distributed throughout the animal kingdom. Responsible of severe pathologies in invertebrates (insects) and fishes, microsporidia can also cause opportunistic infections in immunodeficient persons. Anti-microsporidia antibodies have been detected in 17% of the European population. The genome of Encephalitozoon cuniculi (2.9 Mbp) was the first genome of a parasitic eukaryote to be fully sequenced. Our analytical annotation shown that such genome miniaturization reflects not only various functional losses, as a possible result of parasitic lifestyle, but also minimal sets of genes for conserved functions and an extensive gene size reduction. Microsporidia are energetically dependent on their host cells but have retained a mitochondrion-derived relic organelle, the mitosome. Post-genomic studies (proteomics and comparative genomics) are in progress.

A seminar entitled Genomic variability of HIV-1: implications for viral diagnosis, pathophysiology and viral resistance to antiretroviral drugs has been presented by Prof. Hervé J FLEURY, Professor of Virology, University of Bordeaux, France, Wednesday, 29 January 2003

HIV-1 infection is a chronic disease which leads from primary infection to immunodepression and death; the virus replicates in target cells, monocytes-macrophages and TCD4 lymphocytes, and this replication is associated with the decrease of the TCD4 cells and the emergence of an immunodepression. It is therefore interesting to slow down the viral replication to protect the TCD4 cells. The targets for drug therapy of HIV-1 infection are the reverse transcription (RT), the viral protease and the fusion between the viral envelope and the cellular membrane. Main drugs are nucleoside reverse transcriptase inhibitors (NRTI), non nucleoside reverse transcriptase inhibitors (NNRTI), protease inhibitors (PIs) and fusion inhibitors (T20). Patients are treated with an highly active antiretroviral therapy (HAART) which associates 3 drugs (2 NRTI+1PI for example); HAART enables viral replication to be maintained under the threshold (50 viral RNA copies/ml of plasma) of the assays which are used for the determination of viral load. There are treatment failures which are the consequence of the emergence of viral mutants resistant to the drugs. The resistance to the drugs is associated with specific mutations in the viral genes (RT, protease, env). It is therefore useful to study the genotypic resistance of HIV-1 isolates from patients with a therapeutic failure. This genotypic study can be carried out using sequencing of viral genes or DNA microarrays. Once the mutations have been determined, a new therapeutic regimen will be decided; some examples will be provided. An other important question is related to the genomic variability of HIV-1; as it is now well known, there are three groups of HIV-1: M, O & N; M group(which is pandemic) contains subtypes A to J and recombinants; some of them which are circulating in the human population are named circulating recombinant forms (CRFs); for example, CRF01_AE which is predominant in South East Asia is a recombinant between A and E types; the question is: are subtypes and recombinants of HIV-1 (beside B type which is well characterized in Europe and North America) fully sensitive to the drugs , particularly in countries where they will be introduced in a next future? The French national agency on AIDS research (ANRS) has decided to set up an observatory of genotypic viral resistance of isolates from untreated patients of developing countries of Africa (Ivory Coast, Burkina Faso, Cameroon, Senegal) and Asia (Viet Nam and India). Non B subtypes from these countries are being sequenced in RT, protease and env; the resistance mutations (if present) and the polymorphism (if potentially related to resistance) are noted. Preliminary results from France, Ivory Coast and Viet Nam will be presented.

In collaboration with the Institute of Mathematical Research
Department of Mathematics
Workshop on Mathematical and Computational Biology has been presented on December 30 - 31, 2002

Organizing Committee: Antoine Danchin, HKU-Pasteur; Tze Leung Lai, Stanford U. & HKU; Ngaiming Mok, HKU


A progress report entitled Discovery and description of a Temperature-sensitive Regulatory Network in E. coli & Salmonella enterica serovar Typhi

has be given by Alessandra RIVA Laboratoire Génome et Informatique
Tour Evry II, FRANCE, Friday, November 29, 2002

We have constructed a model chromosome for each, E. coli and Salmonella enterica, specifically to identify unusual distributions of the tetranucleotide GATC. In particular, we were interested in the number of occurences of GATC clusters. GATC is of particular interest: it is methylated by the Dam methylase in E. coli and in Salmonella enterica. GATC motifs have already known functions, for example in mismatch repair and in chromosome replication. We were looking for a further function of this tetranucleotide. GATC, especially in clusters, alters the stability of DNA in a temperature change from warm to cold (cold shock); transcription from genes containing the motif / clusters is blocked under these conditions. We have obtained a list of genes containing GATC clusters for both E. coli and Salmonella enterica and have examined the role each gene plays in the bacterium's metabolism. Most notably we have found that a considerable part of the affected genes code for proteins that are involved in anaerobic or aerobic respiration. Another part of the cell's metabolism affected by the GATC clusters is the metabolism of macromolecules (DNA, cell wall). We have also discovered that the two organisms accumulate succinate during cold shock. Its possible function has been discussed.

Prof. Yiping Wang
College of Life Sciences, Peking University, Beijing, China

A seminar POPULATION GENETICS AND EVOLUTION has been presented on Friday, July 19, 2002 by
Prof. Yoshio Tateno
Laboratory for Gene Function Research, Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Japan

Prof. Takashi Gojobori
Center for Information Biology and DNA Data Bank of Japan,
National Institute of Genetics, Japan

With the aim of searching for a set of ancestral genes that have been involved with the central nervous system, we first made an attempt to identify genes that were expressed in a planarian head by developing the cDNA chip. In practice, we sequenced about 10,000 EST clones from a head part of planarian. We then identified about 3,000 non-redundant sequences. By conducting extensive homology search, we found 116 neural-related genes most of which were found to be homologous to the EST clones from a human brain. Taking into account that planarian and human may have diverged from the common ancestor of deuterostomes and protostomes, we strongly speculated that a set of ancestral genes capable of forming a brain have already existed at the time of their divergence. Moreover, we developed the planarian cDNA chip. Using the cDNA chips, we found that there were more than 200 genes expressed specifically in the planarian brain. When we extended the EST analysis to the brains of fish, newt, and chicken, we could successfully compare the gene expression profiles with each other by defining a "geometric distance" between two gene expression profiles of interest. We then constructed a phylogenetic tree based upon those distances. The tree obtained showed that it was consistent with the so-called "species tree." In this presentation, a possible outline of the evolutionary history of genes involved with the central nervous system and the brain will be discussed from the viewpoint of comparative functional genomics.


Speaker: Yi Huang, Centre of Bioinformatics, Beijing Da Xue, Beijing, China

Ms Huang is experienced in data processing of human genome sequences, including screening, aligning, homologous analysis of large scale sequencing in computer, and managing sequence database. She is conversant with programming for applications connecting to international bioinformatics database and constructing Intranet. She has participated in team research project for 863/973 National Projects of Human Genome. She is interested in computational research for molecular structure and function and in the construction of bioinformatics databases. She would further study on algorithm and data/information management.

In the last several years, to isolate the novel genes related to human hepatocellular carcinoma (HCC), we sequenced P1-derived artificial chromosome PAC579 (D17S926 locus) mapped in the deletion region of chromosome 17p13.3 in HCC. 4 novel genes mapped in this genomic sequence area were cloned by wet-lab experiment and the exons of these genes were located. Simultaneously, 60kb of this genomic sequence was scanned by 5 computational exons prediction programs and 4 splice sites recognition programs. After compared and analyzed between computational predicted result and wet-lab experiment results, some potential exons were predicted in the genomic sequence by using these programs.

Two seminars DNA SEQUENCING: FROM GELS TO FREE-SOLUTION and CHARACTERIZATION OF THE GENE CONVERSION EVENTS IN THE YEAST GENOME have been presented on Friday, November 30, 2001 and Monday, December 03, 2001

Speaker: Dr Guy Drouin Biology Department, University of Ottawa, Canada

Seminar 1: DNA sequences are usually read by resolving the bases of DNA sequencing reactions by polyacrylamide gel electrophoresis. As predicted by the reptation model of gel electrophoresis, the maximum number of bases that can be resolved by polyacrylamide electrophoresis is limited to about 1000 bases. However, these "long reads" are only possible when relatively low voltages are used, and such low voltages lead to long electrophoresis time. Using high electric fields to decrease the time of electrophoresis leads to reduced resolution, and the resolution obtained using high electric fields can not be improved using pulsed fields. We investigated two alternative electrophoresis methods based on the modified migration behaviour of DNA molecules having a protein (streptavidin) attached at their end(s). In the DNA trapping method (Nature 343:190) DNA molecules labeled with streptavidin at one of their end are trapped by the gel fibers during electrophoresis. This technique requires inverted preruns and is of limited utility because, for a given voltage, the increased interband separation obtained is limited to a narrow size range and leads to broader bands. In the End-Labeled Free-Solution Electrophoresis (ELFSE; Anal. Chem. 66:1777) method, DNA molecules labeled with streptavidin at either one or both their ends are separated by capillary electrophoresis in the absence of gel. As predicted, DNA fragments can be separated in the absence of gel when one streptavidin molecule is attached at one end of DNA fragments, and higher resolution is obtained when two streptavidin molecules are attached to both ends of DNA fragments. Furthermore, higher resolution is also obtained at higher voltages (J. Chromatogr. A 806:113). Furthermore, the ELFSE method can be used to sequence 100 bases in less than 18 minutes (Electrophoresis 20: 2501). Therefore, this technique could become the electrophoresis method of choice because it does not require gel filled capillaries, and that using higher voltages leads to both higher resolution and faster separations.

Seminar 2: I used the the yeast genome databank and the GENECONV method of Stanley Sawyer ( to characterise the gene conversion events that occurred between the members of the multigene families found in the yeast genome. I found that gene conversions occur at a frequency of 7.8% gene conversions/pair of genes compared, have an average size of 173 ± 220 nucleotides, that larger gene conversions are found only between more similar genes, that the genes involved in gene conversions are distributed equally among chromosomes, that the frequency of gene conversion increases as the distance between the genes decreases, and that the frequency of gene conversions in independent of the number of genes in a multigene family. In contrast with previous studies, no relationship was observed between the level of expression of a gene and its involvement in gene conversions. These analyses also suggest that gene conversions occur by differentmechanisms in linked and unlinked genes. The excess of converted regions at the 3' end of unlinked genes suggests that recombination with incomplete cDNA molecules is the main mechanism responsible for gene conversions between unlinked genes.

A seminar Global control of methionine biosynthesis by Escherichia coli has been presented on Friday, November 16, 2001

Speaker: Pr Mark Levinthal Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA

E.coli’s genes are regulated to produce optimal growth in a variety of environmental conditions. External environmental conditions such as temperature, oxygen concentration, pH, and nutritional content are sensed and reacted to. Changes in these parameters provoke stress for bacterial cultures. The response to stress involves a change in the bacterial transcriptome. This results in new proteins appearing and other proteins disappearing. Global regulators mediate this stress response. Previous work in this subject has identified several global regulators and the genes that they control. These genes code for specialized proteins specific for a stress condition whose activity restores the homeostatic condition. In addition to these highly specialized proteins, certain other genes expressed during homeostatic conditions are regulated. These homeostatic genes respond to the changes in the internal environment of the cell. These genes are frequently controlled by a variety of different global regulators depending on their role in the cell. The study of the global regulation of homoeostatic genes could allow us a deeper understanding of the physiology and evolution of their gene expression.

We have chosen to study the genes of methionine biosynthesis. This amino-acid has a special role in protein synthesis and one carbon metabolism. In addition, the methionine regulon is complex but very well understood. Our current results identified the previously known global regulators, H-NS and Lrp as regulators of the methionine regulon. H-NS is required for the complete expression of the metA gene when methionine is limiting for the cell. Lrp is a powerful repressor of metA expression and an activator of metF expression. We have identified a new global regulator cspC that is part of the H-NS mediated regulatory circuit. We have perfected a selection procedure for new mutations in genes globally regulating metA. Our ideas of possible modes of action for these regulators will be discussed.

A seminar Repeating to change: Variations on a theme around bacterial evolution and genomics has been presented on Thursday, October 26, 2001

Speaker: Dr. Eduardo P. C. ROCHA, Unité Génétique des Génomes Bactériens, Institut Pasteur & Atelier de BioInformatique, Université Pierre et Marie Curie, France erocha @

Due to their compact genomes, prokaryotes have been thought to lack long repeats. From here to conclude that any redundant sequence would be counter-selected, was a too easily warranted conclusion. Innovation brought about by repeats may confer significant selective advantages, in terms of gene transfer, antigenic variation, and genome plasticity. We have now analyzed for some time the distribution of different types of repeats in bacterial genomes and came about with different patterns related to bacteria’s lifestyle and evolutionary history. It seems that repeats not only are positively selected in certain circumstance of bacterial evolution, but also that they are a major motor of this evolution by the level of rearrangements, integration an deletions of genetic material that they mediate.

A seminar on Annotation of the Human and Mouse genomes has been presented on Monday, October 15, 2001

To Decipher the Book of Life: Functional Analysis of the entire human and mouse genomes

Speaker: Dr. Bo Yuan, Ph.D., M.D., Director, Assistant Professor, Bioinformatics Group, The Ohio State University

A seminar on Comparative Genomic for Pathogenic Micro-organisms has been presented on Thursday, 12, July 2001

Dr Médigue shared with us the results her research group has obtained in the context of automatic coding sequences re-annotation and frameshift errors detection in available prokaryotic genomes. In addition, application of their methodological strategies to the Yersinia pestis genome annotation has also been presented. In combination with assays for function, such in silico genomic approaches facilitate efficient and directed research strategies to elucidate mechanisms of bacterial pathogenicity.

Dr Médigue has extensive research and teaching experience in biological science, bioinformactics, genome annotation as well as in silico sequence analysis. She has been one of the developers of the bioinformaticsmodule of HKU-Pasteur Research Centre.

Speaker : Claudine MÉDIGUE, Comparative Genomic for Pathogenic Micro-organisms, GENOPOLE / INFOBIOGEN, France

Topic: Comparative Genomic for Pathogenic Micro-organisms & their Models

For the first time in history, we have access to the entire genetic content of a growing number and variety of living organisms. It is conceivable that the complete nucleotide sequence for all the major human bacterial pathogens will be available by the end of the decade. This explosive growth of information is forcing changes in many scientific disciplines, particularly in computational biology and molecular genetics. One of the challenges is to predict and annotate the functions of the gene products as rapidly and completely as possible, taking into account both molecular interactions and higher order processes such as the regulation of gene expression and metabolic pathways. New infrastructures which integrate specialised databases and various levels of sequence annotation and function prediction are then required.

The main purpose of Dr Médigue’s research group is to extract significant information from available genomic data. To achieve this goal, they focus their research activity on the in silicoannotation of microbial genomes, particularly on the design and development of tools which will support exploration of biological data organized in appropriate database structures. They thus (1) develop novel strategies for the in silico annotation of genomes using their platform dedicated to sequence analysis and exploration (Imagene system). (2) design specialised databases for the management and exploitation of bacterial genomes data. Their Prokaryotic Genome DataBase (PkGDB) gather information on sequenced bacterial genomes togetherwith additionnal data coming from our analysis, i.e identification of wrong databank annotations or new genes, and detection of potential frameshift errors. (3) apply these methods and tools to the annotation of chosen organims to identify crucial experiments which will validate (or falsify) their in silico predictions.

A seminar on BIOINFORMATICS has been presented twice, on Friday, 20th and on Thursday, 26th April 2001

Speaker 1: Dr. Ivan MOSZER, Genetics of Bacterial Genomes, Institut Pasteur, France

Topic: The SubtiList & GenoList microbial databases:
Linking genome & transcriptome data

The central theme of the work developed at the Regulation of Gene Expression Unit, created in 1986, consists in the identification of the regulation processes involving small metabolic molecules and allowing coordination of gene expression, both in Escherichia coli and in Bacillus subtilis. Over all these years, the Unit has created a world-class research consortium with top-notch scientists in the development of research in computer sciences devoted to the study of genomes. The scientific coordination of annotation and management of the genome data was performed by the Unit, and in particular by Mr. Ivan Moszer, who constructed the reference database, SubtiList; which is regularly maintained in the Unit. He will duly introduce the functions, implementation and development of SubtiList & GenoList microbial databases, which may help provide an adequate framework for collecting, querying and analyzing transcriptome data. Mr. Moszer got his PhD in Genetics (Bioinformatics) from University Paris 6. He has fruitful research & teaching experience in biology and computer science. He is now one of the key drivers in the development of bioinformatics module in HKU-Pasteur Research Centre.

Speaker 2: Gang FANG, Centre of Bioinformatics, Peking University, PRC

Topic: The Development of the Centre of Bioinformatics in Peking University

The Centre of Bioinformatics (CBI) at Peking University was founded in 1997 and is now one of the pioneering centres of this kind in PRC. As one of the first M.S. students and curators for the CBI, Mr. Fang has good knowledge to share with you on matters from how to mirror (e.g. Expsy, GDB, TransFac and GenoList) and maintain bioinformatics databases (e.g EMBL,GenBank, SwissProt, PIR, PDB) to the applications of different database searching, query and analysis tools (e.g. BLAST, SRS, Proteomics, Predict Protein, GCG, Staden and WHAT IF). A former graduate from the College of Life Sciences, Peking University, Mr. Fang are competent in both biology and computer science. He is interested in Secondary Biological Database Design, Molecular Pattern Recognition and Artificial Neural Network in Biology.

Speaker: Dr.Mathias SPRINGER, Service de Biochimie, Institut de Biologie Physico-Chimique, France

About the HKU-Pasteur Research Centre

The Centre will promote scientific and technology research in the field of microbiology, immunology and related disciplines, as well as education, teaching and learning. With a state-of-the-art laboratory, scientists in the Centre will work on the basis of genetics of microbial genomes so as to generate the knowledge, skills and training for application in tackling and emerging pathogens.

A goal of the Centre is to integrate various scientific approaches into a single entity. This interactive area of research is likely to result in a variety of potential new discoveries in domains such as cell metabolism or the development of antibacterial agents, which in turn will contribute to the study and surveillance of emerging and re-emerging infectious diseases. The Centre will also study the application of functional genomics to medical and industrial or agricultural purposes.

Thank you for your participation.

HKU-Pasteur RCL – I/F, Dexter HC Man Building - 8. Sassoon Road - Pok Fu Lam Hong Kong
tel. : (852) 2816 8403 – fax : (852) 2872 5782 - e-mail :