Regulation of transcript levels, intermediate phenotypes between genotype and phenotype

Focus on long non-coding RNA (lncRNA) genes in hens and lipid-related phenotypes

Better understanding of the genotype-phenotype relationship

Since the emergence of new sequencing technologies in the early 2010s, our knowledge of genomes has been constantly improved and has opened the way to a better understanding of the genotype-phenotype relationship. Beyond the challenges related to the acquisition of new fundamental knowledge on phenotype variation and plasticity, a better understanding of the genotype-phenotype relationship in livestock is a challenge to improve genetic predictions and genomic selection. It is now accepted that the polymorphisms responsible for the variation of complex traits are mostly located in the non-coding regions of the genome, which presumably regulate the quantity of gene transcripts. Variation in gene expression, controlled by both genetics and the environment, is therefore a major component of the genotype-phenotype relationship.

In this context and in the framework of the international FAANG project [1] associated with the INRAE Fr-AgENCODE project [2], we are interested in long non-coding RNA (lncRNA) regulatory genes, supposed to be important regulators by their number, their mechanisms of action and their varied properties (interaction with DNA, RNA or proteins). We have recently extended the Ensembl reference annotation of the chicken genome [3], adding more than 12,000 lncRNAs, bringing their number to about 17,000, the equivalent in number of protein-coding genes (PCGs). This addition thus greatly complicates the set of regulators that can modulate gene expression acting directly on phenotypes.

Study of the expression regulation of protein-coding genes

The first objective of this thesis is to provide new knowledge on the regulation of PCG and lncRNA expression and on the regulatory action of the latter on the former. The first step is to further improve the knowledge of the genes in the chicken genome. The second step is to provide a list of regions of the genome (called eQTL) explaining part of the variation in transcript levels, by exploiting 250 RNA-seq (100 already existing) from a commercial line. This study will be done by two approaches: on the one hand by GWAS, a statistical method looking for the regions of the genome associated with the variation in transcript levels, within a population; on the other hand by a method aiming to look for genes with a differential expression between the two alleles of the same individual, reflecting the existence of a variant regulating the expression of the gene. This last method (called ASE) has become possible thanks to RNA-seq data.

fig1_PhD_Degalez
Figure 1: Understand the regulation of PCG by lncRNAs in the liver and the impact on lipid phenotypes

Study of genes involved in lipid metabolism

A focus will be made on the transcriptome of the liver, a major organ of energy homeostasis, with a prioritization of genes involved in lipid metabolism, the latter being associated with many traits and/or diseases of importance in agronomy and animal health. Indeed, in a second step, the detected eQTL regions will be confronted with genome regions (QTL) associated with more or less complex lipid-related phenotypes ranging from elementary phenotypes in the studied tissue (e.g. cholesterol and elementary fatty acids in the liver) to more complex phenotypes measured at the animal level (e.g. body adiposity).

Map of regulatory genome regions and interest of eQTL approaches

This thesis should provide a map of the genome regions regulating hepatic gene expression in the chicken and evaluate, in more or less favorable cases (elementary vs. complex phenotypes), the interest of eQTL approaches to identify genes responsible for the variation of phenotypes of interest. Moreover, because chickens and mammals diverged 300 million years ago, the lncRNA genes identified in chickens as regulators of lipid metabolism will also be searched in humans and mice in order to identify new regulators of lipid metabolism, considered as major because they are evolutionarily conserved.

Fabien Degalez is working on this subject since November 2020 for 3 years. He is supervised by Sandrine Lagarrigue in the genetics and genomics team.

Contact

Fabien Degalez (PhD student) : fabien.degalez[at]inrae.fr
Sandrine Lagarrigue (supervisor) : sandrine.lagarrigue[at]agrocampus-ouest.fr

References

1. FAANG data portal. https://data.faang.org/#
2. FR-AgENCODE · functional annotation of livestock genomes. http://www.fragencode.org/
3. Jehl F, Muret K, Bernard M, Boutin M, Lagoutte L, Désert C, et al. An integrative atlas of chicken long non-coding genes and their annotations across 25 tissues. Scientific Reports. 2020;10:20457. [DOI]