Translational Genomics: Lecture 1
Introduction and Genomic Architecture – 07-09-2023 In the near future, we expect to be able to sequence the genome at birth. This will allow us to immediately diagnose monogenetic diseases, use genomic data in personalized treatment protocols and make predictive profiles and thus health advice for late-onset diseases. However, this also comes with DNA-food, DNA-dating, DNA-jobs and DNA-discrimination.Currently, personalized medicine can
be divided into 4 parts: assess risk,
refine assessment, predict/diagnose and monitor progression/prevent events/inform therapeutics.The baseline risk looks at the stable genomics via haplotype mapping, gene sequencing and single-nucleotide polymorphisms. That, together with preclinical progression that tests gene expression, proteomics, metabolomics and clinical risk models, is able to come up with a treatment plan, a therapeutic decision.Currently, getting a diagnosis can take a long time and many different doctors before the patient really knows what the disease is. In the future, we would want personal diagnosis, prognosis, disease management and treatment.This lecture will talk about the human genome, functional DNA, “junk” DNA and the epigenome.The Human Genome DNA can be found within the nucleus and mitochondria in cells. The nucleus contains 22 pairs of autosomes and 2 sex chromosomes (either XX or XY).It contains ~20.000 coding genes and ~25.000 non- coding genes. In order to be able to build mouse models, one needs to cause the same deletion on the same chromosome. This indicates that it is important to know which chromosomes are alike in human and mouse models.The base pairs are either C-G or A-T. The start of the gene (exon 1) contains a lot of G-C base pairs to prevent an accidental ATG, which would mean an early start of the translation of the gene.
- / 4
Functional DNA Functional DNA can be divided into protein coding genes, non-coding genes and regulatory elements.Protein Coding Genes The different parts of protein coding genes can be seen on the image to the left. Gene splicing can happen differently depending on the organ the gene is located (e.g. liver and brain), resulting in different isoforms.
Non-Coding Genes Non-coding genes can be divided into long non-coding RNAs and small non-coding RNAs. The latter can be subdivided into piRNAs, siRNAs and miRNAs. They all play a role in the regulation of genes and thus also in disease.
miRNA, siRNA and piRNA precursors are transcribed and process from small RNA loci. They undergo a Dicer-dependent (miRNA and siRNA) or a Dicer- independent (piRNA) process to turn into mature small RNAs. They can then form the human RISC complex and target genes. miRNA targets coding genes, siRNA targets transposons and exogenous genes and piRNA targets transposons and other genes. The RISC complexes can result in inhibition of translation initiation, elongation or result in mRNA deadenylation. miRNA/siRNA in the genome browsers look like little blocks. Feingold syndrome 2 is caused by an miR-17~92 deletion.
- / 4
Long non-coding RNA (lncRNA) can be intronic, intergenic or natural antisense (NAT; T = transcript). They can be found anywhere in the genome. They exert their effects through protein binding, DNA binding or RNA binding.This can result in blocking of RNA pol II initiation, different splicing or nuclear retention, in which the gene expression in inhibited.
Regulatory Elements When talking about regulatory elements, we’re talking about the locus control region (LCR), insulators, silencers and enhancers, but also the proximal promoter elements and core promoters.
The core promoters contain:
• the BRE (B recognition element), to which TFIIB (transcription factor 2B) binds; • the TATA box, to which the TBP (tata binding protein) binds; • the Inr (initiator element/motif), to which the TAF1/2 (TATA-box binding protein associated factor ½) binds; • MTE (motif ten element); • DPE (downstream promoter element), to which TAF6/9 binds • DCE (downstream core element) to which TAF1 binds “Junk” DNA “Junk” DNA mostly refers to transposons/transposable elements, which are important for the structural integrity. About 45% of the human genome is build of transposable elements, but less than 0.05% is active. The most abundant are the Alu elements (making up 10% of the human genome). 3 / 4
The Epigenome The most extreme example of the effect of epigenetic modification on gene expression is X-inactivation. DNA methylation is the process in which a methyl group is placed on the C base (which is followed by a G base). This leads to imprinting and thus inactivation.During spermatogenesis, the old imprints are erased and new, sex-specific imprints are made.Genomic imprinting is essential for normal development, so a deregulation in this process results in complex genetic diseases such as Prader-Willi and Angelman syndrome, which are both caused by a deletion of the 15q11-13 region. There are about 100 imprinted genetic loci.Angelman syndrome is caused by a maternal deletion, leading to loss of the maternal gene expression and PW syndrome is caused by a paternal deletion, leading to loss of the paternal gene expression.Angelman syndrome leads to an intellectual disability, random laughing, ataxia, no speech, epilepsy, a typical face and friendliness. PWS leads to hypotonia and feeding problems in newborns and obesity, lack of growth, a mild intellectual disability, hypogonadism and behavioural problems in the first decade of life.Uniparental disomy, in which both members of a chromosome pair are inherited from one parent, also leads to AS or PWS, depending on which parental chromosome pair was inherited (PWS for maternal and AS for paternal).Questions Question 1: There are multiple regions in which the causative gene might reside. In one of the regions, on chr18, a deletion of MIR122HG has been found. Argue based on Fig.1 what type of gene this is and how it is processed.
It is an RNA coding gene and does not code for protein because the lines aren’t “thick” and the name of the gene starts with MIR, indicating that it is a miRNA. HG stands for host gene. The gene codes for hairpin MIR122 and MIR3591 after processing (happens through Drosha/Dicer).
- / 4