introduction to genetics debashis ghosh professor and chair, biostatistics and informatics,...
TRANSCRIPT
![Page 1: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/1.jpg)
Introduction to Genetics
Debashis GhoshProfessor and Chair,
Biostatistics and Informatics, ColoradoSPH
![Page 2: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/2.jpg)
Question we tackle today
• What do we mean by a gene?• Steve Mount (ongenetics.blogspot.com): “A gene is all of the DNA elements
required in cis for the properly regulated production of a set of RNAs whose sequences overlap in the genome. ”
• Mark Gerstein (2007, Genome Biology): “The gene is a union of genomic sequences
encoding a coherent set of potentially overlapping functional products”
![Page 3: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/3.jpg)
What is a gene?
• No ``one-size-fits-all” definition• The previous definitions are useful to
contextualize data that are generated from experiments
• Thinking carefully about evolution and the constraints it has placed on functions is also important
![Page 4: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/4.jpg)
From Genotype to Phenotype
• Full genotypes (genomes) are coming…But inheritance is complex
• Genetic markers are characters inherited in a way that is simple enough to easily track
• Want to find genetic markers that explain or predict phenotypes– e.g., disease, susceptibility– Ideally, the marker would be causative
• But that is rare
![Page 5: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/5.jpg)
Alleles as Genes
• At each gene locus, we have two alleles, one transmitted to us by our father, and one by our mother.
• Usual assumption: Each parent randomly transmits one of his/her alleles to the child
• For real datasets, this is identical to DNA variants referred to as single-nucleotide polymorphisms (SNPs)
![Page 6: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/6.jpg)
Diploid Inheritance
From Mom
From Dad
From Mom
From Dad
Heterozygote
Homozygote
![Page 7: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/7.jpg)
Phenotypic Dominance
From Mom
From Dad
Heterozygote
Light blue dominantDark blue recessive
Dark blue dominantLight blue recessive
Mixed Dominance
![Page 8: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/8.jpg)
Diploid Inheritance
Heterozygote
Homozygote
Dark BlueIs Dominant
Recessive Phenotype
Only Visible in Homozygote
![Page 9: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/9.jpg)
Mendelian Ratios
![Page 10: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/10.jpg)
Recombination
From Grandma
From Grandpa
Chromosomal Segment in Mom (she’s a diploid,
remember)
From Mom
From Dad
Chromosomal Segment in You (You’re diploid too)
![Page 11: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/11.jpg)
Crossing Over
From Grandma
From Grandpa
Sister Chromatids Recombine (Cross Over) During Meiosis
Inherited by You
Lost (Except inTetrad Analysis)
Products of Meiosis
![Page 12: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/12.jpg)
Recombination: Basic Points
• Recombination switches which chromosome in the parent (i.e., originating from which grandparent) is passed along to the offspring
• Alleles physically adjacent on a chromosome are more likely to be passed on together than alleles far apart
• Alleles very far apart or on different chromosomes are inherited randomly
![Page 13: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/13.jpg)
Finding Disease Genes
• Assemble data set of probands• Assemble data set of control population• Might have pedigree if runs in families• Might have trios to determine linkage– Proband plus two parents
• Look for linkage between genetic markers and disease– In pedigree– In dataset of less related individuals
![Page 14: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/14.jpg)
Genetic Markers• Polymorphic in population– Different variants in different individuals– Single Nucleotide Polymorphism (SNP)– Variable Number of Tandem Repeats
(VNTR)• minisatellites
– Short Tandem Repeats (STR)• Microsatellites• Very high mutation rate: strand slippage
• Haplotype– A set of closely linked SNPs inherited as
unit
![Page 15: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/15.jpg)
Linkage Analysis
• Set of variable markers distributed throughout genome
• Identify linkage regions (haplotypes) that cosegregate (are inherited) with disease or trait
![Page 16: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/16.jpg)
Pedigree Analysis• Tabulate the occurrence of a trait in
an extended family– Pedigree is family’s mating history
![Page 17: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/17.jpg)
Assumptions and Complications
• Single gene with Mendelian inheritance– Best use of extended families– Few extended families with trait
• Quantitative traits are multigenic– Includes most widespread or “common”
inherited diseases– Sib pairs are best for complex traits with
incomplete penetrance (see next slide)
![Page 18: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/18.jpg)
Incomplete Penetrance
• Not everyone with genotype will have the disease– Delayed or adult onset– Mild or undetectable symptoms– Environmental and developmental factors– Unknown genetic factors
• Disease allele = increase probability of disease, relative risk
• We don’t always know in pedigree who has the disease genotype!
![Page 19: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/19.jpg)
Evaluating Linkage
• Remember, individual is a recombinant with respect to two genes, A and B, if inherits the allele from one parental chromatid at A and inherits the allele from the other parental chromatid at B
• The recombination fraction is the probability that a child is recombinant
• If A and B are tightly linked, then is small
![Page 20: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/20.jpg)
Simple LOD Scores
• Total number of offspring, P• Number of recombinant offspring, R• Likelihood of the Data = • Maximum likelihood estimate
• LOD score for linkage in pedigree is
![Page 21: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/21.jpg)
Complications• Need to know phase, genotypes of
parents, to identify recombinants– Can estimate informativeness of additional
data depending on heterozygosity of markers
• Many disease versus marker comparisons are involved– Multiple comparisons– But, markers are not independent
• Population structure• LOD scores > 3 (1000:1) give general
sense; >5 very strong
![Page 22: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/22.jpg)
Population structure
• Genetic markers have different patterns in different populations; this has the possibility of confounding associations between genetic markers with disease phenotypes.
![Page 23: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/23.jpg)
Realistic Complications
• Include Penetrance(X|G)– Likelihood of observing trait X given the
genotype G
• Prior(G)– Likelihood of observing the genotype in an
individual
• Transmit(Gm|Gk,Gl, )– Probability that offpring will have genotype
Gm given parental genotypes Gk and Gl, and the recombination parameter
![Page 24: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/24.jpg)
LOD Graph
•Can look at LOD score over a range of 's, not just MLE.
•Usual assumption is LOD > 3 is evidence for linkage, LOD < -2 is evidence for exclusion
Example: 27 recombinantsOut of 139 gametes(example from S. Purcell)
![Page 25: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/25.jpg)
Recombination Probability and Distance along Chromosome
• Recombination does not increase linearly–Multiple recombination events possible
over greater distances, but also interference
• Can estimate genetic distance from recombination rates–Measure in Morgans, or cM– the expected number of crossovers,
is additive
![Page 26: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/26.jpg)
Mapping Functions
• Haldane’s mapping function– Crossovers are assumed random and
independent
• Kosambi’s mapping function–Models interference: crossovers not too
close–Most popular
![Page 27: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/27.jpg)
Genetic versus Physical
• Mapping is not simple– Recombination rate varies along
chromosomes
• Male versus Female–Men 28.51M over whole genome• 1.05 Mb/cM
–Women 42.96M (excluding X)• 0.88 Mb/cM
• In Drosophila, about 0.4 Mb/cM
![Page 28: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/28.jpg)
Modeling Penetrance• Single locus, three genotypes
• If – Disease is Mendelian dominant
• If – Disease is Mendelian recessive
• Spontaneous mutations:• incomplete penetrance:
![Page 29: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/29.jpg)
Extending Analysis
• SNPs scattered throughout genome– LOD scores for regions, not individual marker
• Multipoint linkage analysis– Establish order relationship among 3+ markers
• Non-parametric analysis can be better for complex traits, incomplete penetrance–Work with affected siblings– Less statistical power than model-based
methods
• Identical by descent (IBD) versus chance
![Page 30: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/30.jpg)
Non-Parametric
• Concerning siblings or other relatives– Need “both affected” and “only one
affected” pairs
• Correlate shared IBD alleles with affected state, proportion in two classes– High correlation means linkage to
disease MentionT1D
![Page 31: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/31.jpg)
(Genomewide) Association Studies
• Correlate markers with disease over a large population
• Marker may be disease (rare)• Large regions of chromosome in
linkage disequilibrium with disease allele–Marker is in disease gene haplotype
• Regions of chromosome tend to be inherited as a unit– Tapers off over time due to recombination
![Page 32: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/32.jpg)
Association Studies• Linkage disequilibrium varies among
populations– Depends on population structure, age
• coalescent
– Europeans have a lot, African populations only a little
– Population of human origin is more diverse, older
• Need dense, cheap markers over genome: Genome Wide Association Studies (GWAS)
![Page 33: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/33.jpg)
QTL and GWAS• Quantitative Traits, polygenic traits that
are assumed to have additive effects– Height, heart disease– Quixotic Trait Loci?
• Each gene has a small effect• Huge genotyping efforts now paying off• BUT only a small fraction of genetic
component is accounted for even in huge studies– Tradeoffs of including broader human
population
![Page 34: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/34.jpg)
Common Disease versus Rare Variants
• Common disease, common variants: The most frequently occurring alleles/SNPs should explain most of the etiology of a disease.- Current studies do NOT show this to be the case.
• Newer paradigm: rare variants• - occur less frequently but have
larger associations with disease
![Page 35: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/35.jpg)
Sullivan, Daly and Donovan, Nature Reviews Genetics, 2012
![Page 36: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/36.jpg)
• Different results in different populations
• Heritability–What makes a gene matter to a
disease?– Take advantage of human phenotyping–What genes CAN contribute to disease
or modification of disease?
• A golden age of personal genomics?
![Page 37: Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH](https://reader036.vdocuments.site/reader036/viewer/2022062314/56649e7d5503460f94b7f5f2/html5/thumbnails/37.jpg)
Acknowledgments
• David Pollock, Biochemistry and Molecular Genetics