the nature of a gene; component parts susquehanna magnet school for medicine and health sciences...
TRANSCRIPT
The Nature of a GENE;Component Parts
Susquehanna MAGNET School for Medicine and Health Sciences
October 7, 2013Professor Michael Chorney
Learning ObjectivesExplain the nature of codons and the information inherent intheir base composition
Explain what is meant by degeneracy
Describe the nature of a gene, its component parts andhallmarks
Discuss the nature of exons, and what is meant by an openreading frame; in addition, define the what is meant byexon splicing
DNA Scale, the numbers, Review
We have a lot of DNA in each gamete (3.1647 x 109 basepairs); cf a bacterium which contains about 4.6 million basepairs
200 phonebooks the size of Manhattan’s (1,000 pages) would be required to tally the information; if you read the bases nonstop, it would take 9.5 years to complete
The fertilized ovum contains twice a much DNA as each gamete, or (6.4 x109 basepairs); half maternal, half paternal*
• The body contains 10-100 trillion cells=1014
• A cell contains 12 picograms of DNA= 12x10-12=6x109 basepairs (G, A, T, C) • The body contains 1014 cells x 12x10-12 g=1,200 g of DNA=0.25% wt=6.4x1022 basepairs
*
DNA Scale, the numbers.2, Review
The DNA in each somatic cell is arranged into chromosomes, i.e., linear strands of DNA of varying lengths
The DNA is condensed by proteins of opposite charge, called histones, which provides a means for regulating base (information) access by other proteins
Condensed DNA, during mitosis, can be easily stained, revealing the chromosomes’ size and banding variation (reflected in the variation of A/T and C/G content)
Giemsa-stained metaphase spread of human chromosomes fromone cell, the most condensed form of DNA within the cell, seen atMITOSIS
Vocabulary
metacentricsub-metacentricacro(telo)centriccentromerep armq armbandingheterochromatineuchromatintelomeresautosome
Cytogenetics
Each chromosome is a linear strand ofhelical ds-DNA with capped ends calledtelomeres
Figure 6-2 Molecular Biology of the Cell (© Garland Science 2008)
Information flow
Sense strand
Anti-sensestrand
A GENE
Like copying the lead-ing strand
TRANSCRIPT
Figure 6-21 Molecular Biology of the Cell (© Garland Science 2008)
Figure 6-4 Molecular Biology of the Cell (© Garland Science 2008)
RNA polymerasereplaces U for T (why?) inRNA and ribose forthe deoxy sugar
Deaminated C=U
Only a small amount (percentage) of human DNA contains information that is ostensibly converted into proteins: these sequences are associated with genes. The proteins coded for by genes do biochemical work and regulate cell division, generate energy, respond to the environment, provide immunity to invasive DNA sequences (infection), etc.
DNA, the Puzzle, Review
What (and where) is this information we keep hearing so much about?
It resides in the bases; particular triplet base combinations which comprise the exons and provide information called codons
You should have been exposed to the list of codonslast week in the case study, next slide
There are 64 codons that equate to the twenty aminoacids (a.a’s), with multiple codons existing for most of the a.a.’s, called degeneracy. Three of the codonsare called termination codons, more later
For starters,
Figure 6-50 Molecular Biology of the Cell (© Garland Science 2008)
CODONS, what do you notice?
Figure 4-7 Molecular Biology of the Cell (© Garland Science 2008)
The question for molecular biologists:
What distinguishesa gene (1-2% of DNA)from the remainingDNA (98%)?
This has posed a problem for some time; now that this isbecoming solved, thequestion becomes, what does the ‘gene’do?
Can you see, ata quick glance, agene in the sequence at theleft?
The yellow highlightedbases signify thebeta globin gene!!!
Genes are subject to the following:
1. They must be recognized by a polymerase, thatis, an RNA polymerase that will guide gene copyingcalled TRANSCRIPTION—compare DNA polymerase
2. The collective DNA sequence that summons forthRNA polymerase is called a PROMOTER
3. The information copied into RNA immediately adjacent to the promoter must be readable (CODING SEQUENCE); i.e. no stop codons untilthe naturally determined end of translation
4. There has to be a place after the coding sequencethat signals the end of transcription, different than theend of translation
The eukaryotic gene’s general features and processing characteristics
exon exon exon exon5’ 3’
The gene is controlled by a promoter (p) which is not simple – there are generalized transcription factors and more gene-specific ones that may reside outside of the promoter proper, within the gene, within the 3’ end of the geneor even far 5’ and/or 3’ of the gene itself –they open the DNA and expose sites
p
The gene is structured in ‘staccato,’ with coding sequence (exons) interrupted bynoncoding intervening sequences, called introns; the first exon begins with the ATG met codon, the last exon ends with one of three translational terimantion codons (TAA, TAG, TGA)
Termination of transcription occurs in the 3’ untranslated region (3’UTR) whichpossesses termination signals and an RNA domain which drives 3’ processing, the AATAAA polyadenylation signal
3’UTR
AATAAA
ATG STOP
Exon-intron borders possess sequences which aid in splicing, AG/GT……A……AG/G along with small, nuclear RNAs forming the spliceosome
AGGT A AGG AGGT A AGG AGGT A AGG
5’UTR exon1 exon 2 exon 3 exon 4 3’UTR
exon exon exon exon5’p
3’UTR
AATAAA
AUG STOP
AGGT A AGG AGGT A AGG AGGT A AGG
[CG]
Maintaining DNA euchromatic also rests upon factors that bind to C’s and G’s, whichprotect the CpG ‘islands’ from cytosine methylases best known for their role inimprinting
CpG Islands: under-represented nucleotides found at the 5’ end of eukaryotic genes
CH3ase
…..WlsjeutlsjimsatouttutyecmdsisladksltkaldThedayforeeeuslkeiandseveeubhismomand ttugosocunntewherebudtedandtueislsiecnTisnggotallsixeooaltaxlekqzztiellforthebigbadsumrrrrrrrrrrrrteidas………
Let’s try a poor analogy, constrained by the Englishlanguage and a dearth of three-letter words, but Here goes….
Find the three letter (codon)-containing ‘exons’ that make a kind of a sensible phrase (names included)-This is comparable to an open reading frame
Answer: jimsatoutthedayforhismomandbudtedandgotallsixforthebigbadsum……….
WordDNA
jim sat out the day for his mom and bud tedand got all six for the big bad sum……….
…..WlsjeutlsjimsatouttutyecmdsisladksltkaldThedayforeeeuslkeiandseveeubhismomand ttugosocunntewherebudtedandtueislsiecnTisnggotallsixeooaltaxlekqzztiellforthebigbadsumrrrrrrrrrrrrteidas………
What happens if I delete the s?
Jim sat out the day for him oma ndb udt edandg ota lls ixf ort heb igb ads um……….
FRAMESHIFT—the OPEN READING FRAMEIS GONE
Figure 6-51 Molecular Biology of the Cell (© Garland Science 2008)
RNA
Figure 4-76 Molecular Biology of the Cell (© Garland Science 2008)
CODINGSEQUENCEIS CONSERVEDSEQUENCEACROSSSPECIES
LEPTINGENEALIGNMENT
Figure 4-78 Molecular Biology of the Cell (© Garland Science 2008)
THERE IS GREATER EVOLUTIONARY PRESSURETO CONSERVE CODING SEQUENCE (EXONS) THANINTRON SEQUENCES
Humans have approximately 23,000 genes (down from the 80-140k prediction
Genes are dispersed along the chromosomes in what appears to be a random fashion, although many gene clusters exist which seem to aid coordinate expression: globin, histone, immunoglobulin, MHC, etc.
Some chromosomes are more rich in genes than others, although chromosome size roughly correlates with gene number
A gene’s location is termed its locus as we have touched upon
DNA, the puzzle.2
Genes vary in size, from beginning to end
And in their number of exons, whose tally following splicing must = an open reading frame, or ORF
The average protein is 45Kd (110 for the mw of an average amino acid); the average size of a spliced gene (mRNA) is 1.5 kb, therefore, the amount of coding sequence in the human genome is 0.14%
Exons’ size varies, but average about 200 basepairs (based on myKnowledge of the Ig superfamily members); their translated sequences often equate to ‘domains,’ units of primary amino acid sequence that perform function
http://www.cshlp.org/ghg5_all/section/gene.shtm BIG GENESl