lesson 9: analyzing dna sequences and dna barcoding

17
LESSON 9: Analyzing DNA Sequences and DNA Barcoding PowerPoint slides to accompany Using Bioinformatics: Genetic Research Chowning, J., Kovarik, D., Porter, S., Grisworld, J., Spitze, J., Farris, C., K. Petersen, and T. Caraballo. Using Bioinformatics: Genetic Research. Published Online October 2012. figshare. http://dx.doi.org/10.6084/m9.figshare.936568

Upload: genevieve-parks

Post on 02-Jan-2016

34 views

Category:

Documents


1 download

DESCRIPTION

LESSON 9: Analyzing DNA Sequences and DNA Barcoding. PowerPoint slides to accompany Using Bioinformatics : Genetic Research. - PowerPoint PPT Presentation

TRANSCRIPT

LESSON 9: Analyzing DNA Sequences and DNA Barcoding

PowerPoint slides to accompany

Using Bioinformatics: Genetic Research

Chowning, J., Kovarik, D., Porter, S., Grisworld, J., Spitze, J., Farris, C., K. Petersen, and T. Caraballo.  Using Bioinformatics: Genetic Research.  Published Online October 2012. figshare. http://dx.doi.org/10.6084/m9.figshare.936568

Image Source: Wikimedia Commons

How DNA Sequence Data is Obtained for Genetic Research

Genetic Data

…TTCACCAACAGGCCCACA…

Extract DNA from Cells

Sequence DNA

CompareDNA

Sequences to One Another

Obtain Samples: Blood , Saliva, Hair

Follicles, Feathers, Scales

TTCAACAACAGGCCCACTTCACCAACAGGCCCACTTCATCAACAGGCCCAC

GOALS:• Identify the organism from which the DNA was obtained.• Compare DNA sequences to each other.

Overview of DNA Sequencing

DNA Sample

Mix with primersPerform sequencing reaction

…T T C A C C A A C T G G C C C A C A…

DNA Sequence Chromatogram

Image Source: Wikimedia Commons

Sequence Both Strands of DNA

Sequence #1:Top Strand

Sequence #1: Top Strand

Sequence #2: Bottom Strand

A T G A C G G A T C A G C

T A C T G C C T A G T C GSequence #2:Bottom Strand

Compare the Two Sequences

5’- A T G A C G G A T C A G C – 3’

3’- T A C T G C C T A G T C G – 5’

Sequence #1:Top (“F”)

Sequence #2:Bottom (“R”)

Bioinformatics tools like BLAST can be used to compare the sequences from the two strands.

Sequence #1: Top Strand

Sequence #2: Bottom Strand

Image Source: Wikimedia Commons

Image Source: NCBI, FinchTV, BOLD.

Analyzing DNA SequencesDay One:1. Obtain two chromatograms for each sample.

2. Align the sequences with BLAST.

Day Two: 3. Visualize the chromatograms using FinchTV. Compare BLAST alignments against base calls in chromatogram.

Day Three:6. Translate the sequence to check for stop codons.

7. Use BLAST to identify origin of sequence.

8. Use BOLD to confirm identity and make phylogenetic tree.

ATGCCGTAA M P STOP

Sequence #1:Top Strand

Sequence #2:Bottom Strand

A T G A C G G A T C A G C

T A C T G C C T A G T C G

Sequence #1 Sequence #2 4. Review any differences and determine which base is most likely correct.

5. Edit and trim the DNA sequence using quality data.

Viewing DNA Sequences with FinchTV

Image Source: FinchTV

DNA Peaks Can Vary in Height and Width

Image Source: FinchTV

Quality Values Represent the Accuracy of Each Base Call

Quality values represent the ability of the DNA sequencing software to identify the base at a given position.

Quality Value (Q) = log10 of the error probability * -10.

Q10 means the base has a one in ten chance (probability) of being misidentified.

Q20 = probability of 1 in 100 of being misidentified.

Q30 = probability of 1 in 1,000 of being misidentified.

Q40 = probability of 1 in 10,000 of being misidentified.

Quality Values Are Used When Comparing Sequences

Quality values represent the ability of the DNA sequencing software to identify the base at a given position.

Image Source: FinchTV

Background “Noise” May Be Present

Image Source: FinchTV

The Beginning and Ends of Sequences Are Likely To Be Poor Quality

Image Source: FinchTV

Examples of Chromatogram Data

Circle #1: Example of a series of the same nucleotide (many T’s in a row). Notice the highest peaks are visible at each position.

Circle #2: Example of an ambiguous base call. Notice the T (Red) at position 57 (highlighted in blue) is just below a green peak (A) at the same position. Look at the poor quality score on bottom left of screen (Q12). An A may be the actual nucleotide at this position.

Circle #3: Example of two A’s together. The peaks look different, but are the highest peaks at these positions.

#1 #2 #3

Image Source: FinchTV

Image Source: NCBI, FinchTV, BOLD.

Analyzing DNA SequencesDay One:1. Obtain two chromatograms for each sample.

2. Align the sequences with BLAST.

Day Two: 3. Visualize the chromatograms using FinchTV. Compare BLAST alignments against base calls in chromatogram.

Day Three:6. Translate the sequence to check for stop codons.

7. Use BLAST to identify origin of sequence.

8. Use BOLD to confirm identity and make phylogenetic tree.

ATGCCGTAA M P STOP

Sequence #1:Top Strand

Sequence #2:Bottom Strand

A T G A C G G A T C A G C

T A C T G C C T A G T C G

Sequence #1 Sequence #2 4. Review any differences and determine which base is most likely correct.

5. Edit and trim the DNA sequence using quality data.

Transcription and Translation Begin at the Start Codon

5’- A T G A C G G A T G A G C – 3’3’- T A C T G C C T A C T C G – 5’

Sequence #1:

Sequence #2:

Reading Frame +1 M T D Q

There Are Six Potential Reading Frames in DNA

5’- A T G A C G G A T G A G C – 3’3’- T A C T G C C T A C T C G – 5’

Sequence #1:

Sequence #2:

Reading Frame +1 M T D Q Reading Frame +2

Reading Frame +3

Reading Frame -2 Reading Frame -1

Reading Frame -3

Frame-Shifts, Amino Acid Changes, and Stop Codons

5’- A T G A C G G A T G A G C – 3’3’- T A C T G C C T A C T C G – 5’

Sequence #1:

Sequence #2:

Reading Frame +1 M T G E

Reading Frame +2

Accidental insertion of an extra “G” when editing

5’- A T G G A C G G A T G A G – 3’

M D G STOP