1 lsm2241 ay0910 semester 2 miniproject briefing round 5
TRANSCRIPT
1
LSM2241AY0910 Semester 2
MiniProject Briefing
Round 5
2
FOCUS
Identify and analyze sequence differences among NF-κB protein sequences
Specific objectives
1. Analyze mammalian NF-κB proteins and their homologues
2. Generate sequence patterns for conserved regions in mammalian NF-κB proteins and their homologues
3. Map conserved sequence patterns to 3D structures
Topics for Round 5
• Data collection - DONE
• Data processing - DONE
• Data analysis– Phylogenetic analysis
• Multiple Sequence Alignment - DONE• Generating a phylogenetic tree - DONE
– Sequence pattern analysis – TODAY– Structural analysis
4
Round 4 Review
• Adding root sequence to your MSA
• Editing your MSA
• Generating a phylogenetic tree with bootstrap values
• Any questions?
5
Sequence Pattern Analysis
Goal
To identify patterns within conserved domains of NF-κB proteins
How?
Generate Prosite patterns within the conserved domains of selected mammalian NF-κB proteins using the PRATT tool
6
Identifying highly conserved sequences from your phylogenetic tree
• From the phylogenetic tree generated in the previous section, record the accession numbers of the sequences that are present in each statistically significant clade (bootstrap value ≥ 70)
Note: Record the accession numbers for each clade separately
7
Identifying highly conserved sequences from your phylogenetic tree
8
Identifying highly conserved sequences from your phylogenetic tree
• Extract the aligned conserved domain sequences for each statistically significant clade from the edited multiple sequence alignment file
• Launch BioEdit: Edit Search Search for titles that contain a list of substrings
• In the popup window, type or paste the list of accession numbers you want to extract in the blank field and click “OK”
Note: Do this separately for each statistically significant clade
9
Identifying highly conserved sequences from your phylogenetic tree
10
Identifying highly conserved sequences from your phylogenetic tree
11
Identifying highly conserved sequences from your phylogenetic tree
• Now, all the sequences containing the accession numbers you have keyed in will be selected.
• Copy these sequences by selecting Edit Copy Sequence(s).
• Open a blank alignment file (File New Alignment) and paste your sequences into the blank alignment file (Edit Paste)
• Save your new alignment in FASTA format • Repeat for sequences in each statistically significant clade
12
Generating Patterns
• Do you think you need to do some processing to the alignment of each clade before generating a pattern?
13
Original sequence
Clade alignment
new slide
Generating Prosite patterns
• Use the PRATT tool to generate Prosite patterns for parts of each statistically significant clade (follow the instructions in P7)
• Scan your patterns against all mammalian NF-κB sequences– How do you obtain these sequences?
14
slide updated
Generating Prosite patterns
15
Copy and paste the contents of nfkb_all.fas in the ScanProsite page, under “Sequence(s) to be scanned”
Generating Prosite patterns
• Analyze your results:– Compare and contrast each Prosite pattern. How are
they similar or different? – Compare the results for each Prosite pattern. How
many hits do you get for each Prosite pattern?– Which are the proteins that contain your pattern? – How accurate is your pattern? – Are your results reliable? If not, what can you do to
generate more reliable results?
16
Generating Prosite patterns
• Repeat the ScanProsite step again
• This time, scan your patterns against UniprotKB/Swiss-Prot, UniProtKB/TrEMBL and PDB. Select the option: include all splice variants.
17
Generating Prosite patterns
• Compare and contrast your results– How many hits do you get now? – Do you get the same results as the last step? Why?
18
Generating Prosite patterns
• Analyze your results:– Compare and contrast each Prosite pattern. How are
they similar or different? – Compare the results for each Prosite pattern. How
many hits do you get for each Prosite pattern?– Which are the proteins that contain your pattern? – How accurate is your pattern? – Are your results reliable? If not, what can you do to
generate more reliable results?
19
Generating Prosite patterns
• Points for discussion– What is the significance of sequence pattern
analysis? In what ways are Prosite patterns useful?
– What is the significance of your results? – Are there any other pattern generation software?
Do they generate better results?– How can you improve on your results?
20
Topics for Round 5
• Data collection - DONE
• Data processing - DONE
• Data analysis– Phylogenetic analysis
• Multiple Sequence Alignment - DONE• Generating a phylogenetic tree - DONE
– Sequence pattern analysis - DONE– Structural analysis – NEXT WEEK
21