Download - How to use the web for bioinformatics Ethan Strauss [email protected] 274-4330 X 1171 ethan
![Page 1: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/1.jpg)
How to use the web for bioinformatics
Ethan [email protected]
274-4330 X 1171
http://www.q7.com/~ethan
![Page 2: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/2.jpg)
ObjectivesAt the end of this session you
should be able to do all of the following freely available tools on the world wide web:
• Use Genbank or a similar database to find nucleic acid sequences of interest
• Understand the parts of a Genbank entry• Use a BLAST server (e.g. ) to find related
sequences.• Perform an alignment of several nucleic acid
sequences• Obtain the protein sequence which corresponds to
a specific Nucleic acid sequence
![Page 3: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/3.jpg)
How to find all those dang URLs!
http://q7.com/~ethan/molbio/
![Page 4: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/4.jpg)
Outline
• Sequence Databases– What does a Genbank Entry look like?
• Translation and other Utilities
• BLAST
• Multiple Sequence Alignment
• PCR Primer Design
![Page 5: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/5.jpg)
Sequences Databases
• NCBI databases – Nucleic acids, proteins, Literature, genomes, taxonomy, SNPs and more!
• EMBL – Nucleic acid, protein, structure, microarray data and more.
• DBJJ – Nucleic acid, protein. • SwissProt – Very well annotated protein database. • Many other general and specialized databases
exist.
![Page 6: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/6.jpg)
Sequences DatabasesNCBI/Genebank
Nation Center for Biotechnology Information (NCBI)
Sponsored and run by the US government.
Contains many different databases and huge amounts of information.
Most or all data is freely downloadable.
This one site is probably sufficient for all your Nucleic acid a protein database needs!
![Page 7: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/7.jpg)
Sequences DatabasesEntrez
• Allows searching and access to NCBI databases.
![Page 8: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/8.jpg)
Sequences DatabasesSequence Records
• LOCUS Number Size Type Topology Division Date• DEFINITION - Name of the Sequence• ACCESSION - Unique Id number• VERSION - Other numbers which are associated• KEYWORDS • SOURCE – What was it isolated from • ORGANISM - More taxonomic detail• REFERENCE - Paper or papers about the sequence
– AUTHORS – TITLE – JOURNAL
• FEATURES - A complete list of all of the features of a sequence. Can be very extensive and useful!
• ORIGIN – The actual Sequence!http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=58533118
![Page 9: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/9.jpg)
Hands on
Find a gene of interest using the Entrez interface.
We will be working with this sequence throughout class, so you may want to open a word processing program and save the sequence (only) there for future reference
![Page 10: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/10.jpg)
General Utilities
• http://searchlauncher.bcm.tmc.edu/seq-util/seq-util.html – Translation
– Restriction Digestion
– Reformatting (alternately FASTA Formatter)
– Complement/Reverse
– Etc.
• http://www.promega.com/biomath/calc11.htm – Melting Temperature of an oligo.
![Page 11: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/11.jpg)
Hands on
Translate your sequence in all 6 reading frames.
![Page 12: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/12.jpg)
BLAST
• Basic Local Alignment Search Tool• Compares a query sequences against all sequences
in a database. • Very powerful for finding biologically significant
relationships and full gene sequences in the database when you have a fragment etc.
• Different types:– Nucleic acid – Nucleic Acid– Protein- Protein– Nucleic Acid Translation – Protein– Protein – Nucleic Acid Translation– Translation - Translation
![Page 13: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/13.jpg)
BLAST
![Page 14: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/14.jpg)
BLAST
![Page 15: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/15.jpg)
Hands on
Use ~120 bases (2 lines) from your sequence to find at least two other sequences related to it.
Note that if we all hit NCBI BLAST at once, it will be slow. We may not have time to wait.
Get all 3 sequences (your original and two others) into FASTA format using READSEQ.
![Page 16: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/16.jpg)
Multiple Sequence Alignment
Many programs can align multiple sequences with each other to find the best fit for all.
This is generally more biologically meaningful for protein sequences since they are more highly conserved.
Clustal is the most common.
![Page 17: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/17.jpg)
Multiple Sequence AlignmentMEAGAYLNAIIFVLVATIIAVISRGLTRTEPCTIRITGESITVHACHIDSX ETIKALA MEAGAYLNAIIFVLVATIIAVISRGLTRTEPCTIRITGESITVHACHIDS...ETIKALA MEA..YLNAII.VLV.TIIAVIS..L.RTEPC.IkITGESITV.ACklDa.....I..L. MEAgaYLNAIIfVLVaTIIAVISrgLtRTEPCtIrITGESITVhAChiDsx etIkaLa
LK PLSLERLFQ LK.PLSLERLFQ ......L..... lk plsLerlfq
![Page 18: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/18.jpg)
Hands on
Use your FASAT Formatted sequences to perform a multiple sequence alignment.
Transfer the alignment to a word processing program and see if you can make it look decent.
• Change to Courier or Courier New• Reduce Font Size• Change to Landscape view
![Page 19: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/19.jpg)
PCR Primer Design
There are many PCR primer design programs online and off.
I recommend Primer 3. It is complex, but powerful.
You can ignore most parameters.
![Page 20: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/20.jpg)
Hands on
Design primers for the sequence you have been working with.
![Page 21: How to use the web for bioinformatics Ethan Strauss ethan.strauss@promega.com 274-4330 X 1171 ethan](https://reader035.vdocuments.site/reader035/viewer/2022062221/56649d785503460f94a5a8ea/html5/thumbnails/21.jpg)
Homework Report:
Please turn in a report which includes the following:
Information about your initial sequence including: Genebank Accession Number
Species
Description
Location of ORF and any other important features.
Information about the 4 other sequences including the above Genebank Accession Number
Species
Description
Location of ORF and any other important features.
E value from your BLAST results.
The sequences of the PCR primers you chose or a short explanation of why you could not find primers to amplify all of these genes.
The multiple sequence alignment with the locations of the primers clearly marked.