national center for biotechnology information
DESCRIPTION
National Center for Biotechnology Information. A Field Guide to GenBank and NCBI’s Molecular Biology Resources. University of Colorado Health Sciences Center. August 30, 2005. Topics. About NCBI GenBank overview Primary vs derivative databases The Reference Sequence (RefSeq) project - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/1.jpg)
NC
BI
Fie
ldG
uid
e
National Center for Biotechnology Information
A Field Guide to GenBank
and NCBI’s Molecular Biology Resources
August 30, 2005 University of Colorado Health Sciences Center
![Page 2: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/2.jpg)
NC
BI
Fie
ldG
uid
e
Topics About NCBI GenBank overview Primary vs derivative databases
The Reference Sequence (RefSeq) project
Entrez databases Genome resources Bookshelf
-break- Entrez text searching BLAST sequence searching VAST structure searching An integrated example
![Page 3: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/3.jpg)
NC
BI
Fie
ldG
uid
e
The National Institutes of Health
Bethesda, MD
![Page 4: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/4.jpg)
NC
BI
Fie
ldG
uid
eThe National Center for
Biotechnology Information
Accepts submissions of primary data
Develops tools to analyze these data Creates derivative databases based on the
primary data Provides free search, link, and retrieval of these
data, primarily through the Entrez system
![Page 5: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/5.jpg)
NC
BI
Fie
ldG
uid
eNCBI WWW Users per
Day
![Page 6: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/6.jpg)
NC
BI
Fie
ldG
uid
e
Number of Users Per Day
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
400,000
450,000
Nu
mb
er o
f U
sers
1997 1998 1999 2000 2001 2002 2003
Christmas & New Year
![Page 7: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/7.jpg)
NC
BI
Fie
ldG
uid
e
Homepage - accessing the data
all[filter]
![Page 8: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/8.jpg)
NC
BI
Fie
ldG
uid
eall[filter]
1/11/2005
3/15/2005
8/15/2005
![Page 9: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/9.jpg)
NC
BI
Fie
ldG
uid
e
Entrez Nucleotide
Primary Data GenBank / DDBJ / EMBL 57.3 million (97.4 %) Derivative Data
RefSeq 1.47 million (2.5 %)
RefSeq reviewed 60,000
PDB (structures) 5,973
“Total” 59 million
GenBank
# records
![Page 10: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/10.jpg)
NC
BI
Fie
ldG
uid
e
GenBank: NCBI’s Primary Sequence Database
ftp://ftp.ncbi.nih.gov/genbank/ ftp://genbank.sdsc.edu/pub
ftp://bio-mirror.net/biomirror/genbank
Release 149 August 2005 47 x 106 Records 52 x 109 Nucleotides
195 Gigabytes 816 files
• full release every two months• incremental and cumulative updates daily• available only through internet• release notes: gbrel.txt
Over 100 billionbases!
Over 100 billionbases!
![Page 11: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/11.jpg)
NC
BI
Fie
ldG
uid
eWhat is
GenBank?
Nucleotide only sequence database Archival in nature GenBank Data
Direct submissions (traditional records) Batch submissions (EST, GSS, STS) ftp accounts (genome data)
Three collaborating databases GenBank DNA Database of Japan (DDBJ) European Molecular Biology Laboratory (EMBL)
Database
![Page 12: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/12.jpg)
NC
BI
Fie
ldG
uid
e
GenBank Divisions
“Organismal”PRI (28) Primate ROD (15) Rodent PLN (13) Plant and FungalBCT (11) Bacterial/ArchealINV (7) InvertebrateVRT (7) Other VertebrateVRL (4) ViralMAM (2) MammalianPHG (1) PhageSYN (1) SyntheticUNA (1) Unannotated
“Functional”EST (377) Expressed Sequence Tag GSS (138) Genome Survey SequenceHTG (63) High Throughput GenomicPAT (17) PatentSTS (9) Sequence Tagged SiteCON (1) Contigs, virtual
• Organized by taxonomy (sort of)• Direct submissions (Sequin/Bankit)• Accurate (~1 error per 10,000 bp)• Well characterized
• Organized by sequence type• Batch submissions (ftp/email) • Inaccurate• Poorly characterized
![Page 13: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/13.jpg)
NC
BI
Fie
ldG
uid
eGenBank Functional (Bulk)
Divisions
GenBankEST
STS
GSS
HTG
Expressed Sequence Tag
1st pass single read cDNA
Genome Survey Sequence
1st pass single read gDNA
High Throughput Genomic
incomplete sequences of genomic
clones
Sequence Tagged Site
PCR-based mapping reagents
Whole Genome Shotgun
![Page 14: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/14.jpg)
NC
BI
Fie
ldG
uid
eEST Division: Expressed Sequence
Tags
RNA gene products
nucleus30,000 genes
80-100,000 uniquecDNA clones in library
- isolate unique clones - sequence once from
each end
make cDNA library
5’
3’
>IMAGE:275615 3', mRNA sequenceNNTCAAGTTTTATGATTTATTTAACTTGTGGAACAAAAATAAACCAGATTAACCACAACCATGCCTTATTATCAAATGTATAAGANGTAAATATGAATCTTATATGACAAAATGTTTCATTCATTATAACAAATTTAATAATCCTGTCAATNATATTTCTAAATTTTCCCCCAAATTCTAAGCAGAGTATGTAAATTGGAAGTTCTTATGCACGCTTAACTATCTTAACAAGCTTTGAGTGCAAGAGATTGANGAGTTCAAATCTGACCAAGGTTGATGTTGGATAAGAGAATTCTCTGCTCCCCACCTCTANGTTGCCAGCCCTC
>IMAGE:275615 5' mRNA sequenceGACAGCATTCGGGCCGAGATGTCTCGCTCCGTGGCCTTAGCTGTGCTCGCGCTACTCTCTCTTTCTGGTGGAGGTATCCAGCGTACTCCAAAGATTCAGGTTTACTCACGTCATCCAGCAGAGAATGGAAAGTCAATTCCTGAATTGCTATGTGTCTGGGTTTCATCCATCCGACATTGAAGTTGACTTACTGAAGAATGGAGAGAATTGAAAAAGTGGAGCATTCAGACTTGTCTTTCAGCAAGGACTGGTCTTTCTATCTCTTGTACTACTGAATTCACCCCCACTGAAAAAGATGAGTATGCCTGCCGTGTTGAACCATGTNGACTTTGTCACAGNCAAGTTNAGTTTAAGTGGGNATCGAGACATGTAAGGCAGGCATCATGGGAGGTTTTGAAGNATGCCGCNTTGGATTGGGATGAATTCCAAATTTCTGGTTTGCTTGNTTTTTTAATATTGGATATGCTTTTG
![Page 15: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/15.jpg)
NC
BI
Fie
ldG
uid
e
GSS, WGS, HTG
shred
Whole BAC insert (or genome)
isolate clonessequence
GSS divisionor trace archive
Draft sequence (HTG division)
assembly whole genome shotgun assemblies (traditional division)
![Page 16: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/16.jpg)
NC
BI
Fie
ldG
uid
eHTG Example: Honeybee Draft
Sequences
• Unfinished sequences of BACs
• Gaps and unordered pieces
• Finished sequences (Phase 3) move
to traditional GenBank division
• Unfinished sequences of BACs
• Gaps and unordered pieces
• Finished sequences (Phase 3) move
to traditional GenBank division
LOCUS AC141845 147720 bp DNA linear HTG 19-MAR-2004
DEFINITION Apis mellifera clone CH224-4A2, WORKING DRAFT
SEQUENCE, 14 unordered pieces.
ACCESSION AC141845
VERSION AC141845.1 GI:29124029
KEYWORDS HTG; HTGS_PHASE1; HTGS_DRAFT.
LOCUS AC141845 147720 bp DNA linear HTG 19-MAR-2004
DEFINITION Apis mellifera clone CH224-4A2, WORKING DRAFT
SEQUENCE, 14 unordered pieces.
ACCESSION AC141845
VERSION AC141845.1 GI:29124029
KEYWORDS HTG; HTGS_PHASE1; HTGS_DRAFT.
![Page 17: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/17.jpg)
NC
BI
Fie
ldG
uid
e
Whole Genome Shotgun Projects
351 projects Bacteria (251) Environmental sequences (6) Archaea (6)
Eukaryotes (88), including: Chicken, Rat, Mouse, Dog (2), Chimpanzee, Human
Pufferfish (2)
Honeybee, Anopheles, Fruit Flies (3), Silkworm
Nematode (2)
Yeasts (8), Aspergillus (2)
Rice (2)
351 projects Bacteria (251) Environmental sequences (6) Archaea (6)
Eukaryotes (88), including: Chicken, Rat, Mouse, Dog (2), Chimpanzee, Human
Pufferfish (2)
Honeybee, Anopheles, Fruit Flies (3), Silkworm
Nematode (2)
Yeasts (8), Aspergillus (2)
Rice (2)
![Page 18: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/18.jpg)
NC
BI
Fie
ldG
uid
eWhole Genome Shotgun (WGS)
Projects
wgs master[properties]
![Page 19: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/19.jpg)
NC
BI
Fie
ldG
uid
e
Derivative Databases
GenBank
SequencingCenters UniGene
RefSeq:
Entrez Gene and
annotation pipelines
Labs
Updated ONLY by submitters
ESTUniSTS
STS
HTG
GSS
PRI ROD PLN MAM BCT
INV VRT PHG VRL
ATT GA
ATT
C
GA
C
GA
C
C
CATT
TAACT
Updated
by NCBI
RefSeq
![Page 20: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/20.jpg)
NC
BI
Fie
ldG
uid
e
Why Make Reference Sequences?
Entrez Nucleotide query:
human[organism] AND lipase[title]
![Page 21: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/21.jpg)
NC
BI
Fie
ldG
uid
eWhy Make Reference Sequences?Entrez Nucleotide query:
human[organism] AND lipase[title]
![Page 22: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/22.jpg)
NC
BI
Fie
ldG
uid
ehuman[organism] AND lipase[title] AND endothelial[title]
3927 bp
4150 bp
3927 bp
2323 bp
261 bp
human[organism] AND lipase[title] AND endothelial[title]
![Page 23: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/23.jpg)
NC
BI
Fie
ldG
uid
e
RefSeq Benefits
genomestranscripts
proteins
• non-redundant; best representative
•updates to reflect current sequence data and biology
•distinct, stable accession series
![Page 24: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/24.jpg)
NC
BI
Fie
ldG
uid
e
Reference Sequence: RefSeq
Accession Sequence Type
NM_123456789 mRNANP_123456789 protein, from NM_NR_123456 non-coding RNAXM_123456 predicted mRNAXP_123456 predicted protein XR_123456 predicted non-coding RNAZP_12345678 predicted from NZ_
NC_123456 genomic, e.g., chromosomesNG_123455 genomic, incomplete region
NT_123456 genomic, BAC assemblyNW_123456 genomic, WGS assemblyNZ_ABCD12345678 genomic, WGS collection
blue=curated
![Page 25: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/25.jpg)
NC
BI
Fie
ldG
uid
e
Genomic DNAGenomic DNA((NCNC,, NTNT,, NW NW))
Model mRNAModel mRNA (XM)(XM)(XR)(XR)
Curated mRNACurated mRNA (NM)(NM)(NR)(NR)
Model protein Model protein (XP)(XP)
Annotation Process
Curated ProteinCurated Protein (NP)(NP)
Scanning....
GenbankSequences
RefSeq
![Page 26: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/26.jpg)
NC
BI
Fie
ldG
uid
e
Creating NM_ Records
NM’s must have cDNA support
Genome annotation
Longest mRNA
transcript variant 1transcript variant 2transcript variant 3
![Page 27: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/27.jpg)
NC
BI
Fie
ldG
uid
e
Where is RefSeq?
![Page 28: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/28.jpg)
NC
BI
Fie
ldG
uid
e
GENSAT
The Entrez System
Entrez
Nucleotide
PubMed
Protein
Taxonomy
Structure
Domains 3D DomainsJournal
s
PMC
OMIM
Books
PopSet
SNP
UniGene UniSTS
Genome
Gene
GEO
MeSH
CancerChromosomes
Homologene
PubChem
![Page 29: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/29.jpg)
NC
BI
Fie
ldG
uid
e
A Few Entrez Databases
UniGene Clusters of ESTs, mRNAs
dbSNP Single Nucleotide
Polymorphisms
GEO Gene Expression Omnibus
microarray and other
expression data
CDD Conserved Domain Database protein families (COGs
and KOGs)
single domains (PFAM,
SMART, CD)
UniGene Clusters of ESTs, mRNAs
dbSNP Single Nucleotide
Polymorphisms
GEO Gene Expression Omnibus
microarray and other
expression data
CDD Conserved Domain Database protein families (COGs
and KOGs)
single domains (PFAM,
SMART, CD)
![Page 30: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/30.jpg)
NC
BI
Fie
ldG
uid
eGene-oriented clusters of expressed sequences
• Automatic clustering using MegaBlast
• Each cluster represents a unique gene
• Informed by genome hits
• Information on tissue types and map locations
• Useful for gene discovery and selection of
mapping reagents
UniGene
unique gene
![Page 31: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/31.jpg)
NC
BI
Fie
ldG
uid
e
A Cluster of ESTs
query
5’ EST hits
3’ EST hits
![Page 32: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/32.jpg)
NC
BI
Fie
ldG
uid
eUniGene Collections
![Page 33: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/33.jpg)
NC
BI
Fie
ldG
uid
eExample UniGene Cluster
![Page 34: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/34.jpg)
NC
BI
Fie
ldG
uid
eHistogram of cluster sizes for UniGene Hs Build 177
(Now at Build #186)
![Page 35: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/35.jpg)
NC
BI
Fie
ldG
uid
eUniGene Cluster Hs.95351
SELECTED PROTEIN SIMILARITES
![Page 36: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/36.jpg)
NC
BI
Fie
ldG
uid
eUniGene Cluster Hs.95351
GENE EXPRESSION
![Page 37: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/37.jpg)
NC
BI
Fie
ldG
uid
e
UniGene Cluster Hs.95351: expression
![Page 38: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/38.jpg)
NC
BI
Fie
ldG
uid
eUniGene Cluster Hs.95351: seqs
![Page 39: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/39.jpg)
NC
BI
Fie
ldG
uid
e
Download sequences
web page
ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/
![Page 40: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/40.jpg)
NC
BI
Fie
ldG
uid
eEntrez GEO
![Page 41: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/41.jpg)
NC
BI
Fie
ldG
uid
e
NCBI’s SNP Database
Primary and derivative (RefSNP) Single nucleotide polymorphisms
Repeat polymorphisms
Insertion-deletion polymorphisms
Over 19 million refSNPs (rsXXXXXXX)
(August, 2005)
![Page 42: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/42.jpg)
NC
BI
Fie
ldG
uid
e
Searching dbSNP
![Page 43: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/43.jpg)
NC
BI
Fie
ldG
uid
e
RefSNP
![Page 44: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/44.jpg)
NC
BI
Fie
ldG
uid
e
RefSNP
![Page 45: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/45.jpg)
NC
BI
Fie
ldG
uid
e
RefSNP
![Page 46: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/46.jpg)
NC
BI
Fie
ldG
uid
e
RefSNP
Search Mouse SNP between strains
![Page 47: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/47.jpg)
NC
BI
Fie
ldG
uid
e
RefSNP
MapView GeneView SeqView OMIMNo 3D
![Page 48: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/48.jpg)
NC
BI
Fie
ldG
uid
e
RefSNP
![Page 49: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/49.jpg)
NC
BI
Fie
ldG
uid
eEntrez GEO
![Page 50: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/50.jpg)
NC
BI
Fie
ldG
uid
e
GPLPlatform
descriptions
GSMRaw/processedspot intensities
from a singleslide/chip
GSEGrouping of
slide/chip data“a single experiment”
GDSGrouping ofexperiments
Curated byNCBI
Submitted byExperimentalistsSubmitted by
Manufacturer*
Entrez GEOEntrez
GEO Datasets
GEO SaMple:
experimental
conditions
GEO SEries:
set of related
samples
![Page 51: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/51.jpg)
NC
BI
Fie
ldG
uid
e
What’s a DataSet?
Platform (GPL)
array definition
Sample(GSM)
hyb. measurements
Series(GSE)
related Samples
Supplied by submitter
DataSet (GDS)
• A collection of experimentally-related samples processed using the same platform.• Samples within DataSets are organized into subgroups based on experimental variables.• Form the basis of GEO’s query, analysis and data display tools.
Assembled by GEO staff
![Page 52: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/52.jpg)
NC
BI
Fie
ldG
uid
eGene Expression Omnibus (GEO)
Dataset browser
![Page 53: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/53.jpg)
NC
BI
Fie
ldG
uid
eGEO Dataset Browser
![Page 54: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/54.jpg)
NC
BI
Fie
ldG
uid
eGEO Dataset Report
![Page 55: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/55.jpg)
NC
BI
Fie
ldG
uid
e
GEO Profiles
… of 12625
![Page 56: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/56.jpg)
NC
BI
Fie
ldG
uid
eEntrez CDD
![Page 57: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/57.jpg)
NC
BI
Fie
ldG
uid
eConserved Domain Database
Multiple sequence alignments
Position-specific scoring matrices (PSSM)
Sources SMART, PFAM, COGs, KOGs, and
NCBI curated domains (structure-informed
alignments)
Multiple sequence alignments
Position-specific scoring matrices (PSSM)
Sources SMART, PFAM, COGs, KOGs, and
NCBI curated domains (structure-informed
alignments)
![Page 58: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/58.jpg)
NC
BI
Fie
ldG
uid
e
CDD
>gi|45549418|gb|AAS67634.1| ATP7A [Solenodon paradoxus] IVYQPHLITVEEIKKQIKAVGFPAFIKKQPKYLKLGAIDIERLKNIPVKSSEGSQQMSPSSTNDSKVTLTIDGMHCNSCVSNIESALSTLHYVSSIVVSLQNKSAIIKYNANSVTPEILKKAIEAISPGQYRVSITSEVESTSNSPSSSSQKAPLNVVSQPLTQVTVININGMTCNSCVQSIEGVMSKKAGVKSIQVSLANRNGTVEYDP LLTSPEILRE
![Page 59: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/59.jpg)
NC
BI
Fie
ldG
uid
e
CDD
CD
Pfam
COG
Click on a colored bar to align your sequence to the CD
![Page 60: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/60.jpg)
NC
BI
Fie
ldG
uid
eConserved Domain Database: cd00371.1, HMA
![Page 61: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/61.jpg)
NC
BI
Fie
ldG
uid
e
CDD
![Page 62: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/62.jpg)
NC
BI
Fie
ldG
uid
eCDART: Conserved Domain Architecture Retrieval
Tool
![Page 63: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/63.jpg)
NC
BI
Fie
ldG
uid
e
cdd
Linking from Entrez Protein
![Page 64: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/64.jpg)
NC
BI
Fie
ldG
uid
e
Genome Resources
Gene database
Trace Archive
Map Viewer
Homologene
Genomic Biology
![Page 65: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/65.jpg)
NC
BI
Fie
ldG
uid
e
Genomic Biology
![Page 66: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/66.jpg)
NC
BI
Fie
ldG
uid
eGen Biol: Gen Resources
![Page 67: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/67.jpg)
NC
BI
Fie
ldG
uid
e
Gen Biol: Gen Resources
![Page 68: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/68.jpg)
NC
BI
Fie
ldG
uid
eGen Biol: Gen Resources
![Page 69: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/69.jpg)
NC
BI
Fie
ldG
uid
e
Genome Projects: microb
![Page 70: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/70.jpg)
NC
BI
Fie
ldG
uid
eGen Biol: Gen Resources
![Page 71: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/71.jpg)
NC
BI
Fie
ldG
uid
eGen Biol: Gen Resources
![Page 72: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/72.jpg)
NC
BI
Fie
ldG
uid
eGen Biol: Gen Resources
![Page 73: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/73.jpg)
NC
BI
Fie
ldG
uid
eGen Biol: Gen Resources
![Page 74: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/74.jpg)
NC
BI
Fie
ldG
uid
e
Gen Biol: Gen Resources
![Page 75: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/75.jpg)
NC
BI
Fie
ldG
uid
e
Genome Resources
Gene database
Trace Archive
Map Viewer
Homologene
Genomic Biology
![Page 76: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/76.jpg)
NC
BI
Fie
ldG
uid
e
Entrez Gene
A single query interface to …
• Sequences
- RefSeqs
- GenBank
- Homologene• Maps – MapViewer• Entrez links• Linkouts
More organisms, ~ 3000
Entrez integration
More organisms, ~ 3000
Entrez integration
![Page 77: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/77.jpg)
NC
BI
Fie
ldG
uid
eGlobal Entrez: NADH2
![Page 78: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/78.jpg)
NC
BI
Fie
ldG
uid
eEntrez Gene: NADH2
![Page 79: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/79.jpg)
NC
BI
Fie
ldG
uid
eGene Record for Pongo NADH2
Homo sapiens
Not found with “nadh2”
![Page 80: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/80.jpg)
NC
BI
Fie
ldG
uid
eA Record With More Data: Human HFE
![Page 81: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/81.jpg)
NC
BI
Fie
ldG
uid
eHuman HFE: Transcripts
Transcripts with experimental
evidence
Transcripts with experimental
evidence
![Page 82: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/82.jpg)
NC
BI
Fie
ldG
uid
eGene Table
![Page 83: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/83.jpg)
NC
BI
Fie
ldG
uid
eIntrons/Exons: Gene Table
links to sequence
![Page 84: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/84.jpg)
NC
BI
Fie
ldG
uid
eHuman HFE: Links
![Page 85: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/85.jpg)
NC
BI
Fie
ldG
uid
e
Genotype
![Page 86: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/86.jpg)
NC
BI
Fie
ldG
uid
eGenotype
![Page 87: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/87.jpg)
NC
BI
Fie
ldG
uid
eHuman HFE: Links
![Page 88: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/88.jpg)
NC
BI
Fie
ldG
uid
e
GeneView in dbSNP
![Page 89: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/89.jpg)
NC
BI
Fie
ldG
uid
e
SNP in Structure
![Page 90: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/90.jpg)
NC
BI
Fie
ldG
uid
e
SNP in Structure
![Page 91: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/91.jpg)
NC
BI
Fie
ldG
uid
e
SNP in Structure
H41
S43
C260
![Page 92: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/92.jpg)
NC
BI
Fie
ldG
uid
eAnother Variation Source: OMIM
![Page 93: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/93.jpg)
NC
BI
Fie
ldG
uid
eVariants in OMIM
![Page 94: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/94.jpg)
NC
BI
Fie
ldG
uid
e
Genome Resources
Gene database
Trace Archive
Map Viewer
Homologene
Genomic Biology
![Page 95: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/95.jpg)
NC
BI
Fie
ldG
uid
e
The New Homologene
Automated detection of homologs among the annotated genes of
completely sequenced eukaryotic genomes.
No longer UniGene based
Protein similarities first
Guided by taxonomic tree
Includes orthologs and
paralogs
No longer UniGene based
Protein similarities first
Guided by taxonomic tree
Includes orthologs and
paralogs
![Page 96: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/96.jpg)
NC
BI
Fie
ldG
uid
e
The New Homologene
Homologene Build 43.1 (8/23/05)
Species Number of genes input grouped groups
![Page 97: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/97.jpg)
NC
BI
Fie
ldG
uid
e
RAG1 → Homologene
![Page 98: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/98.jpg)
NC
BI
Fie
ldG
uid
e
RAG1 → HomolgeneRAG1
![Page 99: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/99.jpg)
NC
BI
Fie
ldG
uid
eRAG1
RING-finger
![Page 100: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/100.jpg)
NC
BI
Fie
ldG
uid
e
RAG1 → HomolgeneRAG1
![Page 101: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/101.jpg)
NC
BI
Fie
ldG
uid
eRAG1
Sugar_tr
![Page 102: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/102.jpg)
NC
BI
Fie
ldG
uid
e
Homologene: alignment scores
![Page 103: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/103.jpg)
NC
BI
Fie
ldG
uid
eBLASTPbl2seq
![Page 104: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/104.jpg)
NC
BI
Fie
ldG
uid
e
Genome Resources
LocusLinkLocusLinkGene databaseGene database
UniGeneUniGene
Trace ArchiveTrace Archive
Map ViewerMap Viewer
HomologeneHomologene
![Page 105: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/105.jpg)
NC
BI
Fie
ldG
uid
e
List View
![Page 106: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/106.jpg)
NC
BI
Fie
ldG
uid
eHuman MapViewer
adar
![Page 107: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/107.jpg)
NC
BI
Fie
ldG
uid
eMapViewer: Human ADAR
![Page 108: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/108.jpg)
NC
BI
Fie
ldG
uid
e
MV Hs ADAR3’ UTR
5’ UTR
![Page 109: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/109.jpg)
NC
BI
Fie
ldG
uid
eMaps & Options
--Sequence maps--Ab initioAssemblyRepeatsBES_CloneCloneNCI_CloneContigComponentCpG islanddbSNP haplotypeFosmidGenBank_DNAGenePhenotypeSAGE_TagSTSTCAG_RNATranscript (RNA)Hs_UniGeneHs_EST
--Cytogenetic maps--IdeogramFISH CloneGene_CytogeneticMitelman BreakpointMorbid/Disease--Genetic Maps--deCODEGenethonMarshfield--RH maps--GeneMap99-G3GeneMap99-GB4NCBI RHStandford-G3TNGWhitehead-RHWhitehead-YAC
Mm_UniGeneMm_ESTRn_UniGeneRn_ESTSsc_UniGeneSsc_ESTBt_UniGeneBt_ESTGga_UniGeneGga_ESTVariation
Maps & Options
= SNP
![Page 110: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/110.jpg)
NC
BI
Fie
ldG
uid
e
MapViewerUniGene
Component
Repeats
Gene
![Page 111: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/111.jpg)
NC
BI
Fie
ldG
uid
e
GenePhenotype Variation
![Page 112: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/112.jpg)
NC
BI
Fie
ldG
uid
eMaps & OptionsMaps & Options
![Page 113: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/113.jpg)
NC
BI
Fie
ldG
uid
e
Genome Resources
LocusLinkLocusLinkGene databaseGene database
UniGeneUniGene
Trace ArchiveTrace Archive
Map ViewerMap Viewer
HomologeneHomologene
![Page 114: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/114.jpg)
NC
BI
Fie
ldG
uid
e
Trace Archive Page
![Page 115: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/115.jpg)
NC
BI
Fie
ldG
uid
e
Macaca Mulatta Traces
![Page 116: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/116.jpg)
NC
BI
Fie
ldG
uid
e
![Page 117: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/117.jpg)
NC
BI
Fie
ldG
uid
e
Trace Archive BLAST Page
Access to sequences NOT in GenBankAccess to sequences NOT in GenBank
![Page 118: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/118.jpg)
NC
BI
Fie
ldG
uid
e
Literature Links
![Page 119: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/119.jpg)
NC
BI
Fie
ldG
uid
e
BOOKS Database
![Page 120: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/120.jpg)
NC
BI
Fie
ldG
uid
e
BOOKS Database: hyperlinked
![Page 121: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/121.jpg)
NC
BI
Fie
ldG
uid
e
BOOKS Database
![Page 122: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/122.jpg)
NC
BI
Fie
ldG
uid
e
BOOKS Database
![Page 123: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/123.jpg)
NC
BI
Fie
ldG
uid
e
BOOKS Database
![Page 124: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/124.jpg)
NC
BI
Fie
ldG
uid
e
Genes & Dis
![Page 125: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/125.jpg)
NC
BI
Fie
ldG
uid
e
Genes & Dis
![Page 126: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/126.jpg)
NC
BI
Fie
ldG
uid
e
For More Information…
![Page 127: National Center for Biotechnology Information](https://reader033.vdocuments.site/reader033/viewer/2022051517/5681593c550346895dc67ad5/html5/thumbnails/127.jpg)
NC
BI
Fie
ldG
uid
e
Intermission