investigation into the phylogeny of odobenus rosmarus
DESCRIPTION
Walrus phylogenetic studyTRANSCRIPT
Investigation into the phylogeny of
Odobenus Rosmarus
A report for Nello Cristianini for the unit EMATM0004
Computational Genomics and Bioinformatics
Algorithms
By Samuel R Neaves SN0550
November 2011
Introduction
This project investigates the evolutionary history of Odobenus rosmarus (The walrus). The evolution
of the Pinnipedia (Odobenidae- walruses, Otariidae- eared seals, including sea lions and fur seals &
Phocidae- earless seals) is said to be enigmatic with the exact relationships between subspecies in
dispute. The majority of authors support a monophyletic origin of the pinnipeds from a caniform,
however there are others who suggest a diphyletic origin with the phocidae being related to the
mustelids (The mustelids are themselves a disputed family). Arnason et al (1995).
A further dispute is that some authors divide the walrus into three sub species of Odobenus
rosmarus + (rosmarus, divergen or laptivai) however recent work by (Lindqvist et al, 2009),
concludes that laptivai are not a distinct species from divergen. The aim of this investigation is to
gather evidence for the true phylogeny.
Data Description
The primary species for this investigation will be Odobenus rosmarus rosmarus. The complete
mitochondrial DNA accession number in genbank is: NC_004029(.2). Odobenus rosmarus
rosmarus’s phylogeny will be computed in relation to Erignathus barbatus(Bearded Seal,
representing Phocidae ) Zalophus californianus(California Sea Lion, representing Otariidae) Ursus
maritimus (Polar bear, representing Caniformia) and Gulo gulo (Wolverine, representing mustelids).
Homo sapiens are used as an out group to root the phylogenetic trees. For the full table of accession
numbers see appendix A.
Sequence statistics.
Odobenus rosmarus rosmarus mitochondrial DNA was statistically analyzed with the following
information found:
The size of the genome is 16565 base pairs.
The number of each base:
A C G T
5401 4310 2414 4440
The base count frequency:
A C G T
0.3260 0.2602 0.1457 0.2680
This shows that there are twice as many A’s as G’s, with roughly the same amount of C’s and T’s over
the whole genome. This seems an interesting break from the norm of A and T content being similar
and G and C content being similar. To further investigate and in order to consider local fluctuations
in the frequencies of nucleotides we employ sliding windows of size 5000, 2000 and 500 and plot the
frequencies.
A sliding window of size 5000 does not show a great deal of variation amongst the composition
however a smaller windows clearly show peaks and troughs, which shows that the nucleotides are
not drawn from a independent and identically distributed probability distribution as the distribution
changes along the genome.
With a caveat of caution because of the apparent violation of the aggregate frequencies, the GC
content is also plotted; at the smallest window size this seems to show six distinct waves of variation
in both AT and CG content.
Next we employ an ab initio method to find protein encoding genes. The single-nucleotide
permutation test calculates the significance of Open Reading Frames(ORFs) with a threshold set to
be longer than all ORFs in a random sequence and it finds 1 gene. If we set α to 5% then we get a
larger value of 12 genes found. We are careful to set the correct genetic code for vertebrate
Mitochondrial. We translate these genes into protein sequences and identify cytochrome B and
cytochrome C by translating into amino acid sequences and blasting. Once identified we run further
protein blasts using both cytochromes to identify the nearest other species.
0 2000 4000 6000 8000 10000 12000 14000 16000 180000.1
0.2
0.3
0.4
0.5
Nucleotide density
A
C
G
T
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
0.4
0.5
0.6
0.7A-T C-G density
A-T
C-G
0 2000 4000 6000 8000 10000 12000 14000 16000 180000.1
0.2
0.3
0.4
0.5
Nucleotide density
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
0.4
0.5
0.6
0.7A-T C-G density
A
C
G
T
A-T
C-G
0 2000 4000 6000 8000 10000 12000 14000 16000 180000
0.2
0.4
0.6
0.8Nucleotide density
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
0.4
0.5
0.6
0.7A-T C-G density
A
C
G
T
A-T
C-G
5000 2000
500
Results of CYTB blast:
Rank Latin name Common Name Total Score(Max 760)
1 Halichoerus grypus Grey Seal 681
2 Gulo gulo Wolverine 680
3 Phoca vitulina stejnegeri Harbour Seal 679
4 Erignathus barbatus Bearded Seal 679
5 Ictonyx libyca Saharan Striped Pole cat 679
These results are interesting because they do not include any Otariidaes, suggesting that Pinnipedia
have a diphyletic origin from the ancient caniform with the Odobenidae, Phocidae and the Mustelids
on one branch and Otariidaes on another.
Results of CYTC blast
Rank Latin Name Common name Total Score
1 Tremarctos ornatus Spectacled bear 447
2 Otaria byroni South American Sea Lion 446
3 Arctocephalus townsendi
Guadalupe fur seal 445
4 Neophoca cinerea Australian Sea Lion 444
5 Callorhinus ursinus Northern fur seal 444
This results is a contrast to the CYTB blast results, this time with many Otariidaes, no Phocidaes a
high ranking Caniformia and no Mustelids. This data appears to support the monophyletic origin
hypothesis or a diphyletic origin but with the Odobenidae on the branch of Otariidaes. To further
the investigation we add the initially selected organisms to the data set and compute the genetic
distances between each pair. We utilize the Jukes-Cantor correction to account for multiple
substitutions that have occurred in the same space.
(
)
This states that the number of substitutions per site between two sequences (K) can be estimated
from the observed fractions that differ (d).
With this applied on cytochrome b it is clear that the Polar bear is very distantly related compared to
the other species. It is interesting that the data suggests that the Spectacled bear is a closer relation
to the pinnipeds than the Polar bear.
If we remove the Polar bear to allow us to zoom in we can see five distinct groups. The data shows
the Walrus is about equally distant from the Otariidaes and the Phocidaes , with the Otariidaes
closer to Mustelids and as far from the Phocidaes as it is from the Spectacled bear.
Performing the same procedure for cytochrome C we get similar results however, this time the
Phocidaes are clearly grouped with Mustelids along with the Polar bear. The Spectacled bear is once
again on its own slightly closer to Phocidaes than the Otariidaes. This leaves the Walrus again as an
outliner being roughly equal distances from the two major clusters.
Four phylogenetic trees were built, one for each Cytochrome from both amino acid and nucleotide
sequences’. In order to build the cytochrome C nucleotide tree, a number of animals including
Odobenus rosmarus had to use amino acid to nucleotide transformation due to unavailability of
sequence data, which as this is not a one to one relation results in some random substitutions which
may affect the accuracy of this graph.
The results present a confused picture with many contradictions between the four trees. However if
we discount the Cytochrome C nt tree there appears to be some consensus, all the Otariidae and
Phocidae are consistently grouped together and the Odobenus Rosmarus is seen to first split from
the common ancestor of both the Otariidae and Phocidae which then diverged at a later date, this
stands in contrast to the results in Arnason et al(1995) which show the Phocidae first splitting, with a
later split between the Otariidae and Odobenus Rosmarus. However (Lento et al, 1995) does offer
some evidence for Odobenus Rosmarus being an early divergence from the common pinniped
ancestor which would be consistent with these results. There are major differences in the placing of
Ursus maritimus, Tremarctos ornatus and Gulo gulo between the cytochrome b and c trees,
cytochrome c puts the mustelids, Ursus maritimus and Tremarctos ornatus on the same branch as
the Phocidae, however the cytochrome B tree has the Mustelids and Tremarctos ornatus close to
the Otariidae, with Ursus maritimus being a distance relation. Castresana (2001) presents evidence
that Cytochrome B is more reliable for constructing trees at the genus and family level and therefore
this tree may be taken as a more reliable indicator to the true phylogeny.
The online resource tax browser collated by NCBI has the Odobenidae, Phocidae and Otariidae as
three distinct families within the suborder of Caniformia and does not have any one group as an
ancestor to the other.
Multiple alignments
In order to build multiple alignments and identify polymorphic sites the heuristic CLUSTALW tool was
used to align both the cytochrome B and cytochrome C protein sequences. This was set to use the
BLOSUM Protein weight matrix with a GAP open penalty set to 10, GAP extension penalty set to
0.20, GAP distances set to 5 and No End Gaps set to ‘No’. Too see the full alignments refer to
appendix B. It is clear that both alignments are very good apart from of course the out-group and
the Polar bear in cytochrome B. The majority of polymorphic sites in cytochrome B are consistent
with the groupings of Odobenidae, Phocidae and Otariidae. They include both indels and point
mutations. The sites are fairly sporadic across the sequences which is in contrast to the polymorphic
sites in cytochrome C which mostly lie between the 50th and 100th amino acid with the extremities
remaining constant.
Addressing the question of how many species of Odobenus
Rosmarus there are we utilize a selection of walrus samples from
the (Lindqvist et al, 2009) study. These sequences are ATL25
tRNA-Trp and tRNA-Pro genes from the mtDNA region of the
genomes. We follow the same procedure as earlier computing
the genetic distance between the samples using jukes cantor
correction and plotting these on a graph. We use this
computation to build an unrooted phylogenetic tree. Both the
tree and the distance plot conforms with (Lindqvist et al, 2009)
conclusion that the walruses sampled from the Laptev sea are
indeed just a subgroup of the Pacific walrus because they exist in
a sub branch of Odobenus rosmarus divergens and their genetic
distance is mixed amongst the Pacific samples. This data and
analysis therefore does not justify labeling these as a separate
species.
A further point of note is that the Atlantic walrus genetic data
show signs of going through a genetic bottle neck due to the lack
of diversity compared to the Pacific walrus. This information sits
with the historic fact, that the Atlantic walrus was almost hunted
to extinction by the 1950’s with numbers beginning to recover
since then. Whereas the more remote locations inhabited by the
Pacific walrus protected them from human hunting which has
allowed there numbers to remain much higher throughout the
20th century and therefore accounting for the greater genetic
diversity shown in the samples. If further larger samples are
collected and more detailed analysis’s show the same results
then it may be it will be time to change the current NCBI tax
browser to show only two species of Odobenus Rosmarus.
Atlantic Pacific Laptivai
Conclusion
The analysis that we have performed present results that stand in contrast to the two papers Ulfure
et al (1995) and Lento et al (1995). Proving that the question of pinniped evolution is indeed very
interesting with a variety of hypothesis still in contention. The examination of the question of if
there are two or three walruses species came to the same conclusion as (Lindqvist et al, 2009)
despite using different techniques and methods. It must be said that the same data was used for this
study and Lindqvist et al’s (2009) study. Which when taken with the low numbers of samples and the
use of amplicons, as well as the inherent difficulty of sampling Odobenus Rosmarus potentially
leading to sampling errors, such as close relatives being sampled, leaves the hypothesis very much
still open to refutation.
While the evolution of pinnipeds remain inconclusive there remains the need for further more in-
depth studies to allow for reliable conclusions to be drawn so that wise actions can be taken to
protect this charismatic and vulnerable artic creature from the threats of hunting and habitat
destruction that continue to push many creatures to extinction.
A pair of curious Walruses (image from http://www.free-extras.com/images/walrus-8927.htm)
Appendix A
Accession Number
Proteins Nucleotides
Latin Name Common Name
mtDna Cytochrome B Cytochrome C Cytochrome B Cytochrome C
Odobenus Rosmarus Rosmarus
Atlantic Walrus CAD21718 NP_659340.3 NC_004029.2 NA
Zalophus californianus
California sea lion, representing the Otariidae
YP_778707.1 YP_778698.1 D26524.1 AJ616896.1
Erignathus barbatus
Bearded Seal, representing Phocidae
YP_778837.1 YP_778828.1 AY140982.1 FJ839388.1
Ursus maritimus
Polar bear, representing Caniformia
AAF71578.1 NP_597984.1 NC_003428.1 NA
Gulo gulo Wolverine, representing Mustelids
YP_001382271.1 YP_001382262.1 L77960.2 EU544598.1
Homo Sapiens Human, is used as an outgroup
AAA31851.1 NP_061820.1 S88250.1 NM_018947.5
Halichoerus grypus
Grey Seal ACZ28998.1 NP_007072.1 GU167293.1 GU733706.1
Phoca vitulina stejnegeri
Harbor seals BAI60013.1 NP_006931.1 AB510422.1 NA
Ictonyx libyca Saharan Striped Polecat
ABV57060.1 NA EF987739.1 NA
Tremarctos ornatus
spectacled bear AAB50570.1 YP_001542732.1 U23554.1 NA
Otaria byronia South American Sea Lion
AAQ95107.1 AAR00312.1 AY713034.1 AJ891144.1
Arctocephalus townsendi
Guadalupe fur seal
YP_778759.1 YP_778750.1 AF380897.1 NA
Neophoca cinerea
Australian Sea Lion
YP_778746.1 YP_778737.1 AF380915.1 NA
Callorhinus ursinus
Northern fur seal
YP_778694.1 YP_778685.1 HQ895717.1 HM171421.1
Odobenus Rosmarus samples.
Lap 1 EU728526 Pac 8 EU728538 Atlan 4 EU728567 Atlan 14 EU728549
Lap 2 EU728527 Pac 9 EU728539 Atlan 5 EU728568 Atlan 15 EU728550
Lap 3 EU728529 Pac 12 EU728542 Atlan 6 EU728569 Atlan 16 EU728551
Lap 4 EU728530 Pac 13 EU728543 Atlan 7 EU728570 Atlan 17 EU728552
Lap 5 EU728525 Pac 14 EU728562 Atlan 8 EU728571 Atlan 18 EU728553
Pac 1 EU728531 Pac 15 EU728563 Atlan 9 EU728572 Atlan 19 EU728554
Pac 2 EU728532 Pac 16 EU728564 Atlan 10 EU728573 Atlan 20 EU728555
Pac 3 EU728533 Atlan 1 EU728561 Atlan 11 EU728546 Atlan 21 EU728556
Pac 4 EU728534 Atlan 2 EU728565 Atlan 12 EU728547 Atlan 22 EU728557
Pac 5 EU728535 Atlan 3 EU728566 Atlan 13 EU728548 Atlan 23 EU728558
Pac 6 EU728536
Pac 7 EU728537
Appendix B CLUSTAL 2.1 multiple sequence alignment cytochrome B
gi|115494578|ref|YP_778707.1| MTNIRKVHPLAKIINSSLIDLPTPSNISAWWNFGSLLAACLALQILTGLF 50
gi|115494844|ref|YP_778746.1| MTNIRKTHPLAKIINNSLIDLPAPSNISAWWNFGSLLAVCLALQILTGLF 50
gi|37620596|gb|AAQ95107.1| MTNIRKVHPLAKIINNLLIDLPAPSNISAWWNFGSLLAVCLALQILTGLF 50
gi|115494690|ref|YP_778694.1| MTNIRKVHPLAKIINSSLIDLPAPSNISAWWNFGSLLATCLVLQILTGLF 50
gi|115494830|ref|YP_778759.1| MTNIRKTHPLAKIINNSLIDLPAPSNISTWWNFGSLLAACLALQILTGLF 50
gi|269302297|gb|ACZ28998.1| MTNIRKTHPLMKIINNSFIDLPTPSNISAWWNFGSLLGICLILQILTGLF 50
gi|282154709|dbj|BAI60013.1| MTNIRKTHPLMKIINNSFIDLPTPSNISAWWNFGSLLGICLILQILTGLF 50
gi|115494788|ref|YP_778837.1| MTNIRKTHPLIKIINSSFIDLPTPSNISAWWNFGSLLGICLILQILTGLF 50
gi|8038011|gb|AAF71578.1| --------------------------------------------------
gi|1122916|gb|AAB50570.1| MTNIRKTHPLAKIINSSFIDLPTPSNISAWWNFGSLLGVCLILHILTGLF 50
gi|157461069|gb|ABV57060.1| MANIRKTHPLAKIINNSFVDLPTPSSISAWWNFGSLLGICLIIQILTGLF 50
gi|153124668|ref|YP_001382271. MTNIRKTHPLAKIINNSFIDLPTPSNISAWWNFGSLLGICLILQILTGLF 50
gi|21425423|emb|CAD21718.1| MTNIRKTHPLAKIINNTFIDLPTPSNISAWWNFGSLLATCLILQILTGLF 50
gi|552606|gb|AAA31851.1| --------------------------------------------------
gi|115494578|ref|YP_778707.1| LAMHYTSDTTTAFSSVTHICRDVNYGWIIRYMHANGASMFFICLYMHVGR 100
gi|115494844|ref|YP_778746.1| LAMHYTSDTTTAFSSVTHICRDVNYGWIIRYMHANGASMFFICLYMHVGR 100
gi|37620596|gb|AAQ95107.1| LAMHYTSDTTTAFSSVTHICRDVNYGWIIRYMHANGASMFFICLYMHVGR 100
gi|115494690|ref|YP_778694.1| LAMHYTSDTTTAFSSVAHICRDVNYGWIIRYMHANGASMFFICLYMHVGR 100
gi|115494830|ref|YP_778759.1| LAMHYTSDTTTAFSSVTHICRDVNYGWIIRYMHANGASMFFICLYMHVGR 100
gi|269302297|gb|ACZ28998.1| LAMHYTSDTTTAFSSVTHICRDVNYGWIIRYLHANGASMFFICLYMHVGR 100
gi|282154709|dbj|BAI60013.1| LAMHYTSDTTTAFSSVTHICRDVNYGWIIRYLHANGASMFFICLYMHVGR 100
gi|115494788|ref|YP_778837.1| LAMHYTSDTTTAFSSVTHICRDVNYGWIIRYMHANGASMFFICLYMHVGR 100
gi|8038011|gb|AAF71578.1| --------------------------------------------------
gi|1122916|gb|AAB50570.1| LAMHYTADTTTAFSSVAHICRDVNYGWVIRYMHANGASMFFICLFMHVGR 100
gi|157461069|gb|ABV57060.1| LAMHYTSDTTTAFSSVTHICRDVNYGWIIRYMHANGASMFFICLFLHVGR 100
gi|153124668|ref|YP_001382271. LAMHYTSDTATAFSSVTHICRDVNYGWVIRYMHANGASMFFICLFLHVGR 100
gi|21425423|emb|CAD21718.1| LAMHYTSDTTTAFSSITHICRDVNYGWIIRYMHANGASMFFICLYAHMGR 100
gi|552606|gb|AAA31851.1| --------------------------------------------------
gi|115494578|ref|YP_778707.1| GLYYGSYTLTETWNIGIILLFTIMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|115494844|ref|YP_778746.1| GLYYGSYTLTETWNIGIILLFTIMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|37620596|gb|AAQ95107.1| GLYYGSYTLTETWNIGIILLLTVMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|115494690|ref|YP_778694.1| GLYYGSYTLTETWNIGIILLLTIMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|115494830|ref|YP_778759.1| GLYYGSYTLAETWNIGIILLFTIMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|269302297|gb|ACZ28998.1| GLYYGSYTFTETWNIGIILLFTIMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|282154709|dbj|BAI60013.1| GLYYGSYTFTETWNIGIILLFTVMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|115494788|ref|YP_778837.1| GLYYGSYTFMETWNIGIILLFTVMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|8038011|gb|AAF71578.1| --------------------------------------------------
gi|1122916|gb|AAB50570.1| GLYYGSYLFSETWNIGIILLLTIMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|157461069|gb|ABV57060.1| GLYYGSYLFPETWNIGIILLFTVMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|153124668|ref|YP_001382271. GLYYGSYTYSETWNIGIILLFTVMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|21425423|emb|CAD21718.1| GIYYGSYTLAETWNIGIVLLLTIMATAFMGYVLPWGQMSFWGATVITNLL 150
gi|552606|gb|AAA31851.1| --------------------------------------------------
gi|115494578|ref|YP_778707.1| SAVPYIGTNLVEWIWGGFSVDKATLTRFFAFHFILPFMASALVMVHLLFL 200
gi|115494844|ref|YP_778746.1| SAVPYIGTNLVEWIWGGFSVDKATLTRFFAFHFILPFMASALVMVHLLFL 200
gi|37620596|gb|AAQ95107.1| SAIPYIGTNLVEWIWGGFSVDKATLTRFFAFHFILPFVVSALVMVHLLFL 200
gi|115494690|ref|YP_778694.1| SAIPYIGANLVEWIWGGFSVDKATLTRFFAFHFILPFMVSALVMVHLLFL 200
gi|115494830|ref|YP_778759.1| SAIPYIGTNLVEWIWGGFSVDKATLTRFFAFHFILPFVASALVMVHLLFL 200
gi|269302297|gb|ACZ28998.1| SAIPYIGTDLVQWIWGGFSVDKATLTRFFAFHFILPFVVLALAAVHLLFL 200
gi|282154709|dbj|BAI60013.1| SAIPYIGTDLVQWIWGGFSVDKATLTRFFAFHFILPFVVSALAAVHLLFL 200
gi|115494788|ref|YP_778837.1| SAIPYIGTDLVQWIWGGFSVDKATLTRFFAFHFILPFVVLALAAVHLLFL 200
gi|8038011|gb|AAF71578.1| ---------------GGFSVDKATLTRFFAFHFILPFIILALAAVHLLFL 35
gi|1122916|gb|AAB50570.1| SAIPYIGTDLVEWIWGGFSVDKATLTRFFAFHFILPFIILALAMVHLLFL 200
gi|157461069|gb|ABV57060.1| SAIPYIGNNLVEWIWGGFSVDKATLTRFFAFHFILPFIISALAAVHLLFL 200
gi|153124668|ref|YP_001382271. SAIPYIGTSLVEWIWGGFSVDKATLTRFFAFHFILPFIILALAAIHLLFL 200
gi|21425423|emb|CAD21718.1| SAIPYVGTDLVEWVWGGFSVDKATLTRFLALHFVLPFMALALTAVHLLFL 200
gi|552606|gb|AAA31851.1| --------------------------------------------------
gi|115494578|ref|YP_778707.1| HETGSNNPSGISSDSDKIPFHPYYTIKDILGTLLLILTLMLLVMFSPDLL 250
gi|115494844|ref|YP_778746.1| HETGSNNPSGISSDSDKIPFHPYYTIKDILGTLFLILILMLLVMFSPDLL 250
gi|37620596|gb|AAQ95107.1| HETGSNNPSGISSDSDKIPFHPYYTIKDILGTLLLILILMLLVMFSPDLL 250
gi|115494690|ref|YP_778694.1| HETGSNNPSGVSSDSDKIPFHPYYTIKDILGTLLLILILMLLVMFSPDLL 250
gi|115494830|ref|YP_778759.1| HETGSNNPSGVSSDSDKIPFHPYYTIKDILGALLLILILMLLVMFSPDLL 250
gi|269302297|gb|ACZ28998.1| HETGSNNPSGIMSDSDKIPFHPYYTIKDILGALLLILVLTLLVLFSPDLL 250
gi|282154709|dbj|BAI60013.1| HETGSNNPSGIMSDSDKIPFHPYYTIKDILGALLFILVLTLLVLFSPDLL 250
gi|115494788|ref|YP_778837.1| HETGSNNPSGISSDSDKIPFHPYYTIKDILGALLLILVLMLLVLFSPDLL 250
gi|8038011|gb|AAF71578.1| HETGSNNPSGIPSDSDKIPFHPYYTIKDILGALLLTLALATLVLFSPDLL 85
gi|1122916|gb|AAB50570.1| HETGSNNPSGISSNSDKIPFHPYYTIKDILGVLLLLLALVTLVLFSPDLL 250
gi|157461069|gb|ABV57060.1| HETGSNNPSGIPSNSDKIPFHPYYTIKDILGVLLLIITLMTLVLFSPDLL 250
gi|153124668|ref|YP_001382271. HETGSNNPSGIPSDSDKIPFHPYYTIKDILGALFLALVLMMLVLFSPDLL 250
gi|21425423|emb|CAD21718.1| HETGSNNPSGILSDSDKIPFHPYYTIKDILGLIILILILMLLVLFSPDLL 250
gi|552606|gb|AAA31851.1| --------------------------------------------------
gi|115494578|ref|YP_778707.1| GDPDNYIPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVLALLLSILI 300
gi|115494844|ref|YP_778746.1| GDPDNYIPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVLALLLSILI 300
gi|37620596|gb|AAQ95107.1| GDPDNYIPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVLALLLSILI 300
gi|115494690|ref|YP_778694.1| GDPDNYIPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVVALLLSILV 300
gi|115494830|ref|YP_778759.1| GDPDNYIPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVLALLLSILV 300
gi|269302297|gb|ACZ28998.1| GDPDNYIPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVLALVLSILI 300
gi|282154709|dbj|BAI60013.1| GDPDNYIPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVLALVLSILI 300
gi|115494788|ref|YP_778837.1| GDPDNYTPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVLALVLSILI 300
gi|8038011|gb|AAF71578.1| GDPDNYIPAN---------------------------------------- 95
gi|1122916|gb|AAB50570.1| GDPDNYTPANPVSTPLHIKPEWYFLFAYAILRSIPNKLGGVLALIFSILI 300
gi|157461069|gb|ABV57060.1| GDPDNYTPANPLNTPPHIKPEWYFLFAYAILRSIPNKLGGVLALILSILV 300
gi|153124668|ref|YP_001382271. GDPDNYTPANPLNTPPHIKPEWYFLFAYAILRSIPNKLGGVLALVLSILV 300
gi|21425423|emb|CAD21718.1| GDPDNYTPANPLSTPPHIKPEWYFLFAYAILRSIPNKLGGVLALLLSILV 300
gi|552606|gb|AAA31851.1| ------------------KPEWYFLFAYTILRSVPNKLGGVLALLLSILI 32
gi|115494578|ref|YP_778707.1| LAIIPLLHTSKQRGMMFRPISQCLFWLLVADLLTLTWIGGQPVEHPFITI 350
gi|115494844|ref|YP_778746.1| LAIVPLLHTSKQRGMMFRPISQCLFWLLVADLLTLTWIGGQPVEHPFITI 350
gi|37620596|gb|AAQ95107.1| LAIIPLLHTSKQRGMMFRPISQCLFWLLAADLLTLTWIGGQPVEHPFITI 350
gi|115494690|ref|YP_778694.1| LAIIPLLHTSKQRGMMFRPISQCLFWLLVADLLTLTWIGGQPVEYPFIAI 350
gi|115494830|ref|YP_778759.1| LAIIPLLHTSKQRGMMFRPISQFLFWLLVADLLTLTWIGGQPVEYPFITI 350
gi|269302297|gb|ACZ28998.1| LAIVPLLHTSKQRGMMFRPISQCLFWLLVADLLTLTWIGGQPVEHPYITI 350
gi|282154709|dbj|BAI60013.1| LAIVPLLHTSKQRGMMFRPISQCLFWFLVADLLTLTWIGGQPVEHPYITI 350
gi|115494788|ref|YP_778837.1| LAIAPLLHTSKQRGMMFRPISQCLFWLLVADLLTLTWIGGQPVEHPYITI 350
gi|8038011|gb|AAF71578.1| --------------------------------------------------
gi|1122916|gb|AAB50570.1| LAIIPLLHTSKQRGMMFRPLSQCLFWLLAADLLTLTWIGGQPVEHPLVII 350
gi|157461069|gb|ABV57060.1| LAIIPLLHTSKQRSMMFRPLSQCLFWLLVADLLTLTWIGGQPVEHPFIII 350
gi|153124668|ref|YP_001382271. LAIIPLLHTSKQRGMMFRPLSQCLFWLLVADLLTLTWIGGQPVEHPFITI 350
gi|21425423|emb|CAD21718.1| LAIVPSLHTSKQRSMMFRPISQCLFWLLVADLITLTWIGGQPVEHPFIII 350
gi|552606|gb|AAA31851.1| LAMIPILHMSKQQSMMFRPLSQSLYWLLAADLLILTWIGGQPVSYPFTII 82
gi|115494578|ref|YP_778707.1| GQLASILYFTILLVFMPIAGIIENNILKW- 379
gi|115494844|ref|YP_778746.1| GQLASILYFAILLILMPIAGIIENNILKW- 379
gi|37620596|gb|AAQ95107.1| GQLASILYFTILLVLMPIAGIIENNILKW- 379
gi|115494690|ref|YP_778694.1| GQLASILYFMILLVLMPMAGIIENNILKW- 379
gi|115494830|ref|YP_778759.1| GQLASILYFTILLILMPVAGIIENNILKW- 379
gi|269302297|gb|ACZ28998.1| GQLASILYFMILLVLMPIASIIENNILKW- 379
gi|282154709|dbj|BAI60013.1| GQLASILYFMILLVLMPIASIIENNILKW- 379
gi|115494788|ref|YP_778837.1| GQLASILYFAILLVFMPIASIIENNILKW- 379
gi|8038011|gb|AAF71578.1| ------------------------------
gi|1122916|gb|AAB50570.1| GQLASILYFTILLVLMPIAGIIENNLSKW- 379
gi|157461069|gb|ABV57060.1| GQLASILYFMILLVFMPIASIAENNLLKW- 379
gi|153124668|ref|YP_001382271. GQLASILYFAILLIFMPVASIVENNLLKW- 379
gi|21425423|emb|CAD21718.1| GQLASILYFMILLVFMPIAGMIENSILKW- 379
gi|552606|gb|AAA31851.1| GQVASVLYFTTILILMPTISLIENKMLKWA 112
CLUSTAL 2.1 multiple sequence alignment Cytochrome C
gi|115494569|ref|YP_778698.1| MAYPFQMGLQDATSPIMEELTHFHDHTLMIVFLISSLVLYIISTMLTTKL 50
gi|115494681|ref|YP_778685.1| MAYPFQMGLQDATSPIMEELTHFHDHTLMIVFLISSLVLYIISTMLTTKL 50
gi|37695534|gb|AAR00312.1| MAYPLQMGLQDATSPIMEELTHFHDHTLMIVFLISSLVLYIISTMLTTKL 50
gi|115494821|ref|YP_778750.1| MAYPFQMGLQDATSPIMEELTHFHDHTLMIVFLISSLVLYIISTMLTTKL 50
gi|115494835|ref|YP_778737.1| MAYPFQMGLQDATSPIMEELTHFHDHTLMIVFLISSLVLYIISTMLTTKL 50
gi|21717327|ref|NP_659340.3| MAYPLQMGLQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISTMLTTKL 50
gi|159524414|ref|YP_001542732. MAYPFQMGLQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISTMLTTKL 50
gi|19343520|ref|NP_597984.1| MACPFQMGLQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISTMLTTKL 50
gi|115494779|ref|YP_778828.1| MAYPLQMGLQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISLMLTTKL 50
gi|5835013|ref|NP_007072.1|COX MAYPLQMGLQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISLMLTTKL 50
gi|5834861|ref|NP_006931.1|COX MAYPLQMGLQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISLMLTTKL 50
gi|153124659|ref|YP_001382262. MAYPFQLGLQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISLMLTTKL 50
gi|11128019|ref|NP_061820.1| -------------------------------------------MGDVEKG 7
. *
gi|115494569|ref|YP_778698.1| THTNTMDAQEVETVWTILPAIILIMIALPSLRILYIMDEINNP--SLTVK 98
gi|115494681|ref|YP_778685.1| THTSTMDAQEVETVWTILPAIILIMIALPSLRILYIMDEINNP--SLTVK 98
gi|37695534|gb|AAR00312.1| THTSTMDAQEVETVWTILPAIILIMIALPSLRILYIMDEINNP--SLTVK 98
gi|115494821|ref|YP_778750.1| THTSTMDAQEVETVWTILPAIILIMIALPSLRILYIMDEINNP--SLTVK 98
gi|115494835|ref|YP_778737.1| THTCTMDAQEVETVWTILPAIILIMIALPSLRILYIMDEINNP--SLTVK 98
gi|21717327|ref|NP_659340.3| THTNTMDAQEVETVWTILPAIILIMIALPSLRILYMMDEINSP--FLTVK 98
gi|159524414|ref|YP_001542732. THTNTMDAQEVETVWTILPAIILVLIALPSLRILYMMDEINNP--LLTVK 98
gi|19343520|ref|NP_597984.1| THTSTMDAQEVETVWTILPAIILILIALPSLRILYMMDEINNP--SLTVK 98
gi|115494779|ref|YP_778828.1| THTSTMDAQEVETVWTILPAIILILIALPSLRILYMMDEINNP--SLTVK 98
gi|5835013|ref|NP_007072.1|COX THTSTMDAQEVETVWTILPAIILILIALPSLRILYMMDEINNP--SLTVK 98
gi|5834861|ref|NP_006931.1|COX THTSTMDAQEVETVWTILPAIILILIALPSLRILYMMDEINNP--SLTVK 98
gi|153124659|ref|YP_001382262. THTSTMDAQEVQTVWTILPAIILILIALPSLRILYMMDEINNP--SLTVK 98
gi|11128019|ref|NP_061820.1| KKIFIMKCSQCHTVEKGG-----KHKTGPNLHGLFGRKTGQAPGYSYTAA 52
.: *...: .** . : *.*: *: . : * *.
gi|115494569|ref|YP_778698.1| TMGHQWYWTYEYTDYEDLSFDSYMIPTQELKPGELRLLEVDNRVVLPMEM 148
gi|115494681|ref|YP_778685.1| TMGHQWYWTYEYTDYEDLSFDSYMIPTQELKPGELRLLEVDNRVVLPMEM 148
gi|37695534|gb|AAR00312.1| TMGHQWYWTYEYTDYEDLSFDSYMIPTQELKPGELRLLEVDNRVVLPMEM 148
gi|115494821|ref|YP_778750.1| TMGHQWYWTYEYTDYEDLNFDSYMIPTQELKPGELRLLEVDNRVVLPMEM 148
gi|115494835|ref|YP_778737.1| TMGHQWYWTYEYTDYEDLNFDSYMIPTQELKPGELRLLEVDNRVVLPMEM 148
gi|21717327|ref|NP_659340.3| TMGHQWYWSYEYTDYEDLSFDSYMVPTQELKPGELRLLEVDNRMVLPMEM 148
gi|159524414|ref|YP_001542732. TMGHQWYWSYEYTDYEDLNFDSYMIPTQELKPGELRLLEVDNRAVLPMEM 148
gi|19343520|ref|NP_597984.1| TMGHQWYWSYEYTDYEDLNFDSYMIPTQELKPGELRLLEVDNRVVLPVEM 148
gi|115494779|ref|YP_778828.1| TMGHQWYWSYEYTDYEDLNFDSYMIPTQELKPGELRLLEVDNRVVLPMEM 148
gi|5835013|ref|NP_007072.1|COX TMGHQWYWSYEYTDYEDLNFDSYMIPTQELKPGELRLLEVDNRVVLPMEM 148
gi|5834861|ref|NP_006931.1|COX TMGHQWYWSYEYTDYEDLNFDSYMIPTQELKPGELRLLEVDNRVVLPMEM 148
gi|153124659|ref|YP_001382262. TMGHQWYWSYEYTDYEDLNFDSYMVPTQELKPGELRLLEVDNRVVLPMEM 148
gi|11128019|ref|NP_061820.1| NKNKGIIWGEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKA 102
. .: * : . ..*: *: : * : * : . :
gi|115494569|ref|YP_778698.1| TVRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|115494681|ref|YP_778685.1| TVRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|37695534|gb|AAR00312.1| TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|115494821|ref|YP_778750.1| TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|115494835|ref|YP_778737.1| TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|21717327|ref|NP_659340.3| TVRMLISSEDVLHSWTVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|159524414|ref|YP_001542732. TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|19343520|ref|NP_597984.1| TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|115494779|ref|YP_778828.1| TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|5835013|ref|NP_007072.1|COX TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|5834861|ref|NP_006931.1|COX TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMTMRPGLYYGQCSE 198
gi|153124659|ref|YP_001382262. TIRMLISSEDVLHSWAVPSLGLKTDAIPGRLNQTTLMAMRPGLYYGQCSE 198
gi|11128019|ref|NP_061820.1| TNE----------------------------------------------- 105
* .
gi|115494569|ref|YP_778698.1| ICGSNHSFMPIVIESVPLSCFEKWSASML- 227
gi|115494681|ref|YP_778685.1| ICGSNHSFMPIVIESVPLSCFEKWSASMLQ 228
gi|37695534|gb|AAR00312.1| ICGSNHSFMPIVIESVPLSYFEKWSTSMLQ 228
gi|115494821|ref|YP_778750.1| ICGSNHSFMPIVIESVPLSYFEKWSASMLQ 228
gi|115494835|ref|YP_778737.1| ICGSNHSFMPIVIESVPLSYFEKWSASMLQ 228
gi|21717327|ref|NP_659340.3| ICGSNHSFMPIVLESVPLSYFEKWSASILQ 228
gi|159524414|ref|YP_001542732. ICGSNHSFMPIVLELVPLSYFEKWSASML- 227
gi|19343520|ref|NP_597984.1| ICGSNHSFMPIVLELVPLSYFEEWSASML- 227
gi|115494779|ref|YP_778828.1| ICGSNHSFMPIVLELVPLSHFEKWSTSML- 227
gi|5835013|ref|NP_007072.1|COX ICGSNHSFMPIVLELVPLSHFEKWSTSML- 227
gi|5834861|ref|NP_006931.1|COX ICGSNHSFMPIVLELVPLSHFEKWSTSML- 227
gi|153124659|ref|YP_001382262. ICGSNHSFMPIVLELVPLSHFEKWSASML- 227
gi|11128019|ref|NP_061820.1| ------------------------------
Appendix C bibliography
Andersen et al. (1998). Population Structure and gene flow of the Atlanstic Walrus (Odobenus
rosmarus rosmarus) in the eastern Atlantic Artic based on mitochondiral DNA and
microsatellite variation. Molecular Ecology(7), 1323-1336.
Castresana J. (2001). Molecular biology and Evolution(18), 465-471.
Castresana J. (2001). Cytochrome b Phylogeny and the Taxonomy of Great Apes and Mammals.
Molecular biology and Evolution(18), 465-471.
Lento et al. (1995). Use of Spectral Anaylsis to test hypotheses on the orign of pinnipeds. Molecular
Biology and Evolution(12), 28-52.
Lindqvist et al. (2009). The Laptev Sea Walrus Odobenus rosmarus laptevi: an engima revisited.
Zoologica Scripta(38), 113-127.
Ulfure, A., bodin, K., Gullberg, A., Ledge, C., & Mouchaty, S. (1995). A Molecular View of Pinniped
Relationships with Particular Emphasis on the True Seals. Journal of molecular Evolution(40),
78-85.