families of nuclear receptors in vertebrate models ... · families of nuclear receptors in...

25
Families of Nuclear Receptors in Vertebrate Models: Characteristic and Comparative Toxicological Perspective Yanbin Zhao 1 , Kun Zhang 1 , John P. Giesy 2,3,4 & Jianying Hu 1 1 MOE Laboratory for Earth Surface Processes, College of Urban and Environmental Sciences, Peking University, Beijing 100871, China, 2 Department of Veterinary Biomedical Sciences and Toxicology Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada, 3 Department of Zoology, and Center for Integrative Toxicology, Michigan State University, East Lansing, MI, USA, 4 Department of Biology & Chemistry and State Key Laboratory in Marine Pollution, City University of Hong Kong, Kowloon, Hong Kong, SAR, China. Various synthetic chemicals are ligands for nuclear receptors (NRs) and can cause adverse effects in vertebrates mediated by NRs. While several model vertebrates, such as mouse, chicken, western clawed frog and zebrafish, are widely used in toxicity testing, few NRs have been well described for most of these classes. In this report, NRs in genomes of 12 vertebrates are characterized via bioinformatics approaches. Although numbers of NRs varied among species, with 40–42 genes in birds to 66–74 genes in teleost fishes, all NRs had clear homologs in human and could be categorized into seven subfamilies defined as NR0B-NR6A. Phylogenetic analysis revealed conservative evolutionary relationships for most NRs, which were consistent with traditional morphology-based systematics, except for some exceptions in Dolphin (Tursiops truncatus). Evolution of PXR and CAR exhibited unexpected multiple patterns and the existence of CAR possibly being traced back to ancient lobe-finned fishes and tetrapods (Sarcopterygii). Compared to the more conservative DBD of NRs, sequences of LBD were less conserved: Sequences of THRs, RARs and RXRs were $90% similar to those of the human, ERs, AR, GR, ERRs and PPARs were more variable with similarities of 60%–100% and PXR, CAR, DAX1 and SHP were least conserved among species. N uclear receptors (NRs) are one of the largest groups of transcription factors in vertebrates, and serve important functions in regulation of a range of physiological functions including growth and differenti- ation of cells, metabolic processes, reproduction, development and overall homeostasis. Transcriptional activities of NRs are regulated by binding of endogenous small lipophilic compounds 1,2 . There is growing evidence that diverse chemicals that occur in the environment, including synthetic molecules such as pharma- ceuticals, endocrine disrupting chemicals and some industrial compounds, can mimic endogenous small com- pounds that can bind to ligand binding domains (LBDs), activate NR-mediated signals that then lead to toxic responses 3,4 . Typically, interactions of some pesticides and industrial chemicals with estrogen (ER) and androgen (AR) receptors have been linked to a number of adverse effects including birth defects, developmental neuro- toxicity, both male- and female-factor reproductive health, such as decreased quality of sperm, and increased incidences of cancers 5–7 . A series of in vitro bioassays, based on signaling of endocrine receptors including well-studied steroid hormone receptors such as ER, AR, glucocorticoid receptors (GRs), and progesterone receptor (PR) and the less well- studied retinoic acid receptor (RAR), retinoid X receptor (RXR), and thyroid hormone receptor (THR), have been established or are under assessment by OECD and/or US EPA 8–10 . Due to their relatively clear physiological functions and responses to environmentally-relevant organic micropollutants, these NR-based assays have been used in assessment of toxicological effects of chemicals in the environment. For example, ERs, AR and THRs, involved in development and maintenance of the endocrine system, have been demonstrated to be targets of alkylphenols, phthalates (PAEs), dichlorodiphenyltrichloroethane and some metabolites of polychlorinated biphenyls (PCBs) and polybrominated diphenyl ethers (PBDE) 11–13 . Besides endocrine receptors, PXR and CAR, NRs that participate in metabolism of both endobiotics and xenobiotics to detoxify or bioactivate chemicals, can be activated by a variety of pharmaceuticals such as rifampicin, pesticides such as chlorpyrifos and methoxy- chlor, and other synthetic chemicals used in industry, such as PBDEs and BPA 14–17 In addition to these well- known NRs, there are more NRs, that, during the past decade, have been identified in genomes of several OPEN SUBJECT AREAS: EVOLUTIONARY ECOLOGY ENVIRONMENTAL SCIENCES Received 23 October 2014 Accepted 21 January 2015 Published 25 February 2015 Correspondence and requests for materials should be addressed to Y.Z. (zhaoyb@pku. edu.cn) or J.H. (hujy@ urban.pku.edu.cn) SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 1

Upload: nguyendan

Post on 04-Aug-2019

217 views

Category:

Documents


0 download

TRANSCRIPT

Families of Nuclear Receptors inVertebrate Models: Characteristic andComparative Toxicological PerspectiveYanbin Zhao1, Kun Zhang1, John P. Giesy2,3,4 & Jianying Hu1

1MOE Laboratory for Earth Surface Processes, College of Urban and Environmental Sciences, Peking University, Beijing 100871,China, 2Department of Veterinary Biomedical Sciences and Toxicology Centre, University of Saskatchewan, Saskatoon,Saskatchewan, Canada, 3Department of Zoology, and Center for Integrative Toxicology, Michigan State University, East Lansing,MI, USA, 4Department of Biology & Chemistry and State Key Laboratory in Marine Pollution, City University of Hong Kong, Kowloon,Hong Kong, SAR, China.

Various synthetic chemicals are ligands for nuclear receptors (NRs) and can cause adverse effects invertebrates mediated by NRs. While several model vertebrates, such as mouse, chicken, western clawed frogand zebrafish, are widely used in toxicity testing, few NRs have been well described for most of these classes.In this report, NRs in genomes of 12 vertebrates are characterized via bioinformatics approaches. Althoughnumbers of NRs varied among species, with 40–42 genes in birds to 66–74 genes in teleost fishes, all NRs hadclear homologs in human and could be categorized into seven subfamilies defined as NR0B-NR6A.Phylogenetic analysis revealed conservative evolutionary relationships for most NRs, which were consistentwith traditional morphology-based systematics, except for some exceptions in Dolphin (Tursiopstruncatus). Evolution of PXR and CAR exhibited unexpected multiple patterns and the existence of CARpossibly being traced back to ancient lobe-finned fishes and tetrapods (Sarcopterygii). Compared to themore conservative DBD of NRs, sequences of LBD were less conserved: Sequences of THRs, RARs and RXRswere $90% similar to those of the human, ERs, AR, GR, ERRs and PPARs were more variable withsimilarities of 60%–100% and PXR, CAR, DAX1 and SHP were least conserved among species.

Nuclear receptors (NRs) are one of the largest groups of transcription factors in vertebrates, and serveimportant functions in regulation of a range of physiological functions including growth and differenti-ation of cells, metabolic processes, reproduction, development and overall homeostasis. Transcriptional

activities of NRs are regulated by binding of endogenous small lipophilic compounds1,2. There is growingevidence that diverse chemicals that occur in the environment, including synthetic molecules such as pharma-ceuticals, endocrine disrupting chemicals and some industrial compounds, can mimic endogenous small com-pounds that can bind to ligand binding domains (LBDs), activate NR-mediated signals that then lead to toxicresponses3,4. Typically, interactions of some pesticides and industrial chemicals with estrogen (ER) and androgen(AR) receptors have been linked to a number of adverse effects including birth defects, developmental neuro-toxicity, both male- and female-factor reproductive health, such as decreased quality of sperm, and increasedincidences of cancers5–7.

A series of in vitro bioassays, based on signaling of endocrine receptors including well-studied steroid hormonereceptors such as ER, AR, glucocorticoid receptors (GRs), and progesterone receptor (PR) and the less well-studied retinoic acid receptor (RAR), retinoid X receptor (RXR), and thyroid hormone receptor (THR), have beenestablished or are under assessment by OECD and/or US EPA8–10. Due to their relatively clear physiologicalfunctions and responses to environmentally-relevant organic micropollutants, these NR-based assays have beenused in assessment of toxicological effects of chemicals in the environment. For example, ERs, AR and THRs,involved in development and maintenance of the endocrine system, have been demonstrated to be targets ofalkylphenols, phthalates (PAEs), dichlorodiphenyltrichloroethane and some metabolites of polychlorinatedbiphenyls (PCBs) and polybrominated diphenyl ethers (PBDE)11–13. Besides endocrine receptors, PXR andCAR, NRs that participate in metabolism of both endobiotics and xenobiotics to detoxify or bioactivate chemicals,can be activated by a variety of pharmaceuticals such as rifampicin, pesticides such as chlorpyrifos and methoxy-chlor, and other synthetic chemicals used in industry, such as PBDEs and BPA14–17 In addition to these well-known NRs, there are more NRs, that, during the past decade, have been identified in genomes of several

OPEN

SUBJECT AREAS:

EVOLUTIONARYECOLOGY

ENVIRONMENTAL SCIENCES

Received23 October 2014

Accepted21 January 2015

Published25 February 2015

Correspondence andrequests for materials

should be addressed toY.Z. (zhaoyb@pku.

edu.cn) or J.H. ([email protected])

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 1

vertebrates. These include 48 NR genes in human (Homo sapiens), 47genes in rat (Rattus norvegicus), 49 genes in mouse (Mus musculus)and 68 genes in the teleost puffer fish Fugu rubripes18,19. Specifically,structures of 48 NRs in the human have been identified and categor-ized, based on sequence homology, into seven different subfamiliesNR0B-NR6A20. Except for two NRs in the subfamily NR0B whichlack a DNA binding domain (DBD), all 46 NRs contain the followingsix functional domains: (A–B) variable N-terminal regulatorydomain; (C) conserved DNA-binding domain; (D) variable hingeregion; (E) conserved ligand binding domain (LBD) and (F) variableC-terminal domain20. In addition, sets of NRs described in humansoffered a better understanding of characteristics of NRs, and pro-vided insight for uncovering novel molecular and signal targets andmechanisms of action of synthetic toxicants. For instance, it has beenfound that some widely used pharmaceutical drugs that are found inthe environment, including thiazolidine diones, trichloroacetic acidand toxaphene are ligands for human RORa, PPARa and ERRa,respectively21–23. Compared with the extensive understanding ofNRs in human, fewer NRs have been identified in other vertebratesused as models to screen chemicals for toxic potencies, such as rep-tiles, amphibians and teleost fishes. While in recent years, due toextensive information about their developmental biology andmolecular genetics and now the availability of completed sequencingof their genomes, these vertebrate species have been much used astoxicological models such as western clawed frog (X. tropicalis), zeb-rafish (Danio rerio), and freshwater Japanese medaka (Oryziaslatipes)24–26, information on NRs in these vertebrates were still lim-ited to ERs, AR, GR, PXR, RARs and PPARs, though studies on somenovel NRs, such as VDR, FXR and NURR are in progress27–29.Additionally, since sets of NRs in human, mouse and rat that havebeen identified in previous studies were based on their genomesassembled a decade ago18, there is also a need to reevaluate the char-acteristics of NRs in these genomes due to the constantly updatedsequence data and annotations. In addition to the sequences of gen-omes, predicted transcriptomes and proteomes, now available for allof these species in Genebank and Ensembl, provide useful databasesthat can be further used to uncover and characterize additional NRs.Therefore, comprehensive descriptions of NRs and their families forthese vertebrates used as models to screen for toxic potencies ofchemicals, will be helpful for their further development and inter-pretation of results of studies of synthetic chemicals of envir-onmental significance.

In this study, complete sets of NRs were described for genomes of12 vertebrates used as models in studies of toxic potency andmechanisms of action of chemicals. Several bioinformaticsapproaches were applied to four mammals (human, Homo sapiens;mouse, Mus musculus; rat, Rattus norvegicus and dolphin, Tursiopstruncatus), two birds (chicken, Gallus gallus and mallard (wild duck),Anas platyrhynchos), a reptile (Chinese softshell turtle, Pelodiscussinensis), an amphibian (Western clawed frog, Xenopus tropicalis)and four teleost fishes (zebrafish, Danio rerio; medaka, Oryziaslatipes; tilapia, Oreochromis niloticus and stickleback, Gasterosteusaculeatus). The locations of NRs on chromosomes, phylogeneticanalysis and DBD and LBD sequence conservations among specieswere also analyzed to better understand the characteristics of theseNRs in these vertebrates.

Results and DiscussionIdentification of NRs in 12 vertebrates. Substantial and continuousinformation gathered from developmental biology and moleculargenetics, together with the complete sequencing of genomes hasplaced a series of vertebrate species in attractive positions for usein toxicological research. Twelve species were chosen for descriptionand complete sets of NR genes within their genomes were identifiedby use of a systemic bioinformatics approach. In total, 42–74 NRgenes were uncovered within these vertebrates and a large number of

variations were observed among classes (Fig. 1A, Table S2).Comparisons of sequences showed that all of these NRs displayedsignificant similarity to NRs of the human and could be categorizedinto the seven subfamilies NR0B-NR6A, with no novel subfamilies.For mammals, there were 48, 49, 49 and 47 NRs identified in human,mouse, rat and dolphin genomes, respectively (Fig. 1A). Comparedto the human, one more gene (NR1H5) was observed for mouse andrat and one (NR2F2) was absent from dolphin (Fig. 2). Sets of NRs inhuman and mouse were consistent with previous reports18, while twomore NRs (NR1D2 and NR2E3) were newly identified for the rat.The absences of these two NRs in rat in previous study18 were due tothe existence of sequence gaps in the rat genome which wasassembled in 2003.

The numbers of NRs in birds were less than those in human,though there were some unique genes observed. There were sevenNRs (NR1B3, NR1D1, NR1H2, NR1I2, NR2B2, NR3B1 and NR4A1)present in the human that were absent from the chicken. Similarly,there were nine NRs (NR1B3, NR1D1, NR1H2, NR1I2, NR1I3,NR2B2, NR2E3, NR2F1 and NR3B1) present in the human that wereabsent from the mallard, though there were three new NRs (NR1F3,NR1H5 and NR2A3) were identified that were unique to chicken andmallard (Fig. 2). Similar absences were observed in the genomes ofturkey (Meleagris gallopavo), flycatcher (Ficedula albicollis) andzebra finch (Taeniopygia guttata), where 9, 5 and 6 NRs, respectively,that are present in the human genome were absent from these birds(Fig. 3C). These results demonstrated that a cluster of NRs wereindeed absent from genomes of the class aves, especially in galloan-serae, that were deleted during the course of evolution.

Some NRs present in the human were absent from turtle andwestern clawed frog while some others were unique in these species.In the one species of turtle, 48 NRs were identified with four genesabsent (NR1B3, NR1H2, NR1I2 and NR2B2) and four new genesgained (NR1F3, NR1H5, NR2A3 and NR2F1) compared with thosein human. Similarly, 52 NRs were identified in western clawed frogwith 2 genes absent (NR1H2 and NR4A3) and six additional genes(NR1F2, NR1H5, NR2A3, NR2F5, NR3B3 and NR4A2) appearedwhich were not present in the human (Fig. 2).

For the four teleost fishes studied, there were many additional NRsuncovered in this study. Specifically, 73 and 74 NRs were identified inzebrafish and tilapia, respectively (Fig. 1A), which were consistent withthose reported for Fugu rubripes (68 NRs identified)19. The additionalNRs were mainly due to the paralogue genes exist in their sets of NRs(Fig. 1C). In zebrafish, two or more paralogues were identified tocorrespond with one of 20 NRs in human and with one of 18, 22and 17 NRs in medaka, tilapia and stickleback, respectively. Existencesof paralogue genes in teleost fishes were not random but focused onsome specific NR units. For instance, NR1F3 (RORc) was the mostabundant NR, with a total of seven paralogue gene copies in these fourteleost fishes. The NRs NR1A1, NR1B3, NR1C1, NR1I1, NR2B2,NR2F6, NR3A2, and NR3B3 were also rich in paralogues, with oneparalogue gene copy in each of the four teleosts (Fig. 3D).

Characteristics of NRs families. Genomic locations of NRs in sevenvertebrate genomes (human, mouse, rat, chicken, zebrafish, medakaand stickleback) were retrieved via the Ensemble annotations. Ingeneral, distributions of NRs on chromosomes were morewidespread in teleost fishes than those of mammals and birds(Fig. 1B). This is possibly due to the existence of more paraloguegenes in teleosts. For example, NRs in zebrafish, medaka andstickleback were distributed throughout their genomes except for1–2 chromosomes. The most abundant clusters of NRs wereobserved on chromosomes 8 and 16 in zebrafish, each with 6 NRs;on chromosomes 7 and 16 in medaka, each with 7 NRs; and onchromosome 12 in stickleback, with 8 NRs. The narrowestdistribution of NRs was observed for species of chicken, in which44 NRs were distributed in 61% (19/31) chromosomes.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 2

Phylogenetic analyses, based on their full amino acid sequencesand DBD plus LBD compositions of NRs, were performed for 48types of NRs among these 12 vertebrates. The Neighbor-Joining (NJ)and Maximum-Likelihood (ML) phylogenetic analyses showed sim-ilar patterns, while the Neighbor-Joining algorithm gave better reso-lution at the base of the phylogram. Conservative evolutionaryrelationships were observed for most NRs, i.e. the evolutionary rela-tionships were generally consistent with the traditional morphology-based systematics (Fig. S1). As exemplified for NR3A1 (ERa), closerrelationships were observed within each class and the traditionalteleost-amphibian-reptile-bird and mammal evolutionary relation-ships were followed (Fig. 3A). This was verified by the similarity ofsequences of the LBD of ERa among species (Fig. 4). In details, about82–93% sequence similarities among teleost, 99% between birds and98–99% among mammals was observed and the sequence similaritiesamong classes were relatively small (Fig. 5). Some exceptions wereobserved in Dolphin such as NR2A1 and NR2A2 (Fig. S1). Thoughdolphin, diverged from artiodactyls approximately 50 million yearsago30, was thought to show the closest relationship with humanamong the 12 vertebrates, there were 32% NRs that showed closerrelationships between rodents and human compared with those indolphin. Similarities between sequences of the DBD and LBD alsoconfirmed this likely historical divergence. In rodents, 13% ofsequences of amino acids of DBD and 26% of those of the LBDexhibited relationships more similar to those of the human thandolphin (Fig. 3B). These variations in NRs in dolphin were possiblydue to the results of positive Darwinian selection, the major drivingforce for adaptive evolution and diversification among species, toadapt their radical habitat transition from land to a marine envir-onment. Though increasing toxicological research has been pre-formed using dolphins and extrapolations from dolphin to humanwere thought to be more significant, results of the present study

demonstrated more variations, indicating more genetic characteris-tics should be taken into account when assessing toxicities of chemi-cals based on results of studies with dolphins. In addition, since PXRand CAR displayed the largest variations and were absent in severalvertebrates used in this study (Fig. 2 and 4), more comparisonsamong species were conducted. Existence of NR1I (VDR, PXR andCAR) genes were demonstrated in 35 vertebrate species (20 mam-mals, 5 birds, 2 reptile, 1 amphibian and 7 teleost fishes) with forwhich complete sequences of genomes were available and unexpec-ted patterns were showed for their evolutions. VDR genes appearedin all vertebrate genomes, a result which was consistent with those inprevious reports that VDR could be detected in mammals, birds,amphibians, reptiles, teleost fishes, and even the sea lamprey31.PXR appeared in most teleost fishes (expect for stickleback), amphi-bians and mammals (also known as SXR), but were totally absentfrom reptiles and birds. Though CAR also appeared in all mammals,it exhibited quite different patterns in other classes. CAR was mostlyabsent in birds (expect for chicken), but retained in reptiles andamphibians, and appeared in lobe-finned fishes and tetrapods(Sarcopterygii) (Fig. 3E). Since Sarcopterygii appeared nearly 400million years ago during the Devonian, and are widely accepted asancestors of all tetrapoda, including amphibians, reptiles, birds andmammals32, the appearance of CAR in Sarcopterygii possibly indi-cated that the existence of CAR was much earlier than previouslythought. In general, these results revealed a novel evolutionary rela-tionship for PXR/CAR. These two NRs likely coexisted in ancientSarcopterygii, first due to the duplication events, descended intoamphibians and then to mammals, but one of them was absent fromreptiles and both were absent from most birds (Fig. S2).

Alignment of sequences of DBD and LBD. Since cross-speciesextrapolations from surrogate vertebrate species to humans are

Figure 1 | Identification of NRs in genomes of 12 toxicological vertebrate models. (A) Total number of NRs in each vertebrate genome (B) the

genomic distributions of NRs in seven vertebrate species (C) the number of NRs for each type (NR0B-NR6A) and the paralogous gene numbers (P.G.) in

total.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 3

usually considered to be crucial for human risk assessment of chemicals,better understanding of similarities of these NRs sequences amongspecies will be useful to facilitate these extrapolations and betterunderstand the toxicities of environmental chemicals. In the presentstudy, pairwise alignments were constructed between sequences ofDBD/LBD of 48 human NRs and their corresponding orthologs inthe other eleven vertebrate species (Fig. 4). As expected, DBDs of theorthologous proteins generally shared relatively great conservation withsequences in human (Fig. 4, left), especially, for the mouse, rat anddolphin, in which 94%–100% sequence similarities were observed formost NRs, expect CAR (70%–89%), and almost 70% (31/46, 32/46 and31/42, respectively) orthologous proteins showed 100% similarities withsequences of the human. For bird, reptile, amphibian and teleost fishes,

most NRs also displayed conservation of sequences (usually .90%),especially for RORb (100% for all species). While there are also someexceptions, such as PXR (61%–73%), CAR (64%–67%), and PPARaand TR2 in teleost (87%–90% and 84%–87%, respectively), whichindicates potential alternations on target genes and signals for theseNRs among vertebrate species.

Compared to the more conserved sequences of DBD regions ofNRs among species, sequences of the LBD displayed more variation.The greatest variation was observed for DAX1 (40%–81%), while theleast variation was observed for COUP-TFII (99%–100%) comparedwith those in human (Fig. 4, right). To our best knowledge, this is thefirst time all NRs LBD have been compared among vertebrates,which showed a broader and novel insight to investigate the LBD

Figure 2 | Nuclear receptor families in 12 model vertebrates. Each nuclear receptor is presented as a colored block. The white spaces indicate that no

ortholog was identified. Nuclear receptor family for each vertebrate species was marked with different color. From left to right: human ‘‘ ’’;

mouse ‘‘ ’’; rat ‘‘ ’’; dolphin ‘‘ ’’; chicken ‘‘ ’’; duck ‘‘ ’’; turtle ‘‘ ’’; frog ‘‘ ’’; zebrafish ‘‘ ’’; medaka ‘‘ ’’; tilapia ‘‘ ’’ and

stickleback ‘‘ ’’.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 4

differences between species and between multiple NRs units. In thepresent study, three groups were identified in general based on sim-ilarities in sequences of NRs. The first group contained 13 NRsincluding THRa, THRb, RARa, RARb, RARc, RORa, RXRa,RXRb, RXRc, COUP-TFII, ERRc, NURR1 and LRH1 (except someorthologs for RARa, RORa, RXRb, RXRc and NURR1) with $90%similarity of sequences of the LBDs for all eleven vertebrates com-pared with those of the human (Fig. 4, right). As observed for RXRa,97–100% similarities in sequences, for the best alignment orthologs,were observed from multiple sequence alignment (Fig. 5). Variationsin conservation of sequences, window averaged across 10 amino acidresidues, found that there were fewer than 5 variations in amino acidresidues among these 12 vertebrate species, and most of them wereobserved in a-helix 3 to a-helix 6 of the LBD structures (Fig. 5).RXRa commonly functions as a heterodimers with other NRs andmainly mediates signaling of hormones derived from vitamin A(retinol) such as 9-cis retinoic acid, and are involved in multiplephysiological functions of vertebrates such as embryonic patterningand organogenesis, proliferation of cells and differentiation of tis-sues33. It has been reported that among vertebrates, such as mouse

and human, LBDs of RXRa interacted with similar types of ligandswith similar binding affinities34,35. Sequence similarities of these 13NRs among vertebrates suggested potential straightforward interspe-cies extrapolations when assessing toxicity of chemicals via theseNRs. Approximately 77% of NRs such as the well-known ERs, AR,PR, PPARs and VDR can be sorted into the second group, exhibiting60–100% similarities of sequences (for the best aligned orthologs)compared with those of human. Similarities in sequences of theseNRs among four fishes were substantially the same and usually$90% in mouse, rat and dolphin, showing apparent differences insequences of amino acids between teleosts and mammals.Specifically, LBDs of NRs in the second group, such as ERa andPPARc, always shared the same variations in amino acids withinfour fishes, which were quite different from those of mammals(Fig. 5 for ERa). ERa is a well-studied NR, activated by endogenousand exogenous estrogens, and plays a variety of central physiologicalroles, such as maintenance of reproductive, cardiovascular and cent-ral nervous systems in vertebrates36. Potencies of binding of ligandsto LBDs of ERa were different for fishes when compared to mam-mals. It has been reported that widespread chemicals like 4-t-octyl-

Figure 3 | Characteristics of the 12 NRs families. (A) Phylogenetic tree for 12 NR3A1 (ERa) genes (B) The evolutionary relationships of NRs among

dolphin, rodents and human species. Left: the proportions of dolphin NRs with closer relationships with human compared to rodents are presented as

percent/number and blue colour. The proportions of rodents NRs with closer relationships with human are presented as percent/number and orange

colour. Green colour represents the NRs numbers with equivalent sequence similarities with human for dolphin and rodents. Right: phylogenetic tree for

NR2C1 and NR2A1 represents the different positions of NRs for dolphin. (C) Comparative searches for the ten lacked NRs in five bird species (D)

Paralogous gene copy numbers for each type of NRs (E) Comparative searches for NR1I genes (VDR, PXR and CAR) in 35 vertebrates, including 20

mammals, 5 birds, 2 reptiles, 1 amphibian and 7 teleost (details are described in Table S4). Phylogenetic tree was developed utilizing 35 full amino acid

sequences of VDR.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 5

Figure 4 | Pairwise alignments between DBD/LBD amino acid sequences of 48 human NRs and the corresponding orthologs in other eleven vertebratespecies. Left for the DBD sequence comparisons and right for the LBD. The sequence similarities are presented as the percentage (%) and relevant

color. NRs, with incomplete amino acid sequences of DBD/LBD, were not included in this comparison.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 6

phenol and bisphenol A (BPA) bound with greater avidity to rainbowtrout ER than that of human or rat. Also, types of ligands werevarious: of 34 chemicals tested, 29 can bind to ER of rainbow trout,while only 20 of them can bind to ER of human/rat37. PPARc is also awell-studied transcription factor, which could be activated by fattyacids and is involved in lipid and glucose metabolism38. Reports onbinding strengths of LBDs for PPARc were rare, but interspeciesextrapolations on LBD binding activities can be likely to estimate,due to the similar sequence characteristics between PPARc and ERa.

In the third group, with less than 85% similarities in sequences ofeleven vertebrate species compared with those in human, four NRsincluding PXR, CAR, DAX1 and SHP (Fig. 4) were classified as beingdifferent from human. DAX1 and SHP, which belong to the subfam-ily NR0B, displayed the greatest variations among NRs and amongvertebrates (Fig. 4 and 5), a result which is consistent with thosereported previously that NRs in the NR0B group were a unique classof NRs with among-species variability in sequences and lacking DBDdomains18. PXR and CAR were also assigned to this group, and

Figure 5 | Variations in LBD sequence conservation across the sequence of RXRa, ERa and SHP. Left: LBD sequences for eleven vertebrates compared

to the related human nuclear receptors. All sequences were window averaged across 10 residues. Right: multiple sequence alignments among the 12

vertebrates. The sequence similarities are presented as the percentage (%) and relevant color. The LBD sequence of ERa in Dolphin was not included in

this comparison due to the incomplete amino acid sequences.

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 7

exhibited apparent differences among vertebrates and even amongfishes. PXR and CAR can be activated by xenobiotics and have rela-tively broad abilities to bind ligands39. The unusually great diversityin sequences of the LBD among species could be related to diversityin binding activities among species. This is exemplified by the factthat phenobarbital, a pharmaceutical that is generally detectable ineffluents of municipal waste water plants (WWTP), was a moderateactivator of the zebrafish PXR and exhibited greater binding affinitywith human PXR, while it did not bind to PXR of mouse39. Thesedifferences among species might be due to the differences in diet andphysiology among vertebrates, and such largely differences ofsequences of PXR and CAR among vertebrates complicated the insilico extrapolations.

Here, for the first time, genes that code for NRs and their relativecharacteristics are provided for 12 vertebrate species used as modelanimals in screening of toxic potencies of chemicals. These resultswill help understanding of the NRs in vertebrates and will be usefulfor clarifying mechanisms of toxic effects of environmental chemi-cals on these model species and also the extrapolations from theeffects on these surrogates to human.

MethodsIdentification of NRs in 12 vertebrate genomics. Identification of sequences forNRs was performed as described previously40,41 with slight modifications. In brief, theputative NRs for each vertebrate were identified through a combination of BLASTnand BLASTp searches of the genome and protein databases, which were obtainedfrom NCBI and Ensembl. The nucleotide and protein sequences of 165 described NRsin three vertebrates (48 in human, 49 in mouse and 68 in Fugu rubripes) weredownloaded from GenBank and used as templates for interrogating the vertebratedatabases. Nucleotide homology searches were performed using the full nucleotidesequences of each of the 165 NRs against these 12 genomic sequences database atNCBI by use of nucleotide BLAST with a blastn algorithm and an e value cut off of 1e-04. Protein sequences were then used to construct multiple sequence alignments byClustalX2 (http://www.clustal.org/clustal2/) and then the DNA-binding domain(DBD) and the ligand-binding domain (LBD) amino acid sequences weredemonstrated. BLASTp searches were performed using the conserved DBD plus LBDdomains against the non-redundant vertebrate protein sequence database at NCBI byuse of protein BLAST with a blastp algorithm and an e value cut off of 1e-25. The ecut-off values were set to be just loose enough to find all the Fugu NRs when usinghuman NRs as queries. Genes identified by BLASTn and BLASTp searches were thencombined and individual putative genes were sorted according their unique DNA andamino acid sequences. All these putative genes were verified by online softwareNRpred and iNR-PhysChem to remove the false-positive hits, and the NR0B1 andNR0B2, which are known to lack the DBD region, were added to the final sets of NRs.Details for the sequence searches were shown in Table S1. Finally, complete sequencesfor each NR in each vertebrate species were loaded into Ensembl database. Thenomenclatures of NRs were based on Ensembl’s GeneTree and Orthologyannotations.

Genomic distributions. Genomic location for each nuclear receptor in sevenvertebrate genomes (human, mouse, rat, chicken, zebrafish, medaka and stickleback)were retrieved via the Ensembl annotations, and then mapped onto completevertebrate karyograms.

Analyses of sequences of DBD and LBD. Sequences of peptides in the DBD and LBDdomains for each NR were identified by use of Pfam software (http://pfam.sanger.ac.uk/, Pfam 27.0) and modified manually, based on characteristics of DBD and LBDregions reported previously. The sequence of DBD, which is classified as a type-II zincfinger motif, corresponds to a 75–80 amino acid residue segment, starting at thelocation of two amino acid residues before the first conserved cysteine andencompassing both C4 zinc fingers and the LBD, a flexible unit made of a-helicescontaining of 170 to 210 amino acid residues, begin at the 12th residue of a-helix 3and extended through a-helix 1042,43.

The pairwise alignments between sequences of the DBD and LBD of humanprotein and corresponding orthologs in the other 11 vertebrates were constructed byuse of the NCBI BLASTp software with default parameters. Similarities in sequenceswere calculated based on the numbers of identical residues over the total numbers ofaligned residues in human.

Phylogenetic analysis. Phylogenetic trees were constructed by use of amino acidsequences of 48 types of NRs downloaded from Ensembl based on the set ofhomologous NRs in the human. Only full- length molecules were included for theanalysis. Some genes without complete amino acid sequences in the Ensembl databasewere retrieved from NCBI/EMBL/DDBJ databases (Table S3). They were alsoincluded. The Ensembl ID of each NR used in the analyses is available in SI Table S2.Conserved sequences of DBD and LBD for each NR were also isolated and used as asupportive analysis. Sequences of DBD and LBD were combined and then aligned,

except for NR0B1 and NR0B2. Multiple alignments of sequences of amino acids weregenerated by use of ClustalX2 software with default parameters, and the results usedfor construction of phylogenetic trees by implementation of the Neighbour-Joiningand Maximum-Likelihood algorithms with a Poisson model in MEGA6 software(http://www.megasoftware.net/mega.php). Confidence for branching patterns wasassessed by bootstrap analysis (1000 replicates). For NR1I1 (VDR) analysis, the fullamino acid sequences of NR1I1 in 35 vertebrates, including 20 mammals, 5 birds, 2reptiles, 1 amphibian and 7 teleost fishes (Table S4), were downloaded from theEmsenbl database. These full amino acid sequences were then aligned and applied forgene phylogenetic analysis by use of the same method described above.

1. Mangelsdorf, D. J. et al. The nuclear receptor superfamily: the second decade. Cell83, 835–839 (1995).

2. Pardee, K., Necakov, A. S. & Krause, H. Nuclear receptors: small molecule sensorsthat coordinate growth, metabolism and reproduction. Subcell. Biochem. 52,123–153 (2011).

3. Janosek, J., Hilscherova, K., Blaha, L. & Holoubek, I. Environmental xenobioticsand nuclear receptors - interactions, effects and in vitro assessment. Toxicol. inVitro 20, 18–37 (2006).

4. Grun, F. & Blumberg, B. Environmental obesogens: organotins and endocrinedisruption via nuclear receptor signaling. Endocrinology 147, S50–S55 (2006).

5. Damstra, T., Barlow, S., Bergman, A., Kavlock, R. & Van Der Kraak, G. eds.International programme on chemical safety global assessment: the state-of-the-science of endocrine disruptors. Geneva: World Health Organization (2002).Available at: http://www.who.int/ipcs/publications/new_issues/endocrine_disruptors/en/ (Accessed: 23th December 2014).

6. Huang, R. et al. Chemical genomics profiling of environmental chemicalmodulation of human nuclear receptors. Environ. Health. Perspect. 119,1142–1148 (2011).

7. Toppari, J. et al. Male reproductive health and environmental xenoestrogens.Environ. Health. Perspect. 104, 741–803 (1996).

8. Kortenkamp, A. et al. State of the art assessment of endocrine disrupters, finalreport. 2011. Available at: http://ec.europa.eu/environment/chemicals/endocrine/documents/studies_en.htm (Accessed: 23th December 2014).

9. Kavlock, R. et al. Update on EPA’s ToxCast Program: Providing high throughputdecision support tools for chemical risk management. Chem. Res. Toxicol. 25,1287–1302 (2012).

10. Martin, M. T. et al. Impact of environmental chemicals on key transcriptionregulators and correlation to toxicity end points within EPA’s ToxCast program.Chem. Res. Toxicol. 23, 578–590 (2010).

11. Kuiper, G. G. J. M. et al. Interaction of estrogenic chemicals and phytoestrogenswith estrogen receptor beta. Endocrinology 139, 4252–4263 (1998).

12. Sonnenschein, C. & Soto, A. M. An updated review of environmental estrogen andandrogen mimics and antagonists. J. Steroid Biochem. Mol. Biol. 65, 143–150(1998).

13. Zoeller, R. T. Environmental chemicals as thyroid hormone analogues: newstudies indicate that thyroid hormone receptors are targets of industrialchemicals? Mol. Cell. Endocrinol. 242, 10–15 (2005).

14. Jacobs, M. N., Nolan, G. T. & Hood, S. R. Lignans, bacteriocides andorganochlorine compounds activate the human pregnane X receptor (PXR).Toxicol. Appl. Pharmacol. 209, 123–133 (2005).

15. Chang, T. K. & Waxman, D. J. Synthetic drugs and natural products as modulatorsof constitutive androstane receptor (CAR) and pregnane X receptor (PXR). DrugMetab. Rev. 38, 51–73 (2006).

16. Zhao, Y. B., Luo, K., Fan, Z. L., Huang, C. & Hu, J. Y. Modulation of benzo [a]pyrene-induced toxic effects in Japanese medaka (Oryzias latipes) by 2, 29, 4, 49-tetrabromodiphenyl ether. Environ. Sci. Technol. 47, 13068–13076 (2013).

17. DeKeyser, J. G., Laurenzana, E. M., Peterson, E. C., Chen, T. & Omiecinski, C. J.Selective phthalate activation of naturally occurring human constitutiveandrostane receptor splice variants and the pregnane X receptor. Toxicol. Sci. 120,381–391 (2011).

18. Zhang, Z. et al. Genomic analysis of the nuclear receptor family: new insights intostructure, regulation, and evolution from the rat genome. Genome Res. 14,580–590 (2004).

19. Maglich, J. M. et al. The first completed genome sequence from a teleost fish (Fugurubripes) adds significant diversity to the nuclear receptor superfamily. NucleicAcids Res. 31, 4051–4058 (2003).

20. Germain, P., Staels, B., Dacquet, C., Spedding, M. & Laudet, V. Overview ofnomenclature of nuclear receptors. Pharmacol. Rev. 58, 685–704 (2006).

21. Missbach, M. et al. Thiazolidine diones, specific ligands of the nuclear receptorretinoid Z receptor/retinoid acid receptor related orphan receptor alpha withpotent antiarthritic activity. J. Biol. Chem. 271, 13515–13522 (1996).

22. Maloney, E. K. & Waxman, D. J. Trans -activation of PPARa and PPARc bystructurally diverse environmental chemicals. Toxicol. Appl. Pharmacol. 161,209–218 (1999).

23. Yang, C. & Chen, S. Two organochlorine pesticides, toxaphene and chlordane, areantagonists for estrogen-related receptor alpha-1 orphan receptor. Cancer Res. 59,4519–4524 (1999).

24. Showell, C. & Conlon, F. L. The western clawed frog (Xenopus tropicalis): anemerging vertebrate model for developmental genetics and environmental

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 8

toxicology. Cold Spring Harb. Protoc. 2009, pdb.emo131 (2009); DOI:10.1101/pdb.emo131.

25. Hill, A. J., Teraoka, H., Heideman, W. & Peterson, R. E. Zebrafish as a modelvertebrate for investigating chemical toxicity. Toxicol. Sci. 86, 6–19 (2005).

26. Ankley, G. T. & Johnson, R. D. Small fish models for identifying and assessing theeffects of endocrine-disrupting chemicals. Inst. Lab Anim. Res. 45, 469–483(2004).

27. Ciesielski, F., Rochel, N., Mitschler, A., Kouzmenko, A. & Moras, D. Structuralinvestigation of the ligand binding domain of the zebrafish VDR in complexeswith 1alpha, 25(OH)2D3 and Gemini: purification, crystallization andpreliminary X-ray diffraction analysis. J. Steroid Biochem. Mol. Biol. 89–90, 55–59(2004).

28. Howarth, D. L. et al. Two farnesoid X receptor a isoforms in Japanese medaka(Orzias latipes) are differentially activated in vitro. Aquat. Toxicol. 98, 245–255(2010).

29. Kapsimali, M., Bourrat, F. & Vernier, P. Distribution of the orphan nuclearreceptor Nurr1 in medaka (Oryzias latipes): cues to the definition of homologouscell groups in the vertebrate brain. J. Comp. Neurol. 431, 276–292 (2001).

30. Meredith, R. W. et al. Impacts of the cretaceous terrestrial revolution and KPgextinction on mammal diversification. Science 334, 521–524 (2011).

31. Whitfield, G. K. et al. Cloning of a functional vitamin D receptor from the lamprey(Petromyzon marinus), an ancient vertebrate lacking a calcified skeleton andteeth. Endocrinology 144, 2704–2716 (2003).

32. Georges, D. & Blieck, A. Rise of the earliest tetrapods: an early Devonian originfrom marine environment. PLoS ONE 6, e221362011 (2011); DOI:10.1371/journal.pone.0022136.

33. Szanto, A. et al. Retinoid X receptors: X-ploring their (patho) physiologicalfunctions. Cell Death Differ. 11, S126–S143 (2004).

34. Heyman, R. A. et al. 9-cis retinoic acid is a high affinity ligand for the retinoid Xreceptor. Cell 68, 397–406 (1992).

35. Mangelsdorf, D. J. et al. Characterization of three RXR genes that mediate theaction of 9-cis retinoic acid. Genes Dev. 6, 329–344 (1992).

36. Heldring, N. et al. Estrogen receptors: How do they signal and what are theirtargets. Physiol. Rev. 87, 905–931 (2007).

37. Matthews, J., Celius, T., Halgren, R. & Zacharewski, T. Differential estrogenreceptor binding of estrogenic substances: a species comparison. J. SteroidBiochem. Mol. Biol. 74, 223–234 (2000).

38. Lee, C. H., Olson, P. & Evans, R. M. Minireview: lipid metabolism, metabolicdiseases, and peroxisome proliferator-activated receptors. Endocrinology 144,2201–2207 (2003).

39. Moore, L. B. et al. Pregnane X receptor (PXR), constitutive androstane receptor(CAR), and benzoate X receptor (BXR) define three pharmacologically distinctclasses of nuclear receptors. Mol. Endocrinol. 16, 977–986 (2002).

40. Thomson, S. A., Baldwin, W. S., Wang, Y. H., Kwon, G. & Leblanc, G. A.Annotation, phylogenetics, and expression of the nuclear receptors in Daphniapulex. BMC Genomics 10, 500 (2009); DOI:10.1186/1471-2164-10-500.

41. Vogeler, S., Galloway, T. S., Lyons, B. P. & Bean, T. P. The nuclear receptor genefamily in the Pacific oyster, Crassostrea gigas, contains a novel subfamily group.BMC Genomics 15, 369 (2014); DOI:10.1186/1471-2164-15-369.

42. Wurtz, J. M. et al. A canonical structure for the ligand-binding domain of nuclearreceptors. Nat. Struct. Biol. 3, 87–94 (1996).

43. Greschik, H. et al. Characterization of the DNA-binding and dimerizationproperties of the nuclear orphan receptor germ cell nuclear factor. Mol. Cell Biol.19, 690–703 (1999).

AcknowledgmentsThis study supported by the National Natural Science Foundation of China [41330637 and41171385] and the 111 Project (B14001). Prof. Giesy was supported by the Canada ResearchChair program, a Visiting Distinguished Professorship in the Department of Biology andChemistry and State Key Laboratory in Marine Pollution, City University of Hong Kong.

Author contributionsY.B.Z. and J.Y.H. designed the experiments, Y.B.Z. and K.Z. performed the experiment andanalyzed the data, Y.B.Z., K.Z., J.P.G. and J.Y.H. wrote the manuscript. All authorscontributed to scientific discussions of the manuscript.

Additional informationSupplementary information accompanies this paper at http://www.nature.com/scientificreports

Competing financial interests: The authors declare no competing financial interests.

How to cite this article: Zhao, Y., Zhang, K., Giesy, J.P. & Hu, J. Families of NuclearReceptors in Vertebrate Models: Characteristic and Comparative Toxicological Perspective.Sci. Rep. 5, 8554; DOI:10.1038/srep08554 (2015).

This work is licensed under a Creative Commons Attribution 4.0 InternationalLicense. The images or other third party material in this article are included in thearticle’s Creative Commons license, unless indicated otherwise in the credit line; ifthe material is not included under the Creative Commons license, users will needto obtain permission from the license holder in order to reproduce the material. Toview a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

www.nature.com/scientificreports

SCIENTIFIC REPORTS | 5 : 8554 | DOI: 10.1038/srep08554 9

Supplementary information for: 1

Families of Nuclear Receptors in Vertebrate Models: Characteristic and Comparative 2

Toxicological Perspective 3

Yanbin Zhao1, Kun Zhang

1, John P. Giesy

2,3,4, and Jianying Hu

1 4

1MOE Laboratory for Earth Surface Processes, College of Urban and Environmental Sciences, 5

Peking University, Beijing 100871, China 6

2Department of Veterinary Biomedical Sciences and Toxicology Centre, University of 7

Saskatchewan, Saskatoon, Saskatchewan, Canada 8

3Department of Zoology, and Center for Integrative Toxicology, Michigan State University, East 9

Lansing, MI, USA 10

4Department of Biology & Chemistry and State Key Laboratory in Marine Pollution, City 11

University of Hong Kong, Kowloon, Hong Kong, SAR, China 12

13

Address for Correspondence 14

Dr. Yanbin Zhao; Prof. Dr. Jianying Hu 15

College of Urban and Environmental Sciences 16

Peking University, Yi Fu Second Building 17

Beijing 100871 China 18

TEL & FAX: 86-10-62765520 19

Email: [email protected]; [email protected] 20

Figure S1. Phylogenetic analysis for 48 types of nuclear receptor genes in twelve vertebrate 21

species. Numbers at branches indicate the bootstrap probabilities (≥90%) with 1,000 replicates. 22

Neighbour-Joining trees of ClustalX-aligned full amino acid/DBD plus LBD sequences were 23

constructed and displayed for the majority of NRs. For some trees, which displayed better 24

topological structures in Maximum-Likelihood analysis, the ML trees were constructed instead. 25

26

99

100

100 95

93

Human_0B1

Dolphin_0B1

Mouse_0B1

Rat_0B1

Chicken_0B1

Duck_0B1

Xenopus_0B1

Turtle_0B1

Tilapia_0B1a

Zebrafish_0B1

Stickleback_0B1

Medaka_0B1

Tilapia_0B1b

99

Human_0B2

Dolphin_0B2

Mouse_0B2

Rat_0B2

Chicken_0B2

Duck_0B2

Turtle_0B2

Zebrafish_0B2a

Tilapia_0B2

Medaka_0B2

Stickleback_0B2

100

100

100

100

90 Human_1A1

Mouse_1A1

Rat_1A1

Chicken_1A1

Xenopus_1A1

Zebrafish_1A1b

Zebrafish_1A1a

Medaka_1A1a

Tilapia_1A1a

Stickleback_1A1b

Tilapia_1A1b

Medaka_1A1b

Stickleback_1A1a

100

98 98

99 99

Human 1A2

Dolphin 1A2

Mouse 1A2

Rat 1A2

Duck 1A2

Chicken 1A2

Turtle 1A2

Xenopus 1A2

Zebrafish 1A2

Medaka 1A2

Tilapia 1A2

Stickleback 1A2

96

100

99 99

Human_1B1

Dolphin_1B1

Rat_1B1

Mouse_1B1

Chicken_1B1

Xenopus_1B1

Zebrafish_1B1a

Medaka_1B1

Tilapia_1B1a

Zebrafish_1B1b

Tilapia_1B1b

Stickleback_1B1a100

98

90

Human_1B2

Rat_1B2

Mouse_1B2

Chicken_1B2

Duck_1B2

Turtle_1B2

Xenopus_1B2

Medaka_1B2b

Tilapia_1B2b

Medaka_1B2a

Tilapia_1B2a

Stickleback_1B2

96

99

100

Dolphin_1B3

Human_1B3

Rat_1B3

Mouse_1B3

Xenopus_1B3

Medaka_1B3b

Tilapia_1B3b

Stickleback_1B3a

Zebrafish_1B3a

Zebrafish_1B3b

Medaka_1B3a

Tilapia_1B3a

Stickleback_1B3b

100

97

Human_1C1

Dolphin_1C1

Mouse_1C1

Rat_1C1

Chicken_1C1

Duck_1C1

Turtle_1C1

Xenopus_1C1

Medaka_1C1a

Zebrafish_1C1b

Tilapia_1C1a

Stickleback_1C1b

Zebrafish_1C1a

Stickleback_1C1a

Medaka_1C1b

Tilapia_1C1b

100

100 100

100 100 100

94 100

Human_1C2

Dolphin_1C2

Mouse_1C2

Rat_1C2

Turtle_1C2

Chicken_1C2

Duck_1C2

Zebrafish_1C2a

Zebrafish_1C2b

Stickleback_1C2

Medaka_1C2

Tilapia_1C2

Xenopus_1C2

99

100 90

97 99

99

27 Human_1C3

Dolphin_1C3

Mouse_1C3

Rat_1C3

Chicken_1C3

Duck_1C3

Turtle_1C3

Xenopus_1C3

Zebrafish_1C3

Medaka_1C3

Tilapia_1C3

Stickleback_1C3

100

100 100

100 98

99

Human 1D2

Dolphin 1D2

Mouse 1D2

Rat 1D2

Chicken 1D2

Duck 1D2

Turtle 1D2

Xenopus 1D2

Zebrafish 1D2b

Tilapia 1D2b

Stickleback 1D2a

Zebrafish 1D2a

Medaka 1D2

Tilapia 1D2a

Stickleback 1D2b

100 100

100 100 100

100

100

100 100

100 98

99

Human_1D1

Mouse_1D1

Rat_1D1

Dolphin_1D1

Xenopus_1D1

Zebrafish_1D1

Tilapia_1D1

Stickleback_1D1

100

100

98

100

Human 1F2

Mouse 1F2

Rat 1F2

Dolphin 1F2

Duck 1F2

Chicken 1F2

Turtle 1F2

Xenopus 1F2b

Zebrafish 1F2

Medaka 1F2

Tilapia 1F2

Stickleback 1F2

98

97

100

100

Human 1H3

Dolphin 1H3

Mouse 1H3

Rat 1H3

Chicken 1H3

Duck 1H3

Turtle 1H3

Xenopus 1H3

Stickleback 1H3

Zebrafish 1H3

Medaka 1H3

Tilapia 1H3

100

100

100

100

100

99

95 99

Human 1H4

Dolphin 1H4

Mouse 1H4

Rat 1H4

Chicken 1H4

Duck 1H4

Turtle 1H4

Xenopus 1H4

Zebrafish 1H4

Tilapia 1H4

Medaka 1H4

Stickleback 1H4

100 100

98

98

96

100

100

Chicken_1F1

Duck_1F1

Turtle_1F1

Human_1F1

Dolphin_1F1

Mouse_1F1

Rat_1F1

Zebrafish_1F1a

Medaka_1F1

Tilapia_1F1a

Xenopus_1F1

Tilapia_1F1b

Zebrafish_1F1b

93

93 99

97

Human_1F3

Dolphin_1F3

Mouse_1F3

Rat_1F3

Chicken_1F3a

Duck_1F3a

Turtle_1F3b

Zebrafish_1F3b

Stickleback_1F3b

Medaka_1F3b

Tilapia_1F3c

Zebrafish_1F3c

Stickleback_1F3a

Medaka_1F3c

Tilapia_1F3a

Zebrafish_1F3a

Medaka_1F3a

Tilapia_1F3b

99

99

100

100

100

100

99

99 98

100 100

Human_1H2

Dolphin_1H2

Mouse_1H2

Rat_1H2100

Human 1I1

Dolphin 1I1

Mouse 1I1

Rat 1I1

Chicken 1I1

Duck 1I1

Turtle 1I1

Xenopus 1I1

Zebrafish 1I1b

Tilapia 1I1a

Medaka 1I1b

Stickleback 1I1b

Zebrafish 1I1a

Medaka 1I1a

Tilapia 1I1b

Stickleback 1I1a

100

100

100

100 100

91

97

Human 1I2

Dolphin 1I2

Mouse 1I2

Rat 1I2

Xenopus 1I2

Zebrafish 1I2

Medaka 1I2

Tilapia 1I2

100

100

100 100

99

100

100

Human_1I3

Dolphin_1I3

Mouse_1I3

Rat_1I3

Turtle_1I3

Xenopus_1I3

28 Human 2A1

Mouse 2A1

Rat 2A1

Dolphin 2A1

Chicken 2A1

Duck 2A1

Turtle 2A1

Xenopus 2A1

Zebrafish 2A1

Tilapia 2A1

Stickleback 2A1

100 100

100 99

99 100

Human_2A2

Mouse_2A2

Rat_2A2

Dolphin_2A2

Chicken_2A2

Duck_2A2

Turtle_2A2

Xenopus_2A2

Medaka_2A2

Tilapia_2A2

Stickleback_2A2

100 98

100

100 100

99

90

100

Human 2B1

Mouse 2B1

Rat 2B1

Dolphin 2B1

Chicken 2B1

Duck 2B1

Turtle 2B1

Xenopus 2B1

Zebrafish 2B1b

Zebrafish 2B1a

Medaka 2B1a

Tilapia 2B1

Stickleback 2B1

100

100

97

95

Human 2B2

Dolphin 2B2

Mouse 2B2

Rat 2B2

Xenopus 2B2

Zebrafish 2B2a

Tilapia 2B2b

Medaka 2B2a

Stickleback 2B2b

Zebrafish 2B2b

Medaka 2B2b

Tilapia 2B2a

Stickleback 2B2a

100 99

100

100

100

Human 2B3

Dolphin 2B3

Mouse 2B3

Rat 2B3

Chicken 2B3

Duck 2B3

Turtle 2B3

Xenopus 2B3

Zebrafish 2B3a

Zebrafish 2B3b

Medaka 2B1b

Tilapia 2B3

Stickleback 2B3

100

100

100

100 96

96

99

Human 2C1

Dolphin 2C1

Mouse 2C1

Rat 2C1

Chicken 2C1

Duck 2C1

Turtle 2C1

Xenopus 2C1

Zebrafish 2C1

Stickleback 2C1

Medaka 2C1

Tilapia 2C1

100

100 100

97

100

100

98

Human 2C2

Dolphin 2C2

Mouse 2C2

Rat 2C2

Chicken 2C2

Duck 2C2

Turtle 2C2

Xenopus 2C2

Zebrafish 2C2

Stickleback 2C2

Medaka 2C2

Tilapia 2C2

100

100

100 100

100

98 99 94

Human 2E3

Mouse 2E3

Rat 2E3

Dolphin 2E3

Chicken 2E3

Turtle 2E3

Xenopus 2E3

Zebrafish 2E3

Medaka 2E3a

Tilapia 2E3a

Stickleback 2E3

Medaka 2E3b

Tilapia 2E3b

100

100

100 90

91

96 Human_2E1

Dolphin_2E1

Mouse_2E1

Rat_2E1

Chicken_2E1

Duck_2E1

Turtle_2E1

Xenopus_2E1

Zebrafish_2E1

Stickleback_2E1

Medaka_2E1

Tilapia_2E193 99

100

Human_2F6

Dolphin_2F6

Mouse_2F6

Rat_2F6

Turtle_2F6

Xenopus_2F6

Zebrafish_2F6a

Zebrafish_2F6b

Stickleback_2F6b

Medaka_2F6b

Tilapia_2F6b

Tilapia_2F6a

Medaka_2F6a

Stickleback_2F6a

100

100

90 90

94

94

98

Human_2F1

Mouse_2F1

Rat_2F1

Dolphin_2F1

Xenopus_2F1

Chicken_2F1

Zebrafish_2F1a

Tilapia_2F1

Stickleback_2F1

100

99 90

100

Human_2F2

Mouse_2F2

Rat_2F2

Chicken_2F2

Turtle_2F2

Xenopus_2F2

Duck_2F2

Tilapia_2F2b

Zebrafish_2F2

Medaka_2F2

Stickleback_2F2b

Tilapia_2F2a

Stickleback_2F2a

100

100

100

99

29 Human 3A1

Dolphin 3A1

Mouse 3A1

Rat 3A1

Chicken 3A1

Duck 3A1

Turtle 3A1

Xenopus 3A1

Zebrafish 3A1

Medaka 3A1

Tilapia 3A1

Stickleback 3A1

100 100

100 100

100 99

99

92

Human 3A2

Dolphin 3A2

Mouse 3A2

Rat 3A2

Chicken 3A2

Duck 3A2

Turtle 3A2

Xenopus 3A2

Zebrafish 3A2b

Medaka 3A2b

Tilapia 3A2b

Zebrafish 3A2a

Stickleback 3A2a

Medaka 3A2a

Tilapia 3A2a

100 100

100

100

100 100 90

99

97

Human 3B1

Dolphin 3B1

Mouse 3B1

Rat 3B1

Turtle 3B1

Xenopus 3B1

Zebrafish 3B1

Medaka 3B1

Tilapia 3B1

Stickleback 3B1

100

100

99

99

92

100

100

100

100

99

100

99

Human_3B2

Mouse_3B2

Rat_3B2

Dolphin_3B2

Chicken_3B2

Duck_3B2

Turtle_3B2

Zebrafish_3B2

Tilapia_3B2a

Medaka_3B2a

Stickleback_3B2a

Stickleback_3B2b

Medaka_3B2b

Tilapia_3B2b

100

Human_3B3

Dolphin_3B3

Mouse_3B3

Rat_3B3

Turtle_3B3

Chicken_3B3

Duck_3B3

Xenopus_3B3a

Zebrafish_3B3a

Tilapia_3B3b

Medaka_3B3a

Stickleback_3B3b

Medaka_3B3b

Tilapia_3B3a

Zebrafish_3B3b

Xenopus_3B3b

Stickleback_3B3a

100

100

90

100

100

100

100 100

99

Human_3C1

Dolphin_3C1

Rat_3C1

Mouse_3C1

Chicken_3C1

Duck_3C1

Turtle_3C1

Xenopus_3C1

Tilapia_3C1b

Medaka_3C1b

Stickleback_3C1b

Zebrafish_3C1

Tilapia_3C1a

Medaka_3C1a

Stickleback_3C1a

100

100

92

98 99

95

Human_3C2

Dolphin_3C2

Mouse_3C2

Rat_3C2

Chicken_3C2

Duck_3C2

Turtle_3C2

Xenopus_3C2

Zebrafish_3C2

Medaka_3C2

Tilapia_3C2

Stickleback_3C2

100

100 100

100

100 100

99

99

98

Human_3C3

Dolphin_3C3

Mouse_3C3

Rat_3C3

Chicken_3C3

Turtle_3C3

Xenopus_3C3

Stickleback_3C3

Zebrafish_3C3

100

100

100

100

99

Human_3C4

Mouse_3C4

Rat_3C4

Dolphin_3C4

Turtle_3C4

Chicken_3C4

Duck_3C4

Xenopus_3C4

Zebrafish_3C4

Stickleback_3C4b

Medaka_3C4b

Tilapia_3C4b

Stickleback_3C4a

Medaka_3C4a

Tilapia_3C4a

100

100

100 100

100 100

100 98

100

95

30

Human 5A2

Dolphin 5A2

Mouse 5A2

Rat 5A2

Chicken 5A2

Duck 5A2

Turtle 5A2

Xenopus 5A2

Zebrafish 5A2

Stickleback 5A2

Medaka 5A2

Tilapia 5A2

100 99

99

99

97 98

100

Human_5A1

Dolphin_5A1

Mouse_5A1

Rat_5A1

Chicken_5A1

Xenopus_5A1

Zebrafish_5A1b

Zebrafish_5A1a

Stickleback_5A1a

Medaka_5A1a

Tilapia_5A1

Medaka_5A1b

Stickleback_5A1b

100 100

100

100

100

98

99

Human_6A1

Mouse_6A1

Rat_6A1

Dolphin_6A1

Chicken_6A1

Turtle_6A1

Xenopus_6A1

Zebrafish_6A1a

Tilapia_6A1

Stickleback_6A1

Zebrafish_6A1b

100 90

96 100

100 97

90

Human 4A1

Dolphin 4A1

Mouse 4A1

Rat 4A1

Turtle 4A1

Xenopus 4A1

Duck 4A1

Zebrafish 4A1

Stickleback 4A1a

Medaka 4A1b

Tilapia 4A1a

Medaka 4A1a

Tilapia 4A1b

Stickleback 4A1b

100

100

100

100

100

94

91

Human 4A3

Dolphin 4A3

Mouse 4A3

Rat 4A3

Chicken 4A3

Duck 4A3

Turtle 4A3

Zebrafish 4A3

Medaka 4A3

Tilapia 4A3

Stickleback 4A3

100 100

99

95

100

100

100

Human_4A2

Dolphin_4A2

Rat_4A2

Mouse_4A2

Turtle_4A2

Chicken_4A2

Xenopus_4A2a

Zebrafish_4A2a

Stickleback_4A2

Medaka_4A2a

Tilapia_4A2a

Zebrafish_4A2b

Medaka_4A2b

Tilapia_4A2b

Xenopus_4A2b

91

96

99 95

100

100

99 99

100

Figure S2. Schematic diagram depicts the evolution of PXR and CAR in vertebrates. 31

32

Table S1. Details for nuclear receptor sequence searches in 12 model vertebrates. 33

34

BLASTn

Hits

BLASTp

Hits Sum

After

sortation

Verified by

software

NR0B

Subfamily

Final sets

of NRs.

Human 33849 24967 58816 57 46 2 48

Mouse 23014 12540 35554 62 47 2 49

Rat 8312 8896 17208 70 47 2 49

Dolphin 2834 2752 5586 74 45 2 47

Chicken 3712 3761 7473 50 42 2 44

Duck 2381 4034 6415 48 40 2 42

Turtle 2922 3421 6343 48 46 2 48

Xenopus 2289 3850 6139 53 50 2 52

Zebrafish 9788 9230 19018 72 70 3 73

Medaka 3601 4090 7691 78 65 2 67

Tilapia 7586 6630 14216 83 71 3 74

Stickleback 571 268 839 64 64 2 66

35

Table S2. Sequence ID. for each nuclear receptor gene in Ensembl database. 36

37

Human Mouse Rat Dolphin Chicken Duck Turtle Xenopus Zebrafish Medaka Tilapia Stickleback

NR1A1 ENSG0000012

6351

ENSMUSG00

000058756

ENSRNOG00

000009066

ENSTTRG000

00016893

ENSGALG00

000000270

ENSAPLG000

00016001

ENSPSIG0000

0012754

ENSXETG000

00024399

ENSDARG00

000000151

ENSORLG000

00016941

ENSONIG000

00018247

ENSGACG000

00003766

ENSDARG00

000052654

ENSORLG000

00012005

ENSONIG000

00006456

ENSGACG000

00006540

NR1A2 ENSG0000015

1090

ENSMUSG00

000021779

ENSRNOG00

000006649

ENSTTRG000

00001859

ENSGALG00

000011294

ENSAPLG000

00006081

ENSPSIG0000

0008182

ENSXETG000

00003871

ENSDARG00

000021163

ENSORLG000

00008122

ENSONIG000

00010312

ENSGACG000

00007996

NR1B1 ENSG0000013

1759

ENSMUSG00

000037992

ENSRNOG00

000009972

ENSTTRG000

00016901

ENSGALG00

000005629

ENSAPLG000

00006377

ENSPSIG0000

0002372

ENSXETG000

00024390

ENSDARG00

000056783

ENSORLG000

00004373

ENSONIG000

00019915

ENSGACG000

00012955

ENSDARG00

000034893

ENSONIG000

00006314

ENSGACG000

00005297

NR1B2 ENSG0000007

7092

ENSMUSG00

000017491

ENSRNOG00

000024061

ENSTTRG000

00010874

ENSGALG00

000011298

ENSAPLG000

00006432

ENSPSIG0000

0007930

ENSXETG000

00007272

ENSORLG000

00008502

ENSONIG000

00010320

ENSGACG000

00007999

ENSORLG000

00016394

ENSONIG000

00006493

NR1B3 ENSG0000017

2819

ENSMUSG00

000001288

ENSRNOG00

000012499

ENSTTRG000

00002778

ENSXETG000

00012670

ENSDARG00

000034117

ENSORLG000

00015382

ENSONIG000

00012223

ENSGACG000

00009372

ENSDARG00

000054003

ENSORLG000

00007861

ENSONIG000

00019165

ENSGACG000

00000612

NR1C1 ENSG0000018

6951

ENSMUSG00

000022383

ENSRNOG00

000021463

ENSTTRG000

00004136

ENSGALG00

000022985

ENSAPLG000

00010641

ENSPSIG0000

0018221

ENSXETG000

00023454

ENSDARG00

000031777

ENSORLG000

00002413

ENSONIG000

00016715

ENSGACG000

00018958

ENSDARG00

000054323

ENSORLG000

00011091

ENSONIG000

00008831

ENSGACG000

00003703

NR1C2 ENSG0000011

2033

ENSMUSG00

000002250

ENSRNOG00

000000503

ENSTTRG000

00009416

ENSGALG00

000002588

ENSAPLG000

00004751

ENSPSIG0000

0005889

ENSXETG000

00015121

ENSDARG00

000044525

ENSORLG000

00006636

ENSONIG000

00011871

ENSGACG000

00008288

ENSDARG00

000009473

NR1C3 ENSG0000013

2170

ENSMUSG00

000000440

ENSRNOG00

000008839

ENSTTRG000

00016565

ENSGALG00

000004974

ENSAPLG000

00009031

ENSPSIG0000

0011100

ENSXETG000

00017422

ENSDARG00

000031848

ENSORLG000

00004432

ENSONIG000

00014331

ENSGACG000

00001665

NR1D1 ENSG0000012

6368

ENSMUSG00

000020889

ENSRNOG00

000009329

ENSTTRG000

00016894

ENSPSIG0000

0014806

ENSXETG000

00024397

ENSDARG00

000033160

ENSONIG000

00009283

ENSGACG000

00009356

NR1D2 ENSG0000017

4738

ENSMUSG00

000021775

ENSRNOG00

000046912

ENSTTRG000

00010829

ENSGALG00

000011291

ENSAPLG000

00005753

ENSPSIG0000

0008488

ENSXETG000

00003869

ENSDARG00

000003820

ENSORLG000

00016431

ENSONIG000

00008699

ENSGACG000

00012958

ENSDARG00

000009594

ENSONIG000

00010308

ENSGACG000

00007986

NR1D4 ENSDARG00

000031161

ENSORLG000

00007837

ENSONIG000

00012213

ENSGACG000

00000614

ENSDARG00

000059370

ENSORLG000

00015399

ENSONIG000

00019164

NR1F1 ENSG0000006

9667

ENSMUSG00

000032238

ENSRNOG00

000027145

ENSTTRG000

00007718

ENSGALG00

000003759

ENSAPLG000

00005866

ENSPSIG0000

0011314

ENSXETG000

00021123

ENSDARG00

000031768

ENSORLG000

00007645

ENSONIG000

00015289

ENSDARG00

000001910

ENSONIG000

00015603

NR1F2 ENSG0000019

8963

ENSMUSG00

000036192

ENSRNOG00

000013413

ENSTTRG000

00008387

ENSGALG00

000015150

ENSAPLG000

00007187

ENSPSIG0000

0005579

ENSXETG000

00031251

ENSDARG00

000033498

ENSORLG000

00012441

ENSONIG000

00010762

ENSGACG000

00011556

ENSXETG000

00008148

NR1F3 ENSG0000014

3365

ENSMUSG00

000028150

ENSRNOG00

000046831

ENSTTRG000

00003151

ENSGALG00

000025988

ENSAPLG000

00013051

ENSPSIG0000

0008995

ENSXETG000

00002131

ENSDARG00

000087195

ENSORLG000

00009486

ENSONIG000

00004686

ENSGACG000

00012280

ENSGALG00

000001035

ENSAPLG000

00011493

ENSPSIG0000

0016262

ENSDARG00

000057231

ENSORLG000

00003765

ENSONIG000

00010247

ENSGACG000

00015341

ENSDARG00

000017780

ENSORLG000

00014886

ENSONIG000

00006222

NR1H3 ENSG0000002

5434

ENSMUSG00

000002108

ENSRNOG00

000013172

ENSTTRG000

00014149

ENSGALG00

000008202

ENSAPLG000

00010925

ENSPSIG0000

0010360

ENSXETG000

00000307

ENSDARG00

000043170

ENSORLG000

00001286

ENSONIG000

00005828

ENSGACG000

00017167

NR1H2 ENSG0000013

1408

ENSMUSG00

000060601

ENSRNOG00

000019812

ENSTTRG000

00002416

NR1H5 ENSMUSG00

000048938

ENSRNOG00

000023073

ENSGALG00

000002170

ENSAPLG000

00008338

ENSPSIG0000

0003828

ENSXETG000

00021443

ENSDARG00

000031046

ENSONIG000

00009252

ENSGACG000

00004938

NR1H4 ENSG0000001

2504

ENSMUSG00

000047638

ENSRNOG00

000007197

ENSTTRG000

00016373

ENSGALG00

000011594

ENSAPLG000

00013289

ENSPSIG0000

0005774

ENSXETG000

00030372

ENSDARG00

000057741

ENSORLG000

00011270

ENSONIG000

00014678

ENSGACG000

00011745

NR1I1 ENSG0000011

1424

ENSMUSG00

000022479

ENSRNOG00

000008574

ENSTTRG000

00012578

ENSGALG00

000026166

ENSAPLG000

00005087

ENSPSIG0000

0018108

ENSXETG000

00010658

ENSDARG00

000043059

ENSORLG000

00001063

ENSONIG000

00009200

ENSGACG000

00004763

ENSDARG00

000070721

ENSORLG000

00016402

ENSONIG000

00019378

ENSGACG000

00007975

NR1I2 ENSG0000014

4852

ENSMUSG00

000022809

ENSRNOG00

000002906

ENSTTRG000

00016650

ENSXETG000

00018029

ENSDARG00

000029766

ENSORLG000

00017953

ENSONIG000

00014385

NR1I3 ENSG0000014

3257

ENSMUSG00

000005677

ENSRNOG00

000003260

ENSTTRG000

00009227

ENSGALG00

000028624

ENSPSIG0000

0004437

ENSXETG000

00031759

NR2A1 ENSG0000010

1076

ENSMUSG00

000017950

ENSRNOG00

000008895

ENSTTRG000

00013004

ENSGALG00

000004285

ENSAPLG000

00008950

ENSPSIG0000

0012689

ENSXETG000

00001775

ENSDARG00

000021494

ENSORLG000

00016380

ENSONIG000

00016515

ENSGACG000

00011485

NR2A3 ENSGALG00

000015670

ENSAPLG000

00011331

ENSPSIG0000

0017650

ENSXETG000

00016389

ENSDARG00

000012764

ENSONIG000

00005911

NR2A2 ENSG0000016

4749

ENSMUSG00

000017688

ENSRNOG00

000008971

ENSTTRG000

00003691

ENSGALG00

000005708

ENSAPLG000

00011794

ENSPSIG0000

0003756

ENSXETG000

00017845

ENSDARG00

000071565

ENSORLG000

00006996

ENSONIG000

00014490

ENSGACG000

00002422

NR2B1 ENSG0000018

6350

ENSMUSG00

000015846

ENSRNOG00

000009446

ENSTTRG000

00009492

ENSGALG00

000002626

ENSAPLG000

00013150

ENSPSIG0000

0011977

ENSXETG000

00012733

ENSDARG00

000057737

ENSORLG000

00012155

ENSONIG000

00013076

ENSGACG000

00018189

ENSDARG00

000035127

ENSORLG000

00016690

NR2B2 ENSG0000020

4231

ENSMUSG00

000039656

ENSRNOG00

000000464

ENSTTRG000

00004291

ENSXETG000

00020416

ENSDARG00

000078954

ENSORLG000

00006476

ENSONIG000

00020007

ENSGACG000

00000096

ENSDARG00

000002006

ENSORLG000

00007020

ENSONIG000

00002873

ENSGACG000

00007982

NR2B3 ENSG0000014

3171

ENSMUSG00

000015843

ENSRNOG00

000004537

ENSTTRG000

00003653

ENSGALG00

000003406

ENSAPLG000

00004831

ENSPSIG0000

0004871

ENSXETG000

00004750

ENSDARG00

000005593

ENSONIG000

00002143

ENSGACG000

00011685

ENSDARG00

000004697

NR2C1 ENSG0000012

0798

ENSMUSG00

000005897

ENSRNOG00

000006983

ENSTTRG000

00016305

ENSGALG00

000011327

ENSAPLG000

00006253

ENSPSIG0000

0017190

ENSXETG000

00023840

ENSDARG00

000045527

ENSORLG000

00004114

ENSONIG000

00008566

ENSGACG000

00010174

NR2C2 ENSG0000017

7463

ENSMUSG00

000005893

ENSRNOG00

000010536

ENSTTRG000

00009876

ENSGALG00

000008519

ENSAPLG000

00007538

ENSPSIG0000

0008928

ENSXETG000

00004817

ENSDARG00

000042477

ENSORLG000

00010877

ENSONIG000

00017240

ENSGACG000

00002941

NR2E1 ENSG0000011

2333

ENSMUSG00

000019803

ENSRNOG00

000050550

ENSTTRG000

00008863

ENSGALG00

000015305

ENSAPLG000

00010675

ENSPSIG0000

0006035

ENSXETG000

00014853

ENSDARG00

000017107

ENSORLG000

00013426

ENSONIG000

00013281

ENSGACG000

00008934

NR2E3 ENSG0000003

1544

ENSMUSG00

000032292

ENSRNOG00

000050690

ENSTTRG000

00009410

ENSGALG00

000002093

ENSPSIG0000

0017480

ENSXETG000

00005219

ENSDARG00

000045904

ENSORLG000

00000011

ENSONIG000

00007109

ENSGACG000

00004739

ENSORLG000

00007175

ENSONIG000

00015396

NR2F1 ENSG0000017

5745

ENSMUSG00

000069171

ENSRNOG00

000014795

ENSTTRG000

00001519

ENSGALG00

000027907

ENSPSIG0000

0009818

ENSXETG000

00011594

ENSDARG00

000052695

ENSORLG000

00010191

ENSONIG000

00011840

ENSGACG000

00010385

ENSPSIG0000

0010198

ENSDARG00

000017168

NR2F2 ENSG0000018

5551

ENSMUSG00

000030551

ENSRNOG00

000010308

ENSGALG00

000007000

ENSAPLG000

00010629

ENSPSIG0000

0017164

ENSXETG000

00022346

ENSDARG00

000040926

ENSORLG000

00008429

ENSONIG000

00015133

ENSGACG000

00013235

ENSONIG000

00003070

ENSGACG000

00014846

NR2F5 ENSXETG000

00011046

ENSDARG00

000033172

ENSORLG000

00016315

ENSONIG000

00008594

ENSGACG000

00013191

NR2F6 ENSG0000016

0113

ENSMUSG00

000002393

ENSRNOG00

000016892

ENSTTRG000

00003132

ENSGALG00

000027294

ENSAPLG000

00003193

ENSPSIG0000

0013773

ENSXETG000

00013531

ENSDARG00

000003607

ENSORLG000

00008749

ENSONIG000

00010512

ENSGACG000

00007766

ENSDARG00

000003165

ENSORLG000

00008911

ENSONIG000

00010104

ENSGACG000

00015583

NR3A1 ENSG0000009

1831

ENSMUSG00

000019768

ENSRNOG00

000019358

ENSTTRG000

00002996

ENSGALG00

000012973

ENSAPLG000

00004585

ENSPSIG0000

0004166

ENSXETG000

00012364

ENSDARG00

000004111

ENSORLG000

00014514

ENSONIG000

00013354

ENSGACG000

00008711

NR3A2 ENSG0000014

0009

ENSMUSG00

000021055

ENSRNOG00

000005343

ENSTTRG000

00000517

ENSGALG00

000011801

ENSAPLG000

00011895

ENSPSIG0000

0018210

ENSXETG000

00007257

ENSDARG00

000016454

ENSORLG000

00017721

ENSONIG000

00005633

ENSGACG000

00007514

ENSDARG00

000034181

ENSORLG000

00018012

ENSONIG000

00001710

ENSGACG000

00000213

NR3B1 ENSG0000017

3153

ENSMUSG00

000024955

ENSRNOG00

000021139

ENSTTRG000

00010296

ENSPSIG0000

0016751

ENSXETG000

00007211

ENSDARG00

000069266

ENSORLG000

00010624

ENSONIG000

00001778

ENSGACG000

00020287

NR3B2 ENSG0000011

9715

ENSMUSG00

000021255

ENSRNOG00

000010259

ENSTTRG000

00001302

ENSGALG00

000010365

ENSAPLG000

00012470

ENSPSIG0000

0017916

ENSXETG000

00013217

ENSDARG00

000040151

ENSORLG000

00016581

ENSONIG000

00015282

ENSGACG000

00010561

ENSORLG000

00009126

ENSONIG000

00020192

ENSGACG000

00007542

NR3B3 ENSG0000019

6482

ENSMUSG00

000026610

ENSRNOG00

000002593

ENSTTRG000

00006004

ENSGALG00

000009645

ENSAPLG000

00005309

ENSPSIG0000

0005595

ENSXETG000

00020932

ENSDARG00

000004861

ENSORLG000

00011528

ENSONIG000

00000573

ENSGACG000

00013426

ENSXETG000

00016948

ENSDARG00

000011696

ENSORLG000

00016819

ENSONIG000

00017162

ENSGACG000

00016275

NR3B4 ENSDARG00

000015064

ENSONIG000

00001134

ENSGACG000

00004898

NR3C1 ENSG0000011

3580

ENSMUSG00

000024431

ENSRNOG00

000014096

ENSTTRG000

00003260

ENSGALG00

000007394

ENSAPLG000

00007318

ENSPSIG0000

0015245

ENSXETG000

00001879

ENSDARG00

000025032

ENSORLG000

00006022

ENSONIG000

00017907

ENSGACG000

00018209

ENSORLG000

00001565

ENSONIG000

00008483

ENSGACG000

00020725

NR3C2 ENSG0000015

1623

ENSMUSG00

000031618

ENSRNOG00

000034007

ENSTTRG000

00014440

ENSGALG00

000010035

ENSAPLG000

00015146

ENSPSIG0000

0006383

ENSXETG000

00026061

ENSDARG00

000037025

ENSORLG000

00007530

ENSONIG000

00010029

ENSGACG000

00017193

NR3C3 ENSG0000008

2175

ENSMUSG00

000031870

ENSRNOG00

000006831

ENSTTRG000

00000030

ENSGALG00

000017195

ENSAPLG000

00003887

ENSPSIG0000

0013654

ENSXETG000

00005482

ENSDARG00

000035966

ENSORLG000

00002651

ENSGACG000

00012162

NR3C4 ENSG0000016

9083

ENSMUSG00

000046532

ENSRNOG00

000005639

ENSTTRG000

00004230

ENSGALG00

000004596

ENSAPLG000

00006566

ENSPSIG0000

0010176

ENSXETG000

00005089

ENSDARG00

000067976

ENSORLG000

00008220

ENSONIG000

00012854

ENSGACG000

00018525

ENSORLG000 ENSONIG000 ENSGACG000

00009520 00017538 00020332

NR4A1 ENSG0000012

3358

ENSMUSG00

000023034

ENSRNOG00

000007607

ENSTTRG000

00002817

ENSAPLG000

00014123

ENSPSIG0000

0018018

ENSXETG000

00000579

ENSDARG00

000000796

ENSORLG000

00015557

ENSONIG000

00016717

ENSGACG000

00010788

ENSORLG000

00015279

ENSONIG000

00019260

ENSGACG000

00000045

NR4A2 ENSG0000015

3234

ENSMUSG00

000026826

ENSRNOG00

000005600

ENSTTRG000

00005740

ENSGALG00

000012538

ENSAPLG000

00012071

ENSPSIG0000

0008054

ENSXETG000

00031753

ENSDARG00

000017007

ENSORLG000

00016692

ENSONIG000

00008976

ENSGACG000

00005831

ENSXETG000

00024016

ENSDARG00

000044532

ENSORLG000

00000050

ENSONIG000

00012131

NR4A3 ENSG0000011

9508

ENSMUSG00

000028341

ENSRNOG00

000005964

ENSTTRG000

00007458

ENSGALG00

000013568

ENSAPLG000

00011263

ENSPSIG0000

0012281

ENSDARG00

000055854

ENSORLG000

00008732

ENSONIG000

00006026

ENSGACG000

00009027

NR5A1 ENSG0000013

6931

ENSMUSG00

000026751

ENSRNOG00

000012682

ENSTTRG000

00017390

ENSGALG00

000001080

ENSAPLG000

00004548

ENSPSIG0000

0006131

ENSXETG000

00011456

ENSDARG00

000017704

ENSORLG000

00016486

ENSONIG000

00020218

ENSGACG000

00003539

ENSDARG00

000023362

ENSORLG000

00013196

ENSGACG000

00018317

NR5A2 ENSG0000011

6833

ENSMUSG00

000026398

ENSRNOG00

000000653

ENSTTRG000

00003256

ENSGALG00

000002182

ENSAPLG000

00009302

ENSPSIG0000

0003632

ENSXETG000

00000314

ENSDARG00

000042556

ENSORLG000

00006933

ENSONIG000

00012517

ENSGACG000

00008896

NR5A5 ENSDARG00

000039116

ENSORLG000

00006019

ENSONIG000

00001686

ENSGACG000

00009952

NR6A1 ENSG0000014

8200

ENSMUSG00

000063972

ENSRNOG00

000013232

ENSTTRG000

00017391

ENSGALG00

000001073

ENSAPLG000

00004788

ENSPSIG0000

0006445

ENSXETG000

00008578

ENSDARG00

000018030

ENSORLG000

00016492

ENSONIG000

00020217

ENSGACG000

00003560

ENSDARG00

000014480

NR0B1 ENSG0000016

9297

ENSMUSG00

000025056

ENSRNOG00

000003765

ENSTTRG000

00013272

ENSGALG00

000016287

ENSAPLG000

00003894

ENSPSIG0000

0009740

ENSXETG000

00015374

ENSDARG00

000056541

ENSORLG000

00011824

ENSONIG000

00012111

ENSGACG000

00002817

ENSONIG000

00006662

NR0B2 ENSG0000013

1910

ENSMUSG00

000037583

ENSRNOG00

000007229

ENSTTRG000

00016680

ENSGALG00

000000887

ENSAPLG000

00010744

ENSPSIG0000

0017134

ENSXETG000

00011771

ENSDARG00

000044685

ENSORLG000

00004442

ENSONIG000

00006772

ENSGACG000

00007198

Table S3. Genes with incomplete/without DBD/LBD regions in the Ensembl database. Genes 38

marked in red means the full sequences were retrieved in NCBI/ EMBL/DDBJ databases. 39

40

Gene and related Ensembl ID.

Human —

Mouse —

Rat NR2E3 (ENSRNOG00000050690); NR3A1 (ENSRNOG00000019358); NR3C3 (ENSRNOG00000006831);

NR5A2 (ENSRNOG00000000653)

Dolphin

NR1A1 (ENSTTRG00000016893); NR1B2 (ENSTTRG00000010874); NR1C1 (ENSTTRG00000004136);

NR1F3 (ENSTTRG00000003151); NR1I2 (ENSTTRG00000016650); NR2A1 (ENSTTRG00000013004);

NR2B1 (ENSTTRG00000009492); NR2B3 (ENSTTRG00000003653); NR2F6 (ENSTTRG00000003132);

NR3A1 (ENSTTRG00000002996); NR3B2 (ENSTTRG00000001302); NR4A1 (ENSTTRG00000002817);

NR4A3 (ENSTTRG00000007458)

Chicken NR1B1 (ENSGALG00000005629); NR2F6 (ENSGALG00000027294);

Duck

NR1A1 (ENSAPLG00000016001); NR1B1(ENSAPLG00000006377); NR1F2 (ENSAPLG00000007187);

NR1F3b (ENSAPLG00000011493); NR1H4 (ENSAPLG00000013289); NR1I1 (ENSAPLG00000005087);

NR2F6 (ENSAPLG00000003193); NR3A1 (ENSAPLG00000004585); NR3C3 (ENSAPLG00000003887);

NR4A1 (ENSAPLG00000014123); NR4A2 (ENSAPLG00000012071); NR5A1 (ENSAPLG00000004548);

NR6A1 (ENSAPLG00000004788); NR0B1 (ENSAPLG00000003894)

Turtle

NR1A1 (ENSPSIG00000012754); NR1B1 (ENSPSIG00000002372); NR1D1 (ENSPSIG00000014806);

NR2A2 (ENSPSIG00000003756); NR2E3 (ENSPSIG00000017480); NR2F1b (ENSPSIG00000010198);

NR3B1 (ENSPSIG00000016751); NR4A2 (ENSPSIG00000008054); NR5A1 (ENSPSIG00000006131);

NR6A1 (ENSPSIG00000006445)

Xenopus NR1C2 (ENSXETG00000015121); NR1C3 (ENSXETG00000017422); NR1F3 (ENSXETG00000002131);

NR2A1 (ENSXETG00000001775); NR4A2b (ENSXETG00000024016)

Zebrafish NR1C3 (ENSDARG00000031848); NR1I1 (ENSDARG00000043059)

Medaka NR1B1 (ENSORLG00000004373); NR2A1 (ENSORLG00000016380); NR2F1 (ENSORLG00000010191);

NR3C3 (ENSORLG00000002651); NR6A1 (ENSORLG00000016492)

Tilapia —

Stickleback —

Table S4. The vertebrate species used for NR1I1 (VDR) gene phylogenetic analysis. 41

42

Common name Scientific name Common name Scientific name

Human Homo sapiens Flycatcher Ficedula albicollis

Gibbon Nomascus leucogenys Zebra Finch Taeniopygia guttata

Gorilla Gorilla gorilla gorilla Duck Anas platyrhynchos

Macaque Macaca mulatta Chicken Gallus gallus

Marmoset Callithrix jacchus Turkey Meleagris gallopavo

Bushbaby Otolemur garnettii

Cat Felis catus Anole lizard Anolis carolinensis

Dog Canis lupus familiaris Chinese softshell turtle Pelodiscus sinensis

Ferret Mustela putorius furo

Hedgehog Erinaceus europaeus Xenopus Xenopus tropicalis

Rabbit Oryctolagus cuniculus

Dolphin Tursiops truncatus Coelacanth Latimeria chalumnae

Pig Sus scrofa Tilapia Oreochromis niloticus

Opossum Monodelphis domestica Zebrafish Danio rerio

Cow Bos taurus Tetraodon Tetraodon nigroviridis

Sheep Ovis aries Medaka Oryzias latipes

Mouse Mus musculus Platyfish Xiphophorus maculatus

Rat Rattus norvegicus Stickleback Gasterosteus aculeatus

Guinea Pig Cavia porcellus

Squirrel Ictidomys tridecemlineatus

43