pseudogenes

48
Pseudogenes Sean D. Pitman M.D. © January 2005 Latest Update: May 2008 Table of Contents Pseudogenes Shared Pseudogenes Signs of Function One Man's Junk Shared Mistakes A New Paradigm The Human Genome in Numbers Pyknons

Upload: sean-pitman-md

Post on 12-Nov-2014

107 views

Category:

Documents


0 download

DESCRIPTION

Pseudogenes - markers of common descent or misunderstood functional genetic elements?

TRANSCRIPT

Page 2: Pseudogenes

Home

 

    

Pseudogenes are DNA sequences that resemble functional genes but are generally

thought to have no purpose.  In fact many scientists think that pseudogenes are nothing

more than discarded genetic fossils of a bygone era when they did have some sort of

important function. Of course, it logically follows that similar pseudogenes that are

shared by different species give evidence of common ancestry and even potential times

of divergence.11 For example, the eta-globin pseudogene, which is found in both

humans and chimps, has been used as an argument for the common ancestry of the

two species.

The first pseudogene was reported in 1977.1 Since that time, a large number of

these genes have been reported and described in humans and many other species.

Page 3: Pseudogenes

There are two types of pseudogenes known as "processed" and "unprocessed"

pseudogenes.2,11

Processed genes are found on different chromosomes from their functional

counterparts. They lack introns and certain regulator genes, often terminate in adenine

series, and are flanked by direct repeats (which are associated with movable genetic

elements). They may be complete or incomplete copies of genes or mixtures of several

genes. They are believed to have occurred through a 3-step process: Copying DNA

into RNA, editing the introns to make mRNA, and then turning the code in the mRNA

back into DNA through a reverse transcription process. This process is thought to have

created the "L1 family of pseudogenes."2 Other theories include retroviruses as means

of pseudogene transport between different organisms.

Unprocessed pseudogenes are usually found in clusters of similar functional

sequences on the same chromosome. They usually have introns and associated

regulatory sequences. Their expression is usually prevented by a "misplaced" stop

codon or codons. There may be other changes from the "original" as the result of

deletions, insertions, and point mutations. Some form of mRNA may or may not be

produced depending on the damage to the gene. Many of these are believed to have

arisen by gene duplication, which produced an extra copy of the gene. The extra copy

Page 4: Pseudogenes

could then accumulate mutations without harming the organism since it would still have

a completely functional original copy.2 (The evolutionary gene duplication hypothesis

suggests that over time, random mutations may produce a new gene with new functions

by using this gene duplicate while maintaining the original gene funtion5).

 

 

Shared Pseudogenes

 

It is felt by many, especially evolutionary biologists, that shared pseudogenes, which

have no function in any form in different species, are examples of common ancestry.

Comparison of DNA sequences from humans, chimps, and other mammals shows a

great number of shared pseudogenes. Perhaps the best-known example of a shared

pseudogene is the eta-globin gene.

The eta gene is located on chromosome 11 in humans and is fourth in a series of 6

beta globin genes (five are functional).4 It has no start codon (AUG) and it has several

stop codons.  So obviously, no mRNA is made and therefore no protein. Humans,

chimps, and gorillas have the same number of beta-globin genes arranged in the same

sequence. The exon sequences within these genes are also similar - as are the exons

of the eta gene.4 It is thought that the eta-globin gene originated by a duplication of the

gamma-A-globin gene because of the high similarity of the sequences. Also, both

genes are present in primates.

The history of the eta-globin pseudogene is thought to have originated some 140

million years ago in marsupials and placental mammals. After the "evolutionary

divergence" of marsupials, the gamma-globin gene formed by duplication of an existing

gene in the beta-globin family. Later, but before radiation of the orders of placental

mammals, the eta-globin gene formed from a duplication of the gamma-globin gene.

Gamma and eta genes must therefore have been present in ancestral placentals, but

presumably gamma was lost by goats (which do not have gamma) and eta was lost by

rabbits (which do not have eta).

According to this scenario, the eta gene must have been functional at first, because

it is functional in goats today. 2 It is non-functional in all primates, which is interpreted

Page 5: Pseudogenes

to mean it was already non-functional in ancestral primates some 70-80 million years

ago. This interpretation implies that the eta-globin gene has been maintained for more

than 70 million years without being converted to a useful new gene and without being

eliminated through random mutations.

 

 

Signs of Function?

So, the persistence of a non-functional DNA sequence in an entire lineage for such a

supposed long period of time seems remarkable in the context of the gene duplication

hypothesis. The very fact that pseudogenes are still present and recognizable after

tens of millions of years without any beneficial function just doesn't seem to make

sense.  Certainly, without some beneficial function, natural selection would not have

maintained their sequences for such long periods of time.  There is in fact a cost to

maintain non-functional DNA.  It takes energy to replicate and maintain DNA that

doesn't pay for its keep.  Although this cost might seem small over the short term.  An

extremely small cost compounded over the course of millions of generations starts to

turn into a significant disadvantage.   So, the fact that pseudogenes have any

recognizable gene-like structure at all suggests that they do in fact serve some kind of

purpose.  

 

   The persistence of pseudogenes is in itself evidence for their activity. 

This is a serious problem for evolution, as it is expected that natural

selection would remove this type of DNA if it were useless, since DNA

manufactured by the cell is energetically costly.  Because of the lack of

selective pressure on this neutral DNA, one would expect that ‘old’

pseudogenes would be scrambled beyond recognition as a result of

accumulated random mutations.  Moreover, a removal mechanism for

neutral DNA is now known.6

 

 

Page 6: Pseudogenes

 

 

     “Typically when people say that the human genome contains 27,000 genes or so,

they are referring to genes that code for proteins,” points out Michel Georges, a

geneticist at the University of Liège in Belgium. But even though that number is still

tentative—estimates range from 20,000 to 40,000—it seems to confirm that there is no

clear correspondence between the complexity of a species and the number of genes in

its genome. “Fruit flies have fewer coding genes than roundworms, and rice plants have

more than humans,” notes John S. Mattick, director of the Institute for Molecular

Bioscience at the University of Queensland in Brisbane, Australia. “The amount of

noncoding DNA, however, does seem to scale with complexity.". . . 

        "Increasingly we are realizing that there is a large collection of ‘genes’ that are

clearly functional even though they do not code for any protein” but produce only RNA,

Georges remarks. The term “gene” has always been somewhat loosely defined; these

RNA-only genes muddle its meaning further. To avoid confusion, says Claes

Wahlestedt of the Karolinska Institute in Sweden, “we tend not to talk about ‘genes’

anymore; we just refer to any segment that is transcribed [to RNA] as a ‘transcriptional

unit.’” Based on detailed scans of the mouse genome for all such elements, “we

estimate that there will be 70,000 to 100,000,” Wahlestedt announced at the

International Congress of Genetics, held this past July in Melbourne. “Easily half of

Page 7: Pseudogenes

these could be noncoding.” If that is right, then for every DNA sequence that generates

a protein, another works solely through active forms of RNA—forms that are not simply

intermediate blueprints for proteins but, rather, directly alter the behavior of cells.” . . . 

        “I think this will come to be a classic story of orthodoxy derailing objective analysis

of the facts, in this case for a quarter of a century,” Mattick says. “The failure to

recognize the full implications of this particularly the possibility that the intervening

noncoding sequences may be transmitting parallel information in the form of RNA

molecules—may well go down as one of the biggest mistakes in the history of

molecular biology." [emphasis added] 16

 

 

 

Given this, it is not known if all of what are currently thought of as pseudogenes

have absolutely no function. In fact, some pseudogenes are believed to function as

sources of information for producing genetic diversity. It is thought that partial

pseudogenes are copied into functional genes during genetic recombination, producing

variants of the functional gene. This phenomenon has been reported many times to

Page 8: Pseudogenes

include various immunoglobulins within mice and birds, mouse histone genes, horse

globin genes, and human beta-globin genes. It is not known if this could be a possible

role for the eta-globin gene as well. However, the fact that the eta-globin pseudogene is

located between the fetal and adult genes suggests that it might play a role in gene

switching (there seems to be some preliminary evidence to this effect although the eta

gene sequence’s part in this is still unknown).

It all seems like the protein coding genes are actually rather informationally simplistic

- that the real informational complexity and functionality lies in the non-coding portion of

the genome.  This portion of the genome directs when and where the protein building

blocks are placed and therefore is vitally important to the overall structure and ultimate

function of the resulting creature.  It was because of the evolutionary bias that these

non-coding regions of DNA were assumed to be junk for so long - and therefore

overlooked and unrecognized as key informational components in the genome.

Interestingly enough, such findings actually support the predictions of intelligent design

theory while countering long-held evolutionary assumptions. Of course, there are

always ad hoc modifications to explain such failed predictions resulting from an

evolutionary bias.

 

 

One Man's Junk . . .

 

Other pseudogenes and so-called transposons, such as the “Alu element” (once

thought to be completely useless), are being found to have important functions.

 

There is a growing body of evidence that Alu (a SINE – Short

Interspersed Nuclear Element) sequences are involved in gene regulation,

such as in enhancing and silencing gene activity, or can act as a receptor-

binding site… This is surely a precedent for the functionality of other types

of pseudogenes. 6, 7

 

Page 9: Pseudogenes

Around 1998 Carl Schmid, a molecular biologist at the University of California at

Davis, started advancing what seemed like a nutty idea to explain Alu’s unusual affinity

for genes.  Schmid suggested Alu sequences resided near genes because they are not

really “junk” sequences, but are rather useful sequences involved with a mechanism

that helps cells repair themselves. With the entire genome map in front of them,

showing so many instances of Alu sequences around genes, scientists are beginning to

take Schmid seriously.  “It looks pretty convincing,” Francis Collins said. Others such as

M.I.T. geneticist Eric Lander agree.8

More recently in 2001, a team of molecular geneticists discovered two “hot spots”

where the same SINEs inserted independently:

 

  Vertebrate retrotransposons have been used extensively for

phylogenetic analyses and studies of molecular evolution. Information can

be obtained from specific inserts either by comparing sequence

differences that have accumulated over time in orthologous copies of that

insert or by determining the presence or absence of that specific element

at a particular site.  The presence of specific copies has been deemed to

be an essentially homoplasy-free phylogenetic character because the

probability of multiple independent insertions into any one site has been

believed to be nil. . . . We have identified two hot spots for SINE insertion

within mys-9 and at each hot spot have found that two independent SINE

insertions have occurred at identical sites.  These results have major

repercussions for phylogenetic analyses based on SINE insertions,

indicating the need for caution when one concludes that the existence of a

SINE at a specific locus in multiple individuals is indicative of common

ancestry.  Although independent insertions at the same locus may be rare,

SINE insertions are not homoplasy-free phylogenetic markers.9

 

 

Page 10: Pseudogenes

Even more recently, in the May 2003 issue of Nature, Jeannie Lee published an

article entitled, "Complicity of Gene and Pseudogene" in which some interesting findings

from work done by Hirotsune et al.13 were presented:

 

Dysfunctional in the sense that they cannot be used as a template for producing a

protein, pseudogenes are in fact nearly as abundant as functional genes.  Why have

mammals allowed their accumulation on so large a scale?  One proposed answer is

that, although pseudogenes are often cast as evolutionary relics and a nuisance to

genomic analysis, the processes by which they arise are needed to create whole gene

families, such as those involved in immunity and smell.  But, are pseudogenes

themselves merely byproducts of this process?  Or do apparent evolutionary pressures

to retain them [natural selection] hint at some hidden biological function?  For one

particular pseudogene, the latter seems to be true . . . Hirotsune and colleagues report

the unprecedented finding that the Makorin1-p1 pseudogene [located on chromosome 5

in mice] performs a specific biological task [it regulates the expression of the Makorin1

gene which is located on a completely different chromosome - chromosome 6 in mice].

The work of Hirotsune et al. is provocative for revealing the first biological function of

any pseudogene.  It challenges the popular belief that pseudogenes are simply

molecular fossils -- the evidence of Mother Nature's experiments gone awry." 12,13

 

    In yet another recent Science article by Wojciech Makalowski, the following

comments are made that seem to echo what design theorists have been saying for a

very long time:

 

     Although catchy, the term "junk DNA" for many years repelled mainstream

researchers from studying noncoding DNA.  Who, except a small number of genomic

clochards, would like to dig through genomic garbage?  However, in science as in

normal life, there are some clochards who, at the risk of being ridiculed, explore

unpopular territories.  Because of them, the view of junk DNA, especially repetitive

elements, began to change in the early 1990s.  Now, more and more biologists regard

repetitive elements as genomic treasure." 14

Page 11: Pseudogenes

   

       Then, as recently as the December 2003 issue of Annual Review of Genetics,

Balakirev and Ayala published a paper entitled, "Pseudogenes: Are They 'Junk' or

Functional DNA?"  Consider just a few of their conclusions and see if they do not again

remind you of what design theorists have been claiming for a long time  - -  That

pseudogenes surely have important functions and therefore are not really "pseudo" after

all:

 

      Pseudogenes have been defined as nonfunctional sequences of genomic DNA

originally derived from functional genes. It is therefore assumed that all pseudogene

mutations are selectively neutral and have equal probability to become fixed in the

population. Rather, pseudogenes that have been suitably investigated often exhibit

functional roles, such as gene expression, gene regulation, generation of genetic

(antibody, antigenic, and other) diversity. Pseudogenes are involved in gene conversion

or recombination with functional genes. Pseudogenes exhibit evolutionary conservation

of gene sequence, reduced nucleotide variability, excess synonymous over

nonsynonymous nucleotide polymorphism, and other features that are expected in

genes or DNA sequences that have functional roles. . .

       An extensive and fast-increasing literature does not justify a sharp division between

genes and pseudogenes that would place pseudogenes in the class of genomic "junk"

DNA that lacks function and is not subject to natural selection. Pseudogenes are often

extremely conserved and transcriptionally active. . .

       There seems to be the case that some functionality has been discovered in all

cases, or nearly, whenever this possibility has been pursued with suitable

investigations. One may well conclude that most pseudogenes retain or acquire some

functionality and, thus, that it may not be appropriate to define pseudogenes as

nonfunctional sequences of genomic DNA originally derived from functional genes, or

as "genes that are no longer expressed but bear sequence similarity to active genes".

Rather, pseudogenes might be defined as DNA sequences derived by duplication or

retroposition from functional genes that are often subject to natural selection and

therefore retain much of the original sequence and structure because they have

Page 12: Pseudogenes

acquired new regulatory or other functions, or may serve as reservoirs of genetic

variability.15

 

 

Shared Mistakes

 

 

Another interesting argument is that various pseudogenes in

different species often have certain shared "mistakes" -  that "must

have originated in a common ancestor." 11 However, there is some

evidence that nucleotide changes may not be completely random in

certain gene locations. Mutational "hotspots" have been identified in

many genes as well as pseudogenes. In these locations, point

mutations, even specific types of point mutations, are much more

common than elsewhere in the gene.

 

 

Consider the GULOP (or GULO) pseudogene for example. In

most mammals this is an active gene encoding the enzyme L-

glucono-γ-lactone oxidase (LGGLO). GULO is located on

chromosome 8 at p21.1 in a region that is rich in genes (see figure). This is the enzyme

that catalyzes the last step in the synthesis of ascorbic acid (vitamin C). As it turns out,

this particular gene is defective in humans and other primates as well as several other

creatures to include guinea pigs, bats and certain kinds of fish. Compared to the rat

Page 13: Pseudogenes

GULO gene, the human version, as well as the great ape version, has large or clearly

functional deletions involving exons I-III, V-VI, VIII, and XI (see figure above).18-21

Compare this with the significant deletions of the guinea pig GULO sequence that

involve exons I, V, and VI - - all of which match the same losses of the primate

mutations.  In addition to this, all four functionally detrimental stop codons (3TGA and

1TAA sequences) that are identified in the guinea pig are shared at the same sites

locations in the primate GULO pseudogene.

 

Of course, it seems that we humans are able to get along just fine without this gene

because we eat a lot of foods that are rich in vitamin C, like citrus fruits. So, what's the

big deal? Well, the argument goes something like this (as per a popular Talk.Origins

essay by Edward E. Max, Ph.D.):

 

In most mammals functional GLO genes are present, inherited - according to the

evolutionary hypothesis - from a functional GLO gene in a common ancestor of

mammals. According to this view, GLO gene copies in the human and guinea pig

lineages were inactivated by mutations. Presumably this occurred separately in guinea

pig and primate ancestors whose natural diets were so rich in ascorbic acid that the

absence of GLO enzyme activity was not a disadvantage--it did not cause selective

pressure against the defective gene.

Molecular geneticists who examine DNA sequences from an evolutionary

perspective know that large gene deletions are rare, so scientists expected that non-

functional mutant GLO gene copies--known as "pseudogenes"--might still be present in

primates and guinea pigs as relics of the functional ancestral gene. . . [Beyond this],  the

theory of evolution would make the strong prediction that primates [like apes and

monkeys] would carry similar crippling mutations to the ones found in the human

pseudogene. A test of this prediction has recently been reported. A small section of the

GLO pseudogene sequence was recently compared from human, chimpanzee,

macaque and orangutan; all four pseudogenes were found to share a common crippling

single nucleotide deletion that would cause the remainder of the protein to be translated

in the wrong triplet reading frame (Ohta and Nishikimi BBA 1472:408, 1999). 11,20

Page 14: Pseudogenes

 

Now, it is interesting that among the many various substitution mutations in the

"GLO" pseudogene that many, though not all, would be shared, to include a single

deletion mutation that is shared by all primates (when compared to the rat of course). If

not for common descent why would the sequences of human, chimpanzee, gorilla and

orangutan reveal a single nucleotide deletion at position 97 in the coding region of Exon

X? What are the odds that out of 165 base pairs the same one would be mutated in all

these primates by random chance?  Pretty slim - right?  Is this not then overwhelming

evidence of common evolutionary ancestry?

This would indeed seem to be the case at first approximation. However, in 2003, the

same Japanese group published the complete sequence of the guinea pig GLO

pseudogene, which is thought to have evolved independently, and compared it to that of

humans [Inai et al, 2003]. 21 Surprisingly, they reported many shared mutations

(deletions and substitutions) present in both humans and guinea pigs. Remember now

that humans and guinea pigs are thought to have diverged at the time of the common

ancestor with rodents. Therefore, a mutational difference between a guinea pig and a

rat should not be shared by humans with better than random odds. But, this was not

what was observed. Many mutational differences were shared by humans, including the

one at position 97.  According to Inai et al, this indicated some form of non-random bias

that was independent of common descent or evolutionary ancestry. The probability of

the same substitutions in both humans and guinea pigs occurring at the observed

number of positions was calculated, by Inai et al, to be 1.84x10-12 - consistent with

mutational hotspots.  

 

 

Page 15: Pseudogenes

 

 

What is interesting here is that the mutational hot spots found in guinea pigs and

humans exactly match the mutations that set humans and primates apart from the rat

(see figure below). 21,22  This particular feature has given rise to the obvious argument

that Inai et al got it wrong.  Reed Cartwright, a population geneticist, has noted a

methodological flaw in the Inai paper:

     "However, the sections quoted from Inai et al. (2003) suffer from a major

methodological error; they failed to consider that substitutions could have occurred in

the rat lineage after the splits from the other two. The researchers actually clustered

substitutions that are specific to the rat lineage with separate substitutions shared by

guinea pigs and humans. . . 

     If I performed the same analysis as Inai et al. (2003), I would conclude that there are

ten positions where humans and guinea pigs experienced separate substitutions of the

same nucleotide, otherwise known as shared, derived traits. These positions are 1, 22,

31, 58, 79, 81, 97, 100, 109, 157. However, most of these are shown to be substitutions

in the rat lineage when we look at larger samples of species.

     When we look at this larger data table, only one position of the ten, 81, stands out as

a possible case of a shared derived trait, one position, 97, is inconclusive, and the other

eight positions are more than likely shared ancestral sites. With this additional

Page 16: Pseudogenes

phylogenetic information, I have shown that the "hot spots" Inai et al. (2003) found are

not well supported." (see Link) 

   

  

 

Page 17: Pseudogenes

It does indeed seems like a number of the sequence differences noted by Cartwright

are fairly unique to the rat - especially when one includes several other species in the

comparison. However, I do have a question regarding this point.  It seems to me that

there simply are too many loci where the rat is the only odd sequence out in Exon X

(i.e., there are seven and arguably eight of these loci).  Given the published estimate on

mutation rates (Drake) of about 2 x 10-10 per loci per generation, one should expect to

see only 1 or 2 mutations in the 164 nucleotide exon in question (Exon X) over the

course of the assumed time of some 30 Ma (million years).  Therefore, the argument of

the mutational differences being due to mutations in the rat lineage pre-supposes a

much greater mutation rate in the rat than in the guinea pig.  The same thing is true if

one compares the rat with the mouse (i.e., the rat's evident mutation rate is much higher

than that of the mouse).

This is especially interesting since many of the DNA mutations are synonymous (see

Link).  Why should essentially neutral mutations become fixed to a much greater extent

in the rat gene pool as compared to the other gene pools? Wouldn't this significant

mutation rate difference, by itself, seem to suggest a mutationally "hot" region - at least

in the rat?

Beyond this, several loci differences are not exclusive to the rat/mouse gene pools

and therefore suggest mutational hotspots beyond the general overall "hotness" or

propensity for mutations in this particular genetic sequence.

 

 

Page 18: Pseudogenes

 

 

Some have noted that although the shared mutations may be the result of hotspots,

there are many more mutational differences between humans and rats/guinea pigs as

compared to apes.  Therefore, regardless of hotspots, humans and apes are clearly

more closely related than are humans and rats/guinea pigs. 

The problem with this argument is that the rate at which mutations occur is related to

the average generation time.  Those creatures that have a shorter generation time have

a correspondingly higher mutation rate over the same absolute period of time - like 100

years.  Therefore, it is only to be expected that those creatures with a very long

generation times, like humans and apes, would have fewer mutational differences

relative to each other over the same period of time relative to those creatures with much

shorter generation times, like rats and guinea pigs.  

As an aside, many other genetic mutations that result in functional losses are known

to commonly affect the same genetic loci in the same or similar manner outside of

common descent.  For example, achondroplasia is a spontaneous mutation in humans

Page 19: Pseudogenes

in about 85% of the cases. In humans achondrioplasia is due to mutations in the FGFR2

gene. A remarkable observation on the FGFR2 gene is that the major part of the

mutations are introduced at the same two spots (755 C->G and 755-757 CGC->TCT)

independent of common descent. The short legs of the Dachshund are also due to the

same mutation(s). The same allelic mutation has occurred in sheep as well.  

What is interesting about many of these mutational losses is that they often share

the same mutational changes.  It is at least reasonably plausible then that the GULO

mutation could also be the result of a similar genetic instability that is shared by similar

creatures (such as humans and the great apes).

Another interesting example of this phenomenon has been studied in detail in more

rapidly reproducing organisms, such as viruses.  For example, an interesting study was

published by Bull et al., on replicate lineages of the bacteriophage {phi} X174. 

Numerous mutations occurred in each genome during propagation. Across nine

separate lineages 119 independent substitutions occurred at 68 nucleotide sites. 

What is interesting here is that over half of these substitutions at 1/3 of the sites were

identical in the different lineages. Some convergent substitutions were specific to

specific hosts while others where shared between the two separate hosts.  Phylogenetic

reconstruction using the complete genome sequence not only failed to recover the

correct evolutionary history because of these convergent changes, but the true history

was rejected as being a significantly inferior fit to the data (see Link).

 

This same sort of thing is seen to a fairly significant degree in the GULO region. 

Many of the same significant mutations are shared between humans and guinea pigs. 

Consider the following illustration yet again:

 

Page 20: Pseudogenes

 

Why would both humans and guinea pigs share major deletions of exons I, V and VI

as well as four stop codons if these mutations were truly random?  In addition to this, a

mutant group of Danish pigs have also been found to show a loss of GULO

functionality.  And, guess what, the key mutation in these pigs was a loss of a sizable

portion of exon VIII.  This loss also matches the loss of primate exon VIII.  In addition,

there is a frame shift in intron 8 which results in a loss of correct coding for exons 9-12. 

This also reflects a very similar loss in this region in primates (see Link).  That's quite a

few key similarities that were clearly not the result of common ancestry for the GULO

region.  This seems to be very good evidence that many if not all of the mutations of the

GULO region are indeed the result of similar genetic instabilities that are prone to

similar mutations - especially in similar animals.

 

Back to mutational hotspots, what makes hotspots so "hot"? Perhaps the answer

lies in the chemical nature of the hotspot region. The type of molecular bonds, their

stability or instability, or other molecular interactions may lend themselves to specific

nucleotide pair switches, especially given certain environmental changes. No one really

knows for sure except to say that mutational hot spots do exist. So, given that they do

exist, similar genes should be expected to function in similar ways and this includes

having similar mutational "hotspots and/or "shared mistakes." 3 In any case, it is

interesting to note that there are no such examples of "shared errors" between

mammals and other groups of animals (although there are plenty of common "errors"

that are shared by widely divergent mammalian groups).

 

There are no examples of 'shared errors' that link mammals to other

branches of the genealogic tree of life on earth. . . Therefore, the

evolutionary relationships between distant branches on the evolutionary

genealogic tree must rest on other evidence besides 'shared errors.' 11

 

Of course the argument used to explain this fact is that mammals split off from other

groups of animals over 200 million years ago. Given this amount of time, random

Page 21: Pseudogenes

mutations would have obliterated any trace of common genetic errors. 11 This is a very

good point. The question remains however as to why are some identifiable genetic

errors are maintained as long as they are if they are in fact functionless?  Also,

"processed pseudogenes" are very similar to "movable genetic elements" which are

often transmitted from animal to animal by viruses.  Certain interspecies pseudogenes

of this type might in fact share a common ancestor while the various types of animals

themselves, that harbor certain of these genetic sequences, may not be related through

common descent so much as they are partially related through common infection.

In any case, there really are no "foolproof" genetic markers of common decent.  All

of the ones proposed so far to be foolproof have been shown to have significant flaws.

The prediction that pseudogenes, transposons (SINEs and LINEs) and other shared

mutational mistakes are conclusive evidence for common descent has not held up over

recent years. For example, consider the following excerpt from David Hillis' paper

entitled, "SINEs of the perfect character." published in  the Proceedings of the National

Academy of Sciences, 1999:

 

  What of the claim that the SINE/LINE insertion events are perfect

markers of evolution (i.e., they exhibit no homoplasy)?  Similar claims

have been made for other kinds of data in the past, and in every case

examples have been found to refute the claim.  For instance, DNA-DNA

hybridization data were once purported to be immune from convergence,

but many sources of convergence have been discovered for this

technique.  Structural rearrangements of genomes were thought to be

such complex events that convergence was highly unlikely, but now

several examples of convergence in genome rearrangements have been

discovered.  Even simple insertions and deletions within coding regions

have been considered to be unlikely to be homoplastic, but numerous

examples of convergence and parallelism of these events are now

known.  Although individual nucleotides and amino acids are widely

acknowledged to exhibit homoplasy, some authors have suggested that

widespread simultaneous convergence in many nucleotides is virtually

Page 22: Pseudogenes

impossible. Nonetheless, examples of such convergence have been

demonstrated in experimental evolution studies. 10

 

 

A New Paradigm

 

       Obviously then, the old notions that pseudogenes and other forms of shared "junk"

DNA give clear evidence of common ancestry over common functional need, will have

to be discarded.  Certainly if organisms share similar environments and have similar

morphologic appearances and needs, should one be surprised to find similar functional

genetic elements shared between such creatures?   Such sequences cannot be used to

clearly establish evolutionary trees and to estimate divergent times since such beneficial

sequences would be maintained over time via natural selection without any significant

changes.  The similarities and differences would not be based so much on evolutionary

changes over the time since a shared common ancestor as they would be the result of

similarities and differences in functional needs that have always been there, maintained

by the forces of natural selection, since these creatures came to be.

 

        No one knows yet just what the big picture of genetics will look like once this

hidden layer of information is made visible. "Indeed, what was damned as junk because

it was not understood may, in fact, turn out to be the very basis of human complexity,"

Mattick suggests. Pseudogenes, riboswitches and all the rest aside, there is a good

reason to suspect that is true. Active RNA, it is now coming out, helps to control the

large-scale structure of the chromosomes and some crucial chemical modifications to

them—an entirely different, epigenetic layer of information in the genome.16

       In fact, the most detailed probe yet into the workings of the human genome has led

scientists to conclude [as of June 14, 2007] that a cornerstone concept about the

chemical code for life is badly flawed.  Reporting in the British journal Nature and the

US journal Genome Research on Thursday [June 14, 2007], they suggest that an

established theory about the genome should be consigned to history.

Page 23: Pseudogenes

        In between the genes and the sequences known to regulate their activity are long,

tedious stretches that appear to do nothing. The term for them is "junk" DNA, reflecting

the presumption that they are merely driftwood from our evolutionary past and have no

biological function. But the work by the ENCODE (ENCyclopaedia of DNA Elements)

consortium implies that this nuggets-and-dross concept of DNA should be, well, junked.

        The genome turns out to a highly complex, interwoven machine with very few

inactive stretches, the researchers report. Genes, it transpires, are just one of many

types of DNA sequences that have a functional role. And "junk" DNA turns out to have

an essential role in regulating the protein-making business. Previously written off as

silent, it emerges as a singer with its own discreet voice, part of a vast, interacting

molecular choir. 

        "The majority of the genome is copied, or transcribed, into RNA, which is the active

molecule in our cells, relaying information from the archival DNA to the cellular

machinery," said Tim Hubbard of the Wellcome Trust Sanger Institute, a British

research group that was part of the team. "This is a remarkable finding, since most prior

research suggested only a fraction of the genome was transcribed."

        Francis Collins, director of the US National Human Genome Research Institute

(NHGRI), which coralled 35 scientific groups from around the world into the ENCODE

project, said the scientific community "will need to rethink some long-held views about

what genes are and what they do."17

           

 

The human genome in numbers26

 

1.5% of the genome translated into proteins

27% of the genome transcribed as part of protein-coding gene expression but not translated into proteins

Page 24: Pseudogenes

25% of the genome that is transcribed but not translated, and is not associated with protein-coding genes

250 microRNAs currently identified (as of June 2005) 

o ~1,000 as of 2007 ( Link )

10,000 protein-coded genes estimated to be regulated by microRNAs; each microRNA can target several genes, and a particular gene may be regulated by several microRNAs

98% of genomic output that is non-coding RNA

9% of genes that appear to have associated antisense transcripts

~20,000 "pseudogenes" in the genome

 

 

        This is very interesting.  I mean, who would have thought that the majority of the

genome would be copied or transcribed into RNA? - and that it would in fact be

functional?  Only a few years ago the scientific community believed that less than 5% of

the genome was actually functional and the rest was non-functional evolutionary

remnants.  After all, "noncoding genomic regions account for 98% to 99% of the human

genome and consist of introns found within protein-coding transcripts and the intergenic

regions between them."25  Add to these numbers the very surprising finding that many

genetic sequences that do not produce either proteins or RNA are also being found to

be functional (see discussion of Pyknons)

       Who would have predicted this? - - besides creationists and intelligent design

theorists that is?  Creationists and intelligent design theorists have been claiming for

many years that the concept of "Junk DNA" (as well as vestigial structures) was not

entirely correct. I myself have been promoting this idea for over 11 years (as of June,

2008).  Yet, only now are mainstream scientists finally starting to realize the significant

errors in their long-cherished beliefs when it comes to the ill-conceived notion of junk

Page 25: Pseudogenes

DNA - an idea which was based on ardently held evolutionary presuppositions that

blinded mainstream science and prevented them from searching out the hidden

treasures of so-called "junk DNA" for a fairly long time. 

       When are scientists going to start realizing that the creationist paradigm does

indeed have very good predictive scientific value when it comes to accurately

understanding and investigating the physical world and universe?

 

 

 

Pyknons

 

 

       To add to this, consider the fairly recent finding (2006) of "pyknons" by Rigoutsos et

al.24 Pyknons are variable-length patterns within DNA sequences that have identically

conserved copies and multiplicities above what is expected by chance. They are also no

transcribed into RNA (unlike miRNAs noted above) or translated into protein. Among the

millions of discovered patterns, Rigoutsos et al. found a subset of 127,998 patterns,

which they termed pyknons, that have additional nonoverlapping instances in the

untranslated and protein-coding regions of 30,675 transcripts from 20,059 human

genes. The pyknons arrange combinatorially in the untranslated and coding regions of

numerous human genes where they form mosaics. Consecutive instances of pyknons in

these regions show a strong bias in their relative placement, favoring distances of ~22

nucleotides.

       Pyknons are also very common in the human genome.  They form 1/6th of the

human intergenic and intronic regions for a total of 127,998 pyknons covering

898,424,004 DNA nucleotide positions on the forward and reverse strands of the human

genome. 

       What is interesting here, of course, is that pyknons are associated with specific

biologic processes - i.e., they are functional. Cross-genome comparisons reveal that

many of the pyknons have instances in the 3' UTRs of genes from other vertebrates and

invertebrates where they are overrepresented in similar biological processes, as in the

Page 26: Pseudogenes

human genome. This "unexpected finding" suggests, according to the authors, potential

unique functional connections between the coding and noncoding parts of the human

genome - such as a possible link with posttranscriptional gene silencing and RNA

interference. 

 

 

     "Human pyknons are also present in other genomes, where they associate with

similar biological processes.  Notably, >600 million nucleotides that are associated with

nongenic copies of pyknons in the human genome are absent from the mouse and rat

genomes. Interestingly, the human pyknons have many instances in the intergenic and

intronic regions of the phylogenetically distant worm and fruit fly genomes, covering ~1.6

million nucleotides in each."24

 

 

     Given that genetic sequences that are transcribed or translated or both seem to

account for the "majority" of the genome, and are thought to be functionally beneficial, it

is interesting that certain types of genetic sequences that are neither translated nor

transcribed are also being found to be functional.  Taken together, it seems like the

significant majority of the genome is indeed functional to at least some degree - well

over 50% if not more like 85-90% or even higher?  

 

 

 

The Key Human-Ape Differences

 

       It is becoming more and more clear that the key functional differences between

living things, like humans and apes, are not so much found in protein-coding genes, but

in the non-coding regions of DNA once thought to be functionless "junk-DNA" -

evolutionary remnants of past mistakes that are shared between various creatures. 

Page 27: Pseudogenes

This notion is starting to be shed with more and more discoveries that show that many

of these same regions are not just functional, they carry the vast majority of the genetic

information.  The "genes" that were once thought to be so important for genetic

function are turning out to be equivalent to the most low-level basic building blocks

within the genome, like bricks and motor.  Surprisingly, it is the non-coding regions of

DNA control what is done with these building blocks - that determine what kind of

"house" to build so to speak.  The following article is very interesting in this regard:

 

 

     "Seventy-five percent of known human miRNAs [microRNAs] cloned in this study

were conserved in vertebrates and mammals, 14% were conserved in invertebrates,

10% were primate specific and 1% are human specific. The new miRNAs have a

Page 28: Pseudogenes

different conservation distribution: more than half of the human miRNAs were

conserved only in primates, about 30% in mammals and 9% in nonmammalian

vertebrates or invertebrates; 8% were specific to humans. We saw a similar distribution

for the chimpanzee miRNAs.

     The different miRNA repertoire, as well as differences in expression levels of

conserved miRNAs, may contribute to gene expression differences observed in human

and chimpanzee brain . Although the physiological relevance of miRNAs expressed

at low levels remains to be shown, it is tempting to speculate that a pool of such

miRNAs may contribute to the diversity of developmental programs and cellular

processes . . . For example, miRNAs recently have been implicated in synaptic

development and in memory formation. As the species specific miRNAs described here

are expressed in the brain, which is the most complex tissue in the human body, with an

estimated 10,000 different cell types, these miRNAs could have a role in establishing or

maintaining cellular diversity and could thereby contribute to the differences in human

and chimpanzee brain ... function." 23

     

     Pseudogenes are also being found to have similar functionality as miRNAs. 

"Transcripts of processed pseudogenes can contain regions with significant antisense

homology, which may suggest a regulatory role for transcribed pseudogenes through an

RNAi-like mechanism" (see Link ).  Two recent studies have demonstrated that such

transcribed pseudogenes can regulate transcription of homologous protein-coding

genes. Transcription of a pseudogene in Lymnea stagnalis, that is homologous to the

nitric oxide synthase gene, decreases the expression levels for the gene through

formation of a RNA duplex; this is thought to arise via a reverse-complement sequence

found at the 5′ end of the pseudogene transcript (Link). In a second example,

transcription of the makorin1-p1 TPΨg in mouse was required for the stability of the

Page 29: Pseudogenes

mRNA from a homologous gene makorin1. This regulation was deduced to arise from

an element in the 5′ areas of both the gene and the pseudogene (Link).  More recently,

Weil et al. discovered that the murine FGFR-3 pseudogene is transcribed in fetal tissues

in an antisense direction. This prompted the following consideration:

 

     'As the regions of exact identity between FGFR-3 and its pseudogene can be up to

60 nt long, it may be envisioned that FGFR-3 transcripts could play a regulatory role in

FGFR-3 expression. If these antisense transcripts could hybridize to sense FGFR-3

transcripts inside the cells, this may lead to either rapid degradation or inhibition of

translation.' (Link)

 

     As Yao et. al., predict, "Further studies on transcribed pseudogenes will add to our

understanding of their potential roles as non-coding RNA genes or other new types of

functional elements." (Link)  It seems like many transcribed pseudogenes may act as

giant miRNAs to regulate the function of protein-coding genes and other genetic

elements.

 

 

  Additional information dealing with this most interesting topic is listed in an fairly extensive essay by

Wade Schauer (used with permission).

  1. Jacq C, Miller JR, Brownlee GG. A pseudogene structure in 5S DNA of Xenopus laevis ,

Cell 12:109-120. 1977. 2. Gibson L. J., Pseudogenes and Origins, Origins 21(2):91-108. 1994. 3. Menotti R.M., Starmer W.T., Sullivan D.T., Characterization of the structure and evolution

of the Adh region of Drosophila hydei, Genetics 127:355-366. 1991. 4. Lalley P.A., Davisson M.T., Graves J.A.M., O’Brien S.J., Womack J.E., Roderick T.H.,

Creau-Goldberg N., Hillyard A.L., Doolittle D.P., Rogers J.A., Report of the committee on comparative mapping, Cytogenetics and Cell Genetics 51:503-532. 1989.

5. Long M., Langley C.H., Natural selection and the orgin of jingwei, a chimeric processed functional gene in Drosophila, Science 260:91-95. 1993.

6. Jerlstrom, Pierre. 2000. Pseudogenes. Creation Ex Nihilo Technical Journal 14 (no. 3):15.  

7. Woodmorappe, John.2000. Are Pseudogenes 'Shared Mistakes' Between Primate Genomes? Creation Ex Nihilo Technical Journal 14 (no. 3):58-71.

Page 30: Pseudogenes

8. Abate, Tom. 2001. Genome Discovery Shocks Scientists. San Francisco Chronicle (February 11).  

9. Cantrell, Michael A. and others. 2001. An Ancient Retrovirus-like Element Contains Hot Spots for SINE Insertion. Genetics 158:769-777.

10. Hillis, David M. 1999. SINEs of the perfect character. Proceedings of the National Academy of Sciences 96:9979-9981.

11. Max, Edwards. Plagiarized Errors and Molecular Genetics. Creation/Evolution (XIX, p.34) 1986-2003. ( http://www.talkorigins.org/faqs/molgen/ )

12. Lee, Jeannie T., Complicitiy of the gene and pseudogene, Nature 423:26-28. 2003 13. Hirotsun, Shinji et. al., An expressed pseudogene regulates the messenger-RNA stability

of its homologous coding gene, Nature 423:91-96. 2003 14. Makalowski, Wojciech. 2003.  Not Junk After All, Science 300:1246-1247 15. Balakirev, Evgeniy S., Ayala, Francisco J., PSEUDOGENES: Are They "Junk" or

Functional DNA? Annual Review of Genetics, Vol. 37, pp. 123-151, December 2003 (  http://arjournals.annualreviews.org/doi/abs/10.1146%2Fannurev.genet.37.040103.103949 )   

16. Wyatt Gibbs, The Unseen Genome: Gems among the Junk, Scientific American, November 2003, pp 45-53 ( Link )

17. ENCORE Project Consortium et al., Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project, Nature 447, 799-816 (14 June 2007); Richard Ingham, Landmark study prompts rethink of genetic code, Yahoo News, accessed June 15, 2007 (Link1, Link2)

18. Nishikimi, M. and Yagi, K. (1991) Molecular basis for the deficiency in humans of gulonolactone oxidase, a key enzyme for ascorbic acid biosynthesis. Am. J. Clin. Nutr. 54(6 Suppl):1203S-1208S.

19. Nishikimi, M., Fukuyama, R., Minoshima, S., Shimizu, N. and Yagi. K. (1994) Cloning and chromosomal mapping of the human nonfunctional gene for L-gulono-gamma-lactone oxidase, the enzyme for L-ascorbic acid biosynthesis missing in man. J. Biol. Chem. 269:13685-13688.

20. Ohta, Y. and Nishikimi, M. (1999) Random nucleotide substitutions in primate nonfunctional gene for L-gulono-gamma-lactone oxidase, the missing enzyme in L-ascorbic acid biosynthesis. Biochim. Biophys. Acta. 1472:408-411.

21. Inai, Y., Ohta. Y., and Nishikimi, M. (2003) The whole structure of the human nonfunctional L-gulono-gamma-lactone oxidase gene--the gene responsible for scurvy--and the evolution of repetitive sequences thereon. J Nutr Sci Vitaminol (Tokyo) 49:315-319.

22. Peter Borger, Shared mutations: Common descent or common mechanism?, The Independent Research Institute on Origins, Accessed 8/10/07 ( Link )

23. Eugene Berezikov, Fritz Thuemmler, Linda W van Laake, Ivanela Kondova, Ronald Bontrop4, Edwin Cuppen & Ronald H A Plasterk, "Diversity of microRNAs in human and chimpanzee brain", Nature Genetics, Vol 38 | Number 12 | December 2006 pp. 1375-1377. ( Link )

24. Isidore Rigoutsos, Tien Huynh, Kevin Miranda, Aristotelis Tsirigos, Alice McHardy, and Daniel Platt, Short blocks from the noncoding parts of the human genome have instances within nearly all known genes and relate to biological processes, PNAS | April 25, 2006 | vol. 103 | no. 17 | 6605-6610 ( Link )

25. Jill Cheng, Philipp Kapranov, Jorg Drenkow, Sujit Dike, Shane Brubaker, Sandeep Patel,

Jeffrey Long, David Stern, Hari Tammana,  Gregg Helt, Victor Sementchenko, Antonio Piccolboni, Stefan Bekiranov, Dione K. Bailey, Madhavan Ganesh, Srinka Ghosh, Ian Bell,1 Daniela S. Gerhard, Thomas R. Gingeras, Transcriptional Maps of 10 Human

Page 31: Pseudogenes

Chromosomes at 5-Nucleotide Resolution, Science 20 May 2005: Vol. 308. no. 5725, pp. 1149 - 1154 ( Link )

26. Richard Twyman, Small RNA: BIG NEWS, The Human Genome, January 2005 ( Link )  

  

 

. Home Page                                                                             . Truth, the Scientific

Method, and Evolution   

. Methinks it is Like a Weasel                                                 . The Cat and the Hat -

The Evolution of Code   

. Maquiziliducks - The Language of Evolution             . Defining Evolution    

. The God of the Gaps                                                           . Rube Goldberg

Machines  

. Evolving the Irreducible                                                     . Gregor Mendel  

. Natural Selection                                                                  . Computer Evolution

. The Chicken or the Egg                                                         . Antibiotic

Resistance  

. The Immune System                                                            . Pseudogenes  

Page 32: Pseudogenes

. Genetic Phylogeny                                                                . Fossils and DNA  

. DNA Mutation Rates                                                            . Donkeys, Horses,

Mules and Evolution  

. The Fossil Record                                                                . The Geologic

Column  

.  Early Man                                                                                . The Human Eye  

. Carbon 14 and Tree Ring Dating                                     . Radiometric Dating  

 . Amino Acid Racemization Dating                   . The Steppingstone

Problem

.  Quotes from Scientists                                                           . Ancient Ice

 . Meaningful Information                                                          . The Flagellum

 . Harlen Bretz                                   . Milankovitch Cycles

 . Kenneth Miller's Best Arguments