functional consequences of chromosomal … · (a*star) who have awarded me a singapore...
TRANSCRIPT
FUNCTIONAL CONSEQUENCES OF
CHROMOSOMAL REARRANGEMENTS IN
NEURODEVELOPMENTAL DISORDER
KAGISTIA HANA UTAMI
NATIONAL UNIVERSITY OF SINGAPORE
2014
FUNCTIONAL CONSEQUENCES OF
CHROMOSOMAL REARRANGEMENTS IN
NEURODEVELOPMENTAL DISORDER
KAGISTIA HANA UTAMI
(M. Sc) University Medical Center Utrecht
A THESIS SUBMITTED
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
DEPARTMENT OF PAEDIATRICS
YONG LOO LIN SCHOOL OF MEDICINE
NATIONAL UNIVERSITY OF SINGAPORE
2014
i
DECLARATION
I hereby declare that this thesis is my original work and it has been written by me
in its entirety. I have duly acknowledged all the sources of information which
have been used in the thesis.
This thesis has also not been submitted for any degree in any university
previously.
___________________
Kagistia Hana Utami
23 August 2014
ii
ACKNOWLEDGEMENT
I would like to start by acknowledging people who made it possible for me to
pursue my PhD, my supervisors: Dr. Valere Cacheux, who has been generous in
devoting her time in between her busy schedules to guide me and actively
involved in supervising me from distance miles away; my heartfelt thanks to Dr.
Sonia Davila, who has been extremely supportive during the course of my study,
provided unlimited amount of time in guiding and supervising me, especially to
improve my scientific writing; Dr. Stacey Tay Kiat Hong, for accepting me as a
student under her department and providing constructive ideas from clinical point
of view.
This thesis would not have been possible without the collaborators: Dr. Robyn
Jamieson, Dr. Sylvain Briault, and Dr. Pierre Sarda, who have provided patients
samples and assistance in manuscript writing.
I wish to express my sincere appreciation to the following people: Dr. Axel
Hillmer, for his helpful guidance in analyzing genome sequencing data,
continuous supports and manuscript writing; Dr. Irene Aksoy, for her patience in
teaching me the basics of culturing embryonic stem cells for the first time, and her
constructive suggestions to develop my project; Dr. Larry Stanton for allowing
me to use his cell culture lab space; Dr. Vladimir Korzh, for the hours of
discussion about neural crest cells biology and the access to the zebrafish facility;
iii
Dr. Sinakaruppan Mathavan, who kindly assisted with the bioinformatics analysis
on the evolutionary conservation of candidate genes. I also thank my Thesis
Advisory Committee for their helpful suggestions, Prof. Fu Xin Yuan and Dr.
Bruno Reversade.
My big thanks to my Indonesian friends in Biopolis: Lanny, Teddy, Astrid, and
Herty who have given me continuous supports throughout my entire journey. I
would also like to thank all the people that I have got to know during my time at
GIS: Seong Soo, Wei Yong, and Edward Chee for keeping the quiet level 5
become more enjoyable. Sonia’s group lab members: Vikrant, Katrin, Clarabelle,
Lisa, Melissa and Zai Yang, for their help in one way or another.
I would like to thank the Agency for Science and Technology Research
(A*STAR) who have awarded me a Singapore International Graduate Award
(SINGA) scholarship, including conferences supports throughout my study. I am
very grateful for the opportunity.
My deepest gratitude goes to my parents for their endless encouragement and
giving me the greatest love and support. This thesis is dedicated to you. Finally, I
would also especially thank Ryan for his patience and generous understanding.
Last but not least, for the patients who donated their cells to the studies that make
up this thesis, and for all the fish!
iv
TABLE OF CONTENTS
Declaration ............................................................................................................... i
Acknowledgement .................................................................................................. ii
Table of Contents ................................................................................................... iv
Summary ................................................................................................................. x
List of tables ......................................................................................................... xiii
List of figures ....................................................................................................... xiv
List of Abbreviations .......................................................................................... xvii
Chapter 1: Introduction ........................................................................................... 1
1.1 Neurodevelopmental disorders overview ................................................. 1
1.1.1 Developmental Delay (DD)/Intellectual disability (ID) ......................... 2
1.1.2 Language Delay (LD) ............................................................................. 2
1.1.3 Speech Delay (SD) ................................................................................. 3
1.1.4 Autism spectrum disorders (ASD) ......................................................... 4
1.2 Clinical evaluation of NDDs ......................................................................... 4
1.3 Causes of NDDs ....................................................................................... 6
1.3.1 Environmental contributions of NDDs ................................................... 6
1.3.2 Genetics of Neurodevelopmental Disorders ........................................... 9
1.4 Genetic evaluation for NDDs ...................................................................... 11
v
1.4.1 G-banding karyotyping ......................................................................... 11
1.4.2. Fluorescence in situ hybridization (FISH)........................................... 14
1.4.3 Array comparative genomic hybridization (aCGH) ............................. 15
1.4.4 Next generation sequencing.................................................................. 19
1.5 Pathophysiology of NDDs .......................................................................... 25
1.6 Overview of DNA Paired-End Tag (DNA-PET) sequencing ..................... 30
1.7 Thesis aims .................................................................................................. 35
Chapter 2: Materials and Methods ........................................................................ 37
2.1 Patient samples and clinical information .................................................... 37
2.1.1 Patient CD5........................................................................................... 39
2.1.2 Patient CD10......................................................................................... 39
2.1.3 Patient CD8........................................................................................... 40
2.1.4 Patient CD9........................................................................................... 41
2.1.5 Patient CD14......................................................................................... 41
2.1.6 Patient CD6........................................................................................... 42
2.1.7 Patient CD23......................................................................................... 42
2.2 G-Banding karyotype .................................................................................. 43
2.3 Fluorescence in situ hybridization (FISH) .................................................. 43
2.4 Genomic DNA isolation .............................................................................. 44
2.5 Array comparative genomic hybridization (aCGH) .................................... 45
2.6 DNA-PET .................................................................................................... 45
vi
2.7 Post-sequencing analysis ............................................................................. 46
2.8 Filtering of normal structural variation (SVs) ............................................. 48
2.9 Functional analysis of regulatory regions ................................................... 49
2.10 Validation of breakpoints by Sanger sequencing ...................................... 50
2.11 Quantitative Real Time PCR (qPCR)........................................................ 50
2.12 CNV analysis from published studies ....................................................... 53
2.13 Functional analysis by pluripotent stem cells ........................................... 53
2.13.1 Cell lines used and maintenance ......................................................... 53
2.13.2 Induction of iPSC from fibroblast ...................................................... 53
2.13.3 Neural progenitor cells differentiation ............................................... 54
2.13.4 Neuronal differentiation ..................................................................... 55
2.13.5 pSUPER shRNA cloning and transfection ......................................... 55
2.13.6 shRNA vector for GTDC1 .................................................................. 57
2.13.7 EdU proliferation assay ...................................................................... 57
2.13.8 Immunocytochemistry ........................................................................ 58
2.13.9 Immunocytochemistry quantification analysis ................................... 59
2.13.10 Microarray ........................................................................................ 60
2.13.11 Gene enrichment analysis for microarray ......................................... 61
2.14 Functional analysis by zebrafish ............................................................... 61
2.14.1 Fish lines and maintenance ................................................................. 61
2.14.2 Embryo preparation ............................................................................ 61
2.14.3 RNA probe synthesis .......................................................................... 62
vii
2.14.4 Whole mount in situ hybridization ..................................................... 62
2.14.5 Morpholino microinjection ................................................................. 63
2.14.6 Human MED13L mRNA synthesis .................................................... 63
2.14.7 Alkaline phosphatase staining ............................................................ 64
2.14.8 Alcian Blue staining ........................................................................... 64
2.14.9 Image quantification analysis ............................................................. 65
2.14.10 qPCR analysis ................................................................................... 65
Chapter 3: Results: Discovery of Candidate Genes for NDDs ............................. 67
3.1 Study background ........................................................................................ 67
3.2. Characterization of SVs by DNA-PET ...................................................... 68
3.3 Breakpoint characterization through detailed SVs analysis ........................ 70
Patient CD5.................................................................................................... 70
Patient CD10.................................................................................................. 74
Patient CD8.................................................................................................... 76
Patient CD9.................................................................................................... 81
Patient CD14.................................................................................................. 82
Patient CD23.................................................................................................. 85
Patient CD6.................................................................................................... 88
3.4 Secondary CNV screening in published studies and databases .................. 91
Chapter 4: Results: Dissecting Functional Role of MED13L during
Neurodevelopment and Neural crest cells (NCCs) Specification ......................... 93
4.1 Study background ........................................................................................ 93
viii
4.2 Expression profile of MED13L orthologuein zebrafish .............................. 97
4.3 Loss of function of med13b in zebrafish embryo ...................................... 100
4.4 Loss of med13b impaired craniofacial cartilage development .................. 102
4.5 med13b suppression affects neurodevelopment in zebrafish embryo ....... 104
4.6 MED13L knockdown in neural stem cells did not affect proliferation ..... 105
4.7 MED13L knockdown did not affect neuronal maturation ......................... 109
4.8 Transcriptome profiling of MED13L-deficient neurons ........................... 110
Chapter 5: Results: Studying the role of GTDC1 during neurogenesis .............. 114
5.1 Study background ...................................................................................... 114
5.2 Somatic cells reprogramming from patient’s fibroblasts .......................... 115
5.3 Phenotypic characterization of patient’s NPCs and GTDC1-deficient NPCs
......................................................................................................................... 117
5.4 Transcriptome profiling of patients and shGTDC1 cells .......................... 123
Chapter 6: Discussion ......................................................................................... 128
6.1 Clinically relevant gene disruptions in the chromosomal rearrangement
breakpoints ...................................................................................................... 129
6.2 Limitations of DNA-PET sequencing ....................................................... 134
6.3 Large phenotypic spectrum in patients with MED13L disruptions ........... 135
6.4 MED13L haploinsufficiency contributes to craniofacial anomalies and ID
......................................................................................................................... 137
ix
6.5 Morphological alterations in neurons derived from patient’s iPSCs and
GTDC1-deficient cells..................................................................................... 139
Chapter 7: Conclusion......................................................................................... 144
Chapter 8: Future Directions ............................................................................... 146
Appendix A: List of SVs per patients. ................................................................ 162
Appendix B: List of Publications. ....................................................................... 171
x
SUMMARY
Neurodevelopmental Disorders (NDDs) are heterogeneous groups of conditions
characterized by impairments in cognition, communication and/or motor skills
resulting from abnormal development of the central nervous system (CNS). They
are usually diagnosed during childhood or infancy. NDDs occur as frequent as 1-
3% in the general population, and the diagnostic yield has been estimated to be
between 15-25% using the currently available techniques. Although mutations in
more than 450 genes have been implicated in NDDs, the majority of affected
patients are still undiagnosed due to genetic and phenotypic heterogeneity.
Chromosomal rearrangements are known contributors to NDDs, which have been
routinely detected by G-banding karyotyping and fluorescence in situ
hybridization at extremely low resolution.
The first aim of my study was to identify novel candidate genes in NDDs by
performing genome paired-end tag sequencing in patients with unexplained
NDDs carrying known chromosomal rearrangements. These analyses led to the
identification of several disrupted genes within the chromosomal breakpoint
regions, and one candidate gene from private structural variants (SVs) of one
patient. In total, eight disrupted genes were identified in the breakpoint regions of
six patients, Guanine nucleotide binding protein (G-protein), q (GNAQ), RNA-
binding protein, fox1 homolog (C.elegans) (RBFOX3), unc-5 homolog D
(C.elegans) (UNC5D), X-linked inhibitor of apoptosis (XIAP), transmembrane
protein 47 (TMEM47), non-SMC condensing II complex, subunit G2 (NCAPG2),
xi
glycosyltransferase-like domain containing 1 (GTDC1), and mediator complex
subunit 13-like (MED13L). Gene disruption in four out of seven patients were
likely to explain the phenotypic features in these patients, and two candidate
genes (MED13L and GTDC1) were selected for further functional study.
The second part of the study focused on characterizing one candidate gene,
MED13L, and its potential involvement in the clinical manifestation seen in the
patient CD23. Overlapping mutations and variants encompassing MED13L have
been reported previously and associated with large phenotypic spectrum
consistent with the clinical presentation of the patient described in this study.
Zebrafish studies showed that MED13L is required for cranial neural crest
migration and its disruption in this animal model recapitulated craniofacial defects
seen in patients. Transcriptomic analysis in neuronal cells lacking MED13L
showed significant gene expression changes in components of Wnt and FGF
signaling pathways.
The third aim of the study was to functionally characterize the role of GTDC1 by
using patient’s induced pluripotent stem cells (iPSCs)-derived neurons. GTDC1
was identified as the sole candidate gene based on trio-sequencing that was
disrupted in a balanced translocation (Patient CD6). Slower proliferation rate of
progenitor cells and altered neuronal morphology were observed in both patient’s
cells and GTDC1 knockdown cells, suggesting that GTDC1 may contribute to
neurodevelopmental phenotype in the patient.
xii
Taken together, this study highlights the clinical relevance of gene disruptions due
to chromosomal rearrangements, and provides novel insights into the functional
impacts of individual gene disruption in patients with NDDs.
xiii
LIST OF TABLES
Table 1. Frequency of chromosome abnormalities in patients with ID/DD based
on G-banding karyotype analysis.......................................................................... 14
Table 2. Comparison of widely-used sequencing platform technology, adapted
from Morozova et al., 2008(93) . .......................................................................... 21
Table 3. List of patients and participating hospitals included in this study. ........ 37
Table 4. qPCR primers used to measure transcript level in the EBV-LCL for each
patient and human tissue panel. ............................................................................ 52
Table 5. shRNA sequences for cloning into pSUPER vectors. ............................ 55
Table 6. shRNA sequence for GTDC1................................................................. 57
Table 7. List of primary antibodies used in this study. ........................................ 59
Table 8. PCR primer sequences to synthesize RNA in situ probes. ..................... 62
Table 9. List of qPCR primers for validation of microarray fold change. ........... 66
Table 10. Summary of DNA-PET post-sequencing analysis ............................... 68
Table 11. Summary of DNA-PET findings to identify SVs in nine individuals .. 69
Table 12. List of SVs found on chromosome X to analyze complex
rearrangements.. .................................................................................................... 77
Table 13. Summary of the breakpoint analysis for each patient. ......................... 91
Table 14. CNV counts in cases and controls from published and public dataset. 92
Table 15. Clinical presentation of reported patients with structural variants or
mutations affecting MED13L. .............................................................................. 95
Table 16. qPCR validation of microarray-predicted changes in shMED13L NPCs,
Neurons and med13b MO.. ................................................................................. 113
xiv
LIST OF FIGURES
Figure 1. Recommended clinical evaluation scheme for early diagnosis of NDDs.
................................................................................................................................. 5
Figure 2. Illustration of cytogenetically visible chromosome rearrangements. ... 12
Figure 3. DNA-PET sequencing workflow. ......................................................... 31
Figure 4. Classification of SVs based on dPET mapping criteria. ....................... 32
Figure 5. Study design of the thesis. ................................................................... 35
Figure 6. Patients’ pedigree and their partial karyotypes indicating the
rearrangements. ..................................................................................................... 38
Figure 7. Pedigree of Patient CD5 family. ........................................................... 71
Figure 8. Validation of DNA-PET breakpoints by Sanger sequencing in Patient
CD5... .................................................................................................................... 72
Figure 9. qPCR analysis of GNAQ and RBFOX3 in human tissue panels. .......... 73
Figure 10. Pedigree of Patient CD10 family. ....................................................... 74
Figure 11. Validation of DNA-PET breakpoints by Sanger sequencing. ............ 75
Figure 12. qPCR analysis of UNC5D in human tissue panel. ............................. 76
Figure 13. FISH validation of DNA-PET predicted breakpoints......................... 78
Figure 14. qPCR analysis of TMEM47 in human tissue panel.. .......................... 79
Figure 15. qPCR analysis of XIAP and TMEM47 in patient cells. ...................... 80
Figure 16. qPCR analysis of SH2D1A and ODZ1 in patient’s cells. ................... 81
Figure 17. Telomeric deletion detected by cPET reads and aCGH. .................... 83
Figure 18. qPCR analysis of NCAPG2 and MCPH1 in patient’s cells.. .............. 85
xv
Figure 19. Validation of DNA-PET breakpoints in Patient CD23 by Sanger
sequencing............................................................................................................. 86
Figure 20. qPCR analysis of MED13L in patient’s lymphoblastoid cell lines. ... 87
Figure 21. qPCR analysis of MED13L in human tissue panel. ............................ 88
Figure 22. Venn diagram showing total SVs found in each individual after trio
sequencing............................................................................................................. 89
Figure 23. qPCR analysis of GTDC1 in patient’s lymphoblastoid cell line. ....... 90
Figure 24. qPCR analysis of the expression of GTDC1 in human tissue panel. .. 91
Figure 25. Gene and Protein structure of MED13L.. ........................................... 96
Figure 26. Sequence conservation of MED13L across vertebrate species. .......... 97
Figure 27. Expression profile of med13b in zebrafish embryo. ........................... 99
Figure 28. Knockdown of med13b in zebrafish embryo. ................................... 100
Figure 29. Eye size of morphants was smaller than controls and could be partially
rescued by human MED13L mRNA.. ................................................................. 101
Figure 30. Survival rate of med13b MO embryos compared to wild type
(uninjected and control MO) and rescue embryo observed on 5 dpf. s. ............. 102
Figure 31. Expression of NCCs markers by in situ hybridization. .................... 103
Figure 32. Morpholino knockdown of med13b perturbed neuronal distribution
across zebrafish brain. ......................................................................................... 105
Figure 33. Knockdown efficiency of shMED13L cells. .................................... 106
Figure 34. Immunocytochemistry of NPCs markers in shScrambled, shMED13L1
and shMED13L2 cells......................................................................................... 106
xvi
Figure 35. Proliferating cells in different cell cycle phases measured by EdU-
incorporated cells and Ki67 staining................................................................... 108
Figure 36. Expression of early neuronal marker (TuJ1) and mature neuronal
marker (MAP2) in shScrambled, shMED13L1 and shMED13L2 cells assessed by
immunocytochemistry......................................................................................... 109
Figure 37. qPCR analysis of SP8 and FGFR3 in MED13L-knockdown cells. .. 111
Figure 38. Heatmap clustering of shMED13L1/2 neuronal cells. ..................... 112
Figure 39. Pluripotent markers expression in iPSCs clones. ............................. 116
Figure 40. Similar translocation profile is retained in patient’s iPSC clones .... 116
Figure 41. Immunocytochemistry of NPCs markers (NESTIN, Ki67, and SOX2)
in iPSCs and shGTDC1 cells. ............................................................................. 118
Figure 42. Proliferation rate was measured by using flow-cytometry-based EdU-
incorporation assay. ............................................................................................ 119
Figure 43. Glycosylation status is assessed by ICAM1 expression in NPCs. ... 121
Figure 44 Neuronal markers expressions in patient’s cells and quantification of
neuronal morphologies........................................................................................ 122
Figure 45. Venn diagram of differentially expressed genes that are in common in
NPCs and neurons between patient’s and shGTDC1 cells.. ............................... 124
Figure 46.Heatmap clustering of the differentially expressed genes that are in
common between patient’s and shGTDC1 NPCs. .............................................. 125
xvii
LIST OF ABBREVIATIONS
aCGH Array Comparative Genomic Hybridization
ACMG American College of Medical Genetics
ADM-2 Aberration Detection Method-2
ASD Autism Spectrum Disorders
BAC Bacteria Artificial Chromosome
CCR Complex Chromosomal Rearrangements
CMA Chromosomal Microarray
CNV Copy Number Variation
cPET Concordant Paired-End Tag
dbSNPs Database of Single Nucleotide Polymorphisms
DD Developmental Delay
DGV Database of Genomic Variants
DNA-PET DNA Paired-End Tag
dPET Discordant Paired-End Tag
DSM Diagnostic and Statistic Manual of Mental Disorder
EBV-LCL Epstein-Barr Virus Lymphoblastoid Cell Lines
EdU 5-ethynyl-2´-deoxyuridine
xviii
EEG Electroencephalogram
FISH Fluorescence in situ hybridization
FRAXA Fragile X Test
fMRI Functional Magnetic Resonance Imaging
FXS Fragile X syndrome
GNAQ G-protein alpha subunit q
GTDC1 Glycosyltransferase-like Domain Containing 1
hESCs Human Embryonic Stem Cells
ID Intellectual Disability
ISCA International Society of Cytogenetics Nomenclature
IQ Intelligent Quotient
iPSCs Induced Pluripotent Stem Cells
LD Language Delay
LMP Long Mate Paired
MED13L Mediator complex subunit 13-like
MCA Multiple Congenital Anomalies
MCPH1 Microcephalin 1
MO Morpholino
xix
MRI Magnetic Resonance Imaging
NCAPG2 Non-SMC Condensin Complex II subunit
NCC Neural Crest Cells
NDDs Neurodevelopmental Disorders
NGS Next-Generation Sequencing
NPCs Neural Progenitor Cells
PET Paired-End Tag
RBFOX3 RNA-binding protein
SD Speech Delay
SNP Single Nucleotide Polymorphism
SNV Single Nucleotide Variations
SV Structural Variations
TMEM47 Transmembrane protein 47
UNC5D Unc-5 homolog D
WES Whole Exome Sequencing
WGS Whole Genome Sequencing
WISH Whole-mount In Situ Hybridization
XIAP X-linked inhibitor of apoptosis
1
CHAPTER 1: INTRODUCTION
1.1 NEURODEVELOPMENTAL DISORDERS OVERVIEW
Neurodevelopmental disorders (NDDs) are one of the most common health
burdens in pediatric health care. Up to 3% of the general population is estimated
to have some form of NDDs(1), and the prevalence tends to be higher in
developing countries with lower socioeconomic status and poor health care.(2)
NDDs are defined as an umbrella term for a heterogeneous group of conditions
characterized by impairments in cognition, communication, behavior and/or
motor skills resulting from abnormal brain development.(3, 4) There are no
curative pharmacological treatments for cognitive delay.(5) Thus, children with
NDDs usually undergo treatment with a variety of rehabilitative therapies and
early intervention strategies to optimize their developmental potential. NDDs can
be classified based on abnormalities in certain areas, such as intellectual
functioning, speech, language, fine motor skills and may coexist with a known
syndrome. In some cases, the presence of minor dysmorphism (facial and other
superficial physical anomalies) or multiple congenital anomalies (MCA) may
coexist with NDDs symptoms. The most common clinical features observed in
NDDs patients include intellectual disability (ID) or developmental delay (DD),
speech delay (SD), language delay (LD), and Autism Spectrum Disorder (ASD),
which are further described below:
2
1.1.1 DEVELOPMENTAL DELAY (DD)/INTELLECTUAL DISABILITY (ID)
According to the new criteria of Diagnostic and Statistical Manual of
Mental Disorder (DSM)-V(6), developmental delay (DD) or intellectual
disability (ID) is characterized by an impairment of general mental
abilities that impact adaptive functioning in conceptual domain (language,
reading, writing), social domain and practical domain (organizing task).
The term DD is used for younger children (less than 5 years of age),
whereas ID is applied to older children when Intelligence Quotient (IQ)
assessment is valid and reliable. Children with DD usually present with
significant delays in the developmental milestones at the expected age.(7,
8) DD/ID is estimated to occur in 1-3 of every 100 live births. ID is a
newly recommended term to replace ‘Mental Retardation’ according to
Rosa’s Law and documented by the new International Classification of
Diseases (11th
revision).(9)
ID is defined by an IQ with four degrees of severity: mild ID (IQ 50-70),
moderate ID (IQ 35-49), severe ID (IQ 20-34) and profound ID for IQ
below 20. DD/ID can appear as a distinct, isolated condition or coexist as
part of well-defined syndromes such as autistic disorder, or X-linked ID
syndromes.
1.1.2 LANGUAGE DELAY (LD)
Language encompasses the understanding, processing and production of
communication. Language delay (LD) occurs more frequently than ID in
the general population. It is estimated to be in the range of 5-8% in pre-
3
school children, and might co-occur with other conditions such as autism
or cleft palate.(10) LD is typically recognized by difficulty with grammar,
words or vocabulary, units of words meaning, and the use of language
particularly in social contexts.(11) LD is diagnosed by using the early
Language Milestone scale that focuses on expressive, receptive and visual
language.(12) Children diagnosed with LD possess higher risk for learning
disabilities as they have difficulties in reading, and written language,
which subsequently lead to academic underachievement and lower IQ
score. The difference between LD and speech delay (SD) is that LD
pertains to both expressive and receptive delays, whereas speech delay is
specific to speech mechanism alone.
1.1.3 SPEECH DELAY (SD)
Speech refers to the mechanics of oral communication or the motor act
communicating by articulating verbal expressions.(11) Early signs of SD
include stuttering or dysfluency, articulation problems, inability to speak,
which occur at the age of onset below 5 years old.(11) However, not all
children develop linguistic skills at the same speed or to equivalent
proficiency.(13) Similar to LD, SD is a common childhood problem that
affects 3-10% of children, which could manifest with other disorders such
as autism or intellectual disability, and is more frequently seen in boys.
4
1.1.4 AUTISM SPECTRUM DISORDERS (ASD)
According to a new classification in DSM-V, ASD is defined as a
behavioral disorder recognized in early childhood that shows selective
impairment mainly in social interaction, communication, language
development, and restricted or repetitive patterns of behavior
(stereotyped), which largely limit everyday functioning. ASD has recently
been reclassified as a collective presentation of Rett syndrome, Asperger
syndrome and Autistic disorder or Pervasive Developmental Disorder Not
Otherwise Specified (PDD-NOS) that were previously presented as
distinct subtypes of ASD in the DSM-IV manual. ASD affects about 1 in
110 individuals, with age of onset of three years old. ASD is highly
heritable compared to other types of NDDs, and the presentation of ASD
patients are largely variable, with symptoms ranging from mild to severe
in terms of behavioral and IQ performance.
1.2 CLINICAL EVALUATION OF NDDS
Patients and their families may benefit from an established etiologic diagnosis for
the possibilities of recurrence risk, treatment options and prevention strategies.
Generally, two clinical evaluations are required in the first year of life, yearly
evaluations until the early school years and a re-evaluation during puberty. In
clinical genetics, establishing a diagnosis usually require a process of gathering
data from visit history, repeated physical examinations and staged diagnostic
5
testing.(14, 15) In 1997, Curry and colleagues (14) described the recommended
‘gold standard’ for clinical evaluation of NDDs, which is summarized in Figure 1.
Figure 1. Recommended clinical evaluation scheme for early diagnosis of NDDs. The
patients are initially investigated from the prenatal, perinatal and postnatal history. The
family pedigree of three generations is important to determine possibility of inherited
disorders. Next, physical examination is important to monitor developmental milestones.
Based on the phenotypic data and patient history, different types of genetic diagnostic
tests are recommended for follow-up, such as Fragile X test for a suspected phenotype or
chromosomal karyotyping.
First, clinicians will assess the prenatal and birth history records, as these are
important determinants of a likelihood of perinatal complications such as birth
trauma or asphyxia. Second, the family history including three-generation
pedigree is required to examine possible transmission of neurological traits
running in the family such as learning disabilities or psychiatric disorders. Then,
the child will undergo complete physical examination, focusing on the minor
anomalies such as dysmorphisms, measurements of growth parameter and head
circumference that is compared with normal developmental stages. Facial features
assessment, with special attention on the inter-eye distance, width of the nasal
root, forehead size, and appearance of the nose, upper lip, palate and jaw usually
provide clues for specific syndromic diagnoses, such as Down syndrome. When a
patient present features that resemble a specific diagnosis for which genetic
6
testing is available, this analysis should be performed first. For example, Fragile
X syndrome test (FRAXA), or Down syndrome test. However, when a detailed
clinical history, physical examination and family history are not suggestive of a
specific disorder, unbiased genome-wide screening such as G-banding
karyotyping or chromosomal microarray should be considered as a first line
genetic testing of individuals with NDDs.(16)
1.3 CAUSES OF NDDS
NDDs can be caused by environmental insults, such as exposure to viral
infections, birth traumas, toxins or radiation, which mostly occur during prenatal
periods. Mounting evidence has shown that genetic factors play a major role in
NDDs, and the majority of the cases (approximately ~60%) have unknown
etiology.(17)
1.3.1 ENVIRONMENTAL CONTRIBUTIONS OF NDDS
Environmental insults occurring at different time points during fetal development
could interfere with normal brain development and may contribute to cognitive
impairments in NDDs. Understanding the environmental risk factors is
particularly important to identify potentially amenable factors for clinical
intervention. Formation of the CNS is a highly dynamic process that requires
orchestrated steps from early embryonic development to reach adult maturation,
and the developing brain is more vulnerable to environmental insults.
Previous studies have shown that environmental stressors could influence normal
neurodevelopment, which include complications during the prenatal period and
7
complications during delivery. Perinatal asphyxia, or lack of oxygen intake in the
newborn is one of the most frequent causes of NDDs that may result in increase in
mortality and morbidity such as an increased risk cognitive impairment(18).
Smoking, alcohol use, drugs and exposure to toxins during pregnancy may be
associated with birth defects. Prenatal exposure to toxins from drugs abuse or
alcohol use have been shown to exert adverse effects on neurodevelopmental
outcome.(19) (20) In utero exposure to nicotine in animal models resulted in
behavioral and cognitive impairments, suggesting that prenatal exposure to
nicotine may perturb neurodevelopment.(21) Furthermore, epidemiological
studies have shown that maternal smoking was associated with slightly poor
academic achievement, and increased symptoms of attention-deficit disorder and
hyperactivity in the offspring.(22)
Folate deficiency has also been associated with specific birth defects affecting
neurodevelopment such as neural tube defects (NTDs).(23) Impaired folate
metabolism was originally observed in mothers of infants with NTDs in 1960s,
and folate supplementation during pregnancies has substantially reduced the
occurrence and recurrence of NTDs. (24, 25) Nutritional status, physiological
condition and psychological states of pregnant mothers appeared to be associated
with increased risk of developing NDDs in their newborn. Large-scale population
study showed that maternal metabolic conditions such as diabetes, hypertension
and obesity have been associated with autism, DD or impairments in specific
domains of development in the offspring.(26, 27) Depression or anxiety during
8
pregnancy also have been shown to be correlated with decreased IQ, learning and
memory deficits and delayed social development in their offspring.(28)
In addition, prenatal infections such as rubella or even fever have been linked to
an increased risk of developing neuropsychiatric disorders, including
schizophrenia and autism.(29) A large study involving 10,000 autism cases
provided significant association with maternal viral infection in the first
trimester(30). Maternal effect risk of NDDs also came from studies observing
short inter-pregnancy intervals between first and second-born child pregnancies.
Data gathered from registry-based studies have shown significant correlation
between NDDs incidence and short interpregnancy intervals. A group in
California reported that closely spaced pregnancies were associated with
increased risk of autistic disorder in the later-born child, with the largest increase
observed in < 1 year apart(31). Subsequent studies in Sweden, Denmark and
Norway population showed similar trend of increased risk of developing NDDs
such as autism or schizophrenia (31-34). The proposed theory from these studies
was due to deficiency of essential micronutrients during pregnancy, and short
inter-pregnancy intervals were not sufficient to restore nutritional status after
delivery.
Apart from maternal effect, father’s age appears to be a risk factor for NDDs-
associated diseases such as autism. A recent study has found that sperm from
older fathers contain more DNA mutations compared to sperm from young men,
and these mutations are commonly segregated into their autistic offspring.(35)
Advanced maternal age did not seem to correlate with the risk of autism,
9
suggesting paternal age as an important determinant in autism incidence.(36)
Another group investigating the rate of mutations in older fathers provides further
evidence of paternal age bias towards risk of carrying additional de novo mutation
per year (35, 36).
1.3.2 GENETICS OF NEURODEVELOPMENTAL DISORDERS
Accumulating evidence has demonstrated that NDDs have a strong genetic
component, based on twin studies, family segregation studies, or spontaneous
genetic mutations.
In 1938, a British geneticist, Lionel Penrose conducted a detailed examination in
1280 patients with ID and their family members over a period of 7 years (37), and
he observed that family members of ID-affected individuals were at risk of
developing NDDs, and the risk was reduced with decreasing relatedness(38). He
proposed de novo mutations in sporadic cases, multifactorial inheritance, and
incomplete penetrance for the non-Mendelian segregation of the phenotype being
the major genetic cause of cognitive impairments. Recent studies have shown that
in accordance to his theory, the majority of sporadic NDDs cases arise from
spontaneous de novo mutations.(39-46) Despite its high frequency in the
population, a large proportion of NDDs cases (~60%) have unknown etiology,
and thus the majority of them remain undiagnosed. Some forms of syndromic
NDDs account for a total of ~10%, which include FXS (~1-2%), tuberous
sclerosis (~1%), Rett syndrome (~0.5%), and Neurofibromatosis 1 (less than 1%).
Furthermore, other rare monogenic disorders with more subtle pattern of
malformations were accounted in an even smaller proportion of NDD cases.
10
Numerous efforts have been made in large-scale population studies to determine
common genetic variants in several forms of NDDs, without any success.(47, 48)
Interestingly, many studies observed that mutations found in different genes have
been identified for seemingly identical form of NDDs, and mutations in the same
gene may result in different disease outcomes.(47) These observations led to an
emerging view that rare de novo mutations, instead of common variants, are more
likely to contribute to NDDs, which emphasized a considerable genetic and
phenotypic heterogeneity in this disease.(49) These rare variants typically
constitute different forms of genetic mutations with large effects such as
chromosomal anomalies, copy number variants (CNVs), and single nucleotide
variants (SNVs) that could inactivate or alter gene dosage(50).
Recent advances in technologies over the past two decades such as microarray
comparative genomic hybridization (aCGH) and next-generation sequencing
(NGS) have identified up to more than 400 candidate genes associated with
NDDs. Some of these genes are involved in general physiological processes
required for normal development, including cell adhesion, gene transcription,
metabolism, synaptogenesis and chromatin remodeling that converge on similar
neurodevelopmental pathways.(49) Altered gene dosage involved in these
processes could disrupt the neuronal networks and interfere with a normal brain
development leading to cognitive dysfunctions.
11
1.4 GENETIC EVALUATION FOR NDDS
Since early 1970s, patients with ID or DD, with or without dysmorphic features
were routinely assessed with conventional cytogenetic methods by obtaining
samples from peripheral blood as the first line of genetic evaluation. Cytogenetic
is a branch of genetics that studies chromosomes and examines the function and
structure of chromosomes, and its encoded DNA that build the genome.(51)
Conventional cytogenetic techniques such as G-banding karyotyping or FISH are
very informative and have allowed better understanding of human diseases,
normal phenotypic variation and karyotypic evolution. Importantly, due to its
genome-wide coverage and rapid turnaround time, cytogenetic analysis has been
instrumental for rapid genetic evaluation of unexplained NDDs.
1.4.1 G-BANDING KARYOTYPING
G-banding karyotyping is a conventional cytogenetic method that relies on
harvesting chromosomes in mitosis by treating cells with tubulin inhibitors such
as colcemid that depolymerize the mitotic spindle and arrest the metaphase
chromosomes in the cells. The chromosomes are assayed by staining with Giemsa
dye, and this process is therefore referred to as G-banding. This technique was
developed in the late 1960s, and the principle relies on the identification of the
alternating light and dark staining bands comprising each chromosomal locus, and
allowed for identification of large chromosomal rearrangements at a resolution of
5-10 Mb(16).
12
These structural chromosomal changes involve an exchange between two
chromosomes, or translocation, inverted piece of chromosome in the opposite
direction, or inversion, deleted portion of a chromosome, or deletion, and
additional copy of chromosome regions, or duplication (Figure 2). Based on a
large cytogenetic survey in 1991, the risk of serious congenital anomalies is
estimated to be 6.7% for individuals carrying de novo balanced chromosome
rearrangements (translocation and inversion).(46)
Figure 2. Illustration of cytogenetically visible chromosome rearrangements.
Translocation occurs when there is an exchange of genetic material between two
chromosomes, illustrated by chromosome A and B. The rearranged chromosome is
referred as derivative chromosome (Der(A) or Der(B)). Duplication is depicted as an
additional copy of certain chromosomal region, resulting in a duplicated segment.
Deletion occurs when there is a loss of genetic material in the chromosome. Inversion
occurs when there is a chromosomal break within a chromosome that results in a reversed
orientation of the genetic material within the chromosomal break. Dotted white lines
represent the region of chromosomal break.
13
The presence of balanced rearrangements could potentially explain the phenotype
when the rearrangement occurs de novo or segregated with a disease within the
family. Typically, balanced rearrangements retain a single chromosomal allele
expressing normal expression and one derivative allele containing the
rearrangement breakpoint. This rearrangement breakpoint may disrupt a gene that
lead to an absent or altered gene dosage through gene truncation, inactivation,
gain of function or creation of chimeric fusion genes.
According to the recent guideline in American College of Medical Genetics
(ACMG), G-banding techniques are recommended as a first-tier genetic testing
for specific group of patients with clinically suspected chromosome aneuploidy,
such as Down, Turner and Klinefelter syndromes, or a family history suggestive
of chromosomal rearrangements.(16) This is based on the earlier observations that
a sizeable proportion of NDDs cases (at a range of 4-28.4%) are attributable to
chromosome abnormalities, including trisomy, subtelomeric rearrangements and
balanced chromosomal rearrangements (Table 1).(14, 15) Furthermore,
subtelomeric chromosome rearrangements have been found in 6% of idiopathic
severe ID patients.(52, 53)
14
Study # of patients Type of
NDDs
Country of
study
Frequency
(%)
Bourgeois and Benezech(54) 600 ID France 9
Kodama(55) 197 Severe ID Japan 4
Rasmussen(56) 1,905 ID Denmark 18.8
Wuu(57) 470 ID Taiwan 8.1
Gustavson(58) 171 Mild ID Sweden 11.9
Srsen(59) 324 ID Czech 28.4
Wuu(60) 1,323 ID Taiwan 7.87
Schwartz(61) 350 ID/DD U.S. 11.9
Table 1. Frequency of chromosome abnormalities in patients with ID/DD based on
G-banding karyotype analysis. These studies assess the presence of chromosome
anomalies (including trisomies) among individuals with neurological deficits.
The diagnostic yield of routine G-banding karyotyping is approximately 3.7%.
Meta-analysis of 33 studies by the International Standard Cytogenetic Array
Consortium estimated that balanced translocations are identified in ~0.3% of
individuals with ID who were tested with G-banding karyotyping.(62) However,
G-banding techniques are limited to the detection of microscopically visible
chromosomal aberrations (Megabases in size), and the precise breakpoint cannot
be precisely delineated without further validation by ‘chromosome walking’,
using probes surrounding the breakpoints by fluorescence in situ hybridization
(FISH) analysis.
1.4.2. FLUORESCENCE IN SITU HYBRIDIZATION (FISH)
G-banding karyotyping provides an unbiased view of the whole chromosomes,
which is useful for genetic testing in individuals with unknown cause and no
family history. However, for patients with phenotypes suggestive of specific
15
disorder such as trisomy disorder, subtelomeric or microdeletion/duplication
syndromes, a focused FISH analysis is a useful step to investigate specific
syndromes. When the FISH analyses reveal abnormalities, FISH should be
performed on both parents to identify a carrier parent. FISH analysis involves
hybridization of fluorescently labeled polymorphic marker probes such as
Bacterial Artificial Chromosomes (BACs) or fosmids into the denatured DNA of
metaphase chromosomes or interphase nuclei. FISH can detect submicroscopic
aberrations of less than 5 Mb, and the resolution relies on the size of the probes
used. For example, BACs provide resolution of 150-200 kb, whereas cosmid
probes provide a resolution of 30-40 kb. FISH analysis has enabled identification
of many disease genes associated with congenital anomalies at the chromosomal
breakpoints, such as dystrophin in Duchenne Muscular Dystrophy (DMD),(63)
DISC1 in schizophrenia,(64, 65) and ATP7A in Menkes disease, as well as
subtelomeric deletion syndromes (66, 67) The diagnostic yield for FISH analyses
in patients with ID/DD is approximately 6.8%. However, FISH can only detect
known regions, therefore it can only be used when the phenotype is suggestive of
a particular disorder or having a prior knowledge of certain genomic region to be
investigated.
1.4.3 ARRAY COMPARATIVE GENOMIC HYBRIDIZATION (ACGH)
aCGH was first developed by Daniel Pinkel in 1998(68), and the method has
continued to improve over the last ten years. aCGH has higher resolution and
sensitivity to detect CNVs such as deletions and duplications that were previously
difficult to detect by G-banding and FISH analyses. The principle of aCGH relies
16
on utilizing cloned BACs or synthesized oligonucleotides DNA fragments
covering across chromosomal loci in the genome that are spotted on the array
chip. Copy numbers are determined based on the differences in the hybridization
pattern intensities between two differentially labeled DNA (patient and reference
DNA).
CNVs are defined as alteration of copy number in certain genomic locus that is
greater than 1 kb in size. The resolution varies depending on the tiling array used;
1 Mb resolution BAC arrays, tiling resolution BAC arrays, or 100k SNP arrays.
The presence of CNVs could have functional consequences through diverse
mechanisms, including dosage changes, gene interruption, generation of fusion
genes, position effect by altering regulatory sequences of genes near the
breakpoint, and unmasking a recessive allele.(69) These mechanisms could
potentially affect genes that have a role in disease, and thus may pinpoint a
disease-associated locus. However, it has been shown that large-scale CNVs are
in fact scattered randomly throughout the genome covering 360 Mb (12% of the
genome) in the healthy individuals.(70, 71) These studies suggested that these
CNVs are unlikely to be disease-related, and could be the source of genetic
variations between individuals.(70, 71) Based on high resolution genomic
microarray studies, it has been estimated that on average every individual
possesses ~1000 CNVs that range in size from 500 bp to 1.2 Mb.(71) Following
this genome wide prevalence of large-scale CNVs in healthy individuals, the
Database of Genomic Variants (DGV; http://dgv.tcag.ca/dgv/app/home) was
launched in 2008 as a resource that extensively cataloged and mapped precise
17
localization of CNVs and inversions from hundreds of disease-free individuals.
DGV is an invaluable resource for clinical aCGH applications that could filter out
CNVs that are present in normal individuals and help to define the candidate
disease susceptibility loci. Following the recent ACMG guidelines, aCGH has
replaced G-banding karyotyping as the first line genetic testing for individuals
with DD/ID without a specific diagnosis, and has greatly improved the diagnostic
yield.
aCGH is a powerful approach to identify CNVs associated with NDDs, and many
studies have reported the identification of recurrent microdeletions or
nicroduplications associated with specific clinical features.(72-75) Clinically
relevant CNVs are defined as pathogenic when they are large in size (>500 Kb),
overlap with known microdeletion/duplication syndromes, or encompass genes
with known phenotypes.(76) In addition, recent studies suggested that rare and de
novo CNVs were considered to be clinically relevant and might be responsible for
15-20% of NDDs cases.(72, 77, 78) The diagnostic yield of NDDs using aCGH
was approximately 15%, which was nearly four-fold higher than karyotyping.(74)
Recurrent microdeletions or microduplications identified in patients with nearly
identical phenotypes have allowed clinical geneticists to classify these patients
into locus-specific syndromes.(79) This novel classification has greatly improved
diagnostic outcome in certain group of patients. However, it is extremely
challenging to identify the causative gene within the affected region for follow-up
functional studies, because these regions may comprise multiple genes.
18
As an example, the 17q21.31 microdeletion syndrome (also referred as Koolen-
De Vries syndrome) was the first microdeletion syndrome identified through
aCGH in patients with DD/ID.(79) Subsequent studies(72, 80) reported additional
patients with 17q21.31 deletion that show nearly identical phenotype to the first
study, including ID, hypotonia, characteristic facial features, epilepsy, heart and
kidney defects. Another clinically-defined microdeletion syndrome is at 15q24
locus. The first report described four individuals with idiopathic ID,
microcephaly, digit abnormalities, genital abnormalities, hypospadias, and facial
dysmorphism. All had a common deletion in 15q24 region. Further
microdeletions in the same region ranging from 1.7-3.9 Mb in size were reported
with nearly identical phenotypes, which lead to a consistent and well-recognized
clinical diagnosis.(81, 82) However, a number of genomic loci has been recently
identified with variable inheritance and penetrance, which complicates a clinical
interpretation, such as CNVs at loci 1q21.1(83, 84), 15q13.3,(85, 86) and
16p13.1.(87, 88)
In terms of molecular characteristics of CNVs, recent aCGH studies performed in
large cohort of NDDs patients have provided an insight into the molecular
signatures of CNVs in these individuals. Baptista et al compared phenotypically
normal and abnormal carriers of translocation and found that translocation in
diseased cohort were more likely to be associated with cryptic genomic
imbalances at the breakpoint regions.(89) Girirajan et al postulated that the
additive effect of second large CNV in NDDs patients in addition to the existing
microdeletion or microduplication syndrome caused a more severe clinical
19
phenotype, due to a combined effect of rare variants.(90) These studies suggested
that the co-occurrence of rare CNVs with existing variants contributed to the
levels of cognitive impairments severity in NDDs. Despite its improved
resolution, aCGH is unable to detect copy-neutral rearrangements or complex
intra-chromosomal aberrations.
1.4.4 NEXT GENERATION SEQUENCING
When the first draft of human genome sequence was announced in 2001, it took
more than 13 years for the Human Genome Project (HGP) to sequence 3 billion
base pairs by Sanger sequencing. The method used involve creating massive
libraries of sheared DNA fragments inserted into large vector such as BACs,
sequencing each fragments and assembling these fragments based on sequence
overlaps. Although Sanger sequencing is highly accurate due to its long reads
capacity, it is not a preferred method to sequence human genome on a large-scale
due to its high cost and labor intensiveness. Recent advances in next-generation
sequencing (NGS) technology have revolutionized the potential for gene
discovery and human genetic variations in the past few years. In contrast to
Sanger sequencing that typically produces a single long read (800 bp - 1 Kb),
NGS generates millions of short reads (starting from 35 bp) using reversible
sequencing chemistries. NGS technologies have substantially reduced the time
and costs required for genome-wide screening, and also have increased resolution
compared to conventional methods.
The era of NGS technology began in 2004 when the 454 Pyrosequencing was
developed by Roche Applied Science, allowing thousands of sequencing reactions
20
in high density picoliter reactors to be carried out in parallel, massively increasing
the sequencing throughput.(91) In 2006, Illumina launched the Solexa Genome
Analyzer (GA) for reversible termination sequencing method by attaching single-
stranded DNA fragments to a single molecule array or flow cell, allowing solid-
phase bridge amplification of single molecule DNA templates. Subsequently
thereafter, in 2007 Applied Biosystem introduced a ‘supported oligonucleotide
ligation and detection’ (SOLiD) system for a massively parallel sequencing by
hybridization-ligation. Earlier studies employing NGS technologies to catalog
genome variation maps mainly utilized SOLiD platform (92).
In early 2010, Illumina released HiSeq 2000, which adopts similar sequencing
strategy to Illumina/Solexa GA system that is able to produce up to 600 Gb in 7
days, and more cost-effective than SOLiD or 454 system per million bases. HiSeq
2000 also allows multiplexing of samples by barcoding with primers and adapters,
to process hundreds of samples in parallel. These NGS platforms differ greatly in
the read length, output data capacity per run and the time needed for each run.
The comparison between different sequencing platforms are described in Table 2.
21
Platform
technology Sequencing mechanism
Read length
(bp)
Output data
per run
Cost per
million bases
(USD)
Sanger
ABI3730xl
Synthesis in the presence of dye
terminators 400-900 0.7 Gb 2400
454 Roche Pyrosequencing on solid support 700 600 Gb 10
Illumina
HiSeq 2000
Sequencing by synthesis with
reversible terminators 50-100 120 Gb 0.07
ABI/SOLiD Massively parallel sequencing
by ligation 35-50 90 Kb 0.13
Table 2. Comparison of widely-used sequencing platform technology, adapted from
Morozova et al., 2008(93) Different sequencing machines have their own advantages
and disadvantages in terms of read length, error rate, output data for each run and the cost
required for each million bases.
Depending on the platform, sequencing methods and applications for specific
experiments, NGS techniques are able to identify either structural variations
(SVs) or single nucleotide variations (SNVs). SVs are typically defined as
variation in a DNA region greater than 1 kb and include structural changes such
as translocations, inversions, insertions/deletions (indels) and CNVs. Similar to
CNVs, SVs can lead to diverse mechanistic outcomes, including altered gene
dosage, positional regulatory change, chimeric fusion gene formation and
reorganization of functional elements.(94) SNVs are single nucleotide change in a
DNA region that may alter protein-coding, regulatory region or splicing
mechanism. For SVs identification, mate-paired library genome sequencing could
provide information for the presence of SVs through discordantly mapped
sequencing reads.
22
Mate-paired libraries are generated by ligating two adjacent DNA fragments
(paired-end reads) separated by a certain size (1.5- 5 kb) in the genome, thus
referred to as long-insert paired end sequencing. Mate-paired library relies on the
presence and exact distance of both reads (paired-end reads) to map the sequence
assembly to the reference genome. The paired reads that are not in correct
orientation, correct position in the chromosome or correct size between reads are
informative for SVs annotation, including CNVs and balanced rearrangements
(translocation, inversion and complex copy-neutral rearrangement).(92, 95) Due
to detailed position of breakpoint from sequencing output, this approach could
substantially reduce the time consumed for positional cloning of candidate genes
or mapping balanced rearrangements breakpoints by BAC-probe-oriented
chromosome walking in FISH.
Following this application, many studies have employed this technique to map the
breakpoints of disease-associated balanced rearrangements in NDDs patients by
various methods such as chromosome-sorting(96), targeted capture of
breakpoints,(97) or large-insert jumping libraries.(98, 99) Talkowski et al
reported 33 gene-disruptive loci in 38 NDDs patients carrying balanced
rearrangements, and determine the possible correlation of these genes to the
phenotypes (98) Cuppen et al. described a detailed analysis on the genetic
architecture of complex chromosomal rearrangements (CCR) involving more than
3 breakpoints through paired-end sequencing, providing a mechanistic insight into
the reorganization of shattered chromosomes.(100) Recently, the Genome
Institute of Singapore has developed a large-insert genome paired-end tag
23
sequencing, referred to as DNA-PET, which is the primary method used in this
thesis, and it will be described in section 1.4. However, paired-end tag sequencing
techniques are limited to detection of SVs and unable to detect small
rearrangements such as SNVs.
Alternatively, whole genome sequencing (WGS) can detect both SVs and SNVs
in a high throughput manner. In principal, WGS employs paired-end reads
approach by sequencing short-reads from both forward and reverse ends, and
reads can be aligned by overlapping sequence data. Recent study utilized WGS in
a large cohort of patients with unexplained severe ID and their parents, which
resulted in an improved diagnostic yield of up to 42-62% by using this method.
(101) This is particularly beneficial for diagnosis of a genetically heterogeneous
condition such as ID, in which WGS can provide unbiased genome-wide
screening with a very high resolution. However, WGS is limited by the large
amount of data output generation, requires labor-intensive and time consuming
bioinformatics analysis, and the cost of sequencing per genome is still relatively
expensive compared to paired-end tag sequencing method. Furthermore, data
analysis and interpretations of non-coding variants are very challenging, although
the technology is still improving and annotation of gene desert regions are likely
to improve in the coming years by ENCODE project.
To overcome this limitation, targeted enrichment technologies provide
opportunities to sequence only protein-coding genes or specific sets of genes.
Theoretically, majority of disease causing mutations reported so far occurs in the
coding regions, which account for <1% of the genome. Thus, by sequencing only
24
a minor fraction of the genome, whole exome sequencing (WES) requires shorter
time to generate libraries, to produce sequence output, less amount of sequence
data for high-throughput data analysis, and is less expensive compared to WGS.
WES technologies are able to detect SNVs and CNVs based on the sequencing
read-depths. Over the past few years, experimental and analytical approaches
related to WES have established a standardized scheme for disease gene
discovery in Mendelian disorders. Typically, every individual possess ~20,000-
24,000 variants identified by WES, and more than 95% of these variants are
known as polymorphic in human population as documented by Database of Single
Nucleotide Polymorphisms (dbSNPs) or 1000 Genome Project. Filtering of these
polymorphic variants is likely to reveal rare variants that might have impact on
disease phenotypes. In favor to the low-cost of WES, parental samples are usually
included in WES analysis to identify rare de novo variants, in which germline
variants inherited from the parents can be excluded. Recent studies have reported
a significant excess of de novo SNVs found in NDDs patients with severe ID, or
autism. A recent study reported a large-scale WES trio analysis in 100 severe ID
patients resulting in identification of de novo SNVs that could explain the
phenotype in ~50% of the cases.(102) This analysis produced a diagnostic yield
of about ~16% based on de novo mutations in known and novel genes.
Subsequent trio-exome sequencing study in autism families showed an excess of
de novo mutations in brain-expressed genes that might be associated with autism
and carry large effects.(103)
25
WES is a powerful tool for clinical application and novel disease gene discovery.
However, large structural variants such as translocation or inversion could not be
detected through WES. Despite the successful identification of potentially
pathogenic mutations through WES, WGS or PET sequencing, there are
challenges in interpreting NGS data and their causal role in disease. Each
individual generally carries private variants that have unknown clinical
significance, as well as potentially damaging mutations, which may lead to false
assignments of disease-causing mutations or may raise ethical issues such as
incidental findings of other disease-causing mutations based on unbiased genome-
wide screening.(104, 105)
1.5 PATHOPHYSIOLOGY OF NDDS
The underlying pathology of NDDs has been mainly studied through phenotypic
characteristics of brain morphology, post-mortem brain tissues, and animal model
knockouts of genes identified through genetic analysis. Phenotypic characteristics
of brain morphology typically analyzed through structural magnetic resonance
imaging (MRI), or functional MRI (fMRI), both are safe and non-invasive tools
for evaluating gross neuroanatomical changes of the brain. MRI uses strong
magnetic signals to define morphological differences in the brain, whereas fMRI
measures neural activity in the brain based on changes in the blood flow. The
majority of neuroimaging studies in NDDs have focused largely on autism, Down
syndrome or other syndromic disorders (FXS, Williams or, Velocardiofacial
syndrome), rather than DD, SD or ID as single entities. Clinically, individuals
26
with these syndromes usually have lower IQ, and possess difficulties in language
and communication.
One of the most replicated neuroanatomical changes in children with autism is the
age-specific brain anatomic abnormalities. Abnormal brain overgrowth is
observed at early ages (1-2 years), that subsequently decline or shown as arrested
growth in early adulthood suggesting possible degeneration due to cell cycle
defect, neuronal loss or apoptosis defects.(106) Based on MRI studies that
measure brain size and head circumference, the abnormal early overgrowth is
specifically found in several regions of the brain including the frontal lobe,
temporal cortex and amygdala that are known to be important for development of
higher order social, emotion, language and communication abilities.(106-109)
Subsequent studies showed enlarged cerebellum in children with autism, but
smaller cerebellar volumes in DD patients.(110, 111) At the tissue level, brain
enlargement reflects both increased cerebral gray and white matter, especially
white matter immediately underlying the cortex.(112-115) However, fMRI studies
investigating perceptual skills and overall intelligence in ASD patients did not
produce consistent results, indicating that not all brain systems are equally
affected. Overall, these studies suggested dynamic changes of brain pathology
across developmental stages, with age-specific defects, showing that defects
present at younger ages may not be present at older ages and conversely.
(Example: overgrowth of amygdala at young age, but not at older age.) This
theory implies that the age-related changes will mark gene expression, molecular
features, synaptic composition, cellular characteristics, and circuit patterns. The
27
overgrowth phenomenon was hypothesized to be due to excess neurons from cell
cycle defects or failure of apoptosis.(116, 117) To this end, follow-up studies
were performed by measuring neuron numbers in post-mortem brain tissues of
children with autism.(118)
Recent study discovered that brain overgrowth in autism involved excess number
of neurons (~67% more) in the prefrontal cortex, a brain region that is important
for social, communication and complex cognitive behaviour.(119) Another post-
mortem study showed excess number of von Economo neurons (~53% more) in
the frontoinsular cortex in autism brains than in controls.(120) Von Economo
neurons are a subtype of neurons that is primate-specific and mainly located in the
dorsolateral prefrontal cortex and fronto insular cortex, which is important for
expressing emotion and higher cognitive functioning. Another post mortem study
found decreased density of Purkinje cells in the cerebellar hemisphere in children
with autism and ID, which contradicts the MRI finding of enlarged cerebellum in
autism.(121, 122) Further stereological study in amygdala did not show consistent
findings; one study reported smaller size of densely-packed neurons in
amygdala(123), while other group found fewer number of neurons in
amygdala.(124)
Both neuroimaging and post-mortem studies poses different challenges and
limitations. Neuroimaging techniques provide complete documentation of
different phases during development. Although MRI studies may show some
minor abnormalities or variations in structure, autism is a condition for which
there is a diagnostic or easily recognizable pattern on neuroimaging. Post-mortem
28
brain studies typically examined older adults with autism, and most of the studies
are performed in small sample sizes, due to number of tissue and quality of tissues
available. Therefore, the earlier developmental time window is not explored,
which is the critical time period where MRI studies showed enlargement of brain
structures. Further studies to identify novel quantitative imaging biomarkers that
could detect early disturbance of neurodevelopment would need to be pursued to
facilitate early detection of NDD.
The majority of animal models for NDDs are based on monogenic disease
models, such as Rett syndrome, FXS, and tuberous sclerosis. Apart from
behavioral study of specific tasks or abilities in knockout mice, morphological
changes in neurons have provided insight into specific defects that alter normal
neurodevelopment, such as neurite numbers, dendritic arborization, synaptic
plasticity, and axonal outgrowth. One example of well-characterized
pathophysiology in NDDs is FXS, a leading cause of inherited autism and ID that
is caused by trinucleotide expansion repeat permutation of 55-200 CGG repeats in
FMR1 gene located at chromosome X.(125, 126) FMRP regulates synaptic
protein synthesis by binding to ribosomes and repressing translation of target
mRNAs. In Fmr1- /-
mouse model, Huber and colleagues(127) showed that the
absence of FMRP leads to increased protein synthesis and enhanced long-term
depression in the hippocampus, due to excessive glutamatergic signaling via
metabotropic Glutamate Receptor 5 (mGluR5).(128) Fmr1 KO mice have
abnormal dendritic spine morphology, and increased spine density, which
recapitulated the neuronal profiles of post-mortem brains in patients with FXS. It
29
has been postulated that excessive protein translation is the underlying cause of
NDDs phenotype in FXS patients, as demonstrated by compensating FMRP
function led to correction of some FXS phenotypic features.(129) Therefore,
therapeutic strategies for FXS have been directed towards modulating mGlur5
activity. These efforts resulted in the identification of several mGlur5 antagonists
that are currently in preclinical trials and clinical development.(129, 130) Some of
these preclinical studies have resulted in preliminary improvement in motor
learning and better accuracy on performance tasks.(131-133)
This example showed the advantage of characterizing highly penetrant mutations
in understanding molecular pathogenesis and development of therapeutic targets
based on genetic findings. However, the majority of idiopathic NDDs cases have
no single major genetic cause, but are instead induced by rare mutations in
different individuals with similar diagnoses. Thus, it remains challenging to
establish a single defining pathology through animal models. One alternative
solution is to utilize the patient’s derived iPSCs that can be differentiated into
respective cell types, such as neurons or neural progenitor cells. This approach
has the potential to improve the validation of biological hypothesis by examining
phenotypic assays of derived cell types in similar genetic background of the
patient without altering gene expression in comparison to the mouse models.
30
1.6 OVERVIEW OF DNA PAIRED-END TAG (DNA-PET)
SEQUENCING
Large structural rearrangements such as translocation, inversion, deletion and
duplication have functional roles in human traits and diseases, which have been
previously characterized mainly by karyotyping, FISH and aCGH. WES recently
has demonstrated its ability to detect CNVs in higher resolution based on
sequencing read depth. However, neither aCGH nor exome sequencing can
identify copy number neutral rearrangements, such as insertions, inversions, and
translocations. For studies focusing on identification of such rearrangements,
paired-end tags sequencing (PET) is the most ideal approach.
The principle of PET technique is to sequence only the short 5’ and 3’ tags of
specific insert size of DNA fragments derived from genomic DNA in a massive
and highly parallel manner.(92, 95, 134, 135) This strategy was initially applied
by Snyder and colleagues to systematically analyze SVs of two cell lines derived
from healthy individuals, which revealed extensive variation in the human
genome.(92) Another study has extended the analysis in eight individuals using
cloning-based PET sequencing, and provided SVs map that is used as a reference
for normal SVs in the population.(95) These studies used short DNA fragments or
a few kilobases (3 Kb) pair fragments insert sizes, which limit the detection
ability of larger rearrangements, often missed repetitive sequences, and resulted in
very low-coverage (less than 10-fold).
31
To this end, our institute has recently established a modified PET-sequencing
technique, by using larger insert sizes compared to the regular protocol.(134, 136-
138) Different insert sizes have been tested and 10 Kb fragment has proven to be
the optimal fragment size for SV analysis.(134, 136) In theory, larger insert sizes
are superior than shorter fragments due to the higher ability to detect complicated
DNA sequence features such as repetitive sequences. Moreover, it provides higher
physical coverage (50 to 100-fold per genome) compared to short-insert PET
sequencing. The experimental scheme of large-insert DNA-PET sequencing is
illustrated in Figure 3, as adapted from Yao et al.(134)
Figure 3. DNA-PET sequencing workflow.
Genomic DNA is sheared into ~10 kb fragments, and LMP cap linkers were attached to
each fragment end. After correct fragment sizes were selected, each fragment was
circularized by adding biotinylated internal adaptors. Upon restriction digest, fragment
containing each ends (5’ and 3’ end) of genomic fragment were released to be further
ligated with SOLiD sequencing adaptors, and PCR amplified for high-throughput
sequencing with ABI SOLiD sequencer.
32
After massively parallel sequencing, the PETs are mapped back to the reference
genome by examining the tags orientation and size of insert DNA fragments. Both
5’ and 3’ PETs with incorrect tags orientation, larger or shorter mapping distance,
or located in different chromosomes are referred to as discordant PETs (dPETs)
that are indicative of SVs, with classification as illustrated in Figure 4.
Figure 4. Classification of SVs based on dPET mapping criteria.
Balanced translocation is characterized by two paired-end reads that are mapped at
different chromosomes. Inversion is classified by two paired-reads in adjacent
coordinates showing reverse strand orientation in the same chromosome. Deletion is
detected when two paired reads with the same orientation are mapped in a longer
distance. Unpaired inversion is recognized by a single paired reads in a reverse
orientation. Isolated translocation is classified by a single paired end reads that is located
in different chromosomes.
33
These discordant mapping criteria allowed the identification of large SVs in a
single sequencing run; this was previously not feasible using conventional
methods. DNA-PET sequencing requires validation by longer-read sequencing
such as Sanger method, or FISH analysis for accurate SVs annotation. However,
DNA-PET is limited to the identification of large SVs, and it is unable to detect
point mutations or SNVs. This strategy has been applied by several groups from
our institute to study complex genomic architecture of cancer genome and
revealed distinct hard-wired patterns of somatic rearrangements.(136-138) To
apply such a technique in the identification of germline and de novo SVs in
human disorders, several adjustments in the filtering criteria were necessary to
provide less stringent threshold and normalized parameter compared to cancer
genomes. Cross-comparison of investigated genomes to other normal genomic
libraries enabled exclusion of germline SVs found in normal population, which
usually accounts for >95% of identified SVs. Sequencing artifacts produced by
chimeric ligation products during library construction or technical noise were
removed by using filtering criteria described in the Method (Chapter 2).
In principle, the DNA-PET sequencing with its capabilities to detect balanced and
unbalanced rearrangements is attractive to apply in mapping the breakpoints of
chromosome rearrangements for novel disease gene identification. As it has been
described in the earlier part of this chapter, many studies have documented
disease-associated chromosomal rearrangements that were identified through
breakpoint-cloning method. DNA-PET sequencing is an ideal method to apply in
34
such a setting to rapidly pinpoint the breakpoint regions of chromosomal
rearrangements that segregate with disease phenotypes in familial cases, or rare de
novo events. Potentially, DNA-PET can be applicable to examine non-syndromic
NDDs patients who do not have a definitive diagnosis after exhaustive genetic
testing.
35
1.7 THESIS AIMS
In this thesis, I focused primarily on the functional impact of chromosomal
rearrangements in NDDs. The major aims of this thesis are: (i) to explore the
application of DNA-PET in mapping disrupted genes within chromosomal
rearrangements for identification of novel NDDs genes, and (ii) to functionally
validate selected candidate genes by various disease modeling strategies. The
schematic workflow of the thesis is outlined in Figure 5.
Figure 5. Study design of the thesis. Seven patients were included for this study, of
which three of them were suspected of certain phenotype and undergone specific
genetic/metabolic testing, in addition to the routine G-banding karyotyping and aCGH.
Four other patients were undiagnosed and only tested for chromosome analysis. Each of
these patients carried chromosomal rearrangements that have been identified prior to
DNA-PET sequencing. These patients were then analyzed by DNA-PET sequencing,
followed by SVs filtering analysis and validation to identify candidate genes. After
extensive analysis, several candidate genes were selected for further functional
validation. FRAXA indicates Fragile X test for FMR1 mutation.
36
Chapter 1 describes an overview of NDDs, the strategy of clinical/genetic
evaluations to assess patients, and followed with the causes and pathophysiology
of NDDs. To begin with the discovery of candidate genes, the patients were
selected by the clinical geneticists. These subjects belonged to the non-syndromic
NDDs category and had undergone exhaustive genetic testing and each of them
carried distinct but previously undescribed chromosomal rearrangements. The
clinical data of these patients and execution of the experimental methodologies
will be explored in Chapter 2.
DNA-PET sequencing is utilized as the main approach to pinpoint the
chromosomal breakpoints in these patients, and to detect additional clinically
relevant variants from sequencing data. The results obtained from sequencing data
and variants analysis for each patient is described in detail in Chapter 3.
Focusing on one candidate gene, MED13L, Chapter 4 will delineate the
functional impact of its disruption on neural-crest derived organs and its
correlation to ID phenotypes based on assays performed in zebrafish and human
embryonic stem cells-derived neurons. Follow-up validation on another candidate
gene, GTDC1, will be explored by using induced pluripotent stem cells derived-
neurons in Chapter 5. Ultimately, these data offer broad perspectives on altered
gene dosage as a result of chromosomal rearrangements and provides an insight
into the molecular mechanisms in novel NDDs genes.
37
CHAPTER 2: MATERIALS AND METHODS
2.1 PATIENT SAMPLES AND CLINICAL INFORMATION
All DNA samples were obtained from peripheral blood lymphocytes or cultured
skin fibroblasts of patients seen in the Children’s Hospital of Westmead in
Sydney, Australia, Centre Hospitalier Regionale d’Orleans in Orleans, France and
Centre Hospitalier Regional University de Montpellier, Hospital Arnaud-de-
Villeneuve, Montpellier, France as described in Table 3. Patient samples and
clinical data were collected after written informed consent from patient’s parents
was obtained, in accordance with local Institutional Review Board (IRB)
approved protocols from National University of Singapore in Singapore,
Children’s Hospital Westmead in Sydney, Australia, Centre Hospitalier Regional
University de Montpellier, Hospital Arnaud-de-Villeneuve, Montpellier, in
France and Centre Hospitalier Regional in Orleans, France.
Patient
ID
Hospital Country
of Origin
Sample obtained
from the hospital
CD5 Centre Hospitalier Regionale d’Orleans France EBV-LCL
CD6 Centre Hospitalier Regionale d’Orleans France EBV-LCL and
Fibroblast
CD14 Centre Hospitalier Regionale d’Orleans France EBV-LCL
CD8 Children’s Hospital of Westmead Sydney Fibroblast
CD9 Children’s Hospital of Westmead Sydney gDNA
CD10 Children’s Hospital of Westmead Sydney gDNA
CD23 Centre Hospitalier Regional University de
Montpellier
France EBV-LCL
Table 3. List of patients and participating hospitals included in this study. Patients
were recruited by the clinicians in the hospital. Source of samples obtained from each
hospital were indicated: EBV-LCL; Eppstein-Barr virus lymphoblastoid cell lines,
gDNA; genomic DNA extracted from blood.
38
Figure 6. Patients’ pedigree and their partial karyotypes indicating the
rearrangements.
Seven patients were included in the study. Patients IDs are assigned as (CD(number))
corresponding to the clinical description for each patient below. Prior karyotypic analysis
identified unique chromosomal rearrangements in each patient, and the karyotype is
indicated below the pedigree. Filled boxes/circles indicated male/female probands having
chromosomal rearrangements, along with a phenotype, and dotted boxes/circles indicated
unaffected carriers. Patients that were analyzed by DNA-PET sequencing are indicated
by the red arrow. Abbreviations are as follows: LD, Language Delay; DD,
Developmental Delay; ID, Intellectual Disability; SD, Speech Delay.
39
2.1.1 PATIENT CD5
This case is a familial balanced translocation detected in a father and his two sons
presenting with variable degrees of NDD features. The first son displayed autistic
behavior and global DD at three years of age with absence of speech, feeding and
sleeping difficulties, and stereotypic movements. At 4 years of age he showed
improvement in communication and speech. Chromosome analysis revealed a
translocation 46,XY,t(9;17)(q12q24) (Figure 6), which is in common with his
father and sibling. During genetic counseling the father reported that he suffered
from LD and DD during childhood that was not explored at that time, and that had
resolved by adolescence. His second son was found to have DD and autistic
features at the age of two years. The father’s (Patient CD5) DNA was obtained for
DNA-PET sequencing and downstream analysis. Breakpoint analysis by Sanger
sequencing was performed on both sons. Chromosome analysis in the
phenotypically normal biological mother of the two affected sons revealed a
normal karyotype.
2.1.2 PATIENT CD10
The patient was born at term by lower segment Caesarean section, due to breech
presentation. Delay in developmental milestones was noted at 18 months of age
involving both walking and speech. At 9 years old, comprehension and
behavioural difficulties were noted at school. Karyotype analysis revealed a
balanced translocation 46,XX,t(6;8)(q16.2;p11.2) (Figure 6). Metabolic screening
and FRAXA testing were normal. Karyotypic analysis in his parents revealed that
the translocation was derived from his mother, and she had no intellectual
40
difficulties, but was reported to have had a ‘hole in the heart’ in childhood that
closed spontaneously. His mother had another subsequent pregnancy and
amniocentesis revealed a female fetus with the same balanced translocation. This
second child was noted to have plagiocephaly soon after birth and had feeding
difficulties in early infancy, which gradually resolved. At 2 years old, his sister
had LD with low-average fine motor and gross motor skills. Right intermittent
exotropia was noted. An MRI brain scan showed closed lip schizencephaly
involving the right frontal lobe with polymicrogyria in the right Sylvian fissure.
MRI brain scans in her brother and mother were normal.
2.1.3 PATIENT CD8
Patient CD8 was one of monozygotic twins (Figure 6). At the age of three years,
he had an acute EBV infection with prolonged hepatosplenomegaly and
abnormality of liver function tests. Immunological investigations showed reduced
IgM, decreased CD4 T-helper cells and decreased natural killer (NK) cell
function. At the age of five years, mild ID was diagnosed, as well as a specific
language disorder affecting his receptive and expressive skills. Both twins had
articulation problems with velo-pharyngeal incompetence. In view of the
velopharyngeal incompetence, 22q11.2 FISH (TUPLE1) was requested and the
result was normal. Karyotype analysis revealed a complex inversion
46,Y,inv(X)(?p22?q26) (Figure 6), derived from his phenotypically normal
mother. His identical twin showed similar clinical features and carried the same
karyotype. His maternal uncle was reported to have recurrent infections and a
maternal great uncle died at eight months of age, due to liver abnormalities,
41
emphasizing that the male carriers of this inversion in this family had features of
X-linked lymphoproliferative disease. In view of possible features of Duncan
disease or XLP-1 due to SH2D1A mutations, western blot analysis performed on
the patient’s whole cell lysates of mononuclear cells showed normal expression of
SAP, protein encoded by SH2D1A.
2.1.4 PATIENT CD9
The patient was born by normal delivery after an uneventful pregnancy. Mild
delay in early motor and speech developmental milestones was reported. At age
twelve, he was found to have average intellectual ability, an expressive language
disorder, and specific learning difficulties in maths, spelling and reading; as well
as attention deficit disorder. Chromosome analysis revealed a paracentric
inversion of chromosome 5: 46,XY,inv(5)(q22q35.1) (Figure 6). Both parents
have normal karyotypes and did not show any phenotypic abnormality.
2.1.5 PATIENT CD14
The patient was the only child of a young, healthy and unrelated couple. The
patient was born at 38 weeks by caesarean section. At 8 months, psychomotor
delay was noted and confirmed at 20 months. Clinical examination at the age of 7
years 9 months showed a relatively small-sized boy with microcephaly (-4.7 SD
for age). Physical examination revealed slightly down-slanting palpebral fissures,
deep-set eyes, flat nasal bridge, bulbous nasal tip, large dysplastic posteriorly
rotated ears, widely spaced nipples, dorsal kyphoscoliosis, bilaterally tapering
fingers, unilateral simian crease, shawl scrotum and hypoplastic genitalia.
Neurological examination showed axial hypotonia with hypertonia of the lower
42
limbs. The patient had psychomotor retardation with absence of speech. Brain
magnetic resonance imaging showed microcephaly at age 14 months and at 7
years 8 months. His electroencephalogram was normal for his age and he did not
have a history of seizures. Chromosome analysis revealed a normal karyotype,
and telomeric FISH showed a subtelomeric deletion of the long arm of
chromosome 7q (Figure 6), which was inherited from his asymptomatic father.
2.1.6 PATIENT CD6
The patient is the fifth child from a second marriage of a healthy mother and
father. During pregnancy, amniocentesis was done for advanced maternal age and
revealed a de novo translocation 46,XY,t(2;8)(q23;q21) (Figure 6). Ultrasound
examination was normal and the pregnancy continued to term. The baby was born
at term with Apgar score of 10 at both 1 and 5 minutes. At the age of two years,
he presented with global DD without dysmorphic features or growth delay. He
had excessive salivation and deep tendon reflexes. He initially walked on tiptoe,
but subsequently reverted back to normal by six years of age. At the age of four
years, he had cognitive difficulties, SD and LD. DNA of the patient, and both
biological parents were used for DNA-PET analysis. Mother and father are
referred to as CD15 and CD16, respectively.
2.1.7 PATIENT CD23
The patient is the only child of healthy and unrelated parents. She was diagnosed
with Pierre-Robin sequence at birth, characterized by cleft palate, glossoptosis
and retrognathia, which was corrected by surgery. She had multiple limb
contractures and camptodactyly, including metatarsus adductus of the thumb and
43
bilateral equinovarus foot deformity. At 4 years of age she had language
difficulties with absence of speech. Other developmental milestones were
significantly delayed and she required special education. Physical examination
revealed dysmorphic features including flat occiput, hypertelorism, flat philtrum,
and broad nasal bridge. Strabismus and hirsutism were noted. Additionally, she
had scoliosis during puberty (about 14 years). Her EEG showed epileptiform
discharges and she was seizure-free on anti-epileptic drug treatment.
Chromosome analysis revealed a balanced translocation between chromosome 12
and 19. Both parents had normal karyotypes and did not show any phenotypic
abnormality. Array-CGH performed on the patient’s genomic DNA showed no
additional chromosomal imbalances.
2.2 G-BANDING KARYOTYPE
Karyotypes were determined from G-banding analysis using standard protocol
according to the ISCN nomenclature, which was performed in the participating
hospitals. Subsequent G-banding karyotype confirmation was performed for each
patient samples by the Cytogenetics Unit, Genome Institute of Singapore (GIS).
Metaphase chromosomes were prepared from Epstein-Barr virus-transformed
lymphoblastoid cell lines (EBV-LCL) or cultured skin fibroblast cells obtained
from patients, parents or siblings.
2.3 FLUORESCENCE IN SITU HYBRIDIZATION (FISH)
FISH was performed on metaphase chromosomes using standard protocols.(139,
140) BAC and fosmid probes were obtained from BACPAC Resources, CHORI.
44
Probes were labeled by nick-translation kit (Enzo) with Biotin-16-dUTP or
Digoxigenin-11-dUTP (Roche). The probes were blocked with 1 µg/µl Human
Cot1 DNA (Life Technologies), and were resuspended at a concentration of 5
ng/µl in hybridization buffer (2xSSC, 10% Dextran Sulfate, 1x PBS, 50%
Formamide). Prior to hybridization, metaphase slides were pre-treated with 0.01%
pepsin for 5 min at 37 C followed by 1x PBS rinse, 1% formaldehyde 10 min
treatment, 1xPBS rinse (5 min) and dehydration through ethanol series (70%,
80%, and 100%). Probes and slides were co-denatured for 5 min at 75 C and
hybridized overnight at 37 C. After post-hybridization washes, the slides were
revealed with avidin-conjugated fluorescein isothiocyanate (Vector Laboratories)
or anti-Digoxigenin-Rhodamine (Roche). Slides were mounted with VectaShield
(Vector Laboratories) and observed under an upright fluorescence Nikon
microscope. Image analysis was performed using ISIS Metasystems software.
2.4 GENOMIC DNA ISOLATION
Genomic DNA from EBV-LCL, fibroblast, or blood from the patient was
extracted by Qiagen Blood and Cell Culture DNA Kits (Qiagen), according to
manufacturer’s instruction. EBV-LCL and fibroblast cell lines were maintained to
a minimum number of passages prior to DNA extraction. Quality and quantity of
the extracted DNA were measured using NanoDrop 1000 Spectrophotometer and
agarose gel electrophoresis.
45
2.5 ARRAY COMPARATIVE GENOMIC HYBRIDIZATION
(ACGH)
aCGH was performed in all patient samples using the SurePrint G3 Human 2 x
400k aCGH Microarray (Agilent Technologies Inc. Santa Clara) according to the
manufacturer’s instruction. The microarray slides were scanned on the Agilent
Microarray Scanner. Data were processed by Genomic Workbench software,
standard edition 5.0.14 (Agilent). The Aberration Detection Method (ADM-2)
algorithm was used to identify DNA copy number variations (CNV). The ADM-2
algorithm identified all aberrant intervals in a given sample with consistently high
or low log ratio based on the statistical score. Filtering option of minimum of
three probes in region was applied and using a centralization threshold of 6. NCBI
Build 36 was used as a reference genome. To compare the detection rate between
aCGH and DNA-PET, aCGH raw data was used as a standard to validate small
CNVs identified by DNA-PET and interpreted the copy number change by
looking at the individual probe ratio, using a cutoff of 0.1 (<0.1 indicated
deletion, >0.1 indicated duplication).
2.6 DNA-PET
DNA-PET libraries were constructed for seven patients and two parents of one
patient according to Hillmer et al.(136). Briefly, genomic DNA was hydrosheared
to 7-11 kb DNA fragments. Long Mate Paired (LMP) cap adaptors were ligated to
the hydrosheared and end repaired DNA fragments. The cap adaptor-ligated DNA
46
fragments were separated by agarose gel electrophoresis, recovered using the
QIAEXII Gel Extraction Kit (QIAGEN) and circularized with a biotinylated
adaptor that connects the cap adaptors at both ends of the DNA fragments.
Missing 5’ phosphate groups of cap adaptors created a nick on each strand after
circularization of the DNA. Both nicks were translated outwards by >50 bp into
the circularized genomic DNA fragment by DNA polymerase I (NEB). The nick-
translated constructs were then digested with T7 exonuclease and S1 nuclease
(NEB), to release paired-end tag (PETs) library constructs. These constructs were
ligated with SOLiD sequencing adaptors P1 and P2 (Life Technologies), and
amplified using 2x HF Phusion Master Mix (Finnzymes OY) for sequencing.
High throughput sequencing of the 50 bp libraries was performed on SOLiD
sequencers (v3plus and v4, respectively) according to the manufacturer’s
recommendation (Life Technologies), for all samples except for the mother of
Patient CD6. Sequence tags were mapped to the human reference sequence
(NCBI Build 36) and paired using SOLiD System Analysis Pipeline Tool
Bioscope, allowing up to 12 color code mismatches per 50 bp tag and up to 4
color code mismatches per 35 bp tag. For sample CD5 and CD8, two DNA size
fractions were merged during library construction due to technical optimization in
the early stage of this project, which resulted in a reduced sensitivity to identify
small deletions and low physical coverage.
2.7 POST-SEQUENCING ANALYSIS
The majority of the PET sequences mapped accordingly to the reference genome
NCBI Build 36 (concordant PETs or cPETs) with expected mapping orientation
47
(5’ tag to 3’ tag) and expected mapping distance (according to the selected
fragment size). The distribution of cPET in the genome was used to compute copy
number information based on random simulation of different blocks of genomic
fragments over the coverage, according to the Supplementary Methods in Hillmer
et al. (136) Briefly, the genome was divided into non-overlapping windows of
equal size that was set as a parameter, and relies on the overall coverage of the
sequence data. PETs of 1x genome coverage were simulated by generating
randomly distributed fragments across the reference genome adjusted to the size
distribution of the actual experimental library and the tags at both ends of the
fragments were subsequently extracted. The simulated tags were then mapped
back to the reference genome, and paired using the same ABI SOLiD mapping
and pairing pipeline. The tag density in each window was then corrected for its
mappability by computing the ratio of number of mapped tags of the experimental
library to the simulated library within the window, thus referred to as cPETs.
The remaining portion of the PETs mapped discordantly to the reference genome
(discordant PET or dPETs), classified as those with incorrect paired-tag
orientation and incorrect genomic distances. These dPETs provided information
to search for genomic rearrangements; with specific criteria for different types of
SVs according to PET mapping orientation and genomic region, as described in
the Figure 4 (Chapter 1.6). The overlapping dPETs representing similar SVs were
clustered together as dPET clusters and counted as the cluster size, according to
the criteria described in the Supplementary Method of Hillmer et al. (136) with
refined data curation as described in Ng et al.(138) as described below:
48
(i) to obtain a 5’ and 3’ anchor region, the mapping location of the 5’ and 3’ tags
of dPET was extended by the maximum insert size of the respective genomic
library in both directions; (ii) if the 5’ and 3’ tags of a second dPET mapped
within 5’ and 3’ window of the first dPET, the two PETs are defined as a cluster
of the size 2, and the 5’ and 3’ windows were extended to match the size of the
second dPET; (iii) the number of dPETs clustering together overlapping similar
genomic position was represented by cluster size.
The following criteria were applied to remove sequencing artefacts for annotation
of germline SVs: (i) dPET clusters with anchor regions of less than 1,000 bp were
excluded; (ii) dPET clusters of size 6 or higher were considered a stringent cutoff
to reduce the number of false positives; (iii) dPET clusters with potential
sequence homology between aligned breakpoints (BLAST scores) of more than
2000, as these SVs were difficult to validate by PCR due to high sequence
homology.
2.8 FILTERING OF NORMAL STRUCTURAL VARIATION (SVS)
Comparison of clusters across different genomes was performed as described by
Ng et al.(138). The normal SVs were filtered using the following data sources: (i)
in-house DNA-PET data of 23 normal individuals (25 DNA-PET data sets); (ii)
the pilot release set of 1000 Genome Project from Mills et al. (141); (iii)
published SVs identified from paired-end sequencing studies of 18 normal
individuals(92, 95); and (iv) the DGV. The fraction of predicted SV which
overlapped with a published SV was calculated by the percentage of overlap
49
relative to the larger event, with a cutoff of 80% or more for filtration of shared
SVs with normal individuals. Thus, SVs seen in normal individuals were
excluded, with the assumption that they are less likely to underlie the disease,
although this stringent analysis could potentially miss common variants with
reduced penetrance of the disease in the population.
For the identification of matching SVs within the parent-offspring trio in patient
CD6, cross-genome comparison was generated for the trio, including normal
individual datasets. Filtering of sequencing artifacts were performed according to
DNA-PET data curations. SVs were classified into; (i) maternal-identical SVs,
shared between mother and offspring, (ii) paternal-identical SVs, shared between
father and offspring, (iii) identical germline SVs, shared within the trio and
presumably found in normal population, (iv) non-identical SVs, de novo SVs
exclusively found in the offspring, (v) total SVs in each individual, paternal,
maternal or offspring. The rates for maternal and paternal inheritance were
calculated using the percentage of maternal/paternal-identical SVs to the total
number of maternal/paternal SVs. Filtration of normal SVs, and parental SVs
(maternal-identical SVs and paternal-identical SVs) resulted in specific de novo
SVs found in the offspring.
2.9 FUNCTIONAL ANALYSIS OF REGULATORY REGIONS
The hg18 coordinates of genomic regions from each patient’s SVs list were
converted to hg19 in LiftOver from UCSC genome browser. SVs occurring within
intronic region of genes or non-coding regions were assessed for the probability
50
score of functional regulatory regions in RegulomeDB and ENCODE data in
UCSC genome browser.(142, 143) RegulomeDB score of 1-2 indicates functional
binding site of certain transcription factor.(144)
2.10 VALIDATION OF BREAKPOINTS BY SANGER
SEQUENCING
Primers were designed by Primer3 program (http://primer3.ut.ee), and the
amplicons spanning the breakpoint were predicted by dPET clusters according to
human genome assembly NCBI Build 36. PCR was carried out with JumpStart
REDAccuTaq LA polymerase (Sigma Aldrich) in a 50 µl reaction volume and
with 100-500 ng of genomic DNA as a template. The following program was
used: i) Initial denaturation at 96 C for 30 sec, ii) 40 cycles of 15 sec at 94, 30
sec at 58C, 10 min at 68, iii) 68C for 10 min. PCR products showing single
bands were purified by Gel Extraction Kit (Qiagen) and used as templates for
sequencing in both directions by Sanger sequencing. The sequences of junction
fragments were aligned to the human genome reference sequence using Blat in
UCSC Genome Browser (145) .
2.11 QUANTITATIVE REAL TIME PCR (QPCR)
Total RNA was extracted from patient’s EBV-LCL or fibroblasts using RNAeasy
kit (Qiagen). Reverse transcription of 2 µg RNA derived from patients, 2
lymphoblast controls, 2 fibroblast controls and commercially available human
tissue panel RNA (Clontech) was performed in 20 µl of SuperScript III Reverse
Transcriptase reaction buffer (Life Technologies) using random hexamer primers.
51
qPCR reactions were performed in the ViiA7 Real Time PCR system (Life
Technologies) with ten-fold dilution of cDNA, 200 nM of each primer using the
SybrGreen PCR Master Mix (Life Technologies). Data were analyzed using 2∆∆Ct
method, and normalized against control sample with human ACTB. Each
measurement was performed in triplicate. The controls used in this study are
derived from EBV-LCL or skin fibroblast cells of 4 normal individuals. Primers
used to quantify transcript expression of affected genes in patients are listed in
Table 4.
Primer Name Primer Sequence Primer Location
GNAQ F1 GGACAGGAGAGAGTGGCAAG exon 2
GNAQ R1 GAGTGTGTCCATGGCTCTGA exon 2
RBFOX3 F1 GCCACGGCCCGAGTGATGAC exon 8
RBFOX3 R1 ACACGACCGCTCCGTAAGTCG exon 10
UNC5D F1 GCACTTCTCTTTGTCCTGTGG exon 1
UNC5D R1 TTCTCAATGCTTTGGGGTTT exon 2
XIAP F1 CAACACTGGCACGAGCAGGG exon 1
XIAP R1 CATGGCAGGGTTCCTCGGGT exon 2
XIAP F2 GCTGGATTTTATGCTTTAGGTGA exon 2
XIAP R2 ACCAGACACTCCTCAAGTGAA exon 3
TMEM47 F1 AGCCTGGTCCTTTACCCAAT exon 3
TMEM47 R1 AGGGTTCAGGCAATAAAGGA exon 3
TMEM47 F2 TTCTCATTGCATTCCTGGTG exon 2
TMEM47 R2 TTAGGGTTCAGGCAATAAAGGA exon 3
SH2D1A F1 AGGCGTGTACTGCCTATGTG exon 1
SH2D1A R1 TGCAGAGGTATTACAATGCCTTG exon 2
ODZ1 F1 GCAACCAGTTGAAGGAGAGC
exon 6
ODZ1 R1 ACGCCAGAATAAACCAGGTG
exon 7
NCAPG2 F1 AAGAAAGTTCGGCAGGGAGT exon 10
NCAPG2 R1 TGCAGCATTTGATCGAACTT exon 11
52
NCAPG2 F2 AAGAGGACGGAAGGGAGAAG exon 15
NCAPG2 R2 GGAAGCACAGAGGCAAACTT exon 16
MCPH1 F1 AGGAAAACTTACCCGGAGGA exon 8
MCPH1 R1 AGGTCCTTAAAGCCGTCACA exon 9
MCPH1 F2 TCCTGAAAGATGTAGTGGCCT exon 1
MCPH1 R2 TGTCCCAAGTGCTCTGGTAG exon 2
MCPH1 F3 CTACCAGAGCACTTGGGACA exon 3
MCPH1 R3 AGCTGCAGGGAACAATGATTC exon 4
GTDC1 F1 CACAGGTGGGAGCATGATAA exon 6-7
GTDC1 R1 CAACATTGCCACTCCAAAGA exon 8
GTDC1 F2 CAGGACCCTCTTTGCAAGTT exon 3-4
GTDC1 R2 GGCATGAATCTGCTCACATC exon 5
GTDC1 F3 CAGACCCAAGGATCTGGAAA exon 5
GTDC1 R3 CTCTGCTCTGGCTGAAAAGG exon 6
MED13L F1 GAGCCTGGAGGATTGTCACT exon 1
MED13L R1 GACATCACGACGCCATACAC exon 2
MED13L F2 AGTGAGCATTTGTCCTGTGCT exon 5
MED13L R2 CTGCATACCTCCACTCTCACC exon 8
MED13L F3 CTGCAGCCACTTTCATTAGAGAT exon 16
MED13L R3 TCGGAGAGAATCAGGGTAACATA exon 17
MED13L F4 TCTGCCTCTGCTCCTGGTAT exon 21
MED13L R4 CATCAAGCTCAACAGCCAAA exon 22
Table 4. qPCR primers used to measure transcript level in the EBV-LCL for each
patient and human tissue panel. One or more candidate genes were tested for each
patient as follows: (i) Patient CD5: GNAQ and RBFOX3; (ii) Patient CD10: UNC5D;
(iii) Patient CD8: XIAP, TMEM47, SH2D1A, and ODZ1; (iv) Patient CD14: MCPH1
and NCAPG2; (v) Patient CD6: GTDC1; (vi) Patient CD23: MED13L.
53
2.12 CNV ANALYSIS FROM PUBLISHED STUDIES
CNV information within genes from 10,151 cases of DD and 8,329 adult controls
were extracted from published data by Cooper et al.(72) and from DECIPHER
data base (with a collection of >17,000 cases). These data were used to obtain
CNV counts within our listed candidate genes. For control dataset, the first SVs
release set of 1000 Genome project consisting of 185 individuals was used. CNVs
were counted in both cases and controls of the candidate genes in these datasets.
Chi-square test was performed from comparison of CNV counts present in cases
versus controls.
2.13 FUNCTIONAL ANALYSIS BY PLURIPOTENT STEM CELLS
2.13.1 CELL LINES USED AND MAINTENANCE
Human embryonic stem cells (hESCs) H1 cells and induced pluripotent stem cells
(iPSCs) were grown in feeder-free conditions on Matrigel-coated plates (BD
Biosciences) in mTESR1 media (Stem Cells technologies).
2.13.2 INDUCTION OF IPSC FROM FIBROBLAST
Human fibroblasts were reprogrammed to iPS cells by retroviral transduction of
transcription factors following established protocols(146). For virus production,
293-GP2 viral packaging cells (Clontech) were plated at 80% confluence per 100-
mm dish and transfected with 8μg of each retroviral vectors (plasmids expressing
OCT4, KLF4, c-MYC and the mouse Sox factor) and 8 μg of VSVG plasmid
using Fugene6 (Roche) following the manufacturer’s instructions. The virus-
54
containing media was collected 48 hours after transfection, filtered through a 0.45
μm pore-size cellulose acetate filter (Whatman), and concentrated using Amicon
ultra centrifugal filter units (Millipore) during 15 minutes at 3,000 rpm. Fibroblast
culture was then infected with virus-containing media with four reprogramming
factors mixed in equal volumes. The media was changed the next day. After 2
days, the cells were refreshed with human ES media containing DMEM/F12 (Life
Technologies) supplemented with 20% Knock-out serum and 10ng/ml of bFGF
(Life Technologies). ES-like colonies were picked and transferred into 24-well
plates, followed by mild trypsinization overnight, and the media was changed the
next day. Picked iPSCs were further expanded for further characterization in
feeder free condition.
2.13.3 NEURAL PROGENITOR CELLS DIFFERENTIATION
NPC differentiation of ESCs and iPSCs was done based on established
protocols(147). Cells were first plated in clumps on matrigel-coated 6 well-plates
using mTESR1 media in a concentration of 5 104/cm
2. The following day, the
media was changed with NPC media composed of (1:1) DMEM-F12/Neurobasal
media (Life Technologies) supplemented with N2 and B27 without vitamin A
supplements (Life Technologies) and CHIR99021 (4µM), SB431542 (3µM),
CompoundE (0.1µM) and human LIF (10 ng/ml). After 7 days, cells were
passaged using Accutase at 1:6 dilution (Sigma-Aldrich) and grown in the same
media using CHIR99021 (3µM) and SB431542 (2µM) together with LIF
(10ng/ml), EGF and bFGF (20ng/ml), without the Compound E.
55
2.13.4 NEURONAL DIFFERENTIATION
For neuronal differentiation, NPCs were plated on poly-L-ornithine/Laminin-
coated plates at a density of 200,000 cells per 6 well (approximately 20,000
cells/cm2) and grown in (1:1) DMEM-F12/Neurobasal (Gibco) media
supplemented with N2 and B27 supplements (Gibco) together with GDNF
(20ng/ml) (R&D systems), BDNF (20ng/ml) (R&D systems), dibutyryl-cyclic
AMP (1mM) (Sigma-Aldrich) and Ascorbic Acid (200nM) (Sigma-Aldrich) for
four weeks. For characterization, neurons were passaged using Accutase and re-
plated at very low density (5,000 cells per 24 well) on poly-L-ornithine/Laminin
coverslips.
2.13.5 PSUPER SHRNA CLONING AND TRANSFECTION
ShRNA sequences targeting candidate genes were designed through Dharmacon
shRNA design software: (http://www.dharmacon.com/designcenter
/designcenterpage.aspx.) and oligos were purchased commercially (IDT
Technologies). The sequences of the oligos are as follows:
Oligo Name Sequences
shHMed13l-F1 GATCCCCACACAGAAATGCTGGATAATTCAA
GAGATTATCCAGCATTTCTGTGTTTTTTA
shHMed13l-R1 AGCTTAAAAAACACAGAAATGCTGGATAATC
TCTTGAATTATCCAGCATTTCTGTGTGGG
shHMed13l-F2 GATCCCCAGGAAGACGAGTTGGGATATTCAA
GAGATATCCCAACTCGTCTTCCTTTTTTA
shHMed13l-R2 AGCTTAAAAAAGGAAGACGAGTTGGGATATC
TCTTGAATATCCCAACTCGTCTTCCTGGG
Table 5. shRNA sequences for cloning into pSUPER vectors. shRNAs were designed
to target MED13L gene and cloned into pSUPER-Neo vector for stable transfection into
human NPCs.
56
Oligos were annealed using the annealing buffer (100 mM Tris Buffer pH 8.0, 10
mM EDTA pH 8.0, 5 M NaCl in H2O) in 50 µl reactions. The reaction is
incubated as follows: 90 for 4 minutes, 70 for 10 minutes, 60 for 10 minutes,
50 for 10 minutes, 37 for 20 minutes, and 20 for 5 minutes. pSUPER-Neo
plasmid (OligoEngine) was used for shRNA cloning. Approximately 5 µg of
plasmid was linearized using BglII (40 U) and HindIII (10 U) for 4 hours at 37
C. Linearized plasmid was checked for a single band using agarose gel
electrophoresis, and purified using QiaQuick PCR Purification Kit (Qiagen).
Annealed product was used for ligation reaction with the linearized plasmid with
the following composition: 2 µl of annealed Oligos, 1 µl of T4 Ligase buffer, 500
ng of linearized pSUPER plasmid, 1 µl of T4 Ligase enzyme in 10 µl ligation
reaction.
The ligation reaction was incubated overnight at 16 C. On the next day,
approximately 5 µl of ligation reaction was added to 25 µl of One Shot TOP10
Competent cells (Invitrogen), and incubated on ice for 30 minutes. Heat shock
reaction was performed at 42 C for 45 seconds, and directly incubated on ice for
2 minutes. SOC medium was added to the mixture, and incubated in the 37 C
horizontal shakers for 1 hour. The transformation reaction was inoculated to the
LB plate with 100 µg/ml Ampicilin, and grown overnight at 37 C. Single
colonies were picked and grown in 3 ml LB/Amp for plasmid isolation using
Qiagen Miniprep (Qiagen). Positive clones were selected upon digestion with
EcoRI (10 U) and HindIII (10 U), showing 280 bp fragments on agarose gel
electrophoresis. Negative clones showed 220 bp fragment size. Positive clones
57
were stored in glycerol stock, and amplified for plasmid transfection using Endo-
Free Plasmid Maxi kit (Omega-Biotek). Approximately 2.5 µg of plasmid DNA
was transfected to hNPCs by Fugene6 Transfection reagent (Promega).
2.13.6 SHRNA VECTOR FOR GTDC1
Pre-designed pLKO.1 vectors harboring a shRNA sequence targeting GTDC1
were obtained commercially (Sigma-Aldrich). The sequences of the duplexes are
shown in Table 6.
Oligo Name Sequences
shGTDC1-1 CCGGCCTGCTGCTTAAAGTCAGATTCTCGAGA
ATCTGACTTTAAGCAGCAGGTTTTTG
shGTDC1-2 CCGGTGGGTCAGTAAGATGATTGATCTCGAGA
TCAATCATCTTACTGACCCATTTTTG
Table 6. shRNA sequence for GTDC1. shRNAs were selected based
AGGTCACCACCTGCGC on commercially available lenti-shRNAs for GTDC1.
Viral particles were produced using the pLKO.1 protocol available online and
concentrated using Amicon ultra centrifugal filter units (Millipore) during 15
minutes at 3000x rpm and 4oC. For infection, NPCs derived from H1 human ES
cells were seeded in 6-well plates at about 200.000 cells per well in NPC medium.
Following infection, the medium was replaced with fresh media.
2.13.7 EDU PROLIFERATION ASSAY
Proliferation assay was quantified by using 5-ethynyl-2´-deoxyuridine (EdU)
incorporation kit, Click-iT EdUAlexa Fluor 647-Flow assay (Molecular Probes,
Life Technologies, USA). EdU (20 µM) was added into NPCs for 2 hours and
incubated in 37o C, 5% CO2 incubators. Cells were harvested after 2 hours. Click-
IT reaction was performed according to manufacturer’s protocol (Molecular
58
Probes). The percentages of EdU-incorporated cells were quantified by using BD
LSR II Flow Cytometry analyser in Biopolis Shared Facility.
2.13.8 IMMUNOCYTOCHEMISTRY
Cells were plated on ethanol-treated coverslips and fixed with 4%
paraformaldehyde in phosphate buffered saline (PBS) at 4 C for 1 hour and
incubated with Tris Buffer Saline (TBS) containing 5% of FBS, 1% BSA (Bovine
Serum Albumin, Sigma-Aldrich) and 0.1% of Triton X-100 (Sigma-Aldrich) for
45 minutes at room temperature. Primary antibodies were incubated with fixed
cells overnight at 4°C, and cells were subsequently stained with secondary
antibody conjugated to Alexa Fluor 594 or 488 (Molecular probes) (4 ug/ml) for 1
hour at room temperature in the dark. Images were captured using a ZEISS
AxioObservor DI inverted fluorescence microscope (Carl Zeiss International).
List of primary antibodies, dilution and host are as follows:
59
Antibody Cat. No. Company Species Working
dilution
Oct4 Sc8628 Santa Cruz Goat 1:250
SSEA1 Sc21702 Santa Cruz Mouse 1:100
Nanog Sc30328 Santa Cruz Goat 1:100
Nestin MAB353 Millipore Mouse 1:500
Nestin Ab22030 Abcam Mouse 1:500
Musashi AB5977 Millipore Rabbit 1:500
Pax6 PRB-278P Covance Rabbit 1:100
Ki67 7B11 Life technologies Mouse 1:100
NeuN MAB377 Millipore Mouse 1:100
TuJ1 MMS-435P Covance Mouse 1:500
Map2 A2320 Millipore Mouse 1:1000
TuJ1 PRB-435P Covance Rabbit 1:2000
Med13l Ab87831 Abcam Rabbit 1:100
Icam1 BBA3 R&D system Mouse 1:100
Table 7. List of primary antibodies used in this study. These antibodies are used to
characterize pluripotency (Oct4, Nanog, and SSEA1), differentiated neural progenitors
(Nestin, Musashi, and Pax6), proliferation capacity (Ki67), neuronal state (NeuN, TuJ1,
and Map2), candidate gene (Med13l), and glycosylation status (Icam-1).
2.13.9 IMMUNOCYTOCHEMISTRY QUANTIFICATION ANALYSIS
To determine the proportion of positive cells in each sample, images were
captured at 40 magnification from at least five randomly selected areas. ImageJ
software was then used to compute the total number of cells (DAPI stained
nuclei) and the number of cells expressing TuJ1, MAP2 or NESTIN. Nuclei size
was measured by ImageJ from at least 60 nuclei per image. Neurite outgrowth
was characterized by counting the number of neurite extensions per cell body, and
counted at least 10 Tuj1-positive neurons per image. All data were expressed as
the mean ± standard error of the mean from three individual experiments.
Neuronal maturation was measured by colocalization with PlugIn in ImageJ,
based on the threshold quantification of overlapping signal between TUJ1 and
60
MAP2-positive neurons. Statistical analysis was carried out using student’s paired
t-test between control and samples, and significance was defined with p values
0.05.
2.13.10 MICROARRAY
Whole Genome Gene Expression Direct Hybridization Human Ref-12 Expression
BeadChip microarrays (Illumina) were used for genome-wide expression analysis.
For hybridization, cRNAs from duplicate or triplicate samples were synthesized
and labelled using TotalPrep RNA Amplification Kit (Ambion), following the
manufacturer's instructions. After measuring the quality of RNA, the cRNAs are
hybridized into the BeadChip arrays according to the manufacturer’s protocol
(Illumina). Scanned data from the BeadChip raw files for all samples were
retrieved and background corrected using GenomeStudio software (Illumina), and
subsequent filtering analyses were completed in GeneSpring GX (Agilent).
Replicates were grouped together for the interpretation, and validated using
Principle Component Analysis (PCA), to test the reproducibility between
duplicates. Data were normalized using percentile-shift with a baseline to median
of all samples, and averaged within array. False intensities were filtered based on
the averaged probe set and present flags. Data were normalized within arrays,
corrected for multiple testing according to Benjamini–Hochberg and applied
statistical test based on analysis of variance (ANOVA) with definition of genes
being significantly differentially expressed when the FDR is <0.05.
61
2.13.11 GENE ENRICHMENT ANALYSIS FOR MICROARRAY
Differentially expressed genes were assessed for Gene Ontology analysis in
DAVID Functional Annotation tools (http://david.abcc.ncifcrf.gov) and Panther
(http://www.pantherdb.org). Functional classification based on biological
function, known pathways are used for further analysis. Heatmap for raw fold
change and hierarchical clustering using Euclidean distance was generated by
MeV microarray software suite (http://www.tm4.org/mev.html).
2.14 FUNCTIONAL ANALYSIS BY ZEBRAFISH
2.14.1 FISH LINES AND MAINTENANCE
Zebrafish were maintained at 28.5o on a 14-hour light/10-hour dark cycle.
Embryos were staged by examination of the head angle and time of birth. All
embryos and fish were raised and cared using established protocols on Research
Animal Care, Zebrafish Facility in the Institute of Molecular and Cell Biology.
Transgenic HuC:GFP line is maintained by Dr.Vladimir Korzh group in IMCB.
2.14.2 EMBRYO PREPARATION
Embryos were staged until desired developmental time point and fixed by 4%
paraformaldehyde at 4 C overnight. The next day, the embryos are washed in
PBS for several times at room temperature, and dehydrated into increasing
concentration of Methanol (25%-50%-75%) in PBS. The embryos were stored in
100% MeOH in -20 C until needed for in situ hybridization.
62
2.14.3 RNA PROBE SYNTHESIS
The antisense RNA probes were synthesized by DIG RNA Labelling mix solution
(Roche), 1 transcription buffer and RNA Polymerases (Roche) for 2-4 hours at
37 C. For med13b, the cDNA clone was obtained commercially from Open
Biosystem (Clone ID: 7050861). Antisense probes are designed by PCR primers
as follows:
Gene Sequences
sox10 F: GGGATTCAGA GCGCGAGCGA
R: CGCGCATTTAGGTGACACTATAGAAGTGACAGGTACT AGCATCATGTG
twist1b F: CCCTCCGTGA CGCAGGAGGA
R: CGCGC ATTTAGGTGACACTATAGAAGTG TTCGTTGAGTGTGTGTGTTT
islet1 F: TCCAGGCTCAAACTCCAC
R: GGATCCATTAACCCTCACTAAAGGGAATGTCCGACCGTTTACTTACAG
Table 8. PCR primer sequences to synthesize RNA in situ probes. PCR primers were
designed to include T7 or SP6 promoter to hybridize into the Expressed Sequence Tags
library clones. Resulted PCR products will contain the respective promoter (T7 or SP6)
required for the mRNA synthesis.
2.14.4 WHOLE MOUNT IN SITU HYBRIDIZATION
Whole mount in situ hybridizations of zebrafish embryo (WISH) were carried out
as previously described by Thisse et al.(148) Briefly, embryos were rehydrated
with PBS from concentrated MeOH, and digested with Proteinase K according to
embryo stage. The embryos were then fixed in 4% PFA and washed several time
with PBS, before pre-hybridization at 70 C in shaking water bath. Hybridization
of 300 ng of antisense probe was performed overnight at 70 C. The next day,
embryos were washed in salt-solutions (decreasing 2 SSC concentration) at 70
C, and blocked for 1 hour at room temperature in a blocking solution (5% Lamb
serum, 2 mg/ml BSA, 1% Dymethyl Sulfoxide in 0.1% PBS-Tween). Anti-DIG
63
Alkaline Phosphatase antibody at dilution 1:2000 (Roche) was added to the
blocking solution and incubated overnight at 4C. The next day, alkaline-
phosphatase staining solution, NBT-BCIP, was added to the Alkaline-Tris buffer
containing 100 mM Tris-HCl pH 9.5, 50 mM MgCl2, 100 mM NaCl and 0.1%
Tween 20, and used for staining the washed embryo for 2-3 hours at room
temperature. Embryos were fixed in 4% PFA upon desirable staining intensities,
and mounted on 50% glycerol:PBS for capturing images on Leica microscopes.
2.14.5 MORPHOLINO MICROINJECTION
Antisense oligonucleotide morpholinos (MOs) were purchased from GeneTools.
Two antisense morpholino oligonucleotides for med13b sequences were designed
by the manufacturer to target the start codon (med13b-1-Start:
ATGGCAGAGCCCCTCGTTTGTTAGA) and the splice site between exon 14-
15 (med13b-2-Splice: AGAACTCATCATCAAAGCGCAGTCC). Subsequently,
4 ng of each MO was injected into the yolk of one to two-cell stage embryo from
wild type AB-fish strains. Injected embryos then incubated at 28.5o C until
reaching the appropriate developmental stage. The embryos were then fixed in 4%
paraformaldehyde (PFA) overnight in 4o C and stored in 100% MeOH at -20
o C
overnight.
2.14.6 HUMAN MED13L MRNA SYNTHESIS
Full length human MED13L cDNA was amplified by using primers Hmed13l-F
5’-AGGATCATGACTGCGGCAGC-3’ and Hmed13l-R 5’-
TCGCGGAGGATCATGACT-3’ from a HapMap lymphoblastoid cell lines
GM12878 cDNA. Full length MED13L cDNA was subcloned into TOPOII Dual
64
Promoter vector by using Topo TA Cloning Kit (Invitrogen, Life Technologies)
for capped mRNA synthesis using mMESSAGE Machine T3 Kit (Ambion, Life
Technologies) according to manufacturer’s instruction. Capped mRNA was
purified by using RNAeasy mini kit (Qiagen) and co-injected together with MOs
for rescue experiments.
2.14.7 ALKALINE PHOSPHATASE STAINING
Embryos were fixed at 72 hpf in formalin (3.7% formaldehyde:PBS) at room
temperature for 30 minutes. Fixed embryos were then washed in methanol
dehydration series (50%-70%-100%) for 5 minutes and acetone for 30 minutes.
After washing step with 0.1% PBS-Tween twice for 5 minutes, embryos were
transferred to NTMT buffer (0.1 M Tris pH 9.5, 0.05 M MgCl2, 0.02 M NaCl, 1%
Tween-20 in ddH2O) three times for 5 minutes. Embryos were then incubated in
staining solution containing NTMT and 1:50 NBT/BCIP solution (Roche) for
maximum 30 minutes. After the staining is completed, embryos were fixed in
formalin and kept in 50% glycerol:PBS for capturing images using Leica
microscope.
2.14.8 ALCIAN BLUE STAINING
Embryos were fixed at 5 dpf with 4% PFA for 2 hours at room temperature. After
fixation, embryos were dehydrated to Ethanol (50%) for 10 minutes at room
temperature. Embryos were then transferred to Alcian Blue staining solution
(0.4% Alcian Blue (w/v) in 70% Ethanol) to rock at room temperature overnight.
To remove the pigmentation in older embryos, embryos were bleached the next
day in 3% H2O2 and 2% KOH for 20 minutes. To clear the tissue, embryos were
65
soaked in 20% Glycerol/0.25% KOH for 30 minutes, and prepared for imaging in
the same solution.
2.14.9 IMAGE QUANTIFICATION ANALYSIS
Images of up to 5 individual embryos for morphants analysis were captured for
uninjected siblings, control morpholino-injected embryos, morphants, and rescued
embryos, and further quantified by using ImageJ for the size of the eyes, and body
length. The areas and lengths were tested for statistical significance by student’s
paired t-test and p value 0.05 was considered as significant.
2.14.10 QPCR
Total RNA was extracted from 1-72 hpf zebrafish embryo using RNAeasy kit
(Qiagen). Reverse transcription of 2 µg RNA was performed in 20 µl of
SuperScript III Reverse Transcriptase reaction buffer (Life Technologies) using
random hexamer primers. qPCR reactions were performed in the ViiA7 Real
Time PCR system (Life Technologies) with ten-fold dilution of cDNA, 200 nM of
each primer using the SybrGreen PCR Master Mix (Life Technologies). Data
were analyzed using 2∆∆Ct
method, and normalized against zf-18s ribosomal RNA
or zf-actin. Each measurement was performed in duplicates. The controls used in
this study are the uninjected clutchmates or control MO-injected embryos. List of
primers used for zebrafish transcripts is listed in Table 9.
66
Primer name Primer sequence
zf_med13b F1 ACCTGTTCAAAGACTGCAACTTC
zf_med13b R1 ACATGTTGTCCATGAACTGTCTG
zf_sp8a F1 ACTCCATTGGCCATGTTAGC
zf_sp8a R1 CCGTTAAGGAGAACGAGCTG
zf_sp8b F1 CATCCCTACGAGTCGTGGTT
zf_sp8b R1 GGCAGAGTGGCTTAAACTCG
zf_fgfr3 F1 CGGTGGAGATGGAGAGAGAG
zf_fgfr3 R1 GGGGGAATTTGGACAGTTTT
zf_18s rRNA F1 CGCCACTTGTCCCTCTAAGAA
zf_18s rRNA R1 GTAGTTCCGACCGTAAACGAT
zf_actin F1 TACAGCTTCACCACCACAGC
zf_actin R1 AAGGAAGGCTGGAAGAGAGC
Table 9. List of zebrafish qPCR primers for validation of microarray fold change.
med13b primer was used to assess MO knockdown efficiency and developmental stages
expression level. Sp8a and fgfr3 were top-ranked genes with significant upregulation in
MED13L-deficient neurons based on the microarray. 18s and actin specific to zebrafish
were used to normalize the qPCR expression data.
67
CHAPTER 3: RESULTS
DISCOVERY OF CANDIDATE GENES FOR NDDS
3.1 STUDY BACKGROUND
Chromosomal rearrangements such as translocation, inversion, duplication or
deletion occur in the germline at a rate of 1 in every 2,000 live births and about
15% of NDDs-associated phenotypes are attributed to cytogenetically visible
abnormalities. Balanced and unbalanced rearrangements may disrupt genes,
affecting regulatory elements, altering copy number and giving rise to position
effect of the neighboring chromosomal regions.(149) Chromosome abnormalities
have provided useful means for disease gene identification in clinically well-
defined genetic disorders (as described in the Chapter 1). However, the majorities
of NDDs cases (60-80%) are idiopathic, and are unclassifiable into specific
syndromes. These are predominantly isolated cognitive impairments such as
learning delay or intellectual disability and often associated with multiple
congenital anomalies. These general clinical features present challenges in
diagnosis, counselling and management.
Karyotyping and chromosomal microarrays are currently first-tier diagnostic tools
applied for investigation of chromosomal abnormalities in individuals with
NDDs.(62) Furthermore, advances in NGS technologies have enabled more rapid
detection of SVs in a very high resolution, which has led to a flood of new
variants associated with NDDs in the past few years. DNA-PET sequencing, an
68
NGS-based technology established in our institute, adapted long-insert sizes of 7-
11 kb (rather than regular 1.5-2 kb size) genomic DNA fragments, resulting in
high physical coverage for the identification of genomic SVs. By combining a
classical approach of disease gene identification based on chromosomal
rearrangements with NGS technology, the first part of this study is aimed to
delineate novel candidate genes associated with NDDs by mapping the
chromosomal breakpoint regions in seven patients carrying cytogenetically visible
chromosomal rearrangements and presenting cognitive impairments (DD, SD, ID)
with or without multiple congenital anomalies.
3.2. CHARACTERIZATION OF SVS BY DNA-PET
Genomic DNA libraries from seven patients and parents of one patient (n = 9)
were generated and sequenced using SOLiD sequencing platform. About 60
million non-redundant PET reads were obtained on average per patient, resulted
in a relatively high physical coverage (mean = 127 from 9 datasets) (Table 10).
Sample PETa) Insert size range
(bp) cPET
(NR)b) Coveragec) dPET (NR)d)
CD5 42,680,822 6,070-15,900 15,377,825 58.1 3,291,424
CD10 44,421,203 7,100-9,230 32,994,073 93.5 5,781,459
CD8 57,591,305 7,470-14,880 27,552,231 103.0 3,177,920
CD9 61,252,841 5,130-10,350 40,545,617 122.2 13,659,051
CD14 87,508,075 8,650-11,110 41,231,642 143.0 3,871,310
CD23 70,741,827 7,580-10,090 48,798,329 151 3,530,694
CD6 55,695,575 8,110-10,900 47,991,119 161.5 2,466,666
CD15 *) 82,692,289 8,560-12,100 59,076,344 215.3 3,650,581
CD16 *) 45,724,748 5,860-7,870 38,584,959 93 3,285,306
Table 10. Summary of DNA-PET post-sequencing analysis
a) Number of tags which have been mapped and paired to form PETs;
b) Non-redundant
concordant PETs; c) Physical coverage of cPETs determined by average number of cPET
connections crossing a chromosomal position; d) Non-redundant discordant PETs; *) Unaffected
parent of patient CD6
69
After filtering, approximately 96% of the SV calls generated for each library were
compared with healthy individuals according to the DGV, 1000 Genome Project
SVs Pilot Release set, and previously published studies of paired-end sequencing
in control individuals (Chapter 2.7). After extraction of ‘normal’ SVs, the number
of patient-specific SVs was reduced into 7 – 27 events per patient (Table 11).
Sample Karyotype dPET SVs SVs(-) fSV
s D BT IB I UI TD In
CD5 t(9;17)(q12;q24) 505 284 270 14 12 2 0 0 0 0 0
CD10 t(6;8)(q16.2;p11.2) 927 616 601 15 9 2 1 0 0 1 2
CD8 inv(X)(p22;q26) 724 387 368 19 9 0 0 1 1 7 0
CD9 inv(5)(q22;q35.1) 390 581 574 7 2 0 0 0 0 3 0
CD14 Del7qter 1,053 618 596 22 20 0 0 0 0 0 2
CD23 t(12;19)(q12;q24) 999 641 614 27 19 2 0 0 4 2 0
CD6 t(2;8)(q23;p21) 1,030 635 618 17 9 2 0 3 3 3 0
CD15 46,XX 913 558 543 15 9 0 1 0 1 4 0
CD16 46,XY 1,003 682 654 28 24 0 1 0 0 3 0
Table 11. Summary of DNA-PET findings to identify SVs in nine individuals
Abbreviations: SVs(-), SVs present in normal individuals; fSVs, final list of SVs per individual
after filtering; D, deletion; BT, balanced translocation; IB, isolated breakpoints; I, Inversion; UI,
Unpaired Inversion; TD, Tandem Duplication; In, Insertion.
Sanger sequencing was used to determine precise breakpoints from SOLiD-
predicted breakpoints. PCR validation was confirmed in 31 out of 40 SVs that
were randomly selected from 94 SVs, suggesting an approximately 77%
validation rate. This number is consistent with previous studies employing the
DNA-PET to investigate cancer genomes, showing validation rates within the
range of 69-80%.(134, 136, 137) In parallel to DNA-PET sequencing, array CGH
was performed to compare the overlaps of deletions/duplications from both
techniques as a systematic validation. On average, the concordance rate between
aCGH and DNA-PET of copy number detection were about 93%, suggesting high
70
overlap between both approaches. Based on this, SVs detected by both techniques
were considered as validated. Similar to other NGS-based studies (150), aCGH
was used to systematically validate SVs finding from NGS data, where any
discordance between two techniques was considered as a false positive.
3.3 BREAKPOINT CHARACTERIZATION THROUGH
DETAILED SVS ANALYSIS
Each patient was analyzed individually to identify the potential candidate genes
disrupted by cytogenetically visible rearrangements, with the assumption that
these were the most likely causative events. DNA-PET predicted breakpoints
matching the known chromosomal rearrangements were validated by PCR
followed by Sanger sequencing and FISH. Transcript expressions of candidate
genes within the chromosomal breakpoints or other potential SVs were evaluated
by qPCR.
PATIENT CD5
In the first patient (Patient CD5), the monoallelic t(9;17) translocation was
transmitted to his two sons (CD21 and CD22) who displayed different levels of
cognitive impairment including DD, absence of speech, and autistic features
(Figure 7). aCGH analyses were performed for all three affected subjects, and did
not reveal any additional copy number changes in the translocation carriers.
71
Through DNA-PET SVs analysis, two abnormally oriented clusters corresponding
to the balanced translocation were identified between chromosome 9 and 17
(Table S1). The predicted breakpoint coordinate from DNA-PET was used to
validate the region by PCR amplification and Sanger sequencing. The breakpoint
was validated at position chr9:79,572,001-79,572,005 and chr17:74,764,927-
74,767,964. Sanger sequencing analysis in the two affected sons revealed similar
breakpoint in the three translocation carriers (Figure 8).
Figure 7. Pedigree of Patient CD5 family.
Familial translocation in this family is transmitted
from the father (CD5) to his two sons (CD21 and
CD22).
72
Figure 8. Validation of DNA-PET breakpoints by Sanger sequencing in Patient
CD5. PCR validation followed by Sanger sequencing showed identical breakpoints in
three translocation carriers; father (CD5), and two sons (CD21 and CD22). The resulting
fusion genes predicted at the genomic level are indicated.
There is a loss of 4 bp and 3,036 bp on chromosome 9 and chromosome 17,
respectively, with a microhomology of 3 bp between paired breakpoints. The
translocation disrupted GNAQ at intron 5 on chromosome 9 and RBFOX3 at
intron 2 on chromosome 17.
GNAQ encodes a member of the Gαq heterotrimeric protein family, and null
Gnaq mice exhibit cerebellar ataxia, and motor coordination deficit (151).
Heterozygous Gnaq mice were viable and did not show any obvious phenotypic
73
defect for more than 20 months (151). RBFOX3 is a neuronal specific splicing
factor, which is exclusively expressed in the neuronal nuclei. Both disrupted
genes shared the same orientation, breakpoints lie within introns, and the resulting
fusion is predicted to be in frame (Figure 8). However, RT-PCR analysis on the
patient’s EBV-LCL did not reveal any fusion transcript expression as neither gene
was expressed in these cells. The mRNA expression of both genes in human
tissue panel showed high expression in the fetal brain and the cerebellum (Figure
9).
Figure 9. qPCR analysis of GNAQ and RBFOX3 in human tissue panels. High
expression of both genes are shown in brain tissues such as fetal brain, adult brain and
cerebellum for both genes.
Nine out of twelve SVs were validated by PCR, in which 5 SVs were shared with
his two affected sons (SV1, 2, 4, 6 and 13). These five SVs overlapped with
known CNV regions (SV1, 2, 4) or were in non-coding regions (SV6, 13) and
were therefore excluded from further analysis (Table S1). Overall, the
translocation with the affected genes may have a potential functional impact,
74
since the translocation completely segregated with the language and
developmental delay in this family. To correlate the pathogenicity of fusion genes
to the phenotype observed, deriving neurons from induced pluripotent stem cells
of these patients might provide clues to check whether the fusion transcripts were
expressed and functional in the relevant cell type.
PATIENT CD10
In patient CD10, the impact of the t(6;8) balanced translocation was unclear, due
to variable penetrance in the translocation carriers; his mother was asymptomatic
and his younger sister had schizencephaly and DD (Figure 10).
Figure 10. Pedigree of Patient CD10 family.
This case showed incomplete segregation of the
phenotype in the translocation carriers; among a
healthy mother and two affected children.
DNA-PET identified abnormal paired reads corresponding to the balanced
translocation between chromosomes 6 and 8 (Table S2). Sanger sequencing
refined the breakpoint coordinates on chromosome 6 at position 98,318,526-
98,318,538 and chromosome 8 at position 35,527,969-35,527,976, with
microhomologies of 3 bp between paired breakpoints (Figure 11).
75
Figure 11. Validation of DNA-PET breakpoints by Sanger sequencing.
Sequencing of breakpoint junctions by using primers designed at DNA-PET breakpoint
coordinates revealed a loss of 11 bp on chromosome 6 and a loss of 8 bp on chromosome
8. There is a 3 bp (CCA) microhomology between paired breakpoints.
The translocation disrupted UNC5D at intron 5 on chromosome 8, while no
coding genes were found on chromosome 6 breakpoint. UNC5D encodes a
member of the human dependence receptor UNC5 family, specifically expressed
in the layer 4 of the developing neocortex in rats, which makes it an interesting
candidate (152, 153). Since his healthy mother carried the same translocation, all
coding exons of UNC5D were sequenced in the translocation carriers to identify a
potential secondary mutation affecting the other allele of UNC5D, and no
additional point mutations were found in this gene, suggesting that UNC5D
disruption was heterozygous. UNC5D is highly expressed in the CNS (adult brain,
fetal brain and cerebellum), low expression in the kidney, and its expression is
absent in other human tissues (Figure 12).
76
UNC5D mRNA transcript is not endogenously expressed in the patient’s EBV-
LCL; therefore the transcript level differences could not be investigated. Further
analysis of other SVs showed 5 deletions in non-coding regions (SV1, 2, 11, 12,
and 14) and 8 within genes (SV3, 4, 5, 6, 9, 10, 13 and 15), all of them overlapped
with CNVs seen in normal individuals based on the DGV (Table S2).
PATIENT CD8
Patient CD8 carried a pericentric inversion at chromosome X inherited from his
mother, between Xp22 and Xq26. His monozygotic twin brother harbors the same
inversion, and they both show signs of immunodeficiency combined with mild ID
and language delay. DNA-PET sequencing analysis revealed a 1.2 Mb inversion
located at chromosome Xq24-25 between paired breakpoints (Table S3), which
did not correspond to the large inversion seen in the karyotype.
Sanger sequencing refined this breakpoint position at coordinate
chrX:119,984,613-119,984,615 and chrX:122,839,398-122,844,862 with a loss of
1 bp and 5,463 bp on each breakpoint junction, respectively. Due to the
Figure 12. qPCR analysis of UNC5D
in human tissue panel.
Transcript expression of UNC5D in
human tissue panel showed high
expression in adult brain, fetal brain
and cerebellum.
77
discrepancies between karyogram and NGS data, other SVs in chromosome X
were extracted from the raw dataset with a cluster size cutoff ≥4 (Table 12).
SV # SV Coordinate on chrX (hg18) Cluster Size Span Gene
1 120,059,370-120,309,394 30 250,024
2 34,649,915-125,672,128 49 91,022,213
3 119,984,597-122,839,027 26 2,854,430 XIAP
4 119,984,844-122,845,538 29 2,860,694 XIAP
5 123,107,520-34,569,874 23 88,554,66 TMEM47
6 125,658,649-123,127,427 30 2,555,127 ODZ1,SH2D1A
7 125,828,479-125,110,988 26 737,420 CXorf64
8 34,558,550-34,567,696 5 9,146 TMEM47
9 34,560,395-123,109,280 5 88,548,885 TMEM47
10 61,754,201-61,771,832 4 28,847
Table 12. List of SVs found on chromosome X to analyze complex rearrangements.
DNA-PET data were re-analyzed to modify the filtering parameter to include SVs with
cluster size less than 4 (lower confidence clusters). Three additional SVs were identified
from low confidence clusters, resulted in 10 SVs found in chromosome X.
These SVs clustered into breakpoint hotspots on the p arm (SV2, 5, 8, 9), the
centromeric region (SV10) and the q arm (SV1, 3, 4, 6, 7), suggesting a complex
chromosomal rearrangement mechanism. The breakpoints were then validated by
PCR and FISH in all inversion carriers (mother and twin boys) (Table S4).
Probes used for FISH were focused on the main hotspots on Xp21 for the p-arm,
Xq11.1 for the centromeric region, and Xq24-25 for the q-arm. The probe on
Xq24 (RP1-315G1) and Xq25 (RP1-296G17) showed a fusion localized at the
centromere of der(X), while the probes on Xp21 (RP11-330K13) showed a split
signal and fused to the Xq25 (W12-499N23) (Figure 13A). And the centromeric
probe on Xq11 (RP11-762M23) was localized on the longer chromosomal arm,
78
suggesting a highly rearranged chromosome. These data suggested two large
sequential breaks in each arm of chromosome X (Xp21 and Xq24-25) that
resulted in tandem duplications, deletion and isolated breakpoints (Figure 13B).
This double inversion mechanism rearranged the landscape of chromosome X
architecture, and appeared as a large inversion on G-banding karyotype (Figure
13B).
Figure 13. FISH validation of DNA-PET predicted breakpoints. A) Mislocalization
signals of probes located in breakpoint hotspots were indicative of complex
rearrangements. B) Proposed mechanism of double inversion occurred sequentially, first
between Xp21 and Xq25, and second between Xq11 and Xq25, based on the FISH
analysis.
This complex rearrangement disrupted two genes at the inversion breakpoints:
XIAP with a breakpoint in intron 1 and TMEM47 with a breakpoint in the second
79
intron. XIAP encodes X-linked inhibitor of apoptosis, and mutations in this gene
have been associated with X-linked lymphoproliferative syndrome 2, XLP-2
(MIM 300635). TMEM47 encodes a member of PMP22/EMP/Claudin protein
family with unknown function that is ubiquitously expressed in human tissues
including adult and fetal brain (Figure 14).
Figure 14. qPCR analysis of
TMEM47 in human tissue panel.
TMEM47 is ubiquitously expressed
in human tissue panel.
Two other genes were affected by tandem duplications as part of the complex
rearrangement (SV6,7): SH2D1A, encodes an SH2 domain containing 1A protein
(SAP), which has been associated with X-linked lymphoproliferative syndrome
type 1, XLP-1 (MIM: 308240). This gene was initially suspected as the primary
cause of immunodeficiency phenotype seen in the patient, but SAP testing in
patient’s blood lysates showed normal protein expression, thus XLP-1 was ruled
out from the diagnosis. ODZ1, also known as TENM1 is the second gene affected
by this rearrangement. It encodes and Odd Oz/Ten-M homolog 1 of Drosophila
pair-rule gene involved in post-segmentation processes, including embryonic
development of the central nervous system (CNS), eyes, and limbs in
80
Drosophila.(154, 155) qPCR analysis of the four genes revealed a total absence of
expression for XIAP and TMEM47 (Figure 15) and moderately variable
expression for SH2D1A and ODZ1/TENM1 in patient’s fibroblast cell line (Figure
16) compared to two controls, supporting the absence of functional XIAP and
TMEM47 genes in the male patient. Considering the variable pattern of expression
of SH2D1A and the absence of functional XIAP, these data suggest that XLP-2
underlies the immunodeficiency symptoms of the affected twin boys. However,
further functional studies should be pursued to fully clarify the role of each of the
disrupted genes in this phenotype. Other autosomal SVs identified in patient CD8
were located in the intergenic regions (SV2,3,4,5,6,7,8,9,11,12) or in the intronic
regions (SV1 and 10) (Table S3).
Figure 15. qPCR analysis of XIAP and TMEM47 in patient cells. XIAP and TMEM47
transcripts expression were almost completely absent in the patient’s cells (dashed bar)
compared to controls. XIAP inversion breakpoint is located at intron 1, and TMEM47
unpaired inversion breakpoint is located at intron 2.
81
Figure 16. qPCR analysis of SH2D1A and ODZ1 in patient’s cells. SH2D1A and
ODZ1 showed moderate to normal transcript expression in patient’s cells compared to
two control individuals.
PATIENT CD9
In patient CD9, sequencing analysis identified two paired-inversion clusters
correspond to the large inversion on chromosome 5. This large inversion spans
53 Mb on chromosome 5q22.2 (chr5:111,966,591-111,963,149) and 5q34
(chr5:165,575,819-165,576,628), with 1 bp deletion, and microhomologies of 3
bp between paired breakpoints (Table S5). No gene was disrupted at the inversion
breakpoint, and no evidence of regulatory elements sitting at both breakpoint
coordinates was found based on RegulomeDB and ENCODE database (Table S5).
Additionally, 5 other patient-specific SVs were identified in this patient; two were
located in intergenic regions (SV5 and 6); two others were tandem duplications
overlapping with known CNVs (SV4 and 7); and one intronic deletion of 4kb
(SV1) in RNF19B gene, which has not been listed in DGV and could be a
potentially interesting candidate to follow up. Based on this analysis, it seems
unlikely that the large inversion on chromosome 5 is causative for the phenotype.
82
Other patient-specific SVs (Table S5) or mutations that cannot be identified by
DNA-PET such as point mutations or exposures of environmental factors might
explain the phenotype.
PATIENT CD14
Patient CD14 has a heterozygous telomeric deletion on del7q which was initially
identified by subtelomeric FISH. This telomeric deletion was not captured within
the list of patient’s specific SV (Table S6), which highlighted the limitation of
DNA-PET. DNA-PET requires mappable sequences on both paired-reads for SVs
annotation. Alternatively, by using the copy number information derived from
DNA-PET sequencing read depths, the deletion was detected at the telomeric
region of chromosome 7q36.3, and then confirmed by aCGH (Figure 17).
83
Figure 17. Telomeric deletion detected by cPET reads and aCGH. Deletion is
depicted by a steep decline indicated by black arrow in the upper panel, corresponding to
lower copy number, as shown by cPET copy number browser. Lower panel showed
aCGH browser snapshot indicating a deletion in green only region.
The deletion disrupted NCAPG2 at exon 14. NCAPG2 encodes a subunit of
Condensin II complex that plays a role in compacting chromosomes preparing for
mitosis (156). Four other genes ESYT2, WDR60, LINC00689 and VIPR2 were
deleted in the distal telomeric region. Duplications in VIPR2 have been shown to
84
confer risk to schizophrenia (157), and a subtelomeric deletion spanning
NCAPG2, WDR60, ESYT2, and VIPR2 has been reported in an idiopathic DD
patient(158). However, in this case, the telomeric deletion is derived from an
asymptomatic father. Hence, there could be other factors that might explain the
complex clinical phenotype in this patient.
From the list of patient-specific SVs (Table S6), there is a 37 kb deletion inside
the Microcephalin 1 (MCPH1) gene, which was too small to be detected by G-
banding karyotyping. This deletion was observed by the aCGH, and further
validated by Sanger sequencing, confirming a deletion from exon 6 to 9 in this
gene. MCPH1 is an interesting candidate gene, as it has been reported to be
causative for Autosomal Recessive Primary Microcephaly (MCPH MIM:
251200). However, upon parental testing, this deletion was found to be inherited
from his asymptomatic mother. To test the possibility of homozygous deletion in
this gene, the coding regions of MCPH1 from paternal allele were sequenced by
Sanger method and revealed 3 non-synonymous SNVs; two in exon 8 and 1 in
exon 14. All of them were present in dbSNPs and classified as being non-
pathogenic, thus confirming heterozygous deletion in this patient.
The mRNA expression of both MCPH1 and NCAPG2 in the patient’s
lymphoblastoid cells were assessed by qPCR (Figure 18). NCAPG2 expression
was reduced in patient’s cell line by using primers at exon 15-16, downstream of
the telomeric deletion, and its expression was comparable to normal level at exon
10-11. MCPH1 expression showed variable expression (between 30-60%
85
reduction) in patient’s cell lines by using primers at exon 1-2 and exon 8-9,
compared to two controls.
Figure 18. qPCR analysis of NCAPG2 and MCPH1 in patient’s cells. mRNA
expression of NCAPG2 is normal in the patient’s cells at exon 10-11 (upstream of
telomeric deletion) and reduced at exon 15-16 (downstream of telomeric deletion) in the
patient’s cells compared to controls. NCAPG2 telomeric deletion is predicted at exon 14
based on aCGH and cPETs window. mRNA expression of MCPH1 is reduced variably
(up to 30-60%) in the patient’s cells by using primers at either exon positions. MCPH1
deletion is located between exon 6 to 9.
Further studies are warranted to decipher the molecular mechanism regulating
transcription of the affected genes.
PATIENT CD23
Patient CD23 carried a de novo monoallelic translocation t(12;19) identified from
G-banding karyotyping. She had Pierre-Robin sequence (severe micrognathia),
dysmorphic features and intellectual disability. Two abnormally oriented paired
clusters correspond to the balanced translocation between chromosome 12 and 19,
disrupting MED13L at intron 5 on chromosome 12 and no coding genes in the
chromosome 19 breakpoint (Table S7).
86
Figure 19. Validation of DNA-PET breakpoints in Patient CD23 by Sanger
sequencing. The illustration of translocated chromosomes between chr12 and 19 and
Sanger sequencing validation chromatogram showing 1 bp microhomology on
chromosome 12 breakpoint and 2 bp insertion on chromosome 19 breakpoint.
Sanger sequencing validation confirmed the breakpoint junction at intron 4 of
MED13L with a 1-bp microhomology in derivative chromosome 12 and 2 bp
insertion in derivative chromosome 19 (Figure 19).
To check whether the MED13L disruption is heterozygous, coding exons of
MED13L were sequenced and did not reveal any other mutations on the non-
translocated chromosome. MED13L encodes for Mediator complex subunit 13-
like, one of the subunit in CDK8 module of Mediator complex, a module that
functions as a transcriptional repressor. MED13L mRNA expression was assessed
by qPCR in patient’s cell lines compared to two controls by using primers at
87
different exon positions (Figure 20), and showed inconclusive results. Although a
consistent decrease of mRNA expression is observed in the patient’s cells, further
studies would need to be pursued in order to confirm the molecular consequences
of the translocated allele .
Figure 20. qPCR analysis of MED13L in patient’s lymphoblastoid cell lines,
MED13L is approximately 50-60% reduced in patient‘s cells compared to two controls.
Primers were designed at different exons within MED13L to predict the molecular
consequences. Breakpoint is located at intron 5.
The mRNA expression of MED13L across human tissue panel showed ubiquitous
expression, with the highest expression observed in the cerebellum, fetal brain
and skeletal muscle (Figure 21).
88
Figure 21. qPCR analysis of MED13L in human tissue panel. MED13L is
ubiquitously expressed in human tissues and its highest expression is in the cerebellum.
There are 25 other SVs found in this patient, of which 7 deletions spanning
coding genes have known variants reported in DGV, and 16 other SVs were
located in intergenic regions, with unknown functions (Table S7). Previous case
studies reporting SVs within MED13L showed nearly identical clinical
phenotypes as our patient. This piece of evidence together with our analysis
suggested that MED13L translocation might have potential functional relevance.
Further functional studies on MED13L will be described in the Chapter 4.
PATIENT CD6
Trio sequencing was performed for patient CD6, since the parental DNA were
available for sequencing. This allowed for filtering of non-pathogenic germline
SVs and to determine paternal/maternal SVs inheritance rate. This case served as
an example for complete delineation of patient’s de novo specific SVs. Before
filtration, 650 SVs were detected in the patient, of which 616 SVs were detected
89
in at least one of the parents or both, indicating that these were germline variants
(Figure 22).
Figure 22. Venn diagram showing total SVs found in each individual after trio
sequencing; 345 SVs were shared between three of them, and 34 were found to be de
novo. The number in parenthesis indicated the final number of SVs after removing
sequencing artifacts.
The paternal/maternal SV inheritance rates were found to be relatively balanced
(46.1% and 52.5%, respectively), as expected for 50:50 inheritance ratio during
meiosis. After filtering criteria , as described previously in the Methods (Chapter
2.7-2.8), were applied, only four de novo SVs were found in the offspring, of
which two SVs correspond to the previously identified balanced translocation
t(2;8), and the other two SVs were located in intergenic regions (Table S8). The
translocation breakpoints were validated by PCR and Sanger sequencing at
position chr2:144,606,846 and chr8:23,007,141-23,007,144, with a loss of 5 bp in
chromosome 8 and a microhomology of 4 bp between the flanking regions. The
chromosome 2 breakpoint was located in the intron 5 of GTDC1 gene, and no
coding gene was found on chromosome 8 breakpoint, suggesting that GTDC1 is
the main candidate gene in this patient. GTDC1 encodes for a glycosyltransferase
90
domain-like containing protein 1, of unknown function. GTDC1 transcript
expression was reduced in patient’s lymphoblastoid cell line compared to two
controls (Figure 23).
Figure 23. qPCR analysis of GTDC1 in patient’s lymphoblastoid cell line. GTDC1
expression is reduced in the patient’s cell line compared to two controls. Breakpoint is
located at intron 5.
Based on secondary CNV screening in DD cases by Cooper et al(72) and the
DECIPHER, there are 21 deletions/duplications overlapping with GTDC1,
whereas no CNVs spanning GTDC1 were found in DGV, suggesting that the
CNVs affecting GTDC1 are predominantly present in the disease population.
GTDC1 is ubiquitously expressed in human tissues, with the highest expression
observed in the cerebellum of the adult brain (Figure 24).
91
3.4 SECONDARY CNV SCREENING IN PUBLISHED STUDIES
AND DATABASES
From these analyses, eight genes were disrupted at the chromosomal breakpoint
regions in six out of seven patients screened as summarized in Table 13.
Patient G-banding/FISH
diagnosis
aCGH
finding
DNA-PET breakpoint
coordinates
Putative
gene
disruption
CD5 t(9;17)(q12;q24) Normal Chr9:79,572,001-79,572,005
Chr17:74,764,927-74,767,964
GNAQ;
RBFOX3
CD10 t(6;8)(q16.2;p11.2) Normal Chr6:98,318,526-98,318,538
Chr8:35,527,964-35,527,976 UNC5D
CD8 inv(X)(p22;q26) Normal ChrX:119,984,613-119,984,615
ChrX:122,839,398-122,844,862
XIAP;
TMEM47
CD9 inv(5)(q22;q35.1) Normal Chr5:111,962,767
Chr5:165,576,568-165,576,574
CD14 Del7qter Telomeric
deletion
Chr7:158,150,000
Chr8:6,282,434-6,318,538 NCAPG2
CD23 t(12;19)(q12;q24) Normal Chr12:114,971,734-114,971,744
Chr19:35,769,937-35,769,968 MED13L
CD6 t(2;8)(q23;p21) Normal Chr2:144,606,846
Chr8:23,007,141-23,007,144 GTDC1
Table 13. Summary of the breakpoint analysis for each patient.
Patients were initially assessed from G-banding karyotyping showing breakpoint
position. aCGH finding usually showed normal finding, and DNA-PET sequencing
precisely pinpoint the chromosomal breakpoint, and subsequently narrowed down to the
putative candidate gene that is disrupted.
Figure 24. qPCR analysis of the
expression of GTDC1 in human tissue
panel. This analysis showed high
expression in neurological tissues,
emphasizing critical role of GTDC1 in
brain development.
92
In addition, MCPH1 deletion was identified by screening the patient-specific SVs
in one patient, as the telomeric deletion was inherited by a healthy father. Among
these genes, XIAP has been associated with X-linked lymphoproliferative disorder
2 (XLP-2: MIM 300635) and MCPH1 has been linked to autosomal recessive
primary microcephaly (MCPH MIM 251200).
To further evaluate the disrupted genes and their potential implication to NDDs,
CNV screening in DD cases from published studies and Developmental Disorder
database DECIPHER was performed to assess the occurrence of pre-existing
CNVs within these genes in disease population, compared to the control set in the
same study and DGV database(72, 159, 160) (Table 14). Enrichment of CNV
counts in cases versus controls (p ≤ 0.05) was observed in RBFOX3, GTDC1, and
NCAPG2, suggesting that altered gene dosage in these genes might contribute to
the neurodevelopmental phenotypes seen in the patients.
Patient Gene Cases Control Chi
square
p-value Cooper DECIPHER Total Cooper DGV Total
CD5 GNAQ 3 7 10 14 0 14 1 × 10-1
CD5 RBFOX3 68 11 79 1 4 5 1 × 10-4
CD10 UNC5D 2 8 10 0 0 0 7 × 10-2
CD14 NCAPG2 16 14 30 1 1 2 1 × 10-2
CD8 TMEM47 0 10 10 0 4 4 4 × 10-1
CD6 GTDC1 11 14 35 0 0 0 4 × 10-3
CD23 MED13L 13 10 23 2 4 6 4 × 10-1
Table 14. CNV counts in cases and controls from published and public dataset. In
Cooper et al. the total number of cases is 10,151 cases and 8,329 controls. The total
number of cases in DECIPHER database is 17,000 cases. The total number of controls
included in 1000 Genome SV Release set is 185 controls. P-values are calculated by Chi-
square test based on two-tailed rank.
93
CHAPTER 4: RESULTS
DISSECTING FUNCTIONAL ROLE OF MED13L
DURING NEURODEVELOPMENT AND NEURAL
CREST CELLS (NCCS) SPECIFICATION
4.1 STUDY BACKGROUND
MED13L is a subunit of Mediator complex that is located in the CDK8 module, a
dissociable component of Mediator complex that has been described to have an
activating and repressing functions in transcription regulation.(161) Mediator
complex regulates gene expression by physically linking transcription factors to
RNA Polymerase II.(162-164) Recent studies have shown that MED13L has a
role in cancer pathways by specifically repressing Retinoblastoma/E2F-induced
growth, and also targeted for degradation by FBW7, a tumor suppressor and a
component of SCF (Skp-Cullin-F box) ubiquitin ligase to disrupt CDK8-Mediator
association. (165) Mutations in other subunits within Mediator complex have
been linked to DDs syndromes such as, (i) MED12 in Lujan-Fryns syndrome
(MIM 309520),(166) X-linked Ohdo syndrome (MIM 249620)(167) and Opitz-
Kaveggia syndrome (MIM 305450);(168) (ii) MED17 mutation was found in
infantile cerebral and cerebellar atrophy (MIM 613688)(169); (iii) MED25 in
autosomal recessive Charcot-Marie Tooth disease Type 2B (MIM 505589).(170)
94
Mutations or SVs in MED13L has been previously reported in several case
studies. First study reported a MED13L disruption in a translocation breakpoint of
a patient with ID and dextro-looped transposition of great arteries (dTGA) upon
positional cloning, and three additional patients with MED13L missense mutation
were identified after mutation screening in dTGA patients. Thus, MED13L
mutations were suggested to be causative for dTGA (MIM 608771).
Subsequently, a homozygous mutation p.Arg1416His in MED13L was reported in
two siblings presenting an isolated non-syndromic ID without cardiac defect.(171)
In the third study, Asadollahi et al. reported three patients with ID, craniofacial
abnormalities, and cardiac defects; of which two patients were found to have
exonic deletions in MED13L and one patient has a triplication involving the
whole gene. The last study suggested that patients with MED13L mutations could
be potentially classified as a recognizable haploinsufficiency syndrome, due to
shared clinical features among the patients with mutations in this gene.
In this thesis, MED13L disruption was identified in a translocation breakpoint of
patient CD23 that resulted in MED13L haploinsufficiency (Chapter 3.3).
Interestingly, this patient showed nearly identical phenotype to the previously
described patients with MED13L mutations.(172-175). Table 15 summarized the
comparison of clinical phenotypes among patients carrying MED13L mutations
(Figure 25).
95
Table 15. Clinical presentation of reported patients with structural variants or mutations affecting MED13L. bbreviations : CoA , coarctation of the aorta ; dTGA, dextro-looped transposition of great arteries ; pVSD, perimembranous ventricular septal
defect ; TAPVC, total anomalous pulmonary venous connection ; TOF, tetralogy of Fallot; ID, intellectual disability ; DD, developmental delay ;
SD, speech delay ; LD, language delay; NR: Not reported.
Category Muncke et al. 2003 Najmabadi et
al. 2011
Asadollahi et al. 2013 Our patient
Patient 1 Patient 2 Patient 3 Patient 4 Patient 5 Patient 6 Patient 7 Patient 8 Patient 9
DNA
Alteration
t(12;17)(q24.1;q21)
(intron 1)
c.752A>G c.5615G>
A
c.6068A>
G
c.4246C>T Deletion in exon 2 (17
Kb)
Deletion in exon 3-4
(115 Kb)
Triplication of
the whole gene
t(12;19)(q24;q21)
(intron 4)
Cardiac
malformation
dTGA, pVSD, mild coA
dTGA dTGA dTGA NR Pulmonary atresia, supracardial TAPVC,
VSD
TOF pVSD NR
DD and
cognitive
anomalies
DD, ID, absence of
speech, discrete
ataxia,
microcephaly
NR NR NR Mild ID ID, DD, SD, motor
coordination problem,
hypotonia
DD Learning
difficulties
DD, LD, ID
Craniofacial
anomalies
NR NR NR NR NR Prominent occipital
protuberance, broad
forehead with nevus simplex, upslanting
palpebral fissures, flat
nasal root, bulbous nose, large ears,
macroglossia
Upslanting palpebral
fissures, large ears,
flat nasal root, bulbous nose, deep
philtrum,
micrognathia
Broad nasal
bridge, mild
pectus excavatum
Flat occiput,
hypertelorism, flat
philtrum, large nasal bridge, Pierre-Robin,
microretrognathia,
strabismus, double-gum aspect
Skeletal and
muscle
manifestation
NR NR NR NR NR Muscular hypotonia Muscular hypotonia,
bowed legs, overlapping of fifth
toe over fourth toe
Neonatal
muscular hypotonia
Bilateral equino varus
foot deformity, metatarsus adductus
(thumb),
camptodactyly, scoliosis, multiple
contractures
96
Figure 25. Gene and Protein structure of MED13L. The mutations or SVs in reported
patients (P1-P9) are indicated at their coding position on mRNA and protein level. In the
upper diagram, gene structure of MED13L is indicated. Patient 1 (t(12;17)) with a de
novo translocation at intron 1, Patient 2, 3, 4 and 5 with a missense mutation on exon 2,
25, 27, and 19, respectively. Patient 6 with a deletion on exon 2 and Patient 7 with a
deletion encompassing exon 3-4. Patient 8 had a triplication involving the whole gene,
thus not indicated in this diagram. Patient CD23 is indicated in red as Patient 9 in this
diagram with a translocation breakpoint at intron 4.
Patients with MED13L haploinsufficiency show similar syndromic features affecting
craniofacial mesenchyme (macroglossia, bulbous nose, upslanting palpebral fissures),
nervous system (ID) or cardiac system, which to some extent overlap with DiGeorge
syndrome. Many parts of these affected organs are known to originate from neural crest
cells (NCCs). NCCs are migratory and multipotent cells that give rise to a wide
variety of different cell lineages in embryo, including melanocytes of the skin and
inner ear, craniofacial cartilage and bone, peripheral and enteric neurons and glia
and smooth muscle. Other components of CDK8 module in Mediator complex has
been previously shown through animal studies to be important in NCCs-derived
organ development: (i) med12, is required as a coactivator for Sox9-dependent
neural crest, brain, kidney, and ear development; (ii) Med13 regulates energy
97
homeostasis in the heart. These studies suggested that components of Mediator
complex may have specific roles in different aspects of NCCs development.
To uncover the role of MED13L in NCCs development and neurodevelopment,
knockdown experiments were performed in zebrafish embryos and human ES
derived NPCs.
4.2 EXPRESSION PROFILE OF MED13L ORTHOLOGUEIN
ZEBRAFISH
In human, MED13L is expressed in fetal brain, cerebellum, and skeletal muscle
(See Chapter 3). To study the role of MED13L during embryogenesis, zebrafish
system was used as a model for neural crest development and neurodevelopment.
MED13L is highly conserved across vertebrate species (Figure 26), and the
protein sequence of med13b, the closest zebrafish orthologue of MED13L showed
65% sequence similarity to the human protein.
Figure 26. Sequence conservation of MED13L across vertebrate species.
Sequences were extracted from UCSC Genome browser and aligned by using
CLC Main Workbench.
In zebrafish, med13b is shown to be expressed maternally at 1-cell stage and
gradually decreased at 6 hours post fertilization (hpf) through the activation of
98
maternal-to-zygotic transition (Figure 27A). To determine the spatiotemporal
expression of med13b across different time points during embryogenesis, whole
mount in situ hybridization of zebrafish embryo was performed across different
time points during embryogenesis. Maternal expression of med13b was detected
at 1-2 cell stage in the blastomeres, and at 24-48 hpf the zygotic med13b
expression was restricted to the forebrain, mid-hindbrain boundary, eye, pectoral
fin and the lateral line (Figure 27B-F). Some of these tissues require derivatives of
neural crest lineage to develop: pectoral fin arises from trunk neural crest, lateral
line arises from cranial placodal ectoderm and eye development involves
migration of NCCs to the periocular mesenchyme.
99
Figure 27. Expression profile of med13b in zebrafish embryo. (A) qPCR analysis of
med13b between 1 hpf to 72 hpf showed maternal expression that is subsequently
replaced by zygotic transcription at 6 hpf. (B-F) Maternal expression is confirmed by in
situ hybridization, showing expression on the blastomeres at 1 and 2-cell stage (B and C),
restricted expression in brain regions at 24 and 48 hpf (D and E), and no expression by
using sense probe (F).
100
4.3 LOSS OF FUNCTION OF MED13B IN ZEBRAFISH EMBRYO
To investigate the role of med13b in development, med13b was suppressed by
injecting antisense morpholino oligonucleotides (MO) to 1-2 cell stage zebrafish
embryo. Embryos injected with med13b-MO and control-mismatch-MO, as well
as uninjected clutchmate were analyzed up to 5 days-post-fertilization (dpf) for
survival and morphology. Developmental defects in med13b morphants were
apparent from 2 dpf. More than 95% of the embryos were affected with a curved
body axis, underdeveloped head, pericardial edema, and microphtalmia (Figure
28), which were categorized as severe (40%) and mild-to-moderate for 55% of the
analyzed embryo.
Figure 28. Knockdown of med13b in zebrafish embryo. med13b MO injection in ~150
embryos per injection resulted in 3 categories of morphant phenotypes: severe, moderate
and mild with severe developmental defect. Coinjection of MO and human MED13L
mRNA restore the phenotype.
101
Co-injection of approximately 150 pg of full-length human MED13L mRNA with
med13b MO resulted in significant rescue of the eye size and head development,
while the body length was partially rescued (Figure 29).
Figure 29. Eye size of morphants was smaller than controls and could be partially
rescued by human MED13L mRNA. The body length of morphants was not severely
affected, and thus rescue construct did not fully restore the body length. Six
representative embryos were selected for measurements in each group (n=6, ± SEM), *)
p<0.05 and **) p<0.001.
Knockdown of med13b did not seem to impair survival as the morphants
displayed comparably similar survival rate compared to the control MO,
uninjected controls and rescued embryo at 5 dpf (Figure 30), indicating a possible
functional redundancy with other family members of Mediator complex, such as
MED13 which shares ~50% protein similarity to MED13L.
102
4.4 LOSS OF MED13B IMPAIRED CRANIOFACIAL CARTILAGE
DEVELOPMENT
To further study the implication of med13b deficiency in craniofacial and
cartilage structure development, the migration of cranial NCCs in the morphant
embryos were assessed by using sox10 that is expressed along the cranial ganglia
and otic vesicle at 24 hpf (Figure 31A). In the morphants, the NCCs migration
was impaired, as shown by clustered expression of sox10-positive cells along the
migratory cranial ganglia at a more posterior location (Figure 31A’). Upon co-
injection of human MED13L mRNA, the location of sox10 positive cells were
comparable to that of control embryos (Figure 31A’’), suggesting that med13b is
required for proper migration of NCCs.
Further assessment on cartilage development were performed on 5 days old
morphant embryos stained by Alcian blue, a time point when all craniofacial
cartilage has been fully formed, resulted in a severely distorted ceratohyal
Figure 30. Survival rate of med13b MO embryos compared to wild type
(uninjected and control MO) and rescue embryo observed on 5 dpf. Loss of
med13b did not affect the survival rate, as shown by 2-3% differences between
controls, rescued and morphant embryos.
103
cartilage, which was shifted posteriorly in the morphants (Figure 31B’). This
phenotype was fully restored upon human MED13L mRNA introduction (Figure
31B’’), which further confirmed the phenotypic defect due to MED13L
deficiency. These data suggest that defective cranial NCCs in early
developmental stage contributed to craniofacial anomalies upon med13b
suppression in these embryos.
Figure 31. Expression of NCCs markers by in situ hybridization.
(A-A’’) Expression of sox10 at 24 hpf showed expression along cranial ganglia and otic
vesicle (ov) at 24 hpf in the control embryos (A). Delayed migration of NCCs depicted as
clustered sox10-positive cells towards more posterior location in morphant embryo (A’)
and rescued embryo showed comparable to normal sox10 expression (A’’). (B-B’’)
Alcian-Blue staining at 5 dpf showed staining of ceratobranchial 1-5 (cb), Palatoquadral
(Pq), Meckel (m) and cerathyal (ch) cartilage in the control (B). ch was severely distorted
and shifted posteriorly in the morphants (B’), which was restored upon introduction of
human construct in the rescued embryo (B’’).
104
4.5 MED13B SUPPRESSION AFFECTS NEURODEVELOPMENT
IN ZEBRAFISH EMBRYO
Another prominent feature of patients with MED13L mutations is mild-to-
moderate intellectual disability. Considering the conserved expression of
MED13L in the early brain development from fish to human, the role of med13b
in early neurodevelopmental stage was investigated in the zebrafish embryo.
Islet1 was used as a marker that is primarily expressed in the retinal ganglion cells
(RGC), hypothalamus, and ventral telencephalon in the CNS. Med13b morphants
showed absence of islet1 expression in RGCs and forebrain (Figure 32C’), and
reduced expression in the hypothalamus and telencephalon, which is restored
upon human MED13L mRNA co-injection (Figure 32C’’), suggesting impaired
neuronal development in early embryogenesis. To better examine the events of
neural differentiation, med13b MO was injected to transgenic HuC:GFP fish,
where cells that start to develop as neurons initiate expression of green fluorescent
protein (GFP) under the regulation of HuC promoter. At 48 hpf, a decrease in the
GFP expression in the optic tectum of the midbrain in morphants was observed
(Figure 32D’), whereas rescued embryos showed expression relatively similar to
that of control embryos (Figure 32D’’). These results suggest that med13b have a
role in early stages of neural differentiation.
105
Figure 32. Morpholino knockdown of med13b perturbed neuronal distribution
across zebrafish brain.
(C-C’’) Expression of islet1 in the hypothalamus (ht), forebrain (fb), and retinal ganglion
cells (rgc) is shown in the control (C). Loss of islet1 expression is depicted in the fb, ht
and rgc of morphants, suggesting developmental delay in the the morphant embryo (C’),
which was rescued by coinjection of human MED13L mRNA (C’’). (D-D’’) HuC:GFP
transgenic line showed distribution fo developing neurons as GFP expression across brain
regions in the control embryos (D). Abnormal distribution of GFP expression in the
forebrain and midhindbrain boundaries was seen in the morphant (D’), which was
restored upon mRNA introduction.
4.6 MED13L KNOCKDOWN IN NEURAL STEM CELLS DID
NOT AFFECT PROLIFERATION
To further assess the effect of MED13L disruption in human system, hESC-
derived neural progenitor cells (H1-NPC) were used as a model system. These
NPCs were differentiated from H1 human ES cells (H1-hESCs) using small
molecule inhibitor cocktails targeting GSK3β and TGF-β signaling pathways,
based on established protocol.(147) Two shRNAs targeting MED13L were
transfected into H1-NPC, and resulted in 50-60% suppression of MED13L
expression (Figure 33).
106
The protein expression of neural stem cells markers NESTIN, SOX2 and
VIMENTIN were not remarkably different between shScrambled and
shMED13L1/2 cells, shown by immunocytochemistry (Figure 34), indicating that
MED13L knockdown did not alter the cell fate, as they still retain NPCs
characteristics.
Figure 34. Immunocytochemistry of NPCs markers (SOX2, NESTIN, and
VIMENTIN) in shScrambled, shMED13L1 and shMED13L2 cells. Sox2,
Nestin and Vimentin were expressed at comparable expression levels between
Figure 33. Knockdown efficiency of
shMED13L cells.
MED13L expression was significantly
suppressed (up to ~50-60%) by two shRNAs;
shMED13L1 and shMED13L2 in H1-NPCs.
Data were collected based on triplicate
measurements. **) p < 0.001
107
shScrambled and shMED13L NPCs. Images are representative of two
independent experiments and six field-images were taken for each experiment.
NPCs were able to self-renew and proliferate until a limited number of times
before they became post-mitotic. The proliferative capacity of these cells was not
severely affected upon MED13L knockdown as assessed by EdU-incorporation
assay and Ki67 labeling (Figure 35). The ratio of proliferating cells at different
cell cycle phases did not show significant differences between shScrambled and
shMED13L cells. Ki67-expressing cells represented proliferating cells in the
overall G1 to G2/M cell cycle phase, also did not show any significant differences
between these samples.
108
Figure 35. Proliferating cells in different cell cycle phases measured by EdU-
incorporated cells and Ki67 staining. EdU-incorporated cells were measured using
flow cytometry, and showed no significant differences among shScrambled and
shMED13L cells. Experiments were performed twice in duplicates. Similarly, percentage
of Ki67-positive staining were calculated per total number of DAPI-stained nuclei and
also revealed no significant differences. Ki67-positive cells was counted based on five
randomly selected image fields of each sample and each image consist of 40-50 nuclei.
109
4.7 MED13L KNOCKDOWN DID NOT AFFECT NEURONAL
MATURATION
To investigate the impact of MED13L disruption in neurons, H1-derived NPCs of
both shScrambled and shMED13L cells were maintained at low passage and
further differentiated into neurons. Neurons generated using this technique
consists of a mixed population of different neuronal subtypes, and not directed
towards specific neuronal lineage. At four weeks, these cells express mature
neuron marker, MAP2 and early differentiating neuron marker TuJ1. The
expressions of both markers in shMED13L1/2 neuronal cells were comparably
similar to shScrambled cells (Figure 36), indicating that MED13L suppression did
not affect neuronal maturation. This result contradicted the observation in
zebrafish morphants, which showed abnormal distribution of developing neurons
in zebrafish brain. This could be explained by a lack of positional identity in
neuronal cell culture to determine specific brain regions, and the neurons
generated in vitro may not be identical to the affected neuronal cell type in
zebrafish morphants.
Figure 36. Expression of early neuronal marker (TuJ1) and mature neuronal
marker (MAP2) in shScrambled, shMED13L1 and shMED13L2 cells assessed by
immunocytochemistry. TuJ1 and MAP2 were co-expressed at similar level in
110
shScrambled and shMED13L cells at 4-weeks differentiation. Images are representative
of two independent experiments and six field-images were taken for each experiment.
4.8 TRANSCRIPTOME PROFILING OF MED13L-DEFICIENT
NEURONS
To further characterize whether MED13L knockdown have impact on other genes
that might provide clues on its regulation in the genome, microarray experiments
on shMED13L1/2 neurons were performed at week 4, as this time point is a
critical transition period between neural progenitors to become post-mitotic
neurons. In total, 1,117 genes were differentially expressed in shMED13L
neurons. The top two upregulated genes were SP8, a zinc finger transcription
factor gene and FGFR3, a fibroblast growth factor receptor 3 gene. Both were
found nearly 10-times upregulated in shMED13L1 and shMED13L2 neurons.
Both were confirmed to be upregulated as well in shMED13L NPCs and med13b
morphant embryos by quantitative RT-PCR (Figure 37). Four other genes with
known expression in zebrafish NCCs development according to ZFIN database,
such as pharyngeal arches (calcr), lateral line (rspo3), and pectoral fin (hoxa5 and
atp1a2) were significantly downregulated (Log2FC = -3.83 to -2.39, p<0.05).
These data suggest that MED13L suppression also affects the regulation of genes
that are important for development of NCCs derivatives.
111
Pathway enrichment analysis using PANTHER software was applied on 1,147
genes and revealed 16 components within canonical Wnt pathway being
significantly differentially expressed in shMED13L1/2 neurons (Figure 38).
Figure 37. qPCR analysis of SP8 and FGFR3 in MED13L-knockdown cells.
This analysis showed upregulated expression of SP8 and FGFR3 in shMED13L1
and shMED13L2 NPCs and neurons, as well as med13b MO, thus confirming the
increased fold-change in microarray.
112
Figure 38. Heatmap clustering of shMED13L1/2 neuronal cells. NCCs genes (shown
in red) and components of Wnt pathway are differentially expressed in shMED13L1/2
neurons, compared to shScrambled cells. The dendogram show the hierarchical clustering
based on the differentially expressed genes.
113
Eight genes from Wnt pathway were selected and validated by qPCR, confirming
the microarray-predicted changes in these genes for neurons and NPCs, as well as
med13b morphant embryos, supporting the notion that Wnt signaling is
deregulated in cells lacking MED13L (Table 16).
Gene Name shRNA/Morphant Microarray
Fold Change
qRT-PCR Fold Change
Neuron NPC med13b MO
WNT7A
shMED13L-1 3.278 2.329 7.639 1.139
shMED13L-2 3.112 2.371 34.004
LRP5
shMED13L-1 1.343 1.778 1.375 1.384
shMED13L-2 1.489 3.349 1.662
FRZB
shMED13L-1 -5.497 0.096 0.533 0.609
shMED13L-2 -6.017 0.112 0.365
PCDHB15
shMED13L-1 -1.576 0.294 0.51 n.o.
shMED13L-2 -1.939 0.211 0.793
PCDH17
shMED13L-1 1.488 1.488 1.603 n.o.
shMED13L-2 1.788 1.473 3.822
HOXA5
shMED13L-1 -5.277 0.066 0.366 0.701
shMED13L-2 -6.701 0.02 0.416
SFRP4
shMED13L-1 -2.180 0.122 0.304 n.o.
shMED13L-2 -2.614 0.169 0.227
FZD9
shMED13L-1 1.696 1.465 1.781 0.718
shMED13L-2 2.745 1.605 2.212
Table 16. qPCR validation of microarray-predicted changes in shMED13L NPCs,
Neurons and med13b MO. Eight genes that represent the components of canonical
Wnt signaling pathways were selected for PCR validations in MED13L
knockdown cells and zebrafish morphants. qPCR experiments were performed in
triplicates.n.o. indicates no orthologue in the zebrafish.
114
CHAPTER 5: RESULTS
STUDYING THE ROLE OF GTDC1 DURING
NEUROGENESIS
5.1 STUDY BACKGROUND
This chapter focuses on a follow-up functional study of Patient CD6 that carries a
t(2;8) translocation and diagnosed with global DD, SD and language impairment.
Sequencing analysis of his unaffected parents revealed that this translocation is
the only de novo SVs within coding regions, and it disrupted GTDC1 at intron 5.
GTDC1 encodes a glycosyltransferase-domain containing 1 protein of 458 amino
acids (AA), and its function is largely unknown. The gene was first discovered
and characterized in human fetal library, and shown to have ~75% homology with
its mouse orthologue.(176) It has a glycosyltransferase group 1 domain on its C-
terminus and a transmembrane region in its N-terminus, indicating a type II
transmembrane protein(176) Another family member, GTDC2 also known as
POMGNT2, was recently reported to be causative for Walker-Warburg syndrome
through exome sequencing study on consanguineous family and functional
validation in zebrafish model.(177) Based on the preliminary expression analysis
in human tissues, GTDC1 is ubiquitously expressed in human tissues, including
fetal brain, and cerebellum (Chapter 3.3).
115
In this study, the correlation of GTDC1 disruption to neurodevelopmental
phenotype in the patient is investigated through morphological analysis on NPCs
and neurons derived from patient’s induced pluripotent cells, and H1-NPCs
lacking GTDC1.
5.2 SOMATIC CELLS REPROGRAMMING FROM PATIENT’S
FIBROBLASTS
Somatic cells reprogramming was performed on the patient’s and control
fibroblast cells and resulted in generation of multiple iPSCs clones for further
characterization. The controls were derived from two healthy donors, and the data
obtained from one of them is shown as representative as the results were similar
for both controls. Two clones were selected for patient’s iPSCs (Clone 5 and 15).
Each clone expressed pluripotent markers OCT4, NANOG, SSEA4 and TRA-1-
60 (Figure 39) and maintained t(2;8) karyotypes (Figure 40), which confirmed
successful reprogramming of patient’s fibroblasts into pluripotent iPSCs, and
stable genome integrity upon inducing reprogramming factors.
116
Figure 39. Pluripotent markers (OCT4, NANOG, SSEA4, and TRA1-60) expression
in iPSCs clones. Two clones of patient’s derived iPSCs (CD6 #5 and CD6#15) and
control iPSCs showed expression of pluripotent markers, indicating successful
reprogramming from fibroblast to iPSCs. Images are representative of two independent
experiments and six field-images were taken for each experiment.
Figure 40. Similar translocation profile is retained in patient’s iPSC clones. G-
banding karyotype of two patients’s iPSC clones (CD6 #5 and #15) showed t(2;8)
translocation, similar to the patient’s EBV-LCL.
117
5.3 PHENOTYPIC CHARACTERIZATION OF PATIENT’S NPCS
AND GTDC1-DEFICIENT NPCS
To examine cellular neurogenesis in patient’s cells, patient CD6-iPSCs were
differentiated into NPCs by using small molecule inhibitors.(147) After one week
of differentiation, these cells were positive for NESTIN and Ki67 as shown by the
immunocytochemistry (Figure 41). In parallel, H1-human ES cells were
differentiated into NPCs using similar protocol. After several passages, GTDC1
expression was suppressed by using two shRNAs in H1-NPC, and resulted in
50% reduction of GTDC1 mRNA expression in shGTDC1-1 and shGTDC1-2
NPCs compared to scrambled shRNA. Similar to the result obtained in the
patient’s NPCs, immunocytochemistry of NPCs markers NESTIN and SOX2
performed in these cells showed comparable expression levels between
shScrambled and shGTDC1 (Figure 41).
118
Figure 41. Immunocytochemistry of NPCs markers (NESTIN, Ki67, and SOX2) in
iPSCs and shGTDC1 cells. Upper Panel: Expression of Nestin (green) and Ki67 (red) in
NPCs differentiated from patient’s iPSCs (CD6 #5 and CD6 #15) and control iPSCs.
Lower Panel: Similarly, Nestin (green) and Sox2 (red) were expressed in shGTDC1 cells
and shScrambled cells after differentiation from H1-hESCs. Images are representative of
two independent experiments and six field-images were taken for each experiment.
To evaluate whether NPCs proliferation capacity is affected, the proliferation rate
of patient’s and shGTDC1 NPCs were evaluated by EdU-incorporation assays.
Interestingly, patient-derived NPCs showed a significantly slower proliferation
rate compared to control NPCs, and shGTDC1 NPCs also appeared to have
reduced proliferation rate compared to shScrambled NPCs (Figure 42). The rate
of proliferation in NPCs is tightly regulated by cell cycle checkpoint genes, and
crucial to maintain the efficiency of neuronal differentiation. Therefore,
perturbation in the cycling rate of these progenitors may alter the normal timing to
generate post-mitotic neurons.
119
Figure 42. Proliferation rate was measured by using flow-cytometry-based EdU-
incorporation assay. Significantly reduced proliferation rate was observed in the two
clones of patient-derived NPCs (CD6 #5 and CD6 #15) and shGTDC1/2 NPCs in the S-
phase. Measurements were performed in duplicates and repeated twice.
Glycosyltransferases are the key players for regulating protein glycosylation, an
important post-translational modification for protein function in diverse biological
120
processes. To investigate the glycosylation status in the patient’s cells, ICAM1
expression was assessed in shGTDC1 and patient’s NPCs. ICAM-1 is a recently
described hypoglycosylation biomarker for congenital disorders of glycosylation
(CDG) patients. In CDG-patient cells, the protein product of ICAM-1 is degraded
and rendered unstable upon hypoglycosylation. Analysis of ICAM-1 protein
expression by immunocytochemistry revealed distinct patterns of ICAM-1
staining localization (Figure 43). In patient’s NPCs, dotted-like staining diffused
in the cytoplasm was observed in both clones, compared to control NPCs and
shScrambled NPCs that localized in the outer plasma membrane. In contrast,
zipped-like staining pattern was observed in the shGTDC1-1 and shGTDC1-2
NPCs. Mislocalized expression in patient’s cells may provide clues on the
possibilities of dominant-negative effect upon GTDC1 disruption. Clearly,
knockdown cells did not show similar staining pattern, and further experiments
are required to confirm the functional impact of this translocation in the patient.
121
Figure 43. Glycosylation status is assessed by ICAM1 expression in NPCs. ICAM1
(green) is expressed on the plasma membrane, as shown in control and shScrambled
NPCs. In patient’s cells (CD6 #5 and CD6 #15) ICAM1 is mislocalized to the cytoplasm
in a dotted-like staining, and shGTDC1 cells showed a zipped-like staining pattern.
Images are representative of two independent experiments and six field-images were
taken for each experiment.
To further extend the analysis on the relevant cell type that is affected in NDDs
patients, the NPCs were differentiated into neurons. The expression level of
GTDC1 mRNA was reduced in patient’s neuron, indicating that reprogramming
and differentiation process did not alter the genome characteristics. To investigate
neuronal morphological phenotypes in patient’s and shGTDC1 neurons,
immunocytochemistry was performed using TuJ1, early neuronal marker and
MAP2, a mature neuron marker, counterstained with DAPI for quantification
analysis, and showed expression of both markers in patient’s and shGTDC1
122
neurons (Figure 44A). Notably, reduced number of neurons is more pronounced
in the shGTDC1 cells.
Figure 44 Neuronal markers expressions in patient’s cells and quantification of
neuronal morphologies. A) Immunocytochemistry of TuJ1 (green) and MAP2 (red) in
patient-derived neurons (CD6 #5 and CD6 #15) and shGTDC1 neurons after 4 weeks of
differentiation. Images are representative of two independent experiments and six field-
images were taken for each experiment. B-D) Quantification analysis of neuronal
maturation showed lower maturation status (B), reduced soma size (C), and lower
number of neurites per neuronal cell body (D) in the patient’s cells. *) p < 0.05, **) p <
0.01, ***) p < 0.001
123
First, the efficiency of neuronal maturation was assessed by calculating the
percentage of colocalization signal threshold between TuJ1-positive cells and
MaP2-positive cells. In theory, the colocalized signal showed the number of
differentiated neurons that undergo maturation. Reduced percentage of TuJ1 and
MAP2-positive cells was similarly observed in patient’s neuron and shGTDC1
neuron (Figure 44B), suggesting impaired neuronal maturation in these cells.
Decrease in neuronal soma size has been described previously in neurons derived
from Rett Syndrome iPSCs and Parkinson’s disease iPSCs. The soma size of
individual neurons in patient’s and shGTDC1 cells were measured from 3-5
different fields, and revealed similar reduction of soma size in patient’s and
shGTDC1 neurons (Figure 44C). To measure the neurite outgrowth, neurons were
passaged in a lower density in similar condition for patient and shGTDC1 cells, as
well as control cells. There is a significant decrease of neurite density per
neuronal cell body in both patient’s and shGTDC1 neurons, as compared to
control (Figure 44D). Together, these data provide evidence of impaired neuronal
maturation, and morphologically abnormal neuronal properties, shown by reduced
soma size and lower number of neurites per cell body.
5.4 TRANSCRIPTOME PROFILING OF PATIENTS AND SHGTDC1 CELLS
To further characterize the transcriptome alteration between four types of cells
(shGTDC1, shScrambled, Control-iPSCs, and Patient-iPSCs), exploratory
microarray experiments were performed in NPCs and neurons. In this regard, the
patient’s and shGTDC1 expression profiles were hypothesized to be closely
124
related due to GTDC1 loss of function in both cell types. In total, 659 and 430
genes were differentially expressed in the patient’s NPCs and neurons,
respectively (Figure 45). In shGTDC1 cells, 721 and 2264 genes were
differentially expressed at NPCs stage and neurons, respectively, of which 49 and
61 genes were found to be overlapped between patient’s cells and shGTDC1 cells
at NPCs and neurons, respectively.
Figure 45. Venn diagram of differentially expressed genes that are in common in
NPCs and neurons between patient’s and shGTDC1 cells. Differentially expressed
genes in each sample were extracted and analyzed with GeneSpring analysis software
using parameters described in Chapter 2. Data were collected from two biological
replicates and normalized with controls in one microarray slide.
Hierarchical clustering showed that samples from patient’s and shGTDC1 cells
tightly clustered with each other, and separated from the control-iPSCs and
shScrambled cells (Figure 46), suggesting that patient’s cells and shGTDC1 cells
are closely related with each other.
125
Figure 46.Heatmap clustering of the differentially expressed genes that are in
common between patient’s and shGTDC1 NPCs. Each column represent duplicate
samples indicated in the upper panel. Genes that were highly expressed are shown in
yellow. The dendogram show the hierarchical clustering based on the differentially
expressed genes. Patient’s cells and shGTDC1 cells are clustered separately from the
controls.
Among the differentially expressed genes, two genes were found to be
upregulated 5 to 8 fold consistently in the NPCs of both patient’s and shGTDC1
cells; STMN2 and CNTN2. STMN2 encodes stathmin-2 like, which plays a role in
126
neuronal growth, and functions to stabilize microtubule. CNTN2 encodes
contactin 2, a glycosylphosphatidylinositol (GPI) anchored neuronal membrane
protein and has a role in axon guidance.
To examine the biological pathways that were affected in the patient’s cells and
GTDC1 knockdown in both stages, enrichment analysis was performed by using
DAVID web server that show Gene Ontology categories from significantly
altered genes. The top five most-enriched GO categories of biological processes
that are in common between patient cells and shGTDC1 cells are shown in Figure
48. At NPCs stage, the top GO categories were related to organ development,
cellular growth and proliferation, cell death and survival, and cellular movement.
The GO terms in cellular growth and proliferation might be correlated to reduced
number of cells cycling in S-phase at NPCs stage, in both patient’s and shGTDC1
cells. The upregulated genes in NPCs showed enrichment of embryonic
development, nervous system development, cellular assembly, and cellular
function and maintenance, suggesting that GTDC1 disruption has significant
impacts on other genes that are involved in the neurodevelopmental processes.
Among the downregulated genes in neurons, biological processes affecting
nervous system development were affected, such as morphology of nervous
system, neuritogenesis, organization of cytoskeleton, cellular and tissue
development, which could explain the abnormal neuronal properties of patient’s
and shGTDC1 cells in culture.
127
Figure 48.The top five GO categories that are in common between patient’s and
shGTDC1 cells. This analysis showed significant enrichment of nervous system
development, embryonic development and cytoskeleton organization.
128
CHAPTER 6: DISCUSSION
The major challenge in identifying novel NDDs genes is the heterogeneous nature
of genetic etiologies and phenotypic expression observed in these patients. Many
symptoms are not unique to single NDD syndromes, and several NDDs have
clusters of clinical entities in common.(4) Similarly, genetic mutations found in
certain syndromic form of NDDs could be shared with an idiopathic NDDs, for
example TCF4 is known to cause Pitt-Hopkins syndrome (MIM 610954)(178),
and mutations in this gene have been reported in individuals with idiopathic DD
as well. (179) Depending on the variants being investigated, either single-
nucleotide changes, large structural rearrangements, or epigenomic changes, the
choice of screening method is crucial to determine diagnostic yield. In this thesis,
the first study aimed to pinpoint candidate genes within the breakpoints of
chromosomal rearrangements using a high-throughput sequencing approach.
Secondly, the thesis focused on defining the functional significance of selected
disrupted genes within the chromosome rearrangements in patients diagnosed
with NDDs or DDs.
First, the patients carrying known chromosomal rearrangements and diagnosed
with symptoms of NDDs were screened through genome paired-end tag
sequencing, to fine-map the rearrangement breakpoints and to identify potential
additional SVs apart from the cytogenetically visible rearrangements. I further
prioritized candidate genes for functional studies from seven cases and selected
two genes for further characterization: GTDC1, based on trio sequencing analysis
129
availability and MED13L, based on previously reported mutations in patients with
a similar clinical phenotype. I showed that MED13L plays a role in cranial NCCs
development and neurodevelopment through zebrafish model and neuronal
cultures derived from hESCs, which is likely to contribute to the craniofacial
anomalies and ID in the patient. Finally, GTDC1 studies involved the use of
iPSCs and human ES cells with suppression of GTDC1, in which cellular
phenotype showed abnormal neuronal morphology in the patient’s cells.
6.1 CLINICALLY RELEVANT GENE DISRUPTIONS IN THE
CHROMOSOMAL REARRANGEMENT BREAKPOINTS
Over the last couple of years, tremendous numbers of novel candidate genes
involved in NDDs have been reported through studies employing NGS
technologies. Not only has this revolutionary approach enhanced the
understanding of the underlying etiology, but also the implications of interpreting
such variants have slowly emerged. Disease genes discovery through mapping
genes in chromosomal breakpoints have long been applied in classical cytogenetic
hypothesis model. This model relies on the large structural rearrangements that
potentially disrupt, truncate or inactivate coding gene or subjected to position
effect of regulatory elements, as the primary cause of phenotypes that segregate
with the rearrangement. In cases where the rearrangements are transmitted to
individuals without disease, the causality of such rearrangements in the
expression of a phenotype is not entirely clear. In addition, other types of genetic
mutations such as SNVs also bear functional consequences, which could alter
130
protein sequences, adding a premature termination codon, or affecting alternative
splicing. It has also been understood that genetically heterogeneous disease such
as NDDs are not caused by single gene disruption, but rather a combined effect of
rare variants, either SVs or SNVs.(47, 48, 180)
In this study, seven patients from unrelated families and two parents were
examined through DNA-PET sequencing to assess genome-wide SVs that could
potentially explain their phenotypic features. These patients carried specific
chromosomal rearrangements that were interpreted as likely to be pathogenic by a
clinical geneticist. The sequencing yielded relatively high coverage comparable to
other studies employing similar technique in patients with chromosomal
rearrangements. In total, eight genes were identified at the chromosomal
breakpoints; GNAQ, RBFOX3, UNC5D, XIAP, TMEM47, NCAPG2, GTDC1 and
MED13L, and one gene was identified from patient’s SVs, MCPH1. These genes
have biologically relevant functions important for normal neurodevelopment,
involved in axon guidance, chromosomal condensation, glycosylation,
transcriptional regulation and alternative splicing regulation. Cryptic deletions
ranging from 3 bp to 5,462 bp were found in the breakpoint junctions revealed by
Sanger sequencing, and microhomologies of 2-4 bp between paired breakpoints,
suggesting a microhomology-mediated end joining (MMEJ)(181) or non-
homologous end joining mechanism (NHEJ)(182). Both pathways are major
pathways for double strand break repair that often occurred in non-homologous
regions of chromosomal rearrangements such as translocations or inversions,
131
involving few nucleotide deletions and short stretches of microhomologies
between the junctions(183-185).
Of the seven patients, four were familial cases, and one had a translocation that
fully segregated with the disease in the family of patient CD5. The translocation
created fusion genes GNAQ-RBFOX3 on both derivative chromosomes, which
could not be validated on the patient’s cell line, as RBFOX3 is a neuronal-specific
gene and is not expressed on the lymphocytes. In theory, fusion genes that are
commonly seen in several forms of cancer, may give rise to gain of function or
haploinsufficiency of individual genes, and could serve as an important biomarker
for diagnostic purposes. Both genes have important roles in neuronal growth and
development. GNAQ null mice displayed cerebellar ataxia, and delayed growth of
neuronal cells, and RBFOX3 is an important regulator of alternative splicing of
exons during neuronal differentiation. Since the translocation was transmitted to
two affected sons with variable degrees of NDDs, this translocation appears to
illustrate that the variability of expressed phenotype from a genetic mutation is a
common phenomenon in neurological disorders.
In one familial case, the inversion found in patient CD8 disrupted a causative
gene XIAP, which has been linked to XLP2(186, 187) (XLP2 [MIM: 300635]),
and was likely to be the cause for the immunodeficiency phenotype seen in this
patient. Although the inversion is derived from a phenotypically normal mother,
this could be explained by the alternate X-inactivation in the mother. Moreover,
his monozygotic twin brother has identical symptoms, suggesting the penetrance
132
of the causative gene in this familial case. The remaining two familial cases
(Patient CD10 and CD14) did not show a complete segregation with the disease,
which further complicates the interpretation of disrupted genes in the
chromosomal breakpoint. The translocation in patient CD10 affecting UNC5D, a
dependence receptor involved in axon guidance and neuronal migration, is
derived from a phenotypically normal mother and is also present in his younger
sister that displayed schizencephaly and polymicrogyria. These structural brain
abnormalities were not present in the patient CD10, suggesting additional genetic
factors that may influence the phenotype. However, there is an enrichment of
CNV counts encompassing UNC5D in DD cases and DECIPHER database,
indicating increased CNV burden in this patient that could be clinically relevant.
In patient CD14, two deletions were identified from SVs analysis. A telomeric
deletion at 7q36 that truncate NCAPG2 gene and delete three other genes VIPR2,
WDR60, ESYT2, was inherited from his asymptomatic father. The patient
harbored a second deletion in MCPH1 from exon 6 to 8, which was found to be
inherited from his healthy mother. Initially, this case was complicated by the
interpretation of apparently normal carrier parents, casting doubt on the validity of
the identified CNVs underlying the phenotype of this patient. Interestingly,
MCPH1 and NCAPG2 are known to interact during chromosome condensation
preparing for mitosis, and the binding site of NCAPG2 in the MCPH1 is deleted
in this patient. Thus, the impaired interaction due to two-hit deletions could
133
contribute to ID and severe microcephaly in this patient, as it has been described
by Perche et al.(188)
For the remaining three de novo cases, one inversion in patient CD9 did not
disrupt any coding genes, or regulatory elements, thus the causality of the
inversion is not proven, emphasizing the limitation of candidate gene detection
merely through ABCR breakpoint cloning methods. In this case, the presence of
point mutations, and exposure to environmental factors, which could not be
investigated through technology applied in this study, might be the contributing
factors to the patient’s phenotype. Alternatively, position effect due to disruption
of regulatory elements that control neighboring gene expression that has not been
studied by ENCODE could be another possibility, which has been demonstrated
in previous studies as significant contributing factors to the phenotype.(189-193)
The remaining two de novo translocation cases disrupted potentially causative
genes, GTDC1 and MED13L. GTDC1 was identified as the only disrupted gene in
patient CD6 based on de novo SVs analysis using parental data. GTDC1 has a
glycosyltransferase domain, in which most glycosyltransferases plays a role in
catalyzing the synthesis of the carbohydrate portions of glycoproteins, glycolipids
and proteoglycans. Mutations in glycosyltransferases have been associated with
congenital disorders of glycosylation (CDG [MIM 602579]). MED13L is a
Mediator complex subunit that plays a role in repression of RNA-Polymerase II-
dependent transcription. Structural rearrangements in MED13L including
134
translocation and CNVs have been reported in patients with craniofacial
anomalies, and ID, suggesting a possible classification of new syndrome due to
MED13L haploinsufficiency.
In summary, gene disruption in four out of seven (~57%) patients might
contribute to explain the underlying phenotype of these patients. By combining
classical methods and recent technology, large-scale screening of chromosomal
rearrangements can be performed more rapidly, and reporting gene disruptions in
microscopically visible rearrangements is important to determine the clinically
relevant variants, especially for undiagnosed cases. Although technology
continues to improve, validation with longer read Sanger sequencing is still
necessary for better annotations of SVs.
6.2 LIMITATIONS OF DNA-PET SEQUENCING
DNA-PET sequencing has been shown to be powerful in finding candidate genes
by mapping the chromosomal rearrangement breakpoints. However, this approach
is limited to the ability to detect large rearrangements, and is unable to detect
single nucleotide variants. Highly repetitive sequences such as telomeric regions
are difficult to call (as shown in Patient CD14), although this can be resolved by
extracting the sequencing read-depth copy number information that could
facilitate detection of subtelomeric rearrangements. Furthermore, in order to
implement this technique faithfully in the clinic, it requires a more established
SVs calling pipeline for a more rapid and simplified interpretation for the
clinicians. For example, the turnaround time for DNA-PET sequencing from
135
patient’s sample collection to post-sequencing analysis could take about 1 month
per patient, compared to less than a week turnaround time of karyotyping or
aCGH. In addition, a much bigger study is needed to carefully classify certain
group of patients that may benefit from DNA-PET sequencing analysis. Overall,
this technique may not be suitable for group of patients that require a fast and
‘actionable’ outcome or for a prenatal diagnosis, due to its turnaround time. On
the other hand, it can be potentially applied to non-syndromic NDDs patients that
have gone through extensive genetic testing without a conclusive diagnosis. These
patients may benefit from a genome-wide screening that could provide novel
insights into disruptive mutations associated with their phenotypes.
6.3 LARGE PHENOTYPIC SPECTRUM IN PATIENTS WITH
MED13L DISRUPTIONS
NDD syndromes could be classified based on known genetic etiology and
multiple overlapping clinical phenotypes. In this study, I described the large
phenotypic spectrum of patients with distinctive facial characteristics and ID
sharing a single gene disruption. There are eight patients that have been reported
in the literature, and one described in this study to have mutations in MED13L.
The clinical features among four patients largely overlap; two patients with out-
of-frame deletion, a first translocation case reported by Muncke (194), and the
patient described in this study. All of them similarly displayed craniofacial
anomalies that coexisted with ID. In three patients, heterozygous non-
synonymous SNVs caused cardiac defects (172), and homozygous non-
136
synonymous SNVs in two siblings displayed non-syndromic ID without cardiac
defect (174). One patient carrying a triplication of the whole gene showed
comparably similar phenotype including the cardiac defect, craniofacial
anomalies and muscular hypotonia, which also involved a neighboring gene
MAP1LC3B2. Overall, cardiovascular defects were present in 7 out of 9 cases
(78%).
Some of the patients reported in previous studies were initially screened and
tested negative for 22q11.2 deletion, since their clinical features resembled a well-
described neurocristopathy, Velocardiofacial syndrome (VCFS [MIM 192430]),
or DiGeorge sequence (DGS [MIM 188400]) (VCFS/DS). This group of disorders
is caused by defects in the development of specific neural crest that differentiates
into organs such as thyroid, thymus and conotruncal septum of the heart.(195) In
the early classification of patients into VCFS/DS, some patients were noted to
have Pierre-Robin syndrome (~10%)(196, 197), and cardiac defect was not
always present (~82% of the cases have cardiac anomalies)(198).
In comparison to the primary symptoms of VCFS/DS, patients with MED13L
haploinsufficiency displayed nearly identical craniofacial features such as
upslanting palpebral fissures, bulbous nose, flat nasal root, and differing palatal
problems including macroglossia, micrognathia or Pierre-Robin sequence. In
addition, neurocognitive defects ranging from SD, DD, LD or ID were
consistently seen in each individual with MED13L mutations, suggesting a
correlation of MED13L disruption to neurodevelopmental phenotype in these
patients. A recent study described two additional patients with craniofacial
137
anomalies and ID but without heart defects, carrying de novo exonic deletions in
MED13L.(175) These observations support a reduced cardiac penetrance for
MED13L haploinsufficency. Overall, these clinical data suggest a potentially
novel classification of patients with MED13L haploinsufficiency, and could
provide diagnostic value that may benefit patients in the future. Further screening
in larger number of patients is required to carefully establish this diagnostic
classification.
6.4 MED13L HAPLOINSUFFICIENCY CONTRIBUTES TO
CRANIOFACIAL ANOMALIES AND ID
To interrogate the correlation of MED13L disruption and its possible involvement
in neural crest defects, zebrafish was used to model craniofacial cartilage
structures, as it has been shown that genes controlling the development of the jaw,
pharyngeal arches, and midline of the skull are conserved between fish and
human.(199) Studies in this animal model have enhanced our understanding on
mechanisms of craniofacial development. Actually, many congenital anomalies
involving craniofacial dysmorphism have been modeled in zebrafish (177, 200-
202).
I showed that craniofacial cartilage defect due to med13b loss of function is due
to early migration defects of cranial NCCs. Since the craniofacial defects coexist
with ID in individuals with MED13L mutations, I attempted to address whether
loss of MED13L affect the process of neurodevelopment. By expression of
neuronal markers in zebrafish brain, I observed lack of expression in certain brain
138
regions due to abnormal distribution of early differentiating neurons. However,
the finding obtained in human cells was not concordant with the zebrafish data,
which showed no remarkable differences in NPCs and neuronal morphology,
including neuronal differentiation efficiency, proliferation rate of NPCs, and
neuronal maturation. This could be due to lack of positional identity of
differentiated cell type from human ES cells. Further functional analysis on these
neurons might provide clues on perturbed neurodevelopment that is consistently
seen in patients with MED13L haploinsufficiency.
Interestingly, the transcriptome profile of MED13L deficient neurons showed
enrichment of genes important for cranial neural crest induction, such as SP8 and
FGFR3. SP8 encodes an SP family of transcription factors that plays a crucial role
in regulation of craniofacial development through FGF signaling pathway.(203)
Among differentially expressed genes, I showed that components of the canonical
Wnt pathway were dysregulated in MED13L-deficient neuronal cells, suggesting
a possible role of MED13L in Wnt signaling.
Canonical Wnt signaling is essential to a variety of biological processes, including
NCCs induction, neurogenesis, cell proliferation, cell fate specification, and axis
patterning. This finding was consistent with previously published reports
describing physical interaction between other Mediator complex subunits and
canonical Wnt signaling pathway in mouse, Drosophila and zebrafish studies.
Med12 hypomorphic mouse mutant had impaired canonical Wnt signaling
pathway.(204) Transcript expression of downstream target genes of Wnt, Pygopus
required recruitment of Mediator complex through med12 and med13 in
139
Drosophila.(205) Depletion of Med10, Med12 and Med13 enhances Wnt
signaling pathway in zebrafish morphants.(206) MED12 has been demonstrated to
physically interact with β-catenin, and Mediator was recruited to Wnt-responsive
genes in a β-catenin manner.(207) The data presented in this thesis further extend
the involvement of MED13L, as part of Mediator complex to interact with Wnt
pathway for transcription of specific genes, which has yet to be further
investigated. Together these expression data derived from human neuronal
cultures provided a mechanistic insight to the patient’s craniofacial defects arising
from altered genes and pathway that are important for early cranial NCCs
induction and specification.
6.5 MORPHOLOGICAL ALTERATIONS IN NEURONS DERIVED
FROM PATIENT’S IPSCS AND GTDC1-DEFICIENT CELLS
Since the breakthrough discovery by Yamanaka and colleagues in 2006,(146,
208) studies using patient’s derived iPSCs have been exponentially reported over
the last couple of years mainly in neurological and cardiac diseases. These iPSCs
exhibited similar transcriptional and epigenetic profiles to hESCs that could also
be used for non-invasive high-throughput drug screening, or cell replacement
therapy. Despite the growing trend of using iPSCs to explore the cellular
phenotype associated with diseases, there are still many issues to overcome before
the full potential of iPSCs can be widely accepted in clinical application. One of
the issues is the use of appropriate control for a fair-comparison to the patient’s
phenotype, which is still highly debated in the community. Ideally, the control
140
could be derived from an unaffected family member. However, in NDDs, parents
or older healthy individuals are not ideal, especially for iPSCs derived from skin
fibroblast, considering the likelihood of environmental exposure such as UV
radiation, accumulated over the years in older individuals. On the other hand,
isogenic cell lines can be generated through the currently available genome-
editing methods and served as the most appropriate control for iPSCs-based
disease modelling studies. Another issue is the individual cell line variability
during iPSCs generations, which confound the ability to distinguish disease-
associated alteration. Thus, Maherali and Hochedlinger proposed a few criteria of
ideal iPSCs to consider as true pluripotent cells, including morphological
resemblance to ESCs, being self-renewal, low genome instability, expression of
pluripotency genes and teratoma formation.(209) Once the iPSCs passed these
criteria, the abnormalities seen through patient’s iPSCs could provide valuable
insight into disease mechanism and may lead to the development of diagnostic
tools and drugs for early intervention.
In this study, the discovery of GTDC1 in a familial translocation from trio-
sequencing provided a compelling candidate to follow-up in functional studies,
considering it was the only de novo candidate gene from DNA-PET screening. De
novo mutations in sporadic NDDs cases are known to contribute to disease
etiology and they have been studied for many years. The availability of patient’s
fibroblast enabled somatic reprogramming into iPSCs to further extend the
understanding of disease mechanism in similar genetic background to patient’s
cells. The pluripotent assays of patient’s iPSCs have shown effective
141
reprogramming by expression of pluripotency markers and similar translocation
profile in the karyotype. To investigate the disease states in relevant cell types,
patient’s iPSCs were differentiated into NPCs and neurons, in parallel to the
GTDC1-depleted cells and control iPSCs. The first striking observation was the
slower proliferation rate in the NPCs derived from patient’s iPSCs and shGTDC1
cells, although it was moderately slower in shGTDC1. Previous iPSCs studies in
Rett syndrome(210), Timothy syndrome(211, 212) or FXS(213) did not show any
differences in cell-cycle length between patient cells and controls. This could be
explained by the transcriptome profile of downregulated genes, showing
enrichment of GO terms in cell growth and proliferation, namely Cyclin E1
(CCNE1), Cyclin D1 (CCND1), and Cyclin D2 (CCND2), which are important
regulators for maintaining transition between cell-cycle phases. The observed
slower proliferation rate in NPCs stage might underlie the lower maturation status
of differentiated neurons, as it has been demonstrated by lower percentage of
early neurons that became mature neurons in patient’s and shGTDC1 cells.
Additionally, there were several morphological alterations observed in both
patient’s and shGTDC1 neuronal cells, including reduced soma size and
decreased number of neurites. Reduced soma size has also been reported
previously in Rett syndrome iPSCs, which was supported by prior observation in
post-mortem brain tissue of Rett patients. Post-mortem brain studies in other
NDDs syndromes also have shown reduced neuronal soma size in the cortex of
schizophrenia and autism patients. The underlying theory of decreased size of
neuronal soma is that it could affect the projection of output signals and the
142
packing density of individual neurons to form a certain brain region. The final
morphological alteration in patient’s and shGTDC1 cells is the decreased number
of neurites as means of measuring neurite outgrowth in these cells. Reduced
number of neurites could significantly affect the wiring connection in the brain, as
the information between neurons is transmitted through communication of
synapses, arising from neurite extension from individual neurons. Furthermore,
the enriched GO terms in the upregulated genes of both patient and shGTDC1
neurons are involved in neuritogenesis, morphology of nervous system and
organization of cytoskeleton, which are biological processes important for
formation of neurites and dendrites. Together these data showed profound
morphological changes that could be explained by the transcriptome profile of
these cells, and the finding in patient’s cells provided a comparably similar trend
to the shGTDC1 cells, emphasizing that the arising morphological phenotypes
could be due to disruption in GTDC1.
Considering GTDC1 as a glycosyltransferase and mutations in glycosyltransferase
are known to cause CDG syndrome(214-216), I further sought to determine
whether GTDC1 disruption could lead to a novel CDG syndrome by measuring
the glycosylation status through ICAM1 expression in patient’s cells.(217)
Distinct staining pattern of ICAM1 was observed in the patient’s cells, as shown
by diffused staining in the cytoplasm rather than at the plasma membrane in the
control cells. Previous study reporting the use of ICAM1 in CDG patients showed
complete absence of ICAM1 in patient’s fibroblast cells. However, shGTDC1
NPCs did not show similar staining pattern as in the patient’s cells, with the
143
protein localized to the plasma membrane in a zipped-like pattern. This
discrepancy could be due to the genetic or epigenetic background of the patient’s
cells, compared to a specific disruption on GTDC1 in the knockdown cells.
Alternatively, mislocalized ICAM1 expression in the patient’s cells could be due
to dominant-negative protein expression of the mutated allele. Additional
experiments should be performed to test this hypothesis, such as overexpressing a
truncated form of GTDC1 protein in H1-NPCs and check for ICAM1 localization.
Together these data showed the possible involvement of GTDC1 in this patient’s
neurodevelopmental phenotype. Mutation screening to identify novel patients and
further functional assays are required to establish its potential delineation as a
novel mild form of CDG syndrome, considering the phenotype is limited to affect
nervous system rather than multisystemic organs.
144
CHAPTER 7: CONCLUSION
The work presented in this thesis consisted of three major findings:
- The genome screening conducted in seven patients with prior cytogenetic
information led to the identification of eight genes disrupted in the
chromosomal breakpoint regions, and one candidate gene detected from
private SVs of one patient that was previously unexplored by karyotypic
observation. These eight genes were expressed in the brain, and have
biologically relevant function to the neurodevelopmental processes. In
comparison to the laborious chromosome walking method by FISH to map
breakpoints that would take 6-12 months, DNA-PET sequencing provided a
more straightforward detection of breakpoints, thus faster validation by using
PCR or FISH can be achieved by using precise coordinate derived from
sequence data.
- The clinical evaluation of MED13L in patients with nearly identical
phenotypes showed a consistent phenotype, including craniofacial anomalies,
cognitive deficits and in some cases, cardiovascular defects. Loss of function
assays in med13b, zebrafish orthologue of MED13L, provided evidence of
early migration defect of cranial NCCs that subsequently resulted in
craniofacial cartilage anomalies. Transcriptome profiling of MED13L-
deficient neurons showed expression changes of cranial NCCs genes, and
145
components of Canonical Wnt signaling, suggesting that MED13L alteration
could affect the transcriptional regulation of genes important for
developmental processes.
- The characterization of GTDC1 being the only disrupted gene based on
sequencing of a trio provided a possible correlation to the neurodevelopmental
phenotype seen in the patient. Further functional assays in neurons derived
from patient’s iPSCs and parallel comparison to GTDC1 knockdown cells
showed morphological defects affecting the proliferation rate of progenitor
cells, maturation efficiency, neuronal soma size and neurite outgrowth. These
data suggested potential implication of GTDC1 to underlie
neurodevelopmental disturbances. Although disruptions or mutations in this
gene have not been reported in normal individuals databases, description of
additional patients with GTDC1 mutations are required to further strengthen
the hypothesis of this gene as the causative gene for idiopathic NDDs.
In summary, the work presented in this thesis showed how recent technological
advances have improved the evaluation of chromosomal rearrangements, and
therefore have allowed rapid identification of new developmental genes, which
provided significant impact on the understanding of the pathophysiology of ID
and NDDs. These findings provided clinical significance in terms of discovery of
novel candidate genes in ABCR, and the potential involvement of MED13L and
GTDC1 in the underlying pathology of individuals carrying mutations in these
genes.
146
CHAPTER 8: FUTURE DIRECTIONS
In the discovery part of this thesis, I demonstrated the application of PET
sequencing as an alternative to WGS, and its efficacy for identification of large
SVs. The implementation of this technique in clinics has yet to pass several
criteria, including the requirement of large amounts of DNA (40 µg), coverage
variability of sequencing machines, a filtering pipeline, interpretation of variants,
turnaround time and cost-effectiveness. This study used a SOLiD sequencing
machine that required high amount of genomic DNA. During the course of this
study, improvements in the protocol has been implemented in the institute, by
using lower input of DNA (5 µg), sequencing by Illumina platform, an improved
algorithm for sequence mapping to the reference genome and variants calling. As
the sequencing cost continues to drop, DNA-PET is potentially suited to be
implemented in the diagnostic workup for unexplained NDDs. However, the
turnaround time for DNA-PET sequencing (~1 month) takes much longer than G-
banding karyotyping (2-3 days) and aCGH (4 days), which is the major limiting
factor of DNA-PET for a detailed genetic assessment. In addition, the cost of trio
sequencing is currently still very high (SGD 5,000 per individual), suggesting that
certain parameters will need to be considered before its potential application in
diagnostic settings.
In the second part of this thesis, I assessed the overlapping clinical phenotypes
among patients with MED13L disruptions, reflecting the potential delineation of a
novel clinical entity. Although it is clear that these patients resembled each other
147
due to MED13L disruption, additional screening of patients with MED13L
mutations in a large number of cases would need to be performed to carefully
delineate the novel classification. In addition, further functional study of MED13L
can be pursued to strengthen the current hypothesis. One attempt could be
performed by using a genome-editing approach in zebrafish, as means of a stable
generation of mutant lines, instead of a transient-knockdown approach induced by
morpholino. There are several genome-editing methods currently available, such
as TALEN, Zinc-Finger Nucleases and Crispr/Cas9 system. In this way, the
comparison between morphants and mutant fish could show whether the mutant
lines displayed similar defects in craniofacial development and neuronal
distribution in the brain. Further investigation of Wnt or FGF enzyme activity
through luciferase assays in zebrafish and neuronal cells can provide an insight
into downstream effects in MED13L-deficient cells, and this could facilitate
cross-validation in different cell types. Characterization of Wnt pathway
interactions with MED13L and considering its functions in various biological
processes would be important to follow up further. This can be performed by
injecting morpholino or genome-editing nucleases in zebrafish transgenic lines of
Wnt-component pathway, which could help to directly visualize the distribution
of Wnt-expressing cells in live cells.
In addition, neurons derived from H1-derived NPCs upon MED13L knockdown
did not show morphological alterations, and further studies would be required to
validate its involvement in neurogenesis. Based on the transgenic zebrafish
model, the post-mitotic neurons was not proportionally distributed in different
148
brain regions of med13b MO embryos. Thus, migration assays in MED13L-
deficient neuronal cells could be performed by time-lapse imaging of neurite
extension from neurons plated in a very low density. Alternatively, since
MED13L is expressed in the cerebellum, it will be interesting to modify the
differentiation protocol into cerebellar neurons. The correct cell type could
theoretically provide unbiased mechanism, compared to the currently applied
general differentiation protocol.
In the last part of the thesis, I showed several morphological alterations in neurons
derived from patient’s iPSCs and GTDC1 knockdown cells. However, these are
morphological observations and did not provide molecular insight into the
neuronal activity of these cells. Electrophysiological studies are required to
measure the firing of action potentials, an important feature of neuronal activity to
become mature and electrophysiologically active. Patient’s iPSCs derived neurons
and GTDC1 knockdown cells have to be depolarized using sodium (Na) or
calcium (Ca) prior to measurement of the spontaneous excitatory and inhibitory
postsynaptic currents. Furthermore, rescue experiments are necessary to prove the
causality of GTDC1 whether the phenotypic defect seen in the patient’s cells can
be restored. So far, none of the iPSCs studies in NDDs used any rescue
experiment involving the introduction of an intact gene expression to restore the
neuronal phenotype; rather they used pharmacological agents such as antibiotics
to read through the premature stop codon for non-sense mutations, or IGF-1
treatment that demonstrated a correction of neuronal phenotype. It is unclear
whether these studies attempted to overexpress the mutated gene due to well-
149
defined genetic etiology in these syndromes. As in the case of GTDC1, a single
patient with a mutation required a rescue experiment to build a stronger case for
its involvement in NDDs. This can be performed by overexpressing GTDC1 into
the patient’s NPCs, and showing whether the neuronal morphological phenotypes
could be restored. This study highlights the importance of using a combination of
genome screening followed by functional studies in animal models and iPSCs to
better understand the disease mechanisms at the cellular level. The potential use
of this combined approach in clinical genetics would help to provide an insight
into disease pathogenesis of rare diseases.
150
BIBLIOGRAPHY
1. H. Leonard, X. Wen, The epidemiology of mental retardation: challenges and
opportunities in the new millennium. Ment Retard Dev Disabil Res Rev 8, 117-
134 (2002).
2. A. Rauch et al., Diagnostic yield of various genetic approaches in patients with
unexplained developmental delay or mental retardation. Am J Med Genet A 140,
2063-2074 (2006).
3. M. Yeargin-Allsopp, C. Boyle, Overview: the epidemiology of
neurodevelopmental disorders. Ment Retard Dev Disabil Res Rev 8, 113-116
(2002).
4. A. P. Mullin et al., Neurodevelopmental disorders: mechanisms and boundary
definitions from genomes, interactomes and proteomes. Transl Psychiatry 3,
e329 (2013).
5. J. D. Picker, C. A. Walsh, New innovations: therapeutic opportunities for
intellectual disabilities. Ann Neurol 74, 382-390 (2013).
6. K. S. Kendler, A history of the DSM-5 Scientific Review Committee. Psychol
Med 43, 1793-1800 (2013).
7. J. B. Moeschler, Genetic evaluation of intellectual disabilities. Semin Pediatr
Neurol 15, 2-9 (2008).
8. J. B. Moeschler, Medical genetics diagnostic evaluation of the child with global
developmental delay or intellectual disability. Curr Opin Neurol 21, 117-122
(2008).
9. L. Salvador-Carulla et al., Intellectual developmental disorders: towards a new
name, definition and framework for "mental retardation/intellectual disability" in
ICD-11. World Psychiatry 10, 175-180 (2011).
10. P. A. Silva, S. Williams, R. McGee, A longitudinal study of children with
developmental language delay at age three: later intelligence, reading and
behaviour problems. Dev Med Child Neurol 29, 630-640 (1987).
11. P. Shetty, Speech and language delay in children: a review and the role of a
pediatric dentist. J Indian Soc Pedod Prev Dent 30, 103-108 (2012).
12. J. Coplan, J. R. Gleason, R. Ryan, M. G. Burke, M. L. Williams, Validation of an
early language milestone scale in a high-risk population. Pediatrics 70, 677-683
(1982).
13. S. A. Graham, S. E. Fisher, Decoding the genetics of speech and language. Curr
Opin Neurobiol 23, 43-51 (2013).
14. C. J. Curry et al., Evaluation of mental retardation: recommendations of a
Consensus Conference: American College of Medical Genetics. Am J Med Genet
72, 468-477 (1997).
15. A. Battaglia, J. C. Carey, Diagnostic evaluation of developmental delay/mental
retardation: An overview. Am J Med Genet C Semin Med Genet 117C, 3-14
(2003).
16. L. G. Shaffer, A. C. o. M. G. P. P. a. G. Committee, American College of
Medical Genetics guideline on the cytogenetic evaluation of the individual with
developmental delay or mental retardation. Genet Med 7, 650-654 (2005).
17. H. van Bokhoven, Genetic and epigenetic networks in intellectual disabilities.
Annu Rev Genet 45, 81-104 (2011).
151
18. A. Perez et al., Long-term neurodevelopmental outcome with hypoxic-ischemic
encephalopathy. J Pediatr 163, 454-459 (2013).
19. J. H. Williams, L. Ross, Consequences of prenatal toxin exposure for mental
health in children and adolescents: a systematic review. Eur Child Adolesc
Psychiatry 16, 243-253 (2007).
20. F. V. O'Callaghan, M. O'Callaghan, J. M. Najman, G. M. Williams, W. Bor,
Prenatal alcohol exposure and attention, learning and intellectual ability at 14
years: a prospective longitudinal study. Early Hum Dev 83, 115-123 (2007).
21. M. Ernst, E. T. Moolchan, M. L. Robinson, Behavioral and neural consequences
of prenatal exposure to nicotine. J Am Acad Child Adolesc Psychiatry 40, 630-
641 (2001).
22. B. Eskenazi, R. Castorina, Association of prenatal maternal or postnatal child
environmental tobacco smoke exposure and neurodevelopmental and behavioral
problems in children. Environ Health Perspect 107, 991-1000 (1999).
23. Folic acid to prevent neural tube defects. Lancet 338, 505-506 (1991).
24. G. Anneren, [Folic acid therapy recommended for women with increased risk of
giving birth to children with neural tube defects]. Lakartidningen 88, 4110
(1991).
25. A. E. Czeizel, I. Dudas, Prevention of the first occurrence of neural-tube defects
by periconceptional vitamin supplementation. N Engl J Med 327, 1832-1835
(1992).
26. H. Stolp, A. Neuhaus, R. Sundramoorthi, Z. Molnar, The Long and the Short of
it: Gene and Environment Interactions During Early Cortical Development and
Consequences for Long-Term Neurological Disease. Front Psychiatry 3, 50
(2012).
27. P. Krakowiak et al., Maternal metabolic conditions and risk for autism and other
neurodevelopmental disorders. Pediatrics 129, e1121-1128 (2012).
28. D. P. Laplante et al., Stress during pregnancy affects general intellectual and
language functioning in human toddlers. Pediatr Res 56, 400-410 (2004).
29. P. Boksa, Effects of prenatal infection on brain development and behavior: a
review of findings from animal models. Brain Behav Immun 24, 881-897 (2010).
30. H. O. Atladottir et al., Maternal infection requiring hospitalization during
pregnancy and autism spectrum disorders. J Autism Dev Disord 40, 1423-1430
(2010).
31. K. Cheslack-Postava, K. Liu, P. S. Bearman, Closely spaced pregnancies are
associated with increased odds of autism in California sibling births. Pediatrics
127, 246-253 (2011).
32. L. Gunawardana et al., Pre-conception inter-pregnancy interval and risk of
schizophrenia. Br J Psychiatry 199, 338-339 (2011).
33. L. Smits, C. Pedersen, P. Mortensen, J. van Os, Association between short birth
intervals and schizophrenia in the offspring. Schizophr Res 70, 49-56 (2004).
34. N. Gunnes et al., Interpregnancy interval and risk of autistic disorder.
Epidemiology 24, 906-912 (2013).
35. A. Kong et al., Rate of de novo mutations and the importance of father's age to
disease risk. Nature 488, 471-475 (2012).
36. J. J. Michaelson et al., Whole-genome sequencing in autism identifies hot spots
for de novo germline mutation. Cell 151, 1431-1442 (2012).
37. L. S. Penrose, Eugenic prognosis with respect to mental deficiency. Eugen Rev
31, 35-37 (1939).
38. J. W. Ellison, J. A. Rosenfeld, L. G. Shaffer, Genetic basis of intellectual
disability. Annu Rev Med 64, 441-450 (2013).
152
39. L. E. Vissers et al., A de novo paradigm for mental retardation. Nat Genet 42,
1109-1112 (2010).
40. A. C. Need et al., Clinical application of exome sequencing in undiagnosed
genetic conditions. J Med Genet 49, 353-361 (2012).
41. J. R. Lupski, New mutations and intellectual function. Nat Genet 42, 1036-1038
(2010).
42. P. N. Robinson, Whole-exome sequencing for finding de novo mutations in
sporadic mental retardation. Genome Biol 11, 144 (2010).
43. K. Huang, De novo paradigm: the ultimate answer to the paradox in mental
retardation? Clin Genet 79, 427-428 (2011).
44. S. J. Funderburk, M. A. Spence, R. S. Sparkes, Mental retardation associated
with "balanced" chromosome rearrangements. Am J Hum Genet 29, 136-141
(1977).
45. M. Bugge et al., Disease associated balanced chromosome rearrangements: a
resource for large scale genotype-phenotype delineation in man. J Med Genet 37,
858-865 (2000).
46. D. Warburton, De novo balanced chromosome rearrangements and extra marker
chromosomes identified at prenatal diagnosis: clinical significance and
distribution of breakpoints. Am J Hum Genet 49, 995-1013 (1991).
47. B. P. Coe, S. Girirajan, E. E. Eichler, A genetic model for neurodevelopmental
disease. Curr Opin Neurobiol 22, 829-836 (2012).
48. B. P. Coe, S. Girirajan, E. E. Eichler, The genetic variability and commonality of
neurodevelopmental disease. Am J Med Genet C Semin Med Genet 160C, 118-
129 (2012).
49. M. B. Ramocki, H. Y. Zoghbi, Failure of neuronal homeostasis results in
common neuropsychiatric phenotypes. Nature 455, 912-918 (2008).
50. F. M. Mikhail et al., Clinically relevant single gene or intragenic deletions
encompassing critical neurodevelopmental genes in patients with developmental
delay, mental retardation, and/or autism spectrum disorders. Am J Med Genet A
155A, 2386-2396 (2011).
51. S. Le Scouarnec, S. M. Gribble, Characterising chromosome rearrangements:
recent technical advances in molecular cytogenetics. Heredity (Edinb) 108, 75-85
(2012).
52. E. M. Ruiter et al., Pure subtelomeric microduplications as a cause of mental
retardation. Clin Genet 72, 362-368 (2007).
53. J. Flint, S. Knight, The use of telomere probes to investigate submicroscopic
rearrangements associated with mental retardation. Curr Opin Genet Dev 13,
310-316 (2003).
54. M. Bourgeois, M. Benezech, [Cytogenic survey of 600 mentally retarded
hospitalized patients]. Encephale 3, 189-202 (1977).
55. Y. Kodama, Cytogenetic and dermatoglyphic studies on severely handicapped
patients in an institution. Acta Med Okayama 36, 383-397 (1982).
56. K. Rasmussen, J. Nielsen, G. Dahl, The prevalence of chromosome abnormalities
among mentally retarded persons in a geographically delimited area of Denmark.
Clin Genet 22, 244-255 (1982).
57. K. D. Wuu, S. W. Wuu, I. W. Liu, A cytogenetic survey of mentally retarded
children in Taiwan: final report on the incidence of chromosome abnormalities.
Proc Natl Sci Counc Repub China B 8, 83-88 (1984).
58. K. H. Gustavson, G. Holmgren, H. K. Blomquist, Chromosomal aberrations in
mildly mentally retarded children in a northern Swedish county. Ups J Med Sci
Suppl 44, 165-168 (1987).
153
59. S. Srsen, N. Misovicova, K. Srsnova, J. Volna, [Chromosome aberrations in a
group of mentally retarded persons]. Cesk Psychiatr 85, 9-16 (1989).
60. K. D. Wuu et al., Chromosomal and biochemical screening on mentally retarded
school children in Taiwan. Jinrui Idengaku Zasshi 36, 267-274 (1991).
61. C. E. Schwartz et al., Fragile X syndrome: incidence, clinical and cytogenetic
findings in the black and white populations of South Carolina. Am J Med Genet
30, 641-654 (1988).
62. D. T. Miller et al., Consensus statement: chromosomal microarray is a first-tier
clinical diagnostic test for individuals with developmental disabilities or
congenital anomalies. Am J Hum Genet 86, 749-764 (2010).
63. P. N. Ray et al., Cloning of the breakpoint of an X;21 translocation associated
with Duchenne muscular dystrophy. Nature 318, 672-675 (1985).
64. W. J. Muir et al., Direct microdissection and microcloning of a translocation
breakpoint region, t(1;11)(q42.2;q21), associated with schizophrenia. Cytogenet
Cell Genet 70, 35-40 (1995).
65. J. K. Millar et al., Disruption of two novel genes by a translocation co-
segregating with schizophrenia. Hum Mol Genet 9, 1415-1423 (2000).
66. J. Chelly et al., Isolation of a candidate gene for Menkes disease that encodes a
potential heavy metal binding protein. Nat Genet 3, 14-19 (1993).
67. V. Verga et al., Localization of the translocation breakpoint in a female with
Menkes syndrome to Xq13.2-q13.3 proximal to PGK-1. Am J Hum Genet 48,
1133-1138 (1991).
68. D. Pinkel et al., High resolution analysis of DNA copy number variation using
comparative genomic hybridization to microarrays. Nat Genet 20, 207-211
(1998).
69. J. R. Lupski, P. Stankiewicz, Genomic disorders: molecular mechanisms for
rearrangements and conveyed phenotypes. PLoS Genet 1, e49 (2005).
70. R. Redon et al., Global variation in copy number in the human genome. Nature
444, 444-454 (2006).
71. D. F. Conrad et al., Origins and functional impact of copy number variation in
the human genome. Nature 464, 704-712 (2010).
72. G. M. Cooper et al., A copy number variation morbidity map of developmental
delay. Nat Genet 43, 838-846 (2011).
73. E. B. Kaminsky et al., An evidence-based approach to establish the functional
and clinical significance of copy number variants in intellectual and
developmental disabilities. Genet Med 13, 777-784 (2011).
74. L. E. Vissers, B. B. de Vries, J. A. Veltman, Genomic microarrays in mental
retardation: from copy number variation to gene, from research to diagnosis. J
Med Genet 47, 289-297 (2010).
75. E. M. Morrow, Genomic copy number variation in disorders of cognitive
development. J Am Acad Child Adolesc Psychiatry 49, 1091-1104 (2010).
76. L. A. Flore, J. M. Milunsky, Updates in the genetic evaluation of the child with
global developmental delay or intellectual disability. Semin Pediatr Neurol 19,
173-180 (2012).
77. F. Zahir, J. M. Friedman, The impact of array genomic hybridization on mental
retardation research: a review of current technologies and their clinical utility.
Clin Genet 72, 271-287 (2007).
78. R. Hochstenbach et al., Array analysis and karyotyping: workflow consequences
based on a retrospective study of 36,325 patients with idiopathic developmental
delay in the Netherlands. Eur J Med Genet 52, 161-169 (2009).
154
79. D. A. Koolen et al., Clinical and molecular delineation of the 17q21.31
microdeletion syndrome. J Med Genet 45, 710-720 (2008).
80. T. Y. Tan et al., Phenotypic expansion and further characterisation of the
17q21.31 microdeletion syndrome. J Med Genet 46, 480-489 (2009).
81. E. Klopocki et al., A further case of the recurrent 15q24 microdeletion syndrome,
detected by array CGH. Eur J Pediatr 167, 903-908 (2008).
82. A. W. El-Hattab et al., Redefined genomic architecture in 15q24 directed by
patient deletion/duplication breakpoint mapping. Hum Genet 126, 589-602
(2009).
83. N. Brunetti-Pierri et al., Recurrent reciprocal 1q21.1 deletions and duplications
associated with microcephaly or macrocephaly and developmental and
behavioral abnormalities. Nat Genet 40, 1466-1471 (2008).
84. H. C. Mefford et al., Recurrent rearrangements of chromosome 1q21.1 and
variable pediatric phenotypes. N Engl J Med 359, 1685-1699 (2008).
85. A. J. Sharp et al., A recurrent 15q13.3 microdeletion syndrome associated with
mental retardation and seizures. Nat Genet 40, 322-328 (2008).
86. B. W. van Bon et al., Further delineation of the 15q13 microdeletion and
duplication syndromes: a clinical spectrum varying from non-pathogenic to a
severe outcome. J Med Genet 46, 511-523 (2009).
87. F. D. Hannes et al., Recurrent reciprocal deletions and duplications of 16p13.11:
the deletion is a risk factor for MR/MCA while the duplication may be a rare
benign variant. J Med Genet 46, 223-232 (2009).
88. R. Ullmann et al., Array CGH identifies reciprocal 16p13.1 duplications and
deletions that predispose to autism and/or mental retardation. Hum Mutat 28,
674-682 (2007).
89. J. Baptista et al., Breakpoint mapping and array CGH in translocations:
comparison of a phenotypically normal and an abnormal cohort. Am J Hum
Genet 82, 927-936 (2008).
90. S. Girirajan et al., Phenotypic heterogeneity of genomic disorders and rare copy-
number variants. N Engl J Med 367, 1321-1331 (2012).
91. T. Langaee, M. Ronaghi, Genetic variation analyses by Pyrosequencing. Mutat
Res 573, 96-102 (2005).
92. J. O. Korbel et al., Paired-end mapping reveals extensive structural variation in
the human genome. Science 318, 420-426 (2007).
93. O. Morozova, M. A. Marra, Applications of next-generation sequencing
technologies in functional genomics. Genomics 92, 255-264 (2008).
94. M. E. Hurles, E. T. Dermitzakis, C. Tyler-Smith, The functional impact of
structural variation in humans. Trends Genet 24, 238-245 (2008).
95. J. M. Kidd et al., Mapping and sequencing of structural variation from eight
human genomes. Nature 453, 56-64 (2008).
96. W. Chen et al., Breakpoint analysis of balanced chromosome rearrangements by
next-generation paired-end sequencing. Eur J Hum Genet 18, 539-543 (2010).
97. M. E. Talkowski et al., Next-generation sequencing strategies enable routine
detection of balanced chromosome rearrangements for clinical diagnostics and
genetic research. Am J Hum Genet 88, 469-481 (2011).
98. M. E. Talkowski et al., Sequencing chromosomal abnormalities reveals
neurodevelopmental loci that confer risk across diagnostic boundaries. Cell 149,
525-537 (2012).
99. W. P. Kloosterman et al., Chromothripsis as a mechanism driving complex de
novo structural rearrangements in the germline. Hum Mol Genet 20, 1916-1924
(2011).
155
100. W. P. Kloosterman et al., Constitutional chromothripsis rearrangements involve
clustered double-stranded DNA breaks and nonhomologous repair mechanisms.
Cell Rep 1, 648-655 (2012).
101. C. Gilissen et al., Genome sequencing identifies major causes of severe
intellectual disability. Nature 511, 344-347 (2014).
102. J. de Ligt et al., Diagnostic exome sequencing in persons with severe intellectual
disability. N Engl J Med 367, 1921-1929 (2012).
103. S. J. Sanders et al., De novo mutations revealed by whole-exome sequencing are
strongly associated with autism. Nature 485, 237-241 (2012).
104. G. J. Lyon, K. Wang, Identifying disease mutations in genomic medicine
settings: current challenges and how to accelerate progress. Genome Med 4, 58
(2012).
105. R. C. Green et al., ACMG recommendations for reporting of incidental findings
in clinical exome and genome sequencing. Genet Med 15, 565-574 (2013).
106. E. Courchesne, R. Carper, N. Akshoomoff, Evidence of brain overgrowth in the
first year of life in autism. JAMA 290, 337-344 (2003).
107. G. Dawson et al., Rate of head growth decelerates and symptoms worsen in the
second year of life in autism. Biol Psychiatry 61, 458-464 (2007).
108. Y. A. Dementieva et al., Accelerated head growth in early development of
individuals with autism. Pediatr Neurol 32, 102-108 (2005).
109. H. C. Hazlett et al., Magnetic resonance imaging and head circumference study
of brain size in autism: birth through age 2 years. Arch Gen Psychiatry 62, 1366-
1376 (2005).
110. B. F. Sparks et al., Brain structural abnormalities in young children with autism
spectrum disorder. Neurology 59, 184-192 (2002).
111. M. W. Mosconi et al., Longitudinal study of amygdala volume and joint attention
in 2- to 4-year-old children with autism. Arch Gen Psychiatry 66, 509-516
(2009).
112. C. M. Schumann et al., Longitudinal magnetic resonance imaging study of
cortical development through early childhood in autism. J Neurosci 30, 4419-
4427 (2010).
113. M. R. Herbert et al., Brain asymmetries in autism and developmental language
disorder: a nested whole-brain analysis. Brain 128, 213-226 (2005).
114. E. Courchesne et al., Unusual brain growth patterns in early life in patients with
autistic disorder: an MRI study. Neurology 57, 245-254 (2001).
115. R. A. Carper, P. Moses, Z. D. Tigue, E. Courchesne, Cerebral lobes in autism:
early hyperplasia and abnormal age effects. Neuroimage 16, 1038-1051 (2002).
116. E. Courchesne, K. Pierce, Brain overgrowth in autism during a critical time in
development: implications for frontal pyramidal neuron and interneuron
development and connectivity. Int J Dev Neurosci 23, 153-170 (2005).
117. E. Courchesne, Brain development in autism: early overgrowth followed by
premature arrest of growth. Ment Retard Dev Disabil Res Rev 10, 106-111
(2004).
118. E. Courchesne, K. Campbell, S. Solso, Brain growth across the life span in
autism: age-specific changes in anatomical pathology. Brain Res 1380, 138-145
(2011).
119. E. Courchesne et al., Neuron number and size in prefrontal cortex of children
with autism. JAMA 306, 2001-2010 (2011).
120. M. Santos et al., Von Economo neurons in autism: a stereologic study of the
frontoinsular cortex in children. Brain Res 1380, 206-217 (2011).
156
121. E. R. Whitney, T. L. Kemper, M. L. Bauman, D. L. Rosene, G. J. Blatt,
Cerebellar Purkinje cells are reduced in a subpopulation of autistic brains: a
stereological experiment using calbindin-D28k. Cerebellum 7, 406-416 (2008).
122. E. R. Whitney, T. L. Kemper, D. L. Rosene, M. L. Bauman, G. J. Blatt, Density
of cerebellar basket and stellate cells in autism: evidence for a late developmental
loss of Purkinje cells. J Neurosci Res 87, 2245-2254 (2009).
123. T. L. Kemper, M. L. Bauman, The contribution of neuropathologic studies to the
understanding of autism. Neurol Clin 11, 175-187 (1993).
124. C. M. Schumann, D. G. Amaral, Stereological analysis of amygdala neuron
number in autism. J Neurosci 26, 7674-7679 (2006).
125. I. Oberle et al., Instability of a 550-base pair DNA segment and abnormal
methylation in fragile X syndrome. Science 252, 1097-1102 (1991).
126. S. Yu et al., Fragile X genotype characterized by an unstable region of DNA.
Science 252, 1179-1181 (1991).
127. K. M. Huber, S. M. Gallagher, S. T. Warren, M. F. Bear, Altered synaptic
plasticity in a mouse model of fragile X mental retardation. Proc Natl Acad Sci U
S A 99, 7746-7750 (2002).
128. M. F. Bear, K. M. Huber, S. T. Warren, The mGluR theory of fragile X mental
retardation. Trends Neurosci 27, 370-377 (2004).
129. A. Michalon et al., Chronic pharmacological mGlu5 inhibition corrects fragile X
in adult mice. Neuron 74, 49-56 (2012).
130. C. H. Choi et al., Pharmacological reversal of synaptic plasticity deficits in the
mouse model of fragile X syndrome by group II mGluR antagonist or lithium
treatment. Brain Res 1380, 106-119 (2011).
131. D. D. Krueger, M. F. Bear, Toward fulfilling the promise of molecular medicine
in fragile X syndrome. Annu Rev Med 62, 411-429 (2011).
132. E. W. Fish et al., Changes in sensitivity of reward and motor behavior to
dopaminergic, glutamatergic, and cholinergic drugs in a mouse model of fragile x
syndrome. PLoS One 8, e77896 (2013).
133. S. Jacquemont et al., Epigenetic modification of the FMR1 gene in fragile X
syndrome is associated with differential response to the mGluR5 antagonist
AFQ056. Sci Transl Med 3, 64ra61 (2011).
134. F. Yao et al., Long span DNA paired-end-tag (DNA-PET) sequencing strategy
for the interrogation of genomic structural mutations and fusion-point-guided
reconstruction of amplicons. PLoS One 7, e46152 (2012).
135. M. J. Fullwood, C. L. Wei, E. T. Liu, Y. Ruan, Next-generation DNA sequencing
of paired-end tags (PET) for transcriptome and genome analyses. Genome Res
19, 521-532 (2009).
136. A. M. Hillmer et al., Comprehensive long-span paired-end-tag mapping reveals
characteristic patterns of structural variations in epithelial cancer genomes.
Genome Res 21, 665-675 (2011).
137. K. Inaki et al., Transcriptional consequences of genomic structural aberrations in
breast cancer. Genome Res 21, 676-687 (2011).
138. K. P. Ng et al., A common BIM deletion polymorphism mediates intrinsic
resistance and inferior responses to tyrosine kinase inhibitors in cancer. Nat Med,
(2012).
139. P. Lichter, S. A. Ledbetter, D. H. Ledbetter, D. C. Ward, Fluorescence in situ
hybridization with Alu and L1 polymerase chain reaction probes for rapid
characterization of human chromosomes in hybrid cell lines. Proc Natl Acad Sci
U S A 87, 6634-6638 (1990).
157
140. D. Cherif, J. Derre, R. Berger, Fluorescence in situ hybridization on metaphase
chromosomes with biotinylated probes. In situ hybridization, biotin labeling,
cosmids, gene mapping, oncogene amplification. Nouv Rev Fr Hematol 32, 459-
460 (1990).
141. R. E. Mills et al., Mapping copy number variation by population-scale genome
sequencing. Nature 470, 59-65 (2011).
142. B. E. Bernstein et al., An integrated encyclopedia of DNA elements in the human
genome. Nature 489, 57-74 (2012).
143. R. E. Thurman et al., The accessible chromatin landscape of the human genome.
Nature 489, 75-82 (2012).
144. A. P. Boyle et al., Annotation of functional variation in personal genomes using
RegulomeDB. Genome Res 22, 1790-1797 (2012).
145. W. J. Kent, BLAT--the BLAST-like alignment tool. Genome Res 12, 656-664
(2002).
146. K. Takahashi et al., Induction of pluripotent stem cells from adult human
fibroblasts by defined factors. Cell 131, 861-872 (2007).
147. W. Li et al., Rapid induction and long-term self-renewal of primitive neural
precursors from human embryonic stem cells by small molecule inhibitors. Proc
Natl Acad Sci U S A 108, 8299-8304 (2011).
148. C. Thisse, B. Thisse, High-resolution in situ hybridization to whole-mount
zebrafish embryos. Nat Protoc 3, 59-69 (2008).
149. D. J. Kleinjan, V. van Heyningen, Position effect in human genetic disease. Hum
Mol Genet 7, 1611-1618 (1998).
150. R. E. Mills et al., Mapping copy number variation by population-scale genome
sequencing. Nature 470, 59-65 (2011).
151. S. Offermanns et al., Impaired motor coordination and persistent multiple
climbing fiber innervation of cerebellar Purkinje cells in mice lacking Galphaq.
Proc Natl Acad Sci U S A 94, 14089-14094 (1997).
152. M. Takemoto et al., Laminar and areal expression of unc5d and its role in
cortical cell survival. Cereb Cortex 21, 1925-1934 (2011).
153. Y. Zhu et al., Dependence receptor UNC5D mediates nerve growth factor
depletion-induced neuroblastoma regression. J Clin Invest 123, 2935-2947
(2013).
154. T. Ben-Zur, E. Feige, B. Motro, R. Wides, The mammalian Odz gene family:
homologs of a Drosophila pair-rule gene with expression implying distinct yet
overlapping developmental roles. Dev Biol 217, 107-120 (2000).
155. T. Ben-Zur, R. Wides, Mapping homologs of Drosophila odd Oz (odz):
Doc4/Odz4 to mouse chromosome 7, Odz1 to mouse chromosome 11; and ODZ3
to human chromosome Xq25. Genomics 58, 102-103 (1999).
156. T. Hirano, Condensins: organizing and segregating the genome. Curr Biol 15,
R265-275 (2005).
157. V. Vacic et al., Duplications of the neuropeptide receptor gene VIPR2 confer
significant risk for schizophrenia. Nature 471, 499-503 (2011).
158. Y. Wu et al., Submicroscopic subtelomeric aberrations in Chinese patients with
unexplained developmental delay/mental retardation. BMC Med Genet 11, 72
(2010).
159. M. Corpas, E. Bragin, S. Clayton, P. Bevan, H. V. Firth, Interpretation of
genomic copy number variants using DECIPHER. Curr Protoc Hum Genet
Chapter 8, Unit 8 14 (2012).
158
160. G. J. Swaminathan et al., DECIPHER: web-based, community resource for
clinical interpretation of rare variants in developmental disorders. Hum Mol
Genet 21, R37-44 (2012).
161. K. L. Tsai et al., A conserved Mediator-CDK8 kinase module association
regulates Mediator-RNA polymerase II interaction. Nat Struct Mol Biol 20, 611-
619 (2013).
162. H. J. Baek, S. Malik, J. Qin, R. G. Roeder, Requirement of TRAP/mediator for
both activator-independent and activator-dependent transcription in conjunction
with TFIID-associated TAF(II)s. Mol Cell Biol 22, 2842-2852 (2002).
163. P. M. Flanagan, R. J. Kelleher, 3rd, M. H. Sayre, H. Tschochner, R. D. Kornberg,
A mediator required for activation of RNA polymerase II transcription in vitro.
Nature 350, 436-438 (1991).
164. Y. Takagi, R. D. Kornberg, Mediator as a general transcription factor. J Biol
Chem 281, 80-89 (2006).
165. M. A. Davis et al., The SCF-Fbw7 ubiquitin ligase degrades MED13 and
MED13L and regulates CDK8 module association with Mediator. Genes Dev 27,
151-156 (2013).
166. C. E. Schwartz et al., The original Lujan syndrome family has a novel missense
mutation (p.N1007S) in the MED12 gene. J Med Genet 44, 472-477 (2007).
167. A. T. Vulto-van Silfhout et al., Mutations in MED12 cause X-linked Ohdo
syndrome. Am J Hum Genet 92, 401-406 (2013).
168. P. Rump et al., A novel mutation in MED12 causes FG syndrome (Opitz-
Kaveggia syndrome). Clin Genet 79, 183-188 (2011).
169. R. Kaufmann et al., Infantile cerebral and cerebellar atrophy is associated with a
mutation in the MED17 subunit of the transcription preinitiation mediator
complex. Am J Hum Genet 87, 667-670 (2010).
170. R. Bernard, A. De Sandre-Giovannoli, V. Delague, N. Levy, Molecular genetics
of autosomal-recessive axonal Charcot-Marie-Tooth neuropathies.
Neuromolecular Med 8, 87-106 (2006).
171. H. Najmabadi et al., Deep sequencing reveals 50 novel genes for recessive
cognitive disorders. Nature 478, 57-63 (2011).
172. N. Muncke et al., Missense mutations and gene interruption in PROSIT240, a
novel TRAP240-like gene, in patients with congenital heart defect (transposition
of the great arteries). Circulation 108, 2843-2850 (2003).
173. R. Asadollahi et al., Dosage changes of MED13L further delineate its role in
congenital heart defects and intellectual disability. Eur J Hum Genet 21, 1100-
1104 (2013).
174. H. Najmabadi et al., Deep sequencing reveals 50 novel genes for recessive
cognitive disorders. Nature 478, 57-63 (2011).
175. M. M. van Haelst et al., Further confirmation of the MED13L haploinsufficiency
syndrome. Eur J Hum Genet, (2014).
176. E. Zhao et al., Cloning and expression of human GTDC1 gene
(glycosyltransferase-like domain containing 1) from human fetal library. DNA
Cell Biol 23, 183-187 (2004).
177. M. C. Manzini et al., Exome sequencing and functional validation in zebrafish
identify GTDC2 mutations as a cause of Walker-Warburg syndrome. Am J Hum
Genet 91, 541-547 (2012).
178. J. Amiel et al., Mutations in TCF4, encoding a class I basic helix-loop-helix
transcription factor, are responsible for Pitt-Hopkins syndrome, a severe epileptic
encephalopathy associated with autonomic dysfunction. Am J Hum Genet 80,
988-993 (2007).
159
179. F. F. Hamdan et al., Parent-child exome sequencing identifies a de novo
truncating mutation in TCF4 in non-syndromic intellectual disability. Clin Genet
83, 198-200 (2013).
180. D. H. Geschwind, Autism: many genes, common pathways? Cell 135, 391-395
(2008).
181. M. McVey, S. E. Lee, MMEJ repair of double-strand breaks (director's cut):
deleted sequences and alternative endings. Trends Genet 24, 529-538 (2008).
182. J. C. Wang, L. Dang, T. Fisker, Chromosome 6 between-arm intrachromosomal
insertion with intrasegmental double inversion: a four-break model. Am J Med
Genet A 152A, 209-211 (2010).
183. M. R. Lieber, T. E. Wilson, SnapShot: Nonhomologous DNA end joining
(NHEJ). Cell 142, 496-496 e491 (2010).
184. M. R. Lieber, NHEJ and its backup pathways in chromosomal translocations. Nat
Struct Mol Biol 17, 393-395 (2010).
185. M. R. Lieber, J. Gu, H. Lu, N. Shimazaki, A. G. Tsai, Nonhomologous DNA end
joining (NHEJ) and chromosomal translocations in humans. Subcell Biochem 50,
279-296 (2010).
186. S. Rigaud et al., XIAP deficiency in humans causes an X-linked
lymphoproliferative syndrome. Nature 444, 110-114 (2006).
187. S. Rigaud, S. Latour, [An X-linked lymphoproliferative syndrome (XLP) caused
by mutations in the inhibitor-of-apoptosis gene XIAP]. Med Sci (Paris) 23, 235-
237 (2007).
188. O. Perche et al., Combined deletion of two Condensin II system genes (NCAPG2
and MCPH1) in a case of severe microcephaly and mental deficiency. Eur J Med
Genet 56, 635-641 (2013).
189. L. A. Lettice et al., A long-range Shh enhancer regulates expression in the
developing limb and fin and is associated with preaxial polydactyly. Hum Mol
Genet 12, 1725-1735 (2003).
190. W. Balemans et al., Identification of a 52 kb deletion downstream of the SOST
gene in patients with van Buchem disease. J Med Genet 39, 91-97 (2002).
191. A. C. Fonseca et al., The clinical impact of chromosomal rearrangements with
breakpoints upstream of the SOX9 gene: two novel de novo balanced
translocations associated with acampomelic campomelic dysplasia. BMC Med
Genet 14, 50 (2013).
192. R. E. Hill, L. A. Lettice, Alterations to the remote control of Shh gene expression
cause congenital abnormalities. Philos Trans R Soc Lond B Biol Sci 368,
20120357 (2013).
193. G. V. Velagaleti et al., Position effects due to chromosome breakpoints that map
approximately 900 Kb upstream and approximately 1.3 Mb downstream of
SOX9 in two patients with campomelic dysplasia. Am J Hum Genet 76, 652-662
(2005).
194. N. Muncke et al., Missense mutations and gene interruption in PROSIT240, a
novel TRAP240-like gene, in patients with congenital heart defect (transposition
of the great arteries). Circulation 108, 2843-2850 (2003).
195. K. Mølsted, M. Boers, I. Kjaer, The morphology of the sella turcica in
velocardiofacial syndrome suggests involvement of a neural crest developmental
field. Am J Med Genet A 152A, 1450-1457 (2010).
196. R. J. Shprintzen et al., A new syndrome involving cleft palate, cardiac anomalies,
typical facies, and learning disabilities: velo-cardio-facial syndrome. Cleft Palate
J 15, 56-62 (1978).
160
197. R. J. Shprintzen, R. B. Goldberg, D. Young, L. Wolford, The velo-cardio-facial
syndrome: a clinical and genetic analysis. Pediatrics 67, 167-172 (1981).
198. R. Goldberg, B. Motzkin, R. Marion, P. J. Scambler, R. J. Shprintzen, Velo-
cardio-facial syndrome: a review of 120 patients. Am J Med Genet 45, 313-319
(1993).
199. P. C. Yelick, T. F. Schilling, Molecular dissection of craniofacial development
using zebrafish. Crit Rev Oral Biol Med 13, 308-322 (2002).
200. C. Golzio et al., KCTD13 is a major driver of mirrored neuroanatomical
phenotypes of the 16p11.2 copy number variant. Nature 485, 363-367 (2012).
201. H. G. Kim et al., Translocations disrupting PHF21A in the Potocki-Shaffer-
syndrome region are associated with intellectual disability and craniofacial
anomalies. Am J Hum Genet 91, 56-72 (2012).
202. A. M. Lindgren et al., Haploinsufficiency of KDM6A is associated with severe
psychomotor retardation, global growth restriction, seizures and cleft palate. Hum
Genet 132, 537-552 (2013).
203. A. D. Kasberg, E. W. Brunskill, S. Steven Potter, SP8 regulates signaling centers
during craniofacial development. Dev Biol 381, 312-323 (2013).
204. P. P. Rocha, M. Scholze, W. Bleiss, H. Schrewe, Med12 is essential for early
mouse development and for canonical Wnt and Wnt/PCP signaling. Development
137, 2723-2731 (2010).
205. I. Carrera, F. Janody, N. Leeds, F. Duveau, J. E. Treisman, Pygopus activates
Wingless target gene transcription through the mediator complex subunits Med12
and Med13. Proc Natl Acad Sci U S A 105, 6644-6649 (2008).
206. X. Lin, L. Rinaldo, A. F. Fazly, X. Xu, Depletion of Med10 enhances Wnt and
suppresses Nodal signaling during zebrafish embryogenesis. Dev Biol 303, 536-
548 (2007).
207. S. Kim, X. Xu, A. Hecht, T. G. Boyer, Mediator is a transducer of Wnt/beta-
catenin signaling. J Biol Chem 281, 14066-14075 (2006).
208. K. Takahashi, S. Yamanaka, Induction of pluripotent stem cells from mouse
embryonic and adult fibroblast cultures by defined factors. Cell 126, 663-676
(2006).
209. N. Maherali, K. Hochedlinger, Guidelines and techniques for the generation of
induced pluripotent stem cells. Cell Stem Cell 3, 595-605 (2008).
210. M. C. Marchetto et al., A model for neural development and treatment of Rett
syndrome using human induced pluripotent stem cells. Cell 143, 527-539 (2010).
211. J. F. Krey et al., Timothy syndrome is associated with activity-dependent
dendritic retraction in rodent and human neurons. Nat Neurosci 16, 201-209
(2013).
212. S. P. Paşca et al., Using iPSC-derived neurons to uncover cellular phenotypes
associated with Timothy syndrome. Nat Med 17, 1657-1662 (2011).
213. M. C. Marchetto, K. J. Brennand, L. F. Boyer, F. H. Gage, Induced pluripotent
stem cells (iPSCs) and neurological disease modeling: progress and promises.
Hum Mol Genet 20, R109-115 (2011).
214. H. H. Freeze, Understanding human glycosylation disorders: biochemistry leads
the charge. J Biol Chem 288, 6936-6945 (2013).
215. T. M. Snow, C. W. Woods, A. G. Woods, Congenital disorder of glycosylation: a
case presentation. Adv Neonatal Care 12, 96-100 (2012).
216. A. G. Woods, C. W. Woods, T. M. Snow, Congenital disorders of glycosylation.
Adv Neonatal Care 12, 90-95 (2012).
217. P. He, B. G. Ng, M. E. Losfeld, W. Zhu, H. H. Freeze, Identification of
intercellular cell adhesion molecule 1 (ICAM-1) as a hypoglycosylation marker
161
in congenital disorders of glycosylation cells. J Biol Chem 287, 18210-18217
(2012).
162
APPENDIX A.
LIST OF SVS PER PATIENTS
163
Table S1. List of SVs found in patient CD5.
SV No SV Type
Breakpoint Cluster Size
Span (bp) Gene DNA alteration DGV
PCR validation
Patient CD005
Affected son CD020
Affected son CD021
1 Del chr1 (+) :72,390,624-72,430,488 17 39,864 NEGR1 Intronic deletion 28 + + +
2 Del chr1 (+) :109,337,135-109,357,109 15 19,974 WDR47 Deleted exon 5-8 9 + + +
3 Del chr1 (+) :114,674,249-114,682,110 16 7,861 - - -
4 Del chr2 (+) :31,004,746-31,033,602 17 28,856 GALNT14 5 + + +
5
Del chr4 (+) :91,698,589-94,246,096
16
2,547,507
TMSL3 Full length deletion 2 + - +
GRID2 Deleted exon 1-2 46
6 Del chr4 (+) :122,847,359-122,891,237 16 43,878 + + +
7 Del chr6 (+) :89,226,451-89,232,156 7 5,705 + + -
8 Del chr7 (+) :16,201,711-16,211,601 25 9,890 + + -
9 Del chr7 (+) :142,534,798-142,604,566 16 69,768 TAS2R39 Full length 3 + - -
PIP Full length 67
10 BT chr9 (+) :79,571,716; chr17(+) :74,768,401 16 NA GNAQ Bp at intron 5 + + +
RBFOX3 Bp at intron 2
11 BT chr9 (-) :79,573,787; chr17(-) :74,764,804 14 NA GNAQ Bp at intron 5 + + +
RBFOX3 Bp at intron 2
12 Del chr11 (+) :7,643,628-7,653,635 6 10,007 CYB5R2 Deleted exon 4-7 1 - - -
13 Del chr15 (+) :59,473,758-59,487,405 28 13,647 + + +
14 Del chr16 (+) :4,072,901-4,082,506 7 9,605 ADCY9 Intronic deletion 6 - - -
Abbreviations: Del, Deletion; BT, Balanced translocation
164
Table S2. List of SVs found in patient CD10
SV No SV Type Breakpoint Cluster Size Span (bp) Gene DNA alteration DGV
1 Del chr1 (+) :202,173,929-202,188,904 27 14,975
2 Del chr2 (+) :139,455,505-139,459,801 29 4,296
3 Ins chr2:178,545,739; chr12:54,068,618 14 NA PDE11A Disruption intron 1 18
4 Ins chr2:178,564,890; chr12:54,070,437 6 NA PDE11A Disruption intron 1 18
5 Del chr4 (+) :92,499,237-92,504,340 29 5,103 FAM190A Intronic deletion 79
6 Del chr6 (+) : 34,511,385- 34,517,410 31 6,025
7 BT chr6:98,318,059; chr8:35,527,808 36 NA UNC5D Breakpoint at intron 2
8 BT chr6:98,326,375; chr8:35,528,282 34 NA UNC5D Breakpoint at intron 2
9 IT chr7:224,069; chr15:100,042,752 48 NA TARSL2 Disruption intron 11
10 Del chr7 (+) :5,846,915-5,855,247 13 8,332 ZNF815 Deleted exon 4-6 6
11 Del chr7 (+) :11,886,866-11,889,627 26 2,761
12 Del chr7 (+) :100,369,276-100,374,934 22 5,658
13 TD chr10 (+) : 84,405,124-84,414,680 30 22,078 NRG3 Intronic duplication 54
14 Del chr11 (+) :62,267,054-62,270,934 13 3,880
15 Del chr15 (+) :48,758,881-48,764,977 32 6,096 TRPM7 Intronic deletion 2
Abbreviations: Del, Deletion; BT, Balanced translocation; Ins, Insertion; IT, Isolated translocation; TD, Tandem duplication
165
Table S3. List of SVs found in patient CD8.
SV No SV Type Breakpoint Cluster Size Span (bp) Gene DNA alteration DGV
1 TD chr1 (+) :143,753,709-143,758,374 25 12,876 PDE4DIP Intronic duplication 0
2 Del chr4 (+) :32,973,945-32,982,487 6 8,542
3 Del chr5 (+) :97026615-97,046,690 22 20,075
4 Del chr5 (+) :113,354,357-113,364,703 45 10,346
5 Del chr7 (+) :12,986,617-12,995,338 24 8,721
6 TD chr8 (+) :128,450,797-128,413,587 46 58,701
7 Del chr8 (+) :16,778,010-16,786,728 6 8,718
8 Del chr9 (+) :90,123,341-90,127,712 12 4,371
9 Del chr14 (+) :83,112,675-83,120,540 56 7,865
10 TD chr19 (+) :43,243,,485-43,164,809 16 96,613 SIPA1L3 Intronic duplication 4
11 Del chr20 (+) :58,544,674-58,558,612 40 13,938
12 TD chr22 (+) :49,417,248-49,424,209 17 8,643
13 Del chrX (+) :120,059,370-120,309,394 30 250,024
14 UI chrX:34,661,885-125,684,797 49 91,022,213
15 PI chrX:119,984,597-122,839,027 26 2,854,430 XIAP Breakpoint at intron 1
16 PI chrX:119,997,337-122,856,765 29 2,860,694 XIAP Breakpoint at intron 1
17 TD chrX (+) : 34,569,874-123,107,520 23 88,554,667 TMEM47 Part of complex inversion
18 TD chrX (+) : 123,127,427-125,658,649 30 2,555,127 ODZ1;SH2D1A Part of complex inversion
Abbreviations: Del, Deletion; UI, Unpaired inversion; PI, Paired inversion; TD, Tandem duplication
166
Table S4. List of SVs found in chromosome X including lower confidence clusters
SVa) DNA-PET SVs FISH validation
Gene
PCR validation
SV Type
Cluster Size
Span (bp) Breakpoint A BAC Probe for Bp A
Breakpoint B BAC Probe for Bp B
Patient Affected twin
Unaffected mother
13 Del 30 250,024 chrX (+) :120059370 - chrX (+) :120,309,394 - + + +
14 UI 49 91,022,213 chrX (+) : 34661885 - chrX (-) :125,684,797 -
15 PI 26 2,854,430 chrX (+) :119,984,597 RP1-296G17 chrX (-) : 122,839,027 RP1-315G1 XIAP + + +
16 PI 29 2,860,694 chrX (-) :119,997,337 RP1-296G17 chrX (+) :122,856,765 RP1-315G1 XIAP + + +
17 TD 23 88,554,667 chrX (+) :123,107,520 W12-499N23 chrX (+) : 34,569,874 RP11-330K13 TMEM47 + + +
18 TD 30 2,555,127 chrX (+) :125,658,649 - chrX (+) :123,127,427 W12-499N23 ODZ1, SH2D1A
19 TD 26 737,420 chrX (+) :125,828,479 - chrX (+) :125,110,988 - CXorf64
20* UI 5 9,146 chrX (+) :34,560,359 RP11-330K13 chrX (-) 34,567,696 - TMEM47 + + +
21* UI 5 88,548,885 chrX (-) :34,560,395 RP11-330K13 chrX (+) :123,109,280 W12-499N23 TMEM47 + + +
22* TD 4 28,847 chrX (+) :61,754,201 RP11-762M23 chrX (+) : 61,771,832 RP11-762M23 - - -
Abbreviations: Del, Deletion; UI, Unpaired inversion; PI, Paired inversion; TD, Tandem duplication
167
Table S5. List of SVs found in patient CD9
SV No SV Type Breakpoint Cluster Size Span (bp) Gene DNA alteration DGV
1 Del chr1 (+):33,189,324-33,193,992 43 4,668 RNF19B Intronic deletion 0
2 PI chr5 (+):111,962,591-165,575,819 59 53,613,228
3 PI chr5 (-):111,972,555-165,576,628 63 53,613,479
4 TD chr5 (+):641,668-738,952 17 97,284 TPPP Duplicated exon 1-3 47
CEP72 Full length duplication 42
5 TD chr9 (+):78,960,669-78,968,871 7 6,202
6 Del chr9 (+):106,837,792-106,843,557 44 5,765
7 TD chr17(+):2,734,890-2,747,456 18 12,556 RAP1GAP2 Intronic duplication 17
Abbreviations: Del, Deletion; PI, Paired inversion; TD, Tandem duplication
168
Table S6. List of SVs found in patient CD14
SV No SV Type Breakpoint ClusterSize Span (bp) Gene DNA alteration DGV
1 Del chr1 (+) :38,374,440-38,378,079 17 3,639
2 Del chr1 (+) :177,243,686-177,246,818 24 3,132
3 Del chr1 (+) :217,861,277-217,867,288 8 6,011
4 Del chr2(+):66,357,659-66,361,616 13 3,957
5 Del chr2 (+) :178,563,026-178,567,389 6 4,363 PDE11A Intronic deletion 18
6 TD chr2(+):76,660,390-76,670,369 28 9,979
7 Del chr4 (+) :171,064,817-171,069,617 55 4,800
8 Ins chr6 (+) :30,666,984;chr17 (-) :25,082,330 15 NA ABCF1 Breakpoint at 3' UTR 0
SSH2 Breakpoint at intron 2 2
9 Ins chr6 (+) :30,667,054;chr17 (-) :25,068,847 13 NA ABCF1 Breakpoint at 3' UTR 0
10 Del chr7(+):26,107,987-26,120,427 9 12,440
11 Del chr8 (+) :6,282,090-6,319,264 50 37,174 MCPH1 Deleted exon 6-9 4
12 Del chr8(+):131,919,651-131,929,123 6 9,472 ADCY8 Intronic deletion 11
13 Del chr10(+):65,178,815-65,186,592 8 7,777
14 Del chr10(+):132,798,979-132,809,373 38 10,394 TCERG1L Deleted exon 9 25
15 TD chr10(+):134,978,086-134,991,430 21 13,344 CALY Duplicated exon 4-6 17
16 Del chr12 (+) :34,589,132-34,604,152 9 15,020
17 Del chr13 (+) :28,375,180-28,381,616 18 6,436
18 Del chr14 (+) :45,822,161-45,836,976 53 21,953
19 Del chr15 (+) :50,051,999-50,061,520 35 9,521
20 Del chr15 (+) :72,525,564-72,638,632 15 113,068 ARID3B Deleted exon 1-2 4
UBL7 Deleted exon 1-10 1
21 Del chr18(+):63,040,662-63,045,286 46 4,624
22 Del chrX (+) :41,043,082-41,048,219 8 5,137
Abbreviations: Del, Deletion; Ins, Insertion; TD, Tandem duplication
169
Table S7. List of SVs found in patient CD23
SV No SV Type
Breakpoint Cluster Size
Span (bp) Gene DNA Alteration DGV
1 Del chr1(+):97,567,377-97,686,357 68 118980 DPYD intronic deletion 29
2 Del chr1(+):245,206,262-245,210,280 62 4018
3 Del chr2(+):53,410,893-53,413,627 6 2734
4 Del chr3(+):198,934,253-198,938,109 35 3856 KIAA0226 intronic deletion 14
5 Del chr3(+):198,958,247-198,960,827 22 2580 KIAA0226 intronic deletion 14
6 Del chr3(+):199,003,740-199,007,633 58 3893 LRCH3 8
7 Del chr5(+):99,865,951-99,874,835 38 8884
8 Del chr6(+):140,899,577-140,951,731 80 52154
9 Del chr7(+):26,985,897-64,394,905 43 37409008 HOXA1 to INTS4L1 large deletion N.A.
10 Del chr8(+):52,200,993-52,241,082 78 40089
11 Del chr9(+):10,272,864-10,316,145 90 43281 PTPRD 26
12 Del chr10(+):85,087,405-85,092,891 53 5486
13 Del chr12(+):36,557,212-36,563,493 11 6281
14 Del chr13(+):22,603,539-22,608,466 43 4927
15 Del chr13(+):106,936,748-106,941,359 8 4611 FAM155A intronic deletion 39
16 Del chr15(+):70,506,305-70,512,147 46 5842
17 Del chr18(+):62,928,173-62,930,606 40 2433
18 Del chr23(+):45,669,476-45,674,724 11 5248
19 Del chr23(+):89,208,871-89,215,941 92 7070
20 BT chr12(+):114,971,361; chr19(+):35,770,127 52 NA MED13L breakpoint at intron 4
21 BT chr12(-):114,971,801; chr19(-):35,769,914 70 NA MED13L breakpoint at intron 4
22 UI chr7(+):64,497,403; chr7(-):64,533,448 53 36045 ZNF92 affecting 3' UTR
23 UI chr7(-):26,954,838; chr7(+):64,497,640 57 37542802 ZNF92 large affected region
24 UI chr14(-):105,289,264; chr14(+):105,397,567 7 108303
25 UI chrX(+):105,434,960; chrX(-):105,437,562 33 2602
26 TD chr2(+):195,606,431-195,630,007 100 23576
27 TD chr12(+):83,454,680-83534229 71 79549
170
Table S8. List of SVs found in Patient CD6
SV No SV Type Inheritance Breakpoint Cluster Size Span (bp) Gene DNA alteration DGV
1 BT De Novo chr2 (-) :144,606,985;chr8 (+) :23,007,302 60 NA GTDC1 Breakpoint at intron 5
2 BT De Novo chr2 (+) :144,606,673;chr8 (-) :23,007,044 85 NA GTDC1 Breakpoint at intron 5
3 TD Maternal chr2 (+) :57,291,143-57,262,536 62 48,200 -
4 TD Maternal chr3 (+) :47,037,877-47,045,232 13 8,555 SETD2 Intronic duplication 0
5 Del Paternal chr6 (+) :15,295,552-15,299,842 9 4,290 -
6 Del Paternal chr7 (+) :134,791,258-134,794,840 49 3,582 CNOT4 Intronic deletion 3
7 TD Paternal chr7 (+) :151,866,685-151,876,245 22 26,951 -
8 UI Paternal chr8 (-) :75,250,456-75,259,603 37 9,147 -
9 UI Paternal chr8 (+) :75,248,253-75,257,941 45 9,688 -
10 Del Paternal chr11 (+) :50,650,376-50,658,583 6 8,207 -
11 Del Paternal chr12 (+) :118,472,836-118,478,089 53 5,253 -
12 Del Paternal chr14 (+) :40,757,141-40,760,560 24 3,419 -
13 Del Maternal chr19 (+) :12,268,514-12,272,220 8 3,706 -
14 Del Paternal chr19 (+) :21,182,476-21,192,939 45 10,463 -
15 Del De Novo chr20 (+) :31,748,702-31,753,823 6 5,121 -
16 Del Maternal chrX (+) :67,873,609-67,881,314 43 7,705 -
17 UI De Novo chrY (+) :11,941,563-57,398,141 7 45,456,578 -
Abbreviations: Del, Deletion; BT, Balanced translocation; UI, Unpaired inversion; PI, Paired inversion; TD, Tandem duplication
171
APPENDIX B.
LIST OF PUBLICATIONS
172
PUBLICATIONS
Utami KH, Winata CL, Hillmer AM, Aksoy I, Long TH, Chew EGY, Mathavan S, Tay SKH,
Korzh V, Sarda P, Davila S, Cacheux V. Haploinsufficiency of MED13L impairs development of
neural-crest cells derived organs and contributes to intellectual disability. Human Mutation.
2014 Epub 2014 Aug 18
Utami KH., Hillmer AM., Aksoy I, Chew EGY., Teo ASM., Zhang Z., Lee CWH, Chen PJ,
Chan CS, Ariyaratne PN, Rouam SL, Lim SS, Yousoof S, Prokudin I, Peters G, Collins F,
Wilson M, Kakakios A, Haddad G, Menuet A, Perche O, Tay SKH, Sung KWK, Ruan X, Ruan
Y, Liu ET, Briault S, Jamieson RV, Davila S, Cacheux V. Detection of chromosomal
breakpoints in patients with developmental delay and speech disorders. PLoS One. 2014
9(6):e90852
Perche O, Menuet A, Marcos M, Liu L, Paris A, Utami KH, Kervran D, Cacheux V, Laudier B,
Briault S. Combined deletion of two Condensin II system genes (NCAPG2 and MCPH1) in a
case of severe microcephaly and mental deficiency. Eur. J. Med. Genet. 2013. Epub 2013 Sep 4.
The work in this thesis has been presented in the following conferences:
Utami KH, Hillmer AM, Chew EGY, Aksoy I, Tay SKH, Briault S, Stanton LW, Jamieson R,
Davila S, Cacheux V. Zooming into the chromosomal breakpoints by genome paired-end tag
sequencing: application in patients with intellectual disabilities. Oral Presentation. National
Conference of Neuroscience Indonesia 2013, Jakarta, Indonesia September 14-15, 2013.
173
Utami KH, Hillmer AM, Chew EGY, Aksoy I, Tay SKH, Briault S, Stanton LW, Jamieson R,
Davila S, Cacheux V. Zooming into the chromosomal breakpoints by genome paired-end tag
sequencing: application in patients with intellectual disabilities. Poster Presentation. European
Society of Human Genetics 2013, Paris, France June 8-11, 2013.
Utami KH, Winata CL, Hillmer AM, Chew EGY, Mathavan S, Tay SKH, Korzh V, Sarda P,
Davila S, Cacheux V. De novo translocation disrupting Mediator complex subunit in a patient
with Pierre-Robin syndrome and developmental delay. Poster Presentation. American Society of
Human Genetics 2012, San Francisco, United States, November 6-10, 2012.
Utami KH., Chew EGY., Teo ASM., Zhang Z., Lim SS, Ruan X, Ruan Y, Liu ET, Briault S,
Jamieson RV, Hillmer AM., Davila S, Cacheux V. Poster Presentation. The use of DNA-PET
technology in Neurodevelopmental Disorders. Poster Presentation. Human Genome Meeting,
Sydney, Australia March 11-14, 2012.
Detection of Chromosomal Breakpoints in Patients withDevelopmental Delay and Speech DisordersKagistia H. Utami1,13, Axel M. Hillmer2, Irene Aksoy3, Elaine G. Y. Chew2, Audrey S. M. Teo2,
Zhenshui Zhang2, Charlie W. H. Lee4, Pauline J. Chen4, Chan Chee Seng5, Pramila N. Ariyaratne4,
Sigrid L. Rouam4, Lim Seong Soo1, Saira Yousoof6,7, Ivan Prokudin6,7, Gregory Peters8, Felicity Collins9,
Meredith Wilson9, Alyson Kakakios10, Georges Haddad11, Arnaud Menuet12, Olivier Perche12, Stacey
Kiat Hong Tay13, Ken W. K. Sung4, Xiaoan Ruan14, Yijun Ruan14, Edison T. Liu2, Sylvain Briault12,
Robyn V. Jamieson6, Sonia Davila1, Valere Cacheux1*
1 Human Genetics, Genome Institute of Singapore, Singapore, Singapore, 2 Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, Singapore,
Singapore, 3 Stem Cells and Developmental Biology, Genome Institute of Singapore, Singapore, Singapore, 4 Computational and Mathematical Biology, Genome Institute
of Singapore, Singapore, Singapore, 5 Scientific & Research Computing, Genome Institute of Singapore, Singapore, Singapore, 6 Eye and Developmental Genetics
Research, The Children’s Hospital at Westmead, Children’s Medical Research Institute and Save Sight Institute, Sydney, New South Wales, Australia, 7 Disciplines of
Paediatrics and Child Health and Genetic Medicine, Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia, 8 Department of Cytogenetics, The
Children’s Hospital at Westmead, Sydney, New South Wales, Australia, 9 Department of Clinical Genetics, The Children’s Hospital at Westmead, Sydney, New South Wales,
Australia, 10 Department of Immunology, The Children’s Hospital at Westmead, Sydney, New South Wales, Australia, 11 Unite de Genetique, Centre Hospitalier, Blois,
France, 12 Service de Genetique INEM UMR7355 CNRS-University, Centre Hospitalier Regional d’Orleans, Orleans, France, 13 Department of Paediatrics, Yong Loo Lin
School of Medicine, National University of Singapore, Singapore, Singapore, 14 Genome Technology and Biology, Genome Institute of Singapore, Singapore, Singapore
Abstract
Delineating candidate genes at the chromosomal breakpoint regions in the apparently balanced chromosomerearrangements (ABCR) has been shown to be more effective with the emergence of next-generation sequencing (NGS)technologies. We employed a large-insert (7–11 kb) paired-end tag sequencing technology (DNA-PET) to systematicallyanalyze genome of four patients harbouring cytogenetically defined ABCR with neurodevelopmental symptoms, includingdevelopmental delay (DD) and speech disorders. We characterized structural variants (SVs) specific to each individual,including those matching the chromosomal breakpoints. Refinement of these regions by Sanger sequencing resulted in theidentification of five disrupted genes in three individuals: guanine nucleotide binding protein, q polypeptide (GNAQ), RNA-binding protein, fox-1 homolog (RBFOX3), unc-5 homolog D (C.elegans) (UNC5D), transmembrane protein 47 (TMEM47), andX-linked inhibitor of apoptosis (XIAP). Among them, XIAP is the causative gene for the immunodeficiency phenotype seen inthe patient. The remaining genes displayed specific expression in the fetal brain and have known biologically relevantfunctions in brain development, suggesting putative candidate genes for neurodevelopmental phenotypes. This studydemonstrates the application of NGS technologies in mapping individual gene disruptions in ABCR as a resource fordeciphering candidate genes in human neurodevelopmental disorders (NDDs).
Citation: Utami KH, Hillmer AM, Aksoy I, Chew EGY, Teo ASM, et al. (2014) Detection of Chromosomal Breakpoints in Patients with Developmental Delay andSpeech Disorders. PLoS ONE 9(3): e90852. doi:10.1371/journal.pone.0090852
Editor: Osman El-Maarri, University of Bonn, Institut of experimental hematology and transfusion medicine, Germany
Received November 7, 2013; Accepted February 4, 2014; Published March 5, 2014
Copyright: � 2014 Utami et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This study is supported by grants from Biomedical Medical Research Council (BMRC) of the Agency for Science, Technology and Research (A*STAR),Singapore. Support is also acknowledged from the National Health and Medical Research Australia (NHMRC). KHU is funded by Singapore International GraduateAward (SINGA) fellowship. Additional support was also provided by the Genome Institute of Singapore internal research funds from the BMRC. The funders hadno role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: [email protected]
Introduction
Apparently balanced chromosomal rearrangements (ABCR)
occur sporadically in the population or segregate within families,
with a frequency of 1 in every 2000 live births. ABCR have been
largely associated with infertility, ovarian failure and intellectual
disability (ID) when detected during pre- and post-natal investi-
gation or genetic counselling. [1,2,3,4,5,6] The risk of developing
congenital anomalies or NDDs has been estimated to be 6.1% for
de novo ABCR. [2] Although ABCR is also found in phenotypically
normal individuals, there is an increased incidence of ABCR in
NDDs patients, which may lead to novel candidate disease gene
identification through breakpoint cloning methods. Such method
has been successfully applied in characterizing disease genes
including DMD in Dystrophin for Duchenne Muscular Dystrophy
[7], DISC1 in Schizophrenia [8,9] and ATP7A in Menkes disease.
[10,11] Karyotyping, Array-CGH (aCGH) and SNP arrays are
currently first-tier diagnostic tools to investigate ABCR in pre- and
post-natal settings. [6,12,13] Balanced chromosomal aberrations
can only be detected through karyotype observation, although at a
low resolution of approximately 5–10 Million base-pairs (Mb), [14]
and subsequent Fluorescence in situ hybridization (FISH) mapping
across the breakpoints for more detailed delineation is restricted to
the investigated region. Microarrays technologies are able to
PLOS ONE | www.plosone.org 1 March 2014 | Volume 9 | Issue 3 | e90852
identify copy number changes at considerable resolution but fail to
identify copy number neutral rearrangements. [15,16] Recent
advances in NGS technologies render it possible to systematically
identify genomic rearrangements including copy number neutral
events at a base-pair resolution, thus facilitating candidate disease
gene identification in the chromosomal breakpoint regions.
We have established a NGS pipeline, referred to as a DNA-PET
sequencing, [17,18,19] in which we sequence the short ends
(50 bp) of 59 and 39 tags of large-insert sizes between 7 to 11 kb
genomic DNA fragments, followed by ligating the two paired-ends
to form PET constructs and subjected to sequencing in a massive
and highly parallel manner, (Figure S1, see Supporting Informa-
tion S1) [17,20]. Hillmer and colleagues [17,18,19] have
demonstrated that large-insert fragment sizes provided higher
physical coverage with minimum sequencing efforts, and have the
advantage over short-insert sizes in terms of large genomic SVs
detection and covering complicated DNA sequence features such
as repetitive regions. We implemented this technique to analyze
the genome of four individuals with NDDs symptoms including
Developmental Delay (DD), Speech Delay (SD), Language Delay
(LD) and autistic disorder. These patients harbour cytogenetically
defined ABCR (2 familial translocation, 1 familial inversion and 1
de novo inversion), and prior aCGH analysis did not reveal
chromosomal imbalances. Our study shows that NGS technology
enables rapid identification of individual gene disruptions and
potential candidate genes in ABCR. We also demonstrated the
correlation of disrupted gene XIAP in an inversion breakpoint to
be causative for the patient’s immunodeficiency phenotype.
Results
Structural Variants Detection by DNA-PET Sequencing inFour Patients with Developmental Delay and SpeechDisorders
By using patients’ genomic DNA as starting material, libraries
were generated and sequenced using a SOLiD platform, and we
obtained an average of 35 million non-redundant paired-end reads
(Table S1, see Supporting Information S1). The median physical
coverage using our technique was 98x, with an average of 94x
(Table S1, see Supporting Information S1). Majority of the Paired
End Tags (PETs) were mapped to the reference genome NCBI
Build 36 and referred to as concordant PETs (cPETS), which
provided the copy number information based on sequencing read-
depth. The remaining clustered PETs were referred to as
discordantly mapped PETs (dPETs), which allowed the identifi-
cation of structural variants (SVs). After filtering (see Methods
section), approximately 96% of the SV calls generated for each
library were shared with normal individuals published in the
Database of Genomic Variants [21], 1000 Genome Project SVs
Pilot release set [22,23] or previous paired-end sequencing studies
of normal individuals [24,25,26].
Using our analysis pipeline for patient-specific SVs discovery,
the extraction of normal SVs reduced this number to 7–19 events,
with a mean of 14 SVs per patient (Tables S2–S6, see Supporting
Information S1). We observed deletions as being the most frequent
SVs, comprising 58% of the total patients-specific SVs (32
deletions out of a total of 55 SVs), while tandem duplication
comprises 20%, and the remaining SVs (translocation, inversion
and insertion) comprise 21% of total SVs (Table 1). We performed
PCR analysis on 36 randomly chosen SVs with 2 sets of primers
spanning predicted breakpoint junctions. Of these, 27 SVs were
validated and produced a single, clear PCR band at expected size
range, suggesting ,75% validation rate. In parallel to DNA-PET,
we compared the copy number detection overlap to aCGH and
Ta
ble
1.
List
of
SVs
ob
tain
ed
afte
rfi
ltra
tio
no
ffo
ur
pat
ien
ts.
Pa
tie
nt-
spe
cifi
cS
Vs
aft
er
filt
rati
on
Fa
mil
yP
ati
en
tK
ary
oty
pe
dP
ET
clu
ste
rsS
Vs
aft
er
da
tacu
rati
on
SV
so
ve
rla
pw
ith
no
rma
lT
ota
lD
el
Ba
lan
ceT
ran
slo
c.Is
ola
ted
Tra
nsl
oc.
Un
pa
ire
dIn
ve
rsio
nP
air
ed
Inv
ers
ion
Ta
nd
em
Du
pli
cati
on
Inse
rtio
n
1C
D5
46
,X
Y,t
(9;1
7)
(q1
2;q
24
)5
05
28
42
70
(95
.07
%)
14
12
20
00
00
2C
D1
04
6,X
Y,
t(6
;8)
(q1
6.2
;p1
1.2
)9
27
61
66
01
(97
.50
%)
15
92
10
01
2
3C
D8
46
,XY
,in
v(X
)(p
22
q2
6)
72
43
87
36
8(9
5.0
9%
)1
99
00
12
70
4C
D9
46
,XY
,in
v(5
)(q
22
q3
5.1
)9
30
58
15
74
(98
.79
%)
72
00
02
30
To
tal
3,0
86
1,8
68
1,8
13
55
32
(58
.1%
)4
(7.2
%)
1(1
.8%
)1
(1.8
%)
4(7
.2%
)1
1(2
0%
)2
(3.6
%)
aD
isco
rdan
tly
map
pe
dP
ETs
(dP
ETs)
wh
ich
con
ne
ctth
esa
me
two
ge
no
mic
reg
ion
sar
ecl
ust
ere
dto
ge
the
r.b
dP
ETcl
ust
ers
wh
ich
has
pas
sed
qu
alit
yfi
lte
rsto
rem
ove
seq
ue
nci
ng
arte
fact
sp
rovi
de
dth
elis
to
fp
red
icte
dSV
s.cA
pp
roxi
mat
ely
95
%o
fSV
so
verl
app
ed
wit
hSV
sfo
un
din
no
rmal
po
pu
lati
on
(se
eM
eth
od
s).
do
i:10
.13
71
/jo
urn
al.p
on
e.0
09
08
52
.t0
01
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 2 March 2014 | Volume 9 | Issue 3 | e90852
observed an average of 93% overlap between both experiments
(Table S7, see Supporting Information S1).
For each patient, we performed a case by case study to identify
the potential genes which were disrupted by the cytogenetically
visible rearrangements with the assumption that these were the
most likely causative events. We validated DNA-PET breakpoints
matching the cytogenetic rearrangements by Sanger sequencing
(Table 2), FISH and evaluated gene expression by quantitative
real-time PCR (RT-PCR). Apart from the specific chromosomal
rearrangements, we checked other SVs obtained for each patient.
Most of the SVs that coincide with coding regions have been
reported in the Database of Genomic Variants (DGV), indicating
that these SVs are likely benign (Tables S2–6, see Supporting
Information S1). We checked for the presence of regulatory
elements, such as transcription factor binding sites and epigenetic
marks, using RegulomeDB and the ENCODE dataset, on SVs
present in intra/intergenic regions and found no evidence of
known regulatory elements in any of them.
From these extensive analyses, we identified five disrupted genes
in three patients: GNAQ, RBFOX3, UNC5D, XIAP and TMEM47
(Table 2). Among them, XIAP has already been associated with X-
linked lymphoproliferative disorder 2 (XLP-2: MIM [300635]). To
evaluate the potential implication of the disrupted genes, we
assessed the occurrence of pre-existing Copy Number Variations
(CNVs) in DD cases from published studies and the DECIPHER
Consortium [27,28,29] (Table 3). Enrichment of CNV counts in
the cases versus controls was observed in RBFOX3 and UNC5D,
suggesting that altered gene dosage in these genes might have a
role in NDDs.
Breakpoint Characterization through Detailed SVsAnalysis
Case 1 (Patient CD5). In this family, the translocation
segregates with a variable degree of phenotypic manifestation.
Patient CD5 had language delay (LD) and DD during his
childhood, and his two sons (CD21 and CD22) displayed absence
of speech, DD and autistic disorder (Figure 1A). The translocation
t(9;17) was represented by two abnormally oriented clusters
corresponding to the balanced translocation between chromosome
9 and 17. Sanger sequencing refined the breakpoint coordinates in
the translocation carriers, and revealed identical breakpoint
patterns on chromosome 9 at position chr9:79,572,001–
79,572,005 and on chromosome 17 at position
chr17:74,764,927–74,767,964. There is a loss of 4 bp and
3,036 bp on chromosome 9 and 17, respectively, with a
microhomology of 3 bp between paired breakpoints (Figure 1B).
The translocation disrupted GNAQ at intron 5 on chromosome 9
and RBFOX3 at intron 2 on chromosome 17. Both disrupted genes
shared the same orientation, breakpoints lie within introns, and
the resulting fusion is predicted to be in frame. However, RT-PCR
analysis on the patient’s lymphoblastoid cell line did not reveal any
fusion transcript expression as neither gene is expressed in these
cells, reflecting the importance of relevant cell type for validation
of brain-specific genes. We checked the mRNA expression of both
genes in human tissue panel and found that both genes are highly
expressed in the fetal brain and the cerebellum (Figure 1C). GNAQ
encodes a member of the Gaq heterotrimeric protein family, and
null Gnaq mice exhibit cerebellar ataxia, and motor coordination
deficit [30]. RBFOX3 is a neuronal specific splicing factor, which is
exclusively expressed in the neuronal nuclei. There are 12 other
SVs found in this patient, and nine of them were validated by
PCR, in which 5 SVs were shared with his two affected sons (SV1,
2, 4, 6 and 13). These five SVs overlapped with known CNV
regions (SV1, 2, 4) reported in healthy individuals or were in
intergenic regions (SV6, 13) and were therefore excluded for
further analysis (Table S2, see Supporting Information S1).
Overall, the translocation segregates with LD and DD in this
family with variable penetrance, hinting a potential functional
impact.
Case 2 (Patient CD10). In the second patient (CD10), the
causative effect of the t(6;8) balanced translocation is unclear, due
to variable expression of phenotype in the translocation carriers;
his mother is asymptomatic and his younger sibling displayed
schizencephaly and DD (CD11) (Figure 2A). DNA-PET identified
abnormal paired reads matching the balanced translocation, and
Sanger sequencing refined the breakpoint coordinates on chro-
mosome 6 at position chr6:98,318,526–98,318,538 and chromo-
some 8 at position chr8:35,527,969–35,527,976. The translocation
disrupted UNC5D at intron 5 on chromosome 8, while no coding
genes were disrupted on chromosome 6 (Figure 2B). UNC5D
encodes a member of human dependence receptor UNC5 family
that is specifically expressed in the layer 4 of the developing
neocortex in rats, which makes it a biologically plausible candidate
[31,32]. Since the translocation does not fully segregate with the
disease, all coding exons of UNC5D were sequenced in each
translocation carriers, and we did not find additional point
mutations in the other allele. UNC5D mRNA is exclusively
expressed in the adult brain, fetal brain, and cerebellum
(Figure 2C), thus expression changes could not be investigated in
the patient’s lymphoblastoid cell line. Notably, we observed
enrichment of CNV counts in 15 cases described by DECIPHER
(Figure 2D). Analysis of other SVs showed 5 deletions in intergenic
Table 2. Validated breakpoints and disrupted candidate genes from DNA-PET sequencing in four patients.
PatientCytogeneticAnalysis Chr.
DNA-PET breakpoint predictedcoordinate (SOLiD)
Validated breakpoint coordinate(Sanger) Gene
CD5 t(9;17) 9 79,571,716–79,573,787 79,572,001–79,572,005 GNAQ
17 74,764,804–74,768,401 74,764,927–74,767,964 RBFOX3
CD10 t(6;8) 6 98,318,059–98,318,840 98,318,526–98,318,538
8 35,527,808–35,528,282 35,527,964–35,527,976 UNC5D
CD8 inv(X) X 119,984,597–119,984,844 119,984,613–119,984,615
X 122,839,027–122,845,538 122,839,398–122,844,862 XIAP
CD9 inv(5) 5 111,962,591–111,963,149 111,962,767
5 165,575,819–165,576,628 165,576,568–165,576,574
doi:10.1371/journal.pone.0090852.t002
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 3 March 2014 | Volume 9 | Issue 3 | e90852
regions (SV1, 2, 11, 12, and 14) and 8 within genes (SV3, 4, 5, 6,
9, 10, 13 and 15), in which all of them overlapped with known
CNVs regions (Table S3, see Supporting Information S1).
Case 3 (Patient CD8). The third patient (CD8), carried a
maternal-derived pericentric inversion at chromosome X, between
Xp22 and Xq26 (Figure 3A). His monozygotic twin brother
harbors the same inversion. They both show signs of immunode-
ficiency combined with mild intellectual disability (ID) and LD.
SVs analysis of this patient revealed a 1.2 Mb inversion located at
chromosome Xq24–25 between paired breakpoints (Table S4, see
Supporting Information S1), which did not correspond to the large
inversion seen in the karyotype, inv(X)(p11.4;q24). Sanger
sequencing allowed us to delineate the precise breakpoint on
chromosome X at position chrX:119,984,613–119,984,615 and at
position chrX:122,839,398–122,844,862 with a loss of 1 bp and
5,463 bp on each breakpoint junction, respectively. Due to the
discrepancies between karyogram and NGS data, we traced other
SVs in chromosome X by including lower confidence SVs (cluster
size $4), and revealed a total of 10 SVs that clustered into multiple
breakpoint hotspots on the p arm (SV14, 17, 20, 21), the
centromeric region (SV22) and the q arm (SV13, 15, 16, 18, 19),
suggesting a complex chromosomal rearrangement mechanism
(Table S5, see Supporting Information S1). We verified the
breakpoints in the inversion carriers (the mother and the twin
boys) by PCR and FISH using probes RP1-315G1 (Xq25), RP1-
296G17 (Xq24), RP11-330K13 (Xp21), W12-499N23 (Xq25)
BAC and Fosmid clones (Figure 3B). These data suggested two
large sequential breaks in each arm of chromosome X (Xp21 and
Xq24–25) that resulted in tandem duplications, deletion and
isolated breakpoints (Figure 3C). This double-inversion mecha-
nism rearranged the landscape of chromosome X architecture,
and appeared as a large inversion on G-banding karyotype.
Combining DNA-PET sequencing and extensive FISH validations
allowed us to characterize the complex chromosomal rearrange-
ments in chromosome X, as a result of a for a sequential double
inversion mechanism.
This complex rearrangement disrupted two genes: XIAP and
TMEM47. The latter encodes a member of PMP22/EMP/
Claudin protein family is ubiquitously expressed in human tissues
including adult and fetal brain (Figure 3D). Two other genes were
affected by tandem duplications as part of the complex
rearrangement (SV18, 19): SH2D1A, encodes an SH2 domain
containing 1A protein (SAP), which has been associated with X-
linked lymphoproliferative syndrome type 1 (XLP-1 [MIM:
308240]). This gene was initially suspected as the primary cause
of immunodeficiency phenotype seen in the patient, although it
was rejected due to normal SAP expression in patient’s blood
lysates (see Patients and Methods). ODZ1 is the second gene
affected by this rearrangement. It encodes an Odd Oz/Ten-M
homolog 1 of Drosophila pair-rule gene involved in post-
segmentation processes, including embryonic development of the
central nervous system (CNS), eyes, and limbs in Drosophila
[33,34].
RT-PCR of the four genes revealed a total absence of
expression for XIAP1 and TMEM47 and a moderate to normal
expression for SH2D1A and ODZ1 in patient’s fibroblast cell line
compared to sex-matched controls, supporting the absence of
functional XIAP1 and TMEM47 genes in the male patient
(Figure 3E). Considering the normal expression of SAP protein
and the absence of functional gene expression of XIAP, these data
clearly suggest that XLP-2 underlies the immunodeficiency
symptoms of the affected twin boys.
Case 4 (Patient CD9). In patient 4 (CD9), sequencing
analysis identified two paired-inversion clusters corresponding to
Ta
ble
3.
Scre
en
ing
of
CN
Vs
inca
ses/
con
tro
lsfr
om
pu
blis
he
dan
dp
ub
licd
atas
ets
.
Ca
ses
Co
ntr
ol
Pa
tie
nt
Dis
rup
ted
ge
ne
sC
oo
pe
re
ta
l[2
8]
DE
CIP
HE
RC
on
sort
ium
To
tal
Co
op
er
et
al
[28
]1
00
0G
en
om
eT
ota
l
CD
5G
NA
Q3
71
01
40
14
CD
5R
BFO
X3
68
11
79
14
5
CD
10
UN
C5D
21
51
70
00
CD
8TM
EM47
01
01
00
44
Th
eto
tal
nu
mb
er
of
case
sin
Co
op
er
et
al[2
8]
is1
5,7
67
case
s,an
dfo
rD
ECIP
HER
is,
17
,00
0ca
ses
[29
].T
he
tota
ln
um
be
ro
fco
ntr
ols
inC
oo
pe
re
tal
.is
83
29
con
tro
lsan
dfo
r1
00
0G
en
om
eSV
Re
leas
ese
tis
18
5co
ntr
ols
[22
].d
oi:1
0.1
37
1/j
ou
rnal
.po
ne
.00
90
85
2.t
00
3
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 4 March 2014 | Volume 9 | Issue 3 | e90852
the large inversion on chromosome 5. This inversion spans 53 Mb
on chromosome 5q22.2 (chr5:111,966,591-111,963,149) and 5q34
(chr5:165,575,819–165,576,628), with 1 bp deletion, and micro-
homologies of 3 bp between paired breakpoints. No gene was
disrupted at the breakpoint junctions (Table S6, see Supporting
Information S1), and no evidence of regulatory elements sitting at
both breakpoint coordinates was found based on RegulomeDB
and ENCODE databases. Additionally, there are 5 other patient-
specific SVs; two were located in the intergenic regions (SV5 and
6); two others were tandem duplications overlapped with known
CNVs (SV4 and 7); and one intronic deletion of 4 kb in RNF19B
gene, which has not been listed in DGV (SV1) (Table S6, see
Supporting Information S1). Based on this analysis, it seems
unlikely that the large inversion on chromosome 5 causes the
clinical phenotype. Other patient specific SVs or mutations that
cannot be identified by DNA-PET such as point mutations or
exposures of environmental factors might be the underlying cause
of the phenotypic features seen in this patient.
Discussion
In the past decade, we have seen substantial progresses for
identification of novel candidate genes in NDDs with the recent
development in technologies. Two studies of large cohorts of
patients with developmental delay described the enrichment of
large CNVs in 15% of the cases [28], and highlighted the presence
of additional large CNVs that co-exist with primary microdele-
tion/duplication syndrome in 10% of the cases as an additive
contributing factor to more severe phenotype [35]. These CNVs
have been useful to provide a better classification of microdele-
tion/duplication syndromes; however these regions often encom-
pass multiple genes and thus make it challenging to identify
plausible candidates. Recent exome sequencing study in individ-
uals with ID identified potentially causative de novo single
nucleotide variants (SNVs) with a diagnostic yield of 16%,
comparable to the CNV burden obtained in copy number studies
[36]. A more conventional approach in candidate gene identifi-
cation involves delineating candidate genes in the chromosomal
breakpoints of the ABCR [9,37,38,39]. In contrast to the
downstream effects of SNVs or CNVs, genes that are disrupted
Figure 1. Patient CD5 with translocation t(9;17). A) The pedigree of patient CD5 is indicated. The translocation is transmitted to his two sons(CD21 and CD22). B) Translocation between chromosome 9 and 17 were validated by Sanger sequencing in three translocation carriers. The referencesequence is indicated, showing the fusion of two genes at the genomic level: the first five exons of GNAQ fused to exon 3–14 of RBFOX3 and the firsttwo exons of RBFOX3 fused to exon 6–7 of GNAQ. C) mRNA expression of GNAQ and RBFOX3 showed high expression in fetal brain, adult brain andcerebellum in human tissue panel.doi:10.1371/journal.pone.0090852.g001
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 5 March 2014 | Volume 9 | Issue 3 | e90852
by translocations or inversion are presumably more severely
affected, resulting generally in protein truncation or heterozygous
inactivation of the affected allele.
Recent studies have described the feasibility of using NGS
technologies to map the ABCR breakpoints in patients with
neurodevelopmental abnormalities [37,38,40,41,42,43,44]. These
technologies include (i) a shotgun sequencing approach by using
Figure 2. Patient CD10 with translocation t(6;8). A) The pedigree of patient CD10 is indicated. The familial translocation is inherited fromasymptomatic carrier mother and shared with his affected sister (CD11). B) Sanger sequencing analysis refined the chromosomal breakpoint regionsand revealed a loss of 11 bp on chromosome 6 and 8 bp on chromosome 8, with a microhomology of 3 bp between the paired breakpoints. C)UNC5D mRNA expression in human tissue panel showed high expression in the fetal brain, adult brain and cerebellum compared to other tissues. D)The translocation breakpoint is located at intron 1 of UNC5D indicated by the black arrow, encompasses 15 CNVs cases described in the DECIPHER.doi:10.1371/journal.pone.0090852.g002
Figure 3. Patient CD8 with a complex chromosomal inversion. A) Karyogram of normal chromosome X compared to der(X) in patient CD8. B)FISH validation of 10 SVs shown in Table S5 (see Supporting Information S1) with the respective FISH probes: Hybridization of RP1-296G17-Biot (SV15)and RP1-315-Dig (SV16) were localized on the centromere of the patient’s metaphase. Probes for SV17 and SV21 on Xp21 (RP11-330K13-Biot) andXq25 (W12-499N23-Dig), respectively resulted in a split signal between Xp21 and the centromeric region in the patient’s chromosome. Further FISHanalysis was performed by using probe RP11-762M23-Biot on Xq11.1 (SV22) that was found to localize on the upper chromosomal arm. Probe RP11-655E22 on Xp11.2 was localized on the lower arm of derivative chromosome X. C) Reconstructed derivative chromosome X for patient CD8. Normalhuman chromosome X according to ISCN 2009 with the arrow orientation from a to d and the proposed mechanism of sequential double inversion inpatient CD8. Based on our FISH analysis, an inversion occurred first between Xp21 and Xq25, changing the orientation of p and q arm with a shift ofthe centromere position towards the lower q-arm shown by inverted red arrow b and c. This was followed by the second inversion that occurredbetween Xq11.1 and Xq25, altering the orientation of the q-arm (inverted green arrow c). D) Expression of TMEM47 in human tissue panel assessed byqRT-PCR. E) Expression analysis of four disrupted genes in patient CD8 assessed by qRT-PCR.doi:10.1371/journal.pone.0090852.g003
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 6 March 2014 | Volume 9 | Issue 3 | e90852
the flow-sorted derivative chromosomes [40], (ii) custom jumping
libraries coupled to targeted breakpoint capture [38], which are
limited to the chromosomal breakpoint regions, (iii) standard
paired-end sequencing or large-insert jumping libraries of 3–4 kb
[37,38,41], and (iv) mate-pair sequencing of 2–3 kb insert sizes
[42,43].
Our work described the use of a genome paired end tag (DNA-
PET) sequencing with larger insert sizes between 7–11 kb
[17,18,19,45,46]. The use of an approximately 8–15 kb insert
sizes has been shown to be more advantageous in terms of SVs
detection in more complicated DNA sequence features, such as
repetitive regions or large genomic rearrangements, and also
provides higher physical coverage with minimum sequencing
efforts compared to smaller insert sizes [19,47]. We implemented
this technique to map the breakpoints of four patients with DD
harbouring cytogenetically defined ABCR, and identified specific
breakpoints for all of them. Sanger sequencing was required to
refine the breakpoints at the base pair level. The observed cryptic
breakpoint anomalies were deletions ranging from 3 bp to
5,462 bp and/or microhomologies of 2–4 bp suggesting a
microhomology-mediated end joining (MMEJ) [48] or non-
homologous end joining mechanism (NHEJ) [49]. Both are major
pathways for double strand break repair that often occurs in non-
homologous regions of non-recurrent chromosomal breakpoints
emphasizing the unique characteristics of these rearrangements.
In this study, five genes were identified within the ABCR-
breakpoint regions in three patients (GNAQ, RBFOX3, UNC5D,
XIAP, and TMEM47), and we observed various complications in
attempting to correlate these genes to the expressed phenotypes.
For one case (Patient CD8), the complex inversion pattern seen in
chromosome X resembles chromothripsis, phenomenon common-
ly found in cancer and recently, in congenital diseases, which
resulted from localized shattering of one or few chromosomes and
assembly of chromosomal pieces by non-homologous end-joining
(NHEJ), and thus appeared as complex chromosome rearrange-
ments, involving three or more breakpoints [41,42,43,50,51]. This
complex events highlight the advantage but also the limitation of
the DNA-PET technology as FISH experiments were necessary to
better understand the high order structure of the rearrangement.
The inversion breakpoint disrupted XIAP gene, which is the
primary cause of a rare form of XLP2 [52,53] (XLP2 [MIM:
300635]) and likely to be causative for the immunodeficiency
phenotype of the patient. More interestingly, these patients also
presented mild ID and LD, features that are occasionally not
associated with XLP syndromes, suggesting possibilities of
additional contributing candidate genes within the complex
rearrangements. TMEM47 is among the potentially interesting
candidate, with total absence of transcript expression in patient’s
cell line and expression in fetal to adult brain. We proposed the
need to perform mutational screening of TMEM47 in X-linked ID
patients for future studies.
For one patient (Patient CD9), there were no disrupted genes or
regulatory elements at the breakpoint regions, neither potential
other interesting SVs, showing the limitation for candidate gene
detection merely through ABCR breakpoint cloning methods.
Alternatively, exome sequencing of parent-offspring trio can be
considered as a method of choice to further investigate possible
causal variants especially in cases where there is a lack of
association between ABCR and disease.
For the two remaining familial translocations, we identified
fusion gene at the genomic level, although unverifiable at the
transcript level due to lack of expression in available cell lines
between GNAQ and RBFOX3 genes in a t(9;17) translocation in
three affected members of a family (Patient CD5) with different
level of neurological symptoms (mild to severe) and UNC5D
disruption in a family harbouring a t(6;8) translocation carried by
two affected siblings (Patient CD10) and his mother. The
translocation carriers in latter family displayed a broad range of
clinical presentations; an asymptomatic mother, her first child with
mild DD, and her second child presenting schizencephaly,
polymicrogyria and LD, suggesting additional etiological factor
underlying these features apart from the t(6;8) translocation.
Besides the large phenotypic variability between translocation
carriers seen in both families, we observed an enrichment of CNV
counts for UNC5D and RBFOX3 in NDDs-associated cases [28].
Therefore, we cannot totally exclude the possibility of high level of
variability effects in these genes. Independent validation screening
in isolated neurologically-affected patients, rather than collective
multi-symptoms cohorts, would be significantly useful to provide
significant association with specific neurodevelopmental symptoms
in these patients.
Despite the potential functions of four identified genes (GNAQ,
RBFOX3, UNC5D and TMEM47) in brain development, further
functional validations are required to establish a correlation
between these disrupted genes and patients’ clinical phenotype.
Furthermore, reporting such ABCR-disrupted genes is crucial to
determine the clinical relevance of newly identified CNVs or
SNVs encompassing these genes.
In this study, we showed the potential of a cost-effective NGS
technology to rapidly pinpoint disrupted genes within ABCR
breakpoint regions. High overlap (93%) between both NGS-and
array-based technologies shown in this study emphasizes the
importance of employing a single technique that can provide high
coverage of genome information with consistent validation rate as
a routine clinical diagnosis tool. The stringency of our filtering
pipeline can be optimized and adjusted depending on the
sequencing platforms, and this would revolutionize the character-
ization of individual genomes of patients without prior karyotypic
observation. For the implementation in the clinical settings,
unaffected parents or siblings should be included for genome
investigation to reduce the number of familial, non-pathogenic
SVs. Whilst the technology continues to improve, validation with
longer read depth by Sanger sequencing is still necessary for better
SVs annotation. Our study complements the existing application
of NGS technology in unexplained NDDs patients for better
characterization of chromosomal rearrangements and discovery of
potential candidate genes.
Patients and Methods
PatientsAll samples and information were collected after written
informed consent from patient’s parents was obtained and in
accordance with local institutional review board approved
protocols from National University of Singapore in Singapore,
Children’s Hospital Westmead in Sydney, Australia and Centre
Hospitalier Regional in Orleans, France. DNA samples were
obtained from peripheral blood lymphocytes and cultured skin
fibroblasts obtained from patients seen at the participating
institutes.
Patient CD5. This is a familial balanced translocation
presenting variable degree of DD and autistic features. The first
son displayed an autistic behavior and global DD at three years of
age with an absence of speech, feeding and sleeping difficulties,
habit disorders, and stereotypic movements. At 4 years, there is an
improvement in communication and speech seen in the first son.
Chromosome analysis revealed a translocation t(9;17) (see Table 1),
which is shared with his father and sibling. During genetic
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 7 March 2014 | Volume 9 | Issue 3 | e90852
counselling, the father reported that he suffered from LD and DD
during childhood that was not explored at that time, and this has
resolved by adolescence. His second son was found to have DD
and autistic features at the age of two years. DNA tested for the
translocation was obtained from the father who was referred to as
patient CD5. Chromosome analysis in the phenotypically normal
mother revealed a normal karyotype.
Patient CD10. The patient was born at term by lower
segment caesarean section, due to breech presentation. Delay in
developmental milestones was noted at 18 months of age affecting
both walking and speech. At 9 years old, comprehension and
behavioural difficulties were noted at school. Karyotype analysis
revealed a balanced translocation t(6;8) (see Table 1). Metabolic
screening and FRAXA testing were normal. Karyotypic analysis in
his parents revealed that his mother carried the same balanced
translocation. She had no intellectual difficulties, but was reported
to have had a ‘hole in the heart’ in childhood, which closed
spontaneously. His younger sister had the same translocation
detected by amniocentesis during pregnancy. She was noted to
have plagiocephaly soon after birth. She had feeding difficulties in
early infancy, which gradually resolved. At 2 years old, she had LD
with low-average fine motor and gross motor skills. A right
intermittent exotropia was noted. An MRI head scan showed a
closed lip schizencephaly involving the right frontal lobe with
polymicrogyria in the right Sylvian fissure. MRI brain scans in her
brother and mother were normal.
Patient CD8. The patient was one of monozygotic twins. At
the age of three years, he had an acute EBV infection with
prolonged hepatosplenomegaly and abnormality of liver function
tests. Immunological investigations showed reduced IgM, de-
creased CD4 T-helper cells and decreased natural killer (NK) cell
function. Patient was suspected for XLP-1, but western blot
analysis on patient’s whole blood cell lysates showed normal
expression of SAP. At the age of five years, mild ID was diagnosed,
as well as a specific language disorder affecting his receptive and
expressive skills. Karyotype analysis revealed a complex inversion
inv(X) (see Table 1), derived from his phenotypically normal
mother. His identical twin showed similar clinical features and
carrying the same karyotype. The maternal uncle was reported to
have recurrent infections and a maternal great uncle died at eight
months of age, apparently due to liver abnormalities.
Patient CD9. The patient was born by normal delivery after
an uneventful pregnancy. Mild delay in early motor and speech
developmental milestones was reported. At age twelve, he was
found to have average intellectual ability, an expressive language
disorder, and specific learning difficulties in maths, spellings and
reading; as well as attention deficit disorder. Chromosome analysis
revealed a paracentric inversion of chromosome 5 (see Table 1).
Both parents have normal karyotypes and do not show any
phenotypic abnormality.
Cytogenetic and FISH AnalysisKaryotypes were determined from G-banding analysis using
standard protocol according to the ISCN nomenclature. FISH
analysis was carried out using protocols as described elsewhere.
[54,55] Metaphase chromosomes were prepared from Epstein-
Barr virus-transformed lymphoblastoid cell lines (EBV-LCL) or
cultured skin fibroblast cells obtained from patients, parents or
siblings carriers by standard techniques. BAC and fosmid probes
were obtained from BACPAC Resources (Oakland, CA). Probes
were labelled by nick-translation kit (Enzo) with biotin-16-dUTP
or digoxigenin-11-dUTP (Roche). The probes were blocked with
1 mg/ml Cot-1 DNA (Life Technologies), and resuspended at a
concentration of 5 ng/ml in hybridization buffer (2xSSC, 10%
Dextran Sulfate, 1x PBS, 50% Formamide). Fluorescent signals
were visualized by avidin-conjugated fluorescein isothiocyanate
(FITC) (Vector Laboratories, CA) or anti-Digoxigenin-Rhoda-
mine (Roche). Chromosomes were counter-stained with DAPI and
the signal was analysed using a Nikon Epifluorescence Microscope
equipped with ISIS Metasystems for imaging analysis.
Genomic DNA PreparationGenomic DNA from lymphoblastoid cell lines, fibroblast, or
blood from the patient was extracted by Qiagen Blood and Cell
Culture DNA Kits (Qiagen), according to manufacturer’s instruc-
tion. lymphoblastoid and fibroblast cell lines were maintained to a
minimum number of passages prior to DNA extraction. Quality
and quantity of the extracted DNA were measured using
NanoDrop 1000 Spectrophotometer and agarose gel electropho-
resis.
aCGHaCGH was performed in all patient samples using the SurePrint
G3 Human 2 x 400 k aCGH Microarray (Agilent Technologies
Inc. Santa Clara) according to the manufacturer’s instruction. The
microarray slides were scanned on an Agilent Microarray
Scanner. Data were processed by Genomic Workbench software,
standard edition 5.0.14 (Agilent). We used the Aberration
Detection Method (ADM-2) algorithm to identify DNA copy
number variations (CNV). The ADM-2 algorithm identifies all
aberrant intervals in a given sample with consistently high or low
log ratio based on the statistical score. We applied a filtering
option of minimum of three probes in region and centralization
threshold of 6. We used NCBI Build 36 as a reference genome.
For smaller CNVs that were identified by DNA-PET, we
compared with the aCGH raw data and interpreted the copy
number change by looking at the individual probe ratio, using a
cutoff of 0.1 (,0.1 indicated deletion, .0.1 indicated duplication).
DNA-PET Library ConstructionWe constructed the DNA-PET libraries for four patients
according to Method described in Hillmer et al. [17]. Briefly,
genomic DNA was hydrosheared to 7–11 kb DNA fragments.
Long Mate Paired (LMP) cap adaptors were ligated to the
hydrosheared and end-repaired DNA fragments. The cap
adaptor-ligated DNA fragments were separated by agarose gel
electrophoresis, recovered using the QIAEXII Gel Extraction Kit
(QIAGEN) and circularized with a biotinylated adaptor that
connects the cap adaptors at both ends of the DNA fragments.
Missing 59 phosphate groups of cap adaptors created a nick on
each strand after circularization of the DNA. Both nicks were
translated outwards by .50 bp into the circularized genomic
DNA fragment by DNA polymerase I (NEB). The nick-translated
constructs were then digested with T7 exonuclease and S1
nuclease (NEB), to release paired-end tag (PETs) library
constructs. These constructs were ligated with SOLiD sequencing
adaptors P1 and P2 (Life Technologies), and amplified using 2x
HF Phusion Master Mix (Finnzymes OY) for sequencing. High
throughput sequencing of the 50 bp libraries was performed on
SOLiD sequencers (v3plus and v4, respectively) according to the
manufacturer’s recommendation (Life Technologies). Sequence
tags were mapped to the human reference sequence (NCBI Build
36) and paired using SOLiD System Analysis Pipeline Tool
Bioscope, allowing up to 12 color code mismatches per 50 bp tag.
For sample CD5 and CD8, two DNA size fractions were merged
for library construction which resulted in a reduced sensitivity to
identify small deletions.
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 8 March 2014 | Volume 9 | Issue 3 | e90852
DNA-PET Data CurationThe majority of the PET sequences mapped accordingly to the
reference genome (concordant PETs or cPETs) with expected
mapping orientation (59 tag to 39 tag) and expected mapping
distance (according to the selected fragment size) The distribution
of cPET in the genome was used to retrieve copy number
information.5 The remaining portion of the PETs mapped
discordantly to the reference genome (discordant PET or dPETs),
classified as those with incorrect paired-tag orientation and
incorrect genomic distances. These dPETs provided information
to search for genomic rearrangements; with specific criteria for
different types of SVs according to PET mapping orientation and
genomic region as described by Yao et al. [19]. The overlapping
dPETs representing similar SVs were clustered together as dPET
clusters and counted as the cluster size, according to the procedure
as described in Hillmer et al. [17] with refined data curation as
described in Ng et al. [20].
Filtering of Normal Structural Variations (SVs)Comparison of clusters across different genomes was performed
as described by Ng et al. [20]. We included DNA-PET data of 23
normal individuals (25 DNA-PET data sets) and the pilot release
set of 1000 Genome Project [22] from Mills et al. in the cross-
genome comparison, and identified SVs that were present in the
normal libraries. In addition, we used the breakpoint locations to
compare the identified SVs with published SVs based on paired-
end sequencing studies of 18 additional normal individuals [24,25]
and Database of Genomic Variants. The fraction of predicted SV
which overlapped with a published SV was calculated by the
percentage of overlap relative to the larger event. Thus, we
categorized SVs that overlapped by 80% or more with those
identified by these studies as normal SVs. Hence, SVs classified as
normal have been excluded to identify rearrangements which
underlie the diseases.
CNVs Screening in Public DatasetsWe used the CNVs map from published data by Cooper et al.
with 15,767 cases of DD and 8,329 adult controls to screen for
deletions or duplications in our candidate genes [28]. We also
screened from DECIPHER databases, with .17,000 cases
carrying CNVs disrupting individual genes. For additional control
dataset, we screened for normal CNVs in the first release SVs set
of 1000 Genome Project Consortium of 185 individuals [22].
CNVs were counted in both cases and controls spanning candidate
genes in these datasets.
In silico Analysis of Regulatory RegionsThe hg18 coordinates of genomic regions from each patient’s
SVs list were converted to hg19 in LiftOver from UCSC genome
browser. SVs coincide within introns or intergenic regions were
assessed for the probability score of functional regulatory regions
in RegulomeDB [56] and ENCODE data [57,58] in UCSC
genome browser.
Validations of Expected Breakpoints by PCRPrimers were designed by Primer3 program, and the amplicons
spanning the breakpoint were predicted by dPET clusters
according to human genome assembly NCBI Build 36. PCR
was carried out with JumpStart REDAccuTaq LA polymerase
(Sigma Aldrich Inc., St. Louis, MO) in a 50 ml reaction volume
and with 500 ng of genomic DNA as a template. The following
program was used: 1) Initial denaturation at 96uC for 30 sec, 2) 40
cycles of 15 sec at 94u, 30 sec at 58uC, 10 min at 68u, 3) 68uC for
10 min. PCR products showing single bands were purified by Gel
Extraction Kit (Qiagen) and used as templates for sequencing in
both directions by Sanger sequencing. The sequences of junction
fragments were aligned to the human genome reference sequence
using Blat [59].
Expression AnalysisTotal RNA was extracted from patient’s EBV-LCL or
fibroblasts using RNAeasy kit (Qiagen). Reverse transcription of
2 mg RNA derived from patients, 2 lymphoblast controls, 2
fibroblast controls and commercially available human tissue panel
RNA (Clontech) was performed in 20 ml of SuperSCript III
Reverse Transcriptase reaction buffer (Life Technologies) using
random hexamer primers. Quantitative real-time PCR (qRT-
PCR) reactions were performed in the ABI PRISM 7500 HT
system (Life Technologies) with five-fold dilution of cDNA,
200 nM of each primer using the SybrGreen PCR Master Mix
(Life Technologies). Data were analyzed using 2DDCt method, and
normalized against control sample with human ACTB. Each
measurement was performed in triplicate. The controls used in this
study are derived from EBV-LCL or skin fibroblast cells of 4
normal individuals.
Supporting Information
Information S1 The file SupportingInformation_S1.pdf contains
additional information to the manuscript. It consists of 9 pages, 1
Figure and 7 tables.
(DOCX)
Acknowledgments
We thank the patients and family members for their participation. We
thank Dr. Stuart Tangye for Western Blot analysis in patient CD8. This
study makes use of data generated by the DECIPHER Consortium. A full
list of centres who contributed to the generation of the data is available
from http://decipher.sanger.ac.uk and via email from decipher@sanger.
ac.uk. The sequencing data from this study have been submitted to NCBI
Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo)
under accession no. GSE51430.
Author Contributions
Conceived and designed the experiments: AMH VC SD. Performed the
experiments: KHU EGYC LSS ASMT ZZ LSS SY IP. Analyzed the data:
KHU AMH CWHL PJC CCS PNA SLR. Wrote the paper: KHU AMH
IA VC SD SKHT. Provided sequencing platform and facility: XR YR
AMH ETL. Cpatient samples: GP FC MW AK GH AM OP SB RVJ.
Provided analysis tools: KWKS.
References
1. Fantes JA, Boland E, Ramsay J, Donnai D, Splitt M, et al. (2008) FISH mapping
of de novo apparently balanced chromosome rearrangements identifies
characteristics associated with phenotypic abnormality. Am J Hum Genet 82:
916–926.
2. Warburton D (1991) De novo balanced chromosome rearrangements and extra
marker chromosomes identified at prenatal diagnosis: clinical significance and
distribution of breakpoints. Am J Hum Genet 49: 995–1013.
3. MacGregor DJ, Imrie S, Tolmie JL (1989) Outcome of de novo balanced
translocations ascertained prenatally. J Med Genet 26: 590–591.
4. Vandeweyer G, Kooy RF (2009) Balanced translocations in mental retardation.
Hum Genet 126: 133–147.
5. Hochstenbach R, van Binsbergen E, Engelen J, Nieuwint A, Polstra A, et al.(2009) Array analysis and karyotyping: workflow consequences based on a
retrospective study of 36,325 patients with idiopathic developmental delay in the
Netherlands. Eur J Med Genet 52: 161–169.
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 9 March 2014 | Volume 9 | Issue 3 | e90852
6. Rauch A, Hoyer J, Guth S, Zweier C, Kraus C, et al. (2006) Diagnostic yield of
various genetic approaches in patients with unexplained developmental delay ormental retardation. Am J Med Genet A 140: 2063–2074.
7. Ray PN, Belfall B, Duff C, Logan C, Kean V, et al. (1985) Cloning of the
breakpoint of an X;21 translocation associated with Duchenne musculardystrophy. Nature 318: 672–675.
8. Muir WJ, Gosden CM, Brookes AJ, Fantes J, Evans KL, et al. (1995) Directmicrodissection and microcloning of a translocation breakpoint region,
t(1;11)(q42.2;q21), associated with schizophrenia. Cytogenet Cell Genet 70:
35–40.9. Millar JK, Wilson-Annan JC, Anderson S, Christie S, Taylor MS, et al. (2000)
Disruption of two novel genes by a translocation co-segregating withschizophrenia. Hum Mol Genet 9: 1415–1423.
10. Chelly J, Tumer Z, Tonnesen T, Petterson A, Ishikawa-Brush Y, et al. (1993)Isolation of a candidate gene for Menkes disease that encodes a potential heavy
metal binding protein. Nat Genet 3: 14–19.
11. Verga V, Hall BK, Wang SR, Johnson S, Higgins JV, et al. (1991) Localizationof the translocation breakpoint in a female with Menkes syndrome to Xq13.2-
q13.3 proximal to PGK-1. Am J Hum Genet 48: 1133–1138.12. Miller DT, Adam MP, Aradhya S, Biesecker LG, Brothman AR, et al. (2010)
Consensus statement: chromosomal microarray is a first-tier clinical diagnostic
test for individuals with developmental disabilities or congenital anomalies.Am J Hum Genet 86: 749–764.
13. Rauch A, Ruschendorf F, Huang J, Trautmann U, Becker C, et al. (2004)Molecular karyotyping using an SNP array for genomewide genotyping. J Med
Genet 41: 916–922.14. Shaffer LG (2005) American College of Medical Genetics guideline on the
cytogenetic evaluation of the individual with developmental delay or mental
retardation. Genet Med 7: 650–654.15. Le Scouarnec S, Gribble SM (2012) Characterising chromosome rearrange-
ments: recent technical advances in molecular cytogenetics. Heredity (Edinb)108: 75–85.
16. Savage MS, Mourad MJ, Wapner RJ (2011) Evolving applications of microarray
analysis in prenatal diagnosis. Curr Opin Obstet Gynecol 23: 103–108.17. Hillmer AM, Yao F, Inaki K, Lee WH, Ariyaratne PN, et al. (2011)
Comprehensive long-span paired-end-tag mapping reveals characteristicpatterns of structural variations in epithelial cancer genomes. Genome Res 21:
665–675.18. Inaki K, Hillmer AM, Ukil L, Yao F, Woo XY, et al. (2011) Transcriptional
consequences of genomic structural aberrations in breast cancer. Genome Res
21: 676–687.19. Yao F, Ariyaratne PN, Hillmer AM, Lee WH, Li G, et al. (2012) Long span
DNA paired-end-tag (DNA-PET) sequencing strategy for the interrogation ofgenomic structural mutations and fusion-point-guided reconstruction of
amplicons. PLoS One 7: e46152.
20. Alam MM, Chattopadhyaya M, Chakrabarti S (2012) On the origin of largetwo-photon activity of DANS molecule. J Phys Chem A 116: 11034–11040.
21. Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, et al. (2004)Detection of large-scale variation in the human genome. Nat Genet 36: 949–
951.22. Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, et al. (2011) Mapping
copy number variation by population-scale genome sequencing. Nature 470:
59–65.23. Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, et al. (2012) An
integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65.
24. Kidd JM, Cooper GM, Donahue WF, Hayden HS, Sampas N, et al. (2008)
Mapping and sequencing of structural variation from eight human genomes.Nature 453: 56–64.
25. Korbel JO, Urban AE, Affourtit JP, Godwin B, Grubert F, et al. (2007) Paired-end mapping reveals extensive structural variation in the human genome.
Science 318: 420–426.
26. Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, et al. (2010) Origins andfunctional impact of copy number variation in the human genome. Nature 464:
704–712.27. Corpas M, Bragin E, Clayton S, Bevan P, Firth HV (2012) Interpretation of
genomic copy number variants using DECIPHER. Curr Protoc Hum GenetChapter 8: Unit 8 14.
28. Cooper GM, Coe BP, Girirajan S, Rosenfeld JA, Vu TH, et al. (2011) A copy
number variation morbidity map of developmental delay. Nat Genet 43: 838–846.
29. Swaminathan GJ, Bragin E, Chatzimichali EA, Corpas M, Bevan AP, et al.(2012) DECIPHER: web-based, community resource for clinical interpretation
of rare variants in developmental disorders. Hum Mol Genet 21: R37–44.
30. Offermanns S, Hashimoto K, Watanabe M, Sun W, Kurihara H, et al. (1997)Impaired motor coordination and persistent multiple climbing fiber innervation
of cerebellar Purkinje cells in mice lacking Galphaq. Proc Natl Acad Sci U S A94: 14089–14094.
31. Takemoto M, Hattori Y, Zhao H, Sato H, Tamada A, et al. (2011) Laminar andareal expression of unc5d and its role in cortical cell survival. Cereb Cortex 21:
1925–1934.
32. Zhu Y, Li Y, Haraguchi S, Yu M, Ohira M, et al. (2013) Dependence receptorUNC5D mediates nerve growth factor depletion-induced neuroblastoma
regression. J Clin Invest 123: 2935–2947.
33. Ben-Zur T, Feige E, Motro B, Wides R (2000) The mammalian Odz gene
family: homologs of a Drosophila pair-rule gene with expression implyingdistinct yet overlapping developmental roles. Dev Biol 217: 107–120.
34. Ben-Zur T, Wides R (1999) Mapping homologs of Drosophila odd Oz (odz):
Doc4/Odz4 to mouse chromosome 7, Odz1 to mouse chromosome 11; andODZ3 to human chromosome Xq25. Genomics 58: 102–103.
35. Girirajan S, Rosenfeld JA, Coe BP, Parikh S, Friedman N, et al. (2012)Phenotypic heterogeneity of genomic disorders and rare copy-number variants.
N Engl J Med 367: 1321–1331.
36. de Ligt J, Willemsen MH, van Bon BW, Kleefstra T, Yntema HG, et al. (2012)Diagnostic exome sequencing in persons with severe intellectual disability.
N Engl J Med 367: 1921–1929.37. Talkowski ME, Rosenfeld JA, Blumenthal I, Pillalamarri V, Chiang C, et al.
(2012) Sequencing chromosomal abnormalities reveals neurodevelopmental locithat confer risk across diagnostic boundaries. Cell 149: 525–537.
38. Talkowski ME, Ernst C, Heilbut A, Chiang C, Hanscom C, et al. (2011) Next-
generation sequencing strategies enable routine detection of balanced chromo-some rearrangements for clinical diagnostics and genetic research. Am J Hum
Genet 88: 469–481.39. Millar JK, Christie S, Anderson S, Lawson D, Hsiao-Wei Loh D, et al. (2001)
Genomic structure and localisation within a linkage hotspot of Disrupted In
Schizophrenia 1, a gene disrupted by a translocation segregating withschizophrenia. Mol Psychiatry 6: 173–178.
40. Chen W, Ullmann R, Langnick C, Menzel C, Wotschofsky Z, et al. (2010)Breakpoint analysis of balanced chromosome rearrangements by next-genera-
tion paired-end sequencing. Eur J Hum Genet 18: 539–543.41. Chiang C, Jacobsen JC, Ernst C, Hanscom C, Heilbut A, et al. (2012) Complex
reorganization and predominant non-homologous repair following chromosom-
al breakage in karyotypically balanced germline rearrangements and transgenicintegration. Nat Genet 44: 390–397, S391.
42. Kloosterman WP, Guryev V, van Roosmalen M, Duran KJ, de Bruijn E, et al.(2011) Chromothripsis as a mechanism driving complex de novo structural
rearrangements in the germline. Hum Mol Genet 20: 1916–1924.
43. Kloosterman WP, Tavakoli-Yaraki M, van Roosmalen MJ, van Binsbergen E,Renkens I, et al. (2012) Constitutional chromothripsis rearrangements involve
clustered double-stranded DNA breaks and nonhomologous repair mechanisms.Cell Rep 1: 648–655.
44. Schluth-Bolard C, Labalme A, Cordier MP, Till M, Nadeau G, et al. (2013)Breakpoint mapping by next generation sequencing reveals causative gene
disruption in patients carrying apparently balanced chromosome rearrange-
ments with intellectual deficiency and/or congenital malformations. J MedGenet 50: 144–150.
45. Ng KP, Hillmer AM, Chuah CT, Juan WC, Ko TK, et al. (2012) A commonBIM deletion polymorphism mediates intrinsic resistance and inferior responses
to tyrosine kinase inhibitors in cancer. Nat Med.
46. Nagarajan N, Bertrand D, Hillmer AM, Zang ZJ, Yao F, et al. (2012) Whole-genome reconstruction and mutational signatures in gastric cancer. Genome
Biol 13: R115.47. van Heesch S, Kloosterman WP, Lansu N, Ruzius FP, Levandowsky E, et al.
(2013) Improving mammalian genome scaffolding using large insert mate-pairnext-generation sequencing. BMC Genomics 14: 257.
48. McVey M, Lee SE (2008) MMEJ repair of double-strand breaks (director’s cut):
deleted sequences and alternative endings. Trends Genet 24: 529–538.49. Wang JC, Dang L, Fisker T (2010) Chromosome 6 between-arm intrachromo-
somal insertion with intrasegmental double inversion: a four-break model.Am J Med Genet A 152A: 209–211.
50. Kloosterman WP, Cuppen E (2013) Chromothripsis in congenital disorders and
cancer: similarities and differences. Curr Opin Cell Biol.51. Liu P, Erez A, Nagamani SC, Dhar SU, Kolodziejska KE, et al. (2011)
Chromosome catastrophes involve replication mechanisms generating complexgenomic rearrangements. Cell 146: 889–903.
52. Rigaud S, Fondaneche MC, Lambert N, Pasquier B, Mateo V, et al. (2006)
XIAP deficiency in humans causes an X-linked lymphoproliferative syndrome.Nature 444: 110–114.
53. Rigaud S, Latour S (2007) [An X-linked lymphoproliferative syndrome (XLP)caused by mutations in the inhibitor-of-apoptosis gene XIAP]. Med Sci (Paris)
23: 235–237.54. Lichter P, Tang CJ, Call K, Hermanson G, Evans GA, et al. (1990) High-
resolution mapping of human chromosome 11 by in situ hybridization with
cosmid clones. Science 247: 64–69.55. Cherif D, Derre J, Berger R (1990) Fluorescence in situ hybridization on
metaphase chromosomes with biotinylated probes. In situ hybridization, biotinlabeling, cosmids, gene mapping, oncogene amplification. Nouv Rev Fr
Hematol 32: 459–460.
56. Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, et al. (2012)Annotation of functional variation in personal genomes using RegulomeDB.
Genome Res 22: 1790–1797.57. Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, et al. (2012) The
accessible chromatin landscape of the human genome. Nature 489: 75–82.58. Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, et al. (2012) An
integrated encyclopedia of DNA elements in the human genome. Nature 489:
57–74.59. Kent WJ (2002) BLAT–the BLAST-like alignment tool. Genome Res 12: 656–
664.
Detection of Chromosomal Breakpoints by Next-Generation Sequencing
PLOS ONE | www.plosone.org 10 March 2014 | Volume 9 | Issue 3 | e90852
RESEARCH ARTICLEOFFICIAL JOURNAL
www.hgvs.org
Impaired Development of Neural-Crest Cell-Derived Organsand Intellectual Disability Caused by MED13LHaploinsufficiencyKagistia Hana Utami,1,2 Cecilia Lanny Winata,1 Axel M. Hillmer,3 Irene Aksoy,4 Hoang Truong Long,5 Herty Liany,1
Elaine G. Y. Chew,3 Sinnakaruppan Mathavan,1 Stacey K. H. Tay,2 Vladimir Korzh,6 Pierre Sarda,7 Sonia Davila,1∗
and Valere Cacheux1∗
1Human Genetics, Genome Institute of Singapore, Singapore, 138672; 2Department of Paediatrics, Yong Loo Lin School of Medicine, NationalUniversity of Singapore, Singapore, 117597; 3Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, Singapore, 138672;4Stem Cells and Developmental Biology, Genome Institute of Singapore, Singapore, 138672; 5Infectious Disease, Genome Institute of Singapore,Singapore, 138672; 6Zebrafish Translational Unit, Institute of Molecular and Cell Biology, Singapore, 138673; 7CHRU de Montpellier, HospitalArnaud-de-Villeneuve, Montpellier Cedex 5 34295, France
Communicated by Arnold MunnichReceived 22 April 2014; accepted revised manuscript 22 July 2014.Published online 18 August 2014 in Wiley Online Library (www.wiley.com/humanmutation). DOI: 10.1002/humu.22636
ABSTRACT: MED13L is a component subunit of theMediator complex, an important regulator of transcrip-tion that is highly conserved across eukaryotes. Here, wereport MED13L disruption in a translocation t(12;19)breakpoint of a patient with Pierre–Robin syndrome,moderate intellectual disability, craniofacial anomalies,and muscular defects. The phenotype is similar to pre-viously described patients with MED13L haploinsuffi-ciency. Knockdown of MED13L orthologue in zebrafish,med13b, showed early defective migration of cranialneural crest cells (NCCs) that contributed to cartilagestructure deformities in the later stage, recapitulatingcraniofacial anomalies seen in human patients. Notably,we observed abnormal distribution of developing neuronsin different brain regions of med13b morphant embryos,which could be rescued upon introduction of full-lengthhuman MED13L mRNA. To compare with mammaliansystem, we suppressed MED13L expression by short-hairpin RNA in ES-derived human neural progenitors,and differentiated them into neurons. Transcriptome anal-ysis revealed differential expression of components of Wntand FGF signaling pathways in MED13L-deficient neu-rons. Our finding provides a novel insight into the mecha-nism of overlapping phenotypic outcome targeting NCCsderivatives organs in patients with MED13L haploinsuffi-ciency, and emphasizes a clinically recognizable syndromicphenotype in these patients.Hum Mutat 0:1–10, 2014. C© 2014 Wiley Periodicals, Inc.
KEY WORDS: MED13L ; zebrafish model; craniofacial dys-morphism; neural crest; next-generation sequencing
Additional Supporting Information may be found in the online version of this article.∗Correspondence to: Sonia Davila, Human Genetics, Genome Institute of Singapore,
Singapore 138672. E-mail: [email protected]; Valere Cacheux, Human Genetics,
Genome Institute of Singapore, Singapore 138672. E-mail: [email protected]
Contract grant sponsors: Biomedical Medical Research Council (BMRC) of the
Agency for Science, Technology and Research (A∗STAR), Singapore; Singapore Inter-
national Graduate Award (SINGA).
IntroductionMED13L (MED13L; MIM #608771) is one of the subunit in the
CDK8 module of Mediator Complex, a dissociable component ofMediator complex that has been described to have an activating orrepressing functions in regulating transcriptions [Tsai et al., 2013].Mediator complex regulates gene expression by physically linkingtranscription factors to RNA Polymerase II [Ansari and Morse,2013; Davis et al., 2013; Nakamura et al., 2011; Tsai, et al., 2013].Recent studies have shown that MED13L has a role in cancer path-ways, specifically suppressing Retinoblastoma/E2F-induced growth[Angus and Nevins, 2012], and also targeted for degradation byFBW7 (F Box and WD repeat domain-containing 7) a tumor sup-pressor and a component of SCF (Skp-Cullin-F box) ubiquitineligase [Welcker and Clurman, 2008] to disrupt CDK8-mediator as-sociation [Davis et al., 2013].
Several studies have identified structural variants (SVs) and mu-tations affecting MED13L in patients with heart defects, craniofa-cial anomalies, and intellectual disability (ID) [Muncke et al., 2003;Najmabadi et al., 2011; Asadollahi et al., 2013]. The first studyreported a patient harboring a translocation-disrupting MED13Lwith a dextra-looped transposition of great arteries, 1 (dTGA1;MIM #608808) and ID, and three additional missense mutations inMED13L (p.Glu251Gly, p.Arg1872His, and p.Asp2023Gly) in mu-tation screening of dTGA patients’ cohort [Muncke et al., 2003].Subsequently, copy-number variants encompassing MED13L werereported in three patients of which two were out-of-frame deletions,with conotruncal heart defect, moderate ID, hypotonia and facialanomalies, and one triplication involving a full-length MED13L, anda neighboring gene MAP1LC3B2 whom displayed a much milderphenotype with learning disability and no major facial dysmor-phism [Asadollahi et al., 2013]. In addition, homozygous nonsyn-onymous variant p.Arg1416His was found in two siblings presentingisolated nonsyndromic ID without associated anomalies, suggestingthat disruption of MED13L might underlie the overlapping pheno-types in these individuals.
Here, we report a new patient harboring a monoallelictranslocation disrupting MED13L that displays remarkably similarphenotypes to previously reported patients harboring MED13Ldeletions, which strongly supports that haploinsufficiency ofMED13L represents a clinically recognizable entity. Using ze-brafish as a developmental model system, we demonstrated through
C© 2014 WILEY PERIODICALS, INC.
generation and rescue of zebrafish phenotypes by morpholino(MO)-mediated knockdown of med13b, the zebrafish closest ortho-logue of MED13L, resulted in improper development of branchyaland pharyngeal arches due to defects in early migration of neuralcrest cells (NCCs), thereby recapitulating the craniofacial anoma-lies seen in human patients. Furthermore, we analyzed the geneexpression changes upon MED13L knockdown in neurons derivedfrom human embryonic stem cells (hESCs) and identified signifi-cant differences in expression of genes from Wnt and FGF signalingpathways.
Materials and Methods
Ethics Statement
Patient sample and information were collected after written in-formed consent from patient’s parents was obtained and in ac-cordance with local institutional review board approved protocolsfrom National University of Singapore in Singapore, and CHRU deMontpellier, hospital Arnaud-de-Villeneuve, France.
Patient
The patient is the only child of healthy and unrelated parents. Shewas diagnosed with Pierre–Robin sequence at birth, characterizedby cleft palate, glossoptosis, and retrognathia. She had multiple limbcontractures and camptodactyly, including metatarsus adductus ofthe thumb, and bilateral equinovarus foot deformity. Her electroen-cephalogram showed epileptiform discharges with absence seizures.At 4 years old, she had moderate intellectual disability (IQ = 50)and language difficulties with the absence of speech. Other develop-mental milestones were significantly delayed and she required spe-cial education. Further speech development was delayed in isolatedwords, graphism, phonological programming, and global speechdelay at the age of 14 years. MRI done at the age of 8 years revealed aventricule enlargement in correlation with global atrophy. Physicalexaminations revealed dysmorphic features including flat occiput,hypertelorism, flat philtrum, bulbous nose, and broad nasal bridge.Strabismus and hirsutism were noted. She had scoliosis during pu-berty, about 14 years old. Chromosome analysis revealed a balancedtranslocation between chromosome 12 and 19 t(12;19) (q24;q12).Both parents have normal karyotypes and did not show any phe-notypic abnormality. Array-CGH (Agilent 244K) performed on thepatient’s genomic DNA showed no additional chromosomal imbal-ances.
Genomic DNA
DNA sample was obtained from Epstein–Barr virus transformedlymphoblastoid cell lines from peripheral blood lymphocytes of thepatient, and was extracted by QIAamp DNA Blood and Cell CultureDNA Kit (Qiagen, Valencia, CA) according to the manufacturer’sinstruction. Quality and quantity of the extracted DNA were mea-sured using NanoDrop 1000 Spectrophotometer and agarose gelelectrophoresis.
DNA–PET Sequencing
We constructed the DNA–PET library according to the methoddescribed in Hillmer et al. (2011). Genomic DNA (40 μg) wasfragmented to 10 kb DNA fragments, ligated to long mate paired
cap adaptors, and end repaired. After size selection, the library wasamplified by PCR and subjected to high-throughput sequencing of50 bp libraries using the SOLiD sequencers (v4) according to manu-facturer’s instructions (Life Technologies, Carlsbad, CA). Sequencetags were mapped to human reference sequence (NCBI Build 36)and paired using SOLiD system Analysis Pipeline Tool Bioscope,allowing up to 12 color code mismatches per 50 bp tag.
The majority of the PET sequences mapped accordingly to thereference genome NCBI Build 36 (concordant PETs or cPETs) withexpected mapping orientation and expected mapping distances. Theremaining portion of the PETs mapped discordantly to the refer-ence genome (discordant PETs or dPETs), classified as those withincorrect paired-tag orientation and incorrect genomic distances.These abnormally oriented dPETs provided information for differ-ent types of structural variations as described in Hillmer et al. (2011)and Ng et al. (2012). Filtering of SVs were performed across 23 nor-mal individuals (25 DNA-PET data sets), the pilot release set of1000 Genome Project SV release set, previously published paired-end sequencing studies in 18 normal individuals, and Databaseof Genomic Variants (DGV). Sequences have been submitted tothe Short Read Archive (http://trace.ncbi.nlm.nih.gov/Traces/sra)at the National Center for Biotechnology Information (NCBI) withthe accession number SRP034864.
Sanger Sequencing Validation
Primers were designed by Primer3 program, and the ampliconsspanned the breakpoints predicted by dPET clusters according tothe human genome assembly NCBI Build 36. PCR was carried outwith JumpStart REDAccuTaq LA polymerase (Sigma–Aldrich Inc.,St. Louis, MO) in a 50-μl reaction volume and with 500 ng ofgenomic DNA template. The following program was used: (1) initialdenaturation at 96°C for 30 sec; (2) 40 cycles of 15 sec at 94°, 30 secat 58°C, 10 min at 68°; and (3) 68° for 10 min. PCR productsshowing single bands were purified by QiaQuick PCR PurificationKit (Qiagen) and used as templates for sequencing in both directionsby Sanger sequencing. The sequencing of junction fragments wasaligned to the human genome reference NCBI Build 36 sequenceusing Blat in the UCSC Genome Browser.
Quantitative RT-PCR
Total RNA was extracted using RNAeasy Mini kit (Qiagen). Re-verse transcription of 2 μg RNA was performed in 20 μl of Super-Script III Reverse Transcriptase reaction buffer (Life Technologies)using random hexamer primers. qPCR reactions were performed inthe Viia7
TMReal Time PCR system (Life Technologies) with 100 ng
of cDNA, 200 nM of each primer using the SybrGreen PCR MasterMix (Life Technologies). Measurements were performed in trip-licates. Relative mRNA expression was obtained by using 2–��Ct
method and normalized against control sample with human ACTB.For zebrafish morphants, data were normalized against zebrafish18s ribosomal RNA or actin.
Western Blot
Lymphoblastoid cells were lysed in RIPA buffer and protease in-hibitor, and protein concentration was measured with a Bradfordassay kit (Bio-Rad, Hercules, CA). Cell lysate (50 μg) was resolvedon a 10% SDS-PAGE and transferred to a polyvinylidinefluoridemembrane (Bio-Rad). After blocking in 5% milk for 1 hr, the appro-priate primary antibodies were added: anti-MED13L (rabbit poly-clonal antibody generated against the region of 550–600 amino acid,
2 HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014
ab87831; Abcam, Cambridge, United Kingdom), anti-β-actin (sc-1616; Santa Cruz, Dallas, TX) for 1 hr. After washing with TBS/0.1%Tween-20 (TBS-T), horse-radish peroxidase-conjugated antirabbitor goat IgG secondary antibodies was added for 1 hr. After washingwith TBS-T, signals were developed using Amersham ECL Selectdetection kit (GE Healthcare, Uppsala, Sweden). Western blot ex-periments were repeated twice.
MO Microinjection
Two antisense MO for zebrafish Mediator complex subunit 13-like, med13b (NM 001083838) sequences were designed by themanufacturer to target the start codon (med13b-1-Start: ATGGCA-GAGCCCCTCGTTTGTTAGA) and the splice site between exon 14and 15 (med13b-2-Splice: AGAACTCATCATCAAAGCGCAGTCC)(GeneTools, Philomath, OR). Subsequently, 4 ng of each MO wasinjected into the yolk of one to two cell stage embryos from wild-type AB zebrafish strains. Injected embryos were incubated at 28.5°until reaching the appropriate developmental stage. Consistent phe-notype was observed in at least three independent experiments ofaround 90–150 embryos each. The embryos were then fixed in 4%paraformaldehyde in PBS (PFA/PBS) overnight in 4°C and storedin gradual increase of methanol hydration in 100% MeOH at –20°Covernight.
In Situ Hybridization
Whole mount in situ hybridizations were carried out aspreviously described by Thisse and Thisse (2008). AntisenseDIG (Roche, BAsel, Switzerland) riboprobes were synthe-sized: med13b (cDNA clone ID: 7050861; GE Dharmacon,Lafayette, CO), sox10 (PCR-amplified with primers 5′-GGGATTCAGA GCGCGAGCGA-3′ and 5′-CGCGC ATTTAGGTGACACTATAGAAGTGACAGGTACTAGC ATC ATG TG-3′), twist1b(PCR amplified with primers 5′-CCCTCCGTGA CGCAGGAGGA-3′ and 5′-CGCGC ATTTAGGTGACACTATAGAAGTG TTCGTTGAGTGTGTGTGTTT-3′), and islet1 (PCR amplified withprimer 5′-TCCAGGCTCAAACTCCAC-3′ and 5′-GGATCCATTAACCCTCACTAAAGGGAATGTCCGACCGTTTACTTACAG-3′).
MED13L mRNA Synthesis
Full-length MED13L cDNA (NM 015335.4) was amplified by us-ing primers hMED13L-F 5′-AGGATCATGACTGCGGCAGC-3′ andhMED13L-R 5′-TCGCGGAGGATCATGACT-3′ from human cellline GM12878 cDNA. Full-length MED13L cDNA was subclonedinto TOPOII Dual Promoter vector by using Topo TA Cloning Kit(Life Technologies). The validity and orientation of clones wereconfirmed by PCR and sequencing. Capped mRNA was gener-ated by using SP6 mMESSAGE Machine T3 Kit (Life Technolo-gies) according to manufacturer’s instruction and purified by usingRNAeasy mini kit (Qiagen). mRNA rescue experiments were per-formed by coinjection of 4 ng med13b-MO and 150 pg of MED13LmRNA.
Alcian Blue Staining
Embryos were fixed at 5 dpf with 4% PFA for 2 hr at room tem-perature. After fixation, embryos were dehydrated to ethanol (50%)for 10 min at room temperature. Embryos were then transferredto Alcian Blue staining solution (0.4% Alcian Blue [w/v] in 70%
ethanol) to rock at room temperature overnight. To remove the pig-mentation in older embryos, embryos were bleached the next dayin 3% H2O2 and 2% KOH for 20 min. To clear the tissue, embryoswere soaked in 20% glycerol/0.25% KOH for 30 min, and preparedfor imaging in the same solution.
Neural Progenitor Cells Maintenance
NPC differentiation of ESCs was done based on established pro-tocols [Li et al., 2011]. Cells were first plated in clumps on matrigel-coated 6 well-plates using mTESR1 media (StemCell Technolo-gies, Vancouver, British Columbia, Canada). The following day,the media was changed with NPC media composed of DMEM-F12/neurobasal media (1:1) supplemented with N2 and B27 withoutvitamin A supplements (Gibco, Life Technologies) and CHIR99021(4 μM) (StemCell Technologies), SB431542 (3 μM) (Millipore,Billerica, MA), Compound E (0.1 μM) (Millipore), and humanLIF (10 ng/ml) (Millipore). After 7 days, cells were passaged us-ing accutase and grown in the same basal media with CHIR99021(3 μM), SB431245 (2 μM), and additional growth factors humanLIF (10 ng/ml), EGF (20 ng/ml) (Life Technologies), and bFGF(20 ng/ml) (Life Technologies).
shRNA Transfection
Two sequences targeting MED13L sequences were designed andcloned into pSUPER neo vectors (Oligoengine, Seattle, WA). TwoshRNAs were transfected into H1-NPCs by using Fugene 6 trans-fection reagent (Promega, Madison, WI). After 48 hr, G418 (Gibco,Life Technologies) was added into the culture for selection of sta-bly transfected cells. The surviving cells were expanded for NPCscharacterization and neuronal differentiation.
Neuronal Differentiation
Neuronal differentiation was performed according to the pro-tocol described by Brennand et al. (2011). NPCs were plated at adensity of 200,000 cells per 6 well on a Poly-L-Ornithine/Laminin-coated coverslips, and grown in neuronal differentiation media(DMEM-F12/neurobasal media [1:1] supplemented with N2 andB27 supplements [Gibco, Life Technologies], GDNF [20 ng/ml][R&D Systems, Minneapolis, MN], BDNF [20 ng/ml] [R&D Sys-tems], dibutyryl-cyclic AMP [1 mM] [Sigma–Aldrich Inc.], andascorbic acid [200 nM] [Sigma–Aldrich Inc.]). H1-NPCs-derivedneurons were differentiated for 4 weeks to achieve TUJ1 and MAP2-positive neuronal cells. For characterization, neurons were passagedusing accutase and replated at a very low density (5,000 cells per 24well) on coverslips.
Immunocytochemistry
Cells were plated on ethanol-treated coverslips and fixed with4% paraformaldehyde in PBS at 4°C and incubated with TBS con-taining 5% of fetal bovine serum, 1% BSA (Sigma–Aldrich Inc.),and 0.1% of Triton X-100 (Sigma–Aldrich Inc.) for 45 min at roomtemperature. Following are the primary antibodies used: Nestin(ab22030, 1:500), Sox2 (sc17320, 1:200), Vimentin (sc7557, 1:100),Map2 (A2320, 1:1000), and Tuj1 (PRB-435P, 1:2000). Each pri-mary antibody was incubated with fixed cells overnight at 4°C, andcells were subsequently stained with secondary antibody conjugatedto Alexa Fluor 594 or 488 (Molecular Probes, Life Technologies,
HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014 3
Carlsbad, CA) (4 μg/ml) for 1 hr at room temperature in the dark.Images were captured using a ZEISS AxioObservor DI inverted fluo-rescence microscope (Carl Zeiss, Oberkochen, Baden-Wurttemberg,Germany).
Microarray
Human Ref-12 Expression BeadChip microarrays (Illumina, SanDiego, CA) were used for genome-wide expression analysis. For hy-bridization, cRNAs from duplicate or triplicate samples were synthe-sized and labeled using TotalPrep RNA Amplification Kit (Ambion,Austin, TX), following the manufacturer’s instructions. Scanneddata from the BeadChip raw files for all samples were retrieved andbackground corrected using BeadStudio, and subsequent analyseswere completed in GeneSpring GX. Data were normalized bothwithin and between arrays, and corrected for multiple testing ac-cording to Benjamini–Hochberg. We defined genes as significantlydifferentially expressed when FDR was <0.05.
Data Analysis
Gene ontology and pathway enrichment analysis were performedby using DAVID Functional Annotation Bioinformatics Microar-ray Analysis and PantherDB Classification system. Differentially ex-pressed gene lists were exported through these systems and top GOcategory were considered for P < 0.05 and after multiple testingcorrection with Benjamini–Hochberg.
Results
Identification of MED13L Disruption in a Patient with MildID and Dysmorphism
We analyzed a patient with Pierre–Robin sequence at birth, mod-erate ID, limb defects, and dysmorphic features. Prior cytogeneticanalysis revealed a de novo balanced translocation between chro-mosome 12 and 19 t(12;19)(q24;q12). We performed a large-insertgenome paired-end tag (DNA–PET) sequencing approach as de-scribed previously [Hillmer et al., 2011; Yao et al., 2012; Utami et al.,2014] to fine map the translocation breakpoint, which generated ap-proximately 66 million sequence reads and achieved 150× physicalcoverage. Two abnormally oriented paired clusters correspondingto the t(12;19) translocation were identified. The breakpoint onchromosome 12 (chr12:114,971,734–114,971,744, hg18) disruptedMED13L and no coding genes were found in the chromosome 19breakpoint (chr19:35,769,937-35,769,968, hg18). Sanger sequenc-ing validation confirmed the breakpoint junction at intron 4 ofMED13L with a 1-bp microhomology in derivative chromosome12 at position, likely resulting in a truncated MED13L protein atamino acid position Ser160 (Fig. 1A). The heterozygous disruptionwas confirmed by capillary sequencing on coding exons of MED13Lthat did not reveal any additional mutation in the flanking exons(Supp. Fig. S1). Both mRNA and protein expression of MED13Lwere reduced in the patient’s lymphoblastoid cell lines comparedwith two healthy controls, as shown by qPCR and Western blot anal-ysis (Supp. Fig. S2). MED13L mRNA is ubiquitously expressed in
Figure 1. Mapping the translocation breakpoint disrupting MED13L by sequencing and comparison with previously reported mutations ordisruption affecting MED13L. A: DNA–PET sequencing allowed to precisely pinpoint the translocation t(12;19) breakpoint disrupting MED13L genewithin intron 4 at position chr12:114,971,734-114,971,744 NCBI Build 36 with a loss of 10 and 1 bp microhomology, and no annotated gene atchromosome 19 breakpoint at position chr19:35,769,937-35,769,968 with 2 bp insertions at the breakpoint junction. B: The location of mutations orSVs within MED13L gene (NG_023366.1) and protein (NP_056150.1) consequences reported to date (see Table 1 for precise clinical presentationof each patient). P1 is the first reported patient with translocation t(12;17), presumably resulted in p.25∗ truncated protein when this allele istranslated. P9 is the patient described in this study, harboring a translocation t(12;19) with a breakpoint at intron 4, likely resulted in a truncatedprotein at p.Ser160∗. P6 and P7 have deletions of 17 Kb in exon 2 and 115 Kb in exon 3–4, respectively. Both out-of-frame deletions are locatedin the N-terminal domain and resulted in shorter protein of 2,155 and 2,132 amino acid length. P2, P3, and P4 showed heterozygous mutation atp.Glu251Gly, p.Arg1872His, and p.Asp2023Gly in mutational screening of patients with dTGA related to case P1, which do not alter the protein length.P5 represents the two siblings in a consanguineous family harboring homozygous mutation at p.Arg1416His. P8 is not indicated in this figure, andhas a triplication involving the whole MED13L and a neighboring gene MAP1LC3B2. P1, P2, P6, P7, and P9 are located in the N-terminal MED13domain, whereas P3 and P4 are located in the C-terminal MED13 domain. P5 is located outside N- or C-terminal MED13 domain.
4 HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014
Table 1. Summary of key clinical features in patients with MED13L structural variants
Patient 1 [Muncke et al.,2003]
Patient 2 [Asadollahiet al., 2013]
Patient 3 [Asadollahiet al., 2013]
Patient 4 [Asadollahiet al., 2013]
Patient 5 (this study)
MED13L SVst(12;17)(q24.1;q21)
(intron 1)Deletion in exon
2(c.71-?_310+?del)Deletion in exon
3–4(c.311-?_479+?del)
Triplication of the wholegene (c.1?_6574+?)
(Flanagan et al., 1991)t(12;19)(q24;q21)
(intron 4)
Inheritance De novo De novo De novo De novo De novo
Neurological featuresIntellectual disability + + + (+) +
Language delay NR + + NR +
Speech Nearly absent Delayed Absent NR AbsentAtaxia + NR NR NR NRMotor skills + + + NR +
Hypotonia - + + + NRDysmorphic featuresFacial/skull
prominenceMicrocephaly Broad forehead with
nevus simplexBroad forehead,
prominent occipitalprotuberance
NR Plagiocephaly
Cranial NR Macroglossia,micrognathia
Micrognathia NR Pierre–Robinsequence: cleftpalate, glossoptosis,microretrognathia
Ears NR Large, low-set ears Large, low-set ears NR NREyes NR Upslanting palpebral
fissuresUpslanting palpebral
fissuresNR Hypertelorism,
strabismusNose NR Flat nasal root, bulbous
noseFlat nasal root, bulbous
nasal tipBroad nasal bridge Broad nasal bridge
Mouth NR Deep philtrum, facialhypotonia with openmouth appearance
Deep and short philtrum,hypotonic with openmouth appearance
NR Flat philtrum
Digital NR NR Bowed legs, overlappingfifth toe
NR Metatarsus adductus,camptodactyly,bilateralequino-varusdeformity
Cardiac manifestationSeptal defect pVSD, open foramen
ovale, coAPulmonary atresia,
supracardial TAPVC,VSD, ToF
ToF pVSD NR
Transposition ofarteries
dTGA - - - -
CoA, coarctation of the aorta; dTGA, dextro-looped transposition of great arteries; pVSD, perimembranous ventricular septal defect; TAPVC, total anomalous pulmonary venousconnection; ToF, tetralogy of fallot; ID, intellectual disability; DD, developmental delay; SD, speech delay; LD, language delay; NR, not reported; (+), learning disability.
human tissues with the highest levels in cerebellum, fetal brain, andskeletal muscle (Supp. Fig. S3). We compared the affected regionsof MED13L mutations in previously published subjects [Munckeet al., 2003; Najmabadi et al., 2011; Asadollahi et al., 2013] (Fig. 1B)and their clinical findings are summarized in Table 1. It was pre-viously suggested that variants within N- or C-MED13 domainswere likely to cause cardiac phenotypes [Asadollahi et al., 2013].However, the translocation breakpoint in our patient is withinN-terminal MED13 domain, and cardiac defect was not presentat birth and early adulthood. Interestingly, during the review ofthis manuscript, two additional patients with deletions within theN-MED13 domain were reported, and no cardiac anomalies werenoted in both patients [van Haelst et al., 2014], suggesting a reducedpenetrance for congenital heart defect in patients with MED13Lhaploinsufficiency.
Knockdown of med13b Impairs Cranial NCCs Migration
To investigate the role of MED13L underlying the broad pheno-typic spectrum observed in these patients, we examined whethera knockdown of med13b, with 65% protein sequence similarity tohuman protein (Supp. Fig. S4), could model the observed clinical
features seen in patients. First, we checked the mRNA expressionof med13b across developmental stages between 1 and 72 hr post-fertilization (hpf) by qPCR, and observed maternal and zygotictranscripts during embryonic development (Fig. 2A). To determinespatiotemporal expression of med13b, we performed in situ hy-bridization of med13b and confirmed maternal expression at oneto two cell stage in the first blastomeres, supporting a role duringearly embryogenesis (Fig. 2B and C). At 24 hpf and 48 hpf, zygoticmed13b expression became restricted to the forebrain, midbrain-hindbrain boundary, eye, pectoral fin, and the lateral line (Fig. 2Dand E), whereas sense probe showed no specific hybridization sig-nal (Fig. 2F). Some of these tissues require derivatives of neuralcrest lineage to develop: pectoral fin arises from trunk neural crest,lateral line arises from cranial placodal ectoderm, and eye develop-ment involves migration of NCCs to the periocular mesenchyme[Langenberg et al., 2008]. med13b suppression by splice-site anti-sense MO oligonucleotides resulted in developmental defects af-fecting eyes, head, and antero-posterior axis seen at 2 days post-fertilization (dpf) (Fig. 3A). More than 95% of the embryos wereaffected with a curved body axis, underdeveloped head, pericardialedema, and microphtalmia, which were categorized as “severe” for40%, and as “mild-to-moderate” for 55% of the analyzed embryos.Coinjection of approximately 150 pg full-length human MED13L
HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014 5
Figure 2. med13b mRNA expression across developmental stagesin zebrafish. A: qPCR analysis in zebrafish developmental time pointsbetween 0 and 72 hpf showed maternal expression of med13b between0 and 3 hpf, replaced by the activation of zygotic transcript at 6 hpf,and gradually decreased throughout the developmental stages. RelativemRNA expression was normalized with zebrafish actin and 18s riboso-mal RNA (18s). In situ hybridization of med13b with antisense probe inzebrafish embryo showed maternal expression in 1-cell (B) and 2-cellstage (C) at the blastomeres. At 24 (D) and 48 hpf (E), the expressionwas restricted to forebrain (fb), mid-hindbrain boundaries (mhb), eyes,pectoral fin (pf), and posterior lateral line (pll). F: Sense probe, used asa negative control, did not show any expression in the embryo. Lateralview of the embryo, anterior to the left.
mRNA with med13b MO was sufficient to rescue the eye size ofthe morphants, and fully restore head development, whereas bodylength was partially rescued (Fig. 3B and C).
One of the striking observations among patients with MED13Lhaploinsufficiency is the presence of craniofacial dysmorphisms (in-cluding upslanting palpebral fissures, flat nasal root, bulbous nose,broad forehead, deep philtrum, and palatal problems) (See Table 1).To study the implication of med13b loss of function in craniofacialand cartilage structure development, we checked the migration of
cranial NCCs in the morphant embryos by using sox10 that is ex-pressed in the early migrating cranial and trunk NCCs. At 24 hpf,the expression of sox10 was strongly present in the cranial gangliaand otic vesicle in control embryo, whereas the NCCs migrationwas delayed in the morphants, shown by clustered expression ofsox10-positive cells along the migratory cranial ganglia at a moreposterior location (Fig. 4A and A′). Upon coinjection with humanMED13L mRNA, the location of sox10 positive cells were compa-rable to that of control embryos (Fig. 4A), suggesting that med13bis required for proper NCCs migration. Further examination ofcartilage development on 5-day-old morphant embryos stained byAlcian Blue, a time point when all craniofacial cartilage had beenfully formed, resulted in a severely distorted ceratohyal cartilage(Fig. 4B). This phenotype was fully restored upon human MED13LmRNA introduction (Fig. 4B). Similar pharyngeal skeletal defectshave been observed in the zebrafish bielak mutants 5 dpf larvae,with ceratohyal pointing caudally, suggesting a role in the specifi-cation of craniofacial skeleton [Neuhauss et al., 1996]. These dataclearly support that defective cranial NCC migration contributes tocraniofacial cartilage defects upon disruption of med13b in theseembryos.
Loss of med13b Affects Neuronal Distribution in ZebrafishBrain
Another shared clinical feature among patients with MED13Ldisruptions is mild to moderate intellectual disability. To explorethe impact of med13b knockdown in neuronal development, weused islet1 as a marker of differentiating motor neurons, which isexpressed in the retinal ganglion cells (RGCs), hypothalamus, andventral telencephalon. Med13b morphants showed a lack of islet1expression in RGCs and forebrain, and reduced expression in thehypothalamus and telencephalon, which was restored upon humanMED13L mRNA coinjection, suggesting impaired neuronal devel-opment in early brain patterning (Fig. 4C and C”). To better examineevents of neural differentiation, we further analyzed distribution ofneuroblasts in the developing zebrafish brain using HuC:GFP (greenfluorescent protein) transgenic line [Park et al., 2000], where cellsthat start to develop as neurons initiate expression of GFP underthe regulation of HuC promoter. At 48 hpf, we observed a decreasein the GFP expression in the optic tectum of the midbrain in themorphant embryos (Fig. 4D’), whereas rescued embryos showedexpression relatively similar to that in control embryos (Fig. 4D”).These results suggest that med13b may be involved in early events ofneural differentiation in the zebrafish embryo.
Knockdown of MED13L in hESC-Derived NeuralProgenitors Did Not Affect Neuronal Differentiation
We further attempted to validate defects observed in zebrafishmorphants in human cells. To this end, we used H1-hESCs-derivedneural progenitor cells (H1-NPCs), which were differentiated intoNPCs using an established protocol. Knockdown of MED13L inthese cells by using two short hairpins RNA targeting MED13L re-sulted in 50%–60% suppression of MED13L mRNA after 2 days oftreatment with G418 antibiotic selection, relative to the shRNA tar-geting scrambled sequence as a control (Supp. Fig. S5). The proteinexpression of nestin, SOX2, and vimentin, markers of NPCs, werenot remarkably different between shScrambled and shMED13L1/2cells shown by immunocytochemistry, indicating that MED13Lknockdown did not alter the cell fate, as they still retain NPC charac-teristics (Supp. Fig. S6). NPCs were able to self-renew and proliferate
6 HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014
Figure 3. Suppression of med13b resulted in developmental defects affecting head, eyes, and body axis. A: Morphology of embryos (fromleft to right), uninjected clutchmate, injected with control MO, med13b MO (ranging from severe to mild), and rescue construct. Knockdownof med13b caused morphological and growth defects in the developing zebrafish embryo. Reintroduction of human MED13L mRNA (150 pg)restored the zebrafish wild-type phenotype. Control MO-injected embryo was indistinguishable to the uninjected embryo. med13b MO resulted inhypomorphic head, microphtalmia, curved body axis, and pericardial edema. Phenotypes were classified into three distinct categories accordingto the anteroposterior axis extension: severe, mild, and moderate, with the majority of MO-injected embryo (>40%) included in severe category.Lateral view of embryo at 48 hpf, anterior to the left. Scale bar, 500 μm. B: Microphtalmia is one of the morphant phenotype observed, and uponcoinjection of human MED13L mRNA, the eye sizes were comparable to uninjected and control embryos. C: The body lengths of the morphantswere slightly shorter than controls, and coinjection of human MED13L mRNA (150 pg) was not sufficient to fully rescue the body size.
until a limited number of times before they became postmitotic. Theproliferative capacity of shMED13L1/2 NPCs was slightly increasedby 5%–8% compared with shScrambled NPCs based on quantifi-cation of 5-ethynyl-2′-deoxyuridine (EdU)-incorporated cells andKi67-positive cells (Supp. Fig. S7). The NPCs were maintained atlow passage and differentiated into a mixed neuronal population,including glutamatergic and GABAergic neurons according to theestablished protocol [Brennand et al., 2011]. At 4 weeks, shMED13Lneurons showed comparably similar expression of early differentiat-ing neuron marker, TUJ1, and mature neuron marker, MAP2, withshScrambled cells, suggesting no significant differences of differen-tiated neurons in cultures upon MED13L knockdown in contrast tofindings in zebrafish morphant (Supp. Fig. S8). This might be ex-plained by a lack of information to determine specific brain regionsin the in vitro model.
Transcriptome Profile of MED13L-Deficient NeuronsRevealed Dynamic Expression Changes of Cranial NCCsGenes
To explore the gene expression changes upon MED13L knock-down, we conducted exploratory microarray experiments compar-ing shMED13L1/2 to shScrambled neurons at week 4, as this timepoint is a critical transition period between neural progenitors tobecome postmitotic neurons. We identified 1,117 genes showing sig-nificant expression changes in shMED13L neurons (Supp. Fig. S9).These deregulated gene list was submitted to DAVID annotation,and significant (P < 0.05) gene ontology terms were enriched in thedevelopmental processes, which corroborated the functional role
of MED13L in regulating other genes during development (Supp.Fig. S9). The top two genes being upregulated were: SP8, a zincfinger transcription factor gene known to be crucially involved incraniofacial development [Kasberg et al., 2013]; and FGFR3, fibrob-last growth factor (FGF) receptor 3 gene, a repressor of bone growththat were found nearly 10-fold upregulated in shMED13L neurons.Both were confirmed to be upregulated in shMED13L NPCs andmed13b morphant embryos by qRT-PCR (Fig. 5A). Furthermore,higher expressions of sp8a at olfactory vesicle and motor neurons,and fgfr3 at diencephalon and midbrain-hindbrain boundary weredemonstrated in med13b MO embryos compared with wild type byin situ hybridization (Fig. 5B). Four other genes, which zebrafishorthologues are known to be expressed in NCCs development ac-cording to ZFIN database, such as pharyngeal arches (Calcr), lateralline (Rspo3), and pectoral fin (Hoxa5 and Atp1a2), were significantlydownregulated (Log2 FC = –3.83 to –2.39, P < 0.05), suggesting thatMED13L suppression also affects the regulation of genes that are im-portant for the development of NCCs derivatives (Supp. Fig. S10).We next performed pathway-enrichment analysis using PANTHERWeb tools in the 1,117 genes and revealed 16 components withincanonical Wnt pathway being significantly differentially expressedin shMED13L1/2 neurons (Supp. Fig. S10). We validated eight ofthe genes that were selected based on their known counterpart as themain component (ligands, antagonist, receptors, and downstreamtarget gene) of canonical Wnt pathway, by qRT-PCR and confirmedmicroarray-predicted changes in the genes tested (Supp. Table S1),for neurons and NPCs, as well as med13b morphant embryos, sup-porting the notion that Wnt signaling is deregulated in cells lackingMED13L.
HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014 7
Figure 4. med13b is required for proper NCCs migration and neuronal distribution in zebrafish head morphogenesis and craniofacial prominence.sox10, cranial NCCs marker, showed expression along cranial ganglia and otic vesicle (ov) at 24 hpf in the control embryos (A). Dorsal view ofthe embryo at 24 hpf, anterior to the left. Delayed migration of NCCs is depicted as clustered sox10-positive cells toward a more posterior axisin the morphant embryos (A’). Injection of 150 pg human MED13L mRNA-restored NCCs migration shown in rescued embryo (A”). Dashed linesoutline the migratory axis of sox10-positive cells. Craniofacial cartilage formation was visualized by Alcian Blue staining at 5 dpf, and developingnormally in control embryos (B). Ceratohyal cartilage was severely distorted and shifted posteriorly in the morphants (B’). Injection of the humanMED13l mRNA rescued the wild-type phenotype (B”). M indicates Meckel’s cartilage, pq indicates palatoquadral cartilage, ch indicates ceratohyalcartilage, and cb indicates ceratobranchyal cartilage number 1–4. Ventral view of the embryo at 5 dpf, anterior to the left. Early differentiatingneurons were visualized by Islet1 expression in the hypothalamus (ht), forebrain (fb), and retinal ganglion cells (rgcs) at 48 hpf control embryos(C). Islet1 expression was absent in the rgcs, and reduced in the hypothalamus and forebrain, indicating developmental delay in the morphantembryos (C’). Upon the injection of human MED13L mRNA, the expression was restored in the rescued embryo (C”). Dashed lines outline the eye.Dorsal view of the embryo at 48 hpf, anterior to the left. Imaging of HuC:GFP transgenic line showed distribution of developing neurons as GFPexpression across brain regions in the control embryos (D). Abnormal distribution of GFP expression in the forebrain and mid-hindbrain boundarieswere apparent in the morphants (D’), which was restored upon addition of human MED13L mRNA (D”). tc indicates telencephalon, d indicatesdiencephalon, cb indicates cerebellum, ot indicates optic tectum, and mb indicates midbrain. Lateral view of the head region at 48 hpf, anterior tothe left.
DiscussionOur work provides a mechanistic insight into the functional con-
sequences of MED13L disruption in human patients. Here, we func-tionally demonstrated that depletion of its orthologue in zebrafishmodel could partly phenocopy the craniofacial abnormalities in hu-man and the observed phenotype can be reversed by introducingintact human mRNA, suggesting that gene dosage is required fornormal gene function. Previous reported cases with shorter lengthor presumably truncated MED13L protein displayed more severephenotypes [Muncke et al., 2003; Asadollahi et al., 2013; van Haelstet al., 2014], compared with patients carrying single amino acidsubstitutions associated with either isolated dTGA or ID, indicat-ing a distinct molecular mechanism between single-point mutationsand haploinsufficiency of MED13L [Muncke et al., 2003; Najmabadiet al., 2011]. Patients with missense mutations have either isolated
heart defects or isolated ID, whereas patients with MED13L haploin-sufficiency displayed moderate ID and craniofacial anomalies, witha reduced penetrance of cardiac defects. Our in vivo model demon-strated that early migration defects of cranial NCCs toward thebranchyal arches contributed to the craniofacial defect seen in 5 dpfmorphants, suggesting that med13b is required for early craniofacialdevelopment. This could partly be explained by the enrichment ofcranial NCCs genes including components of Wnt and FGFs sig-naling pathways in the transcriptome profile of MED13L-deficientneuronal cells. FGF signaling is known to play a role in promot-ing the fate of NCCs, and required for migration and patterningof the cranial NCCs into the pharyngeal arches. Fgf8 is secreted bythe anterior neural ridge (ANR), and is required for cranial NCCsproliferation, as well as the development of forebrain and midbrainstructures. Loss of FGF signaling leads to a failure of pharyngealarch cartilage development [Sarkar et al., 2001; David et al., 2002].
8 HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014
Figure 5. Expression of sp8 and fgfr3 in NPCs, neurons and med13b morphants assessed by qPCR and in situ hybridization. A: qPCR analysisshowed similar trend of upregulated expression in MED13L-depleted cells (NPCs, neurons and pooled zebrafish embryos), thus confirmedmicroarray-predicted expression changes. B: In situ hybridization experiments showed enhanced expression of both genes indicated by arrows;(i)sp8a in the olfactory vesicle (ov) and motoneurons (mn) along the spinal cord and (ii) fgfr3 in diencephalon (di), and midbrain-hindbrain boundaries(MHB) in the med13b MO embryos compared with control.
A Recent study has shown that SP8, one of the top upregulatedgenes in our transcriptome data, promotes Fgf8 expression in theANR and olfactory pit signaling centers that regulate survival, pro-liferation, and differentiation of the NCCs to form facial skeletonand the anterior brain [Kasberg et al., 2013]. Studies in frogs andmice have shown that the induction of NCCs is dependent on theactivation of Fgf and Wnt pathways in the paraxial mesoderm [Ken-gaku and Okamoto, 1993; Mayor et al., 1995; Hong et al., 2008].Wnt signaling induces the protrusion of the frontonasal and maxil-lary prominences, which eventually form the facial skeleton [Helmset al., 2005; Tapadia et al., 2005; Brugmann et al., 2006]. Conditionalknockout mice of β-catenin, the key nuclear effector in the canonicalWnt signaling, resulted in brain malformation and failure of cranio-facial development [Brault et al., 2001]. Interestingly, other subunitin the CDK8 module of Mediator complex, MED12, has been shownto interact physically with β-catenin [Kim et al., 2006] and recentgenetic studies in Drosophila and C. elegans have revealed that mu-tations in other Mediator subunits MED1, MED6, MED12, MED13,and MED23 variously affect developmental processes regulated byWnt signaling [Zhang and Emmons, 2000; Treisman, 2001; Zhangand Emmons, 2001; Janody et al., 2003; Yoda et al., 2005]. Our workfurther extends the involvement of MED13L as part of Mediatorcomplex, showing that upon knockdown significantly deregulategenes within Wnt signaling pathway.
There are seven additional patients with ID and congenitalanomalies documented by DECIPHER [Firth et al., 2009], con-sisted of five deletions and two duplications encompassing MED13L.Complete description of additional patients with MED13L disrup-tion becomes clinically relevant to extend the definition of MED13Lhaploinsufficiency syndrome, as initially postulated by Asadollahiet al. (2013). This could be explored further by performing ad-ditional mutation screening within MED13L for patients that weretested negative for DiGeorge syndrome, or other neurocristopathies.
Considering the coexistence of ID in patients with MED13L muta-tions and our functional validation showing abnormal regulation ofneurogenesis in the developing brain of med13b-depleted zebrafishembryo, we further recommend adding MED13L to the ID-linkedscreening panel for clinical genetic testing of ID patients.
In summary, we have presented a mechanistic correlation in pa-tients sharing a single gene disruption and provided an in vivo modelof MED13L loss-of-function that strongly suggests an implication ofdefective NCCs migration during craniofacial development mim-icking the human patients. Taken together, our study effectivelydemonstrates that haploinsufficiency of MED13L is attributable tothe ID and craniofacial phenotype in our patient.
Acknowledgments
We thank the patient and family members for their participation. This studymakes use of data generated by the DECIPHER Consortium. A full listof centers who contributed to the generation of the data is available athttp://decipher.sanger.ac.uk and via email from [email protected]. Wethank Teddy Young and David Greenald for providing reagents, and Cath-leen Teh, Hongyuan Shen, William Goh for the helpful advice in zebrafishtransgenic lines and protocols.
References
Angus SP, Nevins JR. 2012. A role for Mediator complex subunit MED13L in Rb/E2F-induced growth arrest. Oncogene 31:4709–4717.
Ansari SA, Morse RH. 2013. Mechanisms of Mediator complex action in transcriptionalactivation. Cell Mol Life Sci 70:2743–2756.
Asadollahi R, Oneda B, Sheth F, Azzarello-Burri S, Baldinger R, Joset P, Latal B,Knirsch W, Desai S, Baumer A, Houge G, Andrieux J, et al. 2013. Dosage changesof MED13L further delineate its role in congenital heart defects and intellectualdisability. Eur J Hum Genet 21:1100–1104.
HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014 9
Brault V, Moore R, Kutsch S, Ishibashi M, Rowitch DH, McMahon AP, Sommer L,Boussadia O, Kemler R. 2001. Inactivation of the beta-catenin gene by Wnt1-Cre-mediated deletion results in dramatic brain malformation and failure of craniofa-cial development. Development 128:1253–1264.
Brennand KJ, Simone A, Jou J, Gelboin-Burkhart C, Tran N, Sangar S, Li Y, Mu Y, ChenG, Yu D, McCarthy S, Sebat J, et al. 2011. Modelling schizophrenia using humaninduced pluripotent stem cells. Nature 473:221–225.
Brugmann SA, Tapadia MD, Helms JA. 2006. The molecular origins of species-specificfacial pattern. Curr Top Dev Biol 73:1–42.
David NB, Saint-Etienne L, Tsang M, Schilling TF, Rosa FM. 2002. Requirement forendoderm and FGF3 in ventral head skeleton formation. Development 129:4457–4468.
Davis MA, Larimore EA, Fissel BM, Swanger J, Taatjes DJ, Clurman BE. 2013. TheSCF-Fbw7 ubiquitin ligase degrades MED13 and MED13L and regulates CDK8module association with mediator. Genes Dev 27:151–156.
Firth HV, Richards SM, Bevan AP, Clayton S, Corpas M, Rajan D, Van Vooren S,Moreau Y, Pettett RM, Carter NP. 2009. DECIPHER: Database of ChromosomalImbalance and Phenotype in Humans Using Ensembl Resources. Am J Hum Genet84:524–533.
Flanagan PM, Kelleher RJ 3rd, Sayre MH, Tschochner H, Kornberg RD. 1991. Amediator required for activation of RNA polymerase II transcription in vitro.Nature 350:436–438.
Helms JA, Cordero D, Tapadia MD. 2005. New insights into craniofacial morphogenesis.Development 132:851–861.
Hillmer AM, Yao F, Inaki K, Lee WH, Ariyaratne PN, Teo AS, Woo XY, Zhang Z, ZhaoH, Ukil L, Chen JP, Zhu F, et al. 2011. Comprehensive long-span paired-end-tagmapping reveals characteristic patterns of structural variations in epithelial cancergenomes. Genome Res 21:665–675.
Hong CS, Park BY, Saint-Jeannet JP. 2008. Fgf8a induces neural crest indirectly throughthe activation of Wnt8 in the paraxial mesoderm. Development 135:3903–3910.
Janody F, Martirosyan Z, Benlali A, Treisman JE. 2003. Two subunits of the Drosophilamediator complex act together to control cell affinity. Development 130:3691–3701.
Kasberg AD, Brunskill EW, Steven Potter S. 2013. SP8 regulates signaling centers duringcraniofacial development. Dev Biol 381:312–323.
Kengaku M, Okamoto H. 1993. Basic fibroblast growth factor induces differentiationof neural tube and neural crest lineages of cultured ectoderm cells from Xenopusgastrula. Development 119:1067–1078.
Kim S, Xu X, Hecht A, Boyer TG. 2006. Mediator is a transducer of Wnt/beta-cateninsignaling. J Biol Chem 281:14066–14075.
Langenberg T, Kahana A, Wszalek JA, Halloran MC. 2008. The eye organizes neuralcrest cell migration. Dev Dyn 237:1645–1652.
Li W, Sun W, Zhang Y, Wei W, Ambasudhan R, Xia P, Talantova M, Lin T, Kim J, WangX, Kim WR, Lipton SA, et al. 2011. Rapid induction and long-term self-renewal ofprimitive neural precursors from human embryonic stem cells by small moleculeinhibitors. Proc Natl Acad Sci USA 108:8299–8304.
Mayor R, Morgan R, Sargent MG. 1995. Induction of the prospective neural crest ofXenopus. Development 121:767–777.
Muncke N, Jung C, Rudiger H, Ulmer H, Roeth R, Hubert A, Goldmuntz E, Driscoll D,Goodship J, Schon K, Rappold G. 2003. Missense mutations and gene interruptionin PROSIT240, a novel TRAP240-like gene, in patients with congenital heart defect(transposition of the great arteries). Circulation 108:2843–2850.
Najmabadi H, Hu H, Garshasbi M, Zemojtel T, Abedini SS, Chen W, Hosseini M,Behjati F, Haas S, Jamali P, Zecha A, Mohseni M, et al. 2011. Deep sequencingreveals 50 novel genes for recessive cognitive disorders. Nature 478:57–63.
Nakamura Y, Yamamoto K, He X, Otsuki B, Kim Y, Murao H, Soeda T, Tsumaki N,Deng JM, Zhang Z, Behringer RR, Crombrugghe B, et al. 2011. Wwp2 is essentialfor palatogenesis mediated by the interaction between Sox9 and mediator subunit25. Nat Commun 2:251.
Neuhauss SC, Solnica-Krezel L, Schier AF, Zwartkruis F, Stemple DL, Malicki J, Ab-delilah S, Stainier DY, Driever W. 1996. Mutations affecting craniofacial develop-ment in zebrafish. Development 123:357–367.
Ng KP, Hillmer AM, Chuah CT, Juan WC, Ko TK, Teo AS, Ariyaratne PN, TakahashiN, Sawada K, Fei Y, Soh S, Lee WH, et al. 2012. A common BIM deletion poly-morphism mediates intrinsic resistance and inferior responses to tyrosine kinaseinhibitors in cancer. Nat Med 18:521–528.
Park HC, Kim CH, Bae YK, Yeo SY, Kim SH, Hong SK, Shin J, Yoo KW, Hibi M,Hirano T, Miki N, Chitnis AB, et al. 2000. Analysis of upstream elements in theHuC promoter leads to the establishment of transgenic zebrafish with fluorescentneurons. Dev Biol 227:279–293.
Sarkar S, Petiot A, Copp A, Ferretti P, Thorogood P. 2001. FGF2 promotes skeletogenicdifferentiation of cranial neural crest cells. Development 128:2143–2152.
Tapadia MD, Cordero DR, Helms JA. 2005. It’s all in your head: new insights intocraniofacial development and deformation. J Anat 207:461–477.
Thisse C, Thisse B. 2008. High-resolution in situ hybridization to whole-mount ze-brafish embryos. Nat Protoc 3:59–69.
Treisman J. 2001. Drosophila homologues of the transcriptional coactivation complexsubunits TRAP240 and TRAP230 are required for identical processes in eye-antennal disc development. Development 128:603–615.
Tsai KL, Sato S, Tomomori-Sato C, Conaway RC, Conaway JW, Asturias FJ. 2013. Aconserved Mediator-CDK8 kinase module association regulates Mediator-RNApolymerase II interaction. Nat Struct Mol Biol 20:611–619.
Utami KH, Hillmer AM, Aksoy I, Chew EG, Teo AS, Zhang Z, Lee CW, Chen PJ, SengCC, Ariyaratne PN, Rouam SL, Soo LS, et al. 2014. Detection of chromosomalbreakpoints in patients with developmental delay and speech disorders. PLoS One9:e90852.
van Haelst MM, Monroe GR, Duran K, van Binsbergen E, Breur JM, Giltay JC, vanHaaften G. 2014. Further confirmation of the MED13L haploinsufficiency syn-drome. Eur J Hum Genet. doi:10.1038/ejhg.2014.69 [Epub ahead of print]
Welcker M, Clurman BE. 2008. FBW7 ubiquitin ligase: a tumour suppressor at thecrossroads of cell division, growth and differentiation. Nat Rev Cancer 8:83–93.
Yao F, Ariyaratne PN, Hillmer AM, Lee WH, Li G, Teo AS, Woo XY, Zhang Z, Chen JP,Poh WT, Zawack KF, Chan CS, et al. 2012. Long span DNA paired-end-tag (DNA-PET) sequencing strategy for the interrogation of genomic structural mutationsand fusion-point-guided reconstruction of amplicons. PLoS One 7:e46152.
Yoda A, Kouike H, Okano H, Sawa H. 2005. Components of the transcriptional Mediatorcomplex are required for asymmetric cell division in C. elegans. Development132:1885–1893.
Zhang H, Emmons SW. 2000. A C. elegans mediator protein confers regulatory se-lectivity on lineage-specific expression of a transcription factor gene. Genes Dev14:2161–2172.
Zhang H, Emmons SW. 2001. The novel C. elegans gene sop-3 modulates Wnt signalingto regulate Hox gene expression. Development 128:767–777.
10 HUMAN MUTATION, Vol. 0, No. 0, 1–10, 2014