sleep eeg studies in schizophrenia: methodological considerations
TRANSCRIPT
Abstracts of the 4th Biennial Schizophrenia International Research Conference / Schizophrenia Research 153, Supplement 1 (2014) S1–S384 S87
SLEEP EEG STUDIES IN SCHIZOPHRENIA: METHODOLOGICAL
CONSIDERATIONS
Matcheri Keshavan
Beth Israel Deaconess Medical Center and Massachusetts Mental Health
Center, Harvard University, Boston, MA, Boston, USA
Alterations in electroencephalographic (EEG) sleep architecture are com-
mon in psychotic disorders, but the literature is limited to inconsistent
replications and lack of diagnostic specificity, issues common to other
electrophysiological assessments such as ERP and eye tracking studies. This
presentation will review potential methodological reasons for the lack of
consistency in the literature. A review of the literature on EEG biomarkers
in psychotic disorders was done with the following questions in mind: how
robustly do these measures differ between patients and controls, and how
diagnostically specific are they? How are these tests traditionally admin-
istered and standardized, and what are their advantages and limitations?
What are the issues that need to be addressed? What are the best ways to
maximize reliability and validity, while keeping costs and subject burden
to a minimum? We focused on sleep biomarkers, but also considered other
EEG biomarkers (evoked response potentials, resting EEG) in the review.
While there are robust differences between patients and controls on several
sleep EEG measures such as slow wave sleep and REM proportions and
densities, few show effects that discriminate between DSM diagnostic cat-
egories. At least in part these limitations are related to variations between
studies, patient populations, small sample sizes and small effect sizes, labor
intensive procedures and patient burden, factors affecting physiological
states such as smoking and activity levels, differences in instrumentation,
data processing approaches and scoring procedures. Establishing group
differences will require reduction of both instrumentation and biological
noise. Automated detection algorithms (e.g. delta counts, spindle density,
etc.) and artifact rejection procedures are likely to increase sensitivity and
specificity. Ambulatory studies might help cost and subject burden. Making
centralized, blinded processing of data, clear quality control procedures and
standardized calibration of equipment are critical.
CLINICIAN-BASED ASSESSMENT OF PSYCHIATRIC DIAGNOSIS AND
SYMPTOM SEVERITY
Janet B.W. Williams
MedAvante, Hamilton, USA
FDA puts the estimate of failed trials in schizophrenia at 25%, with placebo
response growing steadily. In psychiatric clinical trials, most procedures
are conducted by site-based clinicians whose diagnoses and assessments
may be influenced by various forms of bias, and whose individual findings
may contribute to variability of results when combined with those of
other raters and sites. Bias and variability are two key contributors to trial
failure and a number of strategies have attempted to address them. On a
systems level, dramatically increasing the sample size of a study has been
attempted as a way to improve signal detection, but larger samples require
more raters, sites and countries which adds to the problem of variability
and signal noise and may decrease effect size. Others have tried to be more
selective in choosing sites; however, personnel turnover can affect success
rates if a site does not have consistent training, and good performance
in one trial may have a very low correlation with good performance in
the next study at that same site. Finally, despite efforts to conduct trials
in certain countries that demonstrated good signal detection in previous
studies, placebo response seems to be increasing nonetheless. At the rater
level, strategies to avoid bias and reduce variability have included limit-
ing participation to very experienced raters, and increasing rater training.
Unfortunately, clinical experience alone does not result in good interrater
reliability within a cohort of raters; experienced raters must be carefully
calibrated with each other to achieve this. In addition, studies have shown
that even intensive group rater training at the beginning of a study does not
improve interrater reliability significantly, even in the short term. Thus, the
variability that results from combining ratings from many different sites
remains. Recent methodological approaches to improving trial failure rates
include supplementing site ratings with patients’ self-reports of symptoms,
dual assessments by site-based and independent raters, and providing
feedback to site raters based on reviews of their recorded interviews.
Each of these methods has advantages and disadvantages, which will be
reviewed. Another particularly powerful method replaces a large number
of site-based raters with a much smaller cohort of remote centralized
raters who are geographically and financially independent from the sites.
These experienced raters can be trained and calibrated to a single standard,
continuously monitored to avoid rater drift, and blinded to protocol de-
tails (e.g., inclusion criteria) and visit number (e.g., baseline vs. endpoint)
to avoid expectation bias. This method has met considerable resistance
from site-based clinicians who believe they can yield the most informed
ratings based on strong relationships with study subjects. However, a
growing body of evidence indicates that centralization to ensure blinding
and independence of raters improves signal detection or lowers placebo
response.