the dimacs working group on disease and adverse event surveillance henry rolka and david madigan

34
The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Post on 20-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

The DIMACS Working Group on Disease and

Adverse Event Surveillance

Henry Rolka and David Madigan

Page 2: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Background• WG Objective: Bring together researchers in

adverse event monitoring and disease surveillance

• Part of a 5-year special focus on computational and mathematical epidemiology

• 50+ WG members: epidemiologists, public health professionals, biostatisticians, etc.

• Focus on analytic/statistical methods• Two WG meetings plus week-long tutorial (02-

03)• Coordinated closely with National Syndromic

Surveillance Conferences

Page 3: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Disease Surveillance

Drug Safety Surveillance

Syndromic Surveillance

Vaccine SafetySurveillance

Areas of Common Interest

Page 4: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Representation

• Carnegie-Mellon University

• FDA• Quintiles Inc.• CDC• Rutgers University• Emergint, Inc.• AT&T Labs• NJ State

• NYC Dept. of Health• University of

Pennsylvania• Aventis• ATSDR• University of Connecticut• Los Alamos National Lab• Lincoln Technologies• SAS Institute

Page 5: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Background, cont.

• WG conceived before September 11, 2001• Surveillance landscape has changed

drastically• Major public health effort directed at

bioterrorism detection• Proliferation of novel surveillance projects

in response to national threat• “Good for detecting outbreaks of various

kinds”

Page 6: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

New Data Types for Public Health Surveillance

• Managed care patient encounter data• Pre-diagnostic/chief complaint (text

data) • Over-the-counter sales transactions

– Drug store– Grocery store

• 911-emergency calls• Ambulance dispatch data• Absenteeism data• ED discharge summaries • Prescription/pharmaceuticals• Adverse event reports

Page 7: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

New Analytic Methods and Approaches

• Spatial-temporal scan statistics • Statistical process control (SPC) • Bayesian applications • Market-basket association analysis • Text mining• Rule-based surveillance• Change-point techniques

Page 8: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

ANALYTIC METHODS IN USE

• Scan statistics (e.g., Kulldorff’s SaTScan)

• Statistical process control (e.g., Hutwagner’s EARS)

• Association rule mining (e.g., Moore’s WSARE)

• Bayesian shrinkage (e.g., DuMouchel’s MGPS)

• Generalized linear mixed models (e.g., Kleinman)

• Sequential probability ratio tests (e.g., Spiegelhalter, Evans)

                              

Page 9: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

SCAN STATISTICS

• Martin Kulldorff’s SaTScan - Spatial and Space-Time Scan Statistics - software.• e.g., spatial scan – using Poisson model computes likelihood of all possible circles compared with likelihood under the null distribution• Picks the circle with the biggest likelihood ratio • P-value computed via Monte Carlo

• Big literature on disease clustering: Besag & Newell, Diggle, Moran test, Turnbull’s method, Cuzick & Edwards, etc.• Need methodology for multiple sources

Page 10: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Farzad Mostashari

Page 11: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

BAYESIAN SHRINKAGE ESTIMATION

• DuMouchel’s GPS/MGPS

• Compares observed counts of “market baskets” to expected counts under some (simple) model. For example, saw 30 cases in the ER today with G.I. syndrome AND fever AND work in Newark compared with an expectation of 3 cases

• 30-to-3 is more convincing than 3-to-0.3 but less convincing that 300-to-30. Idea: shrink the smaller ones towards one.

Page 12: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

log RR

log

EB

GM

0 1 2 3 4 5 6

01

23

45

6

12351050-100

number of reports

GPS SHRINKAGE – AERS DATA

Page 13: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

BAYESIAN SHRINKAGE ESTIMATION

• Issues:

Appropriate amount of shrinkage?

Where do the expected values come from?

Temporal dimension?

Covariate information

Simpson’s paradox (“innocent bystander”)

Page 14: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

SEQUENTIAL PROBABILITY RATIO TESTS

• Classical much-studied statistical method dating back to Wald (1948)

Page 15: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

NATURAL LANGUAGE

• Important sources of health data begin life as free text “chief complaints” (ED visits, primary care encounters, adverse event reports, e-mail, etc.)

“Approximately 5 minutes after receiving flu and pneumonia vaccine pt began hollering, "Oh, Oh my neck is hurting.

Feels like a knot in my throat, a medicine taste." Complained of chest pain moving to back and leg

numbness.”

• Some (successful) work on automated coding of free text.

• Little work on direct surveillance of text data

Page 16: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

CONCLUSION

• Analytic methods for surveillance have a long history in Statistics but currently attract substantial new interest from researchers in both CS and Statistics

• Urgently need new methods for multivariate, multi-data type streams

• Data availability a bottleneck; simulation non-trivial.

• DARPA currently staging a competition

Page 17: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

THE IDEA OF A COMPETITION

Thesis: Rapid growth in the number of deployed health surveillance systems and increasing complexity require new analytic methodologies

Goal: Stimulate mainstream Computer Science and Statistics researchers to focus on this area

How: A signal detection competition

Examples: the Message Understanding Conferences (MUC), Text Retrieval Conferences (TREC), KDD Cup, M3 Time Series competition

Page 18: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

COMPETITION STATUS

•DIMACS Working Group on Adverse Event and Disease Reporting, Surveillance, Analysis

•Subgroup focused on competition; applied for funding; identified data sources

•Key challenge: appropriate methods for inserting signals into real data (“spiking”)

•Other groups face the same challenge (e.g. BioStorm)

Page 19: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

ANALYTIC METHODS IN USE

• Scan statistics (e.g., Kulldorff’s SaTScan)

• Statistical process control (e.g., Hutwagner’s EARS)

• Association rule mining (e.g., Moore’s WSARE)

• Bayesian shrinkage (e.g., DuMouchel’s MGPS)

• Generalized linear mixed models (e.g., Kleinman)

• Sequential probability ratio tests (e.g., Spiegelhalter, Evans)

                              

Page 20: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

SCAN STATISTICS

• Martin Kulldorff’s SaTScan - Spatial and Space-Time Scan Statistics - software.• e.g., spatial scan – using Poisson model computes a likelihood ratio for all possible circles comparing event counts inside and outside• Picks the circle with the biggest likelihood ratio • P-value computed via Monte Carlo

• Big literature on disease clustering: Besag & Newell, Cuzick & Edwards, Diggle, Moran test, Pagano, Turnbull’s method,, etc.• Need methodology for multiple sources

Page 21: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Farzad Mostashari

Page 22: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

BAYESIAN SHRINKAGE ESTIMATION

• DuMouchel’s GPS/MGPS

• Compares observed counts of “market baskets” to expected counts under some (simple) model. For example, saw 30 cases in the ER today with G.I. syndrome AND fever AND work in Newark compared with an expectation of 3 cases

• 30-to-3 is more convincing than 3-to-0.3 but less convincing that 300-to-30. Idea: shrink the smaller ones towards one.

Page 23: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

log RR

log

EB

GM

0 1 2 3 4 5 6

01

23

45

6

12351050-100

number of reports

GPS SHRINKAGE – AERS DATA

Page 24: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

BAYESIAN SHRINKAGE ESTIMATION

• Issues:

Appropriate amount of shrinkage?

Where do the expected values come from?

Temporal dimension?

Covariate information

Page 25: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

SEQUENTIAL PROBABILITY RATIO TESTS

• Classical much-studied statistical method dating back to Wald (1948). Mostly univariate.

Page 26: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

NATURAL LANGUAGE

• Important sources of health data begin life as free text “chief complaints” (ED visits, primary care encounters, adverse event reports, e-mail, etc.)

“Approximately 5 minutes after receiving flu and pneumonia vaccine pt began hollering, "Oh, Oh my neck is hurting.

Feels like a knot in my throat, a medicine taste." Complained of chest pain moving to back and leg

numbness.”

• Some (successful) work on automated coding of free text.

• Little work on direct surveillance of text data

Page 27: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

THE IDEA OF A COMPETITION

Thesis: Rapid growth in the number of deployed health surveillance systems and increasing complexity require new analytic methodologies

Goal: Stimulate mainstream Computer Science and Statistics researchers to focus on this area

How: A signal detection competition

Examples: the Message Understanding Conferences (MUC), Text Retrieval Conferences (TREC), KDD Cup, M3 Time Series competition

Page 28: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

• Definitions of signals.

• Test data sets for refining signal detection procedures.

• Modular, interoperable signal generation algorithms.

• Computing efficiencies for Monte Carlo simulations of signal detection events in large complex data.

• Multidimensional graphical displays to interpret results and evaluate algorithms.

• Multivariate statistical techniques for evaluating signal detection profiles across multiple data sources.

HOW CAN THIS BE ACCOMPLISHED

Page 29: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

COMPETITION STATUS

•DIMACS Working Group on Adverse Event and Disease Reporting, Surveillance, Analysis

•Subgroup focused on competition; applied for funding; identified data sources

•Key challenge: appropriate methods for inserting signals into real data (“spiking”)

•Other groups face the same challenge (e.g. BioStorm)

Page 30: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

CONCLUSION

• Short-term goals/benefits:•Promote coordination and collaboration

• Long-term goals/benefits• Stimulate methodological research• Provide objective evaluation of competing algorithms• Produce high quality spiking algorithms

Page 31: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

ANALYTICAL METHODS FOR HEALTH

SURVEILLANCE

DAVID MADIGAN

DEPARTMENT OF STATISTICS

RUTGERS UNIVERSITY

Page 32: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Novel Surveillance Applications Methodologies

• Early Aberration Reporting System (EARS), CDC

• What’s Strange About Recent Events? (WSARE), U of Pittsburgh and Carnegie-Mellon U

• Spatial and Space-Time Scan Statistics (SaTScanTM – Kulldorff)

• Web Visual Data Mining Environment (WebVDME), Lincoln Technologies, Inc.

Page 33: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Novel Surveillance Applications Projects

• Electronic Surveillance System for the Early Notification of Community-based Epidemics (ESSENCE I&II), DOD

• Real-time Outbreak and Disease Surveillance (RODS), U of Pittsburgh

• Biological Spatio-Temporal Outbreak Reasoning Module (BioSTORM), Stanford U

• Rapid Syndrome Validation Project (RSVP), Sandia NL, NM

• Alternative Surveillance Alert Program (ASAP), Health Canada

• Syndromic Surveillance Project, NYC

• Bioterrorism Syndromic Surveillance Demonstration Program, CDC/Harvard

Page 34: The DIMACS Working Group on Disease and Adverse Event Surveillance Henry Rolka and David Madigan

Conceptual Taxonomy

Public Health Surveillance

Adverse event(to intervention exposure)

Disease

Traditional SyndromicDrug Vaccine

Birth defect Injuries

Other

Etc.

Infectious disease