alind gupta real-world and advanced analytics webinars/oncology...• informing future trial design,...
TRANSCRIPT
Shaping the Future
of Drug Development
Transparent machine learning
Alind Gupta
Real-World and Advanced Analytics
Machine learning
4/28/2020 Cytel Inc. 2
• Algorithms that learn from data + evaluation criteria
• Key challenge – identify the (unobservable) data-generating distribution from sample
Linear regression
Neural network
X
Y
Which model better
approximates data
distribution P(X,Y)?
Research areas• Prediction• Knowledge discovery• Anomaly detection• Summarization• Optimal decision-making
Potential applications in clinical research
4/28/2020 Cytel Inc. 3
Shah, Pratik, et al. "Artificial intelligence and machine learning in clinical development: a translational perspective." NPJ digital medicine 2.1 (2019): 1-5.
Black-box machine learning
4/28/2020 Cytel Inc. 4
• High capacity, low interpretability models (e.g. deep neural networks)
• Problems:
• Biases and limitations?
• Inability to audit decision-making
• Difficult to troubleshoot
• May not engender trust in users, regulators
Input PredictionBlack box
model
?
Transparency is important
4/28/2020 Cytel Inc. 5
EU GPDR
• Individual’s “Right to explanation” about automated decisions
FDA guidance for Good Machine Learning Practices (GMLP)
• “[A]ppropriate validation, transparency” to assure “safety and effectiveness”
• Focus on validation with “clinicians in the loop” where necessary
https://www.fda.gov/files/medical%20devices/published/US-FDA-Artificial-Intelligence-and-Machine-Learning-Discussion-Paper.pdf
Limitations of black boxes
4/28/2020 Cytel Inc. 6
Adversarial attacks
• What patterns are black box models really representing?
IBM Watson for Oncology
• “Overpromising and under-delivery”
COMPAS
• As good as random people on the internet at predicting recidivism
Panda
(57%)
Gibbon
(99%)
+
=
Strickland, E. (2019). IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care. IEEE Spectrum, 56(4), 24-31.Dressel, J., & Farid, H. (2018). The accuracy, fairness, and limits of predicting recidivism. Science advances, 4(1), eaao5580.Buolamwini, J., & Gebru, T. (2018, January). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency (pp. 77-91).Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z. B., & Swami, A. (2017, April). Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security (pp. 506-519).
Bayesian networks
A transparent and flexible machine learning method
4/28/2020 Cytel Inc. 7
Key idea
4/28/2020 Cytel Inc. 8
Performing computations on a Directed Acyclic Graph (DAG)
Data
DAG structure DAG parametrization
(Maximum likelihood, Bayesian
estimation)Subject-matter
expert
Applications of graphical models
4/28/2020 Cytel Inc. 9
• Risk prediction
Causal inference
• Bayesian inference
• Computer vision
• NLP
• Gene networks
Use case: Immunotherapy for advanced cancer
4/28/2020 Cytel Inc. 10
Challenges
• High individual-level heterogeneity in response to treatment
• Subsets show durable response, severe adverse events
• Short follow-up
• Multiple outcomes of interest
Uses for machine learning
• Identifying predictors of response
• Long-term predictions for health economic evaluations for HTA (better than curve-fitting?)
• Informing future trial design, surrogate endpoints
• Patient simulation with time-varying interventions
Multivariate prediction model
4/28/2020 Cytel Inc. 11
• Based on IPD from RCT
• 3 outcomes over 3 years
DAG learning
• MLE + bootstrapping + model averaging
• Constrained edge orientation based on causal tiers
• Comparison to known/expected relationships
Risk
score X
Classification performance
4/28/2020 Cytel Inc. 12
Outcome 1
Results for other outcomes
External validation using RWD
4/28/2020 Cytel Inc. 13
• To assess generalizability and limitations
• Problem – what to do about missing covariates?
• HRQoL is highly prognostic in RCT but not present in RWD
Good
real-world
generalizability
All variables
Common subset
Real-world data
External validation by key subgroups
4/28/2020 Cytel Inc. 14
Good
real-world
generalizability
Moderate
real-world
generalizability
Subgroup A Subgroup B
Prognostic variables by treatment group
4/28/2020 Cytel Inc. 15
Pro
gn
ostic v
alu
e
Treatment A Treatment B
Variables ordered by increasing prognostic value
Differentially
prognostic
variable
Communication
4/28/2020 Cytel Inc. 16
Dynamic Bayesian networks
Extending Bayesian networks for time modelling
4/28/2020 Cytel Inc. 17
Predicting trends in time
4/28/2020 Cytel Inc. 18
Challenges
• Extrapolation in time
• Time-varying covariates
• How prognostic are changes in variables?
• Time-varying interventions
Initial distribution Time replication
Survival curves
4/28/2020 Cytel Inc. 19
Time
Prediction performance from baseline
4/28/2020 Cytel Inc. 20
Plateau?
Treatment group A Treatment group B
X X
Prognostic value of changes
4/28/2020 Cytel Inc. 21
High
levels
Intermediate
levels
Low
levels
Month
T
Month
T +1
Survival
Death
Biomarker A high → high
Biomarker A med → med
Biomarker B low → med
Biomarker A high → med
Biomarker B high → low
Biomarker A med → low
Future directions
4/28/2020 Cytel Inc. 22
• Relaxing the Markov assumption
• Latent variable models
• Adding outcomes
• PFS, ORR, TFS
Conclusion
4/28/2020 Cytel Inc. 23
• Bayesian networks are transparent, interpretable models
• Useful for multivariate prediction
• Useful for missing data problems + small data
• Useful as time models for dynamic processes
Statistical
Software
Global Products and Services
Strategic
Consulting
Project-Based
Services
Functional
Services
Provision (FSP)Industry standard for trial design, including adaptive (e.g. East, StatXact) Operations software (e.g. ACES, EnForeSys, FlexRandomizer)
All 25 top biopharma companies, the FDA, EMA & PMDA use our software
PhD statisticians expert in innovative design & complex statistical questions
Experts in Data Science, PK/PD, Enrolment & Event Forecasting, Portfolio/Program Optimization (NPV)
Reliable Biometrics service provider delivering high quality, on time
Lead staff with over 15 years industry experience on average
Including biostatistics & programming, ISC, data management, PK/PD analysis, medical writing
Creation of dedicated teams operating within/as an extension of the client’s own biostatistics & programming, data management and PK/PD teams
Leader in offshoring of Biometrics competencies
NEW BOOK
Introduction to
Adaptive Design &
Master Protocols
COMING 2021
Stage of Development
End-to-End Biometric Solutions for All Phases Development
Protocol Design Study Conduct Reporting & Submission
Cytel’s Statistical and Adaptive Trial Software
Cytel’s Clinical Research Services
Study / Adaptive Design
Exploratory & Predictive Analyses
Simulation & Modeling
Feasibility & Patient
Recruitment Modeling
Regulatory Support &
Representation
eCRF Development
Data Management
Biostatistics
Statistical Programming
Data Monitoring
Final Study ReportingStrategic Program Planning
CDISC migration
Pharmacometrics &
Pharmacology (QPP)
Real World Analytics
Interim Analyses
Randomization
Data Monitoring Committee
Support
Integrated Summaries of Safety &
Efficacy
eCTD Reporting for Submission
Health Economics and Outcomes
Research (HEOR)
All of Cytel’s services are offered across all four phases of drug development and across a multitude of therapeutic areas
25
NEW BOOK
Introduction to
Adaptive Design &
Master Protocols
COMING 2021