dimacs special focus on computational and mathematical epidemiology

84
DIMACS Special Focus on Computational and Mathematical Epidemiology

Post on 19-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: DIMACS Special Focus on Computational and Mathematical Epidemiology

DIMACS Special Focus on Computational and Mathematical

Epidemiology

Page 2: DIMACS Special Focus on Computational and Mathematical Epidemiology

The Role of the Mathematical Sciences in Epidemiology

Emergence of new infectious diseases:

•Lyme disease

•HIV/AIDS

•Hepatitis C

•West Nile Virus

Evolution of antibiotic-resistant strains:

•tuberculosis

•pneumonia

•gonorrhea

Page 3: DIMACS Special Focus on Computational and Mathematical Epidemiology

Great concern about the deliberate introduction of diseases by bioterrorists

•anthrax

•smallpox

•plague

Understanding infectious systems requires being able to reason about highly complex biological systems, with hundreds of demographic and epidemiological variables.

Intuition alone is insufficient to fully understand the dynamics of such systems.

Page 4: DIMACS Special Focus on Computational and Mathematical Epidemiology

Experimentation or field trials are often prohibitively expensive or unethical and do not always lead to fundamental understanding.

Therefore, mathematical modeling becomes an important experimental and analytical tool.

Page 5: DIMACS Special Focus on Computational and Mathematical Epidemiology

Mathematical models have become important tools in analyzing the spread and control of infectious diseases, especially when combined with powerful, modern computer methods for analyzing and/or simulating the models.

Page 6: DIMACS Special Focus on Computational and Mathematical Epidemiology

What Can Math Models Do For Us?

Page 7: DIMACS Special Focus on Computational and Mathematical Epidemiology

What Can Math Models Do For Us?•Sharpen our understanding of fundamental processes

•Compare alternative policies and interventions

•Help make decisions.

•Prepare responses to bioterrorist attacks.

•Provide a guide for training exercises and scenario development.

•Guide risk assessment.

•Predict future trends.

Page 8: DIMACS Special Focus on Computational and Mathematical Epidemiology

In order for math. and CS to become more effectively utilized, we need

to:

•make better use of existing tools

Page 9: DIMACS Special Focus on Computational and Mathematical Epidemiology

In order for math. and CS to become more effectively utilized, we need

to:

•develop new tools

•establish working partnerships between mathematical scientists and biological scientists;

•introduce the two communities to each others’ problems, language, and tools;.

Page 10: DIMACS Special Focus on Computational and Mathematical Epidemiology

•introduce outstanding junior researchers from both sides to the issues, problems, and challenges of mathematical and computational epidemiology;

Page 11: DIMACS Special Focus on Computational and Mathematical Epidemiology

•involve biological and mathematical scientists together to define the agenda and develop the tools of this field.

These are all fundamental goals of this special focus.

Page 12: DIMACS Special Focus on Computational and Mathematical Epidemiology

Methods of Math. and Comp. Epi.Math. models of infectious diseases go back to Daniel Bernoulli’s mathematical analysis of smallpox in 1760.

Page 13: DIMACS Special Focus on Computational and Mathematical Epidemiology

Hundreds of math. models since have:

•highlighted concepts like core population in STD’s;

Page 14: DIMACS Special Focus on Computational and Mathematical Epidemiology

•Made explicit concepts such as herd immunity for vaccination policies;

Page 15: DIMACS Special Focus on Computational and Mathematical Epidemiology

•Led to insights about drug resistance, rate of spread of infection, epidemic trends, effects of different kinds of treatments.

Page 16: DIMACS Special Focus on Computational and Mathematical Epidemiology

The size and overwhelming complexity of modern epidemiological problems calls for new approaches.

New methods are needed for dealing with:

•dynamics of multiple interacting strains of viruses through construction and simulation of dynamic models;

•spatial spread of disease through pattern analysis and simulation;

•early detection of emerging diseases or bioterrorist acts through rapidly-responding surveillance systems.

Page 17: DIMACS Special Focus on Computational and Mathematical Epidemiology

Statistical Methods

•Long used in epidemiology.

•Used to evaluate role of chance and confounding associations.

•Used to ferret out sources of systematic error in observations.

•Role of statistical methods is changing due to the increasingly huge data sets involved, calling for new approaches.

Page 18: DIMACS Special Focus on Computational and Mathematical Epidemiology

Dynamical Systems

Page 19: DIMACS Special Focus on Computational and Mathematical Epidemiology

Dynamical Systems•Used for modeling host-pathogen systems, phase transitions when a disease becomes epidemic, etc.

•Use difference and differential equations.

•Little systematic effort to apply today’s powerful computational tools to these dynamical systems and few computer scientists are involved.

We hope to change this situation.

Page 20: DIMACS Special Focus on Computational and Mathematical Epidemiology

Probabilistic Methods

•Important role of stochastic processes, random walk models, percolation theory, Markov chain Monte Carlo methods.

Page 21: DIMACS Special Focus on Computational and Mathematical Epidemiology

Probabilistic Methods Continued•Computational methods for simulating stochastic processes in complex spatial environments or on large networks have started to enable us to simulate more and more complex biological interactions.

Page 22: DIMACS Special Focus on Computational and Mathematical Epidemiology

Probabilistic Methods Continued•However, few mathematicians and computer scientists have been involved in efforts to bring the power of modern computational methods to bear.

Page 23: DIMACS Special Focus on Computational and Mathematical Epidemiology

Discrete Math. and Theoretical Computer Science

• Many fields of science, in particular molecular biology, have made extensive use of DM broadly defined.

Page 24: DIMACS Special Focus on Computational and Mathematical Epidemiology

Discrete Math. and Theoretical Computer Science Cont’d

•Especially useful have been those tools that make use of the algorithms, models, and concepts of TCS.

•These tools remain largely unused and unknown in epidemiology and even mathematical epidemiology.

Page 25: DIMACS Special Focus on Computational and Mathematical Epidemiology

DM and TCS Continued•These tools are made especially relevant to epidemiology because of:

–Geographic Information Systems

Page 26: DIMACS Special Focus on Computational and Mathematical Epidemiology

DM and TCS Continued–Availability of large and disparate computerized databases on subjects relating to disease and the relevance of modern methods of data mining.

Page 27: DIMACS Special Focus on Computational and Mathematical Epidemiology

DM and TCS Continued–The increasing importance of an evolutionary point of view in epidemiology and the relevance of DM/TCS methods of phylogenetic tree reconstruction.

Page 28: DIMACS Special Focus on Computational and Mathematical Epidemiology

How does a Special Focus Work?

•Get researchers with different backgrounds and approaches together.

•Stimulate new collaborations.

•Set the agenda for future research.

•Act as a catalyst for new developments at the interface among disciplines.

DIMACS has been doing this for a long time.

Page 29: DIMACS Special Focus on Computational and Mathematical Epidemiology

Components of a Special Focus

•Working Groups

•Tutorials

•Workshops

•Visitor Programs

•Graduate Student Programs

•Postdoc Programs

•Dissemination

Page 30: DIMACS Special Focus on Computational and Mathematical Epidemiology

Working Groups

Page 31: DIMACS Special Focus on Computational and Mathematical Epidemiology

Working Groups Continued

•Interdisciplinary, international groups of researchers.

•Come together at DIMACS.

•Informal presentations, lots of time for discussion.

•Emphasis on collaboration.

•Return as a full group or in subgroups to pursue problems/approaches identified in first meeting.

•By invitation; but contact the organizer.

•Junior researchers welcomed. Nominate them.

Page 32: DIMACS Special Focus on Computational and Mathematical Epidemiology

Tutorials

Page 33: DIMACS Special Focus on Computational and Mathematical Epidemiology

Tutorials Continued

•Integrate research and education.

•Introduce mathematical scientists to relevant topics in epidemiology and biology

•Introduce epidemiologists and biologists to relevant methods of math., CS, statistics, operations research.

•Financial support available by application.

Page 34: DIMACS Special Focus on Computational and Mathematical Epidemiology

Workshops

Page 35: DIMACS Special Focus on Computational and Mathematical Epidemiology

Workshops Continued

•More formal programs.

•Widely publicized.

•One-time programs.

•Some educational component: encourage participation by graduate students; tutorials.

•Interdisciplinary flavor.

•Can spawn new working groups.

•Financial support available in limited amounts;contact the organizer.

Page 36: DIMACS Special Focus on Computational and Mathematical Epidemiology

Visitor Programs

Page 37: DIMACS Special Focus on Computational and Mathematical Epidemiology

Visitor Programs Continued•Interdisciplinary groups of researchers will return

after working group meetings.

•Workshop participants can come early or stay late.

•Visits can be arranged independent of workshops or working group meetings. Contact DIMACS Visitor Coordinator.

•Visits by junior researchers and students will be encouraged.

We want to make DIMACS a center for collaboration in mathematical and computational epidemiology for the next 5 years (and beyond).

Page 38: DIMACS Special Focus on Computational and Mathematical Epidemiology

Grad. Student/Postdoc Programs

Page 39: DIMACS Special Focus on Computational and Mathematical Epidemiology

Grad. Student/Postdoc Programs

•Each working group, workshop, tutorial will support students/postdocs. Contact organizer.

•Students/postdocs visiting for longer will have a host/mentor. Contact DIMACS visitor coordinator.

•Local graduate students will get involved through participation in working groups and small research projects.

•We hope to raise funds for postdoctoral fellows to participate by spending a year or more at DIMACS.

Page 40: DIMACS Special Focus on Computational and Mathematical Epidemiology

Dissemination

•DIMACS technical report series.

•Working group and workshop websites.

•DIMACS book series.

Page 41: DIMACS Special Focus on Computational and Mathematical Epidemiology

Working GroupsWG’s on Large Data Sets:

•Adverse Event/Disease Reporting, Surveillance & Analysis.

•Data Mining and Epidemiology.

WG’s on Analogies between Computers and Humans:

•Analogies between Computer Viruses/Immune Systems and Human Viruses/Immune Systems

•Distributed Computing, Social Networks, and Disease Spread Processes

Page 42: DIMACS Special Focus on Computational and Mathematical Epidemiology

WG’s on Methods/Tools of TCS•Phylogenetic Trees and Rapidly Evolving Diseases

•Order-Theoretic Aspects of Epidemiology

WG’s on Computational Methods for Analyzing Large Models for Spread/Control of Disease

•Spatio-temporal and Network Modeling of Diseases

•Methodologies for Comparing Vaccination Strategies

Page 43: DIMACS Special Focus on Computational and Mathematical Epidemiology

WG’s on Mathematical Sciences Methodologies

•Mathematical Models and Defense Against Bioterrorism

•Predictive Methodologies for Infectious Diseases

•Statistical, Mathematical, and Modeling Issues in the Analysis of Marine Diseases

WG on Noninfectious Diseases•Computational Biology of Tumor Progression

Page 44: DIMACS Special Focus on Computational and Mathematical Epidemiology

Workshops on Modeling of Infectious Diseases

•The Pathogenesis of Infectious Diseases

•Models/Methodological Problems of Botanical Epidemiology

WS on Modeling of Non-Infectious Diseases

•Disease Clusters

Page 45: DIMACS Special Focus on Computational and Mathematical Epidemiology

Workshops on Evolution and Epidemiology

•Genetics and Evolution of Pathogens

•The Epidemiology and Evolution of Influenza

•The Evolution and Control of Drug Resistance

•Models of Co-Evolution of Hosts and Pathogens

Page 46: DIMACS Special Focus on Computational and Mathematical Epidemiology

Workshops on Methodological Issues

•Capture-recapture Models in Epidemiology

•Spatial Epidemiology and Geographic Information Systems

• Ecologic Inference

•Combinatorial Group Testing

Other Topics:Suggestions are encouraged.

Page 47: DIMACS Special Focus on Computational and Mathematical Epidemiology

Tutorials

•Dynamic Models of Epidemiological Problems

•The Foundations of Molecular Genetics for Non-Biologists

•Introduction to Epidemiological Studies

•DM and TCS for Epidemiologists and Biologists

•Promising Statistical Methods for Epidemiology for Epidemiologists and Biologists

Page 48: DIMACS Special Focus on Computational and Mathematical Epidemiology

Challenges for Discrete Math and Theoretical Computer

Science

Page 49: DIMACS Special Focus on Computational and Mathematical Epidemiology

What are DM and TCS?

DM deals with:

•arrangements

•designs

•codes

•patterns

•schedules

•assignments

Page 50: DIMACS Special Focus on Computational and Mathematical Epidemiology

TCS deals with the theory of computer algorithms.

During the first 30-40 years of the computer age, TCS, aided by powerful mathematical methods, especially DM, probability, and logic, had a direct impact on technology, by developing models, data structures, algorithms, and lower bounds that are now at the core of computing.

Page 51: DIMACS Special Focus on Computational and Mathematical Epidemiology

DM and TCS have found extensive use in many areas of science and public policy, for example in Molecular Biology.

These tools, which seem especially relevant to problems of epidemiology, are not well known to those working on public health problems.

Page 52: DIMACS Special Focus on Computational and Mathematical Epidemiology

So How are DM/TCS Relevant to the Fight Against Disease?

Page 53: DIMACS Special Focus on Computational and Mathematical Epidemiology

Detection/SurveillanceStreaming Data Analysis:

•When you only have one shot at the data

•Widely used to detect trends and sound alarms in applications in telecommunications and finance

•AT&T uses this to detect fraudulent use of credit cards or impending billing defaults

•Columbia has developed methods for detecting fraudulent behavior in financial systems

•Uses algorithms based in TCS

•Needs modification to apply to disease detection

Page 54: DIMACS Special Focus on Computational and Mathematical Epidemiology

Research Issues:•Modify methods of data collection, transmission, processing, and visualization•Explore use of decision trees, vector-space methods, Bayesian and neural nets•How are the results of monitoring systems best reported and visualized?•To what extent can they incur fast and safe automated responses?•How are relevant queries best expressed, giving the user sufficient power while implicitly restraining him/her from incurring unwanted computational overhead?

Page 55: DIMACS Special Focus on Computational and Mathematical Epidemiology

Cluster Analysis

•Used to extract patterns from complex data

•Application of traditional clustering algorithms hindered by extreme heterogeneity of the data

•Newer clustering methods based on TCS for clustering heterogeneous data need to be modified for infectious disease and bioterrorist applications.

Page 56: DIMACS Special Focus on Computational and Mathematical Epidemiology

Visualization

•Large data sets are sometimes best understood by visualizing them.

Page 57: DIMACS Special Focus on Computational and Mathematical Epidemiology

Visualization

•Sheer data sizes require new visualization regimes, which require suitable external memory data structures to reorganize tabular data to facilitate access, usage, and analysis.

•Visualization algorithms become harder when data arises from various sources and each source contains only partial information.

Page 58: DIMACS Special Focus on Computational and Mathematical Epidemiology

Data Cleaning

•Disease detection problem: Very “dirty” data:

Page 59: DIMACS Special Focus on Computational and Mathematical Epidemiology

Data Cleaning

•Very “dirty” data due to –manual entry–lack of uniform standards for content and formats–data duplication–measurement errors

•TCS-based methods of data cleaning–duplicate removal–“merge purge”–automated detection

Page 60: DIMACS Special Focus on Computational and Mathematical Epidemiology

Dealing with “Natural Language” Reports

•Devise effective methods for translating natural language input into formats suitable for analysis.

•Develop computationally efficient methods to provide automated responses consisting of follow-up questions.

•Develop semi-automatic systems to generate queries based on dynamically changing data.

Page 61: DIMACS Special Focus on Computational and Mathematical Epidemiology

Social Networks

•Diseases are often spread through social contact.

•Contact information is often key in controlling an epidemic, man-made or otherwise.

•There is a long history of the use of DM tools in the study of social networks: Social networks as graphs.

Page 62: DIMACS Special Focus on Computational and Mathematical Epidemiology

Spread of Disease through a Network

•Dynamically changing networks: discrete times.

•Nodes (individuals) are infected or non-infected (simplest model).

•An individual becomes infected at time t+1 if sufficiently many of its neighbors are infected at time t. (Threshold model)

•Analogy: saturation models in economics.

•Analogy: spread of opinions through social networks.

Page 63: DIMACS Special Focus on Computational and Mathematical Epidemiology

Complications and Variants

•Infection only with a certain probability.

•Individuals have degrees of immunity and infection takes place only if sufficiently many neighbors are infected and degree of immunity is sufficiently low.

•Add recovered category.

•Add levels of infection.

•Markov models.

•Dynamic models on graphs related to neural nets.

Page 64: DIMACS Special Focus on Computational and Mathematical Epidemiology

Research Issues:

•What sets of vertices have the property that their infection guarantees the spread of the disease to x% of the vertices?

•What vertices need to be “vaccinated” to make sure a disease does not spread to more than x% of the vertices?

•How do the answers depend upon network structure?

•How do they depend upon choice of threshold?

Page 65: DIMACS Special Focus on Computational and Mathematical Epidemiology

These Types of Questions Have Been Studied in Other Contexts Using DM/TCS

Distributed Computing:

Page 66: DIMACS Special Focus on Computational and Mathematical Epidemiology

Distributed Computing:

•Eliminating damage by failed processors -- when a fault occurs, let a processor change state if a majority of neighbors are in a different state or if number is above threshold.

•Distributed database management.

•Quorum systems.

•Fault-local mending.

Page 67: DIMACS Special Focus on Computational and Mathematical Epidemiology

Spread of Opinion

Page 68: DIMACS Special Focus on Computational and Mathematical Epidemiology

Spread of Opinion

•Of relevance to bioterrorism.

•Dynamic models of how opinions spread through social networks.

•Your opinion changes at time t+1 if the number of neighboring vertices with the opposite opinion at time t exceeds threshold.

•Widely studied.

•Relevant variants: confidence in your opinion (= immunity); probabilistic change of opinion.

Page 69: DIMACS Special Focus on Computational and Mathematical Epidemiology

Evolution

Page 70: DIMACS Special Focus on Computational and Mathematical Epidemiology

Evolution

•Models of evolution might shed light on new strains of infectious agents used by bioterrorists.

•New methods of phylogenetic tree reconstruction owe a significant amount to modern methods of DM/TCS.

• Phylogenetic analysis might help in identification of the source of an infectious agent.

Page 71: DIMACS Special Focus on Computational and Mathematical Epidemiology

Some Relevant Tools of DM/TCS

•Information-theoretic bounds on tree reconstruction methods.

•Optimal tree refinement methods.

•Disk-covering methods.

•Maximum parsimony heuristics.

•Nearest-neighbor-joining methods.

•Hybrid methods.

•Methods for finding consensus phylogenies.

Page 72: DIMACS Special Focus on Computational and Mathematical Epidemiology

New Challenges for DM/TCS•Tailoring phylogenetic methods to describe the idiosyncracies of viral evolution -- going beyond a binary tree with a small number of contemporaneous species appearing as leaves.

•Dealing with trees of thousands of vertices, many of high degree.

•Making use of data about species at internal vertices (e.g., when data comes from serial sampling of patients).

•Network representations of evolutionary history - if recombination has taken place.

Page 73: DIMACS Special Focus on Computational and Mathematical Epidemiology

New Challenges for DM/TCS: Continued

•Modeling viral evolution by a collection of trees -- to recognize the “quasispecies” nature of viruses.

•Devising fast methods to average the quantities of interest over all likely trees.

Page 74: DIMACS Special Focus on Computational and Mathematical Epidemiology

Decision Making/Policy Analysis

Page 75: DIMACS Special Focus on Computational and Mathematical Epidemiology

Decision Making/Policy Analysis•DM/TCS have a close historical connection with mathematical modeling for decision making and policy making.

•Mathematical models can help us:–understand fundamental processes–compare alternative policies and interventions–provide a guide for scenario development–guide risk assessment–aid forensic analysis–predict future trends

Page 76: DIMACS Special Focus on Computational and Mathematical Epidemiology

Consensus

•DM/TCS fundamental to theory of group decision making/consensus

•Based on fundamental ideas in theory of “voting” and “social choice”

•Key problem: combine expert judgments (e.g., rankings of alternatives) to make policy

Page 77: DIMACS Special Focus on Computational and Mathematical Epidemiology

Consensus Continued

•Prior application to biology (Bioconsensus): –Find common pattern in library of molecular

sequences–Find consensus phylogeny given alternative

phylogenies

•Developing algorithmic view in consensus theory: fast algorithms for finding the consensus policy

•Special challenge re bioterrorism/epidemiology: instead of many “decision makers” and few “candidates,” could be few decision makers and many candidates (lots of different parameters to modify)

Page 78: DIMACS Special Focus on Computational and Mathematical Epidemiology

Decision Science•Formalizing utilities and costs/benefits.

•Formalizing uncertainty and risk.

•DM/TCS aid in formalizing optimization problems and solving them: maximizing utility, minimizing pain, …

•Bringing in DM-based theory of meaningful statements and meaningful statistics.

•Some of these ideas virtually unknown in public health applications.

•Challenges are primarily to apply existing tools to new applications.

Page 79: DIMACS Special Focus on Computational and Mathematical Epidemiology

Game Theory

Page 80: DIMACS Special Focus on Computational and Mathematical Epidemiology

Game Theory

•History of use in military decision making

•Relevant to conflicts: bioterrorism

•DM/TCS especially relevant to multi-person games

•Of use in allocating scarce resources to different players or different components of a comprehensive policy.

•New algorithmic point of view in game theory: finding efficient procedures for computing the winner or the appropriate resource allocation.

Page 81: DIMACS Special Focus on Computational and Mathematical Epidemiology

Some Additional Relevant DM/TCS Topics

Order-Theoretic Concepts:

•Relevance of partial orders and lattices.

•The exposure set (set of all subjects whose exposure levels exceed some threshold) is a common construction in dimension theory of partial orders.

•Point lattices may be useful for visualizing the relationships of contigency tables to effect measures and cut-off choices.

Page 82: DIMACS Special Focus on Computational and Mathematical Epidemiology

Combinatorial Group Testing

•Natural or human-induced epidemics might require us to test samples from large populations at once.

•Combinatorial group testing arose from need for mathematical methods to test millions of WWII draftees for syphilis.

•Identify all positive cases in large population by:–dividing items into subsets–testing if subset has at least one positive item–iterating by dividing into smaller groups.

Page 83: DIMACS Special Focus on Computational and Mathematical Epidemiology

Challenges Outside of DM/TCS

We’re expecting your input!

Page 84: DIMACS Special Focus on Computational and Mathematical Epidemiology

See You at DIMACS