overarching technologies: information mgmt ed hovy, usc/isi bill scherlis, cmu phil cohen, ohsu...
TRANSCRIPT
Overarching technologies:Information Mgmt
Ed Hovy, USC/ISIBill Scherlis, CMU
Phil Cohen, OHSUHsinchun Chen, Arizona
Mike Goodchild, UCSBEva Kingsbury, NSF/CISE
Sharad Mehrotra, UCIDave Kehrlein, Calif OES
Bob Neches, USC/ISI
Research on the unexpected
The unexpected: a strawman perspective
• Understand the triggers– Reduce scope of what is “unexpected”– Our mission: understand fault models– Not our mission: model threats
• Design robust infrastructural systems– “System” = technology + people + policy + process– Respond gracefully to misuse
Dampen cascading; reduce consequences of failure A dependable system “allows reliance to be justifiably placed on the
service it delivers” [IFIP Dependability WG]– Our mission: identify requirements; understand feasibility
Prevent detect mitigate
• Design robust response systems– Preparedness of response mechanisms
Closely coupled with design of infrastructural systems Rapid change in rules of engagement: a flawed plan
– Our mission: identify requirements; understand feasibility
Building an agenda for impact-oriented research
• What’s the problem?– Who cares? – What are the (social, economic) stakes of failure/success?
• What can we do now?– What are the limiters to progress?
• What are the great ideas?– How can they be developed?– Which research communities to engage?
• What are the barriers to adoption and how to overcome?– Risks, scaling, culture, turf, incentives,…– Economics: funding, incentives, sustainment– E.g., mainstream headroom (cell, net) vs. crisis-specific (tents)
• What steps to take now?– How to get early validation of potential for long-term impact?– What is the expected overall timescale?– What is the structure and scale (critical mass) for the effort?
Scenarios
Information management scenarios
• Multiple October “flu” outbreaks– Instant epidemiology
Sensors + Fusion + Reportback + Iteration
– Detection, confirmation, etc. Data sources: physicians, grocery scans, school
attendance, lab tests, published pt records, etc. Issues: Data overwhelm, etc.
– Fusion: data mining, modeling, visualization Data sources: occupational, industrial, geographic,
weather, transportation routes, etc. Issues: Variable data quality, etc.
Information management scenarios
• Hurricane / earthquake– Instant bureaucracy– Claims management, identification, etc
What data is needed?
– Resourcing: planning, routing, tracking Example: Dave’s cranes
Information management scenarios
• Explosion on a highway– Triage: Bio? Chem? Nuclear? – Placarded?– – Guiding triage
Rapidly eliminate possibilities• Back-propagation in sensor data
– Highway sensors– Vehicle sensors nearby– Airborne sensors
• Fusion of human input• Situational context: anniversary date, etc.
– Role of other databases, web, etc.• Causal reasoning and diagnosis
Prediction• Respond according to worst case?• Or is it a truck explosion?• What will happen next?
Placard reqt?
Information management scenarios
• Explosion on a highway– Triage: Bio? Chem? Nuclear? – Placarded?– Instant confusion– Guiding triage
Rapidly eliminate possibilities• Back-propagation in sensor data
– Highway sensors– Vehicle sensors nearby– Airborne sensors
• Fusion of human input• Situational context: anniversary date, etc.
– Role of other databases, web, etc.• Causal reasoning and diagnosis
Prediction• Respond according to worst case?• Or is it a truck explosion?• What will happen next?
Bad guyon
board
Placard reqt?
Information Management, generally
Human interaction issues
• Attention management – “Overwhelm”
• Stress effects
• Awareness– Tailoring; push and pull
• Computer mediated collaboration– Group effects: f2f and computer mediated– Division of labor, expertise
… a rich HCI and social science literature here, but….
time
perf
computingcomms
stress
Human interaction issues
• Attention management – “Overwhelm”
• Stress effects
• Awareness– Tailoring; push and pull
• Computer mediated collaboration– Group effects: f2f and computer mediated– Division of labor, expertise
… a rich HCI and social science literature here, but….
time
perf
human attn
computingcomms
stress
Cycles and leverage points
• Needs by CM phase– Preparedness– Mitigation– Response– Recovery
• Analog: computer security– Prevent: write “safe code,” …– Detect: IDS, firewall, audit, …– Mitigate: self-healing architecture, …
• Analog: military C2– Military planning cycles– Response: Observe, Decide, Act– C2 goal: shorten/overlap the iterations– Particular challenge: Coalitions, trust, access
… what are the right process models? …
The flow of information
• Provisioning / gathering– Sensors (passive, active/mobile, ubiquitous), human input, simulated– Goal: Everything is a sensor
• Fusion / validation– Linking (human and automated), moniroting, triggering– Goal: Quality and comprehensiveness modeled– Goal: Ongoing (meta)data reconciliation
• Access– Security (military, civilian), authentication, protection– Goal: More trust from more “localized” trusted parties
• Exploitation / dissemination– Mining (automated and human), querying, triggering – Model development and simulation– What-if analysis, planning, decision-making– Exploration, visualization, presentation, pushing – Drive back to {data, sensor, human, fused} source– Goal: Tailored to user (“perceptual dissemination”)– Goal: Manage overwhelm– Goal: Detect errors and drive back to sources
• Collaboration / orgware– Ad hoc– Goal: Awareness (push/pull), no info loss, rapid consensus– Goal: Expertise effectively exploited
The corpus of information
• Schematization– Traditional schema-first
Structural and semantic GIS
– Corpus Traditional IR Textual / image structure Intrinsic metadata
– Semi-structured– Ad-hoc / extrinsic
WWW and raw hyperlinks Schema-later Rich links
• Economics– Who pays and when– Example: Strd Arg failures– Costs/benefits/risks of
preparation
• Information types– DB types– Geographical– Media: imagery, video,
sensor data, documents
• Metadata– Security and privacy
Classification (fed, state) Proprietary (multiple) Limited use (expiry?)
– Quality Trustworthiness Completeness Source
– Policy / legal Authority to use; turf (Coalition warfare model)
– Extrinsic Annotations, links, etc.
The long term trajectory of information
• New issues – Enablement of future crisis-driven integration of info
How to anticipate linking and needs?– Policy consequences: security, privacy
Rapid (emergency) policy reconfiguration Understanding and modeling consequences of release
• Privacy, unwanted linking, etc.• Gander: Pen/paper DB, controlled copies, limited access.
– Destroyed during mop-up.– Understanding the economics
Costs, risks, benefits, time, incentives “Dual use” (train as you fight; fight as you train)
• E.g., headroom in mainstream infrastructure Clearinghouse, transition, validation, maturity
• Risk and access
• Familiar topics, still critical– Schema evolution– Common data elements and metadata consensus– Legacy management: Reconciliation and wrapping
Technical challenges
Sensors and data collection
• Diverse sources– Digital dust
Large sensor networks: 100000’s of sensors– Self-report patterns: 911 calls, etc.– Multi-modal
• Mobile ubiquitous sources– Camera immersion– Sleeper sensors
structural sensors in bridges, buildings automobile sensors
– Rapid sensor-net deployment
• Where to store and process– Processing of massive data streams
• Sensor reliability and maintenance– Models and records– User feedback
• Security, authentication, etc.– E.g., for dust: emergent badness– E.g., for water security: internal sabotage threat
Communication
• Now– Intermittent (wireless) connectivity
Telecom crises are responder crises
– Interoperation challenges– Multiple redundant systems
• Needed– Instant dependable comms infrastructure– Bandwidth– [Enables offsite datacenters, reachback.]– [Raise the baseline]
Ontologies
• Definitions– 1. semantically enriched schema– 2. set of core conceptual elements and relationships– E.g., Objects and classes– E.g., Procedures and entities in the world– Example: street centerline (i.e., digraph of streets)
• Bkgd resources: model this
• Enable rapid filling of crisis-specific gaps
• Accommodate multiple media types– Imagery, sensor data, video, text, maps, etc., etc.
Fusion
• Semantics– Sensor level (raw data)– (information level) How to overcome differences of
definition? E.g., descriptions of fuel for forest fires: beyond “trees” E.g., unleaded gasoline E.g., race in census
• Syntax– Format: XML is not the whole solution
• Scale– Fusion wrt different levels of detail
• Currency, Trustworthiness
• Commensurability– E.g., positional correction
• Policy and policy aggregation
Mining
• Detecting anomalous or “interesting” patterns– Over diverse media types, e.g., surveillance
cameras
• Working “upstream” from an anomalous event– Back-propagation: Mining (preceding) data for
the (new) pattern that should now be detected
• Media and representations
Modeling and simulation
• Decision trees and anticipation– What is our “event type”?
E.g., explosion: chem/bio spread? E.g., anthrax in the mail
– How expected is this unexpected event?– What are the potential cascading steps?
• Multiple simulation models– Location of vulnerabilities
Co-located personnel– Interoperation of simulation models
• Modeling domains– Human behavior under stress– Organizational response
Presentation / visualization
• Emerging– Deployment to field PDAs
• Needed– Personalization/customization to field users,
media, others.– Drilldown and detail-level control– What is the usage model?– Fluid-modal interaction– Support for diversity in user population
Cultural, disabilities, language, context Sharing information with media, general public
GIS role
• Idea– Map as result of planning:
GIS as “instrument of choice” for intelligence fusion
• Current capability– FireScope: tools, people, organizational structure
GIS subcommittee: common mapping system Dependency on mutual aid
• (Calif: 21 fires with >100 fire depts involved)– Mobile GIS labs– Web/FTP sites
• Needed– 4D representation: navigate in {location + time}– USAR issue: CAD + GIS– Policy: tax/insurance advantage to capture data
Architecture and distribution
• Interlinking: systems and organizations
• Instant infrastructure– Pluggable interconnection medium
Comms KB and ontology
• Robustness– Maximize and localize capabilities– Principle: Graceful degradation– E.g., Networked PDAs without networks
HCI and human factors
• Team support and collaboration: – f2f and dist/mediated– Shift change– Role definition– Planning cycle: 12 hours (link w/shift chg)
How to accelerate planning cycles? In ICS: Situation Analysis (intelligence fusion)
– Trust and emotional state
Making it happen
Program formulation issues (examples)
• Delivery mechanism: – Adoption and risk– Examples:
Mainstream headroom (cell, net) vs. crisis-specific (tents)• Principle: Headroom model• E.g., SETI and other grid computing
Awareness of cultural context (e.g., crisis responders) What are the user’s real risk issues for acceptance?
• Role of tight collaborations– Researchers with users– Interdisciplinary:
IT researchers with social scientists Recognize the process cost of collaborative research