a high security c4i information system for emergency management joseph e. johnson [email protected]...
TRANSCRIPT
A High Security C4I Information System for
Emergency Management
Joseph E. Johnson [email protected]
University of South Carolina
Norfolk VA DARPA Oasis Conference
February 16, 2001
Our Team
Theoretical Physics – 3 PhD Faculty + 2 GRA Computer Science – 2 PhD Faculty + 1 Post
Doc + 1 GRA Applied Mathematics – 2 PhD Faculty +2 GRA My R&D group – several Oracle / Web / Network
development staff.
Phase 0 - Non-DARPA Work
Historic Foundations & Background
Historical Setting – A real-world system
Our R&D group has designed and developed the SC Emergency Management Information System called IRIS.
IRIS is a Web based Java & Oracle 8i system running on IBM RS/6000 (with North American maps & pager triggers).
IRIS has managed the SC emergency information for over 4 years including Hurricane Floyd, the largest US peacetime evacuation.
We have a working voice recognition I-O interface to IRIS allowing updating and querying of the Oracle database by cell phone.
IRIS is a C4I type system that manages all incident reports, resource requests, messaging, resources and critical facilities
databases and logging of ongoing actions.
Objective
Maintain full operation capability of this statewide emergency information management system at 99.999% (down 5 minutes/yr).
System is to run via browser on the Internet with a large number of users at low security.
The Problem: IRIS Must Function Adequately Under All Threats
Acts of Nature: hurricanes, floods, fires, earthquakes, tornados, loss of power or ISP.
Acts of Man Unintentional: HW/SW & Network failures & bugs, human error, BNCI disasters.
Act of Man Intentional: Hackers, terrorists, disgruntled employees, war, a full spectrum of acts of criminal intent & intentional BNCI.
Some Good Features
Rather complex roles & permissions are managed using Oracle with individual Ids.
Use of 128 bit encryption – secure socket layer. Reasonably good firewall. System not yet successfully hacked (not a challenge). Separate tables contain a full continuous backup of
every historical image (no data is ever erased or overwritten).
A complete mirrored system runs at USC to backup the State system (but with substantial delay).
Difficulties So Far
Firewall reconfigurations have blocked users. False alarms, as well as duplicate & bad data. ISP failures (before new site) Power failures (before new site). Earlier programming errors (e.g. an incident site Lon/Lat is plotted at
the city geocenter which is in the bay). User misuse of technical terms and codes. Mistakes of technical staff in system management. Difficulty of managing a rapidly changing approved user list. Technical infrastructure problems of a continuing random nature.
Dominant Anticipated Threats
BCNI Catastrophe or Hurricane - Massive Incident with Staff and Information Overload.
Loss of Internet & Telephony Terrorism & Hackers Earthquake, Tornado, Flood, Fire. Random Unanticipated Failures
Philosophy – Phase 1:
The dominant threats are correlated with a given site (earthquake/hurricane, ISP loss, terrorism, random failures, BCN incident, disgruntled employees, poor technical personnel actions).
If we can develop a robust and rapid dominant host site transfer, we minimize site correlated failure.
Also multiple systems could best handle multiple regions in the future, as disasters are usually geo-centered.
IRIS Catalogues Major Threats
The emergency management software must track the system failures including power, internet, and computer failures.
It must also track its own failure (on a replicated system).
Phase I - DARPA Funded Work
System Replication
Solution Phase 1: System Replication at Widely Dispersed Sites.
We have installed an IBM H70 at each of USC, UU, and MHPCC in secure environments.
The IRIS system with current SC real data is currently being replicated over the three sites.
Oracle replication specific to the needs of this C4I type system is being studied.
Objectives:
Maximum system availability at the minimum cost.
Minimum information loss upon fail-over. Reasonably good security at reasonable cost for
a system with a highly dynamic user base. Multiple hosts well synchronized with fail-over
and potential immersion in a larger set of hosts.
Reliance on expert staff to monitor system for additional security
Staff monitors statistical aspects of network traffic, data density categorized by type, intensity, threat, and geography (Oracle query-by-example filters).
Staff monitors and manages users added and deleted and usage by role and individual.
Specialized regional personnel are responsible for data quality assurance in their area.
Advantages of this Environment
The entire operation is wrapped in Oracle Each piece of information is an Oracle record. County senior administrator oversees all data
submissions which are invisible until approval. All mail is an Oracle record – no attachments.
“Regular” mail is maintained on a separate NT server and separate network.
Photos & enclosures are allowed separately
Phase 2 - DARPA Funded Work
Network Attacks –
Threats that are not Site Specific
Philosophy
Read-only access is not a disaster Write or Use denial is a major disaster. Keep only one of three systems on-line
replicating to the other two. Rapidly identify a crisis, identify problem,
and ‘repair’.
Approach to Identify Network Attacks
Treat the network information as a Complex System
Theoretical Analysis of the Local Network Traffic
To understand the structure of the variables for internet host-to-host communications, we used dumped output of our local network traffic.
1. Dump of IP traffic at EPD site (get characteristics of different ‘emergencies’)
2. Dump of IP traffic at a major USC site (get characteristics of different ‘attacks on a general system’).
Real Time Identification of Threat
Parameters encapsulated in all IP packets have been divided into two classes – dynamic (those that
change during propagation) and static (those that are unchanged).
The information traffic for the host-to-host communication can be described a s a trajectory in a multi-dimensional static parameter space
There are well defined patterns in the parameter subspaces related to the ‘normal’ and ‘abnormal’ network behavior.
First Set of Objectives
What is a characteristic dimension of the network parameter space?
How many nodes are needed to consider the network as a ‘complex enough’ system?
How does the dimension of the space depend upon the network topology and the number of nodes?
Desired Outcomes
A structure of the possible network intrusion in terms of the network parameters.
A quantitative method for the classification and characteristics of attacks.
A model independent way to obtain the best possible (optimized) level for the detection of an intrusion for a given class of intrusions.
The ability to detect an abnormal network behavior at the reconnaissance stage of the attack.
Methods for Pattern Recognition for Intrusion Detection in Real Time
Fast Fourier Transform (FFT) for obtaining stable nodes of the network.
Random Matrix Theory simulations to be able to distinguish between chaotic and regular network behavior.
Wavelet Analysis (WA) for fast pattern recognition used for network analysis and for detection of possible intrusions.
Results to Date
We have found patterns related to ‘normal’ and ‘abnormal’ network behavior.
We have also found that for the initial set of examples under investigation, the abnormal behavior can be detected on the reconnaissance stage of the attack.
Conclusions
Phase 1 Replication is operational but needs a lot of ‘tuning to minimize transient data loss and optimize ‘fail-over’
Phase II Initial results are promising and will be reported in detail in 6 months at the next PI meeting.
If Phase II works, we seek to be able to identify an attack and general type and thus activate a fail over prior to a system compromise
We will probably fail-over to a secure (disconnected from open internet) mode of operation that only lets core staff onto the fail-over system.
Our Hope
Our hope is that the approach, of treating selective network space of parameters+time, as a complex system can be scaled to systems of any size and design.
It is essential to be able to process the vast amount of network data in real time. We envision an attached Linux cluster.
We will seek to identify both anomalies and signatures.
JL Question 1: What threats is your project considering?
Part 1 considers all threats except network attacks to a central Web/Oracle C4I system.
Part 2 considers all network attacks on any system.
JL Question 2:What assumptions does your project make?
Part 1 Assumes that most threats to a system are site specific and if the system is a relational database, these can be managed by server replication and fail-over.
Part 2 Assumes the network information host-to-host traffic can be described as a trajectory, in a multi-dimensional space of parameters+time, and which behaves as a complex system revealing patterns indicating ‘normal’ and ‘abnormal’ behavior.
JL Question 3:What Policies can your Project Enforce?
Part 1 allows fail-over to a replicated system based upon any policy indicating an appropriate threat.
Part 2 allows any policy actions resulting from abnormal network patterns.
JL Question 4:What policies can the group of projects enforce?
Part 1 suggests the policy, where possible, of storing information in relational DB wrappers and replicating to dual systems on a continuous basis for mission critical applications.
Part 2 suggest that agents triggered by abnormal pattern detection in our system could act in concert with agents from other projects indicating attack and thus triggering an appropriate response or human intervention.
Phase III – Non-DARPA Parallel Work
Phase 3 – Outside and Parallel to This Project
Cost-Benefit Analysis Cost/benefit analysis cannot be achieved devoid of an
underlying value of the associated information. We are working to quantify the cost-benefit of
damage, response, resources, and general system management.
All attacks result in ‘misinformation or ‘lack of information’ which results in higher costs as directly linked to the ‘information failure” of the system.
We believe that this problem is of the greatest importance for all decision makers.
Phase 3 – Outside & Parallel to This Project
Synthesis of Expert Opinion Voting weights from both experts and software agents
(such as the FFT/wavelet system we are studying) can be folded with each other and with human votes on system malfunction.
Such a system can be self-adjusting. We have derived associated sets of coupled
nonlinear equations and can show that iterative solutions can be found with rapid convergence.
We will study these applications if possible at the end of our current project.
Phase 3 – Outside & Parallel to this Project
Use of numerical uncertainty and continuous valued logic as data. Use to record probabilities of threat as output of both
software agent and on-line experts via their ‘evaluations’ of each threat using a continuous valued logic developed by the PI.
Use of Markov models We are looking at the use of these models to evaluate
costs-benefit analysis. Use of User Agents that constantly test performance
features including timing and report information will be explored.
Acknowledge Support
DARPA Supporting Sites: USC, UU, MHPCC SC EPD My team – that works a lot on their own
initiatives.