pennsylvania state university john yen chen zhong peng liu

45
Revealing the Characteristics of Cyber Analysts’ Reasoning Processes: A Trace Analysis Approach Annual Review ARO MURI on Computer-aided Human-centric Cyber SA October 29, 2013 Pennsylvania State University John Yen Chen Zhong Peng Liu Army Research Laboratory Robert Erbacher Steve Hutchinson Renee Etoty Hasan Cam William Glodek

Upload: yasir-hill

Post on 30-Dec-2015

53 views

Category:

Documents


2 download

DESCRIPTION

Revealing the Characteristics of Cyber Analysts’ Reasoning Processes: A Trace Analysis Approach Annual Review ARO MURI on Computer-aided Human-centric Cyber SA October 29, 2013. Pennsylvania State University John Yen Chen Zhong Peng Liu. Army Research Laboratory Robert Erbacher - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pennsylvania State University John Yen Chen Zhong Peng Liu

Revealing the Characteristics of Cyber Analysts’ Reasoning Processes:

A Trace Analysis Approach

Annual ReviewARO MURI on Computer-aided Human-centric Cyber SA

October 29, 2013

Pennsylvania State UniversityJohn Yen

Chen ZhongPeng Liu

Army Research LaboratoryRobert ErbacherSteve Hutchinson

Renee EtotyHasan Cam

William Glodek

Page 2: Pennsylvania State University John Yen Chen Zhong Peng Liu

Objectives:• Understand the analytical reasoning process of

cyber analysts• Capture the analytical reasoning trace of cyber

analyst through non-invasive tool• Develop a model of analytical reasoning process

that can capture rich trace and enable automated trace analysis

• Conduct experiments involving cyber analysts

Scientific/Technical Approach• Developed Observation-Hypothesis-ActionHypothesis

(OHA) model of analytical reasoning process• Developed and implemented Analytical Reasoning Support

Tool for Cyber Analysis (ARSCA)• Designed experiments that capture realistic challenges in

cyber SA using VAST 2012.• Collaborated with an ARL study about visualization of

cyber SA led by Dr. Erbacher.• Conducted multiple pilot studies (at Penn State and Army

Research Lab) to polish ARSCA

Accomplishments• Conducted experiments, in collaboration with Army Research

Lab, involving subjects from Penn State and ARL.• Initial case study about trace analysis provided new insights

about the reasoning process of analysts• Initial correlation analysis suggest relationship between

characteristics of traces and performance/expertise

Opportunities• Improve performance of analysts through OHA-based training• Investigate the difference strategies between experts and novice• Investigate using aggregated analyst experiences to support

analytical reasoning process.

Computer-Aided Human Centric CyberSituation Awareness

J. Yen, C. Zhong, P. Liu, R. Erbacher, S. Hutchinson, R. Etoty, H. Cam, W. Glodek

Page 3: Pennsylvania State University John Yen Chen Zhong Peng Liu

System Analysts

Computer network

SoftwareSensors, probes• Hyper Sentry• Cruiser

Mu

lti-

Sen

sory

Hu

man

C

om

pu

ter

Inte

ract

ion

• Enterprise Model• Activity Logs • IDS reports

• Vulnerabilities

Cognitive Models & Decision Aids• Instance Based Learning Models

• Simulation• Measures of SA & Shared SA

• • •

Da

ta C

on

dit

ion

ing

As

so

cia

tio

n &

Co

rre

lati

on

Automated Reasoning Tools• R-CAST• Plan-based

narratives• Graphical

models• Uncertainty

analysis

Information Aggregation

& Fusion• Transaction Graph methods

•Damage assessment

Computer network

• •

Real World

Test-bed

3

Page 4: Pennsylvania State University John Yen Chen Zhong Peng Liu

4

Year 4 Accomplishments at a GlancePublications: 1. Zhong, C., Kirubakaran, D.S., Yen, J., Liu, P.,

Hutchinson, S., & Cam, H., “How to Use Experience in Cyber Analysis: An Analytical Reasoning Support System”, in Proceedings of IEEE Conference on Intelligence and Security Informatics (ISI), 2013.

2. Chen, P.C., Liu, P., Yen, J., & Mullen, T., “Experience-based cyber situation recognition using relaxable logic patterns”, in IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA), pp. 243-250, 2012.

3. Chen Zhong, VAST 2013 Workshop Presenter4. Working papers for CogSIMA 2014

Tools: • ARSCA

Technology transfer:

• J. Yen as summer faculty fellow at ARL • Deep collaborations with ARL researchers: • Brought the ARSCA toolkit to

Adelphi site • 12 ARL security analysts

participated• Weekly teleconferences• Joint work on a series of

papers •Invention Disclosure to PSU

Awards: • Best Paper Award, CogSIMA 2012.• Chen Zhong: Grace Hopper Celebration of Women in

Computing Scholarship. • Chen Zhong, Honorable Mention, VAST Challenge

2013, Mini-Challenge 3 (Visual Analytic for Cyber SA)

Students: • Chen Zhong, PhD

Page 5: Pennsylvania State University John Yen Chen Zhong Peng Liu

Cyber SA Depends on Human Analysts

Network

Attacks

Data Sources(feeds)

DepictedSituation

GroundTruth (estimates)

Compare

JobPerformance

5

Page 6: Pennsylvania State University John Yen Chen Zhong Peng Liu

“Hi Bob, how did you nail it?”

Answer A: “you know this is my job”

Answer B: “this tool is awesome”

Answer C: “I talked to Jacob”

Answer D: “I employ good reasoning” [our research focus]

6

Page 7: Pennsylvania State University John Yen Chen Zhong Peng Liu

High level research questions

Q1. How do analysts reason?

Q2. Does good reasoning matter?

Q3. If it matters, how to enable analysts to do more good reasoning, and less bad reasoning? [training?]

Q4. How to automate, to which extent? – Understanding the analyst’s reasoning processes is

essential to bridge the gaps between human and tools. – The analyst’s reasoning processes provide insights on

how to automate…

7

Page 8: Pennsylvania State University John Yen Chen Zhong Peng Liu

Prerequisites

P1. Need to get the reasoning processes of analysts

P2. Need to characterize these reasoning processes

P3. Need to correlate the characteristics with job performance

• In AI, this is related to “knowledge acquisition/solicitation”

• In Cognitive Science, this is denoted “theories on how we reason”, e.g., the mental model theory, the procedural memory concept in ACT-R

8

Page 9: Pennsylvania State University John Yen Chen Zhong Peng Liu

Existing Knowledge Acquisition Approaches

• CTA: cognitive task analysis

• Simulation: ACT-R needs “procedural memory”

• Knowledge engineering in expert systems

• Case-based learning

9

The Knowledge Acquisition Bottleneck (Feigenbaum)

Page 10: Pennsylvania State University John Yen Chen Zhong Peng Liu

Our ApproachInsight 1: Diverse reasoning processes may share common structures and critical elements

– We propose: OHA model

Insight 2: These critical elements and the relationships among them later on could be used to recover the reasoning processes

Insight 3: Using a software tool to track the traces of analysts’ reasoning processes

– We built ARSCA (Analytical Reasoning Support Tool for Cyber Analysis) toolkit

10

Page 11: Pennsylvania State University John Yen Chen Zhong Peng Liu

Three Merits

1. Don’t need the analyst to remember what he/she did; can automatically restore his/her reasoning processes from traces.

2. Directly correlated the traces with job performance

3. Provided abundant details; thoughts expressed in natural language

11

Page 12: Pennsylvania State University John Yen Chen Zhong Peng Liu

Challenges

C1. Validation challenge: Are the restored reasoning processes really the original?

C2. How to trace in a non-intruding, non-distracting manner?

C3. Tradeoff challenge:

12

tradeoffsAutomation intrace analysis:How structured?

How much infocan be collected?

Page 13: Pennsylvania State University John Yen Chen Zhong Peng Liu

Task Design

Data from VAST 2012 Challenge

– Data sources• Corporate network configuration• Firewall logs 26, 000, 000 entries. • IDS alerts 35,000 entries.

– Ground truth• An attack over two days (40 hours)

Page 14: Pennsylvania State University John Yen Chen Zhong Peng Liu

Task Design (2)

Task Time period Raw Data Size1 4/5 20:18-20:30

(12min)IDS: 214Firewall: 123,133

2 4/5 22:15-22:26(11min)

IDS: 239Firewall: 115,524

3 4/6 0:00-0:10(10 min)

IDS: 296Firewall: 112, 766

4 4/6 18:01-18:15(14 min)

IDS: 252Firewall: 85,463

Page 15: Pennsylvania State University John Yen Chen Zhong Peng Liu

Tracing Tool Architecture

15

DBMS Engine

Queries

Answers

IDS alerts

Firewall logs

Others

View

View

View

A tree of thoughts

Mouse Keyboard

Invisible tracking

- Keystrokes- Data filtering conditions- Observations

XML traces

Page 16: Pennsylvania State University John Yen Chen Zhong Peng Liu

How to work with the Tool?

• Demo1: working with the tool.

• Demo2: traces are captured in XML files.

16

Page 17: Pennsylvania State University John Yen Chen Zhong Peng Liu

Let’s Look into the Traces

• One trace (“pilot 1”)– A quick replay of the analytical reasoning

process– A quick look of the trace– Look into the Hypotheses– Look into the Actions and Observations

(which forms the “context” of the hypotheses)

• Compare 10 Traces• Initial Correlation

17

Page 18: Pennsylvania State University John Yen Chen Zhong Peng Liu

Tour: First Step

• One trace (“pilot 1”)– A quick replay of the analytical reasoning

process– A quick look of the trace– Look into the Hypotheses– Look into the Actions and Observations

(which forms the “context” of the hypotheses)

• Compare 10 Traces• Initial Correlation

18

Page 19: Pennsylvania State University John Yen Chen Zhong Peng Liu

A Quick Replay of the Analytical Reasoning Process

Video

19

Page 20: Pennsylvania State University John Yen Chen Zhong Peng Liu

Tour: Step 2

• One trace (“pilot 1”)– A quick replay of the analytical reasoning

process– A quick look of the trace– Look into the Hypotheses– Look into the Actions and Observations

(which forms the “context” of the hypotheses)

• Compare 10 Traces• Initial Correlation

20

Page 21: Pennsylvania State University John Yen Chen Zhong Peng Liu

A Quick Look of the Trace

Duration: 36 min # of Nodes: 30

Trace Operations

E-Tree:Width: 8

Depth: 3

# of Operations: 92

…21

Page 22: Pennsylvania State University John Yen Chen Zhong Peng Liu

Tour: Step 3

• One trace (“pilot 1”)– A quick replay of the analytical reasoning

process– A quick look of the trace– Look into the Hypotheses– Look into the Actions and Observations

(which forms the “context” of the hypotheses)

• Compare 10 Traces• Initial Correlation

22

Page 23: Pennsylvania State University John Yen Chen Zhong Peng Liu

H-TreeContinuous

occurrence of alerts showing an outside

ip connecting to various inner ips

Looking into IDS alerts

The outside ip is the malicious C&C

server

Looking into Firewall Log, check network

flow from this suspicious ip

All the destination ports are different.

This outside ip may do a port scan.

Observation 1Action 1

Hypothesis 1

Action 2 Observation 2

Hypothesis 2

EU (Experience Unit)

EU (Experience Unit)

H1

H2

E-Tree

H-Tree

23

Page 24: Pennsylvania State University John Yen Chen Zhong Peng Liu

Operations on Hypotheses

H1

H2

H-Tree

H_New: Create a hypothesis

H_Sbling: Add a sibling/alternative hypothesis

H_Jump: Change the current focus from one hypothesis to another.

H_Edit_Content: Edit the content of a hypothesis

H_Edit_Truth: Edit the truth value of a hypothesis

24

Page 25: Pennsylvania State University John Yen Chen Zhong Peng Liu

Look into the Hypotheses of Pilot1 (Cont’d)

• # of the hypothesis: 21

• Operations on the hypotheses (next slide)

H_New Create a hypothesis

H_Add_Sibling Add a sibling/alternative hypothesis

H_Jump Change the current focus from one hypothesis to another

H_Edit_Content Edit the content of a hypothesis

H_Edit_Truth Change the truth value(true of false) of the a hypothesis

25

Page 26: Pennsylvania State University John Yen Chen Zhong Peng Liu

Look into the Hypotheses of Pilot1

H_New H_Add_Sibling H_Jump H_Edit_Content H_Edit_Truth0

2

4

6

8

10

12

14

16

Trace Operations on Hypotheses

26

Page 27: Pennsylvania State University John Yen Chen Zhong Peng Liu

Tour: Step 4

• One trace (“pilot 1”)– A quick replay of the analytical reasoning

process– A quick look of the trace– Look into the Hypotheses– Look into the Actions and Observations

(which forms the “context” of the hypotheses)

• Compare 10 Traces• Initial Correlation

27

Page 28: Pennsylvania State University John Yen Chen Zhong Peng Liu

The Context of a Hypothesis

Continuous occurrence of alerts showing an outside

ip connecting to various inner ips

Looking into IDS alerts

The outside ip is the malicious C&C

server

Looking into Firewall Log, check network

flow from this suspicious ip

All the destination ports are different.

This outside ip may do a port scan.

Observation 1Action 1

Hypothesis 1

Action 2 Observation 2

Hypothesis 2

Context1

Page 29: Pennsylvania State University John Yen Chen Zhong Peng Liu

The Context of a Hypothesis

Continuous occurrence of alerts showing an outside

ip connecting to various inner ips

Looking into IDS alerts

The outside ip is the malicious C&C

server

Looking into Firewall Log, check network

flow from this suspicious ip

All the destination ports are different.

This outside ip may do a port scan.

Observation 1Action 1

Hypothesis 1

Action 2 Observation 2

Hypothesis 2

Context2

Page 30: Pennsylvania State University John Yen Chen Zhong Peng Liu

Actions and Observations in E-Tree (Cont’d)

Continuous occurrence of alerts showing an outside

ip connecting to various inner ips

Looking into IDS alerts

The outside ip is the malicious C&C

server

Looking into Firewall Log, check network

flow from this suspicious ip

All the destination ports are different.

This outside ip may do a port scan.

Observation 1Action 1

Hypothesis 1

Action 2 Observation 2

Hypothesis 2

30

Page 31: Pennsylvania State University John Yen Chen Zhong Peng Liu

Actions and Observations

Checking IDS Alerts Finding … in IDS Alerts

Action Observation

Checking Network Topology Finding … in Network Topology

Checking Firewall logs Finding … in Firewall logs

… …

31

Page 32: Pennsylvania State University John Yen Chen Zhong Peng Liu

Operations on Actions and Observations

AO_Lookup_Port_Term

Look up an explanation of a port or a term

AO_Link Link a set of items in observation together for a reason (e.g. same port)

AO_Filter Filter the data by creating a filtering condition

AO_QuickFind Quick Find a term in the data

AO_Selecting Select some data entries (i.e. create action items)

AO_Finding Find something in the selected data entries (i.e. create observation items)

32

Page 33: Pennsylvania State University John Yen Chen Zhong Peng Liu

Look into the E-Tree in Pilot1’s E-Tree:# of Nodes

EU_Num H_Num Total_Num9 21 30

Number of Nodes in Pilot1's E-Tree

EU_NumH_Num

33

Page 34: Pennsylvania State University John Yen Chen Zhong Peng Liu

Look into the E-Tree in Pilot1’s Trace:# of Operations

AO_Sele

cting

AO_Finding

AO_Quick

Find

AO_Filte

r

AO_Link

AO_Looku

p_Pprt_

Term

0

5

10

15

20

25

# of Operations in Pilot1's Trace

34

Page 35: Pennsylvania State University John Yen Chen Zhong Peng Liu

Look into the E-Tree of Pilot1 (Cont’d)

pilot1 pilot2 pilot4 101 128 174 193 239 246 2850

1

2

3

4

5

6

7

8

9

10

WidthDepth

35

Page 36: Pennsylvania State University John Yen Chen Zhong Peng Liu

Summary: A Slower Replay

Video

36

Page 37: Pennsylvania State University John Yen Chen Zhong Peng Liu

Two Cases of Jumping Back • Go back to previous node

– Case 1JUMP_FROM_TO (H39431008 H46131157)ADD_SIBLING (H46131157 H66431551 )

– Case 2JUMP_FROM_TO (H89931527 H58331044)CHANGE_TRUTH_VALUE (H58331044 False,Unknown)

1

2 3

1

2

Page 38: Pennsylvania State University John Yen Chen Zhong Peng Liu

Tour: Step 5

• One trace (“pilot 1”)– A quick replay of the analytical reasoning

process– A quick look of the trace– Look into the Hypotheses– Look into the Actions and Observations

(which forms the “context” of the hypotheses)

• Compare 10 Traces• Initial Correlation

38

Page 39: Pennsylvania State University John Yen Chen Zhong Peng Liu

Trace Comparison: # of Nodes in E-Tree

More alternative hypotheses

Page 40: Pennsylvania State University John Yen Chen Zhong Peng Liu

Trace Comparison: E-Trees

pilot1 pilot2 pilot4 101 128 174 193 239 246 2850

5

10

15

20

25

30

35

Total_NumWidthDepth

40

Page 41: Pennsylvania State University John Yen Chen Zhong Peng Liu

Trace Comparison:Trace Operations

41

Page 42: Pennsylvania State University John Yen Chen Zhong Peng Liu

Tour: Step 6

• One trace (“pilot 1”)– A quick replay of the analytical reasoning

process– A quick look of the trace– Look into the Hypotheses– Look into the Actions and Observations

(which forms the “context” of the hypotheses)

• Compare 10 Traces• Initial Correlation

42

Page 43: Pennsylvania State University John Yen Chen Zhong Peng Liu

Initial Correlation

• Correlated performance with E-Tree features:

43

pilot1 pilot2 pilot4 101 128 174 193 239 246 2850

5

10

15

20

25

30

35

Expertise, Performance, E-Tree Features

Expertise (Pre-Questionnaire) Performance ScoreTotal_NumWidthDepth

Page 44: Pennsylvania State University John Yen Chen Zhong Peng Liu

FY 2014 Plan

44

• Continue to conduct, in collaboration with ARL researchers, Analytical Reasoning Experiment (VAST 2012)

• Analyze the traces of analytical reasoning• Is the first thought important for an analyst’s performance?• How will the key observation influence the analytical reasoning process?• What are the differences between strategies used by experts and novice?

• Design and conduct, in collaboration with ARL researchers, a collaborative analytical reasoning experiment

• Enables digging into flow data• Two-analysts teams• Leverages VAST 2013

• Enhance the context-guided experience-based analytical reasoning support• Aggregating multiple experiences of analysts• Support context-guided experience-based simulation

Page 45: Pennsylvania State University John Yen Chen Zhong Peng Liu

46

Q & A

Thank you.