on the equivalence between graphical and...

13
ON THE EQUIVALENCE BETWEEN GRAPHICAL AND TABULAR REPRESENTATIONS FOR SECURITY RISK ASSESSMENT Katsiaryna Labunets 1 , Fabio Massacci 1 , Federica Paci 2 1 University of Trento, Italy (<[email protected]>) 2 University of Southampton, UK ([email protected]) REFSQ’17, Essen, Germany March 2nd, 2017

Upload: others

Post on 22-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

ON THE EQUIVALENCE BETWEEN GRAPHICAL AND TABULAR REPRESENTATIONS FOR SECURITY RISK ASSESSMENT

Katsiaryna Labunets1, Fabio Massacci1, Federica Paci2

1University of Trento, Italy (<[email protected]>)2University of Southampton, UK ([email protected])

REFSQ’17, Essen, GermanyMarch 2nd, 2017

Page 2: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

The Problem [1/2]• Several security risk assessment

(SRA) methods and standards to identify threats and possible security requirements are available

• Academia relies on graphical methods(e.g. Anti-Goals, Secure Tropos, CORAS)

• Industry opts for tabular methods(OCTAVE, ISO 27005, NIST 800-30)

• REFSQ’17 representation stats:• 5 papers discuss graphical notations

(i*, Use Сases, BPMN diagrams),• 3 papers on mixed methods,• 1 paper studied requirements in

natural language.

2

Page 3: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

The Problem [2/2]• Are graphical methods actually better?• No clear winner from the past experiments

• [ESEM 2013]: • Graph > Table w.r.t. # of threats (p < 5%) • Table > Graph w.r.t. # of security controls (p < 5%)• Graph =? Table w.r.t. perceived efficacy (not statistically

significant)• [EmpiRE at RE 2014]:

• Graph =? Table w.r.t # of threats and controls (not statistically significant)

• Graph > Table w.r.t. perceived efficacy (p <5%)• Are they really different?

3

Both methods have clear process

Tabular method has less clear process

Page 4: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Research Questions• RQ1: Are tabular and graphical SRA methods equivalent

w.r.t. actual efficacy?• RQ2: Are tabular and graphical SRA methods equivalent

w.r.t. preceived efficacy?

How to answer?

4

Page 5: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Difference tests• Problem

• H0: μA = μB

• Ha: μA.≠ μB

• Test: t-test, Wilcoxon, Mann-Whitney, etc.• we can only reject the null hypothesis H0. • we cannot accept the alternative hypothesis Ha.

• Lack of evidence for difference ≠ evidence for equiavalence• How different two methods should be in order to be considered

different?

5

Page 6: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Equivalence test• Two One-Sided Tests (TOST) [Schuimann, 1981]• Problem

• ẟ defines the range whithin which two methods are considered to be equivalent• Percentage ([80%;125%] by FDA or [70%;143%] by EU) for rational

data• Fixed value (e.g. 0.6 for ordinal values on 1-5 Likert scale with 3 as a

mean value) for ordinal data• We can use t-test, or Wilcoxon, or Mann-Whitney, etc.

6

Page 7: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Experimental Design• Goal

• Compare graphical and tabular representation w.r.t. to the actual and perceived efficacy of an SRA method when applied by novices.

• Treatments• Method: Graphical and tabular SRA methods used in industry• Task: Conduct SRA for each of four security tasks

1. Identity Management security (IM),2. Access Management security (AM),3. Web Application and Database security (WebApp/DB),4. Network and Infrastructural security (Network/Infr).

• Experiment: we conducted two controlled experiments in 2015 and 2016 years.

7

Page 8: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Experimental Execution• ATM Domain

• Remotely Operated Tower (ROT) Scenario by Eurocontrol• Unmanned Air Traffic Management (UTM) Scenario by NASA

• Methods: • Graphical CORAS by SINTEF• Tabular SecRAM by SESAR

• Participants were provided with a catalogues of security threats and controls*

• Participants: 35 and 48 MSc students in Computer Science were involved in ROT2015 and UTM2016 controlled experiments

8

* M. de Gramatica, K. Labunets, F. Massacci, F. Paci and A. Tedeschi. “The Role of Catalogues of Threats and Security Controls in Security Risk Assessment: An Empirical Study with ATM Professionals”. In Proc. of REFSQ’15.

Page 9: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Experimental Protocol

GROUP 1

BACKGROUND

Q1

PARTICIPANTS

TRAINING APPLICATION

RESEARCHERS

EVALUATION

REP

OR

TSM

ETH

OD

S

FOCUS GROUPS

INTERVIEW

FINAL METHOD IMPRESSION

Q32

DOMAIN EXPERTS

GROUPS DELIVER RESULTS

REPORT QUALITY

ASSESSMENT

METHOD DESIGNERS+

DOMAIN EXPERTS

GROUP X

ROT/UTM SCENARIO

GRAPHICAL METHOD

TABULAR METHOD

TRAINING ON SECURITY TOPICSIM AM WebApp/DB Network/Infr

Groups of Type A

Groups of Type B

Groups of Type B

Groups of Type A

Groups of Type A

Groups of Type B

Groups of Type B

Groups of Type A

INITIAL METHOD IMPRESSION

Q31

9

Type A Type BROT2015 9 groups 9 groupsUTM2016 13 groups 11 groups

Page 10: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Results: Actual Efficacy

Exp Act.Efficacy

Tabular Mean

GraphicalMean

ẟmeanTab-Graph

TOSTp-value

ROT2015 Threats 3.17 2.95 +0.22 0.0009SC 3.28 2.97 +0.31 0.001

UTM2016 Threats 3.28 3.24 +0.04 6.3*10-6

SC 3.31 3.29 +0.02 2.4*10-7

10

Table ≈ Graph (both experiments) w.r.t. quality of threats and controls

Actual Efficacy: whether the treatment improves performance of the task

Page 11: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Results: Perceived Efficacy

11

Exp PercEfficacy

Tabular Mean

GraphicalMean

ẟmeanTab-Graph

TOSTp-value

ROT2015 PEOU 3.63 3.20 +0.43 0.08PU 3.54 3.05 +0.37 0.18

UTM2016 PEOU 3.74 3.60 +0.14 2.6*10-5

PU 3.67 3.29 +0.38 0.03

ROT2015 • PEOU & PU: Tabular ? Graphical – inconclusiveUTM2016• PEOU & PU: Tabular ≈Graphical

Page 12: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Threats to Validity• Difference between two experiments (internal validity)

• Low statistical significance (conclusion validity)

• Use of students instead of practitioners (external validity)

• Simple scenario (external validity)

12

Page 13: ON THE EQUIVALENCE BETWEEN GRAPHICAL AND ...homepage.tudelft.nl/6d93v/talks/labunets-REFSQ2017...REFSQ’17, Essen, Germany March 2nd, 2017 The Problem [1/2] •Several security risk

Conclusions• No difference? – check equivalence test• How to measure Actual Efficacy: Quantity vs. Quality?• Both graphical and tabular methods have similar support

for SRA• Clear process matters!

• What is next?• Comprehensibility of risk modeling notations

• Labunets et al. “Model Comprehension for Security Risk Assessment: An Empirical Comparison of Tabular vs. Graphical Representations”. Empirical Software Engineering, 2017.

13