graphical vs. tabular notations for risk...
TRANSCRIPT
1
Graphical vs. Tabular Notations for Risk Models:On the Role of Textual Labels and Complexity
Katsiaryna (Kate) LabunetsTU Delft, The Netherlands
Joint work with Fabio Massacci, University of Trento, ItalyAlessandra Tedeschi, DeepBlue srl, Rome, Italy
InternationalSymposiumon EmpiricalSoftwareEngineeringandMeasurements,Toronto,Canada– November10th,2017
2
Rationale
• Risk recommendations should be “consumed” mostly by not-experts in security
• What if the security representation is not easy to understand?– Stakeholder does not
understand you– The security recommendations
are not implemented• “Understand” != “Believe to
have understood”
3
Example Risk ModelsA simple example of one attack path represented in graphical and tabular notation.
Customershares credentials with next-of-kin
Unauthorized account login[unlikely]
Regularly inform customers of terms of
use
Integrity of account data
Lack of compliance with terms
of use
Customer
severe
Threat
Vulnerability
Threat scenario
Consequence
Unwanted incident
Treatment
Likelihood
Asset
Threatevent Threatsource
Vulnerability Impact Overalllikelihood
Levelofimpact
Asset Securitycontrol
Customersharescredentialswithnext-of-kin
Customer Lackofcompliancewithtermsofuse
Unauthorizedaccountlogin
Unlikely Severe Integrityofaccountdata
Regularlyinformcustomersoftermsofuse
CORASdiagram
NISTtablerowentry
UML-stylediagram
Threat
Customer
Vulnerability
Lack of compliance
with terms of use
Threat scenario
Customer
shares credental
with next-of-kin
Treatment
Regularly inform
customers of security
best practces
Unwanted incident
Unauthorized
account login
[Likelihood: unlikely]
Asset
Integrity of
account data
Consequence
Severe
4
Previous work
• Published in EMSE journal:– Labunets, K., Massacci, F., Paci, F., Marczak, S. and
de Oliveira, F.M., 2017. Model comprehension for security risk assessment: an empirical comparison of tabular vs. graphical representations. Empirical Software Engineering, pp.1-40.
• Included two studies: – [2014] 69 MSc and BSc students from Italy and Brazil – [2015] 83 professional post-master course and MSc students from Italy
• Treatment– graphical and tabular risk modeling notations
• Findings– Tabular is more effective that the graphical representation for simple
comprehension tasks– Less difference for complex tasks, but still tabular is better
5
Research questions
We address the following questions for participants with a significant work experience:
RQ1: Does the task complexity have an impact on the comprehensibility of the models?
RQ2: Does the availability of textual labels improve the participants effectiveness in extracting correct information about security risks?
6
Experiment Description [1/2]
• Goal: – Compare tabular vs. graphical risk models w.r.t.
comprehensibility• Treatments
– Notations: NIST 800-30 (tabular); CORAS (graphical); UML-style (graphical)
– Task: open questions with different level of complexity about information presented in the model
• 7 questions (originally 12 but 5 questions were discarded due to an implementation error)
• Between-subject design– one treatment per participant
7
Experiment Description [2/2]
• Application scenario: – Online Banking scenario by Poste Italiane
• Recruitment process: – Email invitation distributed via mailing lists by UNITN and DeepBlue– Offered a reward of 20 euro (via PayPal)– 572 attempts to start the experiment
• Participants: – 61 professional (avg. 9 years of working experience)
The number of participants reached each experimental phase
8
Demographics
36%
41%
23%
Age
24–30yrsold
31-40yrsold
41-62yrsold
3% 18%
43%
36%
Workingexperience
Didnotreport
Had1-3yrs
Had4-7yrs
Had>7yrs
2% 23%
31% 28%
16%
Experienceingraph.modelinglanguages
Novices
Beginners
Competentusers
Proficientusers
Experts
11%
36%
8%
45%
Educationdegree
BSc
MSc
MBA
PhD
9
Used Risk Models: NIST
1
ThreatEvent ThreatSource
Vulnerabilities Impact Asset OverallLikelihoo
LevelofImpact
SecurityControls
Customer'sbrowserinfectedbyTrojanandthisleadstoalterationoftransactiondata
Hacker
1.Poorsecurityawareness2.Weakmalwareprotection
Unauthorizedtransactionviawebapplication
Integrityofaccountdata
Likely Severe
1.Regularlyinformcustomersaboutsecuritybestpractices.2.Strengthenauthenticationoftransactioninwebapplication.
Keyloggerinstalledoncomputerandthisleadstosniffingcustomercredentials.Whichleadstounauthorizedaccesstocustomeraccountviawebapplication.
Cybercriminal
Insufficientdetectionofspyware
Unauthorizedtransactionviawebapplication
Integrityofaccountdata
Likely SevereStrengthenauthenticationoftransactioninwebapplication.
Spear-phishingattackoncustomersleadstosniffingcustomercredentials.Whichleadstounauthorizedaccesstocustomeraccountviawebapplication.
Cybercriminal
PoorsecurityawarenessUnauthorizedtransactionviawebapplication
Integrityofaccountdata
Likely Severe
1.Regularlyinformcustomersaboutsecuritybestpractices.2.Strengthenauthenticationoftransactioninwebapplication.
Keyloggerinstalledoncustomer'scomputerleadstosniffingcustomercredentials
Cybercriminal
Insufficientdetectionofspyware
Unauthorizedaccesstocustomeraccountviawebapplication
Userauthenticity Certain Severe
Spear-phishingattackoncustomersleadstosniffingcustomercredentials
Cybercriminal
PoorsecurityawarenessUnauthorizedaccesstocustomeraccountviawebapplication
Userauthenticity Certain SevereRegularlyinformcustomersaboutsecuritybestpractices.
Keyloggerinstalledoncustomer'scomputerleadstosniffingcustomercredentials
Cybercriminal
Insufficientdetectionofspyware
Unauthorizedaccesstocustomeraccountviawebapplication
Confidentialityofcustomerdata
Certain Severe
Spear-phishingattackoncustomersleadstosniffingcustomercredentials
Cybercriminal
PoorsecurityawarenessUnauthorizedaccesstocustomeraccountviawebapplication
Confidentialityofcustomerdata
Certain SevereRegularlyinformcustomersaboutsecuritybestpractices.
Fakebankingappofferedonapplicationstoreandthisleadstosniffingcustomercredentials
Cybercriminal
Lackofmechanismsforauthenticationofapp
Unauthorizedaccesstocustomeraccountviafakeapp
Userauthenticity Likely Critical Conductregularsearchesforfakeapps.
Fakebankingappofferedonapplicationstoreandthisleadstosniffingcustomercredentials
Cybercriminal
Lackofmechanismsforauthenticationofapp
Unauthorizedaccesstocustomeraccountviafakeapp
Confidentialityofcustomerdata
Likely Severe Conductregularsearchesforfakeapps.
Fakebankingappofferedonapplicationstoreleadstosniffingcustomercredentials.Whichleadstounauthorizedaccesstocustomeraccountviafakeapp.
Cybercriminal
Lackofmechanismsforauthenticationofapp
UnauthorizedtransactionviaPosteApp
Integrityofaccountdata
Unlikely Minor Conductregularsearchesforfakeapps.
Fakebankingappofferedonapplicationstoreleadstoalterationoftransactiondata
Cybercriminal
Lackofmechanismsforauthenticationofapp
UnauthorizedtransactionviaPosteApp
Integrityofaccountdata
Unlikely Minor Conductregularsearchesforfakeapps.
Smartphoneinfectedbymalwareandthisleadstoalterationoftransactiondata
Hacker WeakmalwareprotectionUnauthorizedtransactionviaPosteApp
Integrityofaccountdata
Unlikely MinorRegularlyinformcustomersaboutsecuritybestpractices.
Denial-of-serviceattack Hacker1.Useofwebapplication2.Insufficientresilience
Onlinebankingservicegoesdown
Availabilityofservice
Certain Minor1.Monitornetworktraffic.2.Increasebandwidth.
Web-applicationgoesdownSystemfailure
ImmaturetechnologyOnlinebankingservicegoesdown
Availabilityofservice
Certain MinorStrengthenverificationandvalidationprocedures.
10
Used Risk Models: CORAS
11
Used Risk Models: UML-style
12
Comprehension Questions
We ask to identify a risk element of a specific type that is related to another element of a different type.
“Which threats can exploit the vulnerability ‘Poor security awareness’? Please specify all threats:”
At least one question per element type:Graphicalelementtypes:1. Threat2. Vulnerability3. Threatscenario4. Unwantedincident5. Likelihood6. Consequence7. Asset8. Treatment
Tabularelementtypes:1. Threatsource2. Vulnerability3. Threatevent4. Impact5. Overalllikelihood6. Levelofimpact7. Asset8. Securitycontrol
13
Comprehension Questions: Complexity
• Task complexity is based on [Wood, 1986].• Our mapping of Wood’s components to the elements of
modeling notations:– Information cues (IC) describe some characteristics that help to
identify the desired element of the model. • Which are the assets that can be harmed by the unwanted
incident “Unauthorized access to HCN"?– Required acts (A) are judgment acts that require to select a
subset of elements meeting some explicit or implicit criteria.• What is the highest consequence?
– Relationships (R) between a desired element and other elements of the model that must be identified.
• … the assets that can be harmed by Cyber criminal…
Complexity (Qi) = |ICi| + |Ri| + |Ai|
14
Comprehension Questions: Complexity
• Simple question: – “What are the assets that can be harmed by Cyber criminal?”
• Complex question: – “Which unwanted incidents may be exploited by Hacker to
harm the asset “Integrity of account data” with the highest likelihood?”
Legend:– Information cue– Relationship– Required act
15
Measurements
• Precision of the response to a question: – # of identified correct elements / # of all elements in a
response• Recall of the response to a question:
– # of identified correct elements / # of all expected correct elements
• F-measure is a weighted harmonic mean of precision and recall
• Subject’s Comprehension– Aggregated F-measure of all questions about assigned risk
model
16
ResultsAllquestions
●
●
●●
●
●
●●
●
● ●
●●●●●
●
●●
●
●
med
ian
all =
0.8
3median all = 0.91
T: N= 0 UML: N= 1
CORAS: N= 1
T: N= 13 −>UML: N= 8 −>
CORAS: N= 5 −>
T: N= 4 UML: N= 3
CORAS: N= 3
T: N= 4 UML: N= 8
CORAS: N= 11
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00Aggregated Recall
Aggr
egat
ed P
resis
ion
●●●
●
●
All questions
METHOD● Tabular
UML
CORAS
17
Results: RQ1 – Complexity effect
• F-measure of simple (compl. lvl =2) vs. complex (compl. lvl >2) questions:
– UML & CORAS: Small difference, but not statistically significant
– Tabular: Statistical equivalence using an equivalence test (TOST)
Noeffecthasbeenobservedinourexperiment
Table: F-measure by task complexity
18
Results: RQ2 – Textual labels
• Precision:– Tabular and UML are
equivalent • pTOST-MW = 0.04
– Tabular > CORAS • pMW = 0.0009, pKW = 0.002
Availability of textual labels helps to elicit better responses
• Recall:– Tabular > UML
• pMW = 0.004, pKW = 0.008– Tabular > CORAS
• pMW = 1.4·10−5, pKW = 6.6·10−5
– UML > CORAS • pMW = 0.04
Table: Precision and recall by modeling notation
19
RQ2: Discussion• Tables are good, but textual labels also help
– Tables provide possibility to use computer-aided search and copy&paste information
– CORAS group used search feature more often than UML group• CORAS does not have textual labels to identify elements of necessary
type
0.00
0.25
0.50
0.75
1.00
Tabular UML CORASModel type
Usa
ge o
f sea
rch
feat
ure,
%
ResponseNo
Yes
0.00
0.25
0.50
0.75
1.00
Tabular UML CORASModel type
Usa
ge o
f cop
y-pa
ste
feat
ure,
%
ResponseNo
Yes
Search usage Copy&paste usage
20
Conclusions
• Electronic tables may be your best choice to communicate security risk assessment results with stakeholders
• Pure graphical models are prone to comprehension errors (comparing to tables). Mitigation options: – Textual labels [cheap]– Invest more in training stakeholders on notation
[expensive]
21
Open questions
• How well models support memorization of information about security risk– What information you can recall from your memory?
• Task complexity– Can we measure it better? – Which questions are complex enough to trigger
declared benefits of graphical models?