data-driven threat intelligence: metrics on indicator ... · using tiq-test – data prep •...
TRANSCRIPT
Data-DrivenThreatIntelligence:MetricsonIndicatorDisseminationandSharing
(#ddti)
AlexPintoChiefDataScientist
MLSec Project@alexcpsec
@MLSecProject
AlexandreSieiraCTONiddel
@AlexandreSieira@NiddelCorp
• CyberWar… ThreatIntel–Whatisitgoodfor?
• CombineandTIQ-test• Measuringindicators• ThreatIntelligenceSharing• Futureresearchdirection(i.e.willworkfordata)
Agenda
HTto@RCISCwendy
50-ishSlides3KeyTakeaways
2HeartfeltandgenuinedefensesofThreatIntelligenceProviders
1Predictionon“TheFutureofThreatIntelligenceSharing”
PresentationMetrics!!
WhatisTIgoodfor(1)Attribution
WhatisTIgoodforanyway?
TYto@bfist forhisworkonhttp://sony.attributed.to
WhatisTIgoodfor(2)– CyberMaps!!
TYto@hrbrmstr forhisworkonhttps://github.com/hrbrmstr/pewpew
WhatisTIgoodforanyway?
• (3)Howaboutactualdefense?• Strategicandtactical:planning• Technicalindicators:DFIRandmonitoring
AffirmingtheConsequentFallacy
1. IfA,thenB.2. B.3. Therefore,A.
1. Evilmalwaretalksto8.8.8.8.2. Iseetrafficto8.8.8.8.3. ZOMG,APT!!!
ButthisisaData-Driventalk!
CombineandTIQ-Test• Combine(https://github.com/mlsecproject/combine)• GathersTIdata(ip/host)fromInternetandlocalfiles• Normalizesthedataandenrichesit(AS/Geo/pDNS)• CanexporttoCSV,“tiq-testformat”andCRITs• ComingSoon™:CybOX /STIX/SILK/ArcSight CEF
• TIQ-Test(https://github.com/mlsecproject/tiq-test)• RunsstatisticalsummariesandtestsonTIfeeds• Generateschartsbasedonthetestsandsummaries• WritteninR(becauseyoushouldlearnastatlanguage)
• https://github.com/mlsecproject/tiq-test-Summer2015
UsingTIQ-TEST– FeedsSelected• Datasetwasseparatedinto“inbound”and“outbound”
TYto@kafeine andJohnBambenek foraccesstotheirfeeds
UsingTIQ-TEST– DataPrep• Extractthe“raw”informationfromindicatorfeeds• BothIPaddressesandhostnameswereextracted
UsingTIQ-TEST– DataPrep• ConvertthehostnamedatatoIPaddresses:• ActiveIPaddressesfortherespectivedate(“A”query)• PassiveDNSfromFarsight Security(DNSDB)
• ForeachIPrecord(includingtheonesfromhostnames):• Addasnumber andasname (fromMaxMind ASNDB)• Addcountry (fromMaxMind GeoLite DB)• Addrhost (againfromDNSDB)– mostpopular“PTR”
UsingTIQ-TEST– DataPrepDone
NoveltyTestMeasuringaddedanddropped
indicators
NoveltyTest- Inbound
AgingTestIsanyonecleaningthismessup
eventually?
INBOUND
OUTBOUND
PopulationTest• LetususetheASNandGeoIP databasesthatweusedtoenrichourdataasareferenceofthe“true”population.
• But,but,humanbeingsareunpredictable!Wewillneverbeabletoforecastthis!
Isyoursamplingpollasrandomasyouthink?
Canwegetabetterlook?• Statisticalinference-basedcomparisonmodels(hypothesistesting)• Exactbinomialtests(whenwehavethe“true”pop)• Chi-squaredproportiontests(similartoindependence tests)
OverlapTestMoredatacanbebetter,butmake
sureitisnotthesamedata
OverlapTest- Inbound
OverlapTest- Outbound
UniquenessTest
UniquenessTest
• “Domain-basedindicatorsareuniquetoonelistbetween96.16%and97.37%”
• “IP-basedindicatorsareuniquetoonelistbetween82.46%and95.24%ofthetime”
Ihatequotingmyself,but…
KeyTakeaway#1
MORE!=BETTERThreatIntelligenceIndicatorFeeds
ThreatIntelligenceProgram
KeyTakeaway#1
Intermission
KeyTakeaway#2
KeyTakeaway#1"ThesearetheproblemsThreatIntelligenceSharingishereto
solve!”
Right?
HerdImmunity,isit?
Source:www.vaccines.gov
HerdImmunity…
…wouldimplythatothersinyoursharingcommunitybeingimmunetomalwareAmeantyouwouldn’tgetitevenifyouwerestillvulnerable toit.
ThreatIntelligenceSharing
• Howmanyindicatorsarebeingshared?
• Howmanymembersdoactuallyshareandhowmanyjustleech?
• Canwemeasurethat?Whatasuper-deeee-duperidea!
ThreatIntelligenceSharingWewouldliketothankthekindcontributionofdatafromthefinefolksatFacebookThreatExchangeandThreatConnect…
…andalsothesharingcommunitiesthatchosetoremainanonymous.Youknowwhoyouare,andwe❤ youtoo.
ThreatIntelligenceSharing– Data
Fromaperiodof2015-03-01to2015-05-31:- NumberofIndicatorsShared
§ Perday§ Permember
Notsharingthisdata– privacyconcernsforthemembersandcommunities
Updatefrequencychart
OVERLAPSLIDE
OVERLAPSLIDE
UNIQUENESSSLIDE
MATURITY?
“Reddit ofThreat
Intelligence”?
KeyTakeaway#1
'Howcansharingmakemebetterunderstandwhatare
attacksthat“aretargeted”andwhatare“commodity”?'
KeyTakeaway#1
TELEMETRY>CONTENT
KeyTakeaway#3(AlsoPrediction#1)
MoreTakeaways(Ilied)
• Analyzeyourdata.Extractmorevaluefromit!• IfyouABSOLUTELYHAVETObuyThreatIntelligenceordata,evaluateitfirst.
• Trythesampledata,replicatetheexperiments:• https://github.com/mlsecproject/tiq-test-Summer2015• http://rpubs.com/alexcpsec/tiq-test-Summer2015
• Sharedatawithus.I’llmakesureitgetsproperexercise!
Thanks!
• Q&A?• Feedback!
”Themeasureofintelligenceistheabilitytochange."- AlbertEinstein
AlexPinto@alexcpsec
@MLSecProject
Alexandre Sieira@AlexandreSieira@NiddelCorp