secure storage in the cloud using property preserving encryption · 2017. 7. 18. · secure storage...
TRANSCRIPT
SecurestorageinthecloudusingpropertypreservingencryptionKennyPaterson
InformationSecurityGroup
Overview
1. Applicationscenarios.2. Deterministicencryptionandsearch.
3. OPE/OREandrangequeries.
4. Analysingaccesspatternleakagefromrangequeries.
2
Applicationscenarios
ApplicationScenarios
4
• Dataownerswishtosecurelyoutsourcestoragetocloudproviderswhilstpreservingcapabilityforuserstoquerydatainvariousways.
• Whatkindsofqueries?
• Whatkindsofusers?
• Whatkindsofdata?
• Whatkindsofquery?
• Whatkindsofadversary?
• Meta:WhynotjustuseFHEandbedone?
Twoscenarios,onepicture
5
Scenario1:SearchableFileStorage
6
• Ownerhaslargecollectionoffiles,indexedbykeywords.
• Ownerencryptsfilesandstorestheseonremoteserver.
• Ownerencodeskeywordsinsuchawaythatkeywordsearchescanstillbecarriedout.
• Encodedkeywordsalsostoredonserver,asanencodedindex.
• Ownersendssearchtokentoserver;serverusestokenandindextofindidentifiersformatchingfiles.
• Matchingfileidentifiersarereturnedtoowner.
Scenario2:DatabaseEncryption
7
• Dataownerhasalargedatabaseofrecords;eachrecordhasmultiplefields.
• Ownerencryptsdataineachfieldinsuchawaythatstandarddatabasequeriescanstillbecarriedout.
• Basic:simplesearches.-“Givemeallrecordsinwhichsurname=Dubois”.
• Advanced:compoundsearches.• Moreadvanced:rangequeries
-“Givemeallrecordswithagesbetween21and30”.• Finally:arbitrarySQLqueries.*
*Otherdbquerylanguagesareavailable.
SearchableEncryption
8
SolutionforScenario1:SearchableEncryption.
• Naïvescheme:ownerusesIND-CPAsymmetricencryptionforfilesandPRFK(kw)asencodingofkeywordkw.
• Storeencryptedfilesandencodedkeywordsperfileonserver.
• Ownersendstok=PRFK(kw)toserver;servermatchestokagainstencodedkeywords;returnsmatchingfiles.
• Canuseaninvertedindexandfileidentifiers:serverstoresdatabaseoftuples(tok,(fid1,fid2,….)).
SecurityAnalysis
9
• Adversarialobjectives?- Keywordrecovery,recoveryoffilecontents,…?
• Adversarialcapabilities?- “Snapshot”,“Honest-but-curious”,“Fullymalicious”.- Can/cannotobservequeries;can/cannotmakequeries;can/cannotinjectfiles.
• Whataboutauxiliaryinformation?
- Whatiftheadversaryhasarepresentativedatasampleorkeywordsample?• Cashetal.(CCS15):detailedanalysisofdifferentattacksmodels,leakage
profiles,etc.againstSEschemesingeneral:LeakageAbuseAttacks.• Fulleretal.(S&P17):SoKpaperoncryptographicallyprotecteddatabase
search.
Twoscenarios,onepicture
10
DeterministicEncryption
11
PartialsolutionforScenario2:DE
• Simplestpossiblescheme:ownerusesdeterministicencryptionscheme(KGen,Enc,Dec)toencrypteachcolumnofthedatabaseusingaper-columnkeyK.
• Servercanstoretheencrypteddataonserverinatraditionaldatabase.
• Tofindmatcheswithvaluexinacolumn,sendsearchqueryfory=EncK(x)toserver.
• Serverfindsmatchesonyandreturnsfullencryptedrecordstoclient.
• Clientdecryptsreturnedrecordsusingpercolumnkeys.
• UseofDEpreservesequalityofplaintextsandallowssimplesearches.
• (VerysimilartonaïveSE,withPRFreplacedbyEnc/Dec).
PropertyPreserving/RevealingEncryption(PPE/PRE)
12
MoregeneralsolutionforScenario2:PPE/PRE
• Generalisesideaof“equalitypreserving/revealing”propertyofDE.
• Mainexample:OrderPreserving/RevealingEncrypion(OPE/ORE).
• OPE:ifx<ythenEnc(x)<Enc(y).
• ORE:thereexistsa(public)efficientlycomputablefunction“Order”suchthat: x<yiffOrder(Enc(x),Enc(y))=1
• OPE/OREallowsrangequeries!
• Clientwhowishestoqueryonrange[a,b]insteadsendsqueryforrange[Enc(a),Enc(b)]toserver.
AnalysisofDeterministicEncryption
14
Reminder:ECBinformationleakage
TuxthePenguin,theLinuxmascot.Createdin1996byLarryEwing
ECB-Tux
AnalysisofDeterministicEncryption
• DEisequalitypreserving,bydesign.
• DEthereforepreservesfrequenciesofplaintextsintheciphertexts,cf.monoalphabeticsubstitutioncipher.
• Naveed-Kamara-Wright(CCS15):let’sapplyfrequencyanalysis!(al-Kindi,9thcentury.)
• Assumption1:attackerhasauxiliaryinformation–areasonablyaccurateestimatefortheplaintextdistribution.
• Assumption2:attackerhasasnapshotoftheencrypteddatabase.
15
AnalysisofDeterministicEncryption
16
FrequencyAnalysisisMaximumLikelihood!
• Givenacolumnofciphertextsy,frequencyanalysismatches:- Mostfrequentiteminywithmostfrequentiteminaux.dist.
- Secondmostfrequentiteminywithsecondmostfrequentiteminaux.dist.
- etc.• Definesapermutationπmappingplaintextsxtociphertextsy.• Thisprocedureismaximumlikelihood,thatis,itmaximisesthelikelihood L(π|y):=Pr(y|π).
• Proof:funexercise,seealsoeprint2015/1158.17
PerformanceofFrequencyAnalysisAgainstDE
• Naveed-Kamara-Wright[CCS15]performedanempiricalinvestigationoftheperformanceoffrequencyanalysisagainstDE.
• Usingalargemedicaldataset:per-patientdatain12categoriesfor200largesthospitalsinthe2009NationwideInpatientSample(NIS),fromtheHealthcareCostandUtilizationProject(HCUP),runbytheUSAgencyforHealthcareResearchandQuality.
• DEencryptdataperhospitalforeachcategory.• Use2004aggregatedHCUPdataastheauxiliarydata.• Runfrequencyanalysisandmeasurepercentageofdataitems
correctlyrecoveredperhospital.
18
PerformanceofFrequencyAnalysisAgainstDE
19
PerformanceofFrequencyAnalysisAgainstDE
20
PerformanceofFrequencyAnalysisAgainstDE
21
FrequencyAnalysisMakesHeadlines!
22
CombattingFrequencyAnalysis
• Wewanttosmoothoutfrequencydistributionsothatfrequencyanalysisbecomesineffective.• Performingworsethanrandomguessingofplaintext.
• Wealsowanttopreserveabilitytoefficientlyperformsearchqueriesonastandarddatabase.• Rulesoutfullyrandomised/IND-CPAsecureencryption.
• Whataboutaddingalimitedamountofrandomness?
• LeadstoideaofapplyinghomophonicencodingtoproduceFrequencySmoothingEncryption(FSE)schemes(Lacharité-Paterson,forthcoming).
23
pDE
e0
e1e2
e3
c1c3
c2
c0
HE
FrequencySmoothingEncryption–CombattingFrequencyAnalysis
24
Plaintext Encodings Ciphertext
• HomophonicEncoding(HE)consumessmallamountofrandomness.
• Makenumberofencodingsproportionaltofrequencyofpforgoodfrequencysmoothing.
• DE=DeterministicEncryption.
• Matchon{c1,c2,c3,c4}insteadofasingleciphertext.
• Querycomplexityblow-upbymax.numberofencodingsinworstcase.
Interval-basedHomophonicEncoding(IBHE)
• Encodingspace=r-bitstrings/interval[0,2r).
• Representencodingsofphavingfrequencyfbyanintervalofsizeapproximatelyfx2r.
• Selectuniformlyatrandomfromintervaltoencodep.
• Needsanencodingtabletostoreanintervalforeachplaintextitem;|p|x2rbits.
• Alsoneedsadecodingtablemappingbitsbacktoplaintexts.
25
0
p0 p1 p2 p3 ….
2r-1
EffectivenessofFSEfromIBHE+DE
• Canprovethatasrgoesto∞,nodistinguishercantellapartciphertextsfromuniformlyrandomstrings.
• Butevenformoderater,IBHE+DEsmoothswellforallbutveryskeweddata.
• Rapidlylimits(generalised)frequencyanalysistobeingworsethanapureguessingattack.• Suchanattackisalwayspossibleforlimiteddomainofplaintexts.
• WeusedsameevaluationframeworkasNaveed-Kamara-Wright(CCS15).• Exceptthatwegavetheadversarytheexact,per-hospital
distributionastheauxiliarydistribution!
26
EffectivenessofFSEfromIBHE+DE
27
EffectivenessofFSEfromIBHE+DE
28
EffectivenessofFSEfromIBHE+DE
• Warning:FSEonlyprotectsagainstabasicsnapshotattacker.
• RecentworkofGrubbs-Ristenpart-Shmatikov(HotOS17)questionslegitimacyofsnapshotattackmodel.
• Columnsaretreatedinisolation.
• Morepowerfuladversarycouldperformfrequencyanalysisonthesetsofresponsestoqueries.
• Schemedoesnotprotectagainstanactiveattackerwhocaninjecthisownqueries.
29
AnalysisofOPE/ORE
OrderPreserving/RevealingEncryption
31
• OPE:ifx<ythenEnc(x)<Enc(y).
• ORE:thereexistsa(public)efficientlycomputablefunction“Order”suchthat: x<yiffOrder(Enc(x),Enc(y))=1
• OPE/OREallowsrangequeries.
• Clientwhowishestoqueryonrange[a,b]insteadsendsqueryforrange[Enc(a),Enc(b)]toserver.
OrderPreserving/RevealingEncryption
32
• Q:IfDEleaksbadly,doesOPE/OREleakevenmore?
• A:Often,yes.
• Folklore:ifOPEschemeisdeterministicandplaintextdataisdense(everypossibleplaintextoccurs)thenasnapshotadversarycanlearnwhichplaintextiswhich.
• Simplyordertheciphertextsandthenreadofftheplaintexts.
• Take-away:bewareofformalsecuritymodelsforOPE/ORE.
• Thiscansometimesbegeneralisedtothenon-densecase…
TheSchemeofChenette-Lewi-Weis-Wu(FSE16)
33
• CLWW(FSE16)presentedacleverandpracticalOREschemebuiltusingonlyPRFs.
• CLWWgaveaprecisecharacterisationofleakageinasimulation-basedsecuritymodel:
• GiventwociphertextsEnc(x)andEnc(y),theschemeleaksexactlythefirstindexatwhichbitsofxandydiffer(andwhichisbigger).
• Example:givenEnc(x=11012)andEnc(y=10012),theschemewouldleakthatthetwoplaintextsareequalinMSB(bit0)butthatthefirstonehas1inbit1andtheother0inbit1.
• LeakageisgreaterthaninanidealOPEscheme,whichwouldleakonlyorder.
AnAttackontheCLWWScheme
34
• Inthedensecase,thefolkloreanalysisapplies.• Whataboutthenon-densecase?
• Assumption1:snapshotattacker.
• Assumption2:Nplaintexts,closetouniformlyrandomonthesMSBs,whereN>s2s.
Then,withhighprobability,theattackercanlearnthesmostsignificantbitsofeveryplaintext.
AnAttackontheCLWWScheme
35
• Assumption1:snapshotattacker.
• Assumption2:Nplaintexts,closetouniformlyrandomonthesMSBs,whereN>s2s.
• Secondassumptionimpliesthat,withhighprobability,everypossibles-bitprefixoccursinatleastoneplaintext:
Prob≈1–2-s/2^(s+1).
• UsetheCLWWscheme’sleakagetoordertheNciphertextsonthe2sdistincts-bitprefixes.
• Nowreadoffthesmostsignificantbitsofeachplaintext.
ImplicationsoftheAttack
36
• Supposeacompanyhas10,000employeeswithsalariesthatare20-bitnumbers(between$0and$220-1).
• Wecansets=10(10x210≈10,000).
• Attackyields10MSBsofeverysalary.
• Thisisenoughtoidentifyeachsalaryuptoaccuracyof$1k.
• Examplegeneralisesto,say,32-bitsalariesthatareallzerointhefirst12bitpositions.
• Sufficientthatdatabedenseinsomepositions(andconstantinleadingpositions).
FurtherResearchonOPE/ORELeakage
37
Severalattackrecentpapersexaminethereal-worldimplicationsoftheleakageofOPE/OREschemesforsnapshotattackers:
• Durak-DuBuisson-Cash(CCS16):attacksoncorrelatedcolumnsofOPE/ORE-encrypteddata,especiallylongitude/latitudedata.
• Grubbs-Sekniqi-Bindschaedler-Naveed-Ristenpart(S&P17):revisitNaveed-Kamara-WrightforOPE/ORE;recastptxt/ctxtmatchingproblemasmin-weight,non-crossingbipartitematchingproblem,solveitefficientlyformanytypesofdata,reliesonauxiliarydistributions.
AccessPatternLeakageforRangeQueries
AnalysisofAccessPatternLeakageforRangeQueries
39
Kellaris-Kollios-Nissim-O’Neill(CCS16):analysisofaccesspatternleakageforSE;applicabletoOPE/OREschemestoo.
• Honest-but-curiousattacksetting,strongerthansnapshotadversary.
• Assumption:adversarycanseewhichdatabaserowsarereturnedinresponsetoanyrangequery.
• ForN-valueddatabase,completereconstructioninO(N4)queries.
• Fordensecase:O(N2logN)queriessuffice.
AnalysisofAccessPatternLeakageforRangeQueries
40
• AdversaryinKKNO(CCS16)doesnotneedtodirectlyseetheactualrangesqueried.- InOPE/ORE,adversarywouldseeonlyciphertextsEnc(x),Enc(y)
correspondingtorangeendpoints.
- ButinOPE/OREsetting,andinsomeSEschemes*,therankalsoleaks.
- Therankofaciphertextisitspositioninanorderedlistofalltheciphertexts.
*e.g.ArxschemeofPoddar-Boelter-PopaandFH-OPEschemeofKerschbaum.
ExploitingRankinAnalysisofAccessPatternLeakge
41
• Canweusetherankleakagetoimproveattackcomplexity?
• Lacharité-Minaud-Paterson(forthcoming):
• Yes!• Andmuchmorebesides…
ExploitingRankinAnalysisofAccessPatternLeakge
42
Simplemotivatingexample:considerrangequeries[a,b]inwhichaisuniformlyrandom.
• Thenwithprobability1–1/N,after2NlogNqueries,allNpossiblevaluesforawillhavearisen.• Followsfromstandardanalysisofthecouponcollectorproblem.
• Easytoidentifyandorderdifferentavaluesbasedonrankleakage.
• Allvaluesofainqueriesarenowknown.
• PickoutNquerieswithdistinctvaluesofa;eachsuchqueryproducesasetofresponsesYa(recordsindatabase).
• Thenthesetofrecordswithvalueais:Ya–Ui>aYi.
ExploitingRankinAnalysisofAccessPatternLeakge
43
Y0
Y1
Y2
0 1 2 N
….
a: N-1N-2
Y0–Ui>0Yi
Y1–Ui>1Yi
YN-1–YN
YN-2–Ui>1Yi
YN
ImprovedAnalysisofAccessPatternLeakge
44
• Oursimpleexampleappearstoshowthatrankleakagehelpstheadversary.
• Infact,wecandispensewithrankleakageandobtainanNlogN+O(N)attackinthegeneral“dense”case!
- ImprovingonKKNO’sO(N2logN)attack.
• Wealsoconsidertheproblemofapproximatereconstruction.
• WecanefficientlyreconstructvaluesinrecordsuptoanabsoluteerrorofεNafterseeingonlyO(N)queries!
- Withaconstantof2log(1/ε).
ImprovedAnalysisofAccessPatternLeakge
45
Finally,westudyalgorithmsforapproximatereconstructionwiththeassistanceofrankandanauxiliarydistribution.
• Significantreductioninnumberofqueriesrequiredforaccuratereconstruction.
• Performsetintersectionsandthenmapbacktounderlyingdatausingrank+auxiliarydistribution.
• ExperimentswithaggregatedHCUPdata…
ApproximateReconstructionwithAuxiliaryDistribution
46
ConcludingRemarks
ConcludingRemarks
• UseDE/OPE/OREwithextremecareifatall.
• Wearecurrentlyinapropose/break/patchcycle.
• Despitetheprovisionofsecuritymodelsandproofs.
• Justidentifyingandprovingleakageisnotenough;weneedtoalsoidentifyreal-worldimplicationsofthatleakage.
• DoesPPEprovideaddedsecurityorafalsesenseofsecurity?
48