content‐based image retrievalthomas.deselaers.de/teaching/files/tutorial_icpr08/02-cbir.pdf ·...
TRANSCRIPT
Content‐basedImageRetrieval
TutorialImageRetrievalThomasDeselaers,HenningMüller
Outline
• WhatisCBIR• Approaches– ConBnuousapproach– Discreteapproach
• Featuresforcontent‐basedimageretrieval– Global– Local
• RelaBonshipsbetweendifferentapproaches• Availableresources
Content‐basedImageRetrieval
• TwomainquesBons– Howaretheimagesrepresented?features– Howaretheimagedescriptorscomparedsimilaritymeasures
• Twomainviewsontheconceptofa`feature’:– Featuresarenumericalvaluescomputedfromeachimage
• ViewconnectedtoimageclassificaBon• IdeasandmethodsfromclassificaBonandmachinelearning
– FeaturesareimageproperBesthatarepresentorabsent• ViewconnectedtotextualinformaBonretrieval• Ideasandmethodsfromtextretrieval
ConBnuousApproach
• Inspiredbynearest‐neighbourclassificaBon• Eachimageisrepresentedbyafeaturevector
• FeaturevectorsarecomparedusingdistancefuncBons
• Imagessimilartoaqueryimageareassumedtoberelevant
• Normally:query‐by‐visualexample(s)
NearestNeighbour
• VoronoiDiagramina2Dspace
• AsetofpointsandthecorrespondingVoronoicells
• AVoronoicellistheareawhereapoint’snearestneighbouristheseedofthecell
Source:Wikipedia
NearestNeighbour
• StandardnearestneighboursearchX[1]…X[N]:datasetQ:querymindist=∞bestN=‐1
Forn=1:N: d=dist(q,X[n]) ifd<mindist: mindist=d bestN=nreturnbestN
R*‐Tree
• DatastructureformulB‐dimensionaldata
• Savedatainatreestructure• Directorynodes(blue)• Datanodes(red)• Saveminimumbounding
rectanglesindirectorynodes
root
p1 p2
p11 p12 p21 p22p13 p23
p1p2p11
p12p13
p21p22
p23
root
p1 p2
p11 p12 p21 p22p13 p23
p1p2p11
p12p13
p21p22
p23
IndexStructures:Complexity
• Linearsearch– ComplexityO(n/C),verysmalloverhead
• BadsituaBonforindexstructures– Largerange– Stronglyoverlappingregions– Fewregionsarenotaccessed– CommonwithhighdimensionaliBes– ComplexityO(n/C),highoverhead
• GoodsituaBon– Smallrange– Smalloverlaps– Manyregionsdonothavetoberead– ComplexityO(logC(n))
NearestNeighbourSearchwithIndexInit:resultdist=∞FunctionSimpleNNQuery(Pointq,Address:page):
page.load()ifisDataPage(page): forxinpage.points: d=dist(q,x) ifd<resultdist: resultdist=d result=xelse: forpinpage.childPages: ifMINDIST(q,p)<resultdist: SimpleNNQuery(q,p);
• Firstpathfindsarbitrarypoint• Searchspaceonlyslowlyreduced• Manypagesunnecessarilyread
ExampleQuery
root
p1 p2
p11 p12 p21 p22p13 p23
p1p2
p11
p12
p13
p21p22
p23
• Ifthesearchhadstartedwithp2nopagefromp1wouldhavebeenread• Clearlynon‐opBmal
NearestNeighbourSearchwithIndexBest‐FirstSearch
• Avoidrecursion• InsteadusepriorityqueueAPL(acBvepagelist)
– Listwhichcontainsdirectorypagestobeprocessedsortedbypriority
• DefiniBon:apagepisacBve,ifandonlyif– pnotyetprocessed– Parentofpprocessed– Minimumdistancebetweenpandq<currentbestdistance
• IniBalisaBon:APL=[root]• Ineachstep,processbestpagefromAPL
– Datapages:asbefore– Directorypages:checkminimumdistancetoquery
BestFirstNearestNeighbourSearchwithIndex
Init:apl=[(0.0,root)]//sortedbydistresultdist=∞
Whileapl.notEmpty()andapl[0].dist<resultdist:page=apl[0].load()delete(apl[0])ifisDataPage(page): forxinpage.points: d=dist(q,x) ifd<resultdist: resultdist=d result=xelse: forpinpage.childPages: h=MINDIST(q,p) ifh<resultdist: apl.insert((h,p))
ExampleQuerywithBestFirstSearch
root
p1 p2
p11 p12 p21 p22p13 p23
p1p2
p11
p12
p13
p21p22
p23
APL:[(0.0,root)]resultdist:∞APL:[(7.2,p1)]resultdist:∞APL:[(6.8,p2)(7.2,p1)]resultdist:∞APL:[(7.2,p1)(8.2,p21)]resultdist:∞APL:[(7.0,p22)(7.2,p1)(8.2,p21)]resultdist:∞APL:[(6.9,p23)(7.0,p22)(7.2,p1)(8.2,p21)]resultdist:∞APL:[(7.0,p22)(7.2,p1)(8.2,p21)]resultdist:7.1APL:[(7.2,p1)(8.2,p21)]resultdist:7.0
BestFirstSearchisopBmalHere:adrajoftheproof
1. Completeness:ItwillfindthecorrectNNofaquery– Everycorrectalgorithmhastoaccessallpagesthatintersect
withtheNNsphereofq– ThesepageshaveMINDIST<resultdist
2. Itaccessespagesinascendingorderfromthequery– TheAPLissortedbyMINDISTandthealgorithmwillterminate
onceitisimpossibletofindanypointclosertoqthanthecurrentresult
3. ItwillnotaccessasinglepagewithMINDISTlargerthanthetrueNNdistance– ChildpagescannothaveaMINDISTsmallerthanitsparents
ConBnuousApproach
• SimpleCase:• Database:• Query:• Distancebetweenand:• Sortimages
• Suchthat• Holdsforalland
ConBnuousApproach
• HowtousemulBplefeatures• Eachimageisrepresentedbydescriptors
• Calculatescoreinsteadofdistancetoallowfordifferentweights
DiscreteApproach
• InspiredbytextualinformaBonretrieval• Eachimageisrepresentedbyasetofbinaryfeatures(featuresmaybepresent(possiblymulBpleBmes)orabsent)
• Featureiseitherpresentorabsent– Similartowordsbeingabsentorpresentinadocument
• Imagescontainingthesame(informaBve)featuresareassumedtoberelevanttoaquery
• Example:GIFT–GNUImageFindingTool
DiscreteApproach
• GIFTusesTF‐IDF(textfrequency/inversedocumentfrequency)ranking– Reducetheimpactoffeatureswhichoccurfrequentlyinthedata(comparableto“the”intexts)
• TF:frequencyafeatureihasinadocumentdj
• IDF:measuresimportanceofaterm
DiscreteApproach• %captureshowojenafeatureoccursinadocument– Featuresthatoccurojeninadocumentdescribethisdocumentwell
• idfcaptureshowrelevantafeatureis– Featuresthatoccurrarelyinthefulldatabaseareimportant
• Importantarethosefeatureswhichareojeninoneimage,butseldomoverthefulldataset– Imageswhichshareseldomfeaturesarerelevantwithrespecttoeachother
DiscreteApproachinGIFT
• InGIFT,4differentfeaturesetsareconsidered– Globalcolour– Localcolour– Globaltexture– Localtexture
• Foralllocalfeaturesap‐idf‐likescoreiscalculatedandthesearefusedasaweightedsum– GlobalfeaturesarecomparedwithahistogramintersecBon
• Formanycases,thediscreteandtheconBnuousapproachcanbesimulatedintherespecBveother
• InGIFT– Imageshaveabout1,500to2,000features
– Similarimagesshareabout400‐500
DiscreteApproach:InvertedFiles
• StoreamappingfromcontenttolocaBon– E.g.foreachfeaturealistofimagesthatcontainthisfeature
• Allowforefficientsearchesevenforhugeamountsofimages– idfforeachfeaturecanbepre‐calculated– %foreachfeatureisstoredinthefiles– Allowsforsearchingwithoutaccessingtheimages
• IntheconBnuousapproachsearchingforneighboursislineartotheamountofimages,hereitisatmostlineartothenumberoffeaturesinanimage– InpracBce,thefeatureswithhighimpactareprocessedfirstandthe
otherfeatureshavelessinfluence(Zipf’sLaw).Therefore,inpracBce,thisapproachisveryfast
Theinvertedfile
Feature1
Featuren‐1
...
Feature2
Featuren
Image5 Image7 Image1 Image25
Image1 Image17 Image3 ...
Image25 Image17 Image1 Image4
Image4 Image5 Image6 ...
Image2 Image17 Image12 Image3
InvertedFiles
• Accessfeaturebyfeatureinsteadofimagebyimage
• Extremelyfastaccessforrarefeatures
• Efficientforsparselypopulatedspaces
VisualProperBesofImages
• Colour• Texture• Shapes• Imageparts• Completeimage
• Metadata• Textuallabels/capBons/annotaBons
FeaturesforContent‐basedImageRetrieval
• GlobalDescriptors– Colourhistograms– TextureFeatures– ShapeFeatures
• LocalDescriptors– Directapproach– Patch‐histograms/bag‐of‐visualwords– SIFTfeatures
GlobalDescriptors
• Captureapropertyoftheimagewithfewvalues
• DescribingtheimageinitsenBrety• E.g.
– Whichcoloursoccurintheimage?
– Istheimageofhighcontrast?– Istheimagebright/dark?
Colourhistograms
• DescribethedistribuBonofcoloursinanimage
• DiscardspaBalinformaBon
1. QuanBsecolourspace2. Countwhichcolouroccurshowojen
ColourHistograms:ExampleRGBcolourspace
HSVcolourspace
VisualisaBondonewith3DColorInspector:hvp://rsbweb.nih.giv/ij/plugins/color‐inspector.html
ColourHistograms:ExampleRGBcolourspace
HSVcolourspace
VisualisaBondonewith3DColorInspector:hvp://rsbweb.nih.giv/ij/plugins/color‐inspector.html
TextureFeatures
“TexturereferstotheproperBesheldandsensaBonscausedbytheexternalsurfaceofobjectsreceivedthroughthesenseoftouch.”
Textureinimageprocessing:• DifferentdefiniBons• DifferentrepresentaBons
TamuraFeatures
• ProposedbyTamura[1978]– FeaturescorrespondingtohumanpercepBon– Examined6differentfeatures,found3tocorrespondstronglyto
humanpercepBon– Coarseness–coarsevs.fine– Contrast–highvs.low– Direc1onality–direcBonalvs.non‐direcBonal– Line‐likeness–line‐likevs.non‐line‐like– Regularity–regularvs.irregular– Roughness–roughvs.smooth
TamuraFeatures
• Originallyonevalueperfeatureperimagewasdetermined
• Createadescriptorby1. Calculatecoarseness,contrast,anddirecBonality
inalocalneighbourhoodforeachpixel
2. Createajointhistogramoverthesevalues
GaborFeatures
• ObtainseveralvaluesperpixeldenoBngspaBalfrequenciesanddirecBons
GaborFeatures• WindowsFouriertransformwithGaussianaswindow
funcBon:
• Gaborfeatures:– Yieldonevalueperpixel(forgreyvalueimages)– ForcolourimagesuseatransformedHSVspace
• Computeglobal(histogram)orlocalfeaturesfromGaborresponses
Gray‐LevelCo‐OccurrenceMatrices
• StaBsBcaldescriptorfortextureproperBesofanimagebycomparingneighbouringpixels– DirecBonanddistance– Extractfeaturesfrommatrix
• Entropy• Contrast• CorrelaBon• …
HistogramComparison
• ‐distances
• JensenShannonDivergence
PixelValuesasFeatures
• Moststraighporward• Scaleallimagestoacommonsize• Comparepixelsusinge.g.Euclideandistance
• MulB‐scaleRepresentaBons:
DirectComparisonofImages
• Pixel‐wise:
ImageDistorBonModel
• Allowforsmalllocaldisplacements• ComputaBonallyefficient
Shape:GISTdescriptor
• Describetheshapesoccurringinanimagewithonedescriptor– Subdivideimagein4×4subimages– CalculateGaborresponsesineachofthese– CreatehistogramsofGaborresponsesineachsubimage
– HasbeenshowntobehelpfultodisBnguish• Naturalness• Openness• Roughness• Expansion• Ruggedness
Shape:GISTdescriptor
GISTdescriptorOlivaandTorralba,IJCV2001
SlidebyJamesHaysandAlexeiEfros
LocalDescriptors
• DefiniBon:• Featuresextractedfromlocalregionsfromtheimage
• E.g.patches,SIFTfeatures,localcolourhistograms,…
• ExtracBonposiBondeterminedbyinterestpoints• Knowntoachievegoodresultsinmanytasks
• AcBvefieldofresearchinobjectrecogniBon,detecBon,sceneclassificaBon,imageannotaBon,andmorerecently:imageretrieval
LocalDescriptors:InterestPoints
LocalDescriptors:Patches
• Extractpatchesfromtheimage• ApplyaPCAtransformaBontoreducedimensionality
• Caneasilyhandlecolourandgrayvalueimages
• Allmethodsfrominvariantimagecomparisoncanbeappliedatpatchlevel
LocalDescriptors:SIFT
• Storeahistogramofgradientsinlocalareas• SIFT=ScaleInvariantFeatureTransform
• Leadingto128‐dimensionalfeaturevectors• Havebeenshowntoperformwellinmanytasks
FigurebyT.Weyand
LocalDescriptors:DirectRetrieval
HistogramsofLocalDescriptors
FigurebyT.Weyand
HistogramsofLocalDescriptors
FeaturesforCBIR:PerformanceEvaluaBon
CorrelaBonBetweenFeatures1:colourhistogram2:MPEG7:colourlayout3:LFSIFThistogram4:LFSIFTsignature5:LFSIFTglobalsearch6:MPEG7:edgehistogram,7:Gaborvector8:Gaborhistograms9:grayvaluehistogram10:globaltexturefeature,11:inv.Feathistocolor12:Lfpatchesglobal13:LFpatcheshistogram14:LFpatchessignature,15:inv.feathistorel16:MPEG7:scalablecolor,17:Tamura18:32x32image19:Xx32image.
CorrelaBonBetweenFeatures
CombiningFeatures
• Manuallytuned– Havean`expert’findapropersetofparameters
• HeurisBctocapturedifferentimageproperBes
• CombinaBontoreflecthumanpercepBon
• CombinaBontoobtainopBmalperformance(givenasetoftrainingqueries)
CombiningFeaturestoCaptureDifferentImageProperBes
• GiventheresultfromthecorrelaBonanalysis,firstchooseasimplefeature
• ThenaddfeatureswhichhavelowcorrelaBon
ColorHistogram:50.5%MAP
+GlobalTextureFeatures:49.5%MAP
+TamuraTextureHistogram:51.2%MAP
+ImageThumbnails:53.9%MAP
+PatchHistograms:55.7%MAP
CombiningFeaturesReflecBngHumanPercepBon
• ComparisonofHumanpercepBonofimagesimilarityandlow‐levelimagedescriptors:
CombiningFeaturesReflecBngHumanPercepBon
• CombinaBonsofimagedescriptorstoachievebestcompliancewithhumanpercepBon
CombiningFeaturesReflecBngHumanPercepBon
CombiningFeaturestoObtainOpBmalPerformance
• Considerretrievaltobeatwo‐classproblem• Classifyimagesas`relevant’or`irrelevant’
• Usetrainedclassifiers• Requirestrainingdata– Canbeobtainedfromexperts– Orrelevancefeedback(moreonthislater)
CombiningFeatures:MaximumEntropy/Log‐LinearModels
• LearnamodeltopredictprobabiliBesforanimagetoberelevant
• isafuncBondescribingsomesimilarity/dissimilarityofthequeryQandimageX
• aretheparameterstobetrained
TrainedLambdasformedicalretrieval
1. En2. Fr3. Ge4. Colour5. Gray6. GTF7. IFH8. Tamura9. 32x3210. PatchHisto
CombiningFeatures:SupportVectorMachines
• Trainatwo‐classSVM,+1=relevant,‐1=irrelevant
• SinceSVMssomeBmeshaveproblemswithnon‐uniformlydistributedclasses,subsampletrainingsamplestohaveapproximatelythesameamountofposiBveandnegaBvesamples
AvailableResources• ImageRetrievalSystems:
– GIFT–GNUImageFindingTool• hvp://www.gnu.org/sojware/gij/• Fullimageretrievalsystem• Followingthediscreteapproach
– FIRE–FlexibleImageRetrievalEngine• hvp://www‐i6.informaBk.rwth‐aachen.de/~deselaers/fire/• Researchimageretrievalsystem• Developedtoallowforeasyextension• FollowingtheConBnuousapproach
– openCV–computervisionlibrary• hvp://sourceforge.net/projects/opencvlibrary/• ImplementsmayimageprocessingoperaBons• E.g.facedetecBonandrecogniBon,featureextracBon
AvailableResourcesDatasets
• IAPRTC12Dataset– UsedinImageCLEFsince2006– 20,000imageswithtextinEnglish,German,Spanish
– Availablefromwww.imageclef.org/photodata
• ImageCLEFmedDatasets– Nearly70,000images
– From50,000medicalcases– Imagesandtext
AvailableResourcesDatasets
• Coreldataset– Widelyusedbutnotfreelyavailable– QuesBonableforevaluaBon– DifferentsetsareinuseintheCBIRcommunity
• Between1,000and200,000images
• MSRCdataset– ProvidedbyMicrosojResearch,Cambridge,UK– 4320imagesfrom45classes
• Flickr– DifficulttouseforevaluaBonalthoughlargenumberofimages
References• Overview
– Smeulders,A.W.,Worring,M.,SanBni,S.,Gupta,A.,andJain,R.2000.Content‐BasedImageRetrievalattheEndoftheEarlyYears.IEEETrans.PavernAnal.Mach.Intell.22,12(Dec.2000),1349‐1380.
– Rui,Y.,Huang,T.,&Chang,S.(1999).Imageretrieval:Currenttechniques,promisingdirecBonsandopenissues.JournalofVisualCommunicaBonandImageRepresentaBon,10(4),39–62.
– Dava,R.,Li,J.,&Wang,J.Z.(2005).Content‐basedimageretrieval—approachesandtrendsofthenewage.InACMIntl.WorkshoponMulBmediaInformaBonRetrieval,ACMMulBmedia.Singapore.
– Lew,M.S.,Sebe,N.,Djeraba,C.,&Jain,R.(2006).Content‐basedmulBmediainformaBonretrieval:Stateoftheartandchallenges.ACMTransacBonsonMulBmediaCompuBng,CommunicaBonsandApplicaBons,2(1),1–19.
– HenningMüller,NicolasMichoux,DavidBandon,AntoineGeissbuhler,Areviewofcontent‐basedimageretrievalsystemsinmedicine‐clinicalbenefitsandfuturedirecBons,InternaBonalJournalofMedicalInformaBcs,volume73,pages1‐23,2004
• Features:– DeselaersT.,KeysersD.,NeyH.,"FeaturesforImageRetrieval:AnExperimentalComparison",InformaBonRetrieval,vol.11,issue2,
TheNetherlands,Springer,pp.77‐107,03/2008.– SIFTDescriptors
• Lowe,DavidG.(1999)."ObjectrecogniBonfromlocalscale‐invariantfeatures".ProceedingsoftheInternaBonalConferenceonComputerVision2:1150–1157.
• Lowe,DavidG.(2004)."DisBncBveImageFeaturesfromScale‐InvariantKeypoints".InternaBonalJournalofComputerVision60(2):91–110.
– PatchHistograms• DeselaersT.,KeysersD.,NeyH.,"DiscriminaBveTrainingforObjectRecogniBonusingImagePatches",CVPR,vol.2,SanDiego,CA,
USA,IEEE,pp.157‐162,20/06/2005.
– GIST• Modelingtheshapeofthescene:aholisBcrepresentaBonofthespaBalenvelope.AudeOliva,AntonioTorralba.InternaBonalJournal
ofComputerVision,Vol.42(3):145‐175,2001
References• ConBnuous/Discrete
– FIRE:seefeatures– GIFT:Squire,D.M.,Müller,W.,Müller,H.,&Raki,J.(1999)Content‐Basedqueryofimagedatabases,inspi‐raBonsfromtextretrieval:
Invertedfiles,frequency‐basedweightsandrelevancefeedback.InScandinavianConferenceonImageAnalysis(pp.143–149).Kangerlussuaq.
– deVries,A.P.,&Westerveld,T.(2004).AcomparisonofconBnuousvs.discreteimagemodelsforprobabilisBcimageandvideoretrieval.InProc.InternaBonalConferenceonImageProcessing(pp.2387–2390).Singapore.
• Datasets– GrubingerM.,CloughP.D.,MüllerH.,DeselaersT.,"TheIAPRBenchmark:ANewEvaluaBonResourceforVisualInformaBonSystems",
InternaBonalConferenceonLanguageResourcesandEvaluaBon,Genoa,Italy,24/05/2006.– Müller,H.,Marchand‐Maillet,S.,andPun,T.2002.TheTruthaboutCorel‐EvaluaBoninImageRetrieval.InProceedingsofthe
internaBonalConferenceonImageandVideoRetrieval(July18‐19,2002).M.S.Lew,N.Sebe,andJ.P.Eakins,Eds.LectureNotesInComputerScience,vol.2383.Springer‐Verlag,London,38‐49.
– www.imageclef.org• LearningFeatureCombinaBons
– DeselaersT.,WeyandT.,NeyH.,"ImageRetrievalandAnnotaBonUsingMaximumEntropy",CLEFWorkshop2006,vol.4730,Alicante,Spain,Springer,pp.725‐734,20/09/2006,2007.
– GassT.,WeyandT.,DeselaersT.,NeyH.,"FIREinImageCLEF2007:SupportVectorMachinesandLogisBcRegressiontoFuseImageDescriptorsinforPhotoRetrieval",AdvancesinMulBlingualandMulBmodalInformaBonRetrieval8thWorkshopoftheCross‐LanguageEvaluaBonForum,CLEF2007,vol.5152,Budapest,Hungary,Springer,19/09/2007,2008.
• EfficientNNsearch– HjaltasonG.R.,SametH.:RankinginSpa.alDatabases,Int.Symp.onLargeSpa.alDatabases(SSD),1995.– Berchtold,Böhm,Keim,Kriegel:ACostModelforNearestNeighborSearchinHigh‐DimensionalSpace,ACMSymposiumonPrinciplesof
DatabaseSystems,1997.