content‐based image retrievalthomas.deselaers.de/teaching/files/tutorial_icpr08/02-cbir.pdf ·...

Post on 17-Apr-2018

219 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Content‐basedImageRetrieval

TutorialImageRetrievalThomasDeselaers,HenningMüller

Outline

•  WhatisCBIR•  Approaches–  ConBnuousapproach– Discreteapproach

•  Featuresforcontent‐basedimageretrieval– Global–  Local

•  RelaBonshipsbetweendifferentapproaches•  Availableresources

Content‐basedImageRetrieval

•  TwomainquesBons–  Howaretheimagesrepresented?features–  Howaretheimagedescriptorscomparedsimilaritymeasures

•  Twomainviewsontheconceptofa`feature’:–  Featuresarenumericalvaluescomputedfromeachimage

•  ViewconnectedtoimageclassificaBon•  IdeasandmethodsfromclassificaBonandmachinelearning

–  FeaturesareimageproperBesthatarepresentorabsent•  ViewconnectedtotextualinformaBonretrieval•  Ideasandmethodsfromtextretrieval

ConBnuousApproach

•  Inspiredbynearest‐neighbourclassificaBon•  Eachimageisrepresentedbyafeaturevector

•  FeaturevectorsarecomparedusingdistancefuncBons

•  Imagessimilartoaqueryimageareassumedtoberelevant

•  Normally:query‐by‐visualexample(s)

NearestNeighbour

•  VoronoiDiagramina2Dspace

•  AsetofpointsandthecorrespondingVoronoicells

•  AVoronoicellistheareawhereapoint’snearestneighbouristheseedofthecell

Source:Wikipedia

NearestNeighbour

•  StandardnearestneighboursearchX[1]…X[N]:datasetQ:querymindist=∞bestN=‐1

Forn=1:N: d=dist(q,X[n]) ifd<mindist: mindist=d bestN=nreturnbestN

R*‐Tree

•  DatastructureformulB‐dimensionaldata

•  Savedatainatreestructure•  Directorynodes(blue)•  Datanodes(red)•  Saveminimumbounding

rectanglesindirectorynodes

root

p1 p2

p11 p12 p21 p22p13 p23

p1p2p11

p12p13

p21p22

p23

root

p1 p2

p11 p12 p21 p22p13 p23

p1p2p11

p12p13

p21p22

p23

IndexStructures:Complexity

•  Linearsearch–  ComplexityO(n/C),verysmalloverhead

•  BadsituaBonforindexstructures–  Largerange–  Stronglyoverlappingregions–  Fewregionsarenotaccessed–  CommonwithhighdimensionaliBes–  ComplexityO(n/C),highoverhead

•  GoodsituaBon–  Smallrange–  Smalloverlaps–  Manyregionsdonothavetoberead–  ComplexityO(logC(n))

NearestNeighbourSearchwithIndexInit:resultdist=∞FunctionSimpleNNQuery(Pointq,Address:page):

page.load()ifisDataPage(page): forxinpage.points: d=dist(q,x) ifd<resultdist: resultdist=d result=xelse: forpinpage.childPages: ifMINDIST(q,p)<resultdist: SimpleNNQuery(q,p);

•  Firstpathfindsarbitrarypoint•  Searchspaceonlyslowlyreduced•  Manypagesunnecessarilyread

ExampleQuery

root

p1 p2

p11 p12 p21 p22p13 p23

p1p2

p11

p12

p13

p21p22

p23

• Ifthesearchhadstartedwithp2nopagefromp1wouldhavebeenread• Clearlynon‐opBmal

NearestNeighbourSearchwithIndexBest‐FirstSearch

•  Avoidrecursion•  InsteadusepriorityqueueAPL(acBvepagelist)

–  Listwhichcontainsdirectorypagestobeprocessedsortedbypriority

•  DefiniBon:apagepisacBve,ifandonlyif–  pnotyetprocessed–  Parentofpprocessed–  Minimumdistancebetweenpandq<currentbestdistance

•  IniBalisaBon:APL=[root]•  Ineachstep,processbestpagefromAPL

–  Datapages:asbefore–  Directorypages:checkminimumdistancetoquery

BestFirstNearestNeighbourSearchwithIndex

Init:apl=[(0.0,root)]//sortedbydistresultdist=∞

Whileapl.notEmpty()andapl[0].dist<resultdist:page=apl[0].load()delete(apl[0])ifisDataPage(page): forxinpage.points: d=dist(q,x) ifd<resultdist: resultdist=d result=xelse: forpinpage.childPages: h=MINDIST(q,p) ifh<resultdist: apl.insert((h,p))

ExampleQuerywithBestFirstSearch

root

p1 p2

p11 p12 p21 p22p13 p23

p1p2

p11

p12

p13

p21p22

p23

APL:[(0.0,root)]resultdist:∞APL:[(7.2,p1)]resultdist:∞APL:[(6.8,p2)(7.2,p1)]resultdist:∞APL:[(7.2,p1)(8.2,p21)]resultdist:∞APL:[(7.0,p22)(7.2,p1)(8.2,p21)]resultdist:∞APL:[(6.9,p23)(7.0,p22)(7.2,p1)(8.2,p21)]resultdist:∞APL:[(7.0,p22)(7.2,p1)(8.2,p21)]resultdist:7.1APL:[(7.2,p1)(8.2,p21)]resultdist:7.0

BestFirstSearchisopBmalHere:adrajoftheproof

1.  Completeness:ItwillfindthecorrectNNofaquery–  Everycorrectalgorithmhastoaccessallpagesthatintersect

withtheNNsphereofq–  ThesepageshaveMINDIST<resultdist

2.  Itaccessespagesinascendingorderfromthequery–  TheAPLissortedbyMINDISTandthealgorithmwillterminate

onceitisimpossibletofindanypointclosertoqthanthecurrentresult

3.  ItwillnotaccessasinglepagewithMINDISTlargerthanthetrueNNdistance–  ChildpagescannothaveaMINDISTsmallerthanitsparents

ConBnuousApproach

•  SimpleCase:•  Database:•  Query:•  Distancebetweenand:•  Sortimages

•  Suchthat•  Holdsforalland

ConBnuousApproach

•  HowtousemulBplefeatures•  Eachimageisrepresentedbydescriptors

•  Calculatescoreinsteadofdistancetoallowfordifferentweights

DiscreteApproach

•  InspiredbytextualinformaBonretrieval•  Eachimageisrepresentedbyasetofbinaryfeatures(featuresmaybepresent(possiblymulBpleBmes)orabsent)

•  Featureiseitherpresentorabsent–  Similartowordsbeingabsentorpresentinadocument

•  Imagescontainingthesame(informaBve)featuresareassumedtoberelevanttoaquery

•  Example:GIFT–GNUImageFindingTool

DiscreteApproach

•  GIFTusesTF‐IDF(textfrequency/inversedocumentfrequency)ranking–  Reducetheimpactoffeatureswhichoccurfrequentlyinthedata(comparableto“the”intexts)

•  TF:frequencyafeatureihasinadocumentdj

•  IDF:measuresimportanceofaterm

DiscreteApproach•  %captureshowojenafeatureoccursinadocument–  Featuresthatoccurojeninadocumentdescribethisdocumentwell

•  idfcaptureshowrelevantafeatureis–  Featuresthatoccurrarelyinthefulldatabaseareimportant

•  Importantarethosefeatureswhichareojeninoneimage,butseldomoverthefulldataset–  Imageswhichshareseldomfeaturesarerelevantwithrespecttoeachother

DiscreteApproachinGIFT

•  InGIFT,4differentfeaturesetsareconsidered–  Globalcolour–  Localcolour–  Globaltexture–  Localtexture

•  Foralllocalfeaturesap‐idf‐likescoreiscalculatedandthesearefusedasaweightedsum–  GlobalfeaturesarecomparedwithahistogramintersecBon

•  Formanycases,thediscreteandtheconBnuousapproachcanbesimulatedintherespecBveother

•  InGIFT–  Imageshaveabout1,500to2,000features

–  Similarimagesshareabout400‐500

DiscreteApproach:InvertedFiles

•  StoreamappingfromcontenttolocaBon–  E.g.foreachfeaturealistofimagesthatcontainthisfeature

•  Allowforefficientsearchesevenforhugeamountsofimages–  idfforeachfeaturecanbepre‐calculated–  %foreachfeatureisstoredinthefiles–  Allowsforsearchingwithoutaccessingtheimages

•  IntheconBnuousapproachsearchingforneighboursislineartotheamountofimages,hereitisatmostlineartothenumberoffeaturesinanimage–  InpracBce,thefeatureswithhighimpactareprocessedfirstandthe

otherfeatureshavelessinfluence(Zipf’sLaw).Therefore,inpracBce,thisapproachisveryfast

Theinvertedfile

Feature1

Featuren‐1

...

Feature2

Featuren

Image5 Image7 Image1 Image25

Image1 Image17 Image3 ...

Image25 Image17 Image1 Image4

Image4 Image5 Image6 ...

Image2 Image17 Image12 Image3

InvertedFiles

•  Accessfeaturebyfeatureinsteadofimagebyimage

•  Extremelyfastaccessforrarefeatures

•  Efficientforsparselypopulatedspaces

VisualProperBesofImages

•  Colour•  Texture•  Shapes•  Imageparts•  Completeimage

•  Metadata•  Textuallabels/capBons/annotaBons

FeaturesforContent‐basedImageRetrieval

•  GlobalDescriptors– Colourhistograms– TextureFeatures– ShapeFeatures

•  LocalDescriptors– Directapproach– Patch‐histograms/bag‐of‐visualwords– SIFTfeatures

GlobalDescriptors

•  Captureapropertyoftheimagewithfewvalues

•  DescribingtheimageinitsenBrety•  E.g.

– Whichcoloursoccurintheimage?

–  Istheimageofhighcontrast?–  Istheimagebright/dark?

Colourhistograms

•  DescribethedistribuBonofcoloursinanimage

•  DiscardspaBalinformaBon

1.  QuanBsecolourspace2.  Countwhichcolouroccurshowojen

ColourHistograms:ExampleRGBcolourspace

HSVcolourspace

VisualisaBondonewith3DColorInspector:hvp://rsbweb.nih.giv/ij/plugins/color‐inspector.html

ColourHistograms:ExampleRGBcolourspace

HSVcolourspace

VisualisaBondonewith3DColorInspector:hvp://rsbweb.nih.giv/ij/plugins/color‐inspector.html

TextureFeatures

“TexturereferstotheproperBesheldandsensaBonscausedbytheexternalsurfaceofobjectsreceivedthroughthesenseoftouch.”

Textureinimageprocessing:• DifferentdefiniBons• DifferentrepresentaBons

TamuraFeatures

•  ProposedbyTamura[1978]–  FeaturescorrespondingtohumanpercepBon–  Examined6differentfeatures,found3tocorrespondstronglyto

humanpercepBon–  Coarseness–coarsevs.fine–  Contrast–highvs.low–  Direc1onality–direcBonalvs.non‐direcBonal–  Line‐likeness–line‐likevs.non‐line‐like–  Regularity–regularvs.irregular–  Roughness–roughvs.smooth

TamuraFeatures

•  Originallyonevalueperfeatureperimagewasdetermined

•  Createadescriptorby1.  Calculatecoarseness,contrast,anddirecBonality

inalocalneighbourhoodforeachpixel

2.  Createajointhistogramoverthesevalues

GaborFeatures

•  ObtainseveralvaluesperpixeldenoBngspaBalfrequenciesanddirecBons

GaborFeatures•  WindowsFouriertransformwithGaussianaswindow

funcBon:

•  Gaborfeatures:–  Yieldonevalueperpixel(forgreyvalueimages)–  ForcolourimagesuseatransformedHSVspace

•  Computeglobal(histogram)orlocalfeaturesfromGaborresponses

Gray‐LevelCo‐OccurrenceMatrices

•  StaBsBcaldescriptorfortextureproperBesofanimagebycomparingneighbouringpixels– DirecBonanddistance–  Extractfeaturesfrommatrix

•  Entropy•  Contrast•  CorrelaBon•  …

HistogramComparison

•  ‐distances

•  JensenShannonDivergence

PixelValuesasFeatures

•  Moststraighporward•  Scaleallimagestoacommonsize•  Comparepixelsusinge.g.Euclideandistance

•  MulB‐scaleRepresentaBons:

DirectComparisonofImages

•  Pixel‐wise:

ImageDistorBonModel

•  Allowforsmalllocaldisplacements•  ComputaBonallyefficient

Shape:GISTdescriptor

•  Describetheshapesoccurringinanimagewithonedescriptor–  Subdivideimagein4×4subimages–  CalculateGaborresponsesineachofthese–  CreatehistogramsofGaborresponsesineachsubimage

–  HasbeenshowntobehelpfultodisBnguish•  Naturalness•  Openness•  Roughness•  Expansion•  Ruggedness

Shape:GISTdescriptor

GISTdescriptorOlivaandTorralba,IJCV2001

SlidebyJamesHaysandAlexeiEfros

LocalDescriptors

•  DefiniBon:•  Featuresextractedfromlocalregionsfromtheimage

•  E.g.patches,SIFTfeatures,localcolourhistograms,…

•  ExtracBonposiBondeterminedbyinterestpoints•  Knowntoachievegoodresultsinmanytasks

•  AcBvefieldofresearchinobjectrecogniBon,detecBon,sceneclassificaBon,imageannotaBon,andmorerecently:imageretrieval

LocalDescriptors:InterestPoints

LocalDescriptors:Patches

•  Extractpatchesfromtheimage•  ApplyaPCAtransformaBontoreducedimensionality

•  Caneasilyhandlecolourandgrayvalueimages

•  Allmethodsfrominvariantimagecomparisoncanbeappliedatpatchlevel

LocalDescriptors:SIFT

•  Storeahistogramofgradientsinlocalareas•  SIFT=ScaleInvariantFeatureTransform

•  Leadingto128‐dimensionalfeaturevectors•  Havebeenshowntoperformwellinmanytasks

FigurebyT.Weyand

LocalDescriptors:DirectRetrieval

HistogramsofLocalDescriptors

FigurebyT.Weyand

HistogramsofLocalDescriptors

FeaturesforCBIR:PerformanceEvaluaBon

CorrelaBonBetweenFeatures1:colourhistogram2:MPEG7:colourlayout3:LFSIFThistogram4:LFSIFTsignature5:LFSIFTglobalsearch6:MPEG7:edgehistogram,7:Gaborvector8:Gaborhistograms9:grayvaluehistogram10:globaltexturefeature,11:inv.Feathistocolor12:Lfpatchesglobal13:LFpatcheshistogram14:LFpatchessignature,15:inv.feathistorel16:MPEG7:scalablecolor,17:Tamura18:32x32image19:Xx32image.

CorrelaBonBetweenFeatures

CombiningFeatures

•  Manuallytuned– Havean`expert’findapropersetofparameters

•  HeurisBctocapturedifferentimageproperBes

•  CombinaBontoreflecthumanpercepBon

•  CombinaBontoobtainopBmalperformance(givenasetoftrainingqueries)

CombiningFeaturestoCaptureDifferentImageProperBes

•  GiventheresultfromthecorrelaBonanalysis,firstchooseasimplefeature

•  ThenaddfeatureswhichhavelowcorrelaBon

ColorHistogram:50.5%MAP

+GlobalTextureFeatures:49.5%MAP

+TamuraTextureHistogram:51.2%MAP

+ImageThumbnails:53.9%MAP

+PatchHistograms:55.7%MAP

CombiningFeaturesReflecBngHumanPercepBon

•  ComparisonofHumanpercepBonofimagesimilarityandlow‐levelimagedescriptors:

CombiningFeaturesReflecBngHumanPercepBon

•  CombinaBonsofimagedescriptorstoachievebestcompliancewithhumanpercepBon

CombiningFeaturesReflecBngHumanPercepBon

CombiningFeaturestoObtainOpBmalPerformance

•  Considerretrievaltobeatwo‐classproblem•  Classifyimagesas`relevant’or`irrelevant’

•  Usetrainedclassifiers•  Requirestrainingdata– Canbeobtainedfromexperts– Orrelevancefeedback(moreonthislater)

CombiningFeatures:MaximumEntropy/Log‐LinearModels

•  LearnamodeltopredictprobabiliBesforanimagetoberelevant

•  isafuncBondescribingsomesimilarity/dissimilarityofthequeryQandimageX

•  aretheparameterstobetrained

TrainedLambdasformedicalretrieval

1.  En2.  Fr3.  Ge4.  Colour5.  Gray6.  GTF7.  IFH8.  Tamura9.  32x3210. PatchHisto

CombiningFeatures:SupportVectorMachines

•  Trainatwo‐classSVM,+1=relevant,‐1=irrelevant

•  SinceSVMssomeBmeshaveproblemswithnon‐uniformlydistributedclasses,subsampletrainingsamplestohaveapproximatelythesameamountofposiBveandnegaBvesamples

AvailableResources•  ImageRetrievalSystems:

–  GIFT–GNUImageFindingTool•  hvp://www.gnu.org/sojware/gij/•  Fullimageretrievalsystem•  Followingthediscreteapproach

–  FIRE–FlexibleImageRetrievalEngine•  hvp://www‐i6.informaBk.rwth‐aachen.de/~deselaers/fire/•  Researchimageretrievalsystem•  Developedtoallowforeasyextension•  FollowingtheConBnuousapproach

–  openCV–computervisionlibrary•  hvp://sourceforge.net/projects/opencvlibrary/•  ImplementsmayimageprocessingoperaBons•  E.g.facedetecBonandrecogniBon,featureextracBon

AvailableResourcesDatasets

•  IAPRTC12Dataset–  UsedinImageCLEFsince2006–  20,000imageswithtextinEnglish,German,Spanish

–  Availablefromwww.imageclef.org/photodata

•  ImageCLEFmedDatasets–  Nearly70,000images

–  From50,000medicalcases–  Imagesandtext

AvailableResourcesDatasets

•  Coreldataset– Widelyusedbutnotfreelyavailable– QuesBonableforevaluaBon– DifferentsetsareinuseintheCBIRcommunity

•  Between1,000and200,000images

•  MSRCdataset–  ProvidedbyMicrosojResearch,Cambridge,UK–  4320imagesfrom45classes

•  Flickr– DifficulttouseforevaluaBonalthoughlargenumberofimages

References•  Overview

–  Smeulders,A.W.,Worring,M.,SanBni,S.,Gupta,A.,andJain,R.2000.Content‐BasedImageRetrievalattheEndoftheEarlyYears.IEEETrans.PavernAnal.Mach.Intell.22,12(Dec.2000),1349‐1380.

–  Rui,Y.,Huang,T.,&Chang,S.(1999).Imageretrieval:Currenttechniques,promisingdirecBonsandopenissues.JournalofVisualCommunicaBonandImageRepresentaBon,10(4),39–62.

–  Dava,R.,Li,J.,&Wang,J.Z.(2005).Content‐basedimageretrieval—approachesandtrendsofthenewage.InACMIntl.WorkshoponMulBmediaInformaBonRetrieval,ACMMulBmedia.Singapore.

–  Lew,M.S.,Sebe,N.,Djeraba,C.,&Jain,R.(2006).Content‐basedmulBmediainformaBonretrieval:Stateoftheartandchallenges.ACMTransacBonsonMulBmediaCompuBng,CommunicaBonsandApplicaBons,2(1),1–19.

–  HenningMüller,NicolasMichoux,DavidBandon,AntoineGeissbuhler,Areviewofcontent‐basedimageretrievalsystemsinmedicine‐clinicalbenefitsandfuturedirecBons,InternaBonalJournalofMedicalInformaBcs,volume73,pages1‐23,2004

•  Features:–  DeselaersT.,KeysersD.,NeyH.,"FeaturesforImageRetrieval:AnExperimentalComparison",InformaBonRetrieval,vol.11,issue2,

TheNetherlands,Springer,pp.77‐107,03/2008.–  SIFTDescriptors

•  Lowe,DavidG.(1999)."ObjectrecogniBonfromlocalscale‐invariantfeatures".ProceedingsoftheInternaBonalConferenceonComputerVision2:1150–1157.

•  Lowe,DavidG.(2004)."DisBncBveImageFeaturesfromScale‐InvariantKeypoints".InternaBonalJournalofComputerVision60(2):91–110.

–  PatchHistograms•  DeselaersT.,KeysersD.,NeyH.,"DiscriminaBveTrainingforObjectRecogniBonusingImagePatches",CVPR,vol.2,SanDiego,CA,

USA,IEEE,pp.157‐162,20/06/2005.

–  GIST•  Modelingtheshapeofthescene:aholisBcrepresentaBonofthespaBalenvelope.AudeOliva,AntonioTorralba.InternaBonalJournal

ofComputerVision,Vol.42(3):145‐175,2001

References•  ConBnuous/Discrete

–  FIRE:seefeatures–  GIFT:Squire,D.M.,Müller,W.,Müller,H.,&Raki,J.(1999)Content‐Basedqueryofimagedatabases,inspi‐raBonsfromtextretrieval:

Invertedfiles,frequency‐basedweightsandrelevancefeedback.InScandinavianConferenceonImageAnalysis(pp.143–149).Kangerlussuaq.

–  deVries,A.P.,&Westerveld,T.(2004).AcomparisonofconBnuousvs.discreteimagemodelsforprobabilisBcimageandvideoretrieval.InProc.InternaBonalConferenceonImageProcessing(pp.2387–2390).Singapore.

•  Datasets–  GrubingerM.,CloughP.D.,MüllerH.,DeselaersT.,"TheIAPRBenchmark:ANewEvaluaBonResourceforVisualInformaBonSystems",

InternaBonalConferenceonLanguageResourcesandEvaluaBon,Genoa,Italy,24/05/2006.–  Müller,H.,Marchand‐Maillet,S.,andPun,T.2002.TheTruthaboutCorel‐EvaluaBoninImageRetrieval.InProceedingsofthe

internaBonalConferenceonImageandVideoRetrieval(July18‐19,2002).M.S.Lew,N.Sebe,andJ.P.Eakins,Eds.LectureNotesInComputerScience,vol.2383.Springer‐Verlag,London,38‐49.

–  www.imageclef.org•  LearningFeatureCombinaBons

–  DeselaersT.,WeyandT.,NeyH.,"ImageRetrievalandAnnotaBonUsingMaximumEntropy",CLEFWorkshop2006,vol.4730,Alicante,Spain,Springer,pp.725‐734,20/09/2006,2007.

–  GassT.,WeyandT.,DeselaersT.,NeyH.,"FIREinImageCLEF2007:SupportVectorMachinesandLogisBcRegressiontoFuseImageDescriptorsinforPhotoRetrieval",AdvancesinMulBlingualandMulBmodalInformaBonRetrieval8thWorkshopoftheCross‐LanguageEvaluaBonForum,CLEF2007,vol.5152,Budapest,Hungary,Springer,19/09/2007,2008.

•  EfficientNNsearch–  HjaltasonG.R.,SametH.:RankinginSpa.alDatabases,Int.Symp.onLargeSpa.alDatabases(SSD),1995.–  Berchtold,Böhm,Keim,Kriegel:ACostModelforNearestNeighborSearchinHigh‐DimensionalSpace,ACMSymposiumonPrinciplesof

DatabaseSystems,1997.

top related