introduction to deep convolutional neural networksgaryliye.com/intro_to_dcnn.pdf · 2017-08-25 ·...
TRANSCRIPT
IntroductiontoDeepConvolutionalNeuralNetworks
YeLi([email protected])1Dept.ofElectricalandComputerEngineering,
WhitingSchoolofEngineering2DivisionofMedicalImagingPhysics,Dept.of
Radiology,SchoolofMedicineJohnsHopkinsUniversity
Context
• IntroductiontoNeuralNetworks• IntroductiontoDeepConvolutionalNeuralNetworks(DCNN)• DeepLearninginMedicalImageSegmentation• DCNNLayerFunctionality• DCNNArchitecturefunctionality
ConventionalNeuralNetworks
Single-layerPerceptrons (SLP)
• Canclassifylinearlyseparabledataintobinaryclasses:-1and1.• Afeed-forwardnetworkbasedonathresholdtransferfunction
y
x1 x2 x3 xn…...
𝑦 =#𝜔%𝑥%
'
%()
𝜔) 𝜔* 𝜔+ 𝜔'
Outputlayer
Inputlayer
Output=, 1𝑖𝑓𝑦 > 𝜃−1𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Multi-layerPerceptrons (MLP)
y1
x1 x2 x3 xn…...
𝑦𝑖 = 𝑓 𝑠
𝑠 =#𝜔%𝑥%
'
%()
𝜔)) 𝜔)* 𝜔)+ 𝜔)'
y2
𝜔*) 𝜔** 𝜔*+ 𝜔*'
y
Inputlayer
Hiddenlayer
Outputlayer
AboutMLP
• DiffersfromSLPbytwothings:- Asoftthresholdingfunctionaftereachsummation- Introductionofhiddenlayers• Manylevelscanbespecifiedtomodelnon-linearrelationship• Thenumberofhiddenunitsisrelatedtothecapacityoftheperceptron
Backpropagation
•Toapplythechainrulemanymanytimestocalculatethegradientofalossfunctionwithrespecttoalltheinputs(weights,inputdata)inthenetwork.
Backpropagation(continued)
b
a
c
x
+
f=(axb)+c
q=(axb)
i.e.a=-2
i.e.b=5
i.e.c=-10
𝜕𝑓𝜕𝑎 =
𝜕𝑓𝜕𝑞𝜕𝑞𝜕𝑎 = 𝑏 = 5
𝜕𝑓𝜕𝑏 =
𝜕𝑓𝜕𝑞𝜕𝑞𝜕𝑏 = 𝑎 = −2
𝜕𝑓𝜕𝑐 = 1
𝜕𝑓𝜕𝑞 = 1
BackpropagationforMLP
x4 …...
𝑧C = 𝜎(#𝜔C%𝑥%)'
%()
𝜔)) 𝜔)* 𝜔)+ 𝜔)G
z2
𝜔*) 𝜔** 𝜔*+ 𝜔*G
y1
zm…...
x5 xn
𝜔)H 𝜔)'
𝜔*H 𝜔*'𝜔C'
z1
𝑦% =#𝑣%C𝑧C
�
C
v11 v12 v1m
𝜔C)…H
x3x2x1
yi…...
vi1 vi2 vim
BackpropagationforMLP(cont’d)• Lossfunction
• Updatetermsarenegativederivativesofthelossw.r.t thelocalparameters(weights)
• Bydefining𝑧C = 𝜎(∑ 𝜔C%𝑥%)'%() and𝐸 =
∑ (𝑦%−∑ 𝜐%C𝑧C�C )*'
%() ………
Detailedderivationsareavailableat:http://garyliye.com/Multilayer_perceptron_and_backpropagration.pdf
WhataboutaReal-worldImage?
Inputlayer
Hiddenlayer
23x23=529
z1 z200…... …... …...200
Needamatrixofinputweightsontheorderof1million(529*200=105,800)toholdthe𝜔s!!
Outputlayery1
V1,1 V1,200
𝜔),)
𝜔),H*P
SpatialStructure
Whatwesee Wecomputerssee
Vectorizing animagecompletelyignoresthecomplex2Dspatialstructureofanimage
LimitationsofConventioanlNeuralNetworks
• Impracticalforreal-worldimageclassification• Ignores2D/3Dspatialstructureinimage• Solutiontoovercomeboththesedisadvantages?
Onesolution:ConvolutionUse2Dconvolutioninsteadofmatrixmultiplications:-Learningasetofconvolutionalfilters(eachof5x5,say)ismuchmoretractablethanlearningalargematrix(529x200)
https://adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/
ConvolutionalNeuralNetworks
ConvolutionalNeuralNetworks(CNN)
• CNNhasprovenverypowerful-Retainsstructuralorconfigural informationinneighboringpixelsorvoxelsina(medical)image-Exploitsextensiveweight-sharingtoreducethedegreesoffreedomofmodels-Composedofconvolutionlayersinterspersedwithpooling(sub-sampling)layers-Highlyparallelizable-GPUimplementationscanaccelerate40timesormore-Trainedusingbackpropagationalgorithmandlotsoflabeleddata-Firstusesinmedicalimagingin1990’s
• Improvementofartificialneuralnetworks- Morelayers,higherlevelsofabstraction,improvedpredictions
H.Greenspanetal.,IEEETMI,May2016
HighLevelPerspectiveofConvolution
• Convolutionalfiltersareessentiallyfeatureidentifiers• Featurescanbehigh-level(abstract)andlow-levelsuchasstraightedges,simplecolors,andcurves.
https://adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/
HighLevelPerspectiveofConvolution(cont’d)
Theoutputofthefilterhasahighactivationvalue.Orsay,theneuronisfired/excited!
Noactivation
Theoutputofthefilterhasalowactivationvalue.
Highactivation
https://adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/
1st ConvLayerFilters LearnedinAlexNet
ExamplefilterslearnedbyKrizhevsky etal.Eachofthe96filtersshownbelowisofsize11x11x3.
Eachlayeroftheactivationmap(s)isbasicallydescribingthelocationsintheoriginalimageforwherecertainlowlevelfeaturesappear.
The2nd ConvLayerActivationMap
https://www.youtube.com/watch?v=AgkfIQ4IGaM
FiltersandActivationMaps
https://www.youtube.com/watch?v=AgkfIQ4IGaM
ConnectionWeightsBetweenConvolutionalLayers
• Letthelearnableconnectionweightsconnectingfeaturemapiatlayer𝑙 − 1 andthefeaturemapjatthelayer𝑙 be𝑘%ST .Specifically,theunitsoftheconvolutionallayer𝑙computetheiractivations𝐴STbasedonlyonaspatiallycontiguoussubsetofunitsinthefeaturemaps𝐴%TV) oftheprecedinglayer𝑙 −1 byconvolvingthekernels𝑘%ST asfollows:
𝐴ST=𝑓(∑ 𝐴%TV) ∗ 𝑘%ST + 𝑏STY(TV))%() )
Sayifthereare5featuremapsatlayer𝑙 − 1 and4featuremapsatlayer𝑙,therewouldbe4axax5(depthofthefeaturemapatpreviouslayer)connectionweights
HowobjectsarerepresentedinCNN?
Bolei Zhou,etal.,arXiv.org
Neuroscienceconnection
• Similar(convolution-like)computationswithinthehumanbrain• Primaryvisualcortexhassimpleandcomplexcells• Thesimplecellsrespondedprimarilytoorientededgesandgratings• Thecomplexcellswerealsosensitivetotheseedgesandgratingbutexhibitedspatialinvariance
DeepLearninginMedicalImaging
• Difficulttoobtainlargeenoughtrainingdata
H.Greenspanetal.,IEEETMI,May2016
• Somesolutionstolackof“bigdata”inmedicalimaging
•Whatarchitecturetouse?
Segmentation:pixel-wiseclassification
Transformingfullyconnectedlayersintoconvolutionlayersenablesaclassificationnettooutputaspatialmap.
Shelhamer,E.Long,J.Darrell,T.IEEETransPatternAnalMachIntell,2016
NetworkDepthandReceptiveFieldSize
• Asyougodeeperintothenetwork,thefiltersbegintohavealargerandlargerreceptivefield,whichmeansthattheyareabletoconsiderinformationfromalargerareaoftheoriginalinputvolume(anotherwayofputtingitisthattheyaremoreresponsivetoalargerregionofpixelspace)
Layerfunctionality
Convolutionallayer
• Localconnectivity-Becauseweuseconvolutionalfilterwithsizemuchsmallerthantheimageitoperateson.Thiscontrastswiththeglobalconnectivityparadigmrelevanttovectorized images• Weightsharing-Thesamefilterappliedacrosstheimage• Canbeseenasalocalindependentfeature-detector;Todetectlocalfeatures(localconnectivity)atdifferentpositionintheinputfeaturemaps
Max-Pooling
• TheNeocognitron modelinspiredthemodelingofsimplecellsasconvolutions.• Thecomplexcellscanbemodeledasamax-poolingoperation,whichcanbethoughtasamaxfilter.• Picksthehighestactivationinalocalregion,thusprovidingasmalldegreeofspatialinvariance,whichisanalogoustotheoperationofcomplexcells.
Non-linearitylayer
• Necessarybecausecascadinglinear(likeconvolution)systemsisanotherlinearsystem• Non-linearitybetweenlayersensurethatthemodelismoreexpressivethanalinearmodel• Intheory,nonon-linearityhasmoreexpressivepowerthananyother,aslongastheyarecontinuous,bounded,andmonotonicallyincreasing.• Maasetal.introducedanewkindofnonlinearity,calledtheleaky-ReLU.ReLU(x)=max(0,x)+bmin(0,x)
Deconvolutional Layer
WithoutPadding WithPaddinghttps://datascience.stackexchange.com/questions/6107/what-are-deconvolutional-layers
Architecturefunctionality(segmentation)
Encoder-decoderarchitecture(U-net)
OlafR.etal.,arXiv.org
Summationbasedskiparchitecture
U-net:Copyandpast
FusionNet:Copyandadd
TranM.Q,etal.,arXiv.org