CONVOLUTIONALNEURALNETWORKSFORAUTOMATEDCELLDETECTIONINMAGNETICRESONANCEIMAGINGDATAByMuhammadJamalAfridiADISSERTATIONSubmittedtoMichiganStateUniversityinpartialoftherequirementsforthedegreeofComputerScienceŠDoctorofPhilosophy2017ABSTRACTCONVOLUTIONALNEURALNETWORKSFORAUTOMATEDCELLDETECTIONINMAGNETICRESONANCEIMAGINGDATAByMuhammadJamalAfridiCell-basedtherapy(CBT)isemergingasapromisingsolutionforalargenumberofserioushealthissuessuchasbraininjuriesandcancer.RecentadvancesinCBT,hasheightenedinterestinthenon-invasivemonitoringoftransplantedcellsininvivoMRI(MagneticResonanceImaging)data.ThesecellsappearasdarkspotsinMRIscans.However,todate,thesespotsaremanuallylabeledbyexperts,whichisanextremelytediousandatimeconsumingprocess.ThislimitstheabilitytoconductlargescalespotanalysisthatisnecessaryforthelongtermsuccessofCBT.Toaddressthisgap,wedevelopmethodstoautomatethespotdetectiontask.Inthisregardwe(a)assembleanannotatedMRIdatabaseforspotdetectioninMRI;(b)presentasuperpixelbasedstrategytoextractregionsofinterestfromMRI;(c)designaconvolutionalneuralnetwork(CNN)architectureforautomaticallycharacterizingandclassifyingspotsinMRI;(d)proposeatransferlearningapproachtocircumventtheissueoflimitedtrainingdata,and(e)proposeanewCNNframeworkthatexploitslabelingbehavioroftheexpertinthelearningprocess.Extensiveexperi-mentsconveytheoftheproposedmethods.Tomyparentsandsiblingsfortheirsupportandencouragement.iiiACKNOWLEDGMENTSWorkingtowardsmyPhDhasbeenarewardingandanenrichingexperience.Lookingback,therearemanywhoshapedmyjourney.Ihavebeenprivilegedtohavethreeco-advisorsformyPhD.Iamthankfultoallofthemforguidingmeonthisjourney.IamespeciallythankfultoDr.ErikM.Shapiroforhisconstantsupportovertheyears.Myworkwouldnothavebeenpossiblewithouthisguidance.Ihavebeenveryfortunatetohaveanadvisorwhohasalsobeenagoodfriend.Hehelpedmerealizetheimportanceofintelligentautomationinmolecularimagingandradiology.IhavelearnedalotfromErikovertheseyears.Erikalwaysencouragedmetoattendthetopconferencesandpresentmywork.Thishelpedmetointeractwithanumberofotherresearcherswhoworkonsimilartopics.In2015,hesentmetopresentourpaperinMICCAIwhichwasheldinGermanyandin2016,heintroducedmetomanyotherresearchersandprofessionalsattheWMICwhereIwaspresentingourwork.ThanksErik!ImustalsoexpressmygratitudeformyPhDadvisorinthecomputerscienceandengineeringdepartment,Dr.ArunRoss.Hisguidancehasakeyroleinshapingmyjourney.ItisDr.Ross,withwhomIhavebeenthoroughlydiscussingthedetailsofallthetechnicalcontentpresentedinthisthesis.Dr.Rosshasalwaysbeenpolite,humble,motivating,andknowledgeable.Ihavelearnedalotfromourregularweeklymeetings,hisfeedbackonpapersandpresentationsandevenfromhisapproachtowardsdealingnon-technicalmatters.Dr.Rosshasalsobeenhighlyencouragingonsendingmetotechnicalconferences.In2015,hesentmetoattendICCVinChile.Ienjoyedthetripandmetanumberoffellowresearchers.ThankyouDr.Ross!ivIamalsoverythankfultoDr.XiaomingLiuforhisguidanceasanadvisor,especially,duringtheinitialyearsofmyPhD.IhavelearnedalotfromDr.Liuandhisguidanceandsupporthasbeenofgreatvaluetome.Dr.HayderRadhaisalsoonmyPhDcommitteeandIwouldthankhimforalltheacademicdiscussionsIhavehadwithhim.Tome,heisalsoanexcellentteacher.IamgratefultoallthelabmembersincludingDorelaShuboni,ShatadruChakrvarty,BarbaraBlanco,ChristianeMallett,LauraSzkolar,StevenHoffmann,ThomasSwearingin,DentonBobeldyk,EricDing,AminJourabloo,YousefAtoum,JosephRothandXiYin.IamalsothankfultoMargaretBenniwitzforremotelyworkingwithusonourjournalpaper.Further,IamthankfultotheComputerScienceandEngineringdepartmentatMSUforprovidingmewiththeTAandRAopportunities.ThankstothedepartmentofRadiologyatMSUandtotheNIHforsupportingthisresearch!IwillalsoliketothankMuhammadShahzadandZubairShaforanumberofacademicdiscussionsIhadwiththemovertheseyears.Iwillliketothankmyfamilyandfriendsfortheirsupportandencouragementalongtheway.ImustalsoappreciatetheroleofOISSatMSU.ThankyouforallyourhelpandformakingMSUagreatplacetostudyandworkat!IamalsothankfultoKatherineTrinklein,CourtneyKosloski,LindaMoore,CathyDavison,andDebbieKruchfortheiradministrativeassistance.vTABLEOFCONTENTSLISTOFTABLES.......................................ixLISTOFFIGURES......................................xChapter1Introduction..................................11.1Background......................................11.2Challengesandcontributions.............................5Chapter2DevelopingMRIDatabase..........................72.1Introduction......................................72.2Approach.......................................92.2.1Cellpreparation................................92.2.2Animalpreparation..............................112.2.2.1Anaesthesia.............................112.2.2.2Cellinjection............................112.2.2.3Incubation.............................112.2.3MRIscanning.................................122.2.3.1InvitroMRIscans.........................162.2.3.2InvivoMRIscans.........................172.2.4Labelcollection................................182.2.4.1Dataloadingandsliceselection..................202.2.4.2Zooming-intopixellevel.....................212.2.4.3OperatingaZoom-out.......................222.2.4.4Labelingstatisticsandcontrastadjustment............23Chapter3Regions-of-InterestandFeatureRepresentations.............273.1Introduction......................................273.2Approach.......................................283.2.1GeneratingRoI................................283.2.2Featureextraction...............................303.2.2.1Featureextractionwitheddesigns(P-1)............303.2.2.2Featureextractionwithlearneddesigns(P-2)...........323.3Experiments,resultsanddiscussion..........................373.3.1Invivoevaluationstudies...........................373.3.2Invitroevaluationstudies...........................393.3.3Comparisonwiththeoreticallycomputedspotnumbers...........413.3.4Modelgeneralizationstudies.........................413.4Conclusion......................................45viChapter4LearningwithSmallTrainingData.....................464.1Introduction......................................464.1.1Backgroundandmotivation..........................464.1.2Technicalgoal.................................494.1.3Noveltyandcontributions...........................504.1.4Relatedwork.................................514.1.4.1TransferlearningviaCNNs....................514.1.4.2Transferlearningintraditionalresearch..............524.1.4.3Supervisedtransferlearning....................524.1.4.4Semi-supervisedtransferlearning.................524.1.4.5Unsupervisedtransferlearning...................534.1.4.6Inductiveandtransductivetransferlearning............534.2Approach.......................................554.2.1Intuitiveapproach:Asolutionspacebasedapproach............554.2.1.1CNNsolutionspace........................564.2.1.2Solutiondifference.........................564.2.1.3Solutionpath............................564.2.1.4Path-to-point........................574.2.1.5SourceCNNranking........................584.2.2Theoreticalapproach.............................594.2.2.1Notations..............................594.2.2.2Derivingthemeasure........................604.2.2.3Discussion.............................644.2.2.4Upperboundontransferability...................644.2.3Datasets....................................654.2.3.1TargetData-MRIdatabase....................654.2.3.2TargetData-MNISTdatabase...................654.2.3.3SourceData-Places-MITdatabase................664.3Experiments,ResultsandDiscussion........................674.3.1MRIbasedtargettask.............................674.3.2MNISTbasedtargettask...........................724.3.3ExperimentsusingCalTech-256.......................734.4Conclusion......................................754.4.1Multiplesources...............................754.4.2Layerstotransfer...............................76Chapter5ExploitingLabelinglatency.........................775.1Introduction......................................775.1.1Priorliterature.................................785.1.1.1Claslearningwithlabelinglatency..............795.1.1.2CNNlearningwithsideinformation................805.2Approach.......................................805.2.1Imageviewer:Human-computerinterface..................815.2.1.1Labelingspots...........................815.2.1.2ExtractingRoIfor..................82vii5.2.2approach............................855.2.2.1Clustering.............................855.2.2.2Transferlearning..........................855.3Experiments,results,anddiscussion.........................875.3.1ComparisonwithconventionalCNNapproach................875.3.2Comparisonwithrandomclustering.....................885.3.3Comparisonusingdifferentnumberoftransferlayers............885.3.4Comparisonwithapreviousapproach....................88Chapter6SupplementaryInformation.........................906.1Amodelbasedapproachforspotdetection......................916.1.1Approach...................................916.1.1.1Spotmodeling...........................926.1.1.2Modelinstantiationviasuperpixel.................926.1.1.3Superfernsfeatureextraction....................946.1.1.4Partition-basedbayesian...............956.1.2Experimentalresults.............................966.1.2.1Experimentalsetup.........................966.1.2.2Performanceandcomparison...................986.1.2.3Superfernsvs.ferns........................986.1.2.4Diversityanalysis.........................996.2CNNrankingwithintuitiveapproach.........................1016.2.1Experimentalsetup..............................1016.2.1.1Targettask.............................1016.2.1.2Sourcetask.............................1016.2.2Resultsanddiscussion............................1026.2.2.1Impactofsizeoftargettrainingset................1026.2.2.2Correlationbetweensourcerankingandperformancegain....1046.2.2.3Layerstobetransferred......................1046.2.2.4ofinformationfusion...................106Chapter7Conclusion..................................108BIBLIOGRAPHY.......................................111viiiLISTOFTABLESTable2.1:CollectiondetailsandcharacteristicsofourMRIdatabase.........16Table3.1:ExperimentalcomparisonofinvivospotdetectionperformanceusingP-1andP-2..................................37Table3.2:Automaticallydetectednumberofspotsin5samplesunder5conditions.Thetheoreticallyexpectednumberofspotsineachsampleis2400.....41Table4.1:AsummaryofrelatedresearchintransferlearningviaCNNs.......51Table4.2:Abriefoverviewoftransferlearningresearch...............54Table4.3:Summaryofthebasicnotationsusedinthissection............59ixLISTOFFIGURESFigure1.1:ThreeorthogonalMRIslicesextractedfrom3DdatasetsofthebrainfromanimalsinjectedwithunlabeledMSCs(toprow)andmagneticallylabeledMSCs(middlerow).NotethelabeledMSCsappearasdistributeddarkspotsinthebrainonly.ThebottomrowshowsthreedifferentcencehistologysectionsfromanimalsinjectedwithmagneticallylabeledMSCsthatthesecellswerepresentinthebrainmostlyasisolated,singlecells.Blueindicatescellnuclei,greenisthelabelinthecell,redisthelabelofthemagneticparticle.........3Figure2.1:Overallarchitectureofthedatacollectionprocess.............8Figure2.2:Thetwoimagesontheleftshowthemediautilizedincellculture.Themediausuallycontainsadiversesetofessentialingredientssuchasglu-coseandglutamine.TheimageontherightshowtheMPIOpackageutilizedinourcellpreparationprocess...................9Figure2.3:(Left)Containerswithcellculture.(Middle)Temperatureandaircontrolequipmentthatwasutilized.(Right)TemperatureandAircontrolsettingsfortheculture................................9Figure2.4:CulturedMSCswithMPIOsasseenunderamicroscope..........10Figure2.5:(Top)Ratundergoesanaesthesiabyinhaling(Bottom)Iodinesolutionisutilizedtomarktheheartregionoftherat............12Figure2.6:MSCswithMPIOsareinjectedintotherat(intracardiacinjection).....13Figure2.7:Amedicalexpertcarefullymountstherattoasuitableframeandpreparesitfortheincubationequipment.......................13Figure2.8:(Top)Incubationequipmentisattachedtotherat.(Bottom)Ageneralviewoftheincubationprocedure.Theequipmentdisplaysthestatusoftherat'sbreathingprocess..........................14Figure2.9:(Top)MountingtherattotheMRImachine'smechanicalframe.(Bot-tom)RattoundergoanMRI........................15Figure2.10:A-FshowvariationinthebrainmorphologyacrossMRIslices.......17xFigure2.11:(Top)ThesoftwareinterfaceprovidesanoptiontobrowsetothedirectorycontainingtheMRIdata.(Bottom)Oncethedataisloaded,anexpertcanbeginlabelingfromanysliceusingthesliderindicatedwitharedarrow..20Figure2.12:(Top)Forzooming-in,theoperatorsimplyclickanddraginthedirectionshownwiththeredarrow.Thiscreatesaboxedregionthatwillappearinthezoomed-inview.Thisprocesscanberepeatedmultipletimesiffurtherzoom-inisrequiredwithinthatboxedregion.(Bottom)Thecor-respondingzoomed-inview.........................21Figure2.13:Illustratingthezoom-outoperation.Theexpertclicksanddragsalongthediagonaldirectionasindicatedbytheredarrow.Thisoperationbringsuptheoriginallabelingview..........................22Figure2.14:(Top)TheexpertlabelsareoverlaidontheMRIslice.Theoperatorusesaleft-clicktoindicatealabel.Alabelcanalsobedeletedbyclickingonitagain.Basiclabelingstatistics,suchaslocationofthelastlabeledpoint,totalnumberoflabeledspots,labeledspotsonthecurrentslide,andtheslicenumber,aredisplayedontherightsideofthetool.(Bottom)Showstheeffectofcontrastadjustment.Notethatalltheseoperationscanalsobeperformedwiththezoomed-inview...................23Figure2.15:Thesquaresrepresentthelabelsfromanexpert.DistributionoftheselabelsontwoMRIslicesisshownhere...................24Figure2.16:Thesquaresrepresentthelabelsfromanexpert.DistributionoftheselabelsontwoMRIslicesisshownhere...................25Figure2.17:Thesquaresrepresentthelabelsfromexpert.DistributionoftheselabelsontwoMRIslicesisshownhere......................26Figure3.1:AdiagrammaticrepresentationofaspotinMRIslices.ThealsoshowstworealspotsinMRIslicesandhowtheywerecapturedbysuper-pixels.....................................29Figure3.2:(Top)Illustratingthegenerationofcandidateregions:Asuperpixelal-gorithmisappliedtoeachsliceinMRIandthenthebrainregionisautomaticallysegmentedusingbasicimageprocessingtechniques.Thesuperpixelsthatcorrespondtoonlythesegmentedbrainregionarecon-sideredandtherestareignored.Foreachsuchsuperpixel,thedarkestpixelisselectedasthecenterandaedsizepatchisextractedaroundit.(Bottom)Amosaicofseveral99patchesextractedfromanMRIslice.Itcanbeseenthatallpatcheshaveadarkregioninthecenterrepresentingaspotina2Dslice..............................30xiFigure3.3:PrincipleComponentAnalysis(PCA)wasutilizedtoextracteigenspotshapesusingallofthe99spotpatchesinthetrainingset.ThetopPCAcomponentsforthespotpatchesobtainedonthreelabeledratsinGAareshownhere.AniterativelyincreasingthresholdisthenappliedonthevaluesofthesetopPCAcomponentstoextractdifferentbinarypatchesthatareutilizedastocapturetheshapeandintensityinformationonspotpatches................................31Figure3.4:BinaryshapeareobtainedusingthetopPCAcomponents.Byiterativelyincreasingthethresholdvaluesfromdarktolightintensities,PCAcomponentscanresultindifferentbinaryshapes.Domainexpertsagreethatthesebinarypatchesrepresentmanyfrequentshapesoftheactualspots.AllthesepatchesarerotatedandtranslatedtoobtainalargesetofdifferentshapeTheseareconvolvedwitheachcandidatepatchandthecomputedresponseistakenasafeature.Alargesetoftheseresponsescomprehensivelycapturetheshapeandintensityinformationofspotpatches..........................32Figure3.5:Visualrepresentationofcontextfeaturesfortwocontextpatches:Whilelearningtheofaspot,itmayalsobeusefulforatolearnmoreaboutitssurroundingcontext.Therefore,tocapturetheap-pearanceofthepatch'scontext,alargerpatchof2121wasextractedaroundthecenterofacandidatepatchxi.Twowell-knownappearancedescriptorsincomputervisionwerethenusedtoextractfeatures:(a)His-togramofGradients(HoG)[1]and(b)Gist[2].AvisualrepresentationofHoGfortwocontextpatchesisshownhere.Redlineshereindicatethedifferentdirectionsofintensitygradientswhereasthelengthsofthelinesdeterminetheirmagnitude..........................33Figure3.6:CNNarchitectureusedinthiswork.Thenetworktakesa9x9imagepatchasinputforataski.EachcompositelayerLk,k2f1;2;3g,iscomposedofaconvolutionallayerCikwhichproducesthefeaturemapsFikandanon-lineargatingfunctionbproducingthetransformedfeaturemapsFikb.Afterpassingthroughthecompositelayers,thenetpassesthroughthefullyconnectedlayerLfcwhichproducestheoutput.Thesoftmaxfunctionisthenappliedtotheoutput.Notethatinthecontextofthiswork,thisarchitecturerepresentthemodelM.Theweightsofalltheacrossitsprocessinglayersarelearnedusingthetrainingdata....35Figure3.7:Someconvolutionlearnedbythe2ndcompositelayersofourdeeplearningapproach.Usually,eachactsasaneuronandduetoco-adaptationbetweenalargenumberofsuchneurons,highlysophisticatedfeaturesareextractedthatcanpotentiallymodelspotshape,intensity,textureetc...................................40xiiFigure3.8:Comparisonandresults:(Top)invitroresults100micron,(Middle)gen-eralizationtestusinginvivoscans,(Bottom)invitroresults200micron..43Figure3.9:3Dvisualizationofthedetectedspotsinaninvitroscan..........44Figure3.10:3Dvisualizationofthedetectedspotsinaninvivoscan...........44Figure4.1:Givenalargenumberofpre-trainedsourceCNNs,theproposedapproachranksthemintheorderinwhichtheyarelikelytoimpacttheperformanceofagiventargettask.Thesourcetaskdataisnotusedinthisdetermination.47Figure4.2:TheavailabilityofsourcetaskdataisnotnecessaryinCNNbasedtransferlearning.TransferlearningmayonlyrequirethesourceCNNmodelandthetargetdatafortuning...........................48Figure4.3:Thisdemonstratethebasicprocessofknowledgetransfer.LearnedfeaturelayersofasourceCNNaretransplantedtoinitializeatargetCNNwhichisthentunedusingthetargetdata...................48Figure4.4:Theoutputofthelastlayermistaskdependentandisthereforeitsdimen-sionalityisdifferentdependingontask.Hence,thek-dimensionaloutputoflayerm1isutilized.Also,itistheoutputofthislayerwhichwillbeutilizedlaterintheexperimentsectionforvisualization...........57Figure4.5:Theoutputofthelastlayermistaskdependentandisthereforeitsdimen-sionalityisdifferentdependingontask.Hence,thek-dimensionaloutputoflayerm1isutilized.Also,itistheoutputofthislayerwhichwillbeutilizedlaterintheexperimentsectionforvisualization...........60Figure4.6:Informationdiagram:TheterminEqn.(7)isrepresentedbyregion-1whereasthesecondtermisrepresentedbyregion-2.Thelargertheregion-2,themoreusefulisthesourceCNN................63Figure4.7:Eachimageinthesourcedatasetwasconvertedtograyscaleandthendown-sampledto2020and99.Someoftheseimagesalongwiththeirtransformedversionsareshownhere..................66Figure4.8:(Toprow)Thetwoshowtheresultoftheproposedapproachontwodifferenttestsets.(BottomRow)ThesetwoshowtheresultoftheRestrictedBoltzmannMachinebasedapproachin[3].Thehorizontalaxisshowsthereconstructionerrorcomputedonthetarget'strainingdatausingthesourceRBMmodel.Notethehighdegreeofcor-relationexhibitedbytheproposedmeasure(toprow)withimprovementinperformance................................69xiiiFigure4.9:Priortoconductinganytransfer,theabilitytodiscriminatebetweenspotandnon-spotpatchesisvisualizedin3DspaceforthebestrankedandtheworstrankedCNN.TheintheleftcolumncorrespondtothebestrankedCNNontwotestsets,whiletheintherightcolumncorrespondtotheworstrankedCNNontwotestsets.Seetextforfurtherexplanation..................................70Figure4.10:Thehorizontalaxisshowthepercentageofthetargettrainingdatauti-lizedintuning.They-axisshowstheperformanceaftertransfer.Datasetsizewasincrementedinvaluesof5%,andforeachdataset,theproposedapproachwasusedtorankthesourceCNNs.Here,thetransferwasonlyconductedusingthebestandtheworstrankedCNN.Notetheperfor-manceimprovementforsmallertrainingsizeswhichconveystheimpor-tanceoftheproposedmethod........................71Figure4.11:RankingperformancefortheMNISTtargettask:44CNNs,basedonrandomlychosensourcetasks,areranked.Fortuning,only5%oftheavailabletrainingsetwasrandomlychosenforthegiventask.TestingwasperformedonalltheimagesintheMNISTtestset.Notethattheper-formancewithouttransferlearning,usingtheselected5%ofthetrainingdata,wasabout0:21.............................72Figure4.12:(Toprow)Thetwoshowtheresultoftheproposedapproachontwodifferenttestsets.(BottomRow)ThesetwoshowtheresultoftheRestrictedBolzmannMachinebasedapproachin[3].Thehorizontalaxisshowsthereconstructionerrorcomputedonthetarget'strainingdatausingthesourceRBMmodel.................73Figure4.13:Informationdiagram:Emphasizingtheneedtoexploitmultiplesources..74Figure5.1:ThischapterdescribesaCNNarchitecturethatincorporatesthelabelingbehaviorofanexpertduringthetrainingphase.Thelabelingbehaviorisanticipatedtoprovidesideinformationthatcapturestheintra-classvari-abilityofpositiveexemplarsinatwo-classproblem.(Note:Greenmark-ershavebeenusedtoindicatespotsinaMRIscan).............78Figure5.2:Unliketraditionalfeatures,labelinglatencyisonlyavailableduringthetrainingphase.Further,unliketraditionalsideinformation,itisassoci-atedwithasingleclassonly.Inthisatwo-classproblem(fi+"andfi-")isconsidered...............................79Figure5.3:BasicarchitectureoftheproposedL-CNNframework...........80Figure5.4:Basicviewisusedtolabeleasy-to-detectspots...............83xivFigure5.5:(Left)Zoomed-inviewtolocateaspot.(Right)Zoomed-inviewwithcontrastadjustmentfordetailedcontextualobservationpriortolabelingspots.....................................83Figure5.6:LabelinglatencyforasingleMRIscan...................84Figure5.7:(A)SpotpatchesextractedfromoneMRIscan(concatenatedas1015patches),(B)SpotpatchesfromanotherMRIscan.Thesepatchesrepre-sentinter-scanandintra-scanvariationsinspotpatches...........84Figure5.8:PerformanceoftheproposedL-CNN....................89Figure5.9:Resultswithdifferentnumberoftransferlayers...............89Figure6.1:Thearchitectureofourapproach.Blue,red,andblackarrowsaretheprocessingwduringthetrainingstage,testingstageandbothstages,respectively..................................92Figure6.2:Fernsvs.Superferns.............................95Figure6.3:Detectionperformancecomparisonsandwithvariouscomponents.....97Figure6.4:Spotdetectionexamples:(a)truedetection,(b)falsenegative,(c)falsealarm.....................................98Figure6.5:Superfernsvs.Ferns.............................99Figure6.6:diversityanalysis.........................99Figure6.7:Transformingsourceimagesto99.Transformed,averageimagesfordifferententitiesareshownhere.......................102Figure6.8:Sourceentitiesandtheircorrespondingtransformedaverageimages....103Figure6.9:Comparisonofempiricalresultsonthreeofthesixtestingscenarios.Notetheperformancegainondatasetswithsmalleramountsoftrainingdataandtheefyoftherankingmetric....................104Figure6.10:Correlationbetweenrankingscoreandperformancegain.........105xvFigure6.11:(A)Performancegainanalysisw.r.ttransferringlayers.L1indicatesthatonly1convolutionallayerwastransferredandL3thatall3convolutionallayersweretransfered.Theredregionsshowstheareaspannedby20differentsources,whiletheblacklineshowsonlythebestrankedsourceoutof25.(B)ofinformationfusion.(C)Correlationofrankingscorewithnumberofclasses.........................106xviChapter1Introduction1.1BackgroundCell-basedtherapiesarepoisedtomakeaimpactacrossabroadspectrumofmedicalscenarios.Inregenerativemedicine,stemcelltransplantsareinvariousstagesofclinicaltrialsfortreatingorslowingamyriadofdiseases,includingParkinson'sdisease[4,5],rheumatoidarthri-tis[6,7]andmultiplesclerosis[8,9].Cell-basedtherapyintheformofcancerimmunotherapyisalsobeingtestedinclinicaltrials[10,11].Itiswellacknowledgedthatimagingthelocationoftransplantedcells,bothimmediatelyandseriallyafterdelivery,willbeacrucialcomponentformonitoringthesuccessofthetreatment.Twoimportantapplicationsforimagingtransplantedcellsare:1.tonon-invasivelyquantifythenumberofcellsthatweredeliveredorthathomedtoapartic-ularlocation,and2.toseriallydetermineiftherearecellsthatareleavingdesirableorintendedlocationsandenteringundesirablelocations.Formultiplereasons,includingimageresolution,lackofradiation,andestablishedsafetyandimagingversatility,magneticresonanceimagingorMRIhasemergedasthemostpopularandper-hapsmostpromisingmodalityfortrackingcellsinvivofollowingtransplantordelivery.Ingeneral,MRI-baseddetectionofcellsisaccomplishedbylabelingcellswithsuperparamagneticiron1oxidenano-ormicroparticles,thoughsomecelltypescanbelabeleddirectlyinvivo,suchasneu-ralprogenitorcells.Followingtransplant,theselabeledcellsarethendetectedinanMRIbyusingimagingsequenceswherethesignalintensityissensitivetothelocalmagneticinhomogeneitycausedbytheironoxideparticles.ThisresultsindarkcontrastintheMRI[12,13].Inthecaseofatransplantoflargenumbersofmagneticallylabeledcells,largeareasofdarkcontrastareformed.Inthecaseofisolatedcells,givensufmagneticlabelingandhighimageresolution,invivosinglecelldetectionispossible,indicatedbyaandwellcharacterizeddarkspotintheimage(SeeFig.1).Duetotherathercomplexrelationshipbetweenironcontent,particledistribution,ironcrystalintegrity,distributionofmagneticlabelandcellsetc.,itisdiftoquantifycellnumbersinanMRI-basedcelltrackingexperiment.Thisisespeciallythecaseforasinglegraftwithalargenum-berofcells.Thereareefmethodsofquantifyingironcontent,mostnotablyusingSWIFTbasedimaging[14],butthedirectcorrelationtocellnumberisnotstraightforward,duetothereasonslistedabove.MRI-baseddetectionofsinglecellspresentsamuchmoredirectwayofenu-meratingcellsincertaincelltherapytypeapplications,suchashepatocytetransplant[13],orforimmunecellsthathavehomedtoanorganoratumor[15].Inthiscase,thesolutionisstraightfor-ward:ifdarkspotsintheMRIarefromsinglecells,thencountingthesespotsintheMRIshouldyieldcellnumber.Whileseeminglystraightforward,performingsuchquantitativeanalysisonthree-dimensionaldatasetsisadiftaskthatcannotbeaccomplishedusingtraditionalmanualmethodologies.ManualanalysisandenumerationofcellsinMRIistedious,laborious,andalsolimitedincapturingpatternsofcellbehavior.Inthisrespect,amanualapproachcannotbeadoptedtoanalyzelargescaledatasetscomprisingdozensofresearchsubjects.Variouscommercialsoft-warethatarecurrentlyavailableforMRIcanonlyassistamedicalexpertinconductingmanualanalysis.TheproblemisfurthercompoundedinthecaseofeventualMRIdetectionofsinglecells2Figure1.1:ThreeorthogonalMRIslicesextractedfrom3DdatasetsofthebrainfromanimalsinjectedwithunlabeledMSCs(toprow)andmagneticallylabeledMSCs(middlerow).NotethelabeledMSCsappearasdistributeddarkspotsinthebrainonly.ThebottomrowshowsthreedifferenthistologysectionsfromanimalsinjectedwithmagneticallylabeledMSCsthatthesecellswerepresentinthebrainmostlyasisolated,singlecells.Blueindicatescellnuclei,greenisthelabelinthecell,redisthecentlabelofthemagneticparticle.atclinicalresolution,whichislowerthanthatachievedonhighsmallanimalsystems.Atlowerimageresolution,thewell-characterizeddarkspotlosesshapeandintensityandcanbediftomanuallyinalargenumberofMRIslices.3ThesehurdleshighlightthepressingneedtodevelopanautomaticandintelligentapproachfordetectingandenumeratingtransplantedcellsinMRI,meetingalltheaforementionedchallenges.AnautomaticandintelligentapproachcanallowresearcherstoefconductlargescaleanalysisoftransplantedcellsinMRI,facilitatingtheexplorationofnewtransplantparadigmsandcellsources.Suchgeneralizedintelligenttoolswilluseacrossabroadspectrumofbiomedicalpursuits.However,theuniquechallengesofdesigningsuchatoolhasnotbeenaddressedinanypriorliterature,especiallyinthecontextofdetectingcellsinMRI.Withmicroscopyimaging,automaticcelldetectionapproacheshaveachievedreasonablesuc-cess,especiallywhenmakinguseoftohighlightthecellsinanimage.Notableistheworkin[16],whichpresentsadetailedcomparativestudyofdifferentautomaticapproachesforcelldetectioninmicroscopyimages.Theirstudyconcludedthatamachinelearning(ML)basedautomaticcelldetectionapproachperformedmoresuperiorthanotherwellknownmethods.However,Notethatdetectingandlocatingtransplantedcellsina3DMRIisadifferentandmorechallengingtaskthancelldetectioninmicroscopyimaging.Cellsappearasverysmalldarkspots,andunlikemicroscopyimagesin[16],theMRIdataalsocontainsback-groundtissuewithmanyspot-likestructuralentities.TheproblembecomesmoredifwithsmallgroupsofnoisypixelsinMRIslicesthatarealsodark.TodesignandevaluateanintelligentandautomaticapproachforcellspotdetectioninMRI,groundtruthi.e.,labels,thatannotatespotsinMRIimages,arerequired.In[17],authorsrecognizedtheneedforautomationandadoptedathresholdbasedstrategyforautomat-icallydetectingspotsinMRI.However,theirapproachwasnotevaluatedusingagroundtruth.Althoughsuchthresholdreliantapproachesarenotknowntobeintelligentforhandlingvariationsanddiversityindata,theirstudyinfacthighlightstheneedforautomation[16,18].AutomaticMLapproacheshavebeensuccessfullyusedinawiderangeofimageanalysisapplications[19,20,16].4However,itisunexploredhowsuchapproachescanbeappropriatedtotheproblemofMRIspotdetection.Further,state-of-the-artMLapproachesrelyonalargevolumeoftrainingdataforaccu-ratelearning.Unfortunately,duetopracticallimitations,generatinglargescaleannotateddataischallenginginbothpreclinicalandtheclinicalarenas.Annotationcanalsobeprohibitivelytime-consumingandcanonlybeperformedbyamedicalexpert.Hence,crowdsourcingapproachessuchastheuseofAmazon'sMechanicalTurk[21],cannotbeadoptedforannotationinsuchap-plications.Therefore,theproblemofspotdetectionusingalimitedamountofannotatedtrainingdata,isanadditionalunaddressedchallenge.1.2Challengesandcontributions1.Datasetcollection:Forthoroughevaluationandtrainingoftheautomatedapproach,anannotatedMRIdatabaseneededtobedeveloped.Therefore,adiversedatabaseconsistingof40MRIscanswasassembledandmorethan19;700manuallabelswereassigned.Tothebestofourknowledge,thisistheannotateddatabasecollectedforautomatedcelldetectioninMRI.2.Candidateregionextraction:GivenanMRIscan,asetofcandidateregionsneededtobeextractedeffectively.EachcandidateregionmustrepresentaregioninMRIthatcanpotentiallycontainaspot.Thisstudydiscusseshowasuperpixelbasedstrategycanbedesignedtoextractsuchregions.3.Featuredesign:Spotshavehighintra-classvariationduetotheirdiverseappearancesintermsofshapeandintensity.Therefore,formachinelearningapproachestoworkeffectively,asetofrobustfeaturedescriptorsneededtobeextractedfromthecandidateregions.Anew5CNNarchitecturewasdesigned,forthisproblem,toautomaticallyextractthemostusefulspotfeatures.Theperformanceofthesefeatureswassystematicallycomparedagainstthoseextractedbyutilizinghand-craftedfeatureextractiontechniques.Resultsshowthatautomaticallylearnedfeaturesperformedbetterwithanaccuracyofupto97.3%invivo.4.Learningwithlimiteddata:Machinelearningapproachestypicallyrequirealargetrain-ingdatasetforaccuratelearning.However,inapplicationsinthemedicaldomain,itcanbechallengingtoobtainalargevolumeoftrainingdata.Therefore,thisthesisexploredhowautomaticspotdetectioncanbeperformedusingalimitedamountoftrainingdata.AnoveltransferlearningstrategyforCNNswasdeveloped,wherethebestsourceCNNisautomati-callyselectedfromanensembleofmanysourceCNNs.5.Exploitinglabelingbehavior:Labelingdatainmedicalapplicationsisusuallymoreex-pensiveandrequiresamedicalexpert.Therefore,canthelabelingprocessinmedicalappli-cationsbebetterexploitedbytheMore,inadditiontothelabelsonspots,canthelabelingbehaviorofamedicalexpertbeincorporatedinasupervisedlearningframework?Inthiscontext,anewCNNframeworkisproposedthataddressesthetechnicalchallengesassociatedwiththisresearchandexploitslabelingbehaviorinCNNlearning.6Chapter2DevelopingMRIDatabase2.1IntroductionDevelopingalabeledcellularMRIdatabaserequiresseveralstepsandinvolvesexpertswithdif-ferentspecialities.Therefore,thegoalofthischapteristoprovideageneralreader,anoverviewofourdatacollectionprocess.detailsofthecollecteddatawillalsobeexplained.Theoverallprocesscanbedividedintofourmainsteps:(1)Cellpreparationwherethegoalistogrowacellculturewithmagneticparticlesinjectedinthem.(2)Animalpreparationwherethepreparedcellsareinjectedintotheanimalunderthestudy.Thisstepinvolvestheprocessofanaesthesiaandanimalincubation.(3)MRIscanningwheretheanimalundergoesanMRIandthescanoftherequiredorganisobtained.TheinjectedcellsappearasdarkspotsinMRI.IncaseofinvitroMRI,theanimalmaybereplacedwithatubecontainingpreparedcells.(4)LabelcollectionwhereamedicalexpertthoroughlyanalyzeseachsliceintheMRIscanusingacustomizedsoftwareandprovidesmanualgroundtruthonthespots.ThearchitectureofthisprocedurecanbeseeninFig.2.1.IntheoverallcontextofcollectingalabeleddatabaseforspotdetectioninMRI,thecontributionsherecanbelistedasfollows:Byfollowingalltheaforementionedfoursteps,thisstudycollectsthelabeledMRIdatabasethatcanbeutilizedforresearchinautomaticspotdetection.Adiversesetof40MRIscans(bothinvivoandinvitro)constitutethisdatabase.7Figure2.1:OverallarchitectureofthedatacollectionprocessForlabelcollection,acustomizedsoftwarewasdevelopedforexpertstoconvenientlyana-lyzeMRIslicesandprovidelabels.Thesoftwareallowsanexperttoperformzoom-inandzoom-outoperations;changecontrastoftheimage;seebasicstatistics;andalsorecordthelabelingbehavior.Atotalofmorethan19;700manuallabelswerecollectedonthegivenMRIdatabase.8Moredetailsofoneachstepofthecollectionapproachispresentednext.2.2ApproachFigure2.2:Thetwoimagesontheleftshowthemediautilizedincellculture.Themediausuallycontainsadiversesetofessentialingredientssuchasglucoseandglutamine.TheimageontherightshowtheMPIOpackageutilizedinourcellpreparationprocess.Figure2.3:(Left)Containerswithcellculture.(Middle)Temperatureandaircontrolequipmentthatwasutilized.(Right)TemperatureandAircontrolsettingsfortheculture.2.2.1CellpreparationThegoalhereistoculturestemcellssuchthattheirformcontainssuperparamagneticironoxideparticles(MPIOs)insidethem.InthisstudyweutilizedMesenchymalstemcells(MSCs).9Figure2.4:CulturedMSCswithMPIOsasseenunderamicroscope.Thesecellswereculturedusingamediathatwasmixedwithnano-sizedMPIOs.InFig.2.2,theimagesoftheutilizedmediaandtheMPIOsareshown.Cellsfeedonthismediaandthus,absorbMPIOs.Themediagenerallycontainsanumberofingredientsincludingglucoseandglutamine.ThistakesplaceinspecialcontainerthatismaintainedinsideatemperatureandaircontrolledequipmentshowninFig.2.3.Oncethecellabsorbtheseparticles,theycanbeviewedunderamicroscope.InFig.2.4,onesuchimageofcellswithMPIOsisshown.Theintensedarkregionsrepresnttheparticlesinsidethecell.Notethatthelargedarkregionsoutsidecellsarefreeironparticleswhicharelatercleanedusingacentrifugebasedprocedure.Suchaprocedurepushesthecellslowinatubewhereastheparticlesontopwhicharethenremoved.Thisproceduremayberepeatedseveraltimes.MoredetailsrelatedtothecollectedinvitroandinvivoMRIscanswillbediscussedlater.102.2.2Animalpreparation2.2.2.1AnaesthesiaThegoalofthisstepistoprepareasubjectanimalforcellinjectionandforconductingMRI.Generally,acolonyofrequiredlivinganimals(Ratsinthiscase)ismaintainedbyexpertsunderspecial,universityapprovedguidelines.ForcollectingeachMRIscan,theanimalisgivenanesthesiatokeepitunconsciousduringtherestoftheprocedure.Anesthesiaisperformedbyallowingtheanimaltoinhalegas.Once,theanimalisunconscious,theinhalationisstillmonitoredbyanexpertforashorttime(about5min.inthiscase).Fig.2.5showsanalbinoratgoingthroughtheunconsciousinhalationofwhichismonitoredbyanexpert.2.2.2.2CellinjectionAsanextstep,aregiononanimal'sbodyisclearlymarkedforcellinjection.Themarkingalsohasasterilizingpurposeandisperformedhereusinga10%iodinesolution.InFig.2.5,onesuchmarkfortheratisshown.Theshowsthemarkovertheheartregionoftheoftheratindicatingthatthesubjectanimalwillundergoanintracardiacinjection.InFig.2.6,anexpertisshowninjectinglabeledcellsviaanintracardiacinjection.2.2.2.3IncubationAsanextstep,theanimalgoesthroughanincubationphase.TheanimalismountedonaframebyanexpertasshowninFig.2.7andthentheincubationequipmentisattachedtoit.Theincubationequipmentcontrolsthebreathingcycleoftheunconsciousrat.ThismakessurethattheratinhalesadequateoxygenandthatCO2isproperlyexhaledfromitslungs.Fig.2.8visuallyshowthisprocedureforarat.11Figure2.5:(Top)Ratundergoesanaesthesiabyinhaling(Bottom)Iodinesolutionisutilizedtomarktheheartregionoftherat.2.2.3MRIscanningThepreparedanimalwithMPIOlabeledcellsisthenshiftedtotheMRImachineframewhichislaterinsertedintothemachine.ThisprocedureisshowninFig.2.9.Usually,beforeperforming12Figure2.6:MSCswithMPIOsareinjectedintotherat(intracardiacinjection).Figure2.7:Amedicalexpertcarefullymountstherattoasuitableframeandpreparesitfortheincubationequipment.thisstep,expertswaitforatleastanhourtomakesurethattheinjectedcellscirculatethroughtherat'sbodyandreachthedesiredorgan.Afterthis,anMRIexpertperformstheimagingunderastrengthusingaechotime(TE).13Figure2.8:(Top)Incubationequipmentisattachedtotherat.(Bottom)Ageneralviewoftheincubationprocedure.Theequipmentdisplaysthestatusoftherat'sbreathingprocess.NotethatinadditiontoperforminginvivoMRI(thoseinvolvinglivinganimals),invitroscanswerealsoobtained.Tab.2.1showsthebasicdetailsofthecollecteddatabase.Asetof33invitro14scansand7invivoscanswerecollected.ThedetailsofourcollectedinvitroandinvivoMRIalongwiththecorrespondingscanningdetailsarepresentednext.Figure2.9:(Top)MountingtherattotheMRImachine'smechanicalframe.(Bottom)RattoundergoanMRI15Table2.1:CollectiondetailsandcharacteristicsofourMRIdatabaseSetTypeSubjectLabelerMachineLabeledScansTotalLabelsResolutionSizeGAinvivoBrainR111.7TG1A;G2A;G3A;G4A;G5A15442100mm256256256GBinvivoBrainR27TG1B;G2B2992100mm256200256GCinvitroTubeR27TG1C;G2C;G3C;G4C814100mm1288080GDinvitroTubeR27TG1D;G2D;G3D;G4D514200mm644040GEinvitroTubet7TG1E;G2E;G3E:::;G25E(240025)100mm10064642.2.3.1InvitroMRIscansImagingphantomswereconstructedconsistingofaknownnumberof4.5microndiameter,mag-neticmicroparticleswith10pgironperparticle,suspendedinagarosesamples.EachmicroparticleapproximatesasinglemagneticallylabeledcellwithappropriateironcontentforMRI-basedsinglecelldetection[13].T2*-weightedgradientechoMRIwasthenperformedonthesesamplesatastrengthof7T.AscanbeseeninTab.2.1,thesescanshavevariationinresolution,matrixsizes,andamountofspots(labels).GEhas25datasets,collectedfrom5samplesunder5differentMRIconditions.TheseconditionswerevariationsinTEfrom10-30ms(signaltonoise>30:1),andimageswithlowsignaltonoiseratio(˘8:1)atTE=10and20.TheeffectofincreasingTEistoenhancethesizeofthespots.ThehighertheTE,thelargerthespot[13].ThedownsideofhigherTEisthatthephysicswhichgovernsenlargementofthespot,thedifferenceinmagneticsusceptibilitybetweenthelocationinandaroundthemagneticparticlesandthesurroundingtissue,alsocausesthebackgroundtissuetodarken.Therationaletocollectimageswithbothhighandlowsignaltonoiseratioistotesttherobustnessofourspotdetectionprocedureintwopotentialinvivosce-narios.Manualgroundtruthswerecollectedfromexpertson8invitroMRIscansofGCandGD.Notethat,tostudytheeffectofchangeinimageresolution,GDwasobtainedusingalowresolu-tionMRI.ForGE,thetheoreticallycomputedgroundtruthwasknown.Thissetwasusedforadirectcomparisonbetweentheautomaticallydetectedspotsandthetheoreticallyexpectednumber.16Figure2.10:A-FshowvariationinthebrainmorphologyacrossMRIslices.2.2.3.2InvivoMRIscansTwodifferentsetsofinvivoMRIwerecollectedfromtwodifferentmachineshavingdifferentstrengths.Usingonemachinewithastrengthof11:7T,5MRIscansofratswerecollected,whicharedenotedasGAinTab.2.1.Threeofthemwereinjectedintracardiac,11:5hourspriortothescan,withratmesenchymalstemcells(MSCs)thathadbeenlabeledwithmicronsizedironoxideparticles(MPIOs)toalevelof˘14pgironpercell.Thistransplantationschemedeliverscellstothebrain-anintravenousinjectionwoulddelivercellsonlytotheliverandlungs.Twoadditionalratswerenotinjectedatall.Usinganothermachinewith7T,2additionalbrainMRIscansofratswerecollected.ThesewerealsotransplantedwithMPIOlabeledMSCs.GBisusedto17denotethese2scansinTab.2.1.Therationalebehindcollectingthesetwodifferentinvivosetswastobeabletovalidatethegeneralizationandrobustnessofourlearnedalgorithmagainstpotentialvariationsarisingfromdifferentimagingsystems.NotethatadifferentamountofMSCswereinjectedindifferentratstoachievefurthervariationsinthedata.AllMRIwere3DT2*-weightedgradientecho.2.2.4LabelcollectionTocollectlabelsondata,aLabelingtoolwasdesignedwiththeassistanceofamedicalexpert.Thistoolwasdesignedandtestedseveraltimesbyanexperttomeettherequirementsofthisresearchwork.TheltoolthatwasadoptedbytheexpertisavailableasanexecutableandcanbeconvenientlyutilizedwithoutinstallingMatlab.Thetoolallowsamedicalexperttoanalyzeimagesandzoomintoportionsoftheimage,atthepixellevel,ifnecessary.Anoptionforcontrastadjustmentisalsoprovided.Expertscanviewsomebasicstatisticsontheinterfaceofthesoftwaretool,e.g.,thetotalnumberofspotslabeled,labeledspotsonthecurrentslice,slicenumber,etc.Ifrequired,theexpertscandeleteapreviouslylabeledpointandimmediatelyskiptoadifferentsliceintheMRI.Inadditiontocollectingtraditionaldata,thetoolalsocapturesaspectsrelatedtothelabelingbehaviorofanexpert.Forexample,thetimetakentolabeleachpoint,overalltimespentoneachslice,numberofkeyboardhits,timeoftheday,numberofdeletedpoints,etc.arealsorecorded.Notethattheselabelsrepresenttheentitiesthatahumanexpertconsidersasspots/cellsintheMRI.Duetohumanlabelingerror,itisalsopossibletohavesomeMRInoisebeingincorrectlymarkedasspotsandspotsbeingconfusedwithMRInoise.Ontheotherhand,forthesetGE,thenumberofspotsistheoreticallycomputed.However,consideringthattheprocessofpreparingandinjectingspotsaremanuallyconducted,theactualspotnumbersinthesescansmaynotbeexactly18thesameasthetheoreticallyestimatednumber.192.2.4.1DataloadingandsliceselectionFigure2.11:(Top)ThesoftwareinterfaceprovidesanoptiontobrowsetothedirectorycontainingtheMRIdata.(Bottom)Oncethedataisloaded,anexpertcanbeginlabelingfromanysliceusingthesliderindicatedwitharedarrow.202.2.4.2Zooming-intopixellevelFigure2.12:(Top)Forzooming-in,theoperatorsimplyclickanddraginthedirectionshownwiththeredarrow.Thiscreatesaboxedregionthatwillappearinthezoomed-inview.Thisprocesscanberepeatedmultipletimesiffurtherzoom-inisrequiredwithinthatboxedregion.(Bottom)Thecorrespondingzoomed-inview.212.2.4.3OperatingaZoom-outFigure2.13:Illustratingthezoom-outoperation.Theexpertclicksanddragsalongthediagonaldirectionasindicatedbytheredarrow.Thisoperationbringsuptheoriginallabelingview.222.2.4.4LabelingstatisticsandcontrastadjustmentFigure2.14:(Top)TheexpertlabelsareoverlaidontheMRIslice.Theoperatorusesaleft-clicktoindicatealabel.Alabelcanalsobedeletedbyclickingonitagain.Basiclabelingstatistics,suchaslocationofthelastlabeledpoint,totalnumberoflabeledspots,labeledspotsonthecurrentslide,andtheslicenumber,aredisplayedontherightsideofthetool.(Bottom)Showstheeffectofcontrastadjustment.Notethatalltheseoperationscanalsobeperformedwiththezoomed-inview.23Figure2.15:Thesquaresrepresentthelabelsfromanexpert.DistributionoftheselabelsontwoMRIslicesisshownhere.24Figure2.16:Thesquaresrepresentthelabelsfromanexpert.DistributionoftheselabelsontwoMRIslicesisshownhere.25Figure2.17:Thesquaresrepresentthelabelsfromexpert.DistributionoftheselabelsontwoMRIslicesisshownhere.26Chapter3Regions-of-InterestandFeatureRepresentations3.1IntroductionInmachinelearning,aapproachmapsarealworldproblemintoataskwheretwoormoreentities(classes)aretobeintelligentlydistinguishedfromeachother(see[22]forbasicdetails).Forexample,classifyingapotentialcandidateregioninMRIasspotornon-spotcanbeviewedasaclastask.However,inthecontextofthiswork,thechallengeishowtoeffectivelyextractthesepotentialcandidateregions,calledRegions-of-Interest(RoI),fromanMRIscan.ShouldtheRoIbebasedononepixel,twopixels,howmanypixels?whatwillbeasystematicapproachtoextractsuchRoI?Onceextracted,theRoIcanbeinputtoathatcanbeconstructedusingdifferentparadigms.Thewillthenlearntodistinguishaspotfromanon-spotintheseRoIs.Inthecontextofthiswork,paradigmscanbecategorizedintotwofundamentaltypes.Intheparadigmtype(P-1),discriminatinginformationisextractedfromtheimages(RoIinthiscase)usingaapproachthatisdesignedbyanexpertbasedonintuitionandexperience.Thisinformationmaybeintheformofanumericarrayofvaluesknownasfeatures.ForeachRoIsuchfeaturesalongwiththeirgroundtruthclasslabelsarethenforwardedtoanotheralgorithmcalledortechniquewhichlearnstodistinguishbetween27theclasses.Inthesecondparadigmtype(P-2),thefeaturerepresentationsarenotmanuallydesignedbyanexpertbutratherautomaticallylearnedfromthedata.Generally,bothfeaturerepresentationsandtheclasarelearnedautomaticallyinasingleframework.Manyneuralnetworkbasedapproachesfallintothiscategorywhichcantakeimagedatasetsdirectlyasinput,alongwiththelabels,andlearnamodel.NotethatboththeaforementionedapproachesrequireRoIasinput.Therefore,inthenextsectionweproposeaneffectivesuperpixelbasedstrategytoextractRoIandtheninvestigatethedesignofapproachesthatbelongtoboththeparadigms.3.2Approach3.2.1GeneratingRoIThechallengeinthisresearchistotheRoI.ProcessingeachpixelanRoIcanresultinahugecomputationalburden.WeaddressedthisissuebyextractingsuperpixelsfromeachMRIscan[18].Asuperpixeltechniquegroupslocallyclosepixelswithsimilarintensitiesintoasingleunit.Eachunitiscalledasuperpixel.Superpixel-basedmethodsarebecomingincreasinglypopu-lar.Forexample,authorsof[23]discusshowthesuperpixelsextractedusingdifferenttechniquescanbecombinedtoachievebetterimagesegmentation.Similarly,variousstudiesutilizesuperpix-elsforclassifyinglocalimagesegments.In[24],authorsuseamulti-scalesuperpixelapproachfortumorsegmentation.Furthermore,superpixelshavebeenutilizedinvariousotherap-plicationsasshownin[25,26,27].Sincespotsareusuallydarkerthantheirsurrounding,theyarecharacterizedassuperpixelswithloweraverageintensitythanthesurroundingsuperpixels.ThissuperpixelbasedmodelofaspotasillustratedinFig.3.1.28Figure3.1:AdiagrammaticrepresentationofaspotinMRIslices.ThealsoshowstworealspotsinMRIslicesandhowtheywerecapturedbysuperpixels.Basedonthisidea,anovelsetoffeaturesutilizingtheaveragesuperpixelintensities,wasproposedinourstudyin[18](seesupplementarymaterialinChap6fordetails).However,thisapproachhasthefollowinglimitations:(1)Theaccuracyoftheapproachwasdependentontheprecisenessofthesuperpixelalgorithms.(2)TheapproachassumesasuperpixelbasedmodelforaspotintermsofitsdepthacrossconsecutiveMRIslices.ThisdoesnotholdtrueforallspotsindifferentMRIsettings.Thestrategyadoptedinthisthesisisresilienttoimprecisionsinthesuperpixelextractional-gorithms.Basedoneachsuperpixelunit,arepresentativepatchisextractedfromtheMRIscanasexplainedinFig.3.2.Eachpatchisthentakenasacandidateregionandundergoesafeatureex-tractionprocess.Theapproachismodel-freeandimitatesthestrategyadoptedbyahumanlabeler.Allcandidatepatchesaredetectedin2DMRIslicesandthenneighboringpatchesdetectedinconsecutiveslicesareconnectedwithoutimposinganyrestrictionontheirdepthin3D.ThespatiallocationofeachpatchinMRIisalsorecorded.Consequently,theseextractedpatchesareforwardedtothemachinelearningalgorithmsasinputdata.29Figure3.2:(Top)Illustratingthegenerationofcandidateregions:AsuperpixelalgorithmisappliedtoeachsliceinMRIandthenthebrainregionisautomaticallysegmentedusingbasicimageprocessingtechniques.Thesuperpixelsthatcorrespondtoonlythesegmentedbrainregionareconsideredandtherestareignored.Foreachsuchsuperpixel,thedarkestpixelisselectedasthecenterandaedsizepatchisextractedaroundit.(Bottom)Amosaicofseveral99patchesextractedfromanMRIslice.Itcanbeseenthatallpatcheshaveadarkregioninthecenterrepresentingaspotina2Dslice.3.2.2Featureextraction3.2.2.1Featureextractionwithdesigns(P-1)Thisisthetraditionalandmostwidelyadoptedparadigmincomputervisionandpatternrecogni-tionbasedstudies[18,16].Manystudiesonautomatedcelldetectioninmicroscope-basedimaging30arebasedonthisparadigm[16].Further,ourinitialstudyproposedin[18]canalsobecategorizedintothisparticularparadigm.Inthisstudy,anelaboratesetoffeatureextractionmethodsareuti-lizedthatextractshape,intensity,textureandcontextinformationabouttheentitiesinthecandidatepatches.Fig.3.3,Fig.3.4,andFig.3.5presentabriefexplanationonhowhand-designedfeaturescanbeextractedforthetaskofcapturingspotappearanceinMRI.Figure3.3:PrincipleComponentAnalysis(PCA)wasutilizedtoextracteigenspotshapesusingallofthe99spotpatchesinthetrainingset.ThetopPCAcomponentsforthespotpatchesobtainedonthreelabeledratsinGAareshownhere.AniterativelyincreasingthresholdisthenappliedonthevaluesofthesetopPCAcomponentstoextractdifferentbinarypatchesthatareutilizedastocapturetheshapeandintensityinformationonspotpatches.Alltheextractedfeaturesareconcatenatedtoformafeaturevectorforeachcandidatepatch.Usually,inlongfeaturevectors,somefeaturesareirrelevantorredundant.Therefore,fromtheobtainedfeaturevector,themostusefulfeaturesareselectedandtheirrelevantfeaturesareeliminatedusingafeatureselectionmodulethatemploysacorrelationbasedfeatureselectionalgorithm[28].Theseselectedfeaturevectorsalongwiththeircorrespondinglabelsforthepatchesarethenforwardedtotunea.Inthisstudy,adiversegroupofwell-knownsuchasprobabilistic(Naivebayes),functional(Multi-layerperceptron(MLP)),anddecisiontree(RandomForest),areutilizedandcompared(see[22]fordetails).31Figure3.4:BinaryshapeareobtainedusingthetopPCAcomponents.Byiterativelyincreas-ingthethresholdvaluesfromdarktolightintensities,PCAcomponentscanresultindifferentbi-naryshapes.Domainexpertsagreethatthesebinarypatchesrepresentmanyfrequentshapesoftheactualspots.AllthesepatchesarerotatedandtranslatedtoobtainalargesetofdifferentshapeTheseareconvolvedwitheachcandidatepatchandthecomputedresponseistakenasafeature.Alargesetoftheseresponsescomprehensivelycapturetheshapeandintensityinformationofspotpatches.3.2.2.2Featureextractionwithlearneddesigns(P-2)Basedonexpertintuitionandexperience,featuresextractedinP-1canbesubjective.Therefore,thekeygoalofP-2approachesistoautomaticallylearnthemostoptimalspotfeaturerepresentationfromthedata.Neuralnetworksareawell-knownexampleoftheP-2approaches.Deepconvolutionalneuralnetworks(CNN)[29,19]havebeenhighlysuccessfulinmanyim-agebasedMLstudies.Before,wediscussthedesignoftheproposedCNNarchitectureforthistask,abriefintroductionofthewell-knownCNNarchitecturesispresentedasfollows:LeNet:ThisisoneoftheCNNarchitectureproposedbyYannLeCuninthe1990's.Thisarchitecturewasappliedforautomaticallyreadingzipcodes,handwrittendigits,etc.Moredetailsonthisarchitecturecanbeseenin[30].AlexNet:ThisarchitecturepopularizedtheuseofCNNarchitecturesinmoderncomputer32Figure3.5:Visualrepresentationofcontextfeaturesfortwocontextpatches:Whilelearningtheofaspot,itmayalsobeusefulforatolearnmoreaboutitssurroundingcontext.Therefore,tocapturetheappearanceofthepatch'scontext,alargerpatchof2121wasextractedaroundthecenterofacandidatepatchxi.Twowell-knownappearancedescriptorsincomputervisionwerethenusedtoextractfeatures:(a)HistogramofGradients(HoG)[1]and(b)Gist[2].AvisualrepresentationofHoGfortwocontextpatchesisshownhere.Redlineshereindicatethedifferentdirectionsofintensitygradientswhereasthelengthsofthelinesdeterminetheirmagnitude.vision.Thiswasproposedin2012byAlexKrizhevsky,IlyaSutskeverandGeoffHinton.ThisarchitecturewasmoredeeperthanLeNetandperformedsuperiortootherapproachesontheImageNetILSVRC-2012.Theirapproachpopularizedtheuseoflinearunits(ReLU)asnon-linearitiesintheCNNarchitectures.Also,theiruseofdropouttechniquetoselectivelyignoreneuronsinthetrainingphasewasconsideredeffectivetoavoidoverMoredetailscanbeseenin[29].Overfeat:OverfeatarchitecturewasthewinnerofthelocalizationtaskoftheILSVRC-332013.ThisarchitecturecanbeseenasaderivativeoftheAlexNetCNNarchitecture.Over-featalsoobtainedcompetitiveresultsforthedetectionandcationstasksinILSVRC-2013.Moredetailscanbeseenin[31].ZFNet:ThisCNNarchitecturewasproposedbyMatthewZeilerandRobFergusandhencebecamefamousasZFNet.ThisarchitectureisalsoaderivativeofthebasicAlexNetarchi-tecture.ThisarchitecturebecamefamousafteritshighaccuracyonILSVRC-2013.ThedetailsonZFNetcanbeseenin[32].VGGNet:VGGwastherunner-upoftheImageNetILSVRC-2014.VGGpresentedamoredeeperCNNnetworkswhichresultedinabetterperformance.Forexample,VGG-16(16layered)andVGG-19(19layered)weredeeperthanAlexNetanditsderivatives.Anotherinterestingpropertyoftheirarchitecturewastheuseofverysmallforconvolution(33)andpooling(22).FurtherdetailsonVGGCNNarchitecturescanbeseenin[33].GoogLeNet:ThisCNNarchitecturebySzegedyetal.fromGooglewasthewinneroftheImageNetILSVRC-2014.Duetotheuseoftheirproposedinceptionmodule,thisCNNarchitecturewasabletoreducethenumberofparametersintheirnetwork.FormoredetailsonGoogLeNetsee[34].ResNet:KaimingHeetal.in[35]designedResidualnetworkwhichwasthewinnerofILSVRC2015.Itutilizesspecialskipconnectionswhichallowsthelowerlayerstobecon-nectedwiththehigherlayers.ResNetalsoallowsforlearningmuchdeeperCNNstoimproveperformance.MorerecentworkofKaimingHecanbeseenin[36].NotethatunlikeP-1approaches,featureswerehierarchicallylearnedinalltheaforementioned34CNNsusingmultiplelayersinanautomaticfashion.IntheseCNNarchitectures,featureextrac-tionandwasperformedinaframework.However,thesearchitecturesweredesignedforstandardcomputervisiontaskswherealargeentityisofinterestintheimage.Consider,M=f()asanoverallclassimodellearnedbyourP-2approach.Inthecon-textofthiswork,letthisMdenoteasequenceofconvolutionalandtransformationfunctionsthatwillseriallyappliedtoaninputimagetooutputadecision.Consideringthis,fcanbedecomposedintomultiplefunctionallayers:Figure3.6:CNNarchitectureusedinthiswork.Thenetworktakesa9x9imagepatchasinputforataski.EachcompositelayerLk,k2f1;2;3g,iscomposedofaconvolutionallayerCikwhichproducesthefeaturemapsFikandanon-lineargatingfunctionbproducingthetransformedfeaturemapsFikb.Afterpassingthroughthecompositelayers,thenetpassesthroughthefullyconnectedlayerLfcwhichproducestheoutput.Thesoftmaxfunctionisthenappliedtotheoutput.Notethatinthecontextofthiswork,thisarchitecturerepresentthemodelM.Theweightsofalltheacrossitsprocessinglayersarelearnedusingthetrainingdata.f()=(fuf(u1)f(u2):::f1):(3.1)Here,eachfunction,fj,j2[1;u],canrepresenta(a)convolutionallayer,(b)non-lineargatinglayer,(c)poolinglayer,(d)full-connectedlayer(see[29,37,19]formoredetails).Foragiventask,weightsfortheseconvolutionalarelearnedautomaticallyusingthetrainingdata.DifferentarchitecturesofaCNNarecreatedbyutilizingdifferentnumberoflayersandalsobysequencingtheselayersdifferently.CNNarchitecturesalsovarydependingonthechoiceofthenon-linear35gatingfunction.Filtersizesforconvolutionallayersarealsodetermineddependingontheappli-cationathand.Well-knownCNNarchitecturessuchasAlexNet[29]orGoogLeNet[34]cannotbeutilizedforspotdetectioninMRI.Therefore,anewCNNarchitecture,designedforspotdetectioninMRI,isproposedhere.TheproposedCNNarchitecturehas3compositelayersand1fullyconnectedlayer(seeFig3.6).Eachcompositelayerconsistsofaconvolutionallayerandagatingfunction.NotethatinaconventionalCNNarchitecture,apoolinglayerisalsousedwhichreducesthedimensionalityoftheinputdata.However,apoolinglayerisnotutilizedinthisarchitectureduetothesmallsizeoftheinputpatches(99).Usingapoolinglayer,inthiscontext,mayresultinthelossofvaluableinformationwhichmaybeessentialtobeutilizedbythenextlayers.Further,agatingfunctionisusuallyaddedforintroducingnon-linearityintoaCNN.Withoutnon-lineargating,aCNNcanbeseenasasequenceoflinearoperationswhichcanhinderitsabilitytolearntheinherentnon-linearitiesinthetrainingdata.Inconventionalneuralnetworks,asigmoidfunctionorahyperbolictangentfunctionwasgenerallyutilizedforthispurpose.How-ever,inrecentstudies,utilizingReLULinearUnits)hasshownsuperiorresultsforthisrole[29].Therefore,theproposedarchitectureusesReLUasanon-lineargatingfunction.Furthercustomizingtothetaskathand,thesizesofalltheconvolutionalwerekeptsmall.However,theirnumberswerekepthigh.ThegoalwastoprovideahighercapacitytotheCNNarchitectureforcapturingadiversesetoflocalfeaturesofapatch.FiltersizesanddimensionsofresultingfeaturemapscanbeseeninFig.3.6(seeFig.3.7forlearnedForanytaski,theproposedmodel(CNNarchitecture)canbewrittenasM=(gLfcbCi3bCi2bCi1):(3.2)36Table3.1:ExperimentalcomparisonofinvivospotdetectionperformanceusingP-1andP-2.AlgorithmsJ1J2J3J4J5J6meansP-1RandomForest94.086.995.394.186.094.791:84:2NaiveBayes82.981.884.384.180.183.782:81:6P-2CNN96.492.396.196.491.295.094:62:3MLP91.185.290.991.484.290.388:93:3MLP(P-1/2)93.989.495.895.490.095.793:42:9means91:75:287:14:092:55:092:34:986:34:591:95:0wheregrepresentsastandardsoftmaxfunctionthatcanbeappliedtotheoutputofthefullycon-nectedlayerLfc.bdenotesthenon-lineargatingfunctionandCikrepresentstheconvolutionallayerinthecompositelayerk.3.3Experiments,resultsanddiscussionExperimentswereperformedtoanswerthefollowingmainquestions:(1)WhichofthetwoMLtechniqueresultsinthebestdetectionaccuracyforinvivospotsinMRI?(2)Howdoesthebestapproachperformoninvitroevaluationstudies?(4)CanaMLapproachlearnedoninvivodatabetestedforspotdetectiononinvitrodata?(5)HowistheperformanceaffectediftheMRIisconductedatlowresolution?(6)IstheproposedapproachrobusttothedifferencesinMRImachinesintermsofstrength,makeandmodeletc.?Importantly,itisalsoofinteresttoinvestigatehowthetheoreticallycomputednumberofspotsforinvitroMRIscanscompareswiththeautomaticallydetectedspotnumbers.3.3.1InvivoevaluationstudiesInthisstudy,thespotperformanceofadiversesetofapproacheswasevaluatedusingthetwosetsofinvivoMRIscansi.eGAandGB.First,experimentsandresultsarediscussedforGAthathas5differentMRIscansobtainedfromoneMRImachineandlabeledbyoneexpert.Three37oftheseinvivoscanscontainspotsthatweremanuallylabeledbyexpertswhereastheremainingtwowerenaive.Sixcombinationsoftestingandtrainingpairsarecreatedsuchthattwoscansarealwayspresentinthetestingsetofeachpair,whereoneofthescansisanaiveandtheothercontainsspots.Theremaining3outofthe5scansareusedfortrainingtheMLalgorithms.NotethateachMRIscanresultedinabouta100;000candidatepatchesandabout5000oftheserepresentedthespots.AreaUndertheCurve(AUC)wasutilizedasastandardmeasureforaccuracy.ExperimentalresultsforallthealgorithmsarelistedinTab.3.1.ItwasobservedthatthebestresultswereachievedbyaCNN,withameanaccuracyof94:6%.ThesuperiorperformanceofCNNcanbemainlyattributedtoitsabilitytoautomaticallyexplorethemostoptimalfeaturesusingtrainingdataratherthanrelyingonhand-craftedfeaturesutilizedintraditionalmachinelearning.Second,CNNlearnfeaturesinadeephierarchyacrossmultiplelayers.RecentresearchshowsthatsuchahierarchyprovidesasuperiorframeworktoCNNforlearningmorecomplexconcepts,unliketraditionalmachinelearningapproacheswhichlearnsinashallowmanner[29,34,37].ThesecondbestresultswereobservedwiththesimpleMLPapproachwhenittakesthecare-fullydesigned,handcraftedfeaturesasaninput,ratherthantherawdataX.ThisMLPcanbeviewedasamixedparadigmapproach(P-1/2).However,thedeeplearningCNNthatinherentlyextractshierarchicalfeatureswithoutusinganyhandcraftedfeaturesresultedintheoverallbestperformance.ProbabilisticNaivebayes,usingP-1,showstheworstdetectionperformancewithanaverageaccuracyof82.8%.Thiscanbebecausenaivebayesassumescompleteindependencebetweenthefeatureswhichinmanypracticalproblemsmaynotbetrue.Further,itcanbeseeninTab.3.1thatJ2andJ5testingsetsprovedtobethemostchallengingwithlowmeanaccuraciesof87:1%and86:3%,respectively,fromallalgorithms.DatasetJ4resultedintheoverallbestperformancewith38meanaccuracyof92:5%.Wheninvestigatingthis,itwasfoundthatbothJ2andJ5containedMRIscanGA1intheirtestsetaccompaniedwithadifferentnaivescan.ItwasseenthatthelabeledpatchesinGA1weretlymorechallengingintermsofmorphologyandintensitythanthoseextractedfromotherscans.Thebesttwoapproaches,i.e.,MLP(P-1/2)anddeepCNN(P-2),werethenfurthercomparedusinganothersetofinvivoscansi.eGB.Thisdatawascollectedfromadifferentmachinehavingadifferentstrengthandwasalsolabeledbyadifferentexpert.Inthisstudy,alltheprevious5scansofGAwereusedfortrainingbothapproaches(creatingalargertrainingset),andthenthelearnedspotdetectionmodelsweretestedontheinvivoscansinGB=fGB1;GB2g.Notethatdespitethedifferencesinmachine,itsstrength,andalsothelabelingexpert,CNNperformedbestwithanaccuracyof97:3%whereasthemixedparadigmMLP(P-1/2)achieved95:3%.WeshowtheROCcurvesforthistestinFig.3.8.3.3.2InvitroevaluationstudiesItcanbeobservedthatCNNyieldsthebestresultonthein-vivodatasetsdespitethesimplicityofitsapproach.Inthisstudy,itsperformanceisevaluatedontheinvitrodatainsetGCandGD.ItsperformanceistestedonGCthathas4invitroMRIscanseachwitha100mmresolutioncreatinga3Dmatrixof(1288080).Usingthese4scans,3differenttestingandtrainingpairsweredeveloped.Eachtestingandtrainingpairhas2MRIscans.ThenaiveMRIscanwasalwayskeptinthetestset,therebygenerating3combinationswiththeremainingothersets.ItwasobservedthatCNNperformedwithameanaccuracyof99.6%oninvitroscans.TheindividualROCplotsforthesetestsareshowninFig4.Adifferentstudywasthenconductedtoseethedegradationinperformancewheneachofthe4invitroscansareobtainedwithamuchlowerresolutionof200mmcreatingamatrixof39Figure3.7:Someconvolutionlearnedbythe2ndcompositelayersofourdeeplearningapproach.Usually,eachactsasaneuronandduetoco-adaptationbetweenalargenumberofsuchneurons,highlysophisticatedfeaturesareextractedthatcanpotentiallymodelspotshape,intensity,textureetc.(644040).SuchastudyisdesirablesinceinsomepracticalapplicationsitmaybemoreconvenienttorapidlyobtainanMRIatalowerresolution,particularlyinhumanexaminations.Usingthesameprocedureasbefore,threedifferenttestingandtrainingpairswerecreated.Itwasnotedthatthemeanperformancedecreasedto86:6%5:6.However,itwasalsoseenthatwhenthenumberoflearninglayersforCNNwasincreasedto5(4compositeand1fullyconnected)theperformanceimprovesto90:6%7:1.TheindividualimprovementsonallthethreesetsareshowninFig.3.8.40Table3.2:Automaticallydetectednumberofspotsin5samplesunder5conditions.Thetheoreticallyexpectednumberofspotsineachsampleis2400.ConditionTube1Tube2Tube3Tube4Tube5TE1021472272247421522270TE2026082750303926442660TE3028442993327228092909TE10(LowSNR)19822023224719492014TE20(LowSNR)241925632794240124453.3.3ComparisonwiththeoreticallycomputedspotnumbersAcomparisonbetweentheautomaticallydetectednumberofspotswiththetheoreticallycomputednumberofspotswasconductedusing25invitroMRIscansofsetGE.Thisisanimportantexperimentasitallowsadirectcomparisonwiththeactualnumberofinjectedspots.AlltheavailabledatafromGAtoGDwasusedfortrainingaCNNandthenthetrainedCNNmodelwasusedfortestingonthese25scansinsetGE.Eachscanisexpectedtocontainabout2400spots.However,itisimportanttounderstandthatduetotheuseofmanualproceduresinpreparingthesolutionintubes,theactualnumberofspotsmayvaryabout2400.TheresultsofautomaticspotdetectionaretabulatedinTab.3.2underdifferentMRIconditions.3.3.4ModelgeneralizationstudiesInthissection,thegeneralizationabilityoftheproposedapproachisdeterminedbytestingitindifferentpossiblepracticalscenarios.Inpractice,invivoscansmightbecollectedwithdifferentMRImachinesatdifferentlaboratoriesusingdifferentstrengths.GAandGBrepresenttwosuchinvivodatasets.Asdiscussedbeforeintheinvivoevaluationstudies,andasshowninFig.3.8,theCNNbasedapproachdemonstratesrobustnesstosuchvariationsandachieves97:3%accuracydespitesuchdifferences.Further,itisnecessarytoknowhowtheperformancewouldbeaffected41ifinvivodataisusedfortrainingbuttheinvitrodataisusedfortesting.Therefore,anexperimentwasconductedwhereaCNNwastrainedusingGA(invivo)andthentesteditusingGC(invitro).CNNstillperformedwithanaccuracyof96:1%.AvisualizationforthedetectedspotsininvitroandinvivoMRIscansisshowninFig.3.9andFig.3.9respectively.42Figure3.8:Comparisonandresults:(Top)invitroresults100micron,(Middle)generalizationtestusinginvivoscans,(Bottom)invitroresults200micron.43Figure3.9:3Dvisualizationofthedetectedspotsinaninvitroscan.Figure3.10:3Dvisualizationofthedetectedspotsinaninvivoscan.443.4ConclusionInconclusion,thisstudyinvestigateddifferentfeaturedesignapproachesforspotdetectioninMRI.AnapproachtoextractRoIfromMRIwaspresented.ACNNarchitecturetotheproblemofspotdetectionwasalsoproposed.ResultsshowthatfeaturesthatareautomaticallylearnedusingadeeplearningCNNoutperformhand-craftedfeatures.Further,theproposedapproachwasevaluatedusingadiversesetofMRIscansthatwereobtainedwithvariationsinstrength,echotimes,andresolutionchanges.AstudywasalsoconductedtocompareitsperformanceagainsttheknownnumberofspotsininvitroMRIscans.45Chapter4LearningwithSmallTrainingData4.1Introduction4.1.1BackgroundandmotivationOnekeyreasonbehindtheunprecedentedsuccessofCNNsistheavailabilityoflargeapplication-annotateddatasets.However,inmanypracticalapplications,especiallythoserelatedtomedicalimagingandradiology(e.g.spotdetection),obtainingalargeannotated(e.g.,labeled)datasetcanbechallenging.Inmanycases,annotationcanonlybeperformedbyexpertsandsocrowdsourcingmethods,suchasAmazon'sMechanicalTurk[21],cannotbeusedforannotatingdata.TheselimitationscanoftenprecludetheuseofCNNsinsuchapplications.Inordertoaddresstheproblemoflimitedtrainingdata,theconceptoftransferlearningcanbeused.Intransferlearning,knowledgelearnedforperformingonetaskisusedforlearningadifferenttask.Theideaoftransferlearningisnotnew.Forexample,theNIPS'95workshoponLearningtoLearnhighlightedtheimportanceofpursuingresearchintransferlearning.Anum-berofresearchstudieshavebeenpublishedinthepastinvestigatingdifferentaspectsoftransferlearningassummarizedinTab.4.2.IncaseofCNNs,transferlearningtypicallyentailsthetransferofinformationfromaselectedsourceconcept(sourceCNN,learnedforasourcetask)tolearnthetargetconcept(targetCNN,learnedforatargettask).RecentstudiesdetailhowtransferlearningcanbeperformedviaCNNs46Figure4.1:Givenalargenumberofpre-trainedsourceCNNs,theproposedapproachranksthemintheorderinwhichtheyarelikelytoimpacttheperformanceofagiventargettask.Thesourcetaskdataisnotusedinthisdetermination.bytransplantingthelearnedfeaturelayersfromoneCNNtoinitializeanother[37](SeeFig.4.2andFig.4.3).Duetoitsimpactonimprovingtheperformanceofthetargettask,transferlearningisbecomingacriticaltoolinmanyapplications[38][39].UsuallythisprocessisreferredtoastoindicatethatthetransplantedfeaturelayersofasourceCNNaremerelyusingthetargetdata.Itisnecessarytonotethatforsuchatransfer,thesourcedataisnotneeded;onlythesourceconceptasembodiedbythesourceCNNisrequired.ThisallowsresearcherstofreelyshareandreusepreviouslylearnedCNNmodels.1AttemptstoconvertCNNmodelsfromoneprogrammingplatformtoanother2hasalsofacilitatedthereusabilityofCNNs.Giventhese1https://github.com/BVLC/caffe/wiki/Model-Zoo2https://github.com/facebook/fb-caffe-exts47Figure4.2:TheavailabilityofsourcetaskdataisnotnecessaryinCNNbasedtransferlearning.TransferlearningmayonlyrequirethesourceCNNmodelandthetargetdatafortuning.Figure4.3:Thisdemonstratethebasicprocessofknowledgetransfer.LearnedfeaturelayersofasourceCNNaretransplantedtoinitializeatargetCNNwhichisthentunedusingthetargetdata.developments,ithasbecomenecessarytoinvestigatehowCNNmodelslearnedonvarioussourcetaskscanbeeffectivelyusedwhenlearningatargettaskthathasverylimitedtrainingdata.GivenaselectedsourcetaskorasourceCNN,recentstudiesshowanumberofusefulwaystotransferandexploititsinformationformaximizingtheperformancegainonthetargettask[37][40][41][42][43].48PreviousresearchhasclearlydemonstratedthatthechoiceofthesourceCNNhasanimpactontheperformanceofthetargettask[37].Somesources3mayalsoresultinaphenomenoncallednega-tivetransferwheretheperformanceonthetargettaskisdegradedasaresultoftransferlearning.However,aprincipledreasonforsuchadegradationhasnotbeenclearlydetermined.Further,inCNN-basedtransferlearning,thesourceismanuallychosen(e.g.,[39,38]).Severaldifferentap-proacheshavebeensuggestedtomanuallyselectasourcefortransferlearning.In[38],Agrawaletal.demonstratethatsourcedataobtainedfromamovingvehicle[44]canbeeffectivefortransferlearning,therebyhighlightingtheimportanceofmotion-baseddata.In[37],Yosinkietal.arguethatsourcetasksthatappeartobesemanticallyrelevanttothetargettaskwouldresultinbetterperformance.Alargenumberofstudies,however,showthatsemanticrelevancebetweensourceandtargettasksisnotalwaysnecessary;performanceimprovementhasbeenobservedevenwhenthesourceandtargettasksarenotrelated[38][45].Manualselectionhasthreemajordrawbacks:itissubjective,wheremultipleexpertsmaychooseadifferentsourceforthesametargettask;unreliable,wherethereisnoguaranteethatthechosensourcewillresultinbetterperformancethanothers;andlaborious,whereanexperthastomanuallyanalyzeaverylargenumberofpotentialsourcestasks.Currently,thereisnoprincipledwaytoautomaticallyselectthebestsourceCNNforagiventargettask.4.1.2TechnicalgoalThekeytechnicalgoalofthisstudy,therefore,istoinvestigatethepossibilityofautomatingsourceCNNselection.BychoosingthebestsourceCNNforagiventargettask,weanticipatethathighperformancecanbeachieveddespitetuningwithverylimitedtargetdata.Since,thisisthestudyattemptingtoautomatesourceCNNselection,wepresentthefollowingthreeideal3Notethatinthisstudy,sourcewillbeusedasageneraltermreferringtobothsourcetaskandsourceCNN.49requirementsofsucharankingmeasure:Scalable:ItonlyutilizessourceCNNs.Itdoesnotrequireustoadditionallystoreandmaintainthesourcedataofeachsourcetask.Unlikeastandardlearningbasedproblemwhereanobjectivefunctionisandoptimizedusingatrainingdataset,theidealrankingapproachshouldperformazero-shotrankingofCNNs,i.e.therankingapproachshouldnotutilizealearningphasethatisbasedonsourceCNNcharacteristics.Reliable:Ideally,therankingmeasureshouldnotbebasedonheuristics,especiallythosesimplybasedonthenotionofperceivedsimilarityordifferencebetweenthetasks.Therankingmeasureshouldbetheoreticallysustainedandnotheavilydependentonthetargettask.4.1.3NoveltyandcontributionsThisstudyisthetodemonstratethatautomaticallyrankingpre-trainedsourceCNNsispossible.ThisstudypresentsaninformationtheoreticframeworktoranksourceCNNsinanefreliable,zero-shotmannertherebysatisfyingalltherequirementsstatedabove.ThisstudypresentsathoroughexperimentalevaluationoftheproposedtheoryusingPlaces-MITdatabase,CalTech-256database,MNISTdatabaseandareal-worldMRIdatabase(whichisthefocusofthisthesis).50Table4.1:AsummaryofrelatedresearchintransferlearningviaCNNsResearchFocusSourceselectionOqubetal.[46]TransferinCNNsbytransplantingfeaturelayersManualYosinkietal.[37]ImpactoftransplantingdifferentCNNlayersManualLongetal.[40]CNNbasedtransferindeeperlayersManualAgrawaletal.[38]ApplicationoftransferlearningviaCNNManualTulsianietal.[39]ApplicationoftransferlearningviaCNNManualE.Littwinetal.[43]EffectofamultiverselossforimprovingtransferinCNNsManualProposedZero-shotrankingofsourceCNNsAutomated4.1.4Relatedwork4.1.4.1TransferlearningviaCNNsOqubetal.in[46]explainedhowtransferbetweenCNNscanbeimplementedbytransplantingnetworklayersfromoneCNNtoinitializeanother.Thisprocedureprovidesimprove-mentonthetargettaskandhasbeenutilizedindifferentapplications[38][39][45].Yosinkietal.in[37]presentanempiricalunderstandingoftheimpactoftransferringfeatureslearnedindiffer-entCNNlayers.TheyshowthatCNNfeatureslearnedinthelayeraregenericandsimilaracrossmultipletasks.Thesefeaturesbecomemoreandmoretaskinthedeeperlayers.TheauthorsalsodiscussthedifferentialimpactofsourceCNNsonthetargettask.Longetal.in[40]describehowdeeperlayerscanbemoreeffectivelytransferredtothetargetCNN.Arecentstudyin[43]providesintuitionsontheeffectofamultiverselossfunctioninimprovingtheperformanceoftransferlearninginCNNs.Thegoalofourworkisdifferentfromtheabove.Inparticular,weseektodevelopaprincipledwayforautomaticallyrankingsourceCNNSbasedontheirpotentialtofavorablytheperformanceofthetargettask.GiventheincreasingavailabilityofsourceCNNsinthepublicdomainandthediversityofpracticalapplicationsthathavetocontendwithscarcityoftrainingdata,theproposedapproachisexpectedtohaveaimpactontheviabilityoftransferlearning.514.1.4.2TransferlearningintraditionalresearchAsshowninTab.4.1,anumberstudieshavebeenpublishedintraditionaltransferlearningresearchthatdoesnotutilizeCNNs.Thesestudiesadoptdifferentapproachesfortransferringinformationacrosstasks.Inthecontextoftransferlearning,themeaningofseverallearningterminologiesvaryacrossdifferentstudiesintheliterature.Inthefollowingsubsections,wepresentabriefdiscussionontheseterminologiesanddifferences.4.1.4.3SupervisedtransferlearningDifferentstudiesrefertothetermsupervisedtransferlearningtorepresentslightlydifferentcon-textsoflearning.Inmanystudiessupervisedtransferlearningmeansthecasewherethereisabundantlabeleddataforthesourcetaskbutlimitedlabeleddataforthetargettask.ResearchbyDaume[47]andChattophadyay[48]usethisterminologytomeanthisparticularcontext.Ontheotherhand,instudiessuchasthosebyGong[49]andBlitzer[50]useadifferenttermforthissetup.Theyrelatesuchasetupwithasemi-supervisedtransferlearning.ForCook[51]andFeuz[52],thetermsupervisedtransferlearningonlyrelatestothesourcetaskdata.Ifthesourcetaskhasanylabeleddata,theyconsideritasthecaseofsupervisedtransferlearning.Further,theycallthelearningasinformedoruninformedbasedontheavailabilityorabsenceoflabeled,targettaskdata.4.1.4.4Semi-supervisedtransferlearningThetermsemi-supervisedtransferlearninghasalsobeenusedintheliteraturetorepresentdiffer-entcontexts.Forexample,studiesbyDaume[47]andChattophadyay[48]usethistermtorefertoacasewherethereisabundantavailabilityoflabeledsourcetaskdatabutthetargettaskdataisnotavailable.Ontheotherhand,asmentionedbefore,Blitzer[50]andGong[49]usetheterm52semi-supervisedtransferlearningtorefertoacasewherethelabeledsourcetaskdataisabundantandthelabeledtargettaskdataislimited.4.1.4.5UnsupervisedtransferlearningInthegeneralcontext,whenlearningapproachonlyutilizesunlabeledtrainingdata,theapproachisreferredtoasunsupervised.Inthecontextoftransferlearning,differentstudiesusethetermunsupervisedtransferlearningtorefertoslightlydifferentcontexts.ForFuez[52]andCook[51]unsupervisedtransferlearningrepresentsacasewherethelabeledsourcetaskdataisnotavailable.ForBlitzer[50]andGong[49],itmeansthecasewherethereisabundantlabeleddataforthesourcetaskbutnolabeleddataforthetargettask.Notethatthisscenariowasreferredassemi-supervisedtransferlearningbyChattophadyay[48]andDaume[47].Further,forPan[53],thistermreferstothecasewherethereisnolabeleddataforboth,thesourcetaskandthetargettask.4.1.4.6InductiveandtransductivetransferlearningPan[53],usetheterminductivetransfertorefertoascenariowheresomelabeleddataforthetargettaskavailable.Ontheotherhand,transductivetransferrefertoacasewherelabeledtargettaskdataisnotavailable,however,labeledsourcetaskdataispresent.Notethatforthissetup,Gong[49]andBlitzer[50]usedthetermofunsupervisedlearning.Basedonwhatistransferred,theseapproachescanbemainlycategorizedas(1)instance-basedtransferlearning,wherethelabeleddatainthesourcetaskisre-weightedtobeutilizedforthetar-gettask[54,55,56,57],(2)feature-basedtransferlearning,wherethefeaturesofthesourcetaskaretransformedtocloselymatchthoseofthetargettask,oracommonlatentfeaturespaceisdis-covered[58,59,60],(3)parameter-basedtransferlearning,wherethethegoalistodiscoversharedparametersacrosstasks[61,62]and(4)relationalknowledge-basedtransferlearning,whichisa53Table4.2:AbriefoverviewoftransferlearningresearchPaperFocusofresearchDaietal.[54]TransferlearningviaboostingalgorithmJiangetal.[55]SourceinstanceweightingfordomainadaptationLiaoetal.[56]UtilizingauxiliarydatafortargetlabelingWuetal.[57]IntegratingsourcetaskdatainSVMlearningframeworkPanetal.[58]TransferlearningviadimensionalityreductionPanetal.[59]DomainadaptationusingeffeaturetransformationBlitzeretal.[50]extractingfeaturestoreducedifferencebetweendomainsDaietal.[64]LabelingtargettaskdatausingunlabeledsourcetaskdataDuameetal.[47]DomainadaptationusingfeatureaugmentationXingetal.[65]Correctingthepredictedlabelsofshift-unawareRosensteinetal.[66]NegativetransferbetweentasksPanetal.[67]SpectralfeaturealignmentfortransferlearningRainaetal.[60]Learninghigh-levelfeaturesfortransferlearningGongetal.[49]ReducingdomaindifferenceinalowdimensionalfeaturespaceTommasietal.[61]TransferringSVMhyperplaneinformationYaoetal.[62]TransferringinternallearnerparameterinformationMihalkovaetal.[63]MarkovlogicnetworksfortransferringrelationalknowledgeLongetal.[68]JointdomainadaptationAmmaretal.[3]AutomatedsourceselectioninreinforcementlearningusingRBMscomparativelylessexploredareainthiscontext,andwherethegoalistotransfertherelationshipamongdatafromasourcetasktoatargettask[63].Whilethehistoryoftransferlearningresearchspansovertwodecades[69,70,53],thequestionofhowtopredictthetransferabilityofasourcetask,inasupervisedframework,isrelativelylessstudied.Somestudiesassumedthatthesourceandtargettaskshadtobesimilarinorderforthetransferlearningtobeeffective[66][54].Suchanassumptionmaynotbetrueinpractice.Forexample,ifthetargettaskitselfisduplicatedandpresentedasasourcetask,thesimilaritybetweentargetandsourcetaskswouldbemaximal;however,suchanarrangementwillbeundesirableduetoredundancyandoverInaddition,suchapproachesmaynecessitatethestoringofsourcedata.In[71],amethodtochooseauxiliarytrainingdatatofacilitatetransferlearningisdiscussed.Themethodutilizesavalidationsetbasedonthetargettaskinordertoselecttheauxiliarytraining54samples.However,themethodisiterative,computationallyexpensive,anddoesnotutilizetheauxiliarydatainazero-shotmanner.Recently,in[3],theauthorsutilizedarestrictedboltzmanmachinebasedapproachtoautomaticallyselectthesourcetaskfortransferinthecontextofreinforcementlearning.However,theapproachhastwodistinctshortcomings.Firstly,itisbasedontheimplicitassumptionthatthesourceandtargetdatahavetobevisuallysimilarinorderforthetransferlearningtobeeffective.Secondly,theapproachdoesnotexplicitlylinktherankingcriteriawithperformancegainonthetargettask.However,theapproachisobservedtoperformwellonthetargettasksconsideredbytheauthors.Therefore,wecomparetheproposedapproachwiththeapproachin[3].4.2ApproachInthissection,beforewepresenttheproposedtheoreticalframework,twointuitiveandpreliminarystudiesarediscussedThedetailedexperimentalresultsfromthesetwointuitiveapproacheswillbepresentedlaterinthesupplementarymaterial.Inthefollowingsubsections,onlythemo-tivationfortheseapproachesandtheirlimitationsarediscussed.Thiswillbefollowedbyadetaildiscussionontheproposedtheoreticalframeworkthatmeetsallthedesignrequirementsmentionedpreviously.4.2.1Intuitiveapproach:AsolutionspacebasedapproachTheapproachhereisbasedonthehypothesisthatgivenaCNNarchitecture,thereexistasolutionspaceforit.DifferentpointsinthisspacerepresentCNNbasedsolutionsfordifferentapplications.Thisindicatesthattheremaybeaspatialregionrepresentingidealsolution(s).Thus,thedataismerelyutilizedbyaCNNduringitstrainingphasetotraversethisspacesoasitmovesaway55fromarandomlyinitializedspatiallocation.AstheCNNsutilizetrainingdataforatask,ittraversesfarfromtherandom,non-idealspaceandhencelearnsmoreusefulfeatures.Inthisregard,thefollowingconceptsaremoreformallypresented:4.2.1.1CNNsolutionspaceConsiderahighdimensionalsolutionspacewitheachpointdenotingtheweightsinthelayersofaCNNthataretransferred.ForaedCNNarchitecture,eachpointinthisspacedenotesaCNNbasedsolutionforsometask.Forexample,onepointinthisspacemayrepresentanidealsolutionfortheface-recognitionproblemwhereasanothermayrepresentanon-idealsolutionfordiseaseestimation.IthasbeengenerallyacceptedintheliteraturethattheweightsintheinitiallayersofaCNNactasgeneralfeatureextractionoperatorsand,thus,theCNNsformanydifferenttasksmayhavesimilarlayers;incontrast,theweightsthatareinthedeeperlayersofthenetworkbecomeincreasinglytask[37,40].Hence,theweightsinthedeepestconvolutionallayersrepresentthemosttaskweightsandareutilizedheretodenoteaCNN.4.2.1.2SolutiondifferenceThismeasurestheEuclideandistancebetweentwopointsinthesolutionspace.SolutiondifferencebetweentwoCNNsNiandNjiscomputedasrij=jjWiWjjj22.HereWiandWjaretwopointsinsolutionspacedenotingNiandNjrespectively.4.2.1.3SolutionpathDuringtraining,aCNNisinitializedtoapoint(randomorotherwise)inthesolutionspace.Thenthelearningalgorithmadjuststheweightsincrementallyaftereachepoch,andtheupdatedweightstraverseapathinthesolutionspace,referredtohereasthesolutionpath.Forataski,let56Figure4.4:Theoutputofthelastlayermistaskdependentandisthereforeitsdimensionalityisdifferentdependingontask.Hence,thek-dimensionaloutputoflayerm1isutilized.Also,itistheoutputofthislayerwhichwillbeutilizedlaterintheexperimentsectionforvisualization.Pti=[Noi;N1i;:::;Nti]denoteitssolutionpath,whereNitdenotesthesolutionpointatepochtandNiorepresentstheinitializationpoint.4.2.1.4Path-to-pointprAsequenceofdifferencesbetweenaCNNNjandeachpointinPicanbecomputedusingthesolutiondifferencemeasureabove.Thisresultsinasequenceofdifferenceshtij=[roij;r1ij;:::;rtij],57whereroijisthesolutiondifferencebetweenNoiandNj.4.2.1.5SourceCNNrankingGivenasourceCNNNtiandarandomlyinitializedCNNNo,therankingscorecanbecomputedas,Ei=rtio:(4.1)Now,thetrainingprocessresultsintintermediateCNNs:i.e.,theCNNsinPti.AnyofthesetCNNscouldpotentiallybeasuitablecandidatefortransferlearning;itisnotnecessarilythecasethatNtiwillresultinthebestperformancegainaftertransferonthetargettask.Therefore,thecriterionEiisupdatedas,Ei=maxfhtijg:(4.2)NotethatthedevelopmentofEireliesonthefactthatsmalltrainingdatasets,ingeneral,areincapableofimputingacomprehensiverepresentationalpowertoaCNN.Asthesupplementaryinformationshowsthisapproachcanworkwellwhenthetargettrainingdataislimited(seeChap.6forexperimentalresults).However,thisapproachalsohastwodistinctdrawbacks:Thepresentedapproachisintuitiveandisnotderivedfromtheory.CNNrankingdoesnottakeintoconsiderationthevariabilityinthetargettask.Consideringtheselimitations,atheoreticalapproachispresentednextwhichmeetsalltheaforementionedidealrequirements.InSupplementarymaterial(Sec.6.2)experimentalresultsofthisapproacharepresented.58Table4.3:SummaryofthebasicnotationsusedinthissectionNotationDescriptionDi=(Xi;Yi)Trainingdatasetforsourcetaski.De=(Xe;Ye)Testdatasetfortargettask.Dt=(Xt;Yt)Trainingdatasetfortargettask.Dv=(Xv;Yv)Validationdatasetfortargettask.XiTrainingsamples(images)forsourcetaski.YiCorrespondinggroundtruthlabelsonXi.Xt,Xv,XeTraining,Validation,andTestsamplesfortargettask,respectively.Yt,Yv,YeCorrespondinggroundtruthlabelsonXt,Xv,andXerespectively.NiSourceCNNlearnedusingdataDi.NtTargetCNNlearnedusingdataDt.mTotalnumberofprocessinglayersinaCNN.lDenotesthelayernumberinaCNN,wherelm.H(A)EntropyofvariableA.H(AjB)ConditionalentropyofAgivenvariableB.I(A;B)MutualinformationbetweenAandB.4.2.2Theoreticalapproach4.2.2.1NotationsConsiderasetofqsourcetaskswithcorrespondingtrainingdatasetsfD1;D2;:::;Dqg.Foreachtaskiq,datasetDi=(Xi;Yi)whereXirepresentsthetrainingsamplesandYidenotesthecorre-spondinglabelsonthem.Also,foreachtaski,aCNNNiislearnedbyutilizingDifortraining.ThisresultsinasetofqsourceCNNsfN1;N2;:::;Nqg.Similarly,consideratargetdatasetthatisdividedintoDeandDa.ThetestsetisrepresentedasDewhileDa=fDt;Dvgrepresentsthedatathatcanbeutilizedfortraining(Dt)andvalidation(Dv).Notethatthesizesofthetrainingandvalidationdatawillbekeptverysmallinourexperimentsinordertoassessefyoftheproposedapproachinreal-worldapplicationswithsmalltrainingdata.Similartothesourcetaskdatasets,eachofthedatasetscorrespondingtothetargettaskalsocomprisesofimagesandcorrespondinglabels.Forexample,thetargettrainingsetcanbedenoted59Figure4.5:Theoutputofthelastlayermistaskdependentandisthereforeitsdimensionalityisdifferentdependingontask.Hence,thek-dimensionaloutputoflayerm1isutilized.Also,itistheoutputofthislayerwhichwillbeutilizedlaterintheexperimentsectionforvisualization.asDt=(Xt;Yt),whereXtaretheimagesandYtarethecorrespondinglabels.Further,letNtdenotetheCNNthatislearnedusingthesmalltrainingsetDt.Abriefsummaryofthenotationsistabu-latedinTab.4.3.4.2.2.2DerivingthemeasureThegoalhereistoderivearankingmeasureonsourceCNNsthatisexplicitlybasedonreducingtheerroronthetargettask.TheuncertaintyinpredictingthetestinglabelsYeisgivenbytheentropyH(Ye),whereH()representstheentropyfunction.Ahigherentropyvaluewouldmeanalargeruncertaintyinpredictionand,therefore,thegoalistoreduceH(Ye).Withtheavailabilityofmoreinformation,whichcanbepotentiallyusefulinlabelprediction,thisuncertaintycandecrease.GiventhatwehaveatrainedCNNNtthatwasderivedusingthesmalltrainingdataDt,additionalinformationNmt(Xe)can,inprinciple,beextractedfromthetestimagesXe.ThenotationNmt(Xe)60indicatesthatimagesinXeareinputintotheCNNNtandtheoutputofthemthlayerisobtained.Here,misthelastlayer(totaldepth)oftheCNN.Since,Nmt(Xe)representstheoutputscorebythelastlayerofNtonXe,wewriteNmt(Xe)asNt(Xe)forsimplicity.Theoretically,asconditioningreducesentropy,therefore,H(Ye)H(YejNt(Xe)):(4.3)Similarly,additionalinformationcanalsobeextractedfromXebyutilizingthefeaturerepre-sentationslearnedbytheCNNforasourcetaski.ThisinformationcanbedenotedasNli(Xe).Since,thedimensionalityoftheoutputoflastlayer,i.e.,atl=m,canbedifferentfordifferentsourcetasks4,theoutputofthelayerl=m1isextractedandutilized.Again,asconditioningreducestheentropy,wehave,H(Ye)H(YejNt(Xe))H(YejNt(Xe);Nm1i(Xe)):(4.4)Further,asthetestimagesXeandthelabelsYewillnotbeavailableduringthetrainingstage,thevalidationdataDv=(Xv;Yv)isutilizedinstead.Thus,H(Yv)H(YvjNt(Xv))H(YvjNt(Xv);Nm1i(Xv)):(4.5)ThisequationshowsthatwithadditionalinformationextractedusingasourceCNN,theuncer-taintyinpredictioncanfurtherdecrease.Now,thetotaldecreaseinuncertaintycanbewrittenas4Here,thedimensionalitypertainstothenumberofclassesinatask61thedifferencebetweenthefollowingterms:f=H(Yv)H(YvjNt(Xv);Nm1i(Xv)):(4.6)Ininformationtheory,thisdifferencefiscalledgainorinformationgain.Thisgaincanalsoberewrittenintheformofmutualinformationas,f=H(Yv)[H(YvjNt(Xv))I(Nm1i(Xv);YvjNt(Xv))]:(4.7)Here,themutualinformationisdenotedbythefunctionI().ForanythreevariablesA,BandC;I(A;BjC)=I(B;AjC)andso:f=H(Yv)H(YvjNt(Xv))+I(Yv;Nm1i(Xv)jNt(Xv)):(4.8)Inthecontextoftwovariables,H(Yv)H(YvjNt(Xv))=I(Yv;Nt(Xv))hence:f=I(Yv;Nt(Xv))+I(Yv;Nm1i(Xv)jNt(Xv)):(4.9)Theequationherehastwoterms.ThetermI(Yv;Nt(Xv))denotesthegainduetothemutualinformationbetweenthetargetlabelsYvandthepredictedoutputscoresNt(Xv)bythetargetCNN.Thehigherthismutualinformation,thelessertheuncertaintyinpredictingYv.Fig.4.6showsaninformationdiagram5fortheaforementionedterms.Region-1inthisrepresentsthetermofEqn.(7).Notethatthesecondterm,I(Yv;Nm1i(Xv)jNt(Xv)),representsthegainduetoasource,5AninformationdiagramissimilartoavenndiagrambutisusedtoshowrelationshipbetweenShannon'sbasicmeasuresofinformation.62Figure4.6:Informationdiagram:TheterminEqn.(7)isrepresentedbyregion-1whereasthesecondtermisrepresentedbyregion-2.Thelargertheregion-2,themoreusefulisthesourceCNN.thatisnotalreadyaccountedforbyNt(Xv).InFig.4.6,region-2representsthisterm.Thistermprovidesadditional,relevantinformationthatwasnotavailablewhenonlyutilizingthetarget'strainingdataDt.Thehigherthevalueofthisterm,themoreusefulasourcewillbe.Since,thetermisindependentofthesource,thesecondtermherecanbeutilizedtomeasuretheworthofasourceCNN.Therefore,forasourceCNNNi,itstransferability6giisgivenas,gi=I(Yv;Nm1i(Xv)jNt(Xv)):(4.10)Notethatthistermcaneasilybecomputedusingpubliclyavailableimplementationsformutualinformation.Forthereproducibilityoftheresults,theimplementationanddatasetsusedherewillbemadepubliclyavailable.6Inthisstudy,thetermtransferabilityandrankingscorewillbeusedalternatively.634.2.2.3DiscussionTheproposedtermisawarethatiftheinformationextractedviasourceCNNisexactlythatofthetargetCNN,i.e.,Nm1i(Xv)=Nt(Xv),thetransferabilitywillbezero.Bycomputingtheconditionalmutualinformation,thetermisexplicitlyevaluatingasourceCNN,Ni,basedontheadditional,predictiveinformationbetweenYvandNm1i(Xv)thatcannotbeobtainedfromNt(Xv).IfthesourceCNNiscompletelyirrelevant,theadditional,predictiveinformationfromNm1i(Xv)willhavezeromutualinformationwithYv,resultinginzerotransferability.4.2.2.4UpperboundontransferabilityTheupper-boundontransferabilitycanalsobeestimated.ThisestimatewilldenotethemaximumtransferabilitythatcanbeachievedbyasourceCNN.ThetotaluncertaintyinpredictinglabelsisestimatedbyH(Yv).SomepredictiveinformationisprovidedbyNtthatistrainedonthetarget'strainingdata.Thisinformationisdenotedbyregion-1inFig.4.6.ThisoverlapofinformationcanbewrittenasH(Yv)\H(Nt(Xv))orsimplyasthemutualinformationI((Yv);Nt(Xv)),asdiscussedbefore.Theremaininginformation,H(Yv)I((Yv);Nt(Xv)),canbeprovidedbyasourceCNN.Theoretically,thisestimatesthemaximumamountofinformationthatisrequired.Since,H(Yv)I((Yv);Nt(Xv))=H(YvjNt(Xv)),theupperboundontransferability,gmax,cansimplybewrittenasH(YvjNt(Xv)).644.2.3Datasets4.2.3.1TargetData-MRIdatabaseArealworldMRIdataset[18][72]isutilizedasthetargetdata.ThetaskistodetecttheinjectedcellsininvivoMRIscansthatappearasdarkspots.Inthisthesis,thisdataisdenotedbysetGA.Inmanymedicalapplicationssuchasthis,notonlyisthecollectionofdatachallengingbutthelabelingofthedataisalsoexpensiveandhighlytimeconsuming.Forthelong-termsuccessofcellbasedtherapies,itisessentialthatinsuchapplications,injectedcellsaredetectedaccuratelywithminimumlabelinginputwhichiscurrentlyapracticalchallenge[72][73].Thisdatasetcomprisesof5MRIscansofdifferentinvivoratbrains.Spotsin3ofthesescanswerelabeledbyamedicalexpert.These3scanswereutilizedinthisstudy.Fromeachscanabout100;000patcheswereextractedaspotentialspotsbyauthorsin[72].Onlyabout5;000ofthesewerespot-patches(positiveclass)andtheremainingwerenon-spotpatches(negativeclass).Trainandtestscansweremutuallyexclusive.Fromeachtrainingscan,only5%ofthepatches(about5;000)wererandomlyselectedandutilized.Further,only85%oftheselected5%wereusedfortrainingNtandtheremaining15%wasusedasthevalidationsetDv.4.2.3.2TargetData-MNISTdatabaseInaseparateexperiment,wetestthegeneralizationoftheproposedapproachusingthestandardMNISTdatabase.Here,themulti-classtaskinvolvesdifferentiatingbetweenwrittendigitsrangingfromfi0"andfi9".ThetotalnumberoftrainingsamplesinMNISTdatabaseisabout60;000,outofwhichonly5%arerandomlychosenandutilizedinthesamemanner.65Figure4.7:Eachimageinthesourcedatasetwasconvertedtograyscaleandthendown-sampledto2020and99.Someoftheseimagesalongwiththeirtransformedversionsareshownhere.4.2.3.3SourceData-Places-MITdatabaseInthisstudy,thepubliclyavailablePlaces-MITdatasetwasutilized[74].Thisdatasethasadiversesetof205differentclasseswithimagescontainingclutteredurbanscenes,emptyhall-ways,cakes(inbakery),(inaquarium),etc.Asetof500differenttaskswererandomlygeneratedwithclassesrangingfrom2to205.Theimagesinthisdatabasearemuchdifferentindimensionsfromthe99patchesintheMRIdatabaseandthe2020imagesoftheMNISTdatabase.Therefore,eachimageherewasconvertedtograyscaleandthendown-sampledtothesizecompatiblewith66theimagesofthetwotargettasks.Thetransformedimagesexhibitdiversityintheircontent,asshowninFig.4.7.4.3Experiments,ResultsandDiscussionInthissection,wedesignexperimentstoanswerthefollowingquestions:(1)HowwelldoestheproposedmeasurerankthesourceCNNsforatargettaskthathasscarcityoftrainingdata?(2)Howdoestheperformanceoftheproposedapproachcomparewithapreviousapproachintheliteraturethatisheuristic-based?(3)Cantheimpactofthetopandtheworstrankedsourceonthetargettaskbevisualizedandcompared?(4)HowdoesthenumberoftrainingsamplesimpacttheperformancegainduetotransferlearninginCNNs?Inallexperiments,AUC(AreaUnderROC)wasutilizedasthemeasureofaccuracy.4.3.1MRIbasedtargettaskRankingSourceCNNs:Usingthe500sourcetasksgeneratedfromPlaces-MITdatabase,500CNNswerelearned.TheCNNarchitectureusedin[72][73]wasadoptedforthistargettask.Usingtheproposedapproach,allthesesourceCNNswererankedpriortoconductingthetransfer.TherankingscoresforeachCNN,i.e.,measuredtransferability,isshownonthehorizontalaxisofFig.4.8whiletheperformancewithtransferlearningispresentedontheverticalaxis.ForverticalaxisinFig.4.8,500moreCNNswerelearnedbytuningeachsourceCNNonthetargettrainingdata.Notethehighdegreeofcorrelationbetweentherankingscoreandthedegreeofimprovementinperformanceaftertransferlearning.ThescoresshownherearethenormalizedscoresobtainedafterdividingtherankscoreofeachsourceCNNbythemaximumscoreachievedbyanyofthe500sourceCNNs.ThetwoinFig.4.8representtheresultsontwo67differenttestMRIscans.Ineachcase500CNNswereevaluatedonthecompletesetoftestpatches(about100;000ineachscan).Whentrainingusingthesourcedata,eachsourceCNNunderwentapre-determinednumberof15epochs.Whentuningonthetargetdata,thetrainingofeachsourceCNNproceededuntilconvergence.PerformanceComparison:Inthisexperiment,thegoalistocomparetheperformanceoftheproposedapproachwithanotherapproachintheliteraturethatmerelyreliesonsimilaritybetweenthesourceandtargettasks.In[3]theauthorsproposeutilizingRBMsforautomatedsourcese-lectionwhichisbasedonthesimilaritybetweentasks.Therefore,usingtheirproposedprotocol,anRBMmodelwastrainedforeachsourcetask.Then,usingeachsourceRBMmodel,thereconstructionerroronthetargetdataiscomputed.ThenormalizedreconstructionerrorsforeachsourceRBMmodelisshownonthehorizontalaxisinFig.4.8.Theverticalaxisrepresentstheper-formancetransferlearningusingthecorrespondingsourceCNN.Itcanbeclearlyobservedthatthereisalackofcorrelationbetweenreconstructionerrorandperformanceimprovementaftertransferlearning.Asmentionedbefore,theapproachesbasedonheuristicsofsimilarityordifferencecanfailinpracticeandmaynotbeapplicableforallsource/targettasks.AnalyzingRankedSourceCNNs:Thegoalhereistovisuallyinvestigatethedifferencebe-tweensourceCNNsthatwererankedthebestandtheworst.ToboththesesourceCNNs,twodifferenttestsetsweregivenasinputsandthe200-dimensionaloutputofthefullyconnectedlayerthesampleswasobtainedforalltestsamples.NotethattheseoutputsarefromsourceCNNsthathavenotyetbeentunedusinganytargetdata.The200dimensionaloutputswerethenprojectedtoa3Dspaceusingprincipalcomponentanalysis.Thespotsamplesandthenon-spotsampleswere68Figure4.8:(Toprow)Thetwoshowtheresultoftheproposedapproachontwodif-ferenttestsets.(BottomRow)ThesetwoshowtheresultoftheRestrictedBoltzmannMachinebasedapproachin[3].Thehorizontalaxisshowsthereconstructionerrorcomputedonthetarget'strainingdatausingthesourceRBMmodel.Notethehighdegreeofcorrelationexhib-itedbytheproposedmeasure(toprow)withimprovementinperformance.coloreddifferentlyandvisualizedinthisspace(seeFig4.9).Ineachtheviewpointthatbestillustratesthedecisionboundaryispresented.Itcanbeseenthatthebestrankedsource,evenpriortoobservinganyMRIdata,hasthepo-69Figure4.9:Priortoconductinganytransfer,theabilitytodiscriminatebetweenspotandnon-spotpatchesisvisualizedin3DspaceforthebestrankedandtheworstrankedCNN.TheintheleftcolumncorrespondtothebestrankedCNNontwotestsets,whiletheintherightcolumncorrespondtotheworstrankedCNNontwotestsets.Seetextforfurtherexplanation.tentialtoseparatespotsamples(yellow)fromnon-spotsamples(black).Theworstrankedsourcedoesnotdifferentiatebetweenthetwoclasses.Infact,thespreadofsamplesacrossthethreedi-mensionsisverylowandallsamplesappeartobeconcentratedinasmallerregion.Therefore,takingthebestCNNastheinitialpointinlearningthetargetconceptclearlyprovidesanedgeoverrandominitializationorusingothersourceCNNswithmuchlesserrankingscores.Impactoftrainingsamplesize:Although,themainfocusofthisstudyistotesttheefy70Figure4.10:Thehorizontalaxisshowthepercentageofthetargettrainingdatautilizedintuning.They-axisshowstheperformanceaftertransfer.Datasetsizewasincrementedinvaluesof5%,andforeachdataset,theproposedapproachwasusedtorankthesourceCNNs.Here,thetransferwasonlyconductedusingthebestandtheworstrankedCNN.Notetheperformanceimprovementforsmallertrainingsizeswhichconveystheimportanceoftheproposedmethod.oftheproposedapproachwhenthetargettrainingdataisverysmall,wearealsointerestedinin-vestigatingtheeffectofincreasingthetargettrainingsize.InFig.4.10,weseethattheproposedapproachisespeciallyveryusefulwhenthetargettrainingsizesaresmall.Usingonly5%oftheavailabletrainingdata,theperformanceisobservedtoimprovebymorethan35%aftertransferlearning.ThismeansthatthelabelingeffortfromamedicalexpertcanbereducedwithoutcompromisingtheAUCperformance.However,whenthereisalreadyalargeamountoftrainingdataavailable,transferofknowledgefromasourceCNNmaynotbringachangeintheresults.71Figure4.11:RankingperformancefortheMNISTtargettask:44CNNs,basedonrandomlychosensourcetasks,areranked.Fortuning,only5%oftheavailabletrainingsetwasrandomlychosenforthegiventask.TestingwasperformedonalltheimagesintheMNISTtestset.Notethattheperformancewithouttransferlearning,usingtheselected5%ofthetrainingdata,wasabout0:21.4.3.2MNISTbasedtargettaskWefurtherevaluatetheproposedrankingmeasureusingthemulti-classMNISTdatabase.Theexperimentalprotocolusedhereisthesameastheoneusedintheprevioustargettask.AstandardLeNet-likeCNNarchitecturewithReLUactivationlayerswasutilized.However,only44differentsourcetaskswererandomlypickedandthecorrespondingCNNswerelearned.Notethatsimilartotheprevioustargettask,only5%oftheavailabledatawasutilized,asexplainedin4.2.3.2.TheexperimentalresultsareshowninFig.4.11.72Figure4.12:(Toprow)Thetwoshowtheresultoftheproposedapproachontwodifferenttestsets.(BottomRow)ThesetwoshowtheresultoftheRestrictedBolzmannMachinebasedapproachin[3].Thehorizontalaxisshowsthereconstructionerrorcomputedonthetarget'strainingdatausingthesourceRBMmodel.4.3.3ExperimentsusingCalTech-256Generally,intheliteratureontransferlearning,thetargettaskisassumedtocontainlimitedtrainingdatawhilethesourcetaskisassumedtohavealargeamountoftrainingdata.Inthisexperiment,73Figure4.13:Informationdiagram:Emphasizingtheneedtoexploitmultiplesources.achallenging,non-conventionalcaseisconsideredtofurthertesttherobustnessoftherankingmeasure.Here,thesourceCNNsarealsotrainedusinglimitedtrainingdata.Further,thetrainingdataforeachclasshaslargeintra-classvariations.Tofacilitatethis,500additionalsourcetaskswererandomlygeneratedusingthepubliclyavailableCalTech-256dataset.Thisdatasetcontainsabout256classesandtheaveragenumberofimagesineachclassisabout120.Classesintheadditionalsourcetasksrangedfrom2to256.TheproblemofspotdetectioninMRI,asdiscussedin4.2.3.1,wasusedforthetargettask.Despiteanon-traditionalscenario,weseeinFig.4.12thattheapproachisstillabletodifferentiatebetweenthesourceswhentestedontwodifferentMRItestsets.However,thevarianceathigherrankingscoresislarger,indicatingthechallengeposedbyrankingsuchCNNmodels.744.4ConclusionThisstudyisthetoshowthatthesourceCNNscanberankedinincreasingorderofforagiventargettask.Aninformationtheoreticframeworkthatperformsreliable,zero-shotrankingofCNNswaspresented.TheapproachwasthoroughlyevaluatedusingPlaces-MITdatabase,CalTech-256database,MNISTdatbase,andarealworldMRIdatabase(whichisthefocusofthisthesis).WedemonstratedthatduetransferringknowledgefromthebestsourceCNN,highperformancecanbeachievedonthetargettaskdespiteusingsmalltrainingdata.AutomatingthecrucialstepofsourceselectionisafundamentalimprovementinthestandardpracticeoftransferlearninginCNNs.Thisstudy,alsoopendoorstobetterinvestigateseveralotherrelatedresearchproblemssuchasautomaticallytheoptimalnumbersoflayerstotransfer.Moredetailsonpotentialfutureworkareasfollows:4.4.1MultiplesourcesUsingtheproposedframework,inFig.4.13,weshowaninformationdiagramwhereanothersourceCNNNjbringsinformationthatisnotaccountedforbybothNtandNi.Since,Region-3ismuchsmallerincomparisontoRegion-2,suchasourceshouldhavealowrankscoreandisantici-patedtobelesscomparedtoNi.However,ifNjisutilizedappropriatelyincombinationwithothersourcessuchasNi,theoverallentropywillfurtherreduceasH(YvjNt(Xv);Nm1i(Xv))H(YvjNt(Xv);Nm1i(Xv);Nm1j(Xv))Therefore,oneinterestingfuturedirectionwouldbetoextendthecurrentframeworktoincorporatemultiplesourceCNNs.Forexample,theformulationpresentedherecanalsobeseenassimplifyingtheproblemofsourceCNNselectiontoafeatureselectionexercise.Inthiscontext,itwillalsobeinterestingtoinvestigatehowdifferentfeatureselectionapproachescanbeappropriatedandexperimentally75comparedinthiscontext.4.4.2LayerstotransferThegoalofthisstudywasnottotheoptimalnumberoflayerstotransfer,ratheralltheconvolutionallayersweretransferredhere.Findinganoptimalnumberoflayerstotransfer,inaprincipledmanner,isstillanopenproblem.Infuture,weplantoinvestigatehowtheperformanceduetodifferentnumberoftransferlayersiscorrelatedwiththerankingscoreofagivensourceCNN.76Chapter5ExploitingLabelinglatency5.1IntroductionInthischapter,weinvestigatetheroleofincorporatinganexpert'slabelingbehaviorintoapartic-ular,theconvolutionalneuralnetwork(CNN).Theinspirationforthisapproachcomesfromresearchinpsychophysiologywhereithasbeenobservedthatthehumanmindprocessesdif-ferentimagesdifferently,basedonthesalientcharacteristicsofindividualimages/stimuli[75][76].ThisisperhapsthecasewhenamedicalexpertcarefullyanalyzesandlabelsspotsinMRIscans.Forexample,aneasy-to-classifyspotmaytakelesstimetolabel,whileadifspotmayrequiremoretimetolabel.Thus,thetimetakenbyanexperttolabeleachspot,i.e.,thelabel-inglatency,canbeviewedasavariablethatmodelsthelabelingbehaviorofanexpert.However,thelabelinglatencyvalueassociatedwithaspot(positivesample)providesadditionalinforma-tionthatisonlyavailableduringtrainingandisabsentduringtesting.Further,amedicalexpertonlylabelsthepositivesamplesinanMRI(theremainingsamplesintheMRIareautomaticallyassumedtobenegativesamples).Thus,labelinglatencyisonlyavailableforoneclass,i.e.,thepositiveclass.Theparadigmoflearningusingprivilegedinformation(LUPI)iscloselyrelatedtotheproblemathand.Privilegedinformation(alsoknownassideorhiddeninformation)isalsoavailableonlyduringtrainingbutabsentduringtesting[77,78,79,80,81,82,83,84].However,existingLUPIapproachescannotbeappropriatedinthecontextofsupervisedrlearningwheretheside77Figure5.1:ThischapterdescribesaCNNarchitecturethatincorporatesthelabelingbehaviorofanexpertduringthetrainingphase.Thelabelingbehaviorisanticipatedtoprovidesideinformationthatcapturestheintra-classvariabilityofpositiveexemplarsinatwo-classproblem.(Note:GreenmarkershavebeenusedtoindicatespotsinaMRIscan).informationisonlyavailableforsamplesofoneclassandmissingfortheotherclass(es).Inthisregard,thecontributionsofthischapterarethree-fold:Utilizinglabelinglatencyasanadditionalvariableforlearninginthecontextofamedicalimag-ingapplication.Introducingtheproblemofexploitingsideinformationthatisonlyavailableforoneclass.DesigninganewCNNframework,L-CNN,1thatexploitslabelinglatencyassideinformation.5.1.1PriorliteratureInthissection,abriefoverviewoftherelatedworkispresented.1ThetermL-CNNisusedtoindicatethattheCNNexploitsLabelingbehavior.78Figure5.2:Unliketraditionalfeatures,labelinglatencyisonlyavailableduringthetrainingphase.Further,unliketraditionalsideinformation,itisassociatedwithasingleclassonly.Inthisatwo-classproblem(fi+"andfi-")isconsidered.5.1.1.1learningwithlabelinglatencyTheliteratureonLUPI-basedapproachesiscloselyrelatedtoourwork.ThebasicgoaloftheLUPIparadigmistoexploitsideorprivilegedinformationthatisavailableonlyduringtrainingandnotduringtesting.Sideinformationhasbeensuccessfullyutilizedinthecontextofunsuper-visedlearningframeworks[85,86].Anumberofapproachesalsoshowtheofusingsideinformationinasupervisedlearningframework[77,78,79,80,81,82,83].However,inthesu-pervisedlearningframework,existingLUPIapproachescannotbeutilizedifthesideinformationisonlyavailableforasingleclassandcompletelyabsentfortheotherclasses.Arecentstudy[84]utilizedreactiontimeofalabelerasanadditionalsideinformationinaSVMbasedframework.However,in[84],thesideinformationisavailableforbothpositiveandnegativeclasses.Intheexperiments,animagewasdisplayedforaveryshortperiodoftimetothelabeler,whohadtoindicatewhethertheimagecontainedafaceimageornot.Thereactiontimetolabelwastakenasthesideinformationforeachimagethatcouldpotentiallymodelthedifofeachimage.79Figure5.3:BasicarchitectureoftheproposedL-CNNframework.5.1.1.2CNNlearningwithsideinformationTheideaofexploitingsideinformationwithCNNsisrelativelylessexplored,especiallyinthecontextofimagebasedlearning.Thefewapproachesthathavebeenstudied[87,88]sufferfromthesamelimitationasstandardLUPIapproachesand,therefore,cannotbeeasilyappropriatedtotheproblemathand.ThisstudyisoneofthetodemonstratehowaCNNlearningframeworkcanexploitlabelinglatency.5.2ApproachThebasicarchitectureoftheproposedL-CNNframeworkisshowninFig.6.1.Thehumancomputerinterface(HCI)takesanMRIscanGasaninputandextractspatchesXfromitsslices,whereeachextractedpatchcanpotentiallycontainaspot.Duringthetrainingphase,italsoallowstheexperttolabelspots(positivepatches)ineachsliceinaninteractivemannerusinganimageviewer(e.g.,byzoomingin,changingimagecontrast,etc.),resultinginasetoflabelsYassociatedwiththesepatches.Further,thelabelinglatencyvalueassociatedwitheachpositivespotlabelisalsorecorded.Labelinglatencyisutilizedlaterasanadditionalsourceofinformationtocategorizespotpatches.Afterthis,atransferlearningparadigmisadoptedfor80whichthesourcetaskinvolveslearningaCNNthatcandistinguishbetweenthesecategoriesofspotpatches.Thetargettaskistodifferentiatebetweenspotandnon-spotpatchesusingtrainingdatathathasalimitednumberofpositiveexamples.SinceCNNsareinitializationdependentandunfavorablyimpactedwhenthetrainingdataislimited,ratherthanusingarandomlyinitializedCNN,theproposedapproachtransferslayersfromthesourceCNNtothetargetCNN.Theresultsobtainedwiththisapproacharesuperiortopreviousstate-of-the-artfortheproblemofspotdetectioninMRIscans.5.2.1Imageviewer:Human-computerinterfaceThismodulehastwomainpurposes:First,itisusedtoobtaingroundtruthonspotsandtorecordlabelinglatencywhenanexpertmanuallylocatesspotsinanMRIscan.Second,afterthespotshavebeenlabeledbytheexpert,thesystemextractspatches(describedbelow)fromtheMRIslice.Anextractedpatchcontainingaclickedpixelislabeledasapositivesample(containingaspot)andthecorrespondinglabelinglatencyisassociatedwithit.Theremainingpatchesarelabeledasnon-spotpatches(negativesamples).5.2.1.1LabelingspotsGivenanMRIscanG,thismodulepresentsasoftwarewithzoomingandcontrastadjustmentcapabilitiesfortheexperttocarefullyanalyzeeach2DsliceintheMRIscanandclickonthespatiallocationtoindicateaspot.Collectively,these3DlocationsaredenotedbyasetF=f[fugku=1wherefudenotesthelocationofaclickedpointandkrepresentsthetotalnumberofclickedpoints.Inaddition,thetimelapsebetweenclicksisalsorecordedaslabelinglatency.ItwasobservedthatexpertslabeltheeasierspotswithoutengaginginanydetailedanalysisasshowninFig.5.4.Difabelspots,ontheotherhand,weretypicallylabeledatthe81end.Everypotentialspotentityiscarefullyanalyzedbythemedicalexpertbyzooming-inand,occasionally,bychangingthecontrastofthelocallyselectedregion(seeFig.5.5).Labelinglatencyisindirectlyrelatedtothecognitiveoverloadinvolvedinlabelingapointasaspot.ItisdenotedasR=f[rugku=1whereruisthelabelinglatencyassociatedwiththeclickedpointfu.Certainlatencyvalueswereratherhigh,duetobreakstakenbytheexpertorduetodistractions.Hence,valuesgreaterthan45secondsweresimplyreplacedwiththemeantimetakenbetweenmouseclicks.Thesepre-processedvaluesoflabelinglatenciescorrespondingtooneMRIscanareshowninFig.5.6.Factorssuchasspatialdistancebetweentwoconsecutiveclickshavelittleornoeffectonlabelinglatency(assmallmovementsonthemouse,translatetolargespatialdistancesonthescreen).Forsimplicity,anypossibleeffectsofsuchfactorsareignored.Sincethelabelingtaskislaborious,expertsareaskedtoindicatepositivesamples(spoten-tities)only.However,thismeansthatlatencyinformationisavailableonlyforpositivesamplesandnotforthenegativesamples.Forexperts,thiscreatesaneasierandmorepracticallabelingenvironment.However,fromapatternstandpoint,thisintroducesanewchallenge:fifeatures"thatareavailableduringtrainingbutnotduringtestingandareassociatedwithoneclassonly.5.2.1.2ExtractingRoIforTheannotated(i.e.,clicked)locations,F,mustbeassociatedwithsomeregionsintheMRIscan;theseregions,representingacollectionofpixelintensities,willthenbeinputtothe.Thequestiontoaddresshereis,howshouldaregionbeThishasbeendiscussedbeforeinChapter2.Forabriefoverview,asuperpixel[89]basedstrategysimilarto[18]isutilizedtoextractalargenumberofpatchesfromMRIGandtheseregionsaredenotedasX=fx1;x2;:::xngwheren>k.AsshowninFig.3.2,foreachsuperpixelina2DMRIslice,apatchofsizezzis82Figure5.4:Basicviewisusedtolabeleasy-to-detectspots.Figure5.5:(Left)Zoomed-inviewtolocateaspot.(Right)Zoomed-inviewwithcontrastadjust-mentfordetailedcontextualobservationpriortolabelingspots.extractedbykeepingthedarkestpixelofthesuperpixelatthecenter(sincespotstypicallyappeardarkerthanthesurroundingtissue).Notethatasuperpixelapproachispreferredoveradensesamplingmethodforregionsastheformerresultsinfewerbutmorerelevantpatchesforfurtherprocessingandislesslikelytoassociatemultiplespotstoasinglepatch.Further,asmentionedin[18],thesuperpixelalgorithmusedin[89][20]outperformsothersuperpixeland3Dsupervoxelalgorithms(suchas3DSLIC)forcapturingspotboundaries.Therefore,the2Dspotpatchesdetectedinneighboringslicesarelaterjoinedtoforma3Dspot.Formally,X=fx1;x2;:::xngwhereeachpatchxuhasacenterau.Notethataqudenotestheslice83Figure5.6:LabelinglatencyforasingleMRIscan.Figure5.7:(A)SpotpatchesextractedfromoneMRIscan(concatenatedas1015patches),(B)SpotpatchesfromanotherMRIscan.Thesepatchesrepresentinter-scanandintra-scanvaria-tionsinspotpatches.numberwherethepatchislocatedbasedonthe2Dspatiallocationbya1uanda2u.ThedistancebetweenthecenterlocationofeachpatchwithrespecttoalltheclickedlocationsinFiscomputed.Ifthesmallestofthesedistances,dmin,islessthanorequaltoathresholdt,thepatchisconsideredtobeaspot;otherwise,itisconsideredtobeanon-spotthatforms84thenegativeclassofthedataset.Notethatdmin=minjkfjauk2and1jk.Basedonthisstep,alabelyuisassignedtoeachpatchxu.Thus,Y=fy1;y2;:::;yngwhereyu2f0;1g.Further,X=fXp;XggwhereXp=fx1;x2;:::xkgdenotesallthespotpatcheswhileXg=fxk+1;xk+2;:::;xngrepresentsallthenon-spotpatches.Notethat8xu2Xp,thetrainingsamplesexistastripletsf[(xu;yu;ru)gku=1,whilefortheremainingnon-spotpatchesthelabelinglatencyvalues(ru)arenotavailable.5.2.2approach5.2.2.1ClusteringThevariationinlabelinglatencyvaluesforanMRIscancanbeseeninFig.5.6.AGaussianMix-tureModel(GMM)isutilizedheretocategorizeXpintomclustersbasedontheircorrespondinglabelinglatencyvalues.Here,mdenotestheoptimumnumberofclustersselectedusingthestan-dardAkaikeinfomationcriterion.Thus,eachpatchxu2Xpisassociatedwithaclusteringindexvuwhere1vumandV=fv1;v2;:::;vkg.5.2.2.2TransferlearningIntransferlearning,theknowledgegainedtoperformonetask(sourcetask)isusedtolearninginanothertask(targettask).InthecontextofCNNs,transferlearningcanbeimple-mentedbytransplantingthelearnednetworklayers(featurerepresentations)fromasourceCNNforinitializingatargetCNN.Inthisstudy,thetargettaskisthetaskofseparatingspotpatchesfromnon-spotpatches.TotrainaCNNtoperformthistask,oneapproachwouldbetoinitializeitrandomlyandthenutilizethetargettasktrainingdatatoupdateit.AnotherapproachwouldbetoinitializeitbasedonthelayersofasourceCNNthathasbeentrainedforadifferenttask.Here,85thesourcetaskinvolvesdifferentiatingbetweencategoriesofspotsgeneratedusingtheclusteringprocessdescribedearlier.ThelayersoftheensuingsourceCNNisthenusedtoinitializethetargetCNN.ThesourcedatasetDS=(Xp;V)isdeveloped,wheretheclusterindicesinVactlikelabelsforthespotpatchesinXp.ACNN,NS,isthentrainedtodistinguishbetweenthepatchesoftheseclusters.Formally,thegoalistolearnweightsforNSthatminimizethelossåku=1J(NS(xu);vu)whereJdenotesthestandardcross-entropylossfunction.Functionally,thisCNNlearningprocessisdenotedasNS=h(No;DS)whereNorepresentsarandomlyinitializedCNNarchitectureandNSdenotestheCNNthathaslearnedafeaturerepresentation(weights)todistinguishbetweenspotpatchesbelongingtodifferentclusters.TheCNNarchitecturecustomizedforthisdataisshowninFig.3.6.Notethatduetothesmallsizeoftheinputpatches(z=9),apoolinglayerhasnotbeenutilized.Inthesecondstep,thegoalistoutilizethetargetdatasetDT=(X;Y)forlearningatargetCNN,NT,whichcandistinguishbetweenspotandnon-spotpatches.Formally,theobjectiveistominimizethelossånu=1J(NT(xu);yu).However,inthiscase,NTisnotrandomlyinitialized;ratherthefeaturelayersinNSaretransplantedtoinitializeit.ThisisdenotedasNT=h(NoS;DT).Thetransferisconductedinthestandardmannerdetailedin[37,40].Inthiscase,alltheconvolutionallayersaretransferredforinitialization.Thefullyconnectedlayerisincompatiblefortransferduetostructuraldifferencesinducedbythetwotasks.Therefore,astypicallydoneintransferlearning,thefullyconnectedlayerisrandomlyinitialized.TheresultinginitializedCNNisdenotedasNoS.ExperimentalresultsshowthatthistransferofknowledgebringsanimprovementthatisnotachievedwhenusingarandomlyinitializedCNNNT=h(No;DT)thatisupdatedusingonlythedatasetDT.Notethatitmaybepossibletoachievedifferentlevelsofimprovementbasedonthelabelingbehaviorofdifferentexperts.Thiscouldbeoneinterestingdirectiontoexploreinthe86future.5.3Experiments,results,anddiscussionInthissectionexperimentsaredesignedtoanswerthefollowingquestions:(a)HowdoestheL-CNNcomparewithatraditionalCNN?(b)WhatistheresultifrandomclusteringisusedinsteadofGMMbasedclustering?(c)WhatistheeffectoftransferringdifferentnumberofCNNlayers?(d)Howdotheresultsobtainedinthisstudycomparewiththepreviousstate-of-the-artforspotdetectioninMRIscans?Notethatinallexperiments,theAreaUndertheCurve(AUC)valuewasusedasmeasureofaccuracy.SetupInthisstudy,theinvivoMRIdatabaseof[18]comprising5MRIscansofdifferentRatbrainswasused.3oftheseRatswereinjectedwithMesenchymalstemcellswhichappearasdarkspotsinMRI.About100;000patchesareextractedfromeachofthe3scans.Thenumberofpositivesamplesineachscanisabout5000.Thelabelinglatencyforeachlabeledpatchwasalsodocumented.EachofthethreescansissuccessivelyusedfortrainingwhiletheremainingtwoindependentMRIscansareusedfortesting.Thiscreates6testingscenarios.Thefollowingparameterswereused:z=9,m2[5;9],andt=p2.5.3.1ComparisonwithconventionalCNNapproachInthisexperiment,theresultofL-CNNiscomparedwithaconventionalCNNNTthatisrandomlyinitializedandthensimplytrainedusingDT.ResultsinFig.5.8clearlydemonstratethattheL-CNNresultsinbetterperformancethantheconventionalCNNonall6testingscenarios.ItisinterestingtonotethatexploitinglabelingbehaviorusingL-CNNcanprovideaperformanceincreaseofupto4%(seetestsetT3).Thus,theoflabelingbehaviorinperformance87improvementhasbeenclearlyestablished.5.3.2ComparisonwithrandomclusteringItcanbeseenthattheL-CNNarchitectureexploitsclusteringtocreatesub-categoriesofthelabeledspotpatches.Inthisexperiment,weinvestigatetheperformancewhenspotpatchesarerandomlyassignedtocategoriesinsteadofusingGMM.TheseresultsareshowninFig.5.8foreachofthe6testingscenarios,andalsocomparedwiththeL-CNN.Inalltestingscenarios,L-CNNclearlyperformsbetterwhenGMMisusedinsteadofrandomclustering.Further,itcanbeseenthattheperformanceduetorandomclusteringis,ingeneral,verysimilartothatoftheconventionalCNN.5.3.3ComparisonusingdifferentnumberoftransferlayersHere,theeffectoftransferringdifferentlayersisinvestigated.TheproposedCNNarchitecturehasthreeconvolutionallayersandafullyconnectedlayer.TheresultsoftransferringdifferentnumberofconvolutionallayersareshowninFig.5.9.Itisevidentthat,ingeneral,transferringallthreelayersresultsinsuperiorperformance.5.3.4ComparisonwithapreviousapproachWecomparetheresultsofL-CNNwiththepreviousstate-of-the-artforspotdetectionreportedin[18].Forthiscomparisontobecompatible,aleave-2-outapproachwasutilizedusingthesameexperimentalsetupmentionedin[18].Theproposedapproachclearlyresultsinsuperiorperformancewithanaccuracyof94:68%,comparedtothe89:1%accuracyachievedin[18].88Figure5.8:PerformanceoftheproposedL-CNN.Figure5.9:Resultswithdifferentnumberoftransferlayers.89Chapter6SupplementaryInformationInordertomakethisthesisself-contained,informationonourrelatedsupplementarystudiesispresentedinthischapter.Thefollowingtwostudieshavebeenreferencedinthemainchapters:1.Amodelbasedapproachforspotdetection:ThisstudywasdiscussedinChap.3.detailsoftheapproachandtheexperimentalresultsarediscussedhere.2.CNNrankingwithintuitiveapproach:ThisapproachwasdiscussedinChap.4.Experimentalresultsusingthisapproacharepresentedhere.906.1AmodelbasedapproachforspotdetectionThissectionpresentssupplementaryinformationonourlearning-basedapproachthatutilizesaspotmodel.ThelimitationsofthisapproachwerediscussedinChap.3.Inthisapproach,weconsiderspotsas3Dentitiesandrepresentitsgeneralstructuralmodelusingsuperpixels.Wethenextractanovelsetoffisuperfernsflfeaturesandclassifyitusingmultipleofspotslearnedbyapartitioning-basedensembleofBayesiannetworks.Experimentalresultsshowthatitperformsbetterthanpreviouslyrelatedapproaches.Insummary,thischaptermakesthefollowingcontributions:(i)Itproposesanovelsuperpixel-based3Dmodeltocharacterizecellularspotsthatcanpotentiallybeusedinothermedicalprob-lems.(ii)Itintroducesthesuperfernsfeaturethatexploitssuperpixel-basedrepresentationsandismorediscriminativethantraditionalfernfeatures.(iii)Itdemonstrateshowapartitioning-basedensemblelearningcanbeeffectivelyutilizedforMRIspotdetection.6.1.1ApproachAsmentionedbefore,thecell/spotdetectionprobleminMRIscanshasuniquechallenges,whereanumberofquestionsshouldbecarefullyconsideredpriortoalgorithmdesign.First,sinceaspotisessentiallya3DentityinanMRIcube,howtomodelitsthreedimensionalcharacteristics?Second,aspotisalsoasmallgroupofdarkpixelswithvaryingshapesandsizes.WhatisthebasicunitwithinanMRIcube(e.g.,one,two,orNpixels)forwhichthetwo-classdecisioncanbemade?Third,thereisahugenumberofcandidatelocations.Therefore,ourfeaturerepresentationforspotsshouldbenotonlyhighlydiscriminative,butalsoefandbasedoncomputationallylightoperations.Fourth,theappearanceofaspotvariesrelativetoitslocalandregionalneighborhood.Howtomakelearningrobusttothesevariationsshouldbeaddressed.91Figure6.1:Thearchitectureofourapproach.Blue,red,andblackarrowsaretheprocessingwduringthetrainingstage,testingstageandbothstages,respectively.Consideringthesechallenges,wedesignourtechnicalapproachasinFig.6.1,withdetailsbelow.6.1.1.1SpotmodelingVisually,acellularspotSappearsasaclusterofNdark3Dpixelswithhighvariationsinits3Dshapeandintensity,wrappedinsideacoverofbackgroundpixels.Inthiswork,wecallthesmallgroupofdarkpixelsasaspot'sinteriorI,andtheirlocalneighboringpixelsinthebackgroundastheexteriorEofa3Dspot.Thismodelisconsistentwiththemanuallabelingofspotsbydomainexperts,whoinspectthecross-sectionsofthesespotsinconsecutive2DMRIslices,andlookforasmallregion(interior)thatisdarkerthanitsneighboringpixels(exterior).Furthermore,humaneyescanalsoadjusttheamountofrelativedarknessbasedonthecharacteristicsofthebrainregioncontainingthatspot.Therefore,inadditiontomodelaspotwithitsinterior/exterior,wealsomodeltheregionitbelongsto,termedregioncontextR.6.1.1.2ModelinstantiationviasuperpixelGiventheconceptualspotmodelS=fI;E;Rg,wenowdescribehowtoI,E,andRforaspot,bythreesteps.Sincenospotshouldbeoutsidethebrainregion,thestepistoperformbrainsegmenationinevery2DMRIsliceswithbasicimageprocessingtechniques.Thesecond92stepistoIandEbyapplying2Dsuperpixelextraction[89]tothesegmentedbrainregionofeachMRIslice.AsuperpixelisagroupofNneighboringpixelswithsimilarintensities,i.e.,Vz;u=fxi;yi;zgNi=1whereuisthesuperpixelIDinslicez.Ingeneralsuperpixelscantightlycapturetheboundariesofaspot'sinterior;however,someimpreciselocalizationisalsoexpectedinpractice(seeChap.3).Afterextraction,wedenoteM=fVz;ugL;Uz=u=1asthesetofallsuperpixelsinthebrainregion,whereLandUarethenumberofslicesandsuperpixelIDs,respectively.Duetotheexclusivenessoftheinteriorandexteriorofspots,wehaveM=I[EwhereIandEarethesetofallinteriororexteriorsuperpixels,respectively.Withthat,foraspotSwithlengthlinz-axis,weformallyitsinteriorasI=fVz;u;;Vz+l;ujVˆIgandtheexteriorasE=fVz1;:;Vz;¯u;:::;Vz+l;¯u;Vz+l+1;:jjj(m(I)m(V)jjt;VˆEg,wherem()computesthemeanofaset,tisthemaximumL2distancebetweenthecentersofaspotandanexteriorsuperpixel,andVz1;:andVz+l+1;:aresuperpixelsintwoadjacentneighboringslices.AssumingthesecondstepextractsN1superpixelsperslice,thethirdstepalsoreliesonsuperpixelstoRwherethenumberofextractedsuperpixelsN2˝N1.ThisisreasonablesinceRcanincludeverylargesuperpixelsthatarerepresentativeoftheregionalappearance.Thus,wetheregioncontextofaspotasR=fŸVz;vjm(I)ˆŸVz;vg,whichisthelargesuperpixelenclosingthespotcenterm(I).Thesuperpixel-based3Dspotmodelhasafewadvantages.First,itaddressestheissueofunit,bygoingbeyondpixelsandusingthesuperpixel-basedmodelforfeatureextractionandcation.Second,thismodelsubstantiallyreducesthenumberoftotalcandidatespotstobetested,sincethecandidatescanbenominatedbasedonsuperpixelsratherthanpixels.Notethatwemayextendourmodelinstantiationbyusing3Dsupervoxelinsteadof2Dsuperpixel.Wechoosethelatterinthisworkduetoitsdemonstratedreliabilityandefyduringtheexperiments.936.1.1.3SuperfernsfeatureextractionWithaninstantiatedspotmodelS=fI;E;Rg,thenextstepistoextractadiscriminativeandeffeaturerepresentation.Sinceaspotgenerallyhasdarkerinteriorthanitsexterior,itmakessensetofeaturesbasedonthecomputationallyefntintensitydifferencesbetweenpixelsintheinteriorandexterior.Difference-basedfernfeatureshaveshowngreatsuccessincomputervision[90].Fernscomputetheintensitydifferencebetweenasubjectpixelandanotherpixelwithacertainoffsetw.r.t.thesubjectpixel.Usingthesameoffsetindifferentimagesleadstofeaturecorrespondenceamongtheseimages.Forourproblem,thespotcenterm(I)canberegardedasthesubjectpixel,anditsintensityistheaverageintensityofallinteriorpixelsm(G(I)).Wethenrandomlygenerateh3DoffsetsO=foighi=1withauniformdistribution,whosecenteristhespotcenterandradiusist.Finally,thefeaturesetiscomputedasF=ffighi=1,wherefi=G(m(I)+oi)m(G(I)).Whilefiiseftocompute,G(m(I)+oi)istheintensityofasinglepixel,whichcanbenoisy,speciallyinin-vivoMRIandleadtolowdiscriminabilityoffi.Thus,itisdesirabletoreplaceitwiththeaverageintensityofallpixelswithinanexteriorsuperpixel.However,theexteriorsuperpixelsarounddifferentspotshavenocorrespondence,and,asaresult,fifordifferentspotsalsohavethecorrespondenceissue.Toaddressthisissue,wepresentanapproachtoexploittheaverageintensitywithoutlosingcorrespondenceinformation.Thenewfeature,termedasfisuperferns",issimilartoFexceptitreplacesthesinglepixel-basedintensitywiththeaverageintensityofthesuperpixel,i.e.,F0=ff0ighi=1,wheref0i=m(G(V))m(G(I)),8m(I)+oi2V.Notethatitispossibletohavethesamefeatureattwodifferentoffsetsduetothembeinginthesamesuperpixel,i.e.,f0i=f0j.Thisisnotanissuebecausethisequalitymaynotbetrueforotherspots,hencethefeaturedistributionsoff0i94Figure6.2:Fernsvs.Superferns.andf0jarenotthesame,andtheycontributedifferentlytotheFeaturesarealsoneededfortheregioncontextR.Givenitsroleofsupportingregion-dependentwethatsimplefeaturesworkwellforR,e.g.,themeanandstandarddeviationofpixelintensitiesinR,Fr=(m(G(R));s(G(R))).6.1.1.4Partition-basedbayesianHavingcomputedthefeatureFs=(F;Fr)forasetofspotsandnon-spots,wenowpresentourapproachtolearnanaccuratetwo-class.Sincedifferentlocalregionshavedifferentappearance,wepartitionthebrainregionintoN0partitions,learnasetofN0eachforoneregion,andfusethemviaaprobabilisticBayesianformulation.,foranyspot95candidateS,itsprobabilityofbeingaspotisP(Fs)=N0åi=1P(Fs;ri)=N0åi=1P(Fsjri)P(ri);(6.1)whererirepresentstheithpartition,P(ri)istheprobabilityofSbelongingtori,andP(Fsjri)istheconditionalprobabilityofaspotatri.WelearnP(ri)usingthewell-knowGaussianMixtureModels(GMM)technique.BycollectingFrforalltrainingsamples,weperformGMMtoestimateN0componentGaussiandensities,eachconsideredasonepartition.Duringthetesting,fP(ri)gN0i=1isobtainedbyevaluatingFrofthetestingsamplew.r.t.eachcomponentdensities.InordertolearnP(Fsjri),wegroupalltrainingsamplesintoN0groupsbasedontheirrespectivemaximumfP(ri)g,andtraintheP(Fsjri)usingthestandardimplementationofBayesianNetworksin[91].Duringthetest,foratestingcandidatespot,GMMenablesasoftpartitionassignment,anditsprobabilityofbeingaspotistheweightedaverage.6.1.2ExperimentalresultsInthissectionwedesignexperimentstoinvestigateanswerstothefollowingquestions:(i)howdoesourapproachperformandcomparewiththepreviousapproachesusingbothinvivoandinvitrodata?(ii)howdoesthediscriminatingpotentialofsuperfernsquantitativelycompareswiththefernfeatures?(iii)howdiverseistheensemblecreatedbyourproposedapproach?6.1.2.1ExperimentalsetupTheROC,andAreaundertheCurve(AUC)areusedastheevaluationmetrics.Forthe5-scansinvivodata(GA),weadoptaleave-two-outschemesuchthatourtestingsetalwayscontainsone96Figure6.3:Detectionperformancecomparisonsandwithvariouscomponents.labeledandonespotlessscan.Thiscreatessixpairsoftrainingandtestingsets,whichallowsustocomputetheerrorbarofROC.Forthe4-scansinvitrodata,threepairsoftrainingandtestingsets97Figure6.4:Spotdetectionexamples:(a)truedetection,(b)falsenegative,(c)falsealarm.areformedsuchthatthenaivescanalwaysremainsinthetestingsetaccompaniedbyeveryotherscanonce.Weimplementthepriorworkof[92]and[16]andusethemasthebaselines,sincetheyarethemostrelevantexamplesofMRIcelldetectionusinglearning-basedandrule-basedmethods.Weexperimentallydeterminez=2,t=9,hvariesfrom2002000andqfrom2060dependingonthesizeofbrainregions.6.1.2.2PerformanceandcomparisonAsshowninFig.6.3(a,b),theproposedmethodoutperformstwobaselineswithanaverageAUCof98:9%(invitro)and89:1%(invivo).TheimprovementmarginisespeciallylargeratlowerFPRs,whicharethemainoperationpointsinpractice.Further,Fig.6.3(c)showsthatwithinvivodata,byusingfernsinsteadofsuperfernsorbymakingnopartitionsofthebrainregion,weobserveadecreaseinperformanceto85:3%and87:1%,respectively.Fig.6.4showsthreetypesofspotdetectionresultswithourmethod.Eachcolumnrepresentstwoconsecutiveslicesofonespot.Theappearanceandshapevariationsamongthespotsclearlyshowthechallengeofthisproblem.6.1.2.3Superfernsvs.fernsTofurtherillustratethestrengthofthenovelsuperfernsfeature,wecomparethediscriminatingpotentialofsuperfernswithferns,regardlessthecdesign.Informationgainisastandardtooltomeasuretheworthofafeature,whereahighergainindicatesitshigherdiscriminating98Figure6.5:Superfernsvs.FernsFigure6.6:diversityanalysispotential.Givenasetof50randomlygeneratedoffsetsoi,wecalculatetheirsuperfernsfeaturesontheinvitrotrainingdataincludingbothspotsandnonspots,whichallowsustocomputetheinformationgainsAsofeachoffsetorsuperfearn.ThesameoffsetsareappliedtothefernsfeaturesandresultsintheirinformationgainsAf.Thenwecomputetheratiooftwoinformationgain,As(i)Af(i),foroi,andcollectivelytheircumulativedensityfunction(CDF)isshowninFig.6.5.Using100randomoffsets,thesameexperimentisrepeatedfortheinvivodata.Thefactthatalmostallratiosarelargerthan1showsthesuperiorityofsuperferns.6.1.2.4DiversityanalysisOurframeworkincludesanensembleofoneforeachpartition.Sincediversediscriminativefeaturesareutilizedindifferentpartitions,learningondisjointpartitionsshouldfavorhighdiversitiesamongwhichisanstrongindicatorforeffectivecation.Toevaluatethediversityofourensemble,weusethestandardCohen'skappavalueas[93],whichrangesfrom0to1,withalowervalueindicatingahigherdiversity.Foreachofsixinvivotrainingsets,wecomputeN0(N01)2kappavalues,eachbetweenapairoflearnedondifferentpartitions.Fig6.6showstheirmeanandstandarddeviationforeachtrainingset.Basedonthestudyin[93],weconsiderourkappavaluestobeverylow,indicatingthehigh99diversityinourlearnedensemble.1006.2CNNrankingwithintuitiveapproachThissectionprovidessupplementaryinformationontheexperimentsandresultsoftheintuitiveapproachforCNNrankingwhichwasproposedinChap.4.6.2.1Experimentalsetup6.2.1.1TargettaskSincemanymedicalapplicationssufferfromthelackoflargescaleannotateddata,inthisstudy,anexisting,realworldMRIdatabase[18]wasutilizedasatargettask.Inthisthesis,thissetisdenotedbyGA.ThisdatabasehasthreedifferentsetsoflabeledMRIscanspertainingtoratbrains.Theinjectedstemcellsappearasdarkspotsintheseimages.Fromeachscanabout100;000non-spotpatchesand5000spotpatcheswereextractedasmentionedbefore.Thesepatcheswereobtaineddirectlyfromtheauthorsin[18].Intheexperimentsbelow,allpatchesfromasinglescan(singleset)wereusedfortrainingandthepatchesfromtheremainingtwoscanswereindependentlyutilizedfortesting,generatingatotalof6testingscenarios.Inalltheexperiments,theAreaUndertheCurve(AUC)wasutilizedforsummarizingaccuracy.6.2.1.2SourcetaskThefocusofthisstudyistorankasetofgivensourcetasksinordertodeterminetheirtransferabil-ityforaedtargettask.Therefore,25diversesourcetaskswerearbitrarilydesignedusingthepubliclyavailable,standardImageNetdatabase.Fourteenofthesewerebinarytasks,whilethenumberofclassesrangedfrom5to20fortheremainingsourcetasks.Notethatthegoalhereistobeagnostictothedatacharacteristicsofasourceandonlyutilizetheweightslearnedby101Figure6.7:Transformingsourceimagesto99.Transformed,averageimagesfordifferententi-tiesareshownhere.thesourceCNNtoassessitstransferabilitytothetargetdomain.Inthefollowingsections,experimentsaredesignedtostudythefollowingquestions:(1)Howwelldoestheproposedapproachrankthesourcesfrombesttoworst?(2)Whatisthedifferenceinperformancewhenthebestrankedsourceisusedfortransferlearningincomparisontotheworstrankedsource?(3)WhatisthegaininaccuracywhenresultsarecomparedagainstaCNNwithoutanytransferlearning?(4)Howdoesthesizeofthetargettrainingdataimpacttheperformancegain?(5)Whatroledoesthechoiceoflayers,thataretransplanted,haveontransferlearning?(6)Doestheinformationfusionofsourcesproviderobustnessagainstrankingerrors?(7)Canthenegativeimpactoftransferlearningbepredictedinadvance,basedonthesourcetask'srankingscore?(8)Whatdoestherankingscoreofasourcetasktellusaboutitsdatacharacteristics?6.2.2Resultsanddiscussion6.2.2.1ImpactofsizeoftargettrainingsetInthisexperiment,wecomparethefollowing:(1)Theperformanceoftransferlearningwhenusingthesourcethatwasrankedthebestagainstthesourcethatwasrankedworstbytheproposed102Figure6.8:Sourceentitiesandtheircorrespondingtransformedaverageimages.sourcerankingapproach.(2)PerformanceofbestandworstrankedagainstabaselineCNNthatwasonlytrainedusingtargettrainingdataXwithnotransferlearning.(3)PerformanceoftheaforementionedCNNswhenusingadifferentproportionoftargettrainingdata.Trainingwasaccomplishedwith12differentpercentagevaluesthatrangedfrom5%to60%ofthetrainingsetinincrementsof5.Fig.6.9showstheresultsonthreedifferenttestingscenarios.Fig.6.9(left)indicatesaperformancegainofabout˘35%withrespecttothebaselinewhenonly5%ofthetargettrainingdataisused!Wefurtherobservethattheperformancegainismorewhenthetrainingdataissmallinsizewhichispreciselythescenarioenvisionedinthisstudy.103Figure6.9:Comparisonofempiricalresultsonthreeofthesixtestingscenarios.Notetheper-formancegainondatasetswithsmalleramountsoftrainingdataandtheefyoftherankingmetric.6.2.2.2CorrelationbetweensourcerankingandperformancegainInFig.6.10(A),thex-axisrepresentstherankingscoreofasourcetaskthatiscomputedusingtheproposedapproach,withthetop-rankedsourcehavingthelargestvalue.They-axisshowsthenormalizedsumoftheoverallperformancegainachievedbyusingthatsourceinalltheafore-mentioned126scenarios(12differentsizedtrainingsets,6testscenarios).This(6.10A)depictstheoverallcorrelation,whenutilizingtrainingsetsizesrangingfrom5%to60%.How-ever,itisobservedthatsuchacorrelationishighwhenthesizeofthetargettrainingdataissmall,ascanbeseeninFig.6.10(B,C,D,E).PerformancegainismeasuredintermsofthedifferencebetweentheAreaUndertheCurve(AUC)values.Sincethecriterionwasdesignedforsmalltargettrainingsets,thisresultisdesired.From(FtoI)inFig.6.10,thetrainingdatasizeincreasesanditcanbeseenthatperformancegainbeginstodecreaseasthetargettrainingdataisnowsufforthespotdetectionproblem.6.2.2.3LayerstobetransferredFig.6.11(A)denotestheaccuracywhen(a)onlythemostgenerallayer(weightsfromtheconvolutionallayeronly)istransferredfromthesourceCNNs,and(b)allthree104Figure6.10:Correlationbetweenrankingscoreandperformancegain.convolutionallayersaretransferredfromthebestrankedsource.Theshadedregionrepresentstheareabetweenthecurvesplottedwhenonly1layerwastransferredfromall25sourceCNNs(indicatedbyL1).Experimentalresultsonall6testingscenariosclearlyshowthatforsourceswithhigherrankingscores,transferringalllayersresultinsuperiorperformance.Forexample,Fig.6.11(A)showsthattheshadedregionliescompletelyunderthebestrankedsourcewhenallthreelayersofthesourceCNNsaretransferred.Inthefuture,wewouldliketoutilizetherankingscorestodeterminethenumberoflayersthatcanbetransferred.105Figure6.11:(A)Performancegainanalysisw.r.ttransferringlayers.L1indicatesthatonly1convolutionallayerwastransferredandL3thatall3convolutionallayersweretransfered.Theredregionsshowstheareaspannedby20differentsources,whiletheblacklineshowsonlythebestrankedsourceoutof25.(B)ofinformationfusion.(C)Correlationofrankingscorewithnumberofclasses.6.2.2.4ofinformationfusionTransferlearningcaninvolvetransferringinformationfrommultiplesourceCNNs,Z,simultane-ously,ratherthanfromasingleCNNonly.LettwosourceCNNsbeNiandNj,respectively.LetNtiandNtj,respectively,betheupdatedCNNsaftertransferlearning.TheCNNsareupdatedusing106thetrainingdata,fromthetargettask.TheoutputofeachCNNisthesetofprobabilitiesindicatingtheposteriorprobabilitythatthegiveninputbelongstoaparticularclass(label).Consideratestsamples.Then,eachCNNwillpredicttheclasslabelsoftheinputdatadifferently,therespectiveprobabilitiescanbedenotedasP(sjNti)andP(sjNtj),respectively.Thesetwoexpressionscanbecombinedas:P(s)=ziP(sjNti)+zjP(sjNtj):(6.2)Usingtherankingscores,theweightscanbecomputedas:zi=EiEi+Ej;zj=EjEi+Ej:(6.3)Fordsources,thisapproachcanbeextendedsuchthatådk=1zk=1.Fig.6.11(B)showsthatthefusionapproachcanovercomeanypotentialerrorsinrankingthesources.Theshadedregiondisplaystheareabetweenthetop-3rankedsourceCNNs.Theresultofthefusedperformanceisplottedasalinewhichclearlystaysnearthetopofthisarea.Incases,where`poor'sourcesarerankedhigher,usingafusionapproachcanprovetobemorereliable.107Chapter7ConclusionThisworkpresentedthecomprehensivestudyonlearningbased,automatedspotdetectionandinMRI.Thisworkhighlightedandaddressedanumberofchallengesinthiscontext.Toutilizeintelligentmachinelearningandcomputervisionapproaches,theannotatedMRIdatabasewasdevelopedforspotdetection.AnextensivestudywasconductedbydesigningadiversesetoflearningbasedapproacheswhichwereevaluatedusingbothinvitroandinvivoMRIscans.EvaluationwasalsoperformedagainstaknownnumberofspotsininvitroMRIscans.TheimpactofresolutionchangeinMRIwasalsostudied.Further,inmanymedicalapplicationssuchasthis,itischallengingtocollectalargevolumeofannotateddata.Inthisstudy,wealsoinvestigatedhowaccurateconvolutionalneuralnetworkarchitecturescanbelearnedusingtransferlearningschemesdespiteusingverylimitedtrainingdata.Infact,morethan35%improvementinaccuracywasobservedwhentrainingwasconductedonlywith5%oftheavailabletrainingdata.Inthiscontext,atheoreticalframeworkwasalsopresentedwhichcanbealsogeneralizedtootherrelatedtasks.Inaddition,wealsodemonstratethatthelabelingprocessofamedicalexpertcanbeincorporatedintotheframework.ItisimportanttonotethatMRI-basedcelltrackinghasremainedlargelyphenomenologicalforitshistory,startinginthelate80's.Movingforward,automatedspotdetectionforMRI-basedcelltrackingwouldproveusefulacrossabroadspectrumofresearchtracks.Forexample,Walczaketal,infusedneuralstemcellsviathecarotidarteryinanefforttotargetstrokelesions[94].HighresolutioninvivoandinvitroMRIappeartoshowsmallclustersofcells,perhapseven108singlecells,distributedinthebrainasafunctionoftheintervention.Onlyqualitativeanalysiswasperformedonthisimagingdata;automatedspotdetectionwouldhaveenabledquantitativemetricsofcellnumbers.Anotherapplicationwouldbefortheevaluationoftransplantedisletsencapsulatedwithironoxidenanoparticleswithinalginatemicrospheres.Theseimagingfeatures,typicallyareindividualhypointensities,examplesbeing[95]and[96].Inbothcases,onlyqualitativeorsemi-quantitativedatawerecompiled,withoutadirectenumerationoftransplantedandsurvivinggrafts.Alastexamplewouldbeforenumerationofkidneyglomeruliinconjunctionwiththeuseofcationizedferritinasacontrastagent[97].ThegeneraluseofMRI-basedcelltrackingandthisapproachtoquantifyingthisdatahassomelimitations.Still,MRIofmagneticallylabeledcellsonlydetectstheiron,notthecellitself,andthismethodisstillunabletodistinguishlivecellsfromdeadcells.Further,ifmorethanonecellgeneratesaparticularspotintheMRI,thenthecalculatedcellnumberwouldbeinaccurate.Inthiswork,only67%ofspotswereresultantfromindividualcells,theother33%from2or3cells.ItremainsanopenquestionastohowaccurateanautomatedspotdetectionalgorithmforMRI-basedcelltrackingneedstobeinordertoprovideusefulclinicalinformation.However,wedonotfeelthatheterogeneousmagneticcelllabelingisaproblem.Indeed,cellswithmoreinternalizedironwouldhavedarkerandlargerspotsonMRI,whilecellswithlessinternalizedironwouldhavelighterandsmallerspots.However,ourautomatedalgorithmcanaccountfordifferencesinspotsizeandintensitytocompensateforheterogeneouscelllabeling.Forfuturework,severaldifferentstudieshavebeensuggestedattheendofChap.4andChap.5.Inadditiontothese,itwillbeinterestingtoexploretheefyoftheproposedapproachusingthegroundtruthobtainedwithhistology.Suchagroundtruthcanalsobeutilizedtoevaluatethelabelingperformanceofamedicalexpert.Anotherinterestingdirectionwouldbetoutilizeahier-109archicalapproachwherethedifferentareexploitedinmultiplelayers.Aineachlayercanrejectsomeofthespotcandidatesandtransfertheothercandidatestothenextlayer.Thiswillallowindeeperlayerstospecializeindetectinghighlychalleng-ingspotcandidates.Further,obtaininghighresolutionMRIcanbetime-consuming.Therefore,anotherinterestingdirectionofresearchwouldbetoexploreCNNarchitecturesthatcanperformaccuratemappingoflowresolutionMRItoahigherresolution.110BIBLIOGRAPHY111BIBLIOGRAPHY[1]NavneetDalalandBillTriggs.Histogramsoforientedgradientsforhumandetection.InComputerSocietyConferenceonComputerVisionandPatternRecognition,volume1,pages886Œ893.IEEE,2005.[2]AudeOlivaandAntonioTorralba.Buildingthegistofascene:Theroleofglobalimagefeaturesinrecognition.ProgressinBrainResearch,155:23Œ36,2006.[3]HaithamBouAmmar,EricEaton,MatthewETaylor,DecebalConstantinMocanu,KurtDriessens,GerhardWeiss,andKarlTuyls.Anautomatedmeasureofmdpsimilarityfortransferinreinforcementlearning.InWorkshopsattheAAAIConferenceonIntel-ligence,2014.[4]Clinicaltrial:Outcomesdataofadiposestemcellstotreatparkinson'sdisease,web-sitehttps://clinicaltrials.gov/ct2/show/nct02184546(received:July3,2014lastupdated:June17,2015lastvJune2015).[5]Clinicaltrial:Astudytoevaluatethesafetyofneuralstemcellsinpatientswithparkin-son'sdisease,websitehttps://clinicaltrials.gov/ct2/show/nct02452723(received:May18,2015lastupdated:March10,2016lastvFebruary2016).[6]Clinicaltrial:Umbilicalcordtissue-derivedmesenchymalstemcellsforrheumatoidarthritis,websitehttps://clinicaltrials.gov/ct2/show/nct01985464(received:October31,2013lastupdated:February4,2016lastvFebruary2016).[7]Clinicaltrial:Cx611-0101,eascsintravenousadministrationtorefractoryrheumatoidarthri-tispatients,websitehttps://clinicaltrials.gov/ct2/show/nct01663116received:August5,2011lastupdated:March5,2013lastvFebruary2013).[8]Clinicaltrial:Evaluationofautologousmesenchymalstemcelltransplantation(effectsandsideeffects)inmultiplesclerosis,websitehttps://clinicaltrials.gov/ct2/show/nct01377870received:June19,2011lastupdated:April24,2014lastvAugust2010).[9]Clinicaltrial:Stemcelltherapyforpatientswithmultiplesclerosisfailingalternateapprovedtherapy-arandomizedstudy,websitehttps://clinicaltrials.gov/ct2/show/nct00273364received:January5,2006lastupdated:March21,2016lastvMarch2016).[10]Clinicaltrial:Pilotstudyofredirectedautologoustcellsengineeredtocontainhumanizedanti-cd19inpatientswithrelapsedorrefractorycd19+leukemiaandlymphomapreviouslytreatedwithcelltherapy,websitehttps://clinicaltrials.gov/ct2/show/nct02374333re-ceived:February23,2015lastupdated:February23,2016lastvFebruary2016).112[11]Clinicaltrial:Geneticallyt-cellsintreatingpatientswithrecurrentorrefractorymalignantglioma,websitehttps://clinicaltrials.gov/ct2/show/nct02208362.[12]JonathanRSlotkin,KevinSCahill,SuzanneATharin,andErikMShapiro.Cellularmag-neticresonanceimaging:nanometerandmicrometersizeparticlesfornoninvasivecelllocal-ization.Neurotherapeutics,4(3):428Œ433,2007.[13]ErikMShapiro,KathrynSharer,StankoSkrtic,andAlanPKoretsky.Invivodetectionofsinglecellsbymri.MagneticResonanceinMedicine,55(2):242Œ249,2006.[14]RongZhou,DjaudatIdiyatullin,SteenMoeller,CurtCorum,HualeiZhang,HuiQiao,JiaZhong,andMichaelGarwood.Swiftdetectionofspio-labeledstemcellsgraftedinthemy-ocardium.MagneticResonanceinMedicine,63(5):1154Œ1161,2010.[15]YijenLWu,QingYe,DanielleFEytan,LiLiu,BeddaLRosario,TKevinHitchens,Fang-ChengYeh,ChienHo,etal.Magneticresonanceimaginginvestigationofmacrophagesinacutecardiacallograftrejectionafterhearttransplantation.Circulation:CardiovascularImaging,6(6):965Œ973,2013.[16]IhorSmal,MarcoLoog,WiroNiessen,andErikMeijering.Quantitativecomparisonofspotdetectionmethodsinmicroscopy.IEEETransactionsonMedicalImaging,29(2):282Œ301,2010.[17]YukiMori,TingChen,TetsuyaFujisawa,SyojiKobashi,KohjiOhno,ShinichiYoshida,YoshiyukiTago,YutakaKomai,YutakaHata,andYoshichikaYoshioka.Fromcartoontorealtimemri:invivomonitoringofphagocytemigrationinmousebrain.reports,4:6997,2014.[18]MuhammadJamalAfridi,XiaomingLiu,ErikShapiro,andArunRoss.AutomaticinvivocelldetectioninMRI.InMedicalImageComutingandMedicalAssistedInterventions,pages391Œ399.Springer,2015.[19]YanivTaigman,MingYang,Marc'AurelioRanzato,andLarsWolf.Deepface:Closingthegaptohuman-levelperformanceinfacevation.InComputerVisionandPatternRecog-nition,pages1701Œ1708,2014.[20]MuhammadJamalAfridi,XiaomingLiu,andJMitchellMcGrath.Anautomatedsystemforplant-leveldiseaseratinginrealInInternationalConferenceonPatternRecognition,pages148Œ153,2014.[21]AlexanderSorokinandDavidForsyth.Utilitydataannotationwithamazonmechanicalturk.InComputerVisionandPatternRecognitionworkshops,2008.[22]IanHWittenandEibeFrank.DataMining:Practicalmachinelearningtoolsandtechniques.MorganKaufmann,2005.113[23]ZhenguoLi,Xiao-MingWu,andShihFuChang.Segmentationusingsuperpixels:Abipartitegraphpartitioningapproach.InComputerVisionandPatternRecognition,2012.[24]ZhihuiHao,QiangWang,HaibingRen,KuanhongXu,YeongKyeongSeong,andJiyeunKim.Multiscalesuperpixelfortumorsegmentationinbreastultrasoundimages.InInternationalConferenceonImageProcessing,pages2817Œ2820,2012.[25]BrianFulkerson,AndreaVedaldi,andStefanoSoatto.Classsegmentationandobjectlo-calizationwithsuperpixelneighborhoods.InInternationalConferenceonComputerVision,2009.[26]JosephTigheandSvetlanaLazebnik.Superparsing:scalablenonparametricimageparsingwithsuperpixels.InEuropearnConferenceonComputerVision,pages352Œ365,2010.[27]HanLiu,YanyunQu,YangWu,andHanziWang.segmentationwithmulti-scalesuperpixels.InAsianConferenceonComputerVisionWorkshops,pages158Œ169,2013.[28]MarkAHall.Correlation-basedfeatureselectionformachinelearning.PhDthesis,TheUniversityofWaikato,1999.[29]AlexKrizhevsky,IlyaSutskever,andGeoffreyEHinton.Imagenetwithdeepconvolutionalneuralnetworks.InadvancesinNeuralInformationProcessingSsystems,pages1097Œ1105,2012.[30]YannLeCun,LéonBottou,YoshuaBengio,andPatrickHaffner.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE,86(11):2278Œ2324,1998.[31]PierreSermanet,DavidEigen,XiangZhang,MichaëlMathieu,RobFergus,andYannLeCun.Overfeat:Integratedrecognition,localizationanddetectionusingconvolutionalnetworks.arXivpreprintarXiv:1312.6229,2013.[32]MatthewDZeilerandRobFergus.Visualizingandunderstandingconvolutionalnetworks.InEuropeanConferenceonComputerVision,pages818Œ833.Springer,2014.[33]KarenSimonyanandAndrewZisserman.Verydeepconvolutionalnetworksforlarge-scaleimagerecognition.arXivpreprintarXiv:1409.1556,2014.[34]ChristianSzegedy,WeiLiu,YangqingJia,PierreSermanet,ScottReed,DragomirAnguelov,DumitruErhan,VincentVanhoucke,andAndrewRabinovich.Goingdeeperwithconvolu-tions.InComputerVisionandPatternRecognition,pages1Œ9,2015.[35]KaimingHe,XiangyuZhang,ShaoqingRen,andJianSun.Deepresiduallearningforimagerecognition.arXivpreprintarXiv:1512.03385,2015.114[36]KaimingHe,XiangyuZhang,ShaoqingRen,andJianSun.Identitymappingsindeepresid-ualnetworks.arXivpreprintarXiv:1603.05027,2016.[37]JasonYosinski,JeffClune,YoshuaBengio,andHodLipson.Howtransferablearefeaturesindeepneuralnetworks?InadvancesinNeuralInformationProcessingSystems,pages3320Œ3328,2014.[38]PulkitAgrawal,JoaoCarreira,andJitendraMalik.Learningtoseebymoving.InInterna-tionalConferenceonComputerVision,pages37Œ45,2015.[39]ShubhamTulsiani,JoaoCarreira,andJitendraMalik.Poseinductionfornovelobjectcate-gories.InInternationalConferenceonComputerVision,pages64Œ72,2015.[40]MingshengLong,JianminWang,MichaelJordan,andYueCao.Learningtransferablefea-tureswithdeepadaptationnetworks.InInternationalConferenceonMachineLearning,2015.[41]HaoChen,DongNi,JingQin,ShenliLi,XinYang,TianfuWang,andPhengHeng.Standardplanelocalizationinfetalultrasoundviadomaintransferreddeepneuralnetworks.IEEEJournalofBiomedicalandHealthInformatics,19,2015.[42]HosseinAzizpour,AliSharifRazavian,JosephineSullivan,AtsutoMaki,andStefanCarls-son.Fromgenerictodeeprepresentationsforvisualrecognition.InComputerVisionandPatternRecognitionworkshop,2015.[43]EtaiLittwinandLiorWolf.Themultiverselossforrobusttransferlearning.ComputerVisionandPatternRecognition,2016.[44]AndreasGeiger,PhilipLenz,ChristophStiller,andRaquelUrtasun.Visionmeetsrobotics:Thekittidataset.InternationalJournalofRoboticsResearch,2013.[45]SachinSudhakarFarfade,MohammadJSaberian,andLi-JiaLi.Multi-viewfacedetectionusingdeepconvolutionalneuralnetworks.InInternationalConferenceonMultimediaRe-trieval,pages643Œ650.ACM,2015.[46]MaximeOquab,LeonBottou,IvanLaptev,andJosefSivic.Learningandtransferringmid-levelimagerepresentationsusingconvolutionalneuralnetworks.InComputerVisionandPatternRecognition,pages1717Œ1724,2014.[47]HalDauméIII.Frustratinglyeasydomainadaptation.arXivpreprintarXiv:0907.1815,2009.[48]RitaChattopadhyay,QianSun,WeiFan,IanDavidson,SethuramanPanchanathan,andJiepingYe.Multisourcedomainadaptationanditsapplicationtoearlydetectionoffatigue.TransactionsonKnowledgeDiscoveryfromData,6(4):18,2012.115[49]BoqingGong,YuanShi,FeiSha,andKristenGrauman.Geodesicwkernelforunsuper-viseddomainadaptation.InComputerVisionandPatternRecognition,pages2066Œ2073.IEEE,2012.[50]JohnBlitzer,RyanMcDonald,andFernandoPereira.Domainadaptationwithstructuralcor-respondencelearning.InConferenceonEmpiricalMethodsinNaturalLanguageProcessing,pages120Œ128.ACM,2006.[51]DianeCook,KyleDFeuz,andNarayananCKrishnan.Transferlearningforactivityrecog-nition:Asurvey.KnowledgeandInformationSystems,36(3):537Œ556,2013.[52]KyleDFeuzandDianeJCook.Transferlearningacrossfeature-richheterogeneousfeaturespacesviafeature-spaceremapping(FSR).TransactionsonIntelligentSystemsandTechnol-ogy,6(1):3,2015.[53]SinnoJialinPanandQiangYang.Asurveyontransferlearning.IEEETransactionsonKnowledgeandDataEngineering,22(10):1345Œ1359,2010.[54]WenyuanDai,QiangYang,GuiRongXue,andYongYu.Boostingfortransferlearning.InInternationalConferenceonMachineLearning,pages193Œ200,2007.[55]JingJiangandChengXiangZhai.InstanceweightingfordomainadaptationinNLP.InACL,volume7,pages264Œ271,2007.[56]XuejunLiao,YaXue,andLawrenceCarin.Logisticregressionwithanauxiliarydatasource.InInternationalConferenceonMachinelearning,pages505Œ512.ACM,2005.[57]PengchengWuandThomasGDietterich.Improvingsvmaccuracybytrainingonauxiliarydatasources.InInternationalConferenceonMachineLearning,page110.ACM,2004.[58]SinnoJialinPan,JamesTKwok,andQiangYang.Transferlearningviadimensionalityreduction.InAAAIConferenceonIntelligence,volume8,pages677Œ682,2008.[59]SinnoJialinPan,IvorWTsang,JamesTKwok,andQiangYang.Domainadaptationviatransfercomponentanalysis.TransactionsonNeuralNetworks,22(2):199Œ210,2011.[60]RajatRaina,AlexisBattle,HonglakLee,BenjaminPacker,andAndrewYNg.Self-taughtlearning:transferlearningfromunlabeleddata.InInternationalConferenceonMachinelearning,pages759Œ766.ACM,2007.[61]TatianaTommasi,FrancescoOrabona,andBarbaraCaputo.Safetyinnumbers:Learningcategoriesfromfewexampleswithmultimodelknowledgetransfer.InComputerVisionandPatternRecognition,pages3081Œ3088.IEEE,2010.[62]YiYaoandGianfrancoDoretto.Boostingfortransferlearningwithmultiplesources.In116ComputerVisionandPatternRecognition,pages1855Œ1862.IEEE,2010.[63]LilyanaMihalkova,TuyenHuynh,andRaymondJMooney.Mappingandrevisingmarkovlogicnetworksfortransferlearning.InAAAIConferenceonIntelligence,volume7,pages608Œ614,2007.[64]WenyuanDai,QiangYang,Gui-RongXue,andYongYu.Self-taughtclustering.InInterna-tionalConferenceonMachineLearning,pages200Œ207.ACM,2008.[65]DikanXing,WenyuanDai,Gui-RongXue,andYongYu.Bridgedfortransferlearning.InEuropeanConferenceonPrinciplesofDataMiningandKnowledgeDiscovery,pages324Œ335.Springer,2007.[66]MichaelTRosenstein,ZvikaMarx,LesliePackKaelbling,andThomasGDietterich.Totransferornottotransfer.InadvancesinNeuralInformationProcessingSystemsWorkshop,volume898,2005.[67]SinnoJialinPan,XiaochuanNi,Jian-TaoSun,QiangYang,andZhengChen.Cross-domainsentimentviaspectralfeaturealignment.InInternationalConferenceonWorldWideWeb,pages751Œ760.ACM,2010.[68]MingshengLong,JianminWang,GuiguangDing,JiaguangSun,andPhilipSYu.Transferfeaturelearningwithjointdistributionadaptation.InInternationalConferenceonComputerVision,pages2200Œ2207,2013.[69]SebastianThrunandTomMMitchell.Learningonemorething.Technicalreport,DTICDocument,1994.[70]SebastianThrunandLorienPratt.Learningtolearn.SpringerScienceandBusinessMedia,2012.[71]ZhongqiLu,YinZhu,SinnoJialinPan,EvanWeiXiang,YujingWang,andQiangYang.SourcefreetransferlearningfortextInAAAIConferenceonIntelli-gence,pages122Œ128,2014.[72]MuhammadJamalAfridi,ArunRoss,andErikM.Shapiro.L-CNN:ExploitinglabelinglatencyinaCNNlearningframework.InInternationalConferenceonPatternRecognition,2016.[73]MuhammadJamalAfridi,ArunRoss,XioamingLiu,MargaretBennewitz,DorelaShuboni,andErikM.Shapiro.IntelligentandautomaticcelldetectionandinMRI.MagneticResonanceinMedicine.[74]BoleiZhou,AgataLapedriza,JianxiongXiao,AntonioTorralba,andAudeOliva.Learningdeepfeaturesforscenerecognitionusingplacesdatabase.InadvancesinNeuralInformation117ProcessingSystems,pages487Œ495,2014.[75]DanFoti,GregHajcak,andJosephDien.Differentiatingneuralresponsestoemotionalpic-tures:evidencefromtemporal-spatialPCA.Psychophysiology,46(3):521Œ530,2009.[76]MDGrimaMurcia,MALopez-Gordo,MariaJOrtíz,JMFerrández,andEduardoFernández.Spatio-temporaldynamicsofimageswithemotionalbivalence.InComputationinBiologyandMedicine,pages203Œ212.Springer,2015.[77]VladimirVapnikandAkshayVashist.Anewlearningparadigm:Learningusingprivilegedinformation.NeuralNetworks,22(5):544Œ557,2009.[78]VladimirVapnik,AkshayVashist,andNatalyaPavlovitch.Learningusinghiddeninforma-tion(learningwithteacher).InInternationalJointConferenceonNeuralNetworks,pages3188Œ3195.IEEE,2009.[79]JixuChen,XiaomingLiu,andSiweiLyu.Boostingwithsideinformation.InAsianConfer-enceonComputerVision,pages563Œ577.Springer,2012.[80]ZihengWangandQiangJi.learningwithhiddeninformation.InComputerVisionandPatternRecognition,pages4969Œ4977,2015.[81]ViktoriiaSharmanska,NoviQuadrianto,andChristophLampert.Learningtorankusingprivilegedinformation.InInternationalConferenceonComputerVision,pages825Œ832,2013.[82]LiorWolfandNogaLevy.Thesvm-minussimilarityscoreforvideofacerecognition.InComputerVisionandPatternRecognition,pages3523Œ3530,2013.[83]ViktoriiaSharmanska,NoviQuadrianto,andChristophHLampert.Learningtotransferprivilegedinformation.arXivpreprintarXiv:1410.0389,2014.[84]WalterJ.Scheirer,SamuelE.Anthony,KenNakayama,andDavidD.Cox.Perceptualanno-tation:Measuringhumanvisiontoimprovecomputervision.PAMI,36,August2014.[85]EricPXing,AndrewYNg,MichaelIJordan,andStuartRussell.Distancemetriclearn-ingwithapplicationtoclusteringwithside-information.advancesinNeuralInformationProcessingSsystems,15:505Œ512,2003.[86]SugatoBasu,MikhailBilenko,andRaymondJMooney.Aprobabilisticframeworkforsemi-supervisedclustering.InKDD,pages59Œ68.ACM,2004.[87]HermanKamper,WeiranWang,andKarenLivescu.Deepconvolutionalacousticwordem-beddingsusingword-pairsideinformation.arXivpreprintarXiv:1510.01032,2015.118[88]RuthJanning,CarlottaSchatten,andLarsSchmidt-Thieme.Hnnp-ahybridneuralnetworkplaitforimprovingimagewithadditionalsideinformation.InICTAI,pages24Œ29.IEEE,2013.[89]Ming-YuLiu,OncelTuzel,SrikumarRamalingam,andRamaChellappa.Entropyratesuper-pixelsegmentation.InComputerVisionandPatternRecognition,pages2097Œ2104,2011.[90]MustafaOzuysal,MichaelCalonder,VincentLepetit,andPascalFua.Fastkeypointrecog-nitionusingrandomferns.TransactionsonPatternAnalysisandMachineIntelligence,32(3):448Œ461,2010.[91]RemcoRBouckaert,EibeFrank,MarkHall,RichardKirkby,PeterReutemann,AlexSee-wald,andDavidScuse.Wekamanualforversion3-7-8.Hamilton,NewZealand,2013.[92]YukiMori,TingChen,TetsuyaFujisawa,SyojiKobashi,KohjiOhno,ShinichiYoshida,YoshiyukiTago,YutakaKomai,YutakaHata,andYoshichikaYoshioka.FromcartoontorealtimeMRI:invivomonitoringofphagocytemigrationinmousebrain.reports,4,2014.[93]JuanJoséRodriguez,LudmilaIKuncheva,andCarlosJAlonso.Rotationforest:Anewensemblemethod.TransactionsonPatternAnalysisandMachineIntelligence,28(10):1619Œ1630,2006.[94]PiotrWalczak,JianZhang,AssafAGilad,DorotaAKedziorek,JesusRuiz-Cabello,Ran-dellGYoung,MarkFPittenger,PeterCMvanZijl,JudyHuang,andJeffWMBulte.Dual-modalitymonitoringoftargetedintraarterialdeliveryofmesenchymalstemcellsaftertran-sientischemia.Stroke,39(5):1569Œ1574,2008.[95]DianRStefValdeig,RobertAAnders,JeffWMBulte,andCliffordRWeiss.Mag-netoencapsulatedhumanisletsxenotransplantedintoswine:acomparisonofdifferenttrans-plantationsites.Xenotransplantation,23(3):211Œ221,2016.[96]PingWang,ChristianSchuetz,PrashanthVallabhajosyula,ZdravkaMedarova,AsedaTena,LinglingWei,KazuhikoYamada,ShaopingDeng,JamesFMarkmann,DavidHSachs,etal.Monitoringofallogeneicisletgraftsinnonhumanprimatesusingmri.Transplantation,99(8):1574Œ1581,2015.[97]EdwinJBaldelomar,JenniferRCharlton,ScottCBeeman,BradleyDHann,LuiseCullen-McEwen,ValeriaMPearl,JohnFBertram,TeresaWu,MinZhang,andKevinMBennett.Phenotypingbymagneticresonanceimagingnondestructivelymeasuresglomerularnumberandvolumedistributioninmicewithandwithoutnephronreduction.Kidneyinternational,2015.119